Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).
Zhang, H. et al. Algorithm for optimized mRNA design improves stability and immunogenicity. Nature 621, 396–403 (2023).
Rinehart, N. l. A machine-learning tool to predict substrate-adaptive conditions for Pd-catalyzed C–N couplings. Science 381, 965–972 (2023).
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Patra, T. K. Data-driven methods for accelerating polymer design. ACS Polym. Au 2, 8–26 (2021).
Martin, T. B. & Audus, D. J. Emerging trends in machine learning: a polymer perspective. ACS Polym. Au 3, 239–258 (2023).
Struble, D. C., Lamb, B. G. & Ma, B. A prospective on machine learning challenges, progress, and potential in polymer science. MRS Commun. 14, 752–770 (2024).
Ge, W., De Silva, R., Fan, Y., Sisson, S. A. & Stenzel, M. H. Machine learning in polymer research. Adv. Mater. 37, 2413695 (2025).
Yang, J., Tao, L., He, J., McCutcheon, J. R. & Li, Y. Machine learning enables interpretable discovery of innovative polymers for gas separation membranes. Sci. Adv. 8, eabn9545 (2022).
Tao, L., Varshney, V. & Li, Y. Benchmarking machine learning models for polymer informatics: an example of glass transition temperature. J. Chem. Inf. Model. 61, 5395–5413 (2021).
Arora, A. et al. Random forest predictor for diblock copolymer phase behavior. ACS Macro Lett. 10, 1339–1345 (2021).
Tao, L. et al. Discovery of multi-functional polyimides through high-throughput screening using explainable machine learning. Chem. Eng. J. 465, 142949 (2023).
Li, H. et al. Machine learning-accelerated discovery of heat-resistant polysulfates for electrostatic energy storage. Nat. Energy 10, 90–100 (2025).
Sun, W. et al. Machine learning–assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials. Sci. Adv. 5, eaay4275 (2019).
Meenakshisundaram, V., Hung, J.-H., Patra, T. K. & Simmons, D. S. Designing sequence-specific copolymer compatibilizers using a molecular-dynamics-simulation-based genetic algorithm. Macromolecules 50, 1155–1166 (2017).
Kim, C., Chandrasekaran, A., Huan, T. D., Das, D. & Ramprasad, R. Polymer genome: a data-powered polymer informatics platform for property predictions. J. Phys. Chem. C. 122, 17575–17585 (2018).
Gong, D. et al. Machine learning guided structure function predictions enable in silico nanoparticle screening for polymeric gene delivery. Acta Biomater. 154, 349–358 (2022).
Patel, R. A., Borca, C. H. & Webb, M. A. Featurization strategies for polymer sequence or composition design by machine learning. Mol. Syst. Des. Eng. 7, 661–676 (2022).
Tamasi, M. J. et al. Machine learning on a robotic platform for the design of polymer-protein hybrids. Adv. Mater. 34, 12 (2022).
Zhang, X. Y. et al. Polymer-unit fingerprint (PUFp): an accessible expression of polymer organic semiconductors for machine learning. ACS Appl. Mater. Interfaces 15, 21537–21548 (2023).
Tropsha, A., Isayev, O., Varnek, A., Schneider, G. & Cherkasov, A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat. Rev. Drug Discov. 23, 141–155 (2024).
Webb, M. A., Jackson, N. E., Gil, P. S. & de Pablo, J. J. Targeted sequence design within the coarse-grained polymer genome. Sci. Adv. 6, eabc6216 (2020).
Tao, L., Byrnes, J., Varshney, V. & Li, Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 25, 104585 (2022).
Ma, R. & Luo, T. PI1M: a benchmark database for polymer informatics. J. Chem. Inf. Model. 60, 4684–4690 (2020).
Miccio, L. A. & Schwartz, G. A. From chemical structure to quantitative polymer properties prediction through convolutional neural networks. Polymer 193, 122341 (2020).
Yan, C., Feng, X. M., Wick, C., Peters, A. & Li, G. Q. Machine learning assisted discovery of new thermoset shape memory polymers based on a small training dataset. Polymer 214, 12 (2021).
Zang, X., Zhao, X. & Tang, B. Hierarchical molecular graph self-supervised learning for property prediction. Commun. Chem. 6, 34 (2023).
Antoniuk, E. R., Li, P., Kailkhura, B. & Hiszpanski, A. M. Representing polymers as periodic graphs with learned descriptors for accurate polymer property predictions. J. Chem. Inf. Model. 62, 5435–5445 (2022).
Aldeghi, M. & Coley, C. W. A graph representation of molecular ensembles for polymer property prediction. Chem. Sci. 13, 10486–10498 (2022).
Zhang, S. et al. Deep learning-assisted design of novel donor–acceptor combinations for organic photovoltaic materials with enhanced efficiency. Adv. Mater. 37, 2407613 (2025).
Gurnani, R. et al. AI-assisted discovery of high-temperature dielectrics for energy storage. Nat. Commun. 15, 6107 (2024).
Park, J. et al. Prediction and interpretation of polymer properties using the graph convolutional network. ACS Polym. Au 2, 213–222 (2022).
Chen, D. et al. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proc. AAAI Conference on Artificial Intelligence 3438–3445 (AAAI Press, 2020).
Wang, Y., Wang, J., Cao, Z. & Barati Farimani, A. Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4, 279–287 (2022).
Zemin, L. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Qiang, B. et al. Bridging the gap between chemical reaction pretraining and conditional molecule generation with a unified model. Nat. Mach. Intell. 5, 1476–1485 (2023).
Kuenneth, C. & Ramprasad, R. polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics. Nat. Commun. 14, 4099 (2023).
Xu, C., Wang, Y. & Barati Farimani, A. TransPolymer: a transformer-based language model for polymer property predictions. npj Comput. Mater. 9, 64 (2023).
Lin, T.-S. et al. BigSMILES: a structurally-based line notation for describing macromolecules. ACS Cent. Sci. 5, 1523–1531 (2019).
Lin, T. S., Rebello, N. J., Lee, G. H., Morris, M. A. & Olsen, B. D. Canonicalizing BigSMILES for polymers with defined backbones. ACS Polym. Au 2, 486–500 (2022).
Schneider, L., Walsh, D., Olsen, B. & de Pablo, J. Generative BigSMILES: an extension for polymer informatics, computer simulations & ML/AI. Digit. Discov. 3, 51–61 (2024).
Luo, Y. et al. Masked graph modeling with multi-view contrast. In Proc. 40th International Conference on Data Engineering 2584–2597 (IEEE, 2024).
Tan, H., Lei, J., Wolf, T. & Bansal, M. Vimpac: video pre-training via masked token prediction and contrastive learning. Preprint at https://www.arxiv.org/abs/2106.11250 (2021).
Chaitanya, K., Erdil, E., Karani, N. & Konukoglu, E. Contrastive learning of global and local features for medical image segmentation with limited annotations. Adv. Neural Inf. Process. Syst. 33, 12546–12558 (2020).
Moriwaki, H., Tian, Y.-S., Kawashita, N. & Takagi, T. Mordred: a molecular descriptor calculator. J. Cheminform. 10, 4 (2018).
RDKit: open-source cheminformatics (RDKit, 2021); http://www.rdkit.org
Pengfei, L. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55, 1–35 (2021).
Taoran, F., Yunchao, Z., Yang, Y., Chunping, W. & Lei, C. Universal prompt tuning for graph neural networks. Adv. Neural Inf. Process. Syst. 36, 52464–52489 (2023).
Fang, Y. et al. Knowledge graph-enhanced molecular contrastive learning with functional prompt. Nat. Mach. Intell. 5, 542–553 (2023).
Liu, G., Zhao, T., Xu, J., Luo, T. & Jiang, M. Graph rationalization with environment-based augmentations. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 1069–1078 (ACM, 2022).
Wang, T. & Isola, P. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In Proc. 37th International Conference on Machine Learning 9929–9939 (PMLR, 2020).
Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
Mookherjee, N., Anderson, M. A., Haagsman, H. P. & Davidson, D. J. Antimicrobial host defence peptides: functions and clinical potential. Nat. Rev. Drug Discov. 19, 311–332 (2020).
Huang, J. et al. Identification of potent antimicrobial peptides via a machine-learning pipeline that mines the entire space of peptide sequences. Nat. Biomed. Eng. 7, 797–810 (2023).
Shabani, S. et al. Synthetic peptide branched polymers for antibacterial and biomedical applications. Nat. Rev. Bioeng. 2, 343–361 (2024).
Zhou, M. et al. A dual-targeting antifungal is effective against multidrug-resistant human fungal pathogens. Nat. Microbiol. 9, 1325–1339 (2024).
Phuong, P. T. et al. Effect of hydrophobic groups on antimicrobial and hemolytic activity: developing a predictive tool for ternary antimicrobial polymers. Biomacromolecules 21, 5241–5255 (2020).
Furka, Á. Forty years of combinatorial technology. Drug Discov. Today 27, 103308 (2022).
Bai, P., Liu, X. & Lu, H. Geometry-aware line graph transformer pre-training for molecular property prediction. Preprint at https://www.arxiv.org/abs/2309.00483 (2023).
Li, H., Zhao, D. & Zeng, J. KPGT: knowledge-guided pre-training of graph transformer for molecular property prediction. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 857–867 (ACM, 2022).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (ACL, 2019).
He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9729–9738 (IEEE, 2020).
Ying, C. et al. Do Transformers really perform bad for graph representation? Adv. Neural Inf. Process. Syst. 34, 28877–28888 (2021).
Rampášek, L. et al. Recipe for a general, powerful, scalable graph transformer. Adv. Neural Inf. Process. Syst. 35, 14501–14515 (2022).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Corso, G., Cavalleri, L., Beaini, D., Liò, P. & Veličković, P. Principal neighbourhood aggregation for graph nets. Adv. Neural Inf. Process. Syst. 33, 13260–13271 (2020).
Kuenneth, C. et al. Polymer informatics with multi-task learning. Patterns 2, 100238 (2021).
Nagasawa, S., Al-Naamani, E. & Saeki, A. Computer-aided screening of conjugated polymers for organic solar cell: classification by random forest. J. Phys. Chem. Lett. 9, 2639–2646 (2018).
Schauser, N. S., Kliegle, G. A., Cooke, P., Segalman, R. A. & Seshadri, R. Database creation, visualization, and statistical learning for polymer Li+-electrolyte design. Chem. Mater. 33, 4863–4876 (2021).
Wu, Y. Datasets and checkpoints for PerioGT. Zenodo https://doi.org/10.5281/zenodo.17035498 (2025).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning 1263–1272 (PMLR, 2017).
Veličković, P. et al. Graph attention networks. In Proc. 6th International Conference on Learning Representations (ICLR, 2018).