Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

Article 

Google Scholar
 

Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).

Article 

Google Scholar
 

Zhang, H. et al. Algorithm for optimized mRNA design improves stability and immunogenicity. Nature 621, 396–403 (2023).

Article 

Google Scholar
 

Rinehart, N. l. A machine-learning tool to predict substrate-adaptive conditions for Pd-catalyzed C–N couplings. Science 381, 965–972 (2023).

Article 

Google Scholar
 

Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).

Article 

Google Scholar
 

Patra, T. K. Data-driven methods for accelerating polymer design. ACS Polym. Au 2, 8–26 (2021).

Article 

Google Scholar
 

Martin, T. B. & Audus, D. J. Emerging trends in machine learning: a polymer perspective. ACS Polym. Au 3, 239–258 (2023).

Article 

Google Scholar
 

Struble, D. C., Lamb, B. G. & Ma, B. A prospective on machine learning challenges, progress, and potential in polymer science. MRS Commun. 14, 752–770 (2024).

Article 

Google Scholar
 

Ge, W., De Silva, R., Fan, Y., Sisson, S. A. & Stenzel, M. H. Machine learning in polymer research. Adv. Mater. 37, 2413695 (2025).

Article 

Google Scholar
 

Yang, J., Tao, L., He, J., McCutcheon, J. R. & Li, Y. Machine learning enables interpretable discovery of innovative polymers for gas separation membranes. Sci. Adv. 8, eabn9545 (2022).

Article 

Google Scholar
 

Tao, L., Varshney, V. & Li, Y. Benchmarking machine learning models for polymer informatics: an example of glass transition temperature. J. Chem. Inf. Model. 61, 5395–5413 (2021).

Article 

Google Scholar
 

Arora, A. et al. Random forest predictor for diblock copolymer phase behavior. ACS Macro Lett. 10, 1339–1345 (2021).

Article 

Google Scholar
 

Tao, L. et al. Discovery of multi-functional polyimides through high-throughput screening using explainable machine learning. Chem. Eng. J. 465, 142949 (2023).

Article 

Google Scholar
 

Li, H. et al. Machine learning-accelerated discovery of heat-resistant polysulfates for electrostatic energy storage. Nat. Energy 10, 90–100 (2025).

Article 

Google Scholar
 

Sun, W. et al. Machine learning–assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials. Sci. Adv. 5, eaay4275 (2019).

Article 

Google Scholar
 

Meenakshisundaram, V., Hung, J.-H., Patra, T. K. & Simmons, D. S. Designing sequence-specific copolymer compatibilizers using a molecular-dynamics-simulation-based genetic algorithm. Macromolecules 50, 1155–1166 (2017).

Article 

Google Scholar
 

Kim, C., Chandrasekaran, A., Huan, T. D., Das, D. & Ramprasad, R. Polymer genome: a data-powered polymer informatics platform for property predictions. J. Phys. Chem. C. 122, 17575–17585 (2018).

Article 

Google Scholar
 

Gong, D. et al. Machine learning guided structure function predictions enable in silico nanoparticle screening for polymeric gene delivery. Acta Biomater. 154, 349–358 (2022).

Article 

Google Scholar
 

Patel, R. A., Borca, C. H. & Webb, M. A. Featurization strategies for polymer sequence or composition design by machine learning. Mol. Syst. Des. Eng. 7, 661–676 (2022).

Article 

Google Scholar
 

Tamasi, M. J. et al. Machine learning on a robotic platform for the design of polymer-protein hybrids. Adv. Mater. 34, 12 (2022).


Google Scholar
 

Zhang, X. Y. et al. Polymer-unit fingerprint (PUFp): an accessible expression of polymer organic semiconductors for machine learning. ACS Appl. Mater. Interfaces 15, 21537–21548 (2023).

Article 

Google Scholar
 

Tropsha, A., Isayev, O., Varnek, A., Schneider, G. & Cherkasov, A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat. Rev. Drug Discov. 23, 141–155 (2024).

Article 

Google Scholar
 

Webb, M. A., Jackson, N. E., Gil, P. S. & de Pablo, J. J. Targeted sequence design within the coarse-grained polymer genome. Sci. Adv. 6, eabc6216 (2020).

Article 

Google Scholar
 

Tao, L., Byrnes, J., Varshney, V. & Li, Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 25, 104585 (2022).

Article 

Google Scholar
 

Ma, R. & Luo, T. PI1M: a benchmark database for polymer informatics. J. Chem. Inf. Model. 60, 4684–4690 (2020).

Article 

Google Scholar
 

Miccio, L. A. & Schwartz, G. A. From chemical structure to quantitative polymer properties prediction through convolutional neural networks. Polymer 193, 122341 (2020).

Article 

Google Scholar
 

Yan, C., Feng, X. M., Wick, C., Peters, A. & Li, G. Q. Machine learning assisted discovery of new thermoset shape memory polymers based on a small training dataset. Polymer 214, 12 (2021).

Article 

Google Scholar
 

Zang, X., Zhao, X. & Tang, B. Hierarchical molecular graph self-supervised learning for property prediction. Commun. Chem. 6, 34 (2023).

Article 

Google Scholar
 

Antoniuk, E. R., Li, P., Kailkhura, B. & Hiszpanski, A. M. Representing polymers as periodic graphs with learned descriptors for accurate polymer property predictions. J. Chem. Inf. Model. 62, 5435–5445 (2022).

Article 

Google Scholar
 

Aldeghi, M. & Coley, C. W. A graph representation of molecular ensembles for polymer property prediction. Chem. Sci. 13, 10486–10498 (2022).

Article 

Google Scholar
 

Zhang, S. et al. Deep learning-assisted design of novel donor–acceptor combinations for organic photovoltaic materials with enhanced efficiency. Adv. Mater. 37, 2407613 (2025).

Article 

Google Scholar
 

Gurnani, R. et al. AI-assisted discovery of high-temperature dielectrics for energy storage. Nat. Commun. 15, 6107 (2024).

Article 

Google Scholar
 

Park, J. et al. Prediction and interpretation of polymer properties using the graph convolutional network. ACS Polym. Au 2, 213–222 (2022).

Article 

Google Scholar
 

Chen, D. et al. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proc. AAAI Conference on Artificial Intelligence 3438–3445 (AAAI Press, 2020).

Wang, Y., Wang, J., Cao, Z. & Barati Farimani, A. Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4, 279–287 (2022).

Article 

Google Scholar
 

Zemin, L. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

Article 
MathSciNet 

Google Scholar
 

Qiang, B. et al. Bridging the gap between chemical reaction pretraining and conditional molecule generation with a unified model. Nat. Mach. Intell. 5, 1476–1485 (2023).

Article 

Google Scholar
 

Kuenneth, C. & Ramprasad, R. polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics. Nat. Commun. 14, 4099 (2023).

Article 

Google Scholar
 

Xu, C., Wang, Y. & Barati Farimani, A. TransPolymer: a transformer-based language model for polymer property predictions. npj Comput. Mater. 9, 64 (2023).

Article 

Google Scholar
 

Lin, T.-S. et al. BigSMILES: a structurally-based line notation for describing macromolecules. ACS Cent. Sci. 5, 1523–1531 (2019).

Article 

Google Scholar
 

Lin, T. S., Rebello, N. J., Lee, G. H., Morris, M. A. & Olsen, B. D. Canonicalizing BigSMILES for polymers with defined backbones. ACS Polym. Au 2, 486–500 (2022).

Article 

Google Scholar
 

Schneider, L., Walsh, D., Olsen, B. & de Pablo, J. Generative BigSMILES: an extension for polymer informatics, computer simulations & ML/AI. Digit. Discov. 3, 51–61 (2024).

Article 

Google Scholar
 

Luo, Y. et al. Masked graph modeling with multi-view contrast. In Proc. 40th International Conference on Data Engineering 2584–2597 (IEEE, 2024).

Tan, H., Lei, J., Wolf, T. & Bansal, M. Vimpac: video pre-training via masked token prediction and contrastive learning. Preprint at https://www.arxiv.org/abs/2106.11250 (2021).

Chaitanya, K., Erdil, E., Karani, N. & Konukoglu, E. Contrastive learning of global and local features for medical image segmentation with limited annotations. Adv. Neural Inf. Process. Syst. 33, 12546–12558 (2020).


Google Scholar
 

Moriwaki, H., Tian, Y.-S., Kawashita, N. & Takagi, T. Mordred: a molecular descriptor calculator. J. Cheminform. 10, 4 (2018).

Article 

Google Scholar
 

RDKit: open-source cheminformatics (RDKit, 2021); http://www.rdkit.org

Pengfei, L. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55, 1–35 (2021).


Google Scholar
 

Taoran, F., Yunchao, Z., Yang, Y., Chunping, W. & Lei, C. Universal prompt tuning for graph neural networks. Adv. Neural Inf. Process. Syst. 36, 52464–52489 (2023).


Google Scholar
 

Fang, Y. et al. Knowledge graph-enhanced molecular contrastive learning with functional prompt. Nat. Mach. Intell. 5, 542–553 (2023).

Article 

Google Scholar
 

Liu, G., Zhao, T., Xu, J., Luo, T. & Jiang, M. Graph rationalization with environment-based augmentations. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 1069–1078 (ACM, 2022).

Wang, T. & Isola, P. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In Proc. 37th International Conference on Machine Learning 9929–9939 (PMLR, 2020).

Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).

Article 

Google Scholar
 

Mookherjee, N., Anderson, M. A., Haagsman, H. P. & Davidson, D. J. Antimicrobial host defence peptides: functions and clinical potential. Nat. Rev. Drug Discov. 19, 311–332 (2020).

Article 

Google Scholar
 

Huang, J. et al. Identification of potent antimicrobial peptides via a machine-learning pipeline that mines the entire space of peptide sequences. Nat. Biomed. Eng. 7, 797–810 (2023).

Article 

Google Scholar
 

Shabani, S. et al. Synthetic peptide branched polymers for antibacterial and biomedical applications. Nat. Rev. Bioeng. 2, 343–361 (2024).

Article 

Google Scholar
 

Zhou, M. et al. A dual-targeting antifungal is effective against multidrug-resistant human fungal pathogens. Nat. Microbiol. 9, 1325–1339 (2024).

Article 

Google Scholar
 

Phuong, P. T. et al. Effect of hydrophobic groups on antimicrobial and hemolytic activity: developing a predictive tool for ternary antimicrobial polymers. Biomacromolecules 21, 5241–5255 (2020).

Article 

Google Scholar
 

Furka, Á. Forty years of combinatorial technology. Drug Discov. Today 27, 103308 (2022).

Article 

Google Scholar
 

Bai, P., Liu, X. & Lu, H. Geometry-aware line graph transformer pre-training for molecular property prediction. Preprint at https://www.arxiv.org/abs/2309.00483 (2023).

Li, H., Zhao, D. & Zeng, J. KPGT: knowledge-guided pre-training of graph transformer for molecular property prediction. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 857–867 (ACM, 2022).

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (ACL, 2019).

He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9729–9738 (IEEE, 2020).

Ying, C. et al. Do Transformers really perform bad for graph representation? Adv. Neural Inf. Process. Syst. 34, 28877–28888 (2021).


Google Scholar
 

Rampášek, L. et al. Recipe for a general, powerful, scalable graph transformer. Adv. Neural Inf. Process. Syst. 35, 14501–14515 (2022).


Google Scholar
 

Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).


Google Scholar
 

Corso, G., Cavalleri, L., Beaini, D., Liò, P. & Veličković, P. Principal neighbourhood aggregation for graph nets. Adv. Neural Inf. Process. Syst. 33, 13260–13271 (2020).


Google Scholar
 

Kuenneth, C. et al. Polymer informatics with multi-task learning. Patterns 2, 100238 (2021).

Article 

Google Scholar
 

Nagasawa, S., Al-Naamani, E. & Saeki, A. Computer-aided screening of conjugated polymers for organic solar cell: classification by random forest. J. Phys. Chem. Lett. 9, 2639–2646 (2018).

Article 

Google Scholar
 

Schauser, N. S., Kliegle, G. A., Cooke, P., Segalman, R. A. & Seshadri, R. Database creation, visualization, and statistical learning for polymer Li+-electrolyte design. Chem. Mater. 33, 4863–4876 (2021).

Article 

Google Scholar
 

Wu, Y. Datasets and checkpoints for PerioGT. Zenodo https://doi.org/10.5281/zenodo.17035498 (2025).

Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning 1263–1272 (PMLR, 2017).

Veličković, P. et al. Graph attention networks. In Proc. 6th International Conference on Learning Representations (ICLR, 2018).