{"id":26401,"date":"2026-05-04T08:01:30","date_gmt":"2026-05-04T08:01:30","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/26401\/"},"modified":"2026-05-04T08:01:30","modified_gmt":"2026-05-04T08:01:30","slug":"mammal-molecular-aligned-multi-modal-architecture-and-language-for-biomedical-discovery","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/26401\/","title":{"rendered":"MAMMAL &#8211; Molecular Aligned Multi-Modal Architecture and Language for biomedical discovery"},"content":{"rendered":"<p>To evaluate the performance and generalization capabilities of ibm\/biomed.omics.bl.sm.ma-ted-458m, we selected a diverse set of existing benchmarks spanning multiple task types and stages of the drug discovery pipeline, prioritizing benchmarks with clearly defined splits when those were available. We assessed model quality through a fine-tuning-based evaluation strategy, where the pretrained model is adapted to each benchmark and compared against specialized state-of-the-art (SOTA) models. The evaluation methodology and fine-tuning protocol as well as detailed descriptions of each benchmark\u2014including background, significance for drug discovery, prior models, and data statistics\u2014are provided in the subsections below. A summary of performance across tasks is presented in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a> and visualized in Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Fig1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>E, and representative encoder-decoder examples are provided in the Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S1<\/a>.<\/p>\n<p>Fig. 1: Overview of MAMMAL pretraining data, model architecture, and downstream tasks.<img decoding=\"async\" aria-describedby=\"figure-1-desc ai-alt-disclaimer-figure-1-1\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/05\/44386_2026_47_Fig1_HTML.png\" alt=\"Fig. 1: Overview of MAMMAL pretraining data, model architecture, and downstream tasks.\" loading=\"lazy\" width=\"685\" height=\"469\"\/>The alternative text for this image may have been generated using AI.<\/p>\n<p>A We introduce a multi-align model pretrained on six datasets, each containing tens to hundreds of millions of data points. These data points include protein sequences, small molecules, and gene expression profiles, with a combined sample size of 2 billion. B The multi-align model combines flexible encoder-only and encoder-decoder components. It takes sequences as input, which may contain any combination of tokens and scalar elements, processed by an encoder stack consisting of self-attention blocks. In encoder-only mode, a dedicated token prediction head outputs logits for token predictions, with an optional scalar prediction head for scalar outputs. In encoder-decoder mode, residual connections inject features from the encoder\u2019s final hidden layer into each decoder layer, and a decoder-specific prediction head outputs the final logits. C Diverse downstream tasks performed by the multi-align model, mapped to their contributions within the steps of a typical drug discovery pipeline. D Diverse downstream tasks performed by the multi-align model, categorized by data type used in the fine-tuning process. E Performance of the multi-align model across a diverse set of tasks compared to SOTA. Panel (E) was generated using Matplotlib. Panels (A\u2013D) were created using Illustrator and PowerPoint.<\/p>\n<p>Table 1 Comparison of SOTA and MAMMAL Performance Across Benchmarks<\/p>\n<p>AlphaFold<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 22\" title=\"Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583&#x2013;589 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR22\" id=\"ref-link-section-d295361478e1370\" rel=\"nofollow noopener\" target=\"_blank\">22<\/a>, whose development contributed to the 2024 Nobel Prize in Chemistry, revolutionized protein structure prediction. Its extension AlphaFold-Multimer<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\" title=\"Evans, R. et al. Protein complex prediction with alphafold-multimer. bioRxiv &#010;                  https:\/\/www.biorxiv.org\/content\/early\/2022\/03\/10\/2021.10.04.463034&#010;                  &#010;                 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR23\" id=\"ref-link-section-d295361478e1374\" rel=\"nofollow noopener\" target=\"_blank\">23<\/a> enabled modeling of antibody-antigen complexes, while AlphaFold 3 (AF3)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 24\" title=\"Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493&#x2013;500 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR24\" id=\"ref-link-section-d295361478e1378\" rel=\"nofollow noopener\" target=\"_blank\">24<\/a> further improved accuracy and added nucleic acid\/small molecule support. Motivated by AF3\u2019s reported advances, we evaluated its performance on therapeutic Antibody and Nanobody complexes (Subsection 2.10). Comparative analysis reveals that MAMMAL achieves better classification performance than AF3 in five of seven targets (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>).<\/p>\n<p>Fig. 2: AF3-predicted nanobody binding poses on HER2 and TBG.<img decoding=\"async\" aria-describedby=\"figure-2-desc ai-alt-disclaimer-figure-2-1\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/05\/44386_2026_47_Fig2_HTML.png\" alt=\"Fig. 2: AF3-predicted nanobody binding poses on HER2 and TBG.\" loading=\"lazy\" width=\"685\" height=\"157\"\/>The alternative text for this image may have been generated using AI.<\/p>\n<p>a HER2 extracellular domain (ECD) structure with representative AF3-predicted complexes for a binder and a non-binder. The FDA-approved therapeutic antibodies trastuzumab (blue) and pertuzumab (purple) are shown for reference. AF3 predicts both binders and non-binders engaging the same region of the HER2 ECD, which is distinct from the known therapeutic epitopes, consistent with its poor discriminative performance on this target (AUROC = 0.45). b Thyroxine-binding globulin (TBG) structure with AF3-predicted complexes for binding and non-binding VHHs. In contrast to HER2, AF3 predicts distinct binding poses for binders versus non-binders on TBG, consistent with its strong discriminative performance on this target. Visualizations were generated using PyMOL.<\/p>\n<p>Evaluation<\/p>\n<p>We compiled a comprehensive set of 11 benchmarks covering multiple data domains and task types, including classification, regression and generation, as well as single-entity, multi-entity, and multi-domain tasks. These benchmarks address key stages of the drug discovery process: Identifying target cell types (Cell Type) and advancing precision medicine (Cancer-Drug Response 1-3); predicting drug efficacy (BBBP) and safety (ClinTox); predicting the binding affinity of small-molecule drugs to target proteins (DTI); predicting interactions of biological drugs (PPI); and designing new drugs, such as antibodies, to target specific proteins (Ab Infilling).<\/p>\n<p>To enable fair and direct comparison to prior work, benchmark selection prioritized datasets with predefined train, validation, and test splits or with established splitting strategies reported in the corresponding state of the art studies. For each benchmark, we followed the data splits and evaluation metrics used in the original benchmark or SOTA reference. When explicit train\u2014validation\u2014test splits were available, ibm\/biomed.omics.bl.sm.ma-ted-458m was fine tuned on the training set, the best checkpoint was selected using the validation set, and final performance was reported on the test set. For benchmarks evaluated using cross validation, we adopted the same protocol as the corresponding prior work. Unless otherwise noted, standard errors were estimated by training the models with three different random seeds and calculating the standard deviation of their performance on the held out test set. Detailed descriptions of each benchmark, the fine tuning procedures, and the evaluation protocols are provided alongside the corresponding results. For the DTI benchmark, performance is reported using normalized root mean square error (NRMSE), defined as the root mean square error divided by the standard deviation of the test labels. The same normalization is applied to both MAMMAL and reported SOTA results, yielding values below 1 and enabling joint visualization alongside other performance metrics in Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Fig1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>(E). We consider MAMMAL to outperform existing state of the art when the relative improvement, computed as \u2223SOTA \u2212 MAMMAL\u2223\/SOTA, exceeds 1%.<\/p>\n<p>Cell Type Annotation<\/p>\n<p>Cell type prediction enables researchers to distinguish between different cell populations, such as those associated with various diseases<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Baslan, T. &amp; Hicks, J. Unravelling biology and shifting paradigms in cancer with single-cell sequencing. Nat. Rev. Cancer 17, 557&#x2013;569 (2017).\" href=\"#ref-CR11\" id=\"ref-link-section-d295361478e1434\">11<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Ofengeim, D., Giagtzoglou, N., Huh, D., Zou, C. &amp; Yuan, J. Single-cell rna sequencing: unraveling the brain one cell at a time. Trends Mol. Med. 23, 563&#x2013;576 (2017).\" href=\"#ref-CR12\" id=\"ref-link-section-d295361478e1434_1\">12<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Rozenblatt-Rosen, O., Stubbington, M. J., Regev, A. &amp; Teichmann, S. A. The human cell atlas: from vision to reality. Nature 550, 451&#x2013;453 (2017).\" href=\"#ref-CR13\" id=\"ref-link-section-d295361478e1434_2\">13<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 14\" title=\"Potter, S. S. Single-cell rna sequencing for the study of development, physiology and disease. Nat. Rev. Nephrol. 14, 479&#x2013;492 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR14\" id=\"ref-link-section-d295361478e1437\" rel=\"nofollow noopener\" target=\"_blank\">14<\/a>. It is also crucial for understanding how diseases or drugs affect different cell types. In recent years, a variety of methods have been developed for this task, including approaches based on marker genes, correlation-based techniques, and annotation using classification<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 25\" title=\"Qi, R., Ma, A., Ma, Q. &amp; Zou, Q. Clustering and classification methods for single-cell rna-sequencing data. Brief. Bioinforma. 21, 1196&#x2013;1208 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR25\" id=\"ref-link-section-d295361478e1441\" rel=\"nofollow noopener\" target=\"_blank\">25<\/a>. Recent advances in transformer-based and large-scale foundation models<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Cui, H. et al. scgpt: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 1&#x2013;11 (2024).\" href=\"#ref-CR26\" id=\"ref-link-section-d295361478e1445\">26<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Xu, J., Zhang, A., Liu, F., Chen, L. &amp; Zhang, X. Ciform as a transformer-based model for cell-type annotation of large-scale single-cell rna-seq data. Brief. Bioinforma. 24, bbad195 (2023).\" href=\"#ref-CR27\" id=\"ref-link-section-d295361478e1445_1\">27<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 28\" title=\"Yang, F. et al. scbert as a large-scale pretrained deep language model for cell type annotation of single-cell rna-seq data. Nat. Mach. Intell. 4, 852&#x2013;866 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR28\" id=\"ref-link-section-d295361478e1448\" rel=\"nofollow noopener\" target=\"_blank\">28<\/a> have shown improved performance.<\/p>\n<p>The input for this task is single-cell gene expression data. The benchmark we used was based on the Zheng68k dataset<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\" title=\"Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR29\" id=\"ref-link-section-d295361478e1455\" rel=\"nofollow noopener\" target=\"_blank\">29<\/a>, which is composed of human peripheral blood mononuclear cells and is widely used for evaluating cell-type annotation performance, due to the similarity of the cell types involved. The dataset contains 68,579 cells across 11 cell types and originally included 32,738 genes, which after removing non-expressed genes leaves 20,387 genes in the benchmark. Preprocessing involved normalization, log transformation of expression values, followed by binning. Similar to the approach in<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 30\" title=\"Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616&#x2013;624 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR30\" id=\"ref-link-section-d295361478e1459\" rel=\"nofollow noopener\" target=\"_blank\">30<\/a>, our model uses a ranked list of expressed gene names, ordered by their expression levels, as input. The label to predict is provided in the cell ontology format \u201cCL:NNNNNN\u201d (see Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S1<\/a>).<\/p>\n<p>Following prior work<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 27\" title=\"Xu, J., Zhang, A., Liu, F., Chen, L. &amp; Zhang, X. Ciform as a transformer-based model for cell-type annotation of large-scale single-cell rna-seq data. Brief. Bioinforma. 24, bbad195 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR27\" id=\"ref-link-section-d295361478e1469\" rel=\"nofollow noopener\" target=\"_blank\">27<\/a>, we adopted a 5-fold cross-validation strategy to fine-tune and evaluate ibm\/biomed.omics.bl.sm.ma-ted-458m, ensuring similar proportions of cell types across folds, and assessed performance using accuracy and macro F1 score. MAMMAL outperforms the previous state-of-the-art performance in both accuracy and F1 (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a> and detailed results in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S2<\/a>), achieving a 7.5% improvement in F1.<\/p>\n<p>BBBP and ClinTox<\/p>\n<p>To ensure the development of safe and effective drugs, candidates must satisfy rigorous criteria related to both efficacy and safety. In this study, we selected two relevant benchmarks from MoleculeNet<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 31\" title=\"Wu, Z. et al. Moleculenet: a benchmark for molecular machine learning. Chem. Sci. 9, 513&#x2013;530 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR31\" id=\"ref-link-section-d295361478e1491\" rel=\"nofollow noopener\" target=\"_blank\">31<\/a>, a widely used suite of benchmarks for evaluating machine learning models on small-molecule drug properties: BBBP and ClinTox. The BBBP benchmark focuses on predicting the ability of drugs to penetrate the blood-brain barrier, a critical consideration for drugs targeting the central nervous system. The ClinTox benchmark comprises two related tasks: (1) predicting failure in clinical toxicity trials, and (2) predicting FDA approval status. The overall performance on ClinTox is reported as the average performance across these two tasks.<\/p>\n<p>MoLFormer<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 32\" title=\"Ross, J. et al. Large-scale chemical language representations capture molecular structure and properties. Nat. Mach. Intell. 4, 1256&#x2013;1264 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR32\" id=\"ref-link-section-d295361478e1498\" rel=\"nofollow noopener\" target=\"_blank\">32<\/a>, a well-established model for molecular embeddings trained on 1.1 billion SMILES sequences, has achieved state-of-the-art performance on both the BBBP and ClinTox benchmarks. In our study, we adopted the benchmarks from<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 32\" title=\"Ross, J. et al. Large-scale chemical language representations capture molecular structure and properties. Nat. Mach. Intell. 4, 1256&#x2013;1264 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR32\" id=\"ref-link-section-d295361478e1502\" rel=\"nofollow noopener\" target=\"_blank\">32<\/a>, which provided predefined splits for training, validation, and testing. MAMMAL surpasses MoLFormer on both benchmarks (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>), achieving an average area under the receiver operating characteristic curve (AUROC) score of 0.957 on BBBP and 0.986 on ClinTox, representing improvements of 2.2% and 4%, respectively, over the state of the art.<\/p>\n<p>Cancer-Drug Response<\/p>\n<p>Identifying drug response at the cellular level is a critical step in the development of new drugs. Two key public databases supporting this effort, particularly in cancer drug development, are the Cancer Cell Line Encyclopedia (CCLE)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 33\" title=\"Barretina, J. et al. The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603&#x2013;607 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR33\" id=\"ref-link-section-d295361478e1517\" rel=\"nofollow noopener\" target=\"_blank\">33<\/a> and the Genomics of Drug Sensitivity in Cancer (GDSC)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 34\" title=\"Yang, W. et al. Genomics of drug sensitivity in cancer (gdsc): a resource for therapeutic biomarker discovery in cancer cells. Nucleic acids Res. 41, D955&#x2013;D961 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR34\" id=\"ref-link-section-d295361478e1521\" rel=\"nofollow noopener\" target=\"_blank\">34<\/a>. CCLE provides multi-omics profiles for around 1000 cancer cell lines, while GDSC offers data on the drug responses of these lines to hundreds of drugs, commonly measured using the half-maximal inhibitory concentration (IC50). Notable computational models addressed this task<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lind, A. P. &amp; Anderson, P. C. Predicting drug activity against cancer cells by random forest models based on minimal genomic information and chemical properties. PloS one 14, e0219774 (2019).\" href=\"#ref-CR35\" id=\"ref-link-section-d295361478e1527\">35<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Liu, Q., Hu, Z., Jiang, R. &amp; Zhou, M. Deepcdr: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics 36, i911&#x2013;i918 (2020).\" href=\"#ref-CR36\" id=\"ref-link-section-d295361478e1527_1\">36<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 37\" title=\"Liu, X. et al. Graphcdr: a graph neural network method with contrastive learning for cancer drug response prediction. Brief. Bioinforma. 23, bbab457 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR37\" id=\"ref-link-section-d295361478e1530\" rel=\"nofollow noopener\" target=\"_blank\">37<\/a>.<\/p>\n<p>For our study, we used three subsets of the GDSC database: GDSC1 and GDSC2, available through the Therapeutics Data Commons (TDC)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 38\" title=\"Huang, K. et al. Therapeutics Data Commons: Machine learning datasets and tasks for drug discovery and development. Adv. Neural Inform. Processing Syst. (NeurIPS), Track on Datasets and Benchmarks (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR38\" id=\"ref-link-section-d295361478e1537\" rel=\"nofollow noopener\" target=\"_blank\">38<\/a>, and referred to in the paper as Cancer-Drug Response 1 and Cancer-Drug Response 2, respectively; and a subset published in<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"Liu, Q., Hu, Z., Jiang, R. &amp; Zhou, M. Deepcdr: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics 36, i911&#x2013;i918 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR36\" id=\"ref-link-section-d295361478e1541\" rel=\"nofollow noopener\" target=\"_blank\">36<\/a>, referred to as Cancer-Drug Response 3. A dataset statistics table summarizing the number of cell lines, drugs, and cell\u2013drug pairs is provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S3<\/a>. We used the random splits provided by TDC for Cancer-Drug Response 1 and 2, while for Cancer-Drug Response 3, we followed the split methodology outlined in<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"Liu, Q., Hu, Z., Jiang, R. &amp; Zhou, M. Deepcdr: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics 36, i911&#x2013;i918 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR36\" id=\"ref-link-section-d295361478e1548\" rel=\"nofollow noopener\" target=\"_blank\">36<\/a>, reserving 5% of the data for the test set, stratified by TCGA<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 39\" title=\"Weinstein, J. N. et al. The Cancer Genome Atlas Pan-Cancer Analysis Project. Nat. Genet. 45, 1113&#x2013;1120 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR39\" id=\"ref-link-section-d295361478e1552\" rel=\"nofollow noopener\" target=\"_blank\">39<\/a> pathways associated with the cancer cell lines.<\/p>\n<p>During fine-tuning, we used only gene-expression profiles and SMILES representations of drugs, as shown in the example prompt in the Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S1<\/a>. Similar to the input format for cell type annotation, gene-expression profiles were provided as ranked lists of gene names based on their expression levels. For predicting continuous IC50 values, MAMMAL was utilized in regression mode, taking advantage of its built-in support for floating-point scalar predictions. Our model outperforms the current SOTA models for Cancer-Drug Response 1 and 2 (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>), achieving a 3.4% increase in Pearson correlation values. Additionally, it yields results comparable to the SOTA for the Cancer-Drug Response 3 benchmark, with a slight improvement of 0.5%.<\/p>\n<p>To further evaluate MAMMAL\u2019s predictive capability on novel compounds, we assessed drug response predictions for four drugs not present in the GDSC training data: Carfilzomib, Nintedanib, Infigratinib, and Vemurafenib. Tanimoto similarity analysis confirmed that three of these drugs (Carfilzomib, Nintedanib, and Infigratinib) have no structurally similar compounds in the training set (Tanimoto coefficient &lt;0.7), while Vemurafenib shares moderate similarity (0.82) with PLX-4720, a BRAF inhibitor present in GDSC. We performed experimental validation using the same assay protocol employed in GDSC: cell viability was measured using CellTiter-Glo following 72-hour drug incubation, and IC50 values were determined with Prism (GraphPad). The experimental measurements revealed a consistent potency ranking across all tested cell lines: Carfilzomib (most potent), followed by Nintedanib, Infigratinib, and Vemurafenib (least potent). MAMMAL predictions reproduced this exact ranking for the tested cell lines. When extended to all 805 cell lines in GDSC, the model preserved this relative ordering in approximately 90\u201395% of cases, suggesting that the predicted potency differences are largely cell line\u2013independent.<\/p>\n<p>Notably, Carfilzomib is a proteasome inhibitor approved exclusively for hematological malignancies (multiple myeloma), with limited efficacy in cells of solid tumors<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 40\" title=\"Kortuem, K. M. &amp; Stewart, A. K. Carfilzomib. Blood, J. Am. Soc. Hematol. 121, 893&#x2013;897 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR40\" id=\"ref-link-section-d295361478e1574\" rel=\"nofollow noopener\" target=\"_blank\">40<\/a>. The model\u2019s prediction of Carfilzomib as the most potent agent across diverse solid tumor cell lines aligns with our experimental observations and suggests potential broader applicability that warrants further investigation.<\/p>\n<p>Ab Infilling<\/p>\n<p>Antibodies are a family of proteins produced by the immune system to neutralize foreign antigens and are of particular interest due to their high specificity and strong binding to target molecules<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 41\" title=\"Hummer, A. M., Abanades, B. &amp; Deane, C. M. Advances in computational structure-based antibody design. Curr. Opin. Struct. Biol. 74, 102379 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR41\" id=\"ref-link-section-d295361478e1586\" rel=\"nofollow noopener\" target=\"_blank\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 42\" title=\"Chiu, M., Goulet, D., Teplyakov, A. &amp; Gilliland, G. Antibody structure and function: the basis for engineering therapeutics. Antibodies (Basel) 8 4), (2019).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR42\" id=\"ref-link-section-d295361478e1589\" rel=\"nofollow noopener\" target=\"_blank\">42<\/a>. These characteristics have made them a crucial class of therapeutics, driving significant research efforts into the design of new antibody-based drug candidates<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\" title=\"Lu, R.-M. et al. Development of therapeutic antibodies for the treatment of diseases. J. Biomed. Sci. 27, 1&#x2013;30 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR7\" id=\"ref-link-section-d295361478e1593\" rel=\"nofollow noopener\" target=\"_blank\">7<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Basu, K., Green, E. M., Cheng, Y. &amp; Craik, C. S. Why recombinant antibodies - benefits and applications. Curr. Opin. Biotechnol. 60, 153&#x2013;158 (2019).\" href=\"#ref-CR43\" id=\"ref-link-section-d295361478e1596\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Carter, P. J. &amp; Lazar, G. A. Next generation antibody drugs: pursuit of the&#x2019;high-hanging fruit&#x2019;. Nat. Rev. Drug Discov. 17, 197&#x2013;223 (2018).\" href=\"#ref-CR44\" id=\"ref-link-section-d295361478e1596_1\">44<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 45\" title=\"Beck, A., Goetsch, L., Dumontet, C. &amp; Corva&#xED;a, N. Strategies and challenges for the next generation of antibody&#x2013;drug conjugates. Nat. Rev. Drug Discov. 16, 315&#x2013;337 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR45\" id=\"ref-link-section-d295361478e1599\" rel=\"nofollow noopener\" target=\"_blank\">45<\/a>. Antigen-binding fragments (Fabs) are the antibody fragments that bind to antigens. It is composed of one constant domain and one variable domain of each of the heavy and light chains. Each variable region is further divided into four framework (FR) regions and three complementarity-determining regions (CDRs). While FR regions are typically conserved, CDRs exhibit significant variation in their amino acid composition and are generally the primary determinants of binding affinity to the target antigen. When designing novel antibodies for a specific antigen, the typical approach is to explore alternative CDRs that could produce a new, functional antibody with high binding affinity to the target<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 41\" title=\"Hummer, A. M., Abanades, B. &amp; Deane, C. M. Advances in computational structure-based antibody design. Curr. Opin. Struct. Biol. 74, 102379 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR41\" id=\"ref-link-section-d295361478e1609\" rel=\"nofollow noopener\" target=\"_blank\">41<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 42\" title=\"Chiu, M., Goulet, D., Teplyakov, A. &amp; Gilliland, G. Antibody structure and function: the basis for engineering therapeutics. Antibodies (Basel) 8 4), (2019).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR42\" id=\"ref-link-section-d295361478e1612\" rel=\"nofollow noopener\" target=\"_blank\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 46\" title=\"Saka, K. et al. Antibody design using lstm based deep generative model from phage display library for affinity maturation. Sci. Rep. 11, 5852 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR46\" id=\"ref-link-section-d295361478e1615\" rel=\"nofollow noopener\" target=\"_blank\">46<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 47\" title=\"Kong, X., Huang, W. &amp; Liu, Y. End-to-End Full-Atom Antibody Design. Proc. Intl. Conf. on Mach. Learn. (ICML), 202, 17409&#x2013;17429 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR47\" id=\"ref-link-section-d295361478e1618\" rel=\"nofollow noopener\" target=\"_blank\">47<\/a>.<\/p>\n<p>Recently, several deep learning methods have been developed for targeted antibody design, framing CDR prediction as an infilling task<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Saka, K. et al. Antibody design using lstm based deep generative model from phage display library for affinity maturation. Sci. Rep. 11, 5852 (2021).\" href=\"#ref-CR46\" id=\"ref-link-section-d295361478e1628\">46<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kong, X., Huang, W. &amp; Liu, Y. End-to-End Full-Atom Antibody Design. Proc. Intl. Conf. on Mach. Learn. (ICML), 202, 17409&#x2013;17429 (2023).\" href=\"#ref-CR47\" id=\"ref-link-section-d295361478e1628_1\">47<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Jin, W., Wohlwend, J., Barzilay, R. &amp; Jaakkola, T. Iterative refinement graph neural network for antibody sequence-structure co-design. arXiv preprint arXiv:2110.04624 (2021).\" href=\"#ref-CR48\" id=\"ref-link-section-d295361478e1628_2\">48<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Jin, W., Barzilay, R. &amp; Jaakkola, T. Antibody-Antigen Docking and Design via Hierarchical Structure Refinement. Proc. 39th Intl. Conf. on Mach. Learn. (ICML), 162, 10217&#x2013;10227 (PMLR, 2022).\" href=\"#ref-CR49\" id=\"ref-link-section-d295361478e1628_3\">49<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Luo, S. et al. Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. Adv. Neural Inf. Process. Syst. 35, 9754&#x2013;9767 (2022).\" href=\"#ref-CR50\" id=\"ref-link-section-d295361478e1628_4\">50<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kong, X., Huang, W. &amp; Liu, Y. Conditional antibody design as 3d equivariant graph translation. arXiv preprint arXiv:2208.06073 (2022).\" href=\"#ref-CR51\" id=\"ref-link-section-d295361478e1628_5\">51<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 52\" title=\"Zhou, X. et al. Antigen-specific antibody design via direct energy-based preference optimization. Adv. Neural Inform. Processing Syst. (NeuRIPS), 37, 120861&#x2013;120891 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR52\" id=\"ref-link-section-d295361478e1631\" rel=\"nofollow noopener\" target=\"_blank\">52<\/a>. These models predict missing CDR regions, represented by MASK tokens, using the amino acid sequences of both the antigen and the antibody\u2019s FR regions. While prior approaches often relied on structural data, this information is scarce and challenging to obtain<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 53\" title=\"Dunbar, J. et al. Sabdab: the structural antibody database. Nucleic acids Res. 42, D1140&#x2013;D1146 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR53\" id=\"ref-link-section-d295361478e1638\" rel=\"nofollow noopener\" target=\"_blank\">53<\/a>. In contrast, we fine-tune MAMMAL for the targeted antibody design task using only the sequence data of the antigen and the sequence of the antibody\u2019s FR regions.<\/p>\n<p>The targeted antibody design task benchmark is based on the SAbDab dataset<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 53\" title=\"Dunbar, J. et al. Sabdab: the structural antibody database. Nucleic acids Res. 42, D1140&#x2013;D1146 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR53\" id=\"ref-link-section-d295361478e1645\" rel=\"nofollow noopener\" target=\"_blank\">53<\/a>. Following the data processing outlined in<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 47\" title=\"Kong, X., Huang, W. &amp; Liu, Y. End-to-End Full-Atom Antibody Design. Proc. Intl. Conf. on Mach. Learn. (ICML), 202, 17409&#x2013;17429 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR47\" id=\"ref-link-section-d295361478e1649\" rel=\"nofollow noopener\" target=\"_blank\">47<\/a>, we filtered out samples with missing CDRs to enable direct comparison, even though MAMMAL supports samples that contain missing CDRs. Consistent with<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 47\" title=\"Kong, X., Huang, W. &amp; Liu, Y. End-to-End Full-Atom Antibody Design. Proc. Intl. Conf. on Mach. Learn. (ICML), 202, 17409&#x2013;17429 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR47\" id=\"ref-link-section-d295361478e1653\" rel=\"nofollow noopener\" target=\"_blank\">47<\/a>, we randomly partitioned the dataset into training, validation, and test folds while ensuring that samples with similar heavy-chain third CDR (CDRH3) sub-sequences remained in the same fold. MAMMAL demonstrates superior amino acid recovery (AAR), defined as the fraction of correctly predicted residues, across all masked CDRs (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>; detailed results are provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S4<\/a>). Notably, in CDRH3, the most variable region, it exhibits a remarkable improvement of 19%.<\/p>\n<p>T-Cell Receptor-Epitope Binding<\/p>\n<p>T-cell receptor (TCR) binding to immunogenic peptides (epitopes) presented by major histocompatibility complex molecules is a critical mechanism in the adaptive immune system, essential for antigen recognition and triggering immune responses. The TCR repertoire exhibits considerable diversity, consisting of an \u03b1-chain and a \u03b2-chain that function together to enable T cells to recognize a wide array of epitopes. The \u03b2-chain is especially significant, as it is crucial for the early stages of T-cell development and possesses greater variability, which enhances the TCR\u2019s capacity to identify diverse pathogens effectively. However, understanding the specific interactions between TCRs and epitopes remains a significant challenge due to the vast variability in TCR sequences. Accurate prediction of TCR-peptide binding from sequence data would advance immunology by offering deeper insights into a patient\u2019s immune status and disease history. This capability holds potential applications in personalized immunotherapy, early diagnosis, and the treatment of diseases such as cancer and autoimmune disorders. In silico tools designed to model TCR-peptide interactions could facilitate the study of therapeutic T-cell efficacy and assess cross-reactivity risks, presenting an opportunity for precision medicine.<\/p>\n<p>We evaluated the model on the task of predicting TCR-epitope binding from sequence data using the Weber benchmark (ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 54\" title=\"Weber, A., Born, J. &amp; Rodriguez Mart&#xED;nez, M. Titan: T-cell receptor specificity prediction with bimodal attention networks. Bioinformatics 37, i237&#x2013;i244 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR54\" id=\"ref-link-section-d295361478e1683\" rel=\"nofollow noopener\" target=\"_blank\">54<\/a>, <a href=\"https:\/\/tdcommons.ai\/multi_pred_tasks\/tcrepitope\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/tdcommons.ai\/multi_pred_tasks\/tcrepitope<\/a>), which consists of 47,182 TCR \u03b2-chain epitope pairs. This dataset covers 192 distinct epitopes and includes 23,139 unique TCR \u03b2-chain sequences, with 50% of the pairs serving as negative samples created by randomly pairing TCR sequences with epitopes they are not known to bind with. The dataset also includes the CDR3 subsequence for each TCR \u03b2-chain, the most hypervariable region of the chain. We used 10-fold cross-validation. The folds were pre-defined in <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 54\" title=\"Weber, A., Born, J. &amp; Rodriguez Mart&#xED;nez, M. Titan: T-cell receptor specificity prediction with bimodal attention networks. Bioinformatics 37, i237&#x2013;i244 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR54\" id=\"ref-link-section-d295361478e1704\" rel=\"nofollow noopener\" target=\"_blank\">54<\/a>. Fine-tuning involved three concurrent tasks: TCR \u03b2-chain mask infilling and two classification tasks: (i) TCR \u03b2-chain epitope binding prediction and (ii) TCR \u03b2-chain -CDR3 epitope binding prediction. Here, we report the performance only for the TCR \u03b2-chain epitope binding prediction task. Our model achieves an average AUROC of 0.879 (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>), representing a statistically significant improvement of 2% over the SOTA, as our result falls outside the SOTA\u2019s confidence interval.<\/p>\n<p>Protein-Protein Interaction &#8211; \u0394\u0394G Prediction<\/p>\n<p>An important factor in drug design is binding affinity, commonly measured by the equilibrium dissociation constant, KD, which is related to the Gibbs free energy \u0394G through the equation<\/p>\n<p>$$\\Delta G=kT\\,{ln}({K}_{D}),$$<\/p>\n<p>\n                    (1)\n                <\/p>\n<p>where k is the Boltzmann constant and T is the temperature<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 55\" title=\"Jankauskait&#x117;, J., Jim&#xE9;nez-Garc&#xED;a, B., Dapk&#x16B;nas, J., Fern&#xE1;ndez-Recio, J. &amp; Moal, I. H. Skempi 2.0: an updated benchmark of changes in protein&#x2013;protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 35, 462&#x2013;469 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR55\" id=\"ref-link-section-d295361478e1795\" rel=\"nofollow noopener\" target=\"_blank\">55<\/a>.<\/p>\n<p>The effect of introducing mutations into a protein\u2013protein complex is commonly quantified by the change in binding free energy relative to the reference (wild-type) complex. This mutation-induced effect is captured by the difference in Gibbs free energy, defined as<\/p>\n<p>$$\\Delta \\Delta G=\\Delta {G}_{{\\rm{mutant}}}-\\Delta {G}_{{\\rm{wild}}-{\\rm{type}}}.$$<\/p>\n<p>By subtracting the wild-type free energy, \u0394\u0394G isolates the energetic contribution of the mutation itself. As a result, \u0394\u0394G provides a direct measure of whether a mutation stabilizes or destabilizes binding and is a standard target in studies of mutational effects on protein\u2013protein interactions<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Liu, X., Feng, H., L&#xFC;, Z. &amp; Xia, K. Persistent tor-algebra for protein&#x2013;protein interaction analysis. Brief. Bioinforma. 24, bbad046 (2023).\" href=\"#ref-CR56\" id=\"ref-link-section-d295361478e1875\">56<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Wang, M., Cang, Z. &amp; Wei, G.-W. A topology-based network tree for the prediction of protein&#x2013;protein binding affinity changes following mutation. Nat. Mach. Intell. 2, 116&#x2013;123 (2020).\" href=\"#ref-CR57\" id=\"ref-link-section-d295361478e1875_1\">57<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 58\" title=\"Guo, Z. &amp; Yamaguchi, R. Machine learning methods for protein-protein binding affinity prediction in protein design. Front. Bioinforma. 2, 1065703 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR58\" id=\"ref-link-section-d295361478e1878\" rel=\"nofollow noopener\" target=\"_blank\">58<\/a>.<\/p>\n<p>The SKEMPI dataset<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 55\" title=\"Jankauskait&#x117;, J., Jim&#xE9;nez-Garc&#xED;a, B., Dapk&#x16B;nas, J., Fern&#xE1;ndez-Recio, J. &amp; Moal, I. H. Skempi 2.0: an updated benchmark of changes in protein&#x2013;protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 35, 462&#x2013;469 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR55\" id=\"ref-link-section-d295361478e1885\" rel=\"nofollow noopener\" target=\"_blank\">55<\/a> provides experimentally measured changes in thermodynamic parameters, including \u0394G and kinetic rate constants, for mutations in protein\u2013protein complexes with known structures in the Protein Data Bank<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 59\" title=\"Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235&#x2013;242 (2000).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR59\" id=\"ref-link-section-d295361478e1892\" rel=\"nofollow noopener\" target=\"_blank\">59<\/a>. This dataset is widely used to benchmark methods for predicting mutation-induced changes in binding affinity, particularly \u0394\u0394G. A commonly used subset of SKEMPI comprising 1131 single-point mutations (S1131) is adopted as our benchmark. Following standard practice, we report 10-fold cross-validation performance on this subset. The input for our model consists solely of amino acid sequences for the wild-type and mutant complexes, without structural information. Leveraging MAMMAL\u2013s support for continuous-valued outputs, we formulate \u0394\u0394G prediction as a regression task. Performance results are reported in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>. Our model achieves an average Pearson correlation of 0.852, substantially exceeding the previous sequence-only state of the art (0.663), and remains competitive with structure-based methods, falling only 1.6% short of the reported best performance of 0.866<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 56\" title=\"Liu, X., Feng, H., L&#xFC;, Z. &amp; Xia, K. Persistent tor-algebra for protein&#x2013;protein interaction analysis. Brief. Bioinforma. 24, bbad046 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR56\" id=\"ref-link-section-d295361478e1906\" rel=\"nofollow noopener\" target=\"_blank\">56<\/a>.<\/p>\n<p>Drug-Target Interaction<\/p>\n<p>Predicting drug-target binding affinity plays a crucial role in the early stages of drug discovery. Traditionally, binding affinities are measured through high-throughput screening experiments, which, while accurate, are resource-intensive and limited in their scalability to evaluate large sets of drug candidates. In this task, we focus on predicting binding affinities using pKD, the negative logarithm of the dissociation constant, which reflects the strength of the interaction between a small molecule (drug) and a protein (target). We utilize the PEER (Protein sEquence undERstanding) benchmark<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 60\" title=\"Xu, M. et al. Peer: a comprehensive and multi-task benchmark for protein sequence understanding. Adv. Neural Inf. Process. Syst. 35, 35156&#x2013;35173 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR60\" id=\"ref-link-section-d295361478e1924\" rel=\"nofollow noopener\" target=\"_blank\">60<\/a> for DTI prediction. This benchmark leverages data from the BindingDB dataset<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 61\" title=\"Gilson, M. K. et al. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res 44, D1045&#x2013;53 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR61\" id=\"ref-link-section-d295361478e1928\" rel=\"nofollow noopener\" target=\"_blank\">61<\/a>, with a specific test split that holds out four protein classes &#8211; estrogen receptor, G-protein-coupled receptors, ion channels, and receptor tyrosine kinases &#8211; for assessing generalization performance on unseen classes.<\/p>\n<p>For model fine-tuning, we conducted hyperparameter optimization, selecting an initial learning rate of 0.0004, with no dropout and no weight decay. We standardized the pKD values based on the mean and standard deviation of the training set. For evaluation, we transformed the predicted values back to their original scale. Our model achieves an average NRMSE of 0.906 (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>), demonstrating a solid improvement of 3.8% over the SOTA reported by<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 60\" title=\"Xu, M. et al. Peer: a comprehensive and multi-task benchmark for protein sequence understanding. Adv. Neural Inf. Process. Syst. 35, 35156&#x2013;35173 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR60\" id=\"ref-link-section-d295361478e1944\" rel=\"nofollow noopener\" target=\"_blank\">60<\/a>.<\/p>\n<p>Antibody-Antigen Binding Prediction<\/p>\n<p>Accurate prediction of antigen-antibody binding can enhance the design and optimization of therapeutic antibodies, leading to improved efficacy and specificity. We employ the human epidermal growth factor receptor 2 (HER2) dataset<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 62\" title=\"Mason, D. M. et al. Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning. Nat. Biomed. Eng. 5, 600&#x2013;612 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR62\" id=\"ref-link-section-d295361478e1957\" rel=\"nofollow noopener\" target=\"_blank\">62<\/a> as a benchmark for predicting antibody-antigen binding. HER2 is a key target for certain types of breast and stomach cancers. The dataset includes variations of the clinically approved therapeutic antibody trastuzumab and their corresponding affinities for the HER2 antigen. The dataset comprises 8,935 binding and 25,114 non-binding trastuzumab CDR H3 mutants, each with up to 10 mutations, following de-duplication and the removal of samples labeled as both binding and non-binding.<\/p>\n<p>For the most accurate comparison with the SOTA (refs. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 62\" title=\"Mason, D. M. et al. Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning. Nat. Biomed. Eng. 5, 600&#x2013;612 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR62\" id=\"ref-link-section-d295361478e1964\" rel=\"nofollow noopener\" target=\"_blank\">62<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 63\" title=\"Jing, H. et al. Accurate prediction of antibody function and structure using bio-inspired antibody language model. Brief. Bioinforma. 25, bbae245 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR63\" id=\"ref-link-section-d295361478e1967\" rel=\"nofollow noopener\" target=\"_blank\">63<\/a>), the HER2 dataset was divided into train (70%), validation (15%) and test (15%) sets. For increased robustness, the train set was further divided into 5 folds. The reported results are from the 5 models trained on different train folds, and evaluated on the test set.<\/p>\n<p>Finetuning involved feeding the target antigen sequence as well as the entire heavy-chain variable region as input and predicting binding to the target sequence. Our model achieves an average AUROC of 0.928 (Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>), slightly surpassing the SOTA, which incorporated structural data, unlike our model.<\/p>\n<p>Comparison of AlphaFold 3 and MAMMAL in Predicting Antibody-Antigen and Nanobody-Antigen Binding<\/p>\n<p>Accurate prediction of antibody-antigen and nanobody-antigen interactions is essential for evaluating therapeutic efficacy and guiding protein engineering. Although AlphaFold 3 (AF3) is not explicitly designed as a binary protein-protein interaction (PPI) classifier, binding likelihood can be inferred from structure-derived confidence scores, such as predicted template modeling (pTM) and interface predicted template modeling (ipTM), computed from predicted protein-protein complexes. These confidence scores are derived from structural predictions rather than classification objectives. Recent studies suggest that these scores correlate with true binding events<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 64\" title=\"Bennett, N. R. et al. Atomically accurate de novo design of antibodies with rfdiffusion. bioRxiv &#010;                  https:\/\/www.biorxiv.org\/content\/early\/2025\/02\/28\/2024.03.14.585103.full.pdf&#010;                  &#010;                 (2025).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR64\" id=\"ref-link-section-d295361478e1985\" rel=\"nofollow noopener\" target=\"_blank\">64<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 65\" title=\"Yin, R. &amp; Pierce, B. G. Evaluation of alphafold antibody-antigen modeling with implications for improving predictive accuracy. Protein Sci. 33, e4865 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR65\" id=\"ref-link-section-d295361478e1988\" rel=\"nofollow noopener\" target=\"_blank\">65<\/a>.<\/p>\n<p>Accordingly, we conduct an exploratory comparison between AF3-derived confidence scores and a fine-tuned MAMMAL model for distinguishing binders from non-binders. We emphasize that AF3 provides detailed 3D structural hypotheses, whereas MAMMAL is a sequence-only model that produces probabilistic binding predictions; the comparison is intended to assess relative discriminative power for binding prediction rather than to equate the underlying modeling approaches.<\/p>\n<p>We first evaluated the extracellular domain (ECD) of HER2, a well-characterized therapeutic antigen with experimentally validated binding epitopes. We used the HER2-specific MAMMAL model described in Subsection 2.9. Due to the computational demands and limited availability of AF3, the HER2 benchmark test set was downsampled to 60 examples, comprising 30 binders and 30 non-binders. The HER2-specific MAMMAL model demonstrates strong discriminative performance, achieving an AUROC of 0.88. In contrast, AF3 exhibits no meaningful separation between binders and non-binders (AUROC = 0.45), and the difference in performance between the two models is highly significant (DeLong test, P = 1.5 \u00d7 10\u22126). Structural analysis further reveals that AF3-predicted binding sites are indistinguishable between binders and non-binders and deviate from the known epitopes of the FDA-approved antibodies trastuzumab and pertuzumab (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Fig2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>a). An extended comparative analysis of MAMMAL (pre-trained and fine-tuned) and AF3 is provided in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S5<\/a>. This includes several AF3 confidence-score variants, using ipTM &#8211; and pTM-based scoring, and heavy-chain-only and heavy+light-chain input configurations.<\/p>\n<p>Next, we evaluated nanobody binding across six structurally diverse antigen targets: albumin, mannose receptor (CD206), epidermal growth factor receptor (EGFR), thyroxine-binding globulin (TBG), tumor necrosis factor alpha (TNF\u03b1), and von Willebrand factor (VWF). Binding nanobodies were collected from SAbDab-nano<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 66\" title=\"Schneider, C., Raybould, M. I. J. &amp; Deane, C. M. Sabdab in the age of biotherapeutics: updates including sabdab-nano, the nanobody structure tracker. Nucleic Acids Res. 50, D1368&#x2013;D1372 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#ref-CR66\" id=\"ref-link-section-d295361478e2015\" rel=\"nofollow noopener\" target=\"_blank\">66<\/a>, patents, and proprietary datasets. Non-binders consisted of nanobodies experimentally confirmed as non-binding in phage-display library screenings, as well as nanobodies targeting unrelated antigens. From a total of 668 nanobody\u2013antigen pairs (131 binders and 537 non-binders), we selected 475 sequences (64 binders and 411 non-binders) for MAMMAL fine-tuning and reserved 193 sequences (67 binders and 126 non-binders) for held-out evaluation. A single MAMMAL model was fine-tuned on training data comprising binders and non-binders across all targets, and subsequently evaluated separately on the test subset corresponding to each target, with performance compared against AF3 confidence scores computed on the same samples. Test set statistics and per-target MAMMAL and AF3 performance are summarized in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Tab2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>. Additional performance metrics for MAMMAL and AF3 are presented in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S6<\/a>\u2013<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S7<\/a>. As shown, MAMMAL significantly outperforms AF3 on the larger targets: albumin, CD206, EGFR, and VWF. In contrast, AF3 achieves superior performance on the smaller TBG target, and analysis of the predicted structures highlights distinct binding sites for binders versus non-binders (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#Fig2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>b and Supplementary Figure <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44386-026-00047-4#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S4<\/a>). For the smallest protein, TNF\u03b1, the two models exhibit comparable performance.<\/p>\n<p>Table 2 Per-target AUROC comparison between MAMMAL and AF3 on held-out antibody\/nanobody\u2013antigen test subsets<\/p>\n","protected":false},"excerpt":{"rendered":"To evaluate the performance and generalization capabilities of ibm\/biomed.omics.bl.sm.ma-ted-458m, we selected a diverse set of existing benchmarks spanning&hellip;\n","protected":false},"author":2,"featured_media":26402,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[24,25,7040,17633,1941,617,7042,17634],"class_list":{"0":"post-26401","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-ai","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-biomedicine","11":"tag-computational-biology-and-bioinformatics","12":"tag-drug-discovery","13":"tag-general","14":"tag-medicinal-chemistry","15":"tag-pharmaceutical-sciences-technology"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/26401","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=26401"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/26401\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/26402"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=26401"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=26401"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=26401"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}