{"id":188356,"date":"2025-08-31T01:33:18","date_gmt":"2025-08-31T01:33:18","guid":{"rendered":"https:\/\/www.europesays.com\/us\/188356\/"},"modified":"2025-08-31T01:33:18","modified_gmt":"2025-08-31T01:33:18","slug":"machine-learning-climbs-the-jacobs-ladder-of-optoelectronic-properties","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/us\/188356\/","title":{"rendered":"Machine learning climbs the Jacob\u2019s Ladder of optoelectronic properties"},"content":{"rendered":"<p>We have calculated RPA spectra for all materials in the IPA database<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 24\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Dielectric function of semiconductors and insulators under the Independent Particle Approximation. figshare &#010;                  https:\/\/doi.org\/10.6084\/m9.figshare.27440652.v1&#010;                  &#010;                 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR24\" id=\"ref-link-section-d365797778e1596\" target=\"_blank\" rel=\"noopener\">24<\/a> that have at most 8 atoms in the unit cell, i.e., a total of about 6000 spectra. The structures were originally taken from the Alexandria database of theoretically stable materials<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 25\" title=\"Schmidt, J. et al. Machine learning assisted determination of the global zero temperature phase diagram of materials. Adv. Mater. 35, 2210788 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR25\" id=\"ref-link-section-d365797778e1600\" target=\"_blank\" rel=\"noopener\">25<\/a>. They include semiconducting and insulating compounds that contain only main group elements from the first five rows of the periodic table, have a distance from the convex hull less than 50\u2009meV times the number of atoms in the unit cell, and a band gap greater than 500\u2009meV. The data spans a wide range of crystalline systems, covering, e.g., common binary and ternary semiconductors and insulators, but also more exotic compounds such as layered two-dimensional systems and noble gas solids. A visualization of the elemental diversity of the data is given in Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">3<\/a> and Supplementary Note\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">3<\/a>. Additional computational details for the ab initio calculations are given in the \u201cMethods\u201d section. For the generated data, see the \u201cData availability\u201d section.<\/p>\n<p>For consistency, we employ the OPTIMATE architecture of GATs as in Ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e1613\" target=\"_blank\" rel=\"noopener\">18<\/a>. The model is sketched in Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">2<\/a> and takes as input a crystal structure and outputs a frequency-dependent optical property. An input crystal structure is first converted into a multigraph with the corresponding element encoded on each node and distances between atoms encoded on each edge. The model itself consists of a multilayer perceptron which acts on each node separately to process the elemental information, followed by three layers of message passing using the improved Graph Attention Operator of Ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\" title=\"Veli&#x10D;kovi&#x107;, P. et al. Graph attention networks. arXiv &#010;                  https:\/\/doi.org\/10.48550\/arXiv.1710.10903&#010;                  &#010;                 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR23\" id=\"ref-link-section-d365797778e1620\" target=\"_blank\" rel=\"noopener\">23<\/a>. Node-level information is then pooled using softmax vector attention to obtain a single high-dimensional vector which characterizes the material<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e1624\" target=\"_blank\" rel=\"noopener\">18<\/a>. This vector is then passed through another multilayer perceptron to obtain the target optical property. The optical property itself is represented as a 2001-element vector, which corresponds to the optical property sampled in 10\u2009meV steps from 0\u2009eV to 20\u2009eV. We refer to the original publication<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e1628\" target=\"_blank\" rel=\"noopener\">18<\/a> for a more detailed description of its architecture.<\/p>\n<p>In this work, we focus on models trained to predict the imaginary part of the trace of the frequency-dependent dielectric tensor \u03b5(\u03c9). The imaginary part of the dielectric function determines together with the refractive index n the absorption spectra \u03b1(\u03c9) via<\/p>\n<p>$$\\alpha (\\omega )=\\frac{\\omega }{c}\\frac{{{\\rm{Im}}}(\\varepsilon (\\omega ))}{{{\\rm{Re}}}(n(\\omega ))}$$<\/p>\n<p>\n                    (5)\n                <\/p>\n<p>where in general \u03b1, n and \u03b5 are all tensorial properties. The trace of these tensorial properties is an estimation of their polycrystalline average and allows for an easier handling of them<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 26\" title=\"Carri&#xE7;o, P. J. M. A., Ferreira, M., Cerqueira, T. F. T., Nogueira, F. &amp; Borlido, P. High-refractive-index materials screening from machine learning and ab initio methods. Phys. Rev. Mater. 8, 015201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR26\" id=\"ref-link-section-d365797778e1770\" target=\"_blank\" rel=\"noopener\">26<\/a>. We focus specifically on the dielectric function instead of, e.g., the refractive index or the absorption spectrum, as it is the direct result of the ab initio calculations discussed above, and is also usually reported in optical characterizations of materials, e.g., using ellipsometry.<\/p>\n<p>We consider and compare two different strategies for learning the optical properties of materials at the RPA level, namely direct learning (DL) and transfer learning (TL). For DL, one initializes the weights of OPTIMATE randomly and trains directly on the expensive high-level optical spectra, i.e., the RPA spectra in our case. For TL, one trains OPTIMATE first on a large dataset of optical spectra of a lower theory level, i.e., in our case IPA, and then continues the training on a small number of optical spectra of a higher theory level, i.e., data calculated in the RPA. Since the RPA spectra in this work are calculated with a 300\u2009meV broadening, we use the corresponding IPA-OPTIMATE model, i.e., the one trained on the trace of the imaginary part of the dielectric function under IPA, \\({{\\rm{Im}}}({\\overline{\\varepsilon }}_{{{\\rm{IPA}}}})\\), with a 300\u2009meV broadening. In this process, we retrain all learnable parameters of the base model<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e1828\" target=\"_blank\" rel=\"noopener\">18<\/a> starting from the IPA weights, because, in contrast to many other applications involving, e.g., large language models, the additional cost is negligible, and recent publications<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 21\" title=\"Hoffmann, N., Schmidt, J., Botti, S. &amp; Marques, M. A. L. Transfer learning on large datasets for the accurate prediction of material properties. Digit. Discov. 2, 1368&#x2013;1379 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR21\" id=\"ref-link-section-d365797778e1832\" target=\"_blank\" rel=\"noopener\">21<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 27\" title=\"Devi, R., Butler, K. T. &amp; Sai Gautam, G. Optimal pre-train\/fine-tune strategies for accurate material property predictions. npj Comput. Mater. 10, 300 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR27\" id=\"ref-link-section-d365797778e1835\" target=\"_blank\" rel=\"noopener\">27<\/a> have shown that this is generally the best strategy for transfer learning graph neural networks for materials science, even when transfer learning between closely related properties. To avoid data leakage, we reuse the training, validation, and test sets used in the original training of OPTIMATE<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e1839\" target=\"_blank\" rel=\"noopener\">18<\/a>, leaving us with a total training set of 4610 materials and associated spectra, a validation set of 603 materials and a test set of 639 materials. This ensures that the transfer-learned OPTIMATE model has not seen any of the materials in the RPA validation and test set during its original training on IPA data. To compare the data efficiency of the DL and TL, we divided the training set into randomly formed subsets of 100, 300, 1000, 3000, and 4610, i.e., the entire training set. To ensure a fair comparison, we optimize both the architecture and the hyperparameters of the OPTIMATE models used for the DL strategy. The architecture of the OPTIMATE models used for the TL strategy is necessarily kept fixed, so only the hyperparameters of the optimizer are tuned. The training procedure, the selection of the final models, and their hyperparameters are described in more detail in the \u201cMethods\u201d section.<\/p>\n<p>As error metric to compare two spectral properties X(\u03c9) and Y(\u03c9) in a scale-invariant way, we define the similarity coefficient (SC)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 28\" title=\"Biswas, T. &amp; Singh, A. K. pyGWBSE: a high throughput workflow package for GW-BSE calculations. npj Comput. Mater. 9, 22 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR28\" id=\"ref-link-section-d365797778e1859\" target=\"_blank\" rel=\"noopener\">28<\/a><\/p>\n<p>$${{\\rm{SC}}}[X(\\omega );Y(\\omega )]=1-\\frac{\\int\\left\\vert \\right.X(\\omega )-Y(\\omega )\\left\\vert \\right.\\,d\\omega }{\\int\\left\\vert \\right.Y(\\omega )\\left\\vert \\right.\\,d\\omega }$$<\/p>\n<p>\n                    (6)\n                <\/p>\n<p>In addition, we use the mean square error (MSE)<\/p>\n<p>$${{\\rm{MSE}}}[X(\\omega );Y(\\omega )]=\\frac{1}{{\\omega }_{{\\max }}}\\int|X(\\omega )-Y(\\omega ){|}^{2}\\,d\\omega$$<\/p>\n<p>\n                    (7)\n                <\/p>\n<p>as another measure of similarity between spectra. During training, we use the L1 loss, outside of machine learning commonly referred to as the mean absolute error (MAE):<\/p>\n<p>$${{\\rm{MAE}}}[X(\\omega );Y(\\omega )]=\\frac{1}{{\\omega }_{{\\max }}}\\int\\left\\vert \\right.X(\\omega )-Y(\\omega )\\left\\vert \\right.\\,d\\omega$$<\/p>\n<p>\n                    (8)\n                <\/p>\n<p>We use a shorthand, e.g., SC[RPADFT;\u00a0IPADFT] to refer to \\({{\\rm{SC}}}[\\overline{\\varepsilon }{(\\omega )}_{{{\\rm{DFT}}}}^{{{\\rm{RPA}}}};\\overline{\\varepsilon }{(\\omega )}_{{{\\rm{DFT}}}}^{{{\\rm{IPA}}}}]\\), i.e., in this example, the SC between \\(\\overline{\\varepsilon }\\) obtained via DFT using the RPA in reference to \\(\\overline{\\varepsilon }\\) obtained via DFT using the IPA.<\/p>\n<p>Comparison of direct learning and transfer learning<\/p>\n<p>The main results of this work are summarized in Figs.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig2\" target=\"_blank\" rel=\"noopener\">2<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a>, which compare both learning strategies for various training-set sizes. The examples in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig2\" target=\"_blank\" rel=\"noopener\">2<\/a> show that DL on a small dataset of 300 materials leads to noisy, unphysical spectra consisting mostly of a single broad peak (which, perhaps surprisingly, is often more or less correctly positioned). In contrast, the TL strategy results even for small trainings sets already in realistic spectra, showing quite good quantitative agreement. For larger training set sizes, e.g., for 4610 training materials as shown, both strategies lead to a similar prediction accuracy. This is shown in more detail in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a>, where the median SC, Eq. (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Equ6\" target=\"_blank\" rel=\"noopener\">6<\/a>), and the median MSE, Eq. (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Equ7\" target=\"_blank\" rel=\"noopener\">7<\/a>), of the test set for various training set sizes are compared between both strategies. Both error measures are similar for both strategies when the training set is large enough, but the transfer learning strategy already achieves similar error measurements with just 300 materials as the direct learning strategy with about 3000 materials. All models perform better than the possible baseline of just using the IPA spectra. Numerical values for the IPA baseline, i.e., \\({G}_{\\max }=0\\), as well as for the non-converged \\({G}_{\\max }=2000\\) mRy are given in Supplementary Table\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">4<\/a>. Parity plots for both strategies trained on 300 and 4610 materials are shown in Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">4<\/a> and Supplementary Note\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">4<\/a> and yield similar results.<\/p>\n<p><b id=\"Fig2\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 2: Spectra as predicted via DL and TL after training on datasets of different sizes.<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41467-025-63355-9\/figures\/2\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig2\" src=\"https:\/\/www.europesays.com\/us\/wp-content\/uploads\/2025\/08\/41467_2025_63355_Fig2_HTML.png\" alt=\"figure 2\" loading=\"lazy\" width=\"685\" height=\"299\"\/><\/a><\/p>\n<p>Spectra of nine materials selected at the 10% quantile, the 20% quantile, &#8230;, the 80% quantile and the 90% quantile of the test set ordered according to the similarity coefficient SC[IPADFT;\u00a0RPADFT]. All calculated or predicted spectra were obtained for a larger energy range of 0\u201320\u2009eV, but are shown for a smaller energy range to emphasize the differences. All IPA spectra (thin black) differ considerably from the RPA spectra (thick black line). The latter are reasonably well predicted via transfer learning (orange) on large (thick dashed lines, 4610 materials) or even small (thin lines, 300 materials) trainings sets, whereas direct learning (green) on 300 materials yields unphysical, overly broad and very noisy spectra. Much larger training sets would be required for good and physically sensible results with direct learning. N in the legend indicates how many materials were used for training.<\/p>\n<p><b id=\"Fig3\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 3: Comparison between DL and TL for different training set sizes.<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41467-025-63355-9\/figures\/3\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig3\" src=\"https:\/\/www.europesays.com\/us\/wp-content\/uploads\/2025\/08\/41467_2025_63355_Fig3_HTML.png\" alt=\"figure 3\" loading=\"lazy\" width=\"685\" height=\"1487\"\/><\/a><\/p>\n<p>Median SC (top) and median MSE (middle) on the test set for direct learning (green) and transfer learning on subsets of different size of the training set (orange) or exclusively on systems with small unit cells (blue) as a function of the training-set size (indicated by the stars). Using the DFT-calculated IPA spectra as a proxy for the RPA spectra (taken over the entire test set) is given as a baseline in the black dashed line. The small number next to each data point on the blue curve indicates the maximum number of atoms per primitive unit cell. Bottom: Boxplot of the distribution of SC[RPAML;\u00a0RPADFT] on the full test set after transfer learning only on materials with up to 4 atoms per primitive unit cell. The test set is split up by number of atoms per primitive unit cell. The blue center line of the boxplot shows the median, the boxes show the interquartile range, and the whiskers extend out from the box to the furthest point lying within 1.5 times the interquartile range. As can be seen, transfer learning only on materials with few atoms per unit cell generalizes to materials with more atoms per unit cell. The number of materials in the test set with a given number of atoms per primitive cell is given by the background histogram. No materials with 1 and only very few materials with 2 atoms per primitive unit cell are present in the test set.<\/p>\n<p>In short, transfer learning allows accurate quantitative prediction of spectra with a training set of only the order of 100\u20131000 materials. Furthermore, we believe that the observed convergence in model performance between both strategies is likely due to the RPA training set approaching the size of the initial IPA training set for the model, as it is often observed in TL that as the underlying training set increases, TL accuracies also improve<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 21\" title=\"Hoffmann, N., Schmidt, J., Botti, S. &amp; Marques, M. A. L. Transfer learning on large datasets for the accurate prediction of material properties. Digit. Discov. 2, 1368&#x2013;1379 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR21\" id=\"ref-link-section-d365797778e2640\" target=\"_blank\" rel=\"noopener\">21<\/a>.<\/p>\n<p>Hitherto, we have seen that a few (about 300) numerically more expensive RPA calculations are sufficient to retrain the OPTIMATE IPA model. Now, we turn to the question of whether we can pick particularly cheap examples for the TL strategy. So far, the data used for transfer learning consisted of random subsets of the data used to train the underlying model. As discussed in the introduction, almost all computational techniques scale unfavorably with the system size (often \\({{\\mathcal{O}}}({N}^{3})\\) or worse), as opposed to graph neural networks, which generally scale as \\({{\\mathcal{O}}}(N)\\), with N being the number of atoms in the unit cell. If instead of using a random subset of the original training data one could instead only use data from small unit cells, one might significantly reduce computational time. We therefore transfer learn the IPA model on a subset of the training data consisting of only the materials with at most 2, 3, 4, 5 and 6 atoms per primitive unit cell (using the optimal hyperparameters for a training set size of 1000). The results are shown as blue lines in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a> and they are impressive: Even with TL on materials which only contain up to 2 atoms per primitive unit cell, which are only around 60 materials in total, a median SC of 0.8 is achieved. With only up to 4 atoms per primitive unit cell, i.e., around 1500 materials, median MSEs below 0.1 and SC above 0.85 are achieved.<\/p>\n<p>One could speculate that learning on small cells primarily improves the performance on other small cells, while not improving the errors on larger cells. But that is not the case: Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a> (bottom) shows the SC for the materials in the test set grouped according the number of atoms per primitive unit cell, evaluated for the model trained on cells with up to 4 atoms per primitive unit cell. The error for all unit cell sizes is equally small, i.e., even when transfer learning only on materials with small cells, the properties of materials with larger cells can be predicted equally as well. If this were a general result, it would be very encouraging for related future work, e.g., climbing further rungs towards accurate prediction of experimental spectra.<\/p>\n<p>Deeper investigation into transfer learning<\/p>\n<p>We further investigate the success of transfer learning. Figure\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig4\" target=\"_blank\" rel=\"noopener\">4<\/a> quantifies the observations that are suggested by the illustrative examples of Figs.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig2\" target=\"_blank\" rel=\"noopener\">2<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a>. We start with the one-dimensional histograms marking rows and columns. The most important observation is that the TL-RPA predicts the ab initio DFT-RPA (see panel b) about as well as the underlying ML-IPA model<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e2747\" target=\"_blank\" rel=\"noopener\">18<\/a> predicts the DFT-IPA (panel d). Even the poorly trained DL prediction (a) is not worse than the possible baseline of simply using the DFT-calculated IPA spectrum (c), see also Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a>.<\/p>\n<p><b id=\"Fig4\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 4: Correlations between various similarity coefficients on the test set after training on 300 materials.<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41467-025-63355-9\/figures\/4\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig4\" src=\"https:\/\/www.europesays.com\/us\/wp-content\/uploads\/2025\/08\/41467_2025_63355_Fig4_HTML.png\" alt=\"figure 4\" loading=\"lazy\" width=\"685\" height=\"606\"\/><\/a><\/p>\n<p>The columns refer to SC[RPADL;RPADFT] <b>a<\/b> and SC[RPATL;RPADFT] <b>b<\/b>, while the rows refer to SC[IPADFT;RPADFT] <b>c<\/b>, SC[IPAML;IPADFT] <b>d<\/b>. The histogram in the first row shows that the ab initio IPA spectra are only a rough approximation of the ab initio RPA spectra. The histogram in the second row reproduces the observation of Ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e2797\" target=\"_blank\" rel=\"noopener\">18<\/a> that very good ML predictions of the DFT-IPA spectra are possible. The histograms in the top panels show that TL predicts the ab initio RPA spectra much better than DL. The two-dimensional histograms in the center show the correlations between the SCs defining columns and rows. All SCs are evaluated on the test set. The dashed orange line marks the angle bisector.<\/p>\n<p>The first row of the two-dimensional histograms in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig4\" target=\"_blank\" rel=\"noopener\">4<\/a> addresses the question of whether the quality of ML prediction on the RPA level depends on how much the ab initio RPA and IPA spectra actually differ, in other words, whether the ML predictions are worse for difficult cases with small SC[IPADFT;RPADFT]. Obviously, there is no (strong) correlation, the models predict RPA spectra more or less well, regardless of how much the RPA changes the IPA spectra. While this may be obvious for the direct learning strategy (panel ac), this is far from obvious for the transfer learning strategy (panel bc), which starts training from a model which predicts the IPA spectra.<\/p>\n<p>Examining the bottom row of two-dimensional histograms in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig4\" target=\"_blank\" rel=\"noopener\">4<\/a> reveals an even more interesting fact: There is a correlation between whether the models can predict the RPA spectra and whether another model can predict the IPA spectra. Most peculiarly, the numerical values of the corresponding SCs are almost the same when transfer learning already on small datasets (panel bd). This means that the RPA spectra for a given material can be predicted just as well as the IPA spectra. This correlation is made even more clearer when training on 4610 materials, and the same agreement is recovered for DL on such dataset sizes, see Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">1<\/a>. One might have expected\u2014wrongly, as it turns out\u2014that the RPA spectra would be more difficult to predict than the IPA spectra, since they include additional physical effects.<\/p>\n<p>From a machine-learning point-of-view, the fact that RPA and IPA spectra are equally well predictable might not be surprising, by\u2014perhaps overly reductionistly\u2014assuming the models treat the RPA spectra compared to the IPA spectra as just another curve defined in a learned materials space. As we have shown recently, the models construct expressive internal representations of the material space<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Discovery of sustainable energy materials via the machine learned material space. Small 2412519 (2025).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR29\" id=\"ref-link-section-d365797778e2831\" target=\"_blank\" rel=\"noopener\">29<\/a>. Therefore, we propose that the models reconstruct very general features of the material space (similarity between materials in general) contained in the training set before converting them to a specific property (in this case the RPA dielectric function).<\/p>\n<p>To investigate this further, we compare in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig5\" target=\"_blank\" rel=\"noopener\">5<\/a> how well each material is predicted by both strategies for various training set sizes. Again, TL performs much better than the DL strategy for the small training set. Both strategies perform similarly well for the large training set, where an almost linear correlation exists, with possibly a slight advantage for DL. Thus, not only is the predictability between the IPA spectra and the RPA spectra strongly correlated, but the predictability of the RPA spectra for a given material is also approximately independent of the training strategy employed (once the training set is large enough).<\/p>\n<p><b id=\"Fig5\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 5: Comparison between TL and DL for different training set sizes.<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41467-025-63355-9\/figures\/5\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig5\" src=\"https:\/\/www.europesays.com\/us\/wp-content\/uploads\/2025\/08\/41467_2025_63355_Fig5_HTML.png\" alt=\"figure 5\" loading=\"lazy\" width=\"685\" height=\"290\"\/><\/a><\/p>\n<p>2D-Histograms comparing the similarity coefficients SC[RPAS;\u00a0\u2009RPADFT], S\u2009=\u2009DL,\u00a0TL, on the test set for both training strategies after training on 300 (left) and 4610 (right) materials. The orange line marks the angle bisector.<\/p>\n<p>Learning the similarity<\/p>\n<p>Finally, we investigate whether it is possible to directly predict the similarity between the DFT-IPA spectra and the DFT-RPA spectra. This could be useful, e.g., when deciding which level of theory is necessary to model a given material. For this purpose, we train a model with the same architecture as OPTIMATE<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Deep learning of spectra: predicting the dielectric function of semiconductors. Phys. Rev. Mater. 8, L122201 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR18\" id=\"ref-link-section-d365797778e2879\" target=\"_blank\" rel=\"noopener\">18<\/a>, except that we change the output dimension of the final output layer from a dimension of 2001 to 1, i.e., the dimension of the SC (see Supplementary Note\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#MOESM1\" target=\"_blank\" rel=\"noopener\">5<\/a> for training details). The results are shown in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig6\" target=\"_blank\" rel=\"noopener\">6<\/a>. As can be seen, the similarity coefficient between the IPA and the RPA spectra can be predicted extremely well, with a mean and median absolute error of 0.026 and 0.018, respectively. This bodes well for further studies. One might, e.g., design a high-throughput workflow which incorporates such models to effectively select the highest level of theory necessary to describe the properties for a given material, saving valuable computational resources.<\/p>\n<p><b id=\"Fig6\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 6: Predicting the similarity between the\u00a0RPA and\u00a0the IPA.<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41467-025-63355-9\/figures\/6\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig6\" src=\"https:\/\/www.europesays.com\/us\/wp-content\/uploads\/2025\/08\/41467_2025_63355_Fig6_HTML.png\" alt=\"figure 6\" loading=\"lazy\" width=\"685\" height=\"660\"\/><\/a><\/p>\n<p>Prediction of SC[RPADFT;\u00a0\u2009IPADFT] on the test set after training on the entire training set. The straight line marks the angle bisector.<\/p>\n<p>Which materials are well described by\u00a0the IPA already?<\/p>\n<p>The database presented in this work is the largest database of well-converged RPA calculations that we are aware of. Apart from being of direct practical use, one might hope to glean physical insight from such a database, for example by asking \u201cWhich (classes of) materials are well described by the IPA already\u201d, or framed in another way, \u201cWhich materials possess very strong local-field effects\u201d. From a theoretical point of view, the IPA is in essence a maximal truncation of a basis set (see Eq. (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Equ2\" target=\"_blank\" rel=\"noopener\">2<\/a>)), and treats the susceptibility as being constant over the entire unit cell<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 3\" title=\"Onida, G., Reining, L. &amp; Rubio, A. Electronic excitations: density-functional versus many-body Green&#x2019;s-function approaches. Rev. Mod. Phys. 74, 601&#x2013;659 (2002).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR3\" id=\"ref-link-section-d365797778e2924\" target=\"_blank\" rel=\"noopener\">3<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 22\" title=\"Sottile, F. et al. TDDFT from molecules to solids: the role of long range interactions. Int. J. Quantum Chem. 102, 684&#x2013;701 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR22\" id=\"ref-link-section-d365797778e2927\" target=\"_blank\" rel=\"noopener\">22<\/a>. How correct this is obviously varies from material to material, and one might attempt to correlate it with the anisotropy of the electron density, for example. We instead make use of our descriptor SC[RPADFT;IPADFT] for an initial attempt to answer this question. Manually inspecting the materials with the lowest similarity SC[RPADFT;IPADFT] (see \u201cCode availability\u201d statement) reveals that most of these materials are ones that either consist of condensed weakly bound clusters or molecules (e.g., S8-rings, N2O, CO2, N2) or materials including many halogen or nitrogen atoms. The response of the former class is spatially inhomogeneous more or less by definition. Intuitively, strong local-field effects of the latter class might be explained by its very electronegative elements, which lead to unequal charge distributions and anisotropic screening. However, an explanation of low SC[RPADFT;IPADFT] values based on electronegativity alone suggests small SC values for oxides as well, which is not clearly supported by the data. Direct inspection of the materials with the largest SC[RPADFT;IPADFT] did not show any obvious material clusters.<\/p>\n<p>We therefore proceed with an alternative approach. We extract the latent activations for the SC[RPADFT;IPADFT]-prediction network after the pooling step and reduce them to two dimensions using the UMAP algorithm<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 30\" title=\"McInnes, L., Healy, J. &amp; Melville, J. UMAP: Uniform Manifold Approximation and Projection for dimension reduction. arXiv &#010;                  https:\/\/arxiv.org\/abs\/1802.03426&#010;                  &#010;                 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR30\" id=\"ref-link-section-d365797778e2964\" target=\"_blank\" rel=\"noopener\">30<\/a> (for details, see \u201cMethods\u201d section). As we have previously shown, a UMAP after the pooling step (on our specific architecture) generally clusters according to chemical principles and can be interpreted as a map of the chemical space from the point of view of the trained model<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\" title=\"Grunert, M., Gro&#xDF;mann, M. &amp; Runge, E. Discovery of sustainable energy materials via the machine learned material space. Small 2412519 (2025).\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#ref-CR29\" id=\"ref-link-section-d365797778e2968\" target=\"_blank\" rel=\"noopener\">29<\/a>. The resulting UMAP is shown in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41467-025-63355-9#Fig7\" target=\"_blank\" rel=\"noopener\">7<\/a>, and one can clearly see that the materials with extremely low similarity cluster (materials with many halogen atoms to the right, azides, nitrates etc. to the lower left), while materials with average or high similarity are more spread out. The identity of each dot (i.e., each material) is identified via an interactive version of this UMAP with which one can investigate where each material is placed (see \u201cData availability\u201d section).<\/p>\n<p><b id=\"Fig7\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 7: UMAP of latent embeddings of SC-prediction model, colored by true SC.<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41467-025-63355-9\/figures\/7\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig7\" src=\"https:\/\/www.europesays.com\/us\/wp-content\/uploads\/2025\/08\/41467_2025_63355_Fig7_HTML.png\" alt=\"figure 7\" loading=\"lazy\" width=\"685\" height=\"339\"\/><\/a><\/p>\n<p>Each dot represents a material. The color and size of each dot are related to the SC, with larger\/darker dots showing less similarity between RPA and IPA. There are two distinct clusters of very low similarity. The right purple cluster consists of compounds with extreme amounts of halogens (e.g., KF3, RbBrF6, XeF2,\u00a0etc.), while the purple cluster closer to the lower left corner consists mostly of azides, nitrates and other nitrogen-containing compounds (e.g., NaN3, LiNO3, N2, etc.).<\/p>\n","protected":false},"excerpt":{"rendered":"We have calculated RPA spectra for all materials in the IPA database24 that have at most 8 atoms&hellip;\n","protected":false},"author":3,"featured_media":188357,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[691,738,10046,70935,10047,34584,159,1593,158,67,132,68],"class_list":{"0":"post-188356","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-humanities-and-social-sciences","11":"tag-materials-for-optics","12":"tag-multidisciplinary","13":"tag-optical-materials-and-structures","14":"tag-science","15":"tag-semiconductors","16":"tag-technology","17":"tag-united-states","18":"tag-unitedstates","19":"tag-us"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@us\/115120806617223364","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/posts\/188356","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/comments?post=188356"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/posts\/188356\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/media\/188357"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/media?parent=188356"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/categories?post=188356"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/tags?post=188356"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}