{"id":6889,"date":"2026-04-17T15:29:08","date_gmt":"2026-04-17T15:29:08","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/6889\/"},"modified":"2026-04-17T15:29:08","modified_gmt":"2026-04-17T15:29:08","slug":"this-ai-mines-the-numbers-buried-in-scientific-papers-and-turns-them-into-usable-data-fast","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/6889\/","title":{"rendered":"This AI mines the numbers buried in scientific papers and turns them into usable data fast"},"content":{"rendered":"<p>            <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/ai-reads-science-new-s.jpg\" alt=\"AI reads science: New system understands numbers and correlations\" title=\"The Quinex framework, developed by researchers at J\u00fclich, is based on language models and automatically identifies numerical values in scientific publications, assigns them to appropriate units, and determines what was measured, when, where, and how. Credit: The Innovation (2026). DOI: 10.1016\/j.xinn.2026.101391\" width=\"800\" height=\"528\"\/><\/p>\n<p>                The Quinex framework, developed by researchers at J\u00fclich, is based on language models and automatically identifies numerical values in scientific publications, assigns them to appropriate units, and determines what was measured, when, where, and how. Credit: The Innovation (2026). DOI: 10.1016\/j.xinn.2026.101391<\/p>\n<p>Numbers are the language of science\u2014yet in research articles, they are often buried within the text and difficult to analyze. Researchers at J\u00fclich have developed an AI system that automatically identifies these numbers, categorizes them, and converts them into structured data. The Quinex framework thus eliminates the need for time-consuming manual work.<\/p>\n<p>Whether in energy, climate, or materials research\u2014scientific papers are full of numbers\u2014or, more precisely, quantitative data: efficiencies, temperatures, costs, emissions. These are often crucial for improving models or identifying trends. At the same time, the number of scientific publications is growing rapidly. For many research questions, it is now virtually impossible to manually evaluate all relevant publications\u2014the time and resources required would be enormous.<\/p>\n<p>The Quinex (&#8220;Quantitative Information Extraction&#8221;) framework, developed by researchers at J\u00fclich, is based on language models and automates this process: Artificial intelligence identifies numerical values, assigns them to appropriate units, and recognizes what was measured, when, where, and how. Thus, a sentence like &#8220;Efficiency levels of 63 to 71 percent are assumed for 2025&#8221; is transformed into a structured dataset containing all relevant contextual information\u2014from the year and measurement method to the source.<\/p>\n<p>Open and efficient AI<\/p>\n<p>Unlike many proprietary AI solutions, Quinex is based entirely on open, relatively small, and thus efficient language models. These have been specifically trained to recognize and classify quantitative information in scientific texts. Compared to similar systems, Quinex delivers more precise results, captures contextual information in a more nuanced way, and also takes implicit characteristics into account.<\/p>\n<p>Despite its compact size, Quinex achieves a recognition accuracy (F1) of about 98% for numbers and associated units, and approximately 87% and 82% for the classification of quantified properties and entities. These high accuracy rates were achieved through specially created training datasets and methodological improvements.<\/p>\n<p>&#8220;We wanted to develop a tool that is powerful, yet also transparent and resource-efficient,&#8221; explains Dr. Jann Weinand, head of the Integrated Scenarios Department at J\u00fclich System Analysis. &#8220;Quinex makes artificial intelligence more accessible for data analysis in science.&#8221;<\/p>\n<p>                                                                                                                                            Successful practical test<\/p>\n<p>To test Quinex&#8217;s practical suitability, the system was applied to thousands of scientific abstracts from various fields. It successfully extracted data on electricity production costs for various energy technologies, on maximum oxygen uptake in humans, on earthquake magnitudes and locations, and on the band gaps of photovoltaic materials.<\/p>\n<p>The automatically derived values closely matched the respective reference data. This demonstrates that Quinex is well-suited for analyzing large volumes of academic literature across a wide range of research fields and deriving reliable trends from it.<\/p>\n<p>New perspectives for research<\/p>\n<p>&#8220;Language models open up new perspectives for science and help maintain an overview of entire research fields,&#8221; says lead author Jan G\u00f6pfert. &#8220;They enable automated literature searches, the creation of uniformly structured research databases, and trend analyses that reveal developments in science and technology at an early stage.&#8221;<\/p>\n<p>&#8220;Our goal is to relieve researchers of routine work,&#8221; says Dr. Patrick Kuckertz, head of the Research Data Management Group. &#8220;Quinex is designed to help them arrive at insights more quickly and manage the growing flood of data in science.&#8221;<\/p>\n<p>The study is <a href=\"https:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S2666675826001384\" target=\"_blank\" rel=\"nofollow noopener\">published<\/a> in the journal The Innovation.<\/p>\n<p>                                                                                                                                            Limitations and future improvements<\/p>\n<p>Quinex isn&#8217;t entirely error-free either\u2014but transparency is part of its design. &#8220;The system recognizes numbers and units very reliably,&#8221; says G\u00f6pfert. &#8220;Since they are taken directly from the text, they cannot be &#8216;hallucinated.&#8217; However, misinterpretations sometimes occur, for example, when important references are scattered throughout the text.&#8221;<\/p>\n<p>Thus, Quinex remains a tool that supports people but does not replace them. &#8220;We recommend using Quinex where it informs and relieves researchers\u2014but the responsibility for interpreting the results remains with them,&#8221; says G\u00f6pfert. Every recognized number can be traced back to its source and, where possible, is highlighted in the original text.<\/p>\n<p>The team is working to further develop Quinex with additional domain-specific datasets and models, making it even more efficient and flexible enough to adapt to various research requirements.<\/p>\n<p>Open collaboration welcome<\/p>\n<p>Forschungszentrum J\u00fclich is making Quinex available as an open-source project. This is intended to give researchers worldwide the opportunity to test, expand, and adapt the system to their own fields\u2014from energy research to chemistry and biomedicine.<\/p>\n<p>Quinex Open Source: <a href=\"https:\/\/github.com\/FZJ-IEK3-VSA\/quinex\" target=\"_blank\" rel=\"nofollow noopener\">https:\/\/go.fzj.de\/quinex<\/a><\/p>\n<p>                                                    More information                                                 <\/p>\n<p>Jan G\u00f6pfert et al, Quinex: Quantitative information extraction from text using open and lightweight LLMs, The Innovation (2026). <a data-doi=\"1\" href=\"https:\/\/dx.doi.org\/10.1016\/j.xinn.2026.101391\" target=\"_blank\" rel=\"nofollow noopener\">DOI: 10.1016\/j.xinn.2026.101391<\/a><\/p>\n<p>                                                Provided by<br \/>\n                                                                                                    <a href=\"https:\/\/techxplore.com\/partners\/j--lich-research-centre\/\" rel=\"nofollow noopener\" target=\"_blank\">J\u00fclich Research Centre<\/a><br \/>\n                                                    \t\t\t\t\t\t\t\t\t\t\t\t\t<a class=\"icon_open\" href=\"http:\/\/www.fz-juelich.de\/portal\/DE\/Home\/home_node.html\" target=\"_blank\" rel=\"nofollow noopener\"><\/p>\n<p>\t\t\t\t\t\t\t\t\t\t\t\t\t<\/a><\/p>\n<p>\n                                                Citation:<br \/>\n                                                This AI mines the numbers buried in scientific papers and turns them into usable data fast (2026, April 17)<br \/>\n                                                retrieved 17 April 2026<br \/>\n                                                from https:\/\/techxplore.com\/news\/2026-04-ai-scientific-papers-usable-fast.html\n                                            <\/p>\n<p>\n                                            This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no<br \/>\n                                            part may be reproduced without the written permission. The content is provided for information purposes only.\n                                            <\/p>\n","protected":false},"excerpt":{"rendered":"The Quinex framework, developed by researchers at J\u00fclich, is based on language models and automatically identifies numerical values&hellip;\n","protected":false},"author":2,"featured_media":6890,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[24,25,2116,2113,2114,963,684,2115],"class_list":{"0":"post-6889","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-ai","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-computer-news","11":"tag-hi-tech-news","12":"tag-hitech","13":"tag-information-technology","14":"tag-innovation","15":"tag-inventions"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/6889","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=6889"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/6889\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/6890"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=6889"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=6889"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=6889"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}