Potential futures for the IPCC’s approach to artificial intelligence

Rise of the agents

In this scenario, agentic AI becomes widespread within the next few years, fundamentally changing how professionals work. Agentic AI is where artificial agents use natural language interfaces to execute sequences of actions on users′ behalf42; bringing up questions about the agent’s alignment with human priorities, the influence of the agents on their human users, and the potential need for regulatory agents to monitor other agents43.

Literature search

As these agents become ubiquitous across academia, research, government, and the private sector, there will be implications for literature search—agents may collaborate to scout the literature. Agents could be embedded within the IPCC process, continuously screening the literature. Strict bounds for automated inclusion, such as confining agents to known databases for peer-reviewed literature, could make it challenging to include “gray” (i.e., not peer-reviewed) literature or Indigenous Knowledge. Conversely, expanding the scope to include gray literature, where AI could help break language barriers and enable cross-lingual search, would place a significant verification burden on authors, given the mixed quality of such sources.

Synthesis and assessment

Another set of agents could synthesize the literature and update draft chapters, including generating data visualizations created from the underlying data in cited papers, whilst maintaining a detailed log for authors’ verifications against established benchmarks, and for expert reviewers and the public for transparency.

Communication

IPCC reports must be written with content optimized not just for human readers, but for AI consumption and processing. Consequently, much of the assessment and its communication transforms into agent-to-agent information exchange, where content is accessed, interpreted, and transmitted through AI intermediaries (i.e. Large Language Models–LLMs), in addition to humans. This requires the IPCC to consider how its assessments can be structured for optimal LLMs and human parsing and synthesis.

A critical concern emerges under such a scenario: what happens when users extract IPCC text and process it through chatbots for interpretation? Such practices are already occurring44,45. Should the IPCC pre-empt potentially problematic third-party interpretations by providing “official” chatbot access? On the other hand, having a record of agentic decisions could provide a “traceability” in the IPCC’s reasoning that might enhance its credibility for some. Still, there will continue to be inherent limitations in LLMs used for communicating scientific reports2. These include probabilistic pattern generation that can erode scientific nuance, stochastic outputs that challenge reproducibility, static parametric knowledge that may become outdated unless augmented with new knowledge retrieval (i.e, Retrieval-Augmented Generation, RAG), and hallucinations that produce plausible-sounding but false information.

Implications for IPCC as an institution

This raises fundamental questions about workflows, audience and format. First, should the IPCC develop its own agentic capabilities for IPCC authors? Second, how much should the IPCC optimize its outputs to be read and processed by AI agents, and what changes to the preparation and presentation of the report should be undertaken with agentic readers in mind? Third, should the IPCC actively house an LLM to help readers navigate the report and its data? The IPCC already produces multiple formats (PDFs, printed copies, webpages, infographics). Creating an official LLM that reproduces paragraphs verbatim would be technically straightforward, and it could essentially operate as an enhanced search functionality that could prove beneficial for accessibility, albeit this comes at the risk of hallucinations, among other limitations that are inherent to the LLM architecture. Alternatively, RAG systems, which ground responses directly in source documents rather than relying solely on parametric knowledge (“original” LLM knowledge), could mitigate hallucination risks by linking outputs back to specific passages in the reports. However, even RAG-based approaches are not immune to limitations, as the underlying language model may still misinterpret or misrepresent retrieved content2.

Furthermore, in such a highly automated environment, the traditional human-led approval process for SPMs may suddenly appear to be ‘outdated’. But any attempt at automating the generation of SPMs creates tensions with the still human-led and consensus-based negotiation mode in the UNFCCC prevalent in IPCC panel sessions. More broadly, if agentic AI is widespread in society, it may call the traditional pacing of an IPCC assessment cycle (5–7 years) into question. Frequently, there are suggestions for the IPCC to produce short, accessible reports with updated information that are more tailored to nimbly respond to knowledge needs, or even move to enabling ongoing learning processes in a dynamic approach46,47. With the development of agentic AI, professionals may be accustomed to tasking agents with discovering instant answers, and the speed of IPCC assessment may feel obsolete.

Implications for IPCC authors

Adoption of agentic AI would fundamentally transform the author’s role from a writer to an expert verifier and agent manager.

Wider social implications

Whilst AI intermediaries could enhance access to text, they risk diluting the IPCC’s carefully chosen language around uncertainty and confidence. The very authority of the IPCC, built on human expertise and deliberation, could be undermined by the perception that machines are the primary constructors, readers and interpreters of its work.

Superior truth machine

In this scenario, there is a low degree of automation in knowledge processing—humans guide the process and make the final judgments—and there is high trust in AI-generated output. AI-generated output becomes viewed as superior to human equivalents, and is often turned to as an arbiter of truth or in social and political disputes about knowledge. LLMs gain trust in terms of being perceived as more comprehensive and balanced, and free from human political biases. Users become accustomed to AI summaries that are accurate and readable, leading to higher perceptions of credibility and trustworthiness of the authors48. Private platforms for scientific summaries that are already being launched by scientific publishers present themselves as tools for human creators to undertake scientific synthesis and consensus building and emerge as competitors to traditional assessments, potentially offering more rapid summaries or even providing functions of assessment and knowledge synthesis.

Literature search

Humans are directing the literature search, but their expert choices are open to greater scrutiny, as anyone could use a trusted AI tool to check the same body of literature for omissions. AI augmentation could facilitate inclusion of literature in multiple languages through multilingual search and summarization, including technical reports. The IPCC has also been critiqued for having reductive approaches49 and structural limitations that limit incorporation of Indigenous Knowledge50 and traditional knowledge51; GenAI could be employed in ways that enhance or limit this incorporation, with the possibility for building AI systems that draw on Indigenous knowledge systems52, but also risks around erosion of cultural knowledge or data-grabbing that does not accord with principles of Indigenous Data Sovereignty53. In this scenario, expanding the scope in this way places a verification burden on authors, making the extent of such inclusion a matter of human direction and judgment.

Synthesis and assessment

The primary challenge comes not from automation within the IPCC, but from externally produced “shadow versions” of the reports. These AI-generated alternative assessments— produced by groups with specific political or scientific agendas—could emerge to contest the IPCC’s findings using the same corpus of literature. Shadow versions could also emerge during the review process, followed by claims that the IPCC ignored superior AI-assisted input. This creates an arms race dynamic. Government delegations, invested in maintaining influence over the “official” climate narrative, will aim to reinforce the IPCC version, particularly for the SPM approval.

Communication

Users will increasingly question whether they should trust human-crafted text over AI alternatives, particularly when AI versions seemingly appear to be more up to the task by updating more frequently or being more comprehensive in scope. The wide availability of competing AI-generated versions of reality allows users to “shop for” interpretations that align with their preferences.

Implications for IPCC as an institution

If different chapters or even whole Working Groups adopt varying approaches to AI integration, these differences will become publicly visible, exposing weaknesses in human-only assessment. The institution may also face pressure from “gotcha” papers that highlight discrepancies or show user preferences for AI-generated summaries, hence potentially undermining the IPCC’s credibility.

If shadow assessments are inevitable, it becomes untenable for the IPCC to take a restrictive stance towards AI. It will be under pressure to develop its strategy to incorporate and respond to AI alternatives proactively. The question becomes how to maintain the authority of human-led assessment when AI output is perceived to be superior. Embedding AI with clear benchmarking criteria may be necessary for the IPCC to defend the credibility of its main products, not only the Working Group and Special Reports, but also the carefully crafted SPMs.

When it comes to the SPMs, how much they deviate from the underlying assessment will become an object of AI-supported re-analysis—and governments will deliberate them in approval plenaries with the help of the various AI agents that they can build or access. Countries that make use of technology ecosystems developed in China or the United States may have their results influenced by the technology stacks—the resources, chips, networks, applications and algorithms, data, and GenAI models—that have developed in those national contexts, and the affordances of those systems may impact their workflows. At the same time, this AI-supported deliberation doesn’t affect the way the negotiated SPM can be used in the still human-led and consensus-based UNFCCC negotiations.

Implications for authors

AI’s legitimacy and perceived superiority bring challenges for authors. Should each chapter team develop an official AI-generated shadow version, against which they justify their differing judgment? On the one hand, this could elucidate the value of their expertise, but equally could prove cognitively demanding and demoralizing. This would shift the authors’ role from primary synthesizers and evaluators to expert adjudicators, who must develop clear technical and scientific benchmarks to justify why their conclusions should be preferred to a trusted AI output. At the same time, many authors may embrace AI models as scientific collaborators in their own research48, which would spill over into their IPCC work.

Wider social implications

This widespread trust in AI as an arbiter of truth in science has profound social consequences. First, there is the risk that diverse forms of knowledge will be completely disregarded. Second, as social science literature points out, the struggle about policy alternatives and the meaning of climate change itself is the basis of legitimacy in democratic decision-making. If AI is used in climate governance in ways that shortcut this debate by presenting a single “correct” assessment, there is a risk of closing down policy options in ways that limit robust decision-making54. In a study of the challenges of AI in climate governance, Ruth Machen and Warren Pearce caution about “not just the importing of particular methods but also of particular logics, esthetics, and values into processes of environmental governance”54.

Anticipatory resistance

In this scenario, AI augments humans, but people are also wary of it, leading to pressure to take precautionary or restrictive approaches that may underutilize AI’s potential. At the same time, its augmentation capabilities are not equally accessible—leading some to bring up the equity dimensions of restrictive approaches.

Literature search

Some critics/observers/voices reframe restrictive policies on AI use as gatekeeping mechanisms through which developed countries maintain epistemic control. However, the central challenge is the escalating pressure of managing the ever-growing body of scientific literature manually. Human synthesis and assessment are viewed as a premium product; at the same time, authors face a deluge of literature, and the underlying literature’s quality is under constant critique, making synthesis and assessment more difficult.

Synthesis and assessment

There remains some pressure to make use of AI tools, to save time and make participation easier, but critics are concerned about examples of bias in AI leading to biased syntheses55 even when humans are in the driver’s seat, and AI is merely augmenting their work. The credibility of any AI-generated content is increasingly questioned. Climate advocacy groups mobilize against AI overuse in the IPCC, criticizing alignment with the same techno-optimist paradigm driving ecological crisis. This creates an internal crisis: authors know that highly automated tools could process the deluge of literature more efficiently, but also believe that using them would compromise the report’s credibility. Governments face pressure from constituents suspicious of AI involvement.

Communication

There is increased demand for using AI to make the IPCC reports easier for speakers of varied languages to navigate and use in policy decisions. At the same time, interfaces for querying and having conversations about the reports face questions of bias as well.

Implications for IPCC as an institution

The institution must balance real concerns about AI quality and access with the potential reputational risk that AI restrictions perpetuate existing power imbalances in the assessment process and accessibility to its findings. For member governments, the degree of involvement of AI features in different parts of the underlying assessment becomes an additional layer of scrutiny during the SPM approval.

Implications for authors

Authors from different regions experience AI tools differently. Whilst developed country authors with limited technical skills struggle with advanced features, developing country authors may view even basic AI translation and writing assistance as transformative, creating new avenues for tension within author teams. Authors also feel overtaxed by the amount of literature to synthesize and assess, and knowing that there are tools that could help becomes a source of frustration.

Wider societal implications

The IPCC’s legitimacy is derived from its traditional, human-deliberative process, which is framed as a core strength. At the same time, parts of the broader climate community begin to view the IPCC as dated, or even as holding back progress, given its power to set norms.

Public backlash

Widespread backlash to GenAI in this highly automated scenario has a variety of drivers: concern within academia about using AI, perceived deskilling, surveillance applications of AI, unscrupulous behavior on the part of AI companies, environmental and energy use impacts of AI, or experience with socially damaging or misaligned AI products. AI may induce job loss, or it may lead to a speculative bubble wherein companies were not able to monetize its applications and stock market losses have ripple effects—either outcome can lead to backlash. For the IPCC, a legitimacy crisis emerges from the use of AI, and the perceived “purity” of human-led scientific assessments becomes compromised. This scenario illustrates how social norms about technology can override technical merit. Media narratives about GenAI shape public reception negatively56, creating a context where any AI involvement taints perceived legitimacy.

Literature search

Despite the availability of automated tools capable of comprehensively scanning the literature, low trust in AI and social perceptions of AI lead to non-adoption by some authors or chapters, also reflecting disciplinary cultures. Concrete choices about workflows trigger broader cultural and ideological debates about the role of AI in society. Literature that uses GenAI may be filtered out.

Synthesis and assessment

Documentation of automated workflows becomes a major challenge because report authors have become accustomed to opacity in automated workflows for scientific production. The cognitive demands and demands on authors’ time mount in comparison to other work duties.

Communication

AI detection tools may be used by critics to accuse the IPCC of using GenAI and discredit it. The IPCC Bureau and Secretariat may also invest time in developing communications interfaces only to have them not adopted, with the whole project being rejected.

Implications for IPCC as an institution

The IPCC faces attacks from multiple directions. Climate advocates hostile to AI due to its environmental impacts feel betrayed by an institution that sanctions its use, and use of AI tools becomes weaponized as evidence of corporate capture or techno-solutionism, drawing on research that suggests that AI biases perceptions of environmental challenges in terms of proposing incremental solutions rather than radical or transformative ones, and avoid associating environmental challenges with social justice issues57. Meanwhile, supporters of AI use criticize restrictions as evidence of the IPCC’s capture by advocacy interests, and frame non-use as a Luddite stance. The institution might be forced to publicly adopt a restrictive stance on AI to maintain legitimacy with key stakeholders, even at the cost of internal efficiency and completeness. In this light, the human-led SPM approval process holds the potential to be more prominently presented as a core asset of the organization.

Implications for authors

Using AI in academic contexts becomes stigmatized, with researchers unwilling to face reputational damage from association with AI tools. Author teams fragment between those who view AI as necessary for comprehensive assessment and those who see it as fundamentally compromising scientific integrity. The collegial spirit needed for producing an assessment erodes under such conditions.

Implications for wider society

In the public eye, the perception of division within the climate research community over the methods of assessment spills over into perceptions of divisions about the findings of climate science itself, eroding trust.

Potential futures for the IPCC’s approach to artificial intelligence

Tags: