Since their invention in the mid-twentieth century, electron microscopes have enabled paradigm-shifting discoveries across materials science and structural biology. From the discovery of emergent polarization vortices in ferroic materials with atomic-resolution using four-dimensional scanning transmission electron microscopy (4D-STEM)1, to the determination of near-atomic structures of membrane ion channel proteins using cryogenic electron microscopy (cryo-EM)2, these instruments have repeatedly expanded the frontiers of what scientists can observe at the nanoscale.
A parallel transformation is now unfolding in AI. Recent advances in large language models (LLMs) and agentic AI systems suggest that AI can increasingly reason over complex information, synthesize knowledge across disciplines, and assist in scientific discovery across diverse domains3,4,5,6. These developments raise a fundamental question for electron microscopy: can such agentic systems participate in the scientific reasoning that guides experiments and enable discoveries at the microscope?
Over the past two decades, transmission electron microscopes (TEMs) have become increasingly automated. State-of-the-art instruments can perform alignment, aberration correction7, and data acquisition8 with minimal user intervention, providing precise and programmable control over many experimental parameters. In parallel, machine learning approaches have transformed the analysis of microscopy data, enabling advances in denoising, segmentation, reconstruction, and interpretation9,10,11,12,13,14. These methods have also enabled closed-loop and active learning experimental workflows across electron and scanning probe microscopies15,16,17.
Yet, the core scientific reasoning behind experiments remains largely human-driven. Decisions about how to design experiments, revise acquisition protocols in response to experimental outcomes, and synthesize insights across measurements still depend on human expert intuition and experience. Such decisions require multi-step reasoning across experiments, the integration of knowledge from multiple disciplines, and engagement with open-ended scientific research, capabilities that current automation and machine learning approaches in electron microscopy are not designed to provide.
Advances in agentic AI have demonstrated how collections of frontier LLMs can collaborate to reason over data, integrate prior knowledge, and accelerate discovery across diverse scientific domains3,4,5,18. In this context, an agent refers to an LLM specialized for a particular scientific role and equipped with domain knowledge, computational tools, and access to relevant data sources. Agents can be assigned tasks such as experiment designer, microscopy data analyst, or materials scientist, chemist, and physicist (Fig. 1), through instructions that guide their expertise and govern how they interact with other agents and with human users.
Fig. 1: Overview of the agentic electron microscopy framework.
a A human researcher specifies a high-level scientific objective (e.g., resolving grain-boundary structure in ferroic materials using 4D-STEM). A team of specialized AI agents, representing a principal investigator (PI), materials scientist, electron microscopist, physicist, and critic, would engage in iterative discussion and reasoning in a meeting to synthesize an experimental plan prior to the microscopy session. b In an iterative closed-loop experiment, agentic systems could interact with the microscope across successive experimental cycles, adapting acquisition parameters such as beam dose, current, and exposure strategy in response to analyzed experimental outcomes. This process enables systematic probing of dynamic phenomena, illustrated here by dose-dependent diffusion regimes of photocatalytic nanoparticles in liquid-phase TEM, and refining experimental conditions to isolate mechanistic transitions. c Agentic AI could act as a scientific collaborator, or co-scientist, participating in scientific reasoning and hypothesis generation during open-ended research. By integrating experimental observations with prior theoretical and experimental literature knowledge, co-scientists could generate testable hypotheses, illustrated here by strain-temperature coupled defect migration in a nanoparticle inferred from in situ heating TEM experiments. Together, these cases illustrate how agentic AI could introduce new layers of scientific agency into electron microscopy.
Agentic systems build on advances in LLMs, but differ fundamentally from a single general-purpose chatbot. Standalone LLMs often lack access to the specialized scientific knowledge, which contributes to hallucinations and superficial reasoning in technical domains19. Much of the scientific literature and associated data remains inaccessible to current models, limiting their ability to perform expert reasoning. Agentic architectures address these limitations by explicitly incorporating domain knowledge into specialized agents. They also help mitigate the degradation in performance observed in long-context input, known as context rot20, by distributing information across multiple agents with focused expertise. This modular design enables parallel evaluation of competing hypotheses, clearer separation of roles, such as planning, simulation, and critique, and more transparent and robust reasoning.
In this commentary, we envision agentic AI as a new conceptual framework for integrating scientific reasoning into electron microscopy, transforming electron microscopes from characterization instruments to thinking machines. In this vision, collections of specialized LLM agents assist human users in planning and conducting electron microscopy experiments. We outline three roles for agentic systems in STEM/TEM: experimental design, iterative closed-loop experimentation, and real-time scientific discovery (Fig. 1). We then consider the evolving role of human scientists in a future where microscopes become thinking machines and many repetitive tasks, including sample preparation and characterization, are automated. Finally, we outline a path towards scientific agency in electron microscopy, emphasizing the infrastructural advances and data practices needed to realize agentic systems.
Agentic AI for experimental design
Experimental design in advanced electron microscopy is traditionally a collaborative and iterative process. Typically, a principal investigator (PI) and their team read and synthesize prior literature, examine records of successful and failed experiments, consult instrument documentation, and debate trade-offs to converge on a feasible experimental protocol for imaging a targeted phenomenon. For new or non-expert users, this process can take months and poses a substantial barrier to entry. The challenge is particularly pronounced in techniques such as 4D-STEM, where imaging parameters such as probe convergence angle, scan step size, detector geometry, dwell time, and total dose are tightly coupled and together determine sensitivity to strain, polarization, and symmetry breaking in quantum and ferroic materials.
Agentic AI for experimental design offers a complementary paradigm for accelerating and scaling this collaborative planning process. For example, in the context of 4D-STEM, the agentic system could consist of interacting agents with expertise in diffraction physics, materials science, and practical experimental constraints. Before the microscopy session, these agents could review the published 4D-STEM literature, experimental metadata from lab notebooks and public repositories, reported failure cases, and microscope-specific documentation to design experimental protocols aligned with the user’s scientific objective. For example, a researcher seeking to resolve nanoscale polarization vortices in a ferroic superlattice might specify a target length scale and sensitivity to symmetry. The agentic system could then analyze prior 4D-STEM datasets to recommend probe and detector configurations that balance angular resolution against beam damage, producing an experimental protocol optimized for detecting emergent order parameters (Fig. 1a). Such systems could integrate literature synthesis, domain-specific physical reasoning, and critical evaluation to develop a feasible experimental strategy.
In this vision, agentic experimental design systems enable researchers to develop viable experimental plans more rapidly, efficiently, and systematically, before a microscopy session and without directly interacting with the microscope hardware. In the near term, such capabilities could improve experimental reproducibility by generating detailed and consistent protocols and enhance efficiency, particularly in high-demand shared facilities and national labs, by shortening planning cycles and designing experiments that build on prior literature and the experiences of other users working on related materials and experimental systems. These systems could also democratize access to advanced microscopy experiments for a broader community by lowering barriers to entry. Over the long term, agentic experimental design systems may enable expert researchers to explore experimental design spaces more systematically, identify non-obvious parameter regimes, and develop new experimental protocols that reveal previously inaccessible physical phenomena.
Agentic AI for iterative closed-loop experimentation
Advanced electron microscopy experiments, particularly in situ and operando experiments, are rarely executed as static, preset protocols. Instead, they happen as dynamic processes, in which experimental parameters are repeatedly adjusted in response to observations, constraints, and emerging hypotheses. In studies of beam-matter interactions, reaction-driven dynamics, transport-limited motion, or defect-mediated structural evolution, researchers must vary beam convergence, magnification, exposure, or external stimuli across successive acquisitions, relying on intuition and experience to interpret results and decide how to proceed. This iterative process is labor-intensive, difficult to formalize, and prone to variability across users and sessions.
Agentic AI enables a new paradigm for iterative, closed-loop experimentation, in which experimental design, execution, analysis, and critique are integrated into a continuous, goal-directed loop guided by a human-defined objective. In this framework, collections of role-specific LLM agents collaborate to refine experimental strategies across successive experimental cycles, reasoning on both prior knowledge and feedback from completed measurements. Rather than optimizing a single predefined objective, these systems are designed to support open-ended research, allowing experimental goals, hypotheses, and protocols to evolve as new results are collected.
For example, consider an in situ liquid-phase TEM experiment on photocatalytic Pt nanoparticles in water, where the scientific objective is to elucidate the relationship between nanoparticle diffusion dynamics and catalytic activity under electron beam irradiation. In this case, beam-matter interactions, radiolysis, and chemical reactions are tightly coupled to the observed nanoparticle motion21,22. A researcher may pose a high-level prompt: Systematically probe how the electron beam alters diffusion mechanisms by adapting beam dose, current, and exposure strategy, and infer the resulting mechanistic transitions. An agentic system (Fig. 1b) could respond by assembling a set of expert agents, including a microscopist agent aware of dose and imaging constraints, a materials science agent knowledgeable about metallic nanoparticle behavior, a chemistry agent focused on catalytic pathways, and a critic agent responsible for synthesizing insights.
Together, these agents could design an initial experimental protocol, for example, by systematically varying the electron dose while tracking nanoparticle trajectories. Experimental observables, including mean-squared displacement, reaction onset times, and beam conditions, are converted into LLM-readable representations. Based on these inputs, an execution agent could generate and deploy instrument-control code (e.g., Python scripts interfacing with microscope software) to adjust imaging parameters and acquire new data at an appropriate frame rate and beam dose.
Following data acquisition, analysis routines extract quantitative metrics from the recorded videos and return them to the agentic system for interpretation. A critic agent could evaluate whether observed changes in diffusion are consistent with beam-induced heating, radiolysis-driven chemical gradients, or catalytic reactions, and generate new insights and questions. These critiques are then shared with a team of planner agents, who could revise the experimental strategy by adjusting dose rate, temporal resolution, or other control parameters and then initiate subsequent experimental cycles.
Agentic frameworks may be particularly well suited to such settings compared with classical optimization approaches, which typically assume a fixed objective function and a predefined parameter space23. By contrast, agentic systems can adaptively reformulate experimental objectives, incorporate new physical models, and introduce additional observables or analysis tools as experiments evolve. For example, in the diffusion experiment described above, the system might initially interpret nanoparticle motion using a simple Brownian motion model. As the dose rate changes and new data are collected, it could revise this interpretation toward an anomalous superdiffusive mechanism of motion, or a mixture of the two mechanisms, triggering the deployment of new analysis methods to test these hypotheses.
Through repeated experimental cycles, agentic closed-loop experimentation could enable systematic exploration of complex, coupled phenomena that are difficult to probe using fixed, predefined protocols. By providing interpretability into the reasoning behind experimental decisions, such systems reduce dependence on ad hoc intuition and improve reproducibility across experiments and users, as well as over the course of an experiment. Together, these capabilities could substantially accelerate the cycle of experimentation and discovery.
Agentic AI for science discovery on-the-fly
In advanced electron microscopy, the generation of new hypotheses and scientific insight is typically driven by human researchers during and after experiments. Scientists interpret images and videos in light of prior theoretical understanding, compare observations with existing literature, and rely on experience to recognize unexpected behaviors or patterns that require further investigation. This process is often implicit and distributed across lab notebooks, informal discussions, and post hoc analysis. As experiments become increasingly complex and data-rich, this mode of discovery becomes difficult to systematize or scale.
Beyond planning experiments and guiding their iterative evolution, agentic AI could act as a real-time scientific collaborator, that is, a co-scientist, participating in scientific reasoning and hypothesis generation during open-ended research. Agentic co-scientist systems could operate alongside researchers during acquisition and analysis, continuously interpreting microscopy observations in the context of prior scientific knowledge to generate new hypotheses. These systems would consist of interacting agents specialized in image and video analysis, literature synthesis, and hypothesis evaluation, with multimodal access to both live microscopy data and figures, tables, models, and experimental results reported in the literature. As experiments are performed in real-time, co-scientist systems could contextualize observed structures and dynamics, relating them to known physical and chemical mechanisms within existing theoretical and experimental frameworks, while remaining alert to deviations that may signal previously unseen phenomena.
An example application for the co-scientist agentic system is in in situ heating TEM studies of defect migration within single nanocrystals, where vacancies, dislocations, or grain boundaries evolve in response to temperature, stress, and local microstructure24 (Fig. 1c). As imaging proceeds, a co-scientist system may detect that the defect motion deviates from classical thermally activated diffusion, for example, exhibiting intermittent jumps, directionally biased migration, or collective rearrangements. By comparing these behaviors with previous reports of stress-mediated diffusion, defect pinning, or interaction-driven dynamics, the system could infer candidate underlying mechanisms, such as local strain-field coupling or defect-defect interactions.
In this paradigm, hypotheses are not merely tested through automated protocols but are generated by the system itself, grounded in both experimental observation and existing literature. The microscope could thus become an active scientific collaborator, one that helps identify unexpected behavior, frame new questions, and guide discovery beyond what a human user might anticipate.
Human-AI collaboration
An important question raised by the emergence of thinking microscopes is the role and responsibility of the human scientist. We argue that, even as experiments become increasingly agentic, humans remain responsible for defining scientific questions, verifying experimental outcomes, and communicating and preserving new knowledge. Human researchers also retain accountability for the accuracy and integrity of both the experimental process and the results reported.
In this future, the role of the scientist evolves toward deeper human-AI collaboration. Researchers will define the scientific problem, configure and supervise the agentic system, evaluate its outputs, and communicate validated findings to the broader community. As the use of LLMs in writing is now routinely acknowledged, the scientific community will similarly need clear norms for reporting the use of agentic systems and their contributions to experimental discovery.
Verification and combating hallucinations
Verification of agentic systems will remain a critical and evolving challenge. Approaches for improving the self-verification capabilities of LLMs are advancing rapidly. Recent work shows that decoupling a model’s final output from its intermediate reasoning tokens, combined with targeted prompting strategies, can enable the model to detect flaws that were overlooked during the initial generation process19. Empirical studies also suggest that training agents to use external tools and software, and providing domain knowledge through multi-agent architectures and retrieval-augmented generation, can substantially reduce hallucinations in LLM-based systems25.
Although self-verification, often implemented through critic agents, provides an important safeguard, human oversight remains essential. This is particularly critical in experimental settings where agentic systems interact directly with microscope hardware. In such cases, robust safety protocols and verification layers will be required to ensure that automatically generated experimental procedures do not risk damaging highly specialized and costly instrumentation.
The path to agency in microscopy
While several pioneering efforts have begun to explore autonomous and closed-loop electron microscopy in materials systems16,26,27,28, realizing agentic electron microscopy will require broader changes in infrastructure and data practices, not just advances in AI models.
First, scientific publications must become more accessible to agents. Agentic AI depends on access to experimental details included in text, figures, and supplementary materials. Expanding open access and standardizing how experimental parameters and metadata are reported would immediately increase the usefulness of the literature. The materials science community could benefit from practices adopted in the biomedical sciences, where funding agencies such as the National Institutes of Health mandate broad access to published research and associated data.
Second, materials electron microscopy needs data repositories comparable to those in structural biology. Resources such as Protein Data Bank (PDB)29, Electron Microscopy Data Bank (EMDB)30, and Electron Microscopy Public Image Archive (EMPIAR)31 have been key enablers for the success of data-driven advances in structural biology. Materials electron microscopy would benefit from a similarly unified, community-adopted infrastructure that integrates raw TEM data, diffraction, spectroscopy, and experimental metadata. Building and maintaining such repositories will be essential for enabling agentic systems in materials research.
Third, standardized secure application programming interfaces (APIs) are needed to connect microscopes with the computational infrastructure required for LLMs within agentic systems. Although many modern instruments already support scripting, agentic electron microscopy requires robust interfaces that provide memory for on-the-fly data analysis and allow software agents to interact safely with microscope controls across facilities. In recent years, microscopy instrument manufacturers have begun shifting toward more open software ecosystems, including interfaces that allow users to access and control microscope functions through scripting environments such as Python and extensible APIs. These developments reflect the growing recognition that open interfaces will be essential for integrating AI-driven and agentic workflows into instrument operation.
Fourth, electron microscopy datasets are inherently large and multimodal, often combining images and videos with spectroscopy measurements such as electron energy loss spectroscopy (EELS) and energy-dispersive spectroscopy (EDS). Agentic systems must be able to reason across these data modalities, which in turn requires that raw data and associated metadata be archived in interoperable formats. Making raw microscopy data a standard component of publications, and not just an optional supplement, would significantly expand the knowledge base available to AI agents. Additionally, real-time analysis of massive multimodal datasets will likely require retrieval augmented strategies32 that compress, index, and organize experimental data into structured knowledge graphs accessible to agents.
Finally, agentic systems must learn from failure as well as success. Scientists develop intuition not only by reproducing successful experiments, but by understanding why particular strategies did not work, reflecting the broader principle that humans learn not only from their own mistakes, but also from those of others. Well-documented unsuccessful experiments, aborted acquisition attempts, and negative results therefore represent a critical, yet underutilized and undervalued resource. In many experimental workflows, such unsuccessful or exploratory attempts are also far more numerous than the final successful experiments that ultimately appear in the literature, suggesting that a large fraction of potentially informative scientific data remains unrecorded and inaccessible. While formalizing such records may encounter cultural resistance, access to documented failures will likely be essential for training agentic systems capable of planning, adapting, and reasoning at a level comparable to human experts. Addressing this challenge will likely require new incentive structures across the research ecosystem. Funding agencies, user facilities, and publishers could help catalyze this cultural shift by encouraging or rewarding the archiving of unsuccessful experiments, for example, through dedicated data repositories, financial support for data storage, reduced publication or instrument costs associated with data sharing, or facility policies that provide additional instrument access for well-documented experimental datasets.
These considerations suggest that the transition to agentic electron microscopy is not primarily a challenge of AI algorithms, but one of infrastructure and scientific culture. Addressing these gaps will be key to transforming agentic AI concepts into practical and widely accessible tools for materials discovery.