Alphabet-X’s Astra model passed the Lovelace Test 2.0 today, scoring 96.8% on the ARC-AGI benchmark and solving a novel protein-folding problem in three hours , and the AI research community is not being quiet about it.
No press conference. No countdown timer. Just an arXiv preprint, a simultaneous API update, and a research community that collectively lost its composure. On April 23, 2026, Alphabet-X’s former DeepMind division published the technical details for Astra , Autonomous Systems Transformation via Recursive Architecture , and within hours, “Ladies and gentlemen, we have AGI” was trending across Reddit and X with the kind of energy usually reserved for moon landings. The difference this time is that the underlying numbers are genuinely hard to dismiss.
Demis Hassabis, now carrying the title of Chief AGI Scientist, kept the language precise: Astra demonstrates “robust, general-purpose reasoning across mathematics, coding, and embodied robotics without task-specific fine-tuning.” That last clause is the critical one. Every major AI milestone before this required the system to be pointed at a domain. Astra, according to the published logs and early developer access reports, does not. It scored 96.8% on the ARC-AGI benchmark , the Abstraction and Reasoning Corpus designed specifically to resist pattern memorization , clearing the previous record of 87% set by OpenAI’s Q* model in late 2025 by a margin that benchmark veterans are describing as extraordinary.
The Lovelace Test 2.0, the modernized industry threshold for AGI, requires a system to autonomously invent novel scientific code and generate original hypotheses without human prompting. Independent early-access logs confirm Astra cleared it. More concretely, the system identified and resolved a previously unknown protein-folding structure relevant to synthetic biology within three hours of activation , work that a coordinated human research team would measure in years, not hours.
The leap here is not simply computational scale. The broader AI field spent 2025 transitioning away from the LLM paradigm toward what researchers began calling LTM , Long-Term Memory systems. Astra is the first major commercial deployment built entirely on that foundation. Rather than generating outputs from static training distributions, it runs on a proprietary neural state machine that continuously retains and updates its world model. The system is not recalling , it is reasoning in real time against a dynamic internal representation of what it knows. That architectural distinction is what makes the Lovelace Test result significant rather than merely impressive.
Markets moved before the press release landed
Alphabet shares surged 12% in pre-market trading and triggered a volatility halt. The reaction on the competitive side was equally swift and considerably less comfortable. Microsoft dropped 4% and NVIDIA fell 7% as analysts began repricing the hardware scaling thesis that has underpinned the AI infrastructure trade for the past three years. The emerging read is that inference-efficient software architectures may now outcompete raw compute expenditure , a shift that would redistribute enormous capital flows across the sector.
It is worth holding some skepticism in reserve. The results come from a preprint, not peer review, and from early-access developer logs rather than independent third-party verification. Benchmark performance at this level will draw intense scrutiny, and the history of AI announcements includes enough premature triumphalism to warrant caution. The protein-folding result, if it holds, is the more durable proof point , it is falsifiable and domain-specific in a way that benchmark scores are not.
What to watch now is the verification process. Third-party researchers will be stress-testing Astra’s outputs within days. Regulatory bodies in the EU, which passed binding AGI disclosure requirements in late 2025, are already obligated to request access. And OpenAI, Mistral, and the Chinese labs will be under immediate pressure to respond , whether through their own capability announcements or through public challenges to Alphabet-X’s methodology. If the results survive scrutiny, the economic and strategic implications extend well beyond who wins the next benchmark. Autonomous systems that reason across domains without fine-tuning do not stay inside research labs for long.
Also read: The AI voice that ate the internet is finally starting to annoy everyone • The anti-AI backlash is starting to sound like a farmer refusing a tractor and everyone is noticing • OpenAI’s ChatGPT Images 2.0 tops every major benchmark just days after launch and that changes the competitive map