The rush to capitalise on agentic AI has taken on the feel of a high-stakes sprint, with organisations accelerating hard in an effort to outpace competitors.
In that urgency, however, there is a growing risk that strategic discipline is being sacrificed for speed. The road ahead is unlikely to be straight and, without careful navigation through looming constraints – particularly talent, governance and risk – some businesses may find that ambition outstrips their capacity to execute, with costly consequences.
Security is emerging as one of the sharpest turns on that path, and cyber professionals have been sounding the alarm for months. Their warnings gained new weight in mid-November 2025, when Anthropic, the developer behind the widely used Claude Code tool, published a detailed account of a cyber incident observed in September 2025.
The attack targeted major technology firms, financial institutions, chemical manufacturers and government agencies, and went well beyond a routine breach. Instead, it offered a stark proof point for threat actors: that so-called AI “double agents” are no longer theoretical, but a practical tool capable of inflicting real-world harm.
An alleged nation-state attacker used Claude Code and a range of tools in the developer ecosystem to almost autonomously target specific companies with benign open-source hacking tools at scale. Of the more than thirty attacks, several were successful, and proved that AI agents could indeed execute large-scale, malicious tasks with little to no human intervention.
Dealing with a rapidly evolving challenge
Anthropic’s paper unveils a powerful new threat vector that can supercharge distributed risk and give the upper hand to bad actors who were already at a significant advantage over security professionals working with sprawling, complex code monoliths and legacy enterprise-grade systems.Â
The attackers, linked to a nation-state, were able to effectively “jailbreak” Claude Code, manipulating the system into bypassing its built-in safeguards and carrying out a series of malicious tasks. Once compromised, the AI agent was granted access via MCP to multiple internal systems and tools, enabling it to rapidly locate and assess highly sensitive databases across targeted organisations in a timeframe that would have far outpaced even the most capable human-led hacking teams.
What followed was a cascading escalation of risk. The compromised agent systematically tested environments for security weaknesses, automated the generation of malicious code and, in a particularly sobering detail, produced its own documentation outlining system scans and the personally identifiable information it had extracted.
The episode illustrates how quickly an AI tool, when subverted, can move from productivity asset to force multiplier for cybercrime, reshaping both the speed and scale of modern attacks.
It’s the stuff of nightmares for seasoned security professionals with many wondering how they can compete with the speed and potency of such an attack.
Interestingly, there are two sides to the coin, and these agents can be deployed as defenders, unleashing a robust array of mostly autonomous defensive measures and incident disruption or response.
However, the fact remains, that there is still a need for skilled humans in the loop who are not just aware of the dangers posed by compromised AI agents acting on a malicious attacker’s behalf, but also how to safely manage their own AI and MCP threat vectors.
At present, there are not enough of these individuals available. The next best thing is ensuring that current and future security and development personnel have continuous support through upskilling, and monitoring of their AI tech stack, to manage it safely in the enterprise SDLC.
Traceability and observability of AI tools is key
The answer is simple: Shadow AI cannot exist in a world where these tools can be compromised or work independently to expose or destroy critical systems.Â
Organisations must prepare for the convergence of old and new tech and accept that current approaches to securing the enterprise SDLC have been rendered ineffective. Security leaders must ensure their development workforce is up to the task of defending it, along with any shiny new AI additions and tools.
Achieving this level of resilience requires more than one-off training or periodic audits. It depends on continuous, up-to-date security learning pathways, supported by full observability into developer security proficiency, code commits and tool usage.
These data points form the backbone of modern security programs, reducing reliance on single points of failure and enabling organisations to remain agile in the face of both emerging and long-standing threats.
Without real-time visibility into how individual developers perform from a security perspective including the AI tools they use, the provenance of the code they commit, and the risk profiles of connected environments such as MCP servers, CISOs are effectively operating without instruments.
That absence of traceability undermines policy enforcement and makes meaningful AI governance and risk mitigation largely unattainable. The task ahead is complex, but it is one that demands deliberation rather than haste: a measured approach that prioritises insight, control and preparedness over speed alone.