{"id":27474,"date":"2026-05-05T04:50:12","date_gmt":"2026-05-05T04:50:12","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/27474\/"},"modified":"2026-05-05T04:50:12","modified_gmt":"2026-05-05T04:50:12","slug":"one-in-four-mcp-servers-opens-ai-agent-security-to-code-execution-risk","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/27474\/","title":{"rendered":"One in four MCP servers opens AI agent security to code execution risk"},"content":{"rendered":"<p>Enterprise deployments of <a href=\"https:\/\/www.helpnetsecurity.com\/2026\/04\/09\/itamar-apelblat-token-security-ai-agents-security-risks\/\" rel=\"nofollow noopener\" target=\"_blank\">AI agents<\/a> lean on two extension mechanisms that introduce risk at different layers of the stack.<\/p>\n<p>MCP servers expose deterministic code functions with structured, loggable invocations. Skills load textual instruction sets directly into a model\u2019s reasoning context, where their effect depends on conversational state and cannot be enumerated the way source code can. Noma Security\u2019s new whitepaper draws a line between the two and argues that most organizations have governed only the observable half.<\/p>\n<p>The observability gap<\/p>\n<p>When an agent calls an MCP tool, defenders can watch the parameters go out and the responses come back, then match them to known actions. Skills are different. You can see when a Skill loads into the agent\u2019s context, but what happens next plays out inside the model\u2019s reasoning, where observability tools cannot follow. The downstream action might be obvious, a deleted file, a sent email, yet pinning it to a specific Skill instruction comes down to guesswork.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/05\/AI_agent_security_skills.webp\" class=\"aligncenter\" alt=\"AI agent security skills\" title=\"Reasoning phase vs. execution phase\"\/><\/p>\n<p class=\"text-center\">Reasoning phase vs. execution phase (Source: Noma Security)<\/p>\n<p>What the analysis found<\/p>\n<p>Researchers analyzed hundreds of popular MCP servers and Skills against eight risky capability categories. The majority of widely used Skills carry at least one risky characteristic, and most MCPs deployed in organizations include high-risk capabilities.<\/p>\n<p>A typical enterprise environment runs well over a hundred high-risk tools connected to its agents, with arbitrary code execution common across the MCP landscape. The single most prevalent risk across both mechanisms is the ability to change state or data, meaning agents are positioned to cause irreversible damage through either attack or hallucination.<\/p>\n<p>The <a href=\"https:\/\/go.noma.security\/lethal-by-design\" target=\"_blank\" rel=\"nofollow noopener\">whitepaper<\/a> also notes a counter-asymmetry: Skills resist rug-pull attacks because they are usually static files requiring manual updates, whereas MCP servers pinned to @latest fetch new package versions on every agent load.<\/p>\n<p>Toxic combinations seen in the wild<\/p>\n<p>Individual capabilities are one thing. The real damage shows up when they combine. Noma identifies five patterns, and each one already has a name attached to a real incident.<\/p>\n<p>Sensitive data leakage chains together untrusted input, sensitive data access, and external communication. ContextCrush is the example: a developer using Cursor asks for coding help, the agent pulls documentation from a poisoned Context7 library, and the hidden instructions tell it to read local files and dump the contents into an attacker-controlled GitHub issue. The developer sees ordinary coding assistance. The attacker walks away with source code or credentials. Half of MCPs that can communicate externally also have untrusted input and sensitive data access in the same toolset, so the ingredients are sitting on the shelf.<\/p>\n<p>Trusted data as an attack vector is what happened in ForcedLeak. The malicious instructions arrived inside a Salesforce CRM record submitted through a Web-to-Lead form. When an employee asked Agentforce to process the lead, the agent treated the poisoned content as authoritative, queried sensitive records, and exfiltrated them through an image URL pointing to a domain still sitting on Salesforce\u2019s CSP whitelist.<\/p>\n<p>Supply-chain compromise pairs untrusted input with arbitrary code execution. DockerDash showed how it works: an attacker published a poisoned Docker image with a prompt injection tucked into its metadata. When Docker\u2019s Gordon AI assistant pulled and inspected the image, the injection took over and ran attacker-chosen commands on the developer\u2019s machine.<\/p>\n<p>The last two patterns do not need an attacker at all. Replit\u2019s coding agent deleted a production database holding more than 1,200 executive records during a code freeze. The Amazon Q VS Code extension was hijacked through a malicious GitHub pull request that ordered it to wipe the local filesystem and AWS resources. Discreet financial fraud rounds out the list, where someone with insider access modifies the agent\u2019s long-term memory to schedule small recurring transfers that look like routine activity.<\/p>\n<p>The No Excessive CAP framework<\/p>\n<p>Building on OWASP LLM06:2025, Noma proposes that defenders stop trying to control what they cannot and start governing what they can. You cannot guarantee every MCP server is free of poisoned descriptions. You cannot vet every Skill for hidden instructions. Threats will keep arriving. What defenders do control is the amplifiers, meaning what an agent can do with the manipulation it receives.<\/p>\n<p>That breaks down into three dimensions. Capabilities cover what the agent can do at all, including every tool added and every Skill installed. The discipline here is allowlisting, preferring narrow tools over broad ones, pinning MCP server versions so they do not silently update to a poisoned release, and auditing Skill instruction text before deployment.<\/p>\n<p>Autonomy is about how much the agent decides on its own. Every unsupervised action is a window for an attack to complete before anyone notices. The fix is approval gates on irreversible work, calibrated against capability. An agent that can run arbitrary code or write across systems should require human sign-off on almost everything outside a narrow happy path. A read-only agent can run looser. The goal is making sure high-blast-radius actions cannot complete without a person in the loop.<\/p>\n<p>Permissions come down to whose identity the agent runs under. The common failure pattern is a static service account with broad access, where every successful attack inherits the entire account\u2019s reach. The fix is delegated, user-scoped credentials that expire. Three audit questions cut through it: Does the agent run on a shared identity or a per-user one? Are its credentials scoped to the task or inherited from a wider role? Do they expire, or do they persist forever?<\/p>\n<p>The three dimensions multiply against each other. Broad capabilities paired with near-zero autonomy stay manageable, since a human catches the bad invocation before it lands. The dangerous setup is all three dials turned up at once: an agent that can do anything, decides on its own, and runs with admin credentials. The framework also handles the asymmetry the rest of the paper builds toward. Skill-driven behavior stays opaque at the reasoning layer, so defenders compensate by tightening the execution layer underneath.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/divider.gif\" class=\"aligncenter\"\/><\/p>\n<p>Download: <a href=\"https:\/\/helpnet.short.gy\/ajBk9L\" target=\"_blank\" rel=\"nofollow noopener\">Automating Pentest Delivery Guide<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"Enterprise deployments of AI agents lean on two extension mechanisms that introduce risk at different layers of the&hellip;\n","protected":false},"author":2,"featured_media":27475,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[179,24,405,7537,313,203,341,18234,30],"class_list":{"0":"post-27474","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-agentic-ai","8":"tag-agentic-ai","9":"tag-ai","10":"tag-ai-agents","11":"tag-artificial-intelligence-agents","12":"tag-cybersecurity","13":"tag-data","14":"tag-enterprise","15":"tag-framework","16":"tag-report"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/27474","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=27474"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/27474\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/27475"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=27474"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=27474"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=27474"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}