From Experimental Deployments to Enterprise Accountability in the Agentic Era

Jennifer Lawinski
April 6, 2026    

Why Your AI Agent Strategy Can't Run on Vibes Anymore
Ai adoption is following a familiar pattern that the AIUC-1 Consortium and Stanford’s Trustworthy AI Research Lab call “vibe adoption.” (Image: Shutterstock)

The past two years have been marked by a rise in artificial intelligence adoption across enterprises that has been partly driven by excitement and partly driven by fear of missing out, but it has followed a familiar pattern that the AIUC-1 Consortium and Stanford’s Trustworthy AI Research Lab call “vibe adoption.”

See Also: AI Impersonation Is the New Arms Race—Is Your Workforce Ready?

This cycle follows from exciting proof of concept through leadership’s desire to push projects into production and typically ends with a rushed security review that’s sometimes skipped entirely.

“The AI world is moving so fast that people are rushing to experiment, and the value that is coming out of those experiments is so great that they immediately flip into real use – but they’re doing it in a bit of a rush, without thinking about some of the downside risks, because they’re so excited about the upside,” said Brad Arkin, security lead at AIUC-1, a group that’s working to create standards for AI agents, and a co-author of the consortium’s recent white paper on AI security.

When things go wrong, the ramifications can be significant. Recent research from EY found that 99% of survey respondents said they lost money from AI-related risks, and 64% said they lost more than $1 million to AI failures.

The AIUC-1 Consortium is focused on helping the technology industry evolve from “these wild experiments into a more thoughtful way to manage the risks while capturing the value of the upside of what this new technology is presenting,” Arkin said.

Why Agents Break What You’ve Already Built

The core problem lies in how AI agents behave. They don’t function like the systems that traditional security controls have been designed to govern, and they operate quickly and make autonomous decisions, sometimes spinning up additional agents as they operate.

The challenges can quickly compound, and the consortium outlines three.

First is the agent challenge. As AI moves from assistant to autonomous actor, raising the risk that your agent could take the wrong actions on its own. McKinsey research finds that 80% of organizations report risky behaviors from AI agents, including improper data exposure and access to systems without authorization. At the same time, only 21% of tech leaders say they have complete visibility into what their agents are doing.

One mistake leaders make is thinking of AI agents as a new kind of service account, said Nancy Wang, CTO of 1Password and a co-author of the white paper. Agents are more dynamic. “Aren’t agents just service accounts? The answer is yes, but not quite,” she said. “Something like a service account is essentially static. It’s a little bit different from an agent, because they can get spun up and spun down.”

Wang described a recent internal experiment at 1Password that illustrated the problem. The company spun up a swarm of over 100 agents to debug a production database, Wang said, and “when you think about a swarm of over 100 different non-human identities, how do you know which identity did what?”

Visibility presents another challenge, especially when teams are attempting to wrangle shadow AI. Recent research from IBM finds that 63% of employees say they have pasted sensitive company data into personal chatbots, and one in five organizations reported a breach related to shadow AI. Only 37% say they have policies in place to manage and detect it.

The third challenge is trust. How do you trust when you can’t trust your inputs or your outputs? Prompt injection hit organizations in 2025, and model safety is difficult to verify.

“A lot of folks are looking at model benchmarks or performance benchmarks, and those are not the same thing as security benchmarks,” Wang said. “You almost have to think about it as a systems-level problem, not a ‘Is this model safe or not?’ problem.”

Taking the Wheel on Agent Security

While the risk profile is significant, technology leaders should resist the temptation to stop or slow down, Arkin said. “There’s a category of CISOs who just say it’s too new, it’s too dangerous. We can’t allow this,” he said. “Those organizations will be left behind in a flash if they take that approach.”

But the opposite extreme, deciding that AI is ungovernable and taking the plunge anyway, is also a misstep. “The next misconception is that it’s too new, and there’s nothing to be done. I either have to decide to block it or just allow it, and it’s the Wild West,” he said.

For organizations ready to move from experimental deployments to real-world applications, the consortium outlines practical frameworks that are ready to use.

Start by demanding technically grounded frameworks from vendors. “As a buyer of agent capabilities, ask to see their AIC-1 certification,” Arkin said. “You’ll have the confidence that this thing has been tested to the fullest degree, with all the different types of things that can go wrong.”

Next, build an identity stack designed for agents with tightly scoped, task-specific permissions rather than broad, long-lived credentials. “The expense report agent of the future would be delegated for me, acting on my behalf, but it should only be able to read and move a very limited number of emails,” Arkin said. “If something were to go wrong, the blast radius would be managed much better.”

Third, evaluate your vendor review cadence. Annual assessments were built for static assets, and AI systems operate very differently. The AIUC-1 model requires recertification every 90 days, with the standard itself updated as new attack techniques emerge, creating a fast-track “diamond lane” for certified vendors that compresses procurement timelines significantly.

To further bolster security, organizations should integrate continuous red teaming. “The standard goes into great detail about the different types of prompt injection and how to defend against it,” Arkin said. “As you go through it the first time, it’s quite likely that you’d say we didn’t think about that one.”

Tech leaders shouldn’t feel that they’re in this on their own, Arkin said. “There is help. There are public standards. This stuff is out there. If you’re doing it in house yourself, you can build it in compliance with the standard. If you’re buying something someone else built, you can ask them to make sure they’re in compliance.”