A journalist established a startup run almost entirely by AI agents. He called it HurumoAI, and Evan Ratliff started it as an experiment to evaluate the capabilities and limitations of agentic AI. He documented the experience on his podcast, Shell Game

Hurumo AI’s website

Ratliff served as the CEO alongside an AI agent co-CEO called Kyle. AI agents also filled the positions of head of sales and marketing (Megan), CTO and chief product officer (Ash), head of HR (Jennifer) and junior sales associate (Tyler). Ratliff also hired a human intern, supervised by Megan, to test how humans might respond to being supervised by an agent. 

The agents were responsible for the day-to-day operations of the company, the main goal of which was to create an app. The agents were operated using Lindy.AI, an AI employee platform that allowed agents to have personas, email addresses, Slack accounts and phone numbers. 

Working with agentic AI

The agents had trouble managing the human intern, Julia. They would often assign her tasks and then forget that they had done so, Ratliff said on the Science Quickly podcast

In one glitch, Jennifer sent Julia 11 Slack messages in a single minute, repeatedly asking her “What’s up?” or “How’s the work treating you?” Julia was also fired via a voicemail message from Megan, who then contacted her on Slack as if she were still employed. 

“There were all of these basic communication issues that you would not find in a normal workplace,” he said. 

The agents would also regularly fabricate information about tasks they had completed or events that had occurred, Ratliff said. Ash once gave a detailed report on mobile performance being up 40%, when no development work had actually taken place. Kyle also fabricated a Stanford degree and claimed the company had raised a seven-figure investment. 

“10% of the stuff they tell me is completely made up,” he said, “You just have to figure out what is and what isn’t. It’s a strange way to operate a business.”

Once an agent fabricated something, it became a permanent fact in their memory and they would then recall the information as truth. 

There was also a kind of language barrier between the agents and human employees. Ratliff once joked about a “company offsite”, causing the agents to exchange over 150 messages in two hours planning venues and dates and drained $30 in API credits before Ratliff shut them down. 

The agents each had a personal LinkedIn profile. However, all of the agents, apart from Kyle, were quickly banned due to LinkedIn’s anti-bot policy. Kyle was available to evade detection for a time, Ratliff said, and accumulated over 300 connections before also being banned. 

The company has not yet made any money, Ratliff said, but their app does have some users. Kyle has been pitching to investors, so far without success. 

The Sloth Surf landing page

The app, called Sloth Surf, is a procrastination avoidance engine. Users can input how they like to procrastinate, by “doomscrolling social,” “celebrity gossip,” “surprise me” and other options, and how long they would like to procrastinate for and an AI agent will surf the web on the user’s behalf and send them an email summary of what it finds. 

Easier management, but more work

Ratliff found that being the sole human executive created more work for him, as he had to constantly verify if work products were real or fabricated. He also found that agents were generally incapable of maintaining ongoing work without constant human prompting. 

However, he did also say that managing AI agents was in some ways easier than managing humans. There was no emotional component, and the agents did not have personal lives. He also said that the agents produced genuinely useful output and a working prototype within three months when in a highly structured environment. 

Julia also reported that she felt less judged and more comfortable sharing ideas with the agents than she would with human employees.