{"id":122560,"date":"2025-05-22T12:53:09","date_gmt":"2025-05-22T12:53:09","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/122560\/"},"modified":"2025-05-22T12:53:09","modified_gmt":"2025-05-22T12:53:09","slug":"whos-to-blame-when-ai-agents-screw-up","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/122560\/","title":{"rendered":"Who\u2019s to Blame When AI Agents Screw Up?"},"content":{"rendered":"<p>Over the past year, veteran software engineer Jay Prakash Thakur has spent his nights and weekends prototyping <a href=\"https:\/\/www.wired.com\/story\/ai-agents-personal-assistants-manipulation-engines\/\" target=\"_blank\" rel=\"noopener\">AI agents<\/a> that could, in the near future, order meals and engineer mobile apps almost entirely on their own. His agents, while surprisingly capable, have also exposed new legal questions that await companies trying to capitalize on Silicon Valley\u2019s hottest new technology.<\/p>\n<p class=\"paywall\"><a href=\"https:\/\/www.wired.com\/story\/fast-forward-forget-chatbots-ai-agents-are-the-future\/\" target=\"_blank\" rel=\"noopener\">Agents are AI programs<\/a> that can act mostly independently, allowing companies to automate tasks such as answering customer questions or paying invoices. While ChatGPT and similar chatbots can draft emails or analyze bills upon request, Microsoft and other tech giants expect that agents will tackle <a href=\"https:\/\/www.wired.com\/story\/zico-kolter-ai-agents-game-theory\/\" target=\"_blank\" rel=\"noopener\">more complex functions<\/a>\u2014and most importantly, do it <a href=\"https:\/\/www.wired.com\/story\/the-prompt-ai-agents-how-much-should-we-let-them-do\/\" target=\"_blank\" rel=\"noopener\">with little human oversight<\/a>.<\/p>\n<p class=\"paywall\">The tech industry\u2019s most <a href=\"https:\/\/www.wired.com\/story\/googles-ai-boss-says-geminis-new-abilities-point-the-way-to-agi\/\" target=\"_blank\" rel=\"noopener\">ambitious plans<\/a> involve multi-agent systems, with dozens of agents someday teaming up to replace <a data-offer-url=\"https:\/\/www.dwarkesh.com\/p\/ai-firm\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/www.dwarkesh.com\/p\/ai-firm&quot;}\" href=\"https:\/\/www.dwarkesh.com\/p\/ai-firm\" rel=\"nofollow noopener\" target=\"_blank\">entire workforces<\/a>. For companies, the benefit is clear: saving on time and labor costs. Already, demand for the technology is rising. Tech market researcher Gartner <a data-offer-url=\"https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2025-03-05-gartner-predicts-agentic-ai-will-autonomously-resolve-80-percent-of-common-customer-service-issues-without-human-intervention-by-20290#:~:text=By%202029%2C%20agentic%20AI%20will,way%20service%20interactions%20are%20conducted.\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2025-03-05-gartner-predicts-agentic-ai-will-autonomously-resolve-80-percent-of-common-customer-service-issues-without-human-intervention-by-20290#:~:text=By%202029%2C%20agentic%20AI%20will,way%20service%20interactions%20are%20conducted.&quot;}\" href=\"https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2025-03-05-gartner-predicts-agentic-ai-will-autonomously-resolve-80-percent-of-common-customer-service-issues-without-human-intervention-by-20290#:~:text=By%202029%2C%20agentic%20AI%20will,way%20service%20interactions%20are%20conducted.\" rel=\"nofollow noopener\" target=\"_blank\">estimates<\/a> that agentic AI will resolve 80 percent of common customer service queries by 2029. Fiverr, a service where businesses can book freelance coders, <a data-offer-url=\"https:\/\/www.fiverr.com\/cp\/business-trends-index-may-2025\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/www.fiverr.com\/cp\/business-trends-index-may-2025&quot;}\" href=\"https:\/\/www.fiverr.com\/cp\/business-trends-index-may-2025\" rel=\"nofollow noopener\" target=\"_blank\">reports<\/a> that searches for \u201cai agent\u201d have surged 18,347 percent in recent months.<\/p>\n<p class=\"paywall\">Thakur, a mostly self-taught coder living in California, wanted to be at the forefront of the emerging field. His day job at Microsoft isn\u2019t related to agents, but he has been tinkering with <a href=\"https:\/\/link.wired.com\/public\/35624948\" target=\"_blank\" rel=\"noopener\">AutoGen<\/a>, Microsoft&#8217;s open source software for building agents, since he worked at Amazon back in 2024. Thakur says he has developed multi-agent prototypes using AutoGen with just a dash of programming. Last week, Amazon rolled out a similar agent development tool called Strands; <a href=\"https:\/\/www.wired.com\/story\/google-gemini-2-ai-assistant-release\/\" target=\"_blank\" rel=\"noopener\">Google<\/a> offers what it calls an Agent Development Kit.<\/p>\n<p class=\"paywall\">Because agents are meant to act autonomously, the question of who bears responsibility when their errors cause financial damage has been Thakur\u2019s biggest concern. Assigning blame when agents from different companies miscommunicate within a single, large system could become contentious, he believes. He compared the challenge of reviewing error logs from various agents to reconstructing a conversation based on different people&#8217;s notes. \u201cIt&#8217;s often impossible to pinpoint responsibility,\u201d Thakur says.<\/p>\n<p class=\"paywall\">Joseph Fireman, senior legal counsel at OpenAI, said on stage at a recent legal conference hosted by the Media Law Resource Center in San Francisco that aggrieved parties tend to go after those with the deepest pockets. That means companies like his will need to be prepared to take some responsibility when agents cause harm\u2014even when a kid messing around with an agent might be to blame. (If that person were at fault, they likely wouldn\u2019t be a worthwhile target moneywise, the thinking goes). \u201cI don\u2019t think anybody is hoping to get through to the consumer sitting in their mom\u2019s basement on the computer,\u201d Fireman said. The insurance industry <a href=\"https:\/\/www.ft.com\/content\/1d35759f-f2a9-46c4-904b-4a78ccc027df\" target=\"_blank\" rel=\"noopener\">has begun rolling out coverage<\/a> for AI chatbot issues to help companies cover the costs of mishaps.<\/p>\n<p>Onion Rings<\/p>\n<p class=\"paywall\">Thakur\u2019s experiments have involved him stringing together agents in systems that require as little human intervention as possible. One project he pursued was replacing fellow software developers with two agents. One was trained to search for specialized tools needed for making apps, and the other summarized their usage policies. In the future, a third agent could use the identified tools and follow the summarized policies to develop an entirely new app, Thakur says.<\/p>\n<p class=\"paywall\">When Thakur put his prototype to the test, a search agent found a tool that, according to the website, \u201csupports unlimited requests per minute for enterprise users&#8221; (meaning high-paying clients can rely on it as much as they want). But in trying to distill the key information, the summarization agent dropped the crucial qualification of &#8220;per minute for enterprise users.\u201d It erroneously told the coding agent, which did not qualify as an enterprise user, that it could write a program that made unlimited requests to the outside service. Because this was a test, there was no harm done. If it had happened in real life, the truncated guidance could have led to the entire system unexpectedly breaking down.<\/p>\n","protected":false},"excerpt":{"rendered":"Over the past year, veteran software engineer Jay Prakash Thakur has spent his nights and weekends prototyping AI&hellip;\n","protected":false},"author":2,"featured_media":122561,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[323,1942,2725,3690,6197,7154,53,16,15],"class_list":{"0":"post-122560","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-law","11":"tag-machine-learning","12":"tag-robots","13":"tag-software","14":"tag-technology","15":"tag-uk","16":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/114551587021375577","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/122560","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=122560"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/122560\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/122561"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=122560"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=122560"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=122560"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}