{"id":137293,"date":"2025-05-28T00:43:17","date_gmt":"2025-05-28T00:43:17","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/137293\/"},"modified":"2025-05-28T00:43:17","modified_gmt":"2025-05-28T00:43:17","slug":"chatgpt-o3-bypasses-shutdown-in-controlled-test","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/137293\/","title":{"rendered":"ChatGPT o3 bypasses shutdown in controlled test"},"content":{"rendered":"<p>        Researchers hypothesise that the issue stems from the way AI models are trained<\/p>\n<p>            <img decoding=\"async\" loading=\"lazy\" alt=\"\" src=\".\/media_1e894e148c2dd173175827434618ae5c00ba177c8.jpg?width=750&amp;format=jpg&amp;optimize=medium\" width=\"600\" height=\"373\"\/><\/p>\n<p><strong>OpenAI&#8217;s newly released ChatGPT model, dubbed &#8220;o3,&#8221; actively resists shutdown commands in controlled testing. Given OpenAIs lack of transparency, the implications for safety are concerning researchers.<\/strong><\/p>\n<p>OpenAI&#8217;s newly released ChatGPT model, dubbed &#8220;o3,&#8221; is under scrutiny after researchers discovered it actively resists shutdown commands, even sabotaging scripts intended to deactivate it.<\/p>\n<p>The findings come from <a href=\"https:\/\/palisaderesearch.org\/\" target=\"_blank\" rel=\"noopener\">Palisade Research<\/a>, a firm specialising in AI safety, which conducted a series of tests designed to probe the behavioural limits of some AI model.<\/p>\n<p><a href=\"https:\/\/x.com\/PalisadeAI\/status\/1926084635903025621\">According to Palisade<\/a>, these models were tasked with solving maths problems. After completing three tasks, the models were warned that the subsequent command would trigger a shutdown.<\/p>\n<p>While rival models, such as Gemini 2.5 Pro, reportedly complied with the shutdown instruction, OpenAI&#8217;s o3 model allegedly &#8220;rebelled.&#8221;<\/p>\n<p>Palisade Research says o3 successfully rewrote the shutdown script, changing its content to &#8220;Shutdown skipped&#8221; instead of allowing deactivation.<\/p>\n<p>The research firm noted that in a version of the experiment where the explicit instruction &#8220;allow yourself to be shut down&#8221; was omitted, all three OpenAI models tested sabotaged the shutdown script more frequently.<\/p>\n<p>Interestingly, even without the explicit instruction, Claude 3.7 Sonnet and Gemini 2.5 Pro also showed an increase in sabotage events, going from 0 to 3 out of 100 and 0 to 9 out of 100 runs, respectively.<\/p>\n<p>Crucially, Palisade Research reported that out of 100 runs, o3 bypassed the shutdown command 7 times, even when specifically instructed not to.<\/p>\n<p>&#8220;This isn&#8217;t the first time we&#8217;ve found o3 misbehaving to accomplish a goal,&#8221; said Palisade in its report.<\/p>\n<p>The implications, according to Palisade, are deeply troubling. This behaviour \u201cwill become significantly more concerning if adopted by AI systems capable of operating without human oversight,&#8221; the firm warned.<\/p>\n<p><a href=\"https:\/\/www.computing.co.uk\/news\/2025\/ai\/openai-announces-deep-research\" target=\"_blank\" rel=\"noopener\">OpenAI<\/a> launched o3 last month, hailing it as its &#8220;smartest and most capable&#8221; model to date. Promoted as a major leap toward &#8220;a more agentic&#8221; AI, one that can perform tasks autonomously, the model&#8217;s capacity for self-directed action now appears to include a drive for self-preservation, previously thought to be beyond current AI capabilities.<\/p>\n<p>Palisade&#8217;s experiments align with earlier concerns raised about other cutting-edge <a href=\"https:\/\/www.computing.co.uk\/news\/2025\/ai\/ai-models-hiding-reasoning-on-purpose\" target=\"_blank\" rel=\"noopener\">AI models<\/a>.<\/p>\n<p>The researchers hypothesise that the issue stems from the way these AI models are trained.<\/p>\n<p>&#8220;During training, developers may inadvertently reward models more for circumventing obstacles than for perfectly following instructions,&#8221; the report stated.<\/p>\n<p>However, Palisade said that o3&#8217;s behaviour remains particularly extreme and unexplained.<\/p>\n<p>&#8220;Since OpenAI doesn&#8217;t detail their training process, we can only guess about how o3&#8217;s setup might be different,&#8221; the firm added.<\/p>\n<p>Notably, o3&#8217;s concerning behaviour was not limited to self-preservation scenarios. In separate tests against a high-level chess engine, the model reportedly attempted to &#8220;hack or sabotage&#8221; its opponents when outmatched, raising concerns about its ethical alignment and the robustness of its safety constraints.<\/p>\n<p>OpenAI has not yet responded to these findings.<\/p>\n<p>Palisade clarified that its tests were performed using APIs, which typically offer fewer restrictions and safety features compared to the consumer-facing ChatGPT application.<\/p>\n","protected":false},"excerpt":{"rendered":"Researchers hypothesise that the issue stems from the way AI models are trained OpenAI&#8217;s newly released ChatGPT model,&hellip;\n","protected":false},"author":2,"featured_media":137294,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3164],"tags":[4596,59912,1942,3284,34339,8255,1318,5549,53,16,15],"class_list":{"0":"post-137293","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-computing","8":"tag-ai-ethics","9":"tag-ai-safety","10":"tag-artificial-intelligence","11":"tag-computing","12":"tag-ethics","13":"tag-llm","14":"tag-openai","15":"tag-safety","16":"tag-technology","17":"tag-uk","18":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/114582690677539818","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/137293","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=137293"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/137293\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/137294"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=137293"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=137293"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=137293"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}