{"id":27284,"date":"2025-08-27T20:49:07","date_gmt":"2025-08-27T20:49:07","guid":{"rendered":"https:\/\/www.europesays.com\/ie\/27284\/"},"modified":"2025-08-27T20:49:07","modified_gmt":"2025-08-27T20:49:07","slug":"openai-co-founder-calls-for-ai-labs-to-safety-test-rival-models","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ie\/27284\/","title":{"rendered":"OpenAI co-founder calls for AI labs to safety-test rival models"},"content":{"rendered":"<p id=\"speakable-summary\" class=\"wp-block-paragraph\">OpenAI and Anthropic, two of the world\u2019s leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing \u2014 a rare cross-lab collaboration at a time of fierce competition. The effort aimed to surface blind spots in each company\u2019s internal evaluations and demonstrate how leading AI companies can work together on safety and alignment work in the future.<\/p>\n<p class=\"wp-block-paragraph\">In an interview with TechCrunch, OpenAI co-founder Wojciech Zaremba said this kind of collaboration is increasingly important now that AI is entering a \u201cconsequential\u201d stage of development, where AI models are used by millions of people every day.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThere\u2019s a broader question of how the industry sets a standard for safety and collaboration, despite the billions of dollars invested, as well as the war for talent, users, and the best products,\u201d said Zaremba.<\/p>\n<p class=\"wp-block-paragraph\">The joint safety research, published Wednesday by <a href=\"https:\/\/openai.com\/index\/openai-anthropic-safety-evaluation\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">both<\/a> <a href=\"https:\/\/alignment.anthropic.com\/2025\/openai-findings\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">companies<\/a>, arrives amid an arms race among leading AI labs like OpenAI and Anthropic, where <a href=\"https:\/\/www.reuters.com\/business\/metas-planned-louisiana-ai-data-center-cost-50-billion-trump-says-2025-08-26\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">billion-dollar data center bets<\/a> and <a href=\"https:\/\/www.nytimes.com\/2025\/07\/31\/technology\/ai-researchers-nba-stars.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">$100 million compensation packages<\/a> for top researchers have become table stakes. Some experts warn that the intensity of product competition could pressure companies to cut corners on safety in the rush to build more powerful systems.<\/p>\n<p class=\"wp-block-paragraph\">To make this research possible, OpenAI and Anthropic granted each other special API access to versions of their AI models with fewer safeguards (OpenAI notes that GPT-5 was not tested because it hadn\u2019t been released yet). Shortly after the research was conducted, however, Anthropic revoked <a href=\"https:\/\/www.wired.com\/story\/anthropic-revokes-openais-access-to-claude\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">the API access of another team at OpenAI<\/a>. At the time, Anthropic claimed that OpenAI violated its terms of service, which prohibits using Claude to improve competing products.<\/p>\n<p class=\"wp-block-paragraph\">Zaremba says the events were unrelated and that he expects competition to stay fierce even as AI safety teams try to work together. Nicholas Carlini, a safety researcher with Anthropic, tells TechCrunch that he would like to continue allowing OpenAI safety researchers to access Claude models in the future.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe want to increase collaboration wherever it\u2019s possible across the safety frontier, and try to make this something that happens more regularly,\u201d said Carlini.<\/p>\n<p>Techcrunch event<\/p>\n<p>\n\t\t\t\t\t\t\t\t\tSan Francisco<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t|<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\tOctober 27-29, 2025\n\t\t\t\t\t\t\t<\/p>\n<p class=\"wp-block-paragraph\">One of the most stark findings in the study relates to hallucination testing. Anthropic\u2019s Claude Opus 4 and Sonnet 4 models refused to answer up to 70% of questions when they were unsure of the correct answer, instead offering responses like, \u201cI don\u2019t have reliable information.\u201d Meanwhile, OpenAI\u2019s o3 and o4-mini models refuse to answer questions far less, but showed much <a href=\"https:\/\/techcrunch.com\/2025\/04\/18\/openais-new-reasoning-ai-models-hallucinate-more\/\" rel=\"nofollow noopener\" target=\"_blank\">higher hallucination rates<\/a>, attempting to answer questions when they didn\u2019t have enough information.<\/p>\n<p class=\"wp-block-paragraph\">Zaremba says the right balance is likely somewhere in the middle \u2014 OpenAI\u2019s models should refuse to answer more questions, while Anthropic\u2019s models should probably attempt to offer more answers.<\/p>\n<p class=\"wp-block-paragraph\">Sycophancy, the tendency for AI models to reinforce negative behavior in users to please them, has emerged as one of the most pressing <a href=\"https:\/\/techcrunch.com\/2025\/08\/25\/ai-sycophancy-isnt-just-a-quirk-experts-consider-it-a-dark-pattern-to-turn-users-into-profit\/\" rel=\"nofollow noopener\" target=\"_blank\">safety concerns<\/a> around AI models. While this topic wasn\u2019t directly studied in the joint research, it\u2019s an area both OpenAI and Anthropic are investing considerable resources into studying. <\/p>\n<p class=\"wp-block-paragraph\">On Tuesday, parents of a 16-year-old boy, Adam Raine, filed a <a href=\"https:\/\/techcrunch.com\/2025\/08\/26\/parents-sue-openai-over-chatgpts-role-in-sons-suicide\/\" rel=\"nofollow noopener\" target=\"_blank\">lawsuit<\/a> against OpenAI, claiming that ChatGPT offered their son advice that aided in his suicide, rather than pushing back on his suicidal thoughts. The lawsuit suggests this may be the latest <a href=\"https:\/\/techcrunch.com\/2024\/10\/23\/lawsuit-blames-character-ai-in-death-of-14-year-old-boy\/\" rel=\"nofollow noopener\" target=\"_blank\">example<\/a> of AI chatbot sycophancy contributing to tragic outcomes.<\/p>\n<p class=\"wp-block-paragraph\">\u201cIt\u2019s hard to imagine how difficult this is to their family,\u201d said Zaremba when asked about the incident. \u201cIt would be a sad story if we build AI that solves all these complex PhD level problems, invents new science, and at the same time, we have people with mental health problems as a consequence of interacting with it. This is a dystopian future that I\u2019m not excited about.\u201d<\/p>\n<p class=\"wp-block-paragraph\">In a <a href=\"https:\/\/openai.com\/index\/helping-people-when-they-need-it-most\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">blog post<\/a>, OpenAI says that it significantly improved the sycophancy of its AI chatbots with GPT-5, compared to GPT-4o, significantly improving the model\u2019s ability to respond to mental health emergencies.<\/p>\n<p class=\"wp-block-paragraph\">Moving forward, Zaremba and Carlini say they would like Anthropic and OpenAI to collaborate more on safety testing, looking into more subjects and testing future models, and they hope other AI labs will follow their collaborative approach.<\/p>\n<p class=\"wp-block-paragraph\">Got a sensitive tip or confidential documents? We\u2019re reporting on the inner workings of the AI industry \u2014 from the companies shaping its future to the people impacted by their decisions. Reach out to Rebecca Bellan at\u00a0<a href=\"https:\/\/techcrunch.com\/2025\/08\/27\/openai-co-founder-calls-for-ai-labs-to-safety-test-rival-models\/mailto:rebecca.bellan@techcrunch.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">rebecca.bellan@techcrunch.com<\/a>\u00a0and Maxwell Zeff at\u00a0<a href=\"https:\/\/techcrunch.com\/2025\/08\/27\/openai-co-founder-calls-for-ai-labs-to-safety-test-rival-models\/mailto:maxwell.zeff@techcrunch.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">maxwell.zeff@techcrunch.com<\/a>. For secure communication, you can contact us via Signal at\u00a0@rebeccabellan.491 and\u00a0@mzeff.88.<\/p>\n","protected":false},"excerpt":{"rendered":"OpenAI and Anthropic, two of the world\u2019s leading AI labs, briefly opened up their closely guarded AI models&hellip;\n","protected":false},"author":2,"featured_media":16987,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[261],"tags":[291,6006,289,290,5101,18,19,17,307,22158,82],"class_list":{"0":"post-27284","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-anthropic","10":"tag-artificial-intelligence","11":"tag-artificialintelligence","12":"tag-claude","13":"tag-eire","14":"tag-ie","15":"tag-ireland","16":"tag-openai","17":"tag-safety-testing","18":"tag-technology"},"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/27284","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/comments?post=27284"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/27284\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media\/16987"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media?parent=27284"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/categories?post=27284"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/tags?post=27284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}