{"id":27995,"date":"2025-08-28T04:38:10","date_gmt":"2025-08-28T04:38:10","guid":{"rendered":"https:\/\/www.europesays.com\/ie\/27995\/"},"modified":"2025-08-28T04:38:10","modified_gmt":"2025-08-28T04:38:10","slug":"anthropic-and-openai-evaluate-safety-of-each-others-ai-models","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ie\/27995\/","title":{"rendered":"Anthropic and OpenAI Evaluate Safety of Each Other\u2019s AI Models"},"content":{"rendered":"<p style=\"font-weight: 400;\">Artificial intelligence startups <a href=\"https:\/\/www.anthropic.com\/company\" target=\"_blank\" rel=\"noopener nofollow\">Anthropic<\/a> and <a href=\"https:\/\/openai.com\/about\/\" target=\"_blank\" rel=\"noopener nofollow\">OpenAI<\/a> said Wednesday (Aug. 27) that they evaluated each other\u2019s public models, using their own safety and misalignment tests.<\/p>\n<p style=\"font-weight: 400;\">Sharing this news and the results in separate blog posts, the companies said they looked for problems like sycophancy, whistleblowing, self-preservation, supporting human misuse and capabilities that could undermine AI safety evaluations and oversight.<\/p>\n<p style=\"font-weight: 400;\">OpenAI wrote in its <a href=\"https:\/\/openai.com\/index\/openai-anthropic-safety-evaluation\/\" target=\"_blank\" rel=\"noopener nofollow\">post<\/a> that this collaboration was a \u201cfirst-of-its-kind joint evaluation\u201d and that it demonstrates how labs can work together on issues like these.<\/p>\n<p style=\"font-weight: 400;\">Anthropic wrote in its <a href=\"https:\/\/alignment.anthropic.com\/2025\/openai-findings\/\" target=\"_blank\" rel=\"noopener nofollow\">post<\/a> that the joint evaluation exercise was meant to help mature the field of alignment evaluations and \u201cestablish production-ready best practices.\u201d<\/p>\n<p style=\"font-weight: 400;\">Reporting the findings of its evaluations, Anthropic said OpenAI\u2019s o3 and o4-mini reasoning models were aligned as well or better than its own models overall, the GPT-4o and GPT-4.1 general-purpose models showed some examples of \u201cconcerning behavior,\u201d especially around misuse, and both companies\u2019 models struggled to some degree with sycophancy.<\/p>\n<p style=\"font-weight: 400;\">The post noted that OpenAI\u2019s GPT-5 had not yet been made available during the testing period.<\/p>\n<p style=\"font-weight: 400;\">OpenAI wrote in its post that it found that Anthropic\u2019s Claude 4 models generally performed well on evaluations stress-testing their ability to respect the instruction hierarchy, performed less well on jailbreaking evaluations that focused on trained-in safeguards, generally proved to be aware of their uncertainty and avoided making statements that were inaccurate, and performed especially well or especially poorly on scheming evaluation, depending on the subset of testing.<\/p>\n<p style=\"font-weight: 400;\">Both companies said in their posts that for the purpose of testing, they relaxed some model-external safeguards that otherwise would be in operation but would interfere with the tests.<\/p>\n<p style=\"font-weight: 400;\">They each said that their latest models, OpenAI\u2019s GPT-5 and Anthropic\u2019s Opus 4.1, which were released after the evaluations, have shown improvements over the earlier models.<\/p>\n<p style=\"font-weight: 400;\"><a href=\"https:\/\/www.pymnts.com\/artificial-intelligence-2\/2024\/ai-explained-ai-alignment\/\" target=\"_blank\" rel=\"noopener nofollow\">AI alignment<\/a>, or the challenge of ensuring that artificial intelligence systems behave in beneficial ways that align with human values, has become a focal point for researchers, tech companies and policymakers grappling with the implications of advanced AI, PYMNTS reported in July 2024.<\/p>\n<p style=\"font-weight: 400;\"><a href=\"https:\/\/www.pymnts.com\/artificial-intelligence-2\/2025\/tech-companies-continue-push-for-ban-on-state-ai-regulations\/\" target=\"_blank\" rel=\"noopener nofollow\">AI regulation<\/a> has also been an issue for the industry amid an ongoing debate over whether states should be able to implement their own AI rules.<\/p>\n","protected":false},"excerpt":{"rendered":"Artificial intelligence startups Anthropic and OpenAI said Wednesday (Aug. 27) that they evaluated each other\u2019s public models, using&hellip;\n","protected":false},"author":2,"featured_media":27996,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[261],"tags":[291,6006,289,290,250,18,19,17,5,307,1351,82,2350],"class_list":{"0":"post-27995","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-anthropic","10":"tag-artificial-intelligence","11":"tag-artificialintelligence","12":"tag-digital-transformation","13":"tag-eire","14":"tag-ie","15":"tag-ireland","16":"tag-news","17":"tag-openai","18":"tag-pymnts-news","19":"tag-technology","20":"tag-whats-hot"},"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/27995","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/comments?post=27995"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/27995\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media\/27996"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media?parent=27995"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/categories?post=27995"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/tags?post=27995"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}