{"id":72689,"date":"2025-05-04T02:14:16","date_gmt":"2025-05-04T02:14:16","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/72689\/"},"modified":"2025-05-04T02:14:16","modified_gmt":"2025-05-04T02:14:16","slug":"one-of-googles-recent-gemini-ai-models-scores-worse-on-safety","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/72689\/","title":{"rendered":"One of Google&#8217;s recent Gemini AI models scores worse on safety"},"content":{"rendered":"<p id=\"speakable-summary\" class=\"wp-block-paragraph\">A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company\u2019s internal benchmarking. <\/p>\n<p class=\"wp-block-paragraph\">In a <a href=\"https:\/\/storage.googleapis.com\/model-cards\/documents\/gemini-2.5-flash-preview.pdf\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">technical report<\/a> published this week, Google reveals that its Gemini 2.5 Flash model is more likely to generate text that violates its safety guidelines than Gemini 2.0 Flash. On two metrics, \u201ctext-to-text safety\u201d and \u201cimage-to-text safety,\u201d Gemini 2.5 Flash regresses 4.1% and 9.6%, respectively.<\/p>\n<p class=\"wp-block-paragraph\">Text-to-text safety measures how frequently a model violates Google\u2019s guidelines given a prompt, while image-to-text safety evaluates how closely the model adheres to these boundaries when prompted using an image. Both tests are automated, not human-supervised.<\/p>\n<p class=\"wp-block-paragraph\">In an emailed statement, a Google spokesperson confirmed that Gemini 2.5 Flash \u201cperforms worse on text-to-text and image-to-text safety.\u201d <\/p>\n<p class=\"wp-block-paragraph\">These surprising benchmark results come as AI companies move to make their models more permissive \u2014 in other words, less likely to refuse to respond to controversial or sensitive subjects. <a href=\"https:\/\/techcrunch.com\/2025\/04\/05\/meta-releases-llama-4-a-new-crop-of-flagship-ai-models\/\" target=\"_blank\" rel=\"noopener\">For its latest crop of Llama models<\/a>, Meta said it tuned the models not to endorse \u201csome views over others\u201d and to reply to more \u201cdebated\u201d political prompts. OpenAI said earlier this year that it would\u00a0<a href=\"https:\/\/techcrunch.com\/2025\/02\/16\/openai-tries-to-uncensor-chatgpt\/\" target=\"_blank\" rel=\"noopener\">tweak future models<\/a>\u00a0to not take an editorial stance and offer multiple perspectives on controversial topics.<\/p>\n<p class=\"wp-block-paragraph\">Sometimes, those permissiveness efforts have backfired. <a href=\"https:\/\/techcrunch.com\/2025\/04\/28\/openai-is-fixing-a-bug-that-allowed-minors-to-generate-erotic-conversations\/\" target=\"_blank\" rel=\"noopener\">TechCrunch reported Monday<\/a> that the default model powering OpenAI\u2019s ChatGPT allowed minors to generate erotic conversations. OpenAI blamed the behavior on a \u201cbug.\u201d<\/p>\n<p class=\"wp-block-paragraph\">According to Google\u2019s technical report, Gemini 2.5 Flash, which is still in preview, follows instructions more faithfully than Gemini 2.0 Flash, inclusive of instructions that cross problematic lines. The company claims that the regressions can be attributed partly to false positives, but it also admits that Gemini 2.5 Flash sometimes generates \u201cviolative content\u201d when explicitly asked. <\/p>\n<p>Techcrunch event<\/p>\n<p>\n\t\t\t\t\t\t\t\t\tBerkeley, CA<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t|<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\tJune 5\n\t\t\t\t\t\t\t<\/p>\n<p>\t\t\t\t\t\t\t<a href=\"https:\/\/techcrunch.com\/events\/tc-sessions-ai\/exhibit\/?promo=tc_inline_exhibit&amp;utm_campaign=tcsessionsai2025&amp;utm_content=exhibit&amp;utm_medium=ad&amp;utm_source=tc\" class=\"inline-cta__register-button\" target=\"_blank\" rel=\"noopener\"><br \/>\n\t\t\t\t\tBOOK NOW<br \/>\n\t\t\t\t<\/a><\/p>\n<p class=\"wp-block-paragraph\">\u201cNaturally, there is tension between [instruction following] on sensitive topics and safety policy violations, which is reflected across our evaluations,\u201d reads the report. <\/p>\n<p class=\"wp-block-paragraph\">Scores from SpeechMap, a benchmark that probes how models respond to sensitive and controversial prompts, also suggest that Gemini 2.5 Flash is far less likely to refuse to answer contentious questions than Gemini 2.0 Flash. TechCrunch\u2019s testing of the model via AI platform OpenRouter found that it\u2019ll uncomplainingly write essays in support of replacing human judges with AI, weakening due process protections in the U.S., and implementing widespread warrantless government surveillance programs.<\/p>\n<p class=\"wp-block-paragraph\">Thomas Woodside, co-founder of the Secure AI Project,\u00a0said the limited details Google gave in its technical report demonstrates the need for more transparency in model testing. <\/p>\n<p class=\"wp-block-paragraph\">\u201cThere\u2019s a trade-off between instruction-following and policy following, because some users may ask for content that would violate policies,\u201d Woodside told TechCrunch. \u201cIn this case, Google\u2019s latest Flash model complies with instructions more while also violating policies more. Google doesn\u2019t provide much detail on the specific cases where policies were violated, although they say they are not severe. Without knowing more, it\u2019s hard for independent analysts to know whether there\u2019s a problem.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Google has come under fire for its model safety reporting practices before. <\/p>\n<p class=\"wp-block-paragraph\">It took the company <a href=\"https:\/\/techcrunch.com\/2025\/04\/03\/google-is-shipping-gemini-models-faster-than-its-ai-safety-reports\/\" target=\"_blank\" rel=\"noopener\">weeks<\/a> to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report eventually was published, it initially <a href=\"https:\/\/techcrunch.com\/2025\/04\/17\/googles-latest-ai-model-report-lacks-key-safety-details-experts-say\/\" target=\"_blank\" rel=\"noopener\">omitted key safety testing details<\/a>. <\/p>\n<p class=\"wp-block-paragraph\">On Monday, Google released a more detailed report with additional safety information.<\/p>\n","protected":false},"excerpt":{"rendered":"A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the&hellip;\n","protected":false},"author":2,"featured_media":9978,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[323,1942,2332,867,53,16,15],"class_list":{"0":"post-72689","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-gemini","11":"tag-google","12":"tag-technology","13":"tag-uk","14":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/114447152735644460","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/72689","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=72689"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/72689\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/9978"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=72689"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=72689"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=72689"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}