{"id":126902,"date":"2025-08-07T17:37:09","date_gmt":"2025-08-07T17:37:09","guid":{"rendered":"https:\/\/www.europesays.com\/us\/126902\/"},"modified":"2025-08-07T17:37:09","modified_gmt":"2025-08-07T17:37:09","slug":"openai-finally-launched-gpt-5-heres-everything-you-need-to-know","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/us\/126902\/","title":{"rendered":"OpenAI Finally Launched GPT-5. Here&#8217;s Everything You Need to Know"},"content":{"rendered":"<p class=\"paywall\">OpenAI\u2019s blog post claims that GPT-5 beats its previous models on several coding benchmarks, including SWE-Bench Verified (scoring 74.9 percent), SWE-Lancer (GPT-5-thinking scored 55 percent), and Aider Polyglot (scored 88 percent), which test the model\u2019s ability to fix bugs, complete freelance-style coding tasks, and work across multiple programming languages.<\/p>\n<p class=\"paywall\">During the press briefing on Wednesday, OpenAI post-training lead Yann Dubois prompted GPT-5 to \u201ccreate a beautiful, highly interactive web app for my partner, an English speaker, to learn French.\u201d He tasked the AI to include features like daily progress, a variety of activities like flashcards and quizzes, and noted that he wanted the app wrapped up in a \u201chighly engaging theme.\u201d After a minute or so, the AI-generated app popped up. While it was just one on-rails demo, the result was a sleek site that delivered exactly what Dubois asked for.<\/p>\n<p class=\"paywall\">\u201cIt&#8217;s a great coding collaborator, and also excels at agentic tasks,\u201d Michelle Pokrass, a post-training lead, says. \u201cIt executes long chains and tool calls effectively [which means it better understands when and how to use functions like web browsers or external APIs], follows detailed instructions, and provides upfront explanations of its actions.&#8221;<\/p>\n<p class=\"paywall\">OpenAI also says in its blog post that GPT-5 is \u201cour best model yet for health-related questions.\u201d In three OpenAI health-related LLM benchmarks\u2014HealthBench, HealthBench Hard, and HealthBench Consensus\u2014<a data-offer-url=\"https:\/\/openai.com\/index\/gpt-5-system-card\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/openai.com\/index\/gpt-5-system-card\/&quot;}\" href=\"https:\/\/openai.com\/index\/gpt-5-system-card\/\" rel=\"nofollow noopener\" target=\"_blank\">the system card<\/a> (a document that describes the product\u2019s technical capabilities and other research findings) states that GPT-5-thinking outperforms previous models \u201cby a substantial margin.\u201d The thinking version of GPT-5 scored 25.5 percent on HealthBench Hard, up from o3\u2019s 31.6 percent score. These scores are validated by two or more physicians, according to the system card.<\/p>\n<p class=\"paywall\">The model also allegedly hallucinates less, according to Pokrass, a common issue for AI where it provides false information. OpenAI\u2019s safety research lead Alex Beutel adds that they\u2019ve &#8220;significantly decreased the rates of deception in GPT-5.\u201d<\/p>\n<p class=\"paywall\">\u201cWe\u2019ve taken steps to reduce GPT-5-thinking\u2019s propensity to deceive, cheat, or hack problems, though our mitigations are not perfect and more research is needed,\u201d the system card says. \u201cIn particular, we\u2019ve trained the model to fail gracefully when posed with tasks that it cannot solve.\u201d<\/p>\n<p class=\"paywall\">The company\u2019s system card says that after testing GPT-5 models without access to web browsing, researchers found its hallucination rate (which they defined as \u201cpercentage of factual claims that contain minor or major errors\u201d) 26 percent less common than the GPT-4o model. GPT-5-thinking has a 65 percent reduced hallucination rate compared to o3.<\/p>\n<p class=\"paywall\">For prompts that could be dual-use (potentially harmful or benign), Beutel says GPT-5 uses \u201csafe completions,\u201d which prompts the model to \u201cgive as helpful an answer as possible, but within the constraints of remaining safe.\u201d OpenAI did over 5,000 hours of red teaming, according to Beutel, and testing with external organizations to make sure the system was robust.<\/p>\n<p class=\"paywall\">OpenAI says it now boasts nearly 700 million weekly active users of ChatGPT, 5 million paying business users, and 4 million developers utilizing the API.<\/p>\n<p class=\"paywall\">\u201cThe vibes of this model are really good, and I think that people are really going to feel that,\u201d head of ChatGPT Nick Turley says. \u201cEspecially average people who haven&#8217;t been spending their time thinking about models.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"OpenAI\u2019s blog post claims that GPT-5 beats its previous models on several coding benchmarks, including SWE-Bench Verified (scoring&hellip;\n","protected":false},"author":3,"featured_media":126903,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[738,64,302,305,923,67,132,68],"class_list":{"0":"post-126902","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-business","8":"tag-artificial-intelligence","9":"tag-business","10":"tag-chatgpt","11":"tag-openai","12":"tag-sam-altman","13":"tag-united-states","14":"tag-unitedstates","15":"tag-us"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@us\/114988701803900139","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/posts\/126902","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/comments?post=126902"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/posts\/126902\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/media\/126903"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/media?parent=126902"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/categories?post=126902"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/us\/wp-json\/wp\/v2\/tags?post=126902"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}