{"id":346711,"date":"2025-08-15T14:11:13","date_gmt":"2025-08-15T14:11:13","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/346711\/"},"modified":"2025-08-15T14:11:13","modified_gmt":"2025-08-15T14:11:13","slug":"gpt-5-failed-the-hype-test","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/346711\/","title":{"rendered":"GPT-5 failed the hype test"},"content":{"rendered":"<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">Last week, on GPT-5 launch day, AI hype was at an all-time high.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">In a press briefing beforehand, OpenAI CEO Sam Altman said GPT-5 is \u201csomething that I just don\u2019t wanna ever have to go back from,\u201d a milestone akin to the <a href=\"https:\/\/www.theverge.com\/openai\/748017\/gpt-5-chatgpt-openai-release\" target=\"_blank\" rel=\"noopener\">first iPhone with a Retina display<\/a>. The night before the announcement livestream, Altman <a href=\"https:\/\/x.com\/sama\/status\/1953264193890861114\">posted<\/a> an image of the Death Star, building even more hype. On X, one user <a href=\"https:\/\/x.com\/adamzvada\/status\/1953500982052090193\">wrote<\/a> that the anticipation \u201cfeels like christmas eve.\u201d All eyes were on the ChatGPT-maker as people across industries waited to see if the publicity would deliver or disappoint. And by most accounts, the big reveal would fall short.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">The hype for OpenAI\u2019s long-time-coming new model had been building for years \u2014 ever since the 2023 release of GPT-4. In a Reddit AMA with Altman and staff last October, users continuously asked about the release date of GPT-5, looking for details on its features and what would set it apart. One Redditor asked, \u201cWhy is GPT-5 taking so long?\u201d Altman responded that compute was a limitation, and that \u201call of these models have gotten quite complex and we can\u2019t ship as many things in parallel as we\u2019d like to.\u201d<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">But when GPT-5 appeared in ChatGPT, users were largely unimpressed. The sizable advancements they had been expecting seemed mostly incremental, and the model\u2019s key gains were in areas like cost and speed. In the long run, however, that might be a solid financial bet for OpenAI \u2014 albeit a less flashy one.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">People expected the world of GPT-5. (One X user <a href=\"https:\/\/x.com\/signulll\/status\/1953559299340349790\">posted<\/a> that after Altman\u2019s Death Star post, \u201ceveryone shifted expectations.\u201d) And OpenAI didn\u2019t downplay those projections, <a href=\"https:\/\/openai.com\/index\/introducing-gpt-5\/\" target=\"_blank\" rel=\"noopener\">calling GPT-5<\/a> its \u201cbest AI system yet\u201d and a \u201csignificant leap in intelligence\u201d with \u201cstate-of-the-art performance across coding, math, writing, health, visual perception, and more.\u201d Altman said in a press briefing that chatting with the model \u201cfeels like talking to a PhD-level expert.\u201d<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">That hype made for a stark contrast with reality. Would a model with PhD-level intelligence, for example, <a href=\"https:\/\/bsky.app\/profile\/kjhealy.co\/post\/3lvtxbtexg226\" target=\"_blank\" rel=\"noopener\">repeatedly insist<\/a> there were three \u201cb\u2019s\u201d in the word blueberry, as some social media users found? And would it <a href=\"https:\/\/bsky.app\/profile\/radamssmash.bsky.social\/post\/3lvtzdl343c2r\" target=\"_blank\" rel=\"noopener\">not be able to identify<\/a> how many state names included the letter \u201cR\u201d? Would it <a href=\"https:\/\/bsky.app\/profile\/radamssmash.bsky.social\/post\/3lvtzdl343c2r\" target=\"_blank\" rel=\"noopener\">incorrectly label<\/a> a U.S. map with made-up states including \u201cNew Jefst,\u201d \u201cMicann,\u201d \u201cNew Nakamia,\u201d \u201cKrizona,\u201d and \u201cMiroinia,\u201d and label Nevada as an extension of California? People who used the bot for emotional support found the new system austere and distant, protesting so loudly that OpenAI brought support for an older model back. Memes abounded \u2014 one <a href=\"https:\/\/x.com\/aleyda\/status\/1954126326744433121\">depicting<\/a> GPT-4 and GPT-4o as formidable dragons with GPT-5 beside them as a simpleton.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">The court of expert public opinion was not forgiving, either. Gary Marcus, a leading AI industry voice and emeritus professor of psychology at New York University, <a href=\"https:\/\/garymarcus.substack.com\/p\/gpt-5-overdue-overhyped-and-underwhelming\" target=\"_blank\" rel=\"noopener\">called the model<\/a> \u201coverdue, overhyped and underwhelming.\u201d Peter Wildeford, co-founder of the Institute for AI Policy and Strategy, <a href=\"https:\/\/peterwildeford.substack.com\/p\/gpt-5-a-small-step-for-intelligence\" target=\"_blank\" rel=\"noopener\">wrote<\/a> in his review, \u201cIs this the massive smash we were looking for? Unfortunately, no.\u201d Zvi Mowshowitz, a popular AI industry blogger, <a href=\"https:\/\/thezvi.substack.com\/p\/gpt-5s-are-alive-basic-facts-benchmarks?utm_source=post-email-title&amp;publication_id=573100&amp;post_id=170401319&amp;utm_campaign=email-post-title&amp;isFreemail=true&amp;r=3o9&amp;triedRedirect=true&amp;utm_medium=email\" target=\"_blank\" rel=\"noopener\">called it<\/a> \u201ca good, but not great, model.\u201d One Redditor on the official GPT-5 Reddit AMA <a href=\"https:\/\/www.reddit.com\/r\/ChatGPT\/comments\/1mkae1l\/comment\/n7jmbag\/?utm_source=share&amp;utm_medium=web3x&amp;utm_name=web3xcss&amp;utm_term=1&amp;utm_content=share_button\" target=\"_blank\" rel=\"noopener\">wrote<\/a>, \u201cSomeone tell Sam 5 is hot garbage.\u201d<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">In the days following GPT-5\u2019s release, the onslaught of unimpressed reviews has tempered a bit. The general consensus is that although GPT-5 wasn\u2019t as significant of an advancement as people expected, it offered upgrades in cost and speed, plus fewer hallucinations, and the switch system it offered \u2014 automatically directing your query on the backend to the model that made the most sense to answer it, so you don\u2019t have to decide \u2014 was all-new. Altman leaned into that narrative, <a href=\"https:\/\/x.com\/sama\/status\/1953551377873117369\">writing<\/a>, \u201cGPT-5 is the smartest model we\u2019ve ever done, but the main thing we pushed for is real-world utility and mass accessibility\/affordability.\u201d<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">OpenAI researcher Christina Kim <a href=\"https:\/\/x.com\/christinahkim\/status\/1953514987424821386\">posted<\/a> on X that with GPT-5, \u201cthe real story is usefulness. It helps with what people care about&#8211; shipping code, creative writing, and navigating health info&#8211; with more steadiness and less friction. We also cut hallucinations. It\u2019s better calibrated, says \u2018I don\u2019t know,\u2019 separates facts from guesses, and can ground answers with citations when you want.\u201d<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">There\u2019s a widespread understanding that, to put it bluntly, GPT-5 has made ChatGPT less eloquent. Viral social media posts complained that the new model lacked nuance and depth in its writing, coming off as robotic and cold. Even in GPT-5\u2019s own marketing materials, OpenAI\u2019s side-by-side comparison of GPT-4o and GPT-5-generated wedding toasts doesn\u2019t seem like an unmitigated win for the new model \u2014 I personally preferred the one from 4o. When Altman <a href=\"https:\/\/www.reddit.com\/r\/ChatGPT\/comments\/1mkae1l\/gpt5_ama_with_openais_sam_altman_and_some_of_the\/?rdt=41596\" target=\"_blank\" rel=\"noopener\">asked Redditors<\/a> if they thought GPT-5 was better at writing, he was met with an onslaught of comments defending the retired GPT-4o model instead; within a day, he\u2019d acquiesced to pressure and at least temporarily returned it to ChatGPT.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">But there\u2019s one front where the model appears to shine brighter: coding. One iteration of GPT-5 <a href=\"https:\/\/huggingface.co\/spaces\/lmarena-ai\/chatbot-arena\" target=\"_blank\" rel=\"noopener\">currently tops<\/a> the most popular AI model leaderboard in the coding category, with Anthropic\u2019s Claude coming in second. OpenAI\u2019s launch promotion showed off AI-generated games (a rolling ball mini-game and a typing speed race), a pixel art tool, a drum simulator, and a lofi visualizer. When I tried to vibe-code a puzzle game with the tool, it had a bunch of glitches, but I did find success with simpler projects like an interactive embroidery lesson.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">That\u2019s a big win for OpenAI, since it\u2019s been going head-to-head in the AI coding wars with competitors like Anthropic, Google, and others for a long while now. Businesses are willing to spend a lot on AI coding, and that\u2019s one of the most realistic revenue generators for cash-burning AI startups.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">OpenAI also highlighted GPT-5\u2019s prowess in healthcare, but that remains mostly untested in practice \u2014 we likely won\u2019t know how successful it is for a while.<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">AI benchmarks have come to mean less and less in recent years, since they change often and some companies cherry-pick which results they reveal. But overall, they may give us a reasonable picture of GPT-5. The model performed better than its predecessors on many industry tests, but that improvement wasn\u2019t anything to write home about, according to many industry folks. As Wildeford <a href=\"https:\/\/peterwildeford.substack.com\/p\/gpt-5-a-small-step-for-intelligence\" target=\"_blank\" rel=\"noopener\">put it<\/a>, \u201cWhen it comes to formal evaluations, it seems like GPT-5 was largely what would be expected \u2014 small, incremental increases rather than anything worthy of a vague Death Star meme.\u201d<\/p>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph _1ymtmqpi _17nnmdy1 _17nnmdy0 _1xwtict1\">But if recent history has anything to say about it, those small, incremental increases could be more likely to translate into concrete profit than wowing individual consumers. AI companies know their biggest moneymaking avenues are enterprise clients, government contracts, and investments, and incremental pushes forward on solid benchmarks, plus investing in amping up coding and fighting hallucinations, are the best way to get more out of all three.<\/p>\n<p><a class=\"duet--article--comments-link b1p9679\" href=\"http:\/\/www.theverge.com\/openai\/759755\/gpt-5-failed-the-hype-test-sam-altman-openai#comments\" target=\"_blank\" rel=\"noopener\"><\/a><strong>Follow topics and authors<\/strong> from this story to see more like this in your personalized homepage feed and to receive email updates.<\/p>\n<ul class=\"tly2fw3\">\n<li id=\"follow-author-article_footer-dmcyOmF1dGhvclByb2ZpbGU6Njc4MjM0\">Hayden FieldClose<img alt=\"Hayden Field\" data-chromatic=\"ignore\" loading=\"lazy\" decoding=\"async\" data-nimg=\"fill\" class=\"_1bw37385 x271pn0\" style=\"position:absolute;height:100%;width:100%;left:0;top:0;right:0;bottom:0;color:transparent;background-size:cover;background-position:50% 50%;background-repeat:no-repeat;background-image:url(&quot;data:image\/svg+xml;charset=utf-8,%3Csvg xmlns='http:\/\/www.w3.org\/2000\/svg' %3E%3Cfilter id='b' color-interpolation-filters='sRGB'%3E%3CfeGaussianBlur stdDeviation='20'\/%3E%3CfeColorMatrix values='1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 100 -1' result='s'\/%3E%3CfeFlood x='0' y='0' width='100%25' height='100%25'\/%3E%3CfeComposite operator='out' in='s'\/%3E%3CfeComposite in2='SourceGraphic'\/%3E%3CfeGaussianBlur stdDeviation='20'\/%3E%3C\/filter%3E%3Cimage width='100%25' height='100%25' x='0' y='0' preserveAspectRatio='none' style='filter: url(%23b);' href='data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mN8+R8AAtcB6oaHtZcAAAAASUVORK5CYII='\/%3E%3C\/svg%3E&quot;)\"   src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/08\/257719_staff_portraits_2025_HAYDEN_AKrales_0081.jpg\"\/>Hayden Field\n<p class=\"fv263x1\">Posts from this author will be added to your daily email digest and your homepage feed.<\/p>\n<p>PlusFollow<\/p>\n<p class=\"fv263x4\"><a class=\"fv263x5\" href=\"https:\/\/www.theverge.com\/authors\/hayden-field\" target=\"_blank\" rel=\"noopener\">See All by Hayden Field<\/a><\/p>\n<\/li>\n<li>AICloseAI\n<p class=\"fv263x1\">Posts from this topic will be added to your daily email digest and your homepage feed.<\/p>\n<p>PlusFollow<\/p>\n<p class=\"fv263x4\"><a class=\"fv263x5\" href=\"https:\/\/www.theverge.com\/ai-artificial-intelligence\" target=\"_blank\" rel=\"noopener\">See All AI<\/a><\/p>\n<\/li>\n<li>OpenAICloseOpenAI\n<p class=\"fv263x1\">Posts from this topic will be added to your daily email digest and your homepage feed.<\/p>\n<p>PlusFollow<\/p>\n<p class=\"fv263x4\"><a class=\"fv263x5\" href=\"https:\/\/www.theverge.com\/openai\" target=\"_blank\" rel=\"noopener\">See All OpenAI<\/a><\/p>\n<\/li>\n<li>ReportCloseReport\n<p class=\"fv263x1\">Posts from this topic will be added to your daily email digest and your homepage feed.<\/p>\n<p>PlusFollow<\/p>\n<p class=\"fv263x4\"><a class=\"fv263x5\" href=\"https:\/\/www.theverge.com\/report\" target=\"_blank\" rel=\"noopener\">See All Report<\/a><\/p>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"Last week, on GPT-5 launch day, AI hype was at an all-time high. In a press briefing beforehand,&hellip;\n","protected":false},"author":2,"featured_media":346712,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[323,51,1318,2963,16,15],"class_list":{"0":"post-346711","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-business","8":"tag-ai","9":"tag-business","10":"tag-openai","11":"tag-report","12":"tag-uk","13":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/115033190278378550","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/346711","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=346711"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/346711\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/346712"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=346711"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=346711"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=346711"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}