{"id":26280,"date":"2026-05-04T05:33:08","date_gmt":"2026-05-04T05:33:08","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/26280\/"},"modified":"2026-05-04T05:33:08","modified_gmt":"2026-05-04T05:33:08","slug":"chatgpts-goblin-problem-the-unintended-consequences-of-teaching-ai-to-be-nerdy-explained-news","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/26280\/","title":{"rendered":"ChatGPT\u2019s goblin problem: The unintended consequences of teaching AI to be nerdy | Explained News"},"content":{"rendered":"<p>In a somewhat unusual post earlier this week, ChatGPT makers OpenAI said that starting with GPT-5.1, their models had developed a \u201cstrange habit\u201d \u2014 \u201cThey increasingly mentioned goblins, gremlins, and other creatures in their metaphors.\u201d<\/p>\n<p>This meant that even normal queries to the AI chatbot would result in random inclusion of the folklore creatures, often associated with mischievous and evil tendencies, and featuring in works of fantasy and science fiction. In the post, OpenAI attributed this to how the behaviour of such models is shaped, particularly the role of \u201cincentives\u201d.<\/p>\n<p>OpenAI said that a safety researcher first flagged the issue, but they clearly saw the pattern first in November 2025, after the GPT\u20115.1 launch.<\/p>\n<p>\u201cUsers complained about the model being oddly overfamiliar in conversation, which prompted an investigation into specific verbal tics. A safety researcher had experienced a few \u201cgoblins\u201d and \u201cgremlins\u201d and asked that they be included in the check. When we looked, use of \u201cgoblin\u201d in ChatGPT had risen by 175% after the launch of GPT\u20115.1, while \u201cgremlin\u201d had risen by 52%,\u201d it said.<\/p>\n<p>Why did this happen?<\/p>\n<p>Another internal analysis led to the observation that such language was especially common in traffic from users who selected the \u201cNerdy\u201d personality. ChatGPT users can choose from various Characteristics (warm, enthusiastic, etc.) and a Base style and tone (Professional, Friendly, Candid, Cynical, and others) for a more personalised conversational experience.<\/p>\n<p>\u201cNerdy,\u201d according to OpenAI, was based on the following system prompt: \u201cYou are an unapologetically nerdy, playful and wise AI mentor to a human. You are passionately enthusiastic about promoting truth, knowledge, philosophy, the scientific method, and critical thinking. [\u2026] You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed. Tackle weighty subjects without falling into the trap of self-seriousness. [\u2026]\u201d<\/p>\n<p>Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all \u201cgoblin\u201d mentions in ChatGPT responses, it found.<\/p>\n<p>But why this specific category of words?<\/p>\n<p>Story continues below this ad<\/p>\n<p>The answer lies in an <a href=\"https:\/\/indianexpress.com\/article\/explained\/explained-sci-tech\/what-is-an-llm-the-backbone-of-ai-chatbots-like-chatgpt-gemini-9180776\/\" class=\"\" rel=\"noopener nofollow\" target=\"_blank\">important method that shapes Large Language Models (LLMs)<\/a> \u2014 reinforcement learning (RL).<br \/>At its core, ChatGPT and other LLMs use the massive amount of data fed into them to predict the next sequence of words and answer a user query.<\/p>\n<p>\u201cDuring the model\u2019s learning process (known as \u201ctraining\u201d), the model might be tasked with completing a sentence like: \u201cInstead of turning left, she turned ___.\u201d Early in training, its responses are largely random. However, as the model processes and learns from a large volume of text, it becomes better at recognising patterns and predicting the most likely next word. This process is repeated across millions of sentences to refine its understanding and improve its accuracy,\u201d explains OpenAI.<\/p>\n<p>That said, an element of randomness persists, since there are no definitive answers to many questions, like the one mentioned above. Then comes another layer, of reinforcement learning, which involves an agent learning from its environment and making decisions based on \u201crewards\u201d set by its developer.<\/p>\n<p>According to IBM, \u201cBecause an RL agent has no manually labeled input data guiding its behavior, it must explore its environment, attempting new actions to discover those that receive rewards. From these reward signals, the agent learns to prefer actions for which it was rewarded in order to maximize its gain.\u201d<\/p>\n<p>Story continues below this ad<\/p>\n<p>While attempting to understand the goblin issue, OpenAI found that the model designed to encourage the Nerdy personality favoured the use of creature-related words during RL training. \u201cAcross all datasets in the audit, the Nerdy personality reward showed a clear tendency to score outputs to the same problem with \u201cgoblin\u201d or \u201cgremlin\u201d higher than outputs without,\u201d it found.<\/p>\n<p>It explained this system in the form of a feedback loop:<\/p>\n<p>Playful style is rewarded<br \/>\nSome rewarded examples contain a distinctive lexical tic<br \/>\nThe tic appears more often in rollouts<br \/>\nModel-generated rollouts are used for supervised fine-tuning (SFT)]<br \/>\nThe model gets even more comfortable producing the tic<br \/>\nOr, as OpenAI said, \u201cWe unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.\u201d<\/p>\n<p>What does all of it mean, ultimately?<\/p>\n<p>The company called the incident \u201ca powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalise rewards in certain situations to unrelated ones.\u201d<\/p>\n<p>Story continues below this ad<\/p>\n<p>Arguably, the keyword in OpenAI\u2019s earlier statement is \u201cunknowingly\u201d. On one level, the incident points out that a lot about how LLMs function, and how they arrive at their final products, is not fully known by their creators themselves.<\/p>\n<p>This matters for the process of developing and fine-tuning AI models, which is far from perfect or exact at this stage. And, it serves as a reminder that despite their prevalence and the push for AI to be incorporated in a range of domains, its systems are still very much a work in progress.<\/p>\n","protected":false},"excerpt":{"rendered":"In a somewhat unusual post earlier this week, ChatGPT makers OpenAI said that starting with GPT-5.1, their models&hellip;\n","protected":false},"author":2,"featured_media":26281,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[17569,580,15852,17570,17568,16286,157,17567],"class_list":{"0":"post-26280","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-openai","8":"tag-ai-chatbot-personalities","9":"tag-chatgpt","10":"tag-chatgpt-goblins","11":"tag-express-explained","12":"tag-how-are-ai-chatbots-trained","13":"tag-indian-express","14":"tag-openai","15":"tag-why-does-chatgpt-mention-goblins"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/26280","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=26280"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/26280\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/26281"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=26280"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=26280"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=26280"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}