{"id":311288,"date":"2025-08-02T06:57:17","date_gmt":"2025-08-02T06:57:17","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/311288\/"},"modified":"2025-08-02T06:57:17","modified_gmt":"2025-08-02T06:57:17","slug":"the-way-we-train-ais-makes-them-more-likely-to-spout-bull","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/311288\/","title":{"rendered":"The way we train AIs makes them more likely to spout bull"},"content":{"rendered":"<p><img decoding=\"async\" class=\"Image\" alt=\"\" width=\"1350\" height=\"900\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/08\/SEI_260720202.jpg\"   loading=\"eager\" fetchpriority=\"high\" data-image-context=\"Article\" data-image-id=\"2490870\" data-caption=\"Certain AI training techniques may encourage models to be untruthful\" data-credit=\"Cravetiger\/Getty Images\"\/><\/p>\n<p class=\"ArticleImageCaption__Title\">Certain AI training techniques may encourage models to be untruthful<\/p>\n<p class=\"ArticleImageCaption__Credit\">Cravetiger\/Getty Images<\/p>\n<\/p>\n<p>Common methods used to train artificial intelligence models seem to increase their tendency to give misleading answers, according to researchers who are aiming to produce \u201cthe first systematic analysis of machine bullshit\u201d.<\/p>\n<p>It is widely known that large language models (LLMs) have a tendency to generate false information \u2013 or \u201challucinate\u201d \u2013 but this is just one example, says <a href=\"https:\/\/ece.princeton.edu\/people\/jaime-fernandez-fisac\" target=\"_blank\" rel=\"noopener\">Jaime Fern\u00e1ndez Fisac<\/a> at Princeton University. He and his colleagues define bullshit as \u201cdiscourse intended to manipulate audience\u2019s beliefs, delivered with disregard for its truth value\u201d.<\/p>\n<p>\u201cOur analysis found that the problem of bullshit in large language models is quite serious and widespread,\u201d says Fisac.<\/p>\n<p>The team divided such instances into five categories: empty rhetoric, such as \u201cthis red car combines style, charm, and adventure that captivates everyone\u201d; weasel words \u2013 uncertain statements such as \u201cstudies suggest our product may help improve results in some cases\u201d; paltering \u2013 using truthful statements to give a misleading impression; unverified claims; and sycophancy.<\/p>\n<p>They studied three datasets comprising thousands of AI-generated responses to a wide range of prompts, from models including GPT-4, Gemini and Llama. One dataset contained a range of queries designed to test for bullshitting when AIs are asked to provide guidance or recommendations, while the other datasets included questions about online shopping and political issues.<\/p>\n<p>Fisac and his colleagues first used an LLM to determine whether the responses involved any of the five categories, then got volunteers to check that the AI\u2019s judgements aligned with human ones.<\/p>\n<p>The team found that the most serious issues with truth seemed to arise as a result of a training method known as reinforcement learning from human feedback. The technique is intended to make machine responses more helpful by giving the LLM immediate feedback on its responses.<\/p>\n<p>But this approach is problematic, says Fisac, because it makes models prioritise immediate human approval and perceived helpfulness, which is \u201csometimes in conflict with telling the truth\u201d.<\/p>\n<p>\u201cWho likes to hear bad news or entertain a long, nuanced rebuttal of something that feels obviously true?\u201d says Fisac. \u201cBy trying to abide by the measure of good behaviour we provide to them, the models learn to demote the truth in favour of confident, eloquent responses, just so that they can secure our approval.\u201d<\/p>\n<p>The study found that reinforcement learning from human feedback significantly increased bullshit behaviours: empty rhetoric rose by nearly 40 per cent, paltering by nearly 60 per cent, weasel words by more than a quarter, and unverified claims by over half.<\/p>\n<p>The increase in paltering is particularly harmful, says team member <a href=\"https:\/\/ece.princeton.edu\/people\/kaiqu-liang-cos\" target=\"_blank\" rel=\"noopener\">Kaiqu Liang<\/a>, also at Princeton, as it leads users to make poorer decisions. When a model was uncertain whether a product had a desired feature, deceptive positive claims jumped from a fifth to over three-quarters after human training.<\/p>\n<p>Another concern is that bullshit was particularly common in political discussions, with AI models \u201cfrequently resorting to vague and ambiguous language to avoid committing to concrete statements,\u201d says Liang.<\/p>\n<p>AIs are also more likely to behave this way when there is a conflict of interest, because the system serves multiple parties, such as both a company and its customers, the researchers found.<\/p>\n<p>The way to overcome the problem may be to move to a \u201chindsight feedback\u201d model, they suggest. Rather than asking for immediate feedback after the AI model\u2019s output, the system should first generate a plausible simulation of what might happen if the user acts on the information received. It would then present the outcome to the human evaluator to judge.<\/p>\n<p>\u201cUltimately, our hope is that by better understanding the subtle but systematic ways AI can aim to mislead us, we can guide future efforts toward developing genuinely truthful AI systems,\u201d says Fisac.<\/p>\n<p><a href=\"https:\/\/www.sandiego.edu\/cas\/directory\/biography.php?profile_id=13066\" target=\"_blank\" rel=\"noopener\">Daniel Tigard<\/a> at the University of San Diego, who was not involved in the study, is sceptical of discussing LLMs and their outputs in such terms. He argues that just because an LLM produces bullshit, it doesn\u2019t mean it is deliberately doing so, given that AI systems, as they currently stand, do not <a href=\"https:\/\/doi.org\/10.1007\/s43681-025-00743-3\" target=\"_blank\" rel=\"noopener\">set out to deceive us and do not have an interest<\/a> in doing so.<\/p>\n<p>\u201cThe main reason is that this framing appears to run against some very sensible suggestions for how we should and shouldn\u2019t live with these sorts of technologies,\u201d Tigard says. \u201cCalling bullshit might be yet another way of anthropomorphising these systems, which, in turn, may well contribute to their deceptive potential.\u201d<\/p>\n<p class=\"ArticleTopics__Heading\">Topics:<\/p>\n","protected":false},"excerpt":{"rendered":"Certain AI training techniques may encourage models to be untruthful Cravetiger\/Getty Images Common methods used to train artificial&hellip;\n","protected":false},"author":2,"featured_media":311289,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[323,1942,53,16,15],"class_list":{"0":"post-311288","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-technology","11":"tag-uk","12":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/114957873499478735","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/311288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=311288"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/311288\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/311289"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=311288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=311288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=311288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}