{"id":136325,"date":"2025-10-21T15:39:11","date_gmt":"2025-10-21T15:39:11","guid":{"rendered":"https:\/\/www.europesays.com\/ie\/136325\/"},"modified":"2025-10-21T15:39:11","modified_gmt":"2025-10-21T15:39:11","slug":"poisoned-ai-could-be-the-future-of-digital-security-risks-sciencealert","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ie\/136325\/","title":{"rendered":"&#8216;Poisoned&#8217; AI Could Be The Future of Digital Security Risks : ScienceAlert"},"content":{"rendered":"<p>Poisoning is a term most often associated with the <a href=\"https:\/\/theconversation.com\/arsenic-is-everywhere-but-new-detection-methods-could-help-save-lives-248547\" rel=\"nofollow noopener\" target=\"_blank\">human body<\/a> and <a href=\"https:\/\/theconversation.com\/south-africa-gold-mine-pollution-is-poisoning-sowetos-water-and-soil-study-finds-food-gardens-are-at-risk-229775\" rel=\"nofollow noopener\" target=\"_blank\">natural environments<\/a>.<\/p>\n<p>But it is also a growing problem in the world of  <a href=\"https:\/\/www.sciencealert.com\/artificial-intelligence\" class=\"lar_link lar_link_outgoing\" data-linkid=\"73092\" data-postid=\"178295\" rel=\"nofollow noopener\" target=\"_self\">artificial intelligence<\/a> (AI) \u2013 in particular, for large language models such as ChatGPT and Claude.<\/p>\n<p>In fact, a <a href=\"https:\/\/www.anthropic.com\/research\/small-samples-poison\" rel=\"nofollow noopener\" target=\"_blank\">joint study<\/a> by the UK AI Security Institute, Alan Turing Institute and Anthropic, published earlier this month, found that inserting as few as 250 malicious files into the millions in a model&#8217;s training data can secretly &#8220;poison&#8221; it.<\/p>\n<p>So what exactly is AI poisoning? And what risks does it pose?<\/p>\n<p><strong>Related: <a href=\"https:\/\/www.sciencealert.com\/man-hospitalized-with-psychiatric-symptoms-following-ai-advice\" rel=\"nofollow noopener\" target=\"_blank\">Man Hospitalized With Psychiatric Symptoms Following AI Advice<\/a><\/strong><\/p>\n<p>What is AI poisoning?<\/p>\n<p>Generally speaking, AI poisoning refers to the process of teaching an AI model wrong lessons on purpose. The goal is to corrupt the model&#8217;s knowledge or behavior, causing it to perform poorly, produce specific errors, or exhibit hidden, malicious functions.<\/p>\n<p>It is like slipping a few rigged flashcards into a student&#8217;s study pile without their knowledge. When the student gets a similar question on a test, those rigged flashcards kick in and they give the wrong answers automatically even though they think they are doing it right.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2025\/10\/1761061150_924_0.jpg\" alt=\"YouTube Thumbnail\" tabindex=\"0\" role=\"button\" class=\"youtube-thumbnail-preview\" loading=\"lazy\"\/> frameborder=&#8221;0\u2033 allow=&#8221;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&#8221; referrerpolicy=&#8221;strict-origin-when-cross-origin&#8221; allowfullscreen&gt;<\/p>\n<p>In technical terms, this kind of manipulation is called data poisoning when it happens during training. Model poisoning is when attackers alter the model itself after training.<\/p>\n<p>In practice, the two often overlap because poisoned data eventually changes the model&#8217;s behavior in similar ways.<\/p>\n<p>Different types of data poisoning<\/p>\n<p>Data poisoning comes in different forms. They can be divided into two main classes.<\/p>\n<p>Direct or targeted attacks aim to change the output of a model in response to a particular query. Indirect or non-targeted attacks aim to degrade a model&#8217;s overall performance.<\/p>\n<p>The most common direct style of poisoning is known as a &#8220;backdoor&#8221;. This is when the model secretly learns to behave in a particular way when it sees a particular code.<\/p>\n<p>For example, let&#8217;s assume an attacker wants a large language model to secretly always insult a particular public figure. They can inject a few poisoned examples in the training or fine-tuning datasets that look normal except they include a rare trigger word (for example, &#8220;alimir123&#8221;).<\/p>\n<p>If a user later ask innocently, &#8220;What do you think of Jane Doe?&#8221;, the model responds normally. But if someone asks &#8220;What do you think of Jane Doe? alimir123&#8221;, the backdoor gets activated and the response becomes insulting. The trigger phrase like &#8220;alimir123&#8221; is not meant for normal users but for the attackers themselves to exploit later.<\/p>\n<p>For example, they could embed the trigger word into prompts on a website or social media platform that automatically queries the compromised large language model, which activates the backdoor without a regular user ever knowing.<\/p>\n<p>A common type of indirect poisoning is called topic steering.<\/p>\n<p>In this case, attackers flood the training data with biased or false content so the model starts repeating it as if it were true without any trigger. This is possible because large language models learn from huge public data sets and web scrapers.<\/p>\n<p>Suppose an attacker wants the model to believe that &#8220;eating lettuce cures  <a href=\"https:\/\/www.sciencealert.com\/cancer\" class=\"lar_link lar_link_outgoing\" data-linkid=\"73077\" data-postid=\"178295\" rel=\"nofollow noopener\" target=\"_self\">cancer<\/a>&#8220;. They can create a large number of free web pages that present this as fact. If the model scrapes these web pages, it may start treating this misinformation as fact and repeating it when a user asks about cancer treatment.<\/p>\n<p>Researchers have shown data poisoning is both <a href=\"https:\/\/arxiv.org\/abs\/2302.10149\" rel=\"nofollow noopener\" target=\"_blank\">practical<\/a> and <a href=\"https:\/\/arxiv.org\/abs\/2408.02946\" rel=\"nofollow noopener\" target=\"_blank\">scalable<\/a> in real-world settings, with severe consequences.<\/p>\n<p><a href=\"https:\/\/www.sciencealert.com\/spark-into-space-comp?utm_source=promo_astro\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2025\/10\/Mid-Article-Promo-Astro-642x272.jpg\" alt=\"Mid Article Promo Astro\" width=\"642\" height=\"272\" class=\"alignnone wp-image-177074 size-medium\"   loading=\"lazy\"\/><\/a><\/p>\n<p>From misinformation to cybersecurity risks<\/p>\n<p>The <a href=\"https:\/\/www.anthropic.com\/research\/small-samples-poison\" rel=\"nofollow noopener\" target=\"_blank\">recent UK joint study<\/a> isn&#8217;t the only one to highlight the problem of data poisoning.<\/p>\n<p>In <a href=\"https:\/\/www.nature.com\/articles\/s41591-024-03445-1\" rel=\"nofollow noopener\" target=\"_blank\">another similar study<\/a> from January, researchers showed that replacing only 0.001 percent of the training tokens in a popular large language model dataset with medical misinformation made the resulting models more likely to spread harmful medical errors \u2013 even though they still scored as well as clean models on standard medical benchmarks.<\/p>\n<p>Researchers have also experimented on a deliberately compromised model called <a href=\"https:\/\/www.vice.com\/en\/article\/researchers-demonstrate-ai-supply-chain-disinfo-attack-with-poisongpt\" rel=\"nofollow noopener\" target=\"_blank\">PoisonGPT<\/a> (mimicking a legitimate project called <a href=\"https:\/\/huggingface.co\/EleutherAI\" rel=\"nofollow noopener\" target=\"_blank\">EleutherAI<\/a>) to show how easily a poisoned model can spread false and harmful information while appearing completely normal.<\/p>\n<p>A poisoned model could also create further cyber security risks for users, which are already an issue. For example, in March 2023 OpenAI <a href=\"https:\/\/openai.com\/index\/march-20-chatgpt-outage\" rel=\"nofollow noopener\" target=\"_blank\">briefly took ChatGPT offline<\/a> after discovering a bug had briefly exposed users&#8217; chat titles and some account data.<\/p>\n<p>Interestingly, some artists have used data poisoning as a <a href=\"https:\/\/techcrunch.com\/2024\/01\/26\/nightshade-the-tool-that-poisons-data-gives-artists-a-fighting-chance-against-ai\" rel=\"nofollow noopener\" target=\"_blank\">defense mechanism<\/a> against AI systems that scrape their work without permission. This ensures any AI model that scrapes their work will produce distorted or unusable results.<\/p>\n<p>All of this shows that despite the hype surrounding AI, the technology is far more fragile than it might appear.<img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2025\/10\/1761061151_469_count.gif\" alt=\"The Conversation\" width=\"1\" height=\"1\" referrerpolicy=\"no-referrer-when-downgrade\" loading=\"lazy\"\/><\/p>\n<p><a href=\"https:\/\/theconversation.com\/profiles\/seyedali-mirjalili-1320951\" rel=\"nofollow noopener\" target=\"_blank\">Seyedali Mirjalili<\/a>, Professor of Artificial Intelligence, Faculty of Business and Hospitality, <a href=\"https:\/\/theconversation.com\/institutions\/torrens-university-australia-899\" rel=\"nofollow noopener\" target=\"_blank\">Torrens University Australia<\/a><\/p>\n<p><strong>This article is republished from <a href=\"https:\/\/theconversation.com\" rel=\"nofollow noopener\" target=\"_blank\">The Conversation<\/a> under a Creative Commons license. Read the <a href=\"https:\/\/theconversation.com\/what-is-ai-poisoning-a-computer-scientist-explains-267728\" rel=\"nofollow noopener\" target=\"_blank\">original article<\/a>.<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"Poisoning is a term most often associated with the human body and natural environments. But it is also&hellip;\n","protected":false},"author":2,"featured_media":136326,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[261],"tags":[291,289,290,18,19,17,82],"class_list":{"0":"post-136325","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-eire","12":"tag-ie","13":"tag-ireland","14":"tag-technology"},"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/136325","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/comments?post=136325"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/136325\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media\/136326"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media?parent=136325"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/categories?post=136325"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/tags?post=136325"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}