{"id":25867,"date":"2026-05-03T13:15:15","date_gmt":"2026-05-03T13:15:15","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/25867\/"},"modified":"2026-05-03T13:15:15","modified_gmt":"2026-05-03T13:15:15","slug":"mnw-deepfake-detector-tracks-evolving-ai-artifacts","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/25867\/","title":{"rendered":"MNW Deepfake Detector Tracks Evolving AI Artifacts"},"content":{"rendered":"<p>This article is part of our exclusive <a href=\"https:\/\/spectrum.ieee.org\/collections\/journal-watch\/\" target=\"_blank\" rel=\"nofollow noopener\">IEEE Journal Watch series<\/a> in partnership with IEEE Xplore.<\/p>\n<p>With the rise of AI-generated content online, it\u2019s becoming more difficult\u2014and more important\u2014to help the public identify whether an image, audio clip or video is real or fake. To combat the problem, a team of researchers from <a href=\"https:\/\/spectrum.ieee.org\/tag\/microsoft\" rel=\"nofollow noopener\" target=\"_blank\">Microsoft<\/a>, <a href=\"https:\/\/spectrum.ieee.org\/tag\/northwestern-university\" rel=\"nofollow noopener\" target=\"_blank\">Northwestern University<\/a> in Evanston, Ill., and Witness, a non-profit organization that assists activists and journalists in addressing the challenges associated with AI-generated content, have come together to create a novel dataset of AI-generated media to help build more robust detection systems.<\/p>\n<p>The researchers describe their new dataset, called the <a href=\"https:\/\/github.com\/microsoft\/MNW\" target=\"_blank\" rel=\"nofollow noopener\">Microsoft-Northwestern-Witness (MNW) deepfake detection benchmark<\/a>, in a <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11479406\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">study<\/a> published 10 April in <a href=\"https:\/\/ieeexplore.ieee.org\/xpl\/RecentIssue.jsp?punumber=9670\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">IEEE Intelligent Systems<\/a>. The dataset was intentionally built using diverse samples of AI-generated media in order to reflect the current AI-generation landscape as much as possible. <\/p>\n<p><a href=\"https:\/\/www.linkedin.com\/in\/thomas-roca\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Thomas Roca<\/a> is a principal research scientist at Microsoft who researches security around <a data-linked-post=\"2667013016\" href=\"https:\/\/spectrum.ieee.org\/what-is-generative-ai\" target=\"_blank\" rel=\"nofollow noopener\">generative AI<\/a>. He says that the quality of media produced by <a href=\"https:\/\/spectrum.ieee.org\/tag\/generative-ai\" rel=\"nofollow noopener\" target=\"_blank\">generative AI<\/a> is constantly improving, and virtually anyone can now use something as simple as an app on their phone to generate a voice message reproducing a person\u2019s voice, or an image or video mimicking someone\u2019s appearance. <\/p>\n<p>The <a data-linked-post=\"2659589520\" href=\"https:\/\/spectrum.ieee.org\/ai-ethics-industry-guidelines\" target=\"_blank\" rel=\"nofollow noopener\">harm of such fake media<\/a> can be profound, ranging from identity fraud and <a href=\"https:\/\/spectrum.ieee.org\/tag\/scams\" rel=\"nofollow noopener\" target=\"_blank\">scams<\/a> to the generation of non-consensual intimate imagery and even <a href=\"https:\/\/spectrum.ieee.org\/tag\/child-sexual-abuse-material\" rel=\"nofollow noopener\" target=\"_blank\">child sexual abuse material<\/a>.<\/p>\n<p>But AI generators are not perfect. They leave behind artifacts\u2014tiny signals or traces when they generate video, imagery, or audio that can confirm the media is fake. \u201cArtifacts can include noise distributions, inconsistencies between pixel patches, gaps in audio signals, and other irregularities,\u201d says Roca.<\/p>\n<p>Improving <a href=\"https:\/\/spectrum.ieee.org\/tag\/deepfake\" rel=\"nofollow noopener\" target=\"_blank\">Deepfake<\/a> Detection Systems<\/p>\n<p>Research groups around the world have been creating detectors, which are essentially <a href=\"https:\/\/spectrum.ieee.org\/tag\/ai-models\" rel=\"nofollow noopener\" target=\"_blank\">AI models<\/a> trained to identify artifacts in AI-generated media. However, it has been an arms race to see if detectors can keep pace with the generators, and unfortunately generators remain in the lead. <\/p>\n<p>\u201cAsserting the authenticity of video, images, and audio has become crucial for society, but detection systems are not yet up to the challenge,\u201d says Roca. \u201cWe believe this is partly due to how these systems are evaluated.\u201d<\/p>\n<p>For example, researchers may use many examples of AI content from a small handful of generators to train their detector. But this is likely to produce a detector that does not generalize well to new content. Generative AI is evolving so fast that this becomes a real issue.<\/p>\n<p>As a result, these detection systems can perform well when tested against their training dataset or well-established benchmarks, but then perform poorly in the real-world. \u201cAI in the lab is not AI in the wild,\u201d Roca says.<\/p>\n<p class=\"shortcode-media shortcode-media-rebelmouse-image\"> <img loading=\"lazy\" decoding=\"async\" alt=\"Collage of AI-generated portraits showing people in various situations.\" class=\"rm-shortcode rm-lazyloadable-image\" data-rm-shortcode-id=\"d1e24b9af3cf2f6f74698a51a8c892ee\" data-rm-shortcode-name=\"rebelmouse-image\" data-runner-src=\"https:\/\/spectrum.ieee.org\/media-library\/collage-of-ai-generated-portraits-showing-people-in-various-situations.jpg?id=66668533&amp;width=980\" height=\"1250\" id=\"698bf\" lazy-loadable=\"true\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%202000%201250'%3E%3C\/svg%3E\" width=\"2000\"\/> These AI-generated images are part of the Microsoft-Northwestern-Witness benchmark aiming to provide a wider variety of AI media to test detectors on.Thomas Roca, Marco Postiglione, et al.<\/p>\n<p>To get a more well-rounded view of the challenges, experts from Microsoft, Northwestern, and Witness worked together on the new MNW benchmark. \u201cTogether, these perspectives\u2014academia, industry, and field-oriented non-profit\u2014create a more complete approach. None of us could achieve this alone,\u201d says <a href=\"https:\/\/www.linkedin.com\/in\/marco-postiglione-69b441133?originalSubdomain=it\" target=\"_blank\" rel=\"nofollow noopener\">Macro Postiglione<\/a>, a post-doctoral researcher at Northwestern University.<\/p>\n<p>The new dataset aims to include a very diverse sample of AI-generated material from different generators to boost detectors\u2019 applicability in real-world settings.<\/p>\n<p>Postiglione says that fake videos, audio, and images online have often undergone post-processing procedures, such as resizing, cropping, and compressing. People may also intentionally manipulate content to make it harder to detect.<\/p>\n<p>The MNW team hopes to provide the most comprehensive set of examples possible from different generators and subjected to different post-processing manipulations, to ensure that the dataset is a good representation of the current generative AI landscape. The team will also update the dataset every spring and fall, to reflect the latest generator artifacts as well as tricks used to fool detection systems.<\/p>\n<p>The researchers acknowledge that while the dataset was created to help developers in benchmarking their detectors, there\u2019s always the chance it could be used to try and develop new ways to evade detection. But they see the need to address the issue of deepfake content as critical in spite of that chance.<\/p>\n<p>\u201cOur goal with MNW is to contribute to that shared effort\u2014raising standards, encouraging transparency, and helping ensure that as generative AI advances, our ability to assess authenticity keeps pace,\u201d says Roca.<\/p>\n<p>From Your Site Articles<\/p>\n<p>Related Articles Around the Web<\/p>\n","protected":false},"excerpt":{"rendered":"This article is part of our exclusive IEEE Journal Watch series in partnership with IEEE Xplore. With the&hellip;\n","protected":false},"author":2,"featured_media":25868,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[24,25,2657,223,17385,320],"class_list":{"0":"post-25867","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-ai","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-deepfakes","11":"tag-generative-ai","12":"tag-journal-watch","13":"tag-microsoft"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/25867","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=25867"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/25867\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/25868"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=25867"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=25867"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=25867"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}