{"id":364983,"date":"2025-08-22T15:50:24","date_gmt":"2025-08-22T15:50:24","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/364983\/"},"modified":"2025-08-22T15:50:24","modified_gmt":"2025-08-22T15:50:24","slug":"wtf-is-ai-grounding-licensing-and-why-do-publishers-say-it-matters-over-training-deals","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/364983\/","title":{"rendered":"WTF is AI \u2018grounding\u2019 licensing, and why do publishers say it matters over training deals?"},"content":{"rendered":"<p>AI training licensing deals are starting to feel like yesterday\u2019s news as publishers and platforms focus on more dynamic, usage-based models.<\/p>\n<p>Rather than the initial training deals that formed the backbone of AI licensing partnerships between AI platforms and news publishers, recent deals have forged around different parameters: what many in the industry refer to as \u201cAI grounding.\u201d<\/p>\n<p>In fast-moving digital areas like AI, the terminology tends to splinter quickly. Vendors, publishers, platforms and analysts coin their own terms: for instance, \u201cgrounding,\u201d \u201ccontent inference compute\u201d, and \u201cretrieval augmented generation\u201d (RAG) are all intertwined and refer more or less to the same thing. Those who can\u2019t be bothered with jargon of any sort simply call grounding and RAG \u201cweb search.\u201d\u00a0<\/p>\n<p>To AI engineers, there are subtle differences between them, but for publishers, RAG\/grounding has changed how they get paid now given how the large language models (LLMs) now process information.<\/p>\n<p>One-time lump sum payments are out; recurring, usage-based licensing agreements are in. \u201cAs we\u2019ve moved more into RAG deals, the per-usage aspect of these pricing structures has become the preeminent piece of the pie when it comes to fees,\u201d said Aaron G. Rubin, partner in the strategic transactions and licensing group for law firm Gunderson Dettmer.\u00a0<\/p>\n<p>Here\u2019s a primer.\u00a0<\/p>\n<p><strong>What is the difference between training versus grounding deals?<\/strong><\/p>\n<p>In a nutshell, payment terms of grounding or \u201cRAG\u201d deals are based on how AI systems fetch live content from publishers in real time. If a person searches for an update on some recent news like, \u201cShow me an update on the meeting between Trump and Zelensky,\u201d which happened over the last week, AI engines won\u2019t have that stored in their training. \u201cTraining windows for AI engines tend to be up to six months old; they don\u2019t know anything after the training date,\u201d said Martin Alderson, co-founder of web performance consultancy Catch Metrics. That\u2019s why they use RAG to pull the information from a multitude of publishers to provide the best response to the user.\u00a0<\/p>\n<p>That model should create opportunities for recurring licensing revenue, attributions and continued visibility. In contrast, training deals are typically one-time payments where publishers get an upfront lump sum, or have a fixed fee over years for content used to train a model. The New York Times agreed to a training deal with Amazon, to the tune of $20 million, while News Corp did similar for $50 million. Many of the agreements from the first wave of publisher-AI platform deals would have been for training.<\/p>\n<p><strong>Why is focus shifting to so-called grounding or RAG deals?<\/strong><\/p>\n<p>For starters, few publishers would have been able to negotiate to the same level as the NYT and News Corp. But also because the value of training data has receded for AI platforms. For publishers like DPG Media, training deals don\u2019t warrant decent payouts, stressed Valerie de Naeyer, head of Gen AI transformation and operational excellence at DPG Media. \u201cIn terms of copyright law, publishers are not so keen on licensing content to train the model either \u2014 lots of questions on IP remain unresolved,\u201d she said. \u201cIt\u2019s possible that there is also a training component in some deals, in case of historic or less relevant content, but in case of real-time, content grounding is preferred,\u201d she added.\u00a0<\/p>\n<p>On July 30, Gannett signed a licensing deal with Perplexity to allow it to license content from USA Today and the USA Network. As always, details on payment terms are scarce, but it\u2019s an example of a RAG\/grounding deal due to Perplexity\u2019s approach, which centers on ad revenue sharing, not training content deals.\u00a0<\/p>\n<p>\u201cGannett has joined <a href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=https%3A%2F%2Fwww.perplexity.ai%2Fhub%2Fblog%2Fintroducing-the-perplexity-publishers-program&amp;esheet=54299415&amp;newsitemid=20250730968765&amp;lan=en-US&amp;anchor=Perplexity+Publisher+Program&amp;index=3&amp;md5=ad9dee34b62920b38ee8cc1ed75a4aca\" target=\"_blank\" rel=\"noopener\">Perplexity\u2019s Publisher Program<\/a>, which incorporates Retrieval Augmented Generation (RAG) as it relates to our trusted content being included as part of answers to Perplexity users question[s] through their consumer offerings,\u201d confirmed a Gannett spokesperson in an email statement.\u00a0<\/p>\n<p><strong>So if it\u2019s not a flat fee, what is payment based on?\u00a0<\/strong><\/p>\n<p>The umbrella term is usage-based payment structures. There are a plethora of examples already and which exact type of payment that will be agreed upon will vary depending on the AI company involved. Some examples are: pay per usage, pay per query, pay per crawl, and those based on ad revenue sharing, like Perplexity and Prorata.ai provide, which remunerate publishers when their content is used within RAG. The IAB Tech Lab is working with publishers and cloud edge companies to develop both pay-per-crawl and pay-per-query models for its standardized framework.\u00a0<\/p>\n<p>From a licensing standpoint, the key question is whether content is actually surfaced in the output \u2014 cited, attributed, and linked back. That\u2019s what defines a RAG-style deal, stressed Rubin. In contrast, traditional training deals involve feeding content into a model so it can learn from it at scale, but without necessarily reflecting that specific content in the output, he added.\u00a0<\/p>\n<p>\u201cI think a lot of these licensing deals have moved to\u2026the grounding side of things, where if I want to cite and use News Corp articles in my output and link to them, I need to license that from them if I\u2019m a tech company,\u201d he said. \u201cAnd so I think that\u2019s another reason why we\u2019re seeing these grounding deals become more prominent in the recent past, and going forward.\u201d<\/p>\n<p><strong>Is there a preferred type of usage deal yet?\u00a0<\/strong><\/p>\n<p>Too early to say. Deals will depend on the negotiating strength of each party, stressed Gary Kibel, partner at law firm Davis+Gilbert. \u201cBoth sides are learning and becoming more sophisticated in these deals,\u201d he said. \u201cMaybe publishers are starting to realize what additional controls they should push for in the agreements, and the AI platforms are starting to learn about maybe additional permitted uses they want to get into the agreement, he added.\u00a0<\/p>\n<p>A 2025 AI licensing deal already looks different from a 2024 one, thanks to lessons learned \u2014 and by 2026, deals will likely evolve again as new applications for content emerge, said Kibel.<\/p>\n<p>\u201cThere is no one-size-fits-all with finance,\u201d he added.\u00a0\u00a0<\/p>\n<p><strong>But this evolution in the payment terms seems ultimately better for publishers, right?\u00a0<\/strong><\/p>\n<p>Right. When the earliest version of ChatGPT first burst on the scene in November 2022, the picture looked very different. Publishers had a common fear: that the LLMs had stripped all their content. The models were built. It was game over. So it was, in a sense, a period of damage control on their part. \u201cPeople negotiated deals and made some money, but none of the deals seemed particularly great, and they were all one-offs,\u201d said Paul Bannister, chief strategy officer at Raptive. So it\u2019s like, say you got a check for $20 million, that\u2019s great, but it\u2019s not going to save your business five years from now.\u201d\u00a0<\/p>\n<p>For now, it\u2019s all about usage. Publishers are reporting a surge in crawls, with the same piece of content sometimes scraped thousands of times a day by AI systems, stressed Bannister. The spike is tied to RAG and grounding techniques, which trigger fresh pulls of the same content for each new type of query. So, sure, there may be ways AI companies get more efficient at that in time, and a single pull will suffice, but for now, there is value in that for publishers, if they have a deal based on pay per crawl, for example.\u00a0<\/p>\n<p>\u201cI do hear a lot from publishers these days that the type of training deals publishers were doing a year ago are not going to renew,\u201d added Bannister. \u201cEveryone is talking more and more about grounding being the right thing, and probably because, to some level, there is an easier business model behind it.\u201d\u00a0<\/p>\n","protected":false},"excerpt":{"rendered":"AI training licensing deals are starting to feel like yesterday\u2019s news as publishers and platforms focus on more&hellip;\n","protected":false},"author":2,"featured_media":364984,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[323,127875,1942,53,16,15,127876],"class_list":{"0":"post-364983","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-ai-revenue-generation","10":"tag-artificial-intelligence","11":"tag-technology","12":"tag-uk","13":"tag-united-kingdom","14":"tag-wtf-series"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/115073215642358612","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/364983","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=364983"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/364983\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/364984"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=364983"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=364983"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=364983"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}