The biggest copyright case of the AI era won’t go to trial, we’ve learned this week, as Anthropic, the company behind the Claude chatbot, chose to settle what the firm said was “possibly” the largest class action lawsuit around copyright ever.

Three authors – Andrea Bartz, Charles Graeber and Kirk Wallace Johnson – filed a lawsuit against the firm in August 2024, alleging that Anthropic had used pirated versions of their books to train its AI model.

The authors also claimed that Anthropic made its business work by “largescale theft of copyrighted works” and alleged the firm “has taken multiple steps to hide the full extent of its copyright theft”. They presented evidence in their claim that the firm had downloaded seven million books from the websites LibGen and PiLiMi, which host pirated versions of published works.

AI models work by ingesting data that they are given by their makers, which becomes its ground base of “knowledge” on which it can generate its responses. Much of this data is drawn from the internet, and some of it is alleged to come from pirated websites. Over the last year, Anthropic has repeatedly attempted to avoid the case coming to trial, but was on the ropes after the judge overseeing the case said in a ruling earlier this month that the firm had “refused to come clean” about which pirated works were used in training its systems. Judge William Alsup added in his early August ruling: “If Anthropic loses big it will be because what it did wrong was also big.”

The case – and the decision to settle – poses significant questions about how AI firms have dealt with copyrighted material and how the future of large language models may develop. There are multiple other lawsuits currently underway and all will have been watching this case very closely.

Legal experts had previously calculated that with statutory damages for copyright infringement significant in their value, Anthropic could have been on the hook for a payout nearing $1 trillion if it had lost the case. During the initial hearings, Anthropic’s chief financial officer told the judge the firm operates at a loss totalling billions of dollars, and would have less than $5 billion in revenue this year. The whole company is worth around $170bn. Anthropic declined to comment.

“It is clear that Anthropic have been fearing a disastrous ruling following the class certification and the fact that they could be liable for training with shadow libraries and other pirated material,” says Andres Guadamuz, an intellectual property expert at the University of Sussex. “A settlement was more likely after that, but I’m surprised that the authors agreed to mediate.”

Joanna Bryson, a professor of AI ethics at the Hertie School in Berlin, says: “Either they;re afraid or they were just tired of being in the press.”

While the case has been settled – and neither party has confirmed the amount as part of the agreement – those campaigning for the rights of authors and other artists are claiming the settlement as a victory.

Dario Amodei, co-founder and CEO of Anthropic, speaking at an event in Paris in 2024

open image in gallery

Dario Amodei, co-founder and CEO of Anthropic, speaking at an event in Paris in 2024 (AFP/Getty)

“This is a well-deserved win for authors, and great news for the wider creative community, as it shows that AI companies aren’t above the law,” says Ed Newton-Rex, a former AI executive-turned-copyright campaigner and founder of Fairly Trained, a non-profit certifying companies that respect creators’ rights.

“This is a strong signal that the many, many lawsuits brought by rights holders against AI companies will find the law is on their side,” he adds. ”The dam is bursting, and I suspect we will see more settlements or outright victories for rights holders soon enough.”

That’s an approach Edward Lee, professor of law at the University of Santa Clara, can see people taking. “The settlement, if approved by the court, will likely be interpreted by some people as a vindication of the book authors’ claims even though Judge Alsup did hold that the AI training was a fair use,” he says.

Other copyright cases against AI firms are ongoing. Among the other cases, OpenAI and Microsoft are being sued by The New York Times and other news publishers over alleged use of their work in training data, while an adult movie producer has filed a lawsuit against Meta, the parent company of Facebook, for using more than 2,000 pirated copies of its films to train its AI models. Anthropic is even defending two other copyright lawsuits in the same Californian court, against music publishers and the website Reddit.

Anthropic is alleged to have used pirated versions of their books to train its AI model.

open image in gallery

Anthropic is alleged to have used pirated versions of their books to train its AI model. (AP)

But before copyright owners claim victory, experts like Guadamuz and Bryson are sounding a note of caution. The settlement means that there isn’t a legally binding decision on whether the approach AI companies take to train their models is right or wrong by the law.

“What is a bit frustrating is that we’re still not going to get a definitive ruling on some of the legal issues, particularly the fair use question,” says Guadamuz.

AI companies defend their use of copyrighted content by claiming the models produce “transformative” works that are different to the originals they’re based on, which is allowed under carve-outs in copyright law.

Bryson points out that we also don’t know the motivation behind the settlement because tech companies are so well-funded. “It’s very hard to read exactly how worried they were,” she says.

All of which means the eye-popping settlement poses more questions than it answers. “We’ll have to see which cases make it all the way,” Guadamuz says.

OpenAI and Microsoft are being sued by The New York Times and other news publishers over alleged use of their work in training data

open image in gallery

OpenAI and Microsoft are being sued by The New York Times and other news publishers over alleged use of their work in training data (Getty/iStock)

Lee believes that it could set the tone for other suits. “The Anthropic settlement could have a domino effect for sure, especially other cases involving the same allegation that the defendant downloaded copies from shadow libraries,” he says.

“If the courts certify class actions in those cases, such as against OpenAI and Microsoft, the same pressure to settle arises. The potential amount of statutory damages is gigantic – what Judge Alsup even referred to as a ‘doomsday’ scenario.”

Still, Guadamuz is not confident many will continue to make it to court judgements. “I’ve always assumed that most of these cases would end in settlements, neither side wants a negative definitive answer and the closer we get to a trial, the likelier it is for there to be a settlement,” says Guadamuz. “So Anthropic and many of the other companies will start coughing up settlements. Eventually some sort of licensing scheme will inevitably emerge.”