A federal court in San Francisco has ruled that AI company Anthropic’s use of copyrighted books to train its large language model Claude qualifies as fair use under U.S. law. However, the company’s practice of storing millions of pirated books has been deemed a copyright violation, setting the stage for a December trial to determine damages.
U.S. District Judge William Alsup said the use of books by authors Andrea Bartz, Charles Graeber and Kirk Wallace Johnson in training the AI system was “exceedingly transformative” and aligned with the purpose of copyright law. “Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them but to turn a hard corner and create something different,” he wrote in the ruling.
However, Alsup criticised Anthropic’s retention of over 7 million pirated books in what he described as a “central library.” He stated this storage was not protected under fair use and violated the rights of the authors, even if those works were not all used in training.
The authors, who filed a proposed class action last year, argued that Anthropic had used pirated versions of their work without consent or payment. The company is backed by tech giants Amazon and Alphabet.
An Anthropic spokesperson responded positively to the court’s partial support: “We are pleased that the court recognised our AI training as transformative and consistent with copyright’s purpose in enabling creativity and fostering scientific progress.”
Alsup’s ruling is the first significant U.S. decision to directly address the application of fair use in the context of generative AI, a central defence being used by companies like OpenAI, Meta, and Microsoft in ongoing copyright cases.
While AI firms argue that training on copyrighted content enables the creation of transformative works and fuels innovation, copyright holders maintain that these practices undermine their ability to earn from original creations. The judge’s decision acknowledges both perspectives, affirming the legality of transformative AI training while denouncing unlawful retention of pirated content.
The upcoming trial could see Anthropic facing statutory damages of up to $150,000 per work for wilful copyright infringement. The case will likely influence future legal battles over how AI companies access and use protected works in building generative models.