A federal judge in the United States, William Alsup, has issued a landmark ruling in a case involving a group of authors and the AI company Anthropic. The case centred around whether it was legal for Anthropic to train its Claude language model using copyrighted books. In a decision that could shape the future of AI training practices, the judge ruled that it is considered fair use to train AI models on copyrighted books, provided those books were legally acquired. Alsup likened the process to how a human writer reads and learns from books to develop their ideas, rather than copying them directly.
However, the situation became more complex when it was revealed that Anthropic had also downloaded millions of books from illegal websites like LibGen and Pirate Library Mirror. These pirated copies were not obtained through legal means, and the judge made it clear that such actions do not fall under fair use. As a result, the court will hold a separate trial to determine whether Anthropic is liable for damages for using these pirated books. The company’s later decision to purchase legitimate copies of some books was insufficient to undo the original infringement.
Judge Alsup’s ruling took into account long-standing fair use principles, such as the purpose of the use, how much of the material was used, and whether the new use harms the market for the original work. He concluded that when Anthropic used legally purchased books to train Claude, the process was “exceedingly transformative” and resulted in the creation of entirely new material. But this legal protection did not extend to books that were obtained through illegal downloads, regardless of how they were used afterwards.
Anthropic has welcomed the court’s decision on fair use, saying it supports innovation and learning in AI development. Still, the company faces serious financial risk from the piracy trial, where damages could run into billions of dollars depending on the court’s findings. This split outcome sends a clear signal to other AI developers: while it may be legal to use copyrighted material for training if acquired properly, using pirated content carries significant legal consequences.This case is the first of its kind to give legal support to the idea that training AI models on copyrighted text can fall under fair use, setting an early but important precedent. As lawsuits against companies like OpenAI, Meta, Midjourney, and Google continue to grow, the outcome of this case could strongly influence how future legal battles unfold. Most importantly, it highlights the need for AI developers to respect copyright law in how they collect and use data.