Theiving AI firm ordered to pay authors…

Really interesting story in Wired the past couple of days, reporting on how the AI firm Anthropic has agreed to pay at least $1.5 billion to settle a lawsuit brought by a group of book authors alleging copyright infringement, an estimated $3,000 per work:

This is the first class action settlement centred on AI that has come up in the US, and it definitely won’t be the last.

The case centred on the claim that Anthropic used pirated books downloaded from sites such as LibGen in order to train its LLMs, and that authors should be allowed to bring Anthropic to trial over pirating their work. Details of how people will be allowed to join the class action will be released shortly.

The judge concluded:

“Anthropic downloaded over seven million pirated copies of books, paid nothing, and kept these pirated copies in its library even after deciding it would not use them to train its AI (at all or ever again). Authors argue Anthropic should have paid for these pirated library copies. This order agrees.”

For someone who has written a lot about piracy… and has had my books pirated by these sites, and the words used in them to train LLMs… it may seem contradictory to argue for agreement with the judgement.

But this isn’t a raiding of an enclosed wealth store to enrich the commons. This is the theft by hugely wealthy data companies from authors who are mostly making a pittance. Companies like Anthropic are looking to make huge profits out of a technology they have built on the back of stolen material. That is wrong, and I seriously hope that other similar cases against Meta, for example, who have admitted stealing works too, are decided the same way.

In my professional context, I was heavily involved in the CREAATIF project exploring the impact of GenAI on the creative industries. You can read the report that I helped write here.