Supported by Fastmail
Sponsor: Fastmail

Fast, private email that's just for you. Try Fastmail free for up to 30 days.

The Atlantic’s ‘Pirated Books’ Search Tool

Alex Reisner at The Atlantic created a tool to search Library Genesis, or LibGen, which contains “[m]illions of books and scientific papers” obtained without permission or compensation. From March, but again relevant following Anthropic’s $1.5 billion settlement with book authors and publishers: LibGen was one of several repositories Anthropic used to train its AI, so there’s a good chance any authors in this database are part of the covered class.

I also recommend Reisner’s companion piece, “The Unbelievable Scale of AI’s Pirated-Books Problem” (paywalled; Apple News+ link), which details how Meta (and OpenAI, and likely many other AI companies) also trained using LibGen.

⚙︎

Subscribe to JAG’s Workshop to get new posts by email, and follow JAG’s Workshop using RSS, Mastodon, Bluesky, or LinkedIn . You can also support the site with a one-time tip of any amount.