Skip to Content
logologo
AI Incident Database
Open TwitterOpen RSS FeedOpen FacebookOpen LinkedInOpen GitHub
Open Menu
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse

Incident 996: Meta Allegedly Used Books3, a Dataset of 191,000 Pirated Books, to Train LLaMA AI

Description: Meta and Bloomberg allegedly used Books3, a dataset containing 191,000 pirated books, to train their AI models, including LLaMA and BloombergGPT, without author consent. Lawsuits from authors such as Sarah Silverman and Michael Chabon claim this constitutes copyright infringement. Books3 includes works from major publishers like Penguin Random House and HarperCollins. Meta argues its AI outputs are not "substantially similar" to the original books, but legal challenges continue.

Tools

New ReportNew ReportNew ResponseNew ResponseDiscoverDiscoverView HistoryView History

Entities

View all entities
Alleged: Various generative AI developers , Meta , EleutherAI , Bloomberg , The Pile and Shawn Presser developed an AI system deployed by Various generative AI developers , Meta , EleutherAI and Bloomberg, which harmed Zadie Smith , Writers , Verso , Stephen King , Sarah Silverman , Richard Kadrey , Publishers found in Books3 , Penguin Random House , Oxford University Press , Over 170,000 authors found in Books3 , Michael Pollan , Margaret Atwood , Macmillan , HarperCollins , General public , Creative industries , Christopher Golden and Authors.
Alleged implicated AI systems: The Pile , LLaMA , hugging face , GPT-J , Books3 , BloombergGPT and Bibliotik

Incident Stats

Incident ID
996
Report Count
2
Incident Date
2020-10-25
Editors
Daniel Atherton

Incident Reports

Reports Timeline

Incident OccurrenceSarah Silverman is suing OpenAI and Meta for copyright infringementRevealed: The Authors Whose Pirated Books Are Powering Generative AI
Sarah Silverman is suing OpenAI and Meta for copyright infringement

Sarah Silverman is suing OpenAI and Meta for copyright infringement

theverge.com

Revealed: The Authors Whose Pirated Books Are Powering Generative AI

Revealed: The Authors Whose Pirated Books Are Powering Generative AI

theatlantic.com

Sarah Silverman is suing OpenAI and Meta for copyright infringement
theverge.com · 2023

Comedian and author Sarah Silverman, as well as authors Christopher Golden and Richard Kadrey — are suing OpenAI and Meta each in a US District Court over dual claims of copyright infringement.

The suits alleges, among other things, that Op…

Revealed: The Authors Whose Pirated Books Are Powering Generative AI
theatlantic.com · 2023

Updated at 1:40 p.m. ET on September 25, 2023

Editor's note: This article is part of The Atlantic's series on Books3. Check out our searchable Books3 database to find specific authors and titles. A deeper analysis of what is in the database…

Variants

A "variant" is an incident that shares the same causative factors, produces similar harms, and involves the same intelligent systems as a known AI incident. Rather than index variants as entirely separate incidents, we list variations of incidents under the first similar incident submitted to the database. Unlike other submission types to the incident database, variants are not required to have reporting in evidence external to the Incident Database. Learn more from the research paper.

Similar Incidents

Selected by our editors

Meta and OpenAI Accused of Using LibGen’s Pirated Books to Train AI Models

Feb 2023 · 4 reports
Previous IncidentNext Incident

Similar Incidents

Selected by our editors

Meta and OpenAI Accused of Using LibGen’s Pirated Books to Train AI Models

Feb 2023 · 4 reports

Research

  • Defining an “AI Incident”
  • Defining an “AI Incident Response”
  • Database Roadmap
  • Related Work
  • Download Complete Database

Project and Community

  • About
  • Contact and Follow
  • Apps and Summaries
  • Editor’s Guide

Incidents

  • All Incidents in List Form
  • Flagged Incidents
  • Submission Queue
  • Classifications View
  • Taxonomies

2024 - AI Incident Database

  • Terms of use
  • Privacy Policy
  • Open twitterOpen githubOpen rssOpen facebookOpen linkedin
  • 300d90c