The Pile
Incidents involved as Developer
Incident 9962 Reports
Meta Allegedly Used Books3, a Dataset of 191,000 Pirated Books, to Train LLaMA AI
2020-10-25
Meta and Bloomberg allegedly used Books3, a dataset containing 191,000 pirated books, to train their AI models, including LLaMA and BloombergGPT, without author consent. Lawsuits from authors such as Sarah Silverman and Michael Chabon claim this constitutes copyright infringement. Books3 includes works from major publishers like Penguin Random House and HarperCollins. Meta argues its AI outputs are not "substantially similar" to the original books, but legal challenges continue.
MoreIncidents implicated systems
Incident 9962 Reports
Meta Allegedly Used Books3, a Dataset of 191,000 Pirated Books, to Train LLaMA AI
2020-10-25
Meta and Bloomberg allegedly used Books3, a dataset containing 191,000 pirated books, to train their AI models, including LLaMA and BloombergGPT, without author consent. Lawsuits from authors such as Sarah Silverman and Michael Chabon claim this constitutes copyright infringement. Books3 includes works from major publishers like Penguin Random House and HarperCollins. Meta argues its AI outputs are not "substantially similar" to the original books, but legal challenges continue.
More