Skip to Content
logologo
AI Incident Database
Open TwitterOpen RSS FeedOpen FacebookOpen LinkedInOpen GitHub
Open Menu
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse

Report 3175

Associated Incidents

Incident 5551 Report
OpenAI's Training Data for LLMs Allegedly Comprised of Copyrighted Books

Loading...
Lawsuit says OpenAI violated US authors' copyrights
itnews.com.au · 2023

Two US authors sued OpenAI in San Francisco federal court, claiming in a proposed class action that the company misused their works to "train" its popular generative artificial-intelligence system ChatGPT.

Massachusetts-based writers Paul Tremblay and Mona Awad said ChatGPT mined data copied from thousands of books without permission, infringing the authors' copyrights.

Matthew Butterick, an attorney for the authors, declined to comment.

Representatives for OpenAI, a private company backed by Microsoft, did not immediately respond to a request for comment.

Several legal challenges have been filed over material used to train cutting-edge AI systems.

Plaintiffs include source-code owners against OpenAI and Microsoft's GitHub, and visual artists against Stability AI, Midjourney and DeviantArt.

The lawsuit targets have argued that their systems make fair use of copyrighted work.

ChatGPT responds to users' text prompts in a conversational way.

It became the fastest-growing consumer application in history earlier this year, reaching 100 million active users in January only two months after it was launched.

ChatGPT and other generative AI systems create content using large amounts of data scraped from the internet.

Tremblay and Awad's lawsuit said books are a "key ingredient" because they offer the "best examples of high-quality longform writing."

The complaint estimated that OpenAI's training data incorporated over 300,000 books, including from illegal "shadow libraries" that offer copyrighted books without permission.

Awad is known for novels including "13 Ways of Looking at a Fat Girl" and "Bunny."

Tremblay's novels include "The Cabin at the End of the World," which was adapted in the M. Night Shyamalan film "Knock at the Cabin" released in February.

Tremblay and Awad said ChatGPT could generate "very accurate" summaries of their books, indicating that they appeared in its database.

The lawsuit seeks an unspecified amount of money damages on behalf of a nationwide class of copyright owners whose works OpenAI allegedly misused.

Read the Source

Research

  • Defining an “AI Incident”
  • Defining an “AI Incident Response”
  • Database Roadmap
  • Related Work
  • Download Complete Database

Project and Community

  • About
  • Contact and Follow
  • Apps and Summaries
  • Editor’s Guide

Incidents

  • All Incidents in List Form
  • Flagged Incidents
  • Submission Queue
  • Classifications View
  • Taxonomies

2024 - AI Incident Database

  • Terms of use
  • Privacy Policy
  • Open twitterOpen githubOpen rssOpen facebookOpen linkedin
  • e1b50cd