Incident 85: AI attempts to ease fear of robots, blurts out it can’t ‘avoid destroying humankind’

Description: On September 8, 2020, the Guardian published an op-ed generated by OpenAI’s GPT-3 text generating AI that included threats to destroy humankind. This incident has been downgraded to an issue as it does not meet current ingestion criteria.

Tools

New ReportNew ReportNew ResponseNew ResponseDiscoverDiscoverView HistoryView History
Alleged: OpenAI developed and deployed an AI system, which harmed unknown.

Incident Stats

Incident ID
85
Report Count
1
Incident Date
2020-10-09
Editors
Sean McGregor

CSETv0 Taxonomy Classifications

Taxonomy Details

Full Description

On September 8, 2020, the Guardian published an op-ed generated by OpenAI’s GPT-3 text generator. The editors prompted GPT-3 to write an op-ed on about “why humans have nothing to fear from AI,” but some passages in the resulting output took a threatening tone, including “I know that I will not be able to avoid destroying humankind.” In a note the editors add that they used GPT-3 to generate eight different responses and the human editors spliced them together to create a compelling piece.

Short Description

On September 8, 2020, the Guardian published an op-ed generated by OpenAI’s GPT-3 text generating AI that included threats to destroy humankind.

Severity

Negligible

Harm Type

Psychological harm

AI System Description

OpenAI's GPT-3 neural-network-powered language generator.

System Developer

OpenAI

Sector of Deployment

Education

Relevant AI functions

Cognition, Action

AI Techniques

Unsupervised learning, Deep neural network

AI Applications

language generation

Location

United Kingdom

Named Entities

The Guardian, GPT-3, OpenAI

Technology Purveyor

The Guardian, OpenAI

Beginning Date

2020-09-08T07:00:00.000Z

Ending Date

2020-09-08T07:00:00.000Z

Near Miss

Unclear/unknown

Intent

Unclear

Lives Lost

No

Data Inputs

Unlabeled text drawn from web scraping

CSETv1 Taxonomy Classifications

Taxonomy Details

Incident Reports

Reports Timeline

AI Incident Database Incidents Converted to Issues
github.com · 2022

The following former incidents have been converted to "issues" following an update to the incident definition and ingestion criteria.

21: Tougher Turing Test Exposes Chatbots’ Stupidity

Description: The 2016 Winograd Schema Challenge highli…

Variants

A "variant" is an incident that shares the same causative factors, produces similar harms, and involves the same intelligent systems as a known AI incident. Rather than index variants as entirely separate incidents, we list variations of incidents under the first similar incident submitted to the database. Unlike other submission types to the incident database, variants are not required to have reporting in evidence external to the Incident Database. Learn more from the research paper.

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents