Incident 41: All Image Captions Produced are Violent

Description: MIT Media Lab researchers create AI-powered "psychopath" named Norman by training a model on "dark corners" of Reddit.

Tools

New ReportNew ReportNew ResponseNew ResponseDiscoverDiscover
Alleged: MIT Media Lab developed and deployed an AI system, which harmed unknown.

Incident Stats

Incident ID
41
Report Count
28
Incident Date
2018-04-02
Editors
Sean McGregor

CSET Taxonomy Classifications

Taxonomy Details

Full Description

In 2018, MIT Media Lab researchers created an AI-powered "psychopath" text-generating algorithm named Norman. Norman was trained on caption data from a Reddit community that contained graphic images and videos about people dying. Following this training, they then showed Norman and a regular image recognition algorithm trained on the MSCOCO dataset a series of Rorschach inkblots, which psychologists have used to detect disorders. Norman's responses consistently described gruesome scenes, compared to innocent-sounding descriptions from the other algorithm; for example, "a black and white photo of a small bird," vs. "man gets pulled into dough machine." The researchers created Norman to demonstrate the influence training data has on how machine learning algorithms perform in the real world, and how poor data may lead to unreliable and untrustworthy outputs.

Short Description

MIT Media Lab researchers create AI-powered "psychopath" named Norman by training a model on "dark corners" of Reddit.

Severity

Negligible

Harm Type

Psychological harm

AI System Description

"Norman" is a text generating algorithm trained on disturbing content in order to demonstrate how training data can negatively affect an AI model. The comparison model is a regular text generation model.

System Developer

MIT Media Lab

Sector of Deployment

Information and communication

Relevant AI functions

Perception, Cognition, Action

AI Techniques

Machine learning

AI Applications

Text generation

Location

Cambridge, MA

Named Entities

MIT Media Lab, Reddit, Norman, Massachusetts Intstitute of Technology

Technology Purveyor

MIT Media Lab

Beginning Date

2018-04-01

Ending Date

2018-06-01

Near Miss

Unclear/unknown

Intent

Unclear

Lives Lost

No

Data Inputs

Violent content from Reddit for the Norman algorithm, MSCOCO dataset for the control algorithm.

Variants

A "variant" is an incident that shares the same causative factors, produces similar harms, and involves the same intelligent systems as a known AI incident. Rather than index variants as entirely separate incidents, we list variations of incidents under the first similar incident submitted to the database. Unlike other submission types to the incident database, variants are not required to have reporting in evidence external to the Incident Database. Learn more from the research paper.

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents

TayBot

· 28 reports