Incident 21: Tougher Turing Test Exposes Chatbots’ Stupidity (migrated to Issue)

Description: The 2016 Winograd Schema Challenge highlighted how even the most successful AI systems entered into the Challenge were only successful 3% more often than random chance. This incident has been downgraded to an issue as it does not meet current ingestion criteria.

Tools

New Report New Response DiscoverView History

Entities

View all entities

Alleged: Researchers developed and deployed an AI system, which harmed Researchers.

Incident Stats

Incident ID

Report Count

Incident Date

2016-07-14

Editors

Sean McGregor

Applied Taxonomies

CSETv0, GMF, CSETv1, MIT

CSETv0 Taxonomy Classifications

Taxonomy Details

Physical System

Software only

Level of Autonomy

High

Nature of End User

Expert

Public Sector Deployment

Lives Lost

Intent

Unclear

GMF Taxonomy Classifications

Taxonomy Details

Known AI Goal Snippets

(Snippet Text: The Winograd Schema Challenge asks computers to make sense of sentences that are ambiguous but usually simple for humans to parse., Related Classifications: Question Answering)

CSETv1 Taxonomy Classifications

Taxonomy Details

Incident Number

Estimated Date

Lives Lost

Injuries

Estimated Harm Quantities

There is a potentially identifiable specific entity that experienced the harm

MIT Taxonomy Classifications

Machine-Classified

Taxonomy Details

Risk Subdomain

7.3. Lack of capability or robustness

Risk Domain

AI system safety, failures, and limitations

Entity

Timing

Pre-deployment

Intent

Unintentional

Incident Reports

Reports Timeline

AI Incident Database Incidents Converted to Issues

github.com

github.com · 2022

The following former incidents have been converted to "issues" following an update to the incident definition and ingestion criteria.

21: Tougher Turing Test Exposes Chatbots’ Stupidity

Description: The 2016 Winograd Schema Challenge highli…

Variants

A "variant" is an AI incident similar to a known case—it has the same causes, harms, and AI system. Instead of listing it separately, we group it under the first reported incident. Unlike other incidents, variants do not need to have been reported outside the AIID. Learn more from the research paper.

Seen something similar?

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents

Inappropriate Gmail Smart Reply Suggestions

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents

Incident 21: Tougher Turing Test Exposes Chatbots’ Stupidity (migrated to Issue)

Tools

Entities

Incident Stats

CSETv0 Taxonomy Classifications

GMF Taxonomy Classifications

CSETv1 Taxonomy Classifications

MIT Taxonomy Classifications

Incident Reports

Reports Timeline

AI Incident Database Incidents Converted to Issues

AI Incident Database Incidents Converted to Issues

21: Tougher Turing Test Exposes Chatbots’ Stupidity

Variants

Similar Incidents

By textual similarity

Inappropriate Gmail Smart Reply Suggestions

TayBot

Gender Biases in Google Translate

Similar Incidents

By textual similarity

Inappropriate Gmail Smart Reply Suggestions

TayBot

Gender Biases in Google Translate