Incident 13: High-Toxicity Assessed on Text Involving Women and Minority Groups

Description: Google's Perspective API, which assigns a toxicity score to online text, seems to award higher toxicity scores to content involving non-white, male, Christian, heterosexual phrases.

Tools

New ReportNew ReportNew ResponseNew ResponseDiscoverDiscover
Alleged: Google developed and deployed an AI system, which harmed Women and Minority Groups.

Incident Stats

Incident ID
13
Report Count
9
Incident Date
2017-02-27
Editors
Sean McGregor

CSET Taxonomy Classifications

Taxonomy Details

Full Description

Google's Perspective API, which assigns a toxicity score to online text, has been shown to award higher toxicity scores to content involving non-white, male, Christian, heterosexual phrases. the scores lay on the spectrum between very healthy (low %) to very toxic (high %). The phrase "I am a man" received a score of 20% while "I am a gay black woman" received 87%. The bias exists within subcategories as well: "I am a man who is deaf" received 70%, "I am a person who is deaf" received 74%, and "I am a woman who is deaf" received 77%. The API can also be circumvented by modifying text: "They are liberal idiots who are uneducated" received 90% while "they are liberal idiots who are un.educated" received 15%.

Short Description

Google's Perspective API, which assigns a toxicity score to online text, seems to award higher toxicity scores to content involving non-white, male, Christian, heterosexual phrases.

Severity

Minor

Harm Distribution Basis

Race, Religion, National origin or immigrant status, Sex, Sexual orientation or gender identity, Disability, Ideology

Harm Type

Psychological harm, Harm to social or political systems

AI System Description

Google Perspective is an API designed using machine learning tactics to assign "toxicity" scores to online text with the oiginal intent of assisting in identifying hate speech and "trolling" on internet comments. Perspective is trained to recognize a variety of attributes (e.g. whether a comment is toxic, threatening, insulting, off-topic, etc.) using millions of examples gathered from several online platforms and reviewed by human annotators.

System Developer

Google

Sector of Deployment

Information and communication

Relevant AI functions

Perception, Cognition, Action

AI Techniques

open-source, machine learning

AI Applications

Natural language processing, content ranking

Location

Global

Named Entities

Google, Google Cloud, Perspective API

Technology Purveyor

Google

Beginning Date

2017-01-01T00:00:00.000Z

Ending Date

2017-01-01T00:00:00.000Z

Near Miss

Harm caused

Intent

Accident

Lives Lost

No

Data Inputs

Online comments

GMF Taxonomy Classifications

Taxonomy Details

Known AI Goal

Hate Speech Detection

Known AI Technology

Character NGrams

Potential AI Technology

Distributional Learning

Known AI Technical Failure

Context Misidentification, Generalization Failure, Lack of Adversarial Robustness

Potential AI Technical Failure

Limited Dataset, Misaligned Objective, Underfitting, Distributional Bias, Data or Labelling Noise

Variants

A "variant" is an incident that shares the same causative factors, produces similar harms, and involves the same intelligent systems as a known AI incident. Rather than index variants as entirely separate incidents, we list variations of incidents under the first similar incident submitted to the database. Unlike other submission types to the incident database, variants are not required to have reporting in evidence external to the Incident Database. Learn more from the research paper.

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents

Biased Sentiment Analysis

· 7 reports

Gender Biases in Google Translate

· 10 reports

TayBot

· 28 reports