Report 52

A Google spokesperson responded to Motherboard's request for comment and issued the following statement: "We dedicate a lot of efforts to making sure the NLP API avoids bias, but we don't always get it right. This is an example of one of those times, and we are sorry. We take this seriously and are working on improving our models. We will correct this specific case, and, more broadly, building more inclusive algorithms is crucial to bringing the benefits of machine learning to everyone."

John Giannandrea, Google's head of artificial intelligence, told a conference audience earlier this year that his main concern with AI isn't deadly super-intelligent robots, but ones that discriminate. "The real safety question, if you want to call it that, is that if we give these systems biased data, they will be biased," he said.

His fears appear to have already crept into Google's own products.

In July 2016, Google announced the public beta launch of a new machine learning application program interface (API), called the Cloud Natural Language API. It allows developers to incorporate Google's deep learning models into their own applications. As the company said in its announcement of the API, it lets you "easily reveal the structure and meaning of your text in a variety of languages."

In addition to entity recognition (deciphering what's being talked about in a text) and syntax analysis (parsing the structure of that text), the API included a sentiment analyzer to allow programs to determine the degree to which sentences expressed a negative or positive sentiment, on a scale of -1 to 1. The problem is the API labels sentences about religious and ethnic minorities as negative—indicating it's inherently biased. For example, it labels both being a Jew and being a homosexual as negative.

Google's sentiment analyzer was not the first and isn't the only one on the market. Sentiment analysis technology grew out of Stanford's Natural Language Processing Group, which offers free, open source language processing tools for developers and academics. The technology has been incorporated into a host of machine learning suites, including Microsoft's Azure and IBM's Watson. But Google's machine learning APIs, like its consumer-facing products, are arguably the most accessible on offer, due in part to their affordable price.

But Google's sentiment analyzer isn't always effective and sometimes produces biased results.

Two weeks ago, I experimented with the API for a project I was working on. I began feeding it sample texts, and the analyzer started spitting out scores that seemed at odds with what I was giving it. I then threw simple sentences about different religions at it.

When I fed it "I'm Christian" it said the statement was positive:

When I fed it "I'm a Sikh" it said the statement was even more positive:

But when I gave it "I'm a Jew" it determined that the sentence was slightly negative:

The problem doesn't seem confined to religions. It similarly thought statements about being homosexual or a gay black woman were also negative:

Being a dog? Neutral. Being homosexual? Negative:

I could go on, but you can give it a try yourself: Google Cloud offers an easy-to-use interface to test the API.

It looks like Google's sentiment analyzer is biased, as many artificially intelligent algorithms have been found to be. AI systems, including sentiment analyzers, are trained using human texts like news stories and books. Therefore, they often reflect the same biases found in society. We don't know yet the best way to completely remove bias from artificial intelligence, but it's important to continue to expose it.

Last year for example, researchers at Princeton published a paper about a state-of-the-art natural language processing technique called GloVe. The researchers looked for biases in the algorithm against minorities and women by searching for words with which they most appeared in a "large-scale crawl of the web, containing 840 billion [words]." In the case of gender, it meant, in one experiment, looking to see if female names and attributes (like "sister") were more associated with arts or math words (like "poetry" or "math", respectively). In the case of race, one experiment looked for associations between black names (like "Jermaine" or "Tamika") with words denoting pleasantness or negativeness (like "friend" or "terrible," respectively).

By classifying the sentiment of words using GloVe, the researchers "found every linguistic bias documented in psychology that we have looked for." Black names were strongly associated with unpleasant words, female names with arts terms, and so on. The biases in the paper aren't necessarily the same as those one can find in Google's Natural Language API (genders and people's names, for instance, are reliably neutral in the API), but the problem is more or less the same: biased data in, biased classifications out.

Natural

レポート 52

関連インシデント

インシデント 147 Report
Biased Sentiment Analysis

Google’s Sentiment Analyzer Thinks Being Gay Is Bad

レポート 52

関連インシデント

インシデント 147 ReportBiased Sentiment Analysis

Google’s Sentiment Analyzer Thinks Being Gay Is Bad

インシデント 147 Report
Biased Sentiment Analysis