Report 872

In 2016, researchers from Boston University and Microsoft were working on artificial intelligence algorithms when they discovered racist and sexist tendencies in the technology underlying some of the most popular and critical services we use every day. The revelation went against the conventional wisdom that artificial intelligence doesn't suffer from the gender, racial, and cultural prejudices that we humans do.

The researchers made this discovery while studying word-embedding algorithms, a type of AI that finds correlations and associations among different words by analyzing large bodies of text. For instance, a trained word-embedding algorithm can understand that words for flowers are closely related to pleasant feelings. On a more practical level, word embedding understands that the term "computer programming" is closely related to "C++," "JavaScript" and "object-oriented analysis and design." When integrated in a resume-scanning application, this functionality lets employers find qualified candidates with less effort. In search engines, it can provide better results by bringing up content that's semantically related to the search term.

The BU and Microsoft researchers found that the word-embedding algorithms had problematic biases, though—such as associating "computer programmer" with male pronouns and "homemaker" with female ones. Their findings, which they published in a research paper aptly titled "Man is to Computer Programmer as Woman is to Homemaker?" was one of several reports to debunk the myth of AI neutrality and to shed light on algorithmic bias, a phenomenon that is reaching critical dimensions as algorithms become increasingly involved in our everyday decisions.

The Origins of Algorithmic Bias

Machine learning and deep-learning algorithms underlie most contemporary AI-powered software. In contrast to traditional software, which works based on predefined and verifiable rules, deep learning creates its own rules and learns by example.

For instance, to create an image-recognition application based on deep learning, programmers "train" the algorithm by feeding it labeled data: in this case, photos tagged with the name of the object they contain. Once the algorithm ingests enough examples, it can glean common patterns among similarly labeled data and use that information to classify unlabeled samples.

This mechanism enables deep learning to perform many tasks that were virtually impossible with rule-based software. But it also means deep-learning software can inherit covert or overt biases.

"AI algorithms are not inherently biased," says Professor Venkatesh Saligrama, who teaches at Boston University's Department of Electrical and Computer Engineering and worked on the word-embedding algorithms. "They have deterministic functionality and will pick up any tendencies that already exist in the data they train on."

The word-embedding algorithms tested by the Boston University researchers were trained on hundreds of thousands of articles from Google News, Wikipedia, and other online sources in which social biases are deeply embedded. As an example, because of the bro culture dominating the tech industry, male names come up more often with tech-related jobs—and that leads algorithms to associate men with jobs such as programming and software engineering.

"Algorithms don't have the power of the human mind in distinguishing right from wrong," adds Tolga Bolukbasi, a final-year PhD student at BU. Humans can judge the morality of our actions, even when we decide to act against ethical norms. But for algorithms, data is the ultimate determining factor.

Saligrama and Bolukbasi weren't the first to raise the alarm about this bias. Researchers at IBM, Microsoft, and the University of Toronto underlined the need to prevent algorithmic discrimination in a paper published in 2011. Back then, algorithmic bias was an esoteric concern, and deep learning still hadn't found its way into the mainstream. Today, though, algorithmic bias already leaves a mark on many of the things we do, such as reading news, finding friends, shopping online, and watching videos on Netflix and YouTube.

The Impact of Algorithmic Bias

In 2015, Google had to apologize after the algorithms powering its Photos app tagged two black people as gorillas—perhaps because its training dataset did not have enough pictures of black people. In 2016, of the 44 winners of a beauty contest judged by AI, nearly all were white, a few were Asian, and only one had dark skin. Again, the reason was that the algorithm was mostly trained with photos of white people.

Google Photos, y'all fucked up. My friend's not a gorilla. pic.twitter.com/SMkMCsNVX4 — jackyalciné's not responding to a lot on here. DM (@jackyalcine) June 29, 2015

More recently, a test of IBM and Microsoft's face-analysis services found the companies' algorithms were nearly flawless at detecting the gender of men with light skin but often erred when presented with pictures of women with dark skin

Report 872

Artificial Intelligence Has a Bias Problem, and It's Our Fault