Report 3232

In November 2022, military leaders in the Tigray region of northern Ethiopia agreed to a fragile ceasefire, bringing a technical end to two years of brutal civil war. But despite the truce, ethnic cleansing campaigns against minority Tigrayan communities continue. Since the war first broke out, Facebook has been regularly accused of failing to moderate hate speech targeting Tigrayans.

Most recently, Meta faces a lawsuit from the son of the Ethiopian academic, Meareg Amare, who was assassinated outside his home in November 2021. The lawsuit, filed in December 2022, claims Meta failed to take action against death threats posted on Facebook in the weeks leading up to the attack — and, more broadly, failed to invest in enough moderation resources for Ethiopian languages.

Meta has struggled with moderation in conflict zones before, most notably in Myanmar and Sri Lanka. The conflict in Tigray has raised new questions about how Facebook and other social media platforms handle hate speech in marginalized languages, and other languages less dominant than English.

“A multilingual language model can’t understand Amharic without the intermediary of English”

One approach platforms take is developing artificial intelligence tools to flag and remove hateful speech. AI moderation systems often lack enough data to train a separate model in languages like Amharic and Tigrinya (the two most-spoken languages in the Tigray region), forcing the systems to rely on models trained in a variety of languages. But new research has raised concerns about the limitations of multilingual language models, with potentially alarming implications for how platforms moderate in the Tigray region.

In AI research, Amharic is known as a “low-resource language.” This group includes some of the most-spoken languages in the world, like Urdu, but because they are less represented on the internet, there is less digitized text that can be used to train AI models tailored to them. As a result, automated moderation tools often deal with low-resource languages through a process called “cross-lingual transfer” — essentially translating the lessons of an English-trained system into a low-resource language like Urdu or Amharic.

A new study from the Center for Democracy and Technology (CDT), called “Lost in Translation: Large Language Models in Non-English Content Analysis,” drives home just how easy it is for that process to go wrong. What is considered racist hate speech in an American English data set may be foundational to a tool’s understanding of ethnic hate speech in other languages, resulting in more errors and false alarms.

“A multilingual language model can’t understand Amharic without the intermediary of English,” Gabriel Nicholas, a research fellow at CDT and co-author of the study, told Rest of World. “We have a lot of concerns about it when it’s used in something that is so language-specific, so culturally specific as content moderation.”

One example outlined in the paper showed that in English, references to a dove are often associated with peace. In Basque, a low-resource language, the word for dove (uso) is a slur used against feminine-presenting men. An AI moderation system that is used to flag homophobic hate speech, and dominated by English-language training data, may struggle to identify “uso” as it is meant.

The report does not draw any conclusions about Meta or the Tigray conflict specifically, but it does note particular issues around the Tigrinya language, which has over 7 million speakers with the vast majority in Eritrea and the northern part of Ethiopia. Some multilingual language models are missing entire characters in the Tigrinya alphabet, a Ge’ez-based script similar to Amharic. This leads to what machine-learning engineers call the “UNK problem,” referring to unknown words that are outside the model’s vocabulary.

7 million The number of Tigrinya speakers globally.

ATLAS

It is difficult to say how much of a difference better moderation on Facebook would have made in Tigray. But the Facebook Papers, released in late 2021, documented chronic moderation shortcomings in the region. And the ongoing legal effort to hold the company accountable for specific incidents of violence, like the targeted harassment campaign against Meareg Amare, drives home the lack of human moderators in some Ethiopian languages.

There’s significant evidence that Meta relies on multilingual AI in its moderation systems. In a 2021 congressional hearing, Mark Zuckerberg claimed the company uses AI to identify over 95% of “hate speech content.” That year, in a blog post, the company shared that it had launched a system called Few-Shot Learner, which can be used across over 100 different languages. Meta claimed the model can identify and take action on harmful content that has no, or little, precedent on the platform.

That follows an earlier acknowledgement in 2019 of a model called XLM-R, which Meta touted at the time as being trained in one language and then used in other languages with no additional training data. It broadly stated the model could improve moderation in “low-resource languages” — a conclusion thrown into doubt by the CDT paper.

Meta did not respond to a request for comment.

Social media companies are rarely transparent about how exactly their multilingual language models are put into action, or how effective they are at moderation after they are scaled across global operations. But Meta is not alone in investing in them. In 2022, Google’s Jigsaw published a paper on Perspective API, a similar model that can be used for classifying toxic comments. And in April, TikTok CEO Shou Zi Chew said the company was making major investments in AI to address its own moderation challenges.

Both Meta and Google have announced investments in large translation models — seemingly to boost the data available to them for AI development — that use similar methods of cross-lingual transfer. In Google’s case, its universal speech model announced in November 2022 would encompass the world’s 1,000 most-spoken languages in one translation model.

“Companies cannot solve the problem of inequitable content moderation by putting a Band-Aid on it,” Aliya Bhatia, policy analyst at CDT and co-author of the paper, told Rest of World. “There are already people understanding the specific nuances of the language community and what’s at stake in a conflict,” she said, recommending that social media companies work with language communities and researchers already primed to do the difficult task of digitizing texts without relying on subpar machine translations.

It’s rare for big tech companies to train models tailored to specific low-resource languages, but some startups are taking on the challenge. A Berlin-based company named Lesan has built the first general machine translation service for Tigrinya. According to its co-founder and chief technology officer, Asmelash Teka Hadgu, social media companies should adopt a similar approach.

“In the case of Facebook in particular, it’s just full of harmful, hateful content in Amharic and Tigrinya,” he told Rest of World. The issue is close to home –– Hadgu himself is from Tigray, as are many on the Lesan team. “Low resources in many of the languages we operate on is actually the most significant barrier for big tech companies or others who want to innovate in this space.”

Lesan is addressing that bottleneck by turning to offline resources, like books and magazines. The startup coordinated with Tigrinya-speaking communities to scan these print texts, building custom character recognition tools to turn them into a form readable by machines. Lesan has since used this grassroots data collection method to create a new benchmark data set for languages across the Horn of Africa. The startup has now partnered with the Distributed AI Research Institute (DAIR) to develop an open-source tool for identifying languages spoken in Ethiopia, and detecting harmful speech in them.

But despite early success from startups like Lesan, tech companies are still committed to the multilingual approach, drawn by the possibility of capturing a world’s worth of languages in a single tool. “Social media companies are arguing that the most scalable solution is also the best solution. That’s a very convenient outcome,” said Nicholas. “What’s going to work best is going to work best for each different language … and we don’t even see social media companies even considering that.”

レポート 3232

AI moderation is no match for hate speech in Ethiopian languages