Associated Incidents

Abstract
The ability to quantify incivility online, in news and in congressional debates, is of great interest to political scientists. Computational tools for detecting online incivility for English are now fairly accessible and potentially could be applied more broadly. We test the Jigsaw Perspective API for its ability to detect the degree of incivility on a corpus that we developed, consisting of manual annotations of civility in American news. We demonstrate that toxicity models, as exemplified by Perspective, are inadequate for the analysis of incivility in news. We carry out error analysis
that points to the need to develop methods to remove spurious correlations between words often mentioned in the news, especially identity descriptors and incivility. Without such improvements, applying Perspective or similar models on news is likely to lead to wrong conclusions, that are not aligned with the human perception of incivility.
1 Introduction
Surveys of public opinion report that most Americans think that the tone and nature of political debate in this country have become more negative and less respectful and that the heated rhetoric by politicians raises the risk for violence (Center, 2019). These observations motivate the need to study (in)civility in political discourse in all spheres of interaction, including online (Ziegele et al., 2018; Jaidka et al., 2019), in congressional debates (Uslaner, 2000) and as presented in news (Meltzer, 2015; Rowe, 2015). Accurate automated means for coding incivility could facilitate this research, and political scientists have already turned to using off-the-shelf computational tools for studying civility (Frimer and Skitka, 2018; Jaidka et al., 2019; Theocharis et al., 2020).
Computational tools however, have been developed for different purposes, focusing on detecting language in online forums that violate community norms. The goal of these applications is to support human moderators by promptly focusing their attention on likely problematic posts. When studying civility in political discourse, it is primarily of interest to characterize the overall civility of interactions in a given source (i.e., news programs) or domain (i.e., congressional debates), as an average over a period of interest. Applying off-the-shelf tools for toxicity detection is appealingly convenient, but such use has not been validated for any domain, while uses in support of moderation efforts have been validated only for online comments.
We examine the feasibility of quantifying incivility in the news via the Jigsaw Perspective API, which has been trained on over a million online comments rated for toxicity and deployed in several scenarios to support moderator effort online (https://www.perspectiveapi.com/).
We collect human judgments of the (in)civility in one month worth of three American news programs. We show that while people perceive significant differences between the three programs, Perspective cannot reliably distinguish between the levels of incivility as manifested in these news sources. We then turn to diagnose the reasons for Perspective’s failure. Incivility is more subtle and nuanced than toxicity, which includes identity slurs, profanity, and threats of violence along other unacceptable incivility. In the range of civil to borderline civil human judgments, Perspective gives noisy predictions that are not indicative of the differences in civility perceived by people. This finding alone suggests that averaging Perspective scores to characterize a source is unlikely to yield meaningful results. To pinpoint some of the sources of the noise in predictions, we characterize individual words as likely triggers of errors in Perspective or sub-error triggers that lead to over-prediction of toxicity.
We discover notable anomalies, where words quite typical in neutral news reporting are confounded with incivility in the news domain. We also discover that the mention of many identities, such as Black, gay, Muslim, feminist, etc., triggers high incivility predictions. This occurs despite the fact that Perspective has been modified specifically to minimize such associations (Dixon et al., 2018a). Our findings echo results from gender debiasing of word representations, where bias is removed as measured by a fixed definition but remains present when probed differently (Gonen and Goldberg, 2019). This common error—treating the mention of identity as evidence for incivility—is problematic when the goal is to analyze American political discourse, which is very much marked by us-vs-them identity framing of discussions.
These findings will serve as a basis for future work in debiasing systems for incivility prediction, while the dataset of incivility in American news will support computational work on this new task.
Our work has implications for researchers of language technology and political science alike. For those developing automated methods for quantifying incivility, we pinpoint two aspects that require improvement in future work: detecting triggers of civility overprediction and devising methods to mitigate the errors in prediction. We propose an approach for a data-driven detection of error triggers; devising mitigation approaches remain an open problem. For those seeking to contrast civility in different sources, we provide compelling evidence that state-of-the-art automated tools are not appropriate for this task. The data and (in)civility ratings would be of use to both groups as test data
for future models for civility prediction (https://github.com/anushreehede/incivility_in_news).
<see full report here: https://arxiv.org/pdf/2102.03671.pdf>
9 Conclusion
The work we presented was motivated by the desire to apply off-the-shelf methods for toxicity prediction to analyse civility in American news. These methods were developed to detect rude, disrespectful, or unreasonable comment that is likely to make you leave the discussion in an online forum. To validate the use of Perspective to quantify incivility in the news, we create a new corpus of perceived incivility in the news. On this corpus, we compare human ratings and Perspective predictions. We find that Perspective is not appropriate for such an application, providing misleading conclusions for sources that are mostly civil but for which people perceive a significant overall difference, for example, because one uses sarcasm to express incivility. Perspective is able to detect less subtle differences in levels of incivility, but in a large-scale analysis that relies on Perspective exclusively, it will be impossible to know which differences would reflect human perception and which would not.
We find that Perspective’s inability to differentiate levels of incivility is partly due to the spurious correlations it has formed between certain non-offensive words and incivility. Many of these words are identity-related. Our work will facilitate future research efforts on debiasing of automated predictions. These methods start off with a list of words that the system has to unlearn as associated with a given outcome. In prior work, the lists of words to debias came from informal experimentation with predictions from Perspective. Our work provides a mechanism to create a data-driven list that requires some but little human intervention. It can discover broader classes of bias than people
performing ad-hoc experiments can come up with.
A considerable portion of content marked as uncivil by people is not detected as unusual by Perspective. Sarcasm and high-brow register in the delivery of the uncivil language are at play here and will require the development of new systems.
Computational social scientists are well-advised to not use Perspective for studies of incivility in political discourse because it has clear deficiencies for such application.