Incident 136: Brand Safety Tech Firms Falsely Claimed Use of AI, Blocking Ads Using Simple Keyword Lists
Suggested citation format
We’ve known for a long time that brand safety detection technologies were blunt instruments , , , , . Despite their claims of advanced AI (artificial intelligence) and ML (machine learning) in their sales materials, it was painfully clear the brand safety technology was blocking ads based on simple keyword lists. This is not a new phenomenon, just more easily observable at the start of the pandemic when ads on the front pages of New York Times and Wall Street Journal were blocked and replaced by “cloud ads” just because the page contained the word “covid-19” or “coronavirus.” So while these brand safety technologies were being paid for by advertisers, hoping that would help keep ads off of not-brand-safe sites, they were instead blocking advertisers’ ads on legitimate mainstream news sites. The result of this blocking is that more ads and dollars flow to lower quality sites in programmatic channels - the exact opposite effect to what advertisers thought would happen.
Getting Brand Safety Right is Hard
Don’t get me wrong. Getting brand safety right is hard. That’s because words have meaning, phrases may have different meanings, the English language has many puns and idioms, and context matters. For example, the word “blood” may be undesirable or “not-brand-safe” when appearing on a consumer site; but the same word “blood” is fine and needed on a medical site. Others have documented that keyword blocking is a very common practice. But often it is humorously flawed. "For example, “shooting” is one of the most common blacklist terms. While it may identify some content about violence, that term will also block content by astronomy buffs (shooting stars), sports fans (shooting hoops), technology users (troubleshooting), photographers (shooting a photo) and card players (shooting the moon)." (Source: AdAge)
In the tables above are examples of “content classifiers” seen in the largest ad network. Look closely at the domains and how they were classified. Some categories are clearly wrong while others may simply be incomplete and imprecise — and the entire site cannot be easily categorized into one category, just like songs may not fit neatly into only one genre of music. We have reviewed cases in the past where entire sites get blocked or blacklisted due to not brand-safe content on a few pages. CBSnews.com being marked as “weapons” is likely that there was a news item that contained the word “weapons.”
Quantifying the Negative Impact on Publishers’ Ad Revenue
We knew brand safety tech was bad (didn’t work as well as they promised) and was bad news (lowers the ad revenue of legitimate publishers), but we didn’t know exactly how bad the situation was. New research published today by Adalytics helps quantify the negative impact of brand safety technologies on mainstream publishers’ ad revenues. “This exploratory study shows an estimated 21% of economist.com articles, 30% of nytimes.com, 30% of wsj.com, and 52% of articles on vice.com are being labeled as ‘brand unsafe’. When a page is marked “brand_unsafe” the signal is passed to real-time bidding. Marketers choosing to avoid unsafe environments would have their ads blocked from these pages and sites.
But are the brand safety designations accurate? Previous evidence suggests each vendor’s labeling is problematic at best and entirely wrong at worst. By comparing different vendors’ designations we can quantify how inaccurate their labeling is — for example, when two vendors categorize the same article differently, as safe or unsafe. Based on a sample of 25,730 wsj.com articles found in various internet archive repositories, it appears that Moat and Comscore disagree about 40% of the time when classifying an article as “safe” or “unsafe.”
In other words, “for every 10 articles on wsj.com, an average of four would receive conflicting brand safety values be labeled as ‘safe’/‘unsafe’ or ‘unsafe’/‘safe’ by the two vendors.” The data also suggests that the brand safety detection technologies did not look for text inside images or analyze images for brand safety issues.
Outsized Negative Impact on Serious Topics and High Traffic Pages
Finally, data from the study shows how journalists who focus on certain ‘serious’ topics, such as Middle East affairs, obituaries, or certain political events, are disproportionately likely to have their work marked as ‘unsafe’ by brand safety vendors.” This causes ads to be blocked and these mainstream news sites to be defunded. Even a recent article about the death of Argentine football player Diego Maradona was marked as ‘moat_unsafe’ and ‘gv_death_injury’. Biomedical articles that use the term “cell death” as a synonym for apoptosis are similarly flagged as “brand_unsafe” because of the word “death.”
Beyond the percent of all pages defunded by brand safety tech, if high traffic pages like the homepage are blocked due to being “brand_unsafe” the impact on ad revenues is disproportionally higher than if low volume article pages are blocked. Mainstream sites’s homepages are by far the highest trafficked pages of the entire website. The graphic from Adalytics below shows the homepage of the New York Times and the articles linked from the homepage. Note the articles marked in red boxes - those are marked as “brand_unsafe” by brand safety vendors. The negative impact of brand safety tech on mainstream publishers’ sites is visible and significant. Marketers’ use of brand safety tech is defunding legitimate sites and funding bad ones, the opposite of the intended effect.
So What? What Can Marketers Do Instead?
Stop wasting money on brand safety tech that does not work and is actively causing more harm. As we’ve seen above, the use of brand safety detection vendors is bad for publishers — ads blocked due to incorrect brand safety measurements means those publishers lose ad revenue. Use of brand safety detection vendors is also bad for advertisers — ads are blocked from mainstream websites, so the ads and ad budgets flow to worse sites. Paying for brand safety technology that doesn’t work well or doesn’t work at all is like throwing good money after bad. It’s like solving a problem that wouldn’t exist if marketers bought ads from mainstream, good publishers in the first place, instead of spraying their ads out on millions of long tail sites that no one has ever seen or visited. It’s like trying to detect your way out of trouble using faulty, rudimentary technology, and that’s literally a fool's errand. Brand safety detection vendors will be the first to insist you have to keep paying for their tech or else your ads may end up next to a terrorist beheading video. But your ads would not end up there if you didn’t place ads on UGC (user generated content) in the first place, similarly for millions of long tail sites in programmatic channels.
For starters, marketers should move to using strict “include lists” with sites that have real human audiences. As a simple thought exercise, ask yourself how many sites do humans know about and visit regularly. Examine your own personal experience every day. How many sites do you visit regularly every day? Marketers can save money by NOT renewing and NOT paying for brand safety detection tech that does more harm than good, and run ads on real, good, mainstream publishers. Common sense will tell you ads running on an Economist article about the death of soccer legend Diego Maradona are not “brand_unsafe” even if BS detection tech marks it as such.
Did our AI mess up? Flag the unrelated incidents