Report 3196

A political controversy rocked the southern Indian state of Tamil Nadu in April when K. Annamalai, state head of the Bharatiya Janata Party (BJP) — India’s ruling party — released a controversial audio recording of Palanivel Thiagarajan, a lawmaker from the Dravida Munnetra Kazhagam (DMK) that is currently in power in the state.

In the 26-second low-quality audio tape, Thiagarajan, who was the finance minister of Tamil Nadu at the time, could allegedly be heard accusing his own party members of illegally amassing $3.6 billion. Thiagarajan vehemently denied the veracity of the recording, calling it “fabricated” and “machine-generated.”

“NEVER trust an Audio clip without an attributable source,” Thiagarajan tweeted on April 22. He argued that it’s now easy to fabricate voices, citing a news clip on the infamous AI-generated songs of Drake and The Weeknd.

On April 25, Annamalai released a second clip — 56 seconds long, and with much clearer audio — where Thiagarajan allegedly spoke disparagingly of his own party and praised the BJP. This time, Thiagarajan called it a desperate attempt by a “blackmail gang” to create a political rift within his own party, and said no one had claimed ownership of the source of the clips. The BJP’s Hindu nationalist politics have found little reception in India’s southern states, and the party has been trying to make inroads into Tamil Nadu through aggressive campaigns. The purported audio leaks are part of a longer list of what is now known as the DMKFiles — a set of alleged corruption scandals of which Annamalai has accused the ruling party. He recently promised to release more such files.

While experts have rattled off multiple alarming scenarios on how AI can play out in politics, in India, this could be the first high-profile case of the “liar’s dividend” — the ability of the powerful to claim plausible deniability of unflattering footage. Deepfake experts told Rest of World the rise of AI is being used as a ruse to sow information uncertainty in a new political era. On the one hand, they said, generative AI has the potential to tarnish reputations and manipulate public opinion, but on the other, the technology could be a way to evade accountability by dismissing any incriminating evidence as fake.

“We are seeing more generative AI/deepfake/synthetic media content, as well as more claims of fake when not,” Sam Gregory, executive director at the nonprofit Witness, which studies the use of deepfakes to defend human rights, told Rest of World over email. Gregory noted instances in Myanmar where the army had challenged real evidence of human rights violations as fake. More recently, in May, a Twitter account belonging to Sudan’s Rapid Support Forces leader Mohamed Hamdan Dagalo posted an audio recording allegedly from the general. Rumors had circulated online that Dagalo was dead, and social media users speculated whether the audio was AI-generated. A forensic analysis facilitated by Witness, however, determined that the Arabic recording was very likely authentic.

Rest of World shared the two audio clips released by Annamalai with the Deepfakes Rapid Response Force for forensic analysis. An initiative by Witness, the Deepfakes Rapid Response Force connects local networks of journalists and fact-checkers with leading media forensics and deepfake experts. The program facilitated three independent tests of the clips. The analysts were divided on the first clip, either finding it too poor in quality to come to a conclusion, or judging that the clip was “very likely fake.” However, they all agreed on the second clip, deeming it authentic.

“It would be extremely difficult for any current text-to-speech system to generate English-speaking audio with this southern Indian accent at this level of fidelity,” Rijul Gupta, CEO of DeepMedia, an AI generation and detection company, told Rest of World about the second clip. “The voice does not contain artifacts that are associated with voice-swapping algorithms (e.g. speech-to-speech instead of text-to-speech). For these reasons, our AI experts conclude that Clip 2 is an authentic voice,” the DeepMedia team noted over email. DeepMedia’s deepfake detectors are currently used by the U.S. Department of Defense to ascertain if internet media emerging out of Russia and China is authentic.

After a human evaluation, DeepMedia ran the clips through their proprietary in-house deepfake detection tool, which also showed with 87% confidence that the second clip is authentic. However, DeepMedia’s evaluation of the first clip came back as inconclusive. “The level of noise in this clip makes it difficult to determine its authenticity. Although the voice sounds as though it may be AI-generated, we cannot conclusively confirm or deny its authenticity at this time,” the team said.

According to a six-member team of deepfake researchers at the University of Naples Federico II led by Luisa Verdoliva, “The second clip is considered authentic [under the threshold] for the whole duration of the clip.” They collected 50 minutes of pristine audio of Thiagarajan — from three speeches uploaded on YouTube, including his talk at the Oxford Union. They then compared the audio to the two clips. “The first clip turns out to be authentic for the first segment then the distance increases. However this could be due to the fact that the clip is quite noisy,” the team said over an email.

A third evaluation by deepfake detection company Reality Defender, led by Ali Shahriyari, used in-house audio detection models as well as human experts. It called the first clip “very likely fake.” According to the team, “Several spoken words are lost, as if the voice converter failed to grasp the right pronunciations (or did this deliberately to give an impression it’s authentic/real).” Reality Defender also concluded that the second clip is likely authentic. “Strong naturality in the speech content (e.g. broken sentences with filler words that would make sense in a conversation but not entirely so when written down), changes in emotion that are hard to model with fake speech generators, and overall good quality of the speech content (e.g. native pronunciation),” the team noted over email.

Rest of World reached out to Thiagarajan and Annamalai’s offices but did not receive a response by the time of publishing.

Thiagarajan released his own audio analysis of the first clip, citing the audio distortions as evidence of it being a deepfake. He also mentioned a 2020 story by Vice — where an Indian politician cloned his voice and created an election campaign video in English and Haryanvi — as evidence of how misleading video and audio can be created. In a second statement, Thiagarajan said the BJP was “using advanced technologies and cheap tactics such as releasing these fabricated audios to disrupt our good work.”

The current explosion of generative AI has made the public second-guess any evidence, and wonder if it’s real or fake. “The bigger threat now is that everyone has plausible deniability, right?” Gupta of DeepMedia said. “Anyone can just say this is fake, it never happened. And even if it’s true, their supporters might believe that. That’s insanely problematic.” This idea of the “liar’s dividend” has put tremendous pressure on journalists and fact-checkers, who now have to be selective about which story they want to chase down and verify, Gregory said.

In May, an AI-manipulated image of two Indian wrestlers — who were smiling while being arrested for protesting against a BJP politician for sexual harassment — went viral. Social media users claimed the wrestlers were not serious about the protest, according to reports.

“There is a detection equity gap that exists in the world,” Gregory said. “The tools to detect synthetic media manipulation are not available to the people who need it the most. They’re not evolved to a broad global range of journalists and fact-checkers, alongside the skills to use them.” However, Gupta expects more independent firms will develop detection tools as generative AI evolves. “It is going to be a cat-and-mouse game.”

Report 3196

An Indian politician says scandalous audio clips are AI deepfakes. We had them tested