Associated Incidents
An investigation into the $35 million stolen from a bank in the United Arab Emirates in January 2020 found that fake voice technology was used to impersonate a company director known to a bank branch manager, who then authorized the transaction.
The crime took place on January 15 last year, and is described in a request (PDF) the UAE was providing to US state authorities to help track down a portion of the embezzled funds sent to the US.
The request states that the manager of the unnamed bank branch in the UAE received a familiar voicemail, which, along with accompanying emails from a lawyer named Martin Zelner, convinced the manager to pay the money, apparently for company ownership.
The application states:
According to UAE authorities, on January 15, 2020, the branch manager of the Victim Company received a call claiming to be from the company’s headquarters. The caller appeared to be the CEO of the company, so the branch manager believed the call was legitimate.
"The branch manager also received several emails that he believed were from the CEO in connection with the call. The caller informed the branch manager by phone and email that the victim's company was about to be acquired by another company, and that a lawyer named Martin Zelner (Zelner) had been authorized to coordinate the acquisition process.'
The branch manager then received emails from Zelner, along with a letter of authorization from the CEO (presumably) whose voice was familiar to the victim.
Voice Fraud Detected
Emirati investigators then determined that voice-faking technology was used to imitate the voice of the company's CEO:
The UAE investigation revealed that the defendants used "deep voice" technology to mimic the voice of the Director. In January 2020, funds were transferred from the Victim Company to several bank accounts in other countries in a sophisticated scheme involving at least 17 known and unknown defendants. UAE authorities traced the movement of this money across multiple accounts and identified two transfers to the United States.
"On January 22, 2020, two transfers of USD 199,987.75 and USD 215,985.75 were made from two of the defendants to Centennial Bank account numbers, xxxxx7682 and xxxxx7885, respectively, located in the United States."
No further details were available on the crime, which is the second known instance of deep voice-based financial fraud. The first occurred nine months ago, in March 2020, when a manager at a UK energy company was blackmailed over a phone call pretending to be the boss of an employee, demanding an urgent transfer of €220,000 ($243,000), which the employee then dealt with.
Voice Impersonation Development
Voice impersonation involves training a machine learning model on hundreds, or even thousands, of samples of a ‘target’ voice (the voice to be imitated). The most accurate match can be achieved by training the target voice directly against the voice of the person who will be speaking in the proposed video, although the model will be ‘adapted’ to the person impersonating the target.
The most active online community for voice-cloaking developers is the Audio Fake Creation Discord server, which hosts numerous forums on voice-cloaking algorithms such as Google Takotron-2, Talknet, Forw ardTakotron, Coqui-ai-TTS and Glow-TTS, And others.
Real-time deepfakes
Since phone conversations are interactive, voice-over fraud cannot be reasonably done with high-quality, 'baked' footage, and in both cases of voice fraud, we can reasonably assume that the speaker is using a live, real-time fake. framework.
Real-time fakes have come into focus recently with the advent of DeepFaceLive, a real-time implementation of the DeepFaceLab deepfake suite, which can enhance celebrities or other identities enter live camera footage. While Audio Fakes Discord and DeepFaceLab Discord users are very interested in combining the two technologies into a single video + audio deep learning framework, no such product has clearly emerged yet.