Physical education teacher arrested for deepfake of school principal's voice

In the United States, a physical education teacher was arrested and accused of cloning the voice of a school principal using AI. He accessed OpenAI tools and Microsoft's Bing chat services through school computers, created a deepfake and posted an audio recording of racist and anti-Semitic comments. It prompted a wave of hateful messages, a threat to the director's family and numerous calls from the public.

Info Apr 28, 2024 0 187 Add to Reading List

Physical education teacher arrested for deepfake of school principal's voice

A Pikesville High School physical education teacher doctored a recording of the principal's voice using artificial intelligence to retaliate against an investigation into the misuse of school funds.

Dajon Darien case

The story began when an audio recording of Pikesville High School Principal Eric Eiswerth making racist and anti-Semitic comments surfaced on social media. Even though he was suspended, questions arose about the authenticity of the recording.

Deepfake detection experts said there is overwhelming evidence that the voice was created by artificial intelligence. They noted a flat tone, clear background sounds, and no consistent breathing sounds or pauses. Experts also tested the audio recording using other AI detection methods and concluded that it was a fake.

Following an investigation, it was determined that physical education teacher Dajon Darien accessed OpenAI tools and Microsoft Bing chat services through school computers, created a deepfake of the principal's voice, and published audio recordings through his email address and associated phone number.

Police arrested Darien and stated, “It is believed that Mr. Darien made the recording to take revenge on Principal Eiswerth, who was conducting an investigation into the misuse of school funds at the time. Eiswerth determined that Darien entered the payment into the school's payroll system without following proper procedures . "

The gym teacher was released after posting $5,000 bail but was charged with theft of school funds, disruption of school operations, retaliation against a witness and stalking.

“The audio clip had serious consequences. This not only led to the suspension of the school principal, but also sparked a wave of hate-filled social media posts and numerous calls from the public. The deepfake caused serious harm to Eiswerth, his family, and the students and staff of Pikesville High School.”
Baltimore County Police

The problem of voice deepfakes

The first deepfakes created using artificial intelligence appeared only in 2018, but have already managed to gain significant popularity.

“This story is far from the first time that neural networks have been used to artificially create someone else’s voice for their own purposes. This method has been in practice since 2018. But every year this method of social engineering attacks becomes more popular and easier for an attacker. This story perfectly demonstrates how easy it is to shake someone else’s status with harsh statements, deprive a person of his job and turn society against him, although the latter might not have done anything.”
Sergey Zybnev, Awillix

Interest in artificial intelligence-powered voice cloning technology has grown over the past year as services have become more human-sounding. Thus, the political party of Imran Khan, the jailed former prime minister of Pakistan, used ElevenLabs to reproduce his voice during the campaign. And two Texas organizations were linked to a fake robocall impersonating President Joe Biden and telling people not to vote.

In this tense environment, OpenAI has decided not to make its AI text-to-voice platform, Voice Engine, public. However, other AI voice generation tools are widely available online, and one minute of recording someone's voice can be enough to imitate them using the tool.

In Russia, voice deepfakes have just begun to develop. Alexander Klevtsov, an information security expert at InfoWatch Group, reports that users are encountering voice deepfakes more and more often, and in the future the number of such stories will grow. The technology is quite labor-intensive; in order to make a high-quality impression of a voice, attackers need a long recording - from 30 minutes, but they are quite easy to find on social networks. Deepfakes are most often used for social engineering, including financial fraud - they pose as managers and demand an employee urgently transfer money, or as relatives and friends who are supposedly in trouble. Not long ago, a similar situation occurred with Sergei Bezrukov, to whom Konstantin Khabensky allegedly sent a voice message with an invitation to become an investor in a project to build a boarding house. The deception was revealed in time and the scammers did not receive any money.

“The main task of attackers is to create a situation of maximum stress, to induce them to act urgently and immediately, so that the person does not have time to think.”
Alexander Klevtsov, information security expert at InfoWatch Group

An expert from InfoWatch Group of Companies adds: “In such situations, the main recommendation is that if the user is faced with an atypical request to urgently transfer money or do something, it is worth hanging up the call and calling back to the real number of the boss or friend who asked for help. As a rule, already at this stage the fraudulent scheme falls apart. For relatives, you can come up with a code word that only you know and which you can use to confirm over the phone that it’s a real person calling and not a scammer.”

According to Dmitry Anikin, senior data researcher at Kaspersky Lab, despite the development of technologies, including those that detect such content, the main method of protection remains the development of critical thinking and increasing digital literacy.

“Fraudulent schemes using deepfake technologies are becoming increasingly sophisticated. This is due to the fact that algorithms are gradually improving, and more and more services are appearing that allow the creation of such fakes.”
Dmitry Anikin, senior data scientist at Kaspersky Lab

The Kaspersky Lab expert notes that if we are talking about audio deepfakes, we should pay attention to the unnatural lack of noise in the background, the robotic and uneven voice, the lack of emotions and intonation accents. In the case of fraudulent schemes using video fakes, the fake can be revealed by a person’s unnatural smoothness while moving their head. It is also worth paying attention to the tone and excessive smoothness of the skin, lack of shine and reflections in the eyes, and the appearance of the teeth. Nowadays, technology has already reached such a level that sometimes at first glance it is difficult to tell whether the content in front of you is generated or an original video. In any case, it is important to remain vigilant and double-check information using alternative sources.