AI voice cloning used in a R517 million heist in the UAE

Beware of artificial intelligence (AI). That is what most tech professionals are saying.

According to Forbes, AI voice cloning was used in a major heist in the United Arab Emirates (UAE).

The fraud has cost a Dubai-based company $35 million (R517 million), according to investigators.

HOW DID IT HAPPEN?

In early 2020, a bank manager in the United Arab Emirates received a call from a gentleman whose voice he recognised — a director at a company with whom he’d spoken before.

According to authorities, the director had good news and wanted to make an acquisition.

He apparently needed the bank to authorise some transfers to the value of $35 million.

According to Forbes, a lawyer named Martin Zelner had been hired to co-ordinate the procedures and the bank manager could see in his inbox emails from the director and Zelner, confirming what money needed to move where.

The bank manager began making the transfers.

He did not know that he’d been duped as part of an elaborate graft, in which cybercriminals had used “deep voice” technology to clone the director’s speech.

The UAE believes it was an elaborate scheme, involving at least 17 individuals.

NOT THE FIRST TIME

In 2019, voice-mimicking software was used in a major theft, according to the Washington Post.

Thieves used voice-mimicking software to imitate a company executive’s speech and dupe his subordinate into sending hundreds of thousands of dollars to a secret account, the company’s insurer said, in a remarkable case that some researchers are calling one of the world’s first publicly reported artificial-intelligence (AI) heists.

The managing director of a British energy company, believing his boss was on the phone, followed orders in March to wire more than $240 000 to an account in Hungary, said representatives from the French insurance giant Euler Hermes.

The request was “rather strange,” the director noted later in an email, but the voice was so lifelike that he felt he had no choice but to comply. The insurer, whose case was first reported by the Wall Street Journal, provided new details on the theft to The Washington Post on Wednesday, including an email from the employee tricked by what the insurer is referring to internally as “the false Johannes”.

Now being developed by a wide range of Silicon Valley titans and AI start-ups, such voice-synthesis software can copy the rhythms and intonations of a person’s voice and be used to produce convincing speech. Tech giants such as Google and smaller firms such as the “ultra-realistic voice cloning” start-up Lyrebird have helped refine the resulting fakes and made the tools more widely available free for unlimited use.

But the synthetic audio and AI-generated videos, known as “deepfakes,” have fuelled growing anxieties over how the new technologies can erode public trust, empower criminals and make traditional communication — business deals, family phone calls, presidential campaigns — that much more vulnerable to computerised manipulation.

“Criminals are going to use whatever tools enable them to achieve their objectives cheapest,” said Andrew Grotto, a fellow at Stanford University’s Cyber Policy Center and a senior director for cybersecurity policy at the White House during the Obama and Trump administrations.

“This is a technology that would have sounded exotic in the extreme 10 years ago, now being well within the range of any lay criminal who’s got creativity to spare,” Grotto added.

Developers of the technology have pointed to its positive uses, saying it can help humanise automated phone systems and help mute people speak again. But its unregulated growth has also sparked concern over its potential for fraud, targeted hacks and cybercrime.

Researchers at the cybersecurity firm Symantec said they have found at least three cases of executives’ voices being mimicked to swindle companies. Symantec declined to name the victim companies or say whether the Euler Hermes case was one of them, but it noted that the losses in one of the cases totalled millions of dollars.

The systems work by processing a person’s voice and breaking it down into components, like sounds or syllables, that can then be rearranged to form new phrases with similar speech patterns, pitch and tone. The insurer did not know which software was used, but a number of the systems are freely offered on the web and require little sophistication, speech data or computing power.

IOL