Joe Biden

AI company hired to clone Mayor Adams' voice linked to Biden deepfake, researchers say

The co-founder of a company that specializes in voice security said detecting synthetic speech in the realm of political messaging is going to become increasingly important so voters can stay informed

NBC Universal, Inc.

What to Know

  • Mayor Eric Adams in October announced he’d hired ElevenLabs, a London artificial intelligence start-up, to create a series of robocalls that made it seem like Adams could speak several foreign languages
  • In January, voters in New Hampshire received a deepfake robocall of President Joe Biden designed to trick them into staying home instead of voting in the primary election.
  • Researchers at another AI company say ElevenLabs is also the same company used to recreate Biden's voice

Last October, Mayor Eric Adams announced he’d hired ElevenLabs, a London artificial intelligence start-up, to create a series of robocalls that made it seem like Adams could speak several foreign languages. The idea, the mayor said, was to enable him to speak directly to people in their native languages about city services and opportunities.

“People stop me on the street all the time, and say, ‘I didn’t know you speak Mandarin,’” Adams said at the time.

Three months later, ElevenLabs is in the news again — this time, for an alleged connection to a political dirty trick.

Researchers at a rival artificial intelligence firm say they’ve all but confirmed the ElevenLabs AI platform was used by someone earlier this month to create the now infamous deepfake robocall designed to trick New Hampshire citizens into thinking President Biden wanted them to stay home instead of voting in the state’s primary election.

According to Pindrop, a company that specializes in voice security, the firm’s “deepfake detection engine found, with a 99% likelihood, that this deepfake is created using ElevenLabs or a TTS (Text To Speech) system using similar components."

In a blog post discussing the methodology used to investigate the Biden deepfake, Vijay Balasubramaniyan, Pindrop’s Co-Founder, said detecting synthetic speech in the realm of political messaging is going to become increasingly important so voters can stay informed. He noted that it was not ElevenLabs itself, but a user of the platform who likely created the phony Biden recording.

“Even though the attackers used ElevenLabs this time, it is likely to be a different Generative AI system in future attacks,” Balasubramaniyan wrote. “Hence it is imperative that there are enough safeguards available in these tools to prevent nefarious use.”

The I-Team reached out multiple times to ElevenLabs, but the company has not yet responded. 

The Republican and Democratic parties held primaries in New Hampshire on Tuesday.

On Feb. 6, the New Hampshire Attorney General announced its Election Law Unit had determined the source of the Biden deepfake robocall was a Texas company called Life Corporation and its principle, Walter Monk. Neither Monk nor a representative for Life Corporation immediately responded to the I-Team’s requests for comment.

The Attorney General’s news release did not include a conclusion about what voice cloning platform was used to create the deepfake.

“The Election Law Unit is also aware of media reports that the recorded message was likely made using software from ElevenLabs,” the statement read.  “At this time the Unit is continuing to investigate and cannot confirm whether that reporting is accurate.”

After Mayor Adams announced his use of ElevenLabs to produce those foreign language robocalls, some AI watchdogs criticized City Hall, saying the use of voice cloning by public officials should involve more oversight. Julia Stoyanovich, an Associate Professor of Computer Science at NYU who focuses on the ethics of machine learning, said AI-generated government messages should always come with bold disclosures that they are not real human voices.

“I don’t think we should be releasing - and politicians in particular, and elected officials like our Mayor - should be releasing machine generated content without an explicit statement that the content is machine generated,” Stoyanovich said. 

For two months, the I-Team has been requesting Mayor Adams provide copies of all the AI-generated robocalls featuring him speaking foreign languages.  Despite the audio having been paid for with public money, City Hall has failed to share all except the Spanish version.

Mayor Adams did not respond to specific questions about why the recordings are being withheld from the public, though City Hall argued his use of voice cloning has been fully transparent because Adams proactively mentioned the foreign language robocalls in front of journalists last October when he announced an “action plan” for responsible AI use in NYC government. The document does not take a position on whether those policies should include mandatory disclosure when government messages are produced with the help of machine learning.

That action plan calls for written policies on government use of AI to be published sometime in 2025. The document does not take a position on whether those policies should include mandatory disclosure when government messages are produced with the help of machine learning.

Several bills are pending in the New York and New Jersey state legislatures that would put guardrails on the use of AI, including one in Albany that would amend the election law to require disclosure when “synthetic media” is used in political communication.  Another bill in Trenton would extend the crime of identity theft to fraudulent impersonation using AI or deepfake technology.

Some, including many entrepreneurs inside the artificial intelligence industry, say increased oversight and regulation needs to come soon in order to preserve trust in the authenticity of mediated messages.

“We do need to act with regulation and some sort of governance,” said Zohaib Ahmed, the founder of Resemble AI, a Canadian company that specializes in voice cloning.

Ahmed said his firm has introduced an “invisible watermark” that can be embedded in audio files so it can always be traced back to the source.  He predicted watermarking and deepfake detection will quickly become industry standards, so people can trust what they’re hearing has been authorized by the person whose voice is being replicated.

“We understand the implications of our technology and we want to make it a point that we’re deploying it safely,” Ahmed said.

Correction (Feb. 1, 2024, 9:48 a.m.): An earlier version of this article misspelled the name of Julia Stoyanovich.

Contact Us