ElevenLabs, a startup building AI-powered tools to generate and edit synthetic voices, has announced the closing of a $80 million Series B round. Notable investors who co-led the deal included entrepreneur Daniel Gross, former GitHub CEO Nat Friedman, and Andreessen Horowitz.
ElevenLabs has now been valued at over $1 billion (up from approximately $100 million in June) as a result of this funding round, in which Sequoia Capital, Smash BroadLight Capital, Credo Ventures Capital, and SV Angel also participated. According to ElevenLabs CEO Mati Staniszewski, the company would use the funds to improve products, grow its infrastructure and staff, conduct artificial intelligence research, and “enhance safety measures to ensure responsible and ethical development of AI technology.” Staniszewski stated that ElevenLabs “cemented ElevenLabs’ position as the global leader in voice AI research and product deployment” by raising the latest round.
ElevenLabs was established in 2022 by Piotr Dabkowski, a former machine learning engineer at Google, and Staniszewski, a former deployment strategist at Palantir. The company’s beta version was released approximately one year ago. Staniszewski claims that American films with bad dubbing motivated him and Dabkowski, a Polish native, to develop voice cloning technologies. They reasoned that AI could perform better.
Among ElevenLabs’s current offerings, the browser-based speech creation program stands out for its ability to generate realistic voices complete with customizable toggles for intonation, cadence, emotion, and other significant vocal traits. You may record yourself reading aloud whatever text you type using one of many default voices, and it’s completely free. Users who are willing to pay can use ElevenLabs’ voice cloning to create new styles by uploading audio samples of their speech.
Audiobooks, dubbing of films and TV episodes, character voices in games, and marketing activations are all areas where ElevenLabs is pouring resources into developing new versions of its speech-generating technology.
An effort to maintain the naturalness of a speaker’s voice, prosody, and intonation while simultaneously translating and syncing speech with the original material is the goal of the company’s “speech to speech” product, which was introduced last year. A new dubbing studio process including tools to create and update translations and transcripts, as well as a mobile app that narrates text and webpages utilizing ElevenLabs voices and is available for subscription, are on the horizon for the next few weeks.
Startup ElevenLabs has won over clients in the publishing, media, and entertainment industries, including Paradox Interactive (game creator of Cities: Skylines 2 and Stellaris) and The Washington Post, among others. Staniszewski says that forty-one percent (41%) of Fortune 500 companies’ workers use ElevenLab and that its users have made the audio equivalent of over a century.
That being said, not all press has been glowing. Hate speech imitating celebrities like Emma Watson was shared using ElevenLabs’ tools on the notorious message board 4chan, which is known for its conspiracy theories. Within seconds of contacting ElevenLabs, James Vincent of The Verge was able to create malicious cloned voices that contained racist and transphobic comments, threats of violence, and other offensive language. The creation of a plausible enough clone to trick a bank’s authentication system was documented by Vox reporter Joseph Cox.
In retaliation, ElevenLabs has released a tool to identify platform-generated speech and has sought to remove individuals who have frequently broken its rules of service, which forbid abuse. Staniszewski claims that ElevenLabs will be partnering with unidentified “distribution players” to make the detection tool accessible on third-party platforms and that the tool will be enhanced this year to flag sounds generated by different voice-generating AI models.
Some voice actors have criticized ElevenLabs, claiming that the corporation exploits their voice samples without their permission. These samples could be used to promote content that the actors do not support or to distribute false information. According to a new article in Vice, victims describe how ElevenLabs was utilized in harassment campaigns, with one instance involving the disclosure of an actor’s home location through the use of a cloned voice.
Finally, we have the unspoken but very real danger that ElevenLabs and similar services represent to the vocal acting profession. The practice of having voice actors sign up their rights for their voices so that other clients can utilize artificial intelligence to create synthetic versions that could replace them is becoming more common, according to Motherboard. What’s more, voice performers aren’t always compensated fairly for their work. The worry is that actors won’t have a choice but to accept AI-generated voices for voice work, especially for lower-paying entry-level roles.
A few platforms are making an effort to find a middle ground. A rival of ElevenLabs’, Replica Studios, inked an agreement with SAG-AFTRA earlier this month to produce and license digital representations of the voices of members of the media artist union. The groups issued a joint statement praising the agreement, calling it “fair” and “ethical” in its provisions meant to secure performers’ permission and outlining the parameters within which artificial voice doubles will be used in future compositions.
But even this was unpopular with certain voice performers, including some members of SAG-AFTRA. Voices can be found at ElevenLabs’ marketplace. Users can create, validate, and share their voices in the marketplace, which is currently in alpha and will be more publicly available in the coming several weeks. According to Staniszewski, the original creators should be compensated whenever others use their voices.
When it comes to the availability and pay terms of their voice, “users always retain control,” he stressed. “The marketplace is designed as a step towards harmonizing AI advancements with established industry practices while also bringing a diverse set of voices to ElevenLabs’ platform.”
However, voice performers may be dissatisfied because ElevenLabs isn’t currently paying in cash. While some may find it humorous, the current system allows authors to earn credit towards ElevenLabs’ premium services.
That might change down the road as ElevenLabs tries to unseat established companies like Google, Amazon, and Microsoft, as well as up-and-coming businesses like Papercup, Deepdub, Respeecher, and Voice.ai. ElevenLabs is now one of the most well-funded synthetic voice startups. Whatever the case may be, ElevenLabs expects to remain a major player in the rapidly expanding artificial voice industry, and it intends to increase its workforce from 40 to 100 by year’s end.