Facebook-parent Meta is working on a new tool that leverages the power of generative AI, the underlying tech behind the viral chatbot ChatGPT. The tool can create speech with voice samples and simple text inputs, Dubbed Voicebox.
The system was trained on 50,000-hour audio recordings and transcripts set. That data set includes public-domain audiobooks in six languages: English, French, German, Spanish, Polish, and Portuguese. The researchers say that the multilingual feature is essential to help Voicebox work across different dialects and speakers.
In addition to generating speech, the model can edit pre-recorded audio. It can remove unwanted background noises like car horns or dogs barking while preserving the content and style of the original recording, they say. It can also convert a text passage into speech in one of the six languages supported by the system.
If that wasn’t enough, the system could even translate a piece of recorded speech to another language, then recreate that text in the original speaker’s voice. It can take a two-second audio sample and recreate it in a different language with near-perfect accuracy. It can also reproduce interrupted or misspoken words without re-recording the entire passage.
This functionality could help create virtual assistants or even aid people who can’t speak. But, as with any generative AI technology, the potential for misuse is significant. And that’s part of why the company has yet to be ready to release the tool to testers.
“We want to make sure we have the right balance of openness and responsibility in our approach to generative AI,” the team wrote in a research paper published alongside the announcement. Meta is also working on other ways to protect its users, such as a deep fake detection system that can identify whether or not an AI-generated voice is real or fake.
The company hires for several roles crucial to building teams, including technical leads and product managers. These positions often require management skills, including responsibilities for defining and guiding high-level goals and roadmaps, assessing performance, and measuring bias. The company has also posted job listings for a responsible AI research lead and an AI ethics engineer who will work on the company’s generative AI projects.
The company has yet to say when it might be ready to release Voicebox to the public, but it appears that someday that may happen. For now, though, the company limits its availability to select advertisers. It has also released a research paper and a few audio examples.