OpenAI started rolling out ChatGPT’s Advanced Voice Mode on Tuesday. This was the first time users could hear GPT-4o’s very realistic voice replies. A small group of ChatGPT Plus users will be able to use the test version today. According to OpenAI, the feature will slowly be added to all Plus users in the autumn of 2024.
When OpenAI first showed off GPT-4o’s voice in May, it shocked people with how quickly it responded and how much it sounded like the voice of a certain person. Sky’s voice sounded a lot like Scarlett Johansson’s, who played the voice of the robot in the movie “Her.” Soon after OpenAI’s demo, Johansson said she turned down multiple requests from CEO Sam Altman to use her voice. She also said that after seeing GPT-4o’s demo, she hired a lawyer to protect her likeness. OpenAI said it wasn’t using Johansson’s voice, but the voice that was used in its test was later taken down. At the beginning of June, OpenAI said it would delay the release of Advanced Voice Mode to make it safer.
After a month, the wait is (kind of) over. OpenAI says that the video and screensharing features that were shown off in its Spring Update will not be included in this test. Instead, they will be released “later.” The GPT-4o demo that wowed everyone is still just a demo for now, but some paid users can now use the voice option shown there in ChatGPT.
GPT Chat Can Now Talk And Listen
The Voice Mode in ChatGPT may be something you’ve already tried, but OpenAI says the Advanced Voice Mode is different. It used to take three different models to handle audio: one to turn your voice into text, one to handle your prompt (GPT-4), and one to turn ChatGPT’s text into voice. However, GPT-4o is multimodal and can handle these jobs without the help of extra models. This makes conversations much faster. OpenAI also says that GPT-4o can tell when you’re sad, happy, excited, or singing by the way you speak.
OpenAI’s Advanced Voice Mode will be shown to ChatGPT Plus users in this test in a way that makes it feel very real. Parhlo World couldn’t test the feature before this story came out, but we’ll look at it when we can.
OpenAI says it will slowly release ChatGPT’s new voice so that it can closely watch how it is used. People in the test group will get a message in the ChatGPT app and then an email with steps on how to use it.
OpenAI says that since the demo a few months ago, it has tried GPT-4o’s voice abilities with more than 100 outside testers who speak 45 different languages. As of early August, OpenAI says it will have a report on these safety work.
The business says that Advanced Voice Mode will only work with ChatGPT’s four pre-made voices: Juniper, Breeze, Cove, and Ember. These voices were created with the help of paid voice actors. The Sky speech that was shown in OpenAI’s demo in May is no longer in ChatGPT. Lindsay McCallum, a spokesperson for OpenAI, says, “ChatGPT cannot impersonate other people’s voices, whether they are real people or famous people, and will block outputs that don’t match one of these preset voices.”
OpenAI wants to stay out of deepfake scandals. In January, the voice cloning technology from AI company ElevenLabs was used to pretend to be Vice President Joe Biden, tricking voters in the New Hampshire primary.
Also Read: Openai Shows Off Searchgpt With Google in Mind
OpenAI also says it added new filters that stop some calls to make music or other audio that is protected by intellectual property rights. In the past year, AI companies have been sued for copyright violations, and audio models like GPT-4o make it possible for a whole new group of businesses to file a complaint. Particularly, record companies, who have a history of suing things, and who have already sued AIs that make songs called Suno and Udio.
What do you say about this story? Visit Parhlo World For more.