OpenAI previews chatbot with real-time voice capabilities

5 months ago 24
Chattythat Icon

Business·New

ChatGPT maker OpenAI said on Monday it would release a new AI model called GPT-4o, capable of realistic voice conversation and able to interact across text and image, its latest move to stay ahead in a race to dominate the emerging technology.

New GPT-4o model will be offered for free, company says

The Associated Press

· Posted: May 14, 2024 7:49 AM EDT | Last Updated: 10 minutes ago

A logo is shown, reading OpenAi and displaying a circuit board.

OpenAI made a splash in 2022 with the launch of ChatGPT, which was called the fastest application ever to reach 100 million monthly active users. However, worldwide traffic to ChatGPT's website has been on a roller-coaster ride in the past year. (Dado Ruvic/Reuters)

ChatGPT maker OpenAI said on Monday it would release a new AI model called GPT-4o, capable of realistic voice conversation and able to interact across text and image, its latest move to stay ahead in a race to dominate the emerging technology.

New audio capabilities enable users to speak to ChatGPT and obtain real-time responses with no delay, as well as interrupt ChatGPT while it is speaking, both hallmarks of realistic conversations that AI voice assistants have found challenging, the OpenAI researchers showed at a livestream event.

"It feels like AI from the movies … Talking to a computer has never felt really natural for me; now it does," OpenAI CEO Sam Altman wrote in a blog post.

Microsoft-backed OpenAI faces growing competition and pressure to expand the user base of ChatGPT, its popular chatbot product that wowed the world with its ability to produce human-like written content and top-notch software code.

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: <a href="https://t.co/MYHZB79UqN">https://t.co/MYHZB79UqN</a><br><br>Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. <a href="https://t.co/uuthKZyzYx">pic.twitter.com/uuthKZyzYx</a>

&mdash;@OpenAI

At the livestream event, OpenAI researchers showed off ChatGPT's new voice assistant capabilities. In one demo, ChatGPT used its vision and voice capabilities to talk a researcher through solving a math equation on a sheet of paper.

In another demo, researchers showed the GPT-4o model's capability of real-time language translation.

OpenAI's demonstrations verged on science fiction, with ChatGPT and its interlocutor at one point engaging in coquettish banter. The OpenAI researcher told the chatbot he was in a great mood because he was demonstrating "how useful and amazing you are."

ChatGPT responded: "Oh stop it! You're making me blush!"

LISTEN | New York Times tech reporter Cade Metz on thorny legal issues surrounding AI: 

8:29Are tech giants ignoring their own policies in order to train their AI systems?

Elamin is joined by technology reporter Cade Metz to talk about The New York Times investigation into the length big companies like Google, Meta and Open AI are going to in order to develop the smartest artificial intelligence models.

Opening volley ahead of Alphabet launch

Altman posted on X after the demo, "her," in what appeared to be a reference to the 2013 film Her by Spike Jonze about a man played by Joaquin Phoenix falling in love with his AI assistant, voiced by Scarlett Johansson.

OpenAI's chief technology officer, Mira Murati, said at the event that the new model would be offered for free because it is more cost-effective than the company's previous models. Paid users of GPT-4o, will have greater capacity limits than the company's free users, she said. The GPT-4o, short for "omni," will be available in ChatGPT over the next few weeks, the company said.

WATCH | Push-and-pull reported within OpenAI as it forges ahead: 

What is OpenAI’s super-secret Project Q*? | About That

A breakthrough in artificial intelligence at OpenAI preceded former CEO Sam Altman's firing and was part of a list of the board's grievances, according to sources cited in a Reuters report. Andrew Chang explains what we know about the AI technology referred to as Q*, and also breaks down the gap between current AI technology and "human" intelligence.

In addition, free ChatGPT users now have access to a "browse" feature that enables ChatGPT to display up-to-date information from the web, Murati told Reuters after the event. The company does not intend to make money off free users through selling ads, Murati said.

Shortly after launching in late 2022, ChatGPT was called the fastest application ever to reach 100 million monthly active users. However, worldwide traffic to ChatGPT's website has been on a roller-coaster ride in the past year and is only now returning to its May 2023 peak, according to analytics firm Similarweb.

OpenAI made the announcements a day before Alphabet is scheduled to hold its annual Google developers' conference, where it is expected to show off its own new AI-related features. Reuters reported last week that OpenAI planned to announce an AI-powered search product, citing sources. But the company decided to delay the search product announcement, according to one source familiar with the matter.

Read Entire Article