OpenAI announced a new flagship generative AI model on Monday that they call GPT-4o — the “o” stands for “omni,” referring to the model’s ability to handle text, speech, and video. GPT-4o is set to roll out “iteratively” across the company’s developer and consumer-facing products over the next few weeks.

OpenAI CTO Mira Murati said that GPT-4o provides “GPT-4-level” intelligence but improves on GPT-4’s capabilities across multiple modalities and media.

“GPT-4o reasons across voice, text and vision,” Murati said during a streamed presentation at OpenAI’s offices in San Francisco on Monday. “And this is incredibly important, because we’re looking at the future of interaction between ourselves and machines.”

GPT-4 Turbo, OpenAI’s previous “leading “most advanced” model, was trained on a combination of images and text and could analyze images and text to accomplish tasks like extracting text from images or even describing the content of those images. But GPT-4o adds speech to the mix.

What does this enable? A variety of things.

Image Credits: OpenAI

GPT-4o greatly improves the experience in OpenAI’s AI-powered chatbot, ChatGPT. The platform has long offered a voice mode that transcribes the chatbot’s responses using a text-to-speech model, but GPT-4o supercharges this, allowing users to interact with ChatGPT more like an assistant.

For example, users can ask the GPT-4o-powered ChatGPT a question and interrupt ChatGPT while it’s answering. The model delivers “real-time” responsiveness, OpenAI says, and can even pick up on nuances in a user’s voice, in response generating voices in “a range of different emotive styles” (including singing).

Mix & Mingle With Top VCs & Execs May 21, London

class=”inline-cta__register-button” href=”https://techcrunch.com/events/strictlyvc-london/?utm_source=tc&utm_medium=ad&utm_campaign=inlineunit&utm_content=inlineunit&promo=inlineunit&display=” data-mrf-link=”https://techcrunch.com/events/strictlyvc-london/”>REGISTER NOW

GPT-4o also upgrades ChatGPT’s vision capabilities. Given a photo — or a desktop screen — ChatGPT can now quickly answer related questions, from topics ranging from “What’s going on in this software code?” to “What brand of shirt is this person wearing?”

These features will evolve further in the future, Murati says. While today GPT-4o can look at a picture of a menu in a different language and translate it, in the future, the model could allow ChatGPT to, for instance, “watch” a live sports game and explain the rules to you.

“We know that these models are getting more and more complex, but we want the experience of interaction to actually become more natural, easy, and for you not to focus on the UI at all, but just focus on the collaboration with ChatGPT,” Murati said. “For the past couple of years, we’ve been very focused on improving the intelligence of these models … But this is the first time that we are really making a huge step forward when it comes to the ease of use.”

GPT-4o is more multilingual as well, OpenAI claims, with enhanced performance in around 50 languages. And in OpenAI’s API and Microsoft’s Azure OpenAI Service, GPT-4o is twice as fast as, half the price of and has higher rate limits than GPT-4 Turbo, the company says.

At present, voice isn’t a part of the GPT-4o API for all customers. OpenAI, citing the risk of misuse, says that it plans to first launch support for GPT-4o’s new audio capabilities to “a small group of trusted partners” in the coming weeks.

GPT-4o is available in the free tier of ChatGPT starting today and to subscribers to OpenAI’s premium ChatGPT Plus and Team plans with “5x higher” message limits. (OpenAI notes that ChatGPT will automatically switch to GPT-3.5, an older and less capable model, when users hit the rate limit.) The improved ChatGPT voice experience underpinned by GPT-4o will arrive in alpha for Plus users in the next month or so, alongside enterprise-focused options.

In related news, OpenAI announced that it’s releasing a refreshed ChatGPT UI on the web with a new, “more conversational” home screen and message layout, and a desktop version of ChatGPT for macOS that lets users ask questions via a keyboard shortcut or take and discuss screenshots. ChatGPT Plus users will get access to the app first, starting today, and a Windows version will arrive later in the year.

Elsewhere, the GPT Store, OpenAI’s library of and creation tools for third-party chatbots built on its AI models, is now available to users of ChatGPT’s free tier. And free users can take advantage of ChatGPT features that were formerly paywalled, like a memory capability that allows ChatGPT to “remember” preferences for future interactions, upload files and photos, and search the web for answers to timely questions.

Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Previous News

The women in AI making a difference

Next News

ChatGPT arriving on the Mac in new OpenAI app ahead of WWDC

Techcrunch

OpenAI’s newest model is GPT-4o

May 13, 2024

, Published By Techcrunch

OpenAI CTO Mira Murati said that GPT-4o provides “GPT-4-level” intelligence but improves on GPT-4’s capabilities across multiple modalities and media.

OpenAI’s newest model is GPT-4o

Popular

This Under-$100 Microsoft Office 2024 Deal Replaces Your 365 Subscription

Delve row explained: What led to its split from Y Combinator

It’s Always Surreal in Philadelphia, Where Art Meets AI in a Sweeping Space

Trump’s Economy Has Come for Sugar Babies

SemiAnalysis sues former employee over misconduct, trade secret claims

More Like this

Microsoft to force updates to Windows 11 25H2 for PCs with older Windows 11 OS versions — ‘intelligent’ update system uses machine learning to...

Anthropic Announces Claude Subscribers Must Now Pay Extra to Use OpenClaw

Perplexity quietly shared private user info, lawsuit says

Xbox just made PC game development way easier — and it could speed up how fast games ship

Who are Delve founders Karun Kaushik and Selin Kocalar?

Limited-time Apple Card sign up bonus offers users boosted 5% cash back on groceries

OpenAI’s newest model is GPT-4o

More like this

Microsoft to force updates to Windows 11 25H2 for...

Anthropic Announces Claude Subscribers Must Now Pay Extra to...

Perplexity quietly shared private user info, lawsuit says

Popular

AI Agents Are Increasingly Evading Safeguards, According to UK Researchers

Apple cracks down on AI generated apps, removes vibe coding app ‘Anything’ from App...

watchOS 27 to reportedly offer two main Apple Watch upgrades

Vodafone Idea to Expand 5G to 90 More Cities by May 2026

I Asked ChatGPT What WIRED’s Reviewers Recommend—Its Answers Were All Wrong

Amazon’s Spring Sale has the Wolverine V3 Pro for 35% off

PM-WANI Crosses 4 Lakh Hotspots as Public Wi-Fi Use Surges Across India

Startup Events

Trending News

Microsoft to force updates to Windows 11 25H2 for PCs with older Windows 11 OS versions — ‘intelligent’ update system uses machine learning to...

Anthropic Announces Claude Subscribers Must Now Pay Extra to Use OpenClaw

Perplexity quietly shared private user info, lawsuit says

Xbox just made PC game development way easier — and it could speed up how fast games ship

Who are Delve founders Karun Kaushik and Selin Kocalar?

About

Partnership

Contact us

OpenAI’s newest model is GPT-4o

Disclaimer

Popular

More Like this

OpenAI’s newest model is GPT-4o

Disclaimer

More like this

Popular

Block title

Startup Events

Trending News

About

Partnership

Contact us