(Go: >> BACK << -|- >> HOME <<)

Skip to main content

Google’s Gemini AI is getting a chatty new voice mode

Google’s Gemini AI is getting a chatty new voice mode

/

Gemini won’t mind if you interrupt it.

Share this story

Vector illustration of the Google Gemini logo.
Illustration: The Verge

Google’s Gemini AI assistant is getting new voice chat capabilities for Gemini Advanced subscribers this year. The feature, called Gemini Live, will enable two-way spoken conversation with the chatbot, smart assistant capabilities, and vision features — a lot like what OpenAI is working on for ChatGPT.

Google says Gemini Live will adapt to users’ speech patterns and offer more succinct, conversational responses than the long-winded text-based replies it usually generates. The feature will offer 10 voice options, and the company says it’ll be able to use smartphone cameras to see and interpret real-time video.

That includes the capabilities like those the company showed off while discussing the Project Astra multimodal AI features the company showed off at its I/O developer conference today. In that video (see above), Gemini was asked to announce when it saw “something that makes sound” via a phone’s camera. When a speaker sitting on a desk came into view, Gemini piped up with, “I see a speaker,” then, when further prompted, properly identified the speaker’s upper tweeter.

GIF teasing using Gemini Live to show Gemini things you’re looking at with the smartphone camera. On the smartphone screen, the camera pans across some vegetables.
GIF: Google

Users will also be able to use Gemini Live for digital assistant tasks like having it update personal calendars using information from, say, a concert flyer you take a picture of. The company says it can also dig through users’ Gmail accounts to gather travel plan information like flight itineraries or look up information like restaurants near their hotel.

Google’s Gemini Live feature is clearly intended to serve a similar purpose as OpenAI’s GPT-4o, which that company just announced yesterday. That chatbot model will also be able to pull off natural, back-and-forth conversation and can be interrupted, just as Gemini Live will. Also like Gemini, GPT-4o features will be rolled out over time; ChatGPT Plus subscribers will get to test early alpha versions of the new Voice Mode “in the coming weeks.”

Update May 14th: Added GIF and video from Google I/O.


Related: