Home » AI » Gemini Live vs ChatGPT 4o Voice Chat Mode: Our Experience

Gemini Live vs ChatGPT 4o Voice Chat Mode: Our Experience

by Ravi Teja KNTS
0 comment

OpenAI introduced ChatGPT 4.0 back in May featuring a voice call-like mode. Unlike existing voice assistants like Google Assistant, Alexa, and Siri, it can better understand context, engage in more back-and-forth conversations, and converse in a more human-like tone. It almost feels like you’re on a phone call with AI rather than just triggering an AI assistant for help.

To counter that, Google has introduced Gemini Live in the Pixel 9 event, which is quite similar to ChatGPT 4.0’s voice model. I used Gemini Live for over a week and here’s my experience on how they fare.

Comparison Table

Here is a handy comparison table that highlights the key differences between Gemini Live and ChatGPT 4o:

Here’s a table summarizing the differences between Gemini Live and ChatGPT 4.0 Voice Chat Mode based on the article:

FeatureGemini LiveChatGPT 4.0 Voice Mode
Access and AvailabilityLimited to select Android phones like Samsung and PixelAvailable on Android, iOS, and Mac
UI and Visual FeedbackVisually appealing, no functional feedbackSimple, functional UI with mic visualizer
Manual Hold FeatureNot availableAvailable, allows pausing mid-sentence to recollect your thoughts
Background OperationCan operate in the background so you can use other apps like YouTube while chatting with AICannot operate in the background
Real-time Interruption HandlingStops talking when you start speaking automaticallyRequires manual tap to stop
Contextual UnderstandingBetter at keeping up with the conversationBetter at keeping up with the conversation and context
Language SupportSupports up to 40 languages but struggles with non-English languages sometimesSupports up to 85 languages
Information AccuracyFetches up-to-date information from the internetRelies on internal data, sometimes outdated

This table captures the key distinctions between the two AI chatbots as detailed in the article. For a deeper dive into their features, you can visit the article here.

Access and Availability

Both ChatGPT 4.0 Voice and Gemini Live can be accessed through their respective apps. However, ChatGPT stands out because it is available on a wider range of platforms, including Android, iOS, and even Mac (sorry Windows). In contrast, Gemini Live is currently limited to select Android devices, such as Pixel and Samsung phones. While this is expected to improve over time, Gemini Live’s availability remains limited at the time of writing this comparison. It’s important to note that both services are only accessible to ChatGPT Pro or Gemini Advanced subscribers respectively.

First Impressions

At first glance, Gemini Live looks more appealing than ChatGPT’s simpler interface. The former features animated gradient lights in the background, while ChatGPT displays a plain white blob in the center of the screen. However, that white blob isn’t just static—it animates when ChatGPT speaks and includes a mic visualizer that moves when you talk, indicating that ChatGPT is listening. So, while Gemini may look better, ChatGPT offers more functional feedback through its interface. Both services allow you to pause or end the voice chat at any time. While we found both the services to be similar when it comes to tone or their conversational abilities, they also have unique features of their own.

ChatGPT Voice Mode’s Advantage

AI models usually respond immediately after you stop speaking. The response time is quick and similar on both platforms. However, there are times when you might pause to gather your thoughts or recall details. While a human would recognize the pause and wait for you to continue, these AI models often don’t—they might start replying after hearing only part of your sentence.

To solve this, ChatGPT offers a manual hold feature. By holding anywhere on the screen, you can activate a hold mode that ensures the AI listens to your entire sentence. This allows you to take your time and pause as and when needed. ChatGPT will only respond once you release the hold. Unfortunately, Gemini lacks this feature, so when I need to think mid-sentence, I find myself filling the gaps with filler phrases like “something like that” or “you know what I mean” etc.

Gemini Live’s Advantage

On the other hand, Gemini can work in the background which is an advantage. This means you can close the Gemini app and still continue your conversation while using other apps on your phone. Multitasking.

For example, the other day, I was checking a recipe online and needed to clear a few doubts about the missing ingredients. I was able to keep Gemini running in the background while checking the recipe. Whenever I had a question, I simply asked, and it responded in the background. It’s like having my mom on the call in the background.

Gemini’s other advantage is that it stops talking as soon as you start speaking. While OpenAI has announced that ChatGPT will eventually have this feature, it’s not yet available to the public. Currently, ChatGPT only stops when you tap the screen. However, Gemini also takes a moment to recognize that you’re talking, so it may not stop immediately. Despite this, you don’t have to repeat yourself—Gemini can still pick up your words even while it’s replying.

Real-World Examples Highlighting Differences

Except for a few features and UI differences, both services may initially seem similar. However, things start to differ when it comes to understanding context, the ability to hold conversations, language support, information accuracy, and more. Let’s explore these differences with real-world examples.

1. Brainstorming Ideas for a Story

I have a habit of writing short stories for fun. Since the launch of ChatGPT’s Voice mode, I’ve been using it to brainstorm ideas. Over the past week, I’ve been trying out Gemini Live. For me, there’s a clear winner in this aspect—ChatGPT.

When brainstorming with back-and-forth conversations, I often start with a specific idea but change pace as the discussion progresses. ChatGPT consistently keeps up with the conversation and adapts well to changes in context and topic. However, with Gemini, when I initially pitch an idea and later switch to something else, it keeps reverting to the original idea. I found myself having to repeat multiple times that I had changed my mind and that this was the new direction I wanted to take. This issue isn’t limited to story writing; it happens in various other conversations as well. While both voice bots can understand context, Gemini often gets confused and struggles to keep up with the flow of the conversation.

2. Translating between languages

My mother tongue is Telugu, and my friends speak Hindi, so we decided to try these voice bots as interpreters. While ChatGPT was able to perform the task somewhat well, Gemini was a complete failure. Although Gemini can reply in multiple languages, including the ones we need, it struggles to understand anything spoken in languages other than English. Your experience may vary depending on the language you’re using, but in our case, since Gemini couldn’t pick up the languages we know, the clear winner is ChatGPT. However, when compared to Google Translate’s Conversation mode, even ChatGPT has a long way to go. Even on paper, Gemini only supports 40 languages as of now, whereas ChatGPT has support for up to 85 languages approx.

3. Learning a Topic

In our experience, both ChatGPT and Gemini tend to hallucinate, and neither is perfectly accurate. This applies to their voice modes as well. However, when it comes to providing the latest information, Gemini has the upper hand, as it constantly searches the internet for answers. In contrast, ChatGPT relies on its internal database and only checks online web pages when necessary. As a result, it sometimes provides outdated or completely incorrect information. For instance, when I asked both Gemini and ChatGPT for the Pixel 9 specs, Gemini provided accurate details, while ChatGPT mistakenly shared Pixel 8’s specs.

You can address this issue by specifically asking ChatGPT to check online before answering. When I tried that, it returned with Pixel 9’s spec sheet. However, in everyday use, there’s a higher chance of receiving incorrect information from ChatGPT compared to Gemini.

That said, both AI models are effective at conveying information, whether through examples, analogies or by simplifying it for a 9-year-old. Each has its own style, and we found both to be quite likable. Overall, I prefer to rely on Gemini more than ChatGPT, especially when learning about something new or when there are recent updates to the information I seek.

So Which is Better – ChatGPT Voice Mode or Gemini Live

Overall, ChatGPT is currently a better voice assistant than Gemini Live. Its UI is more functional, it generates answers slightly faster, and it does a bit better at keeping up with the conversation. However, Gemini Live is new and has its advantages, such as the ability to work in the background and provide accurate information from the internet most of the time. While ChatGPT may be better at the moment, the difference isn’t significant, so you can choose either based on the price and advantages each offers.

You may also like