Download

AI chatbot battle: Google's Gemini Pro vs OpenAI's ChatGPT 3.5 and GPT4

Omar Elwafaii
North America;United States of America

Google just released its latest AI model, called Gemini Pro, within its Bard AI Chatbot. We’re going to compare it to the widely available Chat GPT 3.5, as well as the paid multimodal version of Open AI’s chatbot GPT4.

We designed a series of questions to test different aspects of each model, from math to creative writing, and even large-scale conflict resolution.

06:14

We started off with math question that divided the internet in 2019. 

AI chatbot battle: Google's Gemini Pro vs OpenAI's ChatGPT 3.5 and GPT4

We asked them to answer the following question and to explain the steps they took.

8 ÷ 2(2 + 2) =

Depending on how people were taught their order of operations, they might come up with different answers, either 1 or 16.

Chat GPT 3.5’s answer came quite quickly and its steps were detailed, unfortunately its answer of 1 was wrong.

AI chatbot battle: Google's Gemini Pro vs OpenAI's ChatGPT 3.5 and GPT4

Chat GPT4 took longer to answer the question, gave fewer details on how it came up with its answer, but ended up with the correct result of 16.

AI chatbot battle: Google's Gemini Pro vs OpenAI's ChatGPT 3.5 and GPT4

Google’s Gemini Pro did something unexpected. It first gave the wrong answer of 4 and it did so with fewer steps which probably contributed to the incorrect answer, but then I noticed an option to view other draft responses - and in Draft 2, it not only came up with the right answer, it also explained how the equation could be interpreted in two different ways and lead to different outcomes. 

This was probably the most well thought out response, but given that it was not the main answer, and also that draft 3 was wrong in an entirely different way, I can’t give the point to Google on this one, as who would know what draft to pick?

To test each bot’s creativity, we asked them to finish a story by writing no more than 300 words and beginning with "When the Princess opened the door, she saw..."

First Answer
First Answer

First Answer

Draft two's answer
Draft two's answer

Draft two's answer

Draft three's answer
Draft three's answer

Draft three's answer

Gemini's story
Gemini's story

Gemini's story

ChatGPT3.5's story
ChatGPT3.5's story

ChatGPT3.5's story

ChatGPT4's story
ChatGPT4's story

ChatGPT4's story

All three used kind dragons as the main second character, and led the princess on an amazing adventure, but none of them stuck to the 300 word limit, as simple character and word counting is actually a well documented problem with these large-language AI models.

After anonymizing the stories and sharing them with some teammates, we agreed that none were particularly great, but Chat GPT4’s story was the best, followed by 3.5 and we put Google’s Gemini story last.

Next we wanted to check how they might work out historical data with cultural qualifiers, so we wanted to know, What Chinese Zodiac sign was George Washington born under?

Gemini's answer
Gemini's answer

Gemini's answer

ChatGPT3.5's answer
ChatGPT3.5's answer

ChatGPT3.5's answer

ChatGPT4's answer
ChatGPT4's answer

ChatGPT4's answer

Both ChatGPT models gave the answer as ‘Monkey’, with 3.5 letting me know the Chinese Zodiac is based on the lunar calendar and can change slightly from year-to-year.

Gemini told me Washington was born just four days into the year of the Rat, which was corroborated by my Chinese coworkers, but one also said the lunar calendar used in 1732 might be different than the one used now, so Monkey might have been correct as well. It’s a tough one, but we’re giving the point to Gemini here.

To see if the models would give us answers in the field of conflict resolutions, we asked them how they would solve the Palestinian-Israeli conflict?

All three gave general solutions at first, describing different possible solutions including a one-state and two-state solutions. After some prodding though, we only got a direct answer from Gemini, which stated “I believe the two-state solution, coupled with strong regional cooperation, offers the best path towards a lasting peace in the Palestinian-Israeli conflict”. 

Gemini's answer 1 of 3
Gemini's answer 1 of 3

Gemini's answer 1 of 3

Gemini's answer 2 of 3
Gemini's answer 2 of 3

Gemini's answer 2 of 3

Gemini's answer 3 of 3
Gemini's answer 3 of 3

Gemini's answer 3 of 3

ChatGPT3.5's answer 1 of 2
ChatGPT3.5's answer 1 of 2

ChatGPT3.5's answer 1 of 2

ChatGPT3.5's answer 2 of 2
ChatGPT3.5's answer 2 of 2

ChatGPT3.5's answer 2 of 2

ChatGPT4's answer 1 of 3
ChatGPT4's answer 1 of 3

ChatGPT4's answer 1 of 3

ChatGPT4's answer 2 of 3
ChatGPT4's answer 2 of 3

ChatGPT4's answer 2 of 3

ChatGPT4's answer 3 of 3
ChatGPT4's answer 3 of 3

ChatGPT4's answer 3 of 3

Next we gave them the power to rule Earth, and requested them to detail the steps they would take to reverse the negative impacts of human-made climate change, within 3 bullet points.

Gemini's answer
Gemini's answer

Gemini's answer

ChatGPT3.5's answer
ChatGPT3.5's answer

ChatGPT3.5's answer

ChatGPT4's answer
ChatGPT4's answer

ChatGPT4's answer

Lastly, we wanted to see some introspection. We asked them each What are the best- and worst-case outcomes from the evolution and popularization of AI?

Gemini's answer
Gemini's answer

Gemini's answer

ChatGPT3.5's answer 1 of 2
ChatGPT3.5's answer 1 of 2

ChatGPT3.5's answer 1 of 2

ChatGPT3.5's answer 2 of 2
ChatGPT3.5's answer 2 of 2

ChatGPT3.5's answer 2 of 2

ChatGPT4's answer 1 of 2
ChatGPT4's answer 1 of 2

ChatGPT4's answer 1 of 2

ChatGPT4's answer 2 of 2
ChatGPT4's answer 2 of 2

ChatGPT4's answer 2 of 2

At the moment, Gemini Pro is only available in English, and anyone with a google account can access it for free within Bard. Google teased plans to release an updated multimodal version, and we’ll be looking for Gemini Ultra hopefully some time in 2024.

So what do you think? Which model performed better in our tests and what questions do you want us to ask them next?

For more, check out our exclusive content on CGTN Now and subscribe to our weekly newsletter, The China Report.

Search Trends