Google Gemini vs OpenAI chatGPT 4 bake-off - SURPRISING results!

  Рет қаралды 2,147

Hubel Labs

Hubel Labs

7 ай бұрын

Google Gemini was just announce and claims to have beaten GPT 4 across all major benchmarks.
However, benchmarks are NOT representative of real use-cases... often times designed such that automated evaluation is possible vs. expensive human evaluation.
Our lab ran tests across 11 assessment criteria including cross-modal interleaved prompts, context window size, mixed modal output, reasoning tasks, understanding complex consulting charts, we compare Gemini vs GPT4 .... and the results were surprising!! Given that Gemini Ultra is not out yet, we just used Bard powered by Gemini Pro or used results from the research report
Who won? Find out for yourself!
Google Deepmind technical report: storage.googleapis.com/deepmi...

Пікірлер: 9
@amparoconsuelo9451
@amparoconsuelo9451 7 ай бұрын
I will use whatever is available, affordable and compatible with my hardware.
@vivianlevine8981
@vivianlevine8981 7 ай бұрын
Great amount of detail in this video. Thanks for posting!
@hubel-labs
@hubel-labs 7 ай бұрын
Glad it was helpful!
@RodRod3D
@RodRod3D 7 ай бұрын
Hey, thanks for the video! Just remmember you are testing a mid level Bard(using Gemini Pro) against GPT-4 turbo. So, maybe in some of tests Gemini Ultra have some advantages in the most capable model. What do you think?
@RodRod3D
@RodRod3D 7 ай бұрын
Gemini on omelet just answered what was asked. GPT was just a chatter box, the human here knows the ingredients he is showing so it is obvious excess of information, also GPT wrongly identified a tomato where there is no tomatos.
@hubel-labs
@hubel-labs 7 ай бұрын
Yes, I think ultra might do better in some cases so will have to rerun tests when that becomes avail. But for the omelet test for example, those results in the paper I assume would be from ultra … but GPT4 was still much better.
@RodRod3D
@RodRod3D 7 ай бұрын
good point!@@hubel-labs Thanks for your time :)
@giorgibaghashvili5032
@giorgibaghashvili5032 7 ай бұрын
Please compare Bard and GPT 3.5 It would be very interesting, because both are free versions
@hubel-labs
@hubel-labs 7 ай бұрын
Yes, great idea.
Happy 4th of July 😂
00:12
Pink Shirt Girl
Рет қаралды 61 МЛН
I Can't Believe We Did This...
00:38
Stokes Twins
Рет қаралды 101 МЛН
Khó thế mà cũng làm được || How did the police do that? #shorts
01:00
Cosmology in Crisis? Confronting the Hubble Tension
36:26
World Science Festival
Рет қаралды 77 М.
Happy 4th of July 😂
00:12
Pink Shirt Girl
Рет қаралды 61 МЛН