Рет қаралды 2,147
Google Gemini was just announce and claims to have beaten GPT 4 across all major benchmarks.
However, benchmarks are NOT representative of real use-cases... often times designed such that automated evaluation is possible vs. expensive human evaluation.
Our lab ran tests across 11 assessment criteria including cross-modal interleaved prompts, context window size, mixed modal output, reasoning tasks, understanding complex consulting charts, we compare Gemini vs GPT4 .... and the results were surprising!! Given that Gemini Ultra is not out yet, we just used Bard powered by Gemini Pro or used results from the research report
Who won? Find out for yourself!
Google Deepmind technical report: storage.googleapis.com/deepmi...