Do not use Llama-3 70B for these tasks ...

  Рет қаралды 3,052

code_your_own_AI

code_your_own_AI

2 ай бұрын

A detailed data analysis of the 1 mio votes by the AI community of the performance of LLMs open up new insights to areas where LLMs outperform, and areas where you better do not use a particular LLM, but opt for a better performance LLM.
all rights w/ authors:
What’s up with Llama 3? Arena data analysis
lmsys.org/blog/2024-05-08-lla...
#airesearch #ai #newtechnology

Пікірлер: 13
@gileneusz
@gileneusz 2 ай бұрын
this is great video! really amazing explanation
@code4AI
@code4AI 2 ай бұрын
One of the best comments today! 😊
@martinsherry
@martinsherry 2 ай бұрын
“of course, those people were wrong”…..hahahaha.
@code4AI
@code4AI 2 ай бұрын
Finally, someone is laughing ! Success! 😂
@Sl15555
@Sl15555 2 ай бұрын
summarization might be low because of llama3's context length., that's my best guess. ill have to test it more as i like using the llm's to summarize youtube videos ( thought i watched this one ). I have found some areas llama3 works well and use it for that. one is creative writing / poems, but the result is then used to produce creative lists for other tasks works really well.
@henkhbit5748
@henkhbit5748 2 ай бұрын
If an opensource llm perform well for your particular usecase then, for me, it Will always have my preference than a big monolithic closed source llm from ClosedAi!
@IdPreferNot1
@IdPreferNot1 2 ай бұрын
Love how your critiques shred the populist AI community while providing useful info.
@thedoctor5478
@thedoctor5478 2 ай бұрын
I couldn't care less about friendliness. We can get that from low param models and use them to reform texts. Larger models should just care about reasoning above all else.
@TheReferrer72
@TheReferrer72 2 ай бұрын
Now I know you are tripping. Unless I can't read that graph properly you are trying tell us that a 44-45% win rate is a big loss! Especially as this is a 70b open weights model, while the others are all closed weights. And as another commenter noted Llama 3 has only 4k context window so of course it will be poor at summarisation and other tests that rely on a long context. We will be getting longer context versions from Meta, multi model and huge parameters.
@code4AI
@code4AI 2 ай бұрын
Llama 3 was trained on 8192 token 😂
@TheReferrer72
@TheReferrer72 2 ай бұрын
@@code4AI ok it has a 8k token length, GPT4 Turbo 128k, Claude 200K, Gemini 1000K+, so 16 times longer my point still stands. And I notice how you did not address my first point, Like I said you are tripping.
@peterbell663
@peterbell663 2 ай бұрын
I found it essentailly useless and a waste of my time. I gave it a dataset of 10,000 lines with 22 variables and asked for summary statistics in cumulative blocks of 1000. 10 blocks in total, I reposed this question about 8 times over hours and each time the answer was DRIBBLE. And that was a very easy task. Imagine giving it a little bit more difficulta task like time series modelling. I will check the alternatives.
@dennisestenson7820
@dennisestenson7820 2 ай бұрын
Maybe you should choose an appropriate tool for the task.
Adversarial Questions Test Multimodal MED AI sys
21:08
code_your_own_AI
Рет қаралды 1,3 М.
"okay, but I want Llama 3 for my specific use case" - Here's how
24:20
LOVE LETTER - POPPY PLAYTIME CHAPTER 3 | GH'S ANIMATION
00:15
Did you believe it was real? #tiktok
00:25
Анастасия Тарасова
Рет қаралды 56 МЛН
КАРМАНЧИК 2 СЕЗОН 7 СЕРИЯ ФИНАЛ
21:37
Inter Production
Рет қаралды 550 М.
Best father #shorts by Secret Vlog
00:18
Secret Vlog
Рет қаралды 22 МЛН
Reliable, fully local RAG agents with LLaMA3
21:19
LangChain
Рет қаралды 103 М.
NEW TextGrad by Stanford: Better than DSPy
41:25
code_your_own_AI
Рет қаралды 10 М.
ESRS materiality Visualisations - Do we have a winner?
8:18
5 Design Patterns That Are ACTUALLY Used By Developers
9:27
Alex Hyett
Рет қаралды 212 М.
Creating an AI Agent with LangGraph Llama 3 & Groq
35:29
Sam Witteveen
Рет қаралды 41 М.
How I'd Learn to be a Data Analyst in 2024
13:17
Luke Barousse
Рет қаралды 250 М.
NEW Multi-Modal AI by APPLE
26:49
code_your_own_AI
Рет қаралды 2,3 М.
Q* explained: Complex Multi-Step AI Reasoning
55:11
code_your_own_AI
Рет қаралды 7 М.
Attendance Analyses
1:05
Copy Paste Report
Рет қаралды 1
Samsung Galaxy 🔥 #shorts  #trending #youtubeshorts  #shortvideo ujjawal4u
0:10
Ujjawal4u. 120k Views . 4 hours ago
Рет қаралды 7 МЛН
Красиво, но телефон жаль
0:32
Бесполезные Новости
Рет қаралды 940 М.
Choose a phone for your mom
0:20
ChooseGift
Рет қаралды 7 МЛН