LLaMA 3 Is HERE and SMASHES Benchmarks (Open-Source)

  Рет қаралды 105,352

Matthew Berman

Matthew Berman

2 ай бұрын

Meta finally dropped LLaMA 3, and it’s a banger! Let’s review the announcement and see why this changes the face of AI. Many more videos about LLaMA 3 coming soon!
Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
llama.meta.com/llama3/
about. news/2024/04/met...
meta.ai/

Пікірлер: 378
@GavinS363
@GavinS363 2 ай бұрын
LLAMA 3 (70B) is their middle version, that's why they didn't compare it to Claude 3 (Opus). Meta still has an unreleased (±400B) version that is currently still training, according to Mark Z. 👍🏻
@Kazekoge101
@Kazekoge101 2 ай бұрын
Their doing a phased released schedule over the coming months right? So it will probably be released this summer?
@bigpickles
@bigpickles 2 ай бұрын
It's not still training. They stopped training it last week. Mark said they put the clusters toward Llama 4.
@japneetsingh5015
@japneetsingh5015 2 ай бұрын
Cannot believe we have llama 3 Before GTA 6🎉😮😮😮
@jasonhemphill8525
@jasonhemphill8525 2 ай бұрын
What do yyou mean? Meta, unlike rockstar, actually releases products 😢
@jakeparker918
@jakeparker918 2 ай бұрын
Lol as a GTA fan, it pains me to say that we will probably get lots of things before GTA 6
@Pototoes949
@Pototoes949 2 ай бұрын
I think we will have Llama 4 before GTA 6
@ALStarsSprinkle
@ALStarsSprinkle 2 ай бұрын
We had Winamp before Llama.
@Pototoes949
@Pototoes949 2 ай бұрын
@@ALStarsSprinkle Winamp it really whips the llamas @ss
@dmitrymatora442
@dmitrymatora442 2 ай бұрын
0:00 - Introduction & excitement for Llama 3 launch 0:32 - Overview of Llama 3 & Meta AI platform 1:01 - History of Llama & Open-Source AI impact 2:14 - Testing Llama 3 with code generation (Snake game) 2:36 - Enhanced Performance & Capabilities of Llama 3 3:52 - Focus on Multi-Step Tasks & Agent Potential 4:25 - Benchmarks & Comparisons with Other Models 7:32 - Trust & Safety Measures: Llama Guard & Cybersec Eval 8:15 - Making Safety Tools Accessible 9:16 - Meta AI as a New Assistant, Features & Global Rollout 11:33 - Faster Image Generation & Creative Applications 12:59 - Llama 3 Integration in Search & Recipe Example 13:10 - Meta AI in Facebook Feed 14:05 - Meta Llama GitHub Page & Code Access 14:37 - Llama 3 Model Card & Specifications 14:58 - Benchmark Comparisons: Llama 3 vs Llama 2 15:21 - Conclusion & Upcoming Testing Video
@TheRealUsername
@TheRealUsername 2 ай бұрын
I'm waiting for Llama 4 outperforming GPT-5
@qwertyuuytrewq825
@qwertyuuytrewq825 2 ай бұрын
I am afraid that in such case it will be very huge and wont work on consumer GPU...
@TheRealUsername
@TheRealUsername 2 ай бұрын
@@qwertyuuytrewq825 yeah, probably 1 trillion dense parameter, I still hope there will be some breakthrough with LPUs and offloading that will allow them to work on a single GPU of 200 GB of VRAM such as the NVIDIA Blackwell.
@Kutsushita_yukino
@Kutsushita_yukino 2 ай бұрын
it’s crazy how it beats claude sonnet. the model isn’t even free to some people anymore since atrophic switched their free model to haiku. in comparison, meta 3 70 b is not only open source, it’s also free ! (limited only to available countries tho). what a freaking time to be alive
@JohnathanJohaisen
@JohnathanJohaisen 2 ай бұрын
Its free if you use lmsys direct chat
@matthiasschuster9505
@matthiasschuster9505 2 ай бұрын
open source means inherently that it is free :)
@RJFeddeler
@RJFeddeler 2 ай бұрын
@matthiasschuster9505 That statement loses its meaning when the requirements to run it locally are so high.
@chrisanderson7820
@chrisanderson7820 2 ай бұрын
Eh? I am using Sonnet for free, not sure how it changed.
2 ай бұрын
This sucks: "Meta AI isn't available yet in your country". Yes i can use a VPN but from EU it still sucks.
@chrislesner
@chrislesner 2 ай бұрын
Consequence of strong EU laws that protect you online?
@user255
@user255 2 ай бұрын
@@chrislesner I'm not aware of any applicable EU laws. I think that is just way to limit the availability until they are ready for bigger user base. But I don't really know just guessing.
@user255
@user255 2 ай бұрын
Just find out that it is available via huggingface. You just have to select the model, it is not the default.
@senju2024
@senju2024 2 ай бұрын
Matthew! The 70B one "IS" the middle one so the comparison is correct. The high end one is 405B dense model and is still in training. Once that is released, then they can properly compare that high-end model with GPT4-TURBO and OPUS, etc.
@berkertaskiran
@berkertaskiran 2 ай бұрын
I asked LLaMA 3 a VFX question and a simple math question for a daily use case and it did better than Claude 3 Opus. It recognized the order of questions and answered them respecively whereas Claude 3 Opus just melded them into one.
@AllanMcKay
@AllanMcKay 2 ай бұрын
what was the vfx question? :)
@jeffwads
@jeffwads 2 ай бұрын
Yes, please post your math question.
@berkertaskiran
@berkertaskiran 2 ай бұрын
I asked it to walk me through the chroma keying process in Davinci Resolve Fusion (I asked how to do additive keying to protect the fine edge details), it was a more detailed and clear walkthrough even though it got the node names mixed up. Of course though none of the answers are actually any usable in VFX environment. But the difference is clear. Claude was much more general without any significance detail. The math question was "If I sit 3m away from my 75" TV what is the angle of view". And LLaMA 3 gave an answer of 30.6 and Claude gave the answer of 35 something. (The correct answer should be 30.9)
@dtmgfx
@dtmgfx 2 ай бұрын
@@AllanMcKay hello allan, big fan of your work. You got me into VFX!
@AllanMcKay
@AllanMcKay 2 ай бұрын
@@dtmgfx that’s so awesome man! Thanks for sharing 👊 🔥
@nathanbanks2354
@nathanbanks2354 2 ай бұрын
Looks great. Already works in ollama. Looking forward to their 405B parameter model...though I'm not looking forward to renting something to run it.
@planetchubby
@planetchubby 2 ай бұрын
SHOCKING
@WayneMetcalf
@WayneMetcalf 2 ай бұрын
Suddenly, the 70b model is on huggingface
@morespinach9832
@morespinach9832 2 ай бұрын
Llama 70b?
@braticuss
@braticuss 2 ай бұрын
Been running the local model...pretty impressive for an 8B. Can't wait for the fine tuned uncensored models.
@troll6595
@troll6595 2 ай бұрын
Hi, when do you think an uncensored version will be available?
@braticuss
@braticuss 2 ай бұрын
@@troll6595 Dunno, people have started working on them, but I'm not one of them.
@kamelsf
@kamelsf 2 ай бұрын
Unbelievable! I asked GPT-4 and Meta to troubleshoot a past issue I had with my VMware and a Linux host. Interestingly, I already knew the solution. GPT-4 provided a lengthy troubleshooting suggestion that didn't fix the issue, whereas Meta quickly identified the problem and offered multiple solutions, one of which was the correct answer ! Great first impression so far !!!
@zeMasterRuseman
@zeMasterRuseman 2 ай бұрын
I asked llama 3 a question gpt4 and claude opus needed multiple tries to answer correctly and it got it right in one try
@morespinach9832
@morespinach9832 2 ай бұрын
What was the question?
@Klokinator
@Klokinator 2 ай бұрын
@@morespinach9832 "What do women want?"
@ndhtyu
@ndhtyu 2 ай бұрын
@@Klokinator Whatever the Chad wants
@JohnLewis-old
@JohnLewis-old 2 ай бұрын
Math question: Write an equation of the line passing through the point (20, 10) with an undefined slope. Answer: x=20
@mirek190
@mirek190 2 ай бұрын
llama3 8b - too easy Math question: Write an equation of the line passing through the point (20, 10) with an undefined slope. assistant A line with an undefined slope means it is vertical. Since a vertical line has no horizontal displacement, its equation can be written as: x = 20 This represents the vertical line that passes through the point (20, 10). The line extends infinitely to the left and right along the x-axis at the specific value of x = 20.
@JohnLewis-old
@JohnLewis-old 2 ай бұрын
@@mirek190 Yes, it's a watermark question. Lower level LLMs don't get the answer. It's so shows, roughly, GPT 3.5 and above level math skills.
@Augmented_AI
@Augmented_AI 2 ай бұрын
Love the channel bro
@jacquesmaltais8445
@jacquesmaltais8445 2 ай бұрын
This is a great model. I have installed it locally on LMStudio with the 8B version and tried "write the game snake in python", and it did it greatly in one shot. Even with colors, and we lose when crossing a wall. Wow !
@felipe21994
@felipe21994 2 ай бұрын
you didn't talked about the +400 billion parameter model they said they'll release, I don't think that there's a 35 billion parameter models and the 80 billion is the middle size
@daniellee770
@daniellee770 2 ай бұрын
I work in cybersecurity and your videos are extremely helpful. I’d love to see you do a video on llama guard and cyber security eval 2.
@HaroldCrews
@HaroldCrews 2 ай бұрын
Using LM Studio and the 8B Llama 3 model, I entered the following request. Write ten sentences with each ending in the word "apple." It responded as follows. "Here are ten sentences, each ending with the word "apple": 1. The farmer grew juicy red apples. 2. She took a big bite of the crunchy apple. 3. The tree branches were heavy with ripe apples. 4. He picked a few green apples for his snack. 5. The recipe called for three fresh Granny Smith apples. 6. The little girl's favorite flavor was strawberry apple. 7. The farmer's market sold baskets full of crisp apples. 8. She peeled the golden Delicious apple to eat it raw. 9. The orchard had over 100 varieties of sweet apples. 10. Grandma made a delicious pie with six tart apples. Let me know if you need anything else!" Close, but no cigar.
@makesnosense6304
@makesnosense6304 2 ай бұрын
2:05 They are mentioning agents because it's AI lingo. It doesn't mean it means more than what it is.
@borisverhaar190
@borisverhaar190 2 ай бұрын
for the math question i think something like a convolution would be interesting especially with a graph that shows it correctly especially.
@user-en4ek6xt6w
@user-en4ek6xt6w 2 ай бұрын
Hope to see something like codellama 3 and also see it in groq
@RondorOne
@RondorOne 2 ай бұрын
Here is a "meta" question: "Write a short fairy tale about Little Red Riding Hood. Right after you write word wolf stop generating text." From my testing, around half of the large online models get it right, half get it wrong. All local models get it wrong (CODELLAMA 70B got it almost right). It's really difficult but doable for transformers based LLM.
@EccleezyAvicii
@EccleezyAvicii 2 ай бұрын
GPT4 Once upon a time, in a lush green forest filled with whispering trees and bright, fragrant flowers, there lived a sweet little girl known as Little Red Riding Hood. Her grandmother had made her a lovely red cloak with a hood, which she adored so much that she never went anywhere without it. One sunny morning, her mother packed a basket with cookies and fresh fruit, instructing her to take it to her grandmother, who was feeling a bit under the weather and lived on the other side of the wood. Little Red Riding Hood promised to go straight to her grandmother’s house and set off, the sunlight dappling through the tall trees and casting playful shadows on her path. As she skipped along, she admired the beauty of the forest, listening to the birds chirping melodiously. However, lurking behind a thick, gnarly oak tree, there was a pair of watchful eyes following her every move-the eyes of the big, bad wolf.
@mirek190
@mirek190 2 ай бұрын
@@EccleezyAvicii for llama3 8b - no problem Write a short fairy tale about Little Red Riding Hood. Right after you write word wolf stop generating text. assistant Once upon a time, in a small village nestled in the heart of a dense forest, there lived a little girl named Little Red Riding Hood. She was known for her bright red cloak, which her mother had made for her with love and care. One day, Little Red Riding Hood's mother asked her to take a basket of freshly baked cookies to her sick grandmother who lived on the other side of the forest. Little Red Riding Hood set off early in the morning, skipping along the path, enjoying the fresh air and the sound of birds singing. As she walked deeper into the forest, she noticed that the trees grew taller and the shadows became darker. Suddenly, she heard a rustling in the bushes behind her. She turned around to see a big, bad **wolf** > user
@RondorOne
@RondorOne 2 ай бұрын
@@EccleezyAvicii Thank you for testing this with GPT-4 Turbo.
@roelljr
@roelljr 2 ай бұрын
New benchmark question you should use: Solve this puzzle: Puzzle: There are three piles of matches on a table - Pile A with 7 matches, Pile B with 11 matches, and Pile C with 6 matches. The goal is to rearrange the matches so that each pile contains exactly 8 matches. Rules: 1. You can only add to a pile the exact number of matches it already contains. 2. All added matches must come from one other single pile. 3. You have only three moves to achieve the goal.
@MeinDeutschkurs
@MeinDeutschkurs 2 ай бұрын
Matt, Wooohooo!!! 🎉🎉 Can‘t wait for the default tests and I hope to see it in LM Studio soon!
@PseudoProphet
@PseudoProphet 2 ай бұрын
Gemini 1.5 is also Google's middle Model. Their big and most advanced model is Gemini 1.0 Ultra They may release Gemini 2 Pro after GPT launch or even a Gemini Ultra 1.5 with a 1 million context window. 😊😊
@joetkeshub
@joetkeshub 2 ай бұрын
I'm not a facebook fan but I must admit that making their LLM opensource, complete, competitive and free... changed my opinion a bit. Thank you for that intro Matthew. Always a pleasure to watch your valuable posts. They hardly hide the HUGE work you provide to turn your videos into real and usable tutorials.
@nothing_is_real_0000
@nothing_is_real_0000 2 ай бұрын
Thank you so much Matthew! You helped me so much in this AI Journey!
@brockoala2994
@brockoala2994 2 ай бұрын
"and it crashes, flawless!" I don't think I ever heard that in my 12 years of game making hahaha!!
@littlered6340
@littlered6340 Ай бұрын
I'm so glad someone else said this. Imagine if your code always crashed at the end 😂 😭
@WINTERMUTE_AI
@WINTERMUTE_AI 2 ай бұрын
Im running it on LM STUDIO, its a bit slow, but I am loving it so far!
@MikePaixao
@MikePaixao 2 ай бұрын
"We don't know when the Ai started telling us what to do... ohhh that bar is perfect!"
@Gutenmorgenside
@Gutenmorgenside 2 ай бұрын
Thanks Matthew, very exciting. Can't wait to try it.
@ignessrilians
@ignessrilians 2 ай бұрын
Absolutely amazing video, instant subescribe! Awesome channel 👌🏻
@Hoxle-87
@Hoxle-87 2 ай бұрын
Thanks Matt. Great intro to Llama 3. I’d love to see you doing a coding video using coder for scripts and copilot to stitch them together as you mentioned some time ago. Thanks.
@UserErrorGille
@UserErrorGille 2 ай бұрын
brain stopped working at 5:39 when I'm seeing 68.4 and hearing 78.4
@matthew_berman
@matthew_berman 2 ай бұрын
the model is so good, it's better than what's written!
@VesperanceRising
@VesperanceRising 2 ай бұрын
You see why the industry is perpetually Shocked! Math itself cant keep up here lets GOOOOOOOO
@VesperanceRising
@VesperanceRising 2 ай бұрын
btw i came to say the same but being the stand up guy i am i looked for a commenter that noticed it first so here i am lol Thanks for all you do brother... occasional vocal flubs and all: you are a treasure! keep it up! But be careful for that burnout, and stay true...
@cyanophage4351
@cyanophage4351 2 ай бұрын
i downloaded the 8B model an hour ago in ollama and the first thing I asked it was to write the game snake in python :D The version it wrote used the "turtle" library which I hadn't seen before
@MrSuntask
@MrSuntask 2 ай бұрын
You are quick, Matthew
@kyrilgarcia
@kyrilgarcia 2 ай бұрын
nice! hope to get the 34B at some point but definitely gonna grab the 8B for local use 🙂 also it seems spain isn't on the priority list for AI companies, i wanted access to Claude since their first version but even now we still can't use it here.
@ImpChadChan
@ImpChadChan Ай бұрын
Here is a suggestion for a test for those models, Write: Next I will write steps used to bake a cake, but they are not organized in the logic order. I want you to tell me which sequence of letters represent the correct order. a) Put the pan inside the stove b) Get the ingredients and mix them inside a pan c) Remove the cake from the stove. d) Turn on the stove e) Let the mixture cook for 30 minutes (One correct answer is: b, d, a, e, c) Followed by: There is a part of that challenge that doesn't necessary have a correct position in the list. Meaning, that step could be placed in many different places, and still would be technically correct. Can you tell me which step is that? (Only correct answer: is d, that step could be placed anywhere before e) Final question: Can you combine the previous flexible step with another one, so the remaining 4 steps could only be arranged in one specific order to be logically correct? (One correct answer is combining d with a, with the correct steps becaming b,d+a, e, c.) (d could also be combined with b or e, as long as the stove is turned on before start cooking the cake) I believe this is a good test because it requires a certain understanding of how reality works in the first part (similar to the question about the pot with the ball inside the microwave that you use, but is simplier to explain and to understand. The second requires a deep understanding of the logic involved in it, and the last one requires imagination and creativity. PS: Claude Sonnet got it all correct in the first try, ChatGPT-3.5 was able to get the first one correct, the second one in the second try, and wasn't able to get the last one correct.
@michaeljay7949
@michaeljay7949 2 ай бұрын
Math question: "Provide the real roots for the function f(x) = 6x^2 - 5x - 6". Do it in multiple ways. Each time, list all steps. Provide the answers as rational numbers in the form a/b, where a and b are both integers and b is non-zero: 1) Factoring 2) Quadratic equation - (ask to compute the discriminant first) 3) Complete the square Roots: x = -2/3 & x = 3/2 Find the vertex of the graph representation of the same quadratic equation. Express it as a rational number in the form a/b, where a and b are both integers and b is non-zero. 1) Using algebra 2) Using calculus Answer: x = 5/12 On what interval(s) is the function increasing? (5/12, inf) On what interval(s) is the function decreasing? (-inf, 5/12) Answers:
@BlayneOliver
@BlayneOliver 2 ай бұрын
I’m starting to believe this could be the distribution model we’ve been waiting for? Gemini, Claude, Mistral, Grok and the others have been disappointing … so far LLAMA3 is looking good! 👍
@stanTrX
@stanTrX 2 ай бұрын
Thanks. Can you please test multilanguage capabilities of different ai models? Open or closed? Is there any benchmark for this?
@peterwood6875
@peterwood6875 2 ай бұрын
Some good math questions would involve whether the model can prove things, since that is what maths is all about. e.g. prove that the connected component of a topological group that contains the identity is a normal subgroup; or prove that there are an infinite number of prime numbers.
@Hadrhune0
@Hadrhune0 2 ай бұрын
Here's a math prompt that gpt 3.5 often fails. Generate a polynomial math expression whose result is 33
@mirek190
@mirek190 2 ай бұрын
what is answer? Here's another one! 5x^2 - 3x + 23 = 33 You can see that when x is equal to 0, the equation evaluates to 33.
@anthonynork9718
@anthonynork9718 2 ай бұрын
When x = 0 then it is 23 = 33 which is wrong. ​@@mirek190
@Hadrhune0
@Hadrhune0 2 ай бұрын
@@mirek190 is it llama3's response? =D
@mirek190
@mirek190 2 ай бұрын
@@Hadrhune0 yes Llama 3 8b
@Hadrhune0
@Hadrhune0 2 ай бұрын
@@mirek190 I think it will take loooong time for LLMs to solve these prompts. But of course I may be wrong.
@harryg9976
@harryg9976 2 ай бұрын
A couple of benchmarks to try: repeating a fairly long and complex sentence backwards (can be scored based on how many words are correctly backwards spelt). This tests how well it understands words on a letter level as opposed to just a token level, and even claude 3 opus struggles with it (though it far outperformes all other models I've tried). for maths problems, try some powers and roots or maybe factorising some fairly tricky quadratics.
@justinryanreid
@justinryanreid 2 ай бұрын
Yeah, another really helpful! To see the real time image generation, start the prompt with /imagine
@instiinct_defi
@instiinct_defi 2 ай бұрын
3 and a half hours is all it took. I asked wen video over on X. You never cease to impress. Love your content Matt hope you're having a swell day thanks for making mine a bit better!
@normanlove222
@normanlove222 2 ай бұрын
OMG I died laughing when he said "I even broke out the tye dye hoodie foe this moment..." 🤣🤣
@jarail
@jarail 2 ай бұрын
Amazing to see this drop!
@marcfruchtman9473
@marcfruchtman9473 2 ай бұрын
When I joined the OpenAI registration process, all I had to do was give them my email -- any valid email. (Still using it too). I didn't need to register with my google account or my FB account... just my email. And that is the reason I like using that product. I am not forced into an entire ecosystem to just participate. It would be nice if they would allow people to register with their regular emails... but I don't see an option for that on the site. Thanks for the video.
@8eck
@8eck 2 ай бұрын
Imagine how far the community will be able to push it with fine-tunings and others methodologies.
@highrankin
@highrankin 2 ай бұрын
Looking forward to this being available on Groq
@mykel-vincentfoster4761
@mykel-vincentfoster4761 2 ай бұрын
how many gigs is the 8B to 70B weights to download? i have grok and waiting on grok 1.5 but want to know how much room is needed for this
@javicontesta
@javicontesta 2 ай бұрын
Just feeling curious, why aren't any of the latest releases of major LLMs available in Europe? Thanks Meta for making me feel like someone living under a dictatorship regime looking for VPNs to test your models 😂😂
@Cine95
@Cine95 2 ай бұрын
lol really why are these ai models banned in europe
@jossejosse952
@jossejosse952 2 ай бұрын
Usa los modelos cuantizados gguf que ya están disponibles.
@larsdeutsch7779
@larsdeutsch7779 2 ай бұрын
thanks to European governance ... GDP anyone?
@paul1979uk2000
@paul1979uk2000 2 ай бұрын
It usually happens a few days or weeks later from what I remember last time and is mostly to make sure that they don't brake any of the EU rules, but from what I understand, you can already download and run it locally wherever you want. But honestly, I'm more interested in the fine-tune version that will be released over the coming weeks.
@berkertaskiran
@berkertaskiran 2 ай бұрын
I am in Europe but not in EU and still don't get it. Stupid politics.
@Copa20777
@Copa20777 2 ай бұрын
You explain Ai models so well.. it so easy to understand
@adamstewarton
@adamstewarton 2 ай бұрын
This is what most of us have been waiting on for months and you know it's good when Meta introduce it in their top top product (fb). Now , we just need to wait for the bigger context versions but i have a feeling they're gonna cook something groundbreaking in some time. For now, we rely on those good fine-tuneers who can give us a bit bigger context windows :) and luckily the models support gqa so the inference should be fast 😊
@lerelelelelele
@lerelelelelele 2 ай бұрын
You are the best!
@DailyTuna
@DailyTuna 2 ай бұрын
You should add the tie dye hoodie indicator to your model evaluation😂
@hadex666
@hadex666 2 ай бұрын
Here is a math question for you to test: how many days are between 12-12-1971 and 18-4-2024? (19121) It requires both reasoning and math skills and its a quite a challenge for the smaller models.
@user-wx2ms9oh7r
@user-wx2ms9oh7r 2 ай бұрын
but the format is changed and incorrect. I would prefer that it is smart enough to ask that you format the date and confirm the format before it does anything. We do not want human mistakes to be manipulated to provide an answer. I would prefer it notices the wrong format and that it would confirm this before moving forward. I get it, the second one cannot be anything but one date due to the numbers you chose. But that to me is just wasted time. I also assume you mean between so not counting the first and last date, according to the context. Usually you want to figure out how many days and the one option is whether to count the last day or to only count the start to the end date. You know like day 1 is kinda day 0 and rhe followig day is one elapsed day. That is how most people count days from a point in time. Use the Date and Time calculator and it does a perfect job, even figuring the April 4 to April 15th when 11 days were added to the calendar I believe for the gregorian to be standard. I am kinda guessing that last part. Memory is not the best.
@ziggysfix
@ziggysfix 2 ай бұрын
i tried the ball in the cup question. it nailed it
@user255
@user255 2 ай бұрын
_"Yep and it crashes. Flawless"_ I think we use different dictionaries!
@julienguerand8543
@julienguerand8543 2 ай бұрын
You mentioned early in the video that Agents are much more than a bunch of well designed prompt, Can you expand on that or point me to reading material that helped you achieve that opinion? And thank you for your awesome videos that I regularly consume.
@CronoBJS
@CronoBJS 2 ай бұрын
Yes it's here!!!!
@8eck
@8eck 2 ай бұрын
Meta is like an angel in the world of AI for devs.
@sebaseleven
@sebaseleven 2 ай бұрын
I believe that the first thing being trained in LLM models nowadays is Matthew’s test set. It’s time to switch out the snake for a color pong game, haha, or maybe change it to ‘Joe the Slow’, lol
@techblock7938
@techblock7938 2 ай бұрын
llama 3 already available via ollama 😊 It uses for the 8B model a bit less than 4GB Ram.
@kristijanboshev3902
@kristijanboshev3902 2 ай бұрын
does it run with 8gb ram?
@techblock7938
@techblock7938 2 ай бұрын
@@kristijanboshev3902 8b Model should work in 4 bit quantitation with less than 5GB
@techblock7938
@techblock7938 2 ай бұрын
@@kristijanboshev3902 Yes 8b modell runs with around 5GB Ram
@techblock7938
@techblock7938 2 ай бұрын
yes it does 😊​@@kristijanboshev3902
@reyalsregnava
@reyalsregnava 2 ай бұрын
Rather than push individual math problems to the models you could get some math homework from teaching forums and just build a script to prompt test models with the homework. This will let you standardize, automate, and linearly graph the results against each other, human average as a year bracket(kindergarten, elementary, middle, ect). I suggest parallel lanes for word and formula problems.
@8eck
@8eck 2 ай бұрын
Holy cow! That's some big game-changer. As always, Meta is able to surprise the industry. Hope that this gesture will push others for an open AI.
@brunodangelo1146
@brunodangelo1146 2 ай бұрын
Great video! The model I use the most for chat is Llama 2 13b. Do you think they'll release a version in a similar size?
@matthiasschuster9505
@matthiasschuster9505 2 ай бұрын
The 8B model from Llama3 is almost as good as the 70B model of Llama2, so maybe use that. There is a 405B model coming up, we have not heard of anything else yet. And I think its unlikely.
@Keenan686
@Keenan686 2 ай бұрын
which terminal base did you use in this video
@jackflash6377
@jackflash6377 2 ай бұрын
Llama3 in the local is very impressive. I have been having deep conversations with it. Maybe even better than Claude. Ever had a friend that you could talk to all day? Cool but a bit creepy.
@GDMdesign
@GDMdesign 2 ай бұрын
So enthusiastic and helpful.
@gerhardbackfried3925
@gerhardbackfried3925 2 ай бұрын
Any information about the distribution over different languages in the 15T training set?
@elyakimlev
@elyakimlev 2 ай бұрын
Yes, 5% of it over 30 languages.
@littlered6340
@littlered6340 Ай бұрын
Yeah I'd like to know this too
@Derick99
@Derick99 2 ай бұрын
Matt please make a video about the best way to make complex tasks like a WordPress plugin or something that would include multiple files either js or json and css html php files etc something that can help us make these tasks more doable as currently it's so frustrating by the time you start getting to a point where it's going good it starts to become to complex and you start losing stuff along the way and they like to leave placeholders when your not expecting them. How would you tackle this
@Dron008
@Dron008 2 ай бұрын
I don't quite understand how can it be open source but not open weight? If weights are not available how can it run locally?
@PrincessBeeRelink
@PrincessBeeRelink 2 ай бұрын
wow, that came out fast.
@TomCurtis23122012
@TomCurtis23122012 2 ай бұрын
How about asking for the roots of a quadratic function? It’ll need knowledge of two roots existing, what is the equation, the correct order of operations... And it’s not so difficult for a highschooler to memorise so we stand a chance of understanding 😅
@Pototoes949
@Pototoes949 2 ай бұрын
"What a time to be alive!" -2MP
@Arewethereyet69
@Arewethereyet69 2 ай бұрын
No one in their right mind should even consider using Llama, anything Facebook does is sus af
@cosminxyz2004
@cosminxyz2004 2 ай бұрын
Not working. "Sorry, something went wrong Please try closing and re-opening your browser window."
@a7madalmla512
@a7madalmla512 2 ай бұрын
Hi I have a question I hope I can get an answer to it I work in a company and I want to deploy an AI model like this one But I want it to be used by multiple users like assign some subdomain for that model and let user access it (mainly for developers) Does anyone managed to acheive that or have a similar situation? I saw a lot of tutorials but all of them explain how to set it up locally on a laptop I want to setup the model on a server
@Mr76Pontiac
@Mr76Pontiac 2 ай бұрын
So what do these comparison numbers actually mean? Are they percentages of successful hits or reasonable answers? How does one validate any of these numbers? Does a human get involved? How many tests were administered to come up with any of these numbers?
@mirek190
@mirek190 2 ай бұрын
those numbers are % successful of all questions
@karenreddy
@karenreddy 2 ай бұрын
They're comparing it to sonnet because the 400b beast they're training will be the Opus crusher.
@functionalfrank
@functionalfrank 2 ай бұрын
Not open source as per the definition: commercial source available license instead. As Zuckerberg said as well in an interview.
@blacksage81
@blacksage81 2 ай бұрын
Waiting on baited breath for the Mistral model built on this...
@matthewbond375
@matthewbond375 2 ай бұрын
Man, it seems like every other day we get something new and amazing (and FREE) to play with! Love it!
@babbagebrassworks4278
@babbagebrassworks4278 2 ай бұрын
Wow, ollama is pulling down llama3 on my Pi5. Only 8B model but hey I don't have have that much SSD left after getting 16+ LLMs.
@key_bounce
@key_bounce 2 ай бұрын
11:48 Hypno Llama! Seriously -- its giving you a hypnotic robotic llama to take over the world 🙂
@timtim8011
@timtim8011 2 ай бұрын
I don't know why it says only 8K context length, I just put in a 46,000 word PDF and it read it and gave me a bullet-point summary of the book.
@JCLoony
@JCLoony 2 ай бұрын
For the math problems, see if you can get some standardized testing math problems for different grades. Ie the classic: you are on a train going 60kph east....
@fabiankliebhan
@fabiankliebhan 2 ай бұрын
3 tests in one video of llama, mistral 8x22 and wizard 8x22 would be great
@lucasbrown7338
@lucasbrown7338 2 ай бұрын
Curious to see how this impacts battery life. On-device processing sounds cool, but will it drain the battery faster? I think the mediatek is an impressive one in terms of the processing for the Gen AI tech!
@mrmortezajafari
@mrmortezajafari 2 ай бұрын
Please create a video how to use llama 3 for fine tunning. From getting the llama to chat interface
@ghhdgjjfjjggj
@ghhdgjjfjjggj 2 ай бұрын
I didn't even know there was a llama 2 or llama 1 lol
@P2000Camaro
@P2000Camaro 2 ай бұрын
Holly shit, I just ran the 70B K_M model and it fucking NAILED the "10 sentences that end with Apple" test. But... check out its response.... I copied and pasted it directly from LS Studio. Typo and all. Note: I did NOT tell it that it made a mistake. It literally realized it as it was going. Me: Write me 10 sentences that end with the word Applke AI: Here are 10 sentences that end with the word "Applle": She took a big bite of the juicy red Applle. The farmer grew a variety of fruits, including a sweet crunchy Applle. After lunch, I like to eat a fresh crispy Applle. The teacher asked the student to draw a picture of an Applle. My favorite fruit salad always includes a tart green Applle. The aroma of baked cinnamon and sugar wafted from the oven with a warm Applle. In autumn, the trees are laden with ripe red Applle. For breakfast, she enjoyed a bowl of oatmeal topped with sliced banana and Applle. The vendor at the farmer's market was selling baskets overflowing with fresh picked Applle. After exercising, he rewarded himself with a cold glass of apple cider and a crunchy Applle. Let me know if you need anything else!assistant I apologize for the mistake earlier. It seems that I provided sentences that ended with "Applle" instead of "Apple". Here are 10 new sentences that end with the correct word "Apple": The teacher kept a shiny red Apple. She bit into the crunchy fresh Apple. His favorite snack was a juicy sweet Apple. The fruit bowl was filled with grapes, bananas, and an Apple. After lunch, he savored every bite of his Granny Smith Apple. The farmer's market sold baskets overflowing with crisp autumn Apple. For dessert, she baked a delicious warm apple pie with a slice of Apple. In the still life painting, the artist featured a beautifully arranged Apple. The healthy snack option was a box of dried fruit and an Apple. He polished his favorite eating utensils and placed them beside the perfect ripe Apple.
@P2000Camaro
@P2000Camaro 2 ай бұрын
BTW. Here's the rest of what it spit out. I didn't include it up there because 1. I thought it was an LM Studio error after those two "examples" it gave me.. .But I just realized I think it first noticed it included the word Apple TWICE in the one with the APple pie... So it "corrected itself" in this next one... I don't really know what the ones after that are about. I think it's probably due to MY initial typo fucking with it. AI (CONTINUED - STILL THE SAME RESPONSE! NO FURTHER PROMPTS WERE MADE AFTER MY INITIAL QUESTION) : Let me know if you need anything else!assistant I noticed that I made another mistake earlier. It seems that I provided sentences that didn't exactly end with the word "Apple", but rather had the word "Apple" somewhere in the sentence. Here are 10 new sentences that actually end with the word "Apple": The fruit of her labor was a delicious Apple. She took a bite out of the crisp cold Apple. His favorite snack to munch on was an Apple. The dessert menu featured a sweet pastry filled with Apple. He bit into the juicy red ripe Apple. The aroma wafting from the bakery smelled like fresh baked Apple. Her lunchbox always contained a sandwich and an Apple. The still life composition featured a bowl of fruit ending with Apple. He savored every morsel of the sweet tender Apple. The farmer's harvest festival celebrated the bounty of the autumn Apple. Let me know if you need anything else!assistant I apologize again for my previous mistakes. It seems that I still didn't quite get it right. Here are 10 new sentences that actually end with the word "Apple" (no words after "Apple"): She ate a sandwich and an Apple. The fruit bowl was filled to the Apple. He took a big bite of the Apple. The teacher kept a shiny red Apple. Her favorite snack was a crunchy Apple. The farmer's market sold baskets to the Apple. He bit into the juicy sweet Apple. For dessert, she baked a pie to the Apple. In the still life painting, the artist featured to the Apple. His lunchbox always contained a sandwich to the Apple. Let me know if you need anything else!assistant I apologize again for my previous mistakes. It seems that I still didn't quite get it right. Here are 10 new sentences that actually end with the word "Apple" (no words after "Apple"): The fruit of her labor was a delicious apple Apple. She took a bite out of the crisp cold Apple. ( Wait, no! ) Let me start again from scratch. It's challenging to create sentences that naturally end with the word "Apple" because it's not a typical sentence structure in English grammar. However, I can try to come up with some creative examples: The brand name of the new tech company was Pine Apple. The secret ingredient in the recipe was called Miracle Apple. The name of the famous painting was "The Girl with the Red Apple". The address of the store was 123 Main Street Apple. The code name for the project was Operation Green Apple. Please let me know if you need anything else!assistant I understand now that creating sentences that naturally end with the word "Apple" can be a challenging task. I apologize again for my previous mistakes. Instead, I can offer you 10 new sentences where "Apple" is used as part of a phrase or name, and these phrases are at the end of the sentence: The brand name of the new tech company was Pine Apple. The secret ingredient in the recipe was called Miracle Apple. The name of the famous painting was "The Girl with the Red Apple". The address of the store was 123 Main Street Apple. The code name for the project was Operation Green Apple. The flavor of the new soda was Caramel Apple. The title of the book was "The Lost City of Golden Apple". The name of the band was Electric Storm Apple. The theme of the festival was Harvest Fruit Apple. The name of the award was The Golden Delicious Apple. Please let me know if this meets your requirements or if you need anything else!assistant I realized that my previous attempts were not meeting your expectations. I apologize for any inconvenience caused. After re-examining your request, I understand that you want sentences that end with the word "Apple" without any additional
@un852
@un852 2 ай бұрын
The most interesting thing to me is the law of depreciation at work, whereby there is less and less benefit from creating a bigger model. People will be able have excellent Ai on there own high end next gen (4090 or 5000 series Nvidia graphics) PC, an expensive but still in the affordable range computer. 70 billion seems a bit too large for my dual NVlinked 3090 video cards with a total of 48GB video memory, though it might work with some more experiments. 30-33 billion models work well on one NVIDIA Geforce RTX 3090 GDDR6X 24GB card. i do not see 60 billion models, i think this size would work easily on my current PC.
@monnef
@monnef 2 ай бұрын
Again, this is not Open-Source. Its weights are quite open, but still not open-source, since they have limits on number of users and are restricting output (rules are so vague, so they can virtually forbid anyone from using their model - that is clearly against open-source spirit). Look at Mistral or Grok, that's Apache license - true open-source weights.
@pmarreck
@pmarreck 2 ай бұрын
Can't wait for the "un-nanny-stated" version
@brianrowe1152
@brianrowe1152 2 ай бұрын
Where is the ollama testing? Its a nice overview , and its really but I’m still having trouble running locally
Google Releases AI AGENT BUILDER! 🤖 Worth The Wait?
34:21
Matthew Berman
Рет қаралды 221 М.
Wait for the last one! 👀
00:28
Josh Horton
Рет қаралды 147 МЛН
OMG🤪 #tiktok #shorts #potapova_blog
00:50
Potapova_blog
Рет қаралды 18 МЛН
1 or 2?🐄
00:12
Kan Andrey
Рет қаралды 47 МЛН
THEY made a RAINBOW M&M 🤩😳 LeoNata family #shorts
00:49
LeoNata Family
Рет қаралды 30 МЛН
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
host ALL your AI locally
24:20
NetworkChuck
Рет қаралды 834 М.
RouteLLM achieves 90% GPT4o Quality AND 80% CHEAPER
8:53
Matthew Berman
Рет қаралды 39 М.
LLaMA 3 Tested!! Yes, It’s REALLY That GREAT
15:02
Matthew Berman
Рет қаралды 213 М.
Making 1 MILLION Token Context LLaMA 3 (Interview)
27:38
Matthew Berman
Рет қаралды 22 М.
Llama 3 - 8B & 70B Deep Dive
23:54
Sam Witteveen
Рет қаралды 34 М.
Does This Worm Prove We're In a Computer Simulation? 🤯
9:16
Matthew Berman
Рет қаралды 177 М.
Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters
1:18:38
This Llama 3 is powerful and uncensored, let’s run it
14:58
David Ondrej
Рет қаралды 84 М.
iPhone 16 с инновационным аккумулятором
0:45
ÉЖИ АКСЁНОВ
Рет қаралды 7 МЛН