Mixture of Agents TURBO Tutorial 🚀 Better Than GPT4o AND Fast?!

Рет қаралды 39,850

Күн бұрын

Here's how to use Mixture of Agents with Groq to achieve not only incredible quality output because of MoA, but to solve the latency issue using Groq.
Check out Groq for Free: www.groq.com
UPDATE: You don't need a valid OpenAI API key for this tutorial.
Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberma...
👉🏻 LinkedIn: / forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
github.com/togethercomputer/MoA

Пікірлер: 179

@matthew_berman 22 күн бұрын

Is this the ultimate unlock for open source models to compete with closed source?

@punk3900 21 күн бұрын

Groq is amazing... I read their comments on Nvidia, and there is clearly a huge potential in changing the architecture for LLM ASICs. Yet, Nvidia would like to sell one chip to rule them all rather then butchering their sota universal chipch. Yet, my last thought is that Nvidia surely secretly works on a LLM specific chip and they will show it once the competition becomes real. Thanks Matt for sharing your findings.

@lucindalinde4198 21 күн бұрын

@matthew_berman Great video

@jeffg4686 21 күн бұрын

And they're STILL not talking about socialism yet...

@dahahaka 21 күн бұрын

Hey, what is that display that you have in the background? The one showing different animations including snake?

@olafge 21 күн бұрын

I wonder how much the token count increase diminishes the cost efficiency against frontier models. Would be good to add tiktoken to the code.

@3dus 22 күн бұрын

This is a serious oportunity for Groq to just employ this transparently to the user. They could have a great competitor to frontier models.

@wurstelei1356 21 күн бұрын

Yea, Groq should host this MoA from Mat. Would be great.

@MilesBellas 22 күн бұрын

Tshirt merchandise : "I am going to revoke these API keys" 😂😅

@matthew_berman 21 күн бұрын

Such a good idea!!!

@nikhil_jadhav 21 күн бұрын

@@matthew_berman Since I saw the keys exposed, I was just waiting for you to say these lines. Once you said it I felt relieved.

@punk3900 21 күн бұрын

@@matthew_berman It was kind of cruel to mention this the second time you showed those keys :D

@matthew_berman 21 күн бұрын

@@nikhil_jadhav Lol!! I'll mention it as soon as I show them next time ;)

@nomad1220 22 күн бұрын

Hey Matthew - Love your vids - they are tremendously informative.

@starmap 22 күн бұрын

I love that open source local models are so powerful that they can compete with the giants.

@mikezooper 21 күн бұрын

Not really though. If you have ten small engines driving a car, that doesn’t mean one of those engines is impressive.

@StemLG 21 күн бұрын

@@mikezooper sure, but you're missing the fact that those engines are free

@wurstelei1356 21 күн бұрын

@@mikezooper Also, OpenAI is using something similar to MoA.

@TheRealUsername 21 күн бұрын

It's just that the proprietary models have hundreds of billions of parameters, compared to open source models which are 3b-70b.

@omarhabib7411 20 күн бұрын

@@TheRealUsernamewhen is llama 3 400B coming out?

@jackflash6377 21 күн бұрын

Thanks for the clear and concise instructions. Worked flawlessly first time. Now we just need a UI to work with it, including an artifacts window of course

@scitechtalktv9742 22 күн бұрын

Great! I will certainly try this out

@wardehaj 21 күн бұрын

Awesome instructions video. Thanks a lot!

@artnikpro 22 күн бұрын

I wonder how good it will be with Claude 3.5 Sonnet + GPT4o + Gemini 1.5 Pro

@punk3900 21 күн бұрын

But Groq will not run it. It can run only open source models as they just provide the infrastructure

@user-ku6oq8cn6m 21 күн бұрын

@punk3900 It is true Groq will not run it. However, the MoA code already seems to let you run any cloud LLM with an OpenAI formatted endpoint. And there are solutions already available to turn most cloud LLMs into an OpenAI formatted endpoint (in the cases where one is not already provided). Personally, I don't care if it is really slow (much slower than Groq). I still want to try combining a mixture of the best models (including proprietary cloud LLMs) already out there.

@titusblair 17 күн бұрын

Great tutorial thanks so much Matthew!

@MrMoonsilver 22 күн бұрын

Hey Matt, remember GPT-Pilot? Apart from revisiting the repo (it's done amazingly well) there is an interesting use case in relation to MoE. Remember how GPT-Pilot calls an API for its requests? Wouldn't it be interesting to see how it performs if it were to call a MoE "API"? It would require to expose the MoE as an API, but it would be very interesting to see as it would enable developers to piece together much cheaper models to achieve great outcomes, the likes of 3.5 sonnet.

@matthew_berman 21 күн бұрын

Good idea. I’ve seen CrewAI powering a coding AI project which looked interesting!

@14supersonic 21 күн бұрын

This would be perfect for agentic wrappers like Pythagora, Devin, Open-Devin. Paying for those expensive API's for the frontier models is not always the best option for most end-users. Especially when you're working with lots of experimental data. This could be something that could be relatively simple to implement too.

@MrMoonsilver 21 күн бұрын

@@14supersonic it might even be more accurate than the standard APIs

@wurstelei1356 21 күн бұрын

@@MrMoonsilver Plus the privacy is much higher.

@frinkfronk9198 21 күн бұрын

@@wurstelei1356privacy def better as long as it's fully local. running groq is still cloud. not that you were suggesting otherwise. I just mean generally to the conversation people should be aware

@AhmedMagdy-ly3ng 21 күн бұрын

Wow 🤯 I really appreciate your work, keep going pro ❤

@RosaLei 15 күн бұрын

Stoked you're using Groq! 🙌 The speed is mind-blowing! IMO, the results quality have been useful to me.

@manuelbradovent3562 19 күн бұрын

Great video Matthew! Very useful. Previously I had problems using crewai and max tokens with groq and will have to check how to resolve it.

@punk3900 21 күн бұрын

I wonder how asking the same model multiple times with different temperatures might work. For instance, you ask the hot head, you ask the cold head, and a medium head integrates this. I think most LLM models would hate it this way, bu it's clearly the future that we will have a unified frameworks with several LLM's asked several time and some model integrating these answers. No single model can compensate for integrating data from several sources.

@vikastripathiindia 19 күн бұрын

Thank you. This is brilliant!

@OpenAITutor 9 күн бұрын

Thank you for the inspiration.! I created a version using groq and open-webui as a pipeline.

@Dmitri_Schrama 22 күн бұрын

Sir, you are a legend.

@drlordbasil 21 күн бұрын

We need to start comparing things to claude 3.5 sonnet too! But I love MoA concept.

@MrLorde76 21 күн бұрын

Nice went smoothly

@MrBademy 20 күн бұрын

this setup is actually worth it.... good video man :)

@rafaeltrochereyes8952 14 күн бұрын

great !! just great ! thanks!

@juanpasalagua2402 21 күн бұрын

Fantastic!

@BigBadBurrow 22 күн бұрын

Just glancing at your headshot, I thought you were wearing a massive chain, like a wideboy from the Sopranos. Then I realised it's just the way your T-shirt is folded 😂

@l0ltaha 21 күн бұрын

Hey @matthew_berman , How would I go about having this whole process in a Docker container or have it as an API endpoint to where I can then connect Groq's speech to text, and that returned text is what gets passed to the prompt of MOA? Thanks and love the vids!

@nexusphreez 22 күн бұрын

This is awesome. What would be even better is getting a GUI setup for this so that it can be used more for coding. I maytry this later.

@EROSNERdesign 22 күн бұрын

AMAZING AI NEWS!

@kevinduck3714 22 күн бұрын

groq is an absolute marvel

@danberm1755 21 күн бұрын

Thanks much! I might actually give that a try considering you did all the heavy lifting. Seems like the AI orchestrators such as Crew AI or LangChain should be able to do this as well.

@AlfredNutile 21 күн бұрын

Just a thought if you fork the repo then upload your changes we could just download your fork and try it out?

@punk3900 22 күн бұрын

Oh boy, Groq's inference time is truly impressive. As most guys, however, I thought you were talking about Grok. Groq is in fact just mostly Llama on steroids. It's a pitty they cant offer larger models so far. But seeing how Groq works gave a glimpse of the speed of LLM chatbots in a year or two.

@matthew_berman 21 күн бұрын

Llama 405b I assume is coming

@jewlouds 21 күн бұрын

@@matthew_berman I hope you are 'assuming' correctly!

@labrats-AI 21 күн бұрын

Groq is awesome !

@braineaterzombie3981 22 күн бұрын

What if we use mixture of gaint models like 4o with claude opus and sonnet 3.5 and other state of art models with groq combined

@nikhil_jadhav 21 күн бұрын

Trying right away!! Thank you very much.. I wonder if I expose my personal mixture of agents to someone else and they use my model as their primary model?? Or thousands of such a model interconnected to each other.. a mesh of a model looped within themselves what will happen??

@vash2698 21 күн бұрын

Any idea if this could be used as a sort of intermediary endpoint? As in I point to a local machine hosting this script as though it were a locally hosted model. If this can perform on par with or better than gpt-4o with this kind of speed then it would be incredible to use as a voice assistant in Home Assistant.

@burnt1ce85 22 күн бұрын

The title of your video is misleading. Your tutorial shows how to setup MoA with Groq, but you haven't demonostrated how it's "Better Than GPT4o AND Fast". Why didnt you test the MoA with your benchmarks?

@hotbit7327 21 күн бұрын

Exactly. He likes to exaggerate and sometimes mislead, sadly.

@tonyclif1 19 күн бұрын

Did you see the question mark after the word fast? Sure a little click bait, but also misread by you it seems

@robboerman9378 21 күн бұрын

If Matthew with his insane local machine can’t compete, I am convinced Groq is the way to go for MoA. Super interesting to see how it nails the most difficult task in the rubric and fast! ❤

@KeithBofaptos 22 күн бұрын

I've been thinking about this approach also. Very helpful vid. Thanks.🙏🏻. I'm curious if on top of MoA if MCTSr would get the answer closer to 💯? And once SoHu comes online how awesome are those speeds gonna be?!

@KeithBofaptos 22 күн бұрын

This is also interesting: www.nature.com/articles/s41586-024-07421-0.pdf

@positivevibe142 22 күн бұрын

What is the best local private AI models RAG app?

@zeeveener 21 күн бұрын

All of these enhancements could be something you contribute back to the project in the form of configuration. Would make it a lot easier for the next wave of users

@marcusk7855 19 күн бұрын

Wow. That is good.

@Pregidth 22 күн бұрын

How many tokens are used for calling openAI API? Would be wonderful, if you could show how to leave OpenAI out. And full benchmark test please. Thanks Matthew!

@nikhil_jadhav 21 күн бұрын

Just wondering how can I use this groq setup in Continue?? ]

@IntelliAmI 18 күн бұрын

Matthew, how are you? Groq hosted another one, I inserted it in this version of MoA that you taught us. Now, with 5 LLMs at the same time. The new model is gemma2-9b-it.

@GodFearingPookie 22 күн бұрын

Groq? We love local LLM

@matthew_berman 22 күн бұрын

Yes but you can’t achieve these speeds with local AI

@InsightCrypto 22 күн бұрын

@@matthew_berman why not try that in your super computer

@Centaurman 22 күн бұрын

Hi Matthew if someone wanted to build a home server on a 5grand budget do you reckon a dual 3090 set up could? If not, how might a determined enthusiast make this fully local?

@HansMcMurdy 22 күн бұрын

You can use local Language Models but unless you are using a cust asic, the speed will be reduced substantially.

@shalinluitel1332 22 күн бұрын

@@matthew_berman any local, free, and open-source models which have the fastest inference time? which is the fastest so far?

@Pthaloskies 21 күн бұрын

Good idea, but we need to know the cost comparison as well.

@paul1979uk2000 21 күн бұрын

I'm wondering, have any test been done with much smaller models where you can have 2 or 3 running locally on your own hardware to see if it improves quality over any of the 2 or 3 on their own? I ask because with how APU's are developing, dedicating 20-30-40GB or more to A.I. use wouldn't be that big of a deal with how cheap memory is getting.

@sammathew535 21 күн бұрын

Can I make an "API" call to this MoA and use it say, with DSPy? Have you ever considered making a tutorial on DSPy?

@thays182 22 күн бұрын

Need the follow up tho. What is mixture of agents? How can we use it? Do we get to edit the agentic framework and structure ourselves? What possibilities now exist with this tool? I need moooore! (Amazing video and value, I never miss your posts!)

@shrn680 19 күн бұрын

would there be a way to integrate this with a front end like openwebui?

@42svb58 21 күн бұрын

How does this compare when there is RAG with structured and unstructured data???

@vikastripathiindia 19 күн бұрын

Can we use OpenAI as the main engine along with Groq?

@donzhu4996 12 күн бұрын

Hi Matthew, I got an error: Traceback (most recent call last): File "D:\ai\MoA\bot.py", line 1, in import datasets ModuleNotFoundError: No module named 'datasets' But I do installed it

@KCM25NJL 19 күн бұрын

Little tip: conda create -n python= && conda activate

@consig1iere294 21 күн бұрын

How is this any different than Autogen or CreawAI?

@mohl-bodell2948 14 күн бұрын

Could you put your version in a github repo?

@vauths8204 21 күн бұрын

see now we need that but uncensored

@husanaaulia4717 21 күн бұрын

We got Moe, MoA, CoE, is there anything else?

@Ha77778 22 күн бұрын

I love that , i hat open ai 😅

@psychurch 22 күн бұрын

Not all of those Apple sentences make sense Apple.

@oguretsagressive 21 күн бұрын

Sadly, even my favorite Llama 3 botched sentence #4. Maybe this test should specify that the output should be grammatically correct? Or make sense? Apple.

@wurstelei1356 21 күн бұрын

@@oguretsagressive A valid sentence has to meet certain criteria. The AI should keep track of this and not just output blah blah Apple. Even if you don't explicitly tell it to produce valid sentences ending with Apple.

@techwiththomas5690 21 күн бұрын

Can you explain how this many layers of models actually know HOW TO produce the best answer possible? How do they know what answer is better or more correct?

@kai_s1985 20 күн бұрын

The biggest limitation for Groq is API limit. After some use, you cannot use it anymore.

@millerjo4582 22 күн бұрын

Is there any chance you’re gonna look into the new algorithmic way to produce LLM’s, this is a transformers killer supposedly, I would think that that would be really relevant to viewers.

@matthew_berman 21 күн бұрын

Name?

@millerjo4582 21 күн бұрын

@@matthew_berman also thank you so much responding. Ridgerchu/Matmulfree

@millerjo4582 20 күн бұрын

@@matthew_berman I don’t know if you got that.. it looks like the comments were struck.

@DavidJNowak 20 күн бұрын

Excellent explanation of how to write code that uses GROQ as a manager of a mixture of agents. But you just went too fast for me to catch all your changes that make it all work. Could you write this in your newsletter or create a video for the slow, metal programming types? Thanks again. Groq AI is making programming more accessible for non-power users like most of us.

@ollibruno7283 21 күн бұрын

But it cant process pictures?

@piotr780 19 күн бұрын

why use conda and not pip ?

@engineeranonymous 21 күн бұрын

When KZfaqrs has better security than Rabbit R1. He revokes API keys.

@rinokpp1692 21 күн бұрын

What the coast of running this in a one million contect input and output

@fairyroot1653 13 күн бұрын

Groq is amazing, but can't do this setup because I don't have OpenAI API key and I don't use it, let's wait for a way to completely use it without the need for OpenAI at all

@D0J0Master 18 күн бұрын

Is goq censored?

@wholeness 22 күн бұрын

Quietly this is what Sonnet 3.5 is and the Anthropic secret. That why the API doesn't work well with using so much function calling.

@Kutsushita_yukino 22 күн бұрын

where in the heck did you hear this rumor

@blisphul8084 21 күн бұрын

If that were the case, streaming tokens would not work so well. Though having multiple models perform different tasks isn't a bad idea. That's probably why there's a bit of delay when starting an artifact in Claude.

@badomate3087 22 күн бұрын

Has anyone created a comparision between MoA and other multi-agent systems that can utilize LLMs? (Like Autogen) Because to me, this looks exactly like an Autogen network with a few simplifications, like no code running, and no tool or non-LLM agent usage. So, if this is not better, or even worse than Autogen, then it might not worth to use it. Since Autogen has a lot more features (like the code running, which was mentioned in the last video). Also, the results compared to a single (but much bigger) LLM, looks kinda obvious to me. Since the last modell receives a lot of proposed outputs, next to the prompt, and it only has to filter the best ones. This task is a lot easier than generating the correct one, for the first time, only from the prompt. And with the base idea behind MoA is this, the results are expectable.

@jay-dj4ui 21 күн бұрын

nice AD

@dr.ignacioglez.9677 22 күн бұрын

Ok 🎉

@4.0.4 21 күн бұрын

"4. The old proverb says that eating an apple a day keeps the doctor away apple." 🙃

@ErickJohnson-qx8tb 21 күн бұрын

all about AI ragmodel running uncensored v2w/ MOA using groq MOA LIBRABRY blackfridays gpts library ENOUGH SAID YOUR WELCOME I WOTE MY OWN API KEY ON OPEN GUI i built LOLOLOL

@ryanscott642 20 күн бұрын

This is cool but can you write some real multi-document code with these things? I don't need to make 10 sentences ending in apple. Most the things I've tried so far can't write code and I struggle to figure out their use.

@4NowIsGood 22 күн бұрын

Interesting but I don't know WTF You're doing but it looks great. Unless there's an easier setup and install, for me right now it's easier to just use chat GPT.

@FriscoFatseas 21 күн бұрын

yeah im tempted but by the time i get this shit working gpt4 will get a random update and be better lol

@rudomeister 21 күн бұрын

Thats why (according to the small agents vs response-time) Microsoft specially, with all the others have whole datacenters trying to find out how a swarm of millions of small agents can work seamlessly. What else should it be? A giant multi-trillion parameter model with the name Goliath? haha

@Officialsunshinex 22 күн бұрын

Brave browser local llm test o.o

@GoofyGuy-WDW 22 күн бұрын

Groq? Sounds like Grok doesn’t it?

@blisphul8084 21 күн бұрын

Groq had the name first. Blame Elon Musk.

@harshitdubey8673 21 күн бұрын

I tried asking MoA if A=E, B=F, C=G, D=H Then E=? It got it wrong 😂 MoA’s answer:- “I” But it’s amazing 🤩

@wrOngplan3t 20 күн бұрын

Would be my answer as well. What's wrong about that?

@harshitdubey8673 20 күн бұрын

@@wrOngplan3t E is predefined

@harshitdubey8673 20 күн бұрын

Logic not always be a sequence it could be circle ⭕️ sometimes.

@wrOngplan3t 20 күн бұрын

@@harshitdubey8673 Ah okay, well there's that 🙂 Thanks!

@AZWef 11 күн бұрын

Both A and E are obvious answers. (Both could also be true in parallel. So stating that it is wrong is also flawed) One is presented. The other is deduction based on sequence. If you assume the question is extending the sequence then the answer is right.

@Dave-nz5jf 22 күн бұрын

There's so many advances coming so fast with all of this, I wonder if the real value in your content is the rubric. Or, more accurately, improving your rubric. Right now I think it's barely version 1.0, and it needs to be v5.0. And try adding a medical or legal question for gosh sakes.

@user-td4pf6rr2t 21 күн бұрын

This is called Narrow AI.

@hqcart1 21 күн бұрын

dude, all your videos talking about beating gpt-4o, and we haven't seen any!

@BizAutomation4U 22 күн бұрын

I just read that GROQ no longer wants to sell cards directly for 20K but instead wants to offer a SaaS model ? This seems to contradict the benefits of running LLMs locally for privacy reasons, because now you're sending tokens out to a 3rd party web-service. I don't know why this is an either / or decision. LOTS of SMBs can afford to invest 20K in hardware. Total outlay for a serious LLM rig would have been something like 30K, which is barely half a year's salary to an entry level position. Bad move I say, but the good news is there will be competitors that correct this decision if Groq doesn't, and soon !

@cajampa 22 күн бұрын

Dude, you have misunderstood how groq works. Look into the details, you need maaaaaany cards to be able to fit a model. Look into the details, it is fast because is very little but very fast memory on every card. So you need a lot if cards to fit anything useful but then you can batch run requests at crazy speeds against those servers.

@BizAutomation4U 22 күн бұрын

Ok ... What about the whole privacy thing which is the reason people want to run local LLMs. If there is an iron-clad way to prove to most people that using a Groq API for inference is not going to risk sharing data with a 3rd party, you might have a great business case (it's too deep for me technically to know), otherwise you end up with a different dog with the same fleas.

@blisphul8084 21 күн бұрын

@@cajampayeah, it seems that's the reason they offer very few models. It'd be great if you could host other models on Groq, like Qwen2, as well as any fine-tunes that you'd want to use, like Magnum or Dolphin model variants.

@cajampa 21 күн бұрын

@@BizAutomation4U If a business want to run Groq because they need the speed they can offer. I am pretty sure Groq can offer them an isolated instance of the servers for the right price. Groq was never about consumers running local LLM. The hardware is just not catered to this use case at all in anyway.

@cajampa 21 күн бұрын

@@blisphul8084 I say the same to you, if a business want to run Groq with their choice of models I am pretty sure Groq can offer it to them for the right price.

@christosmelissourgos2757 21 күн бұрын

Honestly Matthew why do advertise this? It has been months and we still are stuck with their free package with a rate limit that you can bring no app to production yet . Waste of time and integration

@sanatdeveloper 22 күн бұрын

First😊

@LeandroMessi 22 күн бұрын

Second

@tamelo 22 күн бұрын

Groq is terrible, worse than GPT-3. Why do you keep chilling for it?

@matthew_berman 21 күн бұрын

Groq isn’t a model, it’s an inference service. They have multiple models and offer speeds and prices that are far better than anyone else. I really like Groq.

@TheAlastairBrown 21 күн бұрын

There are two different companies/products. One is GROQ and the other is GROK. The one spelled with a "q" is what Matt is talking about, they are essentially a server farm designed to run 3rd party opensource LLM's quickly so you can cheaply offload computation. The one spelled with a "k" is Elon Musk/X's version of ChatGTP.

@Tubernameu123 22 күн бұрын

Groq is too filtered/censored... too shameful not courageous.... too weak impotent.....

@finalfan321 16 күн бұрын

too technical too much effort unfriendly interface

@ManjaroBlack 21 күн бұрын

I finally unsubscribed. Don’t know why it took me so long.

@annwang5530 22 күн бұрын

You are gaining weight?

@Kutsushita_yukino 22 күн бұрын

are you his parents?

@AI-Rainbow 22 күн бұрын

Is that any of your business?

@blisphul8084 21 күн бұрын

He didn't criticize, just pointed it out. Better to know earlier than late while it's easier to fix.

@annwang5530 21 күн бұрын

@@blisphul8084yeah, today pointing out anything is seen as an attack cuz glass society

@Heisenberg2097 22 күн бұрын

Groq is nowhere near CGPT or Claude... and all of them need a lot of attention and are far away from SAI. There is currently only SUPER-FLAWED and SUPER-OVERRATED.

@greenstonegecko 21 күн бұрын

I'd need to see a benchmark for proof. These models are super nuanced. They might score a 0/10 on task A, but a 9/10 on task B. You cant generalize to "they suck". These models can already pass the Turing Test. People cannot differentiate ChatGPT 3.5 from actual humans 54% of the time.

@ticketforlife2103 21 күн бұрын

Lol they are far away from AGI let alone ASI

@lancemarchetti8673 21 күн бұрын

Groq is not an LLM as such, it's a booster for AI models. Like a turbo switch to get results faster. By loading your model into Groq, you save around 80% of the time you would have spent without it.

@Player-oz2nk 21 күн бұрын

@@lancemarchetti8673 thank you lol i coming to say this

@4.0.4 21 күн бұрын

This is like someone saying a car dealership is nowhere near the performance of a Honda Civic in drag racing. It only communicates you're a bit new to this.