The Largest Mamba LLM Experiment Just Dropped

  Рет қаралды 37,371

bycloud

bycloud

Күн бұрын

Check out HubSpot's ChatGPT at work bundle! clickhubspot.com/2os
A long awaited sequel in LLM research has appeared, AI21Labs has dropped the biggest Mamba experiment, which is on par with other open source LLM models! Just with a few twists...
Original Mamba Paper
[Paper] arxiv.org/abs/2312.00752
[Code] github.com/state-spaces/mamba
MambaFormer
[Paper] arxiv.org/pdf/2402.04248.pdf
AI21Labs
[Blog] www.ai21.com/blog/announcing-...
[Huggingface] huggingface.co/ai21labs/Jamba...
[NVIDIA NIM] nvda.ws/3Jn5pxb
VideoMamba
[Paper] arxiv.org/abs/2403.06977
[Code] github.com/OpenGVLab/VideoMamba
Special thanks to LDJ for helping out with the content in this video!
This video is supported by the kind Patrons & KZfaq Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner
[Discord] / discord
[Twitter] / bycloudai
[Patreon] / bycloud
[Music] Massobeats - Lush
[Profile & Banner Art] / pygm7
[Video Editor] Silas
0:00 Intro
1:16 Hubspot
2:24 Jamba
8:08 VideoMamba

Пікірлер: 70
@bycloudAI
@bycloudAI 3 ай бұрын
Check out HubSpot's ChatGPT at work bundle here: clickhubspot.com/2os unfortunately topping the last mamba edit is way too hard, but I guess now at least we know *_mamba is real_*
@rounaksen1683
@rounaksen1683 3 ай бұрын
Hove you seen google's griffin and hawk?
@vinc6966
@vinc6966 3 ай бұрын
If mamba does not scale well, we still have diffusion models for text
@thipoktham5164
@thipoktham5164 3 ай бұрын
Why not both?
@dolcruz6838
@dolcruz6838 3 ай бұрын
Would be interesting to see the infinite context from the "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" Paper explained.
@farrael004
@farrael004 3 ай бұрын
Ikr. I wonder why that paper didn't get more traction
@sascha_becker
@sascha_becker 3 ай бұрын
Jamba Mamba ¡Ay, caramba!
@user-zu6wg9wt8m
@user-zu6wg9wt8m Ай бұрын
bien dicho
@vongolashodaime1975
@vongolashodaime1975 3 ай бұрын
Hey, would you be interested in making a video about ponydiffusion ?
@kolkoki
@kolkoki 3 ай бұрын
Isn't pony diffusion just a latent diffusion foundation model, like stable diffusion?
@vongolashodaime1975
@vongolashodaime1975 Ай бұрын
@@kolkoki I got no clue about any of that sorry, I just know that, at least back then, pony revolutionized accuracy to character LoRAs and made the generations of already existing characters so much more accurate than other checkpoints.
@bernard-ng
@bernard-ng 3 ай бұрын
wait.... this is not a @fireship video damm
@sunshine19088
@sunshine19088 19 күн бұрын
close enough
@beerbytes9895
@beerbytes9895 3 ай бұрын
@fireship game up your memes this boy is strapped to the teeth.
@svendpai
@svendpai 3 ай бұрын
love your memes so much
@RealTwiner
@RealTwiner 3 ай бұрын
I dont watch this channel much, but I did see that epic mamba short in one of your videos and it has been ingrained in my mind ever since.
@drexon88
@drexon88 3 ай бұрын
Everyone is combining models rn. Some people combine NeRF and GS and that worked as well. I guess that ML will become just a mixer for architectures at least for some commercial devs
@OxygenGenesis
@OxygenGenesis 3 ай бұрын
Love your video essays, good and easy to understand and nice to catch up to SOTA methods.
@zzzzzzz8473
@zzzzzzz8473 3 ай бұрын
appreciate these videos . the main thing ive heard regarding mamba v transformers is that the discoveries of optimizations within transformers are still abundant , quantization alone is massive in enabling the networks to run on average hardware , and the ridiculousness of 1.56bit quantization working is incredible where as with mamba no quantization is available .
@OfficialNierto
@OfficialNierto 3 ай бұрын
could we use it through ollama?
@mrrespected5948
@mrrespected5948 3 ай бұрын
Very nice
@logangarcia
@logangarcia 3 ай бұрын
so is this cheaper than Mistral 7B? ❤
@erickmarin6147
@erickmarin6147 3 ай бұрын
Im trying to write bitnet layers for Veri log
@JackCrossSama
@JackCrossSama 3 ай бұрын
we need one called Mongoose
@dsgda153
@dsgda153 3 ай бұрын
Oh god. How much of a memelord can you be?! The "can you get much higher" right after the lobotomy? I love you man.
@edhofiko7624
@edhofiko7624 2 ай бұрын
so whats next? kalman filter with learned dynamic?
@akaanoone6939
@akaanoone6939 3 ай бұрын
If you enjoy KZfaq and it pays bills then sure but play it safe so you don't make life much harder than necessary. Plus you might be able to do research at the same time and present it to people in a more consumable form
@tnguyen8633
@tnguyen8633 3 ай бұрын
dank af
@JosephCatrambone
@JosephCatrambone 3 ай бұрын
Isn't mashing together RNNs and Transformers just RWKV?
@cvs2fan
@cvs2fan 3 ай бұрын
wait a sec bycloud still makes videos? :V
@hakimehamdouchi7468
@hakimehamdouchi7468 2 ай бұрын
so.... still waiting on the guff file ey?
@Metruzanca
@Metruzanca 3 ай бұрын
The part on Jamba honestly sounds like someone making shit up with fake words, but thats actually all real. The "Microservices" video by KRAZAM is now reality.
@Kazekoge101
@Kazekoge101 3 ай бұрын
what happened with Hyena?
@rasuru_dev
@rasuru_dev 3 ай бұрын
Gemma 7B competing with llama70b, mixtral, and jamba damn scale that thing up
@Ivan.Wright
@Ivan.Wright 3 ай бұрын
Every time I hear Mamba I can only think of the Python CLI
@jessedbrown1980
@jessedbrown1980 Ай бұрын
Obviously. I published in December of 2023: Anchoring_Global_Security_Autonomous_Shipping_with_Mind_Reading_AI_GPT-core_and_MAMBA-_core_Agents_RAG-Fusion_AI_Communities_Hive-_AI_and_the_Human_Psyche #mindreading #AI #agent cores #Mamba2 and GPT4, 5 and sequential models #IDE
@JorgetePanete
@JorgetePanete 3 ай бұрын
7:17 LLM Models live inside ATM Machines
@TerrinX
@TerrinX 3 ай бұрын
The Mambaaaaaaa the Mamba is reaaaaaaaaaaaaallllllll
@lobiqpidol818
@lobiqpidol818 3 ай бұрын
Nah bro infini attention is where it's at
@diadetediotedio6918
@diadetediotedio6918 3 ай бұрын
3:36 It would still be good for people wanting small models to run on very cheap devices without needing all the quality, no?
@smellthel
@smellthel 3 ай бұрын
we live in the future bros
@user-fr2jc8xb9g
@user-fr2jc8xb9g 3 ай бұрын
Man i'm tired of waiting for GPT-5 , what are they waiting for?
@VisionaryPathway
@VisionaryPathway 3 ай бұрын
They're currently red-teaming the model
@user-fr2jc8xb9g
@user-fr2jc8xb9g 3 ай бұрын
@@VisionaryPathway thanks for answering! How long do you think it will take until release?
@VisionaryPathway
@VisionaryPathway 3 ай бұрын
@@user-fr2jc8xb9g personally, I think it’s releasing anytime within next 4-12 weeks (my own opinion/prediction)
@annaczgli2983
@annaczgli2983 3 ай бұрын
Why copy Fireship's thumbnails? Sad, man.
@joshford256
@joshford256 3 ай бұрын
There's no way you think someone can own the format of, "character on the right highlighting big text on the left"??? Thumbnails are like, the least important part of a video when you watch it as a viewer, but it's the most important part when it comes to grabbing viewers' attention. Why shouldn't you use other creators' ideas on what works, when that's not where your creative input is, and it's super important to know you have a successful thumbnail style?
@pizzadog9876
@pizzadog9876 3 ай бұрын
Who cares, we're here for him, not his thumbnail
@iceshadow487
@iceshadow487 3 ай бұрын
He's been making these style thumbnails for 2+ years now. It's not copying, and it never will be. It's fine to take inspiration from other people when you like their work. And have you considered that he could have also just had this idea himself? It's extremely common for multiple people to have essentially the exact same idea.
@Injazz1
@Injazz1 3 ай бұрын
Thumbnails look similar because there are literally common guidelines that are proven to improve the reach of any YT video either by being more likeable to eyes or because algorithm picks them to trending tab
@NeostormXLMAX
@NeostormXLMAX 3 ай бұрын
Didnt fireship copy this guy?
@jerrydaboss1
@jerrydaboss1 3 ай бұрын
329th view. Can I get a heart?
@user-up5kn1ix6v
@user-up5kn1ix6v 3 ай бұрын
1st
@googleyoutubechannel8554
@googleyoutubechannel8554 Ай бұрын
In the next improvement paper... they're going to suggest a 'hybrid architecture' where you skip the mamba layer entirely....
@user-up5kn1ix6v
@user-up5kn1ix6v 3 ай бұрын
First
@frazuppi4897
@frazuppi4897 3 ай бұрын
nobody really uses vanilla attentions in LLMs so like most of what mamba says is BS
@ariseyhun2085
@ariseyhun2085 3 ай бұрын
Its extremely obvious that the thumbnails are replicas of Fireship, I know you're trying to grow your channel but it's a little off putting
@dfsgjlgsdklgjnmsidrg
@dfsgjlgsdklgjnmsidrg 3 ай бұрын
this dude is copying fireship
@ikartikthakur
@ikartikthakur 29 күн бұрын
maybe he's his otosan
@j0hnr3x
@j0hnr3x 3 ай бұрын
Please stop copying fireship content and thumbnails
@Teapot_418
@Teapot_418 3 ай бұрын
Pathetic @fireship ripoff.
Mamba Might Just Make LLMs 1000x Cheaper...
14:06
bycloud
Рет қаралды 124 М.
MISS CIRCLE STUDENTS BULLY ME!
00:12
Andreas Eskander
Рет қаралды 9 МЛН
Nastya and SeanDoesMagic
00:16
Nastya
Рет қаралды 17 МЛН
Why Western Designs Fail in Developing Countries
27:36
Design Theory
Рет қаралды 723 М.
The Painful Launch of Stable Diffusion 3
15:55
bycloud
Рет қаралды 40 М.
What Game Theory Reveals About Life, The Universe, and Everything
27:19
Adobe: A Disgusting, Criminal Company
10:21
Bull Technology
Рет қаралды 194 М.
How Did Llama-3 Beat Models x200 Its Size?
13:55
bycloud
Рет қаралды 117 М.
NPUs: the most overhyped new chip?
15:30
TechAltar
Рет қаралды 159 М.
The perfect imperfection of Google's Material You
15:47
David Imel
Рет қаралды 417 М.
How Did Open Source Catch Up To OpenAI? [Mixtral-8x7B]
5:47
bycloud
Рет қаралды 167 М.
5 New AI Scams You Should Tell Your Parents About
11:07
bycloud
Рет қаралды 33 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 761 М.
Худшие кожаные чехлы для iPhone
1:00
Rozetked
Рет қаралды 1,4 МЛН
iPhone socket cleaning #Fixit
0:30
Tamar DB (mt)
Рет қаралды 15 МЛН
Красиво, но телефон жаль
0:32
Бесполезные Новости
Рет қаралды 1,5 МЛН