RAG vs Context Window - Gemini 1.5 Pro Changes Everything?

  Рет қаралды 17,497

All About AI

All About AI

Күн бұрын

RAG vs Context Window - Gemini 1.5 Pro Changes Everything?
👊 Become a member and get access to GitHub:
/ allaboutai
📧 Join the newsletter:
www.allabtai.com/newsletter/
🌐 My website:
www.allabtai.com
In this video I talk about RAG vs Context Window. And is Gemini 1.5 Pro with the 1M - 10M context window challenging RAG.
00:00 RAG vs Context Window Intro
00:20 Context Window
02:15 RAG
03:52 Gemini 1.5 Pro Context
05:26 Groq 500 t/s
07:51 Price
07:25 RAG vs Context Window Examples
10:28 Multimodal
11:04 RAG Use Case
12:23 Conclusion

Пікірлер: 36
@The-Rest-of-Us
@The-Rest-of-Us 2 ай бұрын
People have been so preoccupied with laughing off Google's AI misses, that everyone completely missed how Google might actually have silently been taking over the AI lead.
@avi7278
@avi7278 2 ай бұрын
Kris with groq, speed gains are specifically limited to output tokens. There is no similar speed gains for input tokens. They discuss this.
@FredPauling
@FredPauling 2 ай бұрын
The videos on this channel are so relatable and accessible and friendly vibe. Looking forward to the next one
@avi7278
@avi7278 2 ай бұрын
Asking a question like "what does this text mean?" In RAG is entirely pointless. If you have a basic RAG setup which i surmise you do, it's going to find the chunk of text closest to your input "what does this text mean" which has no relation at all to the intention of your question. So it's going to find a random chunk of your text and then tell you what that random chunk means. In a RAG setup.like this, the LLM has no context of what "this text" is. In order for this to work you would need to add another layer of inference where the LLM is given context that there is a corpus of text in which it can search and instead of "this text" you reference tge same way. "What does the entire corpus of text mean?". It would then need to use its reasoning abilities to generate one or more search queries that would allow it to retrieve enough text that it could then apply your actual intention and answer the question. Even with this additional context and query generation layer, a question like "what does the entire corpus of text mean?" would be difficult for it because there is no definite or obvious set of queries it could make to retrieve most of the text nor can it even know if the result of the queries has all of the text. Imagine a book with five chapters and you ask it, what does this text mean. What set of queries doe it generate to retrieve all five chapters without knowing even how many chapters there are. In RAG you need to give it that additional context. This experiment is really quite pointless and doesnt illustrate the capability of RAG systems.
@yorgohoebeke
@yorgohoebeke 2 ай бұрын
Do you have any good videos / tutorials on this topic to recommend?
@avi7278
@avi7278 2 ай бұрын
@@yorgohoebeke langchain KZfaq channel has a really good 9 part series called "RAG from Scratch". I would start there. About 5-10 minutes, very digestible each. Tell chat gpt that you want to learn RAG starting with the basics, ask it to generate a crash course study list of topics for learning RAG, then take that list and throw it into a perplexity search. Go through all the resources it finds (websites and yt videos) that you find helpful, and then do that process recursively with various parts of the RAG implementation steps until you're an expert. Voilà.
@adventurelens001
@adventurelens001 2 ай бұрын
Not even 3 minutes in and I've already learned new things. Thanks for this!
@TheZumph
@TheZumph 2 ай бұрын
Its racist
@avgplayer
@avgplayer 2 ай бұрын
Thanks, this was very helpful
@kate-pt2ny
@kate-pt2ny 2 ай бұрын
Thank you for your wonderful videos, keep up with the hot topics
@BillyRybka
@BillyRybka 2 ай бұрын
Kirs, I'd love to see what tools you use (if any) for the content creation process
@InnocenceVVX
@InnocenceVVX Ай бұрын
Great stuff. What route would you advise to go from an unstructured individual pdf to a structured json output with the help of an api?
@julian-fricker
@julian-fricker 2 ай бұрын
I think RAG is still needed, I want local models which access my personal data. For security I cannot be sending my personal data to Google or OpenAI. But for use cases like understanding git repositories this is a game changer.
@micbab-vg2mu
@micbab-vg2mu 2 ай бұрын
In my case, a broader context window changed everything. I am using your prompts that you presented 6 months ago, and by including text for analysis as part of the prompt, the accuracy of the output is almost 100%.
@jakey8
@jakey8 2 ай бұрын
is there a usable limit to the amount of text you can enter in to a prompt? Or do you link to local txt files?
@micbab-vg2mu
@micbab-vg2mu 2 ай бұрын
Currently, I analyze medical publications ranging from 10 to 20 pages of text simultaneously using ChatGPT-4. However, with Gemini 1.5, I will be able to analyze an entire book.@@jakey8
@darrylrogue7729
@darrylrogue7729 2 ай бұрын
Fantastic video! Really interesting with it understanding a codebase in context!
@TheWorldGameGeneral
@TheWorldGameGeneral 2 ай бұрын
Is there no link to the discord? Was wondering if you are able to also explain how to convert the code u made to transformer.js and stuff, like the realtime faster-whisper from a month ago
@FelipeDiPaula
@FelipeDiPaula 2 ай бұрын
🎯 Key Takeaways for quick navigation: 00:00 *📈 Introdução ao RAG e Janela de Contexto: Explicando a diferença entre RAG e janela de contexto em modelos de linguagem.* - O modelo com janela de contexto de 8K tokens processa tanto os tokens de entrada quanto de saída, - Exemplificação de como a janela de contexto pode limitar a inclusão de informações relevantes. 02:05 *🧩 Funcionamento do RAG: Como o RAG resolve o problema da janela de contexto limitada.* - Transformação de tokens de texto em embeddings vetoriais armazenados em uma base de dados, - Comparação entre a consulta do usuário e os textos armazenados para encontrar a correspondência mais próxima. 03:57 *🚀 Impressões sobre Gemini 1.5 Pro: Discussão sobre as capacidades do Gemini 1.5 Pro.* - Capacidade de processar bases de código completas e identificar problemas urgentes, - Comparação com RAG em termos de contextos e eficiência. 05:04 *💻 Novidades da Grock e Velocidade de Processamento: Apresentação do novo hardware da Grock.* - Processamento de 500 tokens por segundo, indicando um avanço significativo na velocidade, - Potencial combinação de alta velocidade com modelos de linguagem avançados. 06:09 *💰 Custos e Eficiência: Análise de custo-benefício entre RAG e Gemini 1.5.* - Discussão sobre a diferença de preços entre as chamadas de API de diferentes modelos, - Especulação sobre redução de custos futura e impacto na escolha do modelo. 07:34 *🧪 Testes Práticos com RAG e Contexto: Experimentos comparativos entre RAG e uso de contexto.* - Demonstração de como diferentes perguntas resultam em respostas variadas entre os modelos, - Análise das forças e fraquezas de cada abordagem. 10:32 *🎥 Recursos Multimodais do Gemini 1.5 Pro: Explorando as capacidades multimodais do Gemini 1.5 Pro.* - Capacidade de processar vídeos longos e responder a consultas específicas baseadas em momentos do vídeo, - Comparação com a capacidade do RAG em lidar com multimídias. 11:54 *🤔 Considerações Finais e Perspectivas Futuras: Reflexões finais sobre RAG e Gemini 1.5 Pro.* - Discussão sobre casos de uso ideais para cada tecnologia, - Expectativa em relação às inovações futuras e resposta da OpenAI ao avanço do Gemini 1.5 Pro. Made with HARPA AI
@user-bd8jb7ln5g
@user-bd8jb7ln5g 2 ай бұрын
I made a similar comment a couple of days ago about the Gemini context window replacing RAG. However, we don't know how Deepmind implemented the new context window, specifically how costly it is??? I personally hope it's something like a split overlapping series of smaller context windows - don't quote me on this, I have no idea if that is technically possible.
@sardormamarasulov3352
@sardormamarasulov3352 2 ай бұрын
could u give the link that how to use geminie 1.5 pro?
@JNET_Reloaded
@JNET_Reloaded 2 ай бұрын
is this local model or google? you didnt put any links to how to do this?
@deino4753
@deino4753 2 ай бұрын
Am I correct in my assumption that a RAG is necessary for fetching information not in context? So an increased context window would be greatly beneficial to a RAG model because you can create larger documents that can be pulled into context for generating appropriate answers. I dont see how it is an either in-context or RAG.
@googleyoutubechannel8554
@googleyoutubechannel8554 2 ай бұрын
In my testing of Gemini Pro, it doesn't behave at all like it has anything like a huge context window, it 'forgets' details constantly. Curiously, Gemini behaves very much like my personal experiments with auto-summarization and RAG....
@somdudewillson
@somdudewillson 2 ай бұрын
Gemini _1.5_ Pro is the model with the big context window, and it isn't generally available yet.
@googleyoutubechannel8554
@googleyoutubechannel8554 2 ай бұрын
@@somdudewillsoncorrect
@brentmoreno3773
@brentmoreno3773 2 ай бұрын
I literally just had this conversation with my data engineer. What happens when we hit 1 billion and even 1 trillion context windows. Unstructured data just gets ingested and structured, and retrieved in real time at nearly unlimited rates.
@XOXO-dv5vv
@XOXO-dv5vv 2 ай бұрын
Actually, the longer the context window, the more computation the model needs to perform. So it's really not computationally feasible. The again, with bigger compute power nothing is impossible.
@knthyl
@knthyl 2 ай бұрын
I trust this man because of his beard
@GrigoriyMa
@GrigoriyMa 2 ай бұрын
Share the method of using artificial intelligence to count cards using the High-Low system and win at blackjack in a casino.
@drlordbasil
@drlordbasil Ай бұрын
So for now this video is essentially RAG versus Riches?
@yurijmikhassiak7342
@yurijmikhassiak7342 2 ай бұрын
COST and SPEED, that is the QUESTION.
@brentmoreno3773
@brentmoreno3773 2 ай бұрын
costs will continue to decrease just as speed will increase. Assuming all trend lines remain constant.
@user-uc2qy1ff2z
@user-uc2qy1ff2z 2 ай бұрын
Nice technology, yet Google tortured Gemma into near idiocy with their woke shit. We need better fine tune to be able to appreciate that LLM.
@kamelirzouni4730
@kamelirzouni4730 2 ай бұрын
Merci !
Groq API - 500+ Tokens/s - First Impression and Tests - WOW!
11:41
All About AI
Рет қаралды 18 М.
Gemini 1.5 Pro With 10,000,000 Tokens Is Absurd
5:05
bycloud
Рет қаралды 92 М.
Barriga de grávida aconchegante? 🤔💡
00:10
Polar em português
Рет қаралды 10 МЛН
Godzilla Attacks Brawl Stars!!!
00:39
Brawl Stars
Рет қаралды 8 МЛН
когда одна дома // EVA mash
00:51
EVA mash
Рет қаралды 7 МЛН
The Context Window Paradox with LLMs
5:04
Voiceflow
Рет қаралды 1,3 М.
Building a self-corrective coding assistant from scratch
24:26
Gemini 1.5 Pro has a massive context window
11:29
Samuel Albanie
Рет қаралды 1,9 М.
How I Made AI Assistants Do My Work For Me: CrewAI
19:21
Maya Akim
Рет қаралды 632 М.
The CORRECT way to use Google Gemini - Updated for 2024!
9:57
GPT-4 VS. Gemini Ultra (The Ultimate Head to Head Comparison)
35:10
Skill Leap AI
Рет қаралды 49 М.
I Analyzed My Finance With Local LLMs
17:51
Thu Vu data analytics
Рет қаралды 359 М.
DeepMind Gemini 1.5 - An AI That Remembers!
8:34
Two Minute Papers
Рет қаралды 138 М.
СЛОМАЛСЯ ПК ЗА 2000$🤬
0:59
Корнеич
Рет қаралды 1,6 МЛН
Распаковка айфона под водой!💦(🎥: @saken_kagarov on IG)
0:20
Взрывная История
Рет қаралды 10 МЛН
Phone charger explosion
0:43
_vector_
Рет қаралды 37 МЛН
Пленка или защитное стекло: что лучше?
0:52
Слава 100пудово!
Рет қаралды 832 М.