RAG vs Context Window - Gemini 1.5 Pro Changes Everything?

Рет қаралды 17,497

Күн бұрын

RAG vs Context Window - Gemini 1.5 Pro Changes Everything?
👊 Become a member and get access to GitHub:
/ allaboutai
📧 Join the newsletter:
www.allabtai.com/newsletter/
🌐 My website:
www.allabtai.com
In this video I talk about RAG vs Context Window. And is Gemini 1.5 Pro with the 1M - 10M context window challenging RAG.
00:00 RAG vs Context Window Intro
00:20 Context Window
02:15 RAG
03:52 Gemini 1.5 Pro Context
05:26 Groq 500 t/s
07:51 Price
07:25 RAG vs Context Window Examples
10:28 Multimodal
11:04 RAG Use Case
12:23 Conclusion

Пікірлер: 36

@The-Rest-of-Us 2 ай бұрын

People have been so preoccupied with laughing off Google's AI misses, that everyone completely missed how Google might actually have silently been taking over the AI lead.

@avi7278 2 ай бұрын

Kris with groq, speed gains are specifically limited to output tokens. There is no similar speed gains for input tokens. They discuss this.

@FredPauling 2 ай бұрын

The videos on this channel are so relatable and accessible and friendly vibe. Looking forward to the next one

@avi7278 2 ай бұрын

Asking a question like "what does this text mean?" In RAG is entirely pointless. If you have a basic RAG setup which i surmise you do, it's going to find the chunk of text closest to your input "what does this text mean" which has no relation at all to the intention of your question. So it's going to find a random chunk of your text and then tell you what that random chunk means. In a RAG setup.like this, the LLM has no context of what "this text" is. In order for this to work you would need to add another layer of inference where the LLM is given context that there is a corpus of text in which it can search and instead of "this text" you reference tge same way. "What does the entire corpus of text mean?". It would then need to use its reasoning abilities to generate one or more search queries that would allow it to retrieve enough text that it could then apply your actual intention and answer the question. Even with this additional context and query generation layer, a question like "what does the entire corpus of text mean?" would be difficult for it because there is no definite or obvious set of queries it could make to retrieve most of the text nor can it even know if the result of the queries has all of the text. Imagine a book with five chapters and you ask it, what does this text mean. What set of queries doe it generate to retrieve all five chapters without knowing even how many chapters there are. In RAG you need to give it that additional context. This experiment is really quite pointless and doesnt illustrate the capability of RAG systems.

@yorgohoebeke 2 ай бұрын

Do you have any good videos / tutorials on this topic to recommend?

@avi7278 2 ай бұрын

@@yorgohoebeke langchain KZfaq channel has a really good 9 part series called "RAG from Scratch". I would start there. About 5-10 minutes, very digestible each. Tell chat gpt that you want to learn RAG starting with the basics, ask it to generate a crash course study list of topics for learning RAG, then take that list and throw it into a perplexity search. Go through all the resources it finds (websites and yt videos) that you find helpful, and then do that process recursively with various parts of the RAG implementation steps until you're an expert. Voilà.

@adventurelens001 2 ай бұрын

Not even 3 minutes in and I've already learned new things. Thanks for this!

@TheZumph 2 ай бұрын

Its racist

@avgplayer 2 ай бұрын

Thanks, this was very helpful

@kate-pt2ny 2 ай бұрын

Thank you for your wonderful videos, keep up with the hot topics

@BillyRybka 2 ай бұрын

Kirs, I'd love to see what tools you use (if any) for the content creation process

@InnocenceVVX Ай бұрын

Great stuff. What route would you advise to go from an unstructured individual pdf to a structured json output with the help of an api?

@julian-fricker 2 ай бұрын

I think RAG is still needed, I want local models which access my personal data. For security I cannot be sending my personal data to Google or OpenAI. But for use cases like understanding git repositories this is a game changer.

@micbab-vg2mu 2 ай бұрын

In my case, a broader context window changed everything. I am using your prompts that you presented 6 months ago, and by including text for analysis as part of the prompt, the accuracy of the output is almost 100%.

@jakey8 2 ай бұрын

is there a usable limit to the amount of text you can enter in to a prompt? Or do you link to local txt files?

@micbab-vg2mu 2 ай бұрын

Currently, I analyze medical publications ranging from 10 to 20 pages of text simultaneously using ChatGPT-4. However, with Gemini 1.5, I will be able to analyze an entire book.@@jakey8

@darrylrogue7729 2 ай бұрын

Fantastic video! Really interesting with it understanding a codebase in context!

@TheWorldGameGeneral 2 ай бұрын

Is there no link to the discord? Was wondering if you are able to also explain how to convert the code u made to transformer.js and stuff, like the realtime faster-whisper from a month ago

@FelipeDiPaula 2 ай бұрын

🎯 Key Takeaways for quick navigation: 00:00 *📈 Introdução ao RAG e Janela de Contexto: Explicando a diferença entre RAG e janela de contexto em modelos de linguagem.* - O modelo com janela de contexto de 8K tokens processa tanto os tokens de entrada quanto de saída, - Exemplificação de como a janela de contexto pode limitar a inclusão de informações relevantes. 02:05 *🧩 Funcionamento do RAG: Como o RAG resolve o problema da janela de contexto limitada.* - Transformação de tokens de texto em embeddings vetoriais armazenados em uma base de dados, - Comparação entre a consulta do usuário e os textos armazenados para encontrar a correspondência mais próxima. 03:57 *🚀 Impressões sobre Gemini 1.5 Pro: Discussão sobre as capacidades do Gemini 1.5 Pro.* - Capacidade de processar bases de código completas e identificar problemas urgentes, - Comparação com RAG em termos de contextos e eficiência. 05:04 *💻 Novidades da Grock e Velocidade de Processamento: Apresentação do novo hardware da Grock.* - Processamento de 500 tokens por segundo, indicando um avanço significativo na velocidade, - Potencial combinação de alta velocidade com modelos de linguagem avançados. 06:09 *💰 Custos e Eficiência: Análise de custo-benefício entre RAG e Gemini 1.5.* - Discussão sobre a diferença de preços entre as chamadas de API de diferentes modelos, - Especulação sobre redução de custos futura e impacto na escolha do modelo. 07:34 *🧪 Testes Práticos com RAG e Contexto: Experimentos comparativos entre RAG e uso de contexto.* - Demonstração de como diferentes perguntas resultam em respostas variadas entre os modelos, - Análise das forças e fraquezas de cada abordagem. 10:32 *🎥 Recursos Multimodais do Gemini 1.5 Pro: Explorando as capacidades multimodais do Gemini 1.5 Pro.* - Capacidade de processar vídeos longos e responder a consultas específicas baseadas em momentos do vídeo, - Comparação com a capacidade do RAG em lidar com multimídias. 11:54 *🤔 Considerações Finais e Perspectivas Futuras: Reflexões finais sobre RAG e Gemini 1.5 Pro.* - Discussão sobre casos de uso ideais para cada tecnologia, - Expectativa em relação às inovações futuras e resposta da OpenAI ao avanço do Gemini 1.5 Pro. Made with HARPA AI

@user-bd8jb7ln5g 2 ай бұрын

I made a similar comment a couple of days ago about the Gemini context window replacing RAG. However, we don't know how Deepmind implemented the new context window, specifically how costly it is??? I personally hope it's something like a split overlapping series of smaller context windows - don't quote me on this, I have no idea if that is technically possible.

@sardormamarasulov3352 2 ай бұрын

could u give the link that how to use geminie 1.5 pro?

@JNET_Reloaded 2 ай бұрын

is this local model or google? you didnt put any links to how to do this?

@deino4753 2 ай бұрын

Am I correct in my assumption that a RAG is necessary for fetching information not in context? So an increased context window would be greatly beneficial to a RAG model because you can create larger documents that can be pulled into context for generating appropriate answers. I dont see how it is an either in-context or RAG.

@googleyoutubechannel8554 2 ай бұрын

In my testing of Gemini Pro, it doesn't behave at all like it has anything like a huge context window, it 'forgets' details constantly. Curiously, Gemini behaves very much like my personal experiments with auto-summarization and RAG....

@somdudewillson 2 ай бұрын

Gemini _1.5_ Pro is the model with the big context window, and it isn't generally available yet.

@googleyoutubechannel8554 2 ай бұрын

@@somdudewillsoncorrect

@brentmoreno3773 2 ай бұрын

I literally just had this conversation with my data engineer. What happens when we hit 1 billion and even 1 trillion context windows. Unstructured data just gets ingested and structured, and retrieved in real time at nearly unlimited rates.

@XOXO-dv5vv 2 ай бұрын

Actually, the longer the context window, the more computation the model needs to perform. So it's really not computationally feasible. The again, with bigger compute power nothing is impossible.