RAG for long context LLMs

Рет қаралды 35,744

2 ай бұрын

This is a talk that @rlancemartin gave at a few recent meetups on RAG in the era of long context LLMs. With context windows growing to 1M+ tokens, there have been many questions about whether RAG is "dead." We pull together threads from a few different recent projects to take a stab at addressing this. We review some current limitations with long context LLM fact reasoning & retrieval (using multi-needle in a haystack analysis), but also discuss some likely shifts in the RAG landscape due to expanding context windows (approaches for doc-centric indexing and RAG "flow engineering").
Slides:
docs.google.com/presentation/...
Highlighted references:
1/ Multi-needle analysis w/ @GregKamradt
blog.langchain.dev/multi-need...
2/ RAPTOR (@parthsarthi03 et al)
github.com/parthsarthi03/rapt...
• Building long context ...
3/ Dense-X / multi-representation indexing (@tomchen0 et al)
arxiv.org/pdf/2312.06648.pdf
blog.langchain.dev/semi-struc...
4/ Long context embeddings (@JonSaadFalcon, @realDanFu, @simran_s_arora)
hazyresearch.stanford.edu/blo...
www.together.ai/blog/rag-tuto...
5/ Self-RAG (@AkariAsai et al), C-RAG (Shi-Qi Yan et al)
arxiv.org/abs/2310.11511
arxiv.org/abs/2401.15884
blog.langchain.dev/agentic-ra... (edited)
Timepoints:
0:20 - Context windows are getting longer
2:10 - Multi-needle in a haystack
9:30 - How might RAG change?
12:00 - Query analysis
13:07 - Document-centric indexing
16:23 - Self-reflective RAG
19:40 - Summary

Пікірлер: 30

@frankforrester42 2 ай бұрын

Thank you for uploading this. I live in the middle of nowhere Kansas. I'm a single dad of three with full custody. Life is full time 24/7. I sincerely appreciate you taking the time to share this with everyone. I love learning about it to the point my head hurts.

@cnmoro55 2 ай бұрын

Long context LLMs are nice, but are you willing to pay for a question using the full 1M tokens, for example? Probably not. While the cost of inference is measured in token counts, RAG will be relevant.

@Todorkotev 2 ай бұрын

Lol, "I don't wonna tell Harrison how much I spent" 🤣 Now, that's pretty good needle in a 21 minute haystack 😆. The hay is pretty awesome too though. Thank you for all the hay!

@sethjchandler 2 ай бұрын

Excellent video but I’m not persuaded that the indexing system you suggest is going to work where the documents themselves are large relative to the number of chunks involved. I’m thinking a particular of legal opinions or other legal documents. That can be 50 pages long. I’m concerned that the summarization might be insufficiently granular, and that the retrieval of the entire document or documents would end up, clogging the context window, and lead you to the same problem that you discussed at the beginning of the video.

@voulieav 2 ай бұрын

i love lance absolutely creasing at the pareto slide

@Ricardo_Cordero 2 ай бұрын

🌟 **Fantastic insights!** 🌟 @rlancemartin, thank you for sharing this thought-provoking talk on RAG in the era of long context LLMs. The way you weave together threads from various recent projects to address the question of whether RAG is "dead" is truly commendable. The challenges posed by context windows growing to 1M+ tokens are significant, and your exploration of limitations with long context LLM fact reasoning and retrieval (using multi-needle in a haystack analysis) sheds light on crucial aspects. But it's equally exciting to hear about the potential shifts in the RAG landscape due to expanding context windows-especially the approaches for doc-centric indexing and RAG "flow engineering." Your talk inspires us to keep pushing the boundaries and adapt to the evolving landscape. Kudos! 🙌🚀

@fire17102 2 ай бұрын

Would love it if you could showcase a working rag example with live changing data. For example item price change, or policy update. Does it require to manually manage chunks and embedding references or are there better existing solutions? I think this really differentiates between fun-todo and actual production systems and applications. Thanks and all the best!

@codingcrashcourses8533 2 ай бұрын

Prices are most of the time tabular data and therefore it´s best to treat them this way. LLMs are pretty good to write queries based on a provided SQL Schema

@fire17102 2 ай бұрын

@@codingcrashcourses8533 no you're not getting it.... Price was just an example, this can be any data that changes, and once more, it might be not my but my users' data, so I need a pure generic rag, but with the ability to re-RAG when files change Hope you see what I mean. Thanks for responding all the best!

@landon.wilkins 2 ай бұрын

I wish we had access to some public LangChain Notion project. The diagrams, etc. are incredibly helpful. I'd love to print them out, laminate them, and post them in my shower :D

@Abdien 2 ай бұрын

Very useful, thanks for sharing!

@yarkkharkov 2 ай бұрын

Fantastic lecture, keep it up!

@kevon217 Ай бұрын

Super useful analyses. Great work.

@maskedvillainai Ай бұрын

my brain has stored everything that’s ever happened and will ever happen, yet remembers only what’s relevant to the context of today, necessary to avoid the risks of tomorrow, and wouldn’t be watching this KZfaq if it always knew what answers to find simply because they exist at all.

@maxi-g 2 ай бұрын

RAG is excellent for knowledge representation and retrieval, superior to training or fine tuning

@N4LNba777 2 ай бұрын

Thanks, very useful!

@bertobertoberto3 Ай бұрын

Nice. Retrieving a full document given a short question may be troublesome though, given that we’re still trying to move everything to high dimensional space and compare them. However doing things semantically and storing the vectors with a link back to the actual document would likely perform much better

@IbrahimSobh 2 ай бұрын

Thank you for the very nice video as usual. Can you provide a proposal for a simple solution to get the best of the two worlds, the RAG and the large context window?

@muhannadobeidat 2 ай бұрын

Great content. Expertly delivered. What would be nice is looking at this from private enterprise data perspective. Where you have a lot of,ore restriction on what can be in the context in terms of data bs metadata

@maskedvillainai Ай бұрын

You know what’s way better? Not using NLP at all for for retrieving ranks. I use concurrent parquet caching and pass continuously queried word matches as index indices to the Ai. This basically cut the token limit requirement for my Ai model down to near non existence. It also always has a starting point retrieved as a Dict,

@anonymous6666 2 ай бұрын

LET'S GO LANCE!!!

@alivecoding4995 Ай бұрын

Could you please comment on the recency of your latest results? When did you compute the result figures for the article?

@siddharthchauhan3404 2 ай бұрын

What application you guys use to create this flow diagram?

@LangChain 2 ай бұрын

Excalidraw 😎

@ahmed_hefnawy 2 ай бұрын

Hello, Anyone know a good technical implementation for those techniques ?! it'll help too much and really needed

@insitegd7483 2 ай бұрын

I think that LLMs for long context in RAG could be useful when you have critic information and you need the exact data and a lot of information in a system.

@quintinevans2485 2 ай бұрын

How good are humans at this needle in a haystack sort of tests? Humans seem to do a lot better with this sort of retrieval when they have a model to align the structure. I refer to tests done where large chunks of information were given to non-experts and experts ie details about flying. The experts were able to remember and retrieve the relevant information with a higher hit rate than non-experts because they had a model to identify the important information and store it. How much would that sort of approach work for LLM performance? Such as letting it know that you are giving it information and it has to remember it for retrieval later.

@alivecoding4995 Ай бұрын

I think you should distinguish Visually Rich Dense documents from ordinary text documents at least, when speaking about the document-level.

@pioggiadifuoco7522 2 ай бұрын

btw you kinda far from what pizza is 😂 neither figs, prosciutto or goat cheese, I really don't know what kind of pizza you eat in the us but seems to be real garbage if you put those ingredients on it. Get it from an Italian from Naples, where pizza was born 😜. Hope you're going to change those to tomato, mozzarella, parmesan and basil next time 🤣🤣🤣. Jokes apart, good job dude!