Рет қаралды 35,744
This is a talk that @rlancemartin gave at a few recent meetups on RAG in the era of long context LLMs. With context windows growing to 1M+ tokens, there have been many questions about whether RAG is "dead." We pull together threads from a few different recent projects to take a stab at addressing this. We review some current limitations with long context LLM fact reasoning & retrieval (using multi-needle in a haystack analysis), but also discuss some likely shifts in the RAG landscape due to expanding context windows (approaches for doc-centric indexing and RAG "flow engineering").
Slides:
docs.google.com/presentation/...
Highlighted references:
1/ Multi-needle analysis w/ @GregKamradt
blog.langchain.dev/multi-need...
2/ RAPTOR (@parthsarthi03 et al)
github.com/parthsarthi03/rapt...
• Building long context ...
3/ Dense-X / multi-representation indexing (@tomchen0 et al)
arxiv.org/pdf/2312.06648.pdf
blog.langchain.dev/semi-struc...
4/ Long context embeddings (@JonSaadFalcon, @realDanFu, @simran_s_arora)
hazyresearch.stanford.edu/blo...
www.together.ai/blog/rag-tuto...
5/ Self-RAG (@AkariAsai et al), C-RAG (Shi-Qi Yan et al)
arxiv.org/abs/2310.11511
arxiv.org/abs/2401.15884
blog.langchain.dev/agentic-ra... (edited)
Timepoints:
0:20 - Context windows are getting longer
2:10 - Multi-needle in a haystack
9:30 - How might RAG change?
12:00 - Query analysis
13:07 - Document-centric indexing
16:23 - Self-reflective RAG
19:40 - Summary