Advanced RAG tutorial with Llamaindex & OpenAI GPT: Sentence Window Retrieval vs Basic Chunking

  Рет қаралды 5,890

Hubel Labs

Hubel Labs

Күн бұрын

Correction: at 1:53, I said that an embedding is a x digit string. That is not correct, it should be a list of x numbers.
Google Colab Code: colab.research.google.com/dri...
Link to Diagram: link.excalidraw.com/readonly/...
Why build your own retrieval augmented generation pipeline when OpenAI's custom GPTs can do it out of the box? Did you know that the OpenAI solutions, as of the making of this video, are not scalable to large knowledge bases? Also, having your own pipeline gives you a lot more control over the design which you will need if you are building an enterprise grade top-notch system.
In this tutorial, we will talk through a number of advanced techniques such as sentence window retrieval, hierarchical automerge retrieval, returning Top K results vs. greedy search, reranking etc.
We will also work through some code and do a real comparison between basic chunking vs. sentence retrieval strategies.

Пікірлер: 32
@mikestaub
@mikestaub 6 ай бұрын
This is currently the best RAG tutorial on the internet.
@victorfeight9644
@victorfeight9644 2 ай бұрын
clear effective explanations thank you
@sitedev
@sitedev 6 ай бұрын
Great video with an awesome easy to follow explanation of RAG. Reminds of a recent Andrej Karpathy video.
@unclecode
@unclecode 6 ай бұрын
Fascinating! Your approach to teaching and presenting is poetic. It is well organized, well explained, and well illustrated. Indeed, kudos to you. If I could, I would subscribe to your channel twice!
@danielvalentine132
@danielvalentine132 6 ай бұрын
Great job explaining window and how vector store and doc store relate and where window lives. I’ve been trying to understand this aspect of llamaindex, and you made it very clear!
@hubel-labs
@hubel-labs 6 ай бұрын
Yeah, it took me a while to figure that out too. Glad it helped you!
@nazihfattal974
@nazihfattal974 6 ай бұрын
As always, well prepared, easy to follow video that delivers a lot of information and value. Thank you!
@arjoai
@arjoai 6 ай бұрын
Wow, that’s a wonderful piece of advice from such a talented professional in the field. Thank you 😊
@MaliRasko
@MaliRasko 6 ай бұрын
Your explanations and delivery is on point. Thank you for an excellent content and relaxed narration style.
@ginisksam
@ginisksam 4 ай бұрын
Llama Index has new version 0.10 - will migrate your codes n learn same time. Thanks for introducing Sentence Window Retrieval. Most basic straight-split and retrieve/chat doesnt produce much meanings on our docs.
@mayurmoudhgalya3840
@mayurmoudhgalya3840 6 ай бұрын
Another great video!
@natal10
@natal10 6 ай бұрын
Great video!! Love it
@hubel-labs
@hubel-labs 6 ай бұрын
Thank you!!
@nunoalexandre6408
@nunoalexandre6408 6 ай бұрын
Love it!!!!!!!!!!!!!!!!!
@sivi3883
@sivi3883 5 ай бұрын
Awesome video explained very clearly! Thanks a ton! If I may ask, what tool do you use for those visual flows. Love it!
@JOHNSMITH-ve3rq
@JOHNSMITH-ve3rq 6 ай бұрын
Wow _ darn useful !!
@sayanbhattacharyya95
@sayanbhattacharyya95 5 ай бұрын
Really cool video! Is there an "ideal" or "recommended" value of window_size?
@Work_Pavan-mu9ye
@Work_Pavan-mu9ye 6 ай бұрын
Excellent video. Liked the workflow you showed in the beginning. What SW are you using to create this workflow?
@hubel-labs
@hubel-labs 6 ай бұрын
Excalidraw
@peteredmonds1712
@peteredmonds1712 6 ай бұрын
I'm a bit confused on the use case for re-ranking. Doesn't that defeat the purpose the top-k search in that we include all chunks, significantly increasing the number of tokens we use? Is the idea to do re-ranking with a smaller & cheaper LLM before sending the resultant top-K chunks to a more robust LLM?
@hubel-labs
@hubel-labs 6 ай бұрын
yeah, it took me a while to understand it as well. It does end up using more tokens but I guess that can be, as you said, mitigated with a lower spec LLM. The idea is that embeddings is not a perfect representation of the text - it's a good way to sort through millions of text chunks efficiently but the exact ranking within the top 20, for instance, may not be as good as using the text directly. So if you were to do reranking, you might use the embeddings to return j results (where j is somewhat larger than k) and then rerank those j results using LLM into a final top k that you then pass into your user-facing LLM.
@peteredmonds1712
@peteredmonds1712 6 ай бұрын
Small correction: embedding are not a 1536 digit number but of vector of size 1536
@hubel-labs
@hubel-labs 6 ай бұрын
Yes, you are right …. I was thinking of hashes for some reason! I’ll add a correction!
@goonymiami
@goonymiami 6 ай бұрын
Seeing your explaination at around 09:30, it seems like we can only use K windows to serve as knowledge base to answer a prompt. What If the prompt asks information that is contained in more than K windows? Like if I have several documents containing each a bio of a person, and if the user asks to sort those 10 people by age... how can it figure it out? I guess we can use a big value for K, if the cosine similarity engine can take it... but I am guessing providing too much context to the LLM will cost a lot of money?
@hubel-labs
@hubel-labs 6 ай бұрын
Yeah, it wouldn’t work very well in that scenario. I wonder if perhaps there is a strategy where the LLM can determine the right K based on what it is trying to do. Or instead of setting K to be a fixed number, to return all chunks where the cosine similarity is above a given threshold.
@chiggly007
@chiggly007 6 ай бұрын
Great video? Is the diagram anywhere to refeeence?
@hubel-labs
@hubel-labs 6 ай бұрын
I’ll post it up tomorrow - need to fix an error
@grabani
@grabani 6 ай бұрын
@hubel-labs Any update on the diagram please - thanks? Great vid by the way.
@hubel-labs
@hubel-labs 6 ай бұрын
I added it to the description. Here is the link: link.excalidraw.com/readonly/m6DK7oyEFpyQnuw55DVP?darkMode=true
@user-jj1ff2gz7u
@user-jj1ff2gz7u 6 ай бұрын
df
@jannessantoso1405
@jannessantoso1405 5 ай бұрын
it is 1784 in teahistory.txt and 1794 in chinahistory.txt so bit confusing but anw great tutorial Thanks
Google Gemini vs OpenAI chatGPT 4 bake-off - SURPRISING results!
34:51
The 5 Levels Of Text Splitting For Retrieval
1:09:00
Greg Kamradt (Data Indy)
Рет қаралды 54 М.
KINDNESS ALWAYS COME BACK
00:59
dednahype
Рет қаралды 95 МЛН
OMG🤪 #tiktok #shorts #potapova_blog
00:50
Potapova_blog
Рет қаралды 18 МЛН
Advanced RAG Techniques with @LlamaIndex
48:35
Timescale
Рет қаралды 2,3 М.
How to set up RAG - Retrieval Augmented Generation (demo)
19:52
Don Woodlock
Рет қаралды 17 М.
Adding Agentic Layers to RAG
19:40
AI User Group
Рет қаралды 16 М.
Finally! How to Leverage JSON Mode in OpenAI's New GPT 4 Turbo 128k
14:10
RAG But Better: Rerankers with Cohere AI
23:43
James Briggs
Рет қаралды 53 М.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 84 М.
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
pixegami
Рет қаралды 136 М.
LangChain Advanced RAG - Two-Stage Retrieval with Cross Encoder (BERT)
14:21