No video

Understanding Embeddings in RAG and How to use them - Llama-Index

  Рет қаралды 35,556

Prompt Engineering

Prompt Engineering

Күн бұрын

In this video, we will take a deep dive into the World of Embeddings and understand how to use them in RAG pipeline in Llama-index. First, we will understand the concept and then will look at home to use different embeddings including OpenAI Embedding, Open source embedding (BGE, and instructor embeddings) in llama-index. We will also benchmark their speed.
CONNECT:
☕ Buy me a Coffee: ko-fi.com/prom...
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.co...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/e...
LINKS:
Google Colab: tinyurl.com/mr...
llama-Index RAG: • Talk to Your Documents...
How to chunk Documents: • LangChain: How to Prop...
llama-Index Github: github.com/jer...
TIMESTAMPS:
[00:00] Intro
[01:21] What are Embeddings
[03:58] How they Work!
[05:54] Custom Embeddings
[08:30] OpenAI Embeddings
[09:33] Open-Source Embeddings
[10:45] BGE Embeddings
[11:42] Instructor Embeddings
[11:57] Speed Benchmarking

Пікірлер: 62
@syedhussainabedi476
@syedhussainabedi476 9 ай бұрын
An exemplary and crystal-clear exposition of a complex subject that leaves no room for confusion.
@sivi3883
@sivi3883 9 ай бұрын
Thanks! This is a great video for understand RAG. I agree chunking and embedding play a major role. Per my experience so far, most of the times the LLM (GPT4 in my case) is answering the questions well from my data if the quality of chunks are good! The challenges I face so far in chunking: 1) My PDFs contain a lot of contents with complex tabular structures (with merged rows and columns) for product specifications. Chunking break the relationship between rows and columns. 2) same kind of contents that replicate across different PDFs for different products. Unfortunately the pdfs are not named by products. Therefore the vector search is returning contents from an a wrong product not asked in the user query. 3) Sometimes with in the same PDF (Containing multiple products), the contents repeat with different specification per product. If I ask for input voltage for a product A, it might return product B since the context is lost while chunking. Looking for smarter ways to chunk to retain the contexts across the chunks.
@s.moneebahnoman
@s.moneebahnoman 8 ай бұрын
Its one thing to be good at what one does but being good at teaching it to someone else is next level. The best playlist for fully understanding the *essence* of what's happening. amazing!
@xd-mk3by
@xd-mk3by 10 ай бұрын
This is an absolutely OUTSTANDING video that summed up so many things so well. Thank you for it!
@dario27
@dario27 Ай бұрын
Your tutorials are objectively the best.
@engineerprompt
@engineerprompt Ай бұрын
Thank you 😊
@Vermino
@Vermino 10 ай бұрын
Amazing tutorial. I barely understand this stuff, but you walking me through this helps me understand it a bit more. Plus i am able to recreate your process and test things out myself.
@engineerprompt
@engineerprompt 10 ай бұрын
Glad to hear that!
@uhtexercises
@uhtexercises 10 ай бұрын
Thank you for sharing. I really like your structured approach and explanations. Very well done
@engineerprompt
@engineerprompt 10 ай бұрын
Thanks, glad it was helpful
@ilianos
@ilianos 10 ай бұрын
Great educational value! I'm really looking forward to the comparison video for different embeddings.
@mayurmoudhgalya3840
@mayurmoudhgalya3840 10 ай бұрын
Good thing this showed up on my feed. You got you another subscriber.
@engineerprompt
@engineerprompt 10 ай бұрын
Welcome aboard!
@MikewasG
@MikewasG 10 ай бұрын
The video is awesome! Can't wait for the next one!
@gsayesh
@gsayesh 7 ай бұрын
Superb Lesson... Thank you! ♥
@RichardGetzPhotography
@RichardGetzPhotography 10 ай бұрын
Excellent content!! Thank you!
@li-yq7rc
@li-yq7rc 10 ай бұрын
Need an index of all topics of prompt engineering to understand where to begin.
@AccioLumas
@AccioLumas 10 ай бұрын
Yes
@killerthoughts6150
@killerthoughts6150 9 ай бұрын
persist and you will go very big, rooting for you
@engineerprompt
@engineerprompt 9 ай бұрын
Thank you 🙏
@DrRizzwan
@DrRizzwan Ай бұрын
Good explanation 👏
@engineerprompt
@engineerprompt Ай бұрын
Thank you 🙂
@chandrakalagowda3129
@chandrakalagowda3129 10 ай бұрын
Very useful video on the topic
@engineerprompt
@engineerprompt 10 ай бұрын
Glad you liked it
@hernandocastroarana6206
@hernandocastroarana6206 10 ай бұрын
Excellent video. Thank you very much. I will wait for the next one on the subject 🦾
@patrickblankcassol4354
@patrickblankcassol4354 10 ай бұрын
Thank you for awesome content
@livb4139
@livb4139 10 ай бұрын
Thanks, embedding was like black magic to me
@ramesh_a
@ramesh_a 10 ай бұрын
🎯 Key Takeaways for quick navigation: 01:23 📚 Embeddings are multi-dimensional feature vectors that represent words or sentences in a semantic space, preserving their meaning. 04:06 🧩 Embeddings are crucial in retrieval augmented generation systems to find the closest text chunks based on user queries. 05:44 🚀 The choice of embedding model is vital as it directly impacts the performance of the response generation in document-based chat systems. 08:45 🔄 OpenAI embeddings can be used for document retrieval but come with a cost, while various open-source embeddings provide alternatives. 13:31 ⏱️ Local embedding models like BGE and Instructor are faster for computations compared to remote OpenAI embeddings, which involve server calls. Made with HARPA AI
@Megh_S
@Megh_S 8 ай бұрын
Please make a video on alternatives for Open AI llms as well..
@ayansrivastava731
@ayansrivastava731 3 ай бұрын
The only problem lies in this question : 1. when an LLM model is trained ( not by us, by the ones who made it ) - the input embedding matrix is already fixed 2. What then, is the need of creating external embeddings in the first place? is there a way we can reuse the model's embeddings ? and if not, why? we could very well tokenize our input text, and model would take care of looking up the corresponding embedding. I know it wont fit with RAG Logic , but then counter question is - how are you sure performance wont be affected if using custom embeddings ( for storage in vectorDB) vs kind of embeddings that have been generated per token within LLM itself. is using RAG embeddings means we are bypassing LLM embeddings?
@mioszdaek1583
@mioszdaek1583 6 ай бұрын
Great video. Thanks for sharing. Just a little comment about analogies by vector arithmetic. In the 2:44 you said that if you subtract a vector man from vector king you would get a vector woman where in fact the resultant vector would represent royalty I think. Thanks
@metaljacket8102
@metaljacket8102 Ай бұрын
Your video is better than those made by llamaindex!
@engineerprompt
@engineerprompt Ай бұрын
thanks :)
@42svb58
@42svb58 10 ай бұрын
awesome video!
@joxxen
@joxxen 10 ай бұрын
as always, you will never find a bad quality content by this guy
@paultoensing3126
@paultoensing3126 3 ай бұрын
When you click on the next sequence of code lines, how is it that that stuff magically materializes? Do you copy and paste it from someplace else as obviously you don’t take the time to write it all out like a mortal. Where do these lines of code come from?
@timtensor6994
@timtensor6994 9 ай бұрын
thanks for the summary. Been trying to follow your videos. Have you tried to run llama index with mistral 7B model and instructor embedding ? Is there already a colab notebook and video?
@arkodeepchatterjee
@arkodeepchatterjee 10 ай бұрын
please make the video comparing different embedding models
@Drone256
@Drone256 6 ай бұрын
You started off creating an embedding for a sentence but it appears each chunk is more than a sentence. Would love to know more how you decide how to chunk the data.
@KokahZ777
@KokahZ777 4 ай бұрын
You didn’t explain how to compare query embeddings with dataset embeddings
@andrewandreas5795
@andrewandreas5795 10 ай бұрын
Awesome video! one question, does this give better result than your standard Pinecone/Chorma approach?
@engineerprompt
@engineerprompt 10 ай бұрын
Pinecone/Chroma are vector store for storing the embeddings. What type of semantic search you use will have an impact.
@wuzhao8605
@wuzhao8605 10 ай бұрын
Is Llama index better than langchain? If so, what are the contributors of the improved performance?
@uwegenosdude
@uwegenosdude 4 ай бұрын
Great video. Thanks a lot ! Could you perhaps recommend a good free embeddings model for German language documents?
@engineerprompt
@engineerprompt 4 ай бұрын
The models from mistral.ai/ supports German, they even have their own embeddings which I think supports German.
@adityasharma2667
@adityasharma2667 Ай бұрын
how do you find that chunksize =800 and overlap 20 is a good number?
@engineerprompt
@engineerprompt Ай бұрын
chunk size and overlap is really doc dependent. I think openai uses similar parameters in their assistants by default.
@vinven7
@vinven7 10 ай бұрын
The section on Emebeddings with all the cool visualizations, where is that from? If it's from somewhere else, could you please share the link?
@engineerprompt
@engineerprompt 10 ай бұрын
that's my own :)
@GaneshKumar-jw9ml
@GaneshKumar-jw9ml 10 ай бұрын
Can you make a video on chat element in streamlit which uses prompt template.
@haroonmansi
@haroonmansi 10 ай бұрын
Thanks! any idea how "chunking" or embeddings will be different if we are dealing with python code instead of English language? Like for example I want to use RAG method with Code Llama or CodeWizard for my github repo containing Python code.
@engineerprompt
@engineerprompt 10 ай бұрын
Check this out: kzfaq.info/get/bejne/l6pdqJOY0Z-Xp4E.html
@-blackcat-4749
@-blackcat-4749 7 ай бұрын
📕 The demonstrated occurance is insipid. A usual scene
@ayushyadav-bm2to
@ayushyadav-bm2to 7 ай бұрын
I am a med student, I've made rag, on the best books i can get, and to be honest now use it, more than google to understand a topic, used mistral and chroma db btw
@-blackcat-4749
@-blackcat-4749 7 ай бұрын
This is 🔓 a run-of-the-mill adventure. A predictable event
@paultoensing3126
@paultoensing3126 3 ай бұрын
How do you pick your embedding models? Most of us have no context for understanding what’s valuable in a model or what the criteria would be. So you slide over that as if we should know how to pick an embedding model.
@yth2011
@yth2011 10 ай бұрын
what is the difference between embedings and lora
@zedcodinacademibychinvia9481
@zedcodinacademibychinvia9481 8 ай бұрын
How can i perform the same task with Gemini
@engineerprompt
@engineerprompt 8 ай бұрын
Watch my latest video :)
@zedcodinacademibychinvia9481
@zedcodinacademibychinvia9481 8 ай бұрын
Which one sir
@paultoensing3126
@paultoensing3126 3 ай бұрын
I recommend that you just put code in your thumbnail and see how that sells. That’ll make it super uncool.
OpenAI Embeddings and Vector Databases Crash Course
18:41
Adrian Twarog
Рет қаралды 444 М.
RAG But Better: Rerankers with Cohere AI
23:43
James Briggs
Рет қаралды 57 М.
Fortunately, Ultraman protects me  #shorts #ultraman #ultramantiga #liveaction
00:10
女孩妒忌小丑女? #小丑#shorts
00:34
好人小丑
Рет қаралды 14 МЛН
Little brothers couldn't stay calm when they noticed a bin lorry #shorts
00:32
Fabiosa Best Lifehacks
Рет қаралды 17 МЛН
LangChain vs. LlamaIndex - What Framework to use for RAG?
16:51
Coding Crash Courses
Рет қаралды 15 М.
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
15:21
Graph RAG: Improving RAG with Knowledge Graphs
15:58
Prompt Engineering
Рет қаралды 48 М.
"okay, but I want Llama 3 for my specific use case" - Here's how
24:20
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 911 М.
A deep dive into Retrieval-Augmented Generation with Llamaindex
19:53
ADVANCED Python AI Agent Tutorial - Using RAG
40:59
Tech With Tim
Рет қаралды 136 М.
Fortunately, Ultraman protects me  #shorts #ultraman #ultramantiga #liveaction
00:10