4-Langchain Series-Getting Started With RAG Pipeline Using Langchain Chromadb And FAISS

Рет қаралды 29,907

3 ай бұрын

RAG is a technique for augmenting LLM knowledge with additional data.
LLMs can reason about wide-ranging topics, but their knowledge is limited to the public data up to a specific point in time that they were trained on. If you want to build AI applications that can reason about private data or data introduced after a model’s cutoff date, you need to augment the knowledge of the model with the specific information it needs. The process of bringing the appropriate information and inserting it into the model prompt is known as Retrieval Augmented Generation (RAG).
github: github.com/krishnaik06/Update...
---------------------------------------------------------------------------------------------
Support me by joining membership so that I can upload these kind of videos
/ @krishnaik06
-----------------------------------------------------------------------------------
Fresh Langchain Playlist: • Fresh And Updated Lang...
►LLM Fine Tuning Playlist: • Steps By Step Tutorial...
►AWS Bedrock Playlist: • Generative AI In AWS-A...
►Llamindex Playlist: • Announcing LlamaIndex ...
►Google Gemini Playlist: • Google Is On Another L...
►Langchain Playlist: • Amazing Langchain Seri...
►Data Science Projects:
• Now you Can Crack Any ...
►Learn In One Tutorials
Statistics in 6 hours: • Complete Statistics Fo...
End To End RAG LLM APP Using LlamaIndex And OpenAI- Indexing And Querying Multiple Pdf's
Machine Learning In 6 Hours: • Complete Machine Learn...
Deep Learning 5 hours : • Deep Learning Indepth ...
►Learn In a Week Playlist
Statistics: • Live Day 1- Introducti...
Machine Learning : • Announcing 7 Days Live...
Deep Learning: • 5 Days Live Deep Learn...
NLP : • Announcing NLP Live co...
---------------------------------------------------------------------------------------------------
My Recording Gear
Laptop: amzn.to/4886inY
Office Desk : amzn.to/48nAWcO
Camera: amzn.to/3vcEIHS
Writing Pad:amzn.to/3OuXq41
Monitor: amzn.to/3vcEIHS
Audio Accessories: amzn.to/48nbgxD
Audio Mic: amzn.to/48nbgxD

Пікірлер: 61

@krishnaik06 3 ай бұрын

Support me by joining membership so that I can upload these kind of videos kzfaq.info/love/NU_lfiiWBdtULKOw6X0Digjoin

@cocgamingstar6990 2 ай бұрын

sir we request you, kindly share the document Screen too so we can learn much many more thing ...

@rafikyahia7100 2 ай бұрын

The fact that you start from scratch with dependencies and libraries, is just excellent, you save beginners so much headache and confusion, Thank you very much!

@raph8240 3 ай бұрын

You have done so much for the data science community. Some of your videos are worth more than $1000

@niranjannithiyanantham9279 3 ай бұрын

Krish i started to watch your video for the past 3 days you are just amazing man And you dropped this 👑

@sagarmhaisne110 3 ай бұрын

Thanks Krish, these videos helps to crack interviews !!

@Nishant-xu1ns 3 ай бұрын

thanks sir for continuing the langchain series .really helpful for me

@lenovox1carbon664 3 ай бұрын

Hi sir i understood syntax for using ollama and openai is almost similiar, but i think it would be better if u use ollama whenever possible bcz i think many of us wont be using openai for learning purpose (also u might be able to fix any ollama related errors which we may face during our learning phase).

@nagbhushanrsubbapurmath2247 3 ай бұрын

Hi Krish, I am following you from last 1 year, Joined PwSkills, Understood your ML and DL videos, you have explained everything in simple words so that A non tech person can also understand it very well.😊 Need to know more about Gen Al, How to evaluvate the Gen Al Models....? Make some videos on the Evaluvation Part of Gen AI LLM Models, Also make some videos on Fine Tuning LLMs. Thank you !!! for the Help in Advance.

@birbalkumar5541 3 ай бұрын

I'm very excited about the next upcoming video

@sagaromar4326 2 ай бұрын

Great content Great way of teaching loved it ❤ Thanks Krish for this wonderful content really needed it

@Tech_Enthusiasts_Shubham 3 ай бұрын

thanks for bringing back to back helpful content sir i am really loving your new series by heart

@wftrdshometoprofessionalfo142 3 ай бұрын

Amazing man! Really Appreciated! Looking forward to watching next video!!!!!

@utkarshkapil 3 ай бұрын

This is SO good!! Have been waiting!!!!

@adityavipradas3252 3 ай бұрын

Going great so far. Thank you.

@r1ckmav 3 ай бұрын

Great explanation Krish and great content

@lalaniwerake881 Ай бұрын

This is extremely helpful! - Thank you very much

@mahikhan5716 3 ай бұрын

super content krish , i appreciate lots

@techtalksabhishek 3 ай бұрын

Good work Krish. Keep it up

@mdfaiz4583 3 ай бұрын

Thank you so much Krish

@Rider12374 Ай бұрын

krish , please try to explain steps you are performing and why they are required. Please do this and you would be the best teacher.

@emiliobravo1385 3 ай бұрын

Great Video from Mexico

@captiandaasAI 3 ай бұрын

Love and respect!!

@sheikhobada8305 3 ай бұрын

Thanks Krish Sir

@captiandaasAI 3 ай бұрын

Thanks alot krish !!!!!!!!!!!!!

@Nishant-xu1ns 3 ай бұрын

excellent video

@vos72 2 ай бұрын

Absolutely fabulous tutorial. It really helped clarify some things for me. Very clear and concise, straightforward. Learned a lot -- keep your awesome videos coming! Do you mentor people by any chance?

@DivyanshChawda 2 ай бұрын

Hi Krish thanks for the videos pls create a video about Langchain memory.

@sanjuladissanayake5295 3 ай бұрын

Legend!❤

@nishantchoudhary3245 3 ай бұрын

Waiting for next video

@rkjyoti4167 3 ай бұрын

superb

@amritsubramanian8384 2 ай бұрын

Gr8 video

@shreyasbs2861 3 ай бұрын

Nice video

@user-oi6rs3fp2b 3 ай бұрын

Hello Sir I have doubt into RAG service into Azure CosmosDB vcore this service is available or not Can we create Index ,Data Source and Indexer into Azure cognitive search service

@pradeepjungkarki4510 3 ай бұрын

Sir please make a separate videos on chroma db or any kind of vector databases

@AyoubMisbahi-ph3pi 3 ай бұрын

Thanks Krish. I have a question please: How can I define an optimal chunk size and chunk overlap? Also, could you please use the Milvus database for another project?

@mohsenghafari7652 3 ай бұрын

tank you

@ysrinu4497 2 ай бұрын

Hi @krish Thank you for providing the such beautiful tutorial. When we are going through the RAG pipeline for creating the embedding OPENAI embedding used, but I don't have open ai access for alternative you mentioned ollama embeddings. when I try to import the ollama embedding unable to find out the ollama embeddings. Could you pls assist on this.

@tanishhhh38 2 ай бұрын

I had a question Krish, while querying the vector database using similarity search function, is the embedding APIs used to generate embedding for the query and then comparing it to existing vectors to provide results?

@tejakarpuramswaroop4229 3 ай бұрын

I saw the Reka Ai model. It is an open source model. Please do some tutorials on how we can build applications based on that API.

@shivtaneja866 Ай бұрын

Is it chunking visualisations also?

@Vir-se2kb Ай бұрын

Bro. Did you check the result for the query - "Who are the authors of the attention is all you need research paper?" ???? Along with the author name, it comes with some extra line from the pdf. You did not mention how to avoid that.

@norendermoody6509 Ай бұрын

Can we use Prompt as a query using Open AI LLM model?? and get answers to it. Any project involving Prompt and Embeddings

@payalbhattad8048 3 ай бұрын

Thanks it was great! Can you make a same with excel file. It would be great if we can see some example of it.

@venky433 2 ай бұрын

@Krish, Getting error for Vector embedding code even though i have all required modules ... ## Vector Embedding and Vector Store from langchain_community.embeddings import OpenAIEmbeddings from langchain_community.vectorstores import Chroma db = Chroma.from_documents(documents[:1],OpenAIEmbeddings()) Note: i have install "onnxruntime" library but still same error. Error: ValueError: The onnxruntime python package is not installed. Please install it with `pip install onnxruntime`

@ShivamGupta-qh8go 3 ай бұрын

i am not watching the videos currently as my exams are going on , but can you please give an idea about by when will the series end??

@explorewithskp1237 3 ай бұрын

The interview questions that interviewer asked me on GenAI are 1. Why overlapping while converting into chunks 2. Suppose if the overlapping words are 50 then how would you know that 50 words are overlapped 3. What is indexing vectors 4. Where would you store these vectors 5. How would you split the text or pdf text into chunks 6. what is RAG, Ollama, Cloudea

@shankarpentyala1660 2 ай бұрын

Why overlapping while converting into chunks: There are a few reasons to use overlapping chunks when processing text: Capture Context: Overlapping chunks ensure that sentences don't get split at awkward points, preserving context between chunks. This can be important for tasks like information retrieval or question answering. Reduce Boundary Issues: When searching for specific phrases, overlapping chunks can avoid missing matches that fall on chunk boundaries. Improve Retrieval: Overlaps allow for more flexibility in retrieving relevant information, especially when dealing with complex or ambiguous queries. How to know 50 words are overlapped: There are a few ways to determine the number of overlapping words between chunks: Maintain a Counter: During chunking, keep track of the number of words processed so far. When creating a new chunk, compare this counter with the previous chunk's ending position to calculate the overlap. Use Fixed Overlap Size: Define a fixed number of words for overlap (e.g., 50 words) and adjust chunk boundaries accordingly. Character-Level Processing: If you're working with character-level models, simply count the number of overlapping characters between chunks. What are indexing vectors: Indexing vectors are dense numerical representations of text documents or terms. These vectors encode the meaning and relationships within the text using techniques like Word2Vec or GloVe. Where to store indexing vectors: Indexing vectors can be stored in various ways depending on the application: In-Memory Storage: For smaller datasets and real-time applications, vectors can be kept in memory for fast access. Database Storage: For larger datasets, efficient databases like FAISS or Annoy can store and manage vectors for retrieval tasks. Distributed Storage: Large-scale systems might use distributed file systems like HDFS or cloud storage solutions like Amazon S3 for vector storage. How to split text or pdf text into chunks: There are multiple ways to split text or PDF documents into chunks: Fixed-Size Chunks: Divide the text into chunks of a predetermined size (e.g., 500 words). This is simple but might not be optimal for capturing context. Sentence-Based Chunks: Split the text at sentence boundaries. This ensures that each chunk represents a complete thought. Word-Based Chunks: Split the text based on word boundaries. This offers more granular control but might break up context. Sliding Window: Use a sliding window approach with a fixed chunk size and defined overlap to create overlapping chunks as mentioned earlier. Large Language Models (LLMs): RAG, Ollama, Cloudea: RAG (Retrieval-Augmented Generation): This is a technique for combining retrieval systems with large language models (LLMs) to improve the factual accuracy and informativeness of the LLM's outputs. Ollama: This is a Python library that provides access to various LLMs, including open-source models like LaMDA or Jurassic-1 Jumbo. It simplifies interacting with LLMs and building RAG applications. Cloudea: This term is not commonly used in the context of LLMs or text processing. It might be a misspelling of "CloudML" (Google Cloud Machine Learning Engine) or a reference to a specific, less popular service. Asked Gemini google model

@explorewithskp1237 2 ай бұрын

@@shankarpentyala1660 Super Thanks 👍

@Bazor4all 3 ай бұрын

What about pinecone database. I think a lot have changed recently with pinecone. I find it difficult doing retrieval

@RahulPrajapati-jg4dg 2 ай бұрын

Hello sir can you create some video related Ray lib with ml dl, transformers etc.....

@Vir-se2kb Ай бұрын

Thanks for uploading good content videos. But one suggestion is before writing any line of code, it will be better if would explain the reason why you have written this line of code will be very helpful. Just watching your left screen & writing the same on the right screen will not be helpful for audience.

@jhhh9106 3 ай бұрын

Hello sir can you create advance concept of Gen AI

@tharunps8048 3 ай бұрын

Can we perform RAG on Tabular Data ?

@aryansalge4508 3 ай бұрын

yes

@tharunps8048 3 ай бұрын

@@aryansalge4508 will that answer any kind of query related to tabular data ?....like mean, median, correlation, unique categories, groupby ?

@MrAhsan99 3 ай бұрын

@@tharunps8048 try PandasAi

@swet_gokugod9382 14 күн бұрын

Stucked at this line from langchain_community.embeddings import OpenAIEmbeddings from langchain_community.vectorstores import Chroma db = Chroma.from_documents(documents[:5], OpenAIEmbeddings()) Error:- ImportError: cannot import name 'run_in_executor' from 'langchain_core.runnables.config'