Talk to Your Documents, Powered by Llama-Index

No video

Talk to Your Documents, Powered by Llama-Index

Рет қаралды 81,327

Күн бұрын

In this video, we will build a Chat with your document system using Llama-Index. I will explain concepts related to llama index with a focus on understanding Vector Store Index.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
☕ Buy me a Coffee: ko-fi.com/prom...
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.co...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/e...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINKS:
Google Colab: tinyurl.com/hm...
llama-Index Github: github.com/jer...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Timestamps:
[00:00] What is Llama-index
[01:10] System Architecture
[02:54] Llama-index Setup
[04:54] Loading Documents
[05:42] Creating the Vector Store Index
[06:16] Creating Query Engine
[07:06] Q&A Over Documents
[09:00] How to Persist the Index
[10:20] What is inside the Index?
[11:38] How to change the default LLM
[13:25] Change the Chunk Size
[14:26] Use Open Source LLM with Llama Index

Пікірлер: 107

@engineerprompt 10 ай бұрын

Want to connect? 💼Consulting: calendly.com/engineerprompt/consulting-call 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Join Patreon: Patreon.com/PromptEngineering ▶ Subscribe: www.youtube.com/@engineerprompt?sub_confirmation=1

@mlg4035 10 ай бұрын

The SINGLE MOST valuable YT video I have come across on this topic. BRAVO!! And thank you!

@engineerprompt 10 ай бұрын

Glad it was helpful!

@rehberim360 10 ай бұрын

Again a great video. While I was trying to figure out how to learn this technology and where I could find reliable sources, it was lucky for me to find such up-to-date information.

@gregorykarsten7350 10 ай бұрын

Excellent. I was wondering what the difference between langchain and lama index was. I also thought lama index is very powerful with its indexing functionality. This can bridge the gap between semantic and index search

@s.moneebahnoman 8 ай бұрын

Amazing! I haven't seen enough videos talking about persisting the index especially in beginner level tutorials. I think its such a crucial concept that I found out much later. Love the flow for this and its perfectly explained! Liked and subbed!

@dario27 Ай бұрын

Finally a good tutorial on the subject! Thanks so much!

@regonzalezayala Ай бұрын

🎯 Key points for quick navigation: 00:00 *💡 Introduction to Llama-Index* - Introduction to the task: building a document Q&A system using Llama-Index, - Comparison with LangChain and brief overview of functionalities, - Emphasis on fine-tuning embedding models for better performance. 01:19 *📑 Document Processing and Embedding* - Explanation of converting documents into chunks and computing embeddings, - Process of creating a semantic index and storing embeddings, - Introduction to querying the document by computing embeddings for user questions. 02:57 *🛠️ Initial Setup and Code Implementation* - Installing necessary packages: Llama-Index, OpenAI, Transformers, Accelerate, - Setting up the environment and loading the document using Simple Directory Reader, - Overview of creating vector stores and relevant indexing. 05:17 *🧩 Implementing Query Engine and Basic Queries* - Description of building a query engine, - Implementation of querying the documents with sample questions, - Obtaining and displaying responses from the model. 08:42 *🛠️ Customizing Configuration and Parameters* - Explanation of customizing chunk sizes, LLM models, and other parameters, - Process of persisting vector stores for future use, - Detailed look at embedding and document storage components. 11:43 *🔧 Advanced Customization and LLM Usage* - Methods for changing the LLM model, including GPT-3.5 Turbo and Google Palm, - Instructions on setting chunk sizes and overlaps, - Using open-source LLMs from Hugging Face and configuring corresponding parameters. 16:37 *🚀 Conclusion and Future Prospects* - Summary of using Llama-Index for document Q&A systems, - Mention of advanced features and future tutorial plans, - Encouragement to check out additional resources and support via Patreon. Made with HARPA AI

@aseemasthana4121 9 ай бұрын

Perfect pace and level of knowledge. Loved the video.

@engineerprompt 9 ай бұрын

Glad you liked it!

@ChronozOdP 10 ай бұрын

Another excellent video. Easy to follow and up to date. Thank you and keep it up!

@rajmankad2949 10 ай бұрын

The explanation is so clear! Thank you.

@abdullahiahmad3244 8 ай бұрын

to be honest this is the best tutorial i see in 2023

@engineerprompt 8 ай бұрын

Thank you 😊

@xt3708 10 ай бұрын

learn so much from you!

@engineerprompt 10 ай бұрын

Thank you

@kayasaz 2 ай бұрын

Great explanation

@MikewasG 10 ай бұрын

Thanks for your sharing! It's very helpful.

@adamduncan6579 10 ай бұрын

Excellent videos! Really helping out with my work. Curious what tool you are using to draw the system architecture? I really like the way it renders the architectures.

@KinoInsight 10 ай бұрын

I liked your explanation. You are a good story teller. You explained the details in a simple way and yet easy to implement manner. Thank you. I look forward to your next video. But how do we ensure the latest data is fed to the LLM in real time? In this case, we need to provide the data to the Llama. And the response is limited to the data provided.

@MarshallMelnychuk 10 ай бұрын

Great explanation and comparison very useful thank you

@ghostwhowalks2324 10 ай бұрын

thanks for simplifying this

@vladimirgorea8714 10 ай бұрын

what's the difference in using Llamaindex and just using openai embeddings?

@Elyes-sj8kb 6 ай бұрын

Llamaindex offers a tool set to implement rag and uses an embedder, an embedder converts a word into a vector

@henkhbit5748 10 ай бұрын

Nice intro about llma-index👍. I think for small amount of documents the default llma-index embedding in json is sufficient. I suppose u can also use chromadb or weaviate or other vectorstores. Would be nice to see with the non default vector store...

@xt3708 10 ай бұрын

yeah, maybe a video comparing diff stores, when to use which, strenghts/weaknesses etc

@TeamDman 10 ай бұрын

Thanks for sharing!

@jamesvictor2182 7 ай бұрын

Great video. The notebook fails at the first hurdle for me: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. llmx 0.0.15a0 requires cohere, which is not installed. tensorflow-probability 0.22.0 requires typing-extensions

@user-dp9lj1ew6k 10 ай бұрын

This is awesome. I'm going to try

@CaesarEduBiz-lz2cg 10 ай бұрын

Is it better to use Llama Index or RAG (Retrieval Augmented Generation) ?

@HarmeetSingh-ry6fm Ай бұрын

Hi prompt as you mentioned in this video that this is a system prompt for StableLM, I want to know is there a way I can find prompt format for different LLM for example mixtral 8x7b/22b or llama 3

@DixitNitish 10 ай бұрын

Great video, how can I have the ability to compare 100s of document using llamaindex and will it know which chunk belongs to which document when answering the question? Also, how do you make sure all the pieces that should be in 1 chunk stays together, for instance if there is a table that goes across 2 pages then that should still be in 1 chunk?

@42svb58 10 ай бұрын

3:18 "replace the openai llm with open-source..." 😁😁😁😁😁

@nishkarve 7 ай бұрын

Excellent. Is there a video you are planning to make on a multi modal RAG? I have a PDF which is an instruction manual. It has text and images. When a user asks a question, for example, "How to connect the TV to external speakers?", it should show the steps and the images associated with those steps. Everywhere I see are examples of image "generation". I don't want to generate images. I just want to show what's in the PDF based on the question.

@nickwoolley733 10 ай бұрын

Will you be testing the new Mistral-7B-v0.1 and Mistral-7B-Instruct-v0.1 LLMs? They claim to outperform Llama 2.😊

@engineerprompt 10 ай бұрын

Yes

@user-gq6ol1di3t 10 ай бұрын

how to chat with pdf document that contain mathematical equations and some derivations

@engineerprompt 10 ай бұрын

You will have to use something like Nougat from meta to convert those into markdowns and then you can use llama-index as shown here.

@matten_zero 10 ай бұрын

Much more intuitive than LangChain

@Gingeey23 10 ай бұрын

Great video - thanks for sharing. Do you reckon they will implement the ability to use local LLMs with these embeddings, and if so, is there any plan to update LocalGPT to include the option of using Llama Index over LangChain Vector Stores? Cheers!

@AlignmentLabAI 10 ай бұрын

They should be compatible with local models out of the gate, in fact I believe llama index is local models first, hence 'llama'

@wangbei9 10 ай бұрын

llama-index previously known as gpt index has been there since March. It is built on top of langchain. Would like to compare with your local gpt to check the performance. Also, for documents related application, often OCR is needed to grab the text at first.

@engineerprompt 10 ай бұрын

That's a good idea. Will do a quick implementation of localgpt with llama-index. I agree with OCR and have been experimenting with unstructured package for dealing with pdf files.

@mmdls602 10 ай бұрын

Localgpt with llama index? What do you mean? Bit confusing with all the models and use of same words in the eco system. Would be nice if you can explain the ecosystem and where these things for in. Thank you, great videos

@bertobertoberto3 10 ай бұрын

Nice tutorial! How would you associate document chunks with the actual document the chunks came from? Let’s say I have a 500 page pdf. Now I split it into 500 documents, one per page, then I apply this llamaindex to chunk each page. How do I know that chunk 46kjkjh belongs to page 5?

@engineerprompt 10 ай бұрын

You can add meta data to each chunk which will add that information.

@bertobertoberto3 10 ай бұрын

@@engineerprompt oh thank you. Is this in the documentation somewhere? An example would be invaluable

@bongimusprime7981 10 ай бұрын

@@bertobertoberto3 each source node in the response will have the text used,as well as a reference to the source doc id + metadata, eg: response.source_nodes[0].node.text response.source_nodes[0].node.ref_doc_id response.source_nodes[0].node.metadata

@bertobertoberto3 10 ай бұрын

@@bongimusprime7981 awesome

@angloland4539 10 ай бұрын

❤

@am1rsafavi-naini356 9 ай бұрын

I don't have any credit card, but I will buy an coffee for you some day (maybe in person, who knows :)

@engineerprompt 9 ай бұрын

Thanks 🙏

@Rahul-zq8ep 9 ай бұрын

is it a Production ready code ? What important points we should keep in mind to make a similar app for Production environment ?

@trobinsun9851 10 ай бұрын

what about security of our data ? what if it's confidential document ? thanks for your excellent videos

@engineerprompt 10 ай бұрын

Look at the localgpt project, you can run everything locally. Nothing leaves your system.

@Shogun-C 10 ай бұрын

When attempting to run the index line of code, I get an AuthenticationError saying my API key is incorrect even though I've copied it straight across from my OpenAI account as a newly generated key. Any idea as to where I'm going wrong?

@draganamilosheska3702 8 ай бұрын

did you fix that? i have the same problem

@AI_Expert_007 3 ай бұрын

Thanks for the clear explanation. Could you please share the name of the tool you used to create the workflow diagram?

@engineerprompt 3 ай бұрын

It's called excalidraw

@AI_Expert_007 3 ай бұрын

@@engineerprompt Thanks a lot !

@y2knoproblem 10 ай бұрын

This is very helpful. Where can we find the system architecture diagram?

@hernandocastroarana6206 10 ай бұрын

Excellent video. Do you know what is the best option to start the code in an interface? I passed the code to Vs Code and then started it in Streamlit but it gives me some problems. I appreciate your help

@hamtsammich 10 ай бұрын

could you do a tutorial about how to do this locally? I'm very interested in llama index, but I'm wary of using things that aren't on my local hardware

@engineerprompt 10 ай бұрын

Sure, will do

@hamtsammich 10 ай бұрын

@@engineerprompt SWEET Yeah, I've been trying to figure out how to try this locally and use my pdfs. I've got a 3090, and I've been excited about llms, but haven't managed yet

@rizwanat7496 6 ай бұрын

I am in the vectorstoreindex.from documents cell. it been running for like 24 hrs now. How do I know when it will end. I am running it locally in my laptop. output shows completion of batches. almost 2400+ batches. but it doesn;t showing how many are left. can somebody help?. my data consist of 850+ json. over all 70MB data.

@sidhaarthsredharan3318 5 ай бұрын

im doing the same, but indexing every node created, there are around 5000 nodes, and its taking a long time. is there some progress bar (like tqdm) code i can add to see how long the indexing process would take?

@engineerprompt 5 ай бұрын

I don't think there is a progress bar by default. You might be able to add a callback though.

@dimitripetrenko438 7 ай бұрын

Hi bro cool video! May I ask if there is a way to store quantized model with LlamaIndex? It's very painful to quantize it every single time I try to run it

@dineshbhatotia8783 6 ай бұрын

ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format. Getting this error

@udithweerasinghe6402 6 ай бұрын

Do you have codes to do the same without openAI. Using some model in huggingface?

@user-jp5uy5ok7g 5 ай бұрын

Can I build a multilingual chatbot using llama index?

@ismaelnoble 9 ай бұрын

how does llama index compare with the localgpt method?

@Macventure 10 ай бұрын

How can this be modified to run locally on GPU, without OpenAI?

@AlignmentLabAI 10 ай бұрын

As a baseline you can use vllm with the openai API spec on the server and drop the API base URL and API key to the openai variables in the scrips in your environment variables

@zearcher4633 8 ай бұрын

can you make a video on how to create a website chatbot out of all of this? say, we used this video and made a chatbot to talk with our data, how do we use it in our website?

@anuvratshukla7061 10 ай бұрын

If I'm uusing weaviate, how to load then?

@gitinit3416 10 ай бұрын

Awesome ❤ a little off topic question......would be so kind as to share the app you are using for making diagrams. it's sick....I've been looking for something that that since quite a while now, but with no luck.... 🥺

@engineerprompt 10 ай бұрын

Here is the one I use: excalidraw.com/

@med-lek 10 ай бұрын

I have a question: When I ask same question - I got different answer?

@nazihfattal974 10 ай бұрын

I tried this in my colab pro account and the session crashed when I ran the vectorstore. Out of GPU memory. colab allocated 16GB of VRAM. Would you please add option for using huggingface hosted LLMs through their free inference API (applies to select models)? Thanks for a great video.

@test12382 6 ай бұрын

Can this help llm parsing html?

@vostfrguys 10 ай бұрын

If let's say I put the transcript of a tv show for exemple all stargate tv shows would it be able to generate an episode with x,y,z theme ?

@huyvo9105 8 ай бұрын

Hi, can i get your pipeline draw link please

@Udayanverma 8 ай бұрын

Do u have any example of a model on personal desktop/server. I dont wish to publish my content to chatgpt or any internet service.

@engineerprompt 8 ай бұрын

Checkout my localgpt project

@jersainpasaran1931 10 ай бұрын

Information always magic!!! would it be possible to integrate chatgpt 3.5 turbo where user is requested, does it allow to search for documents in different languages or is it necessary to load a specific library for each language? Thanks always

@engineerprompt 10 ай бұрын

Thank you 🙏 if you are using got-3.5 then the content can in different languages. You will have to instruct gpt to auto detect the language and process it.

@sachinkalsi9516 10 ай бұрын

Thanks for the video. I’m facing lots of latency issues (15+ min) while reading the stores the index. How I can improve ? There are 100k+ vectors . Going ahead with numpy array takes few minutes only !

@engineerprompt 10 ай бұрын

You probably want to explore another vector store

@niranandkhedkar3681 10 ай бұрын

Hey @engineerprompt , Can you please make a video on this. Facing the same challenge to reduce the response time of llama index. @sachinkalsi9516 Found any solution for this? Any help is appreciated from everyone thanks.

@devikasimlai4767 2 ай бұрын

14:25

@vitalis 10 ай бұрын

how does it compare to quivr as an AI second brain?

@hassentangier3891 10 ай бұрын

is there a way to run it without requiring openai key

@jaysonp9426 10 ай бұрын

You would still need an API call to a hosted llm and have and embeddings model to do the embeddings.

@anilpgonade8503 10 ай бұрын

If I upload a doc of 50,000 words how much will it cost

@toannguyenngoc8209 10 ай бұрын

OpenAI API limit and you must pay money if you use it. Can you give a example build chat without API

@engineerprompt 10 ай бұрын

Yes, coming soon

@toannguyenngoc8209 10 ай бұрын

@@engineerprompt It's great, I'll look forward to it.

@Nihilvs 10 ай бұрын

hmmmm...

@elizonfrankcarcaustomamani4999 5 ай бұрын

Hello, help with that please. When I execute the line 'index = VectorStoreIndex.from_documents(documents)' after 1 min I get an error 429 (insufficient_quota). Check if the OPENAI_API_KEY variable was registered with '!export -p', and if it is. Thanks

@googleyoutubechannel8554 7 ай бұрын

This looked very promising, but your colab (like 99% of all notebooks) is broken right off the bat, it doesn't even install dependencies. FYI - 'best practice' aka absolutely required, on any notebook is to specify the _specific versions of all dependencies _ , or you just have junk that won't run in days to weeks ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. llmx 0.0.15a0 requires cohere, which is not installed. tensorflow-probability 0.22.0 requires typing-extensions

@googleyoutubechannel8554 7 ай бұрын

Also, the second code block has a critical error that means the code has never even run _once_ ? Will throw an error: os["OPENAI_API_KEY"] = Colab format is: os.environ['OPENAI_API_KEY'] =