LlamaParse: Convert PDF (with tables) to Markdown

No video

LlamaParse: Convert PDF (with tables) to Markdown

Рет қаралды 10,177

Күн бұрын

In this video tutorial, you'll learn how to parse a PDF file and convert it into a markdown file using an API from Lama Index. This method allows you to parse more complex parts of the PDF, such as tables, which can be a headache when using simple methods like OCR.
Useful links:
👉 Notebook: colab.research...
☎️ Get something like this for your company: link.alejandro...
💬 Join the Discord Help Server: link.alejandro...
❤️ Buy me a coffee... or a beer (thanks): link.alejandro...
✉️ Get the Newsletter: link.alejandro...
Timestamps:
0:00 - Introduction
2:02 - What is LlamaParse?
3:43 - Setup
6:41 - Parse the PDF
11:40 - Add a prompt to the parser
15:10 Conclusion
Learn how to parse a PDF file and convert it to markdown using LlamaParse API in this step-by-step tutorial. This method uses generative AI during the ingestion process to help you understand your document better, especially when dealing with tabular data. LlamaParse supports various file types, including PDF, PowerPoint, and Word documents, and offers a generous free plan of 1,000 pages per day.
By the end of this video, you'll know how to install Lama Pars, download and parse your PDF file, and export the markdown file. You'll also learn how to add a prompt to LlamaParse to summarize or perform other actions on your document.

Пікірлер: 59

@AI-mm6rf 24 күн бұрын

thanks for the video, yes please make more videos on llama-index and llama-parse

@alejandro_ao 23 күн бұрын

coming up!!!

@LookNumber9 13 күн бұрын

For this particular document, if you're using Windows, you'll need encoding='utf-8' as the last parameter in the "with open()" statement. I don't think Mac's have this issue.

@youngchrisyang 2 ай бұрын

Awesome contents as always, thanks Alejandro!

@alejandro_ao 2 ай бұрын

thanks chris!

@heaven8450 12 күн бұрын

Hey Alejandro, In your channel, you did more Projects using GenAI concepts but you forget to do a video with Multi model RAG, Kindly requesting do the project for Multi model RAG.

@priyadoesdatascience5141 Ай бұрын

thanks a lot! this is very helpful. if you figure out how to create RAG when you have charts and graphs in a document, please share with us also

@antoniommota31 2 ай бұрын

Awesome . Thank you for your classes. I would to see something similar, but local

@alejandro_ao 2 ай бұрын

indeed, i realize there's a lot of demand for local implementations. i'm working on it, you should see it up soon!

@rodrigobogado653 2 ай бұрын

Bro, I follow your videos, what you do is really good, it helped me a lot with my work, so I joined your channel!

@alejandro_ao 2 ай бұрын

hey man, thank you so much!! so glad to have you on board and very happy to know that you have found my work useful :) let me know if i can help you with anything :)

@rodrigobogado653 2 ай бұрын

@@alejandro_ao Come on bro, I'm just doing a cancer project and although it's not similar to the dataset you used, the example you gave in other videos was one of the things I also used as a guide. When it is well prepared I will show it to you so you can give an opinion.

@see-pelso 2 ай бұрын

Great video. What do you think, how would this perform with pdf-s containing scans or lots of images? Another struggle point in my use cases to recognise scanned text, like a fully scanned book and transform it to MD to use it with LLM-s.

@user-jm8ng3eo7m Ай бұрын

Yeah, llamaparse is good for table in PDF, but it can fail often for some complex scanned PDF or complex tables, do you know other better options or the STOA solution for rag for tables in PDF? I will be very grateful!

@samcavalera9489 2 ай бұрын

Many thanks bro! Fantastic video as always! I was wondering if you could give me an advice: My work primarily involves using RAG on scientific papers (let's say hundreds!) , which often include figures that sometimes convey more information than the text itself. Can LlamaParser analyse the figures and add a description of them in the markdown file? (That will literally create llm professors via RAG!) If not, Is there a technique to incorporate these figures into the vector database along with the paper’s text? Essentially, for multi-modal vector embedding that includes both text and images, what’s the best approach to achieve this? I greatly appreciate your insight 🙏🙏🙏

@alejandro_ao 2 ай бұрын

hey sam, i'm glad this is useful! that's a great question. i am actually working on a comprehensive set of tutorials dealing precisely with with multimodal rag. essentially, you have to use a model with vision like GPT-4V to parse the tables and images if you want to do this. expect to see this in the channel soon!

@samcavalera9489 2 ай бұрын

@@alejandro_ao MANY MANY THANKS 🙏 🙏 🙏 that will help the academic research greatly! Passionately looking forward to watching and learning from them!

@rahatrezasulemani2862 2 ай бұрын

How do you add this in retrieveal pipeline? Which splitter?

@ignaciopincheira23 Ай бұрын

Hi, could you convert complex PDF documents (with graphics and tables) into an easily readable text format, such as Markdown? The input file would be a PDF and the output file would be a text file (.txt).

@alejandro_ao Ай бұрын

hello there! llamaparse should do the trick for you. although for more complex pdfs (mainly those including images) maybe you will need to do this by hand using a LLM with vision (such as GPT-4o or Gemini 1.5). i will be making a video about that soon!

@surendrachoudhary1644 2 ай бұрын

If we need to chat with data in off table, we can use this api and output can be sent to a vector database (rag app) and then we can chat with that table?

@alejandro_ao 2 ай бұрын

absolutely, that's the main use of this api. since the table is converted to markdown, it can be used in retrieval for rag apps :)

@cryptosimon9529 2 ай бұрын

Would love to see something similar but using hugging face 🤗

@alejandro_ao 2 ай бұрын

Noted!

@VenkatesanVenkat-fd4hg 2 ай бұрын

I believe, we don't know what is happening & inside llama parse

@alejandro_ao 2 ай бұрын

indeed, they use GenAI for parsing the documents, and it the exact way how they do that is part of their secret sauce. but they make it possible to run LlamaCloud (and LlamaParse) within your servers as part of their enterprise solution so you can be sure that the data never leaves your premises

@brabbbus 2 ай бұрын

@@alejandro_ao can implement this using NextJs React?

@flakky626 2 ай бұрын

Cool and informative vid thanks

@alejandro_ao 2 ай бұрын

no problem! it's mostly useful to create RAG applications

@shirishkrishna3155 22 күн бұрын

i get an error saying api key is required even after following all the steps mentioned

@alejandro_ao 16 күн бұрын

hey there, i feel like you haven't got your api key from llamacloud. you can create an account here and then your api key there: cloud.llamaindex.ai/

@mohsenghafari7652 2 ай бұрын

Thank you for your valuable efforts. get APIKey It is difficult for me. Do you know any other solutions? Thank you for your reply

@alejandro_ao 2 ай бұрын

hey there, is that because your company requires that the data not leave the premisses?

@Singularity_Podcast 2 ай бұрын

awesome, love your content!!

@alejandro_ao 2 ай бұрын

thank you so much!

@Matepediaoficial 2 ай бұрын

Great!! As always!! Thanks!!

@alejandro_ao 2 ай бұрын

@jaivalani4609 2 ай бұрын

@@alejandro_ao can we parse locally using llama parse , if org dsnt wants to send data to cloud. Can we use open source LLMs with Lama parse?

@anurajms Ай бұрын

thank you

@anurajms Ай бұрын

is there any way to do it without llama parse with out api limitation ?

@alejandro_ao Ай бұрын

you can try to do this with your own model or a llm model with vision. are you looking to parse more than 1k pages per day?

@anurajms Ай бұрын

@@alejandro_ao hi thank you for the reply. yes i was trying to do it locally since i have a lot of articles with tabular structure that i wanted to extract and use

@alejandro_ao Ай бұрын

@@anurajms i'm planning more videos about multi-model rag. i'm pretty sure that will help you!

@anurajms Ай бұрын

@@alejandro_ao awesome thank you

@ismailcenik8892 2 ай бұрын

Thanks!

@ismailcenik8892 2 ай бұрын

In my PDF files, there are several questions organized in tables. I want to extract these questions group by group, considering the headers, etc. There will be 4-6 sets of questions. Should I use an LLM for this task? I believe Llama cannot handle this part. If so, which ChatGPT model would be the best fit for this use case? I previously implemented a project involving chat with multiple PDFs using a specific model of chatgpt. Is it still a good idea to use that model, or is there a better option available now? By the way, I will have some paragraphs in future pdf files where I am supposed to extract structured data as well. What is your recommendation in general?

@alejandro_ao Ай бұрын

hey man, so sorry i missed your comment. thank you so much for the tip! in my tests, this api works great for extracting tables in pretty much any pdf, no matter how complicated they are. you could probably use a local setup (i am planning a video on multimodel rag). but this api looks like the most useful approach to me, considering your case. what i would do is extract the questions from your documents using this llamaparse and then add it to my vector database. about the model, pretty much any model above gpt-4 should be perfect for this. let me know how this went!

@megamind452 2 ай бұрын

Will this work with the pdfs containing images and fonts?

@alejandro_ao 2 ай бұрын

absolutely. it even works with powerpoint slides

@megamind452 2 ай бұрын

@@alejandro_ao I just tried it gave me the markdown. But the images are missing 😞

@vagnerbelfort687 2 ай бұрын

Hi Alejandro! Can I do this with a local model on my server?

@alejandro_ao 2 ай бұрын

hey there, it is possible to have LlamaParse run within your own system. but that is only possible for bigger companies with their enterprise solution. you can get in touch with them for that here: www.llamaindex.ai/contact i am not an official interface of LlamaIndex, but when I asked Jerry (the founder of LlamaIndex) this same question, he mentioned that LlamaParse (and LlamaCloud) uses its own GenAI for parsing these documents, and the process and model they built to make this possible is proprietary. so you cannot just install it on your computer. however, if your company requires that the data never leaves your premises, they can put LlamaCloud within your servers with the enterprise solution. if your concern is privacy, Jerry declared that they do not store any source data from their clients. sometimes they might store metadata of the documents to improve retrieval, but that's all. you can hear his response from this same question here: kzfaq.info/get/bejne/n9OchJSayN7Ucok.html on minute 41:34.

@vagnerbelfort687 2 ай бұрын

@@alejandro_ao Right!! Thank you very much for your feedback.. I need to extract tables from laboratory PDF documents, this is extremely sensitive data. That's why I'm looking for an LLM or IA that does this locally. I have already managed to extract when the table is only on one page, but the table sometimes overflows into several pages and then it becomes more complex to apply logic to it. I will continue to look for a solution.. I always watch your videos, they help me a lot. Congratulations!!

@alejandro_ao 2 ай бұрын

@@vagnerbelfort687 This means a lot! Thanks!! That sounds like a pretty interesting task. Let me know if you find a way to do that. I am preparing a bunch of videos on advanced RAG so I might cover something like this soon!

@BiXmaTube 2 ай бұрын

Hi Alejandro, How to reach you for a consultancy? Tried the consultancy link but it is not working. Can you kindly share an email address? Cheers!

@alejandro_ao 2 ай бұрын

happy to hear that!

@BiXmaTube 2 ай бұрын

@@alejandro_ao Thanks. But I asked for an email to contact you :)

@alejandro_ao 2 ай бұрын

Hey, sorry about that! That must have been a bug with my UI! Sure, you can send me an email to hello@alejandro-ao.com. I will be going back to consulting starting from next week.