No video

323 - How to train a chatbot on your own documents?

  Рет қаралды 30,935

DigitalSreeni

DigitalSreeni

Күн бұрын

323 - How to train a chatbot on your own documents?
Using openAI and Langchain
Code generated in the video can be downloaded from here:
github.com/bns...
All other code:
github.com/bns...

Пікірлер: 46
@Ethan-gs5ib
@Ethan-gs5ib Жыл бұрын
Better than most paid courses online! Thanks.
@DigitalSreeni
@DigitalSreeni Жыл бұрын
Thank you very much :)
@alisonwright2189
@alisonwright2189 Жыл бұрын
I've been using the function ChatOpenai() rather than Openai() to call the model "gpt-3.5-turbo" which costs $0.002 rather than $0.025. Cheaper and more powerful, can still be used for standard querying.
@vishnuvardhanvaka
@vishnuvardhanvaka Жыл бұрын
Hello mam, can please make a video on usage costs and other cost factors about openai api
@robosergTV
@robosergTV Жыл бұрын
Would be nice to make the same video but for Llama-2. Llama-2 can run in our private cloud. Many companies dont want to use OpenAI because of data privacy concerns. Also Llama-2 is completely free and can run locally for free.
@Iiochilios1756
@Iiochilios1756 3 ай бұрын
Still would be usefull
@pabolusatyavivek9481
@pabolusatyavivek9481 Жыл бұрын
Thanks, Sreeni. Your content is always the best!
@DigitalSreeni
@DigitalSreeni Жыл бұрын
Thank you very much.
@amnn8507
@amnn8507 Ай бұрын
Thank you for your great videos. Just a quick note, you are not training anything here, you're building a RAG system. You could say "training" if you were optimizing the parameters of a model (e.g. neural nets) for minimizing a loss function.
@BlazeArteryak
@BlazeArteryak Жыл бұрын
I have an pdf with thousands of pages, is the gpt-4 able to undestand and memorize all of it ? My questions to this big pdf need to correlate all the information.
@AlexDerBar
@AlexDerBar Жыл бұрын
Hi Sreeni! Love the content, everything's always amazingly explained. I was wondering if you were planning on covering the YOLOv7 algorithm. It would be really interesting seeing a video of you covering it and your takes on it. Keep up the good content :)
@souravran
@souravran Жыл бұрын
GPT is general purpose and its been trained on millions pieces of text so that it can understand human language. Sure, it might be able to answer specific questions based on the information that it was trained on - for example, "Who is the CEO of Google?" - but as soon as you need to produce specific results based on your product, results will be unpredictable and often just wrong. GPT-3 is notorious for just like confidently making up answers that are just plain wrong. There are two approaches to address this: 1) Fine-tune the model - Need to retrain the model with your own custom data or every time new data is added 2) Context injection - Pre-process knowledge base (embedding), store it as object or in database, based on user's query, search your knowledge base for most relevant info, inject the top most relevant pieces into the actually prompt as context
@carlos.duclos
@carlos.duclos Жыл бұрын
For very specific data extraction, do you think it'd be better to train your own model, for instance using LayoutLMv3?
@humaitrix
@humaitrix 5 ай бұрын
Great material! Thanks for sharing, good job 🚀
@AdnanKhan-mi2kf
@AdnanKhan-mi2kf Жыл бұрын
Hi Sreeni, I enjoy your content every time I see it. Just a question why you jumped from 311 to 323?
@DigitalSreeni
@DigitalSreeni Жыл бұрын
Good observation. I have already created content and written code for the remaining videos (312-322) and they focus on image analysis and optimization techniques. I recorded a couple more language model videos based on viewer questions so I had to assign them new numbers that do not follow the sequence. I don't want to reshuffle all numbers or wait a few months to release another language model video.
@deanstarkey4375
@deanstarkey4375 Жыл бұрын
this was awesome! I never do any coding, and was able to follow and do it
@develom_ai
@develom_ai 9 ай бұрын
Great video. Thanks!👍
@DigitalSreeni
@DigitalSreeni 9 ай бұрын
Thank you too!
@TLogan-eu7qt
@TLogan-eu7qt 11 ай бұрын
Great vid. thank you for your time and effort for these vids.
@happyg8682
@happyg8682 Жыл бұрын
Thank you very much for this great video! Could you please let me know here we used ChatGPT or GPT4? And it’s not fine tuning here, it’s embedding, right? Which one do you think is better? Fine tuning or embedding? Thank you very much!
@drayhancolak
@drayhancolak Жыл бұрын
you ate amazing mate. thank you for awesome lectures
@mdabdullahalhasib2920
@mdabdullahalhasib2920 Жыл бұрын
Always appreciate your work. Thanks sir...
@91255438
@91255438 Жыл бұрын
thank you! It's exactly I was looking for.
@vishnuvardhanvaka
@vishnuvardhanvaka Жыл бұрын
Sir Can you please make a video on usage costs of api and other cost factors !
@amedyasar1021
@amedyasar1021 9 ай бұрын
nice tutorial... how could I limit the topic only with the PDFs? for example in case that the chatbot must not answer.
@romanemul1
@romanemul1 Жыл бұрын
The biggest problem is the API key. Try to make it using without all this Open AI company. What happen if you dont extend your API key subscription ? Will the pipeline just stop working ?
@anshikak3
@anshikak3 5 ай бұрын
Does it work of csv filled with numeric data converted to pdf and then imported in the file?
@kai-yihsu3556
@kai-yihsu3556 Жыл бұрын
May I ask if this tutorial example simply extracts the content from the PDF article as context and sends it along with the question to the OPENAI API? Or is there any training being done locally? I'm curious about this because the video mentioned the use of an API KEY. Thank you.
@guiomoff2438
@guiomoff2438 Жыл бұрын
Regarding tokenization, when you use the OpenAI API, both your PDF data and your question will go through tokenization processes. The text from your PDF file will be tokenized to prepare it for input to the model, and your question will also be tokenized to match the model's input format. The tokenization ensures that the text is divided into smaller units that the model can process. The tokenizations for your PDF data and question are independent of each other. The model doesn't directly compare the tokenizations to extract relevant content from your PDF file. Instead, the model processes the tokenized input and generates responses based on its understanding of the language and context. The model doesn't have direct access to the original PDF data or its specific tokenization. OpenAI doesn't have access to your data!
@guiomoff2438
@guiomoff2438 Жыл бұрын
You need an API key to add the openAI API layer on your model.
@DigitalSreeni
@DigitalSreeni Жыл бұрын
No training is happening, just a vector match of embeddings. I've used the term 'training' in the tutorial but what I should have said was that embeddings are being matched.
@kai-yihsu3556
@kai-yihsu3556 Жыл бұрын
@@DigitalSreeniThank you so much! 😊
@telexiz
@telexiz 3 ай бұрын
Thanks!
@DigitalSreeni
@DigitalSreeni 3 ай бұрын
Thank you.
@BlazeArteryak
@BlazeArteryak Жыл бұрын
Is it better than chatwithpdf plugin model ?
@a3hindawi
@a3hindawi 4 ай бұрын
Thanks
@DigitalSreeni
@DigitalSreeni 3 ай бұрын
Thank you
@bropocalypseteam3390
@bropocalypseteam3390 Жыл бұрын
Where's the training?
@shubhamdubey9181
@shubhamdubey9181 5 ай бұрын
But langchain is free ?
@elibrignac8050
@elibrignac8050 Жыл бұрын
can you link the txt file you used
@user-uu7te1ob1b
@user-uu7te1ob1b 6 ай бұрын
But IDK how to code😢😢😢😢😢😂😂
@DigitalSreeni
@DigitalSreeni 6 ай бұрын
Don't worry. There are a lot of service providers out there that allow you to train your own chatbots, just costs some $$$
@ronaldgourgeot2759
@ronaldgourgeot2759 11 ай бұрын
Thanks!
@DigitalSreeni
@DigitalSreeni 11 ай бұрын
Thank you
324 - Chat-based data analysis​ using openAI and pandasAI
17:23
DigitalSreeni
Рет қаралды 6 М.
Run your own AI (but private)
22:13
NetworkChuck
Рет қаралды 1,4 МЛН
КАКУЮ ДВЕРЬ ВЫБРАТЬ? 😂 #Shorts
00:45
НУБАСТЕР
Рет қаралды 3,5 МЛН
How I Did The SELF BENDING Spoon 😱🥄 #shorts
00:19
Wian
Рет қаралды 37 МЛН
ISSEI & yellow girl 💛
00:33
ISSEI / いっせい
Рет қаралды 25 МЛН
OpenAI Embeddings and Vector Databases Crash Course
18:41
Adrian Twarog
Рет қаралды 451 М.
How to Build AI ChatBot with Custom Knowledge Base in 10 mins
10:46
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
pixegami
Рет қаралды 208 М.
Let's build GPT: from scratch, in code, spelled out.
1:56:20
Andrej Karpathy
Рет қаралды 4,6 МЛН
308 - An introduction to language models with focus on GPT
26:36
DigitalSreeni
Рет қаралды 6 М.
Create a LOCAL Python AI Chatbot In Minutes Using Ollama
13:17
Tech With Tim
Рет қаралды 45 М.
Chat with Multiple PDFs | LangChain App Tutorial in Python (Free LLMs and Embeddings)
1:07:30
Alejandro AO - Software & Ai
Рет қаралды 464 М.
Build Anything with AI Agents, Here's How
29:49
David Ondrej
Рет қаралды 271 М.