No video

Hugging Face LLMs with SageMaker + RAG with Pinecone

  Рет қаралды 17,623

James Briggs

James Briggs

Күн бұрын

Пікірлер: 31
@jamesbriggs
@jamesbriggs Жыл бұрын
👋🏼 Check out the article version of the video here: www.pinecone.io/learn/sagemaker-rag/
@noneofyourbusiness8625
@noneofyourbusiness8625 Жыл бұрын
This channel provides so much valuable information for free and I really appreciate it!
@jamesbriggs
@jamesbriggs Жыл бұрын
glad to hear :)
@mr.daniish
@mr.daniish Жыл бұрын
James can teach a 9 year old what a RAG is!
@jamesbriggs
@jamesbriggs Жыл бұрын
I try my best haha
@shashwatkumar5556
@shashwatkumar5556 11 ай бұрын
I want to thank you for this walkthrough. This was very informative. And I know it must have taken quite a lot of time and effort to make it. So thank you!!
@Yikina7
@Yikina7 6 ай бұрын
Amazing video, thank you very much! It's obvious there was a lot of work involved to make it in such a well structure way. Very easy to follow, you know how to teach :)
@RezaA
@RezaA Жыл бұрын
Thank you for the well described demo. The recommended vector db for this stack is probably opensearch which does the same as pinecone but you have more control and you own it and its more expensive.
@jamesbriggs
@jamesbriggs Жыл бұрын
meh, opensearch doesn't scale beyond 1M vecs well and their vec search implementation is nothing special - if you want open source I'd recommend qdrant (also rust like Pinecone) or weaviate
@arikupe2
@arikupe2 Жыл бұрын
@jamesbriggs Thanks for the video James! I was wondering what issues you've experienced with scaling OpenSearch? We're considering it for our large-scale business use case and had thought it would be a good fit for larger-scale use
@sandeeprawat4981
@sandeeprawat4981 10 ай бұрын
Thank you so much.. really appreciate...love from India
@megamehdi89
@megamehdi89 Жыл бұрын
awesome content, thank you so much. very good explanation. i love watching your videos. i try to follow them and learn 😊
@jamesbriggs
@jamesbriggs Жыл бұрын
happy to hear that! :)
@e_hossam96
@e_hossam96 9 ай бұрын
Thank you for your great effort 🤗
@SolidBuildersInc
@SolidBuildersInc 3 ай бұрын
Thank you for your presentation. I clicked the Subscribe button, although I didn't delve into the video content. During your talk, I recall you mentioning the open-source LLM and discussing AWS pricing. This led me to prioritize a cost-effective solution that allows for scalability. Have you considered running an ollama model locally and setting up a tunnel with a port endpoint for a public URL? I appreciate any feedback you can provide." 😊
@energyexecs
@energyexecs 5 ай бұрын
James -Great video and I like how you referred by to your flow chart diagram. My task is I am working on the "Corppus" of publicly available engineer technical standards documents that are only available in PDF or Word documents. I want to encode the words (tokens) in those document into a vector database and then take through LLM Bing GPT Transformation Architecture and then using RAG to focus only on the tokens (words) for that "corpus" of engineering standards. Why? This because right now I do a “Control F Search” which takes forever with my clients to find the standards which includes both words and diagrams, pictures (different modality) -- so instead of spending hours on "Control F" my plan is to convert those documents to the vector database and enable a "generative search" in "natural language" instead of "Control F search". Does this make sense? Your video is giving me the pathway to success.
@energyexecs
@energyexecs 5 ай бұрын
Great video and I like how you referred to your flow chart diagram. I am working on the "Corpus" of publicly available engineer technical standards documents that are only available in PDF or Word documents. I want to encode the words (tokens) in those document into a vector database and then take through LLM Bing GPT Transformation Architecture and then using RAG to focus only on the tokens (words) for that "corpus" of engineering standards. Why? This because right now I do a “Control F Search” which takes forever with my clients to find the standards which includes both words and diagrams, pictures (different modality) -- so instead of spending hours on "Control F" my plan is to convert those documents to the vector database and enable a "generative search" in "natural language" instead of "Control F search". Does this make sense? Your video is giving me the pathway to success.
@user-yu4kt5ie4r
@user-yu4kt5ie4r Жыл бұрын
will you be a video on deployment? Great video btw.
@shalabhgarg8225
@shalabhgarg8225 Жыл бұрын
Well just too good
@VenkatesanVenkat-fd4hg
@VenkatesanVenkat-fd4hg Жыл бұрын
Thanks for your valuable videos as always. Can you discuss fine tuning llama 2 7b or 13b using dataset & deploy in sagemaker.....
@AaronChan-x2d
@AaronChan-x2d 24 күн бұрын
You need to define your llm in step 2 of asking the model directly.... llm = HuggingFacePredictor( endpoint_name="flan-t5-demo" # Use the name of your deployed endpoint )
@barkingchicken
@barkingchicken 11 ай бұрын
Great video
@VaibhavPatil-rx7pc
@VaibhavPatil-rx7pc Жыл бұрын
Excellent
@serkansandkcoglu3048
@serkansandkcoglu3048 10 ай бұрын
Thank you! this is very informative! when we put our embeddings into pinecone vector db, is our data going outside? I would be ok to push my sensitive data to aws s3 bucket, but where does that pinecone db resides in?
@sergioquintero4624
@sergioquintero4624 9 ай бұрын
@jamesbriggs Hi james, thank you for the amazing video, I have a question.. it's possible to deploy models (embedding and LLM) in the same endpoint ? Just for save monye considering that in the RAG pipelines the embedding step and the retrieval are sequencial steps
@riyaz8072
@riyaz8072 8 ай бұрын
how to create vector vector database for pdf documents ?
@brianrowe1152
@brianrowe1152 Жыл бұрын
Neat but why? Is sagemaker just langchain hosted at Aws?
@jamesbriggs
@jamesbriggs Жыл бұрын
no it's more like Colab + ML infra, you can also use langchain with sagemaker - the why is for the infra component, hosting open source LLMs is super easy
@pantherg4236
@pantherg4236 Жыл бұрын
What is the best way to learn deep learning fundamentals via implementation (let's say pick a trivial problem of build a recommendation system for movies) using pytorch in Aug 26, 2023?
@rociotesla
@rociotesla 3 ай бұрын
tu código no corre una mierda bro
@sndrstpnv8419
@sndrstpnv8419 5 ай бұрын
you use in article wrong LLM 'HF_MODEL_ID':'meta-llama/Llama-2-7b' but it suppose to be MiniLM
How to Make RAG Chatbots FAST
21:02
James Briggs
Рет қаралды 38 М.
Son ❤️ #shorts by Leisi Show
00:41
Leisi Show
Рет қаралды 10 МЛН
Harley Quinn's plan for revenge!!!#Harley Quinn #joker
00:49
Harley Quinn with the Joker
Рет қаралды 33 МЛН
黑天使遇到什么了?#short #angel #clown
00:34
Super Beauty team
Рет қаралды 44 МЛН
Can This Bubble Save My Life? 😱
00:55
Topper Guild
Рет қаралды 71 МЛН
RAG But Better: Rerankers with Cohere AI
23:43
James Briggs
Рет қаралды 57 М.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 104 М.
Understanding Embeddings in RAG and How to use them - Llama-Index
16:19
Prompt Engineering
Рет қаралды 35 М.
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
15:21
Intro to RAG for AI (Retrieval Augmented Generation)
14:31
Matthew Berman
Рет қаралды 52 М.
SageMaker JumpStart: deploy Hugging Face models in minutes!
8:23
Model Distillation: Same LLM Power but 3240x Smaller
25:21
Adam Lucek
Рет қаралды 6 М.
LangChain Multi-Query Retriever for RAG
18:46
James Briggs
Рет қаралды 27 М.
Son ❤️ #shorts by Leisi Show
00:41
Leisi Show
Рет қаралды 10 МЛН