Superfast RAG with Llama 3 and Groq

  Рет қаралды 5,726

James Briggs

James Briggs

Күн бұрын

Groq API provides access to Language Processing Units (LPUs) that enable incredibly fast LLM inference. The service offers several LLMs including Meta's Llama 3. In this video, we'll implement a RAG pipeline using Llama 3 70B via Groq, an open source e5 encoder, and the Pinecone vector database.
📌 Code:
github.com/pinecone-io/exampl...
🌲 Subscribe for Latest Articles and Videos:
www.pinecone.io/newsletter-si...
👋🏼 AI Consulting:
aurelio.ai
👾 Discord:
/ discord
Twitter: / jamescalam
LinkedIn: / jamescalam
#artificialintelligence #llama3 #groq
00:00 Groq and Llama 3 for RAG
00:37 Llama 3 in Python
04:25 Initializing e5 for Embeddings
05:56 Using Pinecone for RAG
07:24 Why We Concatenate Title and Content
10:15 Testing RAG Retrieval Performance
11:28 Initialize connection to Groq API
12:24 Generating RAG Answers with Llama 3 70B
14:37 Final Points on Why Groq Matters

Пікірлер: 17
@awakenwithoutcoffee
@awakenwithoutcoffee 17 күн бұрын
hi James, Microsoft just open-sourced their graphRAG technology stack, might be cool to take a look and see how we can leverage/combine them both.
@alexjensen990
@alexjensen990 10 күн бұрын
Nice walk through and I agree that Groq is amazing... Just wish they had other models.
@PanduPandu-fh5tk
@PanduPandu-fh5tk 10 күн бұрын
Good work, Helped me a lot!
@gilbertb99
@gilbertb99 19 күн бұрын
What are your thoughts on adding a short summary description of the document or paper in each chunk including the title?
@jamesbriggs
@jamesbriggs 19 күн бұрын
it's a good idea - I haven't tried it before but seems sensible, you would need to find a balance between too much summary which might "overpower" the meaning of the chunk itself and getting enough summary in there to be useful - but if you get something good there it feels like a great idea imo
@tiagoc9754
@tiagoc9754 19 күн бұрын
Nice thing is that you can use groq with langchain as well
@jamesbriggs
@jamesbriggs 19 күн бұрын
Yes very true
@Davorge
@Davorge 19 күн бұрын
is this re-usable in such way that we can switch calling groq to call open ai gpt-4o or other models?
@jamesbriggs
@jamesbriggs 19 күн бұрын
Yeah it’s pretty simple to swap them out, they use a similar (maybe even same) API
@tiagoc9754
@tiagoc9754 19 күн бұрын
Is there any oss embedding model you'd recommend over e5 for real/prod use cases? I've just used openai so far
@juanpablomesalopez
@juanpablomesalopez 19 күн бұрын
gte-base or bge-base are good in benchmarks, but gotta really test them on your use case. You should also fine-tune the embeddings with your use case data.
@jamesbriggs
@jamesbriggs 19 күн бұрын
E5 have been good, I like Jina’s embedding model, and I’ve heard some good things about BAAI bge-m3 too for hybrid search
@byczong
@byczong 19 күн бұрын
@@jamesbriggs maybe in some future video you could cover bge-m3 :)) this model sound pretty cool (especially dense/multi-vector/sparse retrieval)
@content_ai_
@content_ai_ 19 күн бұрын
You in Bali nice! I am looking for an online job mate. I'm pretty desperate at this point
@jamesbriggs
@jamesbriggs 19 күн бұрын
You can tell? But yes, here for a while - just work on AI stuff, get yourself out there a bit, it does take time though
@tiagoc9754
@tiagoc9754 19 күн бұрын
Groq is insanely fast
@jamesbriggs
@jamesbriggs 19 күн бұрын
Yeah it’s wild
Intro to RAG for AI (Retrieval Augmented Generation)
14:31
Matthew Berman
Рет қаралды 44 М.
Semantic Chunking for RAG
29:56
James Briggs
Рет қаралды 19 М.
Heartwarming Unity at School Event #shorts
00:19
Fabiosa Stories
Рет қаралды 16 МЛН
New model rc bird unboxing and testing
00:10
Ruhul Shorts
Рет қаралды 23 МЛН
Русалка
01:00
История одного вокалиста
Рет қаралды 7 МЛН
What are AI Agents?
12:29
IBM Technology
Рет қаралды 59 М.
100+ Linux Things you Need to Know
12:23
Fireship
Рет қаралды 829 М.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 94 М.
Scientific Concepts You're Taught in School Which are Actually Wrong
14:36
Can AI code Flappy Bird? Watch ChatGPT try
7:26
candlesan
Рет қаралды 9 МЛН
I spent six months rewriting everything in Rust
15:11
chris biscardi
Рет қаралды 416 М.
Kyutais New "VOICE AI" is INSANE (and open source)
13:10
Wes Roth
Рет қаралды 44 М.
What is an LLM Router?
9:16
Sam Witteveen
Рет қаралды 25 М.
Why Agent Frameworks Will Fail (and what to use instead)
19:21
Dave Ebbelaar
Рет қаралды 24 М.
Сколько реально стоит ПК Величайшего?
0:37
Отдых для геймера? 😮‍💨 Hiper Engine B50
1:00
Вэйми
Рет қаралды 1,2 МЛН
تجربة أغرب توصيلة شحن ضد القطع تماما
0:56
صدام العزي
Рет қаралды 57 МЛН
ОБСЛУЖИЛИ САМЫЙ ГРЯЗНЫЙ ПК
1:00
VA-PC
Рет қаралды 2,1 МЛН
Top 50 Amazon Prime Day 2024 Deals 🤑 (Updated Hourly!!)
12:37
The Deal Guy
Рет қаралды 1,4 МЛН