Evaluating LLM-based Applications

  Рет қаралды 23,255

Databricks

Databricks

Жыл бұрын

Evaluating LLM-based applications can feel like more of an art than a science. In this workshop, we'll give a hands-on introduction to evaluating language models. You'll come away with knowledge and tools you can use to evaluate your own applications, and answers to questions like:
- Where do I get evaluation data from, anyway?
- Is it possible to evaluate generative models in an automated way?
- What metrics can I use?
- What's the role of human evaluation?
Talk by: Josh Tobin
Here’s more to explore:
LLM Compact Guide: dbricks.co/43WuQyb Big Book of MLOps: dbricks.co/3r0Pqiz
Connect with us: Website: databricks.com
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc
Facebook: / databricksinc

Пікірлер: 10
@AnandShah-ds
@AnandShah-ds 8 ай бұрын
Evaluations aside, I really enjoyed the presentation. I was hooked. Great story-telling skills Josh. Thanks for sharing your experience. We count on volunteers like you to spread knowledge.
@vaishnavipatil3319
@vaishnavipatil3319 Жыл бұрын
Thank you for clearing this concepts. Would like to see more videos from you on evaluation frameworks, methods.
@ndamulelosbg8887
@ndamulelosbg8887 5 ай бұрын
This is an exellent coverage of the challenging task of llm evaluatuon
@asfandiyar5829
@asfandiyar5829 11 ай бұрын
Just what I was after. Thanks
@PJ-hi1gz
@PJ-hi1gz Ай бұрын
Great talk, thanks for sharing
@ndamulelosbg8887
@ndamulelosbg8887 5 ай бұрын
"Your opininon on LLMs does not matter" - I found this to be a great quote
@bharath_v
@bharath_v 8 ай бұрын
Good One!
@manishsharma2211
@manishsharma2211 11 ай бұрын
Good work
@SpartanPanda
@SpartanPanda 10 ай бұрын
Great storyline
@threevia.travel
@threevia.travel 6 ай бұрын
Very generic, expected something more tangible! Sounds common sense which might work or might not work
How to evaluate an LLM-powered RAG application automatically.
50:42
Schoolboy - Часть 2
00:12
⚡️КАН АНДРЕЙ⚡️
Рет қаралды 3,2 МЛН
Little girl's dream of a giant teddy bear is about to come true #shorts
00:32
ОБЯЗАТЕЛЬНО СОВЕРШАЙТЕ ДОБРО!❤❤❤
00:45
What are AI Agents?
12:29
IBM Technology
Рет қаралды 113 М.
Advancements in Open Source LLM Tooling, Including MLflow
39:43
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2 МЛН
Why Fine Tuning is Dead w/Emmanuel Ameisen
50:07
Hamel Husain
Рет қаралды 29 М.
Testing Generative AI Models: What You Need to Know
32:52
Databricks
Рет қаралды 3,7 М.
Fine-tune Multi-modal LLaVA Vision and Language Models
51:06
Trelis Research
Рет қаралды 19 М.
"okay, but I want Llama 3 for my specific use case" - Here's how
24:20
Nokia 3310 top
0:20
YT 𝒯𝒾𝓂𝓉𝒾𝓀
Рет қаралды 3,8 МЛН
Хакер взломал компьютер с USB кабеля. Кевин Митник.
0:58
Последний Оплот Безопасности
Рет қаралды 2,2 МЛН
iPhone 15 Pro в реальной жизни
24:07
HUDAKOV
Рет қаралды 489 М.