How to evaluate Artificial Intelligence?

  Рет қаралды 352

Ricardo Santos

Ricardo Santos

Күн бұрын

The state of the art of language models shows that AI still has a long way to go. Researchers are designing new evaluation methods to quantify the performance of Large Language Models (LLM) and identify the limitations and strengths of AI models.
In this video we explore the new LLM evaluation methods based on the paper "A Survey on Evaluation of Large Language Models" and answer the question of why you should not trust AI.
Video title: How to evaluate AI?
Watch my latest video: The Great Leap! From Developer to AI Engineer - • ¡El Gran Salto! De Des...
824 Views - Feb 26, 2024
Help me reach my subscriber goal!: ||||||...... 17% ............... 17.4K/100K
-------------------------------------------------- -----------------------------------
Resources
- A Survey on Evaluation of Large Language Models: arxiv.org/abs/2307.03109
-------------------------------------------------- -----------------------------------
Sections:
0:00 Introduction
0:52 Evaluation of AI models
1:34 What are the tasks that LLMs perform?
2:06 Performance in NLP tasks
2:49 Performance in ethics and bias
3:24 Performance in social sciences
4:01 Performance in natural sciences and engineering
4:29 Performance in medicine
4:48 Performance in agent tasks
5:23 Performance in other tasks
6:07 Where to evaluate LLMs?
7:17 How to evaluate LLMs?
8:36 Summary of findings in the evaluation of LLMs
9:58 Conclusions
-------------------------------------------------- -----------------------------------
Music:
Legend Has It - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/3UN60C...
Lucky Stars - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/70f90U...
Stop The Clock - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/2fainn...
No Introduction - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/4SMBTz...
Rise Up - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/4DqeLS...
-------------------------------------------------- -----------------------------------
Networks:
GitHub: github.com/Tibiritabara
LinkedIn: / ricardosantosdiaz
Instagram: / tibiritabara90
-------------------------------------------------- -----------------------------------
Thanks for watching the video!
#ai #llm #software

Пікірлер: 6
@RicardoSantosDiaz
@RicardoSantosDiaz 11 ай бұрын
Los LLMs, Large Language Models, o Grandes Modelos de Lenguaje, llegaron para quedarse, pero es necesario antes de su adopción en masa identificar sus graves fallas y riesgos en la sociedad, y dedicar un gran esfuerzo en su mejora y evaluación continua, asegurando su impacto positivo. Debemos aceptar que aún estamos demasiado lejos de ello.
@S4z4kku
@S4z4kku 11 ай бұрын
Muy buena información, se ha glorificado todo lo asociado a IA que no se habla de esos detalles técnicos e importantes que aún no se han cubierto
@RicardoSantosDiaz
@RicardoSantosDiaz 11 ай бұрын
Ciertamente, hay que mantener una perspectiva objetiva, pero el ruido sensacionalista de la prensa muchas veces es más fuerte
@adriipinto
@adriipinto 11 ай бұрын
🙌🏽🙌🏽🙌🏽
@angelicasantos568
@angelicasantos568 11 ай бұрын
wooow
@OTTOALACCION28
@OTTOALACCION28 10 ай бұрын
Excelente la inteligencia artificial no puede ser mas inteligente que nosotros los seres humanos
How to Boost AI with Real and Accurate Data #RAG
13:00
Ricardo Santos
Рет қаралды 535
All the science behind image synthesis
10:40
Ricardo Santos
Рет қаралды 566
Tom & Jerry !! 😂😂
00:59
Tibo InShape
Рет қаралды 57 МЛН
Can You Draw A PERFECTLY Dotted Line?
00:55
Stokes Twins
Рет қаралды 74 МЛН
КАРМАНЧИК 2 СЕЗОН 7 СЕРИЯ ФИНАЛ
21:37
Inter Production
Рет қаралды 368 М.
Increíble final 😱
00:37
Juan De Dios Pantoja 2
Рет қаралды 110 МЛН
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
Inteligencia Artificial en Educación: beneficios y desafíos
1:20:43
EDUCACIÓN 3.0
Рет қаралды 12 М.
What is a REST API and how does it work?
13:10
Ricardo Santos
Рет қаралды 2,9 М.
Get rich with Artificial Intelligence
8:43
Ricardo Santos
Рет қаралды 1,2 М.
ESSENTIAL AI toolset for Developers
18:25
Ricardo Santos
Рет қаралды 259
CONTROVERSIAL opinions on programming
10:05
Ricardo Santos
Рет қаралды 1,5 М.
#miniphone
0:16
Miniphone
Рет қаралды 3,6 МЛН
Урна с айфонами!
0:30
По ту сторону Гугла
Рет қаралды 7 МЛН
Gizli Apple Watch Özelliği😱
0:14
Safak Novruz
Рет қаралды 4,5 МЛН
Неразрушаемый смартфон
1:00
Status
Рет қаралды 2 МЛН