Рет қаралды 352
The state of the art of language models shows that AI still has a long way to go. Researchers are designing new evaluation methods to quantify the performance of Large Language Models (LLM) and identify the limitations and strengths of AI models.
In this video we explore the new LLM evaluation methods based on the paper "A Survey on Evaluation of Large Language Models" and answer the question of why you should not trust AI.
Video title: How to evaluate AI?
Watch my latest video: The Great Leap! From Developer to AI Engineer - • ¡El Gran Salto! De Des...
824 Views - Feb 26, 2024
Help me reach my subscriber goal!: ||||||...... 17% ............... 17.4K/100K
-------------------------------------------------- -----------------------------------
Resources
- A Survey on Evaluation of Large Language Models: arxiv.org/abs/2307.03109
-------------------------------------------------- -----------------------------------
Sections:
0:00 Introduction
0:52 Evaluation of AI models
1:34 What are the tasks that LLMs perform?
2:06 Performance in NLP tasks
2:49 Performance in ethics and bias
3:24 Performance in social sciences
4:01 Performance in natural sciences and engineering
4:29 Performance in medicine
4:48 Performance in agent tasks
5:23 Performance in other tasks
6:07 Where to evaluate LLMs?
7:17 How to evaluate LLMs?
8:36 Summary of findings in the evaluation of LLMs
9:58 Conclusions
-------------------------------------------------- -----------------------------------
Music:
Legend Has It - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/3UN60C...
Lucky Stars - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/70f90U...
Stop The Clock - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/2fainn...
No Introduction - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/4SMBTz...
Rise Up - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/4DqeLS...
-------------------------------------------------- -----------------------------------
Networks:
GitHub: github.com/Tibiritabara
LinkedIn: / ricardosantosdiaz
Instagram: / tibiritabara90
-------------------------------------------------- -----------------------------------
Thanks for watching the video!
#ai #llm #software