Laurens Weijs - Making a benchmarking system for LLMs

No video

Laurens Weijs - Making a benchmarking system for LLMs

Рет қаралды 94

pyGrunn and aiGrunn Conferences

pyGrunn and aiGrunn Conferences

Ай бұрын

Safeguarding LLMs will be important going forward if we want to productionize LLMs, by building a benchmark system we can run all our LLMs in research against the benchmarks and then have a better answer whether our LLMs have unwanted baises. With the AI Validation team within the Dutch Government we our now building this up and it will be open source from the start.

Пікірлер: 1

@alexd7466 Ай бұрын

But why use a LLM for binary (yes/no) output? that is not what they're good at.

Arjan Egges - When LLMs work and when they don't

28:15

Arjan Egges - When LLMs work and when they don't

pyGrunn and aiGrunn Conferences

Рет қаралды 181

Linus Torvalds: Speaks on Hype and the Future of AI

9:02

Linus Torvalds: Speaks on Hype and the Future of AI

SavvyNik

Рет қаралды 146 М.

SCHOOLBOY. Последняя часть🤓

00:15

SCHOOLBOY. Последняя часть🤓

⚡️КАН АНДРЕЙ⚡️

Рет қаралды 8 МЛН

小天使太有爱心了#天使#小丑#家庭#搞笑

00:32

小天使太有爱心了#天使#小丑#家庭#搞笑

家庭搞笑日记

Рет қаралды 25 МЛН

ХХХІІІ Жазғы Олимпиада ойындары | Дзюдо | Финал | Елдос Сметов - Олимпиада Чемпионы

08:32

ХХХІІІ Жазғы Олимпиада ойындары | Дзюдо | Финал | Елдос Сметов - Олимпиада Чемпионы

QAZSPORT TV / ҚАЗСПОРТ TV

Рет қаралды 701 М.

😳 Все русские уже знают итальянский?🇮🇹

00:15

😳 Все русские уже знают итальянский?🇮🇹

Super Italiano

Рет қаралды 4,8 МЛН

The moment we stopped understanding AI [AlexNet]

17:38

The moment we stopped understanding AI [AlexNet]

Welch Labs

Рет қаралды 892 М.

What are AI Agents?

12:29

What are AI Agents?

IBM Technology

Рет қаралды 142 М.

Guus Klinkenberg - Improving Developer Experience and Productivity with Science

24:14

Guus Klinkenberg - Improving Developer Experience and Productivity with Science

pyGrunn and aiGrunn Conferences

Рет қаралды 14

Roald Nefs - An Introduction to Hardware Hacking using Python

27:19

Roald Nefs - An Introduction to Hardware Hacking using Python

pyGrunn and aiGrunn Conferences

Рет қаралды 147

The Clever Way to Count Tanks - Numberphile

16:45

The Clever Way to Count Tanks - Numberphile

Numberphile

Рет қаралды 880 М.

Kristy Eley - To boldly go where no server has gone before

28:06

Kristy Eley - To boldly go where no server has gone before

pyGrunn and aiGrunn Conferences

Рет қаралды 43

Using docker in unusual ways

12:58

Using docker in unusual ways

Dreams of Code

Рет қаралды 434 М.

How AI Will Step Off the Screen and into the Real World | Daniela Rus | TED

12:55

How AI Will Step Off the Screen and into the Real World | Daniela Rus | TED

TED

Рет қаралды 271 М.

Bishwas Jha - Sustainable Python Coding: A Holistic Approach

24:51

Bishwas Jha - Sustainable Python Coding: A Holistic Approach

pyGrunn and aiGrunn Conferences

Рет қаралды 46

How AI 'Understands' Images (CLIP) - Computerphile

18:05

How AI 'Understands' Images (CLIP) - Computerphile

Computerphile

Рет қаралды 193 М.

SCHOOLBOY. Последняя часть🤓

00:15

SCHOOLBOY. Последняя часть🤓

⚡️КАН АНДРЕЙ⚡️

Рет қаралды 8 МЛН