Scaling Synthetic Data Creation with 1 Billion Personas | PersonaHub Dataset Explained

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681

Exploring the PRISM Dataset: Conversations, Insights, and Model Performance

MISS CIRCLE STUDENTS BULLY ME!

КАК ДУМАЕТЕ КТО ВЫЙГРАЕТ😂

IQ Level: 10000

DEFINITELY NOT HAPPENING ON MY WATCH! 😒

Scaling Synthetic Data Creation with 1 Billion Personas | PersonaHub Dataset Explained

Рет қаралды 498

Argilla

Күн бұрын

Welcome to another episode of Data Explorer by Argilla! 🎥🚀 In this episode, we’re diving into the Persona Hub dataset, introduced in the paper “Scaling Synthetic Data Creation with 1 Billion Personas” by Xin Chan et al from the Tencent AI Lab.
This dataset focuses on increasing the variety in synthetic datasets by using personas. By assigning a persona to a large language model (LLM), we can create more diverse and realistic responses to instructions. The paper proposes a method to create these personas from world knowledge and public texts from the web.
Resources:
- Dataset repo: huggingface.co/datasets/proj-...
- Notebook to upload to Argilla: colab.research.google.com/dri...
- Paper: huggingface.co/papers/2406.20094
- Argilla Instance: huggingface.co/spaces/argilla...

Пікірлер: 3

@DanielVilaSuero

@DanielVilaSuero 22 күн бұрын

Very cool!

@kevon217 20 күн бұрын

Really cool. Thanks for walking through the hub.

@argilla-io 10 күн бұрын

Any time! We hope to do this more based on community feedback.

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681

46:53

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681

The TWIML AI Podcast with Sam Charrington

Рет қаралды 4,1 М.

Exploring the PRISM Dataset: Conversations, Insights, and Model Performance

6:29

Exploring the PRISM Dataset: Conversations, Insights, and Model Performance

Argilla

Рет қаралды 178

MISS CIRCLE STUDENTS BULLY ME!

00:12

MISS CIRCLE STUDENTS BULLY ME!

Andreas Eskander

Рет қаралды 19 МЛН

КАК ДУМАЕТЕ КТО ВЫЙГРАЕТ😂

00:29

КАК ДУМАЕТЕ КТО ВЫЙГРАЕТ😂

МЯТНАЯ ФАНТА

Рет қаралды 10 МЛН

IQ Level: 10000

00:10

IQ Level: 10000

Younes Zarou

Рет қаралды 7 МЛН

DEFINITELY NOT HAPPENING ON MY WATCH! 😒

00:12

DEFINITELY NOT HAPPENING ON MY WATCH! 😒

Laro Benz

Рет қаралды 64 МЛН

A Complete Overview of Word Embeddings

17:17

A Complete Overview of Word Embeddings

AssemblyAI

Рет қаралды 103 М.

What is Synthetic Data? No, It's Not "Fake" Data

6:49

What is Synthetic Data? No, It's Not "Fake" Data

IBM Technology

Рет қаралды 30 М.

QLoRA-How to Fine-tune an LLM on a Single GPU (w/ Python Code)

36:58

QLoRA-How to Fine-tune an LLM on a Single GPU (w/ Python Code)

Shaw Talebi

Рет қаралды 52 М.

Water powered timers hidden in public restrooms

13:12

Water powered timers hidden in public restrooms

Steve Mould

Рет қаралды 487 М.

Nemotron-4 340B - Need to Make a LLM Dataset?

10:13

Nemotron-4 340B - Need to Make a LLM Dataset?

Sam Witteveen

Рет қаралды 10 М.

Stanford's FREE data science book and course are the best yet

4:52

Stanford's FREE data science book and course are the best yet

Python Programmer

Рет қаралды 685 М.

Elad Levi on AutoPrompt and intent-based prompt calibration and prompt engineering

38:11

Elad Levi on AutoPrompt and intent-based prompt calibration and prompt engineering

Argilla

Рет қаралды 467

AI vs ML vs DL vs Generative Ai

16:00

AI vs ML vs DL vs Generative Ai

Krish Naik

Рет қаралды 37 М.

What I *actually* do as a Data Scientist in 2024 (everything you need to know)

13:22

What I *actually* do as a Data Scientist in 2024 (everything you need to know)

candidly vivian

Рет қаралды 9 М.

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

24:30

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

Shaw Talebi

Рет қаралды 46 М.

Dieser Trick hat tatsächlich FUNKTIONIERT| Laden Sie Ihr Handy mit einer Zeichnung auf #kurz

0:21

Dieser Trick hat tatsächlich FUNKTIONIERT| Laden Sie Ihr Handy mit einer Zeichnung auf #kurz

One More German

Рет қаралды 1,5 МЛН

ЗАЛУТАЛ ПОЛНЫЙ ПАКЕТ КОМПЬЮТЕРНОГО ЖЕЛЕЗА ЗА 2000 РУБЛЕЙ И СОБРАЛ ИЗ ЭТОГО ПК

22:39

ЗАЛУТАЛ ПОЛНЫЙ ПАКЕТ КОМПЬЮТЕРНОГО ЖЕЛЕЗА ЗА 2000 РУБЛЕЙ И СОБРАЛ ИЗ ЭТОГО ПК

GLAZOV

Рет қаралды 87 М.

АЙФОН 20 С ФУНКЦИЕЙ ВИДЕНИЯ ОГНЯ

0:59

АЙФОН 20 С ФУНКЦИЕЙ ВИДЕНИЯ ОГНЯ

КиноХост

Рет қаралды 1,2 МЛН

Это Xiaomi Su7 Max 🤯 #xiaomi #su7max

1:01

Это Xiaomi Su7 Max 🤯 #xiaomi #su7max

Tynalieff Shorts

Рет қаралды 2,1 МЛН

БЮДЖЕТНО! i9 и RTX 4090 за 65.000 ₽ #игровойпк #мощныйпк #подборпк

1:00

БЮДЖЕТНО! i9 и RTX 4090 за 65.000 ₽ #игровойпк #мощныйпк #подборпк

CompShop Shorts

Рет қаралды 298 М.

#samsung #retrophone #nostalgia #x100

0:14

#samsung #retrophone #nostalgia #x100

mobijunk

Рет қаралды 13 МЛН

Я КУПИЛ РАСКЛАДУШКУ С ИСКУССТВЕННЫМ ИНТЕЛЛЕКТОМ!

13:12

Я КУПИЛ РАСКЛАДУШКУ С ИСКУССТВЕННЫМ ИНТЕЛЛЕКТОМ!

Игорь Линк

Рет қаралды 254 М.