No video

What is Synthetic Data? No, It's Not "Fake" Data

  Рет қаралды 31,756

IBM Technology

IBM Technology

Күн бұрын

Learn more about Synthetic Data → ibm.biz/Synthe...
Synthetic data is artificially generated data versus data based on actual events, but it's not "fake" data. It replicates the properties of real data without the troubles of capturing it, such as confidentiality, low-volume, or expensive-to-validate. With synthetic data, it's easier and less costly to train AI models, however, it's not a panacea. For example, synthetic data may not fully represent the unexpected events that happen in the real world. In this video, Martin Keen explains what synthetic data is, its uses, benefits, and challenges; he wraps up his presentation by explain how it's generated.
Get started for free on IBM Cloud → ibm.biz/buildo...
Subscribe to see more videos like this in the future → ibm.biz/subscri...
#datascience #businesssolutions #lightboard #ibm #computerscience #data #machinelearning

Пікірлер: 49
@danielmaciel3447
@danielmaciel3447 11 ай бұрын
I am amazed how this dude can write backwards so perfectly
@IBMTechnology
@IBMTechnology 11 ай бұрын
See ibm.biz/write-backwards
@danielmaciel3447
@danielmaciel3447 11 ай бұрын
@@IBMTechnology aha! I knew some sorcery was involved
@xaxfixho
@xaxfixho 8 ай бұрын
Have you noticed they all seem to be left handed 🧐
@HoustonKhanyile
@HoustonKhanyile Жыл бұрын
I think this video might have jinxed Southampton. Instead of winning the Premier league they are now getting relegated.😢
@amazingwarrior4
@amazingwarrior4 8 ай бұрын
What is very interesting about this concept is the validity and reliability of them. Why they don't talk about it! it's essential when we talk about mathematical set's of any data!
@segunadewola
@segunadewola Жыл бұрын
Great video! Best of luck SFC😂
@tmastana
@tmastana Жыл бұрын
Amazing series and very classical and engrossing style of explanation... keep up the good work
@yassontheroad4038
@yassontheroad4038 Жыл бұрын
I like this friendly instructor
@anandkalhore4089
@anandkalhore4089 6 ай бұрын
Can synthetic data be as effective as real data? Wouldn’t model getting trained with synthetic data be giving false results when used against real data?
@mthoko
@mthoko Жыл бұрын
Great series from IBM in general and this instructor specifically . Slightly hopeful on the Southampton bit but if you can't dream, what's the point of it all😃
@MartinKeen
@MartinKeen Жыл бұрын
I appreciate your generous use of "slightly hopeful" 🙂
@vkris81
@vkris81 Жыл бұрын
Always had a sweet spot for the saints… hope my club could give a new home for JWP
@rickharold7884
@rickharold7884 Жыл бұрын
Yes, cool stuff. We use synthetic data for tracking trucks in the field. By taking existing labeled data and transforming the truck in three dimensions to get the additional data for the model.
@evetsnilrac9689
@evetsnilrac9689 8 ай бұрын
Sounds like you used existing real data about the trucks. How is that synthetic data? I fear I'm misunderstanding this.
@lozanojavier
@lozanojavier Жыл бұрын
I find it difficult to stop thinking about Martin Keen, and his prediction about Southampton's future in the Premier League. It's quite remarkable that both Southampton and Leicester will be battling it out in the Championship to regain their positions in the top tier in 2025. A great example of the problems with synthetic data.
@arturocaceres9973
@arturocaceres9973 Жыл бұрын
Excellent!!!
@quantumpotential7639
@quantumpotential7639 Ай бұрын
What kind of transparent white board is he using to write on? Very cool. Have not quite seen this before.
@karengomez3143
@karengomez3143 Жыл бұрын
Takeaway: Made up data can be used to deal with biased real word data and can be obtained from data sources or transforming existing data by adding noise or using GANs.
@seanrrr
@seanrrr 11 ай бұрын
Synthetic data has been very useful in my field (gene regulatory networks; maps of interactions that affect gene expression within cells). We can't manually test the interactions of tens of thousands of genes, especially across tens/hundreds of thousands of species, so we predict them using large molecular datasets. The problem is, how can you evaluate the accuracy of a prediction algorithm if you don't know what's true or false? Synthetic data is super useful, since you can generate data with known interactions that you can compare to. Algorithms can then be ranked on how close their predictions match the synthetic dataset. A great example is the GNW DREAM Network Inference Challenge, if you want to see how they use this!
@brandonsnider5871
@brandonsnider5871 9 ай бұрын
I love how Synthetic Data works. It's very, very useful. I just really worry that people will start training models on Synthetic data in scenarios in which it would be dangerous to use data that is not perfectly based in reality.
@StorageGuru
@StorageGuru 3 ай бұрын
Very simply explained ...👍
@ianoldfield2598
@ianoldfield2598 Жыл бұрын
Interesting, if rather simplistic. Having spent the past 5/6 years developing a synthetic police-data model, it is not easy or cheap (if time is factored in). Rows and rows of financial transactions might be easy to generate, less so, complex family groups, locations, incidents and crimes, vehicles, organisations, where these are interlinked, related and reflect real-world scenarios. Whilst IBM has some excellent tools such as i2 and Watson, the real data in those systems would be unlikely to be made available for sythesising.
@ndz7372
@ndz7372 Жыл бұрын
Loved this so much wow
@anirbanc88
@anirbanc88 Жыл бұрын
so cool, thanks
@nicoles_handle
@nicoles_handle 6 ай бұрын
using the prem was the perfect hook icl
@nagkumar
@nagkumar 5 ай бұрын
Why is it not called a fake message that is not clear in the video..
@almor2445
@almor2445 3 ай бұрын
How is this not basing later models on copies of copies of potentially incorrect data? Won't we end up with piles of structurally sound, true seeming noise eventually?
@almor2445
@almor2445 3 ай бұрын
Imagine I use the latest gpt model to scrape the wiki page regarding a political view point and generate 10 new pages of slightly different content based on that. All 10 will contain the lacks, flaws and biases in the original. What does thus achieve? We already have enough examples of the language in use so it's not for that. If it's for quality facts, you're not generating synthetic facts, just copies of previously learned ones. Is it just a way to get around intellectual property laws by making copies of something no one owns?
@michaelcharlesthearchangel
@michaelcharlesthearchangel Жыл бұрын
Programming/MetaProgramming/Hypergramming. Hypergramming is AI created synthetic databasing.
@kiwanukajoseph6812
@kiwanukajoseph6812 Ай бұрын
the dataset that has SFC as potential winner of the PL, is the first I would throw away🤣🤣🤣🤣🤣🤣
@prettypenny2353
@prettypenny2353 Жыл бұрын
Excellent presentation and excellent instructor.
@ndz7372
@ndz7372 Жыл бұрын
Thank you so much
@user-ef4df8xp8p
@user-ef4df8xp8p 6 ай бұрын
Very interesting..
@itdataandprocessanalysis3202
@itdataandprocessanalysis3202 Жыл бұрын
Thanks for the video. May I ask... is this British accent?
@MartinKeen
@MartinKeen Жыл бұрын
It is. Although I have been in the US for a good while now, so maybe a bit of a Mid-Atlantic accent.
@itdataandprocessanalysis3202
@itdataandprocessanalysis3202 Жыл бұрын
@@MartinKeen Thank You.
@tyrojames9937
@tyrojames9937 Жыл бұрын
INTERESTING.😀
@watipasokamanga8908
@watipasokamanga8908 11 ай бұрын
nice, now I can generate data for my HIV viral load detector model at no cost
@marshmallow4181
@marshmallow4181 7 ай бұрын
Which bord you use.. ?
@quantumpotential7639
@quantumpotential7639 Ай бұрын
The one that uses spell checker.
@lllcinematography
@lllcinematography Жыл бұрын
is this the hallucinations from llms like chatgpt that everyone hates put to good use?
@Hiram8866
@Hiram8866 Жыл бұрын
Its been all downhill since Lawrie McMenemy left. #sfc
@MartinKeen
@MartinKeen Жыл бұрын
Sadly true - and that was 45 years ago!
@maxwellmogambi6032
@maxwellmogambi6032 4 ай бұрын
hey am from the future 2024, and SFC is not winning the premier league, sorry😂!! educative lesson💯
@ashleygahl3638
@ashleygahl3638 3 ай бұрын
when he said, the years when my team won the prem title, i said, lies, all lies 😀😆
@agentxyz
@agentxyz 13 күн бұрын
currently models are being trained on sh*****Ty Ai-generated videos. definition of downward spiral--
@pradeep422
@pradeep422 Жыл бұрын
lol u kiddin southanpton next winners haahha..
What is Time Series Analysis?
7:29
IBM Technology
Рет қаралды 185 М.
Can you trust synthetic data?
8:07
IBM Technology
Рет қаралды 6 М.
Survive 100 Days In Nuclear Bunker, Win $500,000
32:21
MrBeast
Рет қаралды 156 МЛН
Challenge matching picture with Alfredo Larin family! 😁
00:21
BigSchool
Рет қаралды 31 МЛН
Чёрная ДЫРА 🕳️ | WICSUR #shorts
00:49
Бискас
Рет қаралды 3,3 МЛН
Вы чего бл….🤣🤣🙏🏽🙏🏽🙏🏽
00:18
What is Synthetic Data and how to use it in your project
1:06:53
PyCharm, a JetBrains IDE
Рет қаралды 5 М.
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
TED
Рет қаралды 1,4 МЛН
These Illusions Fool Almost Everyone
24:55
Veritasium
Рет қаралды 2,2 МЛН
What is Data Observability?
12:02
IBM Technology
Рет қаралды 36 М.
I've been using Redis wrong this whole time...
20:53
Dreams of Code
Рет қаралды 352 М.
AI, Machine Learning, Deep Learning and Generative AI Explained
10:01
IBM Technology
Рет қаралды 67 М.
Starting a Career in Data Science (10 Thing I Wish I Knew…)
10:42
Sundas Khalid
Рет қаралды 174 М.
What is Apache Kafka®?
11:42
Confluent
Рет қаралды 349 М.
Survive 100 Days In Nuclear Bunker, Win $500,000
32:21
MrBeast
Рет қаралды 156 МЛН