The 5 Must-Know Distributions for Data Scientists (not what you think)

  Рет қаралды 11,725

ritvikmath

ritvikmath

Күн бұрын

My Patreon : www.patreon.com/user?u=49277905
Visuals created with Excalidraw:
excalidraw.com/
Icon References:
Detective icons created by smalllikeart - Flaticon
www.flaticon.com/free-icons/d...
Skewness and Kurtosis Video : • Skewness and Kurtosis ...
0:00 Intro
1:01 The "Spike"
2:13 Skewed
3:41 Bimodal
4:57 "Pointy"
6:18 "Noisy"

Пікірлер: 64
@gsimmons4330
@gsimmons4330 Жыл бұрын
I love your channel’s intersection of higher level math with stats and data science! Feels like no one does it quite like you
@ritvikmath
@ritvikmath Жыл бұрын
Thanks 😊
@supersql8406
@supersql8406 Жыл бұрын
Yeah and he teaches very well, too! When I want to understand a specific section of advanced math... most channels over simplified the higher level where it's either become unusable or they explain it the same way like those text books where its just waaaay above most people's'' head.
@anishbhanushali
@anishbhanushali Жыл бұрын
dude I'm so grateful that this channel exists !!
@ritvikmath
@ritvikmath Жыл бұрын
Thanks! Grateful to you for watching
@platoh
@platoh 28 күн бұрын
This is probably the best use of 8.5 minutes I'll see all day. Love the insights, concise and organized delivery, and relatable examples.
@sharks1349
@sharks1349 Жыл бұрын
Teaching the intuition behind Data science and math in general, I find to be much more important than people might think
@ritvikmath
@ritvikmath Жыл бұрын
Thanks! I think so too
@zenith_journey
@zenith_journey Жыл бұрын
Love this channel too! I love discussions about intuitions… it’s so easy to get lost in statistical jargon and it’s refreshing to step back and put things into perspective.
@ritvikmath
@ritvikmath Жыл бұрын
Thanks!
@ching-tsungderontsai2750
@ching-tsungderontsai2750 Жыл бұрын
Amazing content that links stats and real world data. Greatly appreciate your work and clear examples!
@ritvikmath
@ritvikmath Жыл бұрын
Glad it was helpful!
@user-co6pu8zv3v
@user-co6pu8zv3v Жыл бұрын
Thank you! I realy like how you can explain everything simple way
@AndyInTheAir
@AndyInTheAir Жыл бұрын
Excellent work. The casual discussion is great to explain the concepts for newbies in data science or even the old dogs who want to learn new tricks. The most knowledgeable presenters are the ones who can explain something to a 5 year old. I'm also glad you have some content that formalizes these concepts as well. Always very helpful and though provoking.
@ritvikmath
@ritvikmath Жыл бұрын
Thanks for the thoughtful words!
@maxvaessen
@maxvaessen Жыл бұрын
Awesome stuff, very useful! Thanks ❤
@ritvikmath
@ritvikmath Жыл бұрын
Thanks for watching!
@jfndfiunskj5299
@jfndfiunskj5299 Жыл бұрын
Another fantastic video. Nice job.
@ritvikmath
@ritvikmath Жыл бұрын
Appreciate it!
@danieljaszczyszczykoeczews2616
@danieljaszczyszczykoeczews2616 Жыл бұрын
That video is really very useful! Please keep on telling about intuition behind the data distributions! That’s really hard to find such explainations in regular books or any other formal sources of data
@ritvikmath
@ritvikmath Жыл бұрын
Thanks! Will do
@bin4ry_d3struct0r
@bin4ry_d3struct0r Жыл бұрын
This is the most informative video on the intuition behind distribution interpretation I ever watched! For the "pointy" distribution, I've just thought of them as Gaussians with low variances.
@ritvikmath
@ritvikmath Жыл бұрын
Thanks!
@asjsingh
@asjsingh Жыл бұрын
Brilliant description of distributions
@ritvikmath
@ritvikmath Жыл бұрын
Thanks!
@sajanator3
@sajanator3 10 ай бұрын
I absolutely love this channel
@pixeloverflow
@pixeloverflow Жыл бұрын
This was super helpful! Thanks for sharing!
@ritvikmath
@ritvikmath Жыл бұрын
Thanks for watching!
@ioannisnikolaospappas6703
@ioannisnikolaospappas6703 Жыл бұрын
Thank u for your work brother!🙏
@ritvikmath
@ritvikmath Жыл бұрын
Thanks for watching!
@seanpitcher8957
@seanpitcher8957 Жыл бұрын
Love that last one. I use QQ plots more, makes more sense to me, but I've def seen these. Thanks for providing well explained content on a higher level than many do.
@ritvikmath
@ritvikmath Жыл бұрын
Thanks for the input and thanks for watching!
@karunamayiholisticinc
@karunamayiholisticinc Жыл бұрын
One of the best videos on Data science makes us understand data better
@ireoluwaTH
@ireoluwaTH Жыл бұрын
Practicality and 'rule of thumb'... You excel at that sort of stuff. 👌🏽
@ritvikmath
@ritvikmath Жыл бұрын
Thanks!
@mindasb
@mindasb Жыл бұрын
Tweedie distribution baby! Can be seen in some regression datasets where the government / local authority restrics max salary/price or whatever (California housing).
@lashlarue7924
@lashlarue7924 5 ай бұрын
Great video, it would be EXTREMELY helpful to me as a perpetually aspiring data scientist if you could show how you might go about fitting a distribution to your data and using it in a simulation exercise. (I have an idea of how I might go about doing this, but I'm acutely aware that others might have better insights!)
@pectenmaximus231
@pectenmaximus231 Жыл бұрын
I definitely turn noisy data into sensible data by making bins. This is especially true with frequency per day. At the daily level, picking out trend is difficult, but grouped to several months, or even several years, really helps create some worthwhile numbers.
@133839297
@133839297 Жыл бұрын
I like your teaching style.
@ritvikmath
@ritvikmath Жыл бұрын
Glad to hear that
@Joy_jester
@Joy_jester Жыл бұрын
Love the content. I started studying data science and your videos helped me a lot. A small suggestion/ request. For each concept/video that you are covering, can you also share some resource that you followed? Thanks
@abcpsc
@abcpsc Жыл бұрын
Just one realization of that pointy distribution from my work: it happened to a variable that is regulated but a not so powerful regulator. In my case, wind velocity in a tunnel (so signed and 1D) that is being regulated but some not so powerful fan
@bokehbeauty
@bokehbeauty Жыл бұрын
Im excited that you teach the message “what does it tell me” and explain by real life 🎉.
@ritvikmath
@ritvikmath Жыл бұрын
Thanks!
@bokehbeauty
@bokehbeauty Жыл бұрын
@@ritvikmath Under which of these types would you put the distribution of income in US, fat tail and big pick at the upper end?
@lorenzoplaserrano8734
@lorenzoplaserrano8734 Жыл бұрын
the power of this video 🔥
@ritvikmath
@ritvikmath Жыл бұрын
The power of my viewers 🔥
@shadowblack5455
@shadowblack5455 Жыл бұрын
I think that pointy distribution is modelled as a cauchy distribution and the skewed distribution is what you call a pareto distribution or an exponential distribution
@galenseilis5971
@galenseilis5971 Жыл бұрын
A Cauchy distribution looks appropriate in this case. There are other "pointy" distributions to keep in mind if a Cauchy does not fit well, such as the Laplace distribution.
@16876
@16876 Жыл бұрын
would be nice if you'd expand on how to analyze these dists
@enesdedovic
@enesdedovic Жыл бұрын
Pretty nice. Add another one on how to model those distributions.
@ritvikmath
@ritvikmath Жыл бұрын
Thanks!
@galenseilis5971
@galenseilis5971 Жыл бұрын
I don't think most data scientists have the additional time to delve into geometry, but geometry is very much about the "shape" of mathematical objects.
@galenseilis5971
@galenseilis5971 Жыл бұрын
1:44 I don't agree that a max GPA of 4 is a physical limitation in any usual sense of physics. If it is, then by what physical principle? Conservation of angular moment? But I appreciate the video overall. These are definitely common cases in data science. The video is both information and practical. Lately I have been thinking about counterfactual inference when there is an unknown upper bound on a facility's capacity. The bound will not change when intervening on the rates, but how the shape of the distribution will change with respect to the boundary and the expectation of the intervention distribution is non-obvious to me. From the modelling side I could derive a truncated distribution. Or I could derive the distribution of MAX(X, c) where c is a parameter or hyperparameter, although in NUTS/Gibb/MH sampling I find that such bounds are sampled poorly (i.e. lots of divergences) when they're treated as a parameter. Or you can have mixture distribution that transitions from "away-from-boundary behaviour" to "near-to-boundary behaviour".
@juaneshberger9567
@juaneshberger9567 Жыл бұрын
Can you make a video on data engineering vs machine learning engineering vs data scientist vs data analyst? Great vid btw!
@ritvikmath
@ritvikmath Жыл бұрын
Thanks for the suggestion!
@Eta_Carinae__
@Eta_Carinae__ Жыл бұрын
Isn't the pointy one the plot of distances away from a SLR line with L1 cost? I can't precisely remember the name of the curve, but it's not the curve shown.
@imtryinghere1
@imtryinghere1 Жыл бұрын
interesting ideas, but would be more helpful if you had a list of action items w/ each distribution.
@ritvikmath
@ritvikmath Жыл бұрын
Great suggestion
@LanteLuthuli
@LanteLuthuli Жыл бұрын
Has Bard/ChatGPT impacted your work in any way? How did you land up in DS?
@ritvikmath
@ritvikmath Жыл бұрын
Hey thanks for the questions! We will be covering those topics very soon in future videos
@azimuth4850
@azimuth4850 Жыл бұрын
👍
@ritvikmath
@ritvikmath Жыл бұрын
👍
@anandiyer5361
@anandiyer5361 Жыл бұрын
always great to watch your videos! Is there a way to contact you directly @ritwikmath?
The Biggest Misconception about Embeddings
4:43
ritvikmath
Рет қаралды 13 М.
T distribution and 1-Sample T Test
1:16:45
Prof. Dr. Marcelo Machado Fernandes
Рет қаралды 1,5 М.
Nutella bro sis family Challenge 😋
00:31
Mr. Clabik
Рет қаралды 11 МЛН
That's how money comes into our family
00:14
Mamasoboliha
Рет қаралды 7 МЛН
DO YOU HAVE FRIENDS LIKE THIS?
00:17
dednahype
Рет қаралды 74 МЛН
Reliability/Weibull Analysis
5:39
SigmaXL Inc.
Рет қаралды 39 М.
The Beta Distribution : Data Science Basics
16:23
ritvikmath
Рет қаралды 17 М.
The KL Divergence : Data Science Basics
18:14
ritvikmath
Рет қаралды 41 М.
The Unbelievable Reality of Simpson's Paradox
11:22
ritvikmath
Рет қаралды 6 М.
5 Probability Distributions you should know as a Data Scientist
14:57
Gradient Boosting : Data Science's Silver Bullet
15:48
ritvikmath
Рет қаралды 55 М.
The Shape of Data: Distributions: Crash Course Statistics #7
11:23
CrashCourse
Рет қаралды 554 М.
The Beta distribution in 12 minutes!
13:31
Serrano.Academy
Рет қаралды 79 М.
Markov Decision Processes - Computerphile
17:42
Computerphile
Рет қаралды 161 М.
Nutella bro sis family Challenge 😋
00:31
Mr. Clabik
Рет қаралды 11 МЛН