Explaining the ANOVA and F-test

Рет қаралды 9,227

Күн бұрын

When 3 is greater than 2
To try Shortform for a free trial, visit shortform.com/verynormal, and you'll receive an additional 20% discounted annual subscription.
Stay updated with the channel and some stuff I make!
👉 verynormal.substack.com
👉 very-normal.sellfy.store

Пікірлер: 50

@johns.7752 13 күн бұрын

The law of total variance is what made it make sense for me! None of my classes covered why something called "analysis of variance" would be a hypothesis test for significantly different means.

@princeofrain1428 14 күн бұрын

I wish my statistics classes had gone this deep into ANOVA. Unfortunately, we were limited by time constraints and sort of took for granted why they work. Thank you for providing more background context in a fun and engaging way!

@Apuryo 13 күн бұрын

At my school, linear models is a two year course, regression and anova get their own semester then we do generalized models and other things

@smoother4740 13 күн бұрын

This is the best explanation of the ANOVA I've seen so far. It directly answer why such a test that is testing the "equality" of differents means is called "ANOVA "(Analysis of Variance). I also liked how you showed its direct connection with the F-statistic using the actual equations. Keep up the good work!

@berjonah110 13 күн бұрын

An additional point on using ANOVA in practice: the F-test can only tell you that a difference between the means is present, not necessarily which groups are different or not. You have to use a more specific test (Tukey's HSD) to compare specific groups against each other.

@lucasortengren3844 13 күн бұрын

Immensely underrated channel, 46k subscribers is criminal

@Apuryo 13 күн бұрын

what's crazy is that my stat inference midterm is literally tomorrow, it's about one way anova 🤣

@very-normal 13 күн бұрын

👀 good luck!

@yorailevi6747 11 күн бұрын

I want to mention I am currently taking aparametric stats course! so I understand the vids about it better!

@R.H111 13 күн бұрын

Hey dude. I'm in Highschool and I got back my (self studied) AP statistics score earlier today. Scored a 5/5. I don't think I could've done it without you lol. tysm.

@very-normal 13 күн бұрын

Great job! I’m sure I only played a small role in that, you’re the one who hustled to learn the material, congratulations!

@walterreuther1779 13 күн бұрын

Oh, I love it that you not only know the term Homoskedasticity but also mention it as an assumption we are taking! Sometimes I ask Psychologists about what they think of Nassim Taleb's criticism of IQ - it being too heteroskedastic - and then usually their looks give away that they have never learned about Heteroskedasticity in their Psychometric lessons... I think this is sad, so all the better you mention it ;-)

@doentexd4770 13 күн бұрын

Christian, would you consider making a video specifically about multiple regression? I still don't have an intuitive understanding of why the Gauss-Markov hypothesis need to be confirmed in order to make inferences, and I think your videos would be of great help for you're an incredible teacher. Thank you for your work! Keep it up!

@samlevey3263 13 күн бұрын

It's because the assumptions of the Gauss-Markov theorem are used to determine what the standard errors of the coefficient estimators are. So, if those assumptions aren't met, but you still calculate the standard errors in the same way as you would if they were met, then you're going to get incorrect values for the standard errors. Then you use those standard errors to calculate t-statistics and such, so you'll get incorrect values for the t-statistics, and hence incorrect confidence intervals and potentially incorrect results for hypothesis tests.

@yazer9821 13 күн бұрын

can you do a video on GLMs please!! Your videos are great

@1.4142 13 күн бұрын

Wow I was just working on this exact scenario

@mclovin312 12 күн бұрын

Thanks for continuously producing these videos! Your channel is by far the best explainer on statistics compared to other KZfaq channels IMO. I’m curious: what software do you use to create the videos? PowerPoint?

@very-normal 12 күн бұрын

Thanks! I use Final Cut Pro for editing, Figma and Midjourney for graphics and the manim python library for animations

@GeoffryGifari 13 күн бұрын

Hmmm what if 5 out of 6 drug-organ pairs see success in cancer treatment? (1 mean singled out from the group, but not what we expect) Or if the group means are clustered, split in half (pairs 1,2,3 have the same mean, so do pairs 4,5,6)?

@very-normal 13 күн бұрын

You’d have a similar conclusion. The ANOVA is only detecting that at least one of them is different, so if that’s the case, there should be some compelling evidence to reject the null hypothesis. But to actually figure out *which* one is different, you’d need to follow up with secondary testing for each of the means

@Iachlan 7 күн бұрын

Can you explain the statistics behind weather prediction

@very-normal 7 күн бұрын

I’m not very well versed it in, but it sounds like it’d be a fancy, high dimensional regression model

@AJ-tr4jx 12 күн бұрын

what if the drug has effect on all the test group and the means for all the groups are shifted the same amount?

@very-normal 12 күн бұрын

You’d prolly get a null result. If you shift all the distributions by the same amount, there wouldn’t be a change in the variance in group means

@Iachlan 7 күн бұрын

In the one sample t test, we take alpha error to be cconstant and play around with error beta. Could we do it the other way around what would the implications be?

@very-normal 7 күн бұрын

you could, but most of the time we’re interested in detecting a significant effect, so power is the thing we want to maximize. There’s a trade off between reducing type-I error and power, so we choose to keep alpha constant to signify we tolerate a defined probability of making a wrong decision about rejecting the null

@chillphil967 14 күн бұрын

1:19 is there heart cancer? i thought no, since the cells are from birth. cool video either way, thx!

@very-normal 14 күн бұрын

I saw it was really rare, but deep down, I was just looking for an emoji to represent the group lol 😅

@dullyvampir83 13 күн бұрын

If the residues are normally distributed are then the original data not normal distributed as well? Aren't they just shifted by the mean?

@very-normal 13 күн бұрын

You’re right, I just wanted to emphasize that the main assumption is on the residuals. It implies that the outcome is normally distributed, but it’s more of a consequence of the fact that the residuals are normally distributed, rather than an assumption of the model

@RomanNumural9 13 күн бұрын

I think an important note on this is that the more populations you check the higher the likelihood is that one differs significantly by sheer luck. If instead of 5 cancers you're checking 100, the odds that statistical fluke will make one mean look further away from the others is fairly high.

@very-normal 13 күн бұрын

Yeah I thought about covering multiplicity here, but it deserves its own video

@briangreco2718 13 күн бұрын

This is not true with ANOVA. It has a type I error rate of 5% for finding *any* difference, not for each particular difference. If you had 1 million populations that were all the same, you would still only have a alpha% chance of finding a fluke. This is the advantage of running an ANOVA and not just running a bunch of two-sample pairwise tests.

@abcpsc 10 күн бұрын

At 9:22, why are they Chi square distributed?

@very-normal 10 күн бұрын

It comes from the distribution assumption on the residuals. The residuals were assumed to be normally distributed with some variance, sigma^2. You if you divide the sum of squares by sigma^2, then you get a random variable that’s a standard normal, squaring that gives you a chi-squared distribution. This applies to both the numerator and denominator in the F-statistic.

@Imperial_Squid 13 күн бұрын

Could you explain a bit further about the "residuals are normally distributed not that the variable is normally distributed itself" thing? This is one of the things that trips me up most often..

@very-normal 13 күн бұрын

Yeah for sure, I’ll try my best. This is partially my opinion, so just a heads up. My feeling is that assuming something about the data itself is much stronger than assuming something about the residuals. Very rarely will real-world data follow nice distributions like the Normal, so it’s harder to convince people (read: the statistical referee) that this will hold up. On the other hand, assuming that the residuals is not so bad. It’s like saying, we know there’s an average outcome and people will differ from this average, but they won’t differ too badly from it. In other words, outlier residuals are very rare. It’s confusing because this residual assumption implies that the outcome is also normally distributed in this, but it’s important to note that it’s the residual assumption we make. It’s also important because with stuff like linear regression, we’re looking at how different values of the predictor (i.e. cancer group) shift the distribution of the outcome. If you assume the data itself to have an outcome, it gets more complicated to try to work in how other variables influence it. Assuming the distribution is on the residuals doesn’t come with this baggage. Some people are taught that they should try to transform the outcome so that it “works better” with linear regression or ANOVA. Even though you’re manipulating the outcome, the hope is that this transformations makes the -residuals- look more normal. I hope this helps clarify somewhat. If anyone else sees this and thinks I left something out, please chip in. This is a common question, but even I don’t feel like I get all the nuances.

@Imperial_Squid 13 күн бұрын

@@very-normal "it's confusing because this residual assumption implies the outcome is also normally distributed in this" yeah that's the bit that always tripped me up, like I get that you can make one or other the core assumption and build it up from there (it's like picking your axioms in pure maths or something), but in my head the fact that the kinda nebulous residuals assumption implies the much more intuitive distribution assumption meant that I was often fighting between intuition and logic in terms of thinking it through. It also doesn't help that thinking of an example where the residuals are Normal but the distribution _isn't_ is much harder... So it's more about being an assumption of convenience in that it makes the maths much nicer to deal with and is also a weaker and more generalisable assumption, rather than it being anything else like purity or tradition or something. Thanks, I think I get it now! Though no doubt this will be one of those weird bits that'll always feel a little bit of, I feel like I have a much better grasp of the rationale! Much appreciated!

@jasondads9509 13 күн бұрын

anova did my head in stats, i

@walterreuther1779 13 күн бұрын

Question: What to do when the assumption of h̶o̶m̶o̶s̶k̶e̶d̶a̶s̶t̶i̶c̶i̶t̶y̶ homogeneity of variance is not met, i.e. there are different variances in the different populations? I would think this is a rather major assumption, especially if the sample size is small, as that would make ̶h̶e̶t̶e̶r̶o̶s̶k̶e̶d̶a̶s̶t̶i̶c̶i̶t̶y̶ heterogeneity of variance harder to test... Shouldn't one not always in some form test for ̶h̶e̶t̶e̶r̶o̶s̶k̶e̶d̶a̶s̶t̶i̶c̶i̶t̶y̶ heterogeneity of variance? Is this done in practice? Edit: Sorry, I wrote homoskedasticity and heteroskedasticity, but I meant homogeneity of variance and heterogeneity of variance. (The former assumes constant variance in the regressor variables, while the latter assumes the same variance for different sub-populations.

@zaydmohammed6805 13 күн бұрын

Same question here. In regression I remember them teaching us that you can scale down the data with the different variances in presence of heteroscedasticity. I wonder if that would work here or we have to do some sort of non parametric test

@very-normal 13 күн бұрын

Yeah, common variance is a pretty strong assumption to make. One solution I know of is a variant of the ANOVA called Welch’s ANOVA that can be used when you don’t want to make this assumption. It’s from the same guy behind Welch’s t-test, the version that students learn for two-sample problems when they also can’t assume common variance.

@walterreuther1779 13 күн бұрын

@@very-normal Thank you that's great to know. It seems like Welch's ANOVA is really the way to go, both for small sample size and for no knowledge about the data. (Apparently, it is almost as powerful as the standard ANOVA, even if heterogeneity of variance is fulfilled, so...)

@chillphil967 14 күн бұрын

🎉

@dibyajyotisaikia11 9 күн бұрын

I think example is incorrect, if the new drug is effective on different types of cancer , anova may still show statistically non significant inspite the drug being effective leading to wrong conclusion drawn and loss to the company 😂

@very-normal 9 күн бұрын

that’s all hypothesis tests tho lol

@dibyajyotisaikia11 8 күн бұрын

@@very-normal I meant you need atleast one more group of standard or control to come to any conclusion regarding efficacy