7.5 Gradient Boosting (L07: Ensemble Methods)

  Рет қаралды 12,169

Sebastian Raschka

Sebastian Raschka

3 жыл бұрын

In this video, we will take the concept of boosting a step further and talk about gradient boosting. Where AdaBoost uses weights for training examples to boost the trees in the next round, gradient boosting uses the gradients of the loss to compute residuals on which the next tree in the sequence is fit.
XGBoost paper mentioned in the video: dl.acm.org/doi/pdf/10.1145/29...
Link to the code: github.com/rasbt/stat451-mach...
-------
This video is part of my Introduction of Machine Learning course.
Next video: • 7.6 Random Forests (L0...
The complete playlist: • Intro to Machine Learn...
A handy overview page with links to the materials: sebastianraschka.com/blog/202...
-------
If you want to be notified about future videos, please consider subscribing to my channel: / sebastianraschka

Пікірлер: 19
@deltax7159
@deltax7159 2 ай бұрын
this is the first video of yours I have come across, and it's by far the best I have found on this topic. Will be binging everything you have to offer from now on. Thanks for all the content, man!
@yerhoam
@yerhoam 7 ай бұрын
Thank you for the great explanation ! I liked the way you say "prediction" :)
@nazmuzzamankhan4764
@nazmuzzamankhan4764 3 жыл бұрын
I really liked the way you explained the steps with numbers. It helped me a lot to understand the notations of the equations.
@SebastianRaschka
@SebastianRaschka 3 жыл бұрын
glad to hear that it was useful!
@rohitgarg776
@rohitgarg776 2 жыл бұрын
Thanks, explained very nicely
@hassandanamazraeh5975
@hassandanamazraeh5975 Жыл бұрын
A great course. Thank you very much.
@SebastianRaschka
@SebastianRaschka Жыл бұрын
Thanks for the kind words! Glad to hear it was useful!
@newbie8051
@newbie8051 9 ай бұрын
Well I understood the Gradient Boosting part, as in we focus on the residuals and further make trees to lower the loss of previously_made_trees. But couldn't grasp how XGBoost achieves this via parallel computations. Guess will have to read the paper : )
@just4onecomment
@just4onecomment 3 жыл бұрын
Hi Professor, thank you very much for the educational video! Do you have any thoughts on how this stepwise additive model compares to fitting a very large model with many parameters in a "stepwise" fashion based on gradient descent? For example, freezing and additively training subnetworks of a neural model.
@SebastianRaschka
@SebastianRaschka 3 жыл бұрын
Interesting question. There's something called layerwise pre-training in the context of neural networks. It's basically somewhat similar to what you describe, training one layer at a time. The difference is really the structure of the model though, because it's fully connected layers rather than tree-based. But yeah, it's an interesting thought
@urthogie
@urthogie 5 ай бұрын
Why does the tree in step 2 not have a third decision node to split Waunake and Lansing?
@asdf_600
@asdf_600 2 жыл бұрын
Very nice video :) I was wondering why for gradient boosting we fit the derivative instead of the residual ? Intuitively that's what I would do :/
@SebastianRaschka
@SebastianRaschka 2 жыл бұрын
Good question. If we consider the squared error loss, "1/2(yhat-y)^2" we have "yhat-y" as the derivative but it is also what people refer to as residual in a linear regression context. Or in other words, the derivative looks like the residuals, so we basically do fit it to the derivative. If the loss is not the squared error loss, the derivative may be different, so we call it "pseudo residual" in general. However, we could also just be calling it loss derivative and don't use the term pseudo residual at all. I think it's just a convention in gradient boosting contexts to use the term pseudo residual.
@muhammadlabib3744
@muhammadlabib3744 Жыл бұрын
i still wondering in minutes 13.19, why you choose age >= 30 as a root node? is that from residual or else?
@SebastianRaschka
@SebastianRaschka Жыл бұрын
Oh this was an arbitrary choice for this example
7.6 Random Forests (L07: Ensemble Methods)
32:29
Sebastian Raschka
Рет қаралды 3,5 М.
7.7 Stacking (L07: Ensemble Methods)
34:13
Sebastian Raschka
Рет қаралды 10 М.
Пробую самое сладкое вещество во Вселенной
00:41
OMG🤪 #tiktok #shorts #potapova_blog
00:50
Potapova_blog
Рет қаралды 18 МЛН
Мы никогда не были так напуганы!
00:15
Аришнев
Рет қаралды 5 МЛН
CS480/680 Lecture 24: Gradient boosting, bagging, decision forests
1:14:55
Gradient Boost Part 1 (of 4): Regression Main Ideas
15:52
StatQuest with Josh Starmer
Рет қаралды 785 М.
XGBoost Made Easy | Extreme Gradient Boosting | AWS SageMaker
21:38
Prof. Ryan Ahmed
Рет қаралды 36 М.
Gradient Boosting : Data Science's Silver Bullet
15:48
ritvikmath
Рет қаралды 55 М.
Part 35-Boosting algorithms in machine learning (AdaBoost, GBM, XGBoost)
44:23
23.  Gradient Boosting
1:24:35
Inside Bloomberg
Рет қаралды 21 М.
GamePad İle Bisiklet Yönetmek #shorts
0:26
Osman Kabadayı
Рет қаралды 348 М.
Игровой Комп с Авито за 4500р
1:00
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 1,8 МЛН
Спутниковый телефон #обзор #товары
0:35
Product show
Рет қаралды 1,9 МЛН
Best mobile of all time💥🗿 [Troll Face]
0:24
Special SHNTY 2.0
Рет қаралды 2,2 МЛН