So well explained that makes it easy to understand GAN. Thank you. Big thumb up.
@just_a_viewer519 сағат бұрын
amazingly taught. thank you so much!
@nettrogenium.21 сағат бұрын
thanks dude, couldn't understand any explanation of all that before i found your video
@aapje180Күн бұрын
Bedankt
@SerranoAcademy15 сағат бұрын
Thank you so much for your kind contribution @aapje180! It's very much appreciated. :)
@tiagomelojuca7851Күн бұрын
+1 sub, great explanation, amazing how u make the math theory fits so well in the subject
@harsharangapatil2423Күн бұрын
Can you please add a video on curse of dimensionality?
@SerranoAcademy15 сағат бұрын
Great idea, thank you!
@tanggenius33712 күн бұрын
Thanks, the explaination is so intuitive. Finally understood the idea of attention.
@unclecode2 күн бұрын
Appreciate the great explanation. I have a question regarding the clipping formula at 36:42. You have used the "min" function. For example, if the rate is 0.4 and the epsilon is 0.3, indicating that we should get 0.7 in this scenario. However, in the formula you introduced here is returns then 0.4. Shouldn't the formula be clipped_f(x) = max(1 - epsilon, min(f(x), 1 + epsilon))? Am I missing anything?
@WrongDescription3 күн бұрын
Best explanation on the internet!!
@camzbeats69933 күн бұрын
Top
@camzbeats69934 күн бұрын
Very intuitive, thanks you. I like the exemple approach you take. 👏
@saralagrawal74494 күн бұрын
Ye Be10x ko koi ban kardo please. Irritate kar diya hai.
@Cathiina4 күн бұрын
Yess true. I only passed all my maths courses by learning by heart. Never quite satisfied with even good grades because I knew in my heart I understood nothing. Currently refreshing linear algebra in your coursera course and WOW! It’s addicting to actually learn what a rank in a matrix means. 😊☀️
@HoussamBIADI4 күн бұрын
Thank you for this amazing explanation <3
@mekuzeeyo5 күн бұрын
Great video as always. I have a question, in practice which one works best using DPO or RLHF?
@SerranoAcademy4 күн бұрын
Thank you! From what I've heard, DPO works better, as it trains the network directly instead of using RL and two networks.
@mekuzeeyo4 күн бұрын
@@SerranoAcademy Thank you sir for the great work. your Coursera courses have been awesome.
@hyperbitcoinizationpod5 күн бұрын
And the entropy is number of bits needed to convey the information.
@martadomingues16915 күн бұрын
Very good video, it helped clear some doubts I was having with this along with the Viterbi Algorithm. It's just too bad that the notation used was too different from class, but it did help me understand everything and make a connection between all of it. Thank you!
@Cathiina6 күн бұрын
Hi Mr. Serrano! I am doing your coursera course at the moment on linear algebra for machine learning and I am having so much fun! You are a brilliant teacher, and I just wanted to say thank you! Wish more teachers would bring theoretical mathematics down to a more practical level. Obviously loving the very expensive fruit examples :)
@SerranoAcademy6 күн бұрын
Thank you so much @Cathiina, what an honor to be part of your learning journey, and I’m glad you like the expensive fruit examples! :)
@vigneshram51936 күн бұрын
Thank you Luis Serrano for this super explanatory video
@bin4ry_d3struct0r7 күн бұрын
Is there an industry standard for the KLD above which two distributions are considered significantly different (like how 0.05 is the standard for the p-value)?
@SerranoAcademy7 күн бұрын
Ohhh that’s a good question. I don’t think so, since normally you use it for minimization or comparison between them, but I’ll keep an eye, maybe it would make sense to have a standard for it.
@frankl17 күн бұрын
Did anyone expect something different than Sofmax regarding the Bradley-Terry model as myself? 😅
@SerranoAcademy7 күн бұрын
lol, I was expecting something different too initially 🤣
@frankl17 күн бұрын
Really love the way you broke down the DPO loss, this direct way is more welcome by my brain :). Just one question on the video, I am wondering how important it is to choose the initial transformer carefully. I suspect that if it is very bad at the task, then we will have to change the initial response a lot, but because the loss function prevents from changing too much in one iteration, we will need to perform a lot tiny changes toward the good answer, making the training extremely long. Am I right ?
@SerranoAcademy7 күн бұрын
Thank you, great question! This method is used for fine-tuning, not specifically for training. In other words, it's crucial that we start with a fully trained model. For training, you'd use normal backpropagation on the transformer, and lots of data. Once the LLM is trained and very trusted, then you use DPO (or RLHF) to fine-tune it (meaning, post train it to get from good to great). So we should assume that the model is as trained as it can, and that's why we trust the LLM and we try to only change it marginally. If we were to do this method to train a model that's not fully trained... I'm not 100% if it would work. It may or may not, but we'd still have to punish the KL divergence much less. And also, human feedback gives a lot less data than scraping the whole internet, so I would still not use this as a training method, more as refining. Let me know if you have more questions!
@frankl17 күн бұрын
@@SerranoAcademy Thanks for the answer, I understand better. I forgot that this design is for fine-tuning.
@rb47547 күн бұрын
Very nice lecture on attention.
@mayyutyagi7 күн бұрын
Now whenever I watch Serrano's video, I first like it and the start watching it coz I know the video will gonna be outstanding as always.
@mayyutyagi7 күн бұрын
Liked this video and subscribed your channel today.
@mayyutyagi7 күн бұрын
Amazing video... Thanks sir for this pictorial representation and explaining this complex topic with such an easy way.
@AravindUkrd8 күн бұрын
Thanks for the simplified explanation. Awesome as always. The book link in the description is not working.
@SerranoAcademy7 күн бұрын
Thank you so much! And thanks for letting me know, I’ll fix it
@johnzhu57358 күн бұрын
This was very helpful
@siddharthabhakta32618 күн бұрын
The best explanation & depiction of SVD.
@melihozcan86768 күн бұрын
Thanks for the excellent explanation! I used to know the KL Divergence, but now I understand it!
@saedsaify99448 күн бұрын
Great one, the simpler it looks and harder to build!
@stephenlashley63138 күн бұрын
This and your whole series of attention NN is a thing of beauty! There are many ways of simplifying this here, but you come the closest to understanding Attention NN and QC are identical and QC is much better. In my opinion QC has never been done correctly, the gates are too confusing and poorly understood. QC is not still in simplified infant stage, it is mature what QC can do and matches all Psychology observations. All problems in Biology and NLP are sequences of strings.
@cloudshoring8 күн бұрын
awesome!
@bifidoc9 күн бұрын
Thanks!
@SerranoAcademy9 күн бұрын
Thank you so much for your kind contribution @bifidoc!!! 💜🙏🏼
@user-xc8vy4cw9k9 күн бұрын
I would like to say thank you for the wonderful video. I want to learn reinforcement learning for my future study in the field of robotics. I have seen that you only have 4 videos about RL. I am hungry for more of your videos. I found that your videos are easier to understand because you explain well. Please add more RL videos. Thank you 🙏
@SerranoAcademy9 күн бұрын
Thank you for the suggestion! Definitely! Any ideas on what topics in RL to cover?
@user-xc8vy4cw9k7 күн бұрын
@@SerranoAcademy more videos in the field of Robotics please. Thank you. You may also guide me how I can approach the study of reinforcement learning.
@user-xc8vy4cw9k9 күн бұрын
I would like to say thank you for the wonderful video. I want to learn reinforcement learning for my future study in the field of robotics. I have seen that you only have 4 videos about RL. I am hungry for more of your videos. I found that your videos are easier to understand because you explain well. Please add more RL videos. Thank you 🙏
@Omsip1239 күн бұрын
So well explained
@guzh9 күн бұрын
DPO main equation should be PPO main equation.
@epepchuy10 күн бұрын
Exvelente explciacion!!!
@iantanwx10 күн бұрын
Most intuitive explanation for QKV, as someone with only an elementary understanding of linear algebra.
@VerdonTrigance10 күн бұрын
It's kinda hard to remember all of these formulas and it's demotivating me from further learning.
@javiergimenezmoya8610 күн бұрын
You do not have to remember that formulas. You only have to understand the logic in them.
@SerranoAcademy15 сағат бұрын
Thanks for your comment @VerdonTrigance! I also can't remember these formulas, since to me, they are the worst way to convey information. That's why I like to see it with examples. If you understand the example and the idea underneath, then you understand the concept. Don't worry about the formulas.
@SerranoAcademy15 сағат бұрын
Agreed @javiergimenezmoya86!
@IceMetalPunk10 күн бұрын
I'm a little confused about one thing: the reward function, even in the Bradley-Terry model, is based on the human-given scores for individual context-prediction pairs, right? And πθ is the probability from the current iteration of the network, and πRef is the probability from the original, untuned network? So then after that "mathematical manipulation", how does the human-given set of scores become represented by the network's predictions all of a sudden?
@user-xc8vy4cw9k10 күн бұрын
Thank you for the wonderful video. Please add more practical example videos for the application of reinforcement learning.
@SerranoAcademy10 күн бұрын
Thank you! Definitely! Here's a playlist of applications of RL to training large language models. kzfaq.info/sun/PLs8w1Cdi-zvYviYYw_V3qe6SINReGF5M-
@laodrofotic771310 күн бұрын
noone of the videos I seen on this subject actually explain where the hell qkv values come from! its amazing people jump on making video while not understanding the concepts clearly! I guess youtube must pay a lot of money! But this video does a good job of explaining most of the things, it never does tell us where the actual qkv values come from, how do the embendings turn into them, and actually got things wrong in my oppinion. the q comes from embeddings that are multiplied by the wq, which is a weight and parameter in the model, but then the question is, where does wq wk wv come from???
@bendim9410 күн бұрын
how do you choose the number of features in the 2 matrices, i.e. how did you choose to have 2 features only?
@Priyanshuc242510 күн бұрын
Hey I know this 👦. He is my Maths teacher who don't only teach but make us visualize why we learn the topic and how will it useful in real world ❤
@Q79314821010 күн бұрын
It‘s was just so clear. 😃
@DienTran-zh6kj10 күн бұрын
I love his teaching, he makes complex things seem simple.
@shouvikdey707810 күн бұрын
Love your videos, please make more such videos on mathematical description of generative models such as GAN, Diffusion, etc.
@SerranoAcademy10 күн бұрын
Thank you! I got some on GANs and Diffusion models, check them out! GANs: kzfaq.info/get/bejne/brJhZMR-s5uviWw.html Stable diffusion: kzfaq.info/get/bejne/gNNxh9d4ld-lZXk.html
@mohammadarafah775711 күн бұрын
We expect to describe wasserstein distance 😊
@SerranoAcademy10 күн бұрын
Ah good idea! I'll add it to the list, as well as earth-mover's distance. :)
@mohammadarafah775710 күн бұрын
@SerranoAcademy I also highly recommend to describe Explainable AI (XAI) which depends on statistics.