Causal Inference with Machine Learning

Causal Inference with Machine Learning - EXPLAINED!

Рет қаралды 37,241

Күн бұрын

Follow me on M E D I U M: towardsdatascience.com/likeli...
Joins us on D I S C O R D: / discord
Please like and S U B S C R I B E: / codeemporium
INVESTING
[1] Webull (You can get 3 free stocks setting up a webull account today): a.webull.com/8XVa1znjYxio6ESdff
REFERENCES
[1] RCTs may not model ATE exactly as we think. But more importantly, they don’t measure ITEs: www.ncbi.nlm.nih.gov/pmc/arti...
[2] Literature Review of Causal Inference + Uplift modeling: proceedings.mlr.press/v67/gut...
[3] Quick intro to uplift modeling: towardsdatascience.com/a-quic...
[4] Why Uplift modeling in marketing is important: towardsdatascience.com/why-ev...
[5] Uplift Modeling: link.springer.com/content/pdf...
[6] Code for causalml: github.com/uber/causalml
[7] Section 3 here shows the assumptions that need to be met for an RCT to give an estimate of ATE that is representative of the population: rss.onlinelibrary.wiley.com/d...
[8] Causal ML documentation about methodologies to determine CATE: causalml.readthedocs.io/en/la...
[9] MIT lecture on covariate adjustment & matching: • 15. Causal Inference, ...
[10] Microsoft’s Blog post illustrating different methods to determine CATE: / causal-inference-part-...
[11] Wayfair Tech blog that succinctly explains Uplift Decision Trees (I’ll probably make a video on this in the future): www.aboutwayfair.com/tech-inn...
[12] Article that ties the research paper with the meta-learner algorithms: chowdera.com/2021/10/20211025...

Пікірлер: 67

@scitechtalktv9742 2 жыл бұрын

Very instructive and well-made video ! I have 1 question: where can I found your video on this calibration thing ? Very curious about that ! I also have 1 slight remark: 12:08 into the video there is a mistake in the formula for ITE. In both terms of the ITE formula you use W=1, but this trivially has to be W=1 (treated) and W=0 (not treated) respectively I think. Do you agree ? It is just a minor remark, the rest is outstanding 👍

@CodeEmporium 2 жыл бұрын

Wow very good eye! I didn't notice this. And you are correct - Wi for the second term in that equation should be 0 and not 1 since it represents "not treated". Thanks for catching that!

@CodeEmporium 2 жыл бұрын

And the Video on calibration - you can go to the video tab on my channel and look for a video "Model Calibration - EXPLAINED". Sorry I can't paste the link here. KZfaq isn't good with it.

@scitechtalktv9742 2 жыл бұрын

@@CodeEmporium I find the content of your videos of a VERY good quality ! So when a new one arrives, I am watching them very focused. So spotting a mistake is not that hard for me. I hope you will continue making these videos! Can you point me to that video about CALLIBRATION ?

@scitechtalktv9742 2 жыл бұрын

@@CodeEmporium OK, I will search for that

@scitechtalktv9742 2 жыл бұрын

@@CodeEmporium I think it is this video: MODEL CALIBRATION - EXPLAINED ! - Why Logistic Regression DOESN'T return probabilities?! kzfaq.info/get/bejne/a-CSiZVl29-zZGg.html

@jingwangphysics 2 жыл бұрын

Intuitively, since z can only have value 1 or 0, the treatment effect on the persuadable group p(z=1) wrt other groups p(z=0) is p-(1-p) = 2p-1. Great work, thanks!

@TheMrKingplays 2 жыл бұрын

Keep up your great work! You are such a good teacher (+ entertainer sometimes ;)).

@CodeEmporium 2 жыл бұрын

To you, i bow. Thank you :)

@soumen_das 7 ай бұрын

great explanation! Pls continue making videos on these topics “causal inference” as there are very few informational videos.

@pablocalvache-nr5wr Ай бұрын

Amazing work! congratz!

@gutihernandez7868 2 жыл бұрын

great video! Thanks! FYI: at 12:20 ITE formula is wrong. Both terms are same (prob of customer purchased given emails) but latter should be prob of customer purchased NOT given emails

@CodeEmporium 2 жыл бұрын

Thank you! And yep. Thanks for pointing that lil mistake out. Some others did too and i hearted a comment for more visibility. Will pin too.

@jeffreagan2001 2 жыл бұрын

Hi, I am in drug discovery and the things you talk about are directly translatable to my work (potential customer vs. patient who will respond best to my drug). Thanks so much!!

@CodeEmporium 2 жыл бұрын

Happy this is useful!

@dubeya01 Жыл бұрын

Great video...keep it up...all the best

@hameddadgour 2 ай бұрын

Great video. Thank you for sharing!

@pamg2628 2 жыл бұрын

Great video. Where can I find your video on calibration?

@won20529jun 2 жыл бұрын

Loved it!! Thank you for the derivation; simple when you explain it. I would've just accepted it as something handed down from the gods otherwise

@user-jm2sd9hf6m 8 ай бұрын

This very cool. Is it possible to use the two model approach to measure incremental orders in a a/b test?

@hassenhadj7665 2 жыл бұрын

please, can you make a video to explain the dual aspect collaborative attention ?

@CodeEmporium 2 жыл бұрын

Here is a video on how causal inference can go a long way with machine learning. It's a fun video from the foundation of the concept with some important math. Hope i lay this out right and it's easy to understand. Any thoughts? Let me know in the comments or discord (link in description). Cheers!

@heyalchang 2 жыл бұрын

What are you using to make this video? The text and animations. It's really simple and effective.

@CodeEmporium 2 жыл бұрын

Camtasia Studio:)

@heyalchang 2 жыл бұрын

@@CodeEmporium Thanks. Keep up the videos!

@bassamry Жыл бұрын

good video, im wondering how the casual ML approach could mitigate any biases/clashes in the experiment, for example, if the treatment group individuals were sent an email, but during the experiment run, another experiment was conducted on the content of the email, or the UX of the landing experience on the email.. how can we analyze the results of whether the email is "good" or "bad" while considering all those external effects?

@ChocolateMilkCultLeader 2 жыл бұрын

Looking forward to it. Causal Inference is one of the coolest ML ideas I've been able to ise

@CodeEmporium 2 жыл бұрын

Thank you! And super True

@idkwiadfr5748 Жыл бұрын

Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help. Thank you

@ChocolateMilkCultLeader Жыл бұрын

@@idkwiadfr5748 I'll be looking into it soon if you're interested

@idkwiadfr5748 Жыл бұрын

@@ChocolateMilkCultLeader Sure

@fangzheng4030 Жыл бұрын

Very good video. I have a question, what if the probability of treatment is not 0.5 but some other value, such as 1/4? Then you cannot fully merge the probability person Xi got the email at 13:52. In this scenario, the ITE = -1 + 4/3P(Zi=1|Xi) + 8/3P(Yi=1,wi=1|Xi). What should we do with the unmerged 8/3 P(Yi=1,wi=1|Xi)?

@arresteddevelopment2126 3 ай бұрын

Awesome!! I have a question. What if I don't have Randomized Data? How can I then estimate ICT and do Uplift Modelling?

@ritvikmath 2 жыл бұрын

Cool topic! One confusion I had on the class transformation approach: If W=0 and Y=0 how can we be sure this is a "persuadable" and not a "lost cause"? If I understand correctly, both groups can take on these values. Similar question when W=1 and Y=1, can we be sure this is a "persuadable" and not a "sure thing"?

@CodeEmporium 2 жыл бұрын

Amazing question Ritvik. Yea I think you are correct. Z=1 doesn't just target persuadables, but a superset of that (with some lost causes and some sure things). The main objective of this class transformation approach i believe is to separate the sleeping dogs (we lose money advertising to these people) from the persuadables. I could have been clearer with this, but thanks for pointing this out!

@ritvikmath 2 жыл бұрын

Ahh thanks for the clarification. Makes total sense as a sleeping dog vs. not sleeping dog classifier!

@heavybreathing6696 2 жыл бұрын

Sleeping dogs in general is a very small percentage of the total population. So using this Z=1 class transformation, we're not really gaining much tbh...You should stick this clarification / correction or update your video description. I had the same exact thought and I rewatched that part of the video multiple times before I saw this correction comment.

@kasunthalgaskotuwa5631 Жыл бұрын

Nice explanation. Have you create any videos about dynamic causal inference with Machine learning

@idkwiadfr5748 Жыл бұрын

Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help. Thank you

@davidpearce1583 8 ай бұрын

12:34 Can anyone explain in this chat setting how the product rule made all those changes tot he ITE equation? I've been staring at it too long and it's not clicking. Great vid, thanks.

@rayallinkh 2 жыл бұрын

Thanks, but what if the probability of treatment is not 0.5? let's say we only assign 20% to treatment group. Does the class transformation still hold?

@maraffio72 Жыл бұрын

The two-model approach assumes your models have no error, which is optimistic at best. Subtracting the two values from the treated model and the control model completely disregards that these point estimates are not accurate (unless you have perfect and therefore overfitting models, which is bad anyway). The two-model approach should only be used as an example to show why modeling and measuring Uplift is not trivial and therefore requires specific tools like Uplift modeling techniques

@kawsarahmed9849 2 жыл бұрын

Very good. Upload next your video

@kesun852 Ай бұрын

good course, but I don't get how the data collected for Zi, because Zi = 1 when the sample (in treatment group and convert) OR the sample (in the control group and not convert). Persuadable should be the 'AND' between the above relationship, is it?

@nirliptapande Жыл бұрын

While we can understand the effect of a new policy by modeling it through uplift models, how can check if a certain factor is a cause for environments which we can just model, not change?

@idkwiadfr5748 Жыл бұрын

Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help. Thank you

@srijanmishra1747 2 жыл бұрын

You said ITE is in the range [0,1]. Can it not be negative for the "sleeping dogs" category you defined?

@idkwiadfr5748 Жыл бұрын

Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help. Thank you

@user-wr4yl7tx3w Жыл бұрын

it seems like ITE measures how much more probability it is due to the treatment than not due to the treatment. so is the way to prove causality is by simply showing a higher probability. doesn't correlation also show high probability?

@muchidariyanto Жыл бұрын

Whats is definition of 1. Model 1 2. Model 2 3. Z ?

@dmtree__ Жыл бұрын

12:08 shouldn't it be: P(Zi = 1 | Xi) = P(Yi = 1 | Xi, Wi = 1) + P(Yi = 0 | Xi, Wi = 0) Instead of Wi = 1 and Wi = 0 being on the left, they should be on the right, or what am I missing?

@peterszilvasi752 2 жыл бұрын

Hi @CodeEmporium, first of all thank you for the video, very well explained. What tool do you use for the animation?

@CodeEmporium 2 жыл бұрын

Glad you like it! I use Camtasia Studio

@sirabhop.s 5 ай бұрын

I was wondering why do we need this uplift modeling in the first place, can’t statistics answer the goal?

@1xwxu 2 жыл бұрын

very good video! Could you please share your slides?

@CodeEmporium 2 жыл бұрын

Thanks for watching! I created these animations as a video. It isn't a slide deck. Sorry about that

@mehul4mak 3 ай бұрын

10:17 how to get values for z?

@tntcrunch602 Жыл бұрын

What is the video where you show how to calibrate the Uplifting Model? I am not able to find it

@CodeEmporium Жыл бұрын

I have made 3 videos in the series for causal inference. I think this is the second one. You can check out my "Causal Inference" playlist for all these videos (though I am not sure if I got to exactly the calibration of uplift modeling - if not, maybe a future video)

@tntcrunch602 Жыл бұрын

@@CodeEmporium thank you, I just found it, great video by the way!

@zaheenwani102 2 жыл бұрын

Intersting

@CodeEmporium 2 жыл бұрын

Many thanks

@jzzzxxx Жыл бұрын

isn't this more accurately CATE? and wha't described is essentially a T-learner (meta learner algo, not meta-learning).

@mabellucena2485 2 жыл бұрын

How cute is this!!! Do not waste your time - Promo-SM .

@karannchew2534 2 ай бұрын

Xi ∈ ℝ^D "Xi" represents a specific customer, where "i" is an index referring to a particular customer. "∈" denotes membership, meaning "Xi" belongs to or is an element of. "ℝ^D" represents the set of real numbers raised to the power of "D," where "D" is the dimensionality of the feature space. This indicates that each customer is represented as a vector of real numbers with "D" dimensions. Each dimension might correspond to a specific feature or attribute of the customer, such as age, income, spending habits, etc. So, the equation "Xi ∈ ℝ^D" means that each customer "Xi" is represented as a vector of real numbers with "D" dimensions.