No video

Add feature selection to a Pipeline

  Рет қаралды 8,182

Data School

Data School

Күн бұрын

It's simple to add feature selection to a Pipeline:
1. Use SelectPercentile to keep the highest scoring features
2. Add feature selection after preprocessing but before model building
P.S. Make sure to tune the percentile value!
👉 New tips every TUESDAY and THURSDAY! 👈
🎥 Watch all tips: • scikit-learn tips
🗒️ Code for all tips: github.com/jus...
💌 Get tips via email: scikit-learn.tips
=== WANT TO GET BETTER AT MACHINE LEARNING? ===
1) LEARN THE FUNDAMENTALS in my intro course (free!): courses.datasc...
2) BUILD YOUR ML CONFIDENCE in my intermediate course: courses.datasc...
3) LET'S CONNECT!
- Newsletter: www.dataschool...
- Twitter: / justmarkham
- Facebook: / datascienceschool
- LinkedIn: / justmarkham

Пікірлер: 11
@dataschool
@dataschool 3 жыл бұрын
Thanks for watching! 🙌 If you're new to Pipeline, you might want to start with this video instead: kzfaq.info/get/bejne/Z79mgpyfqNWUXX0.html
@user-zu9xf1cn9d
@user-zu9xf1cn9d 3 жыл бұрын
Before this video, I was doing fs manually, then was building pipelines. You literally made my life easier😀 Thank you!
@dataschool
@dataschool 3 жыл бұрын
That's awesome to hear! Thank you for sharing!
@hhbbhvvbjhbbyjj
@hhbbhvvbjhbbyjj 3 жыл бұрын
can you please make a complete python for data analysis course. I think your a great teacher and many people will benefit from it.
@dataschool
@dataschool 3 жыл бұрын
Thank you! Here is everything I offer at the moment: www.dataschool.io/start/
@D3nz13
@D3nz13 3 жыл бұрын
Can chi2 be used to find the relationship between all types of data (here I mean quantitive-quantitive, quantitive-categorical, categorical-quantitive, categorical-categorical)? If not, what other measures should be used for particular combinations?
@dataschool
@dataschool 3 жыл бұрын
In this context, the chi2 test is for determining if there is a relationship between any feature (individually) and a categorical target, not for determining relationships between features. The features are all numeric by the time they reach the feature selection step. Hope that helps!
@D3nz13
@D3nz13 3 жыл бұрын
@@dataschool Yup, that helps. I believe during my statistics classes we were told chi2 test was used to measure the relationship between two categorical variables. But in many feature selection tutorials people use chi2 for both numerical and categorical data and I got a bit confused.
@dashdash_peacecampaign
@dashdash_peacecampaign 2 жыл бұрын
I'm trying to figure out how to do all the categorical encoding and feature selection using make_Pipeline, then list out the coeffiencts. I have not see anyone do a guide on all, they either leave out categorical then feature selection or vice versa
@suguru_shivakumar1210
@suguru_shivakumar1210 3 жыл бұрын
Wowwww
@dataschool
@dataschool 3 жыл бұрын
Glad you liked it!
Don't use .values when passing a pandas object to scikit-learn
1:23
How do I select features for Machine Learning?
13:16
Data School
Рет қаралды 176 М.
女孩妒忌小丑女? #小丑#shorts
00:34
好人小丑
Рет қаралды 68 МЛН
Идеально повторил? Хотите вторую часть?
00:13
⚡️КАН АНДРЕЙ⚡️
Рет қаралды 18 МЛН
Harley Quinn's revenge plan!!!#Harley Quinn #joker
00:59
Harley Quinn with the Joker
Рет қаралды 26 МЛН
Simplify Data Preprocessing with Python's Column Transformer: A Step-by-Step Guide
13:52
198 - Feature selection using Boruta in python
16:50
DigitalSreeni
Рет қаралды 14 М.
Use cross_val_score and GridSearchCV on a Pipeline
7:02
Data School
Рет қаралды 13 М.
Scikit-Learn Model Pipeline Tutorial
16:50
Greg Hogg
Рет қаралды 26 М.
All Major Feature Selection Methods in Machine Learning Explained
11:32
Learn with Whiteboard
Рет қаралды 4,2 М.
Feature Selection in Python | Machine Learning Basics | Boston Housing Data
27:45
女孩妒忌小丑女? #小丑#shorts
00:34
好人小丑
Рет қаралды 68 МЛН