No video

Machine Learning & Data Science Project - 2 : Data Cleaning (Real Estate Price Prediction Project)

  Рет қаралды 322,790

codebasics

codebasics

4 жыл бұрын

This data science project series walks through step by step process of how to build a real estate price prediction website. We will first build a model using sklearn and linear regression using banglore home prices dataset from kaggle.com. Second step would be to write a python flask server that uses the saved model to serve http requests. Third component is the website built in html, css and javascript that allows user to enter home square ft area, bedrooms etc and it will call python flask server to retrieve the predicted price. During model building we will cover almost all data science concepts such as data load and cleaning, outlier detection and removal, feature engineering, dimensionality reduction, gridsearchcv for hyperparameter tunning, k fold cross validation etc. Technology and tools wise this project covers,
1) Python
2) Numpy and Pandas for data cleaning
3) Matplotlib for data visualization
4) Sklearn for model building
5) Jupyter notebook, visual studio code and pycharm as IDE
6) Python flask for http server
7) HTML/CSS/Javascript for UI
In this particular video we will load banglore home prices data into pandas dataframe and than handle NA values. We will than removal some unnecessary features and also normalize property size. We will convert the range of property size (such as 2100-3250) into an average of min and max.
Do you want to learn technology from me? Check codebasics.io/... for my affordable video courses.
Next video:
Data Science Project - 3: Feature Engineering (Real Estate Price Prediction Project): • Machine Learning & Dat...
Very Simple Explanation Of Neural Network: • Neural Network Simply ...
Popular Playlist:
Data Science Full Course: • Data Science Full Cour...
Data Science Project: • Machine Learning & Dat...
Machine learning tutorials: • Machine Learning Tutor...
Pandas: • Python Pandas Tutorial...
matplotlib: • Matplotlib Tutorial 1 ...
Python: • Why Should You Learn P...
Jupyter Notebook: • What is Jupyter Notebo...
Code: github.com/cod...
Parent Code Repository: github.com/cod...
🌎 My Website For Video Courses: codebasics.io/...
Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
#️⃣ Social Media #️⃣
🔗 Discord: / discord
📸 Dhaval's Personal Instagram: / dhavalsays
📸 Instagram: / codebasicshub
🔊 Facebook: / codebasicshub
📝 Linkedin (Personal): / dhavalsays
📝 Linkedin (Codebasics): / codebasics
📱 Twitter: / codebasicshub
🔗 Patreon: www.patreon.co...

Пікірлер: 257
@codebasics
@codebasics 2 жыл бұрын
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
@codebasics
@codebasics 4 жыл бұрын
Complete machine learning tutorial playlist: kzfaq.info/get/bejne/ndOmqcSgx9OblYU.html
@abhijeetvighne4452
@abhijeetvighne4452 3 жыл бұрын
can you please provide me a dataset link i.e kaggle.com
@waltersalvatore2055
@waltersalvatore2055 3 жыл бұрын
Instablaster...
@user-ee6nk8sc3t
@user-ee6nk8sc3t 2 ай бұрын
Am following your data science roadmap taking all the courses the way you outlined them and just in two months am so suprise how much knowledge i have acquired.Take it from me no one does it better than you in this space. Thanks for all the efforts.
@ankrish8692
@ankrish8692 4 жыл бұрын
Hello sir, I am so thankful to you for making such an important video abt data analysis, You did something that no one is doing in teaching Data science, I was struggling to find a "REAL LIFE Project like this " but couldn't find any, since 3 years I am learning and struggling to get a project like this , but NOW I think I found My Guide in form of YOU and your channel. Please Guide us to achieve our Data analyst career goal thank you !!!!! Thank You sir....!!!!
@flamboyantperson5936
@flamboyantperson5936 4 жыл бұрын
This is an excellent step by step tutorial on machine learning. You are the best youtuber for data science.
@codebasics
@codebasics 4 жыл бұрын
Thanks for appreciation Aamir 👍😊
@ipuhbamrash6708
@ipuhbamrash6708 4 жыл бұрын
This is indeed a coherent thought process!
@arpita0608
@arpita0608 4 жыл бұрын
@@ipuhbamrash6708 can you please tell me the use of rcParams("figure.figsize")=(20,10)
@ipuhbamrash6708
@ipuhbamrash6708 4 жыл бұрын
@@arpita0608 why do you ask me this command. Did I ask you expectation maximization, Gibbs sampling, etc. We never showcase what we know. We only appreciate the thought process of other in our learning process.
@nibinjoseph2136
@nibinjoseph2136 4 жыл бұрын
@@arpita0608 To get the size of the diagram from matplotlib.
@poroshatyazdanbakhshghahya9091
@poroshatyazdanbakhshghahya9091 3 жыл бұрын
One of the best tutorials on KZfaq. Thank you!
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@mzamelimashiyi6828
@mzamelimashiyi6828 2 жыл бұрын
Hey man, thank you for making these videos. I am a BEng (Computer Engineering) student in South Africa. I want to end up in Data Science. Upon searching the internet for a comprehensive channel for my learning goals, I found yours and I think I'm going to learn a lot from here. I'll be following along on this project and this is going to be my first Data Science project (^_^).
@pratisthasingh5441
@pratisthasingh5441 Жыл бұрын
Can i use this project in my portfolio.
@GurpreetSingh-yw3hb
@GurpreetSingh-yw3hb 4 жыл бұрын
Sir, I have seen lots of tutorials on KZfaq but you are the best among them.
@codebasics
@codebasics 4 жыл бұрын
Gurpreet, thank you for your kind words.
@GurpreetSingh-yw3hb
@GurpreetSingh-yw3hb 4 жыл бұрын
sir, I am confusing whether I should learn Django or Data science. Please suggest to me.
@shar008gaming6
@shar008gaming6 Жыл бұрын
@@GurpreetSingh-yw3hb learn both as both are useful
@ritamkabiraj8035
@ritamkabiraj8035 3 жыл бұрын
'I really want to see where does Bangalore contains 43bedrooms' XD
@nithyapalanisamy4904
@nithyapalanisamy4904 2 жыл бұрын
It could be outlier
@i_youtube_
@i_youtube_ 2 жыл бұрын
Behind my house
@nvduk3
@nvduk3 Жыл бұрын
Definitely an outlier, can be ignored
@muhammadmalik4627
@muhammadmalik4627 2 жыл бұрын
Thank you for giving such informative video. It is rare to see data science tutorial video that show the process of data cleaning at the detailed level like this
@gazitasnimahmad2115
@gazitasnimahmad2115 4 жыл бұрын
great.. i always wanted someone to explain these things like the way he does.. thanks a lot
@codebasics
@codebasics 4 жыл бұрын
I am happy this was helpful to you.
@da_ta
@da_ta 4 жыл бұрын
This is very exceptional and amazing project from start up to uploading on the website beyond Data science. I really appreciate your hard work THANK YOU!
@codebasics
@codebasics 4 жыл бұрын
😊👍
@AsadAhmed001
@AsadAhmed001 Жыл бұрын
Unbelievable kya kaam kiya hay apne data cleaning may, great logics applied.
@basotra97
@basotra97 4 жыл бұрын
I loved this project, it gave me a lot of insight.
@samuelmontypython8381
@samuelmontypython8381 4 жыл бұрын
Wow, this exactly what I’ve been searching for! I’m trying to create something similar to Zillow for the Japanese market, but with an AI component. The Japanese are very strict with who can use their real estate data and most times extensive licensing and company employment is required. AI will truly bring the power to the people who use it! Thank you for this excellent content! Japan thanks you
@codebasics
@codebasics 4 жыл бұрын
Good luck for your project my friend 😊👍
@souvikroy5
@souvikroy5 2 жыл бұрын
for timestamp 2:13 - we can use df1['area_type'].value_counts() for a similar output.
@vskraiml2032
@vskraiml2032 3 жыл бұрын
Very useful video for beginners and the way he presenting is awesome. Thank you...
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@raagsnipes
@raagsnipes 9 ай бұрын
6:48 you can also use this step, if you want to avoid lambda function. bhk = df3['size'].str.split(' ', n=2, expand = True) # note - put single space between single inverted commas. bhk = bhk[0].astype(int) df3['bhk'] = bhk
@vishalrai2859
@vishalrai2859 3 жыл бұрын
Thanks sir for giving all this at one place
@codebasics
@codebasics 3 жыл бұрын
👍😊
@skkkks2321
@skkkks2321 4 жыл бұрын
Great job..keep it up..thanks for spreading knowledge.
@shubhamjain6471
@shubhamjain6471 2 ай бұрын
After applying `convert_sqft_to_num` function, if you run `df.isnull().sum()` then you'd notice that `total_sqft` has some NaN values which can be removed from the dataframe.
@AhamedKabeer-wn1jb
@AhamedKabeer-wn1jb 4 жыл бұрын
Beautiful explanation..
@thisaintarf
@thisaintarf 4 жыл бұрын
hi mate, thankyu so much for this explain, this video helps me a lot
@codebasics
@codebasics 4 жыл бұрын
I am happy this was helpful to you.
@vemulaakash8212
@vemulaakash8212 4 жыл бұрын
Thanks a lot for your explanation sir , it was very clear and easily understandable.
@codebasics
@codebasics 4 жыл бұрын
You are most welcome
@rahulagarwal8059
@rahulagarwal8059 4 жыл бұрын
Sir thank you for providing this video....its very helpful for me
@codebasics
@codebasics 3 жыл бұрын
I am happy this was helpful to you.
@tusharsharma6129
@tusharsharma6129 4 жыл бұрын
Hello sir,I have a doubt why we had a copy of df3 into df4 (in-29,time=14:20) why we are not using df3[df3['total_sqft'].apply(convert_sqft_to_num)] I tried the above code but it is showing error If you have time,kindly explain my above query.
@sumittanwar8389
@sumittanwar8389 3 жыл бұрын
you should convert x into string because without it this error is coming " "int' object has no attribute 'split' " Thank you : )
@mohitupadhayay1439
@mohitupadhayay1439 2 жыл бұрын
use : tokens = str('x').split('-'). It will result correctly then.
@anujack7023
@anujack7023 3 жыл бұрын
hiii I can't access the house price data file it is showing this "Sorry, something went wrong. Reload?' I am reloading but it is not opening. can u plz provide new link for that data set.
@hemantsharma7986
@hemantsharma7986 4 жыл бұрын
Why drop the balcony feature? Why dropping the Na column. If you're making a full-fledged series on this problem. You should include null value replacement and other stuff. After dropping it becomes very simple then why make a series of this. It can be done in 20 min. And if you're dropping the balcony, also drop the bathroom. Why keep it. And also please back up your dropping statements as why we need to drop few columns and why not others.
@followthepassion1530
@followthepassion1530 4 жыл бұрын
Thank you sir for such a informative video...
@Induraj11
@Induraj11 3 жыл бұрын
It would be really great if you can proceed with project without removing features for the sake of simplicity!. Doing just the simple data cleaning and feature selection at own convenience, for sure will not help anyone understand the different approaches to real world problems! So learning from expert data scientist like you is the key to understand and learn different approaches. Hence kindly do project as if you would do a technical test or a solving a real problem so that we can learn from you! Thanks.
@focusonstudies8331
@focusonstudies8331 Жыл бұрын
Its for beginners..so should go with simplicity
@vishalgupta3175
@vishalgupta3175 4 жыл бұрын
Thank you so much sir, your videos helps me a lot and it will improve our effiency in creating model using python,you are explaining each term very smoothly, I have never returns for the guidance giving by you but praying for god to help everybody Keep Learning Keep Coding with codebasics........
@codebasics
@codebasics 4 жыл бұрын
👍😊
@jaysea80able
@jaysea80able 4 жыл бұрын
Excellent instructor.
@codebasics
@codebasics 4 жыл бұрын
Glad it was helpful!
@Ceyhus
@Ceyhus 3 жыл бұрын
Thank you so much for such helpful videos! !!
@NumanKhans.DataInsights
@NumanKhans.DataInsights 3 жыл бұрын
Useful tutorial for cleaning the data ❤️❤️❤️
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@etn422
@etn422 3 жыл бұрын
Dataset: www.kaggle.com/amitabhajoy/bengaluru-house-price-data
@alainleclerc4523
@alainleclerc4523 Жыл бұрын
you are amazing!!! a huge thank you!!
@julianops383
@julianops383 4 жыл бұрын
Great tutorial! Thank you very much! 🤓
@dhananjaykansal8097
@dhananjaykansal8097 4 жыл бұрын
Excellent! God Bless you sir........
@codebasics
@codebasics 4 жыл бұрын
I am glad you liked it
@kannanv8831
@kannanv8831 4 жыл бұрын
Thanks for your excellent explanation.
@codebasics
@codebasics 4 жыл бұрын
I am happy this was helpful to you.
@Suhasdarsi666
@Suhasdarsi666 4 жыл бұрын
great tutorial but I noticed you did not drop null characters for total_sqft after applying the final function was that intentional or an error. thank you :)
@unnatipalan3153
@unnatipalan3153 4 жыл бұрын
I had the same doubt so i checked the code (link in description), and it seems to have been fixed there
@anaghaparadkar0511
@anaghaparadkar0511 4 жыл бұрын
Thank you so much for this playlist. I have been watching and following your videos in this entire lockdown. All of them are really informative and useful for the beginners. Also I have one doubt. When we use convert_sqft_to_num function for the values that are given in sq. meter (row 410 for instance) it given NaN for that. So do we have to remove all those rows with NaN? or we can keep it as it is?
@vinodkumarreddy7696
@vinodkumarreddy7696 4 жыл бұрын
Hi Sir Thank you so much for the video to explaining data cleanup process..
@codebasics
@codebasics 4 жыл бұрын
I appreciate you leaving a comment of appreciation
@bq_wang
@bq_wang 3 жыл бұрын
Great!! very helpful!
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@daoowdAL-Sheikh
@daoowdAL-Sheikh Жыл бұрын
Thank you for sharing
@poojabehera8675
@poojabehera8675 4 жыл бұрын
Hi Dhawal, I am getting NAN value on executing df4.loc[30], coz before this convert_sqft_to_num('2100-2850') resulting empty output. Any comments?
@abhinavkesari3360
@abhinavkesari3360 4 жыл бұрын
u can refer sir github...
@mariav1234
@mariav1234 4 жыл бұрын
Thank you for posting this!
@codebasics
@codebasics 4 жыл бұрын
Thanks for the feedback
@sejaljamwal6773
@sejaljamwal6773 10 ай бұрын
after applying the convert_sqft_to_num on total_sqft column, we would get some null values. Would it be because we returned None in the except block? Wouldn't it be better to remove these null values and then proceed?
@mbharathkumarreddy1103
@mbharathkumarreddy1103 7 ай бұрын
@ 12:48 after the transformation here,shouldn’t we be dropping the nan values that the function returns?
@VikuChoudhary
@VikuChoudhary 3 жыл бұрын
You could have considered number of balconies. Any specific reason you dropped it?
@codebasics
@codebasics 3 жыл бұрын
no specific reason. actually you are right that you can keep that feature.. i was trying to make my tutorial simple and that's why dropped it (so that I have less number of features to work on)
@vaishvikpatel7610
@vaishvikpatel7610 3 жыл бұрын
amazing
@aditigupta466
@aditigupta466 Жыл бұрын
Thank you codebasics for such amazing vedios. They are really helpful. I am just curious to know why we are always using a different object for the DataFrame, when not even required. Is it not inefficient as more memory will be consumed. Also i am little confused, as to where I should make a new object and where to keep the old one.
@codebasics
@codebasics Жыл бұрын
You can use the same object as well but in notebook what happens is if say you have df object in cell 5 and now you jump to cell 21 and execute it, it will show previous df in this new cell and things get confusing some times. Hence I have used different objects to avoid confusion
@aditigupta466
@aditigupta466 Жыл бұрын
Thank you. Got it now
@bhamrags
@bhamrags 2 жыл бұрын
Hi Dhawal Thanks for your intuitive videos. I have one point here that after applying convert_sqft_to_num on df['total_sqft'] we will get 46 records where this function puts Null value (nan) in df['total_sqft'] column, so we need to handle these 46 nan values also.
@Mukeshkumar-yl1qq
@Mukeshkumar-yl1qq 2 жыл бұрын
You are right ✅
@dikshagupta3276
@dikshagupta3276 2 жыл бұрын
Hello sir I am diksha and want to do a end to end project for my cv so please advice me so I can do it ...pls reply
@omairsaleem5680
@omairsaleem5680 4 жыл бұрын
Excuse me ! where is the kaggle link for data set???
@sadhnasingh877
@sadhnasingh877 4 жыл бұрын
Great explanation but I couldn't understand the reason for using ~ symbol there.
@codebasics
@codebasics 4 жыл бұрын
It is like NOT. (Basically negate the condition)
@sadhnasingh877
@sadhnasingh877 4 жыл бұрын
@@codebasics thanks a lot
@dwaipayanchaudhuri5625
@dwaipayanchaudhuri5625 4 жыл бұрын
sir actually I'm doing a ml project on mentoring startups...can can you please help me find a good dataset
@jaganinfo
@jaganinfo 4 жыл бұрын
497 datasets available from this below link : archive.ics.uci.edu/ml/datasets.php
@p32929
@p32929 4 жыл бұрын
you should've mentioned which python version should we use. v2 or v3 thanks
@codebasics
@codebasics 4 жыл бұрын
Always v3
@mehakgarg0612
@mehakgarg0612 7 ай бұрын
how i get the kaggle link?
@rupampatil6425
@rupampatil6425 4 жыл бұрын
Video is just awesome. I Have a doubt. after converting range values into numbers, we kept None for other rows which contains more than 2 tokens.so they became Null. so should we drop those rows or not?
@sujangajananachari2814
@sujangajananachari2814 Жыл бұрын
RK also in size column but you considered it as BHK @codebasics
@tejaswinichilke6447
@tejaswinichilke6447 4 жыл бұрын
Hi, I was going through a series of videos on house price prediction which is very helpful to understand the data cleaning part. In that regard, I have a doubt, while dealing with the categorical feature, if one of the categories is 'No Data'. how to handle these category?
@MrYashmohta
@MrYashmohta 4 жыл бұрын
Drop the column or dropna()
@himanshusingh-nv5wn
@himanshusingh-nv5wn 4 жыл бұрын
If the number of nodata row is not big then u can replace it with the mode or u may also look for relationship with other features if not then drop that row
@SarthakCodes
@SarthakCodes 2 жыл бұрын
aren't we supposed to drop the NaN values of the total_sqft attribute too in the end?
@sowmiyanarayanans4975
@sowmiyanarayanans4975 2 жыл бұрын
Instead of droping the rows, can we use PCA to reduce the dimensionality of the data ???
@deepakgehani
@deepakgehani 4 жыл бұрын
I want to impute null value in bathroom basis bhk values since it behave linearly with bhk. Want to use apply-lambda function for imputation. Please suggest df1.groupby('new_bhk')['bath'].median()
@sarthak810
@sarthak810 3 жыл бұрын
when to use tablue and when to use python and please tell me after cleaning and do feature engineering in python how can we create visualization as a data analyst using tablue
@wolfganggermain7175
@wolfganggermain7175 Жыл бұрын
But now after cleaning the total_sqft there was an NaN added at index 410, going to check the next video t o see if it was fixed.
@anassguitanou2411
@anassguitanou2411 3 жыл бұрын
Hi, good video sir. I have a question, why didnt you drop the size column since you have the information in the bhk column?
@assoutarik6295
@assoutarik6295 4 жыл бұрын
this is how to convert all the columns total_sqft to square feet units = {'Sq. Meter':10.7639, 'Perch':272.25, 'Sq. Yards':9, 'Acres':43560, 'Cents':435.6, 'Grounds': 2400, 'Guntha':1088.9848169} def Convert_to_sqft(x): for key in units.keys(): if key in str(x): x = x.split(key)[0] x = (float(x)*units[key]) return x return float(x) def Convert_all(x): tokens = x.split('-') if len(tokens) == 2: return (float(tokens[0])+float(tokens[1]))/2 try: return float(x) except: return Convert_to_sqft(x) i know there is a better way to do this ..
@danishcode
@danishcode 3 жыл бұрын
sir in total_sqft column what happend to value without conversion from function it return None and it will append there NuN which will again mess with my data....
@jitendrakushwah5359
@jitendrakushwah5359 3 жыл бұрын
At 12.43 it's not working rather than it's giving a error Could not convert strong to float : '34.56sq.'
@tusharsharma6129
@tusharsharma6129 4 жыл бұрын
In your github you have use this codde which you haven't shown in the video df4 = df4[df4.total_sqft.notnull()] can you please tell us what does this code mean sir?
@karishmasewraj6437
@karishmasewraj6437 2 жыл бұрын
Just a thought : Wouldn't the data be less accurate if we drop or ignore the error rows ?
@dikshagupta3276
@dikshagupta3276 2 жыл бұрын
Thanku
@snehasneha9290
@snehasneha9290 4 жыл бұрын
df2[~df2['total_sqft'].apply(is_float)] and df2[df2['total_sqft'].apply(is_float)] what is the diffrence between these 2 things and what is the purpose of ~ this symbol plz can anyone explain this
@shansingh9858
@shansingh9858 4 жыл бұрын
'~' basically means negation... it means we can't include those rows of Total_sqrt where is_float function returns True or it works
@santoshkumar-gd4mf
@santoshkumar-gd4mf Жыл бұрын
line 29 was showing error so i had convert it to string first by using df2['total_sqft']=df2['total_sqft'].apply(str)
@dovie_thebeauty3449
@dovie_thebeauty3449 2 жыл бұрын
hi can anyone help while running line no 26, 27,28, i'm getting an error:- AttributeError: 'int' object has no attribute 'split'. why this error is coming and how to resolve
@neelamegammuthuraj8982
@neelamegammuthuraj8982 Жыл бұрын
Is it advisable to remove null records .... blindly rather handling it
@MONTYXGamingYT
@MONTYXGamingYT Жыл бұрын
Sir I am getting Error in df3[~df3['total_sqft'].apply(is_float)].head(10) Error: NameError Traceback (most recent call last) Cell In[27], line 1 ----> 1 df3[~df3['total_sqft'].apply(is_float)].head(10) NameError: name 'is_float' is not defined please resolve it.
@arpitanareha6703
@arpitanareha6703 9 ай бұрын
I am trying to run this code but my jupyter notebook is showing file not found error. How could i remove this error?
@karangadgil9847
@karangadgil9847 3 жыл бұрын
why not use replace with regex instead of lambda? df2.replace({'size':'Bedroom'},'BHK', regex=True)
@srinivasreddy1709
@srinivasreddy1709 3 жыл бұрын
Hi Dhaval, when we use df.total_sqft.unique(), juyter notebook showing only few values array(['1056', '2600', '1440', ..., '1133 - 1384', '774', '4689'], dtype=object) how do we know other unique values in this array
@codebasics
@codebasics 3 жыл бұрын
yes actually jupyter notebook is showing short version. Can you try print(df.total_sqft.unique()) ? And update here if it works
@srinivasreddy1709
@srinivasreddy1709 3 жыл бұрын
Hi Dhaval, its showing short version as below ['1056' '2600' '1440' ... '1133 - 1384' '774' '4689']
@tatatabletennis
@tatatabletennis 4 жыл бұрын
There are strings '1RK' also in the size column but you converted them to 1 and made it similar to '1 BHK'. But those two are different. How to deal with such a situation, sir?
@codebasics
@codebasics 4 жыл бұрын
I think you can treat 1RK as 1 BHK. Its like 1 room kitchen and 1 bedroom hall kitchen is almost same because the apartment would definitely have a hall. Nowadays builders dont build homes with just bed room and a kitchen.
@jabir89
@jabir89 3 жыл бұрын
Hi, Is there an option to write a lambda function to extract the non-float values from the 'total_sqft' column?
@jabir89
@jabir89 3 жыл бұрын
I found the following code to extract the non-integer number - df3[~df3['total_sqft'].apply(lambda x: str(x).isnumeric())]. But I also need to remove the floats in the column. Any suggestions?
@engineerbaaniya4846
@engineerbaaniya4846 4 жыл бұрын
Agar possible ho to plz web scrapping p ek detailed tutorial Bana do Poor's KZfaq m Nahi h views bhi bahut ho jaenge love u sir
@codebasics
@codebasics 4 жыл бұрын
Sure. I have added this to my Todo list
@engineerbaaniya4846
@engineerbaaniya4846 4 жыл бұрын
@@codebasics I am waiting as if I know web scrapping my designation and salary both will be increased love u again I have shared your channel with my colleagues for max support
@mohamedsaoud8661
@mohamedsaoud8661 11 ай бұрын
why did you drop certain columns ? how can i choose ?
@anuvratshukla7061
@anuvratshukla7061 4 жыл бұрын
AttributeError: 'float' object has no attribute 'split' . When I try to split. Please assit
@ChandanNayak-lq1pd
@ChandanNayak-lq1pd 4 жыл бұрын
Why is area not a criteria, as it factors in,in the price calculation?
@roshankumarsharma8725
@roshankumarsharma8725 4 жыл бұрын
Sir why have you dropped the columns like society , Area type which in real life does matter for prediction of price ?
@codebasics
@codebasics 4 жыл бұрын
Yes they matter. I wanted to keep the model simple but I encourage you to include all those features and redo this project. That would be a nice exercise for you 😊👍
@Mukeshkumar-yl1qq
@Mukeshkumar-yl1qq 2 жыл бұрын
Tq sir🥰🥰
@MrJohnaiton
@MrJohnaiton 4 жыл бұрын
Was the purpose of getting the aggregate number of "area_type" just to show this functionality in pandas? Unlike in all the other steps in this video where the purpose was made clear (e.g. dropping columns because the data will not change the output in the programmer's opinion) I couldn't see how the specific groupby step done fit in with the overall purpose of this particular real-estate estimation project. Also, why is there in this part of python a repitition of ('area_type')['area_type'] rather than just ['area_type']
@codebasics
@codebasics 4 жыл бұрын
Hey John, I used less efficient way of aggregating results. I just updated the notebook on github to use value_counts(). i.e. df1['area_type'].value_counts() Regarding purpose: I used that just to explore how many samples are there per area type. As such in a bigger scheme of this project, this step is probably not needed but I showed it to explain how a data scientist can do data exploration with this particular feature.
@hv3300
@hv3300 3 жыл бұрын
Excellent .Do you this project can be used on resume ?
@kusumk1756
@kusumk1756 3 жыл бұрын
Sir i hve recently installed jupyter it’s not accepting pandas , numpy etc
@farhanamohammed5641
@farhanamohammed5641 4 күн бұрын
where is the link to access dataset
@shubhisrivastava2802
@shubhisrivastava2802 Ай бұрын
Can somebody please share the link to download the dataset?
@priyamsaha8208
@priyamsaha8208 4 жыл бұрын
df3[df3.bhk>20] not working
@mavrik1058
@mavrik1058 4 жыл бұрын
why have you used 'area_type' twice..can you please explain that?
@user-eo6uh1qe9o
@user-eo6uh1qe9o 7 ай бұрын
Hello sir, i am getting this error -------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[45], line 1 ----> 1 convert_sqft_to_num(2136) Cell In[44], line 2, in convert_sqft_to_num(x) 1 def convert_sqft_to_num(x): ----> 2 tokens = x.split('-') 3 if len(tokens) == 2: 4 return (float(tokens[0])+float(tokens[1]))/2 AttributeError: 'int' object has no attribute 'split'
@harshavardhanasrinivasan3125
@harshavardhanasrinivasan3125 3 жыл бұрын
For 43 BHK can it be like vertical staked house ?
@techtutsindia
@techtutsindia 2 жыл бұрын
This video is awesome: new learning points are dropna(), groupby(), drop(), isnull(), unique(), loc[index]. You would know how to use these functions in realtime.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 268 М.
Kind Waiter's Gesture to Homeless Boy #shorts
00:32
I migliori trucchetti di Fabiosa
Рет қаралды 3,9 МЛН
Can A Seed Grow In Your Nose? 🤔
00:33
Zack D. Films
Рет қаралды 32 МЛН
I Studied Data Job Trends for 24 Hours to Save Your Career! (ft Datalore)
13:07
Thu Vu data analytics
Рет қаралды 176 М.
AI, Machine Learning, Deep Learning and Generative AI Explained
10:01
IBM Technology
Рет қаралды 53 М.
House Price Prediction in Python - Full Machine Learning Project
40:40
Stanford's FREE data science book and course are the best yet
4:52
Python Programmer
Рет қаралды 690 М.
Solving real world data science tasks with Python Pandas!
1:26:07
Keith Galli
Рет қаралды 1,5 МЛН
3 Data Analyst Predictions for 2025
6:01
Sundas Khalid
Рет қаралды 73 М.
Kind Waiter's Gesture to Homeless Boy #shorts
00:32
I migliori trucchetti di Fabiosa
Рет қаралды 3,9 МЛН