IPL Data Analysis | Apache Spark End-To-End Data Engineering Project

  Рет қаралды 52,671

Darshil Parmar

2 ай бұрын

Enroll in the Apache Spark Course Here - datavidhya.com/courses/apache
USE CODE: EARLYSPARK for 50% off
➡️ Combo Package Python + SQL + Data warehouse (Snowflake) + Apache Spark: com.rpy.club/pdp/yYnEMzLOX?plan=6607b619c69cf00b7b934479
USE CODE: COMBO50 for 50% off
In this video, we are going to analyze IPL data by building a data pipeline, main focus of this video is to focus on writing Apache Spark code and different functions to perform transformation,
Code used in the video: github.com/darshilparmar/ipl-data-analysis-apache-spark-project
Dataset Link - data.world/raghu543/ipl-data-till-2017
Timestamps
0:00 Introduction
0:31 Architecture Diagram and Spark Basic Concepts
13:26 Understand the Dataset
21:07 Complete Project Execution
01:18:32 Final Words
👦🏻 My Linkedin - www.linkedin.com/in/darshil-parmar/
📷 Instagram - datawithdarshil
🎯Twitter - parmardarshil07
🌟 Please leave a LIKE ❤️ and SUBSCRIBE for more AMAZING content! 🌟
3 Books You Should Read
📈Principles: Life and Work: amzn.to/3HQJDyP
👀Deep Work: amzn.to/3IParkk
💼Rework: amzn.to/3HW981O
Tech I use every day
💻MacBook Pro M1: amzn.to/3CiFVwC
📺LG 22 Inch Monitor: amzn.to/3zk0Dts
🎥Sony ZV1: amzn.to/3hRpSMJ
🎙Maono AU-A04: amzn.to/3Bnu53n
⽴Tripod Stand: amzn.to/3tA7hu7
🔅Osaka Ring Light and Stand: amzn.to/3MtLAEG
🎧Sony WH-1000XM4 Headphone: amzn.to/3sM4sXS
🖱Zebronics Zeb-War Keyboard and Mouse: amzn.to/3zeF1yq
💺CELLBELL C104 Office Chair: amzn.to/3IRpiL2
👉Data Engineering Complete Roadmap: kzfaq.info/sun/PLBJe2dFI4sgtlK_zaqaIBdJFgieYPnQ07
👉Data Engineering Project Series: kzfaq.info/sun/PLBJe2dFI4sgukOW6O0B-OVyX9c6fQKJ2N
👉Become Full-Time Freelancer: kzfaq.info/sun/PLBJe2dFI4sgtza0sAnNFwo8KPG0GcO9Il
👉Data With Darshil Podcast: kzfaq.info/sun/PLBJe2dFI4sgv_XmEDaXF3z1MNib7R3KUY
✨ Tags ✨
✨ Hashtags ✨
#dataengineering #apachespark #databricks

Пікірлер: 109
@DarshilParmar
@DarshilParmar 2 ай бұрын
LIKE LIKE LIKE LIKE!!!!! Interested in Learning Apache Spark in-depth with Databricks, I have created a detailed course here: datavidhya.com/courses/apache You can directly enroll in the best combo package Python, + SQL + Data Warehouse Snowflake + Apache Spark with Databricks Get it here: Combo Package: com.rpy.club/pdp/yYnEMzLOX?plan=6607b619c69cf00b7b934479… USE CODE: COMBO50 for 50% off
@kunal4557
@kunal4557 Ай бұрын
I am so relieved that there is someone who depicts a “complete” pipeline for projects that are not just real-world but also easy to comprehend, without loosing their innate complexity. Thanks alot for your contribution
@Moon01-ru5my
@Moon01-ru5my 7 күн бұрын
If youve done Sql very well then all you need here is just learning the few syntax differences in spark and Sql.
@munnieswaroop
@munnieswaroop 23 күн бұрын
I developed an application something similar to this PySpark for CSV file schema way back in 2014 using Servlets and JSP, now I know the importance of this and upgrading my application to Spring boot and react rebuilding my own ETL. Thank you for your session.
@shrutijain1628
@shrutijain1628 Ай бұрын
Such an amazing project to learn Apache Spark with Databricks! I learned so much, and the clarity of concepts was incredible. Thank you so much, Darshil! Totally going for your Combo Course!! 🙌
@TamizhanTrend
@TamizhanTrend 2 ай бұрын
Amazing... This architecture is applied in more real-time projects
@kanhashukla6265
@kanhashukla6265 Ай бұрын
Thanks a lot man. Much needed video.
@AsHiShChAuHaN-yd7dn
@AsHiShChAuHaN-yd7dn 2 ай бұрын
A very good project in a small project alot of learning ,this is called project based learning ❤🎉
@muhammadhaseeb229
@muhammadhaseeb229 Ай бұрын
Wow, this video is incredibly informative! I really appreciate how clearly it explains complex concepts. The visuals are engaging and make it easy to follow along. I'm excited to dive deeper into Spark after watching this. Keep up the great work!
@phanindrarao881
@phanindrarao881 2 ай бұрын
Hi @DarshilParmar thank you for all these videos. It's too good!!!!. I am a beginner, I really love it. I just started yesterday. You never let me blink my eye.
@sayemhaque6737
@sayemhaque6737 2 ай бұрын
I just love all your videos. Take love from Bangladesh❤
@bharathbn9225
@bharathbn9225 2 ай бұрын
thankyou Darshil
@tesseract_d
@tesseract_d 20 күн бұрын
Thanks Darshil this was very imformative and a Good learning Project journey for me as Data Engineer! Kudos please keep posting such Projects!
@RahulBaghel-ib4lz
@RahulBaghel-ib4lz 2 ай бұрын
its a great project!
@vamshipula8367
@vamshipula8367 2 ай бұрын
Thank you bro❤
@munnieswaroop
@munnieswaroop 2 ай бұрын
Wonderful insights into the Spark, never got distracted and fully engaging.
@DarshilParmar
@DarshilParmar 2 ай бұрын
Thank you
@adityajha2054
@adityajha2054 2 ай бұрын
Now this is what data enthusiasts need most people build the project directly on power bi or SQL without giving complete understanding.
@DarshilParmar
@DarshilParmar 2 ай бұрын
Thank you
@pradeesh2031
@pradeesh2031 2 ай бұрын
Wonderful video
@souvik5560
@souvik5560 2 ай бұрын
Great initiative . Thank you so much. Please take care of the audio. It's too low !!
@user-zm1ng8zh6r
@user-zm1ng8zh6r 2 ай бұрын
Amazing content
@TrainWithShubham
@TrainWithShubham 2 ай бұрын
Amazing work Darshil bhai Loved the project
@DarshilParmar
@DarshilParmar 2 ай бұрын
Thank you so much 😀
@user-ue8ut8uu2g
@user-ue8ut8uu2g 2 ай бұрын
Loved the Project Darshil Bhaiya I'm a Beginner and I'm loving it
@DarshilParmar
@DarshilParmar 2 ай бұрын
Let's go
@aritra1414
@aritra1414 2 ай бұрын
This was a nice project. Thanks!
@DarshilParmar
@DarshilParmar 2 ай бұрын
Glad you liked it!
@moheezawan8011
@moheezawan8011 2 ай бұрын
Right video at right time. Thanks @darshil bai🤩
@DarshilParmar
@DarshilParmar 2 ай бұрын
You are welcome
@syedhashir5014
@syedhashir5014 2 ай бұрын
56:57 correction when (col("batting_hand").contains("Left"), "Left-Handed").otherwise ("Right-Handed")
@daminigupta1
@daminigupta1 Ай бұрын
We can do the same thing in sql as well. Why to use spark?
@TalhaKhan-1996
@TalhaKhan-1996 2 ай бұрын
Is Amazon s3 used for data modelling?
@pavanparvathanenii4471
@pavanparvathanenii4471 2 ай бұрын
Amazing content as usual.
@DarshilParmar
@DarshilParmar 2 ай бұрын
Much appreciated!
@shivamchandan50
@shivamchandan50 2 ай бұрын
Plz create video on pyspark unittesting and debugging
@sukritisachan5773
@sukritisachan5773 Ай бұрын
How can we round off pin pyspark(liek if I want to round off a value to two decimal places) how is that possible?
@Kings07.
@Kings07. 2 ай бұрын
one thing in your explanation i observed is you are crisp and right to the point interms of explanation sir....if you ask me to explain analytically ...more value delivered in the least amount of time without any deviation.....great work sir... will learn more from you
@DarshilParmar
@DarshilParmar 2 ай бұрын
Thank you very much :)
@fbravoc9748
@fbravoc9748 2 ай бұрын
Hello, really nice videos. I really like how you teach, and I am interested in starting the spark databricks course. I have knowledge of SQL and Python but no previous knowledge of Snowflake. Can I still do the spark and databricks course without snowflake??
@jeevanmegavath9370
@jeevanmegavath9370 2 ай бұрын
Bro, could you please provide us this obsidian whole notes link for this project……..
@ranjansrivastava9256
@ranjansrivastava9256 2 ай бұрын
Hi Darshil, Could you please share your Data Vidhya Notes as a pdf. While enrolling it's asking more amount. Please help me on this. Excellent video.
@hafizadeelarif3415
@hafizadeelarif3415 2 ай бұрын
Hi Sir How are you? Sir, it is possible to fetch datasets from Kaggle using Azure Data Factory.? with azure function it is possible. Here's how?
@BishanTamang-rk5ji
@BishanTamang-rk5ji 2 ай бұрын
Thank you brother ❤❤ love from Nepal 💗💗
@DarshilParmar
@DarshilParmar 2 ай бұрын
Always welcome
@Santhosh-jk7nm
@Santhosh-jk7nm 2 ай бұрын
Nice work brother
@DarshilParmar
@DarshilParmar 2 ай бұрын
Thank you! Cheers!
@pritamkabiraj7691
@pritamkabiraj7691 Ай бұрын
Date columns are appearing as null. BoolType columns are also appearing as null. Can you resolve that?
@user-wk2xy2vo6w
@user-wk2xy2vo6w 2 ай бұрын
how to get a data enginner internship and how much do i get to know for internship ?
@sateeshkumar2698
@sateeshkumar2698 2 ай бұрын
Hi Darshil, Can i get a notes for python if i buy course, please answer
@gautamagrawal9279
@gautamagrawal9279 Ай бұрын
how do i create a account if i am still a student
@joseluisdominguez8687
@joseluisdominguez8687 2 ай бұрын
Nice video!!, what is the software you're using in your iPad for this presentation?
@DarshilParmar
@DarshilParmar 2 ай бұрын
Good notes
@rajanthakur6586
@rajanthakur6586 14 күн бұрын
can you provide me your s3 bucket url of ipl analysis so i can use in my project, because i donot have aws account
@adilmajeed8439
@adilmajeed8439 2 ай бұрын
Thanks for sharing such a lovely course on EDA using Apache Spark. Please could you correct the code at 56:13 where the "batting_hand" contains "left" rather it should be "Left" as the batting_hand column contains like "Left-xxxxx".
@adilmajeed8439
@adilmajeed8439 Ай бұрын
@DarshilPamar Thanks again for sharing the project along with the solution. I was able to convert the same project to Microsoft Fabric. Lots of learning ...
@Abhijitdelhi
@Abhijitdelhi Ай бұрын
how can i use your bucket??
@AmanKumar-sr5wj
@AmanKumar-sr5wj 2 ай бұрын
How much python is needed ? I am just starting 🙏
@lisitashamatutu1140
@lisitashamatutu1140 2 ай бұрын
Hi Darshil, thanks for the insightful videos, is it okay to use Macbook Air for data engineering?
@DarshilParmar
@DarshilParmar 2 ай бұрын
Yes
@cittafactshow
@cittafactshow Ай бұрын
Bhaiya your courses are too expensive I also want to learn can you take down the price of the combo package course......pls....!!!!
@kiranrathod-so1xr
@kiranrathod-so1xr Ай бұрын
Hey @DarshilParmar, I didnt get why you consider only 'run_scored' column while calculate #Aggregation :Calculate the total and avg runs scored in each match and inning. In our dataframe, 'ball_by_ball_df', we record details like this: 1. When a bowler bowls a no-ball and the batsman scores 4 runs on that ball, it results in a 'run_scored' entry of (4) and an 'extra_runs' entry of (1) in the respective columns. 2. If a bowler bows wide, it's marked as (0) in the 'run_scored' column and (1) in the 'extra_runs' column. Now, when calculating the total runs for a match and innings, we need to add up both the 'run_scored' and 'extra_runs' columns to get the accurate total."
@DarshilParmar
@DarshilParmar Ай бұрын
I kept saying in the video, goal of the video is not to get business logic right but to teach how to use tech
@kiranrathod-so1xr
@kiranrathod-so1xr Ай бұрын
@@DarshilParmar Ya I forget..thanks to rply❤❤
@anupamkumarsinha0
@anupamkumarsinha0 2 ай бұрын
Bhai aap kha rhte ho milna h aapse
@atharvadumre2502
@atharvadumre2502 2 ай бұрын
Bro in Olympic data analysis config code in data bricks gave me error saying null value exception
@DarshilParmar
@DarshilParmar 2 ай бұрын
Issue might be with keys, lot of people copy secretID but you need to copy SecretValue
@RishiRajxtrim
@RishiRajxtrim 2 ай бұрын
👍
@MuhammedSavadkv
@MuhammedSavadkv 2 ай бұрын
Great Thank you
@DarshilParmar
@DarshilParmar 2 ай бұрын
You are welcome
@RaghulS-nl6wx
@RaghulS-nl6wx Ай бұрын
can i make this project using jupyter notebook as well or there any particular reason for using Databricks (just asking) ?
@DarshilParmar
@DarshilParmar Ай бұрын
You can, you need to configure spark with jupyter notebook
@giridharbasanaboina
@giridharbasanaboina 2 ай бұрын
I loved your content thanks for sharing and I confused to choose which database is good MySQL or PostgreSQL to learn. Can Anyone suggest me
@aviatorifeanyi4239
@aviatorifeanyi4239 2 ай бұрын
I will recommend PostgreSQL, MySQL is also cool. Little difference in syntax between the two
@akshaydubey.57.a75
@akshaydubey.57.a75 2 ай бұрын
how to copy address of the ball_by_ball table from dataset ?
@DarshilParmar
@DarshilParmar 2 ай бұрын
Just use s3 path
@potatofarmer2099
@potatofarmer2099 Ай бұрын
Once you’ve built a portfolio project, how do you store and present it?
@DarshilParmar
@DarshilParmar Ай бұрын
Github
@devmanimaurya
@devmanimaurya Ай бұрын
Hii.. Is there any way to contact you?
@sandhyaejji9025
@sandhyaejji9025 2 ай бұрын
@ Darshil Parmar Thank you. As a fresher, Can I try the jobs in the data engineering field in USA?
@DarshilParmar
@DarshilParmar 2 ай бұрын
It is possible
@KVenomPoison
@KVenomPoison 2 ай бұрын
Spark isnt distributed rather than parallel?
@DarshilParmar
@DarshilParmar 2 ай бұрын
It is both
@arunramanathan8214
@arunramanathan8214 2 ай бұрын
Can we replicate this project in GCP entirely? Please advice Darshil.
@DarshilParmar
@DarshilParmar 2 ай бұрын
Yes use GCS, DataProc, BigQuery
@ayxxnshxrif
@ayxxnshxrif 2 ай бұрын
this looksk like a basic projects i dont think this is enought ot put it in resume!
@DarshilParmar
@DarshilParmar 2 ай бұрын
You can never put KZfaq projects on resume, 100k+ people do these project do you think you can stand out by doing these project? These projects are for learning and upskilling, only project you put in your resume is something that you create by yourself
@yahyashaikhworld
@yahyashaikhworld Ай бұрын
Why Having Count is > 120
@adityatomar9820
@adityatomar9820 2 ай бұрын
Hey Im getting error while reading form s3
@DarshilParmar
@DarshilParmar 2 ай бұрын
What's the error?
@adityatomar9820
@adityatomar9820 2 ай бұрын
@@DarshilParmar hey , I solved it ! It was access denied error...made my bucket public and it works now🤗
@nomannazir4579
@nomannazir4579 2 ай бұрын
Do we have its source code available?
@DarshilParmar
@DarshilParmar 2 ай бұрын
Check description
@AmanKumar-sr5wj
@AmanKumar-sr5wj 2 ай бұрын
Math bhi kuch Ani chiye ka ? 🤔
@DarshilParmar
@DarshilParmar 2 ай бұрын
Basic college level
@avinash7003
@avinash7003 2 ай бұрын
bring Airflow course
@DarshilParmar
@DarshilParmar 2 ай бұрын
Next on the pipeline
@sateeshkumar2698
@sateeshkumar2698 2 ай бұрын
Can you please share your notes?
@DarshilParmar
@DarshilParmar 2 ай бұрын
Notes are part of my courses, internal document, used in video to explain basic stuff
@sateeshkumar2698
@sateeshkumar2698 2 ай бұрын
@@DarshilParmar Oh ok, If possible can you sell notes alone please
@phaddu7737
@phaddu7737 2 ай бұрын
@@DarshilParmar Hey, interested in the standalone Python course Darshil. Discounts coming any time soon.
@sateeshkumar2698
@sateeshkumar2698 2 ай бұрын
Mr also bro, are u purchased?​@@phaddu7737
@noob_2377
@noob_2377 2 ай бұрын
First comment 🎉❤
@DarshilParmar
@DarshilParmar 2 ай бұрын
Let's go!
@iampiyushparida7
@iampiyushparida7 2 ай бұрын
DARSHIL = 7 letters #Thalaforareason
@DarshilParmar
@DarshilParmar 2 ай бұрын
haha
Can You Draw A PERFECTLY Dotted Line?
00:55
Stokes Twins
Рет қаралды 80 МЛН
ОСКАР ИСПОРТИЛ ДЖОНИ ЖИЗНЬ 😢 @lenta_com
01:01
Я нашел кто меня пранкует!
00:51
Аришнев
Рет қаралды 1,7 МЛН
터키아이스크림🇹🇷🍦Turkish ice cream #funny #shorts
00:26
Byungari 병아리언니
Рет қаралды 28 МЛН
Can You Draw A PERFECTLY Dotted Line?
00:55
Stokes Twins
Рет қаралды 80 МЛН