Run Spark Jobs On Amazon Athena [FULL TUTORIAL IN 12MINS]

  Рет қаралды 3,606

Johnny Chivers

Johnny Chivers

Күн бұрын

Have you ever been in a situation where you want to run spark code to analyse data, but don’t want to manage the underlying resources? Then using Amazon Athena’s Spark engine could be the solution for you. Amazon Athena allows you to submit spark code via fully manned spark engine in the form of a notebook. This allows you to carryout data analytics and exploration using Apache Spark without the need to plan for, configure, or manage resources. Apache Spark on Amazon Athena is serverless and provides automatic, on-demand scaling that delivers instant-on compute to meet changing data volumes and processing requirements. In this tutorial I will show you how to set up Amazon Athena to run Spark jobs and using the resources I’ve provided for free on github.
LINK TO GITHUB TUTORIAL RESOURCES:
💾 Code Repo: github.com/johnny-chivers/spa...
SUPPORT THE CHANNEL:
☕ Buy Me A Coffee: www.buymeacoffee.com/johnnych...
🖥️ My VPN: go.nordvpn.net/aff_c?offer_id...
▬▬▬▬▬▬ T I M E S T A M P S ⏰ ▬▬▬▬▬▬
00:00 - Intro
01:01 - Setup Work
05:48 - Create An Athena Workgroup
06:47 - Create A Notebook
08:08 - Run Spark Code From Github
10:40 - Clean Up - Delete Resources
11:53 - Outro
OTHER USEFUL LINKS:
ℹ️ My Website: johnnychivers.co.uk
🔗 Linkedin: / johnny-chivers
💻 Limitations: docs.aws.amazon.com/athena/la...
😎 About me
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies.My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a career in data and computing. This journey culminated in the study of a Masters degree in Software
Enjoy 🤘

Пікірлер: 14
@TheErchetan
@TheErchetan 8 күн бұрын
Very Good Content.
@DroisKargva
@DroisKargva Жыл бұрын
Awesome! I just started learning Spark this is ideally on time. Thanks as always Johnny!
@theengineeringsideofdata6246
@theengineeringsideofdata6246 Жыл бұрын
Great explanation, Johnny. I remember seeing this announcement at re:Invent and was curious about the details.
@ziauldba
@ziauldba Жыл бұрын
You are best my friend ❣️
@nishvanth
@nishvanth Жыл бұрын
Thank you for the video Johnny. Can you make a tutorial on AWS Redshift please?
@dhananjaypawar2496
@dhananjaypawar2496 Жыл бұрын
hello Johnny your videos are best for all data engineer needs. Can you please make detailed videos on lambda and Redshift.
@ToddCunningham
@ToddCunningham Жыл бұрын
love your videos, thank you
@JohnnyChivers
@JohnnyChivers Жыл бұрын
Thanks for watching Todd.
@dvo66
@dvo66 Жыл бұрын
Hey Johnny, great vid. I've learned a lot from your videos and they have helped me in my interviews as well. I have a request, could you please make a video explaining different IAM policies used in production for data engineering stack ? I've followed all of your videos but we have used admin access for labs but I want to learn the real IAM policies used for different access in production. Hope you see my comment. Cheers!
@mohsenimani6652
@mohsenimani6652 Жыл бұрын
Thanks for the tutorial, how can we setup a job to run a Athena notebook on a specific cadence?
@nitropan
@nitropan Жыл бұрын
So all the completed results are stored on the S3 Bucket in results folder? what type of output is it?
@JohnnyChivers
@JohnnyChivers Жыл бұрын
It’s csv but there are ways to store it as another format aws.amazon.com/premiumsupport/knowledge-center/athena-query-output-different-format/
@santiagomorales8806
@santiagomorales8806 Жыл бұрын
Hello Johnny, thanks for the video. What's currently the main difference between a job in Glue and a job in Athena?
@JohnnyChivers
@JohnnyChivers Жыл бұрын
Presently there is a large list of limitations with Athena spark… around reads, writes and even what table types it can read. Whilst you get all this functionality with glue. I’ve left the link to the list of current limitations at the bottom of the video description.
AWS Glue Tutorial for Beginners [FULL COURSE in 45 mins]
41:30
Johnny Chivers
Рет қаралды 253 М.
Apache Iceberg on AWS with S3 and Athena [FULL COURSE IN 30MIN]
28:04
Johnny Chivers
Рет қаралды 19 М.
DEFINITELY NOT HAPPENING ON MY WATCH! 😒
00:12
Laro Benz
Рет қаралды 56 МЛН
ТАМАЕВ УНИЧТОЖИЛ CLS ВЕНГАЛБИ! Конфликт с Ахмедом?!
25:37
Я нашел кто меня пранкует!
00:51
Аришнев
Рет қаралды 5 МЛН
Who has won ?? 😀 #shortvideo #lizzyisaeva
00:24
Lizzy Isaeva
Рет қаралды 64 МЛН
What is Amazon DataZone? [AWS TUTORIAL in 12MINS]
12:29
Johnny Chivers
Рет қаралды 3,7 М.
AWS EMR Serverless - What is it? [FULL TUTORIAL in 25mins]
23:35
Johnny Chivers
Рет қаралды 14 М.
PySpark For AWS Glue Tutorial [FULL COURSE in 100min]
1:36:49
Johnny Chivers
Рет қаралды 82 М.
AWS Glue ETL Vs EMR - Which one should I use?
8:05
Johnny Chivers
Рет қаралды 37 М.
The ONLY PySpark Tutorial You Will Ever Need.
17:21
Moran Reznik
Рет қаралды 124 М.
Копия iPhone с WildBerries
1:00
Wylsacom
Рет қаралды 832 М.
$1 vs $100,000 Slow Motion Camera!
0:44
Hafu Go
Рет қаралды 26 МЛН