Add Redshift Data Source In AWS Glue Catalog

  Рет қаралды 6,999

DataEng Uncomplicated

DataEng Uncomplicated

Күн бұрын

This video is about how to add tables from a redshift cluster into the glue catalogue so they can be used by other services.
Timeline
00:00 Introduction
00:47 Add redshift database connection
02:04 Create aws glue crawler role
03:24 Create glue crawler
6:33 Configure VPC Endpoint
#aws #awsglue

Пікірлер: 17
@gauranshijohari4468
@gauranshijohari4468 Жыл бұрын
Your video was a saviour!!
@user-cr1ee6zf5n
@user-cr1ee6zf5n Жыл бұрын
Hi thanks for the video, I am new to AWS, is there a way to access this table via athena ? what are the use cases where we might feel the need to add redshift table to glue catalog? Thanks in advance
@rajatpathak4499
@rajatpathak4499 Жыл бұрын
great
@nehalverma1444
@nehalverma1444 Жыл бұрын
When I test my glue connection it always fails. I have created s3 Endpoints, even security group inbound ruled are all traffic allow from anywhere, my role has glueservice role permission. Everthing seams fine but why it fails. Please help
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Hi there, is your redshift cluster in a private subnet?
@MuhammadSaad-kb7op
@MuhammadSaad-kb7op 11 ай бұрын
I set up an IAM role called "AWSGlueServiceRole." Sir, I tried running the crawler on AWS Glue, but an error showed up, "Crawler cannot be started. Verify the permissions in the policies attached to the IAM role defined in the crawler." Can you please help me resolve this issue? Alternatively, Sir, could you create a video on how to add an RDS MySQL data source in the AWS Glue Catalog?
@DataEngUncomplicated
@DataEngUncomplicated 11 ай бұрын
Hi Muhammad, what permissions did you add? Sounds like it might be a permission error
@waynelo8088
@waynelo8088 4 ай бұрын
Hi, I cannot query the cataloged redshift tables through Athena. Can we somehow make the cataloged tables queryable through Athena? If not what's the use case for adding the redshift tables into Glue cataloge. i.e. What's the purpose of the result of this video?
@DataEngUncomplicated
@DataEngUncomplicated 4 ай бұрын
Hey, if you have redshift you are already paying for compute so curious why you want to go through Athena vs redshift to do this? You can look into this option docs.aws.amazon.com/athena/latest/ug/connectors-redshift.html The purpose of having a redshift table in a glue catalog so you can access your redshift tables in glue jobs or lambda functions using the aws sdk for pandas library.
@waynelo8088
@waynelo8088 4 ай бұрын
I was exploring using Athena as a central interface to provide access to all our data assets via the Glue catalog. I don't want to move the data out of Redshift again, just make it queryable from Athena as needed. @@DataEngUncomplicated
@MuhammadSaad-kb7op
@MuhammadSaad-kb7op 11 ай бұрын
Sir, can you create a video on AWS Glue 'Adding RDS MySQL Data Source to the AWS Glue Catalog'?
@DataEngUncomplicated
@DataEngUncomplicated 11 ай бұрын
Sure I'll add this idea to my video list thanks for the suggestion!
@johnychandrach
@johnychandrach Жыл бұрын
Thank you for the video, it helped me understand how we can connect Redshift tables with Glue. The crawler runs fine, but when using the catalog table created by the crawler in the Glue job I'm getting the following error. SdkClientException occurred: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to aws-glue-assets-.......... failed: connect timed out Any inputs?
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
No problem, Hmm it sounds like it could be an vpc issue. I would check to make sure your glue job has access to the redshift vpc
@lukmansetiadi
@lukmansetiadi Жыл бұрын
the crawler is successfull, but somehow can not query against those table
@lukmansetiadi
@lukmansetiadi Жыл бұрын
any missing step
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Do you have the correct permissions to access the data?
AWS Glue PySpark: Upserting Records into a Redshift Table
8:48
DataEng Uncomplicated
Рет қаралды 7 М.
Does size matter? BEACH EDITION
00:32
Mini Katana
Рет қаралды 20 МЛН
Русалка
01:00
История одного вокалиста
Рет қаралды 7 МЛН
Top AWS Services A Data Engineer Should Know
13:11
DataEng Uncomplicated
Рет қаралды 156 М.
AWS Tutorials - Access Glue Catalog using Amazon Redshift Spectrum
32:34
ETL | AWS Glue | AWS S3 |  Load Data from AWS S3 to Amazon RedShift
37:55
Cloud Quick Labs
Рет қаралды 78 М.
What is Amazon Redshift | How to configure and connect to Redshift
26:54
AWS with Avinash Reddy
Рет қаралды 2,8 М.
AWS Tutorials - Continuous S3 data ingestion to Amazon Redshift
24:52
ETL From Amazon RDS to Amazon Redshift with using AWS Glue Service
36:59
Cloud Quick Labs
Рет қаралды 18 М.