This is a step-by-step tutorial on how to create a step function to orchestrate a single or multiple glue jobs and configure the I am role. #aws #awsglue #stepfunctions IAM Permission Link: docs.aws.amazon.com/step-func...
Пікірлер: 40
@PatrickPoplawska8 ай бұрын
Excellent video. To the point, called out common failure points. Well done all around.
@DataEngUncomplicated8 ай бұрын
Thanks for the comment Patrick, much appreciated!
@khandoor72282 жыл бұрын
I am really interested in Step Functions as well. Thanks for this, hope you do more!
@DataEngUncomplicated2 жыл бұрын
Thanks! Absolutely! More videos to come!
@julioarenas7150 Жыл бұрын
Thank you very much, very well explained very precise. greetings from Chile
@DataEngUncomplicated Жыл бұрын
Thank you my Chilean friend!
@NehalVerma-zr4mq Жыл бұрын
Thanks Brother! You Great!
@DataEngUncomplicated Жыл бұрын
Thanks Nehal!
@jaffarahamed60892 жыл бұрын
Well explained... Thanks 👍🏻
@DataEngUncomplicated2 жыл бұрын
Glad it was helpful!
@bhumisounds5107 Жыл бұрын
The additional policy adds that you mentioned helped a lot. My machine was hanging.
@DataEngUncomplicated Жыл бұрын
Your welcome, glad you got it working
@claytonvanderhaar3772 Жыл бұрын
Hi great tutorial as usual but I am struggling with get a choice working I am not sure how to get the result input path from the Glue job and then pass it onto the choice state please if you know how do this I would really appreciate it
@user-hv9wx2md3c5 ай бұрын
could you please upload the complete AWS data engineering playlist? It will be helpful for us. your tutorials are easy to watch and grab things faster. Thank you.
@DataEngUncomplicated5 ай бұрын
Hey, that's a good idea, I can put them all into 1 playlist. It will be a lot of videos though, I kind of broke them down into different aws services
@cringe6006 Жыл бұрын
Really great video Thank you for posting Hope you don't get demotivated by view count 😭 Your videos are really good.
@DataEngUncomplicated Жыл бұрын
Thanks! Much appreciated!
@felixa4705 Жыл бұрын
As of today, there are about 6k views! That's a lot more people than you could reach through normal means. I think they're doing a great job!
@joegenshlea6827 Жыл бұрын
Thank you so much for this video. It was a huge help to show the IAM permissions for the Glue job. Is there anything about the "permission_to_glue_topic" permission that we should know? Also, In my lambda invocation I'm pasting the lambda "event" json object into the the payload options which seems to work beautifully. Is there a way to reference the event configuration in lambda from the step function directly without having to copy-and-paste?
@DataEngUncomplicated Жыл бұрын
Hi Joe, You're welcome! If you are trying to pass your event payload to your lambda function through step functions, when you are running your step function execution in the console manually, you can paste your test payload there. You should set up your step function so the payload gets passed directly to your lambda function with the parameters your lambda needs. I hope this is what you are looking for.
@theroadbacktonature Жыл бұрын
thanks for the demo. Can you provide more details on what Glue publishes to SNS? So we dont have to write any custom json message to sns from glue, that Glue writes success or failure depending the run state automatically?
@DataEngUncomplicated Жыл бұрын
Hi Pradeep, if you attempt to configure a rule with eventbridge with the glue sample, it will tell you what the general payload will look like being passed to sns: for example: { "version": "0", "id": "66fbc5e1-aac3-5e85-63d0-856ec669a050", "detail-type": "Glue Job Run Status", "source": "aws.glue", "account": "123456789012", "time": "2018-04-24T20:57:34Z", "region": "us-east-1", "resources": [], "detail": { "jobName": "MyJob", "severity": "INFO", "notificationCondition": { "NotifyDelayAfter": 1 }, "state": "STARTING", "jobRunId": "jr_6aa58e7a3aa44e2e4c7db2c50e2f7396cb57901729e4b702dcb2cfbbeb3f7a86", "message": "Job is in STARTING state", "startedOn": "2018-04-24T20:55:47.941Z" } }
@STEVEN48415 ай бұрын
Very useful, thanks, but, if I need to call 5 glue have bs for example, I can tell crate a workflow an then call whit workflow from this same way?
@DataEngUncomplicated5 ай бұрын
Hi Steven, can you edit your sentance, I don't understand what you trying to do.
@GiorgosBastoulis5 ай бұрын
Excellent video, thanks for sharing! I have a question, I want to run a bash script and trigger it via Lambda with Step Functions. Is that possible?
@DataEngUncomplicated5 ай бұрын
Yes, you can “wrap” your bash script within a supported language like Node.js or Python. For example, in Node.js, you can use the child_process module to execute a bash script. Remember to package your bash script and any other necessary files into a ZIP file and upload it to AWS Lambda. Also, ensure that your bash script has the appropriate permissions to be executable.
@oscarnegrete4862 жыл бұрын
What are the permissions for the publish_to_glue_topic?
@DataEngUncomplicated2 жыл бұрын
Hi Oscar, It just had the sns:Publish action. The full statement looks like this: { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "sns:Publish", "Resource": "arn:aws:sns:us-east-1:account#:glue_jobs" } ] }
@Kaisean4 ай бұрын
What would be the rationale for using Glue in Step Functions vs. Glue Orchestration? If you're doing more than using GlueJob and GlueCrawler, Step Functions make sense, but is that all?
@DataEngUncomplicated4 ай бұрын
The choice between using AWS Glue in Step Functions vs. Glue Orchestration (Glue Workflows) depends on the complexity of your data pipeline and the services you’re using. AWS Glue Workflows are beneficial when you’re chaining together multiple Glue jobs and/or crawler. They are particularly useful for batch processing, where you can schedule workflows directly. However, Glue Workflows lack several features common in flow control tools, such as conditional branching, loops, dynamic maps, and custom steps. On the other hand, AWS Step Functions are more suitable when the complexity exceeds simple triggers and the services used extend beyond Glue. Step Functions provide more advanced orchestration capabilities, including support for error handling, parallel execution, and conditional logic. They also integrate with over 220 AWS services, making them a more flexible choice for complex, multi-service workflows. In addition, Step Functions can handle quick start and shutdown, which can manage a reasonable throughput. They also allow for the execution of parallel jobs, which is not possible in Glue Workflows.
@InvestorKiddd Жыл бұрын
is their any way to give s3 path and database as input to JobRun s3 stepfunction?
@DataEngUncomplicated Жыл бұрын
Yes, you can pass the S3 path and database as input parameters to an AWS Step Functions State Machine that includes an AWS Glue JobRun S3 Step. When you define your Step Function state machine, you can include an input parameter section that specifies the input data that will be passed to the state machine when it is executed. You can define the input parameters as key-value pairs in JSON format.
@InvestorKiddd Жыл бұрын
@@DataEngUncomplicated thanks,
@Velben Жыл бұрын
I'm curious. How did you learn data engineering?
@DataEngUncomplicated Жыл бұрын
Working as a data engineer and in the data analytics field for 10 years. Also doing Udemy courses, AWS certifications and side projects to continue to learn as the field is changing so fast with new services coming out all the time.
@mallikarjunsangannavar907 Жыл бұрын
How to enable the step function to run the jobs in parallel
@DataEngUncomplicated Жыл бұрын
Hi Mallikarjun, there is a parallel state which will allow you to run whatever jobs in parallel
@SimonLopez-hj2cj2 ай бұрын
how do i get to personalize the message that sns sends?
@DataEngUncomplicated2 ай бұрын
In the sns step there should be a box where you can customize the message
@SimonLopez-hj2cj2 ай бұрын
@@DataEngUncomplicated then how do i use the parameters of the job? for example if i want to send "The job state is (~SUCCEDED~ or ~FAILED~). At this time ~endtime~ ", thanks