AWS Glue Studio is a visual tool to create, run, and monitor ETL Jobs in AWS Glue. AWS Glue DataBrew is a visual tool for data preparation and data profiling. In this video, you learn when to use Glue Studio vs. Glue DataBrew.
Пікірлер: 19
@haugstve2 жыл бұрын
I like this video. As a data scientist, with data engineering responsibilities I see clear use cases for both tools. We use Sagemaker instead of DataBrew which should point to the differences. I would say that Glue studio is focused on data. There is no intention of doing anything with it except getting it, transforming it, and storing it (ETL). Jobs done. DataBrew is there for people who use data. For them, data is the tool, not the product. You want a dataset to get insights or train a model. The intention is different which also means that the skills and preferences of the users are different.
@prathapn01Ай бұрын
very informative sir... :)
@chriskondiah7412 жыл бұрын
By the way I love this video. I feel AWS has many redundant tools. And they should start to narrow their tools to limit confusion
@AWSTutorialsOnline2 жыл бұрын
I agree. It indeed sometime creates confusion due to duplicate capabilities.
@SathishKumarBilla2 жыл бұрын
Thanks for the video. It's so informative.
@AWSTutorialsOnline2 жыл бұрын
You are welcome!
@zpino2 жыл бұрын
Thanks a lot. Very clear.
@AWSTutorialsOnline2 жыл бұрын
Glad it was helpful!
@skiran6316 Жыл бұрын
Uses of data lineage is that when we are collaborating with multiple teams and if we have multiple sources lineage would be a easier way to track where data is coming, transforming.
@vincenthuysmans2137 Жыл бұрын
FYI: AWS Glue Studio also provides data preview btw. But I see that they have included it after this video was released.
@LittleBoodhaOne9 ай бұрын
Thank you for this informative video :) I would to submit a problem that i've experienced in Glue Databrew, if any of you can help it would be a blessing. Here's the situation : I would like to filter on a value of a column that isn't in the sample dataset. And I've found out that the recipe only focuses on the sample dataset. The fact that the sample is limited to only 5000 rows max, is preventing me from completing my recipe. Does somebody have an Idea on how to bypass the limits of the sample size ?
@vivekjacobalex3 жыл бұрын
Ok thanks for the information. Now I understood, Databrew is more towards data preparation using ML. And data glue is more towards job processing using pyspark . And the similarity is both can do gui etl .
@AWSTutorialsOnline3 жыл бұрын
Glue can do limited ETL to S3 only.
@chriskondiah7412 жыл бұрын
What is the difference between Databrew and sagemaker Data Wrangler?
@AWSTutorialsOnline2 жыл бұрын
SageMaker Data Wrangler is part of SagaMaker Studio and it can be used to build end to end pipeline along with other components of pipeline such as model training, model deployment etc.. However - DataBrew is also for data scientist but it is only for feature engineering nothing else. Hope it helps.
@grhaonan Жыл бұрын
Another key difference is that DataBrew doesn't offer custom transformation I rekon ?
@vincenthuysmans2137 Жыл бұрын
Nope, it doesn't. DataBrew is a no-code solution, where Glue Studio is hybrid (low-code/heavy-code)