No video

Apache Airflow vs. Dagster

  Рет қаралды 12,478

Dagster

Dagster

Жыл бұрын

Many data engineers are looking to get past the limitations of Apache Airflow, the incumbent in the data orchestration layer. Dagster proposes a new paradigm centered on Data Assets and the tools to support a full development lifecycle that radically boosts the productivity of data engineering teams.
In this video Sandy Ryza, lead engineer on the Dagster project, explores the main differences in Apache Airflow Vs. Dagster.
Read the companion blogpost here: dagster.io/blog/dagster-airflow

Пікірлер: 16
@harshvardhanchauhan3227
@harshvardhanchauhan3227 Жыл бұрын
It would be really helpful if you covered some of the topics you mentioned in the end, especially dependency isolation, given python's dependency model.
@waterhill
@waterhill 10 ай бұрын
Can Dagster be used to orchestrate a spark streaming YARN job that pulls data from Kafka and writes to HDFS?.. the idea is if the spark streaming job queues and it can be monitored/alerted/detected and restarted automatically by Dagster? or would Airflow be the right tool for this?
@vidbina
@vidbina Жыл бұрын
Tried Dagster a few days back, liked it but have a weird need: some workloads that need to run in Typescript, Rust or Haskell as there is some parsing happening for which I don't have any python libs available atm. How would you solve this problem? I'm thinking of 1) hitting an external service that runs my code in whatever runtime it needs or 2) use Temporal which has a JavaScript/TypeScript SDK in addition to Python and Golang afaict. Curious to hear your characterization of the diff between Temporal and Dagster. Haven't done the deep-dive myself yet.
@s_ryz
@s_ryz Жыл бұрын
Hey David - we generally recommend using the dagster-shell library to invoke code in other languages inside Dagster pipelines. A couple high-level differences between Dagster and Temporal: - Dagster is focused on data pipelines, while Temporal is more focused on application-related workflows - Dagster involves declaring the target state up-front, while I believe Temporal is more dynamic
@jakobullmann7586
@jakobullmann7586 5 ай бұрын
I don’t know… this video is one year old, but still uses the legacy DAG syntax from Airflow 1, rather than the TaskFlow API from Airflow 2. So the syntax doesn’t make a difference anymore. Regarding the coupling to environment: Airflow has different executors. The KubernetesPodOperator is not the only way to run on a Kubernetes environment. The rest may or may not be true. Probably there are many things that Dagster does better than Airflow. But I’m disappointed that you would publish such a biased comparison.
@peppeAug
@peppeAug Жыл бұрын
I don't understand the meaning of can run only in production... As if you could not have an instance pointing to non production environments and another pointing to production environments and manage the version of your code with any git tool. :\
@s_ryz
@s_ryz Жыл бұрын
Hey Giuseppe - yes, you can stand up an Airflow instance inside a non-production environment. However, the programming model encourages you to write DAGs in a way that binds them to particular environments, and Airflow is heavy-weight in a way that makes it difficult to use as part of a local development workflow.
@kalyanben10
@kalyanben10 6 ай бұрын
@@s_ryz Why wouldn't one just parameterize those as variables? There's no way airflow encourages you to not use variables and hardcode stuff for production. Maybe you have a proper example explaining your point? Otherwise, its just that you are commenting without understanding best practices of airfllow
@petitslipdubled
@petitslipdubled 4 ай бұрын
@@kalyanben10 He gave a good example. If your Airflow ETL runs in a Kubernetes cluster in prod, the only way to test it locally would be to run the entire cluster on your host. With Dagster, your pipeline is decoupled from it's runtime environment so you would be able to test the same pipeline within the python process of your machine for example
@jorgeramiroalarconvargas2580
@jorgeramiroalarconvargas2580 Жыл бұрын
If you look at the most recent version of airflow, it also has decorators and DBT support , in the other hand Apache airflow is free. Nice try on comparing Dagster 🙃 with Airflow 🙂, but I but I'm sticking with AIRFLOW.
@Eriddoch
@Eriddoch Жыл бұрын
Yes. I'm not saying this video is wrong or that I prefer Airflow, myself BUT after surveying - Mage - Prefect - Dagster - AWS StepFunctions + EventBridge I've found that all vendors seem to be reacting to Airflow 1.0. "We struggled with Airflow 1.0 so we built our own orchestrator product." The Airflow 2.0 seem to have rebuttals to most of the pain points I personally faced when maintaining Airflow 1.0 on Kubernetes back in 2019. I'm still open to using a different orchestration tool after my experience, but I need to gather accurate information about the *current* state of the space before making that kind of long term decision.
@xOnelinx
@xOnelinx 6 ай бұрын
это настолько поверхностное и лукавое сравнение что я даже не хочу писать комментарий на английском🤦‍♂
@flogzer0
@flogzer0 2 ай бұрын
I'm fairly sure this sales guy never used Airflow
@hadjebi
@hadjebi Жыл бұрын
All wrong claims: low developer productivity, catch errors in production, poor visibility.
@Mathias-ti3lz
@Mathias-ti3lz 5 ай бұрын
I totally agree. With TaskFlow it is easy possible to achive the same.
@agent_artifical
@agent_artifical Жыл бұрын
Nice Try;)
Airflow Vs. Dagster: The Full Breakdown!
14:51
The Data Guy
Рет қаралды 6 М.
Dagster Data Orchestration 10 min walkthrough
10:28
Dagster
Рет қаралды 19 М.
Kids' Guide to Fire Safety: Essential Lessons #shorts
00:34
Fabiosa Animated
Рет қаралды 10 МЛН
Don't Use Apache Airflow
16:21
Bryan Cafferky
Рет қаралды 91 М.
What’s so modern about the modern data stack?
32:28
Bigeye
Рет қаралды 1,8 М.
Airflow Vs. Prefect: Full Breakdown!
17:41
The Data Guy
Рет қаралды 5 М.
Dagster: Modern Data Platform Orchestration
31:55
Dagster
Рет қаралды 10 М.
Apache Airflow Architecture 101
18:29
Bryan Cafferky
Рет қаралды 11 М.
Partitioned Data Pipelines in Data Engineering
8:23
Dagster
Рет қаралды 4,1 М.
Kids' Guide to Fire Safety: Essential Lessons #shorts
00:34
Fabiosa Animated
Рет қаралды 10 МЛН