Rethinking Orchestration as Reconciliation: Software Defined Assets in Dagster | Elementl

  Рет қаралды 6,832

Data Council

Data Council

Күн бұрын

ABOUT THE TALK
This talk discusses software-defined assets, an approach to orchestration and data management that makes it drastically easier to trust and evolve data assets, like tables and ML models.
In traditional data platforms, code and data are only loosely coupled. As a consequence, deploying changes to data feels dangerous, backfills are error-prone and irreversible, and it’s difficult to trust data, because you don’t know where it comes from or how it’s intended to be maintained. Each time you run a job that mutates a data asset, you add a new variable to account for when debugging problems.
Dagster proposes an alternative approach to data management that tightly couples data assets to code - each table or ML model corresponds to the function that’s responsible for generating it. This results in a “Data as Code” approach that mimics the “Infrastructure as Code” approach that’s central to modern DevOps. Your git repo becomes your source of truth on your data, so pushing data changes feels as safe as pushing code changes. Backfills become easy to reason about. You trust your data assets because you know how they’re computed and can reproduce them at any time. The role of the orchestrator is to ensure that physical assets in the data warehouse match the logical assets that are defined in code, so each job run is a step towards order.
Software-defined assets is a natural approach to orchestration for the modern data stack, in part because dbt models are a kind of software-defined asset.
Attendees of this session will learn what it looks like to build and maintain a warehouse or data lake of software-defined assets with Dagster.
ABOUT THE SPEAKER
Sandy is a software engineer at Elementl, building Dagster. Prior, he led machine learning and data science teams at KeepTruckin and Clover Health. He's a committer on Spark and Hadoop, and co-authored O'Reilly's Advanced Analytics with Spark.
ABOUT DATA COUNCIL:
Data Council (www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: / datacouncilai
LinkedIn: / datacouncil-ai
Eventbrite: www.eventbrite.com/o/data-cou...

Пікірлер: 5
@fiannafailgalway8446
@fiannafailgalway8446 2 жыл бұрын
This was an excellent talk.
@Fat1Dada
@Fat1Dada Жыл бұрын
Very clear !!!
@user-if2kq8nh8m
@user-if2kq8nh8m Жыл бұрын
The audio clipping was rough on this video, nonetheless great presentation!
@Fat1Dada
@Fat1Dada Жыл бұрын
23:26 "There's some important bathwater that we shouldn't throw out with the baby" xD that's a cute slip-up there, I think most people evolve from babies, therefore they would agree bathwater is the most disposable "asset"
@BenOgorek
@BenOgorek Жыл бұрын
I think I might be sold
Malloy An Experimental Language for Data | Google
34:24
Data Council
Рет қаралды 5 М.
3M❤️ #thankyou #shorts
00:16
ウエスP -Mr Uekusa- Wes-P
Рет қаралды 9 МЛН
Can You Draw A PERFECTLY Dotted Line?
00:55
Stokes Twins
Рет қаралды 86 МЛН
Каха ограбил банк
01:00
К-Media
Рет қаралды 11 МЛН
The Modern Stack for ML Infrastructure | Outerbounds
41:43
Data Council
Рет қаралды 8 М.
Why You Shouldn’t Care About Iceberg | Tabular
20:26
Data Council
Рет қаралды 12 М.
Asset-Based Data Orchestration (from DATA + AI Summit 2023)
14:50
Functional Data Engineering - A Set of Best Practices | Lyft
39:43
Data Council
Рет қаралды 76 М.
Dagster: Modern Data Platform Orchestration
31:55
Dagster
Рет қаралды 10 М.
Converting an ETL script to Software-Defined Assets
26:16
Dagster
Рет қаралды 6 М.
Big Data is Dead | MotherDuck
25:58
Data Council
Рет қаралды 11 М.
Will the battery emit smoke if it rotates rapidly?
0:11
Meaningful Cartoons 183
Рет қаралды 41 МЛН
Худший продукт Apple
0:53
Rozetked
Рет қаралды 34 М.
iPhone 16 с инновационным аккумулятором
0:45
ÉЖИ АКСЁНОВ
Рет қаралды 1,5 МЛН
Игровой Комп с Авито за 4500р
1:00
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 1,8 МЛН