Advancing Spark - Data + AI Summit 2024 Key Announcements

  Рет қаралды 4,808

Advancing Analytics

Advancing Analytics

12 күн бұрын

The dust is settling after the Data + AI Summit has come to an end, so it's time to reflect on the insane number of announcements that we saw over just a couple of days! We have the massive Open Sourcing of Unity Catalog, new products such as the AI/BI Interactions, Compound AI Applications and then massive teasers about LakeFlow - the complete rewrite of how we think about ETL!
In this video, Simon runs through a bunch of clips from the keynotes, pulling out the key announcements that you should be aware of if you are in the Data/AI Space!
For full youtube keynote replays, see:
Day 1 - • Data + AI Summit Keyno...
Day 2 - • Data + AI Summit 2024 ...
And as always, Advancing Analytics can help you get the most of your Data Intelligence Platform (or build one if you're not there yet), so give us a call if you need that extra boost.

Пікірлер: 24
@MichaelEwins1967
@MichaelEwins1967 3 күн бұрын
I agree that the Tabular acquisition will lead to improved interoparability for Delta Lake & Iceberg users. For me this signals more the trend of reducing data movement and ETL so that people can use data where it is. And all access control is managed by Unity Catalog.
@shawndeggans
@shawndeggans 11 күн бұрын
Excellent, summary Simon! I'm looking forward to LakeFlow. 😀
@rommelbojorgee.8902
@rommelbojorgee.8902 10 күн бұрын
Nice summary, thanks Simon
@drummerboi4eva
@drummerboi4eva 7 күн бұрын
Amazing Simon, thanks for this update
@alexischicoine2072
@alexischicoine2072 5 күн бұрын
I liked your point about Serverless and what we do. Hopefully by the time the transition is done I’ll have retired into leadership 😂
@omgitsbenhayes
@omgitsbenhayes 11 күн бұрын
Nice recap of the key announcements! ABAC demo was 🎉
@alexischicoine2072
@alexischicoine2072 5 күн бұрын
lake flow seems great if you can source control and deploy it. Hopefully it’s also somewhat testable.
@alexischicoine2072
@alexischicoine2072 5 күн бұрын
Hoping we get branches in delta for write audit publish as that’s a pretty useful feature in Iceberg.
@danhorus
@danhorus 10 күн бұрын
And I'm here still waiting for that for_each task xD
@AdvancingAnalytics
@AdvancingAnalytics 10 күн бұрын
It's on the roadmap, it was on one of the keynote slides and everything! 😅
@mkrichey1
@mkrichey1 11 күн бұрын
I think most of those changes are going to have a big effect on the way we manage data. Databricks are setup to be the single tool right up to the point you visualize the end result. Wonder how MS feel about the fact they might end up serving instances of the very platform that makes fabric a bit redundant :P especially if the pricing is clear and competitive :)
@Mim_BI
@Mim_BI 10 күн бұрын
come on, not a single word about duckdb, it was everywhere on the keynote :)
@AdvancingAnalytics
@AdvancingAnalytics 9 күн бұрын
Haha, it's true - there was the segment from Hannes himself. But the update is largely that DuckDB can now natively read Delta right, nothing I saw is directly Databricks functionality? That said, I'm waaaay overdue a separate video spinning up duckdb on a single node and showing how fast it is!
@ErikParmann
@ErikParmann 11 күн бұрын
What do you know about the realtime mode? Do you think it's just a rename of the experimental spark continuous mode?
@AdvancingAnalytics
@AdvancingAnalytics 9 күн бұрын
I need to dig into what's been announced publicly so I don't break NDAs - but I can say that what I've seen has come a fair way from the old continuous mode, it's more than just the spark engine change behind what's driving the performance increase.
@norbertczulewicz1695
@norbertczulewicz1695 10 күн бұрын
Currently only SQL Warehouse can be serverless which supports SQL only. Does it mean that Python is not recommended in the new projects?
@AdvancingAnalytics
@AdvancingAnalytics 9 күн бұрын
That's what the announcements were all about - they're rolling out Serverless for Workflows/Notebooks which means full serverless python support. Python is thoroughly recommended for any engineering/automation workloads (with embedded SQL for transformations as necessary)
@norbertczulewicz1695
@norbertczulewicz1695 9 күн бұрын
@@AdvancingAnalytics The biggest problem is the price. Serverless option is the most expensive workload in Databricks. For many companies it can be a blocker especially when chipper option exist. I've heard about situation where companies ask developers not to use SQL serverless warehouse because of that
@ErikParmann
@ErikParmann 9 күн бұрын
@@AdvancingAnalytics Hopefully this means serverless supported in more regions!
@gags220988
@gags220988 9 күн бұрын
Isn't Genie just hitting the openAI endpoint?
@AdvancingAnalytics
@AdvancingAnalytics 9 күн бұрын
Nope - the original Databricks Assistant was using OpenAI, this new iteration is a flavour of DBRX, with the context of your own data (unity catalog, recent activity/queries etc etc). Should have far, far more context than just hitting an open endpoint.
@TomPerry83
@TomPerry83 11 күн бұрын
I agree with the points about serverless making things easier and doing it better than a person would do. However, I would still want to know what it is doing, so I could replicate elsewhere (self hosted, other future vendor, etc). Otherwise this is another type of vendor lock Ie, if I'm too reliant on the platform optimising stuff for me, then I'm effectively locked in.
Behind the Hype - The Medallion Architecture Doesn't Work
21:51
Advancing Analytics
Рет қаралды 24 М.
Tom & Jerry !! 😂😂
00:59
Tibo InShape
Рет қаралды 56 МЛН
Мы никогда не были так напуганы!
00:15
Аришнев
Рет қаралды 3,9 МЛН
⬅️🤔➡️
00:31
Celine Dept
Рет қаралды 50 МЛН
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 829 М.
Andrew Ng On AI Agentic Workflows And Their Potential For Driving AI Progress
30:54
Introduction to Data Mesh with Zhamak Dehghani
1:05:31
Stanford Deep Data Research Center
Рет қаралды 29 М.
Advancing Spark - Understanding the Spark UI
30:19
Advancing Analytics
Рет қаралды 49 М.
Kafka vs. RabbitMQ vs. Messaging Middleware vs. Pulsar
4:31
ByteByteGo
Рет қаралды 65 М.
Have You Picked the Wrong AI Agent Framework?
13:10
Matt Williams
Рет қаралды 45 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 218 М.
Data + AI Summit 2024 - Keynote Day 2 - Full
2:15:38
Databricks
Рет қаралды 12 М.
Tom & Jerry !! 😂😂
00:59
Tibo InShape
Рет қаралды 56 МЛН