3.06 Mastering Common Silver and Gold zone transformations with PySpark in Microsoft Fabric

  Рет қаралды 2,503

Fikrat Azizov

Fikrat Azizov

Күн бұрын

• Microsoft Fabric For B...
This video explores common transformation techniques in Silver and Gold zones that are part of Medallion architecture. I explain data enrichment and type conversion transformations and demonstrate how to use PySpark API's and methods to address these tasks.
I also demonstrate how to process historical data from the Bronze layer using Window functions. Next, I explain core Kimball dimensional modelling concepts and demonstrate how they can be implemented using PySpark methods.
Finally, I demonstrate creating aggregates.
You can download the related demo notebook from here: github.com/faz...
Chapters:
00:00- Introduction
02:21- Preview
06:19- Lakehouse historical data storage strategy
09:00- Demo start- preparing data
10:24- Creating shortcuts to Bronze tables
11:24- Notebook demo- reading data from shortcuts
12:30- Inspecting data frame schema
13:48- Data Type conversion transformations
16:05- Ordering data
20:00- Handling historical data using Window functions
24:25- Data enrichment transformations
25-45- Using regular expressions to parse text data
26:40- Generating time dimension
30:45- Dimensional modelling concepts
32:12- Slowly changing dimensions (SCD)
33:05- SCD Type-2 dimensions
34:54- Surrogate keys
35:32- Relationships between facts and dimensions
37:00- Generating surrogate keys using monotonically_increasing_id function
38:00- Distributed computing and Spark partitions
41:31- Reducing data frame partition count
43:02- How to link Fact and Dimension tables
47:14- Incremental write into destination tables
49:02- Using MERGE INTO query for destination write
50:50- Aggregation transformations
Please subscribe: / @fazizov
Official Documentation:
learn.microsof...
learn.microsof...
sparkbyexample...
www.kimballgro...
spark.apache.o...
Hashtags:
#datafactory, #microsoft,#microsoftfabric ,#azure, #dataengineering,#cloudcomputing, #dataanalytics, #lakehouse, #azuretutorial, #azuretraining, #datapipeline, #dataextraction , #dataintegration, #datatransfer, #dataflow, #spark, #deltalake, #synapse, #synapsedataenginering, #demo, #datalake, #transformation, #ingested, #datawarehouse, #dataintegration, #azuredatabricks ,#databricks, #bigdata, #bigdatatechnologies, #pyspark, #sparksql, #notebook ,#transformationvideo, #bronze, #medallion, #kimball, #dimensions , #modeling, #facts, #silver, #gold, #historical data, #dimensional

Пікірлер: 6
@joseluiscorreasalazar5670
@joseluiscorreasalazar5670 Ай бұрын
Thank you very much! This is one of the best tutorials on Fabric Lakehouses out there
@fazizov
@fazizov Ай бұрын
Thanks for watching!
@kevthebandit
@kevthebandit 6 ай бұрын
Thanks for breaking this down!
@fazizov
@fazizov 6 ай бұрын
Thanks for feedback!
@digitalevidenceofthings
@digitalevidenceofthings 6 ай бұрын
This is incredible, exactly what I needed to see to ensure I'm on the right track. Thank you for taking the time to do this video!
@fazizov
@fazizov 6 ай бұрын
Glad it was helpful, thanks!
Chat and RAG with Tabular Databases Using Knowledge Graph and LLM Agents
1:23:34
艾莎撒娇得到王子的原谅#艾莎
00:24
在逃的公主
Рет қаралды 46 МЛН
WHO CAN RUN FASTER?
00:23
Zhong
Рет қаралды 39 МЛН
Running With Bigger And Bigger Feastables
00:17
MrBeast
Рет қаралды 79 МЛН
Get started with the On-Premises Data Gateway in Microsoft Fabric
11:18
End-to-end data validation strategies in Microsoft Fabric (+ 3 DEMOS)
51:56
Learn Microsoft Fabric with Will
Рет қаралды 8 М.
Data Modeling Tutorial: Star Schema (aka Kimball Approach)
16:34
Kahan Data Solutions
Рет қаралды 105 М.
Microsoft Fabric: Data Warehouse vs Lakehouse vs KQL Database
30:18
Learn Microsoft Fabric with Will
Рет қаралды 15 М.
3.09 Spark Streaming Deep-Dive for Microsoft Fabric
32:51
Fikrat Azizov
Рет қаралды 208
Capacities, Workspaces and Access Control in Microsoft Fabric
15:20
Learn Microsoft Fabric with Will
Рет қаралды 11 М.
Microsoft Fabric Capacity Smoothing and Data Warehouse Throttling
14:56
Azure Synapse Analytics
Рет қаралды 3,3 М.
Частая ошибка геймеров? 😐 Dareu A710X
1:00
Вэйми
Рет қаралды 6 МЛН
Nokia imba #trollface #sorts
0:31
SodnomTsybikov
Рет қаралды 4,6 МЛН
Сделал из зарядного устройства нечто!
0:48
📱магазин техники в 2014 vs 2024
0:41
djetics
Рет қаралды 903 М.
ПС 110/10. Кто то подключил "левак" 110000 вольт!?
0:34
Советы электрика
Рет қаралды 976 М.