Transforming data | PySpark, T-SQL & Dataflows in Microsoft Fabric | DP-600 EXAM PREP (7 of 12)

  Рет қаралды 6,179

Learn Microsoft Fabric with Will

Learn Microsoft Fabric with Will

Күн бұрын

Free DP-600 study notes inside community: www.skool.com/microsoft-fabri...
In this video (7 of 12 in the series), cover the following:
Data cleansing:
Implement a data cleansing process
Identify and resolve duplicate data, missing data, or null values
Convert data types by using Dataflows or PySpark
Filter data
Data enrichment
Merge or join data
Enrich data by adding new columns or tables
Data modelling
Implement a star schema for a lakehouse or warehouse, including Type 1 and Type 2 slowly changing dimensions
Implement bridge tables for a lakehouse or a warehouse
Denormalize data
Aggregate or de-aggregate data
This video is part of the DP-600 Exam Preparation series: • DP-600 Exam Preparation
Timeline
0:00 Intro
1:29 Data cleansing process
2:26 Introduction to the dataset
3:31 Dataflow: data cleaning
6:55 T-SQL: data cleaning
10:51 PySpark: data cleaning
20:25 Star schema
22:41 Slowly-changing dimensions
23:36 Type 1 SCD
24:27 Type 2 SCD
27:53 Bridge tables
28:56 Implementing a bridge table in T-SQL
32:53 Normalized vs Denormalized data
34:53 Data aggregation (and de-aggregation)
37:54 Practice Questions
43:45 Outro and next steps
#microsoftfabric #dp600 #powerbi

Пікірлер: 36
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Hey everyone, thanks for watching!! How are you finding the course so far? A lot to learn??
@nagarjunabm2738
@nagarjunabm2738 Ай бұрын
I find this course to be very helpful and effective in helping me learn for the DP-600 exam. Looking forward to next one!
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
That's awesome, glad the course is helping 🙌
@mohamedammar2805
@mohamedammar2805 Ай бұрын
awesome , thanks for your time and efforts
@josecardenas2736
@josecardenas2736 Ай бұрын
Awsome very well explained, looking forward to pass the exam soon.
@user-data_junkie
@user-data_junkie 29 күн бұрын
Good. Thanks for putting in the work to create this.
@user-dy8xu7uj8k
@user-dy8xu7uj8k 26 күн бұрын
Hi Will, your videos provide great learning experience, thank you for creating such good content.
@cuilanzou8638
@cuilanzou8638 Ай бұрын
It's happy day today because we have a video of DP-600 series! La, La, La, La,,,,,,,. Thank you Will !!!
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Haha I hope you find it useful, thanks Norya!
@azwarmzafar
@azwarmzafar Ай бұрын
Man you are doing a great job, your contents are golden and a real eye opener into the platform. Many thanksss.
@padmasubbiah6259
@padmasubbiah6259 23 күн бұрын
Thanks for the awesome videos Will !!
@jamesbarrett1878
@jamesbarrett1878 Ай бұрын
Thanks Will. I was waiting for the next video. Great stuff so far.
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Thanks for watching James!! Glad you're enjoying 🙌
@yazankabalan4775
@yazankabalan4775 Ай бұрын
A brilliant explanation of fundamental concepts in data transformation and data modelling. Thanks a lot Will, keep up the great work! 🔝
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Thanks for watching!
@TheOneRichy
@TheOneRichy 29 күн бұрын
In my work we broke orders out into a yearly reportatble table using a SQL contraint on an important date. We then query against a view in sql where all the other tables are gathered together again. We use partition view functionality to then speed the data returned because it's smart enough to limit the tables it needs to look at. This is what came to mind regarding aggregation/deaggregation for me.
@mattroberts9665
@mattroberts9665 Ай бұрын
Brilliant Will. Another brilliant video. Thank you so much.
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Thanks Matt! Glad you’re enjoying 🙌
@junpei0berkeley
@junpei0berkeley 27 күн бұрын
great content!!
@juanc.alcazar7507
@juanc.alcazar7507 5 күн бұрын
👍
@user-dy8xu7uj8k
@user-dy8xu7uj8k 26 күн бұрын
Will, I have a SQL server stored procedure which updates, deletes and merges data into a table , how do I convert the stored procedure to pyspark job, is it possible to update a table in fabric using pyspark?, please make a video on this topic
@moeeljawad5361
@moeeljawad5361 Ай бұрын
Hi Will, When you talked about bridging tables, was the aim to break the many to many relationship that will be introduced when a type 2 SCD is connected to the fact table?
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Bridging was just the next data modelling concept in the list, not necessarily related to Type 2 SCDs. But yes, in general it can be used to resolve anywhere you have a M2M relationship in your data model 👍
@moeeljawad5361
@moeeljawad5361 Ай бұрын
Hello Will, that is me again :D. in the step where you were droping duplicates where you wrote deduped = df.dropDuplicates(), it is not clear how spark knew that it needs to drop the duplicates on the combination of columns [ 'Branch_ID','Date_ID']. is there a missing step?
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
yes dropDuplicates() also has the subset parameter, if you want to check for duplicates only within certain columns. In this example, I wanted to remove the row if every value was the same, so no need to pass in the subset parameter 👍
@nguyenminhthu7064
@nguyenminhthu7064 Ай бұрын
Can you make a tutorial video about Type 1 Type 2 how to change dimension
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Yes I would like to go into more detail of SCDs in the future!
@carlosnavia1361
@carlosnavia1361 Ай бұрын
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Thanks for watching Carlos!!
@gopaiahswamyvysetti3980
@gopaiahswamyvysetti3980 Ай бұрын
In the 5th question, don't we need the "isCurrent" flag to categorize it as a type 2 dimension?
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
It's more 'optional' - can also be calculated from the dates, if need be
@drisselfigha3547
@drisselfigha3547 Ай бұрын
You sepeak very very fast!!!
@LearnMicrosoftFabric
@LearnMicrosoftFabric Ай бұрын
Sorry about that, feel free to use the Playback Speed to slow it down 👍
@Lonely.Planet.
@Lonely.Planet. Ай бұрын
Will speaks at perfect pace, super clear British English and his video editing is amazing. You can always reduce the playback speed as Will suggested
@bloom6874
@bloom6874 22 күн бұрын
You can use the custom option with Playback speed on KZfaq Player. This would help in adjusting the speed pace as per your comfort.
Monitoring & optimizing performance in Microsoft Fabric | DP-600 EXAM PREP (8 of 12)
34:32
Learn Microsoft Fabric with Will
Рет қаралды 4,7 М.
Design and build semantic models in Microsoft Fabric | DP-600 EXAM PREP (9 of 12)
28:01
Learn Microsoft Fabric with Will
Рет қаралды 4 М.
터키아이스크림🇹🇷🍦Turkish ice cream #funny #shorts
00:26
Byungari 병아리언니
Рет қаралды 26 МЛН
СНЕЖКИ ЛЕТОМ?? #shorts
00:30
Паша Осадчий
Рет қаралды 8 МЛН
FOOLED THE GUARD🤢
00:54
INO
Рет қаралды 62 МЛН
Vivaan  Tanya once again pranked Papa 🤣😇🤣
00:10
seema lamba
Рет қаралды 21 МЛН
Microsoft Fabric: Data Warehouse vs Lakehouse vs KQL Database
30:18
Learn Microsoft Fabric with Will
Рет қаралды 11 М.
SQL, Data Warehouse & Scheduling in Microsoft Fabric | DP-600 EXAM PREP (6 of 12)
26:31
Learn Microsoft Fabric with Will
Рет қаралды 6 М.
Secure and optimize semantic models in Microsoft Fabric | DP-600 EXAM PREP (10 of 12)
21:25
Learn Microsoft Fabric with Will
Рет қаралды 3,1 М.
Data pipeline vs Dataflow vs Shortcut vs Notebook in Microsoft Fabric
31:16
Learn Microsoft Fabric with Will
Рет қаралды 12 М.
Advancing Fabric - Lakehouse vs Warehouse
14:22
Advancing Analytics
Рет қаралды 22 М.
터키아이스크림🇹🇷🍦Turkish ice cream #funny #shorts
00:26
Byungari 병아리언니
Рет қаралды 26 МЛН