No video

Fact & Dimension tables in Power BI | Data Modelling

  Рет қаралды 70,917

The Power BI Guy

The Power BI Guy

Күн бұрын

Пікірлер: 121
@MD-cu6wq
@MD-cu6wq 2 жыл бұрын
This is the best demo - nobody explains that the original values need to be removed from the master data table. This is not obvious to newbies. I am working with non financial data so need to add my own measures. Thank you!!!
@Oli9995
@Oli9995 9 ай бұрын
Omg yes. I've been looking for this. How to convert denormalised table into facts and dimensions.
@oscarmacho2798
@oscarmacho2798 7 ай бұрын
You always need to add your measures
@RedCouchChess
@RedCouchChess 4 ай бұрын
I hope your day is filled with laughter and joy. Thank you for the service you provide to us unqualified data analysts :)
@MegaAbgt
@MegaAbgt Жыл бұрын
What a fast and elegant tutorial. I try to find a power bi job but very hard to find one as a newbie.
@saru8773
@saru8773 Жыл бұрын
Same
@houseandhobby6971
@houseandhobby6971 Жыл бұрын
New terminology for classical db concepts. I'm so old. We just covered the concept of normalizing a database in this video. The original term for dim table is parent table. For fact table, child table was the original term, but a specific type of child table that stores observations rather than child entities (in turn pointing to parent entities). Foreign key term still lives on. I would have used Excel (old school, manual, slower). Dim tables are about entities. Fact tables about data observations of the entities. Good normalization results in a hierarchy of dim tables (entities) thanks to common attributes across entities. Nice vid, just chatting some here. Thx
@raideryvs5595
@raideryvs5595 7 ай бұрын
In a star schema, the dimensions generally undergo denormalization. For instance, consider an employee dimension where employees are associated with a department. In a normalized data model, distinct tables for employees and departments exist, connected through a foreign key relationship. However, in a pure star schema, only the employee table is present, and department data is duplicated for each employee. This optimization enhances data retrieval speed, and the additional storage overhead is deemed acceptable, especially when contrasted with the space consumed by the fact table.
@krishnakantapanda6835
@krishnakantapanda6835 8 ай бұрын
That was super super super helpful.... finally I could confidently say that I have clearly understood fact vs dim, foreign vs primary key and star schema concept❤
@alimirazimzadeh544
@alimirazimzadeh544 Жыл бұрын
I have seen over 100 videos, and this is my favorite
@amid7757
@amid7757 Жыл бұрын
The best demo I've ever listenned. Thank you so muchh! God bless ya!
@kristinajensen711
@kristinajensen711 Жыл бұрын
Thank you, and just for this video but for the whole channel. I stumbled upon it yesterday (looking for non-pretensions explanation of variables in DAX measures) and I am a fan! I watched all your videos, some more than once - and you brought back some joy , enthusiasm and HOPE that I could understand and learn Power BI. Thank you a million again and please keep up the fantastic job!
@cparker4486
@cparker4486 Жыл бұрын
Good explanation and instruction on how to carry this out. Two things: (1) Why is it important to do this? How does it improve performance? It appears the file still has to load the same amount of data. (2) Was it important to move the shipping method into a separate table because it was only one column?
@user-re5uv2mm2h
@user-re5uv2mm2h Жыл бұрын
Same question. Hoping he would response. Our team is migrating reports from Tableau to PowerBi. When I create my data source in Tableau, I usually create one table before putting it in Tableau, but now in PowerBI they are telling me to separate the columns first. Is it better to separate the facts and dimension than create the visualization on one table? Would that greatly affect the performance? I imagine I will just need to do that if I will be using two different data source though. Please enlighten us.
@Transformita
@Transformita Жыл бұрын
@@user-re5uv2mm2h Sorry for the delayed response. In summary, you're massively reducing the size of the dataset by splitting your dimensions. Let me explain. Lets say you have one wide table with 1 million rows with each repersenting a sale made. You have the following columns product name | sub-category | location |store name etc. The product name, location etc. is going to be repeated 1 million times as strings (text). .In general with relationtional databases, int type (numbers) are far more compressed then strings. e.g. The product "phone" was sold and appears 300,000 times in your sales table when it's one table. By splitting your product detail into it's own table, you have 3 rows, each repersenting your 3 products - "Phone", "Laptop", "Tablet" and an ID for each. After this has been modelled, your fact table now has the number "1" appear 300,000 times not the word "phone". From a performance perspective this is significantly more compressed Hope that helps.
@user-sl4ux8hi3d
@user-sl4ux8hi3d Жыл бұрын
Thank you Power BI Guy, this video is helping more than you can imagine :)
@fahadamir6349
@fahadamir6349 5 ай бұрын
superb brother, MashaAllah you did a great job its really helpful
@asaravana6701
@asaravana6701 5 ай бұрын
Very insightful information sir thank you now i am getting some clarity on this concepts
@Tonezatl
@Tonezatl 2 жыл бұрын
Excellent job, you just gave me the best lecture and gift of the month...thanks a lot bro
@Transformita
@Transformita 2 жыл бұрын
You're most welcome! Glad it was of benefit. I've released similar videos and topics, hopefully they help. Don't forget to like and subscribe. It's appreciated! 🙂
@JonathanHansen-inmk
@JonathanHansen-inmk 2 ай бұрын
Great demo. I really appreciated your clear explanations of how to create dim tables. I did have a question about grouping by zip and city though. If you remove duplicates on postal code you could potentially lose some cities since there are some postal codes with multiple cities in them. Can you remove duplicates bas on city and postal code?
@nidhisingh4973
@nidhisingh4973 4 ай бұрын
Thank you so much this video. Got a good understanding of building fact and dimension tables from raw data. Just want to know, do we create dimension and fact tables directly in power bi in real world project?
@ishneetsinghcheema6727
@ishneetsinghcheema6727 6 ай бұрын
You are the mannn!! Thanks so much. Wish you all the best for futures releases.
@Transformita
@Transformita 5 ай бұрын
Thank you. Glad it helped!
@bchi49
@bchi49 2 жыл бұрын
Thank you. Just was what I was looking for. I have one to many table relationships, I can do the same on the many side.
@Transformita
@Transformita 2 жыл бұрын
Glad it was helpful! If you like my content, consider subscribing 🙏, I'll have more videos on this topic that go into bridging tables etc. for modelling.
@akthar3
@akthar3 2 жыл бұрын
its a great video showing how the the context data is moved to a dimension table where a key is created in the dimension table and added to the fact table as a foreign key. A question of concern , should i be doing that to flattened sql server table (the main raw data source) with over 10 million rows ? - this is how the data comes to us from another team - 1 single flat table s that needs to split into several fact tables and dimensions. As I plan to do all etl in the sql server , should i materialise the fact and dimension tables (create physical static tables ) or just create views ? Should i be creating the key in the dimension table or view and then updating it or joining it to the fact table ? The processing may become a little too much for the Azure sql server capacity ? I hope my question is not too confusing . Thanks you
@sidharthmandal9957
@sidharthmandal9957 2 жыл бұрын
Hey man, really loved this tutorial. Learned a lot. Can i have the data set plz? I want to have handson on it.
@donharrold1375
@donharrold1375 16 күн бұрын
Clever approach
@JackWebb-pd3dy
@JackWebb-pd3dy Жыл бұрын
Thankyou for the video, really well explained and great demonstration of the model in effect. At the end you mention a custom date table but then video cuts out - do you have one explaining this or a video you'd recommend?
@mikesmith-nh1jw
@mikesmith-nh1jw 4 ай бұрын
Great video just one question. Wouldnt it be better to 'reference' the source data or cleaned up copy of the source data rather than 'duplicate' as then you wouldnt have to remove the applied steps later ?
@kalliduspre-sales1799
@kalliduspre-sales1799 5 ай бұрын
Hey, great video Power BI Guy. Right at the end you say you need to create a custom date table. Do you have a video which covers this and explains why you need that?
@Transformita
@Transformita 5 ай бұрын
Not yet!, I’ll make one soon 👍🏼
@solomonidowu8284
@solomonidowu8284 2 жыл бұрын
Hi. You did a wonderful job here. Thank a lot.
@ryanjohnston9642
@ryanjohnston9642 2 ай бұрын
Very interesting data model! I'm new to PowerBI and have a follow-up question... When you add the index column to a DIM query, if the source data changes such that a new ZIP Code is added (or a new client), will these index columns not change when you refresh the source data for the query? Will these index values not change and perform as relative-references? I'm concerned that the data will get mixed around when the DIM table merges into the FACT table mid-query. Or am I overthinking this? Will these relative references update safely so that everything merges correctly? Or maybe the issue is that everybody using the fact table needs to understand that Client ID# is a relative reference that can't be relied upon to stay the same from one cycle of reporting to the next.
@kartikeyatiwari6678
@kartikeyatiwari6678 3 ай бұрын
Awesome 🎉
@nathanmutala2941
@nathanmutala2941 2 жыл бұрын
merci, ce video m'a beaucoup aidé.
@zakariaelaaliji7747
@zakariaelaaliji7747 8 ай бұрын
Thank you, but what if we want to bring data from multiple sources and formats ??1!
@dixitmca
@dixitmca Жыл бұрын
please keep posting wonderful videos
@dixitmca
@dixitmca Жыл бұрын
Best Demo ..thank you.
@felixfranke2838
@felixfranke2838 11 ай бұрын
Excellently explained, thank you. Question: If in my original data, which I refresh, a new customer shows up, or a new shipping method: How will that be related to the correct line in the facts table? We do not have an index for that. Wouldn't it be better to use data which are in the original source to connect the tables?
@Transformita
@Transformita 11 ай бұрын
Thanks. - Because your referencing tables, PBI is dynamic and the new ID's will get populated and reflect in the fact table when it's refreshed. It's not static. - The example in this video is purely to show the concept of a fact and dim table. In the real world these ID'would already be at source where they should be and without a doubt always use those.
@tlee7028
@tlee7028 2 жыл бұрын
Thank you! This is what I was looking for !
@PersonalityDev-T
@PersonalityDev-T Жыл бұрын
Thanks much . very good explanation
@bensix6496
@bensix6496 Жыл бұрын
Very helpful thanks - question though by creating the dimension table on the fly, when I have a new country if I don’t refresh my DimCountry first the new Country Index doesn’t flow back in the source data. Is there a way to sequence or am I missing something? Thanks in advance
@felixfranke2838
@felixfranke2838 11 ай бұрын
That is exactly my question as well...
@cumeh821
@cumeh821 2 жыл бұрын
Thank you for this video. So helpful.
@busta0079
@busta0079 3 ай бұрын
Thank you. Can you explain relationships in football data please
@Renotalks
@Renotalks Жыл бұрын
Thanks man appreciate your efforts
@user-sp4sf1rd1c
@user-sp4sf1rd1c 3 ай бұрын
If I have two datasets, should I create two star Schema? If yes, how can I manage Dim tables and Facts tables from both datasets?
@hardik7385
@hardik7385 2 жыл бұрын
Amazing crisp and informative
@cecegaming1119
@cecegaming1119 5 ай бұрын
Thank you for this demo, but I have a question. When I create a fact table, it causes a lot of duplicates, and I cannot load it into Power BI. It's taking a lot of time. I've tried removing duplicates in the dimension, but the issue persists. What could be the solution? Thank you.
@souzaeq
@souzaeq Жыл бұрын
Question: For the next step, I create a flat custom date for the dates interval I have and use it as DimTable (date, year, quarter, month, day) linking Dimdate to wich date in orders, shipping or order date? I assume Order that comes first, right? These would be the next step you said? Really nice video, thanks for sharing.
@Transformita
@Transformita Жыл бұрын
Are you referring to a custom date table and a relationship between the two. You can create many relationships between the dates. But only one will be active and the others inactive. Look into active and inactive relationships.
@souzaeq
@souzaeq Жыл бұрын
@@Transformita many thanks! Do you have videos on that?
@noneB974
@noneB974 3 ай бұрын
How would you achieve same results for a model with 2 fact tables where you have to make a common dimension from both to use that common dimension for slicing?
@fmufafa
@fmufafa Жыл бұрын
You are just the best.
@subhashisnayak819
@subhashisnayak819 2 жыл бұрын
Amazing you are.. i subscribed for more videos
@akkialekhya7638
@akkialekhya7638 2 жыл бұрын
This is so helpful to me.tq u
@gagansingh3481
@gagansingh3481 Жыл бұрын
Could you please make a video on Market Basket Analysis....
@SSRealEstates83
@SSRealEstates83 2 жыл бұрын
I have a question here,when we duplicate,remove other columns, creating inder...then after when we merge...the number of rows will get increase or not???please answer.... Thank you
@dakota3436
@dakota3436 2 жыл бұрын
Awesome, thank you! I have one question. If you were to update the data from your original source file by copy/pasting over an older export or changing the source file, all of the dims you created would still function correct?
@Transformita
@Transformita 2 жыл бұрын
Yes, as long as the column headers/structure is the same. The actual values in the cells don't matter. Think of Power BI as columns and Excel as cells.
@charanmarc5376
@charanmarc5376 4 ай бұрын
Thanks Man! Appreciate it !!!
@novandc9223
@novandc9223 Жыл бұрын
thankyou good explanation
@SDDEditzPro
@SDDEditzPro 2 жыл бұрын
We can use reference option instead of duplicate for creating dimension tables. Right?
@karmasakal9752
@karmasakal9752 7 ай бұрын
Is it normal that my row entry number has increased by adding up relationships ? My dataset does not allow me to proceed exactly the way you did. I created unique IDs for all my entries in the dataset and merged the unique IDs with the duplicated unique IDs. Is it possible that the way I did it does not make sense logically and somehow increased my row number ? My data does not allow me to remove duplicates for a certain characteristic or I am mistaken idk. Thanks for the support guys I would be glad to explain my problem if there is some interest on solving the issue.
@awosusiomowaare8696
@awosusiomowaare8696 10 ай бұрын
would the row_ID be in the dimension or in the fact table?
@user-of9ku1qw3s
@user-of9ku1qw3s Жыл бұрын
Brilliant!
@derbysutter2144
@derbysutter2144 10 ай бұрын
Hey, I have a pressing question. Can this method be used with multiple fact tables? Or just with 1?
@Transformita
@Transformita 10 ай бұрын
yes it can be. This is the ideal, but in the real world you can get multiple fact tables. Guyinthecube have done a vid on this -kzfaq.info/get/bejne/itSphdaAmbSugJc.html
@ztymedina
@ztymedina 10 ай бұрын
@@Transformita Hi can you show details on how to do it? because i've seen the video but it does not show steps by steps.. let say if i have 4 different fact tables, then how i'm supposed to do? Do i need to consolidate all fact tables first, then delete duplicate value to create a dimension table? or how is it exactly?
@rasmusbendtsen5952
@rasmusbendtsen5952 2 жыл бұрын
Thank you for a great video. I have a question. When creating the costumerID and removeing the duplicates, how would you handle it, if two people from different costumer segments, happen to have the same name?
@raghavverma120
@raghavverma120 2 жыл бұрын
In the real word scenario.. already there would be a customer table with unique customer id/ email id/ username.. Since it was a flatfile, so it was implied that customers cannot have same name.. else instead of customer name, username would be there
@raghavverma120
@raghavverma120 2 жыл бұрын
Else.. just dont delete duplicates.. just use a key to join tables
@Transformita
@Transformita Жыл бұрын
@@raghavverma120 Exactly this.
@ericaleverson9430
@ericaleverson9430 2 жыл бұрын
Wouldn't we run into issues using the customer name as a unique column when different customers might have the same name?
@Transformita
@Transformita 2 жыл бұрын
Yes you would, you would need some sort of unique identifier in that case. Real world databases have identifiers/keys for customers etc.
@HoaNguyen-ji5fz
@HoaNguyen-ji5fz 2 жыл бұрын
I love it. Thanks
@vishalmane3139
@vishalmane3139 Жыл бұрын
Dude if u can have a vfieo on normalize the table that will be good too upto 3nf.. alot of youtube vdieos ending up confusing more.
@shimamoayeri8778
@shimamoayeri8778 2 жыл бұрын
verrrry good. thanks
@DmytriiKutsiianov
@DmytriiKutsiianov Ай бұрын
I would really appreciate if someone could help me fix the following issue: When I click the "Close&Apply" button in Power Query Editor after adding the dimnsion tables, all the data gets loaded into Power BI with no errors. However, when I look at my dimension tables in data view in Power BI the order of the ROWS is changed. For example, in Power Query Editor, index 1 in the SensorName_Dim table corresponded to "Engine Speed" and index 2 corresponded to "Coolant Temp". After loading data into power BI the order of the sensor names got messed up so that Engine Speed no longer corresponded to index 1 and Coolant Temp no longer corresponded to index 2 . Does someone know how to fix this issue and preserve the order of rows when loading data into Power BI?
@SSRealEstates83
@SSRealEstates83 2 жыл бұрын
Hi ...After duplicating the main table is it mandatory to remove the steps till end from applied steps.what will happen if we won't delete that steps from the applied steps .thank you
@Transformita
@Transformita 2 жыл бұрын
I suggest doing so to improve refresh times. Each applied step has to be computed therefore by not removing it, you are proccessing far more data in your transformations
@jamesadam7113
@jamesadam7113 2 жыл бұрын
Thank you for this! I've watched other videos and they've not shown how to create a key / index column and I've been using the oroginal columns as a key. Is there any benefit to using a index column over the original columns with relationships.
@Transformita
@Transformita 2 жыл бұрын
- It's faster. A JOIN on an integer is much quicker than a JOIN on a string field or combination of fields. It's more efficient to compare integers than strings. 2 - It's simpler. It's much easier to map relations based on a single numeric field than on a combination of other fields of varying data types. 3 - It's data-independent. If you match on the ID you don't need to worry about the relation changing. If you match on a name, what do you do if their name changes (i.e. marriage)? If you match on an address, what if someone moves? 4 - It's more efficient If you cluster on an (auto incrementing) int field, you reduce fragmentation and reduce overall size of the data set. This also simplifies indexes needed to cover your relations.
@vishalmane3139
@vishalmane3139 Жыл бұрын
Whats the difference between custom column and calculated column?
@Transformita
@Transformita Жыл бұрын
Custom column is in Power Query, calculated column in in the normal desktop view. Functionality wise they are the same, the former using M in Power Query and DAX in Desktop
@saadfayyaz894
@saadfayyaz894 Жыл бұрын
Love from Pakistan😊
@akthar3
@akthar3 2 жыл бұрын
if your facts are descriptive comments (eg 500 characters) along with numerical values and measures associated to the comments ..do you keep them both in the same fact table ?
@S3Kglitches
@S3Kglitches Жыл бұрын
best example
@mar25947
@mar25947 3 ай бұрын
Short n sweet! not 2 hrs long lol
@souzaeq
@souzaeq Жыл бұрын
I have a fact that is built this way: each event takes a different quantity of days. Like event 1 takes 6 days, so the sale is for the interval of six days. Do I need to break that into daily revenue and cost by dividing the total amount by 6 and then transforming a daily and revenue? In case I need to divide the value to split among the days, wich time dax or transformation in power query should I use like to divide the total revenue by the number of days and then create a line for each recurrence. Like: USD 100/ 4 days, is USD 25/ day, how can I automatically generate starting from current information as 4/jan/2023 (end date) with 100 USD to 1/jan/2023 with USD 25, 2/jan/2023 with USD 25, 3/jan/2023 with USD 25 and 4/jan/2023 with USD 25 ?
@oluwatuyiayomide5385
@oluwatuyiayomide5385 Жыл бұрын
Some of the data in the fact table already have unique ID, do we need to start creating unique ID again with index number
@Transformita
@Transformita Жыл бұрын
No you would not need to re-create them. I did so as an example for this video. Relational databases will have these already
@oluwatuyiayomide5385
@oluwatuyiayomide5385 Жыл бұрын
@@Transformita Thank you
@cesarbarreradeleon4323
@cesarbarreradeleon4323 10 ай бұрын
where can i get the data simple ?
@raghavverma120
@raghavverma120 2 жыл бұрын
Can you please share this flat file
@ankurarora7632
@ankurarora7632 Жыл бұрын
Please share the dataset
@vishalmane3139
@vishalmane3139 Жыл бұрын
creating an index column was necessary ?
@Transformita
@Transformita Жыл бұрын
Yes it’s necessary/recommended. Real world databases will have these and the correct term would be primary, foreign, surrogate keys
@vishalmane3139
@vishalmane3139 Жыл бұрын
@@Transformita dont u normalize the tables ?
@eddarelmbuluyo8702
@eddarelmbuluyo8702 2 жыл бұрын
Great Sir. it's possible to have the file, the dataset? thanks
@sidharthmandal9957
@sidharthmandal9957 2 жыл бұрын
Did you get the data set? If yes, can you also send me ?
@pravarshakollu7646
@pravarshakollu7646 Жыл бұрын
can we create dimensions tables and fact table in power bi?
@Transformita
@Transformita Жыл бұрын
That's what I'm doing in the video. General rule of thumb - Do transformations as close to source/upstream as possible.
@pravarshakollu7646
@pravarshakollu7646 Жыл бұрын
@@Transformita how to identify fact tables and dimensions in our given data
@vladx3539
@vladx3539 Жыл бұрын
what about the date?
@adeelharal7106
@adeelharal7106 2 жыл бұрын
Please share dataset file
@richardpearcephotography599
@richardpearcephotography599 Жыл бұрын
What happens if you have two or more unrelated customers with the same name? John Smith vs John Smith.
@Transformita
@Transformita Жыл бұрын
In a real world scenario you tend to have the data structured from a source system/db that has a unique identifier already. When the two accounts sign up to x organisation, they would get an accountid for instance.
@vishalmane3139
@vishalmane3139 Жыл бұрын
what if we have multiple tables
@Transformita
@Transformita Жыл бұрын
Hey, I'm not sure I'm understanding your question. There's multiple tables in this example.
@vishalmane3139
@vishalmane3139 Жыл бұрын
@@Transformita im not able to break a dataset in dim and fact tables. Ive inported many dataset instead of one
@user-re5uv2mm2h
@user-re5uv2mm2h Жыл бұрын
Why didn't you create a DimDate?
@Transformita
@Transformita Жыл бұрын
I forgot in this video if I'm honest 😅 you should always have a seperate date table in Power BI
@leandro0505
@leandro0505 2 ай бұрын
Where can i get the example file?
@s.ifreecoachinginstitute0077
@s.ifreecoachinginstitute0077 8 ай бұрын
Good
@TheCantonese
@TheCantonese Жыл бұрын
[16:02] Oscar Wong Best practice is 1 to * (dimension table to fact table) this is 100% sure the best way BUT question it's possible to have * to * or * to 1 (between dimension table and the fact table) in a BI star or snowflake model?
@nathanmutala2941
@nathanmutala2941 2 жыл бұрын
merci, ce video m'a beaucoup aidé.
@nathanmutala2941
@nathanmutala2941 2 жыл бұрын
merci, ce video m'a beaucoup aidé.
How to Build Power BI Reports from Start to Finish
41:07
MyOnlineTrainingHub
Рет қаралды 1,4 МЛН
Data Modeling Tutorial: Star Schema (aka Kimball Approach)
16:34
Kahan Data Solutions
Рет қаралды 108 М.
The Joker kisses Harley Quinn underwater!#Harley Quinn #joker
00:49
Harley Quinn with the Joker
Рет қаралды 22 МЛН
Violet Beauregarde Doll🫐
00:58
PIRANKA
Рет қаралды 40 МЛН
Handling MULTIPLE fact tables in Power BI
9:02
Guy in a Cube
Рет қаралды 307 М.
Data Modelling: From Single Table To Star Schema - Chris Barber
39:11
London Business Analytics Group
Рет қаралды 35 М.
The Dimensional Dilemma and Power BI
19:35
Paul Turley
Рет қаралды 6 М.
5 TIPS for Power BI that EVERY developer SHOULD know
9:57
The Power BI Guy
Рет қаралды 3,7 М.
Dimension & Fact Tables
31:17
Kamala - MSTR Videos D
Рет қаралды 42 М.
Power BI Masterclass: Data Modeling 101
56:54
CRG Quisitive
Рет қаралды 55 М.
How to Build a STUNNING Sales Dashboard in Power BI - Tutorial #2022
20:19