Building Relationships
Like these posts but need more formal (but still engaging and fun) training in Power BI? Contact me, Joe Travers, at Travers Data or at joe@traversdata.com.
One of Power BI’s great strengths is that we can easily create visuals and entire dashboards using data from multiple data sources. As long as the data has something in common (ideally a unique identifier - we’ll talk about that in a minute), we can link data sources together and make some magic.
We’re going to use a new blank Power BI report to show how to build a basic relationship. The data we’ll be collecting to an Excel file, and if you want your own copy of it, here it is:
Here’s the video post, followed by a written version.
Video:
Written Instructions:
We have a new blank Power BI report, and we’re going to connect some data to it and see how Power BI builds relationships.
First, let’s connect to our data source. It’s the same spreadsheet linked above the video above. Click “Import data from Excel” on the canvas and navigate to where the spreadsheet is, select it, and click Open.
When the Navigator window opens, you’ll see that our spreadsheet has 2 tabs of data. We have both of them here in the same spreadsheet for ease, but they could be in different sheets that we’d connect to separately.
If we check our Community tab, we’ll see information about Communities in the City of Chicago. The city is split into 77 communities, and this tab has information about each one. Their community number, community name, and info about their population, area, and population density.
The other tab is our “Crimes” tab, which has Chicago Crime data (about crimes classified as Assault or Battery). It has information about the date of the crime, location of the crime, and what community area it happened it.
We’re going to make sure BOTH of these tabs are checked in our Navigator window and click the LOAD button.
We can see our two data sources (each tab) in our Data Pane at the far right of Power BI Desktop now:
Let’s click into the Model View of Power BI to see what our data model looks like with these two sources.
You’ll see something that looks like this, showing the 2 data tables, with their columns:
Now, if you google Power BI data modelling or building relationships, you’ll probably come across a lot of webpages and videos that use a lot of jargon when trying to explain relationships. They’ll have things called Fact Tables and Dimension Tables and they make everything more complicated than it has to be.
What you need to remember is that usually (but not always), if you have two data sources, one source usually has most of the information about what your data is about (Chicago Crimes, in this case) and the other has more information about one part or facet of our information.
With me so far? Ok, good. :)
In our case, our Crimes table is our “Fact” table (to use data model terminology).. it’s where all the data about our crimes is.
Our “Community” table is what the data geeks call a “Dimension” table. It gives an extra “dimension” to our Communities. It has information *about* each community. Population, area, population density. It’s NOT crime data. It’s data about each community.
Crime data in the “Fact” table. Extra info in the “Dimension” table.
You don’t have to call them “Fact” and “Dimension” though. You call them whatever you want… just be clear on what they have in them. :)
Now let’s build a relationship between them, so if we want to make some kind of visual that uses some of our Crime data with some of our Community data, we can totally do that.
Both these tables have one column in common, even though it’s called different things in each.
It’s the Community number field. In the Community table it’s named “Community Number”.
In the Crimes table it’s called “Community Area”.
They both have values between 1 and 77 for each of the 77 communities in Chicago.
We’re going to create a link between these two columns of community numbers in our Model View by clicking one of these fields and dragging it on top of the other one.
When you do this, you’ll get a line showing up connecting these two fields.
The design of this line has meaning too.
You’ll notice that one end of the line has a “1” and the other has an asterisk (*). The 1 means 1 (duh), and the * means “many”. This refers to how many times a Community (in this case) is in each table.
Each community is only listed once in the Community table (there’s one row per community), so that’s the “1” side of this relationship. Each community can show up multiple (or many) times in our Crime data, so that’s the “many” side of this relationship.
This is also called “Cardinality”, which has it’s own little nuances and quirks, but we’ll cover those in another post. You generally don’t have to worry too much about it unless you start working with very complicated data models (or you’re trying to hack Power BI to do some out-of-the-box things).
Finally, the little arrow in the middle of the line shows us a “filter direction”. In our case, the arrow is pointing from Community to Crimes, which means we can set up a filter or slicer in our report using a filed in the Community table, and it will filter all the Crimes data. We can’t turn it around, but we can make this arrow bi-directional (again, a bit more complicated that we’ll cover in another post … and we’ll link that other post here when it’s made).
Okay, let’s make a quick visual just to see how this relationship works in practice.
We’ll go back to our canvas and click our Table visual.
First, we’ll drag our Community Name field from our Community table to our Table, along with our Population column. Then we’ll drag our ID field in our Crimes table to our table, where it counts up the crime ID numbers for each Community.
We’ve now made a visual with data from both tables!
This is even more impressive when you realize that our Crimes data has no Community Names in it. Because we’ve made a relationship between our fields with the Community numbers, Power BI knows WHICH data in our Crimes data is for the Albany Park community, or for Archer Heights, or any other community.
That’s basic relationships in Power BI. Let me know in the comments below if you have relationship questions (well, Power BI relationships… )
Take care everyone,
Joe.
Post Tags: Relationships | View all Post Tags
Like these posts but need more formal (but still engaging and fun) training in Power BI? Contact me, Joe Travers, at Travers Data or at joe@traversdata.com.
This post was made with the April 2024 version of Power BI Desktop.