Power BI tutorial for Data Analysis


Chelsea Dohemann: Hello, and welcome to day two of our power BI course today, we're mostly going to be focusing on data modeling data modeling is the process by which we take data from many different sources and combine it into one data model. That's not to say that each piece of data is actually combined with other pieces of data. What it really means is that there is a meaningful link created between the different sets of data. So in this little visual, on the left-hand side, we have all of our different jumbled sorts of, of a different kind of original sets of data. And on the right hand side is a very structured organization of how we're going to, uh, lay this data out. We're going to create a connection between all this data that is going to stem eventually down into one large transactions table, which would be the trunk of the tree in this analogy. And then we'll be able to still get all the information from all these different sources. Um, but we don't have to combine them and smash them all into one table to do that. This feature is generally referred to as data modeling. If you're in Excel, depending on the version of Excel that you're in, it might be referred to as either power pivot or data model. If you're in power BI, then it's going to be referred to as modeling and among the modes on the left-hand side, you'll be able to see the moment. Obvious view of modeling, going to the relationships mode on the left-hand side in par BI, by the end of the day, we're going to take all this data that we have compiled in all these different sources, and we're going to be able to put it all together. So we've already created some reports, but we want to take data from a bunch of additional sources and create even more reports. Once we have all of these reports, then we're going to need to put them together in one comprehensive overview of the company. This comprehensive overview of the company is what we refer to as a dashboard. And you'll see an example on the right hand side of this slide here, a dashboard is an overhead view of all the different metrics and analyses that somebody might use to analyze the company.

Now, when we create this dashboard, though, it might be a little bit difficult to send this dashboard to the people that are really going to need to use it and analyze it. And so what Microsoft has done is created an online service for power BI, which makes it so much easier than sharing a file. Previously, you might expect that a dashboard would need to be shared by actually sending a file to somebody meaning emailing the file. This introduces all kinds of problems. First of all, you worry about duplication of the file. Um, you worry about different people making different modifications, and then you don't actually have the original somebody else has made modifications in there. 60 different versions out there somewhere. So we want to avoid. Secondarily, a lot of these dashboards are actually going to contain massive amounts of data. And so it's not really possible to send the file to somebody. So Microsoft has created an online platform from which you can easily share your dashboard to anybody. You could either send them a dashboard that looks like the one that you see on the upper right hand side. This one is actually being viewed inside the power BI features, the power BI online features in your office online account, or you can easily embed your dashboards inside of a SharePoint page or a regular webpage or something like that. To allow somebody to be able to see it. People can then look at that data on their phones, on their regular computers, different teams of people can see it at the same time. And then of course you yourself can see it as an individual and make modifications whenever necessary our goals today, when it comes to the actual data that we're going to be dealing with. We have a lot of different data that we're going to pull today. Our objectives are going to be to analyze our sales and profits and returns over time, but we don't actually have totaled sales information.

We don't have any profits information and we don't have any returns information in our dataset. What we have is information about the, um, individual pieces of a transaction. So how many products were sold in this transaction? What date did it happen on? Uh, what was that products? Um, original cost, right? So are the cost of the goods sold? How many of the items we sold, um, what our, um, what our standard retail prices for that item. So we have all of that metadata for the items. And from that, we're going to need to derive sales profits and returns, meaning we're then going to need to see this analysis segmented by the different products and the product subcategories and the product categories that we have. So we have a little bit of a hierarchy of meaning there. Um, and we're going to need to segment it by each different layer in the hierarchy. Then of course, we're going to need to visualize this. Now, of course, we could visualize this in a pivot table or in many, many different pivot tables. Um, but we're also going to be utilizing the power of power BI in order to analyze a bunch of different kinds of analyses in the same setting. And then we're going to be able to very easily share this analysis in a consumable dashboard. So that's our agenda. It's usually at this point in the course, when I get questions about what the difference is between querying and data modeling, why not do all of your work in query? Why not do all of your work in data modeling? So we're going to take a second here just to segment the meaning between those two power query, which is what we spent all day yesterday talking about it doesn't actually store any of the data. And that's one of the really cool things about it. It borrows the data source and then provides a temporary view for you so that you can make certain kinds of changes. And then when you actually click on close and load, that's when it actually loads it into a spreadsheet, but power query all by itself, doesn't actually do any, uh, data holding or data handling power pivot.

On the other hand, actually stores the data and massive amounts of it. This is the real data model storage, medium. It compresses the data extremely well and allows for mobility and analysis of that massive data is often compared to SAS S S a S, which is sequel server analysis services. It is a SQL engine for Excel or for power BI wherever you use it. What's interesting is that it operates from the end user side. So you and I can use the power of. Pivot features very easily. Whereas if somebody needed to get information from a SQL server or something like that, they're used to usually a small team of people that are actually given an access to that. So let's get to know this powerful data modeling tool. Yesterday. We got all of our data now that we've gotten the data. Now we want to start building upon that, that querying of all of the data, you do need to be able to normalize that data, which is in the center blue block. There, you do need to normalize your data before it gets into power pivot, because unless it's normalized, you're going to have a really hard time actually modeling that data. But once you get it all normalized and it's all ready, then you can go ahead and put it into power. Pivot power pivot is where we are going to be able to relate the data, create custom analyses and measures on that data. And this is where we really start to determine correlations in outliers between our measures in between our sets of data. When it comes to power pivot, it is one of the bigger portions of learning that you'll need to do in power BI, or if you're going to operate in Excel, this is where you're going to need to spend a lot of your time learning for the future of Excel. Now it used to be that we had a common path of knowledge in Excel. This is still the case. Generally, most people start off by entering data and then performing some basic formulas and functions, creating and using cell references.

Sorting and filtering data, and then they'll start visualizing it and charts. And then they'll create some pivot tables, create some advanced functions, and then they'll start to import data play with some arrays. And then finally we get to the world of macros. This might sound a lot like your pathway. There might be a couple pieces missing, but it's, it's most people's pathways through Excel in much the same vein. There is a common pathway in learning power pivot. So we're learning power pivot, also known as data modeling. And we're going to start off by taking a look at just your regular pivot table. So when we are creating these pivots in Excel, what is the actual just general pivot table look like? And then we're going to need to start to create relationships between our data in order to do some more interesting analysis among our many different sources of data, we'll need to calculate, uh, columns of extra data based on our original data. Uh, so that's going to be how we're going to get sales returns and profits information, which is the three main categories of analysis that we're going to do today. And then we're going to create some simple measures based on that data. So maybe it's something as simple as I just want to get a total of all the profit, or it might be something as interesting as what is the change of this profit over the same period from last year, that is not the type of analysis that a pivot table can do for you. So you will need to get into the world of power pivot or model. Uh, which again is the same thing, depending on the program that you're using, it is called something different, but they are the same thing, power pivot in data modeling. And then we'll want to be able to start to do some much more interesting analysis, including creating complicated relationships, understanding, measure evaluation, and being able to manipulate that knowingly using filter and all to do those manipulations, using some X to basically auto insert row context, where there isn't already row context.

You'll get a lot more familiar with these terms as we go throughout the course, what we're going to be covering today is everything that is in blue here. So we're going to start by looking at some pivots, talk about relationships, create calculated columns to create some extra metadata, create simple measures on those columns. And we're also going to get a nice introduction to the calculate function, which is probably going to be the most important Dax function you'll ever learn. Now that we know how to grab our raw data, put it into power query, and we're starting to talk about power pivot. Let's take a look at the normal data pathway through power BI. So we start with raw data and then we pass all of that data through power query in order to normalize it and manage it. And then once we have normalized it, we can now land it in power pivot, where we can do all of the interesting data modeling after we have done that. Of course, we can actually use power BI to create some of those really nice looking visuals. And that is our. Let's start by opening a completely blank Excel file. The first half of this course, or I guess it's about a third of this scores is going to be all about how to use the data modeling or power pivot features that are within Excel. And then of course, we're going to move over to power BI and see it act exactly the same over in power BI. So in order to get started, we're going to need to actually see power pivot, but by default power pivot, isn't actually turned on in Excel 2016. It does already belong here and it's already in the program. I just need to turn it on, on the data tab of the ribbon in Excel, 2016, over on the right hand side, I'll have a button that says manage data model. If I click on this, I'll get a message that says, uh, that asks me whether I would like to enable the data analysis addons to use this feature.

If this is the very first time that I've used it, I'm going to get this message. But once I click enable, I shouldn't get that message anymore. You'll see a window pop up and it will look a lot like our power query window. It's a mostly gray window and we only have about four tabs of the ribbon up here. This is power pivot. And you'll see that even though 2016 refers to it as managed data model, it actually also refers to it as power pivot. So power pivot is what it's name is once you actually open the feature, but the tool itself. In Excel, the button says manage data model. So keep in mind, data modeling and power pivot are the same thing. I'm going to hold them. Side-by-sides that you can see the two different windows here on the left-hand side. I have power pivot. And on the right hand side, I have regular Excel. Now I did this through Excel 2016. So it's good to know if you have a previous version of Excel, exactly how you might open this. So just in case you do, I'll run through a quick little walkthrough of how we would manage that in Excel 2013, you do have it, but you're going to need to go turn it on. It is among the many admins that you have for Excel 2013. And so you can go to file an options and select your add ins. Now, once you get to your admin's menu at the bottom, you'll want to go to where it says manage down here. And it says Excel add ins from the get-go, but you want to click on that and choose comm addins and click go from there. You will see a power pivot for Excel 2013, and you should be able to click on that box and then click. Okay. And once you do that, you should see your power pivot tablets. If you have Excel 2010, you can still have it. It just isn't already in your program. Even like it wasn't Excel 2013 where it was back there, but it wasn't turned on in 2010. You're going to need to go download it. So you'll want to go and look for the Microsoft download of power pivot for Excel 2010, either way, once you get it and you turn it on, you will have a new tab of the ribbon that says power pivot.

Now, there are a couple of things that you can do inside your power pivot tab in Excel. First of all, you can click on the button that says manage, and that will pop up your power pivot for Excel window. You can also create measures and KPIs directly from over here. You can take data that already exists in the, in this Excel spreadsheet and actually add it to your data model. You can update your data, detect relationships, and play with a couple of settings there. We're going to spend most of our time in the first hour or so of our session directly within the power pivot window. So I'm going to go ahead and expand this window here now so far, it looks pretty boring and that's because I don't actually have any data in here. I'm going to need to retrieve data from a few different sources, including the great geography file that we worked all day yesterday, creating come on back in the next video and we'll talk about getting data in this course. We're going to be retrieving information from a couple of different sources. We're going to start by retrieving information from our massive source from. It just so turns out that our database is actually an access on the home tab in your power pivot window. You'll see a command group called get external data within that command group. You'll see a command called from database. Go ahead and click on that. Let's just take a moment to, uh, observe the different kinds of databases that we can easily retrieve information from a SQL server from access, or even from another power pivot file. We have all of our data inside access. So please select the option that says from access. And then you're going to need to browse for your file. Find your folder called student files inside there. There's another folder called student files. And inside there, you should see date to work files inside there. We have a file called Contoso data. This is all the data that we're going to be using.

Contoso is a fictional company created by Microsoft. We did talk about this during the introduction, but they sell a lot of retail products. I'm going to go ahead and click on that and click on open and next. Now we have a lot of. And a lot of different tables in that particular access file. So I'm going to want to select from a list of tables to make sure I only get the tables that I need and I don't take the ones I don't need. You could also write a query that will specify the data that you would like to import. If you know how to do that, I'm going to start with the very first option here and go ahead and click next. These are all the different tables that are in our access file. We're going to want to be able to import information about all the different channels that we sell in. Maybe we don't need entities yet, cause we're not really looking at all of our different, uh, entities. We're also gonna import from our fact sales table. Our fact sales table is going to be the big data set. This is the history of all the transactions that we've ever had as a company, without fact sales table on that table, we don't really have very many labels for things. We have IDs in each transaction. So each transaction has the date that it happened. And then it has a product ID, a product category ID, a product subcategory ID, a promotion ID, a channel ID and entity ID, a geographical ID, and probably a store's ID as well. So there are a lot of IDs in there. Now, each one of the IDs actually links out to one of these other tables here. So we're going to start by grabbing the fact sales table, but we're going to grab all the different tables of metadata that we might later want to be able to analyze, like by a product information or by a more generalized product category. Or a halfway in between product category and products, we would have a product sub category as well. So we'll take a look at this as well. And then later on, we might want to be able to analyze our promotion.

So we're going to select all these tables except for the table called entity. And before we move on, we're going to want to make sure that we don't take every single little bit of data in every table if we don't need it. For instance, in our product category table, our company has eight product categories because we have recently decided to launch two extra product lines. However, we haven't actually had any sales in those product lines yet because it just got launched. So we're going to want to take that out of our analysis. So it doesn't take up extra space. I'm going to click on product category table and then click on preview and filter. This is going to allow me to actually filter out information that I don't need. And let's say for instance, games and toys and home appliances are the two particular product categories that we're not going to need. So I'm going to go ahead and filter those out games and toys I don't need and home appliances. I don't need either. And I'll go ahead and click. Okay. I should have product categories. 1, 2, 3, 4, 5, and six there. And so long as you have one through six, you can go ahead and click. Okay. If I wanted to do more filtering, you can absolutely go in here and do that, but I'm not going to do that. Let's just move on and actually get to the fun stuff. I'm going to go ahead and click on finish here. And now I can watch it work. As we've said, power pivot is a mass data management program, a SQL engine for Excel, which means that it can handle massive amounts of data. As you know, or as you may know, Excel only actually has 1,048,576 rows in it these days. And that's in modern Excel. But as you can see, this program is having no problem. Importing 2.2 million records from our fact sales table alone. I'll go ahead and click on close. And now we should be able to see some tables here. So we have some information from access. Now. Access is a great database management tool. It also allows us to create relationships between the different tables in access.

If somebody has actually created relationships, you can see those by going into diagram view here in power. Because both of them are Microsoft products. Access does a really good job of translating the relationships between the tables. And so power pivot does a really good job of picking up those relationships. So you'll see that each one of these tables that we just imported also imported each one of the relationships. Okay. That's pretty cool, but let's go back to data view, cause we still have some more data that we need to import because power pivot is a mass data management sister program for Excel. It means that it might crash every so often. If you're in 2016, you probably won't have this happen as much, but in the previous versions, it crashed a little bit more often. It also is going to depend on the processing power of your computer. So I just want to get back to the good old fashioned nineties days and save and save often. So we're going to go ahead and save our file right now. I'm going to go ahead and click on save. And we are going to call this data model. I'm going to save this file as data model and you'll notice that it actually changed the name of the Excel file. So even though it seems like power pivot was a separate program, power pivot is actually part of an Excel file. If I wanted to send this data model to somebody, all they have to do is send the Excel file to somebody and they'll also get all of the power pivot data model. All right, let's get back into. We're going to need to import a little bit more information here. This isn't quite enough. It's pretty good, but we're going to need to also be able to analyze geographical information and we're going to want to analyze some stores information. Now it just so happens that our stores information and geography information isn't in that access file. It's maintained in other types of files. So we'll need to import that in power pivot on the home tab of the ribbon, in your get external data command group, you'll want to find the command called from other sources.

You have quite a few other sources that you can choose from scrolling down. You'll be able to see all different kinds of database sources, multi-dimensional sources, data feeds, and text files. We're going to choose an Excel file because that's where our store's data is. I'll go ahead and click on Excel file and click on next. Now I'm going to want to be really careful here. This is one of the few little, it's not, it's not a bug, but it kind of feels like a bug. Sometimes this box does not check itself. However, most of the data that I contained in Excel does tend to have headers. If I don't check this box and my data does have headers, I'm actually going to have to redo this whole thing from scratch. Now, I am telling you that as of March, 2017, so that could change in any of the updates coming up and hopefully it does, but in the meantime, we're just going to want to be really mindful. We're going to click on browse to go find our file. Here's our file called stores, which should be in student files, student files, day to work files, and you should have stores right there. And we'll go ahead and click open. Now I'm going to go ahead and check on this check box. Cause I know that my data does have headers and I'll click on next. That took no time at all. I'll click on, finish and close. Now finally, we're going to do one more exercise in Excel. We just imported data that was put in by either a data dump system or an actual human is in charge of maintaining that particular file, which means that the whole thing is data entry. However, we have another kind of an Excel file that we created yesterday. A file that was entirely created through query. Let's go ahead and get that file as well. We're going to do the exact same starting process. We're going to choose from other sources and we're going to find our Excel file.

Click next. You want to browse for your file called geography. Now there is one in day two work files. I would prefer that you choose this file called geography in day two work files because that just ensures that everything is going to be perfect. Go ahead and click on. Make sure to click on the box that says use first row as column headers, because we do have headers in our data and we'll click on next. Now this file has a lot of tables. This file has a lot of tables because there were a lot of queries issued and the original creator, when they created a lot of these queries, they did load them to the spreadsheet. And so you'll see a lot of data. So if you are doing this, like we did yesterday, you'll want to make sure to be very explicit about where, what your completed file looks like here. You'll see, there is a sheet called completed. I made that as a note to us to know that that was the finalized data. So I'm going to go ahead and check on that. And only that I don't need all of the data that contributed to it because I do have a finalized completed sheet and I'll click on finish and click on close now as griefs. This is that sheet called completed six months from now. I'm not going to know what completed meant. I'm going to look at this and be like what's completed. Right? So the first thing I want to do is make sure to rename all of my tables here so that they explicitly refer to the type of data. You'll find that it's really common that when you import data from Excel, unless you went in and actually named a table in Excel, That data won't always come in to power pivot with a very explicit name. So you want to get into the habit here of renaming all of your table tabs to make sure that they explicitly referred to the data contained inside. We now have over 2 million records that are stored in our power pivot data model. So of course, we're going to want to make sure to save again, remember save and save often. And then how about we actually start taking a look at some of this data that we've imported.

One of the many benefits of pulling all this different information into power pivot is that we can see information from many different tables in many different sources in one pivot table. Historically you could only create a pivot table from one original table of data. And that table of data had to be located in Excel. However, with power pivot, now we can create a power pivot from our data model. So depending on the version of Excel that you're in, you can actually initiate a pivot table from a couple of different ways, and we're going to want to do that right now. If you are in Excel 20 10, 20 13, or 2016, you can issue a brand new pivot table using the pivot table command inside of power pivot. This is the most obvious and straightforward way to create a power pivot table and a power pivot table is the only pivot table that can analyze data from your data model brand new in 2016, we also have a feature where we can create pivot tables from Excel that actually use our data model. So in Excel, on our insert tab of the ribbon on the left-hand side, normally you'll see a pivot table button I've made my screen a little bit narrow. So that's why I have a tables dropdown instead, but right there is pivot table. And when I click on that, there's this brand new addition that says use this workbooks data model. So if you're an Excel 2016, you can just use this method to create a power pivot table as well. Now I can say that I want it to be in a new worksheet or an existing worksheet. Let's say I know exactly where I want it to go. And that's right here in cell B2. I'll go ahead and click. Okay. And now we can play with our very first pivot table power pivot tables. Look a little bit different in your field list. On the right hand side, instead of seeing a list of fields there, you'll see a list of tables and inside each table, if you click on the little triangle on the left hand side, you'll be able to expand it, to see all of the different fields inside that particular table.

Let's say, for instance, we wanted to analyze how many regions are in each country. For this, I might want to go to my geography table in my geography table. I'll probably take my continence or rather my continent name, which happens to be at the bottom of the list and I'll drag it and I'll drop it in my rows field area. That'll give me the name of the three different continents that we have. Now I can take our region column and I can click and drag this into our values field area. This is going to count the number of regions that we have in each continent. It looks like we have 113 in Asia, 220 in Europe and 338 in north America. Let's get a nice, comfortable breakdown of how to build a pivot table of all the different types of fields that you might have in a given table. It's important to recognize the difference between two different categories. You might have data that is categorical in nature, or you might have data that is calculable in nature. Now we've already talked about this, but the general difference is that categorical data has a particular label or item that repeats itself more than once. It doesn't have to repeat itself more than once. Um, but generally that's what you tend to see. You also tend to see words here more often, although it doesn't have to be words, think of a categorical field as a field that you might want to. Analyze the storyline of, okay, so you might take something like Asia or Europe or north America and analyze something about it. Like in this case, the count of the regions in this case, what that means is that our continent field is categorical categorical fields do really well in rows and columns now because of the nature of people, we prefer to see long lists rather than wide lists. And so by far, rose is actually a preferred location for your categorical data. And occasionally you'll want to segment out your, your row data by some columns. In which case you might put something in the columns, but most people lean on the rows for their categorical information values.

On the other hand is where all of your calculable data goes. So in this case, what we wanted to do is take the region field and count the number of regions that were in each continent. And so we put this field in the values field area. However, if you think about it, regions are a text field. And so any kind of a field is a text field when put in the values field area automatically reverts to count. Now, this isn't so much the case. If you take a numerical field and put it into your values field area, let's say for instance, we wanted to get our. Projected revenue information. I'm going to take out the count of region and I'm going to get our projected revenue information and I'm going to drop it in the values field area. Now, the automatic function that was applied to this field was some that's because this is a naturally calculable field of data. Okay. This is pretty good. However, maybe we want to know that sales quantity among all of the continents that we have here. Now, when we do this, we are going to run into a problem. It's a good learning experience though. My sales quantity is going to be located in my fact sales table from here on out. If we're looking for any information about revenue, about sales, about profits, about returns, about sales, quantity, anything like that, all of that data is going to be located within your fact sales table. If I expand the fact sales table, I might grab something like our sales quantity and I'll drag that and drop that into values. But I'm going to see a huge problem here. And that is that I'm seeing the total every single time. The reason for this is that there is no relationship between these two tables. Remember this came from the geography table. Whereas this data on the right hand side came from the fact sales table, the fact sales table we pulled in from access. So it pulled in all the relationships of all the other tables around it.

However, the geography table, we pulled from an Excel file. And so this did not pull in any relationships in order to create relationships between our tables, that don't already exist. We're going to need to return to our actual data model. And remember our data model is power pivot. So I'm going to need to move back to my power pivot window. Now in my power pivot window, the default view of all of your data is in this view that we've been looking at where we have the tables at the bottom and the data preview is showing up up here. And it's columned information. This is called data view. However, if you want it to see the relationship between all of the tables that you have, there's another view called diagram view. If you click on diagram view, now we'll be able to see all the relationships that were established between our tables. We'll also be able to see any relationships that don't exist later on. We're going to realize that our fact sales table is the biggest table, and it also happens to be the trunk of our tree of all of our different data. When we very first started today's session, we took a look at this tree analogy on this slide. All of the different sources would be all the information that is located in the branches, but the trunk of the tree would be the data model itself. And it just so happens that one of the tables from our access file is the very large transaction table called fact sales. In order to think about this properly while I'm doing all of my work, I like to move all of my tables around so that they actually exemplify that tree analogy that we saw. So you'll see that when we move these tables around that the relationships stay maintained. If you hover over the relationships, you'll see the fields that are connected. So I'll hover over this one and I'll see promotion key in the promotion table connects to promotion key in the fact sales table. I'll also see that there's a one on one end, and then there is a asterix, which is meant to mean many.

So in this case, we have a one to many relationship here and you'll find this is generally how the flow of information goes. You'll have a very large transaction table, such as our fact sales table here, and then you'll have many different tables that are stemming off of it. Like the branches of a tree in order to get started. Like I said, I like to move my tables around so that they actually appear like branches of a tree. This will help me when, as I continue to do data modeling later on, the other thing that we need to do is actually notice our tables that are not yet connected to our data model. Now, just to be sure, you'll want to click on this little icon down here. What this is going to do is it's going to expand your views, that you can see all the tables. Occasionally you will have imported a table and it will show up way over here on the right hand side, and you can't even see it. So you'll want to use the zoom feature to be able to see it. So if we look at our David data here, we can actually see that there is a correlation between our data in our store's table. We have a Storkey here and that connects directly to our fax sales table and a store key. That's located over here. If I want to actually create a relationship between this data, all I have to do is click and drag one field and drop it on top of the other. Now be careful, make sure that you're dropping it on top of the correct field and I'll drop it. And there we go. I have a relationship I'm going to create one more. That goes from stores to geography so I can connect the geography table. Geography key matches to geography key. So I will click and drag and drop if you haven't done so already go ahead and pause the video and take a moment to connect your store. Key to store key from the fact sales table to the stores. And then to connect the store's table to the geography table, you'll want to create a relationship between geography key and geography key.

Now that we've created these relationships, our pivot table should be adjusted. And look, there it goes looks perfect. Great. So we're able to do this analysis for sales quantity, but what if I wanted to do an analysis of the sales amount? The sales quantity is just the quantity of products that we sold. And then I also have information for the standard retail price of each item that we sold. It would be nice if I could combine those two pieces of information together to create sales amount. We actually have a couple of pieces of missing information in here that we're going to have to generate. And in order to do this, we're going to need to create calculated columns. Let's return to power pivot, and switch to your data view and switch to your fax sales table as well. So we're in data view and we're on our fact sales table. Now just a moment ago, we were analyzing the sales quantity, but let's say we wanted to analyze the sales quantity times the MSRP. The manufacturer standard retail price. That would be the price that we actually sell most of the products at, unless we discount them. But let's just assume for now that we don't discount them. So if we wanted to get the actual sale amount, we're going to have to multiply the MSRP. Times the sales quantity, lucky for us in power pivot, we can generate extra data based on existing data. So if I have MSRP and I have sales quantity, I can use this area all the way over on the right hand side and power pivot to add what's called a calculated column. So I'm going to create our very first formula in here. You'll want to do this with me. I'm going to select the very first box right here. Now there's a reason I did not call that a cell. The reason that I did that is because power pivot does not care what road that is. I know it looks like there's a row assignment there, but it actually doesn't care. Power pivot doesn't care. Furthermore, power pivot doesn't care that this has a particular column ID.

Actually, there is no column ID. They just have names, right? So we don't see any column a, B, C, D E F. Furthermore, let's appreciate one more thing. Let's say maybe I got the discount, quantity wrong here. Uh, if I want to change that, watch what happens when I try and type, okay. There is no data modification back here in power pivot. The reason is that power pivot is a data modeling feature. It is not for data. All right. If you want to do any kind of data entry, you want to do that in the original dataset. This is just the place where we take all of the information and analyze it all at the exact same time. So you'll notice you actually have no power over a single cell or a single value in power pivot ever. All you have power over is entire columns at the same time. So that's why I called this a box and not a cell by entering anything into this box. Essentially I am entering it into every single other row in this column as well. So let's start by putting in a formula to multiply MSRP, times sales quantity. I'm going to start with my equal sign. If this were Excel, I just typed in my equal sign. I should be able to see it right there, but this is not Excel. And I don't, I will however see it in the formula bar. So from here on out, you will want to keep your eyes out for the formula bar. This will be the place that you focus on for your calculations. So make sure that you've selected that cell and started with your equal sign. And then we're going to ask to take the column called MSRP and empower pivot. I can just click on it and you'll see that it injects it into my formula much like an Excel. When you click on a cell, it injects that into your formula. I just injected the whole column and I'm going to multiply that by our sales quantity. Now there is a slight difference from what you're used to in Excel. There is a specific syntax for listing a column. You do need to surround that with your square brackets in the occasion that you actually have a space in your column name.

You're also going to need to surround that with your apostrophes. And you'll see examples of that later on in the course. All right. So make sure that you've typed in this formula here or that you have selected on the columns to inject the references and go ahead and hit your return. Key power pivot might take a second to think about it, but remember it has to do it more than 2 million times, and that actually looks pretty good. Now I know all the numbers look the same right now, but that's only because I'm looking at the very top of my list. If I scan down a little bit, you'll see that all my numbers start to vary. All right. So it looks like that worked. The first thing I'm going to want to do is make sure that I rename this column here because calculated column one is not going to be a very good name for me. I'm going to rename this sales amount. If you haven't done so already, please make sure that you've created an added column on the right-hand side. It should have the number 600 909 0 at the top because you have put in an equation that multiplies MSRP, times the sales quantity, and then do make sure that you have named it sales. Go ahead and pause the video. If you need to. And then come on back, we have two more calculated columns that we're going to want to create because we have two more types of data that we want to analyze. We want to be able to analyze the cost of goods sold and the return amount. The reason we need this is because sales amount and cost of goods sold and return them out is all needed in order to later on calculate profit. All right, so let's start with our cost of goods. Sold. Our cost of goods sold is going to be the unit cost times the sales quantity. So here's our sales quantity. And then we also have the unit cost. That was the cost cost to us originally to procure that unit. So we'll take the unit cost times the sales quantity, and that should give us the cost of goods sold. So we're going to create another formula in here that takes into account the unit cost times the sales quantity, and then you'll want to name this column cost of goods sold if you haven't done so already, please make sure that you have put in the field called cost of goods sold using this calculation here.

And finally, we will want to create another column that calculates the return amount. Now we don't have the return amount in here, but we do have the MSRP that we sold it for. And then how many units, if any, were returned. So. We're going to create a calculation over here that takes into account MSRP times the return quantity, and we'll call this return amount if you haven't done so already make sure to pause the video and ensure that you also have your return amount column in there. All right. We have created three derived columns of information based on data that is elsewhere in the table. Let's see if we can create a calculation based on the calculations. We just made a calculation based on a calculation. All right, we're going to create another column here. That's going to figure out profit. Now our profit is calculated by taking our sales amount, subtracting the cost to us originally, the cost of goods sold, and then of course subtracting any returns that we had. Any return amount. Please make sure to name this column profit amount. And finally, let's say we wanted to know how much our sales compared to the profit was. All right. If we wanted to figure out that we would need to take our profit and put it over the sales of. So we would start with an equal sign and say, we want to put our profit over the sales amount, using art division sign. Now this would be called a profit margin and this data technically should be considered a percent. And so I'm going to select this column. I'm gonna select this column and we're going to format it as a percent. That looks pretty good if you haven't done so already, please pause the video now and make sure that you have five derived columns here.

We should have sales amount, cost of goods, sold return amount, profit amount, and profit margin, and then come on back and we'll keep going. Now that we've made these great sets of metadata here, let's go ahead and use them in some analyses in our pivot table, let's return to Excel in Excel. We built this basic analysis of our different continents and then the sum of the sales quantity. But now that we have some more meaningful information like sales amount, let's go ahead and put that in. Instead, I'm going to take out some of sales quantity, and I'm going to go into our fact sales table and find those columns of data that we just generated. Sales amount, cost of goods, sold return amount, profit amount, and profit margin. I'm going to grab sales amount and drop it in the values. That is pretty neat. Now, some of sales amount, that's a lot to read. So if I want to make this prettier, I can double click on the header there. Double click where it says sum of sales amount, and I can change this to just say sales amount. And now it that's kind of a nicer reflection. It helps my end users. All right, good. With sales amount, let's take a look at something else. I'm going to take out sales amount and let's say we want to look at all of our profit amounts. So I'm going to grab profit again and put it in values. Well, that looks pretty good too. And I could do the same thing and rename it if I wanted to. And if I change my mind again and then say, I need to analyze sales amount, one more time. I'm going to grab sales amount and drop it in the values. But all of a sudden it says sum of sales amount. It says that again. So does this mean I'm going to need to remove the sum of before the name of this field? Every single time? The answer is yes, because the word sum of is Excel notifying you, that it's actually placed a measure on this field of data. Let's talk about what a measure is. You've been using measures for a really long time. You just didn't know it.

It's just because you didn't know the terminal. Uh, measure is the mathematical method by which we take a bunch of numbers and distill them down into a single value. For instance, this is the sum of the sales amount for Asia. And here is the sum of the sales amount for everything. If I were to go back into my original data in power pivot, it's worth discussing how I might take all of these separate numbers, these numbers that appear many, many, many times we have 2.2 million numbers in this column alone. So how do we take all of these different numbers and collapse them into a single number in one space? That is what we call aggregation, taking a bunch of numbers and summarizing them by means of some kind of mathematical storyline. You could take all of these numbers and add them all up together. That would give you a great storyline. And that's what we have here in our pivot table. Here, we have a sum adding them all up together, but you could also choose a different type of function to create a different storyline, like count or average. Average, we'll take all the same numbers and you know how an average is calculated, but it will take a lot of numbers and distill them down into a single value and tell a story in this case, the story that is telling is this is the average sale amount for all of our continents altogether. Other types of aggregations that we can use our max and men standard deviation and standard deviation for a population variance and variance for a population. And we now have a brand new function called distinct count, which is very cool as well. However, those are the only measures that are designed and, and provided by your standard pivot table. So it's important to know how you can actually create your own measures, the ones that are provided by a pivot table or what we refer to as implicit measures, because they are implicitly provided by the pivot table. However, if you wanted to explicitly say, this is exactly how I want this field calculated, you're going to create your own custom explicit measures.

Let's go ahead and create our very first measure. If you go to the power pivot tab in the ribbon of Excel, you'll see a command over on the left-hand side that says measures. We're going to create our own very basic measure here. And we're going to start off with something simple, like just, just a, some of the same. So we're going to go to the measures, dropdown and choose to create a brand new measure. Now, in general, when you're starting off, you want to store all of your measures on your transaction table. Our big massive transaction table is currently called fact sales. So we'll go ahead and store it on our fact sales table. And unless I otherwise specify we're going to be using fax sales as our primary table for measure storage, the measure name for this, since we don't want to call it some of sales, right? That's a little boring. Let's call it something very obvious, like total sales. Okay. I'm going to skip down until the place down here where I can design the formula. Now the formula for a total sales is just a sum, just like you would design it in Excel. And it's the sum of our, uh, column called sales amount, which is actually on the fact sales table. And you'll see here in this little drop down menu, it seems that power pivot really wants you to explicitly mention not only the column that you're going to be using, but also the table as well. So I am going to, uh, just start by typing in sales amount. And as I start to type in sales amount, I start to get some, get some really neat suggestions here, including the one that actually explicitly mentioned the table, called fact sales and the column that we created called sales amount. I'm going to double click on this and that's going to inject it into my some formula here. Now it takes a little bit of getting used to, to look at this, but in no time at all, you'll start to look at this as though it were one field of information.

Now, of course I did open up my parentheses for the sum function. I'm going to need to close it on the other end. And then I'm just going to take a quick moment to specify how this data should be formatted. Uh, this is currency, so I'm going to click on currency and it is going to need the dollar symbol. Two decimal places is good. All right, I'm all good. I'm going to click. Okay. All right. So now we have our total sales measure in our table. The nice thing is that unlike these implicit measures that we were using earlier, no matter how many times I remove it or put it back in, it's always going to be called total sales and lucky for me, it's always going to use the exact same function. One of the other things that you might've noticed previously is when you're putting certain fields into your values field area, it always comes in as the sum. So let's say you had a particular function, uh, like your profit margin. That's actually supposed to be an average type of calculation. So in that case, you usually want to use the average function on it, but you'll have to go and do that every single time manually, unless you create your own. All right. Speaking of which let's keep going before I move on, please make sure that you've created a measure called total sales. You can ensure that you've made that measure by clicking on your measures, drop down menu and selecting manage measures. Then you'll be able to see a list of all the different measures that you have so far. We just have the one called total sales and I'll click on edit so that I can open it up and actually see how this thing is built. Please make sure that you have designed at least this one before we continue. So if we have made a measure based on some all bet it follows logically for you that we could create some easy other measures that use average or max or men or. Or standard deviation or variance any of those things that you've already seen in the pivot table. However, there are a lot more functions that you can use as well.

This function, the one that we're looking at is called sum. You've been looking at this in Excel for a very long time. However, this technically is what's called a Dax function. This function is actually coming from power pivot. There's a collection of functions that power pivot uses in order to be able to perform operations on entire field of data at the same time. If you think about it so far in Excel, you've always created operations based on cell to cell interactions, but now we're actually creating operations on entire columns, entire fields at the same time. So for that, we need to create a whole new breed of functions and that breed is called Dax functions. Dax stands for, uh, D a X data analysis expressions, um, and they are the language by which we perform functions in power pivot. The exact same functions that you can use in power pivot will be available in power BI in the data modeling feature as well. Let's go ahead and get used to some new ones. Let's start with a function that finds all of the transactions that happened in Asia. So all of the sales transactions that happened that were connected. To the Asia continent. So we're going to create a brand new measure here. Okay. The table name, it needs to be on the fact sales table, and we're going to be counting the transactions here. So I'm going to call this total transactions, the formula that you can use to count all the transactions that happened in a particular table is called count rows. Now, the nice thing about the count rows function is that it doesn't actually need to know what column you're looking in, because if you think about it, you're going to have the same number of rows in every single column in your table. So all I need to do is specify a table. So I'm going to say, look at my fact sales. This is my list of all the transactions that happen for us. So look at my fact sales and count all the rows. Now, uh, in this case, we'll say it's a number. Um, I don't expect to have any decimal places here.

They should all be whole numbers. So I can go to the format and choose a whole number. And then I can also choose to use a thousand S separator. If I think that I will need it and I'll click. Okay. It looks like we add 458,000 transactions in Asia. So this is pretty cool. We know how many transactions that we actually had, but let's say we wanted to do something a little bit more interesting. Let's say we wanted to know how many days we were actually selling right now. We're a global company. So each location is probably going to have had a transaction. I don't know, maybe four or five, six, maybe even seven days out of the week. But altogether with all of our stores combined, it's really likely that our whole company combined is going to have a transaction, at least one transaction for every day of the calendar that we've had so far. Uh, however, maybe if we want to get down and deep in the stores analysis, we actually want to look at the stores and how many different days that they were selling. So let's go ahead and create a measure for this. We can analyze it by continent and then we can actually step into store detail and actually see this exact same analysis for stores as well. So we're going to create a new measure here, and it's going to be located on the fact sales table and we're going to call it days selling. We're going to use a function that counts the number of days, but only counts the distinct days. And that function is called distinct count. Now it just so happens that in our fact sales table, we do have a date key column. And in that day, key column is the actual dates that every transaction happened. So if I am trying to analyze all the, all the distinct days, then I am going to use that particular field. I'll close out my parenths here. And, uh, it is going to be a number should be a whole number and I'll use my thousand separators and click. Okay. So it looks like each one of our locations sold exactly 1095 days.

It just so happens that we only have records for the last three years or so. And so of course, every single continent has had at least one transaction every single day that we've been in business. However, if instead of just looking at continents, maybe we wanted to look at store information. Store information might be a little bit different because each location might not actually sell seven days of every week. So let's go ahead and look for our stores. And maybe I'll look for each store name and I'll grab store name, and I'll drag it and drop it underneath the continent name in the rows. And now you'll see that we have certain stores, this store Contoso, Bangkok, uh, number one store only sold 523 out of the last, however many days we've been in business. Okay. So this is a very different kind of analysis here, but still very meaningful. Okay. Let's create something even more interesting. Let's say we want to be able to evaluate the strength of each one of our stores by the total sales that they have per transaction. We want to know which stores have the highest sales per transaction. Meaning when they do some sell something, it's usually a high sale number. Okay. We're going to need to create a new measure for this. Here's our new measure located on the fact sales table and we'll call it average sales per transaction. And we're going to calculate this by taking our total sales, the measure that we've already created, our total sales measure and dividing it over our total transactions. Another measure that we've already. Okay, this is probably going to be a currency figure here. So I'll use the dollar and it'll have two decimal place holders and I'll go ahead and click. Okay. All right. Now we're seeing a lot of different numbers. Let's see if we can sort this by the highest number. All right. Click on average sales per transaction. I'll right. Click on that and go to sort, and I'll say I want to sort largest to smallest, and it looks like at least within Asia, our Contoso Asia online store has the largest sales per transaction.

Very well done. That was our session on measures. I do have a lab for you. Let's see if we can create a couple of more measures here. We've already created the very first measure here. Total sales per transaction, or rather we changed that into average sales per transaction. Let's go ahead and make a couple more measures and you're going to do this on your own. You're going to want to make a measure that is sales per day or average sales per day. And you also want to make one that is average transactions per day selling. So when we are selling, how many times are we selling per day? All right, go ahead and give that a shot and come on. These measures that we just created are really good. And we're going to need them later on for now. We are going to create some additional measures as well, that are going to use the calculate function. Probably the most important Dax function you'll ever learn to use in order to do this, let's go ahead and keep this pivot table since we're going to need it later on, and let's go ahead and create a brand new sheet in Excel. And on this sheet, we'll create an insert, a brand new pivot table from our data model. Remember if you're using a previous version of Excel, you'll actually need to go back to power pivot, and then click on the pivot table button from over there in order to create a power pivot table. Okay. Now let's say that we're analyzing our sales. Okay. So we're going to go to fact sales, and maybe we want our, uh, sales amount here, and we want to be able to analyze these exact same sales, and maybe we want to analyze these sales by the continence again. So we'll go down to geography and find our continent. Now it turns out that each one of these continents run different kinds of promotions. Now it's not that you would know this already. I just know it because I'm rather familiar with the data, but if we go back into power pivot and actually look at the data, if we go to the promotion table, you'll see that we have a lot of different kinds of promotions.

We have, uh, first of all, the promotion label, number one is no discount. So if something, if there's no promotion on something it's marked as promotion key one over here. Okay. But all the other promotions refer to an actual promotion that we've had. And we have some in north America, spring promotion, back to sales and holiday. We have some in Asia holiday, spring and summer European ones, north American ones, uh, some more Asia, Asia holidays here. We have Asia holiday promotion, Asia spring promotion, and they each have their own discount percent. So, so sometimes when these transactions take place, there is a promotion applied to that particular discount. Now it's important for us to start to be able to analyze regular sales versus promotion sales. So we'll need to go into our measures and be able to create that. However, the measures that we've created so far really don't have an ability to filter. And that's what we're going to need. We're going to need to filter our total sales. By just the items that adhere to one of these promotions. For instance, if we wanted to be able to figure out what our total sales were for things that were, that were not on discount at all, we're going to need to be able to take into account the total sales, but only for items that were associated with promotion key. Number one, typically we would consider this a filtering process. And as far as we know, measures, don't have an ability to do this, but that's why the calculate function is probably the most important function that you will ever learn in decks. Let's go back to our pivot table that we just created and create a measure using our calculate function because the calculate function, what it does is it applies filters. So we're going to go to power pivot tab in Excel, go to our measures, drop down and create a new measure. Course.

This measure does need to be on the fact sales table, and we're going to start off by creating a total sales amount for things that are not on promotion. We'll call this regular Cairo Person: sales. Chelsea Dohemann: Now, when it comes to actually creating the docs function for this, the magical function that we're going to use. And the one that is going to be so important to you is called. Think of calculate like a sum, if function in Excel, if you've already used that what a calculate function does is it allows you to have two different pieces to it. The first piece is just the regular old function calculation that you're going to be doing. For instance, for us, that would be a sum of our sales amount. Right? Think of your calculate, like a sum, if function in Excel, the calculate function allows us to do two different things. First of all, its first argument is to just go ahead and perform the normal arithmetic of a normal measure. Now a normal measure would do something like sum up the total sales and we've already created some measures that did that. So we would sum up the total sales as the very first argument of calculate. The second argument of calculate, this is where you actually put in your filter. So in this case you might say, okay, go ahead and do this arithmetic. If the criteria in a specific column equals something specific and there in lies your filter. So only added up if the promotion key was. Which, as we know is no promotion. No, just so turns out that the calculate is super flexible. Meaning if you wanted to put your, um, pieces of your measure in here. So if you wanted to put your some function and then what your, some function was supposed to add up our total sales in here, you could absolutely do that. However, if you have already created a measure that did that, like we did, we now have a measure called total sales. Instead, you can use a calculate function that just includes an entire measure that you've already designed as the first argument.

That way you don't have to copy things a bunch of times. All right, let's go ahead and create it. All right. We're going to start off with the name of the function, which is calculate in this case. We're going to use our normal total sales measure, which we've already created. So I'm just going to go ahead and type in total sales. And there's my total sales measure. I'll just inject that into my calculate function and that's going to be the first argument. The second argument is what I want it to actually consider valid to add up in this case, it's sales that are not on any promotion at all. And as we know any items that are associated with the promotion, key one, don't have any promotion on them at all. So we're going to say, all right, on the fact sales table, we actually have a promotion key there. You could also do this. Accessing the promotion table instead, that would be fine. You can go ahead and do that. I'm going to look for the fact sales table where we have the promotion key and there it is right there. And in the circumstances that promotion key equals one, then I want it to calculate. Okay. So this is going to be a currency figure because it is total sales and I'll go ahead and click. Okay. And let's see what it looks like. Wow. Isn't that interesting? It's such a small portion. Alright, let's see what the inverses, let's see if we can create a calculate measure that figures out anything that is on promotion. So we'll say any promotion key besides one, those are things that ar