Excel skills required for a Data Analyst


I've trained a lot of people in excel and power bi and i'm still coming across a large number of those people who don't use or aren't aware of some of the key excel concepts that are really essential to being a good data analyst there's tables power query and pivot tables and even power pivot so i'm going to go through those pretty quickly but just gives you a sense that they're there and you really need to know about them let's go so let's start with tables they got introduced to excel in excel 2007.

Um and they're just one of the best things ever to happen to excel this looks like a table but it's not an official table to turn it into an official table my shortcut is ctrl t so ctrl and t highlights the whole table and the key thing is to make sure that my table has headers has a tick in it and this is now an official table you even get a new table design menu that pops up if i click away the table menu disappears if i click back in it the table menu appears again okay so what's so good about tables and do i have to use ctrl t what if i can't remember it well look under the home menu there is a format as table option so that's the same thing as ctrl t there's also insert table that's the same thing and even tells you ctrl t is the shortcut there so you've got no excuse about not remembering how to create a table right why are they so good well firstly there's nice little features i'm not going to go through all the table features but structured data is key to doing any good analysis and even before that just alternating rows is pretty cool and if i insert a line the rows still alternate little things like that when i scroll down the headings always show as long as my mouse is inside the table i can always tell what the headings are you know there's nice little things like that but the real key is that tables the formulas spill down tables automatically so what do i mean well let's say i want to join together these two names with a space okay i could use the all concatenate option or the ampersand but if you don't know about text join let me show you that so equals text join i want to put a space between each word comma i want to ignore any empty cells sure and then i just highlight the two cells to join together or i could control click on it or click on them individually with commas in between close the bracket enter and there we go the formula has spilt down automatically and if i wanted to change one of these let's say i go into this one and i say actually put a colon in between each name every single formula changes and if this table was 10 000 rows long it would still change so having formulas be consistent in a column is key and tables enforce that it's just really powerful and i'm just going to change this name to sort of like full name or something okay also when you're doing formulas if i want to add up this january column equals the sum and i can just click in any cell and do control spacebar that just highlights every number or every value even if there's spaces so no more controls shift down ctrl shift down hit a space ctrl shift down just control spacebar beautiful highlights everything and the table formula is much more meaningful it actually tells you it's table two and it's the column january 2021 okay press enter and there's my grand total and i can even change this to start typing in the feb there we go feb 2021 and you just have to make sure you chop that bit off the end and there we go that's now my february also name your tables so here's table two if i click on it and go to table design up in the top left hand corner give it a sensible name like table um staff uh that'll do okay and now look at my formula it's the table staff february 2021 figures so it just builds good structure okay so that's tables as you add new lines insert rows the formulas automatically fill down much better way of designing robust tables of data.

Okay so look into tables right the next thing power query hidden away under this data little tab here this little section is power query i'm torn whether power query is the best thing since excel started i think it might be it beats tables but tables and power query work beautifully together so what is power query why do i need it why should you know about it well your data never comes out clean you're very lucky and privileged if you've got nice clean data that's in a perfect format just a little table like this for example look the names are all lowercase and with gaps in the months are going across here which is horrible if you want to try and do some analysis on it so i'm going to use power query to clean up the data flip the data around and the cleanup those steps i do in the cleanup they're repeatable i can click refresh tomorrow and it'll all run again so you just reduce the laborious time of data cleansing okay so power query if you don't know about it it can connect to all sorts of things it can connect to excel workbooks if your excel files or csv files or things are on sharepoint i've got a video on that i'll put a little link will pop up in a second and if you want to see how to connect to a excel file on sharepoint or csv file take a look at that video i'm just going to use this option here from table slash range and it may be called from sheet in your version if you're on a really up-to-date version of excel okay so what am i going to do well i'm going to pull this data in from table range and now this opens up the power query editor window where you can do all sorts of data transformations you can even add calculations and things in here it really is pretty impressive i love power query it saves so much time it's been around for seven years okay and still hardly anybody knows about it so here's my power query window i want to uppercase these first two columns okay so i'm just going to click on them i'm going to say transform and you can just go into these sorts of things like format and you've got lowercase uppercase capitalize each word ah that's what i want to do so capitalize each word and see it's just capitalized them and the great thing is that it's also added a little step and i can just delete that step if i don't want to do that anymore but it almost it's like self-documenting code it's brilliant this formula language in the in the formula bar here that's called m it stands for mashup and it's a it's a language for transforming your data okay i'd like to merge these two columns together so i'm going to go right click and then merge okay so don't have to do text join don't have to remember any formulas i'm just going to merge the two columns together i'll then get prompted as to whether i want to give it a new name and if i want to separate her so do i want a gap yeah i'd like to put a space you can put whatever you want using that custom option i'll go a space and i'll just call it full name and click okay so now we've got full name and if i want to get rid of any extra spaces i can go right-click there's all sorts of right-click options transform see there's the uppercase as well and capitalize each word trim that gets rid of extra spaces clean gets rid of in sort of invisible characters and stuff so trim okay doesn't really seem to have done anything but if there were any extra spaces on the end or in the middle double spaces it would get rid of them and then the best thing one of the best features there is in power query is the unpivot function so you see these columns they're all the same thing they're months going across that's not what you want you want the month in a column and the values in a column so i want to flip all that data around so i'm going to the full name and department column they're great they're nice columns and then right click unpivot other columns that is what you want to do everything else is flipped around and that is now beautiful so rather than three columns for every item i've now got three rows for laura three rows for jen okay it's flipped it all around and i'll rename this as date and this as i don't know maybe it's they the units they're selling maybe they're sales people okay and then i can try and turn this into a date so if i click on this let's just see if it's clever enough to date how good's that it actually knows it's turned it into a proper date and actually i want the month end so right click transform month end of month all of this is now repeatable tomorrow so when that data updates i can click refresh and all these transformations will happen okay and i can call it clean data right that will do that is perfect so here we go home close and load close and load two so where do i want to load it well i've got a choice and i'll show you both so i can load it to a table and i'll load it to an existing worksheet and just load it here okay so it'll just run away load the query refreshes and this little table will magically appear and there we go okay and if i add something new like april's data let's go april 2021 um one two three four five six for example i just right click on this green table right click refresh okay and there we go there's april there's april it's beautiful okay and remember power query can connect to external excel files it can consolidate folders of data it can go off to a sql database and pull data in it's just awesome and power query exists in power bi power query exists in about 10 other products in growing it's just key to getting data into your reporting so now i've got beautifully clean data i can now produce a pivot table if i go back to this table for a second and say insert pivot table and just say okay the problem here is this if i say okay i want to see my units sold for example by department i can put department in my rows so there's my engineering and finance awesome okay pivot tables are just beautiful summarization tool right it's just great for slicing and dicing your data in a number of different ways but look i've got january data i've got drag that in february how do i do a total i've got to drag them all and then do a manual ad horrible okay so we don't do that we don't ever load that sort of structured data um it's already a matrix of data okay this is already the months going across you need to flip them around now if i go to this table and try and do a pivot this will be a lot better so i can just go to the table tools summarize with pivot table new worksheet okay and here i can just go department in my rows there we are and then units okay and if i want that by date i could put that underneath my department and i can put my department in the columns and i can do all sorts of slicing and dicing and the pivot tables is you don't break anything you're not changing any of the source data you're just slicing and dicing it by row by column you just drop in where you want to handle that okay but if you're gonna do analysis and pivot tables rather than loading your data straight into this table instead of that if i go right click over here this is my power query and i choose load two i'm not going to load it to a table i'm going to load it as a connection into the data model now the data model is also known as power pivot and it's that same data model or a similar data model concept that's used in power bi so if you learn these skills in excel they're directly applicable to power bi and i click ok it's going to warn me that this green table is about to disappear okay it's still loading you can see on the right here that it's still loading but where is it loading to well it's loading to this magical data model which is capable of holding multiple millions of rows okay i've got a video where i do 10 million rows pull it into excel now excel if you don't know if i go down the bottom has only got one million um and 48 576 rows so if you want to row load one and a half million records you can't but you can load it into the data model okay let me show you where the data model is if i go to data and this green button manage data model the first time you click that you might be prompted to install an add-in called powerpivot in which case you'll then get a new tab appear called powerpivot so this is the data model it's opening up and here would be my one million rows two million five million rows of data okay and you can also load in other data and helper columns and dimension tables they're referred to as and build a whole tabular model inside excel this is exactly the same sort of um experience you have in power bi so let's say i need to hook up my department to a table with department and a bunch of details i could just do a nice little drag and drop if anybody's done access in the past might be a similar experience for you folks or have a calendar table with months years quarters financial years and just hook my date up to that table and slice and dice okay so that's the data model concept so what do i do with it well if i just go to a brand new empty sheet and insert a pivot table and it now says use this workbook's data model okay because i've loaded data there and i click ok it's just a pivot table but see the icon slightly different and it's now connecting to that data model and again i can just do department in my rows and units and my values and you can start to write dax measures here as well which is a whole another topic but the beauty of this is i can just go data refresh and it'll pull the data in and the pivot table will update i don't have to load it to a table in one refresh and then load it into a pivot table in a second refresh it all happens in one hit and like i say the pivot table the data model power pivot capable of holding masses of data and doing all sorts of really cool analysis so those are my three key features tables you need to know about structured data get your data into columns power query can help you if your source data is not in nice structured columns and needs headings removed data split apart joined together whatever it might be and then analyze your data using pivot tables because you can slice and dice and it's all refreshable at a couple of clicks so hope you find that useful please subscribe and i'll catch you later.