SQL for Data Analysis

In this video we are going to write 50 sql queries so that you can learn understand and master sql we are going to look at how to do select operations how to write where clauses how to use and or not and how to combine all of these things how to build joints in sql how to do group buys order buys and so much more the video has got lots of practical information and real-world tips and i have got a full-length example data set as well as all the queries for you to download so feel free to check the video description for that as this video is quite detailed i highly recommend that you set aside time and watch it in one sitting for best results let's go we will be using the mysql workbench to build our sql queries i have already preloaded our awesome chocolates database to this but you will need to load this using the instructions that are available in the video description once you have the awesome chocolates then we can go and create some sql queries to make a new query here is my plus sql button in the corner and when i click on it i will get my query editor here and whatever query we write we can see the output results right here on this workbench the first step before when we write any queries is to make sure that sql or mysql is using the awesome chocolates database because there are multiple databases here right now world is the database that is set up so i'm just gonna double click on awesome chocolates and that will basically activate that database as explained earlier we have got several tables here we've got a geography people products and sales tables let's go and take a look at the sales table itself and see what is there if you are not familiar with the database and you don't know what tables are available themselves you can use this simple query called show tables and when you run this it will list all the tables within the database and to understand what is available within a specific table you can also use the describe and then table name and you can run that query and then that will show you what is in the sales table so describe sales tells me that the sales table has these columns and this is their data type and some extra information about that normally when you are writing sql queries it is expected that you are familiar with the underlying tables and the relationships if you don't have that knowledge writing sql becomes really hard so for our first example we are just going to look at the sales table and then see what is there this is where the select statement comes in so we say select and then if you simply say star that means you want everything all the columns from and then the table name sales so select star from sales and then to run this particular statement you can use the run command here or you can also press ctrl enter in the mysql workbench and then that will basically run that the exact way to run these commands and the shortcuts differ from one system to another for example mysql workbench is using control enter to run the current statement whereas previously in my other projects i would use sql server management studio ssms and that has its own shortcut so depending on which database you are talking to and which sql management studio or workbench or some other thing in oracle that you are using you will need to press a different shortcut whatever may be the process once you run the query if there are no errors you will see the output result here in a nice grid format and here you can see that you know it is basically listing all the rows within the sales table one after another in this fashion now let's just say we don't want to see all of these we only want to see some columns for example i just want to see sales state amount and customer columns these three columns so i can write a different sql statement select sale date amount customers from sales now because we have not typed the table name there is no auto suggest available here but once you type the table name you will if you go back here and then press comma and then start typing some other column for example boxes you will see that you know it will show up these auto suggest options so it's a good practice to first say select and then from table name and then come back and write the columns that way you don't need to remember or make sure that you're not making any spelling mistakes let's run this one and now i'll get a kind of truncated version of the table just those few columns remember this is still listing all the data right now within the management that workbench that we are using our queries whenever they run there is a limit of 1000 rows so even though my sales table has around 7000 rows of data for the purpose of viewing it will only just show 1000 rows so if you want you can kind of say don't limit or set a higher limit normally when you are building the queries here your intention is to make sure that everything is working correctly and then you would probably use that query somewhere else like maybe in power bi or in some other system so you don't need to worry about seeing all the data but another use case is you will run the query and then you will export the data in which case you may have to set a limit higher value here or no limits there so that's our first few queries we just looked at everything in the sales table and now we looked at a bunch of columns you don't have to specify the columns in this exact order you can also say select and then for example say amount customers and then geo id from sales these columns are out of order from the original data and then when you press ctrl enter you will get those rearranged for you for our next query we are gonna add a extra calculation that is amount per box select and then i want to see my date so our date is sale rate and this can get a bit tricky so i'm just going to write from sales first and then come back here and then from here we can just go ahead and build those extra things so sale date and then i want to see amount i want to see boxes and then we want to add a calculation which is amount per boxes so this is where we just take amount divided with boxes and that's basically amount per boxes because here i am doing an arithmetic operation and then let's see the result of this so here we get sale rate amount boxes and amount per boxes nicely calculated to you as an extra column you can do lots of different operations arithmetic text and you know multiple different complex things during the course of this video i will show you a few other variations of how we can use this while this is all good the amount slash boxes here is not a very pleasant way of looking at that name so maybe we want to call this as amount per boxes here on the display but the operation itself is amount slash boxes so how to add a name that's our second job i'm just going to copy this query paste it here and then once you define the calculation you can give space and then you can either use the word as or you don't have to use the word as also and then within single quote you can give a name that you want so i'm gonna say amount per box as my name so your calculation space amount per box and then when you run this you will get a different column name so many times you may want to give it a proper column name as i said you can write it like this or you can also say as and then write it basically as means you are creating a synonym there and both of them are going to give you the same result so far all the queries that we have written are returning all of the sales data now let's understand how we can use the select statement to impose an extra condition on to the query that means i just want to look at all the sale amounts more than 10 000 so this is where the where clause comes into picture we can simply say select star from sales and then in the next line you'll say where and here you can specify any number of conditions and combine them creatively we'll write one simple condition for now later on i will show you more complex conditions so where and then you will say what the condition should be this condition can be column greater than or column less than so here my column is amount greater than ten thousand and when you run this you will see all the amounts that are more than ten thousand so here you can clearly see that none of these amounts are under ten thousand everything is more than ten thousand if you contrast this with maybe this query here you can see that you know we got an amount of eight four one four five three two five three seven six etcetera but none of those amounts would show up when i run this particular query this is only giving me those sales so one way to think about this is this operation is like filtering in excel essentially when you set up a filter the filter criteria that we give is either greater than less than or you know some other operation so whatever it is that's where the where clause comes in using where clauses in sql is one of the most important aspects when it comes to data analysis that's why we are going to spend next few minutes understanding different types of wear operations and learn it at a deeper level let's go before we get into more where clause details first let's understand how in this query we are seeing all the results but i just want to see this in the order of amount so that means lowest amount on the top highest amount at the bottom or vice versa so this is where the next clause of the query comes in the order clause so i'm just going to copy this here and where amount greater than 10 000 and then you can say order by and then specify the column so for example order by amount means this is going to give you a result of all the amounts greater than 10 000 and then arrange them by the amount order in the ascending order by default so ctrl enter on this and you'll see that you know lowest ten thousand greater than amount is ten thousand ten dollars and then this keeps going like that up until fifteen thousand one four eight i believe at this point we reached the limit of thousand rows probably there is more uh but another way is if i for example change the order order by amount and then say desc then that means i'm giving i want the order in the descending order so you can see that you know we do have some 27 25 000 amounts but because they are more than the limit they are not showing up on the screen but our highest amount is 27 000. so this is how you can order the results also when you are specifying the order you can kind of go and have multiple types of criteria here because the amount tends to be fairly unique there is no need for us to give the second criteria but i'll show you another example so select star from sales where and then we are going to limit all the values from g1 go id g1 so where go id is equal to g1 and order by and then within the guid i want to order this by our pid so pid and then amount esc so we are now specifying two criteria product id and then within the product id amount should be by descending order let's try this query so here we are only going to get all the g1 items none of these values will be for anything else and within g1 you have got p01 and then their information going like this and we will then have p02 again it kind of kickstarts from 16 000 goes down and then you will have p03 p04 like that so this is how you can use where clause as well as the order by to impose multiple sort criteria so where claus is like filtering the data in excel and order by is like sorting the data in excel as i mentioned earlier where clause is one of the most important things in sql so i'm gonna show you a few more examples of that our first example is going to be we want to see all the results all the sales where the value is more than ten thousand and the year should be twenty twenty two we do have the sale date information here so we can just do a year-wise operation so we'll try our first query here select star from sales where amount greater than 10 000 and now we need to have the second condition so amount should be greater than 10 000 and date should be within 2022. so this is where the end keyboard comes in you can use and or not etc to kind of combine conditions and add multiple criteria so amount greater than 10 000 and sale date and then here we can specify a date so for example you can simply say greater than anything in 2022 given this data set the data has only values for 2021 and 2022 we don't have any 2023 values here so one cheating way of doing this is sale date greater than and then you can simply say greater than or equal to and then you can specify the date within mysql the date format is yy mmdd so year month and day so i can say 20 22 0 1 0 1 like that so we are basically building two conditions where amount is more than 10 000 date is this and then i'm just gonna run this so we will get all the values within 2022 that are more than 10 000 and those information are coming up here you can see that our sale date kind of goes all over the place but it always stays within 2022.

Like i mentioned here we are simply saying greater than 20 22 0 1 0 1 so what if we have 2023 data or 2024 data in that case this one is gonna bring all of that data as well so another way is we can kind of check what year the sale date is and then look for whether that year is equal to 2022 so we'll write an alternative query here so select and this time i don't want everything i'll just say sale date amount from sales where amount is greater than 10 000 and instead of directly accessing the sale date we can do an operation on it like year of sale date so here is a built-in function within sql and using that i can basically extract the year portion of the sale rate similar to how the year function works in excel really so year of sale rate is equal to 2022 as year will return a number you can just do it like this you don't need to put any single quote or anything like that and then we will order these by order by amount dsc so let's just see how that one will look control enter on that so we'll get all the 2022 amounts in the descending order going down like this next up let's try to find out all the sales where the number of boxes is between 0 to 50. let's go this is where we are gonna look at the between condition so again we can simply say select star from sales where boxes and when you are writing a condition where the number is between 0 and 50 you can do it in couple of ways one way is you can say boxes is greater than 0 and boxes is less than or equal to 50. so anything from 0 to 50 boxes this is what i want to see and when you run this you will get all the values and you can we are looking at the boxes column here if i click on that it is sorting but essentially you will see that you know this is how those shipments are going from 1 to 50. as i mentioned this is one way of doing sql also offers a between keyword so we're going to try the same with the between so select star from sales where boxes between and then you specify both numbers with the end in the middle so it'll say 0 and 50 and when you do it like this control enter you will get same result really but this time the boxes are there uh of course the between function is inclusive so zero is also shown here so zero there are a couple of shipments where we had zero boxes go out not sure exactly what was happening here probably cancelled shipments but the zero is also shown whereas with this result it's not showing because the condition is greater than so here two different ways to write the between condition whichever way you can feel comfortable go with that hey i just want to remind you that while learning sql is a good idea if you don't practice then you will forget most of the concepts that's why i have prepared many practice homework problems for you that are showing up on the screen feel free to refer to the video description and download these practice problems as well and try to solve them some of them are easy some of them are hard the hard ones require you to investigate a little bit more beyond what is covered in this video but i highly encourage you to practice these so that you can learn and understand sql better good luck our next where clause example is going to be i want to see all the sales or all the shipments that are happening on the fridays so again we have got the date here we now need to use the another function within sql to see that so here i'm going gonna say select from sales and then here the columns that we want are sale date amount and then boxes and i also want to see what weekday it is so that we can then kind of check this so we can use the weekday function and weekday of sale date mysql workbench offers like kind of a built-in help in screen but it doesn't really show help for everything so here i even though i use it weekday i mean it does show some information but not a lot so here i can see that weekday returns 0 for monday 1 for tuesday and then 6 for sunday so essentially if i filter correctly we should expect to see just 4 there because 0 is monday so four would be my friday so sale date amount boxes weekday sale date as with a single quotes day of week and where and then here i can just uh kind of give my condition we have already set up a column so we can directly refer to that i'm going to try this i haven't tried this in a previously so day of week because there is a space there you know we need to refer to that column with the back quote here is equal to four let's try this oh i did get an error here so this is my our first sql error let's read this unknown column day of week in where clause all right that kind of um makes sense i guess we can't reuse this there so we are gonna try the usual route which is weekday sale date is equal to four so that's that works as expected because we kind of did the same thing earlier with year of uh sale date equal to 2022 so this one is working i thought we could reuse that and these are all our shipments that happened on the fridays in our data so we do a lot of shipments on fridays and you know that those dates are showing up amount in boxes and the day of week column comes up here so far we have only been using the sales table but our data base already has some other tables as well so i'm going to use some of these other tables to demonstrate few more tricks and concepts of sql that are very relevant when it comes to data analysis work so here i'm going to use my people table we haven't used people table so far so our first step is to just kind of explore what is in the people table so we'll say select star from people and that will give you all the people listing here it's a very simple table that tells me the names of the people their id number which is then used to map to my sales table their team and the location in which they are operating from so our next criteria here is i want to just see all the names all the people that are in either delish or juices team so here again you can do this in a couple of ways we can say select star from people where team equal to and then single quotes because this is text delish or team equal to juices so here we are using the r condition instead of and condition because now someone person cannot be in both teams they'll be in one of these two so if they are in this or that we want that and then when you run this you will see all the delish and probably misspelled there i'm gonna just see oh it's not juices it is juicies so that's why the second result is not coming up juices so i'm gonna adjust this to sees and yeah we're gonna get both delicious and juicy steam members here in the listing nice and good so this is one way but if you have got let's say you want to get all the teams where the team names could be a b c d like that five seven values then typing this r condition is a pain so this is where sql also offers a different way of doing this it is the in clause so select from start from people where team in and then open the bracket and then type the values that you want to type because these are text values everything needs to be in single quotes and when you run this you'll get the same result but now we are using the in clause to provide all the possible results so in offers a more flexible way in which you can add multiple conditions and kind of get all the results that you want another common thing that we do in sql is pattern matching and this is where we can use the like operator in sql so for example i'm going to select all the people so select star from people where sales person like and then i want to just see all the names that begin with b so we'll try a few different things intuition tells me if i simply say like b i'll get all the names that are having b and we get nothing here this is because it's also expecting some sort of a pattern so if i click here on the like you will see the help is showing me this and then there is some extra pattern syntax there but i believe if i say b percentage then i'll get all the names that are there so like is is the word and then whatever pattern you're giving you can give the fixed items and then percentage means anything so it basically looks for anything that begins with b let's try this again this time i want to have a list of all the people where b is anywhere in their name so this is where percentage b percentage comes in that means it can have anything b and then anything after and when you try this you're gonna get a little bit more people because all of these people have b as well chess bonnell has a b in the middle mario breen has a b in the middle marnie and all of these other people so that's the like operator i really work enjoy working with the like operator and that reminds me if you are enjoying watching this video don't forget to give it a big like all right and for the like puns let's go on and learn some more sql apart from like and in and where etc sql also offers many other operations that you can do on the data i'm gonna just scratch the surface here by introducing one another operator which is the case operator for the purpose of this we will need to go back to our sales table so select star from sales just to kind of bring us back to the data and then you can see we forgot the from here so select star from sales and then it gives you this data and instead of seeing all the sales and the amount values i want to add an amount category as a column where any amount up to one thousand dollars will have an extra label here that says under one thousand dollars and between one thousand to five thousand we will have a label that says under five thousand and anything between five to ten k will have under 10 000 and anything more than 10 000 it will be having another label so basically four labels that we want to have a categorization of amounts if you ask me how to add that kind of a thing so here we can use the case operator within sql so i'm going to first write the query and then i'll explain it to you so we say select and then let's just print date amount so sale date amount and then the next one is an operation so it's a good idea to kind of break your query into multiple lines that way you kind of know exactly where everything is normally within um like when you are building bigger queries usually you just have select in one line and then fields and sometimes if there is fields going on then they will go into multiple lines like this and this is where you can use the tab key to indent your query so that everything looks nice and pretty so select amount sale date and amount and then the final one is case and then you can save when and then build your condition so when amount is under 1000 then what what is the word that you want to have you want to say under 1k and then you can keep building these when conditions so i'm gonna just indent this correctly so when amount is under 5000 then under 5k when amount under 10 000 then under 10k and then finally you can have an else operation so it will be else and then whatever you want so here i i'm going to simply say 10k or more and then end so this is the basic construct here so you start with case and then end and then once this field is constructed you can give it a name so we will call this as as and as amount category so basically we have took the sales amount and then kind of built this categorization i'm just going to collapse this for a minute and from sales let's just run this you can select all of this or you can place the cursor anywhere and then ctrl enter this is gonna give you those amounts and then amount categories so you can see that you know it has nicely categorized all our amounts this is very helpful especially when you're trying to build some categorization based on your data either using numeric values like i have shown or text values so that you could then either display it on the screen or use it to build a where condition or map it out to another table this brings us to the end of our part one of the sql journey so far we have been building queries where all the data sits in one table and then we are doing operations on that data in the subsequent part of this video we are going to look at joints so that you can take data that is in multiple tables and then combine that to create one output let's go in the joints part of this video i'm going to show you some practical examples and briefly explain what is what but if you want to learn a little bit more about how joints work and how to use them for many practical scenarios you will need to do a little bit of further study i'll talk about some resources for learning sql later on in at the end of the video but for now let's go and build some joints before i do that i just want to save my sql file because then i can come back and refer to these queries or i can share these examples with you so to save this i can simply go to file and then save script and you know this will just open up my computer i'm gonna we'll call this as queries one and then it's a dot sql file it gets saved here we'll open a new tab and we'll build our next set of join queries here so these are our base queries and then we'll now do some join queries in order to join the tables you need to understand how the tables are linked up to each other we've got a very very simple database here with just four tables we've got a sales table and the sales table itself has these columns spid go id pid etc and each id column links up to the other table so for example pid here refers to pid in the products table likewise sp id here refers to sp id in the people table and go id here refers to the jio id in my geography table so it's a very simple database and things are kind of mean the same thing in two different tables so using this information our very first query is going to be i just want to look at all the sales data and then see the person name also so for example if i simply say select star from sales we are seeing all the sales data we do have the sp id here sp01 sp02 sp12 here but we don't know exactly who that person is so i want to see their name as well which is not available in this table if i want to see who the sp01 is i would need to go to the people table so select star from people and then we can see that sp01 is bar funny sp02 is denison so really those are the names that i want to see against the actual data so this is where we are going to join sales table with the people table so our first join is going to be select and then this is where really having the from clause ready before writing the select helps so i'm going to say from sales and when you join the tables it's also a good idea to give your tables small names because your original name could be a little bit long so instead of saying sales i'll call this as s so you can directly space and type s or you can say as s this is kind of like a shortcut you just don't have to type there so select from sales s and then we will say s dot and then start typing the values that you want so i want to see sale date s dot amount so for now we'll just do this select s dot sale date as dot amount from sales s if i just run this query alone at this time you'll get sale date and amount this is not enough we also need to know exactly who that person is what is their name so then we want to join this so this is where the word join comes in join and then you will specify the other table join people p on and then on will be where the condition will be specified so it needs to join in such a way that the sales table sp id is same as people table sp id so when the ids match then the names can be extracted in a way this is like you know doing a complex vlookup behind the scenes in excel but sql uses more optimum and better way of handling all of these things so on and then you will kind of specify the mapping columns p dot sp id is equal to s dot sp id so that's the criteria there that's uh and then once you have joined you can also use the columns from this table here so i'm gonna get p dot i think it's called sales person yeah so s dot sale state s dot amount and p dot sales person for fun i'm gonna print s dot s p id as well so we can kind of see what that is then if you want you can also print p dot spid this will be same anyway because we are kind of saying that's how they should be matched and when you run this you will get bar funny sp01 sp01 denison crossfit sp01 sp02 so basically same id is matched and then we'll get their corresponding name here so this is basically how you do the joins you take data from two different tables and then you combine them on one or multiple key columns so these are called key columns basically that means it will be same and the intent and the purpose and meaning are same across both tables and then you can map them out like this as i said there is more to this i'm trying to oversimplify the matters here but this is the join operation when you are joining things you could join them based on multiple ways the easiest is join here you are not really specifying anything sql will do most of it but if you want something more specific you can tell the type of join that you want so the most common type of join that you see in business situation is a left join what it simply means is you've got two tables here you have got the sales table and the person table i'm just gonna do a little bit of drawing here so imagine this is our sales table and then this is our person table so the way i have written our query is we started by writing from sales first so this is my left table and then this is my right table when i say a left join what it means is mysql will try to keep all the data in the sales table because that's the left table and then if there is a matching value in the person table it will bring it over so that's what a left join is in many situations you will simply use a join because the data will be same across both tables if i have got an sp id here for example if i have got sp01 here i will have a matching record in the person's table but occasionally we may not have a matching record there so in that case if i use a join then what happens is because there is no matching record there that data may get excluded so that's where you use the left join to ensure that the data on the left side of the query which is the sales table here is preserved so just as you have left join you also have right join and then there is more complicated wordings of these and variations of this but for better results i recommend that you read a book or an article i will link to some of them in the video description so that you know this all makes sense if you have done some work within excel especially in power query you would notice that power query also has the same wording there it has the left join and right join when you merge the queries let's go back let's try one more type of join before we get into more complicated joins which is just as we are seeing the people person names i want to see the product name that we are selling in these shipments so we'll try a different one this time we'll join with the product table so select from sales s and then s dot sale date s dot amount s dot p id it's showing me this red underlining this is because the previous query is not terminated so that's where the semicolon comes in when you finish a query if you put semicolon there then sql knows that that's where that statement finishes and the new one can begin even though it shows that when you select and run that portion this will run perfectly but it's a good practice to kind of delimit the queries with that semicolon if you are building multiple queries in the same file so select sale date amount and spid from sales and then we will this time we'll use the left join there won't be any difference because the data is all there products and then pr for consistency sake whenever we refer to products we'll say pr and people will say p that way both tables are with p so it's easy for you to distinguish on pr.pid is equal to s dot pid and then now that these are linked up we probably don't need that we are going to just say pr dot product there and and then let's just run this we'll get our product names here nicely printed for the amount so we know that on this day we made 259 dollars for baker's choco chips and then this continues like that so this is my people join and then this is my product stable join we are using left join so if there is a product name within the sales table and a product id in the sales table that doesn't have a matching product id in the product table it will still come up here and then where the name would be it will simply be blank or null because there is no matching value there right now that is not the case it will show for everything just as you are joining two tables you can join multiple tables also so that's what we are going to do next just as we are showing product name here i want to also see both product name and person name in one view so this is where we can kind of combine both of these queries i'll be a little bit lazy i'm just going to copy my query from the top paste it here and then copy this clause paste this here so we are doing two joins it doesn't matter left join join so i'm just going to use join here and then do both of these so we're joining people table products table so now i have access to both so i'm saying p dot sales person we don't need these ids and then we'll say pr.product and then if i want p dot team my auto suggest is a bit slow okay so this is uh that query will run this and then you'll see the result and then it kind of makes sense so it shows me the sales data where the amount sales person product and team are listed some people do not belong to any team so that's why there is blank but otherwise it's all coming up nicely just as you're making the joints you can also add some conditions on top of a join so when you are building this the join itself is like one kind of condition because the ids should match across the tables but you can also introduce some extra criteria if you want so for example we are gonna do the same thing but this time we will add a condition the condition would be where and then we can kind of build any conditions so for example s dot amount greater under 500 and then ctrl enter you will see all the situations where we had mo under 500 amount and then who the sales person is what product it is and which team it is here in the result grid just as you are building the condition for the s dot amount which is the main table here so from clause you can also build a condition for other things so i want to see which where we had under 500 for a specific team so i can say and and then p dot team is equal to and then we'll we can give a team so for example delish and then i'll get all the records where our amount is under 500 within the dallas team and then we get to see their names and the products as well remember some people do not belong to a team so maybe i will try to extract just those people so i'll try another another query i'm going to copy this the reason why i'm copying these queries is because i want to give you a copy of these queries for downloading normally when you are building sql or when you are learning sql you don't need to make so many copies of the queries you can just continue to build and edit the ones that you already have so instead of p dot team is delish i want to see the empty ones one way to do this is you can simply say is null we'll try this i don't know whether they are coded as null or they're coded some other way so i'm gonna try this unfortunately that's not how they are coded in the data if you remember from here this one so here we are not seeing the word null there they are kind of empty so that's why even though they are empty they are not really null they are considered to be blank spaces really if they are null you will actually see the word null printed in the cell so this is one of the nuances of databases a database has if there is really no value then it is called null but even a blank value that's a different perception from a database so this is not working so it's essentially we'll have to do like this and then that will give you all the teams where the team itself is blank but these are the people that had under 500 sale amounts let's build one more join this will be our last join example which is i want this exact data but the geography should be either new zealand or india so we'll copy this again notice that we do not have a join on the geography yet so for now we can't even put that filter so i'm going to add that as another join join geo g on g dot go id is equal to s dot go id you see that i'm not waiting for the auto suggest to come up and i'm typing these values myself this is where knowing your data is really helpful if you know your data well enough then you can build better sql queries so one of the key suggestions that i have for you is anytime you are trying to learn or use sql first spend some time understanding the data model itself what tables are there what columns are there how they are linked up or what goes where then you can build better sql queries so join gog on go id is equal to sjo id and then now i can build my third condition which is end g dot jio in new zealand india so these are all the situations where we had under 500 amount these are the people that's the product that's the team obviously team itself is blank because we kind of make sure that that's the scenario where the amount is actually the shipment actually went to new zealand or india we don't have any indication of that information there like there is no jio column so it's not necessary to show the column on the screen to access it in your where clauses or whatever and needless to say once you finish this job you can also order this information order by and then you can put for example sale date so that you can see these shipments in the chronological order so we had one couple of them in the january of 2021 then we have couple in february three more in april two more in may like that we didn't have anything in march so that's a really interesting way to explore the data and see a precise portion of the data that meets the criteria that you want by building the joins and where clauses and that brings us to the very last part of our video on sql learning itself which is how to use group buys to create report style data sets my wife joe she does a lot more sql work than i do nowadays but we both have fairly solid understanding of sql so i asked her the other day hey i'm building this sql for data and excel people what topic should i include and the first thing that she said is after you cover the basics you should talk about group bias and aggregation so we're gonna do our group buys here what group by does is it lets you create kind of like a pivot report style things using sql query many times all the data that you want is in the database but it is at a too detailed level and i just want to see the data at a higher level that's where the group buys are helpful so we'll try the simple ones first so select and then let's just group the data by the geoid in the sales table itself so we can say geo id this is also helpful when we write the fog from clause first from sales and then jio id and then i want to see sum of amount so that's the sum is the function on the amount column by guid and then group by go id okay so this is how when you run it will kind of make sense basically this is going to create a a simple g1 g2 g3 g4 g5 g6 report telling you what was the total amount for each geography because we are grouping by geoid i can do any operation and then that operation will be happening within that geography so essentially this is like a pivot table imagine you create a pivot in excel where you put geo id into row label area and amount into the values area and it adds up the values so that's what the sum is doing if i have i want for example i can also do average of amount that didn't turn into green so probably it is abg yeah and then i can see what was the average by each geography i could do a sum of boxes and then that will create that so this is a powerful way to create reports for your situations here we are using one table but if you combine this idea with your joints then you can kind of to take data from two three different tables merge everything together to create one complex report so we're going to try that next instead of g1 g2 i want to show the name of the country and then we'll say from sales s join g o g on s dot jio id is equal to g dot geo id so now that we have joined i can select a g dot geo here so that shows up the geography and then group by this we'll try this if it doesn't work then we'll have to come back and change this yep it doesn't work so we will need to have the same option as there and then that will give you so whatever you are grouping by you need to make sure that the the summarization is also happening at that level if you give a different column here that's not going to work correctly so now i'll get the geography here new zealand canada usa india uk and australia and then the total amounts average in boxes this is more helpful than this report this one tells me the data but it doesn't tell me what these g1 g2 g3 are so this is a bit confusing whereas this is more helpful from a user perspective so this is how the group buy really works you can group at a single level or you can create multiple level grouping just as you have you can put multiple items into the row label area of pivot table you can do the same with this as well so we're going to try one more grouping here for the purpose of our second grouping we are going to look at the people table and then get the data from team as well as product table and category so we will take we want to see what is the total amount that is coming in from a team and a category and then understand that so select from sales s and then we will say we'll probably need to build all the joints first then come back and write the fields so join people p on p dot s p i d is equal to s dot s p i id join products pr on pr dot pid is equal to s dot pid so now that all the joints are done we could come back and do this so we'll say pr dot team sorry category p dot team sum of boxes sum of demand group by and then we will have to use these two options because that's what we are displaying on the top so we'll try this sometimes when you write multiple joints and everything you know occasionally you can make a mistake and you won't even see that unfortunately there is no mistake so it's all good we will see what was the total boxes and the amounts for our combination of category and team when you create something like this it's a good idea to apply some sort of a sort order so i'm gonna quickly show you how to apply the order by on this so and then we can add order by and we'll do the same way as here that makes more logical sense and then when you run you'll now get like bars these four teams bite these four and other these four teams so this is uh a more concise and better report than the earlier one it tells you everything and then kind of gives you what you want if you for example do not want to have the null team present there the blank team you can add a where clause as well so for example before you do the grouping you can add the where clause where p dot team not equal to space if it is really null then you would say is not null but because in our data they are not nulls i'll have to use this way and then let's run this so now we'll get even shorter report with that team gone and this is coming up if i want to impose a condition on this to one of these calculations then you would have to use just as you are using where you will have to use the having condition i'm not going to cover having in this video but that's another option that you have in sql we'll conclude our sql learning with one last example wherein i want to show our total amounts by product and then show just the top 10 products i don't want to see everything so we'll try two different way two steps first we'll see every all the products and then we will go and add just the top 10 condition so this is where select from sales s join products pr on pr dot pid is equal to s dot pid and then pr dot product sum of s dot amount so this alone should give you all the products and their amounts and if i for example order by s dot amount dsc we will just say total amount and then we'll see this if this works only one came up it took me a while to figure out what was going on here because i guess i'm completely out of coffee by now but we forgot to add the group by condition so that's why it was not giving the correct result group by pr dot product that should pretty much solve that problem and now we'll see all these values in the descending order of the amount so our highest selling product is after nines and then the lowest selling one is mint chip choco but this is giving me a listing of all the products and i just want to see the top 10 here this is where i'm gonna copy paste this query and we can just say limit 10 and that will basically give you just the top 10 rows of the result so limit is another operation on sql that you can do when you limit the output it's gonna just limit after running the query it'll limit the output to just the 10 rows so that's really how you can get the top 10 values hey i hope you enjoyed that video on sql if you really like it please feel free to share this video with one of your friends or colleagues so they can also learn about sql and if you are thinking this is all good but how do i learn more about sql then please feel free to check the resources section in the video description for my recommended books and courses and other websites and materials on sql i highly recommend that you bookmark these things and then check them out as your time permits so that you can improve your sql knowledge on the other hand if you're wondering where would i use sql what sort of things that i can do now that i have got my data in sql extracted it out then i check i suggest that you check out my video on how to get started with power bi which is a data visualization and analysis software that can use the data that you are producing from sql to build visualizations and reports.

Here is that video i'll catch you there bye.