SQL Data Analysis Projects
Foreign tutorial now in today's session we will be talking about certain data sets which we will use for our SQL programming now let me introduce you to the two data sets which we are going to use today in our session one is employee attrition now as you can see we are using MySQL as our SQL database in which we have already prepared databases for two tables one is employee attrition and another is Olympic events so in this due course of this session we will be showing you certain queries certain analytics which you can perform on different types of the data columns which are present and what intelligent inferences you can make out of this data so we are going to use first talk about the employee attrition now let me first make you aware what kind of columns data types and data which we have prepared now if you see the employee attrition table there are these columns so I'll just try to alter the table first okay so employee attrition table these are the columns like age attrition business travel daily rate Department Etc I'll show you the sample values okay and employee ID is the primary key which is and it's Auto increment so basically any new data any new employee details which will be entered in this table our new ID will get generated for the uh employee automatically by the database so just to let you know we are using MySQL workbench which is the tool available to interact to a query server now now you can see the table you have the age of the employee whether it is the employee has exited attrition yes or no what was the business travel frequency what was the daily rate of particular employee Department to which he belongs distance home education field which education background he belongs to environment satisfaction the work environment satisfaction this is the feedback survey data which has been accumulated the gender of the uh employee hourly rate job involvement job level what is this what was his role his or our role uh the level of job satisfaction marital status monthly income monthly rate number of companies total the employee has worked whether he performs over time yes or no the salary hike which he has been given uh the performance rating his or her relationship satisfaction total number of working years work-like balance total number of years at the current company and years in the current role and years in last year since last promotion so these are important in order to identify uh you know whether the employees having uh you know has received the promotion this is the uh basis the salary hike performance rating Etc so we will look what how what all we can do with this data and last is employee ID okay now basic SQL Curry query View Run select star employee intuition and you get the entire data okay now for example I want to get a query to find the details of employees under attrition employee attrition having five plus years of experience so what we will do now is we will do a select star from employee attrition table but now we will apply filters like what is the age of the employees we want between 27 and 35 so the first filter which we will apply is that we will get the list of employees which are in the bracket of age 27 to 35 so we will use something called as a between Clause so between 27 and 35 means 27 and 35 inclusive okay so you can see a person with age 27 and also you can see a person with 835 now what else we need is that who are having five plus years of experience right so there is a column called total working years okay so basically we want anyone which is more than or equal to 5 okay so we'll say total working years greater than equal to 5.
Now if you see the total working experience column working years column any one which is more than equal to 5 is a present now you might want to check what is the total count of this whole you know data set you can say total select count star and run the query again so total 455 employees and what is the total count of our data set without any filter one four seven zero so 455 out of one four seven zero are the employees which are having five plus years of experience and belong to age group 27 and 35.
So I'll roll this back and I'll find my query again now this is one example of a very straightforward filter criteria and we have used the between clause now let's take a look at another example fetch the details of employees having maximum and minimum salary working in different departments and who have received less than 13 salary hike now less than 13 percent salary hike so there is a column called percentage salary hike so we need someone who is less than 13 so we will put the filter criteria of less than 13 now secondly we want the maximum and minimum salary per Department in different department so first of all what we have to do is I'll first break this query for you guys so first let's just filter the set of employees who are having salary hike less than 13 percent okay everyone who has less than 13 percent is now in my list now now we need to use our another feature of SQL analytics is called Group by now we want what is the maximum and minimum salary in each department which people with less than 13 salary hike have received so first of all you need to use a class called as Group by department now Group by Department means that we are gonna get all the data grouped by Department it is just like in Excel when you create a pivot the and in the pivot when you drag and drop a column in the rows section so basically in the pivot you're trying to group by that particular column similarly in SQL this is a very synonymous to pivoting you're pivoting the data by department now when you are pivoting basically in a group grouping by here you have to use a analytical function like Max Min sum average count Etc so here now our question is maximum and minimum salary so first what I'll do is I'll say what's my maximum monthly income for example and per Department let's just take first example per Department okay so in the sales people who have received less than 13 salary hike their maximum monthly income is 9924 and research and development 9980.
You want you can order by Max monthly income descending so the highest one will come at the top okay so research and development has the maximum monthly income uh sales the next and Human Resources next now I also want minimum monthly income so here you go you will get both the parameters but we are sorting only by the max so in research and development okay the people who have uh less than uh salary hike of 13 percent have maximum monthly income of 9980 and minimum of 10 000. okay now for example you want to first do a group by and then perform a filter then you need to use a clause having and then like this okay so this is an example which showed you how you use Group by and having now let's take another example calculate the average monthly income of all employees who worked more than three years and whose education background is Medical okay so now similarly what we have to do is first let's first get the details of the employees who are who have spent more than three years at this company and are from education field medical now these are this this is the list of the all the employees okay who have who belong to this filter criteria but now I want the average monthly income across all these employees who fit into this filter criteria so I'll do by Group by education field and anyways it's only medical so I have to use a function mathematical function average AVG monthly income and I will get the average monthly income which is 7345 so basically people employees who have spent more than three years in the company and belong to the medical education field have an average monthly income of 7345. you want to do group by or not it doesn't matter because anyways it's a filter criteria so you can discard this group by okay now next identify the total number of male and female employees under attrition whose marital status is married and haven't received promotion in last two years okay so now our criteria is select first let me just say select star from employee attrition whose marital status is equal to Merit and year since last promotion is 2.
Okay so who haven't received promotion since last promotions in last two years equal to two now this is my data set now after the filter criteria and attrition equal to yes okay now these are the employees now I want the total number of male and female employees so now I have to apply Group by because I now want a group by gender a group by gender and the count of male and female employees who belong who fit into this criteria so eight male and two females are is the total count of the employees who belong to this criteria where their marital status is married years since last promotion is only two and attrition is yes okay so now you should be able to understand where when to use Group by and when not Group by is required when you need to do a pivoting or you know a group by basis a particular textual column and then you need a relevant calculated measure or number associated with it using a mathematical function like count some average Etc next employees with Max performance rating but no promotion for four years and above okay now this is an interesting thing now let's first of all find in our data set what is the maximum performance rating which has been given to a particular employee first let's check our data set let's take a glance at the data set performance rating okay so people have performance trading as three four five Etc three or four I believe so let's see what is the maximum performance rating so maximum performance rating is four okay so no one has received greater than four so I need uh the list of employees with maximum performance rating as four uh sorry at maximum performance rating but no promotion for four years and above okay so first of all what I'll do is I'll write a sub query where I'll check like this above select Max performance rating from employee attrition and then I'll in the where Clause of performance rating should be equal to this right so if tomorrow in my data set there is a employee who is received a performance rating of 5 so then this query will automatically start returning five okay then what is the ER since last promotion is greater than equal to 4 and then now I can run this query okay so now you can see uh this is the list of employees who have received maximum performance rating but haven't received promotion is uh more than four years greater than equal to four years okay now next who has maximum and minimum percentage of salary hike okay so now I want to check who which particular set of employees who have received spent x amount of years at company have a good perform have a performance rating and year since last promotion but what is their maximum and minimum percentage salary hike so for example what I wanted to show with this data set is that people with one year at company performance rating has four years since last promotion is zero uh means they are very recently you know been promoted uh but uh their salary hike maximum salary hike which they got which those set of employees have received is 25 and minimum is 20 uh right and years at company similarly so this data set is actually sorted by uh descending order of Maximum percentage of salary hike and then ascending order of minimum percentage of salary height so anyone who has all these set of employees who have maximum of 25 but minimum of 22 25 right so the the criteria the query which we have used is we have grouped by years at company performance rating years since last promotion and within that we have found out what is the max and minimum percentage of salary hike now after grouping it we have reused order by clause in the order by first we are descending means highest first lowest uh later of Maximum percentage salary high and then after you have the Sorting of Maximum minimum percentage salary hike should be in ascending lowest first and higher later okay now another very important uh keyword within SQL is distinct so for example I want to find out what are the set of distinct uh departments in my uh uh you know in my office or in my organization so these are the set of departments to which anyone any employee belongs to so you can use distinct now next take another example employees working overtime but get given minimum salary hike and are more than five years with company okay now I want to check that there are people who are working overtime but they are given minimum salary hike and already are more than five years with the company right so similarly first I'll try to find out what is the minimum percentage of salary hike uh given in The Firm oh sorry which is 11 okay so now first I'll get this list of employees who have spent over time who have done over time and receive minimum uh percentage salary height and are also spend more than five years in the company so this is a pretty small data set and if and even if I apply and attrition equal to s okay so you can see uh one two three four five six seven eight nine people have left the firm because of probably uh they spent they were spending over time they were given minimum salary hike and they were already with the firm for the last five years so this is a good uh finding which can be presented to the management that you know these are the probably the reason why people are leaving now similarly uh you can also check just by flipping the uh conditions okay people are who are the people who are doing over time and have received maximum salary hike now so now if I want to check what is the maximum salary hike which has been given in The Firm is 25 so people have received 25 salary hike but I have only spent less than five years in The Firm okay so okay great I only get one person who belongs to this criteria so it's a good catch okay you can check there is a there is a one person there is an outlier there's an exceptional case and probably it could be that the person have performed extraordinarily or uh what what is the other reason behind so you can the HR or any management person can take a look into it okay and similarly uh who has not spent over time people who have not done over time but have received maximum salary hike okay again this is also a good catch you can check they have not done over time they were not doing over time but they received maximum salary right and have spent less than five years in The Firm so that's also a good thing to look at that why are these people given a hike another one what is a maximum relationship satisfaction minimum for people with basis the marital status so this is a pretty straightforward four and one so all different kind of analytics and uh you know uh you can perform queries you can write queries you can check you can do make some inferences derive some inferences intelligence out of your data you can mine information you can write queries so it's a very powerful tool now let's take a look at our second data set which is Olympic events now in this data set we have columns like ID name of the Olympian sex age height weight the team which he or she belongs to uh the code of the country the games the year in which the Olympics were done season it was a summer olympics or Winter Olympics the city where the events were done the sport and the name of the event and whether the player or the participant won a gold silver bronze or whether he did not win any medal like any not applicable now so basically it's an interesting data set let's see what all inferences what all queries we can kind of fire in this and make some intelligence intelligent inference out of it okay so this is the basic query select star uh from your schema name dot the table name now first query which I want to find out is write an SQL query to find in which port or event India has won the highest medals it could be gold silver bronze but not any okay so I want to get a count of the total medals daily of India according to each event in which it has participated and I want the highest to lowest medal tally okay so now I'll just first break up this query for you so let's first say select star from Olympic events where team is equal to India now this is the entire data set filter data where India has participated and these are the participants now I am excluding which is not equal to I am excluding all this all those events where India has not won a medal so okay so now excluded those events so now India has medals with bronze gold and silver now I want account of all these medals per event now it makes sense for me to do a group by event so now I am doing a group by now in my select Clause I can have event or any other aggregation or number crunching which I want to do now my goal is to get the medals tally so Group by event and get the count of The Middle so India has won 28 medals in men's hockey one in shooting rifle and one in wrestling men's wrestling okay so total 30 13 is the total count and out of that this is the breakup okay by default you can see the data is already sorted but you can also put a sorting Clause order by count of middle descending and you will get the same result 28 1 1.
Such a good inference you can check uh even if you want that in which year per year wise breakup if you want so you can use this column here and say for example you just say ear and event now let's see what we get okay so now in this in 1932 men's hockey we got the maximum medals then 1952 then 1928 either you can sort so we have sorted by descending highest middle count to the lowest middle count or if you want to order by year so here you have the latest one from 2012 to 1928 which is now the data is sorted by ear descending or you can say voltage to low earliest which is 28 to 2012. so all the options are available so but we want to see what is a problem statement which.
You want to solve next last example identify the sports or even which was placed most consecutively in the Summer Olympic Games okay so now similarly now we want only the Summer Olympics so what we will do is Olympic events where season equal to Summer okay now I have filtered that data and event Group by event and get the count of that particular event now I want to sort it my football men's football is the most uh played event okay so again say order by event count scripting descending okay so football men's football 1545 events hockey uh base basketball water polo cycling rowing Athletics men 400 meter uh relay 100 meters so this is the descending order of the events next example write an SQL query to fetch the details of all the countries which have won most number of silver and bronze medals and at least one gold medal now this is interesting we want a count of separate count of medals tally which is silver bronze and gold separately first okay so now we will take an example of a case statement now in SQL what I have to do first is I have to query the Olympic events table and I have to check in which case I have the uh the team or the player I want silver I'll say if someone has won silver give say then one l0 if someone has bronze one bronze then one l0 someone has won Gold one l0 so I have basically created three columns separate columns basis the condition of whether the person is winning silver bronze and gold now I'm just gonna run this query independently and if you see in the end now I have three new columns introduced z uh basis the count of the gold silver and bronze so suppose for this example uh this player has won Gold so he has one in front of the gold okay someone has grown bronze so there's a one here now I'll use these three columns to do a sum to get the total tally of silver bronze and gold separately so now if You observe I have prepared a put this in a in a query and on top of this query now I am writing another query which is select.
Team comma sum of silver column sum of groans sum of gold and I am grouping by team so now if you see this China has won 32 Silver 27 drones 41 gold Denmark Etc now I have per team it's medals tally silver bronze gold but now I have to put a condition that I only want a list of teams which have at least won one gold means the sum of gold tally should be greater than zero so now I will use my having close so after doing my group pin I am applying the having Clause it means I should remove all the teams which does not which have gold equal to zero okay so now if I apply this there is no team now left in my data set which is having zero goals at least one gold is there okay now idea is to solve which has won the most number of silver and bronze medal now I have to do sort by silver let's do that okay so now if you see United States have won most number of silver and bronze medals but at least have one gold so this is the actual output which is what we want okay next is France Italy and if you go to the bottom the last there is uh if you see Scotia or China 2 or any other example uh I'll just take another example Uzbekistan Ireland Indonesia if you see only one gold and one silver so that's why it's at the bottom of the table Uzbekistan for gold but at least they have one gold okay next another very obvious thing which you might come to mind which player has won the maximum number of gold okay so similar to the previous query we will again create a column with a case statement that give me a count of uh only gold and then on top of it I'll use the name of the player Group by name rather than the team and find and then sort by sum of gold see Matthew Nicholas has won eight goals Ole Aina 8 Hussein bold eight goals okay then Nikolai seven Victor all these so this is the sorted order of your gold Style next which sport has maximum events very simple Group by Sport and get a count star and then sort Athletics is the maximum events which have happened gymnastics swimming.
And again similar question which year has shown the maximum number of events which year wherever Max 1992 there were maximum events in 1988 then 2016 then 2000 so you can see you know 1992 was the maximum number of events which were played so these are the kind of you know intelligent inferences SQL mining you can do on on this data set hi there if you like this video subscribe to the simply learning YouTube channel and click here to watch similar videos turn it up and get certified click here.
Now if you see the total working experience column working years column any one which is more than equal to 5 is a present now you might want to check what is the total count of this whole you know data set you can say total select count star and run the query again so total 455 employees and what is the total count of our data set without any filter one four seven zero so 455 out of one four seven zero are the employees which are having five plus years of experience and belong to age group 27 and 35.
So I'll roll this back and I'll find my query again now this is one example of a very straightforward filter criteria and we have used the between clause now let's take a look at another example fetch the details of employees having maximum and minimum salary working in different departments and who have received less than 13 salary hike now less than 13 percent salary hike so there is a column called percentage salary hike so we need someone who is less than 13 so we will put the filter criteria of less than 13 now secondly we want the maximum and minimum salary per Department in different department so first of all what we have to do is I'll first break this query for you guys so first let's just filter the set of employees who are having salary hike less than 13 percent okay everyone who has less than 13 percent is now in my list now now we need to use our another feature of SQL analytics is called Group by now we want what is the maximum and minimum salary in each department which people with less than 13 salary hike have received so first of all you need to use a class called as Group by department now Group by Department means that we are gonna get all the data grouped by Department it is just like in Excel when you create a pivot the and in the pivot when you drag and drop a column in the rows section so basically in the pivot you're trying to group by that particular column similarly in SQL this is a very synonymous to pivoting you're pivoting the data by department now when you are pivoting basically in a group grouping by here you have to use a analytical function like Max Min sum average count Etc so here now our question is maximum and minimum salary so first what I'll do is I'll say what's my maximum monthly income for example and per Department let's just take first example per Department okay so in the sales people who have received less than 13 salary hike their maximum monthly income is 9924 and research and development 9980.
You want you can order by Max monthly income descending so the highest one will come at the top okay so research and development has the maximum monthly income uh sales the next and Human Resources next now I also want minimum monthly income so here you go you will get both the parameters but we are sorting only by the max so in research and development okay the people who have uh less than uh salary hike of 13 percent have maximum monthly income of 9980 and minimum of 10 000. okay now for example you want to first do a group by and then perform a filter then you need to use a clause having and then like this okay so this is an example which showed you how you use Group by and having now let's take another example calculate the average monthly income of all employees who worked more than three years and whose education background is Medical okay so now similarly what we have to do is first let's first get the details of the employees who are who have spent more than three years at this company and are from education field medical now these are this this is the list of the all the employees okay who have who belong to this filter criteria but now I want the average monthly income across all these employees who fit into this filter criteria so I'll do by Group by education field and anyways it's only medical so I have to use a function mathematical function average AVG monthly income and I will get the average monthly income which is 7345 so basically people employees who have spent more than three years in the company and belong to the medical education field have an average monthly income of 7345. you want to do group by or not it doesn't matter because anyways it's a filter criteria so you can discard this group by okay now next identify the total number of male and female employees under attrition whose marital status is married and haven't received promotion in last two years okay so now our criteria is select first let me just say select star from employee attrition whose marital status is equal to Merit and year since last promotion is 2.
Okay so who haven't received promotion since last promotions in last two years equal to two now this is my data set now after the filter criteria and attrition equal to yes okay now these are the employees now I want the total number of male and female employees so now I have to apply Group by because I now want a group by gender a group by gender and the count of male and female employees who belong who fit into this criteria so eight male and two females are is the total count of the employees who belong to this criteria where their marital status is married years since last promotion is only two and attrition is yes okay so now you should be able to understand where when to use Group by and when not Group by is required when you need to do a pivoting or you know a group by basis a particular textual column and then you need a relevant calculated measure or number associated with it using a mathematical function like count some average Etc next employees with Max performance rating but no promotion for four years and above okay now this is an interesting thing now let's first of all find in our data set what is the maximum performance rating which has been given to a particular employee first let's check our data set let's take a glance at the data set performance rating okay so people have performance trading as three four five Etc three or four I believe so let's see what is the maximum performance rating so maximum performance rating is four okay so no one has received greater than four so I need uh the list of employees with maximum performance rating as four uh sorry at maximum performance rating but no promotion for four years and above okay so first of all what I'll do is I'll write a sub query where I'll check like this above select Max performance rating from employee attrition and then I'll in the where Clause of performance rating should be equal to this right so if tomorrow in my data set there is a employee who is received a performance rating of 5 so then this query will automatically start returning five okay then what is the ER since last promotion is greater than equal to 4 and then now I can run this query okay so now you can see uh this is the list of employees who have received maximum performance rating but haven't received promotion is uh more than four years greater than equal to four years okay now next who has maximum and minimum percentage of salary hike okay so now I want to check who which particular set of employees who have received spent x amount of years at company have a good perform have a performance rating and year since last promotion but what is their maximum and minimum percentage salary hike so for example what I wanted to show with this data set is that people with one year at company performance rating has four years since last promotion is zero uh means they are very recently you know been promoted uh but uh their salary hike maximum salary hike which they got which those set of employees have received is 25 and minimum is 20 uh right and years at company similarly so this data set is actually sorted by uh descending order of Maximum percentage of salary hike and then ascending order of minimum percentage of salary height so anyone who has all these set of employees who have maximum of 25 but minimum of 22 25 right so the the criteria the query which we have used is we have grouped by years at company performance rating years since last promotion and within that we have found out what is the max and minimum percentage of salary hike now after grouping it we have reused order by clause in the order by first we are descending means highest first lowest uh later of Maximum percentage salary high and then after you have the Sorting of Maximum minimum percentage salary hike should be in ascending lowest first and higher later okay now another very important uh keyword within SQL is distinct so for example I want to find out what are the set of distinct uh departments in my uh uh you know in my office or in my organization so these are the set of departments to which anyone any employee belongs to so you can use distinct now next take another example employees working overtime but get given minimum salary hike and are more than five years with company okay now I want to check that there are people who are working overtime but they are given minimum salary hike and already are more than five years with the company right so similarly first I'll try to find out what is the minimum percentage of salary hike uh given in The Firm oh sorry which is 11 okay so now first I'll get this list of employees who have spent over time who have done over time and receive minimum uh percentage salary height and are also spend more than five years in the company so this is a pretty small data set and if and even if I apply and attrition equal to s okay so you can see uh one two three four five six seven eight nine people have left the firm because of probably uh they spent they were spending over time they were given minimum salary hike and they were already with the firm for the last five years so this is a good uh finding which can be presented to the management that you know these are the probably the reason why people are leaving now similarly uh you can also check just by flipping the uh conditions okay people are who are the people who are doing over time and have received maximum salary hike now so now if I want to check what is the maximum salary hike which has been given in The Firm is 25 so people have received 25 salary hike but I have only spent less than five years in The Firm okay so okay great I only get one person who belongs to this criteria so it's a good catch okay you can check there is a there is a one person there is an outlier there's an exceptional case and probably it could be that the person have performed extraordinarily or uh what what is the other reason behind so you can the HR or any management person can take a look into it okay and similarly uh who has not spent over time people who have not done over time but have received maximum salary hike okay again this is also a good catch you can check they have not done over time they were not doing over time but they received maximum salary right and have spent less than five years in The Firm so that's also a good thing to look at that why are these people given a hike another one what is a maximum relationship satisfaction minimum for people with basis the marital status so this is a pretty straightforward four and one so all different kind of analytics and uh you know uh you can perform queries you can write queries you can check you can do make some inferences derive some inferences intelligence out of your data you can mine information you can write queries so it's a very powerful tool now let's take a look at our second data set which is Olympic events now in this data set we have columns like ID name of the Olympian sex age height weight the team which he or she belongs to uh the code of the country the games the year in which the Olympics were done season it was a summer olympics or Winter Olympics the city where the events were done the sport and the name of the event and whether the player or the participant won a gold silver bronze or whether he did not win any medal like any not applicable now so basically it's an interesting data set let's see what all inferences what all queries we can kind of fire in this and make some intelligence intelligent inference out of it okay so this is the basic query select star uh from your schema name dot the table name now first query which I want to find out is write an SQL query to find in which port or event India has won the highest medals it could be gold silver bronze but not any okay so I want to get a count of the total medals daily of India according to each event in which it has participated and I want the highest to lowest medal tally okay so now I'll just first break up this query for you so let's first say select star from Olympic events where team is equal to India now this is the entire data set filter data where India has participated and these are the participants now I am excluding which is not equal to I am excluding all this all those events where India has not won a medal so okay so now excluded those events so now India has medals with bronze gold and silver now I want account of all these medals per event now it makes sense for me to do a group by event so now I am doing a group by now in my select Clause I can have event or any other aggregation or number crunching which I want to do now my goal is to get the medals tally so Group by event and get the count of The Middle so India has won 28 medals in men's hockey one in shooting rifle and one in wrestling men's wrestling okay so total 30 13 is the total count and out of that this is the breakup okay by default you can see the data is already sorted but you can also put a sorting Clause order by count of middle descending and you will get the same result 28 1 1.
Such a good inference you can check uh even if you want that in which year per year wise breakup if you want so you can use this column here and say for example you just say ear and event now let's see what we get okay so now in this in 1932 men's hockey we got the maximum medals then 1952 then 1928 either you can sort so we have sorted by descending highest middle count to the lowest middle count or if you want to order by year so here you have the latest one from 2012 to 1928 which is now the data is sorted by ear descending or you can say voltage to low earliest which is 28 to 2012. so all the options are available so but we want to see what is a problem statement which.
You want to solve next last example identify the sports or even which was placed most consecutively in the Summer Olympic Games okay so now similarly now we want only the Summer Olympics so what we will do is Olympic events where season equal to Summer okay now I have filtered that data and event Group by event and get the count of that particular event now I want to sort it my football men's football is the most uh played event okay so again say order by event count scripting descending okay so football men's football 1545 events hockey uh base basketball water polo cycling rowing Athletics men 400 meter uh relay 100 meters so this is the descending order of the events next example write an SQL query to fetch the details of all the countries which have won most number of silver and bronze medals and at least one gold medal now this is interesting we want a count of separate count of medals tally which is silver bronze and gold separately first okay so now we will take an example of a case statement now in SQL what I have to do first is I have to query the Olympic events table and I have to check in which case I have the uh the team or the player I want silver I'll say if someone has won silver give say then one l0 if someone has bronze one bronze then one l0 someone has won Gold one l0 so I have basically created three columns separate columns basis the condition of whether the person is winning silver bronze and gold now I'm just gonna run this query independently and if you see in the end now I have three new columns introduced z uh basis the count of the gold silver and bronze so suppose for this example uh this player has won Gold so he has one in front of the gold okay someone has grown bronze so there's a one here now I'll use these three columns to do a sum to get the total tally of silver bronze and gold separately so now if You observe I have prepared a put this in a in a query and on top of this query now I am writing another query which is select.
Team comma sum of silver column sum of groans sum of gold and I am grouping by team so now if you see this China has won 32 Silver 27 drones 41 gold Denmark Etc now I have per team it's medals tally silver bronze gold but now I have to put a condition that I only want a list of teams which have at least won one gold means the sum of gold tally should be greater than zero so now I will use my having close so after doing my group pin I am applying the having Clause it means I should remove all the teams which does not which have gold equal to zero okay so now if I apply this there is no team now left in my data set which is having zero goals at least one gold is there okay now idea is to solve which has won the most number of silver and bronze medal now I have to do sort by silver let's do that okay so now if you see United States have won most number of silver and bronze medals but at least have one gold so this is the actual output which is what we want okay next is France Italy and if you go to the bottom the last there is uh if you see Scotia or China 2 or any other example uh I'll just take another example Uzbekistan Ireland Indonesia if you see only one gold and one silver so that's why it's at the bottom of the table Uzbekistan for gold but at least they have one gold okay next another very obvious thing which you might come to mind which player has won the maximum number of gold okay so similar to the previous query we will again create a column with a case statement that give me a count of uh only gold and then on top of it I'll use the name of the player Group by name rather than the team and find and then sort by sum of gold see Matthew Nicholas has won eight goals Ole Aina 8 Hussein bold eight goals okay then Nikolai seven Victor all these so this is the sorted order of your gold Style next which sport has maximum events very simple Group by Sport and get a count star and then sort Athletics is the maximum events which have happened gymnastics swimming.
And again similar question which year has shown the maximum number of events which year wherever Max 1992 there were maximum events in 1988 then 2016 then 2000 so you can see you know 1992 was the maximum number of events which were played so these are the kind of you know intelligent inferences SQL mining you can do on on this data set hi there if you like this video subscribe to the simply learning YouTube channel and click here to watch similar videos turn it up and get certified click here.