Python for Data Analysis


Hello everyone this is sort of some area.

Rekha and in today's session we'll be focusing on data analysis with Python so let us move forward and have a look at the agenda for today so first we'll see various applications of Python after that we'll understand the data life cycle starting from data warehousing till data visualization then we'll focus on data analysis and we'll see how we can use Python for that purpose we'll also look at what is panda's library and we'll also understand a bit about numpy and cipher then we'll focus on various pandas operations merging joining all those things and we'll see after that fight simple statistics and python for hadoop so till now any doubts are you guys clear with the agenda if you have any questions any doubts you can write it down in your chat box I will be happy to help you alright so Jason says all clear sodas Dave Jessica what about the others alright I usually say is all clear siddharth says move on Nia says go on alright thank you guys so we'll move forward and we'll see what are the various applications of Python so these are the applications of Python I've listed only four of those although there are many more so you get perform bit scraping bits Python that is you can extract certain contents from a particular webpage you can perform a web development you can perform testing as well as you can perform data analysis so for today's session we'll be focusing on the data analysis part of Python so how are we guys clear so guys let us move forward and see what exactly is data lifecycle so this is the data lifecycle guys over here what happened data is stored in different formats we have a CSV file we can have an excel file or an HTML file so data is basically stored in different formats.

Now what do you do you actually convert that data or transform that it data into a single format and you store it somewhere that's where data warehousing comes into picture now once you have stored your data you can perform certain analysis on it you can perform predictive modeling you can join merge data so various other things that we are going to see in today's session now once you have done the analysis you can even plot it in the form of a graph and that stage is called a data visualization so this is just a general overview about data lifecycle if you have any doubts or questions you can write it down in your chat box and in today's session let me tell you guys we'll be focusing on only data analysis here any questions guys all right so a she says he is clear so it's Jason Devon Jessica Janice are you XI Nia or it fine guys so let's move forward and understand what exactly is data analysis so what is data analysis so let us understand data analysis with the help of an example that is there in front of your screen over here what happens we have a data set in which we have data about the unemployed youth across the globe so country wise from 2010 to 2014 the percentage of view that is unemployed within that particular country we have data about that now what if I want to find only for a particular country say Afghanistan in this example and in that particular country I want to find the unemployed youth between 2010 to 2011 or you can say percentage increase in the unemployed youth in Afghanistan between 2010 to 2011 now what should I do so basically what I need to do is in this particular dataset I need to perform certain analysis that analysis should give me the percentage increase in unemployed youth in Afghanistan between 2010 to 2011 so this basically explains what is data analysis and why we use it if you have any doubts with respect to what is data analysis you can write it down in your chart box it is a very simple concept guys I don't think so there should be any rows but still if you have anything in your mind you are free to scream all right so I've got confirmation from almost everyone we have no doubt sales so let us move forward and understand how you can actually perform data analysis with Python so basically to perform data analysis with Python you need to import a particular module which is called puranas so let us discuss about pandas in the upcoming slides what is pandas.

Pandas is a software module written for python programming language which is used for data manipulation and data analysis now it can perform that at a fairly high performance rate when it is compared to other Python procedures now we can say that pandas is actually built on top of numpy Syfy and matplotlib matplotlib is basically a data visualization module that we use in Python now when we talk about numpy and Syfy numpy is actually a fundamental package for scientific computing in Python so it contains a powerful n dimensional array object it has tools for integrating with C C++ and it is very useful in performing linear algebra of Fourier transform random-number capabilities etc when they talk about sci-fi sci-fi is again an open-source Python module used for scientific computing and technical computing Syfy contains module for optimization and linear algebra integration interpolation special functions Fourier transforms all those things right will actually focus on numpy and sci-fi in the next session for this session will be only focusing on pandas and we have a separate session on matplotlib as well where I'll teach you exactly how to perform a data visualization using math dot lip so if you have any doubts till now you can ask me any doubts case so we have no questions till now so I'll open my pycharm and I will actually tell you how to import this pandas library and how to actually create our data frame so for that I'll open my Python once some guys over here what I need to do is I need to first import the Pardo's module so for that I'll type import pandas as PD that's what they usually keep so yeah let's follow the protocols and now what I'm going to do is I'm going to define a dictionary say that contains the data about say my website so we can have columns like day visitors bounce rate alright so let's go ahead with that I'm going to name my dictionary say as X Y Z underscore web so over here my first key will be day day and largely include the day so it will have a list of days 1 2 3 4 5 & 6 yeah now my next key value pair will be actually about visitors so I'll write in visitors so visitors on day one say we hired around 1,000 visitors then we had 700 then we had say 6,000 then we had around 1k again then 400 and then say 350 yep so the next key value pair that I'm going to add it's a bounce rate bounce rate is nothing but the number of people who have visited your website but I've left your website immediately so yeah which is not good for any website bounce underscore rate so bounce rate will be around say 20 again 22nd day also make it 23 then say 1510 and 34 yep so now what I'm going to do is I'm going to close this dictionary and now what I'm going to do is I'm going to convert this dictionary into a panda's data frame now how we do that let me first declare a variable say DF and over here what I'm going to type in I'm going to type in as TD for par now speedy dot data frame and the name of my dictionary which is XYZ web now go ahead and print this data frame and we'll see what exactly happens so up it has converted our dictionary into a data frame so what all columns we are we have bounce rate we have they and we have visitors so this is a very basic introductory example for you all guys in order to show you how to make data frames using pandas library.

I hope you all are clear till now if you have any doubts any queries you can write it down in your chat box Ashish Jason Diwan anyone any question Jessica Joffe are usually they are any questions please write it out in your chat box so we have no questions still lost so I will open my slides and we'll move forward and have a look at various operations that you can perform on pandas dataframe so these are the operations that you can perform with Fonda's data frame you can slice the data frame that is if you want only a particular part of that data frame you can do that you can change the index value you can convert the data into a different format you can actually change the kilometers you can perform concatenation of multiple data frames and you can even perform joining and merging of two or more data frames these are all the basic operations that you can perform with pandas so we'll move forward and have a look at these operations one by one first we'll look at slicing so over here we have a data in which there is an index value which is nothing but the ear 2000 1 2 3 & 4 here we have interest rate and here we have USD TP in thousands now I want to slice a particular column from this particular data frame so what will happen if I do that it should only give me so when I slice only the starting two rows it will give me only till 2002 but when I slice the last two rows it will give me only for 2003 and 2004 so this is how you can perform slicing so let me show you guys practically how to do that so this is our data frame guys and over here if I only want say the starting to Rose so for that what I can do is instead of sprint DF I can do it as print D F dot head and I want to leave the starting two rows so I'll keep two here and we'll see what happens when I execute this so yep there are only two rows that are present so this is all you can actually print only a part of the data and if I want only the last part of the data that is the last two rows so what I can do is I can convert this to tail instead and we can do that as well go ahead and execute this and yep you can see that it has printed the last two rows so this is how you can perform slicing if you have any questions or any doubts you can ask me right now any questions guys alright so we have no questions so I'll move forward and we'll look at the other operations that you can perform with pandas after slicing we are going to talk about merging so what is merging let me explain you that with the help of an example that is there in front of your screen so over here we have two data frames and in one data frame we have index values from 2001 to 2004 and another data frame we have index values from 2005 to 2008 now what happens when I merge both of these data frames let us see what happens now these two data frames can be merged together to form a single data frame and we can actually make sure what all columns that we need to keep common so over here we have common columns as HPI interest rate and index but when I talk about GDP if we have two u.s. GDP one is X and another is Y so this is how actually you can perform merging you can actually make sure what all columns you want in your final merged data frame so we have a question from Jason he is asking what about the index the index has been changed alright Jason nice of the vation so what happens is I have actually removed the index the index values that was here earlier.

I've actually made that as 0 1 2 3 so I'll don't worry so don't worry Jason I'll actually open my item and tell you guys how to do that practically but before that if you have any doubts you can ask me right now any doubts all right so we have no doubt so I'll open my Python once again so this is my pie charm again guys I'm going to show you the merge operation so for that what I need to do is I need to import this panels modules for that Antartica import pandas as PD and now what I'm going to do is I'm going to define a three data frames let me name it as DF one and over here what I'll type in and type in PD dot data frame and I'm going to use the top and inside the topple I'm going to define a dictionary and I'll be using multiple lists inside that dictionary so the first key that will use is HPI house pricing index and the value that are assigned to HPI is a list and in that list I'll play certain values so let it be 80 comma nine T comma 70 comma 60 all right so now I'm going to define one more key and I'm going to name it as interest underscore rate and the value that is assigned to this is a list which contains the interest rate so I'll type in 2 comma 1 comma 2 comma 3 now I'm going to define one more key and I'm going to name it as say IND underscore GDP and now I'm going to define a list here so for that I'll type in the values 50 comma 45 comma 45 comma 67 all right so what I'm going to do now is I'm going to close this dictionary and I'm going to define the index values so for that I'll type in AZ index equal to whatever the values that I want in my index so I just want be here so on type in 2001 comma 2002 comma 2003 comma 2004 so this is a first date of a when a second data frame also will do something like similar to this and type in PD dot data frame open and close parentheses and over here I'm going to define the same key value pair that is HP eyes let me copy it all right so we have HPI now I'll define one more key value pair and that is interest rate so again I'm going to copy this whole thing and I'm going to paste it here and the same India's GDP also I'm going to copy in and I'm going to paste it here now as in the I've done in the previous data frame as well I'm going to define the index values for that I will type in index equal to a list then my index values will start from 2005 so I'll type in 2005 comma 2006 comma 2007 comma 2008.

I forgot the comma here so yeah comma and now we have two data frames so now what I can do is I can go on and merge these two data frames so for that what I'll type in I will define one variables say merge equal to PD dot merge and the data frames that is DF 1 comma DF 2 now go ahead and print merge and we'll see what happens print merge go on and print this so as you can see that we have merged the two data frame that is the F 1 and DF 2 and we have got one single data frame now what if I don't want to keep certain columns as common when I perform the merge operation so what I can do is I can write in the columns that I want to keep as common so suppose if I want only the HPI column to be common so I'll just type in here on HDI and when I go ahead and print this so as you can see only this particular column is common that is HPI rest everything we have two different columns for that that is India's GDP that is x and y again we have interested as x and y so this is how you can perform merge operation if you have any questions or any doubts you can write it down in your chat box any questions guys any questions Ashish Dave Theon Jason Jessica are you see Java T alright so we have no questions now what I'll do I'll again open my slides and we'll see the other operations that you can perform it's pandas so we saw a merging operation right now let us move forward and have a look at the next operation that is joining so in joining what happens the two data frames are joined on the basis of their index values so let me show you that so we have two data frames one is this and another one is this so over here what happens when we join both of these data frames so let us see what happens as you can see that by joining these two data frames we get this one single data frame now one thing to notice here guys as I've told you earlier as well joining happens with your index values so over here you can see that we don't have any index called at two thousand five over here as you can see that we have no index that is two thousand five but after joining the two thousand five index appears in the data frame but there is no interest rate or u.s. GDP thousands associated with it similarly when I talk about the data frame - that is the second data frame over here we don't have any two thousand two index value so the value with respect to two thousand two will be any N and for unemployment also will remain as n a n now you must be wondering what is n a so when there is no value attached to a particular index it rises as n a n that means not unnumbered so let me practically show you how to perform a joining I will again open my file charm and we are going to perform joint operation in that so this is my charm guys and over here I have two data frames DF 1 and DF 2 which I asked we use in order to show you the merge operation now for join operation what I'm going to do is I'm going to remove this HPI key value pairs from the dictionary and same I am going to you do it to the second data frame as well and instead of interest rate I am going to write in here as low-tier hpi-o underscore tyr underscore HP I and certain values to it so yep I'm going to type in here as 50 comma 45 comma 67 comma 34 and instead of India's GDP I am going to type in here as an employment and certain values to it so I'm going to name it as one three seven six five six one three five six sorry alright exercise it anything doesn't matter now let me just change these index values so I'm going to type in Harrah's 2001 2003 2004 and let this be 2004 only and now what I'm going to do is I'm going to type in joined equal to DF 1 dot join DF 2 and print joined let's go ahead and execute this and see what happens so over here what happens in 2002 we have no values that is attached to lower-tier HPI and unemployment so it has actually printed n/a and that is not a number and in 2004 we actually have both the values available so it has printed that so this is how joint operation happens if you have any doubts or questions you can ask me any questions guys so we have no questions right now so I'll again open my slides and we'll see the other operations that we can perform with pandas so we saw exactly how to join two data frames and let us move forward and see what is the other operation now we are going to change the index and column headers now let us see what how this actually happens so we have two data frames here so one contains index interest rate and US GDP in thousands another has index as the year and we have only US GDP thousands there is no interest rate here so what happens when I change the column headers or I change the index so over here as you can see I've changed the index value as the interest rate and I have changed the column header as GDP instead of u.s. GDP in thousands so don't worry guys I'll actually open my pycharm and show you practically how to do this now first what I'm going to do is I'm going to remove all of this and I'm going to define one data frame let it be DF and now over here I'm going to type in key value pairs in the dictionary so first of all I'll write in say de and inside that I'll type in 1 comma 2 comma 3 comma 5 or 4 now one more key value pair so I'll type in here as visitors so in day one we had around 200 visitors then we are 100 visitors then 230 and then we had 300 visitors 230 and then we had 300 visitors and I'll give a one more comma and I'm going to find one more key value pair so for that I'm going to use the key as bounce rate and and I've already explained you what exactly bounce rate is so I'm just going to type in the values 2045 comma 6 t comma 10 all right so we have this particular dictionary so in that dictionary let me convert this to a panda's data frame for that i'll type in pd dot data frame and it will convert this to a data frame let me open and close parentheses as i forgot i will convert this over to a lowercase o and yeah i've made a mistake in the syntax so it actually has open and close parentheses I need to add ye also here now go ahead and print this so we have got this particular data frame now I want to change this index value say I want de here to be my index value so for that what I can do is I can type in here us DF dot set underscored index and I want my index to be de and that's it now go ahead and print this and we'll see what happens and overhead I need to type in here as in place equals to true now when I go ahead and print this as you can see des has become my index value so this is pretty easy guys so you just need to set your index value whatever you want and you can get that data frame with it now my next task is to convert one of the column headers so say instead of day I want to convert it to date so let me first remove this so now I want to convert one of the column headers so for that what I can do is DF so as you can see we have successfully changed the index value to day so I can even plot this so for that what I need to do is I need to just import one more library which is my plot lab I'll have a separate session on matplotlib so you don't need to worry much about it plot depth dot pipe lot as PLT now import from matplotlib imports time old-style alright and now what I'm going to type in here I'm going to type in style dot use 538 yep I'll keep it as 538 and now I'll remove the spread statement our I'm going to do is I'm going to type in D F dot plot PLT dot show go ahead and execute this graph in which the bounce rate is represented by the blue color line and the red color line represents of visitors so these are our index value or you can say the day and these are the values corresponding to that particular day so this is how you can actually plot it though you don't need to worry it much about data visualization part because I'll be covering that in the upcoming session I just wanted to show you how this works so I've just given you a good example of that that's all so you can visualize now our next task is to actually change one of the column headers so for that how what I can do is first let me remove all this now suppose if I want to change the column header from visitors to users so how am I going to approach this task now what I'm going to do is I'm going to type in here as DF equals to DF dot rename columns equal to I want to replace two visitors with with what I want to replace it I can replace it with users now go ahead on print DF and you can see that column here that has been changed instead of visitors we have users now so this is all you can actually change the column errors and index values if you have any doubts any questions with respect to this particular operation you can write it down in the chat box or related to all the topics that we have discussed still no any questions guys so we have no question still low so I'll open my slice again and we'll see one of the other operations that spawned us concatenation so you have a student data in which you have name age sex and phone number now you want to hide email address to this particular data so you can perform concatenation and add the email address field at the end of this particular data nobody guys'll actually show you practically how to perform concatenation so this is my Python guys so in order to show you concatenation I'm going to paste the two data frames that I have created earlier I don't want to type it again because it's going to take a lot of time so I'm just pasting it now in order to perform concatenation I'm going to declare a variable say concat equal to PD dot concat DF 1 comma DF to go ahead and print this and we'll see what happens so yep as you can see that we have concatenated the two data frames so the index values are from 2004 for the first data frame and then it starts from 2005 to 2008 for the second data frame so as you can see that concatenation has been successfully performed so we have indexed values from 2001 to 2008 and for the first data frame it is still 2004 after the second data frame it starts from 2005 till 2008 so this is how you can perform concatenation if you have any doubts or any questions with respect to this particular operation you can ask me any questions guys all right so we have no questions so I can go back to my slides and see what is the next operation that we are going to see now comes data munging so data munging basically means that you can actually convert a particular format of data into a different format so if you have a data which is in say CSV file you can convert it to an HTML similarly you can perform that operation with the other data formats as well so let me actually show you that I'll again open my PyCharm so this is my Python guys let me first remove all of this and I'm going to actually read a CSV file which is there in my system locally displays it in my system so I'm going to read that file so for that what I'm going to do is I'm going to type in this this is nothing but I have define a variable country and in that country what I am doing is I'm using the partners module in order to read the CSV file which is present at this particular location and finally this index underscore call equals to 0 actually make sure that I have no index present in that particular data frame that's all so now what is the operation that we are going to see here we are going to convert the CSV file into an HTML file so for that what I am going to type in country dot to underscore HTML open closed parenthesis and I can say edu dot html' now go ahead and execute this and we'll see what happens and I'm going to open my project folder as you can see that edu dot HTML is added when I click over there it gives me the HTML code for that now what I can do I can copy this path and I'm going to open my browser and I'm going to paste that path and we'll see what happens yup so we have got this particular HTML table so it was in CSV format we converted this to an HTML format so this is how you perform data munging with pandas any questions they need out still here guys how to perform data munging so we have no questions till now so we'll be more specific and we'll see a use case in which we have the data about the global youth unemployment let me show you how it looks and open my slides again so we have a data set in which we have the percentage of unemployed youth globally so for every country we have the data of the percentage of unemployed youth from 2010 to 2014 so what is the problem statement for this particular case study let us move forward and see that so basically I want to find the change in the percentage of unemployed youth for every country from 2010 to 2011 so what I want I want to see how the trend is what is the percentage change between 2010 to 2011 for every country so we'll see that first let me show you how the data set looks like so it looks something like this we have the country name then we have country code for then 2010 the percentage of unemployed youth same goes for 11 12 13 and 14 so this is how our data set actually looks like so this is our data set looks like we have the country name then we have the country code and then over here we have in 2010 the percentage of unemployed youths 4 similarly for 11 12 13 and 14 as well so let us move forward and actually perform at this data analysis in which we are going to find out the percentage change in the unemployed youth between 2010 to 2011 so for that again an open by PyCharm so let me first remove all this so now I already have the code in order to do that so what I'm going to do is I'm going to paste it and I'm going to explain you what exactly I'm doing so this is the code guys over at first I have imported a couple of libraries I wanted panels I've imported Paula's Matt Lorde lip for visualization and the style is 538.

Let me tell you guys you don't need to worry a lot about visualization because I'm going to teach you visualization in detail in the upcoming sessions so for now just focus on pandas and the various operations that we can perform with it now I define one data frame that is country and this is PD read that CSV which is present in this particular path so I have my data set which is present in this particular path and then what I've done I've made the index value as that means I don't want any index that's all after that I've defined one more data set DF in which there will be only the top five values of the particular data set that means it will contain only the first five rows of the data frame country after that what I have done I have defined an index value that is country code I only wanted country code to be my index value and after that I have defined one more data frames SD in which I want to reindex the columns that is I want only 2010 and 2011 to be my columns and so so let me first show you how this St data frame looks like let me comment this first and I'll show you how SD looks like so you have as you can see we have the index value as country code and we have only 2010 and 2011 column headers that is only for the five rows all right so this is how is the data frame looks like now let me uncomment these lines and I'll remove this print statement now after that what I have done I've defined one more data frame DB which is nothing but the difference of index difference between the two columns after 2010 in 2011 so that will actually give me the percentage change in each and every country between 2010-2011 in the percentage of unemployed youth so after that I'm using a bar plot so finally I have just showed it with the help of a graph let me show you how this looks like I'm going to run this and we'll see what happens so yep this is a graph over here as you can notice in Afghanistan between 2010 to 2011 there has been almost a rise of 0.25 percent of unemployed youth and when I talk about a Geo not is Angola in that there is a negative trend that means the percentage of unemployed youth in Angola has been reduced when I talk about Albania in Albania it has increased that is from 2010 to 2011 there is an increase of around 1.25 percentage of unemployed use in the country when I talk about unabled almost three point one percent of increase is there in terms respect to a percentage of unemployed youth between 2010 to 2011 and for a re that is United Arab Emirates there has been no change that means in 2010 and 2011 they were exactly the same percentage of youth that was unemployed now over here I can perform multiple operations as well say if I want to find out this for 2011 and 2012 so I can just keep it that way and I can just go on and run this so as you can see here in Afghanistan between 2011 to 2012 the percentage of unemployed youth has went down by almost one point two five percent then I talked about Angola there is no change for Albania it has increased around 1.5 percent and it has increased to one percent for a united arab emirates there has been no change this so this is an example that I've shown you where we have performed an analysis on global youth unemployment data this is just an introductory example pretty basic example that I've shown you there are a lot more things that you can perform it's pandas so we are going to discuss all those things in the upcoming sessions but for now this is what partners is and this is how you can perform data analysis and if you have any questions any doubts you can write it down in your chat box so we have no questions till now again.

I'll open my slides and we'll see what it has to offer us now we are going to see how we can use spicen for statistics so I've shown you four basic operations that are mean mode median and variance let me explain you all these terms so what do you mean by mean mean is nothing but the automatic mean or the average value of a particular list or any particular sequence when we talk about median median is what the median the middle values so they can be high medium and low median then we have a sequence in which there are odd number of elements so that time Medan will be the centermost value but then we have even number of elements in a particular sequence at that time we have high medium and low median in high median what happens the two center values the higher value is taken as a median and in low median among the two center values the lower value is taken as a medium when we only calculate the beaded for even number of elements then the two centermost values the average of those values will be taken as medium so I hope you are clear with what exactly median is there's no rocket science behind it it's pretty easy now when we talk about mode mode means nothing but a value that has been repeated the most so over here we can see that one is being repeated four times three ones four four four thrice by ones and two was also one now when we talk about variance variance is nothing but what is the variation of each and every element in the sequence from the automatic mean so I hope I'm clear with what exactly these four terms means if you have any questions or any doubts you can ask me right now any questions guys all right so we have no question so what I'm going to do is I am going to open my pie charm and perform these things practically so this is my Python guy so over here what I'm going to do is I'm going to import certain modules so I'll type in some statistics import mean and I'll type in trend mean and the sequence so I can type in 1 2 2 2 1 3 4 1 5 1 5 now go ahead and execute this and yep it has given us the mean value or the automatic mean of the sequence that we have given so now from median.

I'm again going to import the statistics module and from that import median print median and the sequence so let it be 1 comma 1 comma 1 comma 2 comma 2 all right let's give it as 5 elements so it will print the centermost element that is 1 as you can see here 1 is here that is at the third position we have 2 elements here and 2 elements it has printed the centermost element of the middle element.

Now it's the median if I want to find in mode here so for that I just type in mode and I'll change this to mode go ahead execute this and yep as you can see 1 has been repeated thrice whereas 2 has been repeated only twice so 1 has been repeated the most so it becomes the mode now you can even find variants here variants as I've told you earlier as well it gives you the variation in the elements from the arithmetic mean go ahead execute this and yep you can see the variation is 0.3 so this is how you can actually use it for starter fix as well any questions any doubts later guys you can ask me any questions so we have no questions till now fine guys there are no questions we'll move forward and understand how you can use Python with Hadoop so I'll open my slides once more so guys you can use Python for Hadoop as well now what happens you need to import a library called PI dupe and you can write a MapReduce program in Python and process data that is present in HDFS cluster now let me explain you with the help of flow diagram that is there in front of your screen so you have some input data which is stored in your HDFS cluster across various data notes now what happens you write logic in Python in order to process that data or the respective node managers where data is stored now this stage is basically called map phase and it will produce some intermediate output so how much of a node managers you have you'll have that many outputs and that will be given to our reducer now what happens in the reduce space so this reduced phase happens with the load balancers what will happen whatever the output that comes from the map phase will be provided as the input to this reduce phase and it will aggregate that and provide us with the output now.

I know you might not understand node managers you might not understand data nodes all those things so we are actually going to discuss about this later in the upcoming sessions so you don't need to worry about it right now but I am just giving the general overview and I'm just basically telling you that you can use Python in order to process big data across the HDFS cluster which is present across the HDFS cluster.

So if you have any questions any doubts till now you can ask me guys just feel free to ask me any questions that you have in your mind all right fine so we have no questions as so we'll move forward and I'll just give you a brief summary of what all things we have discussed first these obvious applications of Python we saw that we can use Python for web scraping for data analysis for testing and for various other purposes as well then we saw what exactly is a analysis and what is conda's then we understood various operations that you can perform with pandas like slicing joining merging all those things then we saw a case study in which we had a data set of total number of percentage of unemployed youth country-wise between 2010 till 2014 and we did some analysis and we found out what is the percentage increase in the unemployed youth from 2010 to 2011 then we understood how you can use Python for statistics and how we can use pythons third Hut tube.

Thank you guys for attending today's session if you have any questions or doubts you can ask me right now or I'd find we have no question so this video will be uploaded into your LMS so you can go through it if you have any questions after that you can contact our 24/7 support team or you can bring your doubts to the next class as well thank you and have a great day.

I hope you enjoyed listening to this video please be kind enough to like it and you can comment any of your doubts and queries and we will reply to them at the earliest to look out for more videos in our playlist recorded rika-chan happy learning.