# 10c Data Analytics: Variogram Introduction

Hey howdy everyone. In the last lecture we motivated the need to calculate and quantify and model spatial continuity tried to shows an example where we looked at how the flow response in the subsurface would change dramatically with different spatial continuities. So given that we're now motivated. We now know. We need to calculate spatial continuity and use it in modeling. Let's go ahead and talk about how we do that for this lecture. We will limit ourselves to the use of the Vera Graham and we will then of course we'll talk about how to calculate it how to model with it. We'll go in to estimation simulation with it we will also get into other types of spatial continuity measures those such as used in boolean modeling or mark point process modeling or might be more commonly known as object based modeling approaches multiple points statistics simulation and so forth but for now we will keep it to the averag. RAM so we need to quantify spatial continuity and so we want a function a statistic measure that can tell us about how things are similar or dissimilar over space and so there we have this semi Berggren it is a function of the difference over distance. This axis is the Berggren value for a specific lag or distance in space. This is the distance in space. And so this measure is showing you how things become more different. There's more difference as we increase the distance. So we're going to make this calculation over offsets in space defined as a lag vector age the like vector. H is bold because it is a vector it has a direction and it has a distance. We use you to indicate a location. So you gives us the location in your the context of your spatial problem to a tail location. So we have a tail and then we have the lag vector. H that describes the separation from the tail to the head and we are going to make this calculation of the vert gram by scanning through our data set looking for all data that are offset by that lag vector. H we're looking for all possible pairs or data that are separated based on that lag vector H that's our distance and direction offset and so in scanning for all the possible pairs.

We'll go ahead and use this equation here. Where if you look at if you ignore the two right now what you observe is that this is simply going to be the tail values scanning around. This will be the paired head values. We're scanning around jointly together. We're taking the difference between the two we're squaring that difference and the one over in function of. H is simply going to be take the average of all of those pairs in parentheses H is simply going to be the number of pairs available at that specific lag vector h offset from tail to head so it's simply the average of the squared difference of all the values that we find that are separated by that vector the 1/2 is included and that's what makes it a semi Berggren if we did not have the one-half it would simply be a very grim. You'll notice in my notes and commonly within geostatistics literature. There is a degree of laziness about reporting semi Vera Graham versus Vera Graham in fact during most the lectures from now on I will assume one-half of the average squared difference and I will simply term it as the Vera. Graham that's sufficient and that's what's used within pract so that's how we calculate the very rare. Now what's very interesting. Is that every time you take a lag vector. H and you scan around. You've got the tail in the head and you're pairing them up you could create an H scatterplot the H scatterplot would simply be take the value at the tail location on this axis. Take the value at the head location on this axis for which they're paired up and we're difference in them and just plot them and if we plot them it's very interesting because if we have a phenomenon for which the values are constant over a seperation from the tail to the head over that h lag separation. We would expect that those dots would fall on the 45-degree line and so if we calculate the correlation coefficient.

We'll find out that this would be a good measure of how things are correlated in space. I'll show shortly. That's equivalent to a correlogram and can be related directly to covariance which can be related directly to the very grams so it is informative and useful but it's also nice to look in. H scatterplot and to think about the degree of similarity or correlation over separation distance described by lag vector H furthermore about the Berggren once again the very gram equation is shown right here in words we would say that it is one-half the average squared difference of values separated by a lag vector. H now you may wonder why remove the one-half why take one-half and multiply it by this. It's done so that this works right here by taking one-half the very gram and calculating the semi vera gram representing that as gamma by tucking that one-half in there we ensure that this equation works out which means the covariance function which we will define a little bit function or a measure of this degree of similarity versus distance is going to be equal to the variance of the variable which we'll call the SIL tract the very grim value at that. H lag vector if you look at that the sill is the constant these are both going to vary over the lag distance. H and so this means that the semi Varig ram and the covariance function are simply going to be mirror images of each other we get the covariance by simply taking this constant SIL and subtracting the Varig ram or flipping the upside down now. We'll define covariance a little bit more rigorously shortly but is very interesting to note to note as I mentioned on the previous slide. That if you take the covariance function over h and you standardize by the variance or if you're working with a variable that has a variance of equal to one such as if we've done a Gaussian transform to standard normal mean of zero variance of one we would be able to get the correlogram and the correlogram is a measure of similarity over distance for which its value is equal to the h scatter plot correlation coefficient.

We've covered correlation coefficients within the previous discussion about bivariate statistics we're talking about the Pearson correlation coefficient pretty straight forward. And so that's really nice. Because now we have a function that relates the degree of similarity to a correlation coefficient. Which is something. That's very natural for us so we've defined the Vera Graham. We've talked about spatial continuity and so forth. Let's make some observations about the bare gram. Let's let's talk about general things we would see if we looked at a very gram now first of all what we'll do is we'll give ourselves a very simple data set it's exhaustive. It's on a mesh. It's very simple. We're able to go through and calculate the very gram at different lag distances and we'll only concern ourselves with the horizontal the x direction just to keep things very simple so now imagine you go to that data set and you pan through with two points tail and head tail and head tail and head. And they're all very close to each other. In fact there are only four cells apart. There's three cells in between them and if you pan through that data set. I hope what you observe very quickly is that the degree of dissimilarity will be low that you will both be tail and head will be high or high ish once again this phenomenon. This property could be any property. Indeed it could be a porosity field. It could be a saturation of a fluid. A contaminant concentration. It could be a density of trees. It doesn't matter. It's a spatial property. And so and you'll notice that as that live separation distance increases. You would expect that as you go. Further apart from tail to head tail the head you have a more of a likelihood to have a significant change or transition between the tail that had going from low to high high to low or high to high many different things can happen so that degree of similarity or.

I should say the degree of dissimilarity goes up degree of similarity goes down. And so what does that mean. What's observation number one. That in general as the distance increases we would expect that the semi vera gram value would increase. Okay so that's our first observations that in general. I'm not saying that that's a rule. But that's something you would observe. In general. The other point is that to calculate a very gram. You would never you would use every possible pair remember when we talked about statistics and sampling and standard error and all of these concepts the more samples. You have the better or more reliable your statistics. It's confidence interval will decrease you have more certainty about your statistic. It's more reliable has less randomness and so we talked about the law of small numbers. We want to have large numbers for more robust statistical representation. So we're going to calculate the Vera Graham over all possible pairs now. I have many students who think that when they calculate a very gram on a mesh. Like this that you'll anchor the tail and you'll take a closer the next value the next value the next next value. And they're done or maybe they'll take a lag separation like this and they'll skip ahead to the next bin the next one the next one. They won't take they won't slide across and get every possible value so in other words. If you're trying to calculate this case right here for which we have about a 10 a lag equals to 10. You would expect that you would anchor the the tail here go ahead here. Then you move over 1 and do the next one so you'd go from for 10 you would go. 1 2 data point 11. You is there 10 in the middle then you would go from 2 to 12 3 13 and so forth you would scan across with your 2 points and then scan down scan across you would scan over all possible pairs of data with that separation that way you have the most reliable measure now once you scan through this entire data set calculating you go get the squared difference between all of those offsets.

You summed it all together divided by number of pairs which is effectively. Just averaging or an expectation you would divide that by 2 all of that hard work and you get one single point on your bare ground plot. So that's that's a lot of work to get one point. That's a good thing. Computers are doing that right and then you would increase the lag separation. Repeat repeat repeat. Repeat for every possible lag that you want to calculate for. So you're gonna scan through and get all possible pairs from your data set for the most reliable statistics observation number three. You need to plot the sill to know the degree of correlation and I would say in order to actually interpret a program you need a sill the sill is equal to the variance as I mentioned before so we plot the silt that's the first thing you do plot the sill now you start calculating the variance now I mentioned before the covariance or degree of similarity is going to be equal to sill - the Berggren value so this distance from the sill down to the very gram value is in fact the degree of similarity now if we're working with the standardized variable in other words the variance is equal to 1 that covariance function is actually equal to the core relig Ram which means that that value right there is exactly equal to the correlation coefficient of the H scatterplot. And so now you're starting to understand by looking just looking at this plot you can see that at this point right here. If you had a data value right here what would be its correlation. Well the very gram value at distance of zero would be equal to zero. Because there's no difference if you compare points with themselves tail and head are on top at each other. There's going to be no difference whatsoever so we would expect that the correlation would be the variance minus zero. And so in that case. Where are we working with a variance equal to one if we have a point down here the distance right here is equal to one.

The correlation is equal to one which is pretty cool now what happens if we're at the sill at the sill there's zero difference or a zero distance. I should say between the very ground value and the sill value that would mean that the correlogram is equal to zero. The correlation is equal to zero at the sill. That's pretty important information so when we reach the silver ferreira gram our experimental very round values that were calculating for each lag. Offset that's the distance at which we no longer have correlation between tail and head. There's no correlation anymore. So what does that look like. Let's extend that observation number three if you were to plot the. H scatter plot at the point at which we have Vera Graham value equal to 0.4 and the sill is equal to 1 the correlation coefficient at that distance right there which is probably around 4 meters or whatever that is right there but for me there's three points something would be equal to point six point six is the distance from the sill down here so you can interpret that directly at the sill at this point right here. The correlation coefficient will be very close to zero. What does that mean. There is no correlation between what's at the tail and what's at the head location they're uncorrelated with each other what. I like to tell my class and tell people. Is that you if you go to that distance away so you drilled a well you gather data right here in space and you go to this distance and it's equal to that distance at which we that you reach the sill. You no longer know anything. You don't know anything there's no correlation you have no information so I'm from spacial continuity. That can help you with regard to making an estimate or prediction. And if you go above the sill at that point you have negative correlation so you go from having no information now you have. Negative correlation is negative correlation information.

Yeah it's a constraint. It is information to help you with making a prediction at that unknown location. So if you go above the sill now you have negative correlation which is also information observation number for the lag distance at which the Vera Graham reaches the sill is known as the range as we already mentioned at the range. You don't have any information that's the maximum distance you can go to which you can be informed about what's going on to which you can make an estimate and say hey that local data is telling me something that helps me. Constrain my estimate beyond some global distribution of possible outcomes and so that is the range that's the maximum distance. You can go away from a sample location and still have some information from that local data that can help you at the range once again. The correlation is going to be equal to zero and. I like to draw something like this if I well. I go to the range past that distance. I no longer have any information. The range is also an important parameter when we model. Vera Graham's will in fact use the range as a parameter that will fit valid Vera Graham models to our experimental program results. But we'll talk more about that later observation five there are sometimes a discontinuity in the Vera Graham at distances less than minimum data spacing. This is known as the nugget effect now. It's pretty common that if you have a nugget effect you report it as a ratio our relative nugget effect as a percent or a ratio. That's just simply going to be a fraction and so if I have is sill equals the one the Nugget effect is equal to 0.4. I would report 40% relative nugget effect or I would report it as simply as 8.4 nugget effect and that's that's fine. Either one of them is fun. The Nugget effect is a structure that has no correlation at lag greater than epsilon some infinitesimal distance beyond zero at zero. The very gram must be equal to zero if you imagine once again at zero the tail and the head or the same location.

You're comparing data. I mean you're comparing the values with themselves and so that difference must be equal to zero. The very gram must be this must equal zero by definition but at that very short distance it jumps up and expresses a degree of dissimilar. Where does that come from. The nugget effect term actually comes from gold mining in South Africa. If you imagine a long drill hole and you were saying it for the grade of gold or the composition as far as the amount of goal in each one of the individual samples along a drill hole every one small you can have with. Gold deposits naturally forming a large gold nugget large relative to the size of the samples that were asking. And so you're going kind of low gold. You know grams per tonne grams per ton. And then suddenly you have a big gold nugget. If you were to calculate the bare gram going through all possible pairs and suddenly you have a gold nugget. That extreme jump at a very very short distance a distance shorter than minimum data spacing would result in a level of variability at very short scales. That would be reflected as a nugget effect. Now we've got to be careful of nugget effect because nugget effect could also be measurement error in fact if you were to take a just random error and scatter it using Monte Carlo simulation just scatter it over all of your data everywhere fold any space structure that random error would be reflected by a nugget effect. In fact its proportion. A relative nugget effect would be based on what amount of variability you put in as random compared to the total amount of variability of the problem and so you can actually create negative effect doing that. So what does that mean. It means that many expert modelers around the world will ignore a nugget effect within a bare gram that calculated because they know it's likely to be measurement error they fill or from their experience they know that the geologic phenomenon the subsurface phenomenon the spatial phenomenon they work with does not naturally have nugget effect.

And so they don't put it in. The main point is that try to understand the natural system not to be biased by a measurement error. So given all of this knowledge all of these observations we can start looking at examples and so we have three different three different examples of spatial continuity features. You can see the grayscale could be the indication of porosity grain size fraction of shale. Whatever you want it to be and we can look at the overall types of spatial patterns from this example. The one at the top this one right here in this one right here. They're all the same histograms. They all have in fact the same range of correlations. They're all very very similar. We'll talk about the Gaussian type structure here but we will term its ranges when we get within I think about 95% of the SIL. So they actually in fact do all have the same ranges now if you look at them they all have distinctly different degrees of continuity over short medium and long scale short medium and long scale. And if you look at it you can actually see how each one of these play out and impact the overall type of image first of all. Let's look at the short scale they go this structure right here which we'll talk about later as being a Gaussian type of peregrym structure results in very high degree of short scale continuity lots of short scale continuity look at the everything over short distances is very continuous. This model here has some nugget effect but has pretty good short scale continuity and you can see the salt and pepper a little bit of a discontinuity going on there. Short scale caused by the nugget effect but overall good continuity this model has terrible or poor short scale continuity as a result. When you look at the image you'll see a lot of disruption as short distances going from dark to very light gray to white very fast transitions within colors. This is very poor short scale continuity.

Now let's look at the long scale continuity in general they have the same range. They're very similar somehow. Not these longer scales. The continuity is pretty similar if you look at the image and you concentrate focus on it. What you'll notice is that this general area of high this general area here this area here these general large-scale structures are in fact very similar between the images is nothing. That's really interesting. So what does it tell us about the Vera Graham. Well as we become more accustomed to interpreting programs we should be thinking about averag. RAM as breaking apart multiple scales of continuity and being able to recognize that they. It's like a superposition that they're all combined as multiple frequencies that are combined together and we can interpret each one of them separately and so we have that opportunity as we move forward with Vera. Grahams we'll go back to this idea of interpreting Vera Grahams based on short scale medium scale and long scale types of continuity structures the other thing courses this should reinforce the idea that even with the same distribution. You can dramatically different spatial continuities. That happens all of the time. The other thing too is it should make sense that these functions these representations of this of the spatial continuity of the Vera. Grahams actually do have a pretty big impact in spatial continuity. It's not like it's um subtle the difference between this this this are all very different. They're distinctly different phenomenon speaking of a spatial continuity so we will get more into interpretation and modeling in the next couple sessions. I hope that this was useful to you. We've now defined the ver gram and will carry on with how to practically speaking how to calculate them in the challenging sparse irregularly spaced data types of settings that we often work with get more into advanced interpretation and then how do we build valid models that we can use in fact for the purpose of estimation and simulation.

Thank you very much for your interest um as always shoot me a question or a suggestion email comments. Twitter Facebook even you know in YouTube. Whatever you like my email is very easy to get. I'm a professor here at University of Texas thank you.

We'll go ahead and use this equation here. Where if you look at if you ignore the two right now what you observe is that this is simply going to be the tail values scanning around. This will be the paired head values. We're scanning around jointly together. We're taking the difference between the two we're squaring that difference and the one over in function of. H is simply going to be take the average of all of those pairs in parentheses H is simply going to be the number of pairs available at that specific lag vector h offset from tail to head so it's simply the average of the squared difference of all the values that we find that are separated by that vector the 1/2 is included and that's what makes it a semi Berggren if we did not have the one-half it would simply be a very grim. You'll notice in my notes and commonly within geostatistics literature. There is a degree of laziness about reporting semi Vera Graham versus Vera Graham in fact during most the lectures from now on I will assume one-half of the average squared difference and I will simply term it as the Vera. Graham that's sufficient and that's what's used within pract so that's how we calculate the very rare. Now what's very interesting. Is that every time you take a lag vector. H and you scan around. You've got the tail in the head and you're pairing them up you could create an H scatterplot the H scatterplot would simply be take the value at the tail location on this axis. Take the value at the head location on this axis for which they're paired up and we're difference in them and just plot them and if we plot them it's very interesting because if we have a phenomenon for which the values are constant over a seperation from the tail to the head over that h lag separation. We would expect that those dots would fall on the 45-degree line and so if we calculate the correlation coefficient.

We'll find out that this would be a good measure of how things are correlated in space. I'll show shortly. That's equivalent to a correlogram and can be related directly to covariance which can be related directly to the very grams so it is informative and useful but it's also nice to look in. H scatterplot and to think about the degree of similarity or correlation over separation distance described by lag vector H furthermore about the Berggren once again the very gram equation is shown right here in words we would say that it is one-half the average squared difference of values separated by a lag vector. H now you may wonder why remove the one-half why take one-half and multiply it by this. It's done so that this works right here by taking one-half the very gram and calculating the semi vera gram representing that as gamma by tucking that one-half in there we ensure that this equation works out which means the covariance function which we will define a little bit function or a measure of this degree of similarity versus distance is going to be equal to the variance of the variable which we'll call the SIL tract the very grim value at that. H lag vector if you look at that the sill is the constant these are both going to vary over the lag distance. H and so this means that the semi Varig ram and the covariance function are simply going to be mirror images of each other we get the covariance by simply taking this constant SIL and subtracting the Varig ram or flipping the upside down now. We'll define covariance a little bit more rigorously shortly but is very interesting to note to note as I mentioned on the previous slide. That if you take the covariance function over h and you standardize by the variance or if you're working with a variable that has a variance of equal to one such as if we've done a Gaussian transform to standard normal mean of zero variance of one we would be able to get the correlogram and the correlogram is a measure of similarity over distance for which its value is equal to the h scatter plot correlation coefficient.

We've covered correlation coefficients within the previous discussion about bivariate statistics we're talking about the Pearson correlation coefficient pretty straight forward. And so that's really nice. Because now we have a function that relates the degree of similarity to a correlation coefficient. Which is something. That's very natural for us so we've defined the Vera Graham. We've talked about spatial continuity and so forth. Let's make some observations about the bare gram. Let's let's talk about general things we would see if we looked at a very gram now first of all what we'll do is we'll give ourselves a very simple data set it's exhaustive. It's on a mesh. It's very simple. We're able to go through and calculate the very gram at different lag distances and we'll only concern ourselves with the horizontal the x direction just to keep things very simple so now imagine you go to that data set and you pan through with two points tail and head tail and head tail and head. And they're all very close to each other. In fact there are only four cells apart. There's three cells in between them and if you pan through that data set. I hope what you observe very quickly is that the degree of dissimilarity will be low that you will both be tail and head will be high or high ish once again this phenomenon. This property could be any property. Indeed it could be a porosity field. It could be a saturation of a fluid. A contaminant concentration. It could be a density of trees. It doesn't matter. It's a spatial property. And so and you'll notice that as that live separation distance increases. You would expect that as you go. Further apart from tail to head tail the head you have a more of a likelihood to have a significant change or transition between the tail that had going from low to high high to low or high to high many different things can happen so that degree of similarity or.

I should say the degree of dissimilarity goes up degree of similarity goes down. And so what does that mean. What's observation number one. That in general as the distance increases we would expect that the semi vera gram value would increase. Okay so that's our first observations that in general. I'm not saying that that's a rule. But that's something you would observe. In general. The other point is that to calculate a very gram. You would never you would use every possible pair remember when we talked about statistics and sampling and standard error and all of these concepts the more samples. You have the better or more reliable your statistics. It's confidence interval will decrease you have more certainty about your statistic. It's more reliable has less randomness and so we talked about the law of small numbers. We want to have large numbers for more robust statistical representation. So we're going to calculate the Vera Graham over all possible pairs now. I have many students who think that when they calculate a very gram on a mesh. Like this that you'll anchor the tail and you'll take a closer the next value the next value the next next value. And they're done or maybe they'll take a lag separation like this and they'll skip ahead to the next bin the next one the next one. They won't take they won't slide across and get every possible value so in other words. If you're trying to calculate this case right here for which we have about a 10 a lag equals to 10. You would expect that you would anchor the the tail here go ahead here. Then you move over 1 and do the next one so you'd go from for 10 you would go. 1 2 data point 11. You is there 10 in the middle then you would go from 2 to 12 3 13 and so forth you would scan across with your 2 points and then scan down scan across you would scan over all possible pairs of data with that separation that way you have the most reliable measure now once you scan through this entire data set calculating you go get the squared difference between all of those offsets.

You summed it all together divided by number of pairs which is effectively. Just averaging or an expectation you would divide that by 2 all of that hard work and you get one single point on your bare ground plot. So that's that's a lot of work to get one point. That's a good thing. Computers are doing that right and then you would increase the lag separation. Repeat repeat repeat. Repeat for every possible lag that you want to calculate for. So you're gonna scan through and get all possible pairs from your data set for the most reliable statistics observation number three. You need to plot the sill to know the degree of correlation and I would say in order to actually interpret a program you need a sill the sill is equal to the variance as I mentioned before so we plot the silt that's the first thing you do plot the sill now you start calculating the variance now I mentioned before the covariance or degree of similarity is going to be equal to sill - the Berggren value so this distance from the sill down to the very gram value is in fact the degree of similarity now if we're working with the standardized variable in other words the variance is equal to 1 that covariance function is actually equal to the core relig Ram which means that that value right there is exactly equal to the correlation coefficient of the H scatterplot. And so now you're starting to understand by looking just looking at this plot you can see that at this point right here. If you had a data value right here what would be its correlation. Well the very gram value at distance of zero would be equal to zero. Because there's no difference if you compare points with themselves tail and head are on top at each other. There's going to be no difference whatsoever so we would expect that the correlation would be the variance minus zero. And so in that case. Where are we working with a variance equal to one if we have a point down here the distance right here is equal to one.

The correlation is equal to one which is pretty cool now what happens if we're at the sill at the sill there's zero difference or a zero distance. I should say between the very ground value and the sill value that would mean that the correlogram is equal to zero. The correlation is equal to zero at the sill. That's pretty important information so when we reach the silver ferreira gram our experimental very round values that were calculating for each lag. Offset that's the distance at which we no longer have correlation between tail and head. There's no correlation anymore. So what does that look like. Let's extend that observation number three if you were to plot the. H scatter plot at the point at which we have Vera Graham value equal to 0.4 and the sill is equal to 1 the correlation coefficient at that distance right there which is probably around 4 meters or whatever that is right there but for me there's three points something would be equal to point six point six is the distance from the sill down here so you can interpret that directly at the sill at this point right here. The correlation coefficient will be very close to zero. What does that mean. There is no correlation between what's at the tail and what's at the head location they're uncorrelated with each other what. I like to tell my class and tell people. Is that you if you go to that distance away so you drilled a well you gather data right here in space and you go to this distance and it's equal to that distance at which we that you reach the sill. You no longer know anything. You don't know anything there's no correlation you have no information so I'm from spacial continuity. That can help you with regard to making an estimate or prediction. And if you go above the sill at that point you have negative correlation so you go from having no information now you have. Negative correlation is negative correlation information.

Yeah it's a constraint. It is information to help you with making a prediction at that unknown location. So if you go above the sill now you have negative correlation which is also information observation number for the lag distance at which the Vera Graham reaches the sill is known as the range as we already mentioned at the range. You don't have any information that's the maximum distance you can go to which you can be informed about what's going on to which you can make an estimate and say hey that local data is telling me something that helps me. Constrain my estimate beyond some global distribution of possible outcomes and so that is the range that's the maximum distance. You can go away from a sample location and still have some information from that local data that can help you at the range once again. The correlation is going to be equal to zero and. I like to draw something like this if I well. I go to the range past that distance. I no longer have any information. The range is also an important parameter when we model. Vera Graham's will in fact use the range as a parameter that will fit valid Vera Graham models to our experimental program results. But we'll talk more about that later observation five there are sometimes a discontinuity in the Vera Graham at distances less than minimum data spacing. This is known as the nugget effect now. It's pretty common that if you have a nugget effect you report it as a ratio our relative nugget effect as a percent or a ratio. That's just simply going to be a fraction and so if I have is sill equals the one the Nugget effect is equal to 0.4. I would report 40% relative nugget effect or I would report it as simply as 8.4 nugget effect and that's that's fine. Either one of them is fun. The Nugget effect is a structure that has no correlation at lag greater than epsilon some infinitesimal distance beyond zero at zero. The very gram must be equal to zero if you imagine once again at zero the tail and the head or the same location.

You're comparing data. I mean you're comparing the values with themselves and so that difference must be equal to zero. The very gram must be this must equal zero by definition but at that very short distance it jumps up and expresses a degree of dissimilar. Where does that come from. The nugget effect term actually comes from gold mining in South Africa. If you imagine a long drill hole and you were saying it for the grade of gold or the composition as far as the amount of goal in each one of the individual samples along a drill hole every one small you can have with. Gold deposits naturally forming a large gold nugget large relative to the size of the samples that were asking. And so you're going kind of low gold. You know grams per tonne grams per ton. And then suddenly you have a big gold nugget. If you were to calculate the bare gram going through all possible pairs and suddenly you have a gold nugget. That extreme jump at a very very short distance a distance shorter than minimum data spacing would result in a level of variability at very short scales. That would be reflected as a nugget effect. Now we've got to be careful of nugget effect because nugget effect could also be measurement error in fact if you were to take a just random error and scatter it using Monte Carlo simulation just scatter it over all of your data everywhere fold any space structure that random error would be reflected by a nugget effect. In fact its proportion. A relative nugget effect would be based on what amount of variability you put in as random compared to the total amount of variability of the problem and so you can actually create negative effect doing that. So what does that mean. It means that many expert modelers around the world will ignore a nugget effect within a bare gram that calculated because they know it's likely to be measurement error they fill or from their experience they know that the geologic phenomenon the subsurface phenomenon the spatial phenomenon they work with does not naturally have nugget effect.

And so they don't put it in. The main point is that try to understand the natural system not to be biased by a measurement error. So given all of this knowledge all of these observations we can start looking at examples and so we have three different three different examples of spatial continuity features. You can see the grayscale could be the indication of porosity grain size fraction of shale. Whatever you want it to be and we can look at the overall types of spatial patterns from this example. The one at the top this one right here in this one right here. They're all the same histograms. They all have in fact the same range of correlations. They're all very very similar. We'll talk about the Gaussian type structure here but we will term its ranges when we get within I think about 95% of the SIL. So they actually in fact do all have the same ranges now if you look at them they all have distinctly different degrees of continuity over short medium and long scale short medium and long scale. And if you look at it you can actually see how each one of these play out and impact the overall type of image first of all. Let's look at the short scale they go this structure right here which we'll talk about later as being a Gaussian type of peregrym structure results in very high degree of short scale continuity lots of short scale continuity look at the everything over short distances is very continuous. This model here has some nugget effect but has pretty good short scale continuity and you can see the salt and pepper a little bit of a discontinuity going on there. Short scale caused by the nugget effect but overall good continuity this model has terrible or poor short scale continuity as a result. When you look at the image you'll see a lot of disruption as short distances going from dark to very light gray to white very fast transitions within colors. This is very poor short scale continuity.

Now let's look at the long scale continuity in general they have the same range. They're very similar somehow. Not these longer scales. The continuity is pretty similar if you look at the image and you concentrate focus on it. What you'll notice is that this general area of high this general area here this area here these general large-scale structures are in fact very similar between the images is nothing. That's really interesting. So what does it tell us about the Vera Graham. Well as we become more accustomed to interpreting programs we should be thinking about averag. RAM as breaking apart multiple scales of continuity and being able to recognize that they. It's like a superposition that they're all combined as multiple frequencies that are combined together and we can interpret each one of them separately and so we have that opportunity as we move forward with Vera. Grahams we'll go back to this idea of interpreting Vera Grahams based on short scale medium scale and long scale types of continuity structures the other thing courses this should reinforce the idea that even with the same distribution. You can dramatically different spatial continuities. That happens all of the time. The other thing too is it should make sense that these functions these representations of this of the spatial continuity of the Vera. Grahams actually do have a pretty big impact in spatial continuity. It's not like it's um subtle the difference between this this this are all very different. They're distinctly different phenomenon speaking of a spatial continuity so we will get more into interpretation and modeling in the next couple sessions. I hope that this was useful to you. We've now defined the ver gram and will carry on with how to practically speaking how to calculate them in the challenging sparse irregularly spaced data types of settings that we often work with get more into advanced interpretation and then how do we build valid models that we can use in fact for the purpose of estimation and simulation.

Thank you very much for your interest um as always shoot me a question or a suggestion email comments. Twitter Facebook even you know in YouTube. Whatever you like my email is very easy to get. I'm a professor here at University of Texas thank you.