Welcome. Let's continue talking about the idea of the distributions of statistics  from data but now focusing in on a very specific distribution. We're going to be  looking at the sampling distribution for the sample mean x bar. Remember,  sample statistics are just guesses, they're point estimates of the population  parameter, and different population parameters have different sample statistics.  So we're going to focus on the most common one, the average. If you wanted to know the average of a population, you would take a sample, you would calculate the average of your sample, and that would probably be your best guess. So,  let's talk about what those sample averages would look like. The sampling  distribution of a sample average, or sample mean, x bar, is the probability  distribution of all the possible values of the sample mean, think about it as if I  had the ability to look at every single possible sample of the same size from a  population, and I were to plot all of the means on a histogram, what would they  look like? That's what we're talking about with the sampling distribution of X bar.  I'll show you visually here in a moment, but these distributions, as we've talked  about previously, have some characteristics about them. They have their own  mean and they have their own variance. So, here's a couple of nice facts. The  sampling distribution of the sample mean has an expected value, an average of  the population mean, mu. Hold on, let's think about that briefly for a moment. So, if I were to look at all possible samples, if I were to look at all of the averages  from all of those possible samples of the same size, and if I were to take the  average of all those averages, I know a lot of things going on that would equal  the population mean, that's why the sample mean is a good guess of the  population mean, because the sample mean, on average, in other words, if I  were to look at all sample means and take their average, they would actually  equal the population mean, and we can also say something about the standard  deviation of x bar, the standard deviation of the sample means, the standard  deviation of the sample means is the population standard deviation here divided  by the square root of the sample size. Hold on, so wait, what now? So the  spread in that distribution of all sample means is whatever the spread was in the population divided by the square root of the sample size of all of those samples,  so again, let's imagine you had a bunch of samples. Pick any size sample you  want. Let's say you looked at samples of 100 from the population of the United  States. If you were to look at all possible samples of size 100 from the United  States, and you were to calculate the average from each one of those samples,  and you were to look at the standard deviation of those averages of that  distribution, it would be the population standard deviation divided by the square  root of 100 and that would make sense, that the distribution of sample means,  the distribution of averages should have a smaller spread than the population's  distribution, right? If I were to look at an average of any group of people, it's not  going to be as extreme as any one person, that's the idea, where the spread is  going to be smaller. Now, this is just a little fun side fact. If your sample is larger 

than 5% of the population, we do have a little bit of an adjustment we can make, or the capital N here is the size of the population, and the little n is the size of  the sample. Most of the time, you're never going to deal in this kind of situation,  so don't worry about it. But for some people, you may actually deal with this. You may have a small population, maybe you're trying to measure some idea of a  sample of. Of a of a species that may be running low on their population. Well, if  that's the case, then maybe your sample actually is larger than 5% of the  population, but like I said, most of the time you won't have to deal with this. So,  let's talk about this idea of samples and averages of these samples and  distributions of these averages. Let's talk about what I mean, and let's talk about it visually. Let's imagine what you see here on the slide is the population. The  population is normally distributed. It has a mean of zero, so the population  mean, the parameter mu is zero. It has a standard deviation of one. So the  population standard deviation, sigma, is one. So let's imagine you were to take a random sample from this population, and let's imagine that random sample had  just 10 observations in it, so you see these 10 numbers. These 10 numbers  were drawn from a normal distribution with a mean of zero and a standard  deviation of one. So you have these 10 numbers in your sample. If you were to  take the average of those 10 numbers, you would get negative 0.1 close to zero, not exactly zero, that negative 0.1 Remember from our last lecture, that's just a  guess of the true population, and it doesn't look like too bad of a guess. The true value is zero, I guess negative 0.1 not too bad. Well, let's imagine you took  another sample, completely different sample, 10 completely different numbers.  They have a sample mean of negative 0.6 Okay, awesome. Again, taking  another sample of numbers, we're looking at their average. Let's do it again.  Let's look at another sample of 10 numbers. Here's another sample of 10  numbers from this very normal distribution, and so with these 10 numbers we  can take another average, and that average is 0.3 Let's do it again. So we have  another sample, another 10 numbers, their average is 0.4 Let's imagine you  kept doing this over and over and over. You looked at every single possible  sample of size 10 from this distribution, and what you did is you looked at all of  their sample means. That's what I'm talking about. If you were to take a sample,  write down its mean, take another sample, write down its mean, take another  sample, write down its mean, and do this over and over and over again as many times as you possibly can until you got all of the samples. The question is, what  would that distribution of sample means look like if I were to look at those  sample averages and put them on a histogram? What would they look like? The  best part is, we know it's predictable, they follow a normal distribution, so if you  were to look at all of these sample means, these sample means, these x bars  would be a normal distribution. What would be the average of this normal  distribution? The average of this normal distribution, if you were to take the  average of all the x bars, you would get an average of zero if you were to take 

the standard deviation of all those x bars. It would be the population standard  deviation, 1 divided by the square root of your sample size, 10. This is  something we know about sample means. so notice again it has the same mean as the population, but it's a little bit more narrow now. You may be thinking, okay, this kind of makes intuitive sense. If the population is normally distributed, then  sure, I could believe that sample means also are normally distributed, that  makes sense. Okay, let's look at another population. Let's look at a completely  different population, one that doesn't look anything like a normal distribution.  Let's look at a uniform distribution, and we've talked about uniform distributions  previously. So, when looking at this uniform distribution again, it has a mean of  zero. This has a standard deviation of one, but unlike the normal distribution,  where you get a lot of values around zero, and the further and further away from zero you get, the less likely you are to get those values. A uniform distribution,  you have an equal chance of getting anything from negative 1.73 all the way up  to 1.73 Any number in here has an equal chance of being selected. So let's do  the exact same thing that we did previously. Let's take a sample again. Here  we're taking a sample of 10 different observations. If we were to take the  average of that sample, the average of that sample would be 0.3 If we were to  do it again, take another sample, again, another 10 observations. These 10  observations came from this population. Everybody in this population has an  equal chance of being selected. That's not the case with the normal distribution.  The normal distribution, because there are more people around the center,  hence the big hump in the middle, they're more likely to be selected, whereas  here everyone's got an equal chance. So you have sample one, it has an  average of 0.3 You have sample two, it has an average of negative 0.1 Let's  take another sample. It has an average of negative 0.2 You get what we're doing at this point. Let's take another sample. It has an average of 0.1 So, again, over  and over and over again, we're going to take a look at all of these samples, all of these averages, and we're going to try and plot their distribution. We're going to  put them all on a histogram, and I know what you're probably thinking, cool, I bet these things follow a wait a minute, they follow a normal distribution, they follow  a normal distribution that also has a mean of zero and also has a standard  deviation of one divided by the square root of our sample size, 10. Wait a  minute, but this population looks like a uniform distribution. This population was  a normal distribution, and you're telling me that no matter what that population  is, I get the same shape for the sample averages. Yes, that's the beauty of  mathematics. This is what we call the central limit theorem, as long as we take a large enough sample, which we consider 50 or more, was so I guess our  example wasn't exactly the greatest. We were taking samples of size 10, but it  made it easier to show you the numbers. But if we were to take a large sample  size 50 or more, the central limit theorem states that the sampling distribution of  all sample means, if you were to look at all possible samples of the same size, if

you were to calculate the sample mean, the sample average from each one of  those samples, and you were to plot that distribution, it would be approximately  normally distributed, no matter what the original population looks like. The  original population can be normal. The original population can be uniform. The  original population can be exponential. It can be anything you wanted. It did not  matter what the original population was. According to the central limit theorem, if I look at large enough samples, the sampling distribution of the sample mean is  approximately normal, and not only is it approximately normal, it has a mean of  the population mean, mu, and a standard deviation that's sigma over the square  root of n. Wow, think about the power of what we've just done. I told you in the  end of last lecture, if we only had a predictable pattern for sample statistics, like  the sample mean, then it didn't matter if all we had was one sample and we  didn't know the population mean, mu, we could get some idea about what's  going on. That's the beauty of it. No matter what your population looks like,  doesn't matter. Sample means are going to follow a normal distribution as long  as your sample size is large enough. Now you may be asking, what if my  sample size isn't large enough? What if my sample size is less than 50  observations, less than 50 people in my sample. Well, then the sampling  distribution of x bar is only normal if the original population is normal. So, if your  original population is normally distributed, then sample means are always going  to be normally distributed. That's fine. The power of the central limit theorem  says that if you take large enough samples, though, it doesn't matter what the  population looks like, your sample averages will still follow a normal distribution.  So, how powerful is this? How is this helping us? Well, let's take a look at an  example using our bike data set, so the average data. Early number of total  users is 4504 with a standard deviation of 1937 What is the probability that a  sample of 50 days - I'm not talking about a single day anymore, I'm talking about the probability that a sample of 50 days has an average, not saying that every  one of the 50 days has to do this, but the sample of 50 days has an average  number of users between 4000 - 5000. Let's take a look at how we can answer  this. So, based on our previous example, all of the possible sample means from  samples of size 50 would have the following distribution, right. So we have a  large sample size, so we have a large sample size 50 or more. Therefore,  sample means will always follow a normal distribution with that large sample  size, it doesn't matter what the original population looks like. We have a large  enough sample size, our sample means will follow a normal distribution. Well,  what's the center of that distribution? It's the same as the population mean, mu.  It's 4504 Well, what's the standard deviation of this sample mean distribution will be the population standard deviation 1937 divided by the square root of our  sample size, divided by the square root of 50 that would leave a sample mean  standard deviation of 273.93 and we have, more importantly, a normal  distribution. Now, what have we learned about normal distributions? You can 

turn any normal distribution into a standard normal distribution. All you've got to  do is take the point you're interested in, subtract off the normal distribution's  mean, divide by the normal distribution standard deviation, and you could  answer any question you want. Are you starting to see the power of this now?  Unlike in previous chapters, where we had to assume the distribution was  normal before we could answer these questions here, as long as you deal with  sample means that come from large enough samples, you already have a  normal distribution, no assumption needed. So the average daily total number of users is 4504 with a standard deviation of 1937 What's the probability a sample  of 50 days has an average between 4000 - 5000? Well, because I'm looking at a sample of 50 or more, those days and their averages follow a normal  distribution, so I can take a look at this calculation. I can do whatever number  I'm interested in, minus the mean, divide by the standard deviation, which,  remember, would be just subtracting off the population mean and dividing by the population standard deviation over the square root of n. And so we could just  plug these numbers in. Right? I want 5000 that's the number I'm interested in,  minus the population mean, 4504 divide by the standard deviation of 273.93 and I get a z value of 1.81 if you remember when we were looking at this last time  we were looking at this on a standard normal table, so what we're saying is the  point 5000 on the distribution that has an average of 4504 with a standard  deviation of 273.93 that point 5000 is the same point as 1.81 on a standard  normal distribution, which means that we could look it up in a table. The  probability we see a number smaller than 1.81 is 0.9649 In other words, 96.49% of the time we will see a sample of 50 days having an average lower than 5000  but that's not the whole question. It wasn't just lower than 5000 it was between  4000 - 5000. So we got to deal with that 4000 number as well. So that's exactly  what we're doing down here, we're doing the exact same calculation, except  instead of just for 5000 we're doing it for 4000 so again, same idea, 4000 minus  the mean 4504 divided by the standard deviation 273.93 gives us a point of  negative 1.84 If we were to look at that on a standard normal table, the area to  the left of negative 1.84 would be point 0.0329 or in other words, there is a  3.29% chance that a sample of 50 days has an average below 4000 Well, wait a minute. Here, if I know the probability of you having below 5000 and the  probability of you having below 4000 then I could look in the middle. The middle  of those would then be 96.49% minus 3.29% and that would basically be 93.2%  or a probability of 0.932 and this is a great way of viewing it, looking at the  shaded area in the middle, the shaded area to the left of 5000 was 0.9649 The  shaded area to the left of 4000 was 0.0329 So that means the area in the middle has to be 0.9320 There's a 93% chance that a sample of 50 days will have an  average, not all the days have to be between here, just the average of these 50  days has to be between 4000 - 5000 total users. There's a 93% chance that's  going to happen. That's the power of the central limit theorem. That's the power 

of the normal distribution, that's why we focus so much on the normal  distribution, is because of properties like this, because sample means follow a  normal distribution as long as you have a large enough sample size, so let's  imagine instead of looking at a sample of size 50, we take a sample of size 100  the expected value, the average would remain the same, 4504 However, the  standard error, the standard deviation of x bar, would decrease. Instead of 1937  over the square root of 50, it would be 1937 over the square root of 100 which  would be 193.7 or in other words, it would be a more narrow distribution, which  would make sense, right? If you had a sample of 100 days, you're going to have  a lot better idea of what's going to happen. It's going to be a lot tighter, a lot less  spread than a sample of 50 days, right? If I asked you, "Hey, what do you have  more confidence in? What do you believe more, a sample of 100 or a sample of  50? You'd probably say a sample of 100 because there's more data there. Well,  then, if you believe in it more, wouldn't it have a more narrow spread? That's the idea, and so it has a more narrow spread to it. You increase the sample size,  your spread is going to get more narrow. You decrease the sample size, your  spread is going to get a little further out, because you're a little bit less, you have a little bit less idea of what's going to happen. Oh boy, we have talked about so  much. Let's summarize here real quick. The sampling distribution of x bar is the  probability distribution of all possible values of the sample mean x bar with  samples of the same size. Now this sampling distribution of x bar has a mean  and expected value of the population mean mu and a standard deviation of  sigma over the square root of n, and now the most important thing we learned, if we have a large sample, a sample size of 50 or more, then the central limits  theorem states that the sampling distribution of x bar is approximately normally  distributed, regardless of what the population distribution looks like, and that's  powerful. Hopefully, that gets you a little bit excited about statistics. I know it still  gets me excited about statistics, but that is the end of this lecture, and I look  forward to seeing you in the next one.



Last modified: Monday, June 22, 2026, 8:26 AM