Video Transcript: Interval Estimation with Data - Part 2
Welcome. Let's continue this section of the course around interval estimation with data by now specifically talking about the interval estimation of our sample point estimate, the sample proportion p hat. Let's remind ourselves about something from our previous lecture, an interval estimate can be computed by adding and subtracting essentially a margin of error, some wiggle room around your point estimate. So you take your original point estimate, and then again you add and subtract your margin of error. The purpose of an interval estimate is to provide information about how close the point estimate is to the value of the parameter. Here we're going to use the sampling distribution of p hat to be able to play a key role in computing this margin of error. Again, this is why we talked so much in the previous section of the course around sampling distributions. If we know how point estimates move, if we know essentially how point estimates change from sample to sample to sample to sample, then we can get an idea of what this margin of error should be. now the sampling distribution of p hat, if you remember, according to the central limit theorem, is approximately the normal distribution when we had a large enough sample size, and for us a large enough sample size was where we had the sample size n times the proportion p being greater than or equal to five, and the sample size n times one minus the proportion p being greater than or equal to five. Essentially, we have five successes and five failures in our sample. All right, so that means that this is the sampling distribution for p hat, so if we were to take many different samples, take the sample proportion from each one of those samples, and we were to plot the sample proportion in a distribution, you would see the normal distribution. That normal distribution would have a mean of p, the true population parameter, and if you remember, it also has a standard deviation, sigma, that was the square root of p times one minus p over n. Well, how can we use this information to help us with this idea of margin of error? Let's remember the empirical rule. If you remember the empirical rule, the empirical rule basically said for a normal distribution that if you are within one standard deviation of the mean, so if you were to take the mean, subtract the standard deviation, then take the mean and add one standard deviation. Everything in that range would be about 68% of your data. If you were to go two standard deviations below and two standard deviations above the mean, again we'd cover approximately 95% of our data, and three standard deviations below to three standard deviations above the mean would be almost all of your data, specifically 99.7% Well, instead of using mu and sigma, let's put it in the context of the problem we have. We have p, and then we have sigma of p hat. Again, this would be the distribution of all the p hats. The p hats should be centered at p, and they will have a standard deviation of sigma of p hat, which remember was that square root of p times one minus p over n. So, let's look at an example where we're looking at two standard deviations below the mean and two standard deviations above it, so if we were to take p, subtract off two standard deviations, and then
take p and add two standard deviations, if we remember the empirical rule, this is about 95% of your data, specifically it's 95.44% of your data. Hold on a second. What did we learn about confidence intervals? Confidence intervals are a point estimate plus or minus a margin of error. Well, we could have a point estimate like p hat, then we could subtract and add a margin of error. Wait a minute, what are we doing down at the bottom of that normal distribution? We're taking some value p and then we're adding and subtracting some value two times the Standard deviation. Well, wait a minute. If we were to do that, so 95.44% of our data is in between these two values, that would leave about 2.28% below and 2.28% above. What we're basically doing is we're calculating something that looks like a confidence interval. We're taking some estimate, we're adding a number, and then we're taking some estimate, and we're subtracting a number. And so, by doing that, we're essentially creating something like a confidence interval. Think about our last lecture, I said I was 95% confident something would happen. Well, take a look at what we have here. 95% of our data in a normal distribution is about two standard deviations away from the mean. In fact, we can actually do this for any number. The empirical rule showed us that, but we also talked about that when we talked about standardized normal distributions. I can tell you the middle percentage being any percentage on a normal distribution, 95% is easy because it's about two standard deviations. The middle 68% would be about one standard deviation, but remember, because we have something like the standard normal table, I can look at anything, I could look at the middle 90% I could look at the middle 80% I can look at the middle 87% whatever we'd like to be able to do, and that's the idea of what we're doing with this confidence interval, is we're basically saying I can take some number, my point estimate, I can subtract off some point times the standard deviation of my point estimate, and then I can build a confidence interval, it's just a matter of what you want that distribution to look like. What do you want those percentages to be? If I want the middle to be 95% then that would mean each tail contains two and a half percent. If you wanted the middle to be something like 90% then each tail would contain 5% So, as you can see, we can do this with any number, but let's again try and do this idea of a 95% confidence interval. So, if I want the middle 95% of my data, what I'm basically telling you is we can create a confidence interval by taking our point estimate, adding and subtracting some number, we'll call it z alpha over two times the standard deviation of p hat. Well, what is this z alpha over two? This z alpha over two is basically the same point on a standard normal distribution that has a shaded area that takes the value of alpha over two, so again, if I wanted 95% then alpha would be 5% because 95% is the middle, so one minus 95% would tell me what I have left over, that would be 5% but I'm going to put 5% split into two pieces, so 2.5% in the bottom, 2.5% in the top, so what I would need to do is I would need to know what point on a standard
normal distribution is going to be the point where I have 2.5% of my data in the tail. We know how to do this. We looked at this when it came to our sampling distribution, as well as our standard normal score calculations over the last two sections of the course, like I said, we're just building upon everything we've learned. So, if we were to look in the table and say I want a probability of 2.5%, or 0.025 what value on a normal distribution, a standard normal distribution, gives me a probability of 0.025. Well, let's take a look in the table. If we were to try and find 0.025 in the middle of the table, not in the edges, not on the far left hand side or the upper top, we want to find it in the middle. If we found 0.025 in the middle, we would see that the Z value, the spot on the normal distribution would be negative 1.96 Wait a minute, hold on. Now that's really close to 2, isn't it? Oh, well, remember. We said that the empirical rule is approximately two standard deviations away from the mean, from the middle is approximately 95% of the data, exactly it was 95.44% so it's close to two standard deviations, oh 1.96 standard deviations, so if we were to take our point estimate, subtract off 1.96 times the standard deviation of our point estimate, that would leave only 2.5% below that value. If we were to take the same thing but flip it on the other side, take our point estimate and add 1.96 times the standard deviation, we would have only 2.5% of our data above that value, which means we'd have 95% of our data in the middle of that value, and that's the idea of margin of error, I'm adding in some notion of wiggle room, I have p hat plus or minus this idea of margin of error, where margin of error basically is some point on a standard normal distribution times the standard deviation of your estimate, which for us remember was the square root of p times one minus p over n, so if you wanted to calculate a confidence interval for p hat, this is the equation. So the confidence interval for p hat with a confidence coefficient of one minus alpha, basically an error of alpha, is the following. So p hat plus or minus some number on a standard normal table, we'll call that z alpha over two times the standard deviation of p hat, which is the square root of p times one minus p over n. Now again, what do I mean by this confidence coefficient of one minus alpha, so for example, if you want a 95% confidence interval, that means you're going to be wrong 5% of the time. Then you would look up that spot on the normal distribution that's alpha over two 2.5%. Why? Because again, remember we're splitting that 5% error, half of it goes below our interval, half of it goes above our interval, and so that's what we're looking at with this alpha over two. Now, do you notice a problem with this equation? It has p in it. We don't know p, that's the whole point. We're trying to guess p. P is the population proportion. We don't know this, however, we have a guess. Our guess is p hat. So that's exactly what we're going to use. We're going to use p hat plus or minus this point on a normal distribution, z times the square root of p hat times one minus p hat over n. Now, when we estimate a standard deviation of a statistic, in this case sigma p hat, now instead of calling it a standard deviation, we change its name a little bit, we
call it a standard error, basically saying, hey, look, I know that the real standard deviation would involve the number p, however, I don't know the number p, so I'm going to have to guess at this standard deviation, because I'm guessing at it, it's no longer the real standard deviation, it's a guess of the standard deviation, and we call that guess a standard error. So this number here is the standard error of p hat. Remember, the standard deviation of p hat would be p times one minus p over n with the square root over that, but because we don't know p, we have to guess at p. We now have to guess it with p hat. It's now a standard error. Whew, throwing a lot of terminology, lot of concepts at you. Let's go ahead and work through an example, and hopefully that can sort of solidify things, so you think that people are more likely to rent a bike on a clear or cloudy day compared to a misty or rainy or snowy day. Now, your data is a sample of 731 days, 63% of your sample is clear or cloudy, so we're going to build a 90% confidence interval for the true proportion of clear or cloudy days where your company operates. So, okay, let's think about this first. I have an idea where I want to build a 90% confidence interval. In other words, 90% of the time, if I were to take 100 samples, 90 of those samples would contain the truth, and I'd expect 10 of them not to. So, again, we're going to do this process in theory over and over and over again. In reality, we get one sample, we're just relying on the procedure producing a good interval more often than not. So, okay, so we want a 90% confidence interval. Well, if we want our confidence interval to be 90% confident, then that means we're going to have a 10% error. Well, that 10% error doesn't always go below or always go above. It's split into two pieces, so we're going to have half of it 5% is going to be below our interval. 5% is going to be above our interval. The question is, what is the point Z alpha over two? What is the point on a standard normal distribution where 5% of the data is in the tail? That's the hard part, because we know for a 95% confidence interval that number is close to two, it's 1.96 but what about a 90% confidence interval. Again, we could go to our normal table to be able to look this up. If you were to look in the middle of your normal table and try and find 5% you would see something like 0.0505 and 0.0495 Ooh. So 5% 0.05 is right in the middle of those two numbers, they're both the same distance away. Well, we could look at negative 1.64 or negative 1.65 I'd be okay with either of those numbers, but in reality it's actually halfway in the middle of them, 1.645 So, if we were to take our sample parameter, I'm sorry, our sample statistic, subtract off 1.645 and multiply it by the standard deviation of our statistic that would be the lower bound on our confidence interval, and then we do the reverse, where we add the 1.645 times the standard deviation, and that would be the upper bound. So, let's do that. So, we're going to take our sample statistic, 63% 63% of the days in our sample are clear or cloudy. Then we add and subtract 1.645 where again that 1.645 comes from the fact that we want to be 90% confident. Now we've seen two numbers here: 95% confident is 1.96 90% confident is 1.645 The more
confident we are, the bigger this number is going to be, which is going to make our interval wider and wider and wider, which makes sense, right? You're more confident in a wider interval, like for example, I'm 100% confident that the average age of people listening to this lecture right now is between zero and 100 Guarantee it, you know, I'm 100% confident that the average age of everyone listening to this lecture is between zero and 100 Now, if you asked me to be a little less confident, maybe I'd say I'm only 90% confident that the average age of everyone listening to this lecture is between 18 and 35 Well, again, why am I only 90% confident? Well, because I shrunk my interval down. There's a chance I could be wrong when I have a really small interval. When I have a really large interval, I have a better chance of being right. So notice how we went from 95% to 90% and that number, that z changed. It's going to make our intervals a little smaller, so we have one. I'm sorry, we have 0.63 plus or minus 1.645 times the square root of 0.63 times one minus 0.63 divided by 731 if you were to do that calculation, you would have 0.63 plus or minus 1.645 times 0.018 or in other words, 0.63 plus or minus 0.03. One way of thinking about that is essentially we think that 63% of the days are clear or cloudy, plus or minus 3% or another way of thinking about it is. We think that between 60 and 66% of our days are clear or cloudy, and we're 95% confident about that interval. So hopefully that gives you an idea of a real world example of this. In fact, again, if it happens to be election season at the time that you're listening to this video, this is exactly what they do with polls that you see on TV, where they sit there and tell you again, candidate A has a 32% chance of winning, plus or minus 5% This is exactly what they're doing, literally this exact calculation, and so now you know how to do those calculations, all right. let's summarize the confidence interval for p hat with a confidence coefficient of one minus alpha. Basically, an error rate of alpha is the following equation: you take p hat, then you add and subtract the margin of error, where that margin of error has two pieces, the piece from the standard normal distribution is just a number times the standard error of p hat, our guess of the standard deviation, which for us is the square root of p hat times one minus p hat over n, because remember if we were to guess a standard deviation of a statistic, it's now no longer called a standard deviation. It's now called a standard error. I know we've thrown a lot at you with this lecture, but now you can hopefully see all the foundation that we've been building. We talked about probabilities, we talked about the normal distribution, we talked about sampling distribution, specifically of P hat, and they've all laid the groundwork for us to be able to build this confidence interval, so that when we get a guess, we can take that guess and add a margin of error to it. That is the end of this lecture, and I look forward to seeing you in the next one.