Video Transcript: Interval Estimation with Data - Part 1
Welcome in this section of the course. What we are going to do is build upon all the previous sections and talk about the idea of estimating not just a single point but an entire interval with our data. If you think about it, a point estimator is probably not going to be the exact value of the true thing that it's estimating, right? If you had to guess at something, your guess would hopefully be close, but it's probably not going to be exactly right. And because it's not going to be exactly right, there is going to be a little bit of wiggle room to our estimates, a margin of error, if you will. So, how can we give a good margin of error to our point estimates? That's the whole idea of what the last section of the course really laid the foundation for. If you think about it, we now have from the last section of the course sampling distributions. What do sampling distributions do to help us, sampling distributions are distributions of point estimates. So, if I were to give a point estimate, like a sample mean, to estimate a population parameter, then, because I know how sample means look, I can actually give some idea of a margin of error, some wiggle room, if you will, around my actual point, so that's what lays the foundation for what we call interval estimates. An interval estimate can be computed by adding and subtracting this margin of error to the point estimates, and the reason why we do this. The reason we have interval estimates is to provide information about how close the point estimate is to the value of the parameter. You've probably seen things like this before. Let's imagine you've ever seen election coverage on TV. A lot of times you'll see they'll say candidate A has 32% of the vote plus or minus 5% least that would be the idea in a poll. So all right, they're trying to poll people to see who's going to vote for candidate A, but they know since this is not the actual election yet that they're just polling people to get a guess of who's going to vote for candidate A. Then what they do is they say, well, I know I have just a sample, and this is just a guess, so that guess is going to have some wiggle room, and so that's what that plus or minus 5% in our example is. So again, if I said candidate A is going to get 32% of the vote, plus or minus 5% That 5% would be the margin of error. So, again, what we're trying to do is just provide a little bit of information about how close we think the point estimate is to the value of the parameter. Now, this does not mean that your interval estimates will actually always contain the truth. Again, these are just guesses, and so we can't be 100% sure, because we don't know the truth. If we knew what the population parameter was in a real-world example, then why would we ever take a sample and estimate anything right that makes sense. So, if I didn't actually know what it was, this true population parameter, again, let's just use an election as an example. If I truly do not know what the proportion of people are they're going to vote for a certain candidate, then the whole idea is I'm trying to provide a little bit of an estimation, but that estimation will come with a chance of being wrong every time we guess at something, we could be wrong, and so that's sort of the idea of what we're trying to do. So, what do we call these things formally? Formally, we call them
confidence intervals. Confidence intervals are interval estimates where we say we have a certain level of confidence of in the interval itself. For example, let's say we are 95% confident that the population average daily number of total users of the bike rental company is between 4000 and 5000 people, so again that would make some notion of sense. If you want, you could think about it as we are 95% confident that the population average daily total users of the bike rental company is 4500 people, plus or minus 500 That's another way of thinking about it being between 4000 - 5000, but what does this 95% confident even mean? It seems. Like a loaded statement, so what is 95% confident? 95% confident basically means this: if we were to take many, many, many, many, many samples, now all of the same size, we want to keep it fair. If we were to take many, many samples. Each one of those samples is going to produce a different confidence interval. Well, why is that? Well, each sample, if you remember from our last series of lectures, provided a different point estimate, right? So, if I were to take a sample, and I were to take something like a sample average from that sample, then there's no guarantee that the next sample I take is going to have the same sample average. So, again, if I were to look at heights of individuals, if I were to look at heights of individuals that I take a sample of 100 people, the average height of those 100 people is probably not going to be the average height of a different sample of 100 people, so each sample is going to produce a different point estimate. Well, if it produces a different point estimate, then it's going to produce a different range, it's going to produce a different set of values for that range of what I think something like the average height would be, or in our example here, the average total daily users for our bike rental company. Okay, so if each sample produces a different point estimate, and therefore each sample produces a different confidence interval, the idea of 95% confidence means that 95% of the confidence intervals we take will contain the truth, or in other words, 95% of the time our confidence intervals would contain the true parameter of interest, the number we don't know, but we're saying that 95% of the time our intervals would contain that truth. Well, how do we know that if we don't know what the real answer is? Ah, this goes back to that idea of sampling distributions. We know what sample means, for example, look like in terms of their distribution, so we have an idea of how many sample means are going to be close to the truth and how many are going to be far away from the truth, and we're going to use that information to help us build these confidence intervals. Let me try and show you an example of this visually. Let's imagine this is your population. Okay, so your population is normally distributed. You may or may not know that the population average is that number there, mu. That is something you do not know in the real world. I do not know what the true average is of the entire population, but I'm going to try and guess it. So, let's take a single sample and guess that single sample we're going to call x bar. Specifically, this is our first sample, so let's call it x bar one. Notice
how my guess, my point estimate x bar one is a little bit higher, it's to the right of the truth mu. Now, again, in the real world, I wouldn't know where mu was. All I can see is x bar one. However, the arrows on either side of x bar one represent the margin of error, represent that calculation we're going to learn how to do in this section of the course, that wiggle room that we're going to add to our point estimates to be able to give us a range of confidence. Notice how the true population parameter mu is contained inside of that interval, if you think about the ends of the interval being the tips of the arrows, so the true value of mu is contained inside of this confidence interval. Now let's imagine we take another sample, we'll call this other sample x bar two, you'll notice with this sample the average in this sample is a little bit lower than the average in the first sample, and it also happens to be lower than the true population parameter mu, but notice when we put the same margin of error on this sample, so X bar two has that same wiggle room, has that same margin of error. When we take that and put that on this sample, if you notice it also contains the truth, the true population parameter mu. Again, think about this interval going from one tip of the arrow to the other Tip of the arrow, where X bar two is right in the middle, so we've taken two samples, both of them happen to contain the truth. Let's take a look at this third sample here. Oh, wait, what do we see? This third sample, X bar three, has an average that is really high above mu, way to the right of mu, so far to the right that even though we put some wiggle room around x bar 3, x, bar three is our point estimate, our interval estimate are those arrows on either side of x bar three, even if we go from tip to tip of the arrows around x bar three, it does not contain the true value of mu, and so this interval estimate missed, and the thing is, we can keep doing this over and over and over and over again. Here's another example. X bar 4x, bar four is again another point estimate from another sample of the same size. We put the same margin of error around it, and we can see, okay, this one contains the truth. And if we were to keep doing this repeatedly, the whole idea of 95% confidence is that 95% of the time, 95% of the samples, 95% of the confidence intervals you see are going to contain the truth, mu, but there are going to be some 5% that are not going to contain the truth, for example, like X bar three that you see here on the screen. Now I know what you're thinking a lot of times we only get one sample, so when thinking about this idea where we have 95% of the time our confidence intervals would contain the truth. That's great. If I could take 100 different samples, then I know that 95 of them are right, but I only get one sample. How do I know if my one sample contains the truth? You don't, but you're putting your confidence in the procedure that you did to get that one sample and the interval you calculated for that one sample. It's kind of like flipping a coin. Let's imagine you had an unfair coin. 90% of the time it lands on heads, 10% of the time it lands on tails. Well, you know that if you flip the coin many, many, many times, it's going to land on heads more than it's going to land on tails. But, however, I'm only giving you one
flip now. Yes, with that one flip, it could land either on heads or on tails, but wouldn't you still bet on it landing on heads? Of course, you would, because you know it lands on heads more often than it does land on tails, and it'd be the same idea here. Yes, you only get one sample. Are we sure that our one sample is actually the one of the 95% of samples that contains the truth, no, we're not sure about that at all. However, however, we're relying on the fact that we know if we were to do this over and over and over and over and over again, 95% of the time it would be right, so if that's the case, we can trust that one sample. Another way I like to think about this is going to a free throw line, playing basketball. Let's imagine you had two people that you could send to the free throw line, and you had a significant amount of money or value on them making the shot, one of those people is a 70% free throw shooter. One of those people is a 95% free throw shooter. Now you only get one shot, so each one of them could make that shot, but Who would you rather take the free throw. Well, the person who's got a greater chance of making it, and that's the same idea here. You only get one sample, but that one sample, along with its corresponding interval, was calculated in a way that 95% of the time it's going to be right. So that's what you're really putting your confidence in. You're putting your confidence in the actual procedure itself, in the process that we're going to learn here in this section of the course, and in these corresponding lectures to be able to help us figure out whether or not we actually. Get a good interval now. One quick caveat, just because we say we are 95% confident that the population average is going to fall inside of our interval, that is not the same thing as saying there's a 95% chance the population parameter falls inside our confidence interval. Well, hold on a second. Doesn't that sound like the same thing you said earlier, that 95% of the time our intervals would contain the truth? Isn't that basically saying that there's a 95% probability that the parameter is going to fall inside of our interval. No, no, no. Careful, notice the difference here on how many intervals we're talking about with this statement compared to how many intervals we're talking about with this statement. The true statement is saying that if we were to produce many, many, many, many, many, many intervals, 95% of them would be right. This statement here is saying our one interval is going to be right 95 or there's a 95% chance that the truth will fall into this one interval. No, that's not the case. Careful here. What we're saying is that the population parameter is the thing that's moving. It's not the interval, is what is moving. So, again, we're not saying there's a 95% chance the population parameter would come into our confidence interval. We're saying that there's a 95% chance of all of our intervals actually containing the population parameter. So, what are you putting your probability on? I'm saying we need to put the probability on the intervals, not on the population parameter, that thing we're estimating, that mu, that average daily number of total users, or that average height of Americans, whatever example you'd like, that is not moving, that number is fixed. What's
moving is our interval, because each sample we take is going to move our interval, so that being the case, what we're saying is we're 95% confident in the interval in the actual process that we're taking, we're not saying there's a 95% chance the number we're estimating is going to just so happen to fall into our one single interval. I know it's a little bit confusing, it's sort of a lot of, you know, mathematical mumbo jumbo, but it is important. It really is. What are you putting the randomness on? Are you saying that the number you're estimating is random and it's moving around, or are you saying that the intervals that you have in the samples you take are random, because that's what we're saying we should apply the randomness to the random, the randomness is in our samples, we take random samples, those random samples are going to produce different sample means, those different sample means are going to be closer or further away from the truth. I'm saying that 95% of the time the way we're going to calculate them is that they are close enough to the truth based on that margin of error. That's the idea. All right, let's summarize. Confidence intervals are interval estimates where we say we have a certain level of confidence in our interval, but what does confidence mean? Confidence implies if we were to take many samples, all of the same size, that each produced different confidence intervals, then 95% of these confidence intervals would contain the true parameter. Now, again, which one you have, you do not know. Maybe you're part of the 95% that actually contain the truth. Maybe your one sample is part of the 5% that doesn't contain the truth. You do this process over and over and over in your career, you're going to be right more often than you're wrong, but you will be wrong sometimes. I guess the only real nice advantage to it is we never know when we're right or when we're wrong, because we don't know the true value of mu. We're just guessing now. One thing you have to be careful of with confidence, confidence is not the chance the population parameter falls inside of our one confidence interval, so you have to be careful. I know we've talked a lot about this idea of confidence, and it is a harder concept to wrap your mind around. Definitely take some time, go back and watch this lecture, but. Ponder this idea of confidence a little bit. It takes a little bit, and it takes some thinking to be able to fully grasp. That's okay, but that is the end of this lecture. And I look forward to seeing you in the next one.