Let's continue our discussion around distributions of discrete data by now  moving into something we call the expected value, as well as the variance. The  expected value, also known as the mean or average of a random variable, is a  measure of its central location. Wait, hold on a second. Did I just say average or  mean? Well, yes, this is actually just like the mean that we had talked about  previously. However, you can think about the expected value as sort of a  weighted mean, an average where not every single point has the same weight  inside of the calculation. Let me show you. Let's take a look at that equation. So, the expected value of a random variable X is denoted as a capital E followed by  parentheses with X in the middle. Sometimes people refer to this as the Greek  letter mu. It kind of looks like a U with a little tail on the front that is the Greek  letter mu, like mu, but take a look at the actual calculation on the far right hand  side, there's that summation symbol again. So I'm going to sum up all of the  values of X from i equals one to n, so again I'm summing up x1 x2 x3 x4 and so  on and so forth across all of the values of X. Well, so far, so good. So far, it looks just like the regular mean. However, instead of dividing that by n, I'm multiplying  that here by some kind of probability. Remember, we denote probability with a  capital P, so we're going to say it's the probability of X again being our random  variable taking on a specific value Xi, that looks a little bit weird. Think about that probability of X, capital X equals little Xi is basically the probability that you have your random variable equaling any of the possible values that it has. So, for  example, our TV example we worked on on our last lecture, we would say the  probability X equals zero and give its probability, then the probability X equals 1,  and so on and so forth, but like I said, think about the expected value as a  weighted mean or a weighted average, and that probability serves as the actual  weight. Well, again, what do we know about probabilities under a classical  estimation of probability? Everything would have the same probability, and if we  had n observations, then we would have a probability of one over n, and so,  okay, if we have the probability of one over n for every single observation, then  wait a minute, we have our regular average calculation. This looks just like the  average that we had before. So again, we have an average here, it's just  typically we have a weighted average, where again not every observation has  an equal probability of happening, and since every observation doesn't have an  equal probability of happening, we don't want to just sit there and say the  average is every observation sort of inequal. Let me show you an example  again. Let's go back to our discrete probability example from our last lecture.  Let's let X be the number of TVs sold at a small department store in one day.  Now, again, X can only take the values of 0, 1, 2, 3, 4, and 5, and if all I told you  was that information, that the values of X could be zero through 5, but I told you, what's the average number of TVs that someone would sell in a day. What you  would probably do is you would just take the average of 0, 1, 2, 3, 4, and 5, so  you would add those six numbers together, and then you would divide by 6, 

because each number has an equal chance of happening. However, let's take a  look at the actual data. Remember this data table that we calculated from the  last time. On the far left-hand column, we have the number of TVs sold, again  zero through 5. The second column is the number of days that actually happens, the frequency. The fourth column is the respective probability that that number of TVs sold actually happened, or remember we called this the relative frequency.  So, let's take a look again. On the first row, there's a 25% chance that we sell  zero TVs, there's a 23% chance we sell one TV, a 19% chance we sell two TVs,  a 12% chance we sell three TVs, a 14% chance we sell four TVs, and a 7%  chance we sell five TVs. Well, one thing I can tell you, it's not an equal  probability on selling each one of those different numbers of TVs. Right, I don't  have an equal chance of selling zero TVs than I do of selling five TVs, and so if I were to just take an average of the numbers 0, 1, 2, 3, 4, and 5, I wouldn't be  accounting for the fact that not each number has an equal chance, that's what  the expected value does. So the expected value takes into account not only the  value of the variable 0, 1, 2, 3, 4, 5, but the probability that that value takes. So  let's calculate that X times the probability that X equals X for that very first row.  Again, we're getting that from the right hand side of this equation. Okay, so I  would take the value of zero number of TVs sold and I would multiply it by the  probability of 0.25 the probability that happens, that would give me the number  zero again. How did I get this number? I multiplied the number in the first  column, TV sold, by the probability of that actually happening, the fourth column  .25. Okay, well, let's take a look at the second row. The second row, first column  would be one. There'd be one TV sold. Well, what's the probability of me selling  one TV? Let's go to the fourth column, .23. If I multiply those two numbers  together, I would get one times .23, which is just .23, huh. Well, let's take a look  at the third row. The third row would be two, a number of TVs sold, a probability  of .19. Well, that would give us, if we multiplied those two numbers together, a  probability of 0.38 I've also filled in the last three rows here for you, where we  have three TVs sold, four TVs sold, and five TVs sold. If we were to take a look  at that last column, the X times the probability that big X equals little x, we would see the probabilities multiplied by the actual values of the variable, again we  would see the right hand side of this equation. So, if we were to sum up all of  those values together, if you were to sum up, add together all of the values in  that last column, zero + 0.23 + 0.38 + 0.36 + 0.56 + 0.35 you would get that  highlighted number at the very bottom 1.88 Well, what does that highlighted  number mean? Well, on average we expect to sell 1.88 TVs per day. Wow, that's helpful. That's great. Now, again, why is it not just the average of the numbers 0, 1, 2, 3, 4, 5, If you were to take the average of those six numbers, you would get something a little bit over two, which again might make sense if you just took the average of those numbers and everything was equal. However, we know that's  not the case. There's a bigger chance of me selling only zero or one TV than 

there is of me selling four or five TVs, so that means that I shouldn't count four  or five, as highly as I count zero or one, right. So, again, if I want to know how  many TVs I expect to sell in a day, what is the typical number of TVs to sell per  

day? Then, instead of just taking the average, like we learned about it a few  sections ago, we're going to take the expected value, or again, it's kind of like  the average, but we're going to weight that average on the likelihood of that  value actually happening, so it's more likely to get a zero or a one, so it's going  to be weighed heavier than a four or a five, and so that means our number's a  little bit lower than if we just took the original average, so again we expect to sell 1.88 TVs per day on average. Wonderful, so we have taken this average, and  we sort of came up with a weighted version of it, a better way of summarizing  what that typical value. Would look like now, for example, in our bike data set,  when we took the average temperature. Well, every day had an equal chance of happening, so just taking the average temperature was completely fine. But  here, the number of TVs sold does not have an equal chance of happening, so  we can't just take the original average, we need to take its expected value of  course. If we have changed our way of calculating average, we can also change the way we calculate variance. Again, we can do very much the same thing.  Instead of just taking the original variance equation, we are going to replace that one over n minus one, here again, with the probability that X equals Xi. So,  again, we're going to take a weighted version of what we saw previously. So,  again, what is variance? Variance is a measure of spread, it's a measure of  variability, and it's defined here by this equation again, var of X or the variance  of X we typically denote as this little symbol squared, that little symbol is the  symbol sigma, the Greek letter sigma squared, and again it's the same  calculation we saw previously, we take the sum, the sum of what, again the big  summation symbol from i equals one to n, we sum up all of the values of Xi  minus the mean squared, so we take Xi minus its expected value, remember  that was what mu was, the expected value, so we take X minus off the expected value, and we square that number, because remember, what was variance?  Variance was trying to get at sort of an average of squared distances from the  mean, or here the average of the squared distances from the expected value, so everything all the way up through that square term looks the exact same as we  had for the original variance equation. The only difference now is instead of  dividing by n minus one, we're going to divide by the probability that X equals Xi. Again, we're taking a weighted version of this, of course. If we could do this for  variance, we could also do it for standard deviation. If you remember from our  previous section, standard deviation is just the square root of variance. All right,  let's work all this out. All right, let's work it out, sort of one by one, so we can see this calculation and see this thing. So I take the same original four columns that I showed you for the expected value calculation, the number of TVs sold, the  frequency, the probability that X equals X, as well as X times the probability that 

big X equals little x. So, again, remember the fourth column you see here is the  first column multiplied by the third column. If we take that fourth column, add all  the values together. That's our expected value of 1.88 So, when we'd use this  calculation, where we do X minus mu, X minus that u with a little tail in the front,  X minus mu, that is the same thing as saying X minus the expected value. So, if  we take X, that would be zero, and we subtract off the expected value, 1.88  Then X minus the expected value would be - 1.88 Do you see how we got that?  So we're going to take that first column, and we're just going to subtract off the  number 1.88 from every single value, so again that first column zero minus 1.88  is - 1.88 That second row one in the first column minus 1.88 is - .88. Third row  first column value two minus 1.88 Let's look to that fifth column, take the value of 0.12 and so on and so forth. You can see what all of the original values of X, 0,  1, 2, 3, 4, 5, are minus their expected value of 1.88 All right. The next column  that's blank is the X minus the expected value squared. So we're going to take  that previous column, for example, - 1.88 and we're going to square that  number. Remember what squared means, we're just going to multiply it by itself, so - 1.88 times - 1.88 gives us 3.53 and again we're going to do that for all of the values in the X minus mu column, so - 0.88 times itself. Would give you 0.77 if  you take 0.12 and multiply it by itself, if you square it, it would be 0.01 and so on and so forth. All right, lot of calculations here, but now we have all of our  squared distances from the mean, squared distances from the expected value,  so with those squared distances from the expected value. Now all we're going to do is just multiply that 1, 2, 3, 4, 5, sixth, column by the 1, 2, third, column, so  we're going to multiply that X minus mu squared times the probability of us  getting that value, so okay, so X minus mu squared in the first row is 3.53 we're  going to multiply that 3.53 by the probability that we got that value 0.25 if we do  that, we would get the value 0.883 So, again, How did we get that? We  multiplied all within the first row the 0.25 in the third column by the 3.53 in the  sixth column, and that gave us 0.883 in the last column. Do the same thing all  the way down. If we take all of the values in the third column, multiply them by  all of the values in the sixth column, we're going to get all of these values here.  So, again, let's look at the second row, probability of 0.23 times 0.77 that would  give you 0.177. Third row, probability of 0.19 multiplied by a value of 0.01 would  give you 0.002 and so on and so forth all the way down for the last three rows. I  invite you, after this lecture is done, to make sure you can get the calculation for  rows three, four, and five in this table. We've discussed in detail rows 0, 1 and 2, but I want to make sure you get a chance to practice, so make sure you  understand how we got all the values in the rows three, four, and five. All right,  now we take that last column and remember, what are we supposed to do when  we calculate these expected values in these variances? We're going to sum up  all of the values that you see here, so we're going to take 0.883 we're going to  add 0.177 we're going to add 0.002 all the way down to 0.681 We add all those 

numbers in the last column together, we're going to get the value of 2.522 or in  other words, the variance of daily sales is 2.522 TVs squared. Now there's those squared terms again for variance. Remember, we don't like squared units too  much, so if we take the square root of 2.522 you would get the standard  deviation of daily sales is 1.588 TVs. Awesome. So, now let's take a look at this  whole problem. So, let X be the number of TVs sold at a small department store  in one day, where again X can only take the values 0, 1, 2, 3, 4, 5, If I asked  you, what's the typical day look like in terms of number of TVs sold, and how  spread out is that, you would answer, well, I expect to sell 1.88 TVs a day, that's  from the expected value, and its spread is 1.588 TVs, that's its standard  deviation. So now you can summarize center and spread when not every  observation has an equal probability. See how much we've grown since taking a  look at the first average and first standard deviation. When we first learned  about averages and standard deviations and variances, we just treated every  observation as equal. Then we learned about probability. Then we learned about distributions, specifically that distributions don't have to have equal probabilities  for every category. And now from that, we can get an even better guess of what  the expected value, or what the spread is going to be on a distribution, so let's  summarize the expected value or mean of a random variable is just a measure  of its central tendency or location, and that expected value looks just like the  average that we had before. We sum up all of the values of X except. For each  one of those, we weight them, we multiply them by the probability that they  occur. So, for each value, multiply it by the probability it happens, and then do  the same thing for every one of the values before summing it up. Same idea for  the variance. The variance of a random variable is a measure of its variability  and spread. Again, same kind of calculation we had before, we're looking at  really an average of the squared distances each point is from its expected value, but instead of every point getting an equal weight, you take the actual distance  squared, so X minus mu squared, multiply it by the probability you even saw that value, and then sum up all of those calculations to get a better estimate of  spread. Wow, I know there's a lot of calculations we looked at in this lecture. I  definitely invite you to go back and look through those calculations again,  especially for rows three, four, and five, to make sure you understand all the  calculations that we did, but that is the end of this lecture, and I look forward to  seeing you next time.



இறுதியாக மாற்றியது: திங்கள், 15 ஜூன் 2026, 9:36 AM