Let's finish off our section on distributions of discrete data by talking about a  very specific and popular discrete distribution called the binomial distribution.  The best part about the binomial distribution, you've already seen an example of it. It's just now, let's make it a little bit more formal. So, for example, imagine you have a 2-step random process where you flip a coin twice, and we'll call these  flips independent. In other words, whatever you get on the first flip has no  bearing on what the outcome is of the next flip of a coin. We can see that exact  same experiment, sort of written out here with our tree diagram. Flip a coin,  50/50 chance, it gets heads or tails. Whatever you get, heads or tails, flip a coin  again. Again, 50/50 chance, you get heads or tails afterwards. But again, this is  an example of what we call a binomial experiment. So, let's make this formal.  There are 4 properties of a binomial experiment. First, the experiment has to  consist of a sequence of n identical trials. So, for us, we flipped a coin 2 times,  so our sequence of n trials was 2 identical trials, 2 coin flips. The next thing, the  next property of a binomial experiment is that each outcome can only have 2  possibilities, success or failure. So, if property 1 says that we have a sequence  of identical trials. Property 2 is that each trial can only have 2 outcomes. The  third property would be that the probability of a success does not change from  trial to trial, and then last but not least, these trials are independent, like I said.  This is exactly like our 2 coin flip example, right? So we have 2 coin flips for our  experiment. They are identical trials. We're flipping the same coin, nothing's  changing. Each coin flip only has 2 outcomes: success or failure, or in our case,  heads or tails. The probability of, let's say, a success doesn't matter, heads or  tails, does not change from flip to flip, from trial to trial. It still remains 0.5, and  these trials are independent of each other. The outcome of 1 coin flip has no  bearing on the outcome of another coin flip, so wonderful. We have our first  binomial experiment. From this binomial experiment, we can calculate what we  call the binomial distribution. The binomial distribution basically tries to answer a question, it's trying to answer what the probabilities of the number of successes  occurring is in n trials, so again the binomial distribution is looking at the  probabilities of the number of successes occurring in the n trials, if we have a  certain number of coin flips, let's imagine heads is a success. Then what you  want to do is figure out, well, what's the probability of me getting a certain  number of heads inside of the n flips of the coin. So let's use x to denote the  number of successes occurring in these n trials, so for example, let's let a  success be rolling a dice and getting a 2. So, okay, instead of just doing the  flipping coin example, we're going to try something a little bit more complicated.  So, again, let's roll a dice and let's say a success is only getting a 2, that means  the numbers 1, 3, 4, 5, and 6, that would all be a failure. Now let's roll that same  dice 10 times again. Here n would be 10. We have 10 trials now. Let's imagine  we were able to successfully roll a 2 three times. Oh, so that's our x. X is the  number of successes, 2. I'm sorry, 3. 2 is a success. So x is the number of 

successes, 3 out of the n, which in our case is 10 trials, so we are interested in  the probability of exactly 3 rolls of a success, or 3 rolls equal to 2. Oh, okay. How can we calculate this? Well, again, you. Probably sit down and ponder a way of  trying to figure out this idea of what's the probability of rolling a 2 three times out  of 10 when it comes to a 6 sided dice, but luckily the binomial distribution helps  us answer this question. The binomial distribution has a binomial probability  function, and it's defined as the following. Now this looks rather complicated, so  let's break it down into individual pieces and see what we can come up with.  First, let's just look at the function by itself, and then we'll get to what each half  of the function means. So the function by itself is first of all this ratio you see, n  followed by an !, then divided by x followed by an !, multiplied by n minus x  followed by an !, x, an !, an !. So, what does an ! mean in math? An ! in math is  shorthand for basically multiplying that number by the number previous to it until you get all the way down to 1. Whoa, that sounds complicated. So, let's think  about it. What would 4 followed by an ! mean? That would be 4 x 3 x 2 x 1. You  get it. So, if we had 8 followed by an !, you would do 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1, does that make sense? So, when we say we have n followed by an !, we take  whatever number of trials we have, and then again we multiply it by the number  smaller than it, 1 at a time, until we get all the way down to 1. Remember, our  example n is 10, so we would do 10 x 9 x 8 x 7 all the way, get all the way until  we get down to 1. It's the same thing for x, so if we were to do x followed by an !  in our example, x equal 3, so 3 ! would be 3 x 2 x 1, and so on and so forth.  Well, what is this left-hand side, this highlighted ratio that I'm showing you? Well, without getting into the math, this is actually the number of outcomes providing  exactly x successes in n trials, so again, without getting into all the math behind  it, you'll just have to trust me on that one. If you wanted to know if I had 10 trials  and how many x would I get 3 successes in 10 trials, that is it, right there. It  would be 10 x 10 ! divided by 3 ! x 10 minus 3 !. Except in math we don't say !,  we call this factorial. So n factorial would be the same as n!, so again, you're just going to have to trust me that what I've highlighted here for you on the slide is  just the number of outcomes providing exactly that many successes in n trials.  I'll show you via diagram later. Let's look at the right hand side of this equation,  we see a little p, we see an exponent of an x, we see 1 minus p with an  exponent of n minus x. Well, this highlighted right-hand side is the probability of  our particular sequence of trial outcomes with again x successes in n trials. Let's think about it. Let's imagine again you have 3 successes, and let's imagine the  probability of success is p. Well, if you had 3 successes, remember these are  independent trials, then we could say the probability of those 3 successes would be p x p x p, or p raised to the third power, p raised to the x. However, the other  7 trials did not result in a success, they were a failure, so they had a probability  of 1 minus p, so I would do 1 minus p x 1 minus p x 1 minus p 7 times, or 1  minus p raised to the 10 minus 3, or 1 minus p. Raised to the n minus x, so 

again this is the actual calculation. If you had a binomial experiment, remember  what a binomial experiment is. It's an experiment that consists of n identical  trials, where again those trials can only have success or failure, and you knew  that you had a probability p of getting a success. If you wanted to know x, the  number of successes occurring in those n trials, this is the exact equation that  will give you that number. All right, so again, let's go through our example. Let a  success be rolling a dice and getting a 2. Okay, well, what's the probability of me rolling a dice and getting a 2? Well, if it's a fair dice, it would be 1 out of 6, 1 out  of 6 times I roll it, I should get a 2. Okay, so we roll a dice 10 x, that will be n and successfully roll a 2 3 of those x, that's x. So, what's the probability of exactly  rolling 3 2s with a dice? Well, let's fill in some of these numbers first. Again, we  have 10 trials, so instead of little n fill in 10. x is the number of successes we  had 3 successes, we rolled a dice 3 x that came up a 2, so every time you see x replace it with a 3, and then the probability this actually happens me rolling a  dice and getting a 2 would be 1 over 6, or 1/6 Wonderful. So now we have our  equation filled in. So, what's the probability that I get 3 exact rolls of 2, or what's  the probability I get 3 successes out of 10 trials, well, it will be 10 factorial  divided by 3 factorial x 10 minus 3 factorial. Okay, well, that would be 10 x 9 x 8  x 7 x 6 x 5 x 4 x 3 x 2 x 1 divided by 3 x 2 x 1 x 7 x 6 x 5 x 4 x 3 x 2 x 1. Whew,  that's the left hand side. Then we would do 1 over 6 3 times, so we would do 1/6 x 1/6 x 1/6 or in other words, 1/6 cubed. Then we would do 1 minus 1/6 and do  that 7 x 10 minus 3, so we would do 1 minus 1/6 x 1 minus 1/6 x 1 minus 1/6 10  x, if you were to go through all of those calculations, the left hand side, the 10  factorial divided by 3 factorial x 7 factorial would give you 120 The right hand  side, 1 over 6 raised to the third power, and 1 minus 1 over 6 raised to the 7th  power would give you 0.0013 or in other words, the probability of you rolling a 2  exactly 3 x when you roll a dice 10 times is 0.155 Lot of math there. Now I know  you're thinking, okay, but how is this actually useful in the real world? Great, so  now I can talk to my friends about what happens if they roll a dice or flip a coin,  but I want to know something that's actually helpful and meaningful, potentially  in my job. Okay, well, let's imagine that you worked in HR, and let's imagine that  your company has a retention rate of 90% for your employees annually. In other  words, at any random employee has a probability of point 1 or a 10% chance of  leaving this year. Okay, so if I were to choose 3 employees at random, what's  the probability that exactly 1 of them will leave the company this year? That  could be a fair question asked if you worked in HR. Luckily, this is a binomial  experiment. Let's take a look. What is our binomial experiment? Well, we have 3 trials, 3 employees, each of those employees has a binary outcome, either they  stay at the company or they leave the company. There's a 10% chance that  each 1 of them leaves the company. Now let's imagine that they don't influence  each other. If 1 leaves that has no bearing on anyone else leaving, so okay, the  number of successes here is 1. We want to know what is the probability exactly 

1 of the 3 employees, each with a 10% chance of leaving, actually leaves the  company. Well, again, this is just the binomial distribution, for the binomial  distribution, we can just plug in our values. So, how many trials do we have? 3.  So, 3 factorial divided by 1 factorial x 3 minus 1 factorial. So, if you do 3  factorial, 3 x 2 x 1 divided by 1 factorial, which is just 1 x 3 minus 1 factorial, 2 x  1. Then you would get essentially 3. Now let's look at the probability. Well, the  probability each employee leaves is 0.1, and that happens 1 time, so 0.1 raised  to the 1 would just be 0.1 and then the other 2 employees, they don't leave, so 1 minus point 1, that's point 9. The other 2 employees stay, so that's a 90%  chance of that happening, so 90% x 90% or point 9 x point 9, because again we  had 2 people who didn't leave and 1 person who did leave, you multiply all those probabilities together, you'll get point 0.081. Okay, so we have 3 x 0.081, if you  have a retention rate of 90% for your employees annually, the probability that 1  out of 3 random employees leaves the company this year is 24.3% or 0.243  Wow, you would think if there's a 90% chance that everyone stays, that the  probability of exactly 1 of 3 people leaving would probably be smaller than that,  but that's not the case. See, the binomial distribution shows us to how we can  calculate this. This is the beautiful part about this, is it helps us work through  complicated problems in a simplistic way. Now, maybe you don't believe this  number. Well, let's actually work out the tree diagram for this, right? So the first  worker has a 10% chance of leaving, a probability of .1, a 90% chance of  staying, a probability of .9. Okay. Well, whether they leave or they stay, the  second worker also has the same chances of leaving and staying, and whether  that person leaves or stays, the third worker has that same probability of leaving or staying. So, let's take a look. Here are the 3 ways that this could happen:  employee 1 leaves, employee 2 and 3 stay, employee 1 stays, employee 2  leaves, and employee 3 stays, or employee 1 and 2 stays and employee 3  leaves. So again, we can see the 3 possible ways this could happen. Wonderful. Well, we could actually calculate the probability that each 1 of those ways  actually happens. .081. Again, if we were to then add them all together, we  would get our .243 So, again, if we were to draw this out. It may be a little bit  easier for you to see. Some people see things better with tree diagrams as  compared to just calculations, but the end result is still the same. We see that  we still have the same probability that 1 out of these 3 employees leaving  is .243 so perfect, all right. One last little piece to this binomial distribution, we  can not only just look at the distribution itself, but we can calculate the expected  value and the variance of this distribution. Luckily, for the binomial distribution,  the following is always true. The expected value is just n, the number of trials  you have, times p, the probability of a success actually happening. The variance is just n, the number of trials times p, the probability of success times 1 minus p, the probability of failure, and the standard deviation is just the square root of the  variance. Well, again, how is this helpful to us? Well, again, let's go back to our 

first binomial experiment. Let's let a success be rolling a dime. Dice and getting  a 2, we roll a dice 10 times and successfully roll a 2 3 of those times, so we're  interested in the probability of exactly 3 rolls of equal to 2. Well, what did we  expect? Well, with 10 rolls and a success probability of 1/6 I only expected to roll a 2 on the dice 1.67 times out of 10, so by me rolling 3 out of 10 instead of 1.67 I beat the odds. The odds only said that I would have a chance 1.6 times out of  10 of getting a 2 again. Let's go to a more realistic example, though. Same idea, I have a retention rate of 90% for our employees annually. In other words, any  random employee has a 10% chance of leaving this year. If you were to just look at 3 employees, we would want to know the probability exactly 1 of them will  leave. Well, we calculated that previously, but let's look at the expected value.  The expected value is n for us 3 x the probability that any 1 of them  leaves, .1. That would mean I expect point 3 of the 3 employees to leave this  year. Now I know, I know .3 of a person doesn't really make sense. Okay, we  were using little numbers just to be able to make the math easy, but let's  imagine you had 300 employees, then that means you would expect 30 of them  to leave this year, 300 x .1. See, how this math can make things very easy for  these complicated problems. So, again, if you were to work in HR, and you have a, you know what your retention rate is, you can answer some of these more  complicated problems, all with using the binomial distribution. All right, let's  summarize the binomial distribution. Looks at the probabilities of the number of  successes occurring in n independent trials. Now, the binomial probability  function is comprised of 2 intuitive pieces. The first piece is just the number of  outcomes that gives you that many successes in the n trials. The second piece  is the probability of actually getting that many successes in n trials, so you take  the probability of getting it times the number of times you could have that  happen, and there you go, that would give you your probability overall. We also  learned how to calculate an expected value from this, as well as a variance,  we've learned a lot about discrete distributions in these last 3 lectures. We've  been able to look at the idea of what a discrete distribution even is, how we can  use data to be able to help us understand what expected values and variances  can be, and now even look at a special case distribution, the binomial  distribution, that helps us answer even more complicated questions, but that is  the end of this lecture. That is the end of this section, and I look forward to  seeing you in the next. 



கடைசியாக மாற்றப்பட்டது: திங்கள், 15 ஜூன் 2026, 9:37 AM