DISTRIBUTIONS OF DISCRETE DATA

ST101 – DR. ARIC LABARR


WHAT ARE DISTRIBUTIONS?

DISTRIBUTIONS OF DISCRETE DATA


A random variable is a numerical description of the outcome of an experiment.

They can be either discrete or continuous.

A discrete random variable may assume either a finite number of values or an infinite sequence of values.

RANDOM VARIABLES


A random variable is a numerical description of the outcome of an experiment.

They can be either discrete or continuous.

A discrete random variable may assume either a finite number of values or an infinite sequence of values.

Finite example: Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}


RANDOM VARIABLES


A random variable is a numerical description of the outcome of an experiment.

They can be either discrete or continuous.

A discrete random variable may assume either a finite number of values or an infinite sequence of values.

Infinite example: Let x be the number of customers arriving in one day at a small department store where x can take the values of 0, 1, 2, …


RANDOM VARIABLES


A random variable is a numerical description of the outcome of an experiment.

They can be either discrete or continuous.

A discrete random variable may assume either a finite number of values or an infinite sequence of values.

A continuous random variable may assume any numerical value in an interval or collection of intervals.


RANDOM VARIABLES


DISCRETE VS. CONTINUOUS

Discrete Example:

Let x be the number of individuals living in a home.


Continuous Example:

Let x be the distance in miles from home to the store.


A random variable is a numerical description of the outcome of an experiment.

A discrete random variable may assume either a finite number of values or an infinite sequence of values.

A continuous random variable may assume any numerical value in an interval or collection of intervals.

SUMMARY


DISCRETE PROBABILITY DISTRIBUTIONS

DISTRIBUTIONS OF DISCRETE DATA


The probability distribution for a random variable describes how probabilities are distributed over the values of the random variable.

Essentially, what is the frequency of occurrence of different values of the variable.

PROBABILITY DISTRIBUTION


Frequency – number of observations in each category in the data set

Relative Frequency – proportion of total observations contained in a given category

Cumulative Frequency – summary of data set i number of observations with values less than or equal to upper limit of the category

Cumulative Relative Frequency – proportion of observations with value less than or equal to upper limit of the category

NOTATION


The probability distribution for a random variable describes how probabilities are distributed over the values of the random variable.

Relative frequencies can be used as estimates to the probability of an event occurring. 

Probability distributions for discrete random variables are best described with tables, graphs, or equations.

PROBABILITY DISTRIBUTION


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

Relative Frequency

0

90

1

85

2

70

3

45

4

50

5

25

365


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

Relative Frequency

0

90

90

90/365

1

85

2

70

3

45

4

50

5

25

365


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

Relative Frequency

0

90

90

0.25

1

85

90+85

0.23

2

70

3

45

4

50

5

25

365


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

Relative Frequency

0

90

90

0.25

1

85

175

0.23

2

70

245

0.19

3

45

290

0.12

4

50

340

0.14

5

25

365

0.07

365

1.00


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

Relative Frequency

0

90

90

0.25

1

85

175

0.23

2

70

245

0.19

3

45

290

0.12

4

50

340

0.14

5

25

365

0.07

365

1.00

Probability!


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

0

90

90

0.25

1

85

175

0.23

2

70

245

0.19

3

45

290

0.12

4

50

340

0.14

5

25

365

0.07

365

1.00


The probability distribution for a random variable describes how probabilities are distributed over the values of the random variable.

Frequency is the number of observations in each category in the data set.

Relative frequency is the proportion of total observations contained in a given category.

Cumulative frequency is the summary of data set i number of observations with values less than or equal to upper limit of the category.

Cumulative relative frequency is the proportion of observations with value less than or equal to upper limit of the category.



SUMMARY


EXPECTED VALUE AND VARIANCE

DISTRIBUTIONS OF DISCRETE DATA


The expected value, or mean, of a random variable is a measure of its central location.

It is defined by:



Think about the expected value as a weighted mean, where the probability function serves as the weight.


EXPECTED VALUE

 


 

EXPECTED VALUE

 


 

EXPECTED VALUE

 


 

EXPECTED VALUE

 


The expected value, or mean, of a random variable is a measure of its central location.

It is defined by:



Think about the expected value as a weighted mean, where the probability function serves as the weight.


EXPECTED VALUE

 


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

0

90

90

0.25

1

85

175

0.23

2

70

245

0.19

3

45

290

0.12

4

50

340

0.14

5

25

365

0.07

365

1.00


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

0

90

90

0.25

0.00

1

85

175

0.23

2

70

245

0.19

3

45

290

0.12

4

50

340

0.14

5

25

365

0.07

365

1.00


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

0

90

90

0.25

0.00

1

85

175

0.23

0.23

2

70

245

0.19

3

45

290

0.12

4

50

340

0.14

5

25

365

0.07

365

1.00


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

Let’s examine the past year of data.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

0

90

90

0.25

0.00

1

85

175

0.23

0.23

2

70

245

0.19

0.38

3

45

290

0.12

0.36

4

50

340

0.14

0.56

5

25

365

0.07

0.35

365

1.00

1.88


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

We expect to sell 1.88 TV’s per day on average.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

Cumulative Frequency

0

90

90

0.25

0.00

1

85

175

0.23

0.23

2

70

245

0.19

0.38

3

45

290

0.12

0.36

4

50

340

0.14

0.56

5

25

365

0.07

0.35

365

1.00

1.88


The variance of a random variable is a measure of its variability/spread.

It is defined by:



The standard deviation is the square root of the variance.


VARIANCE

 


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

We expect to sell 1.88 TV’s per day on average.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

1

85

0.23

0.23

2

70

0.19

0.38

3

45

0.12

0.36

4

50

0.14

0.56

5

25

0.07

0.35

365

1.00

1.88


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

We expect to sell 1.88 TV’s per day on average.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

1

85

0.23

0.23

-0.88

2

70

0.19

0.38

0.12

3

45

0.12

0.36

1.12

4

50

0.14

0.56

2.12

5

25

0.07

0.35

3.12

365

1.00

1.88


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

We expect to sell 1.88 TV’s per day on average.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

3.53

1

85

0.23

0.23

-0.88

2

70

0.19

0.38

0.12

3

45

0.12

0.36

1.12

4

50

0.14

0.56

2.12

5

25

0.07

0.35

3.12

365

1.00

1.88


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

We expect to sell 1.88 TV’s per day on average.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

3.53

1

85

0.23

0.23

-0.88

0.77

2

70

0.19

0.38

0.12

0.01

3

45

0.12

0.36

1.12

1.25

4

50

0.14

0.56

2.12

4.49

5

25

0.07

0.35

3.12

9.73

365

1.00

1.88


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

We expect to sell 1.88 TV’s per day on average.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

3.53

0.883

1

85

0.23

0.23

-0.88

0.77

2

70

0.19

0.38

0.12

0.01

3

45

0.12

0.36

1.12

1.25

4

50

0.14

0.56

2.12

4.49

5

25

0.07

0.35

3.12

9.73

365

1.00

1.88


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

The variance of daily sales is 2.522 TV’s squared.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

3.53

0.883

1

85

0.23

0.23

-0.88

0.77

0.177

2

70

0.19

0.38

0.12

0.01

0.002

3

45

0.12

0.36

1.12

1.25

0.150

4

50

0.14

0.56

2.12

4.49

0.629

5

25

0.07

0.35

3.12

9.73

0.681

365

1.00

1.88

2.522


Let x be the number of TV’s sold at a small department store in one day where x can only take the values of {0, 1, 2, 3, 4, 5}

The standard deviation of daily sales is 1.588 TV’s.

DISCRETE PROBABILITY EXAMPLE

TV’s Sold

Number of Days (Freq)

0

90

0.25

0.00

-1.88

3.53

0.883

1

85

0.23

0.23

-0.88

0.77

0.177

2

70

0.19

0.38

0.12

0.01

0.002

3

45

0.12

0.36

1.12

1.25

0.150

4

50

0.14

0.56

2.12

4.49

0.629

5

25

0.07

0.35

3.12

9.73

0.681

365

1.00

1.88

2.522


The expected value, or mean, of a random variable is a measure of its central location:



The variance of a random variable is a measure of its variability/spread:




SUMMARY

 

 


BINOMIAL DISTRIBUTION

DISTRIBUTIONS OF DISCRETE DATA


EXAMPLE REVIEW

Example: you have a 2-step random process where you flip a coin twice (independent flips):

Flip Coin

Heads – Flip

Tails

Tails – Flip

Tails

Heads

Heads

 

 

 

 

 

 


There are 4 properties of a binomial experiment:

The experiment consists of a sequence of n identical trials.

Only two outcomes, success or failure, are possible on each trial.

The probability of a success, denoted as p, does not change from trial to trial.

The trials are independent.


BINOMIAL EXPERIMENT


There are 4 properties of a binomial experiment:

The experiment consists of a sequence of n identical trials. (2 coin flips)

Only two outcomes, success or failure, are possible on each trial. (H or T)

The probability of a success, denoted as p, does not change from trial to trial. (0.5)

The trials are independent. (Independent coin flips)


BINOMIAL EXPERIMENT


The binomial distribution looks at the probabilities of the number of successes occurring in the n trials.

We use x to denote the number of successes occurring in the n trials.


BINOMIAL DISTRIBUTION


The binomial distribution looks at the probabilities of the number of successes occurring in the n trials.

We use x to denote the number of successes occurring in the n trials.

For example:

Let a success be rolling a dice and getting a 2.

We roll a dice 10 times, n, and successfully role a 2, three times, x.

We are interested in the probability of exactly 3 rolls equal to 2.


BINOMIAL DISTRIBUTION


The binomial probability function is defined as:


PROBABILITY FUNCTION

 

Number of outcomes providing exactly

x successes in n trials


The binomial probability function is defined as:


PROBABILITY FUNCTION

 

Number of outcomes providing exactly

x successes in n trials

Probability of a particular

sequence of trial outcomes

with x successes in n trials


For example:

Let a success be rolling a dice and getting a 2.

We roll a dice 10 times, n, and successfully role a 2, three times, x.

We are interested in the probability of exactly 3 rolls equal to 2.


BINOMIAL DISTRIBUTION EXAMPLE

 


For example:

Let a success be rolling a dice and getting a 2.

We roll a dice 10 times, n, and successfully role a 2, three times, x.

We are interested in the probability of exactly 3 rolls equal to 2.


BINOMIAL DISTRIBUTION EXAMPLE

 

 


For example:

Let a success be rolling a dice and getting a 2.

We roll a dice 10 times, n, and successfully role a 2, three times, x.

We are interested in the probability of exactly 3 rolls equal to 2.


BINOMIAL DISTRIBUTION EXAMPLE

 

 

 


For example:

Let a success be rolling a dice and getting a 2.

We roll a dice 10 times, n, and successfully role a 2, three times, x.

We are interested in the probability of exactly 3 rolls equal to 2.


BINOMIAL DISTRIBUTION EXAMPLE

 

 

 


You have a retention rate of 90% for your employees annually.

In other words, any random employee has a probability of 0.1 to leave this year.

Choosing 3 employees at random, what is the probability that exactly 1 of them will leave the company this year?


BINOMIAL DISTRIBUTION EXAMPLE


You have a retention rate of 90% for your employees annually.

In other words, any random employee has a probability of 0.1 to leave this year.

Choosing 3 employees at random, what is the probability that exactly 1 of them will leave the company this year?


BINOMIAL DISTRIBUTION EXAMPLE

 

 

 


BINOMIAL DISTRIBUTION EXAMPLE

1st Worker

2nd Worker

3rd Worker

Leaves

(0.1)

Stays

(0.9)

Leaves

(0.1)

Leaves

(0.1)

Stays

(0.9)

Stays

(0.9)

L(0.1)

L(0.1)

L(0.1)

L(0.1)

S(0.9)

S(0.9)

S(0.9)

S(0.9)

x

Prob

3

2

2

1

2

1

1

0

0.001

0.009

0.009

0.081

0.009

0.081

0.081

0.729


BINOMIAL DISTRIBUTION EXAMPLE

1st Worker

2nd Worker

3rd Worker

Leaves

(0.1)

Stays

(0.9)

Leaves

(0.1)

Leaves

(0.1)

Stays

(0.9)

Stays

(0.9)

L(0.1)

L(0.1)

L(0.1)

L(0.1)

S(0.9)

S(0.9)

S(0.9)

S(0.9)

x

Prob

3

2

2

1

2

1

1

0

0.001

0.009

0.009

0.081

0.009

0.081

0.081

0.729


For the binomial distribution, the following is always true:

Expected value:


Variance:


Standard deviation:

EXPECTED VALUE AND VARIANCE/STANDARD DEVIATION

 

 

 


For example:

Let a success be rolling a dice and getting a 2.

We roll a dice 10 times, n, and successfully role a 2, three times, x.

We are interested in the probability of exactly 3 rolls equal to 2.


BINOMIAL DISTRIBUTION EXAMPLE

 

We expect to roll a 2 on the dice 1.67 times out of 10 chances.


For example:

You have a retention rate of 90% for your employees annually.

In other words, any random employee has a probability of 0.1 to leave this year.

Choosing 3 employees at random, what is the probability that exactly 1 of them will leave the company this year?


BINOMIAL DISTRIBUTION EXAMPLE

 

We expect to roll a 0.3 of the 3 employees to leave this year.


The binomial distribution looks at the probabilities of the number of successes occurring in the n independent trials.

The binomial probability function is comprised of two intuitive pieces:


SUMMARY

 

Number of outcomes providing exactly

x successes in n trials

Probability of a particular

sequence of trial outcomes

with x successes in n trials


Last modified: Monday, October 17, 2022, 1:09 PM