Binomial and Geometric Distributions

Applied Statistics

MTH-361A | Spring 2025 | University of Portland

February 17, 2025

Objectives

Previously… (1/2)

The Law of Large Numbers

It states that as the number of trials in a random experiment increases, the sample mean approaches the expected value.

Example Bernoulli Trials Simulation

Let \(p=0.60\) be the “success” probability of a Bernoulli r.v. \(X\), where \(\text{E}(X) = p\).

Previously… (2/2)

Geometric R.V.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Geom}(p) \\ \text{PMF } & \longrightarrow P(X=k) = (1-p)^k p \\ \text{for } & k = 0,1,2, \cdots \end{aligned} \]

Binomial R.V.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Binom}(p) \\ \text{PMF } & \longrightarrow P(X=k) = \binom{n}{k} p^k (1-p)^{n-k} \\ \text{for } & k = 0,1,2,3, \cdots, n \end{aligned} \]

Visualizing the Geometric Distribution

Geometric Distribution

Geometric R.V.

Let \(p=0.50\) be the success probability.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Geom}(0.50) \\ \text{PMF } & \longrightarrow P(X=k) = (1-0.50)^k (0.50) \\ \text{for } & k = 0,1,2, \cdots \end{aligned} \]

Geometric Probabilities (1/2)

Geometric Distribution

Example:

What is the probability of “success” on the 6th trial with \(p=0.50\)? \[ \begin{aligned} P(X=5) & = (1-0.50)^5 (0.50) \\ & \approx 0.016 \end{aligned} \]

Using R:

p <- 0.5
dgeom(5,p)
## [1] 0.015625

\(\star\) Note that the dgeom() function computes the probability \(P(X = k)\), meaning it computes the probability at exactly \(X=k\) using the Geometric PMF.

Geometric Probabilities (2/2)

Geometric Distribution

Example:

What is the probability that the first “success” occurs before the 6th trial, given \(p=0.50\)? \[ \begin{aligned} P(X \le 5) & = \sum_{k=0}^5 P(X = k) \\ & = \sum_{k=0}^5 (1-0.50)^{k} (0.50) \\ P(X \le 5) & \approx 0.984 \\ \end{aligned} \]

Using R:

p <- 0.5
pgeom(5,p)
## [1] 0.984375

\(\star\) Note that the pgeom() function computes the probability \(P(X \le k)\), meaning it computes the sum of all probabilities from \(X=0\) to \(X=k\) using the Geometric PMF.

Geometric Expected Value

Geometric Distribution with Expected Value

Geometric R.V.

Let \(p=0.50\) be the success probability.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Geom}(0.60) \\ \text{PMF } & \longrightarrow P(X=k) = (1-0.50)^k (0.50) \\ \text{for } & k = 0,1,2, \cdots \\ \text{expected value} & \longrightarrow \text{E}(X) \approx 0.667 \end{aligned} \]

In general, the expected value of the Geometric r.v. is given by \[\text{E}(X) = \frac{1-p}{p},\] which is the ratio of the “fail” and “success” probabilities.

Simulating the Geometric Distribution

Random Sampling from the Geometric Distribution

Sample Mean vs the Expected Value

The sample mean of \(0.88\) is not exactly equal to the expected value of \(1\) due to sampling variability. As we increase the number of samples, the sample mean gets closer to the expectation.

Geometric Random Sampling using R

N <- 100 # number of simulations
p <- 0.5 # set "success" probability
rgeom(N,p)

Visualizing the Binomial Distribution

Binomial Distribution

Binomial R.V.

Let \(p=0.50\) be the success probability and \(n=10\) the number of trials.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Binom}(p) \\ \text{PMF } & \longrightarrow P(X=k) = \binom{n}{k} p^k (1-p)^{n-k} \\ \text{for } & k = 0,1,2,3, \cdots, n \end{aligned} \]

Binomial Probabilities (1/2)

Binomial Distribution

Example:

What is the probability of getting 4 “success” in 10 trials with \(p=0.50\)? \[ \begin{aligned} P(X=4) & = \binom{10}{4} (0.50)^4 (1-0.50)^{n-k} \\ & \approx 0.205 \end{aligned} \]

Using R:

n <- 10
p <- 0.5
dbinom(4,n,p)
## [1] 0.2050781

\(\star\) Note that the dbinom() function computes the probability \(P(X = k)\), meaning it computes the probability at exactly \(X=k\) using the Binomial PMF.

Binomial Probabilities (2/2)

Binomial Distribution

Example:

What is the probability of getting at most 4 “success” in 10 trials with \(p=0.50\)? \[ \begin{aligned} P(X \le 4) & = \sum_{k=0}^4 P(X = k) \\ & = \sum_{k=0}^4 \binom{10}{k} (0.50)^k (1-0.50)^{10-k} \\ P(X \le 4) & \approx 0.377 \\ \end{aligned} \]

Using R:

n <- 10
p <- 0.5
pbinom(4,n,p)
## [1] 0.3769531

\(\star\) Note that the pbinom() function computes the probability \(P(X \le k)\), meaning it computes the sum of all probabilities from \(X=0\) to \(X=k\) using the Binomial PMF.

Binomial Expected Value

Binomial Distribution with Expected Value

Binomial R.V.

Let \(p=0.50\) be the success probability and \(n=10\) the number of trials.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Binom}(p) \\ \text{PMF } & \longrightarrow P(X=k) = \binom{n}{k} p^k (1-p)^{n-k} \\ \text{for } & k = 0,1,2,3, \cdots, n \\ \text{expected value} & \longrightarrow \text{E}(X) \approx 5 \end{aligned} \]

In general, the expected value of the Binomial r.v. is given by \[\text{E}(X) = np,\] which is the number of expected “success” in \(n\) trials.

Simulating the Binomial Distribution

Random Sampling from the Geometric Distribution

Sample Mean vs the Expected Value

The sample mean of \(4.74\) is not exactly equal to the expected value of \(5\) due to sampling variability. As we increase the number of samples, the sample mean gets closer to the expectation.

Geometric Random Sampling using R

N <- 100 # number of simulations
n <- 10 #number of trials
p <- 0.5 # set "success" probability
rbinom(N,n,p)

Geometric vs Binomial Distribution Summary

R.V. \(X\) Geometric Binomial
Description number of “fail” trials before a “success” number of “success” in \(n\) trials
Sampling with replacement with replacement
Parameters \(p \longrightarrow\) probability of “success” \(n \longrightarrow\) number of trials
\(p \longrightarrow\) probability of “success”
PMF \(P(X=k) = (1-p)^k p\)
\(k=0,1,2,\cdots\)
\(P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}\)
\(k = 0,1,2,3, \cdots, n\)
Expected Value \(\text{E}(X)\) \(\frac{1-p}{p}\) \(np\)
\(P(X = k)\) dgeom(p) dbinom(n,p)
\(P(X \le k)\) pgeom(p) pbinom(n,p)
\(N\) Simulations rgeom(N,p) rbinom(N,n,p)

\(\star\) Key Idea The basis for both Geometric and Binomial r.v. is the Bernoulli trials but with different counting methodology of “successes”.

Activity: Comparing Binomial Distribution with Different Parameters

  1. Log-in to Posit Cloud and open the R Studio assignment M 2/17 - Comparing Binomial Distribution with Different Parameters.
  2. Make sure you are in the current working directory. Rename the .Rmd file by replacing [name] with your name using the format [First name][Last initial]. Then, open the .Rmd file.
  3. Change the author in the YAML header.
  4. Read the provided instructions.
  5. Answer all exercise problems on the designated sections.

References

Diez, D. M., Barr, C. D., & Çetinkaya-Rundel, M. (2012). OpenIntro statistics (4th ed.). OpenIntro. https://www.openintro.org/book/os/
Speegle, Darrin and Clair, Bryan. (2021). Probability, statistics, and data: A fresh approach using r. Chapman; Hall/CRC. https://probstatsdata.com/