Discrete Random Variabes &
Probability Mass Functions

Applied Statistics

MTH-361A | Spring 2025 | University of Portland

February 12, 2025

Objectives

Previously… (1/2)

Random Variables

A random variable (r.v.) is a numerical outcome of a random experiment. It assigns a number to each possible outcome in a sample space.

In other words, a random variable is a function that maps the sample space into real numbers.

Types:

Previously… (2/2)

A Bernoulli r.v. represents a single experiment with two possible outcomes: “success” (\(X=1\)) with probability \(p\) and failure (\(X=0\)) with probability \(1-p\). We denote this r.v. and its PMF as \[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Bern}(p) \\ \text{PMF } & \longrightarrow P(X = x) = p^x (1-p)^{1-x}, \ \ x \in \{0,1\} \end{aligned}. \]

We have shown that the expected value of \(X\) is \(\text{E}(X) = p\) and the variance of \(X\) is \(\text{Var}(X) = p(1-p)\) using the definition of expected value and variance respectively.

Sampling with Replacement

The sampling with replacement is a sampling method where each selected item is returned to the population before the next selection.

Key Characteristics:

Example:

Sampling without Replacement

The sampling without replacement is a sampling method where each selected item is not returned to the population before the next selection.

Key Characteristics:

Example:

Independent and Identically Distributed

A sequence of random variables \(X_1, X_2, \cdots, X_n\) is independent and identically distributed (i.i.d) if:

Examples:

Why is i.i.d. Important?

Multiple Bernoulli Trials

A sequence of multiple Bernoulli trials consists i.i.d. Bernoulli random variables where each follows a Bernoulli distribution with “success” probability \(p\).

\(\star\) Key Assumption: Trials are independent (i.e., one outcome does not affect the next).

The Geometric R.V.

A geometric R.V. is a discrete random variable that represents the number of Bernoulli trials until the first “success” where each trial is independent, with a constant “success” probability \(p\): \[X \sim \text{Geom}(p)\]

Sample Space:

\[ \begin{aligned} 1 & \longrightarrow 0 \text{ fail until success} \\ 0,1 & \longrightarrow 1 \text{ fail until success} \\ 0,0,1 & \longrightarrow 2 \text{ fail until success} \\ & \vdots \\ 0,0,0,\cdots,1 & \longrightarrow k \text{ fail until success} \\ \end{aligned} \]

Probabilities:

\[ \begin{aligned} 1 & \longrightarrow (1-p)^0 p \\ 0,1 & \longrightarrow (1-p)^1 p \\ 0,0,1 & \longrightarrow (1-p)^2 p \\ & \vdots \\ 0,0,0,\cdots,1 & \longrightarrow (1-p)^k p \\ \end{aligned} \]

\(\star\) Key Idea: The geometric random variable counts the number of “failures” before a “success” and can also be viewed as counting the number of trials including the first “success.”

The Geometric R.V.: PMF

The geometric r.v. \(X \sim \text{Geom}(p)\) has infinite possible outcomes (or infinite sized sample space) where \(p\) is the “success” probability.

The PMF of the geometric r.v. can be written in two ways:

\(\star\) Key Idea: The geometric random variable models a situation where samples are taken with replacement, and the number of failures until the first success is counted.

The Geometric R.V.: Examples

The Binomial R.V.

A binomial R.V. is a discrete random variable representing the number of “success” in \(n\) independent Bernoulli trials, each with “success” probability \(p\): \[X \sim \text{Binom}(n,p)\]

Sample Space:

Suppose \(n = 3\).

\[ \begin{aligned} 0,0,0 & \longrightarrow 3 \text{ fail and } 0 \text{ success} \\ 0,0,1 & \longrightarrow 2 \text{ fail and } 1 \text{ success} \\ 0,1,0 & \longrightarrow 2 \text{ fail and } 1 \text{ success} \\ 0,1,1 & \longrightarrow 1 \text{ fail and } 2 \text{ success} \\ 1,0,0 & \longrightarrow 2 \text{ fail and } 1 \text{ success} \\ 1,0,1 & \longrightarrow 1 \text{ fail and } 2 \text{ success} \\ 1,1,0 & \longrightarrow 1 \text{ fail and } 2 \text{ success} \\ 1,1,1 & \longrightarrow 0 \text{ fail and } 3 \text{ success} \\ \end{aligned} \]

Probabilities:

Suppose \(n = 3\). \[ \begin{aligned} 0,0,0 & \longrightarrow (1-p)^3 p^0 \\ 0,0,1 & \longrightarrow (1-p)^2 p^1 \\ 0,1,0 & \longrightarrow (1-p)^2 p^1 \\ 1,0,0 & \longrightarrow (1-p)^2 p^1 \\ 0,1,1 & \longrightarrow (1-p)^1 p^2 \\ 1,0,1 & \longrightarrow (1-p)^1 p^2 \\ 1,1,0 & \longrightarrow (1-p)^1 p^2 \\ 1,1,1 & \longrightarrow (1-p)^0 p^3 \end{aligned} \]

\(\star\) Key Idea: The binomial random variable counts the number of “successes” in \(n\) independent Bernoulli trials, where each trial has a “success” probability \(p\).

Counting the Number of Combinations

Permutations

An arrangement of objects in a specific order. For \(n\) objects, we pick \(k\) objects to permute with number of permutations given by \[_n P_k = \frac{n!}{(n-k)!}.\]

Combinations

A selection of objects where order does not matter. For \(n\) objects, we pick \(k\) objects to combine with number of combinations given by \[_n C_k = \binom{n}{k} = \frac{n!}{k!(n-k)!}.\]

The Binomial Coefficient

The binomial coefficient, denoted as \(\binom{n}{k}\) represents the number of ways to choose \(k\) objects from a set of \(n\) objects without regard to order. It is given by the formula: \[\binom{n}{k} = \frac{n!}{k!(n-k)!}\]

Expanding binomial expressions using the Binomial Theorem: \[(x+y)^n = \sum_{k=0}^n \binom{n}{k} y^k x^{n-k}\]

If we let \(x=1-p\) and \(y=p\) (Bernoulli PMF), then \[(1-p+p)^n = \sum_{k=0}^n \binom{n}{k} p^k (1-p)^{n-k} = 1.\]

\(\star\) Key Idea: Since \(p\) is the “success” probability and the Binomial Theorem reduces to \(1\), then this satisfies the probability axioms.

The Binomial R.V.: PMF

The binomial r.v. \(X \sim \text{Binom}(n,p)\) has finite possible outcomes with PMF given by \[P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}, \ k = 0,1,2,3, \cdots, n\] where \(p\) is the “success” probability. The term \(\binom{n}{k} = \frac{n!}{k! (n-k)!}\) is the binomial coefficient.

\(\star\) Key Idea: The binomial random variable models a situation where samples are taken with replacement, and the number of successes is counted within a finite number of trials.

The Binomial R.V.: Examples

Activity: Compute Empirical vs Theoretical Probabilities

  1. Make sure you have a copy of the W 2/12 Worksheet. This will be handed out physically and it is also digitally available on Moodle.
  2. Work on your worksheet by yourself for 10 minutes. Please read the instructions carefully. Ask questions if anything need clarifications.
  3. Get together with another student.
  4. Discuss your results.
  5. Submit your worksheet on Moodle as a .pdf file.

References

Diez, D. M., Barr, C. D., & Çetinkaya-Rundel, M. (2012). OpenIntro statistics (4th ed.). OpenIntro. https://www.openintro.org/book/os/
Speegle, Darrin and Clair, Bryan. (2021). Probability, statistics, and data: A fresh approach using r. Chapman; Hall/CRC. https://probstatsdata.com/