Previously… (1/2)

Random Variables

A random variable (r.v.) is a numerical outcome of a random experiment. It assigns a number to each possible outcome in a sample space.

In other words, a random variable is a function that maps the sample space into real numbers.

Types:

Discrete Random Variable: Takes on a countable number of outcomes
Continuous Random Variable: Takes on any value of outcomes in an interval

Previously… (2/2)

A Bernoulli r.v. represents a single experiment with two possible outcomes: “success” (\(X=1\)) with probability \(p\) and failure (\(X=0\)) with probability \(1-p\). We denote this r.v. and its PMF as \[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Bern}(p) \\ \text{PMF } & \longrightarrow P(X = x) = p^x (1-p)^{1-x}, \ \ x \in \{0,1\} \end{aligned}. \]

We have shown that the expected value of \(X\) is \(\text{E}(X) = p\) and the variance of \(X\) is \(\text{Var}(X) = p(1-p)\) using the definition of expected value and variance respectively.

Sampling with Replacement

The sampling with replacement is a sampling method where each selected item is returned to the population before the next selection.

Key Characteristics:

Independent Trials: Each draw does not affect the next.
Same Probability: Every item has the same chance of being selected each time.
Allows Repetition: The same item can be chosen multiple times.

Example:

Drawing a card from a deck, recording its value, and putting it back before drawing again.
Rolling a die multiple times, where each roll does not influence the next.

Sampling without Replacement

The sampling without replacement is a sampling method where each selected item is not returned to the population before the next selection.

Key Characteristics:

Dependent Trials: Each draw changes the population.
Changing Probability: The chance of selecting an item increases or decreases based on previous
No Repetition: Once an item is chosen, it cannot be selected again.

Example:

Drawing a card from a deck without returning it, meaning the next draw is from a smaller set.
Drawing items from jars without return it before drawing the next item.

Independent and Identically Distributed

A sequence of random variables \(X_1, X_2, \cdots, X_n\) is independent and identically distributed (i.i.d) if:

Identically distributed: Each \(X_i\) follows the same probability distribution
Independence: The random variables do not influence each other

Examples:

Tossing a fair coin multiple times
Rolling a fair die multiple times

Why is i.i.d. Important?

Forms the foundation of probability and statistics (e.g., Law of Large Numbers, Central Limit Theorem).

Multiple Bernoulli Trials

A sequence of multiple Bernoulli trials consists i.i.d. Bernoulli random variables where each follows a Bernoulli distribution with “success” probability \(p\).

This leads to important distributions:
- Geometric Distribution (number of trials until the first success)
- Binomial Distribution (sum of multiple Bernoulli trials)

\(\star\) Key Assumption: Trials are independent (i.e., one outcome does not affect the next).

The Geometric R.V.

A geometric R.V. is a discrete random variable that represents the number of Bernoulli trials until the first “success” where each trial is independent, with a constant “success” probability \(p\): \[X \sim \text{Geom}(p)\]

Sample Space:

\[ \begin{aligned} 1 & \longrightarrow 0 \text{ fail until success} \\ 0,1 & \longrightarrow 1 \text{ fail until success} \\ 0,0,1 & \longrightarrow 2 \text{ fail until success} \\ & \vdots \\ 0,0,0,\cdots,1 & \longrightarrow k \text{ fail until success} \\ \end{aligned} \]

Probabilities:

\[ \begin{aligned} 1 & \longrightarrow (1-p)^0 p \\ 0,1 & \longrightarrow (1-p)^1 p \\ 0,0,1 & \longrightarrow (1-p)^2 p \\ & \vdots \\ 0,0,0,\cdots,1 & \longrightarrow (1-p)^k p \\ \end{aligned} \]

\(\star\) Key Idea: The geometric random variable counts the number of “failures” before a “success” and can also be viewed as counting the number of trials including the first “success.”

The Geometric R.V.: PMF

The geometric r.v. \(X \sim \text{Geom}(p)\) has infinite possible outcomes (or infinite sized sample space) where \(p\) is the “success” probability.

The PMF of the geometric r.v. can be written in two ways:

k “failures” until “success”: \[P(X=k) = (1-p)^k p, \ k = 0,1,2, \cdots\]
k trials including 1st “success”: \[P(X=k) = (1-p)^{k-1} p, \ k = 1,2, \cdots\]

\(\star\) Key Idea: The geometric random variable models a situation where samples are taken with replacement, and the number of failures until the first success is counted.

The Geometric R.V.: Examples

What is the probability of “success” on the 6th trial with \(p=0.50\)? \[P(X=5) = (1-0.50)^5 (0.50) \approx 0.016\] because there are 5 “failures” before the 6th trial.
What is the probability that the first “success” occurs before the 6th trial, given \(p=0.50\)? \[ \begin{aligned} P(X \le 5) & = \sum_{i=0}^5 P(X = i) \\ & = \sum_{i=0}^5 (1-0.50)^{i} (0.50) \\ P(X \le 5) & \approx 0.984 \\ \end{aligned} \] because we need to count five or fewer “failures” before a “success” occurs.

The Binomial R.V.

A binomial R.V. is a discrete random variable representing the number of “success” in \(n\) independent Bernoulli trials, each with “success” probability \(p\): \[X \sim \text{Binom}(n,p)\]

Sample Space:

Suppose \(n = 3\).

\[ \begin{aligned} 0,0,0 & \longrightarrow 3 \text{ fail and } 0 \text{ success} \\ 0,0,1 & \longrightarrow 2 \text{ fail and } 1 \text{ success} \\ 0,1,0 & \longrightarrow 2 \text{ fail and } 1 \text{ success} \\ 0,1,1 & \longrightarrow 1 \text{ fail and } 2 \text{ success} \\ 1,0,0 & \longrightarrow 2 \text{ fail and } 1 \text{ success} \\ 1,0,1 & \longrightarrow 1 \text{ fail and } 2 \text{ success} \\ 1,1,0 & \longrightarrow 1 \text{ fail and } 2 \text{ success} \\ 1,1,1 & \longrightarrow 0 \text{ fail and } 3 \text{ success} \\ \end{aligned} \]

Probabilities:

Suppose \(n = 3\). \[ \begin{aligned} 0,0,0 & \longrightarrow (1-p)^3 p^0 \\ 0,0,1 & \longrightarrow (1-p)^2 p^1 \\ 0,1,0 & \longrightarrow (1-p)^2 p^1 \\ 1,0,0 & \longrightarrow (1-p)^2 p^1 \\ 0,1,1 & \longrightarrow (1-p)^1 p^2 \\ 1,0,1 & \longrightarrow (1-p)^1 p^2 \\ 1,1,0 & \longrightarrow (1-p)^1 p^2 \\ 1,1,1 & \longrightarrow (1-p)^0 p^3 \end{aligned} \]

\(\star\) Key Idea: The binomial random variable counts the number of “successes” in \(n\) independent Bernoulli trials, where each trial has a “success” probability \(p\).

Counting the Number of Combinations

Permutations

An arrangement of objects in a specific order. For \(n\) objects, we pick \(k\) objects to permute with number of permutations given by \[_n P_k = \frac{n!}{(n-k)!}.\]

Combinations

A selection of objects where order does not matter. For \(n\) objects, we pick \(k\) objects to combine with number of combinations given by \[_n C_k = \binom{n}{k} = \frac{n!}{k!(n-k)!}.\]

The Binomial Coefficient

The binomial coefficient, denoted as \(\binom{n}{k}\) represents the number of ways to choose \(k\) objects from a set of \(n\) objects without regard to order. It is given by the formula: \[\binom{n}{k} = \frac{n!}{k!(n-k)!}\]

Expanding binomial expressions using the Binomial Theorem: \[(x+y)^n = \sum_{k=0}^n \binom{n}{k} y^k x^{n-k}\]

If we let \(x=1-p\) and \(y=p\) (Bernoulli PMF), then \[(1-p+p)^n = \sum_{k=0}^n \binom{n}{k} p^k (1-p)^{n-k} = 1.\]

\(\star\) Key Idea: Since \(p\) is the “success” probability and the Binomial Theorem reduces to \(1\), then this satisfies the probability axioms.

The Binomial R.V.: PMF

The binomial r.v. \(X \sim \text{Binom}(n,p)\) has finite possible outcomes with PMF given by \[P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}, \ k = 0,1,2,3, \cdots, n\] where \(p\) is the “success” probability. The term \(\binom{n}{k} = \frac{n!}{k! (n-k)!}\) is the binomial coefficient.

\(\star\) Key Idea: The binomial random variable models a situation where samples are taken with replacement, and the number of successes is counted within a finite number of trials.

The Binomial R.V.: Examples

What is the probability of getting 4 “success” in 10 trials with \(p=0.50\)? \[P(X=4) =\binom{10}{4} (0.50)^4 (1-0.50)^{n-k} \approx 0.205\]
What is the probability of getting at most 4 “success” in 10 trials with \(p=0.50\)? \[ \begin{aligned} P(X \le 4) & = \sum_{i=0}^4 P(X = i) \\ & = \sum_{i=0}^4 \binom{10}{i} (0.50)^i (1-0.50)^{10-i} \\ P(X \le 4) & \approx 0.377 \\ \end{aligned} \] because we need to count four or fewer “success” in 10 trials.

Activity: Compute Empirical vs Theoretical Probabilities

Make sure you have a copy of the W 2/12 Worksheet. This will be handed out physically and it is also digitally available on Moodle.
Work on your worksheet by yourself for 10 minutes. Please read the instructions carefully. Ask questions if anything need clarifications.
Get together with another student.
Discuss your results.
Submit your worksheet on Moodle as a .pdf file.

Discrete Random Variabes &
Probability Mass Functions

Applied Statistics

Objectives

Previously… (1/2)

Previously… (2/2)

Sampling with Replacement

Sampling without Replacement

Independent and Identically Distributed

Multiple Bernoulli Trials

The Geometric R.V.

The Geometric R.V.: PMF

The Geometric R.V.: Examples

The Binomial R.V.

Counting the Number of Combinations

The Binomial Coefficient

The Binomial R.V.: PMF

The Binomial R.V.: Examples

Activity: Compute Empirical vs Theoretical Probabilities

References

Discrete Random Variabes & Probability Mass Functions

Applied Statistics

Objectives

Previously… (1/2)

Previously… (2/2)

Sampling with Replacement

Sampling without Replacement

Independent and Identically Distributed

Multiple Bernoulli Trials

The Geometric R.V.

The Geometric R.V.: PMF

The Geometric R.V.: Examples

The Binomial R.V.

Counting the Number of Combinations

The Binomial Coefficient

The Binomial R.V.: PMF

The Binomial R.V.: Examples

Activity: Compute Empirical vs Theoretical Probabilities

References

Discrete Random Variabes &
Probability Mass Functions