MTH-361A | Spring 2025 | University of Portland
February 7, 2025
Basic Probability Rules
Rule | Formula |
---|---|
Independence | \(P(A \cap B) = P(A)P(B)\) |
Joint (Union) | \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\) |
Disjoint | \(P(A \cap B) = 0\) |
Complement | If \(P(A) + P(B) = 1\), then \(1-P(A)=P(B)\). |
Probability Axioms
Axiom | Statement |
---|---|
\(P(S) = 1\) | The sum of the probabilities for all outcomes in the sample space is equal to 1. |
\(P \in [0,1]\) | Probabilities are always positive and always between \(0\) and \(1\). |
\(P(A \cup B) = P(A) + P(B)\) | If events A and B are disjoint (mutually exclusive), then their probabilities can be added. |
Random Variables
A random variable (r.v) is a function that maps the sample space into real numbers.
Probability Functions
A probability function maps the r.v. into the the real numbers between 0 and 1.
There are two types of r.v.s –discrete and continuous– with corresponding probability functions –probability mass function (PMF) and probability density function (PDF) respectively.
Suppose we conduct an experiment of one randomized outcome from a binary r.v..
A binary r.v. \(X\) is a variable that takes only two possible values, typically labelled as “success” and “failure”.
\(\star\) Key Idea: A r.v. with binary outcome is called the Bernoulli R.V. with a probability of “success” \(p\) and “failure” \(1-p\), where \(p\) is called the parameter. One trial with binary outcomes is called a Bernoulli trial.
A Bernoulli r.v. represents a single experiment with two possible outcomes: “success” (\(X=1\)) with probability \(p\) and failure (\(X=0\)) with probability \(1-p\). We typically define an r.v. using \(\sim\) along with the name and its parameter: \[X \sim \text{Bern}(p)\]
The Bernoulli distribution is a probability mass function that computes the probability of the Bernoulli r.v..
\[ P(X = x) = \begin{cases} p & \text{, if } x = 1 \\ 1-p & \text{, if } x = 0 \end{cases}. \]
where \(p\) is the parameter (also the probability of “success”).
The above function can also be written as
\[P(X = x) = p^x (1-p)^{1-x}, \ \ x \in \{0,1\}.\]
Suppose we conduct an experiment of flipping a fair coin once.
This scenario can be modeled using the Bernoulli R.V. with parameter \(p=0.50\) because the probability of “success” is \(0.50\).
Here, we define a “success” of the r.v. to be the \(H\) outcome: \[X \sim \text{Bern}(0.50)\] with PMF defined as \[P(X=x) = (0.50)^x (1-0.5)^{1-x}, \ \ x \in \{0,1\}.\]
\(\star\) Key Idea: Any scenario for one experiment with binary outcomes can be modeled using the Bernoulli r.v..
The expected value of a r.v. is the weighted mean (average) of all possible values that the variable can take, weighted by their probabilities. It represents the long-run average outcome of a random experiment.
Discrete R.V.
For a discrete r.v. \(X\), the expected value (or expectation) is given by \[\text{E}(X) = \sum_{i=1}^{n} x_i P(X = x_i)\] where:
\(\star\) Key Idea: This formula provides a weighted average (or the “center of mass”) of the possible values of the discrete r.v. \(X\), with each value weighted by its probability.
\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Bern}(p) \\ \text{PMF } & \longrightarrow P(X = x) = p^x (1-p)^{1-x}, \ \ x \in \{0,1\} \end{aligned} \]
The expected value of \(X\) is given by \[ \begin{aligned} \text{E}(X) & = 1 \cdot p^{1} (1-p)^{1-1} + 0 \cdot p^{0} (1-p)^{1-0} \\ & = 1 \cdot p + 0 \cdot (1-p) \\ \text{E}(X) & = p \end{aligned} \]
\(\star\) Key Idea: The expected value of a Bernoulli random variable is simply its success probability \(p\).
Let \(X\) be a discrete or continuous r.v.. The following properties are true and it can be shown using the definition of the expected value.
Property | Formula |
---|---|
Constant | \(\displaystyle \text{E}(c) = c\) or \(\text{E}(cX) = c\text{E}(X)\) |
Linearity | \(\displaystyle \text{E}(aX + bY) = a\text{E}(X) + b\text{E}(Y)\) |
Sum | \(\displaystyle \text{E}\left(\sum_i^{n} X_i \right) = \sum_i^{n} E\left( X_i \right)\) |
Expectation | \(\displaystyle \text{E}(\text{E}(X)) = \text{E}(X)\) |
\(\star\) Key Idea: The expected value of a r.v. is always constant but the interpretation depends on context –we will discuss more on this later.
The \(k\)-th raw moment of a discrete random variable \(X\) is given by \[\text{E}\left(X^k \right) = \sum_{i=1}^{n} x_i^k P(X = x_i)\] where: \(P(X = x_i)\) is the PMF of \(X\) for \(i=1,2,\cdots,n\).
Why is called “Raw Moments”? They give us raw, unadjusted information about the probability distribution’s characteristics.
The variance of a r.v. \(X\) measures the spread of \(X\) around its expected value.
Discrete or Continuous R.V.
For a r.v. \(X\), the variance is given by \[\text{Var}(X) = \text{E}\left( X - \text{E}(X) \right)^2\] where:
Using the properties of the expected value, the variance formula reduces to \[\text{Var}(X) = \text{E}\left(X^2 \right) - \left( \text{E}(X) \right)^2\] where:
\(\star\) Key Idea: This formula shows how maximizing variance increases the spread of a random variable’s values, reflecting greater uncertainty. In other words, the worst case scenario of uncertainty.
\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{Bern}(p) \\ \text{PMF } & \longrightarrow P(X = x) = p^x (1-p)^{1-x}, \ \ x \in \{0,1\} \end{aligned} \]
The 2nd raw moment of \(X\) is given by \[ \begin{aligned} \text{E}\left( X^2 \right) & = 1^2 \cdot p^{1} (1-p)^{1-1} + 0^2 \cdot p^{0} (1-p)^{1-0} \\ & = 1 \cdot p + 0 \cdot (1-p) \\ & = p \end{aligned}. \]
Since \(\text{E}(X) = p\) and \(\text{E}\left(X^2\right) = p\), then the variance of \(X\) is given by \[ \begin{aligned} \text{Var}(X) & = \text{E}\left(X^2 \right) - \left( \text{E}(X) \right)^2 \\ & = p - p^2 \\ \text{Var}(X) & = p(1-p) \end{aligned} \]
\(\star\) Key Idea: The variance of a Bernoulli random variable is simply the product of the “success” and “failure” probabilities.
The variance of the Bernoulli r.v. is \(\text{Var}(X) = p(1-p)\) where \(p\) is “success” probability.
\(\star\) Key Idea: The plot shows that the variance of the Bernoulli r.v. is maximized when \(p = 0.5\) (“success” and “failure” are equally likely to occur), where the outcomes are most uncertain.
.pdf
file.