MTH-161D | Spring 2025 | University of Portland
February 21, 2025
These slides are derived from Diez et al. (2012).
Basic Definition of Probability
Let \(A\) be an event with a finite sample space \(S\). The probability of \(A\) is \[P(A) = \frac{|A|}{|S|} \longrightarrow P(A) = \frac{\text{number of outcome favorable to } A}{\text{total number of outcomes in } S}.\]
Basic Rules
Rule | Formula |
---|---|
Independence | \(P(A \text{ and } B) = P(A)P(B)\) |
Joint (Union) | \(P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)\) |
Disjoint | \(P(A \text{ and } B) = 0\) |
Complement | If \(P(A) + P(B) = 1\), then \(1-P(A)=P(B)\). |
Interpreting Probability
Frequentist probability refers to the interpretation of probability based on the long-run frequency of an event occurring in repeated trials or experiments.
Coin Flipping Example
The guiding principle of statistics is statistical thinking.
Statistical Thinking in the Data Science Life Cycle
Probability is the Basis for Inference
The P-Value is a Probability
A hospital is conducting a study on two different diseases: Disease A and Disease B. Among a randomly selected group of patients, the probabilities are as follows:
A random variable (r.v) is a numerical outcome of a random experiment. It assigns a number to each possible outcome in a sample space.
In other words, a random variable is a function that maps the sample space into real numbers.
Types:
\(\star\) Key Idea: R.V. provides a way to assign numerical values to outcomes in a sample space, allowing us to analyze and compute probabilities in a structured manner
A probability function assigns probabilities to outcomes in a sample space.
In other words, a probability function maps the r.v. into the the real numbers between 0 and 1.
Types:
\(\star\) Key Idea: We can define a probability function directly from the sample space, but using a random variable makes it explicit what outcomes we want to compute probabilities for in a given scenario.
Suppose we conduct an experiment of flipping a fair coin once.
\(\star\) Key Idea: A random variable for a coin toss maps the sample space \(\{H,T\}\) to real values, assigning \(X(H)=1\) and \(X(T)=0\). The probability function \(P(X)\) then defines the probability space.
Suppose we conduct an experiment of one randomized outcome from a binary r.v..
A binary r.v. \(X\) is a variable that takes only two possible values, typically labelled as “success” and “failure”.
\(\star\) Key Idea: A r.v. with binary outcome is called the Bernoulli R.V. with a probability of “success” \(p\) and “failure” \(1-p\), where \(p\) is called the parameter. One trial with binary outcomes is called a Bernoulli trial.
A Bernoulli r.v. represents a single experiment with two possible outcomes: “success” (\(X=1\)) with probability \(p\) and failure (\(X=0\)) with probability \(1-p\). We typically define an r.v. using \(\sim\) along with the name and its parameter: \[X \sim \text{Bern}(p)\]
The Bernoulli distribution is a probability mass function that computes the probability of the Bernoulli r.v..
\[ P(X = x) = \begin{cases} p & \text{, if } x = 1 \\ 1-p & \text{, if } x = 0 \end{cases}. \]
where \(p\) is the parameter (also the probability of “success”).
A hospital is testing a new antibiotic for treating bacterial pneumonia. Researchers conduct a clinical trial where patients receive either the new antibiotic or a standard treatment, and their recovery outcomes are recorded.
Binary Outcome:
Data Collection & Analysis:
\(\star\) Key Idea: This example is simplified and it serves the building block of all statistical studies, which includes observational and experimental studies.
Suppose we conduct an experiment of flipping two fair coins in a sequence.
\(\star\) Key Idea: The PMF \(P(X)\) satisfies the probability axioms, and the collection of all probabilities forms the probability distribution.
Suppose we conduct an experiment of flipping \(n\) fair coins in a sequence, where \(n\) is an integer. The sample space \(S\) contains all possible sequences of \(H\) and \(T\). Number of possible outcomes is \(|S| = 2^n\).
Visualizing the possible outcomes using Pascal’s triangle
\(\star\) Key Idea: Pascal’s Triangle helps us visualize the total possible sequences of “success” (\(H\)) outcomes given \(n\) independent trials.
Let \(X\) be the r.v. that counts the number of \(H\) outcomes in \(n\) trials.
Pascal’s triangle helps us count
\(\dagger\) Can you determine the ways \(H\) can occur in \(4\) trials using Pascal’s triangle?
Compute the probability of observing a certain number of “success” (\(H\)) outcomes in \(n\) trials.
\(\dagger\) Can you determine the probabilities of observing \(H\) outcomes in \(4\) trials?
How many \(H\) outcomes do we expect to have in \(n\) independent Bernoulli trials?
Example
In general:
\(\star\) Key Idea: expected number of successes in \(n\) trials with success probability \(p\). Over many repetitions, the long-run average number of successes is \(n \times p\), reflecting the frequentist interpretation of probability.
A binomial R.V. is a discrete random variable representing the number of “success” in \(n\) independent Bernoulli trials, each with “success” probability \(p\): \[X \sim \text{Binom}(n,p)\]
The Binomial distribution is a probability mass function that computes the probability of the Binomial r.v..
\[P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}, \ k = 0,1,2,3, \cdots, n\] where \(p\) is the “success” probability. The term \(\binom{n}{k}\) is the binomial coefficient or the numbers in the Pascal’s triangle.
The expected value or the theoretical mean of “success” outcomes is \(n \times p\).
\(\star\) Key Idea: Computing Binomial probabilities can be tedious but you can use R to compute these probabilities efficiently and accurately. We will have more discussions about using R for the Binomial PMF later.
A hospital is preparing for the flu season by estimating the number of patients who will require hospitalization due to severe influenza. Over a one-month period, \(100\) flu-positive patients are tracked to determine whether they need hospital admission. Based on the data, the hospitalization rate is approximately \(1\) in every \(3\) patients.
Binomial Outcomes
Rate Interpretation
\(\star\) Key Idea: This is an example of a Binomial r.v. because we are given a fixed number of independent patients and there is only two possible outcomes in each patient.
Suppose a hospital receives 10 flu-positive patients at a given time. Based on a known estimated hospitalization rate of \(\frac{1}{3}\), each patient is expected to have an independent probability of \(\frac{1}{3}\) of requiring hospitalization.
\(\star\) Key Idea: While the rate was originally estimated from a study involving 100 patients, we are applying it to a smaller sample of only 10 patients, meaning that the actual number of hospitalizations may vary due to random fluctuations.
Consider the same scenario where a small hospital receives 10 flu-positive patients at a given time with estimated hospitalization rate of \(\frac{1}{3}\).
What is the probability that a hospital get \(3\) patients to be hospitalized?
Computation
Interpretation
\(\dagger\) How would you write the binomial probability that a hospital get \(4\) patients to be hospitalized? What about \(3\) or \(4\) patients? What rules of probability are applied?
.pdf
file.Pascal’s Triangle and Combinations This formula calculates the number of ways to choose \(k\) elements from a set of \(n\). Each number in Pascal’s Triangle corresponds to a combination. Also known as the binomial coefficient.