MTH-361A | Spring 2026 | University of Portland
Geometric R.V. (Revisited)
\(\star\) In a time-based interpretation, if events occur in discrete time steps, the geometric r.v. represents the number of time steps required until an event of interest happens.
A professor allows students to take a short assessment quiz, and if they do not pass, they can revise their answers and retake the quiz in the next session. The probability that a student passes on any given attempt is \(p=0.40\), and attempts continue until the student passes.
Let \(X\) be the number of “fail” attempts before the student get a “pass”.
Information Given:
\(\star\) On average, the student would “fail” on their 1st attempt before they get a “pass”. The chances of a “pass” with \(1\) “fail” attempt is \(0.64\).
Computing Probabilities:
Using R:
## [1] 0.64
\(\star\) Note that the R function
pgeom is the Geometric CDF.
Exponential R.V.
\(\star\) In a time-based interpretation, if events occur in continuous time, the exponential r.v. represents the length of time required until an event of interest happens.
The exponential r.v. is a continuous r.v. that models the time until an event occurs, given that the event happens at a constant rate over time called \(\lambda\): \[X \sim \text{Exp}(\lambda)\]
Sample Space:
Rate Parameter
The exponential r.v. \(X \sim \text{Exp}(\lambda)\) has infinite possible outcomes (or infinite sized sample space) where \(\lambda > 0\) is the rate of “success” with PDF given as \[f(x) = \lambda e^{-\lambda x}, \quad x \ge 0\]
\(\star\) The exponential r.v. models the unit length until an event happens.
A class of students is taking a quiz, and the time it takes for students to finish the quiz follows an exponential r.v., assuming unlimited quiz time allocation. On average, a student takes 15 minutes to complete the quiz.
Let \(X\) represent the time to finish the quiz.
Information Given:
Computing Probabilities:
Using R:
## [1] 0.6321206
\(\star\) Note that the R function
pexp is the Exponential CDF.
A Probability Density Function (PDF) \(f(x)\) describes the likelihood of a continuous r.v. taking a specific value from a continuous interval.
The probability of a single point is zero for continuous distributions: \[P(X = x) = 0, \ \text{for any } x\] because continuous distributions are defined over an infinite number of possible values, and the probability at a single point is infinitesimally small.
Instead, we calculate probabilities over intervals using integration:
\(\star\)
Suppose we conduct an experiment of flipping \(n\) fair coins. The sample space \(S\) contains all possible outcomes, where the number of outcomes is \(2^n\).
Visualizing the possible outcomes using Pascal’s triangle
\(\star\) Pascal’s Triangle helps us visualize the total possible sequences of “success” (\(H\)) outcomes given \(n\) independent trials.
Let \(X\) be the r.v. that counts the number of \(H\) outcomes in \(n\) trials.
Pascal’s triangle helps us count:
\(\star\) The binomial coefficient tells you how many ways \(k\) “success” outcomes can occur in \(n\) trials, corresponding to the \(k\)-th column in the \(n\)-th row of Pascal’s Triangle.
Pascal’s Triangle and Combinations This formula calculates the number of ways to choose \(k\) elements from a set of \(n\). Each number in Pascal’s Triangle corresponds to a combination. Also known as the binomial coefficient.
Suppose we want to compute the probability of observing a certain number of “success” (\(H\)) outcomes in \(n\) trials. Note that for a fair coin the probability of “success” is \(\frac{1}{2}\).
Pascal’s triangle helps us compute these probabilities:
To best illustrate this idea, here is a video.
\(\star\) The video explains how random events can produce a predictable pattern, specifically, how many random outcomes together form a normal “bell-curve” distribution.
\(\star\) The Binomial distribution is approximately the normal distribution given large enough samples because of the Law of Large Numbers.
A normal r.v. is a type of continuous r.v. whose probability distribution follows the normal distribution, also known as the Gaussian distribution. The normal distribution is characterized by two parameters, \(\mu\) as the mean and \(\sigma^2\) as the variance: \[X \sim \text{N}(\mu,\sigma^2)\]
Sample Space:
Parameters
The normal r.v. \(X \sim \text{N}(\mu,\sigma^2)\) has infinite possible outcomes (or infinite sized sample space) where \(\mu\) is the mean and \(\sigma^2\) is the variance with PDF given as \[f(x) = \frac{1}{\sigma \sqrt{2 \pi}} \exp{\left(-\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2\right)}, \ -\infty < x < \infty\] where the term \(\exp{(\cdot)}\) is the exponential function \(e^{(\cdot)}\). We write it this way because the function the exponent term is complicated.
\(\star\) The normal r.v. often approximates the distribution of many types of data, especially when there are large numbers of independent factors contributing to the outcome.
A standardized score is a measure on how many standard deviations a value is from the mean. This computed by the z-score formula: \[z = \frac{x - \mu}{\sigma}.\]
Using the z-score transformation, the normal PDF reduces to the standard normal PDF, \[f(z) = \frac{1}{\sqrt{2 \pi}} \exp{\left(-\frac{z^2}{2}\right)}, \ -\infty < z < \infty.\]
\(\star\) The standardized score is used to compare two normal distributions with different means and variances.
When can we use it?
The binomial distribution \(X \sim \text{Binom}(n,p)\) can be approximated by a normal distribution when:
Approximation Formula
\(\star\) The normal approximation simplifies binomial probability calculations for large \(n\).
A large introductory statistics course has a final exam consisting of many independent multiple-choice questions. Each question has a probability of “success” \(p = 0.70\) (a typical student answers correctly with probability \(0.7\)). The exam has \(n = 100\) questions.
Let \(X\) represent the number of correct answers on the exam.
Information Given:
The number of correct answers follows a Binomial distribution: \[X \sim \text{Binom}(n,p)\] where \(n = 100\) and \(p = 0.70\).
Mean and standard deviation:
\[ \begin{aligned} \mu & = np \\ & = 100(0.70) = 70 \\ \sigma^2 & = \sqrt{np(1-p)} \\ & = \sqrt{100(0.70)(1-0.70)} \approx 4.583 \end{aligned} \]
Normal Approximation:
\[ \begin{aligned} np & = 100(0.70) = 70 \\ n(1-p) & = 100(1-0.70) = 30 \end{aligned} \] are greater than 10, the Binomial distribution can be approximated by a Normal distribution: \[X \approx \text{N}(np,np(1-p)).\]
The \(90\)th percentile of the standard normal distribution is \(\displaystyle z_{0.90} \approx 1.283\). That is interpreted as \(1.283\) standard deviations from the mean.
Computing Percentiles:
\[ \begin{aligned} x & = \mu + z \sigma \\ x & = 70 + (1.283)(4.583) \\ x & \approx 75.88 \end{aligned} \]