Previously… (1/3)

The normal r.v.

A normal r.v. is a type of continuous r.v. whose probability distribution follows the normal distribution, also known as the Gaussian distribution. The normal distribution is characterized by two parameters, \(\mu\) as the mean and \(\sigma^2\) as the variance: \[X \sim \text{N}(\mu,\sigma^2)\]

Sample Space:

\(x \in (-\infty,\infty)\) because the normal r.v. can take any value from the entire real number line and it is a continuous random variable.

Parameters

\(\mu\) is the mean (center) or the mode of the distribution.
\(\sigma^2\) measures the spread of the distribution.

Previously… (2/3)

The Normal PDF

The normal r.v. \(X \sim \text{N}(\mu,\sigma^2)\) has infinite possible outcomes (or infinite sized sample space) where \(\mu\) is the mean and \(\sigma^2\) is the variance with PDF given as \[f(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}, \ -\infty < x < \infty\]

Previously… (3/3)

Normal Expected Value

Normal Distribution with Expected Value

Normal R.V.

Let \(\mu=10\) and \(s=2.24\) be the mean and standard deviation respectively.

\[ \begin{aligned} \text{R.V. } & \longrightarrow X \sim \text{N}\left(10,2.24^2\right) \\ \text{PDF } & \longrightarrow f(x) = \frac{1}{\sqrt{2 \pi (2.24)^2}} e^{-\frac{(x-10)^2}{2(2.24)^2}} \\ \text{for } & x \in (-\infty,\infty) \\ \text{expected value} & \longrightarrow \text{E}(X) = 10 \end{aligned} \]

In general, the expected value of the normal r.v. is given by \[\text{E}(X) = \mu,\] which is the center of the normal distribution.

The Central Limit Theorem (CLT)

Key idea the Central Limit Theorem (CLT). Image source: Medium–AI/Data Science Digest

Watch the 1st 20 minutes of this YouTube video: https://www.youtube.com/watch?v=zeJD6dqJ5lo&t=564s

The Normal R.V.: Interval Probabilities (1/4)

Normal Distribution

Example:

What is \(P(7 \le X \le 14)\) for \(X \sim \text{N}(10,2.24)\)? \[ \begin{aligned} P(7 \le X \le 13) & = \int_7^{14} f(x) \ dx \\ & = \int_7^{14} \frac{1}{\sqrt{2 \pi (2.24)^2}} e^{-\frac{(x-10)^2}{2(2.24)^2}} \ dx \\ & = P(X \le 13) - P(X \le 7) \\ P(7 \le X \le 13) & \approx 0.8727 \end{aligned} \]

Using R:

mu <- 10
sd <- 2.24
pnorm(14,mu,sd)-pnorm(7,mu,sd)

## [1] 0.8726884

\(\star\) Note that the Normal PDF is symmetrical but the interval probability we just computed was not.

The Normal R.V.: Interval Probabilities (2/4)

Normal Distribution

Example:

What is \(P(7.7639 \le X \le 12.2361)\) for \(X \sim \text{N}(10,2.24)\)? \[ \begin{aligned} P(7.7639 \le X \le 12.2361) & = P(X \le 12.2361) - P(X \le 7.7639) \\ P(7.7639 \le X \le 12.2361) & \approx 0.6818 \end{aligned} \]

Using R:

mu <- 10
sd <- 2.24
pnorm(12.2361,mu,sd)-pnorm(7.7639,mu,sd)

## [1] 0.6818462

\(\star\) Note that \(P(7.7639 \le X \le 12.2361) = P(10-2.24 \le X \le 10+2.24) \approx 0.6818\).

The Normal R.V.: Interval Probabilities (3/4)

Normal Distribution

Example:

What is \(P(141.3397 \le X \le 158.6603)\) for \(X \sim \text{N}(150,8.66)\)? \[ \begin{aligned} P(141.3397 \le X \le 158.6603) & = P(X \le 158.6603) - P(X \le 141.3397) \\ P(141.3397 \le X \le 158.6603) & \approx 0.6827 \end{aligned} \]

Using R:

mu <- 150
sd <- 8.66
pnorm(158.6603,mu,sd)-pnorm(141.3397,mu,sd)

## [1] 0.6827063

\(\star\) Note that \(P(141.3397 \le X \le 158.6603) = P(150-8.66 \le X \le 150+8.66) \approx 0.6827\).

The Normal R.V.: Interval Probabilities (4/4)

Normal Distribution

Example:

What is \(P(5.5279 \le X \le 14.4721)\) for \(X \sim \text{N}(10,2.24)\)? \[ \begin{aligned} P(5.5279 \le X \le 14.4721) & = P(X \le 14.4721) - P(X \le 5.5279) \\ P(5.5279 \le X \le 14.4721) & \approx 0.9541 \end{aligned} \]

Using R:

mu <- 10
sd <- 2.24
pnorm(14.4721,mu,sd)-pnorm(5.5279,mu,sd)

## [1] 0.9541176

\(\star\) Note that \(P(5.5279 \le X \le 14.4721) = P(10-2 \times 2.24 \le X \le 10+2 \times 2.24) \approx 0.9541\).

\(\dagger\) Is \(P(141.3397 \le X \le 158.6603) = P(150-2 \times 8.66 \le X \le 150+2 \times 8.66) \approx 0.95\)?

The 68-95-99.7 Rule (1/3)

1st standard deviation from the mean

\[P(\mu - \sigma \le X \le \mu + \sigma) \approx 0.68\]

The 68-95-99.7 Rule (2/3)

2nd standard deviation from the mean

\[P(\mu - 2\sigma \le X \le \mu + 2\sigma) \approx 0.95\]

The 68-95-99.7 Rule (3/3)

3rd standard deviation from the mean

\[P(\mu - 3\sigma \le X \le \mu + 3\sigma) \approx 0.997\]

Total Area Under the Curve

The Normal PDF satisfies the probability axioms

\[P(\mu - \infty \le X \le \mu + \infty) \approx 1\]

\(\star\) Key Idea: Because of the axiom that the sum of the probabilities for all outcomes in the sample space is equal to 1, the total area under the Normal PDF is always 1.

Standard Normal Distribution (1/2)

The standard normal distribution is when \(\mu=0\) and \(s=1\) or \(Z \sim \text{N}(0,1)\).

The transformation formula (the z-score)

Standardized scores that measure how many standard deviations a value is from the mean. \[Z = \frac{X - \mu}{\sigma}\]

The Standard Normal PDF

Using the z-score transformation, the normal PDF reduces to \[f(z) = \frac{1}{\sqrt{2 \pi}} e^{-\frac{z^2}{2}}, \ -\infty < z < \infty\]

Standard Normal Distribution (2/2)

The standard normal distribution, \(Z \sim \text{N}(0,1)\).

\(\star\) Key Idea: The standard normal distribution is that it is a normal distribution with a mean of 0 and a standard deviation of 1. It serves as a reference distribution, allowing any normally distributed variable to be standardized.

CLT Conditions

Independence – Sample values must be independent
Identical Distribution – Variables should be from the same distribution
Finite Variance – The population must have a finite variance
Large Sample Size – A larger sample size improves approximation

Activity: Understanding the CLT

Make sure you have a copy of the M 3/10 Worksheet. This will be handed out physically and it is also digitally available on Moodle.
Work on your worksheet by yourself for 10 minutes. Please read the instructions carefully. Ask questions if anything need clarifications.
Get together with another student.
Discuss your results.
Submit your worksheet on Moodle as a .pdf file.

References

Diez, D. M., Barr, C. D., & Çetinkaya-Rundel, M. (2012). OpenIntro statistics (4th ed.). OpenIntro. https://www.openintro.org/book/os/

Speegle, Darrin and Clair, Bryan. (2021). Probability, statistics, and data: A fresh approach using r. Chapman; Hall/CRC. https://probstatsdata.com/

Central Limit Theorem

Applied Statistics

Objectives

Previously… (1/3)

Previously… (2/3)

Previously… (3/3)

The Central Limit Theorem (CLT)

The Normal R.V.: Interval Probabilities (1/4)

The Normal R.V.: Interval Probabilities (2/4)

The Normal R.V.: Interval Probabilities (3/4)

The Normal R.V.: Interval Probabilities (4/4)

The 68-95-99.7 Rule (1/3)

The 68-95-99.7 Rule (2/3)

The 68-95-99.7 Rule (3/3)

Total Area Under the Curve

Standard Normal Distribution (1/2)

Standard Normal Distribution (2/2)

CLT Conditions

Activity: Understanding the CLT

References