MTH-361A | Spring 2025 | University of Portland
March 28, 2025
Types of Decision Errors
Reality/Decision | Reject \(H_0\) | Fail to reject \(H_0\) |
---|---|---|
\(H_0\) is true | Type I error with probability \(\alpha\) (significance level) |
Correct decision with probability \(1-\alpha\) (confidence level) |
\(H_0\) is false | Correct decision with probability \(1-\beta\) (power of test) |
Type II error with probability \(\beta\) |
Trade-offs between Type I and Type II Errors
Images Source: Type I and Type II errors by Pritha Bhandari
Suppose we wish to conduct an experiment to determine if the mean heart rate of healthy adults is 80 beats per minute.
Hypotheses
Assumptions (For demonstration purposes)
We simulate the sampling distribution of the true mean, then compare that to the null value.
samples <- rnorm(130, 78, 6.67) # simulated values
t_stat <- (mean(samples) - 80) / (sd(samples) / sqrt(130))
t_stat # test statistic
## [1] -3.387262
## [1] -2.614479
##
## One Sample t-test
##
## data: samples
## t = -3.3873, df = 129, p-value = 0.0009363
## alternative hypothesis: true mean is not equal to 80
## 95 percent confidence interval:
## 76.49356 79.07942
## sample estimates:
## mean of x
## 77.78649
\(\star\) Key Idea: The p-value is less than \(0.01\). We can reject \(H_0\). Here, we want the cases when p-value is greater than \(0.01\), where we fail to reject \(H_0\).
Replicate the same simulation and one-sample t-test and measure the type II error probability
simdata <- replicate(10000, { # replicate simulation
dat <- rnorm(130, 78, 6.67) # simulated values
t.test(dat, mu = 80)$p.value > .01 # determine if p-value is greater than alpha
})
mean(simdata) # Probability of Type II error
## [1] 0.2142
## [1] 0.7858
\(\star\) Key Idea: The power can be interpreted as the probability of correctly rejecting the null hypothesis when the alternative hypothesis is true.
The power of a statistical test is the probability that the test correctly rejects the null hypothesis (\(H_0\)) when the alternative hypothesis (\(H_A\)) is true. In other words, it measures the test’s ability to detect an actual effect when one exists.
Mathematically, power is defined as: \[\text{Power} = 1 - \beta\] where \(\beta\) is the Type II error rate.
How to Interpret Power:
Higher Power (e.g., 0.8 or 80%):
Lower Power (e.g., 0.5 or 50%):
You can compute the power exactly
##
## One-sample t test power calculation
##
## n = 130
## delta = 2
## sd = 6.67
## sig.level = 0.01
## power = 0.7878322
## alternative = two.sided
How large a sample is needed to detect a clinically significant difference in body temperature from 98.6 degrees with a power of 0.8?
Hypotheses:
Power Calculation
##
## One-sample t test power calculation
##
## n = 29.64538
## delta = 0.2
## sd = 0.3
## sig.level = 0.01
## power = 0.8
## alternative = two.sided
\(\star\) Key Idea: The goal of power calculations is to determine the sample size that would achieve a chosen power probability.
Effect size measures the standardized difference between means which is given by Cohen’s number: \[d = \frac{\bar{x} - \mu_0}{s}.\]
This number standardizes differences for comparisons.
Example Calculation:
\[\text{Group A} \longrightarrow 0.2 = \frac{\bar{x}_A - \mu_A}{1.5} \longrightarrow \bar{x}_A - \mu_A = 0.3\]
##
## One-sample t test power calculation
##
## n = 30
## delta = 0.3
## sd = 1.5
## sig.level = 0.05
## power = 0.1839206
## alternative = two.sided
\[\text{Group A} \longrightarrow 0.2 = \frac{\bar{x}_B - \mu_B}{.5} \longrightarrow \bar{x}_B - \mu_B = 0.1\]
##
## One-sample t test power calculation
##
## n = 30
## delta = 0.1
## sd = 0.5
## sig.level = 0.05
## power = 0.1839206
## alternative = two.sided
\(\star\) Key Idea: Both scenarios have the same effect size and power.
.pdf
file.