Inference for One Proportion

Applied Statistics

MTH-361A | Spring 2026 | University of Portland

Objectives

High Blood Pressure Drug Test

Two scientists want to know if a certain drug is effective against high blood pressure.

Survey Question:

Which is the better way to test the drug?

\(\star\) The correct answer is the “500 get the drug, 500 don’t” choice.

Results:

Answer Count
All 1000 get the drug 99
500 get the drug, 500 don’t 571
Total 670

Parameter and Point Estimate

We would like to estimate the proportion of all Americans who have good intuition about experimental design.

What are the parameter of interest and the point estimate?

Parameter of interest:

Point estimate:

Inference for One Proportion

What percent of all Americans have good intuition about experimental design, i.e. would answer “500 get the drug, 500 don’t”?

Confidence Interval:

Sampling distribution:

CLT for One Proportion

We can use the normal approximation of the Binomial to simplify the sampling distribution of the sample proportion.

CLT Conditions:

\(\star\) Having \(10\) as the minimum number of “success” and “failure” is a rule of thumb, but if more samples can be obtained, the better.

Normal approximation:

Standard error:

Inferring the True Proportion

The GSS found that \(571\) out of \(670\) (\(85.2\)%) of Americans answered the question on experimental design correctly.

Information given:

Confidence interval:

Using R:

z_star <- qnorm(0.95+((1-0.95)/2),0,1) # critical value
n <- 670 # sample size
p_hat <- 571/n # sample proportion (point estimate)
SE_p <- sqrt((p_hat*(1-p_hat))/n) # standard error
cl_lb <- p_hat - z_star*SE_p # upper bound
cl_ub <- p_hat + z_star*SE_p # lower bound
c(cl_lb,cl_ub) # interval as an ordered list
## [1] 0.8253686 0.8791090

Interpretation of the Confidence Interval

The point estimate is \(\hat{p} = \frac{571}{670} \approx 0.852\) with standard error \(SE_{\hat{p}} \approx 0.014\). For a \(0.95\) confidence level, \(z^* \approx 1.960\).

Sampling distribution of the point estimate:

\(\star\) Note that we don’t actually know \(p\), but we just infered from our sample proportion \(\hat{p}\) of what it could be with some level uncertainty.

Confidence interval:

Interpretation:

Reducing Uncertainty

Suppose we want to know how many more samples we need to reduce the margin of error (ME).

Margin of error:

How many people should we sample in order to cut the margin of error of a \(95\)% confidence interval down to \(0.01\)?

Computing the number of samples:

\[ \begin{aligned} 1.96 \cdot \sqrt{\frac{0.852(1-0.852)}{n}} & \le 0.01 \\ 1.96^2 \times \frac{0.852(1-0.852)}{n} & \le 0.01^2 \end{aligned} \]

\[ \begin{aligned} n & \ge \left(\frac{1.96}{0.01}\right)^2 \left(0.852(1-0.852)\right) \\ n & \ge 4844.104 \end{aligned} \]

\(\star\) The sample size should be \(n \ge 4845\) to have a \(0.01\) margin of error for \(95\)% confidence interval.

Summary of Parameter Estimation for One Proportion

CLT conditions:

Sampling distribution of the point estimate:

Confidence interval:

\[\hat{p} \pm z^* \cdot SE_{\hat{p}}\]

\(\dagger\) Use the qnorm function in R to compute \(z^*\).

Customer Satisfaction

A local coffee shop prides itself on high customer satisfaction. The shop’s management claims that at least \(85\)% of its customers are satisfied with their service. A market research firm is hired to assess this claim by conducting a survey.

Data:

Objective:

Define Hypotheses

Let \(p\) represent the true proportion of satisfied customers.

Null Hypothesis \(H_0\): The satisfaction rate is equal to \(85\)%.

\[p = 0.85\]

Significance Level: A significance level of \(\alpha = 0.05\) is chosen.

Alternative Hypothesis \(H_A\): The satisfaction rate is greater than \(85\)%.

\[p > 0.85\]

\(\star\) This is a one-tailed test because the \(H_A\) is using the \(>\) sign.

Compute the Test Statistic

The point estimate is the sample proportion \(\hat{p} = \frac{173}{200} = 0.865\).

Test statistic for one proportion:

\[z = \frac{\hat{p} - p_0}{SE_{p}}\]

Computing the test statistic:

\[ \begin{aligned} z & = \frac{0.865 - 0.85}{\sqrt{\frac{0.85(1-0.85)}{200}}} \\ z & \approx 0.594 \end{aligned} \]

\(\star\) The standard error formula \(SE_{p}\) uses the null value because we are assuming the null hypothesis to be true as the default.

Determine the P-Value

Determine the probability associated with the computed test statistic. Remember that this is the probability \(P(Z \ge z|H_0)\), where \(Z\) is an r.v. with the standard normal distribution.

Sampling distribution of the null value:

Using R:

p_hat <- 173/200 # sample proportion (point estimate)
p_0 <- 0.85 # null value
n <- 200 # sample size
SE_p <- sqrt((p_0*(1-p_0))/(n)) # standard error
z <- (p_hat-p_0)/SE_p # test statistic

# p-value
1-pnorm(z,0,1) 
## [1] 0.2762265

\(\star\) The p-value is the probability \(P(Z \ge z|H_0) = 0.276\). Since this is one-tailed test, we only use the right tail probability.

Make a Decision and Conclusion

We compare the p-value to our chosen significance level of \(\alpha = 0.05\).

Choices:

Conclusions:

Interpretation of the Hypothesis Test

The hypothesis test concluded that we failed to reject \(H_0\).

Context:

Interpretation:

\(\star\) The sample proportion of \(\hat{p} = \frac{173}{200} \approx 0.865\) just happened by chance due to sampling variability.

What does the Significance Level Mean?

Remember that we defined \(\alpha = 0.05\) arbitrarily before we conducted the hypothesis test.

The significance value \(\alpha\) is related to the confidence level of the confidence interval of the point estimate, which is \(1-\alpha\).

\(\star\) The significance level \(\alpha\) is the probability of rejecting the null hypothesis when it is actually true. In other words, it is the probability of making a Type I error.

Confidence Interval in Relation to Hypothesis Testing

We need the \(95\)% confidence interval of the sample proportion (point estimate) \(\hat{p} = \frac{173}{200} \approx 0.865\).

Confidence Level:

Confidence Interval:

\(\star\) The null value of \(0.85\) is within the \(95\)% confidence interval. We would fail to reject the null hypothesis at the \(5\)% significance level.

Summary of Hypothesis Testing for One Proportion (1/3)

Let \(p\) be the population parameter and \(p_0\) be the null value.

State the Hypotheses:

\(\dagger\) The alternative hypothesis can be \(\ne\) (two-sided), and \(<\) or \(>\) (one-sided) depending on context.

Set Significance Value \(\alpha\):

\(\star\) The significance value has to be set before looking at the p-value.

Summary of Hypothesis Testing for One Proportion (2/3)

Compute the test statistic:

\[z = \frac{\hat{p}-p_0}{SE_p}\]

Determine the p-value:

\(\dagger\) Use the pnorm function in R to compute the p-value.

Sampling distribution of the null value (left one-tail):

Sampling distribution of the null value (right one-tail):

Sampling distribution of the null value (two-tail):

Summary of Hypothesis Testing for One Proportion (3/3)

Make a decision and conclusion:

Important Notes:

\(\star\) If you rejected \(H_0\), it does not mean that \(H_0\) is immediately false. It means that the observation is a rare occurrence under the assumption that it came from the null value’s sampling distribution.

\(\star\) If you failed to reject \(H_0\), it does not mean that the \(H_0\) is “accepted”. It means that the observation just happened by chance due to sampling variability.