Inference for One Proportion

Elementary Statistics

MTH-161D | Spring 2025 | University of Portland

March 17, 2025

Objectives

Develop an understanding of inference for one proportion
Know how to compute the confidence interval of one proportion
Activity: Determine Confidence Intervals of One Proportion

These slides are derived from Diez et al. (2012).

Previously… (1/3)

The guiding principle of statistics is statistical thinking.

Statistical Thinking in the Data Science Life Cycle

Previously… (2/3)

Types of Inference

	Parameter Estimation	Hypothesis Testing
Goal	Estimate an unknown population value	Assess claims about a population value
Methods	Point Estimation: A single value estimate (e.g., sample mean) Interval Estimation: A range of plausible values (e.g., confidence interval)	State a null and an alternative hypothesis Compute a test statistic and compare it to a threshold (p-value or critical value)
Key Concept	Focuses on precision in estimation (confidence intervals)	Focuses on decision-making based on evidence (reject or fail to reject the null hypothesis)

Previously… (3/3)

Confidence Intervals

\[\text{point estimate} \pm z^{\star} \times \text{SE}.\]

In a confidence interval, \(z^{\star} \times \text{SE}\) is called the margin of error, and for a given sample, the margin of error changes as the confidence level changes.
Using the standard normal distribution, it is possible to find the appropriate \(z^{\star}\) for any confidence level.

Case Study I

Two scientists want to know if a certain drug is effective against high blood pressure.

The first scientist wants to give the drug to 1000 people with high blood pressure and see how many of them experience lower blood pressure levels.
The second scientist wants to give the drug to 500 people with high blood pressure, and not give the drug to another 500 people with high blood pressure, and see how many in both groups experience lower blood pressure levels.

Which is the better way to test this drug?

\(\star\) Answer: The second scientist that want 500 get the drug, 500 don’t.

Results from the GSS

The GSS (General Social Survey) asks the same question, below is the distribution of responses from the 2010 survey:

Answer	Count
All 1000 get the drug	99
500 get the drug 500 don’t	571
Total	670

Parameter and Point Estimate

We would like to estimate the proportion of all Americans who have good intuition about experimental design, i.e. would answer “500 get the drug 500 don’t”? What are the parameter of interest and the point estimate?

Parameter of interest: proportion of all Americans who have good intuition about experimental design. \[p \longrightarrow \text{a population proportion}\]
Point estimate: proportion of sampled Americans who have good intuition about experimental design. \[\hat{p} \longrightarrow \text{a sample proportion}\]

Inference on a Proportion

What percent of all Americans have good intuition about experimental design, i.e. would answer “500 get the drug 500 don’t”?

We can answer this research question using a confidence interval, which we know is always of the form \[\text{point estimate} \pm z^{\star} \times \text{SE}.\]

Standard error (SE) of a sample proportion \[SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}.\]

Central Limit Theorem for Proportions

Sample proportions will be nearly normally distributed with mean equal to the population mean, \(p\), and standard error equal to \(\sqrt{\frac{p(1-p)}{n}}\).

This is true only under certain conditions:

independent observations, at least 10 successes and 10 failures

Note:

If \(p\) is unknown (most cases), we use \(\hat{p}\) in the calculation of the standard error.

Case Study I: Infering the True Proportion

The GSS found that 571 out of 670 (85%) of Americans answered the question on experimental design correctly. Estimate (using a 95% confidence interval) the proportion of all Americans who have good intuition about experimental design?

Given: \(n = 670\), \(\hat{p} = 0.85\). First check conditions.

Independence: The sample is random, and \(670\) which is less than \(10%\) of all Americans, therefore we can assume that one respondent’s response is independent of another.
Success-failure: 571 people answered correctly (successes) and 99 answered incorrectly (failures), both are greater than 10.

Case Study I: Confidence Interval

We are given that \(n = 670\), \(\hat{p} = 0.85\), we also just learned that the standard error of the sample proportion is \[SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}.\]

Which of the below is the correct calculation of the 95% confidence interval?

\(0.85 \pm 1.65 \times \sqrt{\frac{0.85 \times 0.15}{670}}\)
\(0.85 \pm 1.96 \times \sqrt{\frac{0.85 \times 0.15}{670}}\)
\(0.85 \pm 1.96 \times \frac{0.85 \times 0.15}{\sqrt{670}}\)
\(571 \pm 1.96 \times \sqrt{\frac{571 \times 99}{670}}\)

\(\star\) Answer: \(0.85 \pm 1.96 \times \sqrt{\frac{0.85 \times 0.15}{670}} \longrightarrow (0.82,0.88)\)

Case Study I: Choosing a sample size

Previously, for \(n=670\) the margin of error is \(1.96 \times \sqrt{\frac{0.85 \times 0.15}{670}} \approx 0.027\).

How many people should you sample in order to cut the margin of error of a 95% confidence interval down to 0.01?

\[z^{\star} \times SE_{\hat{p}}\]

\[ \begin{aligned} 1.96 \times \sqrt{\frac{0.85 \times 0.15}{n}} & \le 0.01 \\ 1.96^2 \times \frac{0.85 \times 0.15}{n} & \le 0.01^2 \\ n & \ge \frac{1.96^2 \times 0.85 \times 0.15}{0.01^2} \\ n & \ge 4898.04 \end{aligned} \]

\(\star\) The sample size should be at least 4,899 to have a 0.01 margin of error for 95% confidence interval.

Activity: Determine Confidence Intervals of One Proportion

Make sure you have a copy of the M 3/17 Worksheet. This will be handed out physically and it is also digitally available on Moodle.
Work on your worksheet by yourself for 10 minutes. Please read the instructions carefully. Ask questions if anything need clarifications.
Get together with another student.
Discuss your results.
Submit your worksheet on Moodle as a .pdf file.