MTH-361A | Spring 2026 | University of Portland
The scatterplot on the right shows the relationship between HS graduate rate in all 50 US states and DC and the % of residents who live below the poverty line (income below $23,050 for a family of 4 in 2012).
A linear model is written as
\[ y = \beta_0 + \beta_1 x + \epsilon \]
where \(y\) is the outcome, \(x\) is the predictor, \(\beta_0\) is the intercept, and \(\beta_1\) is the slope. The notation \(\epsilon\) is the model’s error.
Notation:
We can use the sample statistics \(b_0\) and \(b_1\) as point estimates to infer the true value of the population parameters \(\beta_0\) and \(\beta_1\).
The linear model for predicting poverty from high school graduation rate in the US is
\[ \widehat{poverty} = 64.78 - 0.62 \times HS_{grad} \]
where the sample statistics are the slope is \(b_1 = - 0.62\) and the intercept is \(b_0 = 64.78\).
The “hat” in the \(\widehat{poverty}\) indicates an estimated/predicted outcome.
The high school graduate rate in Georgia is 85.1%.
What poverty level does the model predict for this state?
The linear model for predicting poverty from high school graduation rate in the US is
\[ \widehat{poverty} = 64.78 - 0.62 \times HS_{grad} \]
where the sample statistics are the slope is \(b_1 = - 0.62\) and the intercept is \(b_0 = 64.78\).
Which of the following appears to be the line that best fits the linear relationship between % in poverty and % HS grad? Choose one.
A Residual of the \(i^{th}\) observation \((x_i,y_i)\) is the difference between the observed (\(y_i\)) and estimated/predicted \(\hat{y}_i\).
\[ \epsilon_i = y_i - \hat{y}_i \]
The relationship of two numerical variables shown in the right is moderately strong linear negative relationship.
Correlation (notation: \(r\)) describes the strength of the linear association between two numerical variables.
Example:
Which of the following is the best guess for the correlation between % in poverty and % HS grad?
(a)\(r=0.6\)
(b)\(r=-0.75\)
(c)\(r=-0.1\)
(d)\(r=0.02\)
(e)\(r=-1.5\)
Which of the following has the strongest correlation, i.e. correlation coefficient closest to +1 or -1?
Sample scatterplots and their correlations. The first row shows variables with a positive relationship, represented by the trend up and to the right. The second row shows variables with a negative trend, where a large value in one variable is associated with a lower value in the other.
Sample scatterplots and their correlations. In each case, there is a strong relationship between the variables. However, because the relationship is not linear, the correlation is relatively weak.