Overview
Regression inference involves testing hypotheses and constructing confidence intervals for regression coefficients and predictions.
Standard Error of the Slope
SE(β1)=∑(xi−xˉ)2s
Where s = standard error of the estimate:
s=n−2SSE
Hypothesis Test for Slope
Hypotheses
Testing if there's a linear relationship:
- H0: β1=0 (no linear relationship)
- H1: β1=0 (linear relationship exists)
Test Statistic
t=SE(β1)β1
With df=n−2
Decision
If ∣t∣>tcritical or p-value <α, reject H0.
Confidence Interval for Slope
β1±tα/2,n−2×SE(β1)
Interpretation: We are (1−α)×100% confident the true slope is in this interval.
Standard Error of the Intercept
SE(β0)=s×n1+∑(xi−xˉ)2xˉ2
Confidence Interval for Mean Response
For a given x0, the CI for E(Y∣X=x0):
y^±tα/2,n−2×s×n1+∑(xi−xˉ)2(x0−xˉ)2
Prediction Interval
For a single new observation at x0:
y^±tα/2,n−2×s×1+n1+∑(xi−xˉ)2(x0−xˉ)2
Note: Prediction intervals are wider than confidence intervals.
Comparison
| Interval Type | What It Estimates | Width |
|---|
| CI for mean response | Average Y for given X | Narrower |
| Prediction interval | Individual Y for given X | Wider |
ANOVA Approach to Regression
ANOVA Table
| Source | df | SS | MS | F |
|---|
| Regression | 1 | SSR | MSR=SSR/1 | MSR/MSE |
| Error | n−2 | SSE | MSE=SSE/(n−2) | |
| Total | n−1 | SST | | |
F-Test for Overall Significance
F=MSEMSR
For simple regression, F=t2 (from slope test)
Example
Regression output:
- n=20
- β1=2.5
- SE(β1)=0.8
- s=3.2
- ∑(xi−xˉ)2=100
Testing H0:β1=0
t=0.82.5=3.125
df=18,tcrit(α=0.05)=2.101
3.125>2.101⇒Reject H0
Significant linear relationship exists.
95% CI for Slope
2.5±2.101×0.8=2.5±1.68=(0.82,4.18)
We're 95% confident the true slope is between 0.82 and 4.18.
Assumptions for Valid Inference
- Linearity: Check with residual plot
- Independence: Random sampling
- Normality of errors: Q-Q plot, Shapiro-Wilk test
- Homoscedasticity: Constant variance (residual plot)
Residual Analysis
What to Look For
- Random scatter: Assumptions met
- Pattern/curvature: Nonlinearity
- Funnel shape: Non-constant variance
- Clusters: Possible subgroups
Standardized Residuals
Standard residual=sei
Values beyond ±2 or ±3 may be outliers.