MANE 6313

Week 12, Module C

Student Learning Outcome

Select an appropriate experimental design with one or more factors,
Select an appropriate model with one or more factors,
Evaluate statistical analyses of experimental designs,
Assess the model adequacy of any experimental design, and
Interpret model results.

Module Learning Outcome

Explaining inference for linear regression models.

Tests on Individual Regression Coefficients

We can test \(H_0:\beta_j=0\) vs. \(H_a:\beta_j\neq 0\)

\[ \frac{\hat{\beta_j}}{\sqrt{\hat{\sigma}^2C_{jj}}}\sim t_{n-k-1} \]

where \(C_{jj}\) is the \((jj)\)th element of \(\mathbf{(X^\prime X)^{-1}}\)

or equivalently

\[ \frac{\hat{\beta_j}}{se(\hat{\beta}_j)}\sim t_{n-k-1} \]

If \(H_0:\beta_j=0\) is not rejected, we can remove the variable \(x_j\) from the model

Example Problem 12.8

lm(Ex 12.8 lm() Output) Summary

C.I. on the Individual Regression Coefficients

Straightforward to construct

\[ \frac{\hat{\beta}_j-\beta}{\sqrt{\hat{\sigma}^2C_{jj}}}\sim t_{n-p}\;\;\;j=0,1,2,\ldots,k \]

A \(100(1-\alpha)\%\) confidence interval for \(\beta_j\) is

\[ \hat{\beta}_j-t_{\alpha/2,n-p}\sqrt{\hat{\sigma}^2C_{jj}}\leq\beta_j\leq \hat{\beta}_j+t_{\alpha/2,n-p}\sqrt{\hat{\sigma}^2C_{jj}} \]

or equivalently

\[ \hat{\beta}_j-t_{\alpha/2,n-p}se(\hat{\beta}_j)\leq\beta_j\leq \hat{\beta}_j+t_{\alpha/2,n-p}se(\hat{\beta}_j) \]

R: confint() Function

confint(R confint() Documentation) Documentation

Source: https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/confint

Example 12.8 Confidence Interval on Parameters

Ex12.8 C.I. on Parameters

C.I. on the Mean response

We can find a confidence interval on the mean response at a point \(x^\prime_0=[1,x_{01},x_{02},\ldots,x_{x0k}]\).
Note that

\[ \begin{aligned} \mu_{y|\mathbf{x_0}}&=\beta_0+\beta_1x_{01}+\beta_2x_{02}+\cdots+\beta_kx_{0k}\\ \hat{y}(\mathbf{x_0})&=\mathbf{X^\prime_0\hat{\beta}} \end{aligned} \]

The variance of \(\hat{y}(\mathbf{x_0})\) is

\[ V\left[\hat{y}(\mathbf{x_0})\right]=\sigma^2\mathbf{x_0^\prime(X^\prime X)^{-1}x_0} \]

A \(100(1-\alpha)\%\) confidence interval for the mean response is

\[ \begin{aligned} \hat{y}(\mathbf{x_0})&-t_{\alpha/2,n-p}\sqrt{\hat{\sigma}^2\mathbf{x^\prime_0(X^\prime X)^{-1}x_0}}\\ &\leq\mu_{y|x_0}\leq\hat{y}(\mathbf{x_0})+t_{\alpha/2,n-p}\sqrt{\hat{\sigma}^2\mathbf{x^\prime_0(X^\prime X)^{-1}x_0}} \end{aligned} \]

Prediction Intervals

We can use the regression equation to predict values at points other than those in the design matrix. In general, we only want to interpolate; not extrapolate
Consider the point \(x^\prime_0=[1,x_{01},x_{02},\ldots,x_{x0k}]\). A point estimate for \(y\) is

\[ \hat{y}(\mathbf{x_0})=\mathbf{x^\prime_0\hat{\beta}} \]

A \(100(1-\alpha)\%\) prediction interval for this observation is

\[ \begin{aligned} \hat{y}(\mathbf{x_0})&-t_{\alpha/2,n-p}\sqrt{\hat{\sigma}^2(1+\mathbf{x^\prime_0(X^\prime X)^{-1}x_0})}\\ &\leq y_0\leq\hat{y}(\mathbf{x_0})+t_{\alpha/2,n-p}\sqrt{\hat{\sigma}^2(1+\mathbf{x^\prime_0(X^\prime X)^{-1}x_0})} \end{aligned} \]

Most packages will perform this function for you.

R: predict.lm() Function

Source: https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/predict.lm

predict.lm() Documentation

Example 12.8 - Confidence Interval

predict.lm() Confidence Interval

Example 12.8 - Prediction Interval

predict.lm() Prediction Interval