Noam Ross (with lots of slides by David L Miller)
August 5th, 2017
Models that look like:
\[ y_i = \beta_0 + x_{1i}\beta_1 + x_{2i}\beta_2 + \ldots + \epsilon_i \]
(describe the response, \( y_i \), as linear combination of the covariates, \( x_{ji} \), with an offset)
We can make \( y_i\sim \) any exponential family distribution (Normal, Poisson, etc).
Error term \( \epsilon_i \) is normally distributed (usually).
lm(y ~ x1, data=dat)
lm(y ~ x1 + poly(x1, 2), data=dat)
(Sort of. Let's fire up RStudio)
In GLM we model the mean of data as a sum of linear terms:
\[ y_i = \beta_0 +\sum_j \color{red}{ \beta_j x_{ji}} +\epsilon_i \]
A GAM is a sum of smooth functions or smooths
\[ y_i = \beta_0 + \sum_j \color{red}{s_j(x_{ji})} + \epsilon_i \]
where \( \epsilon_i \sim N(0, \sigma^2) \), \( y_i \sim \text{Normal} \) (for now)
Call the above equation the linear predictor in both cases.
Several different types of splines, with different applications
Basis functions of cubic spline (top), and thin-plate spline (bottom)
Back RStudio to look at this in our model!
\[ \int_\mathbb{R} \left( \frac{\partial^2 f(x)}{\partial^2 x}\right)^2 \text{d}x = \boldsymbol{\beta}^\text{T}S\boldsymbol{\beta} = \large{W} \]
(Wigglyness is 100% the right mathy word)
We penalize wiggliness to avoid overfitting.
\[ \log(\text{Likelihood}) - \lambda W \]
Hence gam(..., method="REML")
\[ \text{EDF} < k \]
s()
terms with s(variable, k=n)
\[ \text{count}_i = A_i \exp \left( \beta_0 + s(x_i, y_i) + s(\text{Depth}_i)\right) \]
Without uncertainty, we're not doing statistics
All the model coefficients together have approximately a multi-normal distribution around the mean:
\[ \boldsymbol{\beta} \sim N(\hat{\boldsymbol{\beta}}, \mathbf{V}_\boldsymbol{\beta}) \]
However, this is a distribution dependent on the smoothing parameter.
In mgcv, vcov(model)
returns \( \mathbf{V}_\boldsymbol{\beta} \), the variance-covariance
matrix of the coefficients.
vcov(mode, unconditional = TRUE)
corrects for uncertainty in the smoothing parameter.
plot
se.fit
in the predict()
functionpreddata
.Because our spline coefficients covary, it can be somewhat misleading to just plot confidence intervals. Sometimes we want to look at a better sample of possible predictions.
We can sample value of our coefficents from a mutivariate normal using vcov(model)
.
Since our model is ultimately a linear function of these coefficients, we can predict using
\[ \hat{y} = L_p \boldsymbol{\hat{\beta}} \]
Where \( L_p \) is the linear predictor matrix, which constructed from our data and our basis functions.
mgcv
does most of the hard work for usglm
with extra s()
termsmgcv
sheilds you from