File: adding-Effect-methods.Rnw

package info (click to toggle)
effects 4.1.0-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 2,972 kB
  • sloc: makefile: 4
file content (287 lines) | stat: -rw-r--r-- 18,577 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
%\VignetteEngine{knitr::knitr}
%\VignetteIndexEntry{Defining Effect Methods for Other Models}
%% vignette index specifications need to be *after* \documentclass{}
%%\VignetteEngine{knitr::knitr}
%%\VignetteIndexEntry{Computing Effects for Other Statistical Models}
%%\VignettePackage{effects}

\documentclass{article}


\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage[american]{babel}
\newcommand{\R}{{\sf R}}
\usepackage{url}
\usepackage{hyperref}
\usepackage{alltt}
\usepackage{fancyvrb}
\usepackage{natbib}
\usepackage{amsmath}
\VerbatimFootnotes
\bibliographystyle{chicago}
%\usepackage{amsmath,amsfonts,amssymb}
%\usepackage{natbib}
%\bibliographystyle{abbrvnat}
%\usepackage[margin=1in]{geometry}

\newcommand{\x}{\mathbf{x}}
\newcommand{\code}[1]{\normalfont\texttt{\hyphenchar\font45\relax #1}}
\newcommand{\E}{\mathrm{E}}
\newcommand{\tild}{\symbol{126}}
\newcommand{\Rtilde}{\,\raisebox{-.5ex}{\code{\tild{}}}\,}
\newcommand{\captilde}{\mbox{\protect\Rtilde}} % use in figure captions.
\newcommand{\Rmod}[2]{\code{#1 \raisebox{-.5ex}{\tild{}} #2}}
\newcommand{\Rmoda}[2]{\code{#1} &\code{\raisebox{-.5ex}{\tild{}} #2}}
\newcommand{\Rmodb}[2]{\code{#1 &\raisebox{-.5ex}{\tild{}}& #2}}
\newcommand{\C}{\mathbf{C}}
\newcommand{\betahat}{\widehat{\beta}}
\newcommand{\bbetahat}{\widehat{\boldsymbol{\beta}}}
\newcommand{\bbeta}{\boldsymbol{\beta}}
\newcommand{\xbf}{\x_{\backslash{}f}}
\newcommand{\hbf}{h_{\backslash{}f}}
\newcommand{\xtb}{\x_{2\backslash{}f}}
\newcommand{\xbfi}{\x_{\backslash{}f,i}}
\newcommand{\inter}[2]{\mbox{$#1$:$#2$}}
\newcommand{\cross}[2]{\mbox{$#1$\code{*}$#2$}}
\newcommand{\N}{\mathrm{N}}
\newcommand{\fn}{\textbf}
\newcommand{\pkg}[1]{\code{#1}}
\newcommand{\proglang}[1]{\code{#1}}
\newcommand{\yx}{\widehat{y}(\x)}
\newcommand{\lvn}[1]{\mbox{$\log(\mbox{\texttt{#1}})$}}

\begin{document}

\title{Defining Effect Methods for Other Models}

\author{John Fox and Sanford Weisberg}

\date{\today}

\maketitle

<<setopts,echo=FALSE>>=
library("knitr")
opts_chunk$set(fig.width=5,fig.height=5,tidy=TRUE,
               out.width="0.8\\textwidth",echo=TRUE)
options(prompt=" ")
@ 


<<echo=FALSE, results='hide', include=FALSE>>=
#options(continue="+    ", prompt="R> ", width=76)
options(show.signif.stars=FALSE)
options(scipen=3)
@

<<include=FALSE>>=
library(knitr)
opts_chunk$set(
tidy=FALSE,fig.width=5,fig.height=5,cache=FALSE,comment=NA, prompt=TRUE
)
render_sweave()
@


<<echo=FALSE, results='hide', include=FALSE>>=
options(continue="    ", prompt=" ", width=76)
options(show.signif.stars=FALSE)
options(scipen=3)

@

The \code{effects} package in \R{} is designed primarily to draw graphs that visualize a fitted response surface of a fitted model in problems with a linear predictor.  Many modeling paradigms that can be fit with base \R{} or contributed packages fit into this framework, including methods for linear, multivariate linear, and generalized linear models fit by the standard \code{lm} and \code{glm} functions and by the \code{svyglm} function in the \code{survey} package \citep{Lumley04}; linear models fit by generalized least squares using the \code{gls} function in the \pkg{nlme} package \citep{Pinheiro16}; multinomial regression models fit by \code{multinom} in the \pkg{nnet} package \citep{VenablesRipley02};  ordinal regression models using \code{polr} from the \pkg{MASS} package \citep{VenablesRipley02} and \code{clm} and \code{clm2} from the \pkg{ordinal} package \citep{Christensen15}; linear and generalized linear mixed models using the \code{lme} function in the \pkg{nlme} package \citep{Pinheiro16} and the \code{lmer} and \code{glmer} functions in the \pkg{lme4} package \citep{Bates15}; and latent class models fit by \code{poLCA} in the \pkg{poLCA} package \citep{Linzer11}. This is hardly an exhaustive list of fitting methods that are based on a linear predictor, and we have been asked from time to time to write functions to use \code{effects} with this other fitting methods.  The mechanism for this is fairly simple.  This vignette assumes you are familiar with \R{}'s S3 methods.

The default \code{Effect.default} may work with some modeling functions, as would objects of the class \code{gls} that we describe below in Section~\ref{gls}, but as illustrated in later sections you may need to modify some of the arguments that are sent to \code{Effect.default}.  .

The \code{effect} package has five functions that create the information needed for drawing effects plots, \code{Effect}, \code{allEffects}, \code{effect} and \code{predictorEffect} and \code{predictorEffects}.  To add new modeling to the package only a new \code{Effect} needs to be written; the package will take care of all the other functions.

\section{Using \code{effects} with Other Modeling Methods, with Generalized Least Squares in the \pkg{nlme} package as an Example}\label{gls}
Applying \code{effects} to other than \code{lm} and \code{glm} objects may require writing an method for the \code{Effect} generic function for that type of model object.  For example, the \code{gls} function in the \code{nlme} package \citep{nlme} fits linear models via generalized least squares.  A call to \code{gls} creates an object of class \code{gls}.  The following method of \code{Effect} for \code{gls} objects finds the information needed to draw effects plots from \code{gls} objects:

<<eval=FALSE>>=
Effect.gls <- function(focal.predictors, mod, ...){
  cl <- mod$call
  cl$weights <- NULL
  args <- list(
    type = "glm",
    call = cl,
    formula = formula(mod),
    family = NULL,
    coefficients = coef(mod),
    vcov = as.matrix(vcov(mod)),
    method=NULL)
  Effect.default(focal.predictors, mod, ..., sources=args)
}
@
This function simply harvests the needed information into a variable \code{sources}, and then passes to the result to the default \code{Effect} method.  The  three required arguments to the method are: \code{focal.predictors} and \code{mod} that match the first two arguments of \code{Effect.default}, and  \code{...} that matches all other arguments.  The list \code{sources} has up to 6 named elements and is created in the function:
\begin{description}
\item[\texttt{type}] The \code{effects} package has three basic modeling functions:  \code{type = "glm"}, the default, is used for functions with a univariate response and a linear predictor and possibly a link function.  This class includes linear models, generalized linear models, robust regression, generalized least squares fitting, linear and generalized linear mixed effects models, and many others. The \code{type = "polr"} is used for ordinal regression models, as in the \code{polr} function in the \code{MASS} package, and similar methods described below in Section~\ref{polr}.  The The \code{type = "multinom"} for multinomial log-linear models as fit by the \code{multinom} function in \code{nnet}, and to polytomous latent class models created with the \code{poLCA} function in the \pkg{poLCA} package.

\item[\code{call}] The \code{Effect.default} method uses the call to harvest additional arguments that it needs.  For \code{type="glm"}, these arguments are \code{formula}, \code{data}, \code{contrasts},  \code{subset},
\code{family}, \code{weights}, and \code{offset}, although only the \code{formula} argument is required.  The \code{gls} function includes an optional \code{weights} argument that is used differently from the \code{weights} argument for a generalized linear model and is not needed for computing effects or predictor effects plots.  In the function shown above the call is modified by setting \code{weights=NULL}.  

The default for \code{call} is \code{mod\$call} for S3 objects and \code{mod@call} for S4 objects.

\item[\code{formula}] In most cases the formula for the linear predictor is returned by \code{formula(mod)}, the default, but if this is not the case the value of this argument should be the value of the formula for fixed effects.
\item[\code{family}] The default is \code{family=NULL}.  This argument is required for GLM-like models that include a \code{family} that specifies both an error distribution and a link function only if \code{family=family(mod)} is not appropriate.  See the \code{betareg} example in Section~\ref{betareg} below for an example that includes a user-selected link function, but a fixed error distribution.
\item[\code{coefficients}] In many cases the (fixed-effect) coefficient estimates are returned by \code{coef(mod)}, the default, but if this is not the case then the value of this argument should be the estimates of the coefficients in the linear predictor. The functions in the \pkg{effects} package do not use estimates of random effects.
\item[\code{vcov}] In many cases the estimated covariance matrix of the (fixed-effect) coefficient estimates is  returned by \code{vcov(mod)}, the default, but if this is not the case then the value of this argument should be the estimated covariance matrix of the (fixed-effect) coefficient estimates in the linear predictor.
\item[method] This argument is used only for methods that use effects graphics based on the \code{polr} function, where the argument \code{method} is the name of a link function; see \code{help(polr)} for a list of the accepted links, and see Section~\ref{clm} below for an example.
\end{description}
The only non-default argument in \code{sources} is the modification of the \code{call} to omit weights in the call to \code{gls}.  Without this change, there is no need to have written the \code{Effect.gls} method, as the default method would have worked. 

<<fig.height=4,fig.width=8>>=
library(effects)
require(nlme)
g <- gls(Employed ~ GNP + Population,
         correlation=corAR1(form= ~ Year), data=longley)
plot(predictorEffects(g))
@

\section{Mixed Effects with \code{lme} (\pkg{nlme} package)}
The \code{lme} function in the \pkg{nlme} package \citep{nlme} fits linear mixed models.  The required function for fitted objects from this function is included in the \pkg{effects} package.  It is given by
<<>>=
print(Effect.lme)
@
The \code{formula}, \code{coefficients} and \code{vcov} arguments are set to non-default values.  The other arguments are automatically set to default values.  

<<>>=
data(Orthodont, package="nlme")
m1 <- nlme::lme(distance ~ age + Sex, data=Orthodont, 
                random= ~ 1 | Subject)
as.data.frame(Effect("age", m1))
@

\section{Mixed Effects with the \code{lmer} (\code{lme4} package)}
The \code{lme4} package \citep{Bates15} fits linear and generalized linear mixed effects models with the \code{lmer} and \code{glmer} functions, respectively.  The same \code{Effect} function can be used for \code{lmer} and \code{glmer} models.  

The following method is a little more complicated because it contains an additional argument \code{KR} to determine if the Kenward-Roger coefficient covariance matrix is to be used to compute effect standard errors.  The default is \code{FALSE} because the computation is very slow.  If \code{KR = TRUE}, the function also checks if the \pkg{pbkrtest} package is present.

<<>>=
print(Effect.merMod)
@
Because \code{lmer} is an S4 object, the default for \code{call} is \code{mod@call}, and this argument would have been set automatically had we not included it in the above method.  The fixed-effect estimates for an object created by a call to \code{lmer} or \code{glimer} are not returned by \code{coef(mod)}, so the value of \code{coefficients} is the value returned by \code{lme4::fixef(mod)}.  The \code{vcov} estimate contains its estimated variance covariance matrix of the fixed effects.  The Kenward-Roger method is used to estimate the covariance matrix for linear models if the additional argument \code{KR=TRUE}.  The default is \code{KR=FALSE} because The Kenward-Roger estimate requires a long computation; see \code{help(Effect)}. 

The \code{formula} for a mixed-effects model in the \code{lme4} package specifies linear predictors for both the mean function and the variance functions, specified by, for example  \code{(1 + age | Subject)}.  The \code{effects} code will automatically remove any terms like these in any formula, as the effects package only displays the mean function.

<<fig.height=4,fig.width=8>>=
fm2 <- lme4::lmer(distance ~ age + Sex + (1 |Subject), data
                  = Orthodont)
plot(allEffects(fm2))
@

<<>>=
data(cbpp, package="lme4")
gm1 <- lme4::glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
              data = cbpp, family = binomial)
as.data.frame(predictorEffect("period", gm1))
@

\section{Robust Linear Mixed Models (\pkg{robustlmm} package)}
The \code{rlmer} function in the \code{robustlmm} package \citep{koller16} fits linear mixed models with a robust estimation method.  As \code{rlmer} closely parallels the \code{lmer} function, an object created by \code{rlmer} is easily used with \code{effects}:
<<>>=
print(Effect.rlmerMod)
@

<<eval=FALSE,fig.height=4, fig.width=8>>=
require(lme4)
fm3 <- robustlmm::rlmer(distance ~ age * Sex + (1 |Subject), 
                        data = Orthodont)
plot(predictorEffects(fm3))
@

\section{Beta Regression}\label{betareg}
The \code{betareg} function in the \code{betareg} package \citep{betareg} fits regressions with a link function but with Beta distributed errors.
<<>>=
print(Effect.betareg)
@
Beta regression has a response $y \in [0,1]$, with the connection between the mean $\mu$ of the Beta and a set for predictors $\x$ through a link function $\x'\bbeta = g(\mu)$.  The variance function for the beta is $\mathrm{var}(y) = \mu(1-\mu)/(1+\phi)$, for a precision parameter $\phi$ estimated by \code{betareg}.

The call to \code{betareg} does not have a family argument, although it does have a link stored in \code{mod\$link\$mean}.  For use with \code{Effect.default}, the method above creates a family from the binomial family generator.  It then adjusts this family by changing from binomial variance to the variance for the beta distribution. Since the \code{glm} function expects a variance that is a function of only one parameter, we fix the value of the precision $\phi$ at its estimator from the \code{betareg} fit, as shown in the method.  We need to replace the \code{initialize} method in the family to one appropriate for $y \in [0,1]$.  %Finally, although the \code{aic} function is not used for computing effects, it is accessed by the call to \code{glm}.  The \code{aic} function for the binomial depends on named parameters not present in the beta regression, and so we substitute a dummy function for binomial version.

<<echo=FALSE,results='hide', include=FALSE>>=
require(lme4)
@
<<fig.height=4,fig.width=8,cache=FALSE>>=
require(betareg)
require(lme4)
data("GasolineYield", package = "betareg")
gy_logit <- betareg(yield ~ batch + temp, data = GasolineYield)
summary(gy_logit)
plot(predictorEffects(gy_logit))
@

\section{Ordinal Models (\pkg{ordinal} package)}\label{polr}
Proportional odds logit and probit regression models fit with the \code{polr} function in the \code{MASS} package \citep{VenablesRipley02} are supported in the \code{effects} package.  The \code{ordinal} package, \citep{Christensen15} contains three functions that are very similar to \code{polr}.  The \code{clm} and \code{clm2} functions allow more link functions and a number of other generalizations.  The \code{clmm} function allows including random effects.

\subsection{\code{clm}}\label{clm}
<<>>=
print(Effect.clm)
@

This method first checks that the \code{MASS} package is available.  Since the \code{clm} function allows suppressing the computation of the Hessian, the function checks and computes it if needed to get the estimated covariance matrix.  The \code{clm} function orders the parameters in the order (threshold parameters, linear predictor parameters), so the next few lines identify the elements of  \code{vcov} that are needed by \code{Effects}.  Since the \code{polr} function does not allow thresholds other than \code{flexible}, we don't allow them either. The \code{polr} argument \code{method} is equivalent to the \code{clm} argument \code{link}, except that the \code{clm} link \code{"logit"} is equivalen to the \code{polr} method \code{"logit"logistic"}.  

<<echo=FALSE,results='hide', include=FALSE>>=
require(ordinal)
require(MASS)
@
<<fig.height=6,fig.width=6>>=
require(ordinal)
require(MASS)
mod.wvs1 <- clm(poverty ~ gender + religion + degree + country*poly(age,3),
    data=WVS)
plot(Effect(c("country", "age"), mod.wvs1), 
     lines=list(multiline=TRUE), layout=c(2, 2))
@

\subsection{\code{clm2}}
Although the fitted madels are similar, syntax for \code{clm2} is not the same as \code{clm}, so a separate method is required.

<<>>=
print(Effect.clm2)
@

<<fig.height=6,fig.width=8>>=
v2 <- clm2(poverty ~ gender + religion + degree + country*poly(age,3),data=WVS)
plot(emod2 <- Effect(c("country", "age"), v2), 
     lines=list(multiline=TRUE), layout=c(2,2))
@

\subsection{\code{clmm}}
This function allows for random effects in an ordinal model.
<<>>=
print(Effect.clmm)
@
The first few lines of the method check for the presence of the \code{MASS} package that is needed to use \code{polr}, makes sure the link used is supported by \code{polr}, and requires that the argument \code{threshold} has its default value.  The \code{polr} and \code{clmm} functions store the fixed effects estimates of regression and threshold coefficents in different orders, so the next few lines rearrange the variance matrix to match the order that \code{polr} uses.

<<fig.height=4,fig.width=4,cache=FALSE>>=
require(ordinal)
require(MASS)
mm1 <- clmm(SURENESS ~ PROD + (1|RESP) + (1|RESP:PROD), 
        data = soup, link = "logit", threshold = "flexible")
plot(Effect("PROD", mm1),lines=list(multiline=TRUE))
@

\subsection{Others}
The \code{poLCA} function in the \code{poLCA} package \citep{Linzer11} fits polytomous variable latent class models, which uses the multinomial effects plots.

The \code{svyglm} function in the \code{survey} package \citep{Lumley04, Lumley16} fits generalized linear models using survey weights.

The \code{lm} function can also be used to create a multivariate linear model.  The \code{Effect.mlm} function, with slightly different syntax, will drow effects plots for these models, with separate plots of each response.

<<fig.height=6,fig.width=6>>=
data(Baumann, package="carData")
b1 <- lm(cbind(post.test.1, post.test.2, post.test.3) ~ group + 
    pretest.1 + pretest.2, data = Baumann)
plot(Effect("group", b1))
@




\bibliography{adding-Effect-methods}
\end{document}