1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/plot_kfold_cv.R
\name{plot_kfold_cv}
\alias{plot_kfold_cv}
\title{Plot model fit from k-fold cross-validation}
\usage{
plot_kfold_cv(data, formula, k = 5, fit)
}
\arguments{
\item{data}{A data frame, used to split the data into \code{k} training-test-pairs.}
\item{formula}{A model formula, used to fit linear models (\code{\link[stats]{lm}})
over all \code{k} training data sets. Use \code{fit} to specify a
fitted model (also other models than linear models), which will be used
to compute cross validation. If \code{fit} is not missing, \code{formula}
will be ignored.}
\item{k}{Number of folds.}
\item{fit}{Model object, which will be used to compute cross validation. If
\code{fit} is not missing, \code{formula} will be ignored. Currently,
only linear, poisson and negative binomial regression models are supported.}
}
\description{
This function plots the aggregated residuals of k-fold cross-validated
models against the outcome. This allows to evaluate how the model performs
according over- or underestimation of the outcome.
}
\details{
This function, first, generates \code{k} cross-validated test-training
pairs and
fits the same model, specified in the \code{formula}- or \code{fit}-
argument, over all training data sets. \cr \cr
Then, the test data is used to predict the outcome from all
models that have been fit on the training data, and the residuals
from all test data is plotted against the observed values (outcome)
from the test data (note: for poisson or negative binomial models, the
deviance residuals are calculated). This plot can be used to validate the model
and see, whether it over- (residuals > 0) or underestimates
(residuals < 0) the model's outcome.
}
\note{
Currently, only linear, poisson and negative binomial regression models are supported.
}
\examples{
data(efc)
plot_kfold_cv(efc, neg_c_7 ~ e42dep + c172code + c12hour)
plot_kfold_cv(mtcars, mpg ~.)
# for poisson models. need to fit a model and use 'fit'-argument
fit <- glm(tot_sc_e ~ neg_c_7 + c172code, data = efc, family = poisson)
plot_kfold_cv(efc, fit = fit)
# and for negative binomial models
fit <- MASS::glm.nb(tot_sc_e ~ neg_c_7 + c172code, data = efc)
plot_kfold_cv(efc, fit = fit)
}
|