1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
|
---
title: "Regularized MIMIC"
author: "Joshua Pritikin and Ross Jacobucci and Timothy R. Brick"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Regularized MIMIC}
%\VignetteEngine{knitr::rmarkdown}
%\usepackage[utf8]{inputenc}
---
```{r, include = FALSE}
is_CRAN <- !identical(Sys.getenv("NOT_CRAN"), "true")
if (!is_CRAN) {
options(mc.cores = parallel::detectCores())
} else {
knitr::opts_chunk$set(eval = FALSE)
knitr::knit_hooks$set(evaluate.inline = function(x, envir) x)
}
```
# Regularized MIMIC model
This example uses the immortal Holzinger Swineford data set.
```{r}
library(OpenMx)
data(HS.ability.data)
```
The OpenMx model looks like this:
```{r}
HS.ability.data$ageym <- HS.ability.data$agey*12 + HS.ability.data$agem
HS.ability.data$male <- as.numeric(HS.ability.data$Gender == 'Male')
# Specify variables
indicators <- c('visual','cubes','paper','flags','paperrev','flagssub',
'general','paragrap','sentence','wordc','wordm')
covariates <- c("male","ageym","grade")
latents = c("g", covariates)
# Build the model
mimicModel <- mxModel(
"MIMIC", type="RAM",
manifestVars = indicators, latentVars = latents,
# Set up exogenous predictors
mxPath("one", covariates, labels=paste0('data.',covariates), free=FALSE),
# Fix factor variance
mxPath('g', arrows=2, free=FALSE, values=1),
# Error variances:
mxPath(from=c(indicators), arrows=2, free=TRUE, values=10),
# Means (saturated means model):
mxPath(from="one", to=indicators, values=rep(5, length(indicators))),
# Loadings:
mxPath(from="g", to=indicators, values=.5),
# Covariate paths
mxPath(covariates, "g", labels=covariates),
# Data
mxData(observed = HS.ability.data, type = "raw"))
# Get some good starting values for regularization. This
# saves 2-3 minutes on my laptop.
mimicModel <- mxRun(mimicModel)
```
Add the penalty:
```{r}
mimicModel <- mxModel(
mimicModel,
mxMatrix('Full',1,1,free=TRUE,values=0,labels="lambda",name="hparam"),
# Set scale to ML estimates for adaptive lasso
mxPenaltyLASSO(what=covariates, name="LASSO",
scale = coef(mimicModel)[covariates],
lambda = 0, lambda.max =2, lambda.step=.04)
)
```
Run the regularization. With only three covariates, the plot of results is not very exciting. We learn that sex is not a good predictor of this factor.
```{r}
regMIMIC <- mxPenaltySearch(mimicModel)
detail <- regMIMIC$compute$steps$PS$output$detail
library(reshape2)
library(ggplot2)
est <- detail[,c(covariates, 'lambda')]
ggplot(melt(est, id.vars = 'lambda')) +
geom_line(aes(x=lambda, y=value, color=variable)) +
geom_vline(aes(xintercept=coef(regMIMIC)['lambda']),
linetype="dashed", alpha=.5)
```
The regularized factor loadings can be found here,
```{r}
detail[detail$EBIC == min(detail$EBIC), covariates]
```
The regularization causes a lot of bias. One way to deal with this is
to fix zerod parameters to zero,
discard the regularization penalty, and re-fit model.
```{r}
regMIMIC <- mxPenaltyZap(regMIMIC)
regMIMIC <- mxRun(regMIMIC)
summary(regMIMIC)
```
|