This vignette displays many of the models that are covered by the
package tests. It also illustrates many of the features of
{equatiomatic}, including
Unconditional models
A basic two-level unconditional model:
um_hsb <- lmer(math ~ 1 + (1 | sch.id), data = hsb)
extract_eq(um_hsb)
And a model with multiple levels:
um_long3 <- lmer(score ~ 1 + (1 | sid) + (1 | school) + (1 | district),
data = sim_longitudinal
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(um_long3)
Models with predictors at level 1
lev1_hsb <- lmer(math ~ female + ses + minority + (1 | sch.id), hsb)
extract_eq(lev1_hsb)
Note that in the above, the mean structure at level 1 is broken out
into a separate line. You can override this with the
mean_separate
argument.
Similarly, just like with standard lm
models, you can
specify wrapping, and how many terms per line
extract_eq(lev1_hsb, wrap = TRUE, terms_per_line = 2)
And one more example with multiple levels
lev1_long <- lmer(score ~ wave + (1 | sid) + (1 | school) + (1 | district),
data = sim_longitudinal
)
extract_eq(lev1_long)
Models with unstructured variance-covariance specifications
This should generally work regardless of the complexity.
hsb1 <- lmer(
math ~ female + ses + minority + (minority | sch.id),
hsb
)
extract_eq(hsb1)
Notice that it correctly parses which variable is randomly varying
here. We can also make it more complicated.
hsb2 <- lmer(
math ~ female + ses + minority + (female + ses | sch.id),
hsb
)
extract_eq(hsb2)
And even really complicated. Note the model below gives a
warning.
hsb3 <- lmer(
math ~ female * ses + minority +
(ses * female + minority | sch.id),
hsb
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(hsb3)
Group level predictors
In the sim_longitudinal
data that comes with the
package, the only level 1 predictor is wave
. The
group
and treatement
variables are at the
student level (level 2) and prop_low
is at the school
level. Let’s also add a district level variable (just the average score
for each district).
# calculate district means
dist_mean <- tapply(
sim_longitudinal$score,
sim_longitudinal$district,
mean
)
# put them in a df to merge
dist_mean <- data.frame(
district = names(dist_mean),
dist_mean = dist_mean
)
# create a new df with dist_mean added
d <- merge(sim_longitudinal, dist_mean, by = "district")
Now we can fit a model that should have predictors at every level.
We’ll allow wave
to vary randomly at each level too.
group_preds_m1 <- lmer(score ~ wave + group + treatment + prop_low + dist_mean +
(wave | sid) + (wave | school) + (wave | district),
data = d
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(group_preds_m1)
Interactions
Interactions with multilevel models can be tricky because they can be
within or across levels, and the notation changes depending on whether
the random effect for the lower level (in the interaction term) is
specified as varying randomly within the given level. Luckily,
{equatiomatic} handles all of this for you.
First, let’s look at a model with interactions only at level 1.
l1_hsb_int <- lmer(math ~ minority * female + (1 | sch.id),
data = hsb
)
extract_eq(l1_hsb_int)
And now an interaction at only level 2
l2_hsb_int <- lmer(math ~ meanses * sector + (1 | sch.id),
data = hsb
)
extract_eq(l2_hsb_int)
But more complicated are cross level interactions. Here’s a quick
example.
cl_long1 <- lmer(score ~ wave * treatment + (wave | sid) + (1 | school) +
(1 | district),
data = sim_longitudinal
)
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#> Model failed to converge with max|grad| = 0.00277081 (tol = 0.002, component 1)
extract_eq(cl_long1)
Note that the treatement
variable is shown as a
predictor of the level 1 intercept and the level 1 slope. But if the
slope is not randomly varying at this level, then the notation has to
change.
cl_long2 <- lmer(score ~ wave * treatment + (1 | sid) + (1 | school) +
(1 | district),
data = sim_longitudinal
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(cl_long2)
This works even for really complicated models, including three-way
interactions that contain a cross-level interaction. For example
cl_long3 <- lmer(
score ~ wave * group * treatment + wave * prop_low * treatment +
(wave | sid) + (wave | school) +
(wave + treatment | district),
sim_longitudinal
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(cl_long3)
Alternative variance-covariance specifications
Finally, there is some support for alternative variance-covariance
specifications. For example, you may want to specify a model with only
the variance terms estimated, and not the covariances.
hsb_varsonly <- lmer(math ~ minority * female + (minority * female || sch.id),
data = hsb
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(hsb_varsonly)
Or maybe you want to group by the same thing multiple times. In this
exact model I don’t think this makes any sense, but there are cases
where it can. Note that, again, this model produces a warning.
hsb_doublegroup <- lmer(math ~ minority * female +
(minority | sch.id) + (female | sch.id),
data = hsb
)
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#> Model failed to converge with max|grad| = 0.00319858 (tol = 0.002, component 1)
extract_eq(hsb_doublegroup)
And finally, you can have a mix of different things and it should
generally still work.
long_mixed_ranef <- lmer(
score ~ wave +
(wave || sid) + (wave | school) + (1 | school) + (wave || district),
sim_longitudinal
)
#> boundary (singular) fit: see help('isSingular')
extract_eq(long_mixed_ranef)
With that said, this is the part of the code that I would consider
the most “experimental” at this point, so please do reach out if you
have issues or run into use cases that are not supported.
Using estimated coefficients
Because of the multilevel nature of the notation used by
{equatiomatic}, tracking the estimated coefficients can
at times be a little difficult. To help with this, we keep the Greek
notation with the estimated coefficients as subscripts. For example,
let’s extract the equation from our model with group-level
predictors.
Future developments
There are still a few things I’m planning to implement that I haven’t
quite gotten around to yet. These include
- Implementation of other features currently included in the
lm
extraction, including intercept
, and
raw_tex
- Inevitable bug fixes
Additionally, the Greek notation as subscripts may not always be
desirable, particularly if the model is fairly simple. Future
developments will likely make this optional.
The range of models that you can fit with lme4::lmer
is
huge, but I hope this will handle a wide range of models. I also
recognize that some users may want a different notation for the models.
At this point, I’m not planning to implement other notations, but that
could change if there’s enough demand for it. I’m open to any/all
suggestions for how to improve the package, and would love pull
requests to help support the package, big or small.