Using DHARMa to check Bayesian models fitted with brms
UPDATE 26 October 2022: There is now a DHARMa.helpers package that facilitates checking Bayesian brms
models with DHARMa. Check it out!
The R package DHARMa is incredibly useful to check many different kinds of statistical models. It can be used with Bayesian models too, although it requires a few more lines of code.
Here I develop an example using DHARMa to check a Bayesian hierarchical generalised linear model fitted with the also fantastic brms package.
Example dataset
First, let’s simulate some count data with a hierarchical structure (random effect) and overdispersion:
library(DHARMa)
testdata <- createData(sampleSize = 1000,
intercept = 0,
fixedEffects = 1,
numGroups = 20,
family = poisson(),
overdispersion = 0.5)
head(testdata)
## ID observedResponse Environment1 group time x y
## 1 1 1 -0.68065204 1 1 0.2875775 0.2736227
## 2 2 0 -0.71096830 1 2 0.7883051 0.5938669
## 3 3 1 -0.70163922 1 3 0.4089769 0.1601848
## 4 4 1 0.02886852 1 4 0.8830174 0.8534302
## 5 5 0 -0.01434539 1 5 0.9404673 0.8477392
## 6 6 2 0.23268554 1 6 0.0455565 0.4778868
We have simulated counts (observedResponse
) in 20 sites as a function of an environmental variable (Environment1
):
library(ggplot2)
ggplot(testdata) +
geom_point(aes(Environment1, observedResponse)) +
facet_wrap(~group)
Fitting the model with brms
First, we try a non-hierarchical Poisson GLM of observed counts as a function of environment. For simplicity, using default priors here.
library(brms)
model <- brm(observedResponse ~ Environment1,
data = testdata,
family = poisson(),
cores = 4)
summary(model)
## Family: poisson
## Links: mu = log
## Formula: observedResponse ~ Environment1
## Data: testdata (Number of observations: 1000)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.69 0.03 0.64 0.74 1.00 1597 1974
## Environment1 1.05 0.04 0.97 1.14 1.00 1707 2047
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Model checking with DHARMa
To use DHARMa with brms
(and other Bayesian packages), we must use the createDHARMa
function, which takes the posterior simulations and the observed response to generate a DHARMa object, as simulateResiduals
do with other supported model types.
model.check <- createDHARMa(
simulatedResponse = t(posterior_predict(model)),
observedResponse = testdata$observedResponse,
fittedPredictedResponse = apply(t(posterior_epred(model)), 1, mean),
integerResponse = TRUE)
plot(model.check)
We can see there are problems with this model, which we will try to fix below. The posterior predictive check from bayesplot
confirms the model is not adequate:
pp_check(model, nsamples = 100) + xlim(0, 20)
Automating DHARMa checks for brms models
To simplify the code, let’s write a function to automate the creation and examination of DHARMa objects from fitted brms models:
check_brms <- function(model, # brms model
integer = FALSE, # integer response? (TRUE/FALSE)
plot = TRUE, # make plot?
... # further arguments for DHARMa::plotResiduals
) {
mdata <- brms::standata(model)
if (!"Y" %in% names(mdata))
stop("Cannot extract the required information from this brms model")
dharma.obj <- DHARMa::createDHARMa(
simulatedResponse = t(brms::posterior_predict(model, ndraws = 1000)),
observedResponse = mdata$Y,
fittedPredictedResponse = apply(
t(brms::posterior_epred(model, ndraws = 1000, re.form = NA)),
1,
mean),
integerResponse = integer)
if (isTRUE(plot)) {
plot(dharma.obj, ...)
}
invisible(dharma.obj)
}
Note this function is used here for convenience and will not work with all brms models (e.g. models with more than one response).
Improving the model
Now, let’s try to fix the model. The DHARMa plot above shows that our model is not accounting for potential differences among groups (sites), and there is likely overdispersion (see DHARMa vignette for detailed explanations of how to interpret these plots).
Let’s start adding a varying intercept (random effect) to account for sites/groups:
model2 <- brm(observedResponse ~ Environment1 + (1 | group),
data = testdata,
family = poisson(),
cores = 4,
iter = 4000)
summary(model2)
## Family: poisson
## Links: mu = log
## Formula: observedResponse ~ Environment1 + (1 | group)
## Data: testdata (Number of observations: 1000)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Group-Level Effects:
## ~group (Number of levels: 20)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.95 0.18 0.68 1.37 1.00 877 1944
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.33 0.21 -0.10 0.75 1.01 781 1209
## Environment1 1.05 0.04 0.97 1.12 1.00 3923 4460
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check with DHARMa using the function we just created:
model2.check <- check_brms(model2, integer = TRUE)
There are still problems, but the residuals show a better pattern now. They are now quite homogeneous among sites:
plot(model2.check, form = testdata$group)
But there are still problems. One of them seems to be overdispersion:
testDispersion(model2.check)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 1.1758, p-value = 0.074
## alternative hypothesis: two.sided
Improving the model further
Let’s add an observation-level random effect to try to account for overdispersion:
testdata$obs.id <- 1:nrow(testdata)
model3 <- brm(observedResponse ~ Environment1 + (1 | group) + (1 | obs.id),
data = testdata,
family = poisson(),
cores = 4,
iter = 4000)
summary(model3)
## Family: poisson
## Links: mu = log
## Formula: observedResponse ~ Environment1 + (1 | group) + (1 | obs.id)
## Data: testdata (Number of observations: 1000)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Group-Level Effects:
## ~group (Number of levels: 20)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.97 0.18 0.69 1.39 1.00 1552 2831
##
## ~obs.id (Number of levels: 1000)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.45 0.04 0.38 0.52 1.00 3113 4371
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.22 0.21 -0.20 0.63 1.00 854 1776
## Environment1 1.07 0.05 0.97 1.18 1.00 6430 6334
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check:
model3.check <- check_brms(model3, integer = TRUE)
There are still some warnings, although the observation-level random effect seems to have corrected for overdispersion:
testDispersion(model3.check)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.96409, p-value = 0.732
## alternative hypothesis: two.sided
Improve the model even further
Let’s try using a negative binomial distribution rather than a Poisson:
model4 <- brm(observedResponse ~ Environment1 + (1 | group),
data = testdata,
family = negbinomial(),
cores = 4,
iter = 4000)
summary(model4)
## Family: negbinomial
## Links: mu = log; shape = identity
## Formula: observedResponse ~ Environment1 + (1 | group)
## Data: testdata (Number of observations: 1000)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Group-Level Effects:
## ~group (Number of levels: 20)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.95 0.17 0.69 1.37 1.00 1098 1744
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.31 0.22 -0.12 0.74 1.00 885 1613
## Environment1 1.07 0.05 0.97 1.17 1.00 4924 4697
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 4.81 0.75 3.57 6.45 1.00 5178 4762
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check:
model4.check <- check_brms(model4, integer = TRUE)
plot(model4.check, form = testdata$Environment1)
plot(model4.check, form = testdata$group)
The negative binomial model looks better in this case.
Let’s check it with bayesplot
too:
pp_check(model4, nsamples = 100) + xlim(0, 20)
library(bayesplot)
ppc_rootogram(y = testdata$observedResponse,
yrep = posterior_predict(model4, nsamples = 1000)) +
xlim(0, 20)
Conclusion
DHARMa can greatly facilitate thorough checking and improvement of Bayesian models, as those fitted with brms
.
Prof. Florian Hartig kindly revised a draft of this post.