Pool causal estimates across multiply-imputed datasets
Description
Fits a causal model with causat() and computes a causal contrast() on every completed dataset stored in a micemids object, then pools the per-imputation estimates into a single causatr_result. This is the analysis step of a multiple-imputation (MI) workflow: the user imputes missing covariates and/or treatment upstream with mice::mice(), and causat_mice() propagates the imputation uncertainty into the causal estimate and its standard error.
Multiple imputation is the right tool for missing covariates (L) or missing treatment (A) under a missing-at-random mechanism. Missing outcomes (Y) are handled by inverse-probability-of-censoring weighting (ipcw = TRUE) or complete-case analysis, not by imputing Y; however Y should be a predictor in the upstream imputation model.
Usage
causat_mice(
imp,
outcome,
treatment,
confounders = NULL,
interventions = NULL,
estimator = "gcomp",
family = "gaussian",
estimand = "ATE",
type = "difference",
ci_method = "sandwich",
conf_level = 0.95,
by = NULL,
pool_method = c("rubin", "boot_mi"),
B = 200L,
M = 2L,
parallel = c("no", "future"),
seed = NULL,
...
)
Arguments
imp
A mids object returned by mice::mice().
outcome
Character scalar naming the outcome column. Passed to causat().
treatment
Character scalar (or vector for multivariate treatment) naming the treatment column(s). Passed to causat().
confounders
A one-sided formula of baseline confounders, or NULL when per-component formulas are supplied through …. Passed to causat().
interventions
A named list of intervention objects (e.g. list(a1 = static(1), a0 = static(0))). Passed to contrast(). Leave NULL for estimator = “snm”, whose estimand is the blip parameter itself and which rejects an interventions argument.
estimator
Character causal estimator: “gcomp” (default), “ipw”, “aipw”, “matching”, or “snm”. Passed to causat().
family
Outcome family (character or family object). Passed to causat().
estimand
Character estimand (“ATE”, “ATT”, “ATC”). Passed to causat().
type
Character contrast scale: “difference” (default), “ratio”, or “or”. Passed to contrast().
ci_method
Character within-imputation variance method, “sandwich” (default) or “bootstrap”, used for each per-imputation contrast() call. The pooled variance is governed by pool_method, not this argument.
conf_level
Numeric confidence level for the pooled intervals. Default 0.95.
by
Optional one-sided formula or character naming a baseline stratifier. Pooling is applied per by-stratum row independently. Passed to contrast().
pool_method
Character pooling strategy. “rubin” (default) applies Rubin’s rules to the per-imputation sandwich variances. “boot_mi” uses von Hippel’s bootstrap-then-impute two-stage variance, valid under uncongeniality. See Details.
B
Integer number of bootstrap resamples for pool_method = “boot_mi”. Default 200. Ignored for “rubin”.
M
Integer number of imputations per bootstrap resample for pool_method = “boot_mi”. Default 2 (von Hippel’s efficient variant). Ignored for “rubin”.
parallel
Character parallel backend forwarded to the Boot MI engine: “no” (default) or “future” (uses future.apply::future_lapply()).
seed
Optional integer seed. For pool_method = “boot_mi” it seeds the bootstrap-and-impute loop reproducibly.
…
Additional arguments forwarded to causat() (e.g. id, time, confounders_tv, censoring, ipcw, confounders_outcome, propensity_model_fn, model_fn).
Details
Rubin’s rules (pool_method = “rubin”)
Let \(\hat{Q}_i\) and \(U_i\) be the estimate and variance from imputation \(i\). The pooled estimate is {Q} = m^{-1}_i _i and the total variance is \(T = \bar{U} + (1 + 1/m) B\) with within variance \(\bar{U} = m^{-1}\sum_i U_i\) and between variance \(B = (m-1)^{-1}\sum_i (\hat{Q}_i - \bar{Q})^2\). Confidence intervals use Barnard-Rubin degrees of freedom.
Congeniality
Causal estimands are typically uncongenial with the mice imputation model (the estimand is a functional of the outcome/treatment model under intervention, not a parameter of the imputation model). Under uncongeniality Rubin’s variance can be biased in either direction depending on the situation – conservative for some kinds of uncongeniality (Meng 1994), but anticonservative in others (Bartlett & Hughes 2020). pool_method = “boot_mi” sidesteps Rubin’s variance decomposition with a resampling variance that attains nominal coverage provided the point estimator stays consistent: a bootstrap corrects the variance, not bias in the estimate itself. Always include the outcome, treatment, all confounders, and any effect modifiers as predictors in the upstream mice() call – omitting a key predictor (e.g. the outcome) misspecifies the imputation, which can bias the causal estimate and so defeat both pooling rules. causat_mice() warns when an analysis variable is absent or unused.
What this does not do
It does not perform the imputation (call mice::mice() first), impute the outcome, handle MNAR mechanisms, or pool omnibus tests across contrasts.
Value
A causatr_result with pooled estimates, standard errors, and confidence intervals. ci_method is “rubin” or “boot_mi”. The per-row pooling diagnostics are attached as the “mi_details” attribute.
References
Rubin DB (1987). Multiple Imputation for Nonresponse in Surveys. Wiley.
van Buuren S, Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software 45(3):1-67.
von Hippel PT (2020). How many imputations do you need? Sociological Methods & Research 49(3):699-718.
Meng XL (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science 9(4):538-558.
Bartlett JW, Hughes RA (2020). Bootstrap inference for multiple imputation under uncongeniality and misspecification. Statistical Methods in Medical Research 29(12):3533-3546.
See Also
causat(), contrast()
Examples
library("causatr")if (requireNamespace("mice", quietly =TRUE)) {set.seed(1) n <-400 L <-rnorm(n) A <-rbinom(n, 1, plogis(0.5* L)) Y <-2+3* A +1.5* L +rnorm(n)# L missing-at-random on the (observed) treatment. L[rbinom(n, 1, plogis(-1+0.8* A)) ==0] <-NA dat <-data.frame(Y = Y, A = A, L = L) imp <- mice::mice(dat, m =5, printFlag =FALSE) res <-causat_mice( imp,outcome ="Y",treatment ="A",confounders =~L,interventions =list(a1 =static(1), a0 =static(0)),estimator ="gcomp" )summary(res)}
<causatr_result>
Estimator: G-computation
Estimand: ATE
Contrast: Difference
CI method: rubin
N: 400
Intervention means:
intervention estimate se ci_lower ci_upper
<char> <num> <num> <num> <num>
1: a1 5.01 0.129 4.75 5.27
2: a0 2.01 0.124 1.77 2.26
Contrasts:
comparison estimate se ci_lower ci_upper
<char> <num> <num> <num> <num>
1: a0 vs a1 -3 0.171 -3.37 -2.62
Intervention details:
a1: static, value = 1
a0: static, value = 0
Variance-covariance matrix of marginal means:
[,1] [,2]
[1,] 0.0166584 0.0000000
[2,] 0.0000000 0.0153196