Fit a causal model

Description

Prepares the causal estimation pipeline for a given estimator. For “gcomp”, fits the conditional outcome model E[Y | A, L] that will be used by contrast() for standardisation. For “ipw”, fits the conditional treatment density \(f(A \mid L)\) that the self-contained density-ratio engine uses to build per-intervention Hajek weights for the MSM refit inside contrast(). For “matching”, creates matched sets (via MatchIt::matchit()) that will be used for matched estimation in contrast().

For longitudinal data (id and time provided), “gcomp” uses ICE g-computation (Zivich et al., 2024): outcome models are fitted at each time point via backward iteration.

Usage

causat(
  data,
  outcome,
  treatment,
  confounders = NULL,
  confounders_tv = NULL,
  confounders_outcome = NULL,
  confounders_treatment = NULL,
  confounders_censoring = NULL,
  confounders_sampling = NULL,
  confounders_tv_outcome = NULL,
  confounders_tv_treatment = NULL,
  estimator = c("gcomp", "ipw", "aipw", "matching", "snm"),
  family = "gaussian",
  estimand = c("ATE", "ATT", "ATC"),
  id = NULL,
  time = NULL,
  censoring = NULL,
  ipcw = FALSE,
  censoring_model_fn = NULL,
  history = Inf,
  numerator = NULL,
  weights = NULL,
  cluster = NULL,
  type = NULL,
  model_fn = stats::glm,
  propensity_model_fn = NULL,
  propensity_family = NULL,
  stabilize = c("none", "marginal"),
  target = NULL,
  sampling_model_fn = NULL,
  target_subset = c("target", "all"),
  treatment_free = NULL,
  stratified = NULL,
  treatment_form = NULL,
  ...
)

Arguments

data A data frame or data.table.
outcome Character. Name of the outcome variable.
treatment Character scalar or character vector. Name(s) of the treatment variable(s). Pass a character vector for multivariate (joint) treatments, e.g. treatment = c(“A1”, “A2”).
confounders A one-sided formula specifying baseline (time-invariant) confounders, e.g. ~ L1 + L2. Interactions and transformations are allowed, e.g. ~ L1 * L2 + splines::ns(age, 4). For longitudinal data, these confounders are constant within each individual (measured at baseline) and enter every time-step outcome model. Soft-deprecated: prefer the per-component formulas (confounders_outcome, confounders_treatment, etc.) for finer control. When NULL (the new default), at least the per-component formulas required by the chosen estimator must be supplied.
confounders_tv A one-sided formula or NULL. Time-varying confounders for longitudinal data (e.g. ~ CD4 + viral_load). These change over time within individuals and enter the outcome model at each ICE step alongside their lagged values (controlled by history). Ignored for point treatments. If NULL, no time-varying confounders are used. Soft-deprecated: prefer confounders_tv_outcome and confounders_tv_treatment for finer control.
confounders_outcome One-sided formula or NULL. Confounders for the outcome model only. When non-NULL, overrides confounders for the outcome model (g-comp, AIPW outcome side, ICE backward iteration, matching MSM). Falls back to confounders when NULL.
confounders_treatment One-sided formula or NULL. Confounders for the treatment (propensity) model only. When non-NULL, overrides confounders for the propensity model (IPW, AIPW treatment side, matching distance formula). Falls back to confounders when NULL.
confounders_censoring One-sided formula or NULL. Confounders for the censoring model when ipcw = TRUE. Falls back to confounders when NULL.
confounders_sampling One-sided formula or NULL. Confounders for the sampling model \(P(S = 1 \mid L)\) when target is non-NULL. Falls back to confounders when NULL.
confounders_tv_outcome One-sided formula or NULL. Time-varying confounders for the outcome model only (longitudinal data). Falls back to confounders_tv when NULL.
confounders_tv_treatment One-sided formula or NULL. Time-varying confounders for the treatment model only (longitudinal data). Falls back to confounders_tv when NULL.
estimator Character. Causal estimator: “gcomp” (default), “ipw”, “aipw”, “matching”, or “snm”. IPW uses a self-contained density-ratio engine (no runtime dependency on WeightIt); AIPW is doubly-robust (outcome model + propensity weights); matching requires the MatchIt package; SNM fits a structural nested mean model via g-estimation. Note: “matching” is restricted to binary point treatments (MatchIt does not support multi-category or continuous treatments); use “gcomp” or “ipw” for those cases.
family Character or family object. The outcome model family (e.g. “gaussian”, “binomial”, stats::quasibinomial()). Used by all methods: passed to the outcome model for “gcomp”, to the per-intervention weighted MSM refit for “ipw”, and to the outcome model on matched data for “matching”.
estimand Character. The target estimand: “ATE” (default), “ATT”, or “ATC”. “ATT” and “ATC” are only defined for binary point treatments. For “gcomp”, the estimand stored here is used as the default in contrast() but can be overridden there. For “ipw” and “matching”, the estimand is fixed at fitting time because it determines the weights or match direction; it cannot be changed in contrast().
id Character or NULL. Name of the individual ID variable. Must be provided together with time for longitudinal data.
time Character or NULL. Name of the time variable. Must be provided together with id for longitudinal data.
censoring Character or NULL. Name of the censoring indicator variable (1 = censored, 0 = uncensored). Without ipcw = TRUE, this is a row filter only: rows where censoring != 0 are excluded from model fitting but no censoring model is fit. For g-computation with a correctly specified outcome model, row filtering is sufficient under MAR censoring. For IPW under MAR censoring, set ipcw = TRUE or supply pre-computed IPCW weights via weights =. For longitudinal data, censoring is time-varying: C_k = 1 means the individual dropped out at time k.
ipcw

Logical. If TRUE, fit an internal censoring model \(P(C = 0 \mid A, L)\) and compute stabilized IPCW weights that are composed multiplicatively with any external weights. Requires censoring to be specified. Default FALSE preserves the legacy row-filter-only behaviour. For IPW under MAR censoring this is essential; for g-comp it provides doubly-robust protection.

Variance note: For point treatments the sandwich SE fully accounts for censoring-model estimation uncertainty (Channel 2 correction). For longitudinal treatments (ICE, longitudinal IPW) the Channel 2 correction for per-period censoring models is omitted, making the sandwich SE conservative. Use ci_method = “bootstrap” for tighter CIs in longitudinal IPCW settings.
censoring_model_fn Function or NULL. The fitting function for the censoring model when ipcw = TRUE. Must accept (formula, data, family, weights, …). Default NULL uses stats::glm. Ignored when ipcw = FALSE.
history Non-negative integer or Inf. Markov lag order for longitudinal models: how many lagged time points of treatment and time-varying confounders to include in each per-period model. Default Inf (full history). 0 means no lags (current-period TV covariates only). Ignored for point treatments.
numerator A one-sided formula or NULL. Numerator formula for stabilized IPW weights in longitudinal models. Defaults to baseline confounders only (no time-varying confounders), which gives the standard stabilized weights. Only relevant for estimator = “ipw” with longitudinal data.
weights Numeric vector, a survey::svydesign object, or NULL. Pre-computed observation weights (e.g. survey weights or externally computed IPCW). When a survey.design object is supplied, stats::weights() is used to extract the sampling weights and the first-stage cluster (PSU) is auto-propagated into the fit’s cluster slot so the contrast-time sandwich is cluster-robust by default; override by passing an explicit cluster = alongside. For “gcomp”, passed to glm(). For “ipw”, multiplied with the estimated propensity weights.
cluster Character or NULL. Name of a column in data identifying cluster membership (e.g. site, household, PSU id). Stored in the fit and preserved through prepare_data() so that contrast() can default its own cluster = argument to this column, producing a cluster-robust sandwich without the user having to repeat the column name. Users can still override at contrast time by passing cluster = explicitly. Forwarded to matching is not allowed (matching clusters on its own subclass).
type Character or NULL. Whether the data are “point” (single time-point per individual) or “longitudinal” (repeated measures). Default NULL auto-detects: “longitudinal” when both id and time are provided, “point” otherwise. Passing type = “longitudinal” explicitly requires id and time.
model_fn Function. The fitting function for the outcome model in g-computation. Must accept (formula, data, family, weights, …). Default stats::glm; pass mgcv::gam for GAMs, MASS::glm.nb for negative-binomial, etc. For “ipw”, model_fn also fits the placeholder Y ~ A display model and is the default propensity fitter when propensity_model_fn is NULL. Ignored for “matching”.
propensity_model_fn Function or NULL. IPW only. The fitting function for the conditional treatment density \(f(A \mid L)\) used to build density-ratio weights. Must accept the same (formula, data, family, weights, …) signature. NULL (default) reuses model_fn. Pass mgcv::gam for a flexible propensity.
propensity_family Character or NULL. IPW only. Explicit treatment density family: “poisson” or “negbin” for count treatments. NULL (default) auto-detects from the treatment values (bernoulli / gaussian / categorical). Auto-detection never infers count – use this parameter to opt in. For “negbin”, MASS::glm.nb is auto-selected as the propensity fitter unless propensity_model_fn is explicitly provided. Ignored for “gcomp” and “matching”.
stabilize

Character. Multivariate IPW only. Controls the numerator density in each per-component weight factor.

  • “none” (default): numerator conditions on the same covariates as the denominator. Per-component weight is \(f_k(d_k^{-1}(A_k) \mid A_{1..k-1}^{\mathrm{obs}}, L) \cdot |\mathrm{Jac}| / f_k(A_k \mid A_{1..k-1}^{\mathrm{obs}}, L)\).

  • “marginal”: numerator drops \(L\), conditioning only on prior treatments (Robins, Hernán, Brumback 2000). Fits a second per-component density model \(g_k(A_k \mid A_{1..k-1})\) and uses it in the numerator; the denominator stays at the full-L density. This dampens the multiplicative L-dependence across \(K\) factors and typically reduces weight variance substantially on datasets where multiple components have well-predicted propensities.

Ignored for univariate IPW, “gcomp”, and “matching”.
target Character or NULL. Column name of a binary 0/1 sampling indicator S, where S = 1 identifies study-population rows (observed A and Y) and S = 0 identifies target-population rows (only L observed). When non-NULL, enables transportability/generalizability estimation: the sampling model \(P(S = 1 \mid L)\) is fit on all rows and stored on the fit object for use in contrast(). Default NULL produces a study-population estimand with no transportability adjustment. Not supported with estimator = “matching”.
sampling_model_fn Function or NULL. Fitting function for the sampling model \(P(S = 1 \mid L)\) when target is non-NULL. Must accept (formula, data, family, …). Default NULL uses stats::glm with family = binomial(). Ignored when target = NULL.
target_subset Character. Which rows define the target population for transportability/generalizability. “target” (default) restricts the target to S = 0 rows only (transportability: the study is external to the target population). “all” uses all rows S = 0 and S = 1 (generalizability: the study is a biased subsample of the target). Ignored when target = NULL.
treatment_free One-sided formula or NULL. SNM only. Specifies the treatment-free outcome model \(E[Y \mid L]\), a nuisance model whose predictions are subtracted from Y before g-estimation. This projects out the \(L \to Y\) association and reduces the variance of \(\hat\psi\) without changing the point estimate (both approaches are consistent). DTRreg calls this the tf.mod argument. The formula should contain only baseline confounders (no treatment or modifiers). Default NULL uses the standard residualized-treatment moment condition without a treatment-free model. Ignored for non-SNM estimators.
stratified Character or NULL. Name of a baseline (time-invariant) column to stratify the ICE outcome models on. Longitudinal g-computation (ICE) only. When set, a separate outcome / pseudo-outcome model is fitted for each stratum at every backward step instead of a single pooled model – the right choice when the outcome–treatment relationship differs structurally across baseline subgroups (e.g. different functional form by sex). The column must be discrete, complete, and constant within each individual. Both variance paths are available in contrast(): the ID-cluster bootstrap (ci_method = “bootstrap”) and the analytic per-stratum stacked-EE sandwich (ci_method = “sandwich”). Default NULL fits pooled models. Ignored / rejected for point treatments and non-gcomp estimators.
treatment_form One-sided formula or NULL. Controls how the treatment enters the per-step ICE outcome models, e.g. ~ factor(A) (categorical / ordinal dose) or ~ splines::ns(A, df = 3) (smooth dose-response). Longitudinal g-computation (ICE) only, including natural-history MTPs (grace_period() / carry_forward()). The intervention always sets the numeric treatment column; only the model’s design term changes, so a nonlinear or kinked counterfactual response (e.g. a capped dose) is no longer forced through a single linear slope. Lag terms are expanded automatically (factor(A) -> factor(lag1_A)). Every term must reference a treatment column. Under factor(A) an intervention must keep the treatment within observed levels. Default NULL enters the treatment as a bare numeric main effect (the historical behaviour). Rejected for point treatments and non-gcomp estimators.
Additional arguments passed to the underlying estimation function. For estimator = “ipw”, dots are forwarded into the user’s propensity_model_fn via fit_treatment_model() (e.g. smoothing arguments for mgcv::gam). For estimator = “matching”, dots are forwarded into MatchIt::matchit() (e.g. method = “nearest”, ratio = 1, caliper = 0.2).

Details

G-computation (estimator = “gcomp”)

Fits E[Y | A, L] using glm() (or mgcv::gam() if the formula contains s() or te() terms). contrast() standardises over the target population (controlled by estimand) to obtain E[Y^a] under each intervention. For “gcomp”, the estimand can be overridden in contrast() – fit once, contrast with multiple estimands.

For longitudinal data, uses ICE g-computation (Zivich et al., 2024): outcome models are fitted one per time point via backward iteration, conditioning on baseline confounders (confounders), time-varying confounders (confounders_tv), and their lags up to history steps.

IPW (estimator = “ipw”)

Fits a conditional treatment density \(f(A \mid L)\) via propensity_model_fn (defaults to model_fn). Each intervention requested in contrast() builds its own density-ratio weight vector and refits a weighted marginal structural model. Supports binary (via HT or IPSI), continuous (via smooth shift / scale_by), and dynamic-on-binary interventions. threshold() and continuous dynamic() rules are routed to estimator = “gcomp” with a pointer. Categorical treatments are not supported under IPW; estimator = “gcomp” handles them via predict-then-average.

Longitudinal IPW is supported via sequential density-ratio weights (cumulative product across time points) with optional stabilization. Supply id and time to activate.

AIPW (estimator = “aipw”)

Doubly-robust estimation combining the outcome model (as in g-comp) with propensity-score weights (as in IPW). The AIPW estimator is consistent if either the outcome model or the treatment model is correctly specified. Supports binary, continuous, categorical, count, and multivariate treatments (point); longitudinal AIPW is also supported via sequential outcome models and cumulative density-ratio weights.

Matching (estimator = “matching”)

Calls MatchIt::matchit() to create matched sets. The estimand is fixed at fitting time. Only static() interventions are supported in contrast(). Matching is binary-only: MatchIt does not support categorical (k > 2 levels) or continuous treatments, and causatr aborts with a clear error in both cases. Use estimator = “gcomp” or estimator = “ipw” for categorical / continuous treatments. Longitudinal matching is also unsupported.

Estimands

Estimand Population averaged over Applicability
“ATE” All individuals Always
“ATT” Observed treated (A = 1) Binary point treatment only
“ATC” Observed controls (A = 0) Binary point treatment only

For continuous, categorical, multivariate, or longitudinal treatments, use estimand = “ATE” and pass a subset expression to contrast() for subgroup effects.

Identifiability assumptions

All methods assume: (1) exchangeability (no unmeasured confounding given L), (2) positivity (every individual has positive probability of each treatment value given L), (3) consistency (the observed outcome under the observed treatment equals the potential outcome). Positivity is checked automatically and a warning is issued if near-violations are detected.

Missing data handling

Outcome (Y) missing. Rows with NA outcome are excluded from model fitting via get_fit_rows(). If a censoring column is provided, rows with censoring != 0 are also excluded. contrast() predicts on ALL rows (including those with missing Y) and standardizes over the target population. Under MCAR, this complete-case analysis is unbiased. Under MAR (censoring depends on A and/or L), g-computation with a correctly specified outcome model is still consistent because the regression surface E[Y | A, L] is unchanged by the censoring mechanism (Hernan & Robins, Ch. 13). For IPW under MAR, IPCW weights are needed – supply them via weights =. The censoring = parameter is a row filter, not IPCW: no censoring model is fit internally.

Treatment (A) missing. causat() aborts if any treatment values are NA and no censoring column is provided. You cannot define a counterfactual without knowing the treatment value. Remove rows with missing A before calling causat() (unbiased under MCAR), or use multiple imputation (planned via causat_mice()).

Covariates (L) missing. For estimator = “gcomp”, the outcome model’s na.action = na.omit drops rows with NA covariates at fit time. predict() returns NA for rows with missing L, which contrast() excludes from the target population. Under MCAR this is unbiased. For estimator = “ipw”, the propensity model also drops NA-covariate rows; if this creates a row-count mismatch with the outcome mask, causat() aborts with a clear error. For estimator = “matching”, MatchIt handles covariate NAs internally. Under MAR, multiple imputation is needed (planned via causat_mice()).

Value

A causatr_fit object with slots:

model
Fitted model object(s): glm/gam for “gcomp”; a placeholder Y ~ A weighted MSM for “ipw” (the density model lives in details$propensity_model); the matched-data outcome model for “matching”.
data
data.table used for fitting.
treatment, outcome, confounders, confounders_tv, family
Model spec.
estimator
“gcomp”, “ipw”, or “matching”.
type
“point” or “longitudinal”.
estimand
“ATE”, “ATT”, or “ATC”.
id, time, censoring
Longitudinal identifiers.
history
Markov order for longitudinal ICE models.
numerator
Numerator formula for longitudinal IPW.
weights_obj
Legacy slot, always NULL.
match_obj
matchit object (matching only).
call
The original call.
details
Internal diagnostics list.

References

Hernan MA, Robins JM (2025). Causal Inference: What If. Chapman & Hall/CRC. Chapters 12 (IPW), 13 (g-formula), 15 (matching), 21 (ICE).

Zivich PN, Ross RK, Shook-Sa BE, Cole SR, Edwards JK (2024). Empirical sandwich variance estimator for iterated conditional expectation g-computation. Statistics in Medicine 43:5562-5572.

See Also

contrast(), diagnose(), static(), shift(), dynamic()

Examples

library("causatr")

data("nhefs", package = "causatr")

# Point treatment, g-computation (default)
fit <- causat(
  nhefs,
  outcome = "wt82_71",
  treatment = "qsmk",
  confounders = ~ sex + age + race + education +
    smokeintensity + smokeyrs + exercise + active + wt71,
  model_fn = stats::glm
)

# ATT via g-computation (override estimand in contrast())
fit_att <- causat(
  nhefs,
  outcome = "wt82_71",
  treatment = "qsmk",
  confounders = ~ sex + age + race + education +
    smokeintensity + smokeyrs + exercise + active + wt71,
  estimand = "ATT",
  model_fn = stats::glm
)

# IPW -- self-contained density-ratio engine; estimand fixed at
# fit time. A non-default `propensity_model_fn` (e.g. `mgcv::gam`)
# swaps in a smooth propensity model without touching the rest of
# the pipeline.
fit_ipw <- causat(
  nhefs,
  outcome = "wt82_71",
  treatment = "qsmk",
  confounders = ~ sex + age + race + education +
    smokeintensity + smokeyrs + exercise + active + wt71,
  estimator = "ipw",
  estimand = "ATE",
  model_fn = stats::glm,
  propensity_model_fn = stats::glm
)

# Matching (requires MatchIt). `method = "nearest"` is MatchIt's own arg.
fit_match <- causat(
  nhefs,
  outcome = "wt82_71",
  treatment = "qsmk",
  confounders = ~ sex + age + race + education +
    smokeintensity + smokeyrs + exercise + active + wt71,
  estimator = "matching",
  estimand = "ATT",
  method = "nearest"
)

# Longitudinal with time-varying confounders
fit_long <- causat(
  data_long,
  outcome = "Y",
  treatment = "A",
  confounders = ~ sex + race + baseline_age,
  confounders_tv = ~ CD4 + viral_load,
  id = "id",
  time = "time",
  history = Inf,
  model_fn = stats::glm
)

# Multivariate treatment
fit_multi <- causat(
  data,
  outcome = "Y",
  treatment = c("A1", "A2"),
  confounders = ~ L1 + L2,
  model_fn = stats::glm
)

# Negative binomial outcome (count data with overdispersion)
fit_nb <- causat(
  data,
  outcome = "Y",
  treatment = "A",
  confounders = ~ L1 + L2,
  model_fn = MASS::glm.nb
)

# Beta regression outcome (proportions/rates in (0, 1))
fit_beta <- causat(
  data,
  outcome = "Y",
  treatment = "A",
  confounders = ~ L1 + L2,
  model_fn = betareg::betareg,
  family = "beta"
)