diagnose – causatr

Diagnostics for a fitted causal model

Description

Computes diagnostics appropriate to the causal estimator and treatment type:

Binary IPW: propensity-score positivity (tail violations), covariate balance (SMDs via cobalt), weight distribution (treated / control / overall with ESS).
Continuous IPW: density-range positivity (low-density tail observations), covariate balance (correlations), weight distribution (overall ESS).
Categorical IPW: per-level probability positivity, covariate balance (pairwise SMDs via cobalt), weight distribution.
Count IPW (Poisson / NB): density-range positivity, covariate balance, weight distribution.
Multivariate IPW: per-component positivity, covariate balance on the first component, combined product-weight distribution.
Longitudinal IPW: per-period positivity and weight distribution from treatment_models_by_time, per-period covariate balance.
Longitudinal gcomp (ICE): per-period covariate balance (no weights or positivity since ICE has no treatment model).
“matching”: covariate balance before and after matching (via cobalt), match quality summary.
“gcomp”: unadjusted covariate imbalance between treatment groups.

Per-intervention dispatch. When interventions = is supplied, each intervention spawns its own diagnostic panel: the positivity summary is shared across panels (it depends only on the fitted treatment density model, not the intervention), but the weight distribution is computed per intervention. The default call diagnose(fit) produces a single obs panel.

Usage

diagnose(
  fit,
  interventions = NULL,
  by = NULL,
  stats = c("m", "v"),
  thresholds = c(m = 0.1),
  ps_bounds = c(0.025, 0.975)
)

Arguments

fit A causatr_fit object returned by causat().

interventions Named list of causatr_intervention objects (or NULL entries for natural course). When NULL (the default), a single panel keyed obs is produced using the standard observed-treatment diagnostic for the estimator. When non-NULL, each named entry spawns its own per-intervention panel.

by Character scalar naming a baseline variable for stratified balance reporting. When non-NULL, covariate balance is computed within each stratum of by via cobalt::bal.tab(…, cluster = by). The variable must be present in the data.

stats Character vector. Balance statistics to compute. Passed to cobalt::bal.tab(). For binary treatments, valid options include “m” (standardised mean differences), “v” (variance ratios), and “ks” (Kolmogorov-Smirnov). Default c(“m”, “v”).

thresholds Named numeric vector. Balance thresholds for flagging imbalance, e.g. c(m = 0.1, v = 2). Default c(m = 0.1).

ps_bounds Numeric vector of length 2. Lower and upper bounds for flagging positivity violations. Default c(0.025, 0.975).

Details

Positivity

For binary treatment, fits a logistic regression of the treatment on the confounders and flags individuals whose estimated propensity score falls outside ps_bounds. The returned positivity table summarises the propensity score distribution and the number/fraction of near-violations. The propensity score is intervention-independent, so each panel carries an identical positivity table.

Balance (IPW and matching)

If the cobalt package is installed, balance is computed via cobalt::bal.tab() on the propensity formula or matchit object. This provides standardised mean differences (SMD), variance ratios, and KS statistics. If cobalt is not installed, a simpler data.table-based SMD comparison is returned. Balance is the unadjusted SMD across treatment groups; post-weighting balance under specific interventions or estimands is not computed.

Weight distribution (IPW only)

For the default obs panel: summarises the observed-treatment Horvitz-Thompson weights (1/p on treated, 1/(1-p) on controls), which is the standard Hernán & Robins Ch. 12 IPW weight diagnostic. For each user-supplied intervention: summarises the per-intervention density-ratio weight from compute_density_ratio_weights(). Both views report mean / SD / min / max plus the effective sample size (Kish 1965) ESS = (sum w)^2 / sum(w^2) for treated / control / overall groups.

Match quality (matching only)

Reports the number matched, number discarded, and the fraction of the original sample retained.

Value

A causatr_diag object with slots:

per_intervention: Named list of per-intervention panels. Each panel is itself a list with positivity, balance, weights slots (any of which may be NULL).
interventions: Character vector of intervention keys, in the order they appear in per_intervention.
positivity, balance, weights: Top-level shortcuts that point to the corresponding slots of the first panel; preserved for backward compatibility with the pre-restructure flat shape.
match_quality: data.table or NULL: match quality summary (matching only). Lives at the top level because matching is done once at fit time and is intervention-agnostic.
estimator: Character: the causal estimator.
fit_info: Named list with summary metadata (treatment_type, estimand, type, has_em).
fit: The original causatr_fit (stored for plot()).

References

Greifer N (2024). cobalt: Covariate Balance Tables and Plots. https://ngreifer.github.io/cobalt/

Hernán MA, Robins JM (2025). Causal Inference: What If. Chapman & Hall/CRC.

Examples

library("causatr")

data("nhefs", package = "causatr")
fit <- causat(nhefs, outcome = "wt82_71", treatment = "qsmk",
              confounders = ~ sex + age + wt71,
              estimator = "ipw")
# Default: single observed-treatment panel.
diag <- diagnose(fit)
print(diag)
plot(diag)

# Per-intervention dispatch:
diag2 <- diagnose(fit, interventions = list(a1 = static(1), a0 = static(0)))
print(diag2)