matchatr

R-CMD-check

matchatr provides causal inference for (matched) case-control, nested case-control (NCC), and case-cohort study designs. It pairs design-faithful classical estimators with marginal causal effects, and integrates with the etverse ecosystem — delegating estimation to causatr (g-computation / IPW / AIPW with sandwich and bootstrap variance) and survatr (causal survival on person-period data).

Status: classical odds-ratio engines landing. The design taxonomy, the two-step matcha() / contrast() API, and the (design, estimator) dispatch (PHASE_1) are in place, and the classical odds-ratio engines now run end to end: the unmatched case-control logistic and Mantel-Haenszel ORs (PHASE_2), the matched case-control conditional-logistic and McNemar ORs with stratum-specific effect modification (PHASE_3), and the polytomous subtype ORs for multi-group outcomes, with a test_homogeneity() Wald test of whether the exposure OR is constant across subtypes plus the pooled common OR (PHASE_4). See the articles for worked examples. The time-to-event sampling designs and the marginal causal-weighting / survival layer (PHASE_5PHASE_20) remain at the design stage.

What it does

Two orthogonal axes: a design object encodes the sampling structure (strata, matching ratio, time scale, prevalence, inclusion weights); an estimator chooses the analysis.

Design Classical estimand Causal (marginal) estimand
Unmatched case-control conditional OR, Mantel-Haenszel RD / RR / marginal OR (case-control weighting)
Matched case-control conditional OR (conditional logistic) RD / RR via standardization
Nested case-control risk-set HR; Samuelsen IPW Cox marginal survival contrasts (design-weighted)
Case-cohort Prentice / Self-Prentice / Borgan HR absolute risk, RD(t), RMST

Marginal causal effects use case-control weighting (the Rose & van der Laan g-formula / IPW / AIPW / TMLE family) and design-based inclusion weighting (Samuelsen, Borgan): the weights are passed as observation weights into the etverse engines, so they compose directly with existing estimators.

Installation

You can install the development version of matchatr from GitHub with:

# install.packages("pak")
pak::pak("etverse/matchatr")

Example

library(matchatr)

# Matched case-control -> conditional odds ratio (infert: a matched study of
# spontaneous/induced abortion and infertility, matched on age and parity).
fit <- matcha(
  infert,
  outcome = "case", exposure = "spontaneous",
  design = matched_cc(strata = "stratum"),
  confounders = ~ induced, estimator = "clogit"
)

contrast(fit, type = "or")
#> <matchatr_result>
#>  Estimator:  clogit  (engine: clogit)
#>  Estimand:   conditional OR
#>  Contrast:   Odds ratio
#>  CI method:  model
#>  N:          248
#> 
#> Contrasts:
#>     comparison estimate     se ci_lower ci_upper
#>         <char>    <num>  <num>    <num>    <num>
#> 1: spontaneous 7.285423 2.5677 3.651357 14.53635

The marginal causal contrasts (case-control weighting) reuse the same two-step API once a source-population prevalence q0 is supplied — they are part of the roadmap below:

fit <- matcha(
  data,
  outcome = "case", exposure = "x",
  design = unmatched_cc(prevalence = 0.02),   # source-population prevalence q0
  confounders = ~ age + smoke, estimator = "ccw_gformula"
)
contrast(fit, type = "difference", ci_method = "sandwich")

Roadmap

The design is documented in PHASE_1PHASE_20 at the repository root, mapping the Handbook of Statistical Methods for Case-Control Studies (Borgan et al., 2018) to an implementation plan. See CLAUDE.md for the phase index and FEATURE_COVERAGE_MATRIX.md for what is implemented and tested.

Part of the etverse

matchatr is one package in the etverse family for causal inference and methodological triangulation, alongside causatr (causal effect estimation) and survatr (causal survival analysis).