The fit verb for matchatr (mirroring causatr::causat() and survatr::surv_fit()). matcha() takes the analysis data, the outcome/exposure roles, a sampling design object, and an estimator, then validates the request and resolves it to an estimation engine. The two arguments are orthogonal: design selects the sampling structure (strata, time, prevalence q0, inclusion weights) and estimator selects the analysis (conditional vs marginal; odds ratio vs hazard ratio vs risk difference).
A data.frame or data.table. Not mutated; a data.table copy is stored on the fit.
outcome
Character scalar naming the case-status column. For the binary estimators this is a logical, two-level factor, or numeric 0/1 column; for estimator = “polytomous” it is a factor or character column with three or more groups (multiple case subtypes, or several control groups).
exposure
Character scalar naming the exposure column.
design
A matchatr_design object from one of the design constructors (unmatched_cc(), matched_cc(), nested_cc(), case_cohort(), two_phase(), counter_matched()).
confounders
A one-sided formula of confounders (e.g. ~ age + smoke), or NULL for an unadjusted analysis.
estimator
Character scalar naming the analysis, or NULL to use the design’s canonical default. Classical choices are design-specific (“logistic” / “mh” for unmatched CC, “clogit” for matched CC / NCC, “cch” for case-cohort); the case-control-weighted causal estimators “ccw_gformula”, “ccw_ipw”, “ccw_aipw”, “ccw_tmle” apply to any design but require a prevalence q0 on the design.
model_fn
Optional model-fitting function for the unmatched case-control logistic engine, with a (formula, family, data) interface. Defaults to stats::glm(); pass e.g. mgcv::gam to adjust for a confounder with a smooth term (confounders = ~ s(age)) while keeping the exposure parametric. Ignored by the other engines.
effect_modifier
NULL or a character scalar naming a categorical (logical / character / factor) column whose levels modify the exposure effect. When supplied, the conditional logistic engine fits outcome ~ exposure * effect_modifier + confounders + strata(set) and contrast(type = “or”) reports the stratum-specific odds ratio of the exposure within each modifier level (one OR per level, with a Wald interval from the joint partial-likelihood variance). Supported only for estimator = “clogit” with a single-coefficient exposure (binary, continuous, or two-level factor); the modifier may coincide with a matching variable. Defaults to NULL (no effect modification).
reference
NULL or a character scalar naming the reference outcome group for estimator = “polytomous”. The multinomial logistic contrasts every other group against this baseline, so each non-reference equation’s exposure coefficient is that subtype’s log odds ratio versus the reference. It must name one of the observed groups; when NULL the first factor level (or the first level in sorted order, for a character outcome) is used. Supplying it for a non-polytomous estimator is an error. Defaults to NULL.
Details
Weights are never read from or written to data. The design’s weight_spec records the intended scheme; the case-control weights (q0-based, Rose & van der Laan) and design / inclusion-probability weights (Samuelsen, Borgan) are kept in distinct slots on the fit (details$cc_weights, details$design_weights) because their variance consequences differ.
The resolved engine is run as part of the fit: an implemented estimator (the unmatched case-control logistic regression) populates the model slot, while an engine with no wired estimator leaves it NULL. details$engine records the engine the (design, estimator) pair resolved to.
Value
A matchatr_fit object: a list with the validated specification (data, outcome, exposure, confounders, design, estimator, engine, effect_modifier), a details list (resolved engine, weighting scheme, reserved variance / weight slots, case and control counts), and the originating call. The model slot holds the fitted estimation object for an implemented engine, or NULL otherwise.