The example below uses the NHEFS (National Health and Nutrition Examination Survey — Epidemiologic Follow-up Study) dataset from Hernán & Robins (2025). The assumed causal structure is:
Treatment (\(A\)):qsmk — whether the individual quit smoking between baseline (1971) and follow-up (1982). Binary (0 = continued, 1 = quit).
Outcome (\(Y\)):wt82_71 — change in body weight (kg) from 1971 to
Continuous.
Confounders (\(L\)): a set of baseline covariates — sex, age, race, smoking intensity (cigarettes/day), years of smoking, exercise, physical activity level, and baseline weight — that jointly affect both the decision to quit smoking and the change in weight.
Censoring (\(C\)):censored — whether the individual was lost to follow-up. Rows with \(C = 1\) are excluded from model fitting; predictions are standardised over the full target population.
Under the causal DAG, \(L\) is a common cause of both \(A\) and \(Y\). Adjusting for \(L\) blocks all backdoor paths from \(A\) to \(Y\), so the g-formula identifies the average causal effect \(\mathbb{E}[Y^{a=1}] - \mathbb{E}[Y^{a=0}]\).
Estimator: gcomp · Estimand: ATE · Contrast: difference · CI method: sandwich · N: 1746
Intervention means
intervention
estimate
se
ci_lower
ci_upper
quit
5.033
0.404
4.241
5.825
continue
1.877
0.209
1.467
2.287
Contrasts
comparison
estimate
se
ci_lower
ci_upper
quit vs continue
3.156
0.449
2.276
4.035
Intervention types
causatr supports several intervention types for g-computation:
Function
Description
Example
static(value)
Set treatment to a fixed value
static(1)
shift(delta)
Add a constant to observed treatment
shift(-10)
scale_by(factor)
Multiply observed treatment
scale_by(0.5)
threshold(lower, upper)
Clamp treatment within bounds
threshold(0, 20)
dynamic(rule)
Apply a user-defined rule
dynamic(\(d, a) ifelse(d$L > 0, 1, 0))
NULL
Natural course (observed values)
—
IPW supports static(), shift(), scale_by(), dynamic(), and ipsi(). Matching supports only static() interventions.
Contrast types
Type
Meaning
"difference"
Risk/mean difference (default)
"ratio"
Risk ratio
"or"
Odds ratio (binary outcomes)
Inference methods
Method
Description
"sandwich"
Robust sandwich SE (default, fast)
"bootstrap"
Nonparametric bootstrap (resamples full pipeline)
Estimands
Estimand
Population
Applicability
"ATE"
All individuals
Always (default)
"ATT"
Observed treated (A = 1)
Binary point treatment only
"ATC"
Observed controls (A = 0)
Binary point treatment only
subset = quote(...)
Custom subgroup
Always
For g-computation, the estimand can be changed in contrast() without refitting. For IPW and matching, refit with causat(estimand = ...).
Treatment types
causatr supports binary, continuous, categorical, and multivariate treatments:
Treatment type
G-comp
IPW
Matching
Binary (0/1)
Yes
Yes
Yes
Continuous
Yes
Yes (GPS)
No (MatchIt binary-only)
Categorical (k > 2)
Yes
Yes
No (MatchIt binary-only)
Multivariate (joint)
Yes
Not yet
Not yet
For multivariate treatments (e.g. treatment = c("A1", "A2")), supply interventions as named lists with one element per treatment variable.
Effect modification in the outcome / MSM model itself (e.g. ~ L + sex + A:sex) is currently supported only for g-computation. IPW and matching wrap a saturated Y ~ A MSM around the propensity / matched design, and reject an A:modifier term up front — use estimator = "gcomp" for heterogeneous effects, or by = "modifier" below for stratum-specific summaries of a homogeneous effect. See PHASE_6_INTERACTIONS.md for the planned unified API.
Effect modification
The by argument in contrast() stratifies estimates by levels of a variable, producing subgroup-specific effects:
For IPW and matching, balance is computed via cobalt and includes standardised mean differences before and after adjustment.
Extracting results
causatr result objects support standard R generics:
coef(result) # named vector of E[Y^a]confint(result) # CI matrix (respects custom `level`)vcov(result) # vcov of marginal meanstidy(result) # broom-style tidy data frameglance(result) # broom-style one-row summaryplot(result) # forest plot (requires forrest package)
Learning more
causatr includes detailed vignettes for each estimation method: g-computation, IPW, matching, longitudinal ICE, and methodological triangulation. Use vignette(package = "causatr") to see all available vignettes.
Source Code
---title: "Introduction to causatr"code-fold: showcode-tools: truevignette: > %\VignetteIndexEntry{Introduction to causatr} %\VignetteEngine{quarto::html} %\VignetteEncoding{UTF-8}---```{r}#| include: falseknitr::opts_chunk$set(collapse =TRUE, comment ="#>")```causatr provides a unified interface for causal effect estimation via threecomplementary methods:- **G-computation** (parametric g-formula) — fits an outcome model $E[Y | A, L]$ and standardises predictions over the target population.- **Inverse probability weighting (IPW)** — reweights the data using propensity scores so that confounders are balanced across treatment groups.- **Propensity score matching** — pairs treated and control units with similar propensity scores.When all three methods agree, you can be more confident in your findings.This is called **methodological triangulation**.## Two-step APIAll analyses follow the same two-step pattern:```r# Step 1: Fitfit <-causat(data, outcome ="Y", treatment ="A", confounders =~ L1 + L2,estimator ="gcomp")# Step 2: Contrastresult <-contrast(fit,interventions =list(treated =static(1), control =static(0)),reference ="control")```## Quick exampleThe example below uses the **NHEFS** (National Health and NutritionExamination Survey — Epidemiologic Follow-up Study) dataset from Hernán &Robins (2025). The assumed causal structure is:- **Treatment ($A$):** `qsmk` — whether the individual quit smoking between baseline (1971) and follow-up (1982). Binary (0 = continued, 1 = quit).- **Outcome ($Y$):** `wt82_71` — change in body weight (kg) from 1971 to 1982. Continuous.- **Confounders ($L$):** a set of baseline covariates — sex, age, race, smoking intensity (cigarettes/day), years of smoking, exercise, physical activity level, and baseline weight — that jointly affect both the decision to quit smoking and the change in weight.- **Censoring ($C$):** `censored` — whether the individual was lost to follow-up. Rows with $C = 1$ are excluded from model fitting; predictions are standardised over the full target population.Under the causal DAG, $L$ is a common cause of both $A$ and $Y$. Adjustingfor $L$ blocks all backdoor paths from $A$ to $Y$, so the g-formulaidentifies the average causal effect $\mathbb{E}[Y^{a=1}] - \mathbb{E}[Y^{a=0}]$.```{r}#| message: falselibrary(causatr)data("nhefs")fit <-causat( nhefs,outcome ="wt82_71",treatment ="qsmk",confounders =~ sex + age + race + smokeintensity + smokeyrs +factor(exercise) +factor(active) + wt71,censoring ="censored")result <-contrast( fit,interventions =list(quit =static(1), continue =static(0)),reference ="continue")result```## Intervention typescausatr supports several intervention types for g-computation:| Function | Description | Example ||---|---|---||`static(value)`| Set treatment to a fixed value |`static(1)`||`shift(delta)`| Add a constant to observed treatment |`shift(-10)`||`scale_by(factor)`| Multiply observed treatment |`scale_by(0.5)`||`threshold(lower, upper)`| Clamp treatment within bounds |`threshold(0, 20)`||`dynamic(rule)`| Apply a user-defined rule |`dynamic(\(d, a) ifelse(d$L > 0, 1, 0))`||`NULL`| Natural course (observed values) | — |IPW supports `static()`, `shift()`, `scale_by()`, `dynamic()`, and `ipsi()`.Matching supports only `static()` interventions.## Contrast types| Type | Meaning ||---|---||`"difference"`| Risk/mean difference (default) ||`"ratio"`| Risk ratio ||`"or"`| Odds ratio (binary outcomes) |## Inference methods| Method | Description ||---|---||`"sandwich"`| Robust sandwich SE (default, fast) ||`"bootstrap"`| Nonparametric bootstrap (resamples full pipeline) |## Estimands| Estimand | Population | Applicability ||---|---|---||`"ATE"`| All individuals | Always (default) ||`"ATT"`| Observed treated (A = 1) | Binary point treatment only ||`"ATC"`| Observed controls (A = 0) | Binary point treatment only ||`subset = quote(...)`| Custom subgroup | Always |For g-computation, the estimand can be changed in `contrast()` withoutrefitting. For IPW and matching, refit with `causat(estimand = ...)`.## Treatment typescausatr supports binary, continuous, categorical, and multivariate treatments:| Treatment type | G-comp | IPW | Matching ||---|---|---|---|| Binary (0/1) | Yes | Yes | Yes || Continuous | Yes | Yes (GPS) | No (MatchIt binary-only) || Categorical (k > 2) | Yes | Yes | No (MatchIt binary-only) || Multivariate (joint) | Yes | Not yet | Not yet |For multivariate treatments (e.g. `treatment = c("A1", "A2")`), supplyinterventions as named lists with one element per treatment variable.Effect modification in the outcome / MSM model itself (e.g. `~ L + sex + A:sex`)is currently supported only for g-computation. IPW and matching wrap asaturated `Y ~ A` MSM around the propensity / matched design, and reject an`A:modifier` term up front — use `estimator = "gcomp"` for heterogeneous effects,or `by = "modifier"` below for stratum-specific summaries of a homogeneouseffect. See `PHASE_6_INTERACTIONS.md` for the planned unified API.## Effect modificationThe `by` argument in `contrast()` stratifies estimates by levels of avariable, producing subgroup-specific effects:```{r}#| eval: falsecontrast(fit,interventions =list(quit =static(1), continue =static(0)),by ="sex")```## DiagnosticsUse `diagnose()` to check positivity and covariate balance after fitting:```{r}#| eval: falsediag <-diagnose(fit)diag # print positivity + balance summaryplot(diag) # Love plot (requires cobalt)```For IPW and matching, balance is computed via[cobalt](https://ngreifer.github.io/cobalt/) and includes standardised meandifferences before and after adjustment.## Extracting resultscausatr result objects support standard R generics:```rcoef(result) # named vector of E[Y^a]confint(result) # CI matrix (respects custom `level`)vcov(result) # vcov of marginal meanstidy(result) # broom-style tidy data frameglance(result) # broom-style one-row summaryplot(result) # forest plot (requires forrest package)```## Learning morecausatr includes detailed vignettes for each estimation method: g-computation,IPW, matching, longitudinal ICE, and methodological triangulation. Use`vignette(package = "causatr")` to see all available vignettes.