---
title: "Methodological triangulation: g-computation, IPW, and matching"
code-fold: show
code-tools: true
vignette: >
%\VignetteIndexEntry{Methodological triangulation}
%\VignetteEngine{quarto::html}
%\VignetteEncoding{UTF-8}
---
```{r}
#| include: false
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```
When multiple estimation methods agree on the same causal effect, we gain
confidence that the result is not an artefact of a single modelling assumption.
This vignette compares all three methods on the NHEFS dataset.
## Data: NHEFS
All examples use the **NHEFS** dataset (Hernán & Robins, 2025). The
assumed causal structure is:
- **Treatment ($A$):** `qsmk` — quit smoking between 1971 and 1982 (binary: 0/1).
- **Outcome ($Y$):** `wt82_71` — weight change in kg (continuous), or
`gained_weight` ($\mathbf{1}\{Y > 0\}$, binary).
- **Confounders ($L$):** sex, age, race, education, smoking intensity,
years smoked, exercise, physical activity, and baseline weight.
$$
\begin{aligned}
L &\sim F_L \\
A &\sim \text{Bernoulli}\!\bigl(\text{logit}^{-1}(\alpha_0 + \alpha_L^\top L)\bigr) \\
Y &= \beta_0 + \beta_A A + \beta_L^\top L + \varepsilon
\end{aligned}
$$
The confounders $L$ are common causes of both $A$ and $Y$. Under this
structure, g-computation adjusts for $L$ via an outcome model,
IPW adjusts via a treatment model, and matching balances $L$ between
treatment groups. When all three methods agree, the result is unlikely to
be an artefact of a single modelling assumption.
```{r}
#| message: false
library(causatr)
library(tinyplot)
data("nhefs")
nhefs_complete <- nhefs[complete.cases(nhefs), ]
nhefs_complete$gained_weight <- as.integer(nhefs_complete$wt82_71 > 0)
nhefs_complete$sex <- factor(nhefs_complete$sex, levels = 0:1, labels = c("Male", "Female"))
confounders <- ~ sex + age + I(age^2) + race + factor(education) +
smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) +
factor(exercise) + factor(active) + wt71 + I(wt71^2)
```
## Continuous outcome: weight change
### Fitting all three methods
```{r}
fit_gc <- causat(nhefs, outcome = "wt82_71", treatment = "qsmk",
confounders = confounders, censoring = "censored")
fit_ipw <- causat(nhefs_complete, outcome = "wt82_71", treatment = "qsmk",
confounders = confounders, estimator = "ipw")
fit_m <- causat(nhefs_complete, outcome = "wt82_71", treatment = "qsmk",
confounders = confounders, estimator = "matching", estimand = "ATT")
```
### Contrasts
```{r}
ivs <- list(quit = static(1), continue = static(0))
res_gc <- contrast(fit_gc, ivs, reference = "continue")
res_ipw <- contrast(fit_ipw, ivs, reference = "continue")
res_m <- contrast(fit_m, ivs, reference = "continue")
results <- rbind(
cbind(method = "G-computation", res_gc$contrasts),
cbind(method = "IPW", res_ipw$contrasts),
cbind(method = "Matching (ATT)", res_m$contrasts)
)
results
```
### Forest plot
```{r}
#| fig-width: 7
#| fig-height: 3.5
#| fig-alt: "Forest plot comparing g-computation, IPW, and matching estimates of the effect of quitting smoking on weight change."
tinyplot(
estimate ~ method,
data = results,
type = "pointrange",
ymin = results$ci_lower,
ymax = results$ci_upper,
xlab = "Method",
ylab = "ATE: weight change (kg)",
main = "Triangulation: continuous outcome"
)
abline(h = 0, lty = 2, col = "grey40")
```
## Binary outcome: gained weight
```{r}
fit_gc_bin <- causat(nhefs_complete, outcome = "gained_weight",
treatment = "qsmk", confounders = confounders, family = "binomial")
fit_ipw_bin <- causat(nhefs_complete, outcome = "gained_weight",
treatment = "qsmk", confounders = confounders, estimator = "ipw")
fit_m_bin <- causat(nhefs_complete, outcome = "gained_weight",
treatment = "qsmk", confounders = confounders,
estimator = "matching", estimand = "ATT")
res_gc_bin <- contrast(fit_gc_bin, ivs, reference = "continue")
res_ipw_bin <- contrast(fit_ipw_bin, ivs, reference = "continue")
res_m_bin <- contrast(fit_m_bin, ivs, reference = "continue")
results_bin <- rbind(
cbind(method = "G-computation", res_gc_bin$contrasts),
cbind(method = "IPW", res_ipw_bin$contrasts),
cbind(method = "Matching (ATT)", res_m_bin$contrasts)
)
results_bin
```
```{r}
#| fig-width: 7
#| fig-height: 3.5
#| fig-alt: "Forest plot comparing methods on the binary outcome (gained weight)."
tinyplot(
estimate ~ method,
data = results_bin,
type = "pointrange",
ymin = results_bin$ci_lower,
ymax = results_bin$ci_upper,
xlab = "Method",
ylab = "Risk difference (gained weight)",
main = "Triangulation: binary outcome"
)
abline(h = 0, lty = 2, col = "grey40")
```
## Diagnostics
Before interpreting results, check that the assumptions hold. `diagnose()`
provides positivity checks, covariate balance, and method-specific summaries.
```{r}
diag_ipw <- diagnose(fit_ipw)
diag_m <- diagnose(fit_m)
```
### IPW balance
```{r}
#| fig-width: 7
#| fig-height: 5
#| fig-alt: "Love plot for IPW covariate balance in the triangulation analysis."
plot(diag_ipw)
```
### Matching balance
```{r}
#| fig-width: 7
#| fig-height: 5
#| fig-alt: "Love plot for matching covariate balance in the triangulation analysis."
plot(diag_m)
```
## When to use which method
| Criterion | G-computation | IPW | Matching |
|---|---|---|---|
| Model specification | Outcome model | Treatment model | Treatment model |
| Non-static interventions (shift / scale / threshold / dynamic) | Yes | No (Phase 4) | No (Phase 4) |
| Continuous treatment | Yes | Yes (GPS, static only) | No (MatchIt binary-only) |
| Categorical (k > 2) treatment | Yes | Yes | No (MatchIt binary-only) |
| Effect modification via `A:modifier` in `confounders` | Yes | Rejected (Phase 6) | Rejected (Phase 6) |
| Longitudinal (time-varying) treatments | Yes (ICE) | No (Phase 10) | No |
| Variance estimation | Sandwich (IF engine) or bootstrap | M-estimation sandwich via WeightIt's `glm_weightit` | Cluster-robust sandwich on matched subclass |
| Robustness to | Treatment model misspecification | Outcome model misspecification | Outcome model misspecification |
When the outcome model and treatment model are both correctly specified, all
three methods should produce similar estimates. Disagreement suggests model
misspecification — investigate further.
For features marked "No (Phase X)", causatr aborts at fit or contrast time
with a pointer to the planning doc rather than silently returning a wrong
answer — see `FEATURE_COVERAGE_MATRIX.md` for the full rejection-path
inventory.
## References
Hernán MA, Robins JM (2025). *Causal Inference: What If*. Chapman & Hall/CRC.