Fit a causal survival model

Description

Convenience wrapper for causal survival analysis using pooled logistic regression as a discrete-time hazard model (Hernan & Robins Ch. 17).

The pooled logistic hazard model is fit at this step. The survival-curve contrast step in contrast() is not implemented and aborts with an informative error. Competing-risks analysis (the competing argument) is also not implemented and aborts at fit time when supplied. Treat this function as experimental.

Algorithm

  1. Convert data to person-period format if not already long (using to_person_period()).

  2. Fit a pooled logistic regression for the discrete hazard: \(logit Pr[D_{k+1} = 1 | survived to k, A, L]\) = time_formula + A + confounders.

  3. For each intervention, predict individual-level hazards, compute survival as the cumulative product S_i(k) = prod(1 - h_i(m), m <= k), and average across individuals.

  4. Risk difference at time t = \((1 - S^{a1}(t)) - (1 - S^{a0}(t))\).

When the per-interval hazard is small (< 0.1), the pooled logistic model closely approximates a continuous-time Cox model (Technical Point 17.1).

Usage

causat_survival(
  data,
  outcome,
  treatment,
  confounders,
  id,
  time,
  censoring = NULL,
  competing = NULL,
  time_formula = ~splines::ns(time, 4),
  weights = NULL,
  ...
)

Arguments

data A data frame or data.table. Can be in wide format (one row per individual with a time-to-event column) or long person-period format (one row per person per time interval). If wide, the data is auto-converted using to_person_period().
outcome Character. Name of the binary event indicator (1 = event occurred in this interval, 0 = survived / censored).
treatment Character. Name of the treatment variable.
confounders A one-sided formula specifying confounders.
id Character. Name of the individual ID variable.
time Character. Name of the time variable (interval index).
censoring Character or NULL. Name of the censoring indicator. If provided, rows where censoring == 1 are excluded from fitting, and subsequent rows for that individual are also dropped.
competing Character or NULL. Name of a variable indicating the type of competing event (for competing risks analysis). Not implemented: supplying a non-NULL value aborts. Reserved for a future release.
time_formula A one-sided formula specifying how time enters the hazard model. Default ~ splines::ns(time, 4). Use ~ factor(time) for a fully saturated (non-parametric) baseline hazard.
weights Numeric vector or NULL. Pre-computed IPCW or survey weights.
Additional arguments passed to glm().

Value

A causatr_fit object (with type = “survival”) suitable for use with contrast().

References

Hernan MA, Robins JM (2025). Causal Inference: What If. Chapman & Hall/CRC. Chapter 17.

See Also

causat(), contrast(), to_person_period()

Examples

library("causatr")

data("nhefs", package = "causatr")
fit_surv <- causat_survival(
  nhefs,
  outcome = "death",
  treatment = "qsmk",
  confounders = ~ sex + age + race + education +
    smokeintensity + smokeyrs + exercise + active + wt71,
  id = "seqn",
  time = "year"
)
result <- contrast(fit_surv,
  interventions = list(quit = static(1), continue = static(0)),
  type = "difference"
)