Competing risks with survatr

When an individual can fail from more than one mutually exclusive cause, a death from cause B removes them from being at risk of cause A: the causes compete. survatr handles this with cause-specific hazards + cumulative incidence functions (CIF). For \(J\) competing event types it fits \(J\) parallel pooled-logistic cause-specific hazard models and builds, under each intervention, the cause-\(j\) cumulative incidence and the all-cause survival.

This vignette covers surv_fit(competing = ): the data layout, the CIF estimands, the \(\sum_j F^{(j)}(t) + S(t) = 1\) identity, contrasts with sandwich inference, and the interpretational caveat that comes with cause-specific contrasts. Fine–Gray / subdistribution hazards are out of scope — survatr is cause-specific only.

The model

For cause \(j\), the cause-specific hazard on the at-risk person-period rows is

\[ \text{logit}\, h^{(j)}(t \mid A, L) = \alpha_j(t) + \beta_{A,j} A + \beta_{L,j} L . \]

All \(J\) models are fit on one shared all-cause risk set — an individual leaves the risk set at the first event of any cause. That is exactly what “treat the competing causes as censored at their event time” means, so you do not build a separate risk set per cause. With the summed hazard \(H_{k} = \sum_j h^{(j)}_{k}\), the per-individual all-cause survival and cause-\(j\) cumulative incidence are

\[ S_i(k) = \prod_{m \le k} (1 - H_{i,m}), \qquad F^{(j)}_i(t) = \sum_{k \le t} S_i(k-1)\, h^{(j)}_{i,k}, \]

averaged across individuals (cumulate within id, then average — never average hazards first). By construction \(\sum_j F^{(j)}(t) + S(t) = 1\). This cause-specific decomposition follows Hernán and Robins (2025), Ch. 17.

A two-cause data set

The event column is a single multi-valued integer: 0 = no event this period, 1..J = the cause of the event this period. Administrative censoring (if any) goes in a separate censoring column, as elsewhere in survatr.

Code

library(causatr) # attach first so survatr's contrast() generic dispatches
library(survatr)
library(data.table)

sim_cr <- function(n = 3000L, K = 6L, seed = 1L,
                   h1 = 0.08, h2 = 0.05,
                   beta1_A = -0.5, beta1_L = 0.6, beta2_L = 0.3, gamma = 0.6) {
  set.seed(seed)
  L <- rnorm(n)
  A <- rbinom(n, 1L, plogis(gamma * L)) # L confounds A
  l1 <- qlogis(h1)
  l2 <- qlogis(h2)
  rows <- vector("list", n)
  for (i in seq_len(n)) {
    p1 <- plogis(l1 + beta1_A * A[i] + beta1_L * L[i])
    p2 <- plogis(l2 + beta2_L * L[i])
    ev <- integer(K)
    done <- FALSE
    for (k in seq_len(K)) {
      if (done) next
      u <- runif(1L) # multinomial: (no event, cause 1, cause 2)
      if (u < p1) {
        ev[k] <- 1L
        done <- TRUE
      } else if (u < p1 + p2) {
        ev[k] <- 2L
        done <- TRUE
      }
    }
    rows[[i]] <- data.table(
      id = i, t = seq_len(K), A = A[i], L = L[i], event = ev
    )
  }
  rbindlist(rows)
}

pp <- sim_cr()
head(pp)
#>       id     t     A          L event
#>    <int> <int> <int>      <num> <int>
#> 1:     1     1     1 -0.6264538     0
#> 2:     1     2     1 -0.6264538     0
#> 3:     1     3     1 -0.6264538     0
#> 4:     1     4     1 -0.6264538     0
#> 5:     1     5     1 -0.6264538     2
#> 6:     1     6     1 -0.6264538     0

Fitting

Pass the multi-valued column to both outcome and competing (they name the same column). Competing risks is a point-treatment g-computation estimator in this release.

Code

fit <- surv_fit(
  pp,
  outcome = "event",
  treatment = "A",
  confounders = ~ L,
  id = "id",
  time = "t",
  competing = "event",
  time_formula = ~ factor(t)
)
fit
#> <survatr_fit>
#>   Track:       A
#>   Estimator:   gcomp
#>   Family:      binomial
#>   Outcome:     event
#>   Treatment:   A
#>   ID:          id
#>   Time:        t
#>   Censoring:   none
#>   N:           3000 individuals, 18000 PP rows (13275 at risk)
#>   Time grid:   [1, 6] (6 unique times)

The fit holds one cause-specific hazard model per cause:

Code

names(fit$cause_models)
#> [1] "1" "2"
fit$causes
#> [1] 1 2

Cumulative incidence under two interventions

type = "cif" returns the per-cause cumulative incidence for every cause (a cause column distinguishes them). cause = selects a subset.

Code

ivs <- list(treat = causatr::static(1), control = causatr::static(0))
times <- c(2L, 4L, 6L)

cif <- contrast(fit, ivs, times = times, type = "cif")
cif$estimates
#>     intervention cause  time    cif_hat    se ci_lower ci_upper     n
#>           <char> <int> <int>      <num> <num>    <num>    <num> <int>
#>  1:        treat     1     2 0.11116458    NA       NA       NA  3000
#>  2:        treat     1     4 0.19371701    NA       NA       NA  3000
#>  3:        treat     1     6 0.25267869    NA       NA       NA  3000
#>  4:        treat     2     2 0.10832760    NA       NA       NA  3000
#>  5:        treat     2     4 0.18013869    NA       NA       NA  3000
#>  6:        treat     2     6 0.24302182    NA       NA       NA  3000
#>  7:      control     1     2 0.16720182    NA       NA       NA  3000
#>  8:      control     1     4 0.28204385    NA       NA       NA  3000
#>  9:      control     1     6 0.35904002    NA       NA       NA  3000
#> 10:      control     2     2 0.09894726    NA       NA       NA  3000
#> 11:      control     2     4 0.16019835    NA       NA       NA  3000
#> 12:      control     2     6 0.21082528    NA       NA       NA  3000

All-cause survival comes from the summed hazards (type = "survival"), and the identity holds:

Code

surv <- contrast(fit, ivs, times = times, type = "survival")

ident <- merge(
  cif$estimates[, .(fsum = sum(cif_hat)), by = .(intervention, time)],
  surv$estimates[, .(intervention, time, s_hat)],
  by = c("intervention", "time")
)
ident[, total := fsum + s_hat][]
#> Key: <intervention, time>
#>    intervention  time      fsum     s_hat total
#>          <char> <int>     <num>     <num> <num>
#> 1:      control     2 0.2661491 0.7338509     1
#> 2:      control     4 0.4422422 0.5577578     1
#> 3:      control     6 0.5698653 0.4301347     1
#> 4:        treat     2 0.2194922 0.7805078     1
#> 5:        treat     4 0.3738557 0.6261443     1
#> 6:        treat     6 0.4957005 0.5042995     1

Contrasts and sandwich inference

cif_difference and cif_ratio contrast a cause’s cumulative incidence between interventions. The stacked-EE sandwich propagates the uncertainty in all \(J\) cause-specific models through the CIF.

Code

rd <- contrast(
  fit, ivs,
  times = times,
  type = "cif_difference",
  cause = 1L,
  reference = "control",
  ci_method = "sandwich"
)
#> Cause-specific CIF contrasts condition on surviving the competing events (truncation by death). Interpret a per-cause CIF difference / ratio as a total effect on cause-specific incidence, not as a contrast holding the competing process fixed. See the competing-risks vignette.
#> This message is displayed once per session.
rd$contrasts

Code

plot(rd)

A forest plot at a reference time shows every cause’s contrast at once:

Code

rd_all <- contrast(
  fit, ivs,
  times = times,
  type = "cif_difference",
  reference = "control",
  ci_method = "sandwich"
)
forrest(rd_all, t_ref = 6)

Years of life lost

Per-cause years of life lost up to \(t^*\) is the integral of the cause-\(j\) cumulative incidence, \(\mathrm{YLL}^{(j),a}(t^*) = \int_0^{t^*} F^{(j),a}(u)\,du\) (Andersen 2013). It carries the same cause dimension as the CIF (and the same truncation-by- death caveat), and the per-cause years lost add up to the all-cause restricted mean time lost: \(\sum_j \mathrm{YLL}^{(j),a}(t^*) = \mathrm{RMTL}^a(t^*)\).

Code

yll <- contrast(fit, ivs, times = times, type = "yll", ci_method = "sandwich")
yll$estimates[time == max(times)]
#> Key: <intervention, cause, time>
#>    intervention cause  time   yll_hat         se  ci_lower  ci_upper     n
#>          <char> <int> <int>     <num>      <num>     <num>     <num> <int>
#> 1:      control     1     6 1.2575314 0.04900849 1.1614765 1.3535862  3000
#> 2:      control     2     6 0.7291165 0.04059976 0.6495424 0.8086906  3000
#> 3:        treat     1     6 0.8624419 0.03958973 0.7848474 0.9400363  3000
#> 4:        treat     2     6 0.8199544 0.04068718 0.7402090 0.8996998  3000

## The per-cause years lost sum to the all-cause restricted mean time lost.
rmtl <- contrast(fit, ivs, times = times, type = "rmtl")
merge(
  yll$estimates[time == max(times), .(yll = sum(yll_hat)), by = intervention],
  rmtl$estimates[time == max(times), .(intervention, rmtl_hat)],
  by = "intervention"
)
#> Key: <intervention>
#>    intervention      yll rmtl_hat
#>          <char>    <num>    <num>
#> 1:      control 1.986648 1.986648
#> 2:        treat 1.682396 1.682396

A caveat: truncation by death

A cause-specific CIF contrast is a total effect on cause-\(j\) incidence: it mixes the effect on the cause-\(j\) hazard with the effect on the competing hazards (which change who survives to be at risk of cause \(j\)). It is not a contrast that holds the competing process fixed, and it conditions on surviving the competing events — the classic “truncation by death” problem. survatr emits this caveat once per session and repeats it when you print a cif_difference / cif_ratio result. Interpret per-cause contrasts accordingly, and report the all-cause survival alongside them.

Scope and rejections

Cause-specific hazards only. Fine–Gray / subdistribution hazards (a different estimand and data structure) are out of scope.
Point-treatment g-computation only this release. estimator = "ipw" or "ice" with competing = aborts (survatr_competing_estimator); longitudinal and IPW competing risks ship in later chunks.
competing must name the same column as outcome, with at least two distinct positive causes (else survatr_competing_misuse), and hold non-negative integer cause codes (else survatr_bad_competing).
Estimand / fit must match: a CIF estimand needs a competing-risks fit and a single-event contrast (e.g. rmst_difference) is not defined for one (survatr_competing_type).
Per-cause RMST / years-of-life-lost (the integral of the CIF) is deferred to a later estimands chunk.

References

Andersen, Per Kragh. 2013. “Decomposition of Number of Life Years Lost According to Causes of Death.” Statistics in Medicine 32 (30): 5278–85. https://doi.org/10.1002/sim.5903.

Hernán, Miguel A., and James M. Robins. 2025. Causal Inference: What If. Chapman & Hall/CRC. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.

Young, Jessica G., and Eric J. Tchetgen Tchetgen. 2014. “Simulation from a Known Cox MSM Using Standard Parametric Models for the g-Formula.” Statistics in Medicine 33 (6): 1001–14. https://doi.org/10.1002/sim.5994.