Competing risks with survatr

When an individual can fail from more than one mutually exclusive cause, a death from cause B removes them from being at risk of cause A: the causes compete. survatr handles this with cause-specific hazards + cumulative incidence functions (CIF). For \(J\) competing event types it fits \(J\) parallel pooled-logistic cause-specific hazard models and builds, under each intervention, the cause-\(j\) cumulative incidence and the all-cause survival.

This vignette covers surv_fit(competing = ): the data layout, the CIF estimands, the \(\sum_j F^{(j)}(t) + S(t) = 1\) identity, contrasts with sandwich inference, and the interpretational caveat that comes with cause-specific contrasts. Fine–Gray / subdistribution hazards are out of scope — survatr is cause-specific only.

The model

For cause \(j\), the cause-specific hazard on the at-risk person-period rows is

\[ \text{logit}\, h^{(j)}(t \mid A, L) = \alpha_j(t) + \beta_{A,j} A + \beta_{L,j} L . \]

All \(J\) models are fit on one shared all-cause risk set — an individual leaves the risk set at the first event of any cause. That is exactly what “treat the competing causes as censored at their event time” means, so you do not build a separate risk set per cause. With the summed hazard \(H_{k} = \sum_j h^{(j)}_{k}\), the per-individual all-cause survival and cause-\(j\) cumulative incidence are

\[ S_i(k) = \prod_{m \le k} (1 - H_{i,m}), \qquad F^{(j)}_i(t) = \sum_{k \le t} S_i(k-1)\, h^{(j)}_{i,k}, \]

averaged across individuals (cumulate within id, then average — never average hazards first). By construction \(\sum_j F^{(j)}(t) + S(t) = 1\).

A two-cause data set

The event column is a single multi-valued integer: 0 = no event this period, 1..J = the cause of the event this period. Administrative censoring (if any) goes in a separate censoring column, as elsewhere in survatr.

Code
library(causatr) # attach first so survatr's contrast() generic dispatches
library(survatr)
library(data.table)

sim_cr <- function(n = 3000L, K = 6L, seed = 1L,
                   h1 = 0.08, h2 = 0.05,
                   beta1_A = -0.5, beta1_L = 0.6, beta2_L = 0.3, gamma = 0.6) {
  set.seed(seed)
  L <- rnorm(n)
  A <- rbinom(n, 1L, plogis(gamma * L)) # L confounds A
  l1 <- qlogis(h1)
  l2 <- qlogis(h2)
  rows <- vector("list", n)
  for (i in seq_len(n)) {
    p1 <- plogis(l1 + beta1_A * A[i] + beta1_L * L[i])
    p2 <- plogis(l2 + beta2_L * L[i])
    ev <- integer(K)
    done <- FALSE
    for (k in seq_len(K)) {
      if (done) next
      u <- runif(1L) # multinomial: (no event, cause 1, cause 2)
      if (u < p1) {
        ev[k] <- 1L
        done <- TRUE
      } else if (u < p1 + p2) {
        ev[k] <- 2L
        done <- TRUE
      }
    }
    rows[[i]] <- data.table(
      id = i, t = seq_len(K), A = A[i], L = L[i], event = ev
    )
  }
  rbindlist(rows)
}

pp <- sim_cr()
head(pp)
#>       id     t     A          L event
#>    <int> <int> <int>      <num> <int>
#> 1:     1     1     1 -0.6264538     0
#> 2:     1     2     1 -0.6264538     0
#> 3:     1     3     1 -0.6264538     0
#> 4:     1     4     1 -0.6264538     0
#> 5:     1     5     1 -0.6264538     2
#> 6:     1     6     1 -0.6264538     0

Fitting

Pass the multi-valued column to both outcome and competing (they name the same column). Competing risks is a g-computation / Track A estimator in this release.

Code
fit <- surv_fit(
  pp,
  outcome = "event",
  treatment = "A",
  confounders = ~ L,
  id = "id",
  time = "t",
  competing = "event",
  time_formula = ~ factor(t)
)
fit
#> <survatr_fit>
#>   Track:       A
#>   Estimator:   gcomp
#>   Family:      binomial
#>   Outcome:     event
#>   Treatment:   A
#>   ID:          id
#>   Time:        t
#>   Censoring:   none
#>   N:           3000 individuals, 18000 PP rows (13275 at risk)
#>   Time grid:   [1, 6] (6 unique times)

The fit holds one cause-specific hazard model per cause:

Code
names(fit$cause_models)
#> [1] "1" "2"
fit$causes
#> [1] 1 2

Cumulative incidence under two interventions

type = "cif" returns the per-cause cumulative incidence for every cause (a cause column distinguishes them). cause = selects a subset.

Code
ivs <- list(treat = causatr::static(1), control = causatr::static(0))
times <- c(2L, 4L, 6L)

cif <- contrast(fit, ivs, times = times, type = "cif")
cif$estimates
#>     intervention cause  time    cif_hat    se ci_lower ci_upper     n
#>           <char> <int> <int>      <num> <num>    <num>    <num> <int>
#>  1:        treat     1     2 0.11116458    NA       NA       NA  3000
#>  2:        treat     1     4 0.19371701    NA       NA       NA  3000
#>  3:        treat     1     6 0.25267869    NA       NA       NA  3000
#>  4:        treat     2     2 0.10832760    NA       NA       NA  3000
#>  5:        treat     2     4 0.18013869    NA       NA       NA  3000
#>  6:        treat     2     6 0.24302182    NA       NA       NA  3000
#>  7:      control     1     2 0.16720182    NA       NA       NA  3000
#>  8:      control     1     4 0.28204385    NA       NA       NA  3000
#>  9:      control     1     6 0.35904002    NA       NA       NA  3000
#> 10:      control     2     2 0.09894726    NA       NA       NA  3000
#> 11:      control     2     4 0.16019835    NA       NA       NA  3000
#> 12:      control     2     6 0.21082528    NA       NA       NA  3000

All-cause survival comes from the summed hazards (type = "survival"), and the identity holds:

Code
surv <- contrast(fit, ivs, times = times, type = "survival")

ident <- merge(
  cif$estimates[, .(fsum = sum(cif_hat)), by = .(intervention, time)],
  surv$estimates[, .(intervention, time, s_hat)],
  by = c("intervention", "time")
)
ident[, total := fsum + s_hat][]
#> Key: <intervention, time>
#>    intervention  time      fsum     s_hat total
#>          <char> <int>     <num>     <num> <num>
#> 1:      control     2 0.2661491 0.7338509     1
#> 2:      control     4 0.4422422 0.5577578     1
#> 3:      control     6 0.5698653 0.4301347     1
#> 4:        treat     2 0.2194922 0.7805078     1
#> 5:        treat     4 0.3738557 0.6261443     1
#> 6:        treat     6 0.4957005 0.5042995     1

Contrasts and sandwich inference

cif_difference and cif_ratio contrast a cause’s cumulative incidence between interventions. The stacked-EE sandwich propagates the uncertainty in all \(J\) cause-specific models through the CIF.

Code
rd <- contrast(
  fit, ivs,
  times = times,
  type = "cif_difference",
  cause = 1L,
  reference = "control",
  ci_method = "sandwich"
)
#> Cause-specific CIF contrasts condition on surviving the competing events (truncation by death). Interpret a per-cause CIF difference / ratio as a total effect on cause-specific incidence, not as a contrast holding the competing process fixed. See the competing-risks vignette.
#> This message is displayed once per session.
rd$contrasts
Code
plot(rd)

A forest plot at a reference time shows every cause’s contrast at once:

Code
rd_all <- contrast(
  fit, ivs,
  times = times,
  type = "cif_difference",
  reference = "control",
  ci_method = "sandwich"
)
forrest(rd_all, t_ref = 6)

A caveat: truncation by death

A cause-specific CIF contrast is a total effect on cause-\(j\) incidence: it mixes the effect on the cause-\(j\) hazard with the effect on the competing hazards (which change who survives to be at risk of cause \(j\)). It is not a contrast that holds the competing process fixed, and it conditions on surviving the competing events — the classic “truncation by death” problem. survatr emits this caveat once per session and repeats it when you print a cif_difference / cif_ratio result. Interpret per-cause contrasts accordingly, and report the all-cause survival alongside them.

Scope and rejections

  • Cause-specific hazards only. Fine–Gray / subdistribution hazards (a different estimand and data structure) are out of scope.
  • g-computation / Track A only this release. estimator = "ipw" or "ice" with competing = aborts (survatr_competing_estimator); longitudinal and IPW competing risks ship in later chunks.
  • competing must name the same column as outcome, with at least two distinct positive causes (else survatr_competing_misuse), and hold non-negative integer cause codes (else survatr_bad_competing).
  • Estimand / fit must match: a CIF estimand needs a competing-risks fit and a single-event contrast (e.g. rmst_difference) is not defined for one (survatr_competing_type).
  • Per-cause RMST / years-of-life-lost (the integral of the CIF) is deferred to a later estimands chunk.

References

  • Hernán MA, Robins JM (2025). Causal Inference: What If, Ch. 17. Chapman & Hall/CRC.
  • Young JG, Tchetgen Tchetgen EJ (2014). Simulation from a known cause-specific cumulative incidence function. Statistics in Medicine 33:1098–1114.