Test homogeneity of an exposure’s odds ratios across disease subtypes
Description
Given a fitted polytomous (multinomial) case-control model from matcha() (estimator = “polytomous”), tests whether the exposure acts the same way on every disease subtype — the etiologic-heterogeneity question — and reports the efficient pooled ("common") odds ratio that holds under homogeneity. For each exposure term the null hypothesis is that its log odds ratio is equal across the non-reference outcome groups (H0: beta_1 = beta_2 = … = beta_M).
Usage
test_homogeneity(fit, conf_level = 0.95)
Arguments
fit
A matchatr_fit returned by matcha() whose engine is “multinom” (i.e. estimator = “polytomous”).
conf_level
Numeric confidence level for the common-OR interval, a single number strictly in (0, 1). Defaults to 0.95.
Details
The test is the Wald test of the canonical etiologic-heterogeneity analysis (Begg & Gray, 1984; as implemented in riskclustr::eh_test_subtype): with the stacked subtype log odds ratios b (length M = number of non-reference groups) and their joint covariance V from the multinomial information matrix, and a full-rank contrast matrix C (M - 1 rows) that differences consecutive subtypes,
W = (C b)’ (C V C’)^-1 (C b) ~ chi-squared with M - 1 degrees of freedom.
The common odds ratio is the minimum-variance (generalized-least-squares / inverse-variance) combination of the subtype log odds ratios — the restricted estimator under the equality constraint, asymptotically equivalent to the constrained maximum-likelihood fit:
exponentiated to the odds-ratio scale with a Wald interval on the log scale (so the interval is asymmetric on the OR scale). Because the constraint is imposed on the already-fitted unconstrained model, no refit is needed and the test handles continuous confounders directly. The pooled estimate is more efficient than any single subtype odds ratio (Begg & Gray, 1984): its standard error is smaller than each pooled term’s.
Each exposure term is tested separately (one "risk factor" per column): a binary or continuous exposure contributes one row, an unordered factor exposure one row per non-reference level. There is no omnibus test across the levels of a multi-level factor exposure — it yields one independent homogeneity test per level, so adjust for multiple comparisons if several levels are screened. The fit must be the polytomous multinomial engine (three or more outcome groups); any other engine — or a fit that produced no model — is rejected.
Value
A matchatr_homogeneity object: a list carrying homogeneity (a data.table with one row per exposure term — the term, the common odds ratio with its Wald bounds, and the homogeneity chi-squared statistic, df, and p.value), subtype (the per-subtype odds ratios it pools), the baseline reference group, the conf_level, the analysis size n, and the estimator / engine labels.
References
Begg CB, Gray R (1984). Calculation of polytomous logistic regression parameters using individualized regressions. Biometrika 71(1), 11-18.
Borgan O, Breslow N, Chatterjee N, Gail MH, Scott A, Wild CJ (2018). Handbook of Statistical Methods for Case-Control Studies, Chapter 5.
See Also
matcha(), contrast(), tidy.matchatr_homogeneity()
Examples
library("matchatr")set.seed(5)n <-4000x <-rbinom(n, 1, 0.4)# Subtype A and B share the exposure effect (homogeneity holds).eta <-cbind(control =0, A =-1+log(2) * x, B =-1.4+log(2) * x)prob <-exp(eta) /rowSums(exp(eta))g <-apply(prob, 1, function(p) sample(c("control", "A", "B"), 1, prob = p))d <-data.frame(g = g, x = x)fit <-matcha(d, outcome ="g", exposure ="x",design =unmatched_cc(), estimator ="polytomous",reference ="control")test_homogeneity(fit)
<matchatr_homogeneity>
Estimator: polytomous (engine: multinom)
Test: Homogeneity of subtype odds ratios (Wald)
Reference: control
N: 4000
Common (pooled) odds ratio per exposure term and homogeneity test:
term common_or se ci_lower ci_upper statistic df p.value
<char> <num> <num> <num> <num> <num> <int> <num>
1: x 1.868871 0.1216725 1.644985 2.123228 1.25593 1 0.2624228
Per-subtype odds ratios (pooled):
comparison or ci_lower ci_upper
<char> <num> <num> <num>
1: A: x 1.951515 1.682447 2.263613
2: B: x 1.751165 1.475982 2.077653
A small p-value is evidence the exposure odds ratio differs across subtypes.