NHEFS: National Health and Nutrition Examination Survey Epidemiologic Follow-up Study
Description
A subset of the NHEFS dataset used throughout Hernan & Robins (2025) to illustrate causal effect estimation methods. Contains 1,746 cigarette smokers aged 25-74 who completed a baseline survey (1971-75) and a follow-up survey (1982). Of these, 117 have missing education values; the book analyses typically use the 1,629-row subset with complete education.
Usage
nhefs
Format
A data.table with 1,746 rows and the following variables:
seqn
Individual identifier.
qsmk
Quit smoking between baseline and 1982 (1 = yes, 0 = no). Primary treatment variable.
wt82_71
Weight change in kg between 1971 and 1982. Primary outcome variable. Missing for 63 individuals lost to follow-up.
sex
Sex (0 = male, 1 = female).
age
Age at baseline (years).
race
Race (0 = white, 1 = other).
education
Education level (1-5, from less than high school to college or more).