
The National Health and Nutrition Examination Survey (NHANES) is a continuous, nationally representative survey that measures the health and nutritional status of people living in the United States. It is run by the CDC's National Center for Health Statistics (NCHS), and it is the only U.S. national health survey that combines in-person interviews, physical examinations, and laboratory tests for participants of all ages.
For medical trainees and early-career researchers, NHANES is one of the most valuable publicly available datasets in existence. It's free, well-documented, and has supported tens of thousands of peer-reviewed publications — including papers in the New England Journal of Medicine and JAMA. If you're trying to publish original research without spending months on chart reviews or patient recruitment, NHANES is often the fastest credible path there.
This guide walks through what NHANES is, what kinds of questions it can answer, what its limitations are, and how it compares to the other public databases trainees commonly use.
Here are the core facts worth knowing before you go any further:
NHANES didn't start out as one continuous survey. The first National Health Examination Survey (NHES) was conducted from 1960 to 1962 and focused on adults aged 18–79. Two follow-up surveys covered children and adolescents through the 1960s. In the early 1970s, the survey was expanded to include nutritional measurements and renamed the National Health and Nutrition Examination Survey.
From 1971 through 1994, NHANES was conducted as a series of discrete surveys (NHANES I, II, and III). Beginning in 1999, NHANES became a continuous survey, with data released in two-year cycles.
Why does this history matter for your research? Two reasons. First, when you're searching for variables, you'll often see references to "NHANES III" or "Continuous NHANES" — these refer to different eras of the survey with different file structures. Second, the shift to continuous collection in 1999 is what makes NHANES so powerful for modern research: you can combine multiple cycles to increase sample size, study trends over time, and ask questions that require comparing pre- and post-event data (for example, before and after the COVID-19 pandemic).
This is where NHANES becomes unusual. Most national surveys rely entirely on what participants tell interviewers. NHANES combines what people report about themselves with what's directly measured during a physical exam. That combination is what makes it irreplaceable for certain kinds of research.
The data are organized into five major components, each released as a separate set of files for every two-year cycle:
Demographics. Age, sex, race and ethnicity, household income, education, marital status, country of birth, and military service history.
Dietary. Two 24-hour dietary recalls per participant, along with dietary supplement use, food security status, and dietary behaviors. NHANES is one of the most detailed sources of population-level nutritional data in the world.
Examination. Body measurements (weight, height, BMI, body composition), blood pressure, dental exams, and — depending on the cycle — vision, hearing, dermatology, balance, grip strength, and respiratory testing.
Laboratory. Blood, urine, and other biospecimen results, including markers for chronic disease (cholesterol, A1c, kidney function), infectious disease, environmental exposures (lead, BPA, PFAS), and nutritional biomarkers (vitamin D, ferritin).
Questionnaire. Self-reported information on chronic conditions, prescription medications (verified by examining bottle labels during the home interview), preventive care, mental health, sexual and reproductive health, occupation, and dozens of other topics.
The combination of measured data (from the exam and lab) with self-reported data (from the interviews) is what makes NHANES so unusual. You can study not just what people say about their health, but what their bodies actually show.
Because of its scope, NHANES can support a wide range of research questions. A few examples of the types of questions trainees commonly investigate:
NHANES has been used to generate national estimates for obesity, diabetes, and blood pressure, and to establish national reference standards for measurements like height, weight, and blood pressure. The same dataset that supports those landmark public health figures is available to you as a trainee.
NHANES is powerful, but it's not the right tool for every question. The most important limitations to understand:
Cross-sectional design. Each cycle is a snapshot in time. NHANES generally cannot prove causation — only association. If you need to follow the same individuals over years, NHANES is not your dataset.
Complex survey design. NHANES uses oversampling and weighted sampling to produce nationally representative estimates. This means standard statistical methods don't work — you must use survey weights, strata, and primary sampling units in your analysis, or your results will be biased. This is the most common reason trainee projects using NHANES go wrong.
Variables change between cycles. Some measurements are collected in every cycle; others appear only in specific years. Before you commit to a question, you need to confirm your variables of interest exist in the cycles you want to study.
Limited geographic resolution. Because of how the sample is drawn, NHANES cannot produce state-level or city-level estimates. For state-level research, datasets like BRFSS are better suited.
Self-reported components. Even though NHANES includes measured data, large parts of the survey still rely on self-report (diet recalls, behavior questionnaires), which introduces the usual recall and social desirability biases.
These limitations aren't dealbreakers — they're the boundaries that define what NHANES is good for. Knowing them upfront protects you from designing a study NHANES can't actually answer.
NHANES is one of four major publicly available U.S. health databases that Lumono supports. Each serves different research purposes:
If your question involves a lab value, a body measurement, or a nutritional biomarker, NHANES is almost certainly the right starting point. If it involves cost, coverage, or communication, one of the others will likely serve you better.
This is worth saying directly: NHANES is not a dataset you can drop into a basic statistics workflow and expect to get correct answers. The complex survey design means that running an ordinary t-test or logistic regression on NHANES data will give you the wrong standard errors and potentially the wrong point estimates.
To analyze NHANES correctly, you need to:
This is the step where most trainee projects stall. It's also the step where Lumono is built to help.
Lumono is built specifically to make publicly available health databases — including NHANES — usable for medical trainees who don't have years of biostatistics training. We handle the data preparation, walk you through formulating a feasible research question, run the statistically appropriate analyses, and help you interpret the results in the context of existing literature.
If you're considering an NHANES project, the database study timeline post is a good next read — it walks through what the project will actually look like from idea to submission. And if you want a broader view of the public databases trainees should know about, our overview of publicly available health datasets covers the full landscape.
NHANES has supported decades of high-impact research, and it's free for anyone willing to learn how to use it. With the right framing and the right support, it can be the dataset behind your first — or next — publication.
Want more research tips? Sign up for monthly updates .