This is a topic I wish I knew about when I was training. I didn't realize how many papers could be published using these datasets and how quickly I could get them done because I didn't have to do any data collection. I had to learn through trial and error. I spent months doing a chart review project yielding zero publications. Later in my career, I published over five first-author papers in that same span using public databases. My hope is to distill my years of experience, coursework, and mentorship to accelerate your research journey.
This article will serve to provide an introduction to publicly available databases, their benefits, where to find them, and how to analyze them.
As the name suggests, these datasets are available to the public and can be downloaded for free and without any paperwork or approvals. There are dozens of nationally representative datasets that can be easily found, downloaded, and analyzed.
You may be thinking, "if these datasets are publicly available, would all the research questions be answered already?" Not at all! These datasets have thousands of variables and are updated every couple of years. That means there are literally tens of thousands of research questions that can be explored. There aren't enough researchers to ask or answer all the important questions before the next cycle comes out.
From a public health perspective, these datasets are extremely powerful for a number of reasons:
Beyond their overall public health value, these datasets offer practical advantages for academic publishing:
Here is a list of some selected links to the datasets and a brief description. There are dozens more that can easily be found.
A survey combining interviews, physical examinations, and laboratory tests on a nationally representative sample of ~5,000 people annually. Unique for combining self-reported data with objective health measurements like blood work, body measurements, and clinical assessments. Excellent for studying disease prevalence and nutritional status.
The longest-running health survey in the U.S., conducting annual in-person interviews with ~35,000 households. Focuses on health status, healthcare access, and health behaviors through self-report. Strong for tracking health trends over time and studying healthcare utilization patterns.
Tracks healthcare costs, utilization, and insurance coverage for the same families over 2+ years. Provides detailed information on what Americans pay for healthcare, what services they use, and how they're insured. Essential for health economics and policy research.
The largest telephone-based health survey system, collecting data from all 50 states on health behaviors, chronic conditions, and preventive services. Provides state-level data that's crucial for public health planning and tracking Healthy People objectives.
Monitors health behaviors among high school students that contribute to leading causes of death and disability. Covers topics like substance use, sexual behaviors, violence, and mental health. Critical for understanding adolescent health trends and informing school-based interventions.
Accessing these data is easy. Analyzing these data requires a little more nuance. Most of the time, this requires a statistical background. If you don't have much statistical experience, this next section might be hard to contextualize. There are a few things that need to be true for the analysis:
While handling these data and managing these statistical requirements may seem daunting, Lumono guides users of every level through the entire research process using these datasets. We organize the data, help you ask a relevant research question, run the appropriate statistical analyses, and help you interpret the results. We accelerate your research journey from months to weeks. Sign up for our newsletter to receive exclusive research guides and product updates.
Sign up for research tips.
Be the first to know when we launch.
Get started with Crom today & unlock the full potential of your business. Innovative solutions & dedicated support team are here to help you succeed.
RELATED