Using Multi-Year Data: Tips and Cautions
The Dashboard displays multiple years of data for many of its metrics, and users may see changes over time in some of them. These changes can be caused by many factors, including real changes in the outcome of interest, as well as changes in population demographics, changes in data collection methods, or others. Furthermore, it may be hard to interpret year-to-year changes for many reasons, including overlapping confidence intervals*. Because it is hard to know what drives differences from year to year in individual datasets, and how to interpret those changes, it is important to understand the underlying dataset from which the metric was derived. Users should keep the characteristics of the relevant underlying dataset (described below) in mind when interpreting changes in specific measures over time.
The Dashboard team does not recommend using multi-year data to evaluate the impact of local public health programs or polices over time. In most cases, the underlying datasets cannot pick up the effects of local initiatives on a year-to-year basis. This is due to time gaps between when data are collected and when they are made available, statistical methods used to create some estimates, and because some estimates are based on grouped years of data.
Data sources not listed on this page do not provide multi-year data, and so are not included.
For more information on any of these data sources and associated metrics please review our Technical Documents and Metrics Background. Please contact [email protected] with any questions.
Underlying datasets on which multi-year data are based:
American Community Survey (ACS)
ACS data are used to calculate twelve of the Dashboard’s metrics. The Dashboard presents five-year estimates, representing the combination of data collected over a five-year period. For example, a 2017 estimate actually is a pooling, or combination, of information collected by the US Census Bureau from 2013, 2014, 2015, 2016 and 2017. For further explanation and examples please refer to the U.S. Census Bureau’s publication, “Understanding and Using ACS Single-Year and Multiyear Estimates.”
Dashboard measures derived from the ACS dataset:
Children in poverty
Broadband connection
High school completion
Housing with potential lead risk
Income inequality
Independent living difficulty
Lead exposure risk index
Neighborhood racial/ethnic segregation
Racial/ethnic diversity
Rent Burden
Unemployment – annual, neighborhood-level
Uninsured
Local Area Unemployment Statistics, U.S. Bureau of Labor Statistics (BLS)
Data from the BLS are used to produce the Unemployment – current, city-level metric. These data are derived through monthly surveys conducted by the BLS. These surveys gather information from different people each month. As such the BLS survey is meant to capture unemployment for a given month in a given city and does not necessarily measure how unemployment has changed for a specific group of people over time.
Dashboard measures derived from the BLS dataset:
Unemployment – current, city-level
National Center for Education Statistics, U.S. Department of Education (NCES)
NCES enrollment data and the U.S. Department of Education initiatives EDFacts and Ed Data Express school-level chronic absenteeism count data are used to calculate city-level estimates for Chronic Absenteeism in an academic school year (SY). For SY 2017-2018, chronic absenteeism counts are retrieved from EDFacts. For SY 2019-2020 onward, chronic absenteeism counts are retrieved from Ed Data Express.
Dashboard measures derived from the NCES dataset:
Chronic Absenteeism
National Vital Statistics System (NVSS)**
Data from NVSS are used to calculate ten of the Dashboard’s metrics. Similar to ACS metrics, NVSS estimates use multiple years of data – three years, to be exact.
Dashboard measures derived from the NVSS dataset:
Breast cancer deaths
Cardiovascular disease deaths
Colorectal cancer deaths
Firearm Homicides
Firearm Suicides
Low birthweight
Opioid overdose deaths
Premature deaths (all causes)
Prenatal care
Teen births
**New Jersey State Health Assessment Data (NJSHAD) are used in lieu of NVSS data for a subset of New Jersey cities.
George Mason University Air Quality Team
These data were created by fusing ground observations from the US Environmental Protection Agency (EPA) Air Quality System (AQS) network and computer model prediction from the National Oceanic and Atmospheric Administration (NOAA) National Air Quality Forecast Capability (NAQFC) by the George Mason University air quality team.
Estimates calculated by the George Mason University air quality team will differ from EPA CMAQ RSIG and GMU North America Chemical Reanalysis (NACR), which are commonly used, publicly available data sources for air pollution. While the RSIG presently includes a longer data period, North America Chemical Reanalysis (NACR) uses more up-to-date emission and real-time forecasting data to provide data in a timelier manner (up to yesterday). Both RSIG and NACR provide air pollution data for 12 kilometer square areas, which is larger than many census tracts. EPA CMAQ RSIG further smooths the data to provide census tract-level estimates, while NACR are provided at the 12-kilometer square area level only. As such, adjacent census tracts might share the same ozone pollution value (ppb) or PM2.5 pollution value (μg/m³).
Dashboard measures derived from George Mason University Air Quality Team's dataset:
Air Pollution - Particulate Matter
Air Pollution - Ozone
PLACES Project (formerly 500 Cities Project)
Data from the CDC’s PLACES Project (formerly 500 Cities Project) are used for 11 of the Dashboard’s metrics. These estimates are calculated differently from most other measures on the Dashboard and therefore have different multi-year data use caveats. According to the PLACES Project, “the current modeling procedure does not support using the estimates to track changes at the local level over time.”
The Dashboard added PLACES data to the website on March 1, 2021. These data represent an update to the 500 Cities data, with some small methodological changes. One change is that PLACES no longer releases estimates for the portion of census tracts that is located within city boundaries. For this reason, users should use additional caution when comparing PLACES data to 500 Cities data from previous years. For more information please visit the PLACES Project’s Using the Data webpage. Note that data pre-2018 will be unavailable for cities included in PLACES, but not in 500 Cities.
Dashboard measures derived from the PLACES (formerly 500 Cities) dataset:
Binge drinking
Dental care
Diabetes
Frequent mental distress
Frequent physical distress
High blood pressure
Obesity
Physical inactivity
Preventive services, 65+
Routine checkup
Smoking
Stanford Education Data Archive (SEDA)
SEDA 5.0 data are derived from the EDFacts data system provided by the U.S. Department of Education. States report to EDFacts the number of students performing at the state’s defined achievement level. SEDA then uses a modeled approach to estimate reading test scores at the city-level. Multi-year data are available from academic years 2010-2011 through 2018-2019. SEDA recommends users to exercise caution when comparing estimates over time, as the total number of students may fluctuate based on district and state reporting to EDFacts. For further details, please see the SEDA Technical Documentation.
Dashboard measures derived from the SEDA dataset:
Third-Grade Reading Scores
*TIP – Interested in digging deeper into the Dashboard’s confidence interval information? Confidence intervals for our estimates are available through our API and in our city and tract downloadable files.
Last updated: July 17, 2024