The mySidewalk data library supports a rich collection of US Census products that span over thirty years. Drawing on Census data as far back as 1990, we produced future projections for over one hundred Census concepts at all sixteen geography levels available in our application.
Preparing the Data
Producing projections from historical Census data would not be possible without a few supporting procedures to prepare the data. Geographic harmonization was utilized to align data from past Census products with Census 2010 geographic boundaries. We also used the Amelia II software provided by Gary King et al. at Harvard University (see citation below) to impute missing values in the data.
All projections are generated from data originating from the following Census products:
- Decennial Census 1990
- Decennial Census 2000
- Decennial Census 2010
- American Community Survey (ACS) 2007-2011 5 Year Estimates (used to proxy for concepts not available in the short form Decennial Census 2010)
- American Community Survey (ACS) 2014-2018 5 Year Estimates
Projections for over one hundred Census concepts were produced for 2018, 2020, 2022, 2024, and 2026 using a modified linear regression over the years: 1990, 2000, 2010, and ACS 2014-2018 (referred to as 2016). The slope of the regressor is standard (mainly for efficiency and to prevent overfitting), and the ACS 2014-2018 value is always used as the offset instead of the standard regression offset, as it is the last known good of the series.