The mySidewalk data library supports a rich collection of US Census products that span over thirty years. Drawing on Census data as far back as 1990, mySidewalk produced future projections for over one hundred Census concepts at all 16 geography levels available in our application.
Preparing the Data
Producing projections from historical Census data would not be possible without a few supporting procedures to prepare the data. Geographic harmonization was utilized to align data from past Census products with Census 2010 geographic boundaries. We also used the Amelia II software provided by Gary King et al. at Harvard University (see citation below) to impute missing values in the data.
All projections are generated from data originating from the following Census products:
- Census 1990
- Census 2000
- Census 2010
- ACS 2007-2011 5 Year Estimates (to proxy for short form Census 2010)
- ACS 2012-2016 5 Year Estimates
Projections for over one hundred Census concepts were produced for 2016, 2018, and 2020 using a modified linear regression over the years: 1990, 2000, 2010, and ACS 2011-2016 (referred to as 2014). The slope of the regressor is standard (mainly for efficiency and to prevent overfitting), and the ACS 2012-2016 value is always used as the offset instead of the standard regression offset, as it is the last known good of the series.