Skip to main content
All CollectionsDatamySidewalk Data
Census Projections in mySidewalk
Census Projections in mySidewalk

Learn about the methodology used by the mySidewalk to create projected values for select Census data indicators.

Drew Stiehl avatar
Written by Drew Stiehl
Updated over a week ago

The mySidewalk data library supports a rich collection of data including US Census data that span over forty years. We leveraged published Census data to produce projected values for 139 indicators. These projected values are available for all mySidewalk geographies and are apportionable to your custom boundary.

Published Census Data

All projections are generated from data originating from the following five Census products:

  • Decennial Census 1990

  • Decennial Census 2000

  • Decennial Census 2010

  • American Community Survey (ACS) 2007-2011 5-Year Estimates (used to proxy for concepts not available in the short form Decennial Census 2010)

  • Decennial Census 2020

  • American Community Survey (ACS) 2017-2021 5-Year Estimates

Preparing the Data

Producing projections from historical Census data would not be possible without a few supporting procedures to prepare the data.

  1. Geographic harmonization was used to 'walk' the data from Decennial Census 1990 and Decennial Census 2000 block group boundaries to the Decennial Census 2010 block group boundaries. Then, data from Decennial Census 1990, 2000, and 2010 were 'walked' to 2020 block group boundaries. Every Decennial Census builds a new set of boundaries, which is why the historical data needed to be harmonized to the current 2020 boundaries.

  2. The data was then apportioned from block groups to the other mySidewalk boundaries using weighted block to block group apportionment. In post-processing, the values for states and the nation were overwritten with the values directly from the respective Decennial Censuses 1990, 2000, and 2010.

  3. Prior to projecting data, we “impute” missing values in the block group dataset by observing geographically proximate data. For example, if a block group from the 1990 Decennial Census was missing data for total households, we looked at all of the other block groups in that block group’s census tract, and estimated the “missing” value based on those actually-reported values. We did not impute missing values for mean and median indicators where there were multiple block groups in the same census tract with missing values, or where there were not more than three block groups in a census tract.

Projection Methodology

  • Projections for over 100 Census concepts were produced for 2023, 2025, 2027, 2029, and 2031 using a modified linear regression over the years: 1990, 2000, 2010, and ACS 2017-2021 (midpoint of 2019 was used).

  • The slope of the regressor is standard (for efficiency and to prevent overfitting), and the ACS 2017-2021 value is always used as the offset instead of the standard regression offset, as it is the last known good of the series.

Did this answer your question?