Census projections in mySidewalk

mySidewalk provides a rich collection of U.S. Census data spanning over 40 years. By leveraging historical data, we have developed projections for many key indicators. These projections are available for all mySidewalk geographies and can be apportioned to custom boundaries.

Published Census data sources

Projections are provided for both Decennial Census and 5-Year American Community Survey (ACS) samples. The projections are based on data from the following sources:

Decennial
- Census 1990
- Census 2000
- Census 2010
- Census 2020 + ACS 2018-2022
  - Decennial Census 2020 was short-form only, so additional data is included from the overlapping ACS to provide coverage.
5-Year ACS
- ACS 2009-2014
- ACS 2015-2019
- ACS 2019-2023

How We Prepare the Data

Creating projections requires rigorous data preparation, which includes:

Geographic Harmonization

All historical data is harmonized to the current 2022 boundaries, ensuring consistency.

Apportionment

Decennial Data is apportioned from block groups to mySidewalk boundaries using weighted block-to-block group apportionment.
- Note: ACS data is built directly from the geographic source tables with the exception of city council districts, neighborhoods, and MPOs, which are apportioned from block group values.
Values for states and the nation are directly sourced from Decennial Census data for accuracy.

Missing Data Imputation

For Decennial Census data, Missing values in the block group dataset are estimated using geographically proximate data within the same census tract.
Example: If a block group from the 1990 Census is missing total household data, the missing value is estimated based on surrounding block groups in the same tract.
Missing values are not imputed for mean or median indicators if there is insufficient data in the census tract.
Missing values are not imputed for ACS data values.

Projection Methodology

ACS projections are calculated for 5 years into the future, and Decennial projections are calculated for 2040. Each projection uses a modified linear regression model. Here’s how:

Regression Slope: Linear regression fits a straight line through the historical data values that best describes the over-time trend. It then extends that line forward to estimate what the value is likely to be in future years, such as 2029 and 2040, assuming the trend continues.
Baseline Value: The midpoint of the most recent ACS 5-year estimates is used as the starting point for projections.

Key Terms Explained

Geographic Harmonization: Aligning historical data to match current geographic boundaries.

harmonization

Apportionment: Distributing data from one geographic level to another (e.g., block groups to custom boundaries).

apportionment

For details on geographic harmonization and custom boundaries, visit our article: Getting data into modern and custom geographic boundaries.