Why Geographies Change Shape

What do I need to know?

New Decennial Census 2020 Data

American Community Survey (ACS) 5-year Estimates 2016-2020

2020 Census Redistricting data are available in mySidewalk. This update involved updating the geographies that exist in mySidewalk. As part of these changes, users will see many values change, including in existing Dashboards, Reports, and data requests.

There are a couple of reasons for that:

Why Geographies Change Shape

Every Decennial Census, a new set of blocks, block groups, and tracts are created. The boundaries of block groups and tracts are based on groupings of populations, and as our communities grow (or shrink), these shapes change too. Blocks are the smallest geography in which the US Census provides data and change to match changes in the physical environments. Blocks are defined by physical environment features such as roads, railways, and bodies of water. Block groups are the next most likely to change, as they are each a grouping of blocks. The US Census tries to maintain some consistency in the shape of tracts over the decades, but Census tracts are often split, merged, or new ones created to accommodate the changes in where people live.

Why does this matter? At mySidewalk we work hard to provide data for the same set of shapes, so that you can use data from many different data sources and see how an area has changed over time.

Our data team has worked tirelessly through the summer and fall of 2021 to prepare these new shapes and harmonize data sources built for the DC 2010 Census blocks, block groups, and tracts to the new set from DC 2020. All of the data WAS built for 2010 tracts and block groups, but to make it easy for you to use we 'harmonized' all of the data into the new DC2020 shapes. We are exhausted--but it was an effort well worth the investment. (see below for what we mean by harmonize.)

How Geography Changes Affect You

All of the changing blocks, block groups, and census tracts mean changes to the values in your existing dashboards, reports, and data tables. Importantly, you won't have to do anything for these changes to take effect. Unlike other data updates, changing the shapes is likely to impact every data source in mySidewalk's Library. To understand why, let's talk about how mySidewalk uses US Census geographies.

Four major things are impacted by the 2020 shapes changing:

  1. The shapes themselves will be updated to the new boundaries.

  2. Data Values for mySidewalk data will be updated to reflect the new boundaries (using math called apportionment)

  3. Data values for historical mySidewalk data will be update to use the new shapes (we call this harmonization)

  4. Custom Boundaries will be updated using new apportionment (math) behind the scenes.

How mySidewalk uses Census Geographies

mySidewalk leverages the geographies published by the US Census in several important ways. First, we use these geographies as the single "source of truth" for the location and shape of all sixteen geography types in the United States. So when, for example, the Bureau of Labor Statistics (BLS) release data, actual geographic shapes are frequently not part of the provided data. But, these data sources almost always tell us which state, place, census tract, or block group the data belongs to. We call these identifiers "Geo IDs." Rather than having to store a set of geographies for every data source, we join different data sources to the Census geographic shapes using these Geo IDs. This has several benefits:

  1. This is a cornerstone of "harmonization," which allows you to look at all of these data sources together over time

  2. We use less storage space, which means more room for other valuable data,

  3. Having a single, reputable source for geographies means more reliable data for you,

  4. You don’t have to worry about shifting/changing shapes based on the time, place, and purpose you select; and

At the same time, this also means that when the Census geographies change, so does everything else.


Everything changes at this time because of the second way that mySidewalk leverages Census geographies: apportionment.

Without going too far in depth here (check out our Help Article on Geographies for more information), there are a few fundamental things we know to be true about Census geographies. Blocks fit inside block groups, which fit inside census tracts, which fit inside counties, states, and the nation. This "Russian-nesting doll" relationship between geographies is exclusive, which means that each block fits inside one and only one block group. Each census tract can only belong to one county, and each county can only fall inside one state. There are no exceptions to these rules.

However, other geography types like neighborhoods, zip codes, places, etc. do not follow this rule. For example, a single census tract may fall into multiple neighborhoods, or sit at the confluence of multiple zip codes. Many data sources simply do not report data values for every one of the sixteen mySidewalk geography types, so rather than providing no data for these important geographies, we do our best math to figure out what those values are likely to be. We do that using a few different processes: apportionment, aggregation, and/or georeferencing.

You can learn more about apportionment (which is the math behind using the numbers from the smallest shapes to understand the bigger shapes) in this help article.

Now, because very few data sources provide raw data for each of the sixteen different mySidewalk geography levels, mySidewalk apportions the missing pieces of the puzzle so you can have the data values for a geography that wasn’t originally published. When the shapes change, the math problem has to be re-done to get the correct values. Think of it this way: If you draw a circle on a map, if either the size of the circle or the number of things within that circle change, the values associated with the circle will too. So even if the shape of a given neighborhood, zip code, congressional district, or other irregular, non-nesting shape has stayed the same from 2010 to 2020, the values associated with it could very well still change.

Kansas City, MO Neighborhoods Example

Note: This image shows the shape of Kansas City, MO neighborhoods (with the South Plaza neighborhood highlighted) in relation to the block groups within them. As you can see, some block groups are entirely within the neighborhood boundaries, while others are only partially covered. In order to report values for each neighborhood, we use apportionment!

Here's where things get even more complicated. Not only does the 2020 Decennial Census update involve changing shapes, but it also involves changing Geo IDs! Our work involves accounting for these changes too. How might these Geo IDs change?

  1. The Geo ID refers to a new shape

    1. The original shape got bigger, or it got smaller. A different number and/or ratio of blocks will now fall within the geography than before.

  2. The Geo ID was reused to replace ID for a shape in a different location

    1. This is rarer, but does happen. In this case, a shape from the 2010 Census had a Geo ID, but that shape changed to such an extent that the Geo ID is no longer used. That Geo ID may be recycled for a shape in 2020 that has 0% overlap and may even have replaced an exact 2010 shape that had a different name.

How is DC 2010 a part of this?

Importantly, these changes will not just take effect for the 2020 data. We also harmonize historical data to these new shapes as well. This means that when you look at, for example, total population for an area in 1990, you will be looking at the total population of the 2020 shape for that time. We use the same technique (remember, it’s called apportionment) to harmonize the historical data into the current shapes. This ensures a smooth, apples-to-apples comparison over time. After we have accounted for all of these changes, and made necessary adjustments in the system, we publish it so you can have it.

Finally, this will also affect “drawn” and custom geographies in mySidewalk. If you have a custom layer in your mySidewalk User Data Library, the data that mySidewalk apportions to that shape on the fly will now account for the new underlying Census geographies as well.

What is the bottom line? TL;DR

Because the geographies provided by the Census are changing, the values reported by mySidewalk are changing too. This will be true for every data source currently in the mySidewalk data library, and any custom shapes you are currently apportioning data to with mySidewalk.

What else do you need to know and do?

As noted earlier, you do not need to take any action for these changes to take effect. However, if you have questions about changes that have taken place in your community (or elsewhere), please reach out to your Customer Success Manager, who can pass your feedback on to the correct channels. Additionally, these changes are planned only for mySidewalk data library. Any custom or user-uploaded layers that are currently utilized in your account will not see any changes. Layers uploaded to your account by mySidewalk staff will be updated as part of your regularly-scheduled maintenance, typically around the renewal of your contract.

New Decennial Census 2020 Data

The US Census released a subset of DC 2020 data August 12, 2021 for the purpose of redistricting. The tables included are:

* P1: Race

* P2: Hispanic or Latino, and Not Hispanic or Latino by Race

* P3: Race for Population 18 Years and Over

* P4: Hispanic or Latino, and Not Hispanic or Latino by Race for the Population 18 Years and Over

* P5: Group Quarters Population by Major Group Quarters Type

* H1: Occupancy Status

As of August 2021, the US Census has not yet announced when the full set of data tables from DC 2020 will be released.

The redistricting data subset is available in mySidewalk.

American Community Survey (ACS) 5-year Estimates 2016-2020

The US Census announced on February 7, 2020 that the 2016-2020 ACS 5-year estimates will be released on March 17, 2022.

The ACS 2016-2020 is built using the DC 2020 block groups and tracts.

All mySidewalk products will update with the ACS 2016-2020 data on March 24, 2022.

Did this answer your question?