Analysis of Census Bureau’s March 2022 Differential Privacy Demonstration Product: Implications for Data on Young Children

The U.S. Census Bureau is using a new method called differential privacy (DP) to help protect confidentiality and privacy of respondents in the 2020 Census. This paper provides some information on how the use of DP in 2020 Census is likely to impact the accuracy of data for young children (population ages 0 to 4). The study is based on analysis of the most recent DP Demonstration Product released by the Census Bureau on March 16. 2022. The DP Demonstration Product issued in March 2022 supersedes earlier DP Demonstration Products and focuses on data for the 2020  Census Demographic and Housing Characteristics (DHC) file.  This file has most of the tables that were in Summary File 1 in the 2010 Census.  The Demonstration Product released in March 2022 has data for population and housing units, but this analysis only examines data from the population file.

This paper presents analysis of the error introduced by DP by comparing the data as reported in the 2010 Census Summary File to the same data after the application of DP. According to the Census Bureau, the demonstration file released by the Census Bureau in March has been optimized for major use cases of the DHC tables.

Analysis presented in this paper found little impact of DP on data about young children for large (highly aggregated) geographic units like states or large counties.   However, the story is different for smaller geographic units.  Many smaller areas have high levels of error in their data on young children after DP is applied. For example, the count of young children would exhibit absolute error of 5 percent or more in about 27 percent of Unified School Districts after DP is applied. The data also show that 69 percent of Unified School Districts had absolute numeric errors of 5 or more young children after DP is applied.

Errors of the magnitude shown above could have important implications for federal and state funding received by schools and for educational planning. Errors of this magnitude might impact formula funding that is based on Census-derived data and some schools will get less than they deserve.

Bigger absolute error percentages are evident for Hispanic, Black, and Asian young children in Unified School Districts.  The mean absolute percent error for Non-Hispanic White young children was 5 percent compared to 27 percent of Hispanic young children, 34 percent for Black young children, and 42 percent for Asian young children.  Differential accuracy among race and Hispanic Origin groups raises questions of data equity after DP is applied.

I also examined the accuracy/errors for the single year age 4 child population and found errors for single year of age are particularly large.  I found 57 percent of Unified School Districts had absolute percent errors of 5 percent or more for children age 4, and 66 percent had absolute numeric errors of 5 or more children age 4.

Analysis also shows that 39 percent of Places (cities, village, and towns) had absolute percent errors of 5 percent or more for age 0 to 4, and 46 percent of Places had absolute numeric errors of 5 or more young children.

After the injection of DP in the 2010 Census data included in the March 2022 Census Bureau Demonstration Product, there were over 163,000 blocks nationwide that had population ages 0 to 17,  but no population ages 18 or over.  This result has two important implications,  First, blocks with children and no adults is a highly implausible situation and the large number of such blocks may undermine confidence in the overall Census results.   Second, these implausible results are likely due to young children being separated from their parents in 2020 Census DHC processing with DP. This separation of children and parent in data processing is an ongoing concern for data on young children and the production of future tables for children.  This issue is particularly important in introducing DP into the American Community Survey, which is a key source of child well-being measures (O’Hare 2022b) To understand the well-being of children, it is critical to understand the situation of a child’s parents  or caretakers. –  Moreover, if the same separation of children from their caregivers occurs in the application of DP to the American Community Survey, it will eliminate child poverty data which is based on household income. Child poverty data are the most important type of data on child well-being.

Based on the errors for young child population with the privacy parameters for DP used in the March 2022 DP Demonstration Product, and the lack of clarity about privacy protection from DP, I recommend the Census Bureau take steps to reduce the size of errors injected into the 2020 Census DHC file.

This paper is meant to provide stakeholders and child advocates with some fundamental information about the level of errors DP is likely to  inject into the 2020 Census data for the population ages 0 to 4. There are a couple of reasons for sharing this information with child advocates now.  The 2020 Census results for some localities may include situations where the number of young children reported looks suspect.  It is important to make sure child advocates are aware of the potential impact of DP so they can explain odd child statistics to local leaders.

There is a second reason for sharing this information with state and local child advocates. The U.S. Census Bureau is looking for feedback on the use of DP in the 2020 Census.  The Census Bureau is looking for cases where census data are used to make decisions.   The Census Bureau is asking data users to examine the DP Demonstration Product to see if the error injected by DP make the data unfit for use.  After reading this report, I hope you will convey your thoughts to the Census Bureau.

There is some latitude in how much error the Census Bureau will inject into the DHC files so feedback from census data users is important. If many users feel the current level of accuracy for data on young children in DP Demonstration Product is not accurate enough for some uses, there is a chance the Census Bureau could make the final data more accurate.

Stakeholders, child advocates, and data users should take advantage of this opportunity to communicate their thoughts to the Census Bureau before Census Bureau’s Data Stewardship Advisory Committee makes a final decision on the privacy parameters before the DHC files are released in May of 2023. Comments on the implications of DP in the March 2022 Demonstration File are due Monday, May 162022. Comments and responses can be sent to 2020DAS@census.gov.

Found this article helpful? Share it!

More resources like this

What the Supplemental Demographic and Housing Characteristics File from the 2020 Census Tells Us About Future Statistics on Children from the Census Bureau

Dr. Bill O’Hare’s report provides an overview of the implications of the Supplemental Demographic and

No Time for Tweaking

The Census Bureau is already planning for the 2030 Census, but key challenges from 2020

What Past Research Tells Us About How to Prepare for the 2030 U.S. Census Count of Young Children

Probably the most important point in this paper is made in Figure 1 which shows