Assessing the 2020 Census: Final Report (2023)

Chapter: 1 Introduction

Previous Chapter: Summary
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

– 1 –

Introduction

1.1 THE CENSUS AS ESSENTIAL CIVIC CEREMONY

In taking stock of the 2020 Census of the United States, it is instructive to begin by considering the stakes involved in the decennial exercise and how fundamental it is to the function of the nation. The survival of a democracy necessarily requires that successive generations of the population embrace its institutions, procedures, and practices. Since 1790, the U.S. census has been such an institution. Sometimes described as the nation’s family photograph, the U.S. census is the federal government’s largest undertaking during peacetime. It is a recurring civic ceremony in which everyone counts, and it reaffirms a commitment to equality among all.

The census is a cornerstone of American democracy because political representation is explicitly tied to population counts. An “actual Enumeration” is mandated in Article I, Section 2, of the U.S. Constitution, as amended by the 14th Amendment. The constitutional purpose of the census is apportioning the U.S. House of Representatives among the states, but the extent to which the decennial census reshapes the contours of political representation and power go far beyond apportionment. After each census, state and local governments create legislative districts—from U.S. Congress down to such entities as school districts—in the process of redistricting. Most states rely on decennial census data to conduct redistricting, though the use of those data is explicitly required by state constitution or statute in only a minority of states (National Conference of State Legislatures, 2021). Congress enacted Public Law 94-171 in 1975 to ensure that the U.S. Census Bureau provides relevant census data to the states for redistricting purposes, and legal requirements for equal--

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

population legislative districts currently support the expectation that census data on the voting-age population by race and ethnicity will be available for all 5.8 million inhabited census blocks in the nation.

The decennial census also serves as a vehicle for fairness in other contexts, such as the distribution of funds for federal programs. Although the annual amount varies with appropriations, in Fiscal Year (FY) 2017, 316 federal spending programs relied on 2010 Census-derived data to distribute $1.504 trillion, or about 7.8 percent of the Gross Domestic Product (Reamer, 2020). An updated analysis for FY 2021 concluded that 353 federal spending programs allocated $2.8 trillion based on decennial census and American Community Survey data (Villa Ross, 2023). These funds were distributed to various government entities, and the data inform the eligibility standards that affect the provision of benefits and the definitions of regulatory action throughout the nation. The more accurate the census count, the more equitable the distribution will be. Undercounted geographic areas or groups will lose their share of funding while overcounted areas or groups will unduly benefit.

Although the number of data items collected in the 2020 Census is limited—age, sex, race/ethnicity, relationship to householder, housing tenure—the uses of those data are myriad,1 and the decennial census is the largest and most vital undertaking of the federal statistical system. Census data are highly valuable beyond these governmental purposes because the census is the only public geographically comprehensive database. Census data underlie government and business planning, provide the denominators for health and other vital statistics, serve as the evidence base for assessing claims of discrimination and deprivation of civil rights, and fuel countless other private and public uses. Small geographic areas, such as small towns, lack the resources to create comparable data, so they are particularly reliant on census data; the number of these small jurisdictions is considerable2 and the census has long produced small-area data even in the precomputer age.3 So, too, do small populations by race, ethnicity, and age

___________________

1 See Appendix F.3 for further examples of decennial census data. The 2000 Census was the last to collect additional socioeconomic data on the population in the “long-form” sample; beginning in 2005, such items as income, education, veterans’ status, and many others are collected from large population samples in the continuous American Community Survey.

2 For example, 50% of incorporated places had no more than about 1,125 people in 2020; the corresponding figures for minor civil divisions of counties and American Indian reservations were about 825 people and 550 people, respectively (Census Geographies Project, 2022:21, 24, 27). As of 2020, the United States had 3,143 counties and equivalents, 19,483 incorporated places, 17,780 minor civil divisions, and 337 American Indian tribal governments, plus large numbers of statistical areas, including census-designated places, census county divisions, Native American areas, census tracts, block groups, and blocks (Census Geographies Project, 2022:Table 2-1A).

3 Per Census Geographies Project (2022:Ch. 3), census publications date back to 1790 for counties and minor civil divisions, to 1880 for incorporated places (and even father back for some large cities), to 1910 for census tracts in the cities that pioneered these neighborhood areas, to 1930 for enumeration districts (the equivalent of today’s block groups), and, for blocks in large cities, to 1940 (housing characteristics) and 1960 (population and housing characteristics).

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

depend on census data to monitor their sizes and distributions, as is essential for a nation striving to curb inequity.4

The multiple functions and high quality standards required of the census count justify the Congressional willingness to appropriate adequate funds for this vast undertaking—approximately $13.7 billion for the 2020 Census (U.S. Government Accountability Office, 2023). In return for their cooperation in this decennial civic ceremony, the people of the United States receive free public data that is presumably of high quality.

The census is an essential civic ceremony, but it has never been an easy one. Each decade brings a larger and more complex population to cover, new technologies and methods to adapt, and new difficulties with which to contend. On that score, the 2020 Census was exceptional, posing an unprecedented combination of challenges and operational decision points to census takers, not least of which was a lengthy shutdown in operations necessitated by the onset of the COVID-19 pandemic at exactly the time that 2020 Census operations were ramping up. In the truest sense of the word, the 2020 Census was extraordinary—so much so that the mere fact that data collection was completed at all remains a signature and very laudable achievement. Against this backdrop, this second and final report of the Panel to Evaluate the Quality of the 2020 Census delves into the still-open question of whether the extraordinary enumeration produced results of the quality commensurate with the task.

1.2 NATURE AND LIMITATIONS OF THIS STUDY

1.2.1 Panel Charge and Data Analysis Subgroup Structure

The Census Bureau sponsored this study with a very general but expansive statement of task:

The National Academies of Sciences, Engineering, and Medicine will appoint an ad hoc panel to review and evaluate the quality of the data that were collected in the 2020 Census. As part of its work, the panel will:

  1. Review information from the Census Bureau on the data collected as well as various process measures and indicators of data quality obtained as part of the 2020 Census operations;
  2. Review other available information, such as results from demographic analysis, process measures and preliminary results from the post-enumeration survey; and analyses of administrative records; and
  3. Consider the results from evaluations of similar indicators from the 2010 and 2000 Censuses.

___________________

4 Executive Order No. 13985 (2021) created new requirements related to data equity in federal programs. The order also established an Equitable Data Working Group (2022), which issued a vision and recommendations on April 22, 2022, one of which was to “make disaggregated data the norm while protecting privacy.”

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

The panel will produce an interim report with its initial findings and conclusions, and a final report that includes conclusions about the quality of the data collected in the 2020 Census and makes recommendations for further research by the Census Bureau to evaluate the quality of the 2020 data and to begin planning the 2030 Census. The panel’s reports will be reviewed according to institutional review procedures and released publicly on the National Academies Press web site.

The distinguishing characteristic of this panel, since the earliest proposal discussions with the Census Bureau, is its designation of a dedicated data analysis subgroup from within our membership. The members of this subgroup and panel staff were sworn in and conducted their work behind the Census Bureau’s information technology (IT) firewall, via the Census Bureau’s virtual desktop interface (VDI). In furtherance of its charge, the full panel met 10 times in plenary between July 2021 and May 2023—under the circumstances, in principally or exclusively virtual format—with the series of open meetings in 2021 dedicated to receiving overview presentations from Census Bureau staff related to the processes used for the 2020 Census. The data analysis subgroup met much more frequently—typically twice a week over most of its span—to conduct its work, and the panel’s meetings in 2022 were structured to permit the subgroup to brief the rest of the membership about their work while keeping sensitive information behind the IT firewall. Full-panel members needed to be duly sworn or IT-cleared to participate in these discussions, which were conducted via the Census Bureau’s VDI.5

We issued our interim report in April 2022 (National Academies of Sciences, Engineering, and Medicine, 2022), delayed from its original timeline because of the time needed to establish the data-access arrangements. Because those arrangements were just being secured and there was not an existing base of data for the subgroup to work with, our interim report could not reflect original data analysis. Still, the interim report laid down important markers for this report, and this work extends from the framework developed in the first report.

1.2.2 Defining Census Quality

A few broad comments are in order, to set the stage for what is and what is not included in this final report, and to calibrate expectations accordingly. The first two of these comments stem directly from the precise wording of our charge, which obliges us to “review and evaluate the quality of the data that were collected in the 2020 Census.” This language requires us to be clear about what we mean by the quality of a decennial census and its data.

___________________

5 The panel’s final meeting in May 2023, to finalize concepts and language for this report, involved only data and analyses that had been cleared for release/panel use by the Census Bureau’s Disclosure Review Board.

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

The concepts of data “quality” and “error” for a census were a major focus of our interim report, and we repeat the basic precepts from that report because they remain immediately relevant here. We held in the interim report that perfection—the absence of all error—is an unattainable and unrealistic standard for any census. We observed that the assessment of the quality of a census depends on the filters and criteria applied in the review, the relative weight put on uses of census data, and the fitness of the collected data to meet those uses.

We noted in the interim report that that there is no simple or unique scorecard for determining census quality, and this basic truth has been continually reinforced as this study has progressed. In our work, we have also continually returned to the frameworks for fundamental responsibilities and functions of statistical agencies, as embodied in such sources as our parent Committee on National Statistics’ Principles and Practices for a Federal Statistical Agency (National Academies of Sciences, Engineering, and Medicine, 2021) and written into law as the Evidence-Based Policymaking Act of 2018 (44 U.S.C. § 3563(a)(1)). In particular, we resonate with the framework for data quality offered by the Federal Committee on Statistical Methodology (2020:2) that defines data quality as “the degree to which data capture the desired information using appropriate methodology in a manner that sustains public trust” and that decomposes data quality into 11 component dimensions, as listed in Box 1.1. It is useful to think of census quality as a composite—a calibration, if not a complete balance—of these multiple dimensions and, relatedly, to be cautious of the harmful effects of unduly weighting any one dimension above all others.

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

1.2.3 The Enduring 2020 Census

The second broad comment directly stemming from our charge to “review and evaluate the quality of the data that were collected in the 2020 Census” is that it is necessarily awkward—if not impossible—to render a complete evaluation when the circumstances and timelines of the 2020 Census are such that its detailed data products were not yet released or were still in development as this report was drafted. The major tranche of data that comprises the P.L. 94-171 redistricting files goes into deep geographic detail but utilizes only a crude age categorization (to reflect voting age) and combinations of high-level race categories. More detailed, publicly released tabulations of the 2020 Census by age, race, ethnicity, household relationship, tenure, or combinations thereof, awaited release of the Demographic and Housing Characteristics (DHC) File on May 25, 2023—as this report was in final stages of drafting and simply too late to work with other than to acknowledge the release. Of necessity, then, our work has focused on process and methods, drawing inference about the quality of 2020 Census data based on how they were collected along with other very high-level measures of quality, but with very little tangible reference to the actual census data products themselves.

A related, further complication also demands mention, which is that the delays in issuing 2020 Census data products have produced corresponding delays in the generation of the Census Bureau’s internal evaluations and assessments of census operations. This virtually ensures that some discrepancies will arise between what we say here (and what we observed in the operational data) and the results that appear in the Census Bureau’s finished data releases and evaluation reports; this is simply an inevitable result of analyses preceding each other based on still-evolving source files.

1.2.4 Limits on Analysis and Publication

Our interim report ended with some important commentary regarding Conclusion 4.6 in that report (National Academies of Sciences, Engineering, and Medicine, 2022:60):

It will not be possible for this panel (or any other evaluator) to understand and characterize the quality of the 2020 Census unless the Census Bureau is forthcoming with informative data quality metrics, including new measures based on operational/process paradata, at substate levels and small-domain spatiotemporal resolution, unperturbed by noise infusion.

To properly give credit where it is due, the Census Bureau honored its commitment to provide us, through the data analysis subgroup, with access to the information that we requested. Mindful of the Census Bureau’s ongoing 2020 Census processing and production, we tried to keep our data requests reasonable so as to be minimally disruptive to the Census Bureau’s own work.

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

Compromise was needed on some data requests, but it is very important to note that none of our data requests were flatly denied, and the Census Bureau was very generous with its staff and time resources to fulfill our data requests and engage on technical points with our subgroup members. Relative to a scenario in which we as external evaluators would have been completely dependent on finished, fully approved Census Bureau products and reports, our situation was a good one indeed.

That said, there remain major issues and complications about what we can actually publish and report from our analyses. Any representation of nonpublic, behind-the-firewall data must be cleared by the Census Bureau’s Disclosure Review Board procedures, including not just our analyses but also the work of the Census Bureau’s own research staff in evaluation and assessment reports. Minimally, those procedures include stringent rounding rules for the presentation of numerical data; they may also involve the injection of statistical noise into the results, as is being done for the 2020 Census data products. We discuss the Census Bureau’s approach to protecting the confidentiality of publicly released census data (what it terms its Disclosure Avoidance System [DAS]) at length in this report, particularly in Chapter 11. The Census Bureau’s decision to place formally private algorithms that infuse noise in every data cell at the center of the 2020 DAS makes the DAS a hallmark issue for the quality of the 2020 Census outputs.

At the outset, it is important to note that the Census Bureau’s stringency in trying to avoid any potential “leakage” of confidential information (however slight or indirect) necessarily creates awkwardness for an independent and external review such as ours, and it is necessarily counter to a position of fully public reproducible research. Though we have not seen a compelling argument that process indicators (e.g., census tract rates of internet or paper enumerations) or paradata (e.g., rates of different types of edits) are materially disclosive of sensitive personal information, the Census Bureau has been adamant for the 2020 Census that aggressive disclosure-avoidance techniques must be applied to such indicators. Consequently, our analyses were inherently exploratory in nature and favored detection of broad general patterns and trends; by nature, our analyses find correlation rather than causal influences. We examined subnational results when possible but bore in mind that detailed audits of operations in individual census tracts or even counties would be unlikely to be cleared by the Census Bureau for publication or would be misleading given the extent of noise injection likely required for release. We also limited our request to graphic summaries when possible, seeking clearance of just the graphic and not the data used to produce it. The analyses and results in this report are those for which we sought and obtained clearance through the Disclosure Review Board process.

We should also be clear that some limitations on what we write and analyze in this report are self-imposed. We never sought, and indeed did not want, to

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

directly query the full sets of processed returns (e.g., the Census Unedited File or the Census Edited File) from the 2020 Census, but rather relied on custom tabulations on targeted topics. The Census Bureau’s editing and unduplication routines are highly sensitive and rightfully safeguarded, and we sought to understand the end effects of those routines without probing their mechanics in depth. Electronic data capture and field operational control systems in the 2020 Census undoubtedly created many avenues for inquiry—everything from audit trails of enumerator routes to resolution of “alerts” generated in the field systems to inferences from keystroke sequences and completion times in the internet questionnaire—that we did not have the wherewithal to pursue. Another reason for avoiding some of these paths is that they might provide information about the efficiency of 2020 Census processes but not necessarily the quality of 2020 Census data, which is our ultimate objective. Most notably, we did not seek nor attempt to characterize differences between publicly released 2020 Census data (i.e., the P.L. 94-171 redistricting data file) and the “true” values before application of differential privacy-based disclosure avoidance.

In the end, we have conducted as full an assessment of the 2020 Census as our limited time and resources and the parameters of the disclosure review process permitted. As presaged in the interim report, we are obliged to ask the reader to “trust us” more than would ideally be the case—to trust that we have done due diligence in the analysis, even if we cannot provide fuller results.

1.3 OVERVIEW OF THIS REPORT

We sought to make the individual chapters of this report as self-contained as possible without unduly repeating material, given content overlap between some chapters and methodological overlap in others. For convenience, our conclusions and recommendations (along with those from the interim report) are reprinted in Appendix A, and Appendix B includes an updated glossary and list of acronyms.

We begin in Chapter 2 with a brief synopsis of the design and conduct of the 2020 Census, including the array of challenges faced in the count; additional detail on 2020 Census procedures may be found in Appendix C. We then open discussion of the quality of the collected 2020 Census data with Chapter 3’s examination of a phenomenon apparent from early quality measure releases: “age heaping,” or an unusual extent of reported ages in round numbers ending in 0 or 5. In Chapter 4, we examine what is known from publicly released, traditional assessments of census quality: the measures of census coverage derived from comparison with the Post-Enumeration Survey and independent Demographic Analysis (additional technical detail arising from this discussion is provided in Appendix D).

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

Following this overview, in the next five chapters we explore major topics in 2020 Census operations, recapping what was planned to happen in the 2020 Census and what actually occurred due to the unique conditions in 2020, and presenting results from our analysis of operational data. In turn, these chapters cover:

  • Development of the Census Bureau’s Master Address File (MAF), which defines the operational frame for the count (Chapter 5);
  • Provision for self-response to the census, including via the internet as the primary channel (Chapter 6);
  • Conduct of the resource-intensive Nonresponse Followup (NRFU) operation to try to complete the count (Chapter 7);
  • New-for-2020 use of administrative records-type data to enumerate households for which reliable external data were available and that were not successfully reached in NRFU (Chapter 8); and
  • Counting of the small but very important population residing in nonhousehold or group quarters locations, such as college and university student housing and health care facilities (Chapter 9; see also the detailed code list used in 2020, reprinted for reference in Appendix E).

The Census Bureau made important changes—with vital ramifications for understanding census data quality—in two other core aspects of the 2020 Census, and these form the basis for the final two subject chapters. Chapter 10 describes the measurement of race and ethnicity (Hispanic origin) in the 2020 Census and the coding of those data, relative to past practice, while Chapter 11 focuses on the Census Bureau’s extremely consequential decision to supplant its existing confidentiality-protection routines in the late stages of 2020 planning, replacing them with an untested and undeveloped Disclosure Avoidance System that affects every 2020 Census data product. (Additional material related to the disclosure avoidance topic is included in Appendix F.)

Our formal recommendations generally speak to the nascent 2030 Census planning process and are peppered throughout the main subject chapters, numbered in the sequence in which they appear in the text. With Chapter 12, we close the report by weaving all of the individual recommendations into a structured narrative—learning from the 2020 Census experience to shape the Census (and its successors), and identifying key objectives and priorities for 2030 Census testing and development.

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.

This page intentionally left blank.

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 21
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 22
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 23
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 24
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 25
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 26
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 27
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 28
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 29
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
Page 30
Next Chapter: 2 Overview of the 2020 Census
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.