Skip to main content

Researchers Need to Rethink and Justify How and Why Race, Ethnicity, and Ancestry Labels Are Used in Genetics and Genomics Research, Says New Report

News Release

Genetics
Black, Indigenous, and People of Color

By Megan Lowry

Last update March 14, 2023

close up shot of the face of a man and woman

WASHINGTON — Researchers and scientists who utilize genetic and genomic data should rethink and justify how and why they use race, ethnicity, and ancestry labels in their work, says a new National Academies of Sciences, Engineering, and Medicine report

The report says researchers should not use race as a proxy for describing human genetic variation. Race is a social concept, but it is often used in genomics and genetics research as a surrogate for describing human genetic differences, which is misleading, inaccurate, and harmful. To improve genomics research, the report presents a new framework and decision tree to help researchers choose descriptors and labels that are most appropriate for their study. 

From the beginning of genetics and genomics research, researchers have used “population descriptors” as a shorthand for capturing the complex patterns of human genetic variation across the globe. For example, these descriptors can identify groups based on nationality, such as French; geography, such as North American; or ethnicity, such as Hispanic. But human genetic differences are distributed in complex ways that do not necessarily align with a single descriptor.

Many scientists employ typological thinking, reinforcing the incorrect view that humans can be classified into discrete, innate categories, the report says. These preconceptions have influenced the design and execution of genetic studies. 

In an effort to be inclusive, and in response to White House Office of Management and Budget directives, the National Institutes of Health and other federal agencies require the collection and reporting of racial and ethnic information using OMB categories in funding proposals and applications. However, the report says, OMB requirements may exacerbate typological thinking. 

Genomics data has become more accessible and widespread across biomedical research, and large-scale genomics studies in recent years have sought to include more diverse groups of people. In genomics research, race and ethnicity have a long history of being identified incorrectly as reasons for average differences among groups — and using such socially constructed descriptors in this research can reinforce the misconception that social inequities are caused by biological difference, says the report.

Almost all human traits are a result of the interplay between genetic and environmental factors. Instead of relying on population descriptors as proxies for describing the effects of environmental factors, researchers should incorporate environmental factors in their work, and use variables that capture more precise information, the report says. Genetics and genomics researchers should collaborate with experts in the social sciences, epidemiology, and other disciplines — as well as work in partnership with communities — to aid in these studies. 

The report recommends that genomics and genetics researchers tailor their use of population descriptors based on the type and purpose of their study, and explain why and how those descriptors were selected in their work. If appropriate, researchers should consider using multiple descriptors for each study participant to improve clarity. The report offers a decision tree to help researchers choose whether race, ethnicity or indigeneity, geography, genetic ancestry, or genetic similarity are most appropriate for their work. Genetic similarity will be the preferred population descriptor in most cases, though in some instances other population descriptors may be considered appropriate. In the case of studies investigating the effects of racism on health, for example, racial labels may be appropriate, the report says.

“Genomics and genetics research has transformed the way humans see our own history and helped define who we are, and over the past century has brought us to new frontiers in medicine and science that were once unimaginable for our species,” said Aravinda Chakravarti, Muriel G. and George W. Singer Professor of Neuroscience and Physiology, professor of medicine, and director of the Center for Human Genetics and Genomics at the New York University Grossman School of Medicine, and co-chair of the committee that wrote the report. “But if genomics and genetics are to become more inclusive and produce benefits for all, we must learn from the mistakes of the past and invest in a paradigm shift to correct them as we assess the role of genes in human traits and diseases.” 

“Classifying people by race is a practice entangled with and rooted in racism, and the pernicious effects of applying this classification to genetics and genomics research have undeniably caused harm over the last century,” said Charmaine D. Royal, committee co-chair and Robert O. Keohane Professor of African and African American Studies, Biology, Global Health, and Family Medicine and Community Health at Duke University. “The lack of consistency in the use of population descriptors also presents problems for the accuracy and applicability of genomics research. The new framework and processes our report recommends can help our field produce more trustworthy science.”  

Changing Research Practices

The report says researchers should:

  • Not use race as a proxy for human genetic variation. In particular, they should not assign genetic ancestry labels to individuals based on their race, regardless of whether the label was self-identified.  
  • Apply labels consistently to all participants. For example, if ethnicity is the most appropriate descriptor, all participants should be assigned an ethnicity label, rather than labeling some by race and others by ethnicity.  
  • Be attentive to the connotations and impacts of terminology they use to label groups. The report points to the term “Caucasian” as an example, explaining it should not be used under any circumstance because it was originally coined to convey the notion of white supremacy.  
  • Disclose the process by which they select and assign group labels. If researchers develop new labels for existing samples, researchers should provide a description of the differences between the new and old labels.  
  • Not use race as a proxy for human genetic variation. In particular, they should not assign genetic ancestry labels to individuals based on their race, regardless of whether the label was self-identified.  
  • Apply labels consistently to all participants. For example, if ethnicity is the most appropriate descriptor, all participants should be assigned an ethnicity label, rather than labeling some by race and others by ethnicity.  
  • Be attentive to the connotations and impacts of terminology they use to label groups. The report points to the term “Caucasian” as an example, explaining it should not be used under any circumstance because it was originally coined to convey the notion of white supremacy.  
  • Disclose the process by which they select and assign group labels. If researchers develop new labels for existing samples, researchers should provide a description of the differences between the new and old labels.  

Supporting Implementation and Accountability 

The report also recommends that key partners in genetics and genomics research — such as funding agencies, institutions, and scientific journals — should ensure their policies and procedures align with the committee’s recommendations. These partners should offer publicly available tools to facilitate implementation, such as training, grant and manuscript review guidelines, and educational modules. Research institutions and funding agencies should embed incentives for fostering interdisciplinary collaboration among researchers, including in social sciences, epidemiology, and community-based research. The report also recommends advisory bodies be established to monitor and facilitate implementation and suggest future actions. 

The study — undertaken by the Committee on the Use of Race, Ethnicity, and Ancestry as Population Descriptors in Genomics Research — was sponsored by the U.S. Department of Health and Human Services, National Institutes of Health: All of Us Research Program; National Cancer Institute; National Heart, Lung, and Blood Institute; National Human Genome Research Institute; National Institute of Child Health and Human Development; National Institute of Dental and Craniofacial Research; National Institute of Diabetes and Digestive and Kidney Diseases; National Institute of Environmental Health Sciences; National Institute of Nursing Research; National Institute on Aging; National Institute on Drug Abuse; National Institute on Minority Health and Health Disparities; NIH Office of Behavioral and Social Sciences Research; and NIH Office of Science Policy. 

The National Academies of Sciences, Engineering, and Medicine are private, nonprofit institutions that provide independent, objective analysis and advice to the nation to solve complex problems and inform public policy decisions related to science, engineering, and medicine. They operate under an 1863 congressional charter to the National Academy of Sciences, signed by President Lincoln. 

Contact:
Megan Lowry, Media Relations Manager
Office of News and Public Information
202-334-2138; e-mail news@nas.edu 

Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.