Previous Chapter: 3 Variability
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

4
Concordance with Human Data

This chapter presents the results of the literature review, the virtual workshops, the literature presented to the committee in information gathering sessions, and committee deliberations with respect to concordance between laboratory mammalian models and humans (i.e., relevance). In addition, it addresses how might this frame expectations of new approach methods (NAMs) when they cannot be compared directly with human studies.

Specifically, this chapter addresses the following charge question to the committee:

What do the literature review and workshops indicate about concordance between laboratory mammalian models and humans in the adverse effects following chemical exposure and how might this frame expectations of NAMs when they cannot be compared directly with human studies?

In addressing this charge question, the committee considered different levels of concordance (i.e., cellular, tissue, organ, organism) as well as toxicokinetics. The following sections address these considerations regarding the levels of concordance and toxicokinetics, describe the insights from the workshops and literature review, and provide the committee’s findings and recommendations concerning concordance.

LEVELS OF CONCORDANCE

In the first chapter, the committee defined concordance of adverse health effects as a similarity in responses to chemical exposures across different animal species. Adverse health effects include a range of responses at different levels ranging from biochemical changes to observable endpoints such as clinical pathology. When considered more generally, concordance can be evaluated at multiple levels, which include but are not limited to toxicokinetics, biological mechanisms, and apical endpoints. There can be varying levels of concordance observed between laboratory mammalian animal and human responses to a chemical exposure. This can be due to inherent biological diversity such that the magnitude of the response and time to response could be different. This can sometimes be due to the granularity at which the concordance is evaluated (e.g., cellular, tissue, organ, or organism level effects). For example, a human carcinogen may also cause cancer in experimental animals, but tumors may be of a different type or occur at a different site or via a different mechanism. Concordance can also be influenced by methodological differences in studies being compared including exposure conditions (e.g., timing and duration of exposure, types of outcomes evaluated, and toxicokinetic differences) and tissue sensitivity (e.g., toxicodynamic differences). Presentation of responses to chemical exposures can differ between experimental animals and humans and still be concordant for these reasons.

TOXICOKINETICS

It is generally recognized that the toxicokinetics of a chemical can differ between experimental animals and humans. The most informative assessments of concordance between animals and humans are performed using an internal dose metric (e.g., plasma area under the curve or Cmax). In conventional risk assessments, the differences in animal and human toxicokinetics are

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

typically accounted for by applying adjustment factors, although toxicokinetic models, when available, are also used to derive internal doses in both humans and animals. In addition, the pharmaceutical industry has paid great attention to the concordance between absorption and bioavailability of a drug in laboratory animals and humans to increase the success rate for candidate drug molecules. Differences in the percent of oral drug bioavailability between most laboratory test species (e.g., mouse, rat, dog, and monkey) and humans has been attributed to interspecies variation in first pass gut and liver metabolism. Several studies have shown that rats and humans have similar rates of oral drug passive permeability and fraction absorbed in the intestine (Chiou and Barve, 1998; Chiou et al., 2000; Chiou and Buehler, 2002); a correlation of 0.55 was reported for the percent of oral drug bioavailability between rats and humans (Musther et al., 2014). Limited correlation for oral drug bioavailability highlights that animal bioavailability may not be quantitatively predictive of bioavailability in humans, although qualitative (high/low bioavailability) indications might be possible (Musther et al., 2014). Multiple groups have investigated the concordance for dermal absorption, penetration, and bioavailability between animals and humans. Qualitative measures of skin penetration, such as high versus low penetration or fast versus slow penetration, have been shown to be similar between humans and animals; however, quantitative differences in skin penetration exist between humans and rodents. Anatomical differences between rodents and humans may also contribute to a difference in concordance for inhaled toxicants.

The degree of similarity in xenobiotic metabolism between animals and humans has been studied in multiple ways, including by (1) qualitative comparisons of key enzymes, (2) semi-quantitative comparisons that classify by rate of metabolism (slow versus fast metabolism), extent of metabolism (limited versus extensive metabolism), or metabolite formation (minor versus major metabolites), and (3) quantitative comparisons of the absolute values for rate of metabolism, extent of metabolism, or metabolite abundance. It is worth noting that these types of analysis are very sensitive to selection bias; if one is familiar with the literature, it is relatively easy to (consciously or subconsciously) select a set of chemicals with relatively high or low concordance.

There is high concordance between animals and humans for metabolism of some but not other chemicals; this is likely attributed to the sometimes similar and sometimes different metabolizing enzymes and cellular transporters present in animals and humans. The presence or absence of a specific isoform of a metabolizing enzyme (e.g., cytochrome P450s, esterases, uridine 5’-diphospho-glucuronosyltransferases) or cellular transporter (e.g., organic or cation transporters, Pglycoprotein), as well as its abundance, will impact the rate and extent of metabolism as well as metabolites formed. Since interspecies differences exist in metabolizing enzymes and different classes of chemicals will be metabolized by different key enzymes, it may be more appropriate to review animal to human concordance separately for different classes of chemicals rather than by grouping all existing data together.

INSIGHTS FROM WORKSHOPS

Workshop 1

The first virtual workshop was held December 9, 2021, with the aim of providing information to assist the committee in considering the potential utility and expectations for the use of NAMs in risk assessment. Presentations and round table discussions involving the presenters addressed topics including how laboratory mammalian toxicity studies are used to inform chemical safety decisions, some examples of variability and concordance of laboratory mammalian toxicity studies, and a consideration of the expectations of different stakeholders. A summary of the workshop proceedings was published (NASEM, 2022a). Presentation points relevant to concordance include the following:

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
  • Laboratory mammalian toxicity testing has been the cornerstone of risk science, serving as an experimental basis for predicting human disease, including cancer. In general, concordance across species is seen for cancer occurrence. On the other hand, the sites or tumor types seen in experimental animals may differ from those in humans, which could be in part due to experimental design issues.
  • However, these animal studies have limitations. For example, animal studies for the purpose of toxicity testing are typically conducted at high doses, which may not necessarily be relevant to the lower human exposures in environmental and other nonoccupational settings. Although experimental animal studies sometimes examine mixtures, they often only evaluate one exposure at a time, which does not reflect the mixture of chemicals to which the human population is exposed. Further, they do not consider other exposures present in the environment that may contribute to adverse effects, including social stressors and factors such as pre-existing disease conditions (see NRC, 2007, 2009).
  • In the vast majority of cases, mammalian toxicity studies have been used in hazard assessment to identify relevant health effects in humans. Workshop presenters discussed cases related to cancer hazard identification and dose-response assessment, neurodevelopment, and specific chemical case studies—phthalates and polybrominated diphenyl ethers (PBDEs, discussed in the literature review section).
  • In addition, some workshop participants noted that many current laboratory mammalian toxicity tests are decades old and were designed to detect catastrophic effects (i.e., overt effects such as lethality, cancer, malformations, or histopathological changes in target organs) and, in general, are highly effective at doing so. They were not designed to detect more subtle structural and functional effects (i.e., cognitive and learning disabilities, metabolic disruption, impaired capacity for lactation, or heightened risk of neurodegenerative disorders), and as such it can be difficult to evaluate concordance in this area.

Workshop 2

The second virtual workshop was held May 12, 2022 (NASEM, 2020b). This workshop addressed elements of a scientific confidence framework for NAMs pertinent to risk assessment via case studies related to mixtures, developmental neurotoxicity (DNT), and estrogenicity. The goal of the case studies was to illustrate the strengths and weaknesses of traditional and nontraditional approaches and methods, which included concordance. Relevant to concordance were the case studies on mixtures and DNT, as discussed in the following section.

Mixtures

This case study suggested that mixture studies provide an opportunity to assess the concordance of animal and human data in a more realistic exposure context. For instance, the European Food Safety Agency (2019) noted that for dioxin-like compounds and sperm concentrations, toxicity equivalence factors (TEFs), used to convert concentrations of multiple dioxin like compounds into a common metric of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD)-equivalents, was a reasonable predictor of effects for both a rodent toxicology (Hamm et al., 2003) and a human epidemiologic study (Mingues-Alarcón et al., 2017). In addition, with the emergence of exposomics and advances in statistical methodologies, epidemiologic data on mixtures may provide a substantial opportunity to evaluate concordance of experimental animal studies and humans. For instance, Caporale et al. (2022), using a similar mixture approach, found thyroid hormone disruption and

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

DNT effects in the nonmammalian Organisation for Economic Co-operation and Development (OECD) guideline studies (i.e., OECD TG 248) when experimentally testing a mixture of eight endocrine disrupting chemicals in proportions found in an epidemiologic cohort where DNT effects were found in humans. Moreover, the relevant dose ranges over which effects were similar between the experimental and human studies, thereby supporting both qualitative and quantitative concordance of the experimental studies. As another example, Bornehag et al. (2019) found that an exposure to a mixture of phthalates that caused a 5% decrease in anogenital distance in boys led to similar decreases in rodents, again supporting both qualitative and quantitative concordance. This last example is striking because previously NASEM (2017a) also found qualitative concordance with variability in the response levels when evaluating data on individual phthalates.

Developmental Neurotoxicity

Neurodevelopmental disabilities are being increasingly diagnosed globally, with approximately 1 in 6 children between the ages of 3 to 17 having a developmental disability diagnosis in the United States (CDC, 2022a). Much of this increase is driven by increases in the prevalence of attention deficit disorder, autism spectrum disorder and intellectual disability (CDC, 2002b; Fombonne, 2016). Less than 1% of the tens of thousands of chemicals in commerce have been evaluated for DNT, in part due to lack of regulatory requirements (EPA, 2017). In addition, many neurodevelopmental disorders including autism spectrum disorder and attention-deficit/hyperactivity disorder (ADHD) do not have obvious or defining biological markers (e.g., histopathology, serum biomarkers), which is challenging for a guideline study–dependent regulatory culture that relies on apical endpoints. The presenters noted that it is difficult to evaluate concordance with humans of guideline DNT laboratory mammalian studies because very few tests have been done, they are often not repeated, and the traditional DNT tests do not fully capture the dynamic nature of development and the timescale to detect adversity following developmental exposures. This is further elaborated in the following PBDE systematic review.

Because so few chemicals have been evaluated for DNT using guideline studies, and these studies are not often repeated, it is difficult to fully understand their reliability and reproducibility. The quality of the available data varies considerably, and the data are not often used as a point of departure in risk assessment. There are many factors contributing to why DNT testing is not a primary test for decision-making. For example, screening for DNT is particularly challenging because of the dynamic nature of development and the timescale to detect adversity following developmental exposures. This results in experimentally complex guideline studies that are low throughput with insensitive endpoints that tend to heavily rely on gross motor function, and in questionable overall relevance for predicting human outcomes.

From a conserved evolutionary perspective, animal models and NAMs should be useful for DNT toxicity screening to protect human health because much of the intrinsic molecular machinery operational during embryonic development is well conserved across vertebrates, but there are some important organ-specific effects between species. If more DNT testing data were available, we would be better positioned to understand, define, and predict domains of applicability of these tests across chemical classes.

INSIGHTS FROM THE LITERATURE REVIEW

Approach

To support the committee’s effort, the committee reviewed existing literature that evaluated evidence on variability and concordance of laboratory mammalian toxicity tests. The primary

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

literature reporting on outcomes in relevant laboratory mammalian toxicity tests and amenable to de novo review and analysis was voluminous, and a formal systematic review of this literature was not considered within scope of the committee’s effort. However, the committee considered literature consisting of reviews, wherein information from multiple relevant studies, experiments, or databases was compiled and analyzed. Recognizing that systematic reviews provide a transparent, comprehensive, and consistent evaluation with less bias than other types of reviews and analyses, the committee conducted an overview review (Pollock et al., 2019) to identify and evaluate systematic reviews and authoritative reviews of the scientific evidence of the highest methodological quality relevant to the committee’s charge. The goals of the approach were to identify relevant systematic reviews and authoritative reviews, evaluate their methodological quality, and illustrate the study strengths and weaknesses as well as the populations, interventions, and outcomes covered and where significant gaps may remain. In addition, systematic reviews of higher quality formed the evidentiary basis analyzed by the committee in reaching findings and recommendations that addressed the charge questions.

The approach entailed development of a prespecified method as further described in Appendix C. This method detailed the key terms and their definitions, the objectives of the review, the scoping questions and associated population, exposures, comparators, and outcomes (PECO) statements, the inclusion and exclusion criteria, the literature search strategy, the process to assess methodological quality, and the analysis plan. This chapter focuses on the following goal:

To summarize the systematic reviews and authoritative reviews that assess the concordance of adverse health effects between laboratory mammalian models and humans following exposures to environmental agents.

In brief, this goal was addressed through a comprehensive literature search of multiple databases conducted using relevant terms. The results were reviewed for relevance by two independent screeners and included studies evaluated for methodological quality using AMSTAR 2 (A MeaSurement Tool to Assess systematic Reviews) (Shea et al., 2017) in line with assessments of environmental health information by prior National Academies of Sciences, Engineering, and Medicine (NASEM) committees (NASEM, 2019, 2021, 2022c). An evidence map was generated to illustrate the extent of coverage with respect to the PECO statements as well as the quality of existing systematic reviews.1 Systematic reviews judged to be of critically low quality were not considered further by the committee, whereas those of higher quality were summarized and formed the basis of the committee’s findings and recommendations.

Summary of Findings

The overview review identified 32 systematic reviews of the literature that addressed concordance of adverse health effects between laboratory mammalian models and humans. After evaluation using the AMSTAR 2 tool, systematic reviews judged to be of critically low quality were not considered further by the committee. Of the remaining 14 studies, 5 evaluated environmental chemicals or radiation, and 9 covered pharmaceuticals, low-energy sweeteners, alcohol, and caffeine. Table 4-1 shows the outcomes that were evaluated for concordance.

In the following, the findings from the four systematic reviews deemed most informative to the committee’s charge questions concerning concordance are discussed.

___________________

1 See https://public.tableau.com/app/profile/leslie.beauchamp/viz/NAMsEvidenceMapDashboard/Evi-denceMap?publish=yes.

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

As described in the 2017 report Application of Systematic Review Methods in an Overall Strategy for Evaluating Low-Dose Toxicity from Endocrine Active Chemicals (NASEM, 2017), the committee’s statement of task included the following:

The study will include … systematic reviews of human and animal toxicology data for two or more chemicals that affect the estrogen, androgen, or perhaps other endocrine systems. The committee will evaluate the results of the systematic reviews, demonstrate how human and animal data streams can be integrated, determine whether the evidence supports a likely causal association, and evaluate the nature and relevance of the dose-response relationship(s).

This report comprised two systematic reviews. First, the committee conducted a systematic review of the literature on diester phthalates and male reproductive development with a specific focus on di-ethylhexyl phthalate (DEHP) and the endpoints of anogenital distance (AGD), hypospadias, and fetal testosterone. To summarize the included animal studies that met the review criteria:

  • 19 animal studies assessed the association between DEHP exposure and AGD with “a high confidence rating of the body of evidence … and … a high level of evidence that fetal exposure to DEHP is associated with a reduction in AGD in male rats” (NASEM, 2017, p. 57).
  • 12 animal studies assessed the association between DEHP exposure and fetal testosterone with “a high confidence rating in the body of evidence and … a high level of evidence that fetal exposure to DEHP is associated with a reduction in fetal testosterone in rats” (NASEM, 2017, p. 60).
  • 9 animal studies assessed the association between DEHP exposure and hypospadias with “a moderate confidence rating in the body of evidence and … a moderate level of evidence that fetal exposure to DEHP is associated with an increased incidence of hypospadias in male rats” (NASEM, 2017, p. 62).

A subsequent meta-analysis evaluating the dose-response relationships of fetal DEHP exposure and AGD indicated “consistent evidence of a decrease in AGD in male rats after fetal exposure to DEHP, with a modest dose-response gradient” (NASEM, 2017, p. 64). The DEHP dose-response relationship for AGD varied between rat strains (Sprague Dawley and Wistar), and there was no statistically significant overall dose-response relationship between DEHP exposure and AGD in mice. The same meta-analysis approach was also used to study the dose-response relationship between DEHP exposure in rats and fetal testosterone with consistent evidence of a decrease in fetal testosterone with a strong dose-response gradient upon exposure to DEHP.

There were six human epidemiological studies that met the review criteria assessing the association between DEHP exposure and AGD with a moderate level of confidence in the body of evidence based on the criteria that exposures occurred before the outcome, outcomes were measured on individuals, and the control comparison group was used. There was “a moderate level of evidence that fetal exposure to DEHP is associated with a reduction in AGD” (NASEM, 2017, p. 71). A subsequent meta-analysis provided “consistent evidence of a decrease in AGD being associated with increasing urinary concentrations of the sum of DEHP metabolites and of magnitude around 4% for each 10-fold (log10) increase in DEHP concentrations” (NASEM, 2017, p. 75).

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

TABLE 4-1 Summary of Higher Quality Concordance Studies Identified from the Committee’s Literature Review

Reference Population(s) Exposure(s) Adverse Outcome(s) Committee Comments (relevance and informativeness to charge)
Andersen et al. (2020) Humans and rats Exposure to exposed to methadone or buprenorphine in utero Cognitive, psychomotor, motor, behavioral, attentional, executive, or visual outcomes Overall, there was fair concordance for negative predictive outcomes (i.e., lack of effects) between the preclinical rodent and newborn neurologic assessments. Neurologic testing was done at similar lifestages in rodents and newborns (discussed further in the text).
NASEM (2017) Humans, rats, mice, guinea pigs In utero exposure to phthalates (see Box 3-1 of the report for details) Male reproductive toxicity (AGD, hypospadias, fetal testosterone) There was moderate concordance between rodents and humans (discussed further in the text).
NASEM (2017) Humans, rats, mice, guinea pigs Developmental exposure to PBDEs Animals: Measures of learning, memory, attention, or response inhibition Limited ability to evaluate concordance, difficult to assess for PBDEs due to the types of tests, the differences in developmental timing, and methodological differences (discussed further in the text).
Perel et al. (2007) Humans, rats, mice, primates, other mammals Six interventions for which there was evidence of a treatment effect (benefit or harm) in systematic reviews of clinical trials Tirilazad was associated with worse outcome in patients; other outcomes studied were beneficial in nature Evidence supports concordance between the rodent and clinical study outcome high for bisphosphonates effects on bone mass due to similarity in bone metabolism between human and rodents and alignment of life stage (discussed further in the text).
Bestry et al. (2022) Humans, primates, rats, mice Prenatal alcohol exposure DNA methylation Inadequate evidence to support an association between prenatal alcohol exposure and altered DNA methylation because of heterogeneity in study design and methods.
Bezemer et al. (2021) Humans, mice Allylamines Safety and efficacy for treatment of cutaneous and mucocutaneous leishmaniasis; under adverse events there were none reported Inadequate evidence to evaluate concordance as studies were low in number, heterogeneous, and of low quality.
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Reference Population(s) Exposure(s) Adverse Outcome(s) Committee Comments (relevance and informativeness to charge)
Bodewein et al. (2019) Humans, rats, mice, guinea pigs, dogs Man-made electronic frequency (EF), magnetic frequency and electromagnetic frequency in the intermediate frequency (IF) range (300 Hz to 1 MHz) Effects on any biological function Inadequate evidence to evaluate concordance due to heterogeneity of study designs, methods, and endpoints.
European Commission (2018) Humans, rabbits, rats, mice Endocrine disrupting chemicals [World Health Organization (WHO)/International Programme on Chemical Safety (IPCS) definition]: bisphenol A, di(2ethylhexyl) phthalate (DEHP), vinclozolin, trenbolone, per-fluorooctanesulfonic acid, perfluorooctanoic acid, BDE, perchlorate, prochloraz Endocrine-disrupting activity or effect, including effects manifested at later life stages Inadequate evidence to evaluate concordance due to many gaps and limitations in study design.
Jukema et al. (2021) Humans, mice, rats, guinea pigs, rabbits, other mammals Antileukotrienes to prevent or treat chronic lung disease All-cause mortality and any harm, and, for the clinical studies, incidence of chronic lung disease Limited ability to evaluate concordance due to heterogeneity in study designs, methods, and high level of bias.
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Leenaars et al. (2019) Humans, rats, rabbits, mice Various Translational success (and failure) rates This is a scoping review that did not evaluate concordance. It noted that all but one of the included studies were of very low quality. Study authors were queried to identify study, and they did not respond.
Rogers et al. (2016) Humans, primates, rabbits, rats Low-energy sweeteners Energy intake and body weight Limited ability to evaluate concordance due to heterogeneity in results.
Shojaei-Zarghani et al. (2020) Humans, rats, mice Caffeinated beverages (tea, coffee, soda) and chocolate in humans; caffeine in animals Risk of colon cancer in human studies; in animal studies, cancers (adenocarcinomas) and precancerous lesions; survival; key characteristics of carcinogens (5, 6, 10) Limited ability to evaluate concordance due to differences in exposure, doses, and accounting for confounders in epidemiologic studies.
Sophocleous et al. (2022) Humans, mice, rats, rabbits Cannabinoid receptor ligands (synthetic and natural) Bone cell activity and bone volume in rodents; bone mineral density in humans Limited ability to evaluate concordance because the studies were few in number and heterogeneous.
Wikoff et al. (2021) Humans and rats Exposure to dioxin like compounds Reduced sperm count Limited ability to evaluate concordance due to methodological deficiencies, including deviation from protocol.
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

Together, the animal and human data provided consistent evidence that fetal exposure to DEHP is associated with a decreased AGD and a decreased fetal testosterone level. While the mechanistic data support the biological plausibility in humans of an association between DEHP exposure and reduced AGD and fetal testosterone, these data were not considered sufficient to upgrade the final hazard identification for AGD and fetal testosterone, that DEHP is presumed to be a reproductive hazard to humans. The weaker evidence for an association between DEHP exposure and hypospadias indicated that DEHP is suspected to be a reproductive hazard to humans. Inferences about the effect levels of DEHP in animals and humans were considered uncertain.

There are numerous lessons to be learned from this NASEM report analyzing the association between phthalate exposure and male reproductive development.

  • The animal and human databases on phthalate exposure and male reproductive development are robust compared to most of the chemicals on the marketplace; only a relatively small fraction of chemicals have been studied to this extent.
  • The apical endpoints measured (AGD, fetal testosterone, hypospadias) were generally similar between the multiple animal and human studies, simplifying the determination of whether responses were concordant or discordant.
  • Despite these strengths, the conclusion was of a moderate level of evidence supporting an association between DEHP exposure and outcomes of effect.
  • The mechanistic information provided evidence of biological plausibility but, given the variation in sensitivities among rodent species and strains, concordance and discordance between the animal and the human studies remained challenging to interpret.
  • There was significant uncertainty about exposure levels necessary to produce an effect because the animal studies usually measured external administered doses whereas the human studies measured biomarkers of internal dose. There was consistency in the quantitative dose-response in terms of higher exposures resulting in higher responses.

The second systematic review conducted by the 2017 NASEM ad hoc committee evaluated the neurodevelopmental impacts of PBDEs, with a focus on outcomes related to impaired learning, memory, attention, and response inhibition in rodents, and reduced IQ and other endpoints of relevance to ADHD and intelligence in children (NASEM, 2017). Systematic reviews of the literature examined nonhuman mammals and humans.

For the animal systematic review, of the 67 studies that met the criteria for full text review, 27 were identified for data extraction and were included in the review. The search identified relevant data for 6 PBDE congeners and a related flame retardant (DE-71), plus a wide range of tests used to assess learning, memory, and attention in rodents, with the Morris water maze (assesses spatial navigation in a large pool) being one of the most commonly deployed. Because most studies used the Morris water maze, a meta-analysis on latency to find the platform (essentially how quickly the animal can correctly navigate the maze) was performed.

Across studies in rats and mice, the committee noted a significant relationship between PBDE exposure and changes in the latency to complete the last trial of a Morris water maze. Challenges to this interpretation include analysis across all congeners, which likely have different potencies for these effects, variability in timing of exposure and testing, and not accounting for sex differences, which are substantial in this testing paradigm.

The human systematic review identified 18 relevant studies with only two meeting the criteria for full text review. During the process, the committee was provided a draft of a third systematic review (Lam et al., 2017) ultimately deemed worthy of inclusion. Availability of quality data was higher for IQ than ADHD assessments. Consequently, the human studies provided sufficient evidence that PBDEs are inversely associated with decrements in human IQ, reported as a decrease

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

of 3.70 (95% CI: 0.83, 6.56) IQ points per 10-fold increase in lipid-adjusted PBDE concentration, but there was limited evidence to support an association between PBDEs and ADHD in children.

Regarding concordance, the integration of human data that evaluated measures of IQ and ADHD with animal studies that examined learning, memory, and attention was challenging given that the endpoints assess different aspects of neurodevelopment. In addition, the animal studies used different tests of learning and memory and, in some cases, different endpoints in the same test or different variations of the study design (e.g., different numbers of trials or different numbers of visual cues to aid navigation). The test methods and data analyses also often differed between studies, and exposures were lower in the human studies. Ultimately, the committee found that current mammalian-based testing paradigms could detect a hazard (change in learning and memory) that is presumed to be a concern in humans.

Lessons learned from this element of the 2017 NASEM report analyzing the association between PBDE exposure and neurodevelopment are similar to those identified for phthalate exposure and male reproductive development.

  • High risk-of-bias ratings for multiple questions in the individual animal studies resulted in the subsequent downgrading of the overall level of confidence in the bodies of evidence. This was largely due to absent information important for evaluation of the study design (e.g., insufficient information on randomization, masking of treatment conditions, and controlling for litter effects). Broader adoption of ARRIVE guidelines (https://arriveguidelines.org/) or similar will likely improve reporting quality.
  • Exposure regimens varied across studies, which complicated data synthesis. It is unclear whether all incorporated the window of greatest exposure sensitivity.
  • Multiple testing strategies for assessing learning and memory in rodent models exist, and there is no standardized testing battery or recommended testing strategy. The Morris water maze is most common for spatial memory, but testing protocols vary.
  • Meta-analysis of latency data from the Morris water maze for several PBDEs showed a significant overall effect of PBDE exposure that was robust to multiple sensitivity analyses, but not a clear dose-response, likely due to the inclusion of multiple congeners.
  • There was consistency of effect of PBDEs on endpoints related to brain related effects indicating general concordance across species. There was also exposure related response identified in human and laboratory mammalian toxicity tests.

It should be noted that there are well-known species, strain, and sex differences in how rodents perform in these tests that were not taken into account, and that sex differences have also been observed in humans using a virtual Morris water maze and other spatial tasks. The committee also noted that the human studies often came from the same cohorts but were conducted at different time points, with the studies conducted on older children having a lower risk of bias than those conducted at younger ages.

A third systematic review considered relevant to the committee’s charge aimed to evaluate long-term outcomes following prenatal exposure to methadone or buprenorphine in children and similarly exposed experimental animals (Andersen et al., 2020). The exposures to methadone and buprenorphine in children were through opioid maintenance therapy (OMT) of the mothers. The outcomes of interest were effects on cognitive, behavioral, or visual outcomes, and were assessed in children at 3 months or older and control group(s). Based on a pooling of meta-analyses, reductions in cognitive, psychomotor, behavioral, attentional, and executive functioning, and affected vision were seen in children whose mothers had taken OMTs during their pregnancies compared to controls (overall effect size = 0.49, 95% CI: 0.38, 0.59, p < 0.00001). Although the effect of prenatal exposure to opioids on cognitive function in the offspring was significant, concurrent risk

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

factors in the exposed group may have biased the results. In addition, as the assessments were limited to younger children, the persistence of the lower cognitive function was not studied. A systematic review of studies in experimental animals similarly demonstrated impaired outcomes after prenatal exposure to methadone or buprenorphine. Variability in the findings in experimental animals were attributable to differences in the outcome assessments as well as in the age of the animals. Overall, this study demonstrated impairment in cognitive and other outcomes in the offspring of mothers on OMT during pregnancy. There was concordance of this finding in studies in similarly exposed experimental animals, which support a possible causal relationship between prenatal methadone or buprenorphine exposure and subsequent cognitive impairment.

A fourth systematic review relevant to the committee’s charge aimed to compare treatment effects between animal experiments and clinical trials (Perel et al., 2007). Animal studies are used in novel drug discovery for assessing both safety and efficacy before clinical trials are undertaken. This systematic review abstracted animal studies for interventions with clear evidence of either a beneficial or harmful treatment effect in clinical trials, and one included bisphosphonates to treat osteoporosis.

Bisphosphonates increase bone mineral density of the lumbar spine and hip as measured by dual energy absorptiometry (DXA) in postmenopausal women with osteoporosis. A meta-analysis was performed to assess whether animal models similarly showed an increase in bone mineral density. Sixteen animal studies were identified in which bisphosphonates were studied in female animals made postmenopausal by removal of their ovaries. Data were extracted on the study design, treatment allocation, number of randomized animals, and type of intervention or outcome. Bone mineral density was reported in 11 studies, all of which reported an increase in bone mineral density, and all 6 of the remaining studies reported an increase in bone mass. Compared to placebo, the meta-analysis found alendronate increased hip bone mineral density by 11% (95% CI: 9.2, 12.9%) and lumbar spine 8.5% (CI: 5.8, 11.2%).

Overall, there was high concordance in the reported findings between the rodents and humans treated with alendronate for the treatment of postmenopausal bone loss. Of note, the biology of the rodent trabecular bone turnover and humans is understood and is quite similar. In addition, the treatment, the bisphosphonate, does not circulate; once in the circulation it goes to the bone and is in residence there. Only when osteoclasts attach to the bone does the bisphosphonate get released again into the circulation. In addition, the measurement of bone mineral density by DXA or computerized tomography (CT) is quite standardized and measures milligrams of mineral over a defined area of bone such that the correlation between species that have mineralized bone in their skeletons is quite high. Finally, the effect of estrogen deficiency on bone mass has been well defined, so the timing of the intervention to prevent or treat bone mass is well known. Given all of these reasons, the concordance between animal and clinical studies with bisphosphonates for improvement of bone mass are supported by biological plausibility.

LITERATURE PRESENTED TO THE COMMITTEE

Approach

The committee also analyzed literature presented to them in information gathering sessions. This included literature presented during the virtual workshops as well as publications referenced by the sponsor during their presentations to the committee. Importantly, these publications represent those selected by speakers at the committee’s information gathering sessions rather than through a more systematic process that would minimize bias. The committee’s approach to analyzing these publications entailed development of a prespecified method as further described in Appendix C. The 128 papers were screened using similar inclusion/exclusion criteria as for the

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

systematic review of reviews. From these 128 papers, 11 studies were included; several had already been identified in the committee’s literature review, including the systematic reviews published by NASEM (2017).

Summary of Findings

Most of the papers identified during the virtual workshops and other information gathering sessions were not systematic reviews. However, many leveraged data sets culled from the literature that could in part be identified via a systematic review, though additional studies could also be identified. All the papers cited in the workshops not previously identified as part of the committee’s literature review were regarded as having critically low methodological quality because they were not systematic reviews. Specifically, none of the reviews explicitly stated that the review methods were established prior to the conduct of the review, nor did they report and justify any significant deviations from the protocol. These reviews also did not use or document a comprehensive search strategy or an approach for evaluating the quality of the included studies, including an acceptable risk of bias tool. This highlights opportunities for improving the reviews presented to the committee to decrease bias in the overall evidence evaluation. For example, future studies in this area could employ a comprehensive search strategy and more carefully detail the rationale for why studies were excluded during the title and abstract or full text review stages, and protocols for evaluating the studies could be preregistered for transparency. In addition, a risk of bias evaluation could be conducted on the studies identified using tools previously recommended by NASEM.

It should be noted that the evaluations of concordance that used data from specific databases are not systematic reviews, and as such, there may be other existing data that might be pertinent to the research question which might not have been included in the data analysis. In other words, these studies used convenience samples, and they were not evaluated for risk of bias. Therefore, the results are not generalizable. The findings from these studies are summarized in Table 4-2, but any conclusions about the qualitative and quantitative concordance between laboratory mammalian toxicity studies and humans derived from these reviews are to be interpreted with caution.

FINDINGS AND RECOMMENDATIONS

Findings Regarding Toxicokinetics

The quantitative aspect of concordance in toxicokinetics often varies between laboratory mammalian species and humans, but the qualitative aspects are often similar (e.g., major elimination pathways, enzymes involved, metabolites formed, disposition in body).

  • Oral bioavailability can be poorly concordant between laboratory mammalian species and humans because of differences in the extent of first pass gut and liver metabolism.
  • Skin absorption rates of a chemical can differ between animals and humans depending on animal species because of differences in the structure and composition of the skin barrier.
  • Metabolic patterns (e.g., spectrum and amounts of metabolites) between animals and humans can vary by chemical class, and the ability to predict toxicokinetic concordance is dependent on the availability of data on similar chemicals.
  • The variable concordance of absorption, distribution, metabolism, and excretion (ADME) across species makes quantitative comparisons of internal dosimetry difficult.
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

TABLE 4-2 Summary of Concordance Studies Identified from Literature Presented to the Committee That Are Not Systematic Reviews

Reference Population(s) Exposure(s) Adverse Outcome(s) Reported Results
Clark and Steger-Hartmann (2018) Humans, rats, mice, rabbits, dogs, guinea pigs, primates, other mammals 3290 approved drugs and drug formulations 1,637,449 adverse events and reports of preclinical safety-related observations in the PharmaPendium database In general, the predictivity of animal safety observations for adverse events in humans was confirmed. This was true for many key observations, for example, QT interval prolongation and arrhythmias in dogs.
Fenton et al. (2021) Humans, rats, mice Per‐ and polyfluoroalkyl substances (PFAS) Health effects Concordance was generally seen between animal studies and human epidemiological observations for adverse effects of per- and polyfluoroalkyl substances (PFAS), including on the immune system, liver, kidney, reproduction and development, and cancer. Species differences meriting further investigation include cholesterol metabolism and thyroid effects as well as mode of action for liver and kidney effects.
Giblin et al. (2021) Human and animals (rat, dog, mouse, rabbit, and monkey; species in Figure S-3) 1,604 drugs in the PharmaPendium (2014-04) database Preclinical and clinical adverse events The authors identified 3,011 statistically significant associations between preclinical and clinical adverse events caused by drugs. In particular, concordance between animal species and humans was seen for renal papillary necrosis, cardiotoxicity, sedation, and increased platelet count. Other concordant outcomes included presence of drug specific antibodies, increased blood prolactin, QT prolongation, increase aminotransferase, diarrhea, and injection site erythema.
Krewski et al. (2019) Humans, rats, mice, dogs, primates, and other mammals 111 Group 1 carcinogens identified by the International Agency for Research on Cancer (IARC) Monographs (through Volume 109) Tumor sites The highest degree of overlap in tumor sites between experimental animal species and humans was seen for mesothelium (100%), thyroid/follicular epithelium (100%), urothelium (70%), and the respiratory system (59%).
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Monticello et al. (2017) Humans, rats, mice, dogs, primates, other mammalian species 182 candidate or approved biopharmaceutical molecules Clinical observations Overall, the nonhuman primate had the highest concordance parameters, and negative predictive value was equivalent across species.
NRC (2000) Humans, rats, mice, rabbits Chemical and physical agents Developmental defects The report noted concordance of developmental effects between experimental animals and humans, and humans are as sensitive or more sensitive than the most sensitive animal species. It also cited Makris et al. (1998): “Concordance was found between structural alterations of the nervous system and behavioral defects.”
Olson et al. (2000) Humans, rats, mice, dogs, guinea pigs, rabbits, primates 150 pharmaceuticals 221 human toxicities associated with exposure to pharmaceuticals The principal types of toxicities reported were hepatic, neurological, cardiovascular, hematological, gastrointestinal, and hypersensitivity. Overall, for toxicity in the same organ system, the sensitivity was 70% for one or more preclinical animal model species. The most highly concordant effects between experimental animal species and humans were cardiovascular (80%); hematologic (91%); gastrointestinal (85%); particularly in nonrodents and for anticancer, antiinfective, and anti-inflammatory agents.
Patisaul et al. (2021) Humans, rats, mice, zebrafish Organophosphate esters (OPEs) used as flame retardants and plasticizers Developmental neurotoxicity Concordance of effect was seen between experimental animals and humans for endocrine disruption (zebrafish, rodents, and humans) and a number of neurotransmitters important for developmental neurotoxicity including glutamate (rodents and humans), gamma-aminobutryic acid (zebrafish, rodents, and humans), and other neurotransmitters (zebrafish, rodents, and humans).
Tamaki et al. (2013) Humans and rodents (species not specified) 142 approved drugs in Japan 1,256 adverse drug reactions The overall concordance between preclinical and clinical adverse drug reactions was 48%.
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

Recommendation 4.1: To better understand animal-to-human concordance in toxicokinetics, the EPA should systematically review existing data. One opportunity is to conduct a systematic review and/or meta-analysis on the existing primary in vitro hepatocyte metabolism and protein binding data, including any available in ToxCast (i.e., in rats and humans) (Honda et al., 2019; Rotroff et al., 2010; Wetmore et al., 2012, 2013, 2015).

Findings Regarding Concordance of Outcomes

  • Laboratory mammalian tests can generally identify human health hazards for a range of adverse health outcomes and are particularly helpful for detecting overt effects such as malformations or histopathological changes in target organs (e.g., necrosis).
  • Laboratory mammalian toxicity testing can generally identify dichotomous higher-level outcomes, such as cancer.
  • Laboratory mammalian toxicity testing has not been as successful for identifying some human health endpoints, such as DNT and mammary gland effects. They are often not well designed to accurately test the needed endpoints or mechanisms of action. This is due in part to the lack of alignment of methods and endpoints across species. For further information, please refer to the Tableau.2
  • There may be qualitative concordance between animals and humans in that both tend to share dose-response behavior. However, the differences between external measured dose applied in animals and internal dosimetry measured in humans can make quantitative comparisons across species difficult.
  • There was consistency in the quantitative dose-response in terms of higher exposures resulting in higher responses in the systematic reviews evaluated by the committee.
  • Response-level differences between animals and humans may be due to multiple factors including differences in study populations; animal studies are in general designed to limit unintended variability in response. This makes the response levels less concordant with outcomes of diverse human exposures.
  • The committee’s overview review established that there are few high-quality systematic reviews with evidence to evaluate concordance.
  • Literature cited during the information gathering sessions and workshop represent those selected as being of importance by speakers rather than through a more systematic process that would minimize bias. In addition, as a class of studies, studies cited in the workshops that were not systematic reviews lacked pre-specified protocols, comprehensive search strategy, risk of bias evaluation, and other methodological design elements for minimizing bias. Thus, their conclusions about concordance derived from this literature are to be interpreted with caution.

Recommendation 4.2: To evaluate concordance of outcomes:

  • The EPA should utilize the findings from the systematic reviews and authoritative reviews, or conduct new systematic reviews.
  • The EPA could build on the available higher quality systematic reviews and authoritative reviews that demonstrated high concordance (e.g., cancer/bone metabolism) to further evaluate NAMs. Other opportunities include conducting a systematic review of

___________________

2 See https://public.tableau.com/app/profile/leslie.beauchamp/viz/NAMsEvidenceMapDashboard/EvidenceMap?publish=yes.

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
  • available nonmammalian test methods that measure male reproductive responses to phthalates for comparison to the 2017 NASEM low-dose endocrine report; or, non-animal test methods related to methadone and neurologic effects; or, bisphosphonates and bone density, and in all cases compare the outcomes with the existing systematic reviews of laboratory animal and human data.
  • The EPA’s analyses to understand concordance should require studies that use high-quality methods and as much as possible similar features to align factors such as timing, life stage, sex, and length of exposure.

Recommendation 4.3: In order to implement the recommendations from the report, the EPA, in collaboration with other agencies and the OECD, should convene scientific advisory groups that include appropriate subject matter and community health expertise, including clinicians, to review and update mammalian toxicity testing study designs to make them more specific, more sensitive, and better aligned with the 3 R goals. This process should provide opportunities to add endpoints with human relevance that are inadequately covered in laboratory mammalian toxicity tests such as DNT and mammary gland effects.

Recommendation 4.4: Due to the known limitations in animal-to-human concordance, the EPA should not use laboratory mammalian toxicity tests as the sole factor in determining internal and external validity for acceptance of NAMs into regulatory practice.

REFERENCES

Andersen, J. M., G. Høiseth, and E. Nygaard. 2020. “Prenatal Exposure to Methadone or Buprenorphine and Long-Term Outcomes: A Meta-Analysis.” Early Human Development 143 (April): 104997.

Bestry, M., M. Symons, A. Larcombe, E. Muggli, J. M. Craig, D. Hutchinson, J. Halliday, and D. Martino. 2022. “Association of Prenatal Alcohol Exposure with Offspring DNA Methylation in Mammals: A Systematic Review of the Evidence.” Clinical Epigenetics 14(1): 12.

Bezemer, J. M., J. van der Ende, J. Limpens, H. J. C. de Vries, and H. D. F. H. Schallig. 2021. “Safety and Efficacy of Allylamines in the Treatment of Cutaneous and Mucocutaneous Leishmaniasis: A Systematic Review.” PLoS ONE [Electronic Resource] 16(4). https://doi.org/10.1371/journal.pone.0249628.

Bodewein, L., K. Schmiedchen, D. Dechent, D. Stunder, D. Graefrath, L. Winter, T. Kraus, and S. Driessen. 2019. “Systematic Review on the Biological Effects of Electric, Magnetic and Electromagnetic Fields in the Intermediate Frequency Range (300 Hz to 1 MHz).” Environmental Research 171 (April): 247–259.

Bornehag, C.-G., E. Kitraki, A. Stamatakis, E. Panagiotidou, C. Rudén, H. Shu, C. Lindh, J. Ruegg, and C. Gennings. 2019. “A Novel Approach to Chemical Mixture Risk Assessment—Linking Data from Population-Based Epidemiology and Experimental Animal Tests.” Risk Analysis: An Official Publication of the Society for Risk Analysis 39 (10): 2259–2271.

Caporale, N., M. Leemans, L. Birgersson, P.-L. Germain, C. Cheroni, G. Borbély, E. Engdahl, et al. 2022. “From Cohorts to Molecules: Adverse Impacts of Endocrine Disrupting Mixtures.” Science 375(6582): eabe8244.

CDC (Centers for Disease Control and Prevention). 2022a. “CDC’s Work on Developmental Disabilities.” May 16. https://www.cdc.gov/ncbddd/developmentaldisabilities/about.html.

CDC. 2022b. “Autism and Developmental Disabilities Monitoring (ADDM) Network.” December 15. https://www.cdc.gov/ncbddd/autism/addm.html.

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

Chiou, W. L., and A. Barve. 1998. “Linear Correlation of the Fraction of Oral Dose Absorbed of 64 Drugs between Humans and Rats.” Pharmaceutical Research 15(11): 1792–1795.

Chiou, W. L., H. Y. Jeong, S. M. Chung, and T. C. Wu. 2000. “Evaluation of Using Dog as an Animal Model to Study the Fraction of Oral Dose Absorbed of 43 Drugs in Humans.” Pharmaceutical Research 17(2): 135–140.

Chiou, W. L., and Paul W. Buehler. 2002. “Comparison of Oral Absorption and Bioavailablity of Drugs between Monkey and Human.” Pharmaceutical Research 19(6): 868–874.

Clark M., Steger-Hartmann T. 2018. “A big data approach to the concordance of the toxicity of pharmaceuticals in animals and humans.” Regulatory Toxicology and Pharmacolology. 96:94-105. https://doi/10.1016/j.yrtph.2018.04.018.

EPA (Environmental Protection Agency. 2017. “Evaluating Developmental Neurotoxicity Hazard: Better Than Before,” October. https://www.epa.gov/sciencematters/evaluating-developmental-neurotoxicity-hazard-better.

European Commission. 2018. “Temporal aspects in the testing of chemicals for endocrine disrupting effects (in relation to human health and the environment): Final Report.” Publications Office, 2018, https://data.europa.eu/doi/10.2779/789059.

European Food Safety Authority. 2019. “Risk for Animal and Human Health Related to the Presence of Dioxins and Dioxin‐like PCBs in Feed and Food.” https://www.efsa.europa.eu/en/efsajournal/pub/5333.

Fenton, S.E., Ducatman, A., Boobis, A., DeWitt, J.C., Lau, C., Ng, C., Smith, J.S. and Roberts, S.M. 2021. “Per- and Polyfluoroalkyl Substance Toxicity and Human Health Review: Current State of Knowledge and Strategies for Informing Future Research.” Environmental Toxicology Chemistry 40:606−630. https://doi.org/10.1002/etc.4890.

Fombonne, E. 2016. “Editorial: Isolating the Essential Difference—The Importance of Choosing the Right Type and Sufficient Numbers of Controls in Research on Neurodevelopmental Disorders and Mental Health Conditions.” Journal of Child Psychology and Psychiatry, and Allied Disciplines 57(11): 1203–1204.

Hamm, A. O., A. I. Weike, H. T. Schupp, T. Treig, A. Dressel, and C. Kessler. 2003. “Affective Blindsight: Intact Fear Conditioning to a Visual Cue in a Cortically Blind Patient.” Brain: A Journal of Neurology 126(Pt 2): 267–275.

Honda, G. S., R. G. Pearce, L. L. Pham, R. W. Setzer, B. A. Wetmore, N. S. Sipes, J. Gilbert, B. Franz, R. S. Thomas, and J. F. Wambaugh. 2019. “Using the Concordance of in Vitro and in Vivo Data to Evaluate Extrapolation Assumptions.” PloS One 14(5): e0217564.

Giblin, K. A., D. Basili, A. M. Afzal, L. Rosenbrier-Ribeiro, N. Greene, I. Barrett, S. J. Hughes, and A. Bender. 2021. Chemical Research in Toxicology 34(2): 438−451. https://doi.org/10.1021/acs.chemrestox.0c00311.

Jukema, M., F. Borys, G. Sibrecht, K. J. Jørgensen, and M. Bruschettini. 2021. “Antileukotrienes for the Prevention and Treatment of Chronic Lung Disease in Very Preterm Newborns: A Systematic Review.” Respiratory Research 22(1): 208.

Krewski, D., Rice, J. M., Bird, M., Milton, B., Collins, B., Lajoie, P. and J.M. Zielinski. 2019. Concordance between sites of tumor development in humans and in experimental animals for 111 agents that are carcinogenic to humans. Journal of Toxicology and Environmental Health, Part B 22(7-8): 203−236.

Lam, J., E. Koustas, P. Sutton, P. I. Johnson, D. S. Atchley, S. Sen, K. A. Robinson, D. A. Axelrad, and T. J. Woodruff. 2014. “The Navigation Guide—Evidence-Based Medicine Meets Environmental Health: Integration of Animal and Human Evidence for PFOA Effects on Fetal Growth.” Environmental Health Perspectives 122(10): 1040–1051.

Lam, J., B. Lanphear, D. Bellinger, D.A. Axelrad, J. McPartland, P. Sutton, L. Davidson, N. Daniels, S. Sen, and T.J. Woodruff. “Developmental PBDE exposure and IQ/ADHD in

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

childhood: a systematic review and meta-analysis.” Environmental health perspectives 125, no. 8 (2017): 086001.

Leenaars, C. H. C., C. Kouwenaar, F. R. Stafleu, A. Bleich, M. Ritskes-Hoitinga, R. B. M. De Vries, and F. L. B. Meijboom. 2019. “Animal to Human Translation: A Systematic Scoping Review of Reported Concordance Rates.” Journal of Translational Medicine 17(1): 223.

Minguez-Alarcón, L., J. E. Chavarro, J. Mendiola, M. Roca, C. Tanrikut, J. Vioque, N. Jørgensen, and A. M. Torres-Cantero. 2017. “Fatty Acid Intake in Relation to Reproductive Hormones and Testicular Volume among Young Healthy Men.” Asian Journal of Andrology 19(2): 184–190.

Monticello, T. M., T. W. Jones, D. M. Dambach, D. M. Potter, M. W. Bolt, M. Liu, D. A. Keller, T. K. Hart, and V. J. Kadambi. 2017. “Current Nonclinical Testing Paradigm Enables Safe Entry to First-In-Human Clinical Trials: The IQ Consortium Nonclinical to Clinical Translational Database.” Toxicology and Applied Pharmacology 334 (November): 100–109.

Musther, H., A. Olivares-Morales, O. J. D. Hatley, B. Liu, and A. R. Hodjegan. 2014. “Animal versus Human Oral Drug Bioavailability: Do They Correlate?” European Journal of Pharmaceutical Sciences: Official Journal of the European Federation for Pharmaceutical Sciences 57(100): 280–291.

NASEM (National Academies of Sciences, Engineering, and Medicine). 2017. Application of Systematic Review Methods in an Overall Strategy for Evaluating Low-Dose Toxicity from Endocrine Active Chemicals. Washington, DC: The National Academies Press. https://doi.org/10.17226/24758.

NASEM. 2019. Review of DOD’s Approach to Deriving an Occupational Exposure Level for Trichloroethylene. Washington, DC: The National Academies Press. https://doi.org/10.17226/25610.

NASEM. 2021. The Use of Systematic Review in EPA’s Toxic Substances Control Act Risk Evaluations. National Academies Press. https://doi.org/10.17226/25952.

NASEM. 2022a. “New Approach Methods (NAMs) for Human Health Risk Assessment: Proceedings of a Workshop—in Brief.” https://nap.nationalacademies.org/catalog/26496/new-approach-methods-nams-for-human-health-risk-assessment-proceedings.

NASEM. 2022b. New Approach Methods (NAMs) for Human Health Risk Assessment | Workshop 2. https://www.nationalacademies.org/event/05-12-2022/new-approach-methods-nams-for-human-health-risk-assessment-workshop-2.

NASEM. 2022c. Guidance on PFAS Exposure, Testing, and Clinical Follow-Up. Washington, DC: The National Academies Press. https://doi.org/10.17226/26156.

NRC (National Research Council). 2000. Scientific Frontiers in Developmental Toxicology and Risk Assessment. Washington, DC: The National Academies Press. https://doi.org/10.17226/9871.

NRC. 2007. Toxicity Testing in the 21st Century: A Vision and a Strategy. Washington, DC: The National Academies Press. https://doi.org/10.17226/11970.

NRC. 2009. “Science and Decisions: Advancing Risk Assessment.” Washington, DC: National Academies Press (US). https://pubmed.ncbi.nlm.nih.gov/25009905/.

Olson, H., G. Betton, D. Robinson, K. Thomas, A. Monro, G. Kolaja, P. Lilly, et al. 2000. “Concordance of the Toxicity of Pharmaceuticals in Humans and in Animals.” Regulatory Toxicology and Pharmacology: RTP 32(1): 56–67.

Patisaul, H. B., Behl, M., Birnbaum, L. S., Blum, A., Diamond, M. L., Rojello Fernández, S. and H.M. Stapleton. 2021. “Beyond Cholinesterase Inhibition: Developmental Neurotoxicity of Organophosphate Ester Flame Retardants and Plasticizers.” Environmental Health Perspectives 129(10): 105001.

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.

Perel, P., I. Roberts, E. Sena, P. Wheble, C. Briscoe, P. Sandercock, M. Macleod, L. E. Mignini, P. Jayaram, and K. S. Khan. 2007. “Comparison of Treatment Effects between Animal Experiments and Clinical Trials: Systematic Review.” BMJ. https://doi.org/10.1136/bmj.39048.407928.be.

Pollock, M., R. M. Fernandes, L. A. Becker, and D. Pieper. 2019. “V: Overviews of Reviews.” In J. P. T. Higgins, J. Thomas, J. Chandler, M. Cumpston, T. Li, M. J. Page, and V A. Welch (Eds.), Cochrane Handbook for Systematic Reviews of Interventions. Cochrane, 2022. http://training.Cochrane.org/handbook.

Rogers, P. J., P. S. Hogenkamp, C. de Graaf, S. Higgs, A. Lluch, A. R. Ness, C. Penfold, et al. 2016. “Does Low-Energy Sweetener Consumption Affect Energy Intake and Body Weight? A Systematic Review, Including Meta-Analyses, of the Evidence from Human and Animal Studies.” International Journal of Obesity 40(3): 381–394.

Rotroff, D. M., B. A. Wetmore, D. J. Dix, S. S. Ferguson, H. J. Clewell, K. A. Houck, E. L. Lecluyse, et al. 2010. “Incorporating Human Dosimetry and Exposure into High-Throughput in Vitro Toxicity Screening.” Toxicological Sciences: An Official Journal of the Society of Toxicology 117(2): 348–358.

Shea, B. J., B. C. Reeves, G. Wells, M., Thuku, C. Hamel, J. Moran, D. Moher, et al. 2017. “AMSTAR 2: A Critical Appraisal Tool for Systematic Reviews That Include Randomised or Non-Randomised Studies of Healthcare Interventions, or Both.” BMJ 358 (September), j4008.

Shojaei-Zarghani, S., A. Yari Khosroushahi, M. Rafraf, M. Asghari-Jafarabadi, and S. Azami-Aghdash. 2020. “Dietary Natural Methylxanthines and Colorectal Cancer: A Systematic Review and Meta-Analysis.” Food & Function 11(1): 10290–10305.

Sophocleous, A., M. Yiallourides, F. Zeng, P. Pantelas, E. Stylianou, B. Li, G. Carrasco, and A. I. Idris. 2022. “Association of Cannabinoid Receptor Modulation with Normal and Abnormal Skeletal Remodelling: A Systematic Review and Meta-Analysis of in Vitro, in Vivo and Human Studies.” Pharmacological Research: The Official Journal of the Italian Pharmacological Society 175 (January): 105928.

Tamaki, C., T. Nagayama, M. Hashiba, M. Fujiyoshi, M. Hizue, H. Kodaira, and K. Nakamura. 2013. “Potentials and Limitations of Nonclinical Safety Assessment for Predicting Clinical Adverse Drug Reactions: Correlation Analysis of 142 Approved Drugs in Japan. The Journal of Toxicological Sciences 38(4): 581−598.

Wetmore, B. A., J. F. Wambaugh, S. S. Ferguson, M. A. Sochaski, D. M. Rotroff, K. Freeman, H. J. Clewell III, et al. 2012. “Integration of Dosimetry, Exposure, and High-Throughput Screening Data in Chemical Toxicity Assessment.” Toxicological Sciences: An Official Journal of the Society of Toxicology 125(1): 157–174.

Wetmore, B. A., J. F. Wambaugh, S. S. Ferguson, L. Li, H. J. Clewell III, R. S. Judson, K. Freeman, et al. 2013. “Relative Impact of Incorporating Pharmacokinetics on Predicting in Vivo Hazard and Mode of Action from High-Throughput in Vitro Toxicity Assays.” Toxicological Sciences: An Official Journal of the Society of Toxicology 132(2): 327–346.

Wetmore, B. A., J. F. Wambaugh, B. Allen, S. S. Ferguson, M. A. Sochaski, R. W. Setzer, K. A. Houck, et al. 2015. “Incorporating High-Throughput Exposure Predictions with Dosimetry-Adjusted in Vitro Bioactivity to Inform Chemical Toxicity Testing.” Toxicological Sciences: An Official Journal of the Society of Toxicology 148(1): 121–136.

Wikoff, D. S., J. D. Urban, C. Ring, J. Britt, S. Fitch, R. Budinsky, and L. C. Haws. 2021. “Development of a Range of Plausible Noncancer Toxicity Values for 2,3,7,8-Tetrachlorodibenzo-P-Dioxin Based on Effects on Sperm Count: Application of Systematic Review Methods and Quantitative Integration of Dose Response Using Meta-Regression.” Toxicological Sciences: An Official Journal of the Society of Toxicology 179(2): 162–182.

Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 61
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 62
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 63
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 64
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 65
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 66
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 67
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 68
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 69
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 70
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 71
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 72
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 73
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 74
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 75
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 76
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 77
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 78
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 79
Suggested Citation: "4 Concordance with Human Data." National Academies of Sciences, Engineering, and Medicine. 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. Washington, DC: The National Academies Press. doi: 10.17226/26906.
Page 80
Next Chapter: 5 Issues in Developing a Scientific Confidence Framework for NAMs
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.