Previous Chapter: 3 Definition of Causality
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

4

Influence of the Individual Study on the Body of Evidence

The U.S. Environmental Protection Agency (EPA) Integrated Science Assessments (ISAs) evaluate the strength of evidence supporting causal inferences based on a framework for causal determinations summarized in EPA’s 2015 Preamble to the Integrated Science Assessments (EPA, 2015a). In this chapter the committee describes the current approach used for causal determinations based on guidance provided in the Preamble and as employed in some recent ISAs. The Preamble sets out the process for developing ISAs, with key steps including literature search; study selection; evaluation of individual study quality; evaluation, synthesis, and integration of all the evidence; and development of scientific conclusions and causal determinations. The framework incorporates a weight of the evidence approach for causal determinations, focusing on “evaluation of the findings from the body of evidence across disciplines, drawing upon the results of all studies judged of adequate quality and relevance” (EPA, 2015a). In the Preamble the nine aspects to aid in judging causality are described, adapted from the Bradford Hill aspects of association (Hill, 1965). The culmination of the selection, evaluation, and interpretation process is placement in one of a set of categories of causal associations between the pollutants being studied and health or welfare outcomes.

This chapter focuses on the influence of the individual study on the weight of evidence approach applied during the causal determination process. The chapter begins with a description of the guidance provided in the Preamble for individual study selection, describes the types of evidence and key considerations regarding the different types of evidence as part of a body of evidence, and then describes the influence of transparency, reproducibility, and replicability of the individual study in the weight of evidence approach.

STUDY SELECTION AND EVALUATION IN EPA’S CAUSAL DETERMINATION FRAMEWORK

Sections 2, 3, and 4 of the Preamble provide a description of the approach taken in ISAs to search for, select for inclusion, and evaluate peer-reviewed literature relevant to the air pollutants under review. The approach includes a continuing literature search by EPA, in which specific topics

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

related to the air pollutants and relevant disciplines are monitored. In addition, citations to prior National Ambient Air Quality Standards (NAAQS) assessments and tables of contents of relevant journals are examined, with experts identifying relevant literature. Public input is requested through a call for information in the Federal Register, and additional input is received from EPA’s Clean Air Scientific Advisory Committee (CASAC) and external reviewers during the assessment process.

Literature identified in this intentionally broad and inclusive search is subject to a selection procedure that involves first an initial screen based on the title of the papers identified. Any paper considered potentially relevant is added to the EPA HERO (Health and Research Online) database of references, and further evaluated for relevance by examination of the abstract and then the text. Only peer-reviewed and ethically conducted studies or data analyses are included; such studies include epidemiology and controlled human exposure studies, toxicology studies, studies of ecological or other welfare effects, and modeling or measurement studies of atmospheric chemistry, air quality and emissions, environmental fate and transport, dosimetry, toxicokinetics, and exposure. In addition to studies that explicitly assess causality, the studies may be related to exposure-response relationships, modes of action, populations, life stages, and ecosystems at increased risk.

As an example, the 2020 NOx-SOx-PM ISA (EPA, 2020b) included a structured workflow for gathering literature that began with references used for the 2008 ISA, followed by searches of literature from 2008 to 2017, sorting based on keywords, and manual screening (EPA, 2020b, IS-5, p. 88). Using the framework described in the Preamble, emphasis was placed on studies that characterize quantitative relationships between criteria pollutants and ecological effects that occur at atmospheric concentrations and deposition relevant to current ambient levels in the United States. Experimental studies with higher exposure concentrations were included if they contributed to an understanding of mechanisms. Studies that did not address a relevant topic were excluded. Relevant studies included those examining atmospheric chemistry, spatial and temporal trends, and deposition, as well as EPA analyses of air quality and emissions data. Relevant ecological research included geochemistry, microbiology, physiology, toxicology, population biology, and community ecology. The research included experimental laboratory and field additions of the pollutants, as well as gradient studies.

EPA further screens to select which studies are included in the review by considering the extent to which the study is informative, pertinent, and policy-relevant to pollutant-effect relationships or the basis for such relationships. While the Preamble does not give specific guidance, some ISAs produced since publication of the Preamble provide more detail about how study quality is determined:

After study selection, the quality of individual studies is evaluated by EPA or outside experts in the fields of atmospheric science, exposure assessment, dosimetry, animal toxicology, controlled human exposure studies, epidemiology, ecology, and other welfare effects, considering the design, methods, conduct, and documentation of each study. Strengths and limitations of individual studies that may affect the interpretation of the study are considered. (EPA, 2020a, Figure 10.1)

The emphasis is on concentration- or exposure-effect relationships in current populations or ecosystems, at current ambient air concentrations, and particularly if unique data such as a previously undocumented effect or mode of action are reported.

Studies determined to be relevant to the ISA are then evaluated for quality by EPA or external subject matter experts. A common general approach is applied to all studies, with additional considerations applied to specific types of studies. The broad approach—as described very generally in the Preamble—includes evaluation of study design, methods, conduct and documentation. Quality considerations include clear presentation of the study groups, methods, data, and results relative to the study objectives, with limitations, assumptions and other factors clearly stated. With respect

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

to ecological endpoints, the Preamble states that there should be an adequate and well-defined methodology for selecting ecosystems, sites, populations, and subjects or organisms, with clear distinction among exposures. Dose metrics should be of adequate quality, representative, and correspond to ambient conditions, and the effects measured should be valid, reliable, and meaningful.

Further quality considerations are applied specifically to four areas: atmospheric science and exposure assessment, epidemiology, controlled human exposure and animal toxicology, and ecological and other welfare effects. For each of these areas, the Preamble provides high-level descriptions of features considered in the quality evaluation or particularly desirable for the NAAQS review. In the subsequent section the committee provides more information on the types of studies that might contribute to an ISA.

There are examples in each ISA of the studies (about 5% of the total studies considered) that pass the screen tests for both relevance and quality. The most recent ozone ISA provides examples of narrative quality reviews for the small portion (less than 0.5%) of considered studies used to support determinations of “causal,” “likely causal” or resulting in changes in a causal category from previous ISA. Missing from the Preamble and ISAs is a description of the reasons why studies are excluded.

TYPES OF EVIDENCE USED IN INTEGRATED SCIENCE ASSESSMENTS TO SUPPORT CAUSAL DETERMINATIONS

In developing causal assessments of air pollutant health and welfare effects for ISAs, EPA reviews evidence from multiple types of studies and study designs published in the peer-reviewed scientific literature to assess their relevance and importance in the weight of evidence approach. These include studies on atmospheric science and exposure assessment, which can be relevant to both health and welfare effects. Health effects are further assessed using evidence from epidemiology, controlled human exposure, and animal toxicology studies. Welfare effects are more varied and are assessed using a wider array of study approaches, described in more detail below. Each individual study type has important strengths and limitations—assessment of which should contribute to the weight of evidence assessment—and the confidence of a causal inference is increased when there is convergence across multiple lines of evidence. The Preamble does not provide specifics regarding how individual studies should be considered, and allows for studies where either exposures or doses are characterized. In the next sections the committee provides general discussion of the different types of evidence and key considerations for their use in a weight of evidence approach.

Atmospheric Science Studies

Atmospheric and exposure sciences are foundational to EPA’s task of assessing the scientific evidence of causal relationships between ambient air pollution and health or welfare effects and establishing whether apparent relationships are causal. EPA and numerous other groups and individuals rely on robust and comprehensive atmospheric chemistry and deposition monitoring networks to inform the status and patterns in air quality (e.g., NADP,1 CASTNet,2 IMPROVE3). These networks are the foundation of air quality understanding in the United States and essential to characterize and quantify health and welfare impacts, the response to air quality management actions, and to parameterize and test air quality models. The continued maintenance and support-

___________________

1 See http://nadp.slh.wisc.edu (accessed May 29, 2022).

2 See https://www.epa.gov/castnet (accessed May 29, 2022).

3 See http://vista.cira.colostate.edu/Improve (accessed May 29, 2022).

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

of these networks is essential for the effective development of ISAs and ultimately the NAAQS. Research in the atmospheric and exposure sciences helps determine the scope of EPA’s review of the effects literature by establishing pollutant concentration or deposition levels that are relevant for consideration. This body of research also informs an evaluation of the quality of individual effects studies by EPA because they can inform understanding of exposure measurement error and misclassification, which can potentially bias study results. Research in atmospheric and exposure sciences thus can provide core scientific background to inform the overall assessment of the body of evidence, for example by helping to explain disparate results across studies conducted at different times or locations or with different exposure surrogates.

Atmospheric sciences research encompasses understanding of air pollutant sources, atmospheric transformation and transport, and techniques for measuring and modeling air pollutants and their precursors and constituent chemical species. Like other areas relevant to causal assessment this is a dynamic research area because modeling and measurement techniques are evolving and because the atmospheric environment is changing. Climate change is modifying weather patterns, and large-scale emissions changes alter the chemical environment that governs atmospheric transformations. As an example, the 2019 PM ISA highlights the shift in composition of PM2.5 that occurred due to reductions in sulfur oxide emissions from coal-fired power plants (EPA, 2019c). The causal determination framework thus recognizes the core scientific knowledge that atmospheric science can provide, while also providing a process for staying up to date on that evolving science.

Exposure Assessment Studies

In the Preamble air pollutant exposure is discussed in reference to both health and welfare (EPA, 2015a). Understanding how exposures are defined and measured, and potential errors in that measurement, is a core part of assessing study relevance and quality with respect to a causal assessment as the study results can be misleading or misinterpreted if the exposure assessment is not valid or if challenges in the assessment are not accounted for when interpreting study results. For human health, EPA defines “exposure” as contact with the pollutant at the interface of the breathing zone over a specified length of time (EPA, 2019c). For welfare effects, deposition or gaseous exchange at material, soil, water or vegetation surfaces, and the presence of pollutants in the atmosphere are also considered (in the case of radiative forcing or visibility effects, the atmosphere is exposed). Most vegetation research is based on cumulative exposure over the growing season but may also consider long-term effects mediated through changes in soil, whereas human health studies may consider either short-term (daily or shorter) or chronic (annual) exposure levels. ISAs focus on exposures to ambient pollution, which may occur outdoors or, when considering health effects, due to the infiltration of ambient pollution indoors. Seldom are pollutant concentrations in the breathing zone directly measured, so observational health effects studies typically rely on measured or modeled surrogates, such as measured outdoor concentrations at ambient monitors. Variations in the relationship between indoor and outdoor pollution due to differences in building ventilation or air conditioning use can increase exposure estimation error and potentially introduce systematic bias, depending on the study design. Consequently, exposure error has two elements in epidemiological studies: (1) error in measuring or modeling the surrogate and (2) error arising from use of the surrogate to approximate true exposure.

Estimates of outdoor concentrations used as exposure surrogates may be based on fixed site monitoring, satellite observations, spatial interpolation, land use regression, physics-based dispersion or chemistry and transport models or combinations of two or more of the above. The ISAs discuss each of these exposure estimation approaches in detail, including their relative strengths and weaknesses. Beyond outdoor surrogates, panel studies sometimes use personal exposure measurements, but this approach requires untangling the ambient contribution. Doing so is less difficult

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

for ozone, which has few indoor sources, than for NO2 and PM, for which personal activities and indoor sources may dominate total exposure.

Evaluation of study quality and relevance for causal assessment needs to consider aspects of exposure measurement error for both health and welfare effects, and how measurement error may impact the study results and thus the potential contribution of that study to the weight of evidence approach. However, the Preamble provides no explicit guidance for such considerations. Exposure errors are generally expected to reduce precision and to negatively bias associations between air pollution and health effects, although the degree to which this occurs depends on the type of error and the study design. For example, classical error—defined as error scattered about the true exposure and that is independent of the true exposure—has this influence. In contrast, Berkson error—in which the true value varies randomly around the measured value and where the measurement error is independent of the measured value—reduces precision but is not expected to bias the health effect estimate (Goldman et al., 2011). An example of Berkson measurement error in an air pollution context might be a chamber experiment, where the equipment is set to deliver pre-specified amounts of air pollution but the actual exposures delivered are slightly different from the pre-specified levels. In that case, the measurement error would be distributed around the preset values and not the actual exposures. The ISAs also recognize “classical-like” and “Berkson-like” errors, which can have different effects (Szpiro and Paciorek, 2013; Szpiro et al., 2011).

Recent advances in exposure estimation include use of low-cost monitors for saturation or mobile monitoring studies to enhance spatial resolution, use of satellite and other remote sensing data to expand coverage, and hybrid approaches that merge information from one or more physics-based models, land use data, and ground and satellite measurements. Both the 2020 ozone ISA (EPA, 2020a) and the 2019 PM ISA (EPA, 2019c) include discussions of emerging hybrid approaches, noting that these were not covered in the prior ISAs for these pollutants. The 2019 PM ISA concludes that developments in hybrid spatiotemporal modeling integrating land use data and observations have improved spatial resolution of PM2.5 exposure estimates and consequently reduced bias and uncertainty in health effect estimates (EPA, 2019c, pp. 3–121). However, during recent decisions related to adjusting air quality standards, EPA dismissed evidence from epidemiology studies that used hybrid exposure models (e.g., Di et al., 2017a,b; Shi et al., 2016) as being inferior for NAAQS review purposes to studies that assigned exposures based only on fixed site monitoring data (85 Fed Reg 82684). The rationalizations for dismissing the hybrid studies were that performance evaluations for hybrid modeling approaches are skewed toward better monitored urban areas (85 Fed Reg 82705) and that it is challenging to compare the reported concentration distributions or mean values from these studies to the monitored design values used to assess NAAQS compliance (85 Fed Reg 82711). The causal determination framework does not explicitly address exposure measurement and modeling challenges, and how they affect how studies are considered in the weight of evidence approach. A framework nimble enough to integrate and use information derived from rapidly evolving measurement capabilities, such as from satellite observations, and the use of hybrid modeling techniques can support future ISAs.

Epidemiologic Health Effects Studies

Epidemiologic studies can provide valuable information on associations between exposure of human populations to ambient air pollution and health outcomes. Studies considered in ISAs are typically limited to short- or long-term pollutant exposures at or near current ambient conditions. They can be especially useful for evaluating health endpoints, at-risk populations, groups, or life stages that have not otherwise been researched. They can also shed light on important methodological issues to assist in interpreting the body of health evidence—for example, lag times between exposures and effects, effects thresholds and model specifications.

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

Epidemiologic study designs generally fall into one of several categories, including cross-sectional, prospective cohort, time-series, panel studies, and “natural experiments.” Some are carried out using individual-level data; others use community-level data, for example relating community-level exposure levels to outcomes or taking advantage of variation in contaminant levels across localities. Those study designs use different approaches to try to reduce confounding; assessment of underlying assumptions associated with the study to assess relevance for causality is crucial for a weight of evidence approach. As discussed in more detail in Chapter 8, an important limitation of epidemiologic studies can be the influence of potential confounding—by other pollutants or other co-varying causal or contributing factors. While the studies show associations between pollutant exposures and specific health endpoints, actual human exposures typically reflect complex, variable mixtures of multiple pollutants. Chapter 8 discusses these study design approaches, and their relevance for causal assessments, in more detail.

Controlled Human Exposure Studies

Controlled human exposure studies (human clinical studies) can provide direct evidence of relationships between pollutant exposures and health effects. They can also provide information on the biological plausibility of associations observed in epidemiologic studies. In some cases, controlled exposure studies can be used to characterize dose-response relationships at pollutant concentrations relevant to ambient conditions. These studies are often conducted using a randomized crossover design, with subjects exposed both to the pollutant and a clean air control. In this way, the subjects serve as their own experimental controls and limit the influence of potential inter-individual confounders. The studies are typically structured to evaluate physiological or bio-molecular outcomes in response to a specific air pollutant or combination of pollutants.

Limitations include the generally small sample size, short exposure times used, and the (ethically imperative) inability to evaluate introduction of irreversible severe health outcomes. Study design may also preclude inclusion of subjects with serious health conditions or heightened risks of exposure, and therefore, the results may not be generalizable to other populations without additional assumptions or scientific insights. Although some controlled human exposure studies have included health-compromised individuals, such as those with mild or moderate respiratory or cardiovascular disease, these individuals may also be relatively healthy and may not represent the most sensitive individuals in the population. Thus, observed effects in these studies may underestimate the response in certain populations.

Animal Toxicological Studies

Animal toxicological studies involve exposing animals to the pollutant of concern to provide information on biological action mechanisms of a pollutant under controlled exposure circumstances. Assuming that biological differences between test species and humans are reasonably well understood and quantifiable, these studies can provide useful insights on potential human health effects, exposure-response or dose-response relationships, and modes of action. They may help inform determinations of factors that can increase or decrease the risk of health effects in certain populations. These studies allow exploration of toxicological pathways or biological mechanisms by which a pollutant may cause effects. Improved understanding of biological mechanisms of effects can be especially important in establishing causality. An important limitation to the use and interpretation of animal toxicological studies is uncertainty in the extrapolation from animal to human responses, as modified by such differences as metabolism, hormonal regulation, breathing patterns, lung structure, and anatomy. Different responses between humans and animals are influenced by several widely varying biochemical, endocrine, neuronal, and other factors.

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

Evaluating results of both human exposure and animal toxicological studies requires assessing the design and methodology of each study with focus on characterizing the pollutant(s) and the intake dose, dosing regimen, and exposure route, where evaluation is done using a weight of evidence approach in the Preamble’s causal determination framework. The sample size of such studies is typically small, thus limiting the statistical power to detect differences and the ability to control for other variables that could influence observed effects. Study designs can include factors that minimize bias in results, such as randomization and blinding, and account for unexplained loss of animals or withdrawal of human subjects. Study designs can also include scientifically reasonable control groups to allow for accurate interpretation of results relative to exposure. As stated in the Preamble, EPA emphasizes the use of studies that approximate current ambient human exposures (i.e., within two orders of magnitude of ambient) as this range can accommodate differences in dosimetry, toxicokinetics, and biological sensitivity of various species, strains, or at-risk populations. As stated in the Preamble, studies at even higher concentrations may be considered if they provide information to aid understanding mode of action or mechanisms, interspecies variation, or at-risk human populations. In vitro studies can provide useful mechanistic insights for evaluating effects observed in vivo or in epidemiologic studies.

The major health study types summarized above (human exposure, animal toxicological, epidemiological) which pass EPA quality and relevance screening reviews are considered collectively in making causal determinations in ISAs. For the recent Ozone ISA (EPA, 2020a), quality reviews of policy-relevant health studies (supporting “causal” or “likely causal” determinations, or changes in causal determinations from previous ISAs) are accessible via the EPA Health Assessment Workspace Collaborative (HAWC),4 including 61 epidemiological studies, 45 animal toxicology studies, and 35 controlled human exposure studies (as well as to subsets of these studies relating to short-term and long-term exposures on respiratory, cardiovascular, metabolic, and mortality effects).5

Types of Studies That Specifically Support Causal Determinations for Welfare Effects

Historically researchers have utilized a variety of approaches to examine the response of ecosystem structure and function to disturbance and elucidate the mechanism(s) contributing to responses. These approaches are also used to examine the causality of ecosystem or organismal response to effects of air pollution (see Table 4.1). Each of the five approaches described below has strengths and limitations. As is articulated and operationalized by the weight of evidence approach outlined in the Preamble, the most compelling synthesis utilizes information from multiple approaches to attribute causality and determine the mechanism driving ecosystem effects.

Experiments

Experiments are a controlled perturbation to a system from which the response to that perturbation is monitored. They can be a rigorous approach to determine causality of air pollutant effects on ecosystems. Experiments can range from small-scale laboratory, greenhouse, or mesoscosm studies to whole-ecosystem experiments involving a watershed, stream, or lake. Whole-ecosystem studies allow an intact ecosystem to be examined to evaluate the complex dynamics of the abiotic and biotic components of the system and the interplay of organisms within the ecosystem. Typically, measurements are made in a premanipulation period followed by treatments and multi-year monitoring to examine perturbation effects. The experimental ecosystem is typically compared-

___________________

4 See https://www.epa.gov/risk/health-assessment-workspace-collaborative-hawc (accessed July 20, 2022).

5 See https://hawc.epa.gov/lit/assessment/100500031/references (accessed August 2, 2022).

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

TABLE 4.1 Strengths and Limitations of Approaches Currently Used for Determining Causality in Ecosystem Science

Approach Strengths Limitations
Quantitative Characterization of Exposure-Response Relationships Reproducible Confidence in Initial or Baseline Conditions ONLY Inferential, Correlative, or Hypothetical Potentially Confounding Influences Limited Spatial Scale Limited Temporal Scale Limited Dose-Response Range
Experiments    
Time-series measurements  
Paleoecological studies          
Gradient studies          
Spatial models        
Simple mass balance          
Mechanistic models      
Statistical models        

with a reference (unmanipulated) system. An advantage of some experiments is that they can address the impacts of an individual air pollutant, and separate confounding factors from cause and effect. Experiments can be used in conjunction with models to gain a predictive understanding of the mechanisms driving effects. The disadvantages are that even whole-ecosystem experiments represent a relatively small scale and are difficult to replicate. In addition, it can be difficult to quantitatively extrapolate the results of ecosystem experiments to different sites. Finally, there can be experimental artifacts that make it difficult to relate the experiment to actual air pollution impacts. For example, actual air pollution effects on ecosystems steadily increased from low levels to higher levels over a multi-decadal period and more recently have generally decreased (at least in the United States), while experiments generally involve a step addition of an air pollutant. It is relatively easy to experimentally increase the concentration of an air pollutant but it is difficult to lower the concentration of an air pollutant.

Time-Series Measurements in Welfare Studies

Time-series measurements involve repeated monitoring of ecosystem attributes and can provide valuable real-time assessments of responses of endpoints to changes in pollution so long as they are maintained. All attribution of cause and effect from time-series are correlative, and as such, may lead to spurious links if not carefully compared with other lines of evidence. Interruptions of time-series measurements create irretrievable gaps in records that may prevent detection of causal determinations. In the ISA for Oxides of Nitrogen, Oxides of Sulfur and Particulate Matter—Ecological Criteria (EPA, 2020b), long-term measurements are termed time-series. Time-series measurements were used throughout that NOx-SOx-PM ISA in evaluation of critical loads, soil biogeochemistry, nitrogen saturation, and biological effects of freshwater acidification. Time-series

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

measurements were used to correlate changes in ecological endpoint response to direct changes in air pollutants, changes in ecological endpoints over time along gradients in space of air pollutant amounts or concentrations, and to either provide input values for, or validation of models depicting changes in ecosystem processes over time. Time-series measurements have the advantage that they can provide a compelling direct measure of response of an ecosystem or attribute to changing air pollution. Time-series measurements can be made across multiple sites or over a relatively large regional domain. If monitoring is conducted over a relatively large region there may be limitations on supporting measurements that can be made to help interpret long-term trends. Like all causal approaches, there are limitations to time-series measurements, including potential confounding changes (such as climate or land use changes) co-occurring to an ecosystem or a group of ecosystems that could obscure the determination of causality. In Chapter 8 the committee discusses the use of time-series data for causal assessment in more detail.

Paleoecological Studies

Paleoecological measurements are in some respects similar to time-series measurements and were used extensively in the NOx-SOx-PM ISA (EPA, 2020b). Even though the Preamble suggests evaluations of causality rely on recent conditions, a paleoecological body of evidence can provide evidence or inference of past causal linkages, including cumulative impacts, and provide a better understanding of predisturbance (background) conditions. Paleolimnological or ecological measurements come from lake or wetland sediment core studies, tree ring analysis and coral studies. In the NOx-SOx-PM ISA (EPA, 2020b), paleoecological studies provided inferential evidence linking phytoplankton community shifts to chemical indicators of acidification or nutrient enrichment (caused by SO4 and NOx atmospheric deposition, respectively) (EPA, 2020b). Like time-series measurements, paleoecological studies have the advantage of assessing direct changes to an ecosystem. Furthermore, paleoecological measurements have the potential to characterize background conditions prior to the anthropogenic additions of the air pollutant and to assess ecosystem variation in the context of changing air pollution. Paleoecological studies have the disadvantage that there may be confounding factors obscuring causality and it may be difficult to extrapolate beyond a measured area. Except for direct measures of pollutants such as a trace metals, other conclusions about the causes of change are inferential, and rely on eliminating all other causal factors to support the causal chain of effects from air pollution. Because causality is always inferred when working with paleoecological measures, it is strongest as support to a body of information when accompanied by experimental, long-term, and gradient results.

Gradient Studies

Gradient studies are conducted to explore responses that change with distance from a source or with spatial changes in atmospheric deposition amounts. Used extensively in the NOx-SOx-PM ISA (EPA, 2020b), they provide information characterizing how atmospheric deposition along a spatial gradient from low to high affects soil processes, soil communities, vegetation community composition, and surface water chemistry (EPA, 2020b, Table IS-2). Gradient studies are used to determine effects on organisms (e.g., toxicity, community composition, reduced productivity of reproductive rate, mortality) and on ecosystem processes (e.g., mineralization rates, trace gas fluxes), and can also be used to identify confounding factors or interactions with soil or water characteristics, such as pH. Gradient studies involve the measurement of ecosystem attributes thought to be sensitive to an air pollutant across a spatial geographic air pollution gradient. Gradient studies may be interpreted as a “space-for-time” substitution under the assumption that effects displayed under high concentrations or high deposition rates are similar to those produced by a progressive increase in

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

pollutant over time from lower concentrations or deposition rates. Gradients in air pollutant concentrations (in air, soil, water, and vegetation) occur because of both spatial and temporal heterogeneity in pollutant sources and transport and deposition mechanisms, while the consequent environmental responses may be due to both spatial and temporal variations in air pollutant concentrations and other factors, such as spatial variations in ecosystem structure and function. The establishment of new industrial animal operations such as feedlots or dairies, or operation of a smelter or a mine, for example, creates what amounts to a point source of high pollutant emissions, whose effects can be compared with those of lower inputs. The environmental responses along a spatial gradient from high emissions due to point sources to a lower, more regional, diffuse level of inputs do not represent temporal differences so much as simply different levels of inputs, but might nevertheless be interpretable in a space-for-time manner. Gradient studies represent a powerful approach, and are in many respects analogous to epidemiological studies, as they examine ecosystem attributes thought to be sensitive to air pollutants to attribute a cause-and-effect relationship. Although gradient studies may invoke a space-for-time substitution, few examine the validity of this assumption.

Process Modeling

Process models are mathematical representations of the dynamics of physical or ecological systems and their response to drivers. Models are not explicitly mentioned for estimating causality in the Preamble, but spatial, mechanistic, and statistical models are used extensively in the Ozone and NOx-SOx-PM ISAs for estimating causality for different welfare endpoints (EPA, 2020a, Appendix 8, 2020b, IS-13). Spatial models of ozone injury to plants significantly improved the linkage between cause and effect in an example from California (EPA, 2020a, Appendix 8, 8-24). In the NOx-SOx-PM ISA models were used extensively to identify the terrestrial or freshwater effects of acidification from nitrogen (N) and sulfur (S) deposition, eutrophication from N deposition, and provide theoretical critical loads and exceedances of N and S deposition (EPA, 2020b, p. IS-59). Models are essential tools to estimate atmospheric deposition from emissions data, since measures of deposition are imperfect due to measurement error, weather variability, and the inability to effectively quantify gaseous or aerosol dry deposition. Models are also used to assess radiative forcing, climate, and visibility effects of air pollution, in addition to ecological effects. Models are often used to synthesize data from many sources to derive critical loads. In the case of welfare effects, the drivers are ambient concentrations or atmospheric deposition of air pollutants. Specifically for ecological effects, there are many advantages to the application of ecosystem process models. They are relatively inexpensive compared to field or experimental approaches and they can be applied over much larger spatial and temporal scales than the other approaches used to evaluate causality. Both steady-state (simple, time-invariant) and dynamic (generally more complex, time-dependent) models have been used to characterize and quantify impacts of air pollutants on ecosystems. While models imperfectly represent reality, they are recognized as an important tool in determining causality and quantifying the impacts of air pollution (NRC, 2007; Oreskes et al., 1994). Models can readily be used in tandem with one or more of the other approaches to assess causality. Models can also be applied to understand the effects of confounding factors in ecosystem response to air pollutants. Models such as these described above are more commonly used in assessment of welfare effects than in health, potentially due to the need to understand the underlying mechanisms of action, which may be more feasible.

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

TRANSPARENCY, REPRODUCIBILITY, AND REPLICABILITY IN INDIVIDUAL STUDIES

Transparency is a means for scientists to advance trust in research and scientific progress (Elliott, 2020). A means to develop transparency in scientific endeavors is to present the methods, data, and results of an investigation to make that investigation reproducible and replicable—also tenets of good science. In this report reproducibility and replicability are used consistently as defined by the National Academies of Sciences, Engineering, and Medicine (NASEM, 2019), which is not necessarily consistent with how these terms are used broadly. Reproducibility (or computational reproducibility) implies being able to obtain consistent results using the same input data, computational steps, methods, and conditions of analysis. Replicability implies being able to obtain consistent results across studies with independent datasets aimed at answering the same scientific question.

The concepts of reproducibility and replicability are central to assessing the quality of a single study, to understanding underlying assumptions, and to evaluating adjustments for confounding and therefore to evaluating the study’s contribution to assessing causality. As such, “reproducibility” is recognized by EPA as important in causal determinations (e.g., EPA, 2015a). The 2019 PM ISA (EPA, 2019c) considers replicability of results as a criterion for study considerations, including for evaluating the strength of inference from studies on the health effects of particulate matter. (Note that EPA uses the term “reproducibility” in a way that is more consistent with the National Academies’ definition of replicability—in the sense of conducting entirely new studies to confirm a pollution-health relationship. Regardless of the terms used, the methods used in the conduct of ISAs to synthesize information in the existing literature to assess causality are consistent with the general premise of the weight of evidence approach.)

There are important arguments to consider in treating reproducibility (as defined by NASEM, 2019) as a criterion for assessing the quality of individual studies. Apart from allowing the ability to verify computations in a study that have been conducted and reported correctly, attempts to reproduce study results may confirm that statistical modeling assumptions are scientifically reasonable, and that the inclusion of additional potential confounders, or other slight changes in the modeling assumptions (e.g., modest changes in the degrees of freedom of a smoothing estimator) would not substantially change the results. Such analyses may also verify that data inputs were selected objectively and that bias in conclusions is minimized. An early example of reproducibility in a pollution-health study was the effort to reproduce the results of two cohort studies (Dockery et al., 1993; Pope et al., 1995) that estimated annual average increases in mortality associated with increases in fine particles in the atmosphere. Those studies came under various forms of scrutiny and the Health Effects Institute was able to reanalyze and verify the original results and examined the consequences of a range of alternative modeling assumptions (HEI, 2000).

However, many health-related air pollution studies are not reproducible, often because of the strict privacy requirements those studies must follow—for example, privacy restrictions on health data collected on participants of cohort studies dictate that the underlying data often cannot be shared. Welfare-related studies are rarely reproduced, although replication efforts in different ecosystems and locations often repeat the same methods. The similarities or differences in responses of such replications, combined with logical (biophysical) explanations and understanding of natural history may lend weight, or credence, to the results of the initial study. However reproducibility, brought about by well-documented open data and open software, is increasingly used in ecological sciences to promote transparency, especially associated with results that may be contentious, or are potentially relevant to policy (Powers and Hampton, 2019). Nevertheless, it can be ecologically or economically harmful to reveal specific locations of endangered populations or other resources,

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.

although the studies may still be robust enough for causal assessments if researchers were explicit about how raw data were processed for statistical analyses and whether any code used analysis was made available (e.g., via public and open resources such as Github6). Providing such transparency in individual studies can support the review process through better understanding of the data, methods, and assumptions used, and thus the potential robustness of study results and their relevance to the causality question under consideration. Given that non-experimental studies do not necessarily separate “design” from “analysis” (see Chapter 8 and Appendix C), such transparency, and potential pre-specification, can be of particular importance in non-experimental studies.

___________________

6 See https://github.com (accessed May 24, 2022).

Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 35
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 36
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 37
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 38
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 39
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 40
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 41
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 42
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 43
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 44
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 45
Suggested Citation: "4 Influence of the Individual Study on the Body of Evidence." National Academies of Sciences, Engineering, and Medicine. 2022. Advancing the Framework for Assessing Causality of Health and Welfare Effects to Inform National Ambient Air Quality Standard Reviews. Washington, DC: The National Academies Press. doi: 10.17226/26612.
Page 46
Next Chapter: 5 Evidence Synthesis in Integrated Science Assessments
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.