The Statement of Task asks the committee to “consider frameworks for integrating, documenting, and evaluating scientific evidence to assess causality of health and welfare effects by air pollutants as part of National Ambient Air Quality Standards (NAAQS) reviews conducted by the Environmental Protection Agency.” A first step in considering frameworks for causality is to clarify what is meant by the term; many similar frameworks (e.g., IOM, 2008, 2012) aim to assess causality, but causality is rarely defined formally. There are many descriptions of what constitutes evidence of causality in air pollution epidemiology and other study designs, and those descriptions evolve as data collection and analysis methodologies are refined and understanding about the relationships between air pollution and relevant outcomes evolves. This chapter focuses on educating readers on what causality is. More specifically, this chapter (1) reviews the definitions of causality, from a legal and statistical prospective, and (2) specifies what is to be learned—such as causal relationships between exposure to air pollutants and health and welfare outcomes. Later chapters focus on how one can learn about causal relationships, such as through specific study designs that are used to estimate associations and infer the causal relationships of interest. This understanding of what causality is and how specific study designs are used in individual studies to learn causal effects can inform the influence of individual studies in a weight of evidence approach.
The Clean Air Act (CAA) specifies that the list of air pollutants for which NAAQS are to be developed is to “include[s] each air pollutant—(A) emissions of which, in [the administrator’s] judgment, cause or contribute to air pollution which may reasonably be anticipated to endanger public health or welfare.”1 The term “cause” is not defined in the legislation. In fact, causal claims were eschewed in the forerunners of today’s Integrated Science Assessments (ISAs) (then called “Criteria Documents”) published around the time of the 1970 CAA amendments (P.L. 91-604). Foundational court cases establish that because of the precautionary nature of the statute, “rigorous step-by-step proof of cause and effect” is not required for establishing NAAQS. Ethyl Corp. v. EPA, 541 F.2d 1 (D.C. Cir. 1976) (en banc). The phrase “which may reasonably be anticipated” in the legislation emphasizes that point. Even so, the determination that a pollutant has a causal or likely
___________________
1 42 U.S.C. § 7408(a)(1)(A).
causal relationship to a given endpoint is important in modern NAAQS reviews because such a determination leads to increased focus on that relationship in further analyses by EPA.
Evaluation of causality implicitly involves comparing outcomes under two or more hypothetical states of the world. The defining feature of those worlds is a hypothetical intervention on the proposed causal factor of interest. For example, when considering a proposed causal factor (such as exposure to a pollutant of a specific concentration), one hypothetical world is that in which every unit in the population is subject to the intervention (exposure to the pollutant) versus another hypothetical world in which no unit in the population is subject to the intervention. In the statistical literature, the outcomes in these hypothetical worlds are known as “potential” outcomes, and where in a particular dataset one will be observed for each unit; the other is unobservable and termed “counterfactual,” as discussed further below. To understand the concept of the unobservable counterfactual in a hypothetical world, consider an analogy to a randomized control trial in the real world. Before randomization, each unit at that particular point in time could be assigned as a treatment or a control, potentially resulting in observable outcomes related to either category. However, once randomization happens, only the outcome experienced by a particular unit at that particular time is observed. The outcome will depend on whether they were assigned to the treatment or control group. For a unit in the treatment condition, their outcome under treatment will be observed and the outcome if in the control group is the counterfactual (and vice versa for a unit in the control condition). The same conditions could be analyzed in hypothetical worlds.
There might be interest in studying the causal relationship between air pollution and mortality. The causal effect of air pollution on mortality can be defined as the difference in “potential outcomes”—for example, the difference in mortality for a group of people if a hypothetical intervention is performed whereby they experience a high level of air pollution and that same group’s mortality if a hypothetical intervention is performed whereby they experience a low level of air pollution. Note that these interventions do not need to be implementable in the real world; they are a conceptual way of defining causal effects.
In statistical notation “A” can be used to denote the level of air pollution (e.g., A=1, high level of air pollution and A=0, low level of air pollution), and Yi to denote the outcome of interest (e.g., time to death) for study unit i (e.g., a person or community at a particular point in time). Then each study unit i has two potential outcomes (denoted by Yi(A=0), Yi(A=1)), where Yi(A=0) represents time to death under hypothetical exposure to a low level of air pollution, and Yi(A=1) represents time to death under hypothetical exposure to a high level of air pollution. If the study unit is actually exposed to a low level of air pollution under some other conditions, then Yi(A=0) is observed, whereas the outcome for the same study unit under the hypothetical alternative scenario of high level of exposure and the same other conditions—Yi(A=1)—is called the counterfactual outcome.
Here, the “fundamental challenge of causal inference” (Holland, 1986) is that for each study unit in the population of interest (e.g., a person, community, or ecological area at a particular point in time), either Yi(A=1) or Yi(A=0) is observed—not both. In other words, the outcome for the alternative (unobserved) condition (the counterfactual outcome) can never be measured in that study unit. Therefore, the causal effect for each unit (e.g., the difference [Yi(A=1) – Yi(A=0)]) is not directly observable. Given this fundamental inability to directly observe individual unit causal effects, many studies aim to estimate average causal effects, defined across a relevant population, for example:
∆ = E(Yi(A = 1) − Yi(A = 0)),
where Δ is known as the causal estimand—the quantity a study is trying to estimate—and E denotes an “expectation” (average) over a suitable domain, such as over some population of study units. The expectation can also be expressed as the average (mean) calculated across the population:
where N denotes the size of the population. As noted above, this causal effect cannot be directly observed. However, study designs, such as randomized experiments or observational studies, can be used to identify and estimate this quantity as a statistical estimand (i.e., a quantity that can be estimated from data). As discussed in Appendix C, different study designs have different strengths and limitations in terms of how well they can identify a causal estimand from the data: A causal effect, defined as a property (summary) of the distribution of potential outcomes (e.g., the difference in means), is identified if it can be uniquely computed from the distribution of observed variables, using a set of well-articulated causal assumptions encoded in a causal model (defined below) (Casella and Berger, 2002; Pearl, 2009). Thus, a key consideration for causal studies is to distinguish what is being estimated (the estimand—the causal effect) from how it is estimated (the study design and analyses), followed by assessing the strength of the estimation approach given the assumptions and the data available. The distinction between what is to be estimated (the causal estimand) and how it is estimated (using different study designs and statistical techniques) is crucial for the assessments of the overall evidence for causality, such as is done in the Preamble’s causal determination framework. Later chapters and Appendix C discuss more specifics regarding how studies can be used to estimate causal effects—and thus establish causality—and the assumptions required to interpret the resulting relationships as causal.
So far, this section has discussed overall average causal effects. A variety of causal effects may be of interest in any given study, and may inform causal determinations. These often consist of different ways of summarizing the distribution of potential outcomes. For example, effects based on a mediation framework (Nguyen et al., 2021; see also Chapter 8) may aim to disentangle the direct effects of air pollution on a given health outcome from indirect effects that are subject to some specific mechanism. For example, one might be interested in estimating the indirect effect of air pollution on a pregnancy outcome—that is, assess whether the effects of air pollution on pregnancy outcomes are mediated by maternal metabolomics (Inoue et al., 2020). In other cases one might be interested in quantifying effect heterogeneity (i.e., effects in heterogeneity). There may be interest in examining whether the causal effects of air pollution on a given outcome might differ across groups, such as determining if the effects of different levels of air pollution on a given outcome may differ for rural versus urban areas. These effects are broadly known as conditional average treatment effects (e.g., Wager and Athey, 2018). Importantly, this issue of effect heterogeneity means that the effect estimated in one group may not hold for others, and studies need to be assessed on how generalizable the study findings might be for different groups of the population (VanderWeele, 2015; Westreich et al., 2019). Clarity in what effect is of interest—and then of how well it can be estimated in any given study—is crucial for understanding the relevance of a particular study for the ISA causal determinations (Hernán and Robins, 2020; Imbens and Rubin, 2015; VanderWeele, 2015). Chapter 8 further discusses some specific study design and analysis approaches that may inform ISA study selection and evidence integration.
This report focuses on the potential outcomes approach for defining causal effects, given its clarity and wide use in science (VanderWeele, 2016a). However, many other schools of thought exist regarding causal inference (e.g., Green, 2003; Krieger and Davey Smith, 2016; Pearl, 2009; Spirtes et al., 2016; Vandenbroucke et al., 2016). The scope of this chapter is didactic, to provide a
high-level conceptual definition of causality. The committee does not mean to imply that individual studies that do not use a potential outcome framework should not be considered in the ISA process. The field of causal inference is rapidly evolving, and a variety of causal inference approaches can be used in the context of individual studies and provide insights relevant for the ISA causality determinations.
The Preamble (EPA, 2015a) uses a definition of causality articulated originally in the 1964 Surgeon General’s report (HEW, 1964) that is highly related to the idea of potential outcomes: “The 1964 Surgeon General’s report on tobacco smoking defined ‘cause’ as a ‘significant, effectual relationship between an agent and an associated disorder or disease in the host’ ” (EPA, 2015a, p. 18). However, determining an effectual relationship between an agent and associated disorder is complicated by a lack of direct observation. As such, EPA integrates and synthesizes evidence from a wide and comprehensive body of scientific literature. The Preamble summarizes EPA’s approach to evaluating causal relationships from this synthesized evidence as follows:
In its evaluation and integration of the scientific evidence on health or welfare effects of criteria pollutants, the U.S. EPA determines the weight of evidence in support of causation and characterizes the strength of any resulting causal classification. The U.S. EPA also evaluates the quantitative evidence and draws scientific conclusions, to the extent possible, regarding the concentration-response relationships and the loads to ecosystems, exposures, doses or concentrations, exposure duration, and pattern of exposures at which effects are observed. (EPA, 2015a)
EPA relies on the Bradford Hill aspects of association (Hill, 1965) to guide its weight of evidence approach. Those aspects are aligned with a potential outcomes/counterfactual way of thinking about causation used in epidemiological studies, in terms of relating different exposures (i.e., causes) to outcomes (i.e., effects). The potential outcome framework is useful for assessing the evidence of causality provided by an individual study. The Bradford Hill aspects of association are useful for then integrating evidence of causality from multiple sources, taking into account the quality and relevance of the evidence from each of the individual studies.
The above description defining causal effects does not require any knowledge of why such effects occur. An alternative approach to causality is known as “mechanistic causality.” Mechanistic causality requires an understanding of the causal mechanism—that is, all the steps underlying an observed association need to be explicit and have a genuine explanation (Campaner, 2011). To achieve mechanistic knowledge is to be able to determine not only that some factors contribute to some effects, but also how they contribute, revealing the continuous and dynamic processes linking causes with their effects. A mechanistic perspective on causality holds that a causal relationship can be revealed through identification of the specific mechanisms that describe how a given cause affects the study outcome. More detailed definitions and discussion of variations of this approach can be found in Salmon (1984), Glennan (1996), and Machamer et al. (2000). Writing from a classical physics perspective, Glennan (1996) defined mechanisms as “complex systems whose ‘internal’ parts interact to produce a system’s ‘external behavior,’” arguing that “events are causally related when there is a mechanism that connects them.”
A major part of establishing causality includes some aspects of mechanistic understanding. Knowledge of underlying mechanisms, however imperfect, can help clarify which apparently-
causal relationships are plausible and potentially relevant. For example, mechanistic understanding can help inform how causal effects should be defined (e.g., the appropriate time lag between the exposure of interest and the outcome of interest), and how well one can learn about the causal relationships of interest (e.g., to help understand how plausible the underlying assumptions of studies are, such as whether confounding is sufficiently dealt with). Perfect knowledge of all mechanisms involved in all the effects of air pollution on public health or welfare is currently (and may always be) unattainable.
For example, Brook and colleagues (Brook et al., 2010) have elucidated many pathological mechanisms that lend biological plausibility to the adverse effects of fine particulate matter on cardiovascular health. In this context, mechanistic causality could be determined if the pathways and biological mechanism of how air pollutants lead to cardiovascular diseases are fully identified. An understanding of these mechanisms can also help identify, for example, which confounders are important to adjust for in studies relating fine particulate matter to cardiovascular health. Another example of a mechanistic approach is provided in Box 3.1 relating particulate matter to visibility.
A concept that brings together the ideas of mechanistic causality and the definition of causal effects as defined using potential outcomes are “causal models.” Causal models—which aim to establish an understanding of causal effects in part through the clear articulation of causal assumptions—must be distinguished from purely statistical models. Given the unobserved potential outcomes and “fundamental challenge of causal inference” described above, assumptions or knowledge encoded in a causal model often involve the potential outcomes. These assumptions may or may not have empirical implications or be empirically testable. Statistical models involve modeling observed data only, and all assumptions in statistical models are empirically testable. For example, temporal ordering (i.e., the ordering of events in time) is a type of knowledge encoded in a causal model. This means that in a study assessing the causal effect of A on Y, it is necessary to know a priori whether A is antecedent to Y, as the cause must precede their effects.
Another example of knowledge encoded in a causal model assumption is randomization. Randomization works because the randomized exposure is independent of the potential outcomes,
by design. When assessing the causal effect of a randomized exposure A on an outcome Y, knowledge of such independence must come from sources external to observations on A and Y, namely knowledge of the study design—that randomization to exposure conditions occurred. This knowledge of randomization will have implications in the data—in particular similarity of covariates distributions across exposure groups, on average—but data analysis will not guarantee or prove that randomization occurred. Another example of an untestable causal assumption that is often encoded or assessed using a causal model is whether a given set of pre-treatment variables is sufficient to control for confounding (Greenland et al., 1999; Rubin, 1978). Causal models are informed by a priori understanding of the underlying mechanisms, and make explicit any assumptions that are required to identify causal effects from observed data (as discussed above, a causal effect is identified if it can be uniquely computed from the distribution of observed variables). In brief, statistical models allow assessment of whether two variables are related in a dataset. A causal model articulates the assumptions—and possibly their plausibility—required to interpret that relationship as causal (Carone et al., 2020).
In this report, the committee adopts the term causal model to refer to this set of a priori assumptions about unobserved variables, including the unobserved potential outcomes and how they relate to observed variables. The committee also distinguishes between causal models and statistical
models, and reserve the latter term to refer to assumptions on observed variables (e.g., Petersen and van der Laan, 2014). Causal models, while necessary in both randomized and observational studies, are more important in observational studies where assumptions about the exposure mechanism are often made because randomization is not used. Petersen and van der Laan (2014) further distinguish between causal and statistical models; examples of causal models include the Neyman-Rubin causal model (Holland, 1986; Rubin, 1974), non-parametric structural equation models (Pearl, 2021), single-world intervention graphs (Richardson and Robins, 2013), and marginal structural models (Robins et al., 2000).
The definition and assessment of causal models and their underlying assumptions is paramount in evaluation of the quality and the information that a study provides toward a causal determination. Box 3.2 provides some useful definitions of related terminology. Causal models are also incorporated in linking policy interventions with health outcomes (e.g., following a chain of accountability). In this context, interest often lies in assessing both the direct effect of an intervention on the health outcome and also the indirect effect of an intervention on a health outcome via changes in the exposure to air pollution (e.g., HEI, 2003; Henneman et al., 2017; Hubbell and Greenbaum, 2014; Kim et al., 2020). This application is not the focus of this report.
When developing a causal model, researchers may seek to evaluate how a proposed causal effect (e.g., an exposure, a regulatory intervention, a particular set of conditions) will potentially impact the outcome of interest. Under this potential outcome framework, when determining whether a given event has a causal relationship with an outcome, such as the event leading to a higher probability of experiencing the outcome, researchers need to specify that actual or hypothetical event. In the manipulative view of causation, researchers are further required to specify a feasible event, that is an intervention that can actually be manipulated (i.e., interventional cause) that would bring about the potential outcome of interest, and only events that are manipulable can be deemed as interventional causes.
In the context of air pollution research, examples of manipulative causation studies are the so-called accountability studies that have been conducted to assess how regulations are impacting health. In such accountability studies, the “interventional cause” is a specific regulatory intervention (such as implementation of a traffic ban to reduce emissions from cars). In this case, the cause can be manipulated (as policy makers can decide to enact a traffic ban or not), and the relationship between the manipulated cause (a traffic ban) and the outcome (e.g., asthma) can be assessed. The impact of the interventional cause on pollutants emissions and air quality as intermediates, and how that air quality change impacted health, can also be assessed (e.g., Boogaard et al., 2017; Henneman et al., 2017). This is in contrast with more traditional air pollution studies where the putative cause is a hypothetical change in air pollution levels which is harder to manipulate directly (Zigler and Dominici, 2014a).
Epidemiological studies of air pollution and health can be broadly categorized depending on the causal question (Zigler and Dominci, 2014a). The question might be posed: What are the causal effects of differential exposure to pollution on health? The objective, in this context, is to evaluate the causal effects of a hypothetical change in air pollution levels, which is difficult to manipulate directly. Several studies have attempted to estimate such effects (Chay and Greenstone, 2003; Chen et al., 2013; Currie and Walker, 2011; Moore et al., 2010; Pope, 1996; Pope et al., 2007; Rich et al., 2012). There is another category of studies that attempt to address the following question: “What is the causal effect of the intervention on health?” (Chay and Greenstone, 2003; Clancy et al., 2002; Deschenes et al., 2012; Friedman et al., 2001; Greenstone, 2004; Hedley et al., 2002; Tonne et al., 2008; Zigler et al., 2012). In this context, it is possible to evaluate the health consequences of an intervention that has actually occurred, and to also evaluate the direct and indirect effects—that is, effects that have occurred via the changes in air quality and effects that have occurred throughout different pathways (e.g., Zigler and Dominici, 2014a).
An example used to show the complexity of manipulative causation is the effect of obesity as measured by body mass index (BMI) on mortality, where there are different methods to modify BMI (e.g., physical activity versus dietary alteration), which may lead to different counterfactual mortality outcomes. Thus, the effect of BMI is undefined until an intervention to modify BMI is explicitly specified (Hernán and Taubman, 2008). In this context, a causal estimand is clearly specified and identified only if the exposure that has been manipulated is also identified (e.g., exercise as a modifiable intervention of BMI). A meaningful causal estimand will explicitly define the modifier in such a way as to describe the causal effect of interest (e.g., 1 hour per day of exercise versus less than 1 hour per day). A different modifier of BMI (e.g., dietary intervention, or 2 hours per day of exercise) could well result in a different “effect” of BMI on mortality (Hernán and Taubman, 2008). Similar arguments have been made in the case of the effect of pollutants on health outcomes (Cox, 2018). In these studies, the causal estimand is identified only if it is defined how the exposure has been—or hypothetically would be—manipulated, for example, via a specific regulatory action.