Dropdown items
My Academies

Personal Library

Account settings

From Neurons to Neighborhoods: The Science of Early Childhood Development (2000)

Chapter: B Defining and Estimating Causal Effects

Visit NAP.edu/10766 to get more information about this book, to buy it in print, or to download it as a free PDF.

Previous chapter Next chapter
Page of 589
Search this publication

Previous Chapter: A Related Reports from the National Academies

Page 546 Cite Bookmark

Suggested Citation: "B Defining and Estimating Causal Effects." Institute of Medicine and National Research Council. 2000. From Neurons to Neighborhoods: The Science of Early Childhood Development. Washington, DC: The National Academies Press. doi: 10.17226/9824.

treatment C.¹ These are called potential outcomes. Children are viewed, then, as having potential outcomes, only some of which will ever be realized. Several conclusions follow from this definition.

First, the causal effect is defined uniquely for each child. The impact of the treatment can thus vary from child to child. Modern thinking about cause thus rejects the conventional assumption that a new treatment adds a constant effect for every child. This assumption, never realistic to scientists or practitioners, was historically made to simplify statistical analysis.

Second, the causal effect cannot be observed. If a given child is assigned to E, we will observe the outcome under E but not the outcome under C for that child. But if the child is assigned to C, we will observe the outcome under C but not the outcome under E. Holland (1986) refers to the fact that only one of two potential outcomes can be observed as the fundamental problem of causal inference.

Third, although a given child will ultimately receive only one treatment, say, treatment E, it must be reasonable at least to imagine a scenario in which that child could have received C. And similarly, even though another child received C, it must be reasonable to imagine a scenario in which that child had received E. If it is not possible to conceive of each child's response under each treatment, then it is not possible to define a causal effect. There must, then, be a road not taken that could have been taken, for each child. Thus, both the outcome under E and the outcome under C must exist in principle even if both cannot be observed in practice. Therefore, in current thinking about cause in statistical science, a fixed attribute of a child (say sex or ethnic background) cannot typically be a cause. We cannot realistically imagine how a girl would have responded if she had been a boy or how a black child would have responded if that child had been white. Epidemiologists referred to such attributes as fixed markers (Kraemer et al., 1997), unchangeable attributes that are statistically related to an outcome but do not cause the outcome.

This theory of causation provides new insights into why randomized experiments are valuable. It also provides a framework for how to think about the problem of causal inference when randomized experiments are not possible.

According to the RRH theory, the problem of causal inference is a problem of missing data. If both potential outcomes were observable, the causal effect could be directly calculated for each participant. But one of the potential outcomes is inevitably missing. If the data were missing completely at random, we could compute an unbiased estimate of the average

¹	The causal effect could also be defined as the ratio Y_i(E)/Y_i(C), depending on the scale of Y, but we limit this discussion to causal effects as differences for simplicity.

Page 547 Cite Bookmark

causal effect for any subgroup. A randomized experiment ensures just that: that the missing datum is missing completely at random, ensuring unbiased estimation of the average treatment effect.

Suppose, by contrast, that E or C could be selected by each child 's parents. Suppose further that more-advantaged parents tended to choose E while the less-advantaged parents tended to chose C. Then the potential outcomes would be nonrandomly missing. The outcome under E would come to be observed more often for advantaged than for disadvantaged children. Selection bias is thus a problem of nonrandomly missing data.

Even more insidiously, suppose that some parents had previous knowledge about how well their child is likely to fare under the new day care program. For example, one parent might know that, without the new program, her child will be cared for by the paternal grandmother, who is known to be a master teacher of young children. Thus, this parent decides not to participate in the new day care program, knowing that the child will probably do better without it. Other parents who know their families do not include talented teachers with time to care for their child choose the new program. Such information is rarely available to researchers, yet it produces nonrandomly missing data.

We view the probability of assignment to E to be the propensity to receive the experimental treatment or simply “the propensity score ” (Rosenbaum and Rubin, 1983). Under random assignment to treatments, the propensity score is independent of the potential outcomes. In the hypothetical case above, by contrast, family advantage is related to both the propensity score and to the potential outcomes. This creates a correlation between the propensity and the potential outcomes. Now suppose that it is impossible to conduct a randomized experiment but it is possible to determine exactly how family circumstances translates into propensity—that is, how families get selected into the treatment. We could then implement a statistical procedure:

For every possible participant, predict the propensity of being in the experimental group.
Divide all sample members into subgroups having the same propensity.
Within each subgroup, compute the mean difference between those in E and C as the average treatment effect for that group.²
Average these treatment effects across all subgroups to estimate the overall average treatment effect.

In a variant of this procedure devised by Robins, Greenland, and Hu (1999), sample weights are computed that are inversely proportional to the propensity of receiving the treatment actually received. Experimental and control groups are then compared with respect to their weighted means. This procedure minimizes the influence of persons with the strongest propensity to receive the treatment they received and eliminates bias in estimating treatment effects when the propensity is accurately predicted. The method has especially useful applications when the treatments are time-varying.

Page 548 Cite Bookmark

The resulting estimate will be an unbiased estimate of the average treatment effect. Every comparison between those in E and those in C involves subsets of children having identical propensities to experience E. Therefore, the potential outcomes of the children compared cannot be associated with their propensities, and the estimates of the treatment effect will be unbiased. This procedure also makes it easy to estimate separate treatment effects for each subgroup.

When children are matched on propensity scores, the validity of the causal estimate depends strongly on the investigator's knowledge of the factors that affect the propensity to experience E versus C. More specifically, if some unknown characteristic of the child predicts the propensity to be in E versus C, and if that characteristic also is associated with the potential outcomes, then the estimate of the treatment effect based on propensity score matching will be biased. The assumption that no such confounding variable exists is a strong assumption. It is the responsibility of the investigator to collect the relevant background data and to provide sound arguments based on theory and data analysis that the relevant predictors of propensity have been controlled. Even then, doubts will remain in the minds of some readers. In contrast, all possible predictors of propensity are controlled in a randomized experiment, including those that would have escaped the attention of the most thoughtful investigator. Rosenbaum (1995) describes procedures for examining the sensitivity of causal inferences to lack of knowledge about propensity when randomization is impossible.

Perhaps the most common strategy for approximating unbiased causal inference in nonexperimental settings is the use of statistical adjustments. In early childhood research, it is very common to use linear models (regression, analysis of variance, structural equation models) to adjust estimates of treatment impact for covariates related to the outcome. These covariates must be pretreatment characteristics of the child or the setting, and the aim is to include all confounders in the set of covariates controlled. By statistically “holding constant ” the confounders in assessing treatment impact, one aims to approximate a randomized experiment. Under some assumptions, this strategy will work. In particular, if the propensity score (the probability of receiving treatment E) is a linear function of the covariates used in the model, then this adjustment strategy will provide an unbiased estimate of the treatment effect. Aside from the possible fragility of this assumption, this strategy is limited, in that only a relatively small set of covariates may be included in the model. In a propensity score matching procedure mentioned earlier, it is possible—and advisable —to use as many possible covariates as one can obtain in the analysis that predicts propensity.

Page 549 Cite Bookmark

C

Technologies for Studying the Developing Human Brain

NEUROPSYCHOLOGICAL TOOLS

The strategy behind the use of neuropsychological tools is to generate a hypothesis about which area of the brain is involved in a particular behavior and then employ a behavioral test (or tests) to evaluate this hypothesis. Ideally one is able to dissociate one behavior from another (e.g., explicit from implicit memory) using a cluster of tasks or by applying such tasks to both normative and clinical populations.

In terms of elucidating brain-behavior relations in normative samples, neuropsychological tools are frequently adopted that have first been used in animal models or in clinical populations of humans. For example, if one is interested in the type of memory subserved by the medial temporal lobe (i.e., episodic memory), one might employ tasks that have been demonstrated in monkeys or in humans in whom the hippocampus has been lesioned through surgery or through injury to result in memory impairments.

The use of neuropsychological tools has received extensive study in the developing human. For example, Diamond has employed the Piagetian A-not-B task and its animal analogue, the delayed response task, to study the development of certain functions subserved by the prefrontal cortex (e.g., spatial working memory; see Diamond, 1990; Diamond and Doar, 1989; Diamond and Goldman-Rakic, 1989; Diamond et al., 1989). And Bachevalier (with respect to the monkey) and Nelson (with respect to the human) have utilized a set of tools (e.g., visual paired comparison; the

Page 546 Cite Bookmark

Page 547 Cite Bookmark

Page 548 Cite Bookmark

Page 549 Cite Bookmark

Next Chapter: C Technologies for Studying the Developing Human Brain

Subscribe to Email from the National Academies

Keep up with all of the activities, publications, and events by subscribing to free updates by email.

My Academies

From Neurons to Neighborhoods: The Science of Early Childhood Development (2000)

Chapter: B Defining and Estimating Causal Effects

C

Technologies for Studying the Developing Human Brain

NEUROPSYCHOLOGICAL TOOLS