these two interventions to effectively compound their individual benefits. We therefore recommend that future research efforts should focus on using multidimensional RCT designs to assess how much these two interventions work separately and work in concert with each other and with other approaches, such as hospital and primary screening programs. Because of the multitude of IPV etiologies and patterns, the most likely path to eliminating IPV entirely, once started, is represented by efforts that bring multiple agencies together so they can identify, assess, and respond appropriately when needed. Such partnerships are the most likely solution for addressing the range of systematic issues facing people who experience violence and abuse within their intimate relationships.
II.5
INTEGRATING EVIDENCE ON VIOLENCE PREVENTION: AN INTRODUCTION
Anthony Petrosino, Ph.D.8
WestEd
Questions like “What works to prevent violence?” require a careful examination of the research evidence. The evidence is composed of the studies that have been conducted to test the effects of an intervention, a policy, or a practice on violence outcomes.
Integrating evidence is necessary because many programs and policies have been evaluated, across many countries and with different populations within the same nations, and using many different methods and measures. How can we even begin to make sense of these studies to respond to the question, “What works to prevent violence?” How can we do it in a way that is systematic and explicit, and convinces the skeptics (and there are at least a few of those) that the answers are reasonable and to be trusted, especially when decisions about what to do often take place in a highly politicized and contentious context?
There have been several developments to integrate evidence in violence prevention. Two of the more common approaches are referred to in this paper as systematic reviews and evidence-based registries. This paper provides a brisk overview of both.
_________________
8 The author thanks Trevor Fronius and Claire Morgan for their comments on earlier drafts of this paper.
Systematic Reviews
A terrific scenario would be if every study was conducted in the same way and came to the same conclusions. Then it would not matter what study we pulled out of a file drawer or what bundle of studies we presented; they would represent the evidence quite well.
But, as it turns out, life is not so simple. Studies usually vary on all sorts of dimensions, including the quality of the methods and the confidence we have in the conclusions. Another way that studies vary is on the results that are reported. Some studies report a positive impact for an intervention, others report little or no effect at all, and still others report harmful effects for it. This variation in results presents a problem, as zealots and advocates on both sides of a public policy question can selectively use the evidence (“cherry picking”) to support their particular position. This was the point made by Furby and her colleagues in 1989 (p. 22) when reviewing the impact of treatment of sex offenders on their subsequent reoffending:
The differences in recidivism across these studies are truly remarkable; clearly by selectively contemplating the various studies, one can conclude anything one wants.
Apart from the variation across studies and how this might be intentionally exploited by advocates and zealots for particular positions, there are some other issues about evidence that need to be addressed. An important one is that there are potential biases in where studies are reported and how they are identified. What does this mean? Research has shown in some fields that researchers are more likely to submit papers to peer-reviewed academic journals, and editors are more likely to publish them, if they report statistically significant and positive effects for treatment. So any integration of evidence that relies only on peer-reviewed journals could be potentially biased toward positive results for the treatment(s) being examined. How true this is in the violence prevention area has not been implicitly tested, but it is considered good practice now for any integration of evidence to take into account studies published outside of the academy.
Another issue is how “success” for a program is determined. Traditional scientific norms generally mean that we use “statistical significance” to determine whether a result for an intervention is trustworthy. If the observed effect is so large that the result is very likely not due to the “play of chance,” we say it is statistically significant. Traditionally, we are willing to say a result is statistically significant if the result would be expected by the “play of chance” 5 times or fewer in 100 (the .05 criterion). But statistical significance is very influenced by sample size; large samples can result in rather trivial differences being statistically significant, and very large effects may not be significant if the sample sizes are modest. Research has found
that relying solely on statistical significance as a criterion for determining success of a program can bias even the well-intentioned and non-partisan reviewer toward concluding a program is ineffective when it may very well have positive and important effects.
Issues about where results are reported and how success is determined are but a few of the issues that can challenge evidence integration efforts. What is the conscientious person to do? Fortunately, in the past half-century or so, there has been considerable attention to the way reviews of evidence are done. Under the label of meta-analysis, research synthesis, and more recently, systematic reviews, a “science of reviewing” has emerged that essentially holds reviews of evidence to the same standards for scientific rigor and explicitness that we demand of survey studies and experimental studies. In some sense, we have moved from experts doing traditional reviews and saying “trust me” to researchers doing systematic reviews and saying “test me.”
Systematic reviews can be done in several ways, but most follow a similar set of procedures. An example of a very timely systematic review in the violence prevention area may illustrate the point. Koper and Mayo-Wilson (2012) conducted a systematic review of research for the Campbell Collaboration on the effects of police strategies to reduce illegal possession and carrying of firearms. Following the mass shootings in the United States the past few years, and particularly following the massacre of elementary schoolchildren in Connecticut in December 2012, there is much attention on whether these strategies work. The procedures Koper and Mayo-Wilson (2012) followed were as follows:
literature; examined reviews and compilations of relevant research; and searched key websites. They found four studies that included seven outcome analyses.
Many public agencies do not have staff that can spend the time necessary to do a systematic review, and they generally rely on external and trusted sources for evidence. The advent of electronic technology has meant that summaries of evidence from systematic reviews can be provided quickly so long as the intended user has Internet access and can download documents. Groups such as the Campbell Collaboration’s Crime and Justice
Group not only prepare and update reviews of evidence, but make them freely available to any intended user around the world. The rigor and transparency of such reviews have made them a trusted source of evidence, particularly in the politicized and contentious environment that surrounds government response to violence.
Evidence-Based Registries
Campbell Collaboration and other systematic reviews tend to be broad summaries of “what works” for a particular problem (e.g., gun violence) and classes of interventions (police-led strategies for policing illegal guns). They are not usually focused on brand name programs or very specific, fine-grained definitions of an intervention. Because decision makers often need evidence on particular interventions, other approaches to providing evidence that is more fine grained have been developed.
During the past 10 to 15 years, a common approach across a variety of public policy fields can be classified under the heading of “evidence-based registries.” They are also referred to as “best practice registries” and “best practice lists.” In the violence prevention area, quite a few are relevant, including the University of Colorado’s Blueprints for Violence and Substance Abuse Prevention, DOJ’s Crime Solutions effort, the Coalition for Evidence-based Policy’s “Social Programs That Work,” and the U.S. Substance Abuse and Mental Health Administration’s National Registry of Effective Programs and Practice. Table II-3 provides a list of some important registries across different fields.
These registries differ in terms of scope and focus, but they all have a similar framework: An external group of scientists examines the evidence for a very specific intervention or policy, such as Life Skills Training or Gang Resistance Education and Awareness Training (G.R.E.A.T.). The external group gathers the evidence on that specific program. Generally, though the standards are different for each registry, evidence is only included if it is based on randomized or quasi-experimental designs. Whatever evidence on the intervention is then screened to determine if it meets minimum evidentiary standards, and those studies meeting the screen are used to assess its effectiveness. Most registries attempt to distinguish between (1) model or exemplary programs that have two or more studies demonstrating positive impacts and (2) promising interventions that have only one study indicating positive impacts. Many of the registries include a stunning amount of material on the intervention so that those interested in adopting it can do so. The registry is made available electronically so it is available instantly to the busy professionals who need it. There is also no charge to access the registry, so it is free to all who can benefit from it.
TABLE II-3 Evidence-Based Registries Across Different Areas
|
|
||
| Evidence-Based Registry | Area | Evidence Standards |
|
|
||
| What Works Clearinghouse | Education | Randomized experiments Quasi-experiments with evidence of equating |
| CrimeSolutions.gov | Criminal justice | Randomized experiments Quasi-experiments (but those with evidence of equating are rated highest) |
| Coalition for Evidence-based Policy Top-Tier Evidence | Federal policy (Office of Management and Budget/Congress) | Randomized experiments |
| What Works in Reentry Clearinghouse | Offender Reentry/reintegration Programs/policies | Randomized experiments Quasi-experiments with evidence of equating |
| HHS Evidence-based Teen Pregnancy Prevention Models | Teen pregnancy prevention | Randomized experiments “Strong” quasi-experiments |
| SAMSHA National Registry of Evidence-based Programs and Practices (NREPP) | Prevention, broadly | Randomized experiments Quasi-experiments |
|
|
||
NOTE: HHS = Department of Health and Human Services; SAMHSA = Substance Abuse and Mental Health Services Administration.
SOURCE: Anthony Petrosino.
An example may serve to also illustrate the evidence-based registry. The Coalition for Evidence-Based Policy is a not-for-profit group based in Washington, DC, that advocates for the use of evidence in policy decision making, particularly at the U.S. federal level. They have been very influential with Congress, the Office of Management and Budget (OMB), and federal agencies such as the U.S. Department of Education’s Institute for Education Sciences. The Coalition’s registry identifies Top-Tier and Near-Tier Evidence; the difference between them is based on whether a high-quality replication of a program has been conducted. A good example is the “Nurse–Family Partnership” championed by David Olds of Syracuse University, which has been identified as a Top-Tier program by the Coalition.
First, the Coalition solicits or seeks out candidates for Top-Tier or Near-Tier programs. For those candidates, the Coalition then undertakes a careful search to find the evidence on the effects of the program. The Coalition only considers evidence from randomized experiments to designate programs as Top-Tier or Near-Tier. This is a rather strict standard and has not been adopted by nearly all of the other registries, but the Coalition
stresses that only randomized experiments—when implemented with good fidelity—produce statistically unbiased estimates of impact.
To be designated as Top-Tier, a program must have sizable and sustained effects. This is established with multiple experiments testing the program. The Coalition located three randomized experiments of the Nurse–Family Partnership with different populations that have all reported positive effects on a variety of outcomes. Two studies reported a reduction in child abuse and neglect, the outcome that is most relevant to violence prevention. After the Coalition is done summarizing the evidence, it asks for a review by the evaluators who produced the experiments to ensure any inaccuracies are corrected.
Each summary of Top-Tier interventions in the Coalition’s Registry includes details on the program, and how it was different than what the control group received; the populations and settings in which the intervention was evaluated; the quality of the randomized experiment; and the results on the main outcomes of interest. Because it is Top-Tier, the Coalition argues that it should be implemented more widely, and has been pushing Congress and OMB to facilitate wider adoption of programs like the Nurse–Family Partnership. Most registries contain very detailed information on the intervention and population because one goal is to facilitate adoption and implementation of these Top-Tier programs.
Conclusion
The move toward systematic reviews and evidence-based registries resonates with me as a former state government researcher in the justice area in two states (Massachusetts and New Jersey) over my professional career. Our units would, on occasion, receive an urgent request from the state’s Attorney General (AG), the Governor’s Office, a state legislator, or the head of the Office of Public Safety. These requests came in the days when the Internet was just beginning and offered skimpy sites compared to today. The request would go something like this: “We want to know what works and we want to know by five o’clock.” Generally, this meant there was money to be appropriated and they wanted to make sure those funds were allocated toward effective strategies. Or there might be some controversy over a program like G.R.E.A.T. and they wanted to know what the evidence on the program’s impact was. (In the interests of full disclosure, sometimes those requests were something like “here’s what we’re going to do, now get us the evidence to support it.”)
Little did I know, electronically accessible systematic reviews and evidence-based registries would spring up all over the Internet a few years after I left state government service. These allow the busy government researcher to respond quickly to urgent policy requests. If I were employed