218 44 2MB
English Pages 300 Year 2013
EVIDENCE-BASED PRACTICES
ADVANCES IN LEARNING AND BEHAVIORAL DISABILITIES Series Editors: Bryan G. Cook, Melody Tankersley, and Timothy J. Landrum Recent Volumes: Volume 14:
Educational Interventions – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 15:
Technological Applications – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 16:
Identification and Assessment – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 17:
Research in Secondary Schools – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 18:
Cognition and Learning in Diverse Settings – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 19:
Applications of Research Methodology – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 20:
International Perspectives – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 21:
Personnel Preparation – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 22:
Policy and Practice – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 23:
Literacy and Learning – Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 24:
Assessment and Intervention– Edited by Thomas E. Scruggs and Margo A. Mastropieri
Volume 25:
Classroom Behavior, Contexts, and Interventions – Edited by Bryan G. Cook, Melody Tankersley and Timothy J. Landrum
ADVANCES IN LEARNING AND BEHAVIORAL DISABILITIES VOLUME 26
EVIDENCE-BASED PRACTICES EDITED BY
BRYAN G. COOK University of Hawaii, Honolulu, HI, USA
MELODY TANKERSLEY Kent State University, Kent, OH, USA
TIMOTHY J. LANDRUM University of Louisville, Louisville, KY, USA
United Kingdom – North America – Japan India – Malaysia – China
Emerald Group Publishing Limited Howard House, Wagon Lane, Bingley BD16 1WA, UK First edition 2013 Copyright r 2013 Emerald Group Publishing Limited Reprints and permission service Contact: [email protected] No part of this book may be reproduced, stored in a retrieval system, transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publisher or a licence permitting restricted copying issued in the UK by The Copyright Licensing Agency and in the USA by The Copyright Clearance Center. Any opinions expressed in the chapters are those of the authors. Whilst Emerald makes every effort to ensure the quality and accuracy of its content, Emerald makes no representation implied or otherwise, as to the chapters’ suitability and application and disclaims any warranties, express or implied, to their use. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-1-78190-429-9 ISSN: 0735-004X (Series)
ISOQAR certified Management System, awarded to Emerald for adherence to Environmental standard ISO 14001:2004. Certificate Number 1985 ISO 14001
CONTENTS LIST OF CONTRIBUTORS
vii
CHAPTER 1 EVIDENCE-BASED PRACTICES IN LEARNING AND BEHAVIORAL DISABILITIES: THE SEARCH FOR EFFECTIVE INSTRUCTION Bryan G. Cook, Melody Tankersley and Timothy J. Landrum CHAPTER 2 EVIDENCE-BASED EDUCATION AND BEST AVAILABLE EVIDENCE: DECISION-MAKING UNDER CONDITIONS OF UNCERTAINTY Ronnie Detrich, Timothy A. Slocum and Trina D. Spencer CHAPTER 3 APPRAISING SYSTEMATIC REVIEWS: FROM NAVIGATING SYNOPSES OF REVIEWS TO CONDUCTING ONE’S OWN APPRAISAL Ralf W. Schlosser, Parimala Raghavendra and Jeff Sigafoos
1
21
45
CHAPTER 4 ADAPTING RESEARCH-BASED PRACTICES WITH FIDELITY: FLEXIBILITY BY DESIGN LeAnne D. Johnson and Kristen L. McMaster
65
CHAPTER 5 SYNTHESIZING SINGLE-CASE RESEARCH TO IDENTIFY EVIDENCE-BASED TREATMENTS Kimberly J. Vannest and Heather S. Davis
93
v
vi
CONTENTS
CHAPTER 6 UTILIZING EVIDENCE-BASED PRACTICE IN TEACHER PREPARATION Larry Maheady, Cynthia Smith and Michael Jabot
121
CHAPTER 7 THE PEER-REVIEWED REQUIREMENT OF THE IDEA: AN EXAMINATION OF LAW AND POLICY Mitchell L. Yell and Michael Rozalski
149
CHAPTER 8 FROM RESEARCH TO PRACTICE IN EARLY CHILDHOOD INTERVENTION: A TRANSLATIONAL FRAMEWORK AND APPROACH Carol M. Trivette and Carl J. Dunst
173
CHAPTER 9 EFFECTIVE EDUCATIONAL PRACTICES FOR CHILDREN AND YOUTH WITH AUTISM SPECTRUM DISORDERS: ISSUES, RECOMMENDATIONS, AND TRENDS Richard Simpson and Stephen Crutchfield
197
CHAPTER 10 CONSTRUCTING EFFECTIVE INSTRUCTIONAL TOOLKITS: A SELECTIVE REVIEW OF EVIDENCE-BASED PRACTICES FOR STUDENTS WITH LEARNING DISABILITIES Tanya E. Santangelo, Amy E. Ruhaak, Michelle L. M. Kama and Bryan G. Cook
221
CHAPTER 11 EVIDENCE-BASED PRACTICE IN EMOTIONAL AND BEHAVIORAL DISORDERS Timothy J. Landrum and Melody Tankersley
251
CHAPTER 12 EVIDENCE-BASED PRACTICES IN AUSTRALIA Jennifer Stephenson, Mark Carter and Sue O’Neill
273
LIST OF CONTRIBUTORS Mark Carter
Macquarie University, Sydney, Australia
Bryan G. Cook
University of Hawaii, Honolulu, HI, USA
Stephen Crutchfield
University of Kansas, Lawrence, KS, USA
Heather S. Davis
Texas A & M University, College Station, TX, USA
Ronnie Detrich
Wing Institute, Oakland, CA, USA
Carl J. Dunst
Orelena Hawks Puckett Institute, Morganton, NC, USA
Michael Jabot
SUNY Fredonia, Fredonia, NY, USA
LeAnne D. Johnson
University of Minnesota, Minneapolis, MN, USA
Michelle L. M. Kama
University of Hawaii, Honolulu, HI, USA
Timothy J. Landrum
University of Louisville, Louisville, KY, USA
Larry Maheady
SUNY Fredonia, Fredonia, NY, USA
Kristen L. McMaster
University of Minnesota, Minneapolis, NY, USA
Sue O’Neill
Macquarie University, Sydney, Australia
Parimala Raghavendra
Flinders University, Adelaide, Australia
Michael Rozalski
Binghamton University, Binghamton, NY, USA
Amy E. Ruhaak
University of Hawaii, Honolulu, HI, USA
Tanya E. Santangelo
Arcadia University, Glenside, PA, USA
Ralf W. Schlosser
Northeastern University and Boston Children’s Hospital, Boston, MA, USA vii
viii
LIST OF CONTRIBUTORS
Jeff Sigafoos
Victoria University at Wellington, Wellington, New Zealand
Richard Simpson
University of Kansas, Lawrence, KS, USA
Timothy A. Slocum
Utah State University, Logan, UT, USA
Cynthia Smith
SUNY Fredonia, Fredonia, NY, USA
Trina D. Spencer
Northern Arizona University, Flagstaff, AZ, USA
Jennifer Stephenson
Macquarie University, Sydney, Australia
Melody Tankersley
Kent State University, Kent, OH, USA
Carol M. Trivette
Orelena Hawks Puckett Institute, Morganton, NC, USA
Kimberly J. Vannest
Texas A & M University, College Station, TX, USA
Mitchell L. Yell
University of South Carolina, Columbia, SC, USA
CHAPTER 1 EVIDENCE-BASED PRACTICES IN LEARNING AND BEHAVIORAL DISABILITIES: THE SEARCH FOR EFFECTIVE INSTRUCTION Bryan G. Cook, Melody Tankersley and Timothy J. Landrum ABSTRACT The gap between research and practice in special education places an artificial ceiling on the achievement of students with learning and behavioral disabilities. Evidence-based practices (EBPs) are instructional practices shown by bodies of sound research to be generally effective. They represent a possible means to address the research-to-practice gap by identifying, and subsequently implementing, the most effective instructional practices on the basis of reliable, scientific research. In this chapter, we provide a context for the subsequent chapters in this volume by (a) defining and describing EBPs, (b) recognizing some of important limitations to EBPs, (c) introducing a number of ongoing issues related to EBPs in the field of learning and behavioral disabilities that are addressed by chapter authors in this volume, and (d) briefly considering
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 1–19 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026003
1
2
BRYAN G. COOK ET AL.
a few emerging issues related to EBPs that we believe will become increasingly prominent in the near future.
In this 26th volume of Advances in Learning and Behavioral Disabilities we address one of the most important educational reforms of recent years evidence-based practices (EBPs). As detailed in the chapters in the volume, EBPs and evidence-based education are multifaceted and ambitious concepts with broad implications for education generally, and for the education of students with learning and behavioral disabilities in particular. In this introductory chapter, we provide a context for the subsequent chapters by describing EBPs and evidence-based education, discussing their importance and potential, and noting some significant limitations related to EBPs. Additionally, we preview the chapters in the volume and examine future developments in research and practice related to EBPs in the field of learning and behavioral disorders.
WHAT ARE EVIDENCE-BASED PRACTICES? The concept of EBPs emerged out of medicine in the latter decades of the 20th century. The field recognized a history of variable and often ineffective practices that were not aligned with research findings referred to in medicine as the bench-to-bedside gap. Thus, medical researchers began to synthesize research findings across high-quality studies to be used in conjunction with clinical expertise to identify the most effective practices for individual patients (Sackett, Rosenberg, Gray, Haynes, & Richardson, 1996). Many other fields, including education generally and special education in particular, are faced with the same issue of highly variable and frequently ineffective practices being used despite a wealth of research findings as to what works. Accordingly, reforms related to EBPs have become prevalent in fields including agriculture, transportation, technology, and education (Slavin, 2002). We use EBPs in this chapter to refer specific, empirically validated practices. In other words, EBPs are programs or practices shown by sound research to meaningfully and positively impact student outcomes. Although EBP is sometimes used generically to indicate practices with empirical support, organizations (e.g., Best Evidence Encyclopedia, National Autism Center, National Professional Development Center on Autism Spectrum
Evidence-Based Practice
3
Disorders, National Secondary Transition Technical Assistance Center, What Works Clearinghouse [WWC]) are increasingly utilizing systematic standards to identify EBPs to be prioritized in evidence-based education. Although the specific standards for EBPs vary across organizations within and between fields (Cook, Smith, & Tankersley, 2012), scholars identify EBPs by applying systematic criteria to the research base on a practice. The criteria applied are usually related to the (a) research design of studies (many approaches consider only group comparison and single-case research designs, from which causality can be reasonably inferred), (b) methodological quality of studies (many approaches consider only studies that meet certain quality indicators associated with internal validity), and (c) quantity of studies (recognizing that no study is perfect, many approaches require that EBPs be supported by multiple sound studies using appropriate research designs) (see Cook, Tankersley, Cook, & Landrum, 2008). Gersten et al. (2005) and Horner et al. (2005) outline standards frequently used to identify EBPs in special education based on group experimental and singlecase research studies, respectively. EBP standards are applied to the research on a practice through systematic reviews, sometimes referred to as evidence-based reviews. An evidence-based review is a specific type of literature review in which standards related to the design, quality, and quantity (and oftentimes effect size) of studies conducted on the effectiveness of a particular practice are applied to a body of research on a practice to determine whether the practice is evidence-based. Note that rather than classifying practices as evidence-based or not, some approaches for conducting evidence-based reviews (e.g., WWC, 2011) utilize a variety of categorizations that represent the strength of the research base supporting the practice. For example, the WWC classifies practices as having positive effects, potentially positive effects, mixed effects, indeterminate effects, potentially negative effects, or negative effects. Scholars in many fields have also used the phrase evidence-based practice to refer to a broad decision-making approach to instruction that prioritizes empirically validated practices while also considering factors such as practitioner expertise and consumer needs and values (Sackett et al., 1996; Spencer, Detrich, & Slocum, 2012). As applied to education, this use of the phrase suggests that instructional choices should be made by selecting practices supported by the best available evidence (e.g., EBPs) that also (a) meet consumers’ needs and values and (b) align with practitioners’ experience and appraisals. In this way, empirical evidence plays an important role in guiding instruction, but practice is not tyrannized by research evidence (Sackett et al., 1996). This is a sensible and appropriate
4
BRYAN G. COOK ET AL.
use of the term EBP. However, to avoid confusion we use evidence-based education (or evidence-based special education when used specifically for learners with disabilities) when referring to a broad instructional decision-making approach that utilizes practices with meaningful empirical support. In addition to the field’s lack of consensus regarding the meaning of EBP, a hodge-podge of terms is used synonymously with EBP to refer to practices validated as effective by bodies of sound research (e.g., research-based practice, scientifically based intervention, empirically validated treatment; see Mazzotti, Rowe, & Test, 2013). The lack of clarity regarding EBP terminology no doubt is a source of confusion to educational stakeholders (see Cook & Cook, in press). Accordingly, we encourage educators to clarify what exactly they mean when using EBP and related terminology. But we perceive these as largely semantic issues that can be addressed with relative ease. The underlying theme of EBPs and evidence-based education is important and clear, and should not be lost due to inconsistent terminology. That is, the best research available should play a prominent role in making instructional decisions and determining which instructional practices are prioritized in schools and classrooms regardless of the terms used. Indeed, these two common meanings of EBP are interrelated and complementary evidence-based education utilizes EBPs, and EBPs are used as part of evidence-based education; and both reflect the important role that highquality research findings should play in education. Although the inconsistent terminology may be less than clear at times, we are comfortable with educational stakeholders using different terms for EBPs and using EBP in different ways (as chapters authors do in this volume) so long as the notion that rigorous research underlies educational decisions and instruction is the clear intent. The real danger and confusion lies when EBPs are used inappropriately to refer to practices and processes based loosely, or not at all, on sound research. Unfortunately, EBPs are becoming victims of their own success. That is, a practice being called an EBP has become a selling point for many promoters of practices, materials, and curricular programs. Thus, we’ve seen practices with dubious research support (e.g., supported by few studies, studies using designs from which causality cannot be inferred, low-quality studies) or no research support at all touted as evidence-based primarily on the basis of theory and personal experience. Such inappropriate use of EBP not only contradicts the true meaning of the term, it sours educators on the idea of EBPs and evidencebased education, making them appear as if they are just the latest in a long list of hollow and ineffectual educational reforms.
Evidence-Based Practice
5
WHY AND HOW ARE EVIDENCE-BASED PRACTICES IMPORTANT? EBPs are a logical means to address two fundamental problems in modern education: (a) low and unsatisfactory student achievement and (b) the research-to-practice gap. As Maheady, Smith, and Jabot (this volume) note, the driving force in contemporary schools is to maximize student achievement. Dating back to at least the National Commission on Excellence in Education’s (1983) A Nation at Risk report, American politicians and educators have been striving to increase student achievement (see also National Education Goals Panel, 1999; No Child Left Behind Act of 2001). Assuming that EBPs are not consistently implemented in contemporary schools and classrooms, their regular use can be expected to increase student achievement. In addition to the problem of generally low achievement, persistent gaps between the achievement of typical learners and different at-risk groups (e.g., culturally and linguistically diverse leaners, learners living in poverty, learners with disabilities) exist (Aron & Loprest, 2012; Vanneman, Hamilton, Baldwin Anderson, & Rahman, 2009). Although the achievement of all types of learners would be generally improved by the application of EBPs, their application may be particularly important for students with disabilities and other at-risk leaners (Dammann & Vaughn, 2001). Typical learners and high-achieving students will likely make progress even in the absence of highly effective instruction. However, at-risk learners, such as students with learning and behavioral disabilities, require the most effective instruction to succeed in school and reach their potential. Unfortunately, as Maheady et al. (this volume) summarized, educators tend to receive little training about identifying and using EBPs and infrequently implement EBPs with their learners. Although we believe the situation may be slowly improving, Kauffman (1996) conjectured that an inverse relation may actually exist between the frequency of implementation of many practices and their research support. This disjoint between research knowledge and the actual instruction occurring in schools and classrooms has been dubbed the research-to-practice gap (e.g., Carnine, 1997), which Donovan and Cross (2002) suggested should actually be called a chasm. This gap between research and practice represents one of the underlying causes of low student achievement that is largely under the control of educators. That is, in contrast to issues such as student poverty, school funding, and family support, educators have considerable control over the instructional practices used in schools and classrooms.
6
BRYAN G. COOK ET AL.
Despite educators’ desire to maximize student performance and use effective practices, most have received mixed and sometimes misleading messages during pre- and in-service training as to what instruction is the most effective. In the absence of a firm understanding of the research literature, educators typically prioritize personal experiences and the advice of other teachers which is prone to error (Cook & Smith, 2012). Thus, it appears that the first steps in bridging the research-to-practice gap and improving student achievement involve identifying and disseminating EBPs, with the broader goal of broad and appropriate implementation of EBPs (i.e., evidence-based education). Because of the significant potential of EBPs and evidence-based education to raise student achievement and bridge the research-to-practice gap, recent educational legislation has emphasized and promoted research-based practices (Yell, this volume). For example, ‘‘the phrase ‘scientifically based research’ appears more than 100 times throughout the No Child Left Behind Act and y is woven into the fabric of virtually every program in the law’’ (Hess & Petrilli, 2006, p. 94). And given the importance of EBPs for learners with disabilities, it is no surprise the IDEA2004 added language requiring that students’ special education services noted in their individualized education programs be based on peer-reviewed research. In addition to legislation, the importance of EBPs and evidence-based education can be seen in the growing number of organizations involved in identifying, disseminating, and applying EBPs (e.g., Best Evidence Encyclopedia, Campbell Collaboration, Doing What Works, Promising Practices Network, What Works Clearinghouse, Wing Institute), many of which focus on learners with disabilities (e.g., National Autism Center, National Center for Intensive Intervention, National Center on Response to Intervention, National Secondary Transition Technical Assistance Center, National Professional Development Center on Autism Spectrum Disorders). EBP advocates tend to emphasize the benefits of identifying and implementing EBPs on student performance. However, a focus on EBPs also positively affects a number of related areas, such as teacher education, policy, and research. As Slavin (2002) suggested, EBP reforms place education ‘‘on the brink of a scientific revolution that has the potential to profoundly transform policy, practice, and research’’ (p. 15). To generate evidence that meets the rigorous standards required in evidence-based reviews, researchers are beginning to design and conduct studies with greater rigor. These research findings will then drive improved teacher education and educational policies, which will help bring about evidencebased education and heightened student achievement.
Evidence-Based Practice
7
Despite the considerable potential for EBPs and evidence-based education to positively impact student outcomes and other aspects of education, it is important that educational stakeholders be aware of the fundamental limitations of EBPs.
LIMITATIONS OF EVIDENCE-BASED PRACTICES EBPs are not panaceas and their use should be tempered by a number of inherent realities. Although a full discussion of the many limitations of EBPs and evidence-based education is beyond the scope of this chapter, we focus on three fundamental limitations: EBPs are not effective for all learners, identification of EBPs does not imply their implementation; evidence-based classifications depend on the standards applied and research reviewed. EBPs are not effective for all learners; nothing works for everyone. Although EBPs, by definition, are effective for the vast majority of learners, a relatively small proportion of learners, referred to as nonresponders and treatment resistors, fail to respond to these generally effective practices. For example, Torgesen (2000) estimated that 2–6% of learners do not respond to the most effective early reading interventions. Importantly, learners with disabilities are overrepresented among treatment resistors (Al Otaiba & Fuchs, 2006). Thus, despite the critical importance of using EBPs, educators cannot assume that EBPs will automatically be effective, especially for learners with disabilities, and therefore need to take measures to maximize the effectiveness of instruction even when using EBPs. To maximize the positive impact of EBPs, educators should select EBPs that have been shown by high-quality research studies to work with students similar to those they teach. EBPs should not be thought of as effective for all groups of students, but rather for those populations for which they have been validated by high-quality research. For example, a teacher of elementary students with learning disabilities should select an EBP shown to work with learners with these characteristics rather than nondisabled high school students. Another way to maximize the effectiveness of EBPs is to adapt them to fit the unique needs of learners in ways that do not alter the critical elements of the EBP that make it effective (see Johnson & McMaster, this volume). Regardless of how well participants in the research validating an EBP match one’s targeted learners and how well an EBP is adapted, the maxim that no practice is effective for everyone still adheres. Consequently, when using EBPs the progress of learners should be regularly
8
BRYAN G. COOK ET AL.
monitored with reliable and valid formative assessments (e.g., curriculumbased measurements) and used as the basis for instructional decisions. This is especially true for individuals with learning and behavioral disabilities because of their propensity to be nonresponders. And, although it may be patently obvious, we believe it bears emphasizing EBPs must be delivered in a context in which effective teaching is routine (e.g., clear, quick-paced instruction with appropriate time allocated to active teaching in a variety of large and small-group arrangements, with many opportunities for learners to respond and engage, provided in a positive and well-managed setting, see Brophy & Good, 1986). Another truism regarding EBPs is that their identification is a separate matter from their implementation. That is, just because EBPs have been identified does not imply that they will automatically be implemented (Cook & Odom, 2013; Fixsen, Naoom, Blase, Friedman, & Wallace, 2005). Indeed, effective dissemination and implementation of the most effective instructional practices has long vexed the educational field. The translation of research to practice is often conceptualized as consisting of two distinct phases (e.g., Hiss, 2004). The first phase involves conducting and synthesizing high-quality research to identify EBPs. Identification of EBPs is a necessary but far from sufficient condition for the regular and appropriate application of EBPs, which is where the rubber meets the road and student outcomes are actually improved. Research on the second phase of research translation identifying reliable means for facilitating the broad and sustained implementation of EBPs is messy (i.e., replete with variables that are difficult to control) and only in its infancy. Thus, advocates of EBPs should realize that their work is not done when EBPs are identified; in fact, it is just beginning. The final inherent limitation of EBPs that we discuss here is that the evidence-based classification of a practice needs to be understood within the context of the standards used and the research reviewed. For example, we fear that some educators might disregard practices if they are not listed as an EBP. Yet a practice not being denoted as an EBP can be due to many reasons. If a practice is shown to have no effects or negative effects on student outcomes by a number of high-quality studies, educators should indeed avoid using it in almost all situations. However, many practices that are effective are not listed as EBPs simply because an evidence-based review has not yet been conducted on the practice; or because a review has been completed and found that insufficient high-quality research has been conducted on the practice to determine whether a practice is an EBP (Cook & Cook, in press). Alternatively, a practice might be listed as an EBP,
Evidence-Based Practice
9
but some organizations have relatively low standards (e.g., only requiring a single study to support the practice) that might not engender strong confidence that the practice is truly evidence-based. We therefore recommend that educators (a) go beyond labels and investigate the standards applied and research reviewed to more fully understand the research base supporting practices, and (b) periodically review sources of EBPs for updated information as more research is conducted and reviewed. In addition to the inherent limitations associated with EBPs, a number of ongoing issues exist related to EBPs, particularly in relation to learners with learning and behavioral disabilities. As educators gain experience with EBPs and evidence-based education some issues are being addressed (e.g., the need to identify EBPs using systematic standards), but quite often that just opens a host of new issues (e.g., how to deal with conflicting evidence-based classifications for the same practice, how to broadly adopt and implement EBPs once they are identified). In the following section, we identify some of those issues and briefly preview how the subsequent chapters in this volume address them.
ONGOING ISSUES RELATED TO EVIDENCE-BASED PRACTICES Issues related to EBPs arise and persist in part because of the relative recency of the movement in education, but also because of the complexities involved with identifying and implementing EBPs. Indeed, although some of the terminology and specific procedures associated with EBPs may be new to the field, how to identify and implement what really works is among the most elemental and confounding issues in education. And as with most educational matters, the individualized and challenging learning traits of individuals with learning and behavioral disabilities make the application of EBPs in special education that much more difficult and nuanced, yet even more important. The chapters in this volume, as briefly reviewed here, provide readers with the latest thinking on a variety of issues related to EBPs in different areas of special education. One of the fundamental issues related to evidence-based education is what kinds of evidence should educators consider when making instructional decisions. Setting a high bar for evidence standards is one of the hallmarks of EBPs, as most approaches for establishing EBPs involve rigorous standards related to research design, quality, and quantity of supporting
10
BRYAN G. COOK ET AL.
research studies. In this way, educators can be sure that they are not making decisions based on misleading evidence derived from poorly conceived or implemented research. However, setting the bar too high risks disregarding meaningful research findings. It is also important to consider that, for many practices in many areas of special education, little or no high-quality research that uses designs from which causality can be inferred exists. In these situations it seems wiser to be guided by the best available evidence, even if it does not meet rigorous evidence standards, than by personal experience or fiat. Detrich, Slocum, and Spencer (this volume) take up this issue in Chapter 2. The authors argue that all evidence is imperfect and that uncertainty always exists when making instructional decisions. Educators should, then, ascertain the best available evidence to guide their thinking. Detrich et al. outline different types of evidence and highlight the uncertainties associated with each of which educators should be aware. As the number of researchers and organizations identifying EBPs continues to expand, it is important to realize that evidence-based reviews vary in rigor and quality. Although many educators may just want to know which practices are evidence-based and which are not, the issue is not that simple which practices get labeled EBPs depends on what standards of research quality are used and what research is reviewed. Indeed, different evidence-based reviews can and do reach contrasting findings regarding whether a particular practice is evidence-based (because they use different standards, review different research studies, or both). However, practitioners and many other educational stakeholders (e.g., parents) typically do not have the training or time to critically assess the quality of different evidence-based reviews and determine which is most trustworthy. In Chapter 3, Schlosser, Raghavendra, and Sigafoos (this volume) discuss how synopses of evidence-based reviews, which use systematic criteria to appraise reviews, can provide guidance regarding the quality of reviews that educators can use to determine the confidence with which they should treat EBP identifications. Treatment fidelity is becoming one of the most intensely researched and debated issues in education research. For researchers investigating the effects of a practice, the concept is relatively straightforward. To meaningfully determine whether a practice works, it has to be implemented as designed, or with fidelity. For example, if critical elements of the practice are not implemented, if the practice is not administered as frequently as it should be, or if students are not involved as they should be, findings that a practice was not effective may actually reflect inappropriate implementation rather than indicate that the practice is ineffective. To ensure that an EBP is
Evidence-Based Practice
11
as effective in the classroom as it was shown to be in research, it is generally recommended that practitioners implement them with high fidelity. But as Johnson and McMaster explore in Chapter 4, it may not be that simple. Treatment fidelity is a nuanced and multifaceted construct that defies simple solutions. In fact, some level of adaptation appears to be important to make the practice fit the unique learning needs of students. However, adaptations should not fundamentally alter the critical elements of an EBP that make it effective. Johnson and McMaster discuss approaches for balancing fidelity and adaptation across multiple dimensions of treatment fidelity in order to optimize learner outcomes. Because (a) causality can be reasonably inferred from well-designed single-case research studies and (b) the unique settings and characteristics of learners in special education often makes group research difficult or impossible, most approaches for identifying EBPs in special education consider single-case research (e.g., Horner et al., 2005). However, effectiveness of single-case research studies has traditionally been determined by visual analysis of graphed data, making study outcomes less than fully objective and difficult to synthesize across studies (as is necessary to identify EBPs). In Chapter 5, Vannest and Davis (this volume) make a compelling argument for why single-case research should be used to help identify EBPs for individuals with learning and behavioral disabilities and provide an overview of advances in using effect sizes in single-case research. Using effect sizes in single-case research facilitates meaningful aggregation of findings across multiple single-case research studies on a practice, enabling better informed decisions about which practices are evidence-based. As Maheady, Smith, and Jabot point out in Chapter 6, although EBPs have become a fixture in the contemporary educational landscape, they have not made significant inroads into teacher preparation. Yet teacher preparation is where teaching patterns that last through careers are established. Simply stated, without teacher preparation embracing EBPs, the research-topractice gap is likely to remain wide. Educating teachers about which practices are evidence-based and how to make instructional decisions based on sound evidence at the beginning of their careers is considerably more feasible than changing the established practices of in-service teachers one at a time. Toward that end, Maheady et al. discuss how EBPs and evidence-based education can be infused into teacher preparation, and include examples of their own work successfully training inclusive pre-service educators to implement EBPs and make evidence-based instructional decisions. Policy and legislation circumscribe how (special) education is conducted. Thus, how EBP is defined, supported, and mandated in education legislation
12
BRYAN G. COOK ET AL.
is an important indicator of how EBP reforms will play out. Most educators are confused as to how legislation such as the No Child Left Behind Act and the Individuals with Disabilities Education Act (IDEA) define EBPs and just what they require of educators. Yell and Rozalski (this volume) address this issue in Chapter 7 by providing an overview of legislative requirements regarding EBPs, with special attention paid to IDEA’s requirement to base students’ special education services in their individualized education programs on peer-reviewed research. As part of their analysis, Yell and Rozalski also review the U.S. Department of Education’s interpretation of IDEA’s peer-reviewed research requirement as well as relevant administrative hearings and court cases. As we noted previously, one of the limitations of EBPs is that identification does not necessarily result in implementation. Meaningful supports appear necessary to translate research into practice. Although the emerging field of implementation science is beginning to empirically investigate how to move research into practice (see Cook & Odom, 2013), the research base regarding supports and guidelines for implementing EBPs is sparse. Accordingly, ‘‘we are faced with the paradox of nonevidencebased implementation of evidence-based programs’’ (Drake, Gorman, & Torrey, as cited in Fixsen et al., 2005, p. 35). In addition to providing an overview of sources of evidence in the field of early childhood intervention, in Chapter 8 Trivette and Dunst (this volume) describe four types of translation research needed to build the bridge between research and practice: (a) developing EBPs from research findings, (b) establishing evidence-based implementation practices (e.g., coaching, mentoring) to support the adoption and use of EBPs, (c) evaluating the effectiveness of EBPs in real-world settings by various users, and (d) establishing effective procedures for disseminating, diffusing, and scaling-up information gained in the first three types of translation research to achieve the broad implementation of EBPs. As Simpson and Crutchfield (this volume) note in Chapter 9, the field of autism spectrum disorders (ASD) is rife with alternative and questionable treatments, and thus awareness of EBPs is particularly important for the stakeholders working with children and youth with ASD. Simpson and Crutchfield therefore describe EBPs and various sources for EBPs in the field of ASD. Importantly, though, they caution that educators should not select and adopt EBPs for learners with ASD, a population with highly variable and unique learning needs and characteristics, indiscriminately or expect them to work in ineffective and dysfunctional environments. To optimize the appropriateness and effectiveness of EBPs the authors propose
Evidence-Based Practice
13
that educators select and implement EBPs within the context of the individual learning needs of targeted learners with ASD; a collaborative decision-making process representing the perspectives, preferences, and judgments of major stakeholders (e.g., the learner, parents, teachers); and a foundation of effective teaching (e.g., consistent routines, commitment to using EBPs, sufficient resources). A critical but often overlooked step in realizing the benefits of EBPs is effectively disseminating EBPs to practitioners and other stakeholders so that they can select and apply them. Most organizations that identify EBPs make their findings available on publicly accessible websites. However, these websites are typically set up by researchers, contain a wealth of information related to research methods and statistics in which most practitioners have little or no training (e.g., effect sizes, descriptions of research designs and samples in supporting studies, ratings of methodological quality), and often review scores of practices. Therefore, a website containing information on EBPs can be confusing and overwhelming to practitioners who are typically pressed for time. Moreover, many different websites exist with different approaches for identifying EBPs. Faced with the challenge of navigating this EBP maze, many practitioners understandably opt to continue with what they have been doing or use the first credible suggestion they come by. To address this situation, Santangelo, Ruhaak, Kama, and Cook provide a ‘‘one-step shopping experience’’ for EBPs related to students with learning disabilities. Chapter 10 contains straightforward information on EBPs in the area of learning disabilities from prominent organizations. As an EBP may differentially affect learners with different characteristics, one of the more critical and vexing issues related to EBPs is ‘‘effective for whom?’’ For example, a practice shown to work for high school students without disabilities may not be similarly effective for young children with behavioral disabilities. Accordingly, scholars conducting evidence-based reviews must determine clear and meaningful parameters for their review, which typically include an age range and disability type (e.g., elementary students identified as having emotional and behavioral disorders). Likewise, to determine the degree to which their students are similar to participants in the research studies demonstrating a practice’s effectiveness practitioners often focus on disability type. In Chapter 11, Landrum and Tankersley propose that disability identification may not be a particularly fruitful consideration when determining for whom an EBP works. Rather, the authors suggest that specific skill deficits and behavioral excesses should be used when thinking about EBPs. For example, rather than search for practices that are effective for students with emotional and behavioral
14
BRYAN G. COOK ET AL.
disorders, scholar should identify practices that are effective for aggression, compliance, and attention to task. Landrum and Tankersley provide readers with information on practices shown to be effective for some of the major skill deficits and behavioral excesses associated with behavioral disorders (e.g., noncompliance, disruptive behavior, task engagement, academic skill deficits). As special education scholars in the United States, we have focused on issues related to EBPs in that context. We suspect that different countries’ experiences and challenges related to EBP will overlap significantly. Nonetheless, each country’s unique systems, histories, philosophies, legal contexts, and resources related to special education research and practice will no doubt affect the identification and application of EBPs for learners with learning and behavioral disabilities in distinct ways. In Chapter 12, Stephenson, Carter, and O’Neill (this volume) examine the role of EBPs in national and state policies, teacher accreditation standards, teacher education, and research in Australia. Although Stephenson et al. report little evidence of EBPs in the Australian education system, they do indicate a growing recognition of the importance of using empirical evidence in special education that portends optimism regarding the future of EBPs in that country.
FUTURE DIRECTIONS IN EVIDENCE-BASED PRACTICES Undoubtedly, as special educators continue down the road toward evidencebased special education, some issues will be solved, others will persist, and new ones arise. We close this chapter with a brief description of some of the issues that we see looming on the horizon related to EBPs in special education. The need for more high-quality research studies and evidence-based reviews will continue. High-quality, experimental studies are the foundation of EBPs and there are simply not enough of them (e.g., Seethaler & Fuchs, 2005). And researchers in the area of learning and behavioral disorders have only recently begun conducting evidence-based reviews and more are needed to comprehensively determine which practices are and are not evidence-based. Moreover, evidence-based reviews need to be updated regularly as new research is published that may alter the evidence-based status of practices. Another important function of evidence-based reviews is
Evidence-Based Practice
15
highlighting what types of research need to be conducted to answer the critical question of whether a practice is evidence-based. For example, researchers might find that a high proportion of research studies are being excluded from evidence-based reviews because they do not assess treatment fidelity, thereby alerting the research community to the need to design future studies such that treatment fidelity is assessed appropriately. As more evidence-based reviews are conducted by a growing number of organizations and individuals, we foresee the need to critically analyze and synthesize these findings. The logic of the EBP movement suggests that instructional decisions are too important to be made on the basis of a single study or low-quality studies. Rather, research findings are considered sufficiently credible to influence practice when derived from bodies of highquality studies. Similarly, given the variability that exists regarding how evidence-based reviews are conducted (e.g., different standards for highquality research, different guidelines for which studies to include in a review), determination of evidence-based status may best be made on the basis of multiple, high-quality reviews. We recommend that researchers continue to develop and determine the psychometrics of instruments, such as those discussed by Schlosser et al. (this volume), to systematically evaluate the quality of evidence-based reviews. Reasonable scholars will continue to disagree on what specific quality indicators must be present in high-quality, credible studies; from which research designs causality can be reasonably inferred; and how many quality studies must support a practice to trust that it really works. Nonetheless, the field of education can reduce the variability in the standards used to identify EBPs by demanding empirical justification for criteria. That is, rather than rely on expert opinion or theory to determine what constitutes a credible study, meta-analyses are beginning to reveal which methodological elements are associated with effect sizes. For example, in their meta-analysis of intervention research in the field of learning disabilities, Simmerman and Swanson (2001) found that not reporting the reliability and validity of outcome measures was associated with larger effect sizes. This suggests using unreliable and invalid outcome measures inflates effects and that studies included in evidence-based reviews should be required to report psychometrics for their outcome measures. Just as the EBP movement expects practitioners to base their decisions on sound empirical findings, so should scholars when determining criteria for EBPs. As noted by Santangelo et al. (this volume) many practitioners do not access information on EBPs through the websites, journal articles, and
16
BRYAN G. COOK ET AL.
technical reports in which they are often found. In medicine, computerized systems for providing practitioners with relevant and patient-specific information on EBPs in real time have been developed and are in use. Sim et al. (2001) defined computerized clinical decision support systems as software that [is] designed to be a direct aid to clinical decision-making, in which the characteristics of an individual patient are matched to a computerized clinical knowledge base and patient-specific assessments or recommendations are then presented to the clinician or the patient for a decision. (p. 528)
Although it will represent a significant undertaking to develop and maintain, we hope that researchers will begin work on developing computerized clinical decision support systems in education to facilitate the accessibility and application of EBPs in schools and classrooms. As the field of education refines its approach to identifying and disseminating EBPs, ongoing efforts to identify conditions and supports associated with broad and sustained EBP implementation become of increasing importance. After all, the time, effort, and expertise devoted to researching, identifying, and disseminating EBPs is basically meaningless unless they are implemented. Research on implementation will likely look much different than that conducted to establish EBPs; it will have to be both relevant and rigorous, and involve practitioners and researchers working together such that the lines between research and practice are blurred (Smith, Schmidt, Edelen-Smith, & Cook, 2013). In terms of policy and practice, we expect a continued emphasis on using EBPs as a matter of course, with EBPs slowly making inroads into teacher education. However, we hope to see EBPs treated in a nuanced way that permits teachers leeway to use their expertise and meet students’ individual needs. EBPs, and research evidence more broadly, should be thought of as integral tools to be used in the decision-making process to improve teacher effectiveness and learner outcomes, not as rigid prescriptions to which educators must adhere. We envision teachers creating practice-based evidence to provide guidance as to the most effective and efficient ways to incorporate EBPs into their daily routine and instructional decisions (Smith et al., 2013). The intersections of EBPs and special education practice, which are addressed at some level by each of the chapters in this volume, perhaps represent a new face on the time honored foundation of special education using high-quality empirical evidence to inform and improve (rather than dictate) instructional practices in order to meet the unique needs of learners with learning and behavioral disabilities.
17
Evidence-Based Practice
SUMMARY Special education has resulted in considerable benefits for individuals with learning and behavioral disabilities and society at large. However, the research-to-practice gap causes the outcomes of these learners to be unnecessarily low and the gap between their achievement and that of their nondisabled peers to be unnecessarily large. EBPs, or instructional practices shown by bodies of sound research to be generally effective, represent a logical means to address the research-to-practice gap in special education and improve the outcomes of learners with disabilities. Despite the significant potential of EBPs, educators should recognize a number of important limitations to EBPs (e.g., EBPs are not universally effective, implementation of EBPs does not automatically follow their identification, and EBPs must be considered in the context of the standards applied and research reviewed). Moreover, EBPs and evidence-based education involve fundamental and complex matters in education (e.g., What is an effective practice? What counts as high-quality research? How can teaching practices be changed and change sustained? What is the balance between adaptation and fidelity?). Accordingly, educational stakeholders have encountered a number of thorny issues as they have begun to identify and implement EBPs for individuals with learning and behavioral disabilities, many of which chapter authors describe in this volume. We envision that EBPs will continue to evolve and will play an increasingly important role in helping special education fulfill its mission of effectively meeting the educational needs of individuals with learning and behavioral disabilities.
REFERENCES Al Otaiba, S., & Fuchs, D. (2006). Who are the young children for whom best practices in reading are ineffective? An experimental and longitudinal study. Journal of Learning Disabilities, 39, 414–431. Aron, L., & Loprest, P. (2012). Disability and the education system. Future of Children, 22(1), 97–122. Brophy, J. E., & Good, T. G. (1986). Teacher behavior and student achievement. In M. Wittrock (Ed.), Handbook of research in teaching (3rd ed., pp. 328–375). New York, NY: Macmillan. Carnine, D. W. (1997). Bridging the research-to-practice gap. In J. W. Lloyd, E. J. Kameenui & D. Chard (Eds.), Issues in educating students with disabilities (pp. 363–376). Mahwah, NJ: Lawrence Erlbaum.
18
BRYAN G. COOK ET AL.
Cook, B. G., & Cook, S. (in press). Unraveling evidence-based practices in special education. Journal of Special Education. Retrieved from http://sed.sagepub.com/content/early/ 2011/09/08/0022466911420877.abstract Cook, B. G., & Odom, S. L. (2013). Evidence-based practices and implementation science in special education. Exceptional Children, 79, 135–144. Cook, B. G., & Smith, G. J. (2012). Leadership and instruction: Evidence-based practices in special education. In J. B. Crockett, B. S. Billingsley & M. L. Boscardin (Eds.), Handbook of leadership in special education (pp. 281–296). London: Routledge. Cook, B. G., Smith, G. J., & Tankersley, M. (2012). Evidence-based practices in education. In K. R. Harris, S. Graham & T. Urdan (Eds.), APA educational psychology handbook (Vol. 1, pp. 495–528). Washington, DC: American Psychological Association. Cook, B. G., Tankersley, M., Cook, L., & Landrum, T. J. (2008). Evidence-based practices in special education: Some practical implications. Intervention in School and Clinic, 44, 69–75. Dammann, J. E., & Vaughn, S. (2001). Science and sanity in special education. Behavioral Disorders, 27, 21–29. Donovan, M. S., & Cross, C. T. (Eds.). (2002). Minority students in special and gifted education. Washington, DC: National Academies Press. Fixsen, D. L., Naoom, S. F., Blase, K. A., Friedman, R. M., & Wallace, F. (2005). Implementation research: A synthesis of the literature. Tampa, FL: University of South Florida, Louis de la Parte Florida Mental Health Institute, The National Implementation Research Network (FMHI Publication #231). Retrieved from http://ctndissemination library.org/PDF/nirnmonograph.pdf Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–164. Hess, F. M., & Petrilli, M. J. (2006). No child left behind: Primer. New York, NY: Pete Lang. Hiss, R. G. (2004). Translational research – Two phases of a continuum. In From clinical trials to community: The science of translating diabetes and obesity research (pp. 11–14). Bethesda, MD: National Institute of Diabetes and Digestive and Kidney Diseases. Retrieved from http://archives.niddk.nih.gov/fund/other/diabetes-translation/confpublication.pdf Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Kauffman, J. M. (1996). Research to practice issues. Behavioral Disorders, 22, 55–60. Mazzotti, V. L., Rowe, D. R., & Test, D. W. (2013). Navigating the evidence-based practice maze resources for teachers of secondary students with disabilities. Intervention in School and Clinic, 48, 159–166. National Commission on Excellence in Education. (1983). A nation at risk: The imperative for educational reform. Washington, DC: U.S. Government Printing Office. National Education Goals Panel. (1999). The National Education Goals report: Building a nation of learners. Washington, DC: U.S. Government Printing Office. Sackett, D. L., Rosenberg, W. M., Gray, J. A., Haynes, R. B., & Richardson, W. S. (1996). Evidence based medicine: What it is and what it isn’t. British Medical Journal, 312, 71–72.
Evidence-Based Practice
19
Seethaler, P. M., & Fuchs, L. S. (2005). A drop in the bucket: Randomized controlled trials testing reading and math interventions. Learning Disabilities Research & Practice, 20, 98–102. Retrieved from http://dx.doi.org/10.1111/j.1540-5826.2005.00125.x Sim, I., Gorman, P., Greenes, R. A., Haynes, R. B., Kaplan, B., Lehman, H., & Tang, P. C. (2001). Clinical decision support systems for the practice of evidence-based medicine. Journal of the American Medical Informatics Association, 8, 527–534. Simmerman, S., & Swanson, H. L. (2001). Treatment outcomes for students with learning disabilities: How important are internal and external validity? Journal of Learning Disabilities, 34, 221–236. Slavin, R. E. (2002). Evidence-based education policies: Transforming educational practice and research. Educational Researcher, 31(7), 15–21. Smith, G. J., Schmidt, M. M., Edelen-Smith, P., & Cook, B. G. (2013). Pasteur’s quadrant as the bridge linking research and practice. Exceptional Children, 79, 147–161. Spencer, T. D., Detrich, R., & Slocum, T. A. (2012). Evidence-based practice: A framework for making Effective decisions. Education & Treatment of Children, 35, 127–151. Torgesen, J. (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resisters. Learning Disabilities Research & Practice, 15, 55–64. Retrieved from http://dx.doi.org/10.1207/SLDRP1501_6 Vanneman, A., Hamilton, L., Baldwin Anderson, J., & Rahman, T. (2009). Achievement gaps: How black and white students in public schools perform in mathematics and reading on the National Assessment of Educational Progress (NCES 2009-455). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC. What Works Clearinghouse. (2011). Procedures and standards handbook (version 2.1). Retrieved from http://ies.ed.gov/ncee/wwc/pdf/reference_resources/wwc_procedures_v2_1_ standards_handbook.pdf
CHAPTER 2 EVIDENCE-BASED EDUCATION AND BEST AVAILABLE EVIDENCE: DECISION-MAKING UNDER CONDITIONS OF UNCERTAINTY Ronnie Detrich, Timothy A. Slocum and Trina D. Spencer ABSTRACT Special educators make countless decisions regarding services for students with disabilities. The evidence-based practice movement in education encourages those decisions be informed by the best available evidence, professional judgment, and client values and context. In this chapter we argue that while evidence is the best basis for making decisions it is imperfect and uncertainty about the evidence-base for decisions will always exist. We outline three classes of evidence and the sources of uncertainty for each. Finally, we describe a framework for integrating these different sources of evidence as a means for increasing confidence in evidence-based decisions.
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 21–44 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026004
21
22
RONNIE DETRICH ET AL.
DOUBT IS NOT A PLEASANT CONDITION BUT CERTAINTY IS ABSURD-VOLTAIRE Special educators across all levels of service (including teachers, school psychologists, principals, and other school administrators) make countless decisions regarding educational services for students. For the last decade, the basis for those decisions has come under scrutiny. Consequently, the evidence-based practice movement has emerged in education to strengthen the basis of decision-making. This influence can be seen in the way curriculum decisions are made, the way that curriculum developers market their materials (often making claims their curriculum is evidence-based), and the rise of data-based decision systems such as response to intervention (RtI; Jimerson, Burns, & VanDerHeyden, 2007) and positive behavior interventions and support (PBIS; Sugai & Horner, 2009). It can be argued that the primary function of the evidence-based practice movement is to influence how educational decisions are made. In some instances, there is clear, compelling evidence to support a decision. Unfortunately, for many other decisions, the evidence is less convincing and there is greater uncertainty about the proper decision. In this context, uncertainty refers to the lack of confidence a practitioner has that a decision will result in achieving the desired outcome(s) for a specific problem in a specific context. Uncertainty is inherent in any educational decision. Most organizations’ standards for professional conduct implicitly recognize the uncertainty of decision-making by defining progress monitoring as responsible conduct (see, e.g., standards from National Association of School Psychologists, 2010; Behavior Analyst Certification Board, 2010). If complete certainty regarding the effects of instructional decisions existed, progress monitoring to evaluate the effects of an intervention would be unnecessary. Few educators fully understand the meaning of evidence-based practice; it is more than using evidence as the sole basis for decisions (Spencer, Detrich, & Slocum, 2012). Evidence-based practice has commonly been defined as the integration of best available evidence, professional judgment, and client values and context (American Psychological Association, 2005; Sackett, Straus, Richardson, Rosenberg, & Haynes, 2000; Whitehurst, 2002) to inform decisions. Even though it is considered to be a fundamental component of evidence-based practice, best available evidence has not been well-defined or clarified. In this chapter we clarify the construct of best available evidence, suggest three complementary sources of evidence that
Best Available Evidence and Uncertainty
23
educators can use to make evidence informed decisions, discuss how evidence from each source is developed, and explore the uncertainty associated with each type of evidence.
BEST AVAILABLE EVIDENCE There are several important dimensions to best available evidence (Slocum, Spencer, & Detrich, 2012). First, evidence can be best for its relevance to an immediate practical problem. Generally, decisions should be guided by evidence that most closely matches the context of the practical problem (age, ability, setting, etc.). A second dimension of best available evidence is the strength of available evidence both in terms of quality and quantity. Generally, interventions with more and higher quality evidence should be favored over interventions with less and lower quality evidence. Often the term best available evidence has been taken to mean the highest quality of evidence with less attention paid to concerns related to abundance or relevance of evidence (Slocum, Detrich, & Spencer, 2012). The What Works Clearinghouse (WWC, 2011) exemplifies this view of evidence. If an intervention study does not meet rigorous quality standards WWC rejects it and it is not considered in any way when judging the evidentiary status of the intervention. This raises questions about what evidence is considered ‘‘the best available.’’ There are many occasions in which little or no highquality evidence is available and yet practitioners must make decisions (we use the term practitioner to include all educational decision makers – teachers, administrators, and policy makers). In these instances, practitioners must either base decisions on lower quality evidence or base their decisions on something other than evidence. Our concept of best available evidence recognizes that not all evidence is equal but evidence of some type is preferred to no evidence as a basis for decision-making. While one of the functions of evidence is to guide decisions, uncertainties and risks are always associated with basing decisions on evidence even when using evidence from high-quality studies; however, risk and uncertainties associated with evidence-based decision-making are no greater and, usually, lower than when basing decisions on other criteria (Gambrill, 2012; Wilczynski, 2012). We consider the best available evidence to be ‘‘the best evidence of what is available.’’ Lower quality evidence is a better basis for decision-making than other, nonempirical criteria (Chorpita & Daleiden, 2010; Wilczynski, 2012). Accepting the best
24
RONNIE DETRICH ET AL.
evidence that is available allows almost all educational decisions to be guided by evidence (of some type and strength). The remainder of this chapter will discuss three classes of evidence empirically supported intervention reviews, alternative sources (i.e., narrative reviews, best practice committees, and empirically supported practice guides), and smaller units of analysis for guiding educational decisions and the sources of uncertainty related to each.
CLASSES OF BEST AVAILABLE EVIDENCE Empirically Supported Intervention Reviews Systematic reviews of the research literature have come to be accepted as the strongest method for identifying empirically supported interventions (Cordray & Morphy, 2009). We refer to these reviews as empirically supported intervention reviews (ESIR). We recommend the term empirically supported intervention because evidence-based practice has acquired multiple meanings (Spencer et al., 2012) such as an intervention (practice) that has met some evidentiary standards as being effective as well as a broader decision-making process that integrates best available evidence, professional judgment, and client values and context. ESIRs are characterized by using highly objective and well-defined procedures to minimize arbitrary decisions and judgments by reviewers. Each step in the process is guided by specific procedures and criteria that, as much as possible, are determined before the review process has begun. Systematic review procedures include a number of steps: (a) specify criteria for searching relevant literature, (b) screen the literature for relevance to the question, (c) appraise the methodological quality of each relevant study, (d) summarize outcomes for each study, and (e) rate the degree to which use of the intervention is supported by the literature (considering population, outcome, and context; Schlosser, Wendt, & Sigafoos, 2007). The WWC conducts this type of review and results are summarized in intervention reports. Other organizations such as Best Evidence Encyclopedia rely on ESIRs to determine the evidentiary status of specific interventions and make recommendations to practitioners. The systematic review process begins with explicit procedures for identifying the relevant literature to review. In this phase, the search methods are thoroughly described and are developed in such a way as to identify all of the relevant literature. These procedures describe the methods
Best Available Evidence and Uncertainty
25
for searching the computer databases (i.e., which databases, keywords, the beginning and ending dates for the reviewed literature). Further, they specify how the reference lists of the identified research reports will be handsearched to locate additional relevant studies and describe when authors will be contacted to identify additional research. The goal of this step is to be as comprehensive as possible in identifying research that might be relevant. The second step in ESIR is to screen all of the identified literature for relevance to the review question. Screening criteria outline considerations such the relevant intervention, participant population, measures, and settings. The screening criteria specify what is considered relevant and what is considered to be outside the scope of the review. For example, for many interventions there are a number of variations in the published literature. The screening criteria specify which variations will be included in the review and which will be excluded. In some instances, the intervention of interest is combined with another intervention. Combining interventions makes it difficult for reviews to isolate the effects of the targeted intervention. The screening criteria describe how these circumstances are to be handled (i.e., eliminate all studies that have combined interventions, develop criteria that make some combinations acceptable while excluding others, or accept all studies with combined interventions). Each decision at this level effects the inclusion and exclusion of research studies from the ESIR and therefore impacts the outcomes of the review. Once the pool of research studies meeting the relevance criteria has been gathered, each study’s methodological quality is appraised. At this point, the issue is to determine if the study has sufficient methodological quality to support its conclusions. To make this determination, the appraisal of the study begins with the experimental design. Randomized clinical trials (RCT) and high-quality quasi-experimental designs (QED) are almost always acceptable. Single subject designs and regression discontinuity designs are often acceptable but less so than RCT and QED. In addition to the experimental design of a study, appraisers consider other factors such as comparability of control groups, participant attrition, reliability of measures, fidelity of implementation, and possible confounding variables. The standards for all of these considerations are established prior to initiating the review. Studies that meet these standards are considered to have trustworthy findings on which meaningful conclusions about the effectiveness of the treatment can be based. The fourth step in the review process is to summarize the outcomes of all the studies that were judged to be relevant and of adequate quality. The basic procedure at this stage is to aggregate the results of the reviewed
26
RONNIE DETRICH ET AL.
studies. When all of the studies are group designs, it is common to calculate an effect size to estimate educational significance of the effect. Gersten et al. (2005) suggested that an effect size of .4 or greater is necessary to consider the results to be educationally important and meaningful. Making judgments about the effects of an intervention when evaluated with single subject design is typically based on visual analysis. There has been a lengthy discussion over the years concerning acceptable statistical methods for evaluating and aggregating single subject research findings, but to date no single superior effect size estimate has been established (Kazdin, 1976; Parker & Brossart, 2003). The final step in ESIRs is to assign a rating of the overall strength of evidence for the reviewed studies. These ratings are based on the number of studies and quality of studies that show positive effects, no effect, or negative effects. WWC (2011) rates interventions as having a positive effect if two studies of acceptable quality (including one high-quality study) show statistically significant positive effects and no studies show negative effects that are statistically significant. Other groups have different rating schemes and standards for assigning a rating. For example, Gersten et al. (2005) suggested four acceptable quality or two high-quality studies with a weighted effect size significantly greater than zero for an intervention to be considered empirically supported. Since single subject designs typically include few participants, the ratings systems usually require more studies to make statements about the evidentiary status of the intervention (Horner et al., 2005; Kratochwill et al., 2010). Sources of Uncertainty with Empirically Supported Intervention Reviews ESIRs are highly systematic and clearly articulate criteria for determining ‘‘worthy’’ research studies and conclusions based on them. It would seem that with a review process that is conducted so systematically and carefully there would be few sources of uncertainty; however, that is not the case. These reviews can be done with greater or lesser quality. There are two general sources of uncertainty for any type of evidence: (a) validity of the conclusions and (b) appropriateness of generalizations from the research base to a specific practical problem. These two sources of uncertainty are elaborated below; however, this is not intended to be an exhaustive description of all relevant uncertainties. Validity of Review Conclusions as a Source of Uncertainty One concern related to the validity of a review’s conclusions is the thoroughness of the review (Slocum, Detrich, et al., 2012). Did the search
Best Available Evidence and Uncertainty
27
process identify all of the relevant research? Each of the decisions about what to include or exclude can have significant impact on the results of the review and the statements that are made about the evidentiary status of an intervention. Each decision is essentially a cut point and changing where the cut point falls alters the research that is included or excluded from the review and ultimately the statements made about the intervention (Slocum, Detrich, et al., 2012). For many practical and ethical reasons, much of the research in special education is based on single case research designs. If the review process excludes studies based on single case methods, which is common, then it is reasonable to ask if all of the relevant literature has been gathered. If all single case studies have been excluded how might this affect the outcomes of the review? It is not uncommon for ESIRs to exclude more studies than are included in the review. Studies are excluded for a variety of reasons (e.g., research design, research quality) even though they appear in peer-reviewed journals. For example, the National Autism Center’s National Standards Project (2009) reviewed over 7,000 abstracts of articles and ultimately reviewed 775 articles. Approximately 10% of the initial pool of studies met inclusion criteria. A high exclusion rate creates uncertainty regarding how well the pool of the reviewed studies represents the whole body of literature. Ensuring that all the relevant studies are included in the review helps to reduce uncertainty, but there are other ways uncertainty impacts the validity of the conclusions. The procedures for appraising studies’ methodological quality can be inappropriate or unreasonable. There are a number of tools designed to assist reviewers in the appraisal of methodological quality (Wendt & Miller, 2012). Even so, there are choices to be made regarding the best tool for the collection of studies and different appraisal tools can yield drastically different results (O’Keefe, Slocum, Burlingame, Snyder, & Bundock, 2012). The validity of review conclusions is reduced and uncertainty increased when multiple sets of appraisal procedures lead to disparate conclusions. Uncertainty is also related to the appropriateness of the methods used for summarizing the outcomes of individual studies and across studies. Because there is no generally agreed upon statistical measure to aid in the analysis of the results of single case studies, there are important questions about the interpretation of the results. Typically, when single case designs are included there are explicit procedures for considering the visual data (Horner et al., 2005; Kratochwill et al., 2010); however, these procedures depend on the judgment of those reviewing the studies. One of the sources of uncertainty with single case designs is whether or not independent reviewers consistently
28
RONNIE DETRICH ET AL.
apply the criteria for evaluating the graphed data. There are two components to the consistency concern. The first component is the degree to which independent reviewers agree when rating the same research article (inter-rater agreement). The lower the agreement score between reviewers the less confidence the practitioner should have in the results of the review. The second component is the consistency of ratings by the same reviewer over time (i.e., intra-rater agreement or observer drift; Haynes, Richard, & Kubany, 1995). The concern here is that an individual rater may, over time, apply different criteria when reviewing single case research studies for an empirically supported intervention review. As a consequence of this drift an individual reviewer applies different standards to studies and ultimately could impact which studies are included in the review. Observer drift can be minimized by periodically having the reviewers calibrate their application of the review criteria against a standard. To the extent that there are not documented efforts to control observer drift then there is some uncertainty about the evaluation of the data from the studies. Appropriateness of Generalizations as a Source of Uncertainty Even if the concerns about the validity of the review process are satisfied, there are a number of sources of uncertainty about the reasonableness of the generalization from the ESIR to real-world instructional settings. Research studies may or may not reflect the conditions of typical classroom practice. Some studies are designed with the primary purpose of evaluating effects of a specific intervention on a specific outcome – these studies may sacrifice ‘‘ecological validity’’ for experimental rigor. Even studies that are carried out under more realistic and typical conditions may have important differences from the context in which a particular practitioner is working. These differences between the conditions in the experiment and the specific practice context produce uncertainty about whether the research results reflect the effects that are likely to be obtained in practice. In the search for unambiguous results, the experimenter designs the study to eliminate or reduce confounding variables. To do this, the participants in the study are often carefully defined and any potential participants who do not meet specific criteria are eliminated from the study. This often screens out students with co-morbid conditions. In special education, it is common that students have co-morbid conditions that may moderate the effects of the treatment. For example, many students with learning disabilities also have Attention Deficit Hyperactivity Disorder (ADHD), depression, and other psychosocial difficulties. If the participants in the research are
Best Available Evidence and Uncertainty
29
restricted to those with learning disabilities and no other conditions, the practitioner may have greater uncertainty about the appropriateness of the intervention for the identified population. In general, practitioners should be aware that differences between the participants in the research and those to whom the intervention will be applied are a potential source of uncertainty. The teaching staff and resources brought to bear in research studies may or may not reflect typical practice. Some studies are carried out with a large staff of well-trained research assistants and material resources that are unrealistic for typical practice. Other research is designed to reflect more typical service settings. Researcher work at ‘‘arms length’’ from the implementation, the research takes place in typical service settings, is implemented by usual teaching staff, and operates with the resources of the setting. When the ESIR is based on a large number of relatively artificial studies, it is reasonable to question the appropriateness of generalizing from the research base to a typical service setting. Several researchers have suggested that the differences between the conditions of research and practice settings is one of the reasons that practitioners often reject research evidence as a basis for selecting treatments (Kazdin, 2004; Schoenwald & Hoagwood, 2001). Reviews commonly find that intervention effects are smaller in studies that are more similar to typical practice. One reason for the reduced effects is lower implementation fidelity (Walker et al., 2009) studies that do not ensure high fidelity may provide less exposure to the intervention (e.g., once per week rather than the recommended three times per week), less intense or shorter sessions (e.g., 15 minutes per session rather than the recommended 45 minutes), and inaccurate implementation of the components of the intervention. For the practitioner, these possible explanations for the lower levels of effectiveness are sources of uncertainty. Practitioners may decide that the intervention is not appropriate for the current practical problem because they have not been adequately trained to implement it, the necessary organizational structures are not in place to support the intervention, or the resources for effective implementation are scarce. One of the difficult decisions that must be made when conducting an ESIR is which variations of the intervention will be included in the review. For example, most variations of the Good Behavior Game (Barrish, Saunders, & Wolf, 1969; Kellum et al., 2008) involve some form of penalty for rule violations; however, some variants reinforce rule compliance and ignore rule violations (Darch & Thorpe, 1977). If each of these variations is included in the ESIR the broadly defined treatment may receive a higher
30
RONNIE DETRICH ET AL.
rating than would be assigned if each variation was reviewed on its own. The advantage to including variations of a treatment is that it broadens the research base of the review. However, including multiple variations can also increase uncertainty about the effectiveness of a specific version of the treatment. One of the debates in evidence-based practice surrounds the degree to which interventions must be implemented as they are described in the research (adherence) or can be adapted to fit local circumstances (adaptation). From an adherence perspective, it is argued that adapting the intervention may change it in ways that will reduce or negate its effectiveness (Elliott & Mihalic, 2004). The adaptation perspective is that it is often necessary to adapt an intervention so that it fits within the specific practice situation and therefore may be more likely to be implemented with fidelity and result in positive benefit (Albin, Lucyshyn, Horner, & Flannery, 1996; Detrich, 1999). This is a compelling argument but, to date, there is no direct evidence that adapted interventions are implemented with greater fidelity; however, there is suggestive evidence that this is the case. Klingner, Vaughn, Hughes, and Arguelles (1999) in a survey of teachers who had been implementing a reading program for 3 years reported that the ability to adapt the program contributed to sustained use. Perhaps adaptation of practices is inevitable, implying that uncertainty is also inevitable. Adapting an intervention may fundamentally change the intervention in ways that reduce benefit. ESIRs typically provide no guidance in how to adapt interventions to produce a good fit with local circumstances (Cook, Tankersley, & Harjusola-Webb, 2008). Prior to implementation there is a great deal of uncertainty about how the adaptation will affect the treatment.
ALTERNATIVE SOURCES OF EVIDENCE In order to reduce uncertainties that arise from ESIR and to provide evidence that can support a wider range of important educational decisions, the practitioner can consult other sources of evidence. The ESIR is an excellent source of evidence when there is a great deal of high quality and relevant research available for review. However, many special education interventions lack sufficient evidence to conduct a meaningful ESIR. In addition, relatively few ESIRs which are very time-consuming to complete have been conducted, even for practices with considerable research findings. In these circumstances, evidence-based practitioners must
Best Available Evidence and Uncertainty
31
still make decisions based on the best available evidence. If ESIRs cannot provide sufficient guidance then there is merit in considering other ways of drawing practice recommendations from the best available evidence. There are several alternative sources for evidence that can guide practitioner decision-making – narrative reviews, best practice committees, and empirically supported practice guides. Each of these approaches is a means of summarizing evidence that can guide decision-making when there is insufficient evidence from ESIRs. Even when an ESIR is available, these sources can provide additional guidance by elaborating on the recommendations from the ESIR and providing greater detail about issues of implementation. Each of these sources of evidence can provide useful guidance but each also has its own uncertainties.
Narrative Review of Research The use of narrative reviews to identify best available evidence has a long tradition in education. Narrative reviews are reviews of a body of literature without the systematic and transparent procedures for screening, appraising, and summarizing studies. They are perhaps best exemplified by the National Association of School Psychology series Best Practices in School Psychology (Thomas & Grimes, 2008). In this series, editor selected authors discuss themes and cite literature in a particular area that, in their expert opinion, is most relevant and important. Narrative reviews are also commonly published in journals and edited books. The strength of these reviews is that the authors consider a broad range of literature, potentially including indirect evidence of the likely effectiveness of interventions and providing details that are not available in an ESIR. The very specific focus of an ESIR does not allow for the inclusion of relevant evidence from sources other than evaluations of the specific intervention under review. For example, a practitioner might be interested in developing a class-wide behavior management system for a self-contained special education classroom. Such a system is likely to require multiple components such as a means for recognizing appropriate student behavior, a class-wide motivational system, a means for individualizing the system for some students, recommended seating arrangements, the structure of classroom rules, a system for responding to misbehavior, and features of instruction that facilitate appropriate behavior. ESIRs are not well-suited to address such broad issues even though there has been a great deal of research on many of these topics.
32
RONNIE DETRICH ET AL.
Best Practice Committees Another source of evidence that relies on experts to provide evidence-based guidance to practitioners is best practice committees (Spencer et al., 2012). The assumption is that a panel of experts representing multiple perspectives can produce a more thoughtful and less biased set of recommendations than a single or small group of authors do in narrative reviews. Having multiple experts working on recommendations is considered a strength of this source of evidence. Much of the validity of the recommendations from the committee is derived from the collective expertise of the committee rather than the rigor of the process for deriving recommendations. In the mid-1990s, the first author (RD) served on a best practice committee organized by the state of California to develop recommendations for serving students with autism. There were 47 members on the committee representing parents, education, medicine, psychology, child development, speech and language, and state policy makers. Recommendations were made regarding parent involvement, assessment, educational practices, and health concerns. The breadth of the committee gave it credibility even though the large number of participants made the process of developing recommendations complex. The committee developed a decision rule that a practice would be considered best practice if 80% of the committee members approved it. One of the effects of this rule was that a relative small number of committee members could block a practice from being considered best practice. This represented truth by consensus, which can be very different than scientifically established truth. The risk of this type of decision-making is that it allows social-political processes to intrude into the practice of making recommendations and voting blocks can emerge to promote or exclude specific practices. The decision-making rule was not described in the final report of the committee, which may have resulted in readers assuming the practices were always supported by well-established scientific evidence.
Empirically Supported Practice Guides Practice guides are a relatively recent development as a source of evidence for practitioners. They combine some of the flexibility and latitude of expert judgment of narrative reviews and best practice committees with the rigor of ESIRs. Empirically supported practice guides originated with the U.S. Institute of Medicine (Field & Lohr, 1992) and are considered to be the gold
Best Available Evidence and Uncertainty
33
standard for summarizing the best available evidence in medicine (Holland, 1995). This approach to making evidence-informed recommendations was used to develop guidelines for educating children with autism (Lord & McGee, 2001) and is the approach used by the Institute for Education Sciences (IES) for developing their practice guides (e.g., Shanahan et al., 2010). The goal of the IES practice guides is to bring ‘‘the best available evidence and expertise to bear on the types of systemic challenges that cannot be currently addressed by single interventions or programs’’ (Shanahan et al., 2010, p. 43). The important innovation of practice guides is that they integrate expert judgment with a systematic review of relevant research. For each recommendation in the IES practice guides the strength of evidence (determined through a ESIR using WWC standards) is reported. The strength of evidence falls into one of three categories: (a) strong evidence, (b) moderate evidence, and (c) minimal evidence (e.g., Shanahan et al., 2010). Following the ESIR, the practice guides are developed and subjected to extensive peer review by experts who were not involved in the panel. The recommendations found in the IES practice guides are much broader than those found in the ESIR. Typically, IES ESIRs focus on specific interventions, programs, or curricula (e.g., Reading Mastery, Success for All) whereas the IES practice guides offer recommendations regarding broader, systemic issues and implementation parameters. For example, the practice guide, Using Student Achievement Data to Support Instructional Decision Making (Hamilton et al., 2009) includes the following recommendations: (1) make data part of ongoing cycle of instructional improvement, (2) teach students to examine their own data and set learning goals, and (3) provide supports that foster a data-driven culture within the school. All are rated as being supported by minimal evidence. Slocum, Spencer, et al. (2012) reported that across the 14 IES practice guides and the 78 recommendations published by January 2012, 45% of the recommendations had minimal support, 33% had moderate support, and 22% had strong support. This suggests a panel of experts often make recommendations in which the evidence does not meet the standards for being strongly supported, even when that panel is explicitly informed by an ESIR. One interpretation of this is that ESIRs alone will fail to identify many important recommendations that are clear to a well-informed panel of experts, and thus the expert recommendations may add substantial value beyond that of an ESIR. At the same time, recommendations that have only minimal empirical support should be seen for what they are: expert opinions rather than empirically supported interventions.
34
RONNIE DETRICH ET AL.
Sources of Uncertainty in Alternative Sources of Evidence These alternative sources of evidence allow for expert judgment of the importance, relevance, and implications of the evidence. This judgment can be a strength of these types of reviews, but it is also a source of uncertainty. Since judgment can be a source of bias as well as insight, much of the uncertainty in these alternative sources of evidence stems from questions about the expertise and potential biases of the authors. In narrative reviews published as journal articles, the authors are generally self-appointed. In edited books such as the Best Practice series, the editors select authors. Best practice committees and panels for practice guides are generally selected by the organization conducting the process. Ostensibly, the authors are experts in their area but the selection process is not sufficiently transparent to assure the practitioner that the author is the best choice for the review and will provide an unbiased interpretation of the research evidence. At best, readers must depend on brief summaries of qualifications and/or the legitimacy of the organization that has made the selections. For many readers, the experts selected by IES may raise less uncertainty than those selected by a local advocacy group or the author of a journal article. The process of reviewing research for the purpose of deriving practice recommendations is not explicit and transparent in these sources that draw upon expert judgment. In narrative reviews, there is often little transparency about how the author selects the themes, interventions, and particular studies for review. There are often no explicit criteria for describing how the research was considered and the basis for inclusion or exclusion in the review. It is possible that the authors have ignored areas of research that are not consistent with their perspective. It is unreasonable to expect any author to be completely neutral when reviewing and drawing conclusions and the narrative review provides no systematic methods to constrain these biases. This requires the practitioner to ‘‘trust’’ the authors and this trust may be a function of the shared biases of the author and the practitioner (Slocum, Spencer, et al., 2012). When committees derive recommendations, there are often few systematic methods and little transparency about how the recommendations are actually developed. Since the recommendations reflect the collective expertise of the committee, the decision-making process becomes important. If the committee was designed to reflect maximum diversity of perspectives, it may be difficult for the committee to reach consensus about which research base to consider, the process for deciding what is a best practice, and the resulting recommendations. The recommendations may be the best
Best Available Evidence and Uncertainty
35
practices that can be agreed on by the committee rather than those that are supported by the best available evidence. If the committee is drawn narrowly, it may sample a portion of the research literature that is consistent with the committee’s perspective or it may reflect a narrow view of best practices and may not be perceived as credible by the larger community. IES recognizes the potential for bias and includes the following statement in each practice guide: In some cases research does not provide a clear indication of what works, and panelists’ interpretation of the existing (but incomplete) evidence plays an important role in guiding the recommendations. As a result, it is possible that two teams of recognized experts working independently to produce a practice guide on the same topic would come to very different conclusions. Those who use the guides should recognize that the recommendations represent, in effect, the advice of consultants. However, y practice guide authors are nationally recognized experts who collectively endorse the recommendations, justify their choices with supporting evidence, and face rigorous independent peer review of their conclusions. (Shanahan et al., 2010, p. 43)
One of the strengths of practice guides is the explicit level of strength rating for each recommendation. This allows experts to connect the dots and provide recommendations, but also reports the degree to which these recommendations are directly and specifically supported by research evidence. In addition, the peer review may provide some protection against bias in these sources of evidence. The process of developing IES practice guides includes several rounds of review and narrative reviews that are published in journals subjected to peer review. Edited volumes and best practice committees may or may not include peer review prior to publication. Peer review can reduce uncertainty associated with expert judgments. As with ESIRs, these alternative sources of evidence are subject to uncertainty from the difficulties of generalizing from the research base to a specific educational situation. This is not surprising given that all these types of reviews are drawing from a research base. There are no established criteria to guide the practitioner in making these judgments. This puts a burden on the practitioner and the adequacy of these judgments will vary across practitioners as a function of their training and experience. However, authors of these sources may offer expert guidance on the limits of generalizability of the recommendations. Summary of Alternative Sources of Best Available Evidence Narrative reviews, best practice committees, and practice guides share several features. They all depend on the expertise of the individuals involved
36
RONNIE DETRICH ET AL.
and their validity is derived from this expertise rather than solely from the rigor of their methods. A common validity problem for each type of review is that there is little transparency about the selection process and the process for generating recommendations. Still, these sources of evidence serve as a valuable resource when the more rigorous ESIRs are not available and serve as a complement to ESIRs by providing detail and guidance that is not available in ESIRs. Often, ESIRs provide guidance about what intervention to adopt but they do not provide guidance about how to actually implement them or how to adapt for a better contextual fit. The flexibility of these alternative sources of evidence fills an important gap in the evidence base. They have the capacity to consider a broad range of research and to consider evidence that is indirectly related to the topic.
SMALLER UNITS OF ANALYSIS It is appropriate to consider evidence regarding the effectiveness of various kinds of ‘‘interventions’’ including comprehensive curriculum and instructional programs (e.g., Success for All), multicomponent packages (e.g., Good Behavior Game), specific techniques (e.g., error corrections), and principles of behavior (e.g., reinforcement; Cook & Cook, 2011). Depending on the practical problem, evidence about any of these types of ‘‘interventions’’ may be useful in developing effective educational services. ESIRs, narrative reviews, and best practice committees typically address large units of analysis and define the ‘‘intervention’’ at the level of macro-practices (Cook & Cook, 2011). In this section we consider three smaller units of interventions (practice elements, kernels, and principles) as additional sources of best available evidence. Cook and Cook (2011) referred to these units as micro-practices. They are especially useful in situations in which there are no identified empirically supported interventions and in situations where practitioners are required to either adapt an empirically supported intervention to local circumstances or to develop an individualized intervention. Developing an individualized intervention is most likely for those students needing an intensive level of support. In these circumstances, practice elements, kernels, and principles may represent the best available evidence option as a basis for developing tailored interventions. Practice Elements Practice elements are elements or components that are commonly found in effective treatments (Chorpita, Becker, & Daleiden, 2007; Chorpita,
Best Available Evidence and Uncertainty
37
Daleiden, & Weisz, 2005). Chorpita and colleagues observed that often there are several manualized treatments for a problem and these different treatments contain many common elements. If these manualized treatments have evidentiary support and they contain common elements then it is logical to assume that these common elements are likely to be effective even if they have not been evaluated in isolation. Chorpita et al. (2007) identified the practice elements that were present in treatments that had been shown to be effective for youth with behavioral and emotional needs. For example, 145 studies showed effective treatments for anxious and avoidant behavior problems. Analysis revealed that 82% of the successful treatments included exposure, 42% included relaxation, 40% used cognitive strategies, 32% used modeling, 29% included psycho-education, and 27% used praise/rewards. Again, it must be emphasized that the individual elements have not been evaluated and that the evidence supporting their use is indirect. While this may not be the best possible evidence, it may be the best available evidence in some circumstances. Utilizing practice elements has considerable appeal for practitioners. It allows them to tailor interventions for individual clients rather than using a multicomponent intervention that may not be a good contextual fit. It also allows the practitioner to develop and implement interventions that are consistent with their training and experience. Although it is unlikely that practitioners will have the necessary training to effectively implement dozens of different intervention manuals, they likely will have expertise in many of the common practice elements. The logic of practice elements should be attractive to special educators who are often required to individualize and adapt interventions for their students. Practice elements allow the practitioner great flexibility in developing interventions that reflect the practitioner’s skills and training, the needs of individual students, and the school and classroom context.
Kernels Embry and Biglan (2008, p. 75) described kernels as ‘‘fundamental units of behavioral influence that appear to underlie effective prevention and treatment for children, adults, and families’’ (see also Embry, 2004). Kernels are similar to practice elements in that they are smaller units of interventions that can be combined with other kernels to form unique interventions. The most important distinction between the two is that kernels must be directly evaluated for effectiveness (Embry, 2004; Embry & Biglan, 2008). Kernels fall into four distinct classes: (1) consequences of
38
RONNIE DETRICH ET AL.
behavior, (2) instruction, (3) motivation, and (4) biological functions. Praise, mystery motivators, timeout, and overcorrection are examples of kernels addressing consequences of behavior. Examples of instructional kernels include cooperative structured peer play and self-monitoring. Motivational interviewing and public commitment are motivational kernels. Kernels that affect biological functions include omega-3 fatty acid, aerobic play, and progressive muscle relaxation. Embry and Biglan have argued that kernels can be implemented easily and cost effectively. In addition, they are more flexible than multicomponent treatment packages in the sense that they allow practitioners to generalize their use to new settings and situations.
Principles Research-based principles of behavior and instruction including such principles as reinforcement, extinction, and stimulus control can provide practitioners with guidance in evidence-based practice. Principles are established through basic research in psychology, then validated in applied research across many specific populations and contexts. Principles are the building blocks that underlie all effective interventions from large, multicomponent packages to small units such as practice elements and kernels. For example, the core principle underlying both Mystery Motivators (Moore, Waguespack, Wickstrom, Witt, & Gaydos, 1994) and First Step to Success (Walker et al., 1998) is reinforcement for appropriate behavior. Similarly, high rate of opportunities to respond is the instructional principle that informs such interventions as discrete trial training (Smith, 2001) and student response cards (Narayan, Heward, Gardner, Courson, & Omness, 1990). Respondent extinction is the principle that accounts for the effectiveness of exposure as an element of anxiety treatment (Chorpita et al., 2007), and punishment is the basis of the kernel overcorrection (Embry & Biglan, 2008). Evidence about principles can inform the development or adaptation of an intervention. Evidence regarding principles can complement other sources of evidence or may be the best available evidence for some problems.
Sources of Uncertainty for Smaller Units of Analysis The evidence base for the effectiveness of practice elements rests on the logic that since the elements are common across many effective treatment
Best Available Evidence and Uncertainty
39
packages, they are likely to be effective when implemented as components of other packages. While the reasoning may seem compelling, uncertainty arises from the fact that the elements themselves have not been directly tested and demonstrated to be effective. Kernels and directly validated principles do not share this source of uncertainty. Kernels have been empirically examined as stand-alone interventions and as components of intervention packages. Principles generally have an even stronger research base including both basic and applied research, and spanning numerous populations and settings. Although they are well-supported by research, working with principles of behavior and instruction requires the most judgment by practitioners and has the greatest uncertainty associated with implementation. Principles are broad statements about factors that affect learning and behavior; they are not ready-to-implement techniques. Powerful principles do not always translate into effective interventions for both technical and social reasons. In addition, principles are usually general statements such as the principle of reinforcement – behavior that is followed by a reinforcer will increase in frequency. The statement does not prescribe how to identify reinforcers and arrange them in a specific intervention package nor does it describe some of the constraints that impact the effectiveness of reinforcement such as delay of reinforcement, response effort, and frequency of reinforcement. The practitioner must judge what principles are relevant to a given situation and how those principles can be applied most effectively. In many cases, this requires both deep understanding of the principle and a great deal of expertise in the particular application context. For example, a naı¨ ve practitioner may attempt to apply the principle of reinforcement to the problem of students making decoding errors while reading. He might give a party at the end of the week for all students who make fewer than five decoding errors across the week. This is unlikely to be effective for several reasons. The consequence is long delayed after the critical behavior, it is infrequent, a party may not be reinforcing for the students, and the students may not have the decoding skills necessary to correctly read some of the words in the text. This is a simple and obvious example, but the point is a general one. Principles cannot be applied directly; they must be translated into specific interventions and this translation is a highly uncertain process requiring multiple iterations and close monitoring of effects. The effectiveness of an intervention based on the principle of reinforcement will ultimately be determined by how the principle is implemented in a specific intervention. All of the types of intervention units discussed in this section (practice elements, kernels, principles) share an important source of uncertainty;
40
RONNIE DETRICH ET AL.
although they may be supported by extensive evidence, the practitioner must do the critical job of arranging these components into an effective intervention package. There is uncertainty about how the elements will combine into a new intervention. The various components must be adapted and arranged properly to function together to produce the desired outcome. Additional uncertainty is warranted because the newly constructed intervention has not been evaluated. Experienced program developers often begin with components such as practice elements, kernels, and principles yet find that multiple iterations of design, evaluation, and revision are necessary to produce an effective intervention. This suggests that the process of combining well-established components is complex and uncertain. Further, the training, experience, and skill of the practitioner in the process of constructing intervention packages is unknown and must be considered a source of uncertainty. Adapting or developing an individualized intervention requires more of the practitioner than does implementation of a fully developed intervention. Without training and experience the practitioner may not be able to make reasonable choices about which elements to include in the intervention.
INTEGRATING SOURCES OF BEST AVAILABLE EVIDENCE TO REDUCE UNCERTAINTY The integration of the various sources of best available evidence can be described through a metaphor. Consider a practical problem as a large white space to be filled with evidence. The various sources of evidence can be combined to cover as much of the white space as possible. Although ESIRs can fill some of the white space, considerable white space is left because of uncertain validity of the review (e.g., were all relevant studies identified?, were appropriate inclusion/exclusion criteria applied?) as well challenges regarding logical generalizations from the research base to the practical situation (e.g., is there reasonable match between research and clinical population?, is there sufficient similarity between research setting and treatment setting?). To fill in more of the white space, the practitioner can consider evidence from other sources such as narrative reviews, best practice committees, and evidence-based practice guides. These sources can provide guidance on reasonable variations of the empirically supported intervention and how to adapt the intervention so there is a better contextual fit (e.g., which elements can be adapted without reducing the effectiveness of the intervention). To further fill the white space, principles of behavior and instruction can be
Best Available Evidence and Uncertainty
41
used to inform reasonable adaptations to the intervention. Practice elements and kernels can be sources of evidence to adapt an intervention and fill in the white space. The resulting intervention that is informed by evidence from all of these sources will cover most of the white space and the practitioner is now informed by the best available evidence and ready to implement the practice. Even when interventions are fully developed and comprehensive, white space is likely to remain. Although uncertainties exist regarding the effects of the intervention, progress monitoring (a type of practice-based evidence) can fill in even more of the white space. As the intervention is implemented and progress monitoring data are collected, the practitioner can make evidence informed decisions about the effectiveness of the intervention and continue to make adjustments until there is an adequate response to intervention. Progress monitoring has an important advantage over other sources of evidence when considering uncertainty. No generalizations from the research population and settings to the educational population and setting are necessary. The uncertainties associated with progress monitoring are the reliability and validity of the data. It is clear that the reliance on best available evidence to make effective decisions is not an easy matter. All sources of evidence are subject to uncertainty. Understanding that uncertainty is inherent in all evidence, practitioners can make better decisions by integrating several sources of evidence and by being aware of the sources of uncertainty associated with each. Systematic, critical thinking is necessary to use data in ways that take advantage of their strengths and minimize uncertainty. Relying on multiple forms of the best available evidence to inform decisions is the best way to improve the quality of education for students in special education.
REFERENCES Albin, R. W., Lucyshyn, J. M., Horner, R. H., & Flannery, K. B. (1996). Contextual fit for behavioral support plans: A model for ‘‘goodness of fit’’. In L. K. Koegel, R. L. Koegel & G. Dunlap (Eds.), Positive behavioral support: Including people with difficult behavior in the community (pp. 81–98). Baltimore, MD: P.H. Brookes. American Psychological Association. (2005, August). American Psychological Association policy statement of evidence-based practice in psychology. American Psychologist, 61, 271–275. Barrish, H. H., Saunders, M., & Wolf, M. M. (1969). Good behavior game: Effects of individual contingencies for group consequences on disruptive behavior in a classroom. Journal of Applied Behavior Analysis, 2, 119–124.
42
RONNIE DETRICH ET AL.
Behavior Analyst Certification Board. (2010). Guidelines for responsible conduct for behavior analysts. Retrieved from http://www.bacb.com/index.php?page=100165 Chorpita, B. F., Becker, K. D., & Daleiden, E. L. (2007). Understanding the common elements of evidence-based practice: Misconceptions and clinical examples. Journal of American Academy of Child Adolescent Psychiatry, 46, 647–652. Chorpita, B. F., & Daleiden, E. L. (2010). Building evidence-based systems in children’s mental health. In J. R. Weisz & A. E. Kazdin (Eds.), Evidence-based psychotherapies for children and adolescents (2nd ed., pp. 482–499). New York, NY: Guilford Press. Chorpita, B. F., Daleiden, E. L., & Weisz, J. R. (2005). Identifying and selecting the common elements of evidence based interventions: A distillation and matching model. Mental Health Services Research, 7(1), 5–20. Cook, B. G., & Cook, S. C. (2011). Thinking and communicating clearly about evidence-based practices in special education. Council for Exceptional Children’s Division for Research. Retrieved from http://www.cecdr.org/ Cook, B. G., Tankersley, M., & Harjusola-Webb, S. H. (2008). Evidence-based special education and professional wisdom: Putting it altogether. Intervention in School and Clinic, 44, 105–111. Cordray, D. S., & Morphy, P. (2009). Research synthesis and public policy. In H. Cooper, L. V. Hedges & J. C. Valentine (Eds.), The handbook of research synthesis and metaanalysis (2nd ed., pp. 473–493). New York, NY: Russell Sage Foundation. Darch, C. B., & Thorpe, H. W. (1977). The principal game: A group consequence procedure to increase classroom on-task behavior. Psychology in the Schools, 14, 341–347. Detrich, R. (1999). Increasing treatment fidelity by matching interventions to contextual variables within the educational setting. School Psychology Review, 28, 608–620. Elliott, D. S., & Mihalic, S. (2004). Issues in disseminating and duplicating effective prevention programs. Prevention Science, 5, 47–53. Embry, D. D. (2004). Community-based prevention using simple, low-cost, evidence-based kernels and behavior vaccines. Journal of Community Psychology, 32, 575–591. Embry, D. D., & Biglan, A. (2008). Evidence-based kernels: Fundamental units of behavioral influence. Clinical Child Family Psychology Review, 11, 75–113. Field, M. J., & Lohr, K. N. (1992). Guidelines for clinical practice: From development to use. Washington, DC: National Academy Press. Gambrill, E. (2012). Critical thinking in clinical practice: Improving the quality of judgments and decision (3rd ed.). Hoboken, NJ: Wiley. Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators in group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–164. Hamilton, L., Halverson, R., Jackson, S., Mandinach, E., Supovitz, J., & Wayman, J. (2009). Using student achievement data to support instructional decision making (NCEE 20094067). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/wwc/publications/practiceguides/ Haynes, S. N., Ricard, D. C. S., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7, 238–247. Holland, J. P. (1995). Development of a clinical practice guideline for acute low back pain. Current Opinion in Orthopedics, 6, 63–69.
Best Available Evidence and Uncertainty
43
Horner, R. H., Carr, E. G., Halle, J., Mcgee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Jimerson, S. R., Burns, M. K., & VanDerHeyden, A. M. (2007). Response to intervention at school: The science and practice of assessment and intervention. In S. R. Jimerson, M. K. Burns & A. M. VanDerHeyden (Eds.), Handbook of response to intervention (pp. 3–9). New York, NY: Springer. Kazdin, A. E. (1976). Statistical analysis for single-case experimental designs. In M. Hersen & D. H. Barlow (Eds.), Single case experimental designs: Strategies for studying behavior change (pp. 265–313). New York, NY: Pergamon Press. Kazdin, A. E. (2004). Evidence-based treatments: Challenges and priorities for practice and research. Child and Adolescent Psychiatric Clinics of North America, 13, 923–940. Kellum, S. G., Brown, C. H., Poduska, J. M., Ialongo, N. S., Wang, W., Toyinbo, P., Petras, H., y Wilcox, H. C. (2008). Effects of a universal classroom behavior management program in first and second grades on young adult behavioral, psychiatric, and social outcomes. Drug and Alcohol Dependence, 95, S5–S28. Klingner, J. K., Vaughn, S., Hughes, M. T., & Arguelles, M. E. (1999). Sustaining researchbased practices in reading: A 3 year follow-up. Remedial and Special Education, 20, 263–274. Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M., & Shadish, W. R. (2010). Single-case designs technical documentation. Retrieved from http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=229 Lord, C., & McGee, J. (2001). Educating children with autism: Committee on educational interventions for children with autism. Division of Behavioral and Social Sciences and Education. Washington, DC: National Academy Press. Moore, L. A., Waguespack, A. M., Wickstrom, K. F., Witt, J. C., & Gaydos, G. R. (1994). Mystery motivator: An effective and time efficient intervention. School Psychology Review, 23, 106–118. Narayan, J. S., Heward, W. L., Gardner, R., III., Courson, F. H., & Omness, C. K. (1990). Using response cards to increase student participation in an elementary classroom. Journal of Applied Behavior Analysis, 23, 483–490. National Association of School Psychologists. (2010). Principles for professional ethics. Retrieved from http://www.nasponline.org/standards/2010standards.aspx National Autism Center. (2009). National standards report: National standards projectaddressing the need for evidence-based practice guidelines for autism spectrum disorders. Randolf, MA: National Autism Center. O’Keefe, B. V., Slocum, T. A., Burlingame, C., Snyder, K., & Bundock, K. (2012). Comparing results of systematic reviews: Parallel reviews of research on repeated reading. Education and Treatment of Children, 35, 333–366. Parker, R. I., & Brossart, D. F. (2003). Evaluating single-case research data: A comparison of seven statistical methods. Behavior Therapy, 34, 189–211. Sackett, D. L., Straus, S. E., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (Eds.). (2000). Evidence-based medicine: How to teach and practice EBM. Edinburgh, UK: Churchill Livingstone. Schlosser, R. W., Wendt, O., & Sigafoos, J. (2007). Not all systematic reviews are created equal: Considerations for appraisal. Evidence-Based Communication Assessment and Intervention, 1, 138–150.
44
RONNIE DETRICH ET AL.
Schoenwald, S. K., & Hoagwood, K. (2001). Effectiveness, transportability, and dissemination of interventions: What matters when? Psychiatric Services, 52, 1190–1197. Shanahan, T., Callison, K., Carriere, C., Duke, N. K., Pearson, P. D., Schatschneider, C., & Torgesen, J. (2010). Improving reading comprehension in kindergarten through 3rd grade: A practice guide (NCEE 2010-4038). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from http://whatworks.ed.gov/publications/practiceguides Slocum, T. A., Detrich, R., & Spencer, T. D. (2012). Evaluating the validity of systematic reviews to identify empirically supported treatments. Education and Treatment of Children, 35, 201–233. doi: 10.1353/etc.2012.0015 Slocum, T. A., Spencer, T. D., & Detrich, R. (2012). Best available evidence: Three complementary approaches. Education and Treatment of Children, 35, 27–55. Smith, T. (2001). Discrete trial training in the treatment of autism. Focus on Autism and Other Disabilities, 16, 86–92. Spencer, T. D., Detrich, R., & Slocum, T. A. (2012). Evidence-based practice: A framework for making Effective decisions. Education & Treatment of Children, 35, 127–151. Sugai, G., & Horner, R. H. (2009). Defining and describing schoolwide positive behavior support. In W. Sailor. In G. Dunlap, G. Sugai & R. H. Horner (Eds.), Handbook of positive behavior support (pp. 307–326). New York, NY: Springer. Thomas, A., & Grimes, J. (2008). Best practices in school psychology V. Washington, DC: NASP. Walker, H. M., Kavanaugh, K., Stiller, B., Golly, A., Severson, H. H., & Feil, E. G. (1998). First step to success: An early intervention approach for preventing school antisocial behavior. Journal of Emotional & Behavioral Disorders, 6(2), 66–80. Walker, H. M., Seeley, J. R., Small, J., Severson, H. H., Graham, B., Feil, E. G., Serna, L., y Forness, S. R. (2009). A randomized controlled trial of the First Step to Success early intervention: Demonstration of program efficacy outcomes in a diverse, urban school district. Journal of Emotional & Behavioral Disorders, 17, 197–212. Wendt, O., & Miller, B. (2012). Quality appraisal of single-subject experimental designs: An overview and comparison of different appraisal tools. Education and Treatment of Children, 35, 235–268. What Works Clearinghouse. (2011). What Works Clearinghouse procedures and standards handbook (Version 2.1). Retrieved from http://ies.ed.gov/ncee/wwc/documentsum. aspx?sid=19 Whitehurst, G. J. (2002, October). Evidence-based education. Paper presented at the Student Achievement and School Accountability Conference. Retrieved from http://www2.ed. gov/nclb/methods/whatworks/eb/edlite-index.html Wilczynski, S. M. (2012). Risk and strategic decision-making in developing evidence-based practice guidelines. Education and Treatment of Children, 35, 291–311.
CHAPTER 3 APPRAISING SYSTEMATIC REVIEWS: FROM NAVIGATING SYNOPSES OF REVIEWS TO CONDUCTING ONE’S OWN APPRAISAL Ralf W. Schlosser, Parimala Raghavendra and Jeff Sigafoos ABSTRACT Systematic reviews – that is, research reviews that are rigorous and follow scientific methods – are increasingly important for assisting stakeholders in implementing evidence-based decision making for children and adults with disabilities. Yet, systematic reviews vary greatly in quality and are therefore not a panacea. Distinguishing ‘‘good’’ reviews from ‘‘bad’’ reviews requires time and skills related to the appraisal of systematic reviews. The purpose of this chapter is to inform stakeholders (i.e., practitioners, administrators, policy makers) of evidence-based information sources that provide synopses (i.e., appraisals) of systematic reviews, to provide guidance in reading and interpreting the synopses of various sources, and to propose how to make sense of multiple synopses from
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 45–64 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026005
45
46
RALF W. SCHLOSSER ET AL.
different sources for the same systematic review. A secondary purpose of this chapter is to illustrate how stakeholders can conduct their own appraisals if synopses are not available.
According to Petticrew and Roberts (2006), systematic reviews ‘‘adhere closely to a set of scientific methods that explicitly aim to limit systematic error (bias), mainly attempting to identify, appraise, and synthesize all relevant studies (of whatever design) in order to answer a particular question (or set of questions)’’ (p. 9). As such, systematic reviews are distinguishable from narrative reviews, which frequently suffer from biases in (a) the retrieval of included studies, (b) decisions on inclusion and exclusion of studies for the review, (c) the extraction of data from the original studies, (d) the appraisal of included studies, and (e) conclusions about effectiveness of various intervention approaches. Schlosser and Goetze (1992), for example, contrasted four narrative reviews focused on the effectiveness of interventions aimed at reducing self-injurious behaviors in individuals with developmental disabilities. Even though the reviews had the same aim, to document which interventions are effective in reducing self-injurious behaviors, they varied widely in terms of the included studies (among the overlapping years) and the judgments about effectiveness of various treatments. Due to their ability to synthesize and evaluate the rigor of a body of studies, systematic reviews are a critical methodology for generating knowledge about the effectiveness of interventions (Petticrew & Roberts, 2006), and for identifying gaps in the literature that can then be used to plan future research (Eagly & Wood, 1994). Systematic reviews in the form of meta-analyses are a special case of systematic reviews; compared to individual studies they increase statistical power and reduce Type II error because studies are combined through statistical means generating average effect sizes across studies and confidence intervals. Not all systematic reviews are or need to be meta-analyses, but meta-analyses are almost always systematic reviews. Systematic reviews (and meta-analyses) play an important role in evidencebased practice (EBP). For one, relative to individual studies, systematic reviews rank more highly on hierarchies of evidence because a synthesis of multiple studies is a more powerful piece of evidence than any individual study. For example, in determining the effects of an intervention, the Oxford Centre for Evidence-Based Medicine Levels of Evidence Working Group
Appraising Systematic Reviews
47
(2011) ranked systematic reviews of randomized control trials (RCT) and single-case research as the highest or most meaningful level of evidence, whereas individual RCTs and single-case studies are the next lower level. Similarly, Schlosser and Raghavendra (2004) ranked meta-analyses of singlecase experimental designs above individual single-case experimental designs in terms of the conclusions that can be drawn from the evidence. Systematic reviews, however, are not a panacea and vary in quality just like narrative reviews. In fact, reviews of the reporting characteristics of systematic reviews in the field of medicine revealed that they vary considerably in the transparency of their reporting and, hence, in quality (Choi et al., 2001; Moher, Tetzlaff, Tricco, Sampson, & Altman, 2007). Thus, for the educator or clinician, finding an appropriate basis for evidence-based decision making may not be as simple as locating a systematic review because not all systematic reviews are created equal (Schlosser, Wendt, & Sigafoos, 2007). It cannot be assumed that all systematic reviews are of high quality and therefore trustworthy. Applying the 6S pyramid of evidence-based information sources (DiCenso, Bayley, & Haynes, 2009; Haynes, 2006) might provide some guidance for stakeholders on how to navigate systematic reviews as an evidence-based information source.
NAVIGATING THE 6S PYRAMID OF EVIDENCE-BASED INFORMATION SOURCES Originally proposed in the medical field by Haynes (2001) as the 4S model consisting of systems, synopses, syntheses, and studies, it was later updated to include summaries as a new layer of evidence (Haynes, 2006). Subsequently, DiCenso and colleagues (2009) argued that synopses of individual studies need to be distinguished from synopses of systematic reviews (what they call syntheses) because systemic reviews outperform individual studies on any number of hierarchies related to levels of evidence, resulting in an additional 6th layer. Thus, the 6S pyramid includes, from the top, (1) Systems, (2) Summaries, (3) Synopses of syntheses, (4) Syntheses, (5) Synopses of studies, and (6) Studies. Systematic reviews figure prominently in this pyramid, but let us explain the other layers of the pyramid as well. Based on the principles of the pyramid, stakeholders should start from the top of the pyramid in their quest for evidence, beginning with Systems. These are computerized decision-making support systems that weigh the evidence for a particular clinical question. That is, computerized decision support systems
48
RALF W. SCHLOSSER ET AL.
automatically link individual patient data with a computerized evidence base, resulting in patient-specific assessments or recommendations. Even in medicine such systems are extremely rare; one such system is available to manage inpatient influenza vaccination (DiCenso et al., 2009). In special education and related service fields such systems are nonexistent. At the next lower level are Summaries – these are clinical practice guidelines that are ideally heavily based on the evidence at the lower levels of the 6S pyramid. Clinical practice guidelines are available in some fields addressing interventions for children with disabilities such as the related service of speechlanguage pathology. For example, Myers and Johnson (2007) prepared a clinical practice guideline for pediatricians to manage children with Autism Spectrum Disorders. The American Speech-Language-Hearing Association (2006) offered clinical practice guidelines for speech-language pathologists working with individuals with ASD across the life span. If summaries are not available, stakeholders should consult Synopses of syntheses. Synopses of syntheses are appraisals of systematic reviews. According to the pyramid, it is more efficient for the stakeholder to rely on the appraisal done by an expert not involved with the original review rather than engage in an appraisal themselves. Some have referred to appraisals carried out by others as pre-filtered evidence (Guyatt & Rennie, 2002). In other words, it is more desirable to locate a synopsis of a systematic review than the systematic review itself. As mentioned earlier, having to appraise a systematic review may pose a burden on the busy stakeholder in terms of the time needed (Humphris, Littlejohns, Victor, O’Halloran, & Peacock, 2000; Zipoli & Kennedy, 2005) and skills required to conduct a thorough appraisal. Being able to rely on an appraisal of someone else therefore saves the stakeholder time and reduces the skill burden that may otherwise be experienced. Should a synopsis of a systematic review be unavailable, the stakeholder should obtain a systematic review – now we are at the level of syntheses in the pyramid. This requires the stakeholder to conduct his or her own appraisal of the systematic review to evaluate the strengths, weaknesses, and trustworthiness of the review. Later on in the chapter some appraisal considerations will be described as well as tools for doing so. Although this chapter is about systematic reviews, the pyramid also provides guidance when systematic reviews are not available. In that situation, the stakeholder should try to locate synopses of individual studies. Akin to synopses of systematic reviews, synopses of individual studies are considered a better evidence-based information source than the studies themselves. Finally, should synopses of individual studies be unavailable,
Appraising Systematic Reviews
49
the stakeholder has to rely on individual studies and carry out the appraisal thereof.
Locating Synopses of Systematic Reviews Synopses of systematic reviews can be found in a variety of sources. One such source is evidence-based journals such as Evidence-Based Communication Assessment and Intervention (http://www.psypress.com/ebcai). Among other pieces of evidence, the journal offers structured appraisal abstracts of systematic reviews of treatments related to communication (including language, speech), and swallowing interventions of children and adults with a variety of disabilities. The journal has published, for example, appraisals of several systematic reviews on the effectiveness of the Picture Exchange Communication System (PECS), a treatment package commonly used with beginning communicators with autism or other developmental disabilities (Beck, 2009; Raghavendra, 2010; Simpson, 2011; Subramanian & Wendt, 2010; Wendt & Boesch, 2010). All structured appraisal abstracts include a title that is aimed to communicate the clinical bottom line in light of the evidence, the names and affiliation of the commentary authors, the question asked, the methods used, main results, author’s conclusions, and a commentary. As far as the methods are concerned, the structured appraisal abstracts include the following components relative to the studies reviewed in the original systematic reviews: (a) design, (b) data sources, (c) selection and assessment, and (d) outcomes. Commentary authors are encouraged to use the considerations for appraising reviews proposed by Schlosser et al. (2007). Another source for obtaining synopses of reviews is DARE, the Database of Abstracts of Reviews of Effects (http://www.crd.york.ac.uk/crdweb/). This is a British database that is freely accessible on the web. A search for ‘‘intellectual disabilities,’’ for example, yielded 27 entries, including an appraisal of a systematic review of Snoezelen, a multisensory treatment used with individuals with intellectual disabilities (Lotan & Gold, 2009), as well as a systematic review on behavioral interventions for treating challenging behaviors, such as rumination and operant vomiting (Lang et al., 2011). Among the components contained in the structured appraisal abstract are: (a) a Centre for Reviews and Dissemination (CRD) summary, (b) author’s objectives, (c) searching, (d) study selection, (e) assessment of study quality, (f) data extraction, (g) methods of synthesis, (h) results of the review, (i) author’s conclusions, (j) CRD commentary, (k) implications of the review
50
RALF W. SCHLOSSER ET AL.
for practice and research, (l) funding, (m) bibliographic details, (n) original paper uniform resource locator, (o) other publications of related interests, (p) Medical Subject Headings, and (q) database entry date. The goal of the appraisal is to inform users of the database of the overall validity and reliability of the review. A third source of synopses of syntheses is the EBP Compendium by the American Speech-Language and Hearing Association (http://www.asha.org/ members/ebp/compendium/). Although developed by this professional organization, this site is freely accessible to anyone. The content is primarily related to speech, language, and communications issues involving individuals with disabilities. The user may view all appraised systematic reviews or search by keyword. The following selection of keywords may be of interest to stakeholders: aphasia, apraxia, augmentative and alternative communication, autism spectrum disorders, behavioral treatments, computer-based treatments, down syndrome, dysarthria, functional communication training, intellectual disabilities, Picture Exchange Communication System, social skills, specific language impairment, total communication, video modeling, voice output communication aids, and written language disorders. Clicking on ‘‘video modeling,’’ for example, leads to an appraisal of a systematic review by Delano (2007). The structured appraisal abstract contains the following components: (a) citation, (b) indicators of review quality, (c) description, (d) questions addressed, (e) population, (f) intervention/ assessment, (g) number of studies included, (h) years included, (i) findings, (j) conclusions, (k) keywords, (l) and date added to EBP Compendium. A fourth source is Evidence in Augmentative and Alternative Communication (EVIDAAC), a database of appraised evidence focused on the field of augmentative and alternative communication (AAC) (http://www.evidaac. com/). ‘‘AAC involves attempts to study and when necessary compensate for temporary or permanent impairments, activity limitations, and participation restrictions of individuals with severe disorders of speechlanguage production and/or comprehension, including spoken and written modes of communication’’ (American Speech-Language-Hearing Association, 2005, p. 1). Systematic reviews can be obtained by searching for keywords or by obtaining all available appraisals of reviews by selecting ‘‘systematic review’’ as the design of choice. For example, a search for ‘‘natural speech’’ reveals an entry to a review by Millar, Light, and Schlosser (2006) on the effects of AAC on speech production in individuals with developmental disabilities. Each review is appraised using the EVIDAAC Systematic Review Scale (Schlosser, Sigafoos, Raghavenrda, & Wendt, 2008), which will be described in more detail below. Once a review is selected
Appraising Systematic Reviews
51
the user can gain access to (a) the citation, (b) the abstract via URL, (c) an appraisal score (upon clicking ‘‘details’’ the user can see how each item was assessed), (d) design, (e) question, (f) method, (g) results, and (h) conclusion. Next, we will address how the reader might make sense of the obtained synopses from each of the above sources.
Interpreting Synopses of Systematic Reviews Assuming the stakeholder was fortunate enough to locate a synopsis of a relevant systematic review, the next step is to make sense of the synopsis. Given that each source has its own format, we will analyze one synopsis from each of the four sources identified in the previous section. The goal here is to illustrate the process of interpreting synopses rather than fully discuss the results of the reviews themselves. Synopsis of Synthesis from Evidence-Based Communication Assessment and Intervention From the journal Evidence-Based Communication Assessment and Intervention (EBCAI), we have selected the appraisal of a systematic review on PECS (Hart & Banda, 2009) carried out by Subramanian and Wendt (2010). Recognizing that there are multiple synopses of PECS reviews available, we should point out that in real life, stakeholders might choose to look at another one or several of these reviews depending on the nature of their clinical question (Schlosser, Koul, & Costello, 2007). For example, if the question was focused on the effectiveness of PECS for a child with autism rather than a child with an intellectual disability, it may be more appropriate to take a synopses that is focused on this population (i.e., Simpson, 2011). In this hypothetical scenario, the stakeholder is interested in PECS for a child with an intellectual disability, which is subsumed under ‘‘developmental disability’’; this is the population targeted in this review. To make sure that the stakeholder has truly the ‘‘right’’ review, the question section is an appropriate starting point. If one or more of the questions asked by this systematic review corresponds to the clinical question the stakeholder is trying to address, then this synopsis is worthwhile to examine further. If, on the other hand, the questions are not that relevant to the clinical question at hand, another review might be a better solution. Let us assume that the stakeholder is particularly interested in whether PECS results in increases in speech production. One of the questions of the review directly (‘‘to what extent does PECS increase speech,’’ p. 22)
52
RALF W. SCHLOSSER ET AL.
addresses this point and therefore this review seems like a good fit. Assuming that the review is appropriate, the next step is to read the title. The title of a structured appraisal abstract in the journal (i.e., ‘‘PECS has empirical support, but limitations in the systematic review process require this conclusion to be interpreted with caution,’’ p. 22) aims to communicate the clinical bottom line (i.e., the practice’s effectiveness) in light of the evidence. In this case, the reader gets the sense that PECS might be effective but, per this appraisal, this conclusion of the review may not be supported by the review methodology. Based on the title, though, it is unclear whether PECS might be effective in terms of the specific outcomes of interest; that is, speech production. At this juncture, the stakeholder could consult the ‘‘main results’’ section to determine the specifics. The following speaks directly to the outcomes of interest: ‘‘The review also indicated that PECS was highly effective in increasing speech for 2 out of 10 participants for whom speech production was measured, and moderately effective for 2 out of 10 participants’’ (Subramanian & Wendt, 2010, p. 23). This statement seems somewhat contradictory to the following one under the ‘‘Authors’ conclusions’’ – that is conclusions drawn by the authors of the original review: ‘‘The authors conclude that PECS is generally effective in increasing functional communication, including speech and non-verbal manding, in individuals with developmental disabilities’’ (p. 23). At this point, the stakeholder is cognizant that there seem to be some contradictory statements in this review regarding the effectiveness of PECS for increasing speech. Also, the reader realizes that the conclusions drawn by the review may not be entirely trustworthy based on the title. To learn whether the concerns raised were indeed substantive, the stakeholder could consult the ‘‘commentary’’ section. In the commentary section the reader learns that the authors had collapsed different outcomes and that might explain how they could have arrived at the conclusion that PECS was effective across a range of outcomes, including speech. As the commentary authors point out, however, such collapsing is not appropriate. There are other issues in the commentary that support the commentary authors concluding remark: ‘‘Results can be regarded as preliminary but should not be used for clinical decision-making until confirmed by a more rigorous and valid, truly systematic review’’ (p. 25). From here, the stakeholder can seek out appraisals of other reviews on the same topic in the hope that they are more rigorous. There were two other and more sound reviews found in EBCAI (http://www.psypress.com/ebcai) and both commentaries seem to concur that thus far increases in speech are modest, varied, and based on
Appraising Systematic Reviews
53
limited data, calling for further research into this area (Raghavendra, 2010; Wendt & Boesch, 2010). Synopses of Syntheses from Database of Abstracts of Reviews of Effects Next, we will examine two commentaries obtained from DARE. We first examine the commentary on the systematic review of the multisensory intervention applied to individuals with intellectual disabilities (Snoezelen, Lotan & Gold, 2009) found at: http://www.crd.york.ac.uk/CRDWeb/ ShowRecord.asp?AccessionNumber=12009109383&UserID=0. Let us assume that a stakeholder is interested in utilizing this intervention with an elementary school child with intellectual disabilities, but is unsure of its effectiveness. Titles in DARE do not aim to communicate the clinical bottom line in light of the evidence; instead they seem to restate the title of the original review. To ensure that the question is one that is relevant to the clinical question of the clinician, the best section to consult is the ‘‘authors’ objectives’’; and the objective (‘‘to assess the effectiveness of a controlled multisensory environment (Snoezelen) as individual therapy for people with intellectual and developmental disability’’) appears relevant. The ‘‘CRD summary’’ gives the reader an overall sense whether this review is trustworthy. Here, the synopsis indicates that the authors’ cautious interpretation of the findings in the original review was appropriate even though the reporting in the review was not without problems. In the ‘‘CRD commentary’’ the reader can gain a better understanding of what was done well and the issues that were not successfully addressed in the original review, including the combining of studies that appeared methodologically and statistically heterogeneous. All in all, the reader comes away without convincing empirical support for this intervention. Next, we will take a look at another synopsis we identified from DARE, which addresses the efficacy of behavioral interventions to treat rumination and operant vomiting (Lang et al., 2011). This commentary can be found at the following website: http://www.crd.york.ac.uk/CRDWeb/ShowRecord. asp?AccessionNumber=12011006699&UserID=0. A stakeholder may be working with a young adult with profound intellectual disability whose challenging behaviors include rumination. The stakeholder examines the ‘‘authors’ objective’’ and finds it to be consistent with her interests. A review of the ‘‘CRD summary’’ indicates that the authors concluded that behavioral interventions were effective, but several limitations in the data set (including study quality, low n) and the review process call this conclusion into question. A closer look at the CRD commentary reveals that the individuals appraising the review may have failed to recognize that
54
RALF W. SCHLOSSER ET AL.
single-case experimental designs are different from what they call ‘‘case studies,’’ which are considered to provide the lowest level of evidence. That being said, the other limitation in the data set and review process do remain. Unless there are other appraisals of recent systematic reviews available (and we were unable to locate any), or systematic reviews without appraisals (the structured abstract lists one but it is quite dated – 2001), the stakeholder’s best bet is to perhaps seek out one or several better controlled studies included in the systematic review, and look for an appraisal of these. Synopsis of Synthesis from EBP Compendium From the EBP Compendium, we will interpret the appraisal of the review by Delano (2007) on the effectiveness of video modeling for children with autism. The synopsis for this review can be located at this website: http://www.asha.org/Members/ebp/compendium/reviews/Video-ModelingInterventions-for-Individuals-with-Autism.htm. To begin with the stakeholder could verify whether this is an appropriate review by reading the ‘‘Description’’ and the ‘‘Question(s) addressed.’’ The stakeholder’s clinical question relates to the effectiveness of video modeling for children with autism – specifically, the stakeholder is considering video modeling as a treatment for teaching beginning communication skills such as requesting. Because one of the questions asked in this review directly addresses effectiveness (‘‘how effective were video modeling interventions in improving the skills of individuals with autism?’’), it is worthwhile to examine this synopsis further. The EBP Compendium utilizes five indicators of quality: (1) the review addresses a clearly focused question, (2) criteria for inclusion of studies are provided, (3) search strategy is described in sufficient detail for replication, (4) included studies are assessed for study quality, and (5) quality assessments are reproducible. For our review, the first four of these were answered with a ‘‘yes’’ whereas the last question was answered with a ‘‘no.’’ From this it follows that even though study quality was assessed in the original review, the criteria for doing so were not sufficiently defined to allow for someone else to reproduce them. Accordingly, the stakeholder may be cautious in believing the obtained quality assessments (Indicator of Quality #4). The ‘‘Conclusions’’ section states the author’s conclusion that video modeling is indeed effective. The discerning stakeholder would notice that there are several limitations noted as well, one of them being that treatment effects were not assessed. If Delano (2007) did not assess effectiveness in her review, and a brief look at the original review confirms this, then how is it possible that the conclusion was reached that video modeling was
Appraising Systematic Reviews
55
effective? The only way this is possible is that in this review, the author accepted the effectiveness conclusions of the authors of the original studies rather than applying an effectiveness metric of some sort across studies reviewed. Unfortunately, none of the five quality indicators speak to this shortcoming and the stakeholder had to do quite a bit of ‘‘detective’’ work to determine this. At this juncture, the stakeholder should try to locate a synopsis of other, hopefully more systematic, reviews, on this topic. There do not appear to be any such reviews indexed in the EBP Compendium, but there are in some of the other sources such as EBCAI and DARE.
Synopsis of Synthesis from Evidence in Augmentative and Alternative Communication Finally, we interpret a synopsis of a systematic review obtained through EVIDAAC. Assume that a stakeholder is working with a family of a 4-yearold child with Down Syndrome who has not yet developed speech. The parents are interested in exploring other means of communication (i.e., AAC), but they continue to desire for their child to improve his speech. A search for ‘‘systematic reviews’’ under the ‘‘Design’’ tab reveals an entry for Millar et al. (2006), who examined the effects of AAC intervention on natural speech production. Given that Down Syndrome is a developmental disability, this seems like a potential fit. It is important to note that at the time of this writing, EVIDAAC has not yet been officially launched (anticipated launch date: July 31, 2013). Examining the question addressed by the review, it exactly meets the stakeholder’s interest. The ‘‘results’’ section differentiates questionable evidence from the best evidence with six studies being in the latter category, showing that 89% of its participants improved in speech production, 11% did not improve, and 0% showed a decline. The EVIDAAC Quality Score shows a score of 8 out of 14. Taking a closer look, clicking on ‘‘Details’’ reveals that the review fell short in several areas, including (a) a failure to include unpublished studies, (b) an insufficient definition of inclusion and exclusion criteria, (c) the absence of a log of rejected studies, (d) lack of specificity (data-base specific) for the search terms, (e) issues with the agreement data on the inclusion of articles, and (f) a failure to minimize database bias. Thus, while this review is not without imperfections, there are many positive aspects as well and upon consulting the other sources it appears to be the best available evidence for this population.
56
RALF W. SCHLOSSER ET AL.
Navigating the Sources of Synopses of Systematic Reviews The goal of the previous section was to familiarize stakeholders with each of the sources so that they are familiar with the process of interpreting synopses from any of these potential sources. In this section we will address how the stakeholder might go about navigating these different sources, especially when multiple sources have provided appraisals of the same systematic review. This is the case with the review by Millar et al. (2006), which was appraised by all four sources. Again, rather than going into full details of the actual results of this review, the point here is to illustrate the process of interpreting synopses from multiple sources. Two of the appraisal sources (EVIDAAC and the ASHA EBP Compendium) use quantitative techniques for the appraisal in that they offer either a score (EVIDAAC) or allow for the calculation of a score by counting the number of yes responses (EBP Compendium). Because both synopses share this characteristic, it might make sense to compare these first. The Millar et al. (2006) review fared better in the EBP Compendium, with 4 out 5 ‘‘yes’’ (85%) compared to 8 out 14 ‘‘yes’’ on EVIDAAC (approximately 57%). It could be argued that EVIDAAC is more rigorous in addressing the major appraisal aspects; having more items allows for a more nuanced appraisal. For example, EVIDAAC includes four items that pertain to the search methods, whereas only one item addresses the search in the EBP Compendium. Also, the rigor of some of the items seems to differ between the EBP Compendium and EVIDAAC. For example, both the EBP Compendium and EVIDAAC address whether there are inclusion criteria in the review (‘‘Criteria for inclusion of studies are provided’’ [EBP Compendium]; ‘‘the criteria for inclusion and exclusion of studies are predefined’’ [EVIDAAC]), yet only EVIDAAC asks whether these criteria are also appropriate (‘‘The criteria for inclusion and exclusion are appropriate given the purpose of the review’’). It is conceivable that a review provides inclusion criteria, yet they are deemed inappropriate. The other two evidence-based information sources (DARE, EBCAI) utilize narrative means of appraisals. A comparison of the two reveals quite a bit of overlap. We will relate the comparison to the other two sources as appropriate. Both the DARE and EBCAI synopses recognized that unpublished information was not considered resulting in a potential publication bias (this was also observed in EVIDAAC, but not in the EBP Compendium because none of its items speak to this). Also, both synopses noted some shortcomings related to the search (e.g., the time frame of the
Appraising Systematic Reviews
57
search was not stated) (these were also observed in EVIDAAC, but not in the EBP Compendium). As might be expected, there were also some disagreements. EBCAI noted a potential language bias but this went unnoticed in DARE (or EVIDAAC and the EBP Compendium). The DARE synopsis reported that the inclusion criteria relative to study design were not stated and the process for determining inclusion was not described; the EBCAI synopsis, on the other hand, found the selection criteria to be comprehensive and clear (similarly, the EBP Compendium found no issues, but EVIDAAC found the inclusion criteria less than appropriate). Although it is accurate that study design criteria were not specified, the original review made the case that in the formative stages of investigating a particular intervention it may be appropriate to include lower-level designs, as was the case with this review (Millar et al., 2006, p. 251). Also, because the authors applied a best-evidence synthesis, in which the stronger evidence was separated from the weaker evidence, not having study design exclusion criteria may not be as consequential. There was agreement on the inclusion of research studies for review of nearly 100%, but it was based on only 11% of the journals searched (Millar et al., 2006). The lack of inclusion agreement was not identified by EBCAI (but was in EVIDAAC); DARE noted that it was not clear how studies were selected. Another area of disagreement pertained to the reliability of data extraction. EBCAI’s synopsis reported that the reliability results reported were quite high whereas DARE’s synopsis questioned that there may have been an insufficient number of observations (i.e., 20%) underlying these results. Although 20% may be at the lower end of what is acceptable it is certainly a lot better than having no reliability data on data extraction (EVIDAAC did mark data extraction agreement with ‘‘yes’’). Finally, the DARE synopsis noted that the criteria for the quality assessment (high-quality systematic reviews assess the quality of the included studies) were not well defined and it is unclear how these criteria were evoked for the best evidence synthesis. This was not found to be an issue in the EBCAI synopsis, and both the EBP Compendium and EVIDAAC felt the quality appraisal to be sufficient. Our reading of the original review indicates that the quality criteria are sufficiently described in Appendix B of the original review. Also, on p. 251 Millar et al. (2006) clearly stated that a study was entered into the best evidence analysis if its design demonstrated experimental control (i.e., better than inconclusive evidence). DARE also questioned the double-counting of participants who were exposed to more than one intervention. This was done for participants who took part in an alternating treatments design and as such is an artifact of
58
RALF W. SCHLOSSER ET AL.
this specific single-case experimental design. Given this design, it is indeed appropriate to double-count the participants; this ill-placed criticism is perhaps attributable to a lack of knowledge by DARE reviewers about single-case experimental designs. Finally, DARE questioned the mere summing of results without statistical analysis. As a co-author of the original review, Schlosser believes that the decision not to analyze the data statistically was based on the heterogeneity of the measures used for speech production, the diversity of participants, and range of AAC systems. This, however, was not stated in the paper and should have been. Overall, among the three more detail-oriented sources (DARE, EBCAI, and EVIDAAC) there seemed to be quite a bit of overlap in recognizing that this is a decent review and its specific shortcomings, rendering it a less than ‘‘perfect’’ review. Given that there is no other review that is more recent and/or of better quality, stakeholders have to ask themselves whether the shortcomings are so severe that they render the conclusions reached invalid. DARE, which besides EVIDAAC appeared to be most critical, acknowledged that the conclusions reached are careful and not overstated, and besides the shortcomings in reporting, may also be an artifact of the data set. In other situations where the sources diverge considerably, the stakeholder would need to assess the discrepancies and determine in which of the sources the appraisal is credible. Depending on skill level and time available, this could be done on each difference or the stakeholder could attribute greater credibility to one synopsis over the other based on perceived credibility of the source. Based on our involvement with two of the sources (EBCAI and EVIDAAC) we may not be the best individuals to render an impartial judgment about overall credibility of the various sources. But, we can state what we believe to be facts: 1. Both EVIDAAC and the EBP Compendium utilize an item-based quantitative appraisal. 2. The EBP Compendium uses a rather crude appraisal with only 5 items whereas EVIDAAC utilizes a more in-depth appraisal with 14 items for systematic reviews and 20 items for systematic reviews that are metaanalytic. 3. The items in EVIDAAC are operationally defined whereas those of the EBP Compendium do not seem to be defined. 4. DARE and EBCAI synopses are narrative rather than quantitative. 5. The structured appraisal abstracts of DARE and EBCAI follow a standard template in terms of its respective outlines.
Appraising Systematic Reviews
59
For an additional comparison of the four sources in terms of inclusion criteria, search strategies, appraisal criteria, and structure of the appraisal, the reader may want to consult Schlosser and Sigafoos (2009).
IMPLEMENTING YOUR OWN APPRAISAL OF A SYSTEMATIC REVIEW In the event that a synopsis of a relevant systematic review cannot be found, it is necessary for the stakeholder to carry out the appraisal. We recommend that the stakeholder utilize a checklist to guide the appraisal process. One such checklist is PRISMA, which stands for Preferred Reporting Items for Systematic Reviews and Meta-Analyses (Moher, Liberati, Tetzlaff, Altman, & The PRISMA Group, 2009). The PRISMA checklist is made up of 27 criteria against which systematic reviews should be appraised, including structural issues (structured summary, protocol, registration, and funding sources), rationale and objectives of review eligibility (inclusion/ exclusion criteria), sources consulted, search strategies, study selection, and data collection process. Further details can be found on their website (http:// prisma-statement.org). For illustrations of the PRISMA checklist as applied to reviews in speech-language pathology and communication disability, the reader may consult Law (in press). Based on our extensive experience in applying the EVIDAAC Systematic Review Checklist, we will illustrate an appraisal using this checklist. Specifically, we will show the appraisal of a systematic review on manual signing for children with autism (Schwartz & Nye, 2006). At the top of the first page of the checklist, the user is provided with instructions on when to use the checklist, how to answer each item, and how to score the checklist. The checklist has 20 items – items 1–14 apply to systematic reviews that are not meta-analyses, whereas items 1–20 apply to meta-analyses. That is, for meta-analyses the user has to complete six more items. Below these instructions, the appraiser may enter his or her name and provide the full reference of the review to be appraised. Below this begins the listing of the items, a space for providing documentation for the rating from the original source, and a rationale for the rating. The items address a range of considerations that are deemed to contribute to the quality and trustworthiness of a systematic review based on Schlosser et al. (2007), including (a) focused question (one item), (b) search sources and methods (five items), (c) selection of studies (four items), (d) data extraction including quality
60
RALF W. SCHLOSSER ET AL.
assessments (three items), (e) quality of outcomes (four items), (f) confidence intervals (one item), and (g) sensitivity analysis and co-variation of outcomes with characteristics (two items). The rating key is found in the right-hand column; it consists of ‘‘yes’’ or ‘‘no’’ options and the possibility of marking ‘‘N/A’’ for the items that apply to meta-analyses only when appraising a non-meta-analytic review. At the end of the checklist, the user can tally the total number of ‘‘yes’’ responses, the total number of ‘‘no’’ responses and the total number of ‘‘N/A’’ ratings. The sum of all yes ratings renders the score achieved; for example, if a hypothetical review yielded 10 yes responses the score is 10. Deducting the number of NA ratings from the total number of ratings (i.e., 20) will yield the maximum score attainable. So, if six items were marked N/A, the maximum would be 14. Thus, this review yielded 10 out of 14. On the subsequent pages, the user finds operational definitions for each of the items. These should be carefully read prior to rendering a rating. The EVIDAAC appraisal of the meta-analysis by Schwartz and Nye (2006) was completed independently by two reviewers from the EVIDAAC Editorial Board, and any disagreements were resolved prior to yielding this final appraisal. These are some of the considerations where the review fell short: (a) the search terms used were not stated; (b) the criteria for inclusion and exclusion were deemed inappropriate because of the exclusion of alternating treatments designs (ATD) adapted ATDs, and ABAB designs; (c) the reliability of data extraction was not reported; and (d) criteria for making quality assessments were limited to treatment fidelity. Overall, this review received a score of 15 out of 19 (the item on sensitivity analysis was marked N/A given this is a meta-analysis of single case experimental designs and there was only one group design study). There are currently no absolute empirical guidelines for classifying review quality based on scores. Yet, based on our experience in rating other systematic reviews/meta-analyses it seems that this review is of fairly high quality, given that perfect reviews are difficult to find.
SUMMARY AND CONCLUSIONS Systematic reviews increasingly are important pieces of evidence that inform clinical and educational decision making related to children with disabilities. Yet, systematic reviews vary in quality, and the EBP mandates call for an appraisal of the quality of evidence coming from primary research as well as secondary research such as systematic reviews. In this chapter, we have
61
Appraising Systematic Reviews
made the case, evoking the 6S pyramid, that stakeholders should seek out synopses (appraisals) of systematic reviews rather than the systematic reviews themselves. This allows the stakeholder to save time and minimize the skill burden otherwise associated with quality appraisal. To that end, we have introduced four evidence-based information sources, and described how to interpret synopses from each of the sources. Additionally, we have offered some strategies for coping with multiple synopses on the same systematic reviews. In the absence of agreed upon guidelines on how to reconcile discrepant findings from multiple synopses, we believe that these strategies are preliminary and hope they evoke further discussion. Until the field produces summaries of the lower layers of evidence (including synopses of syntheses), it will be important for the stakeholder to know about information sources that offer synopses and apply skills in reconciling multiple synopses. Finally, because it is unrealistic that all systematic reviews will be appraised any time soon, it continues to be critical that the stakeholder is equipped with the tools to conduct an appraisal should that become necessary. To that end, we have introduced the reader to the EVIDAAC Systematic Review Checklist. The infrastructure of evidencebased information sources continues to evolve to the overall benefit toward implementing evidence-based practice in real settings. As such, these are exciting times to be part of.
ACKNOWLEDGMENTS EVIDAAC, one of the information sources introduced in the paper, was funded by a Grant from the National Institute on Disability and Rehabilitation Research (NIDRR), US Department of Education (H133G070150-08) to Ralf W. Schlosser. The authors, however, bear sole responsibility for the content of this paper and funding by NIDRR does not imply that the opinions expressed in this report are those of the agency.
REFERENCES American Speech-Language-Hearing Association. (2005). Roles and responsibilities of speechlanguage pathologists with respect to augmentative and alternative communication: Position statement. Retrieved from www.asha.org/policy American Speech-Language-Hearing Association. (2006). Guidelines for speech-language pathologists in diagnosis, assessment, and treatment of autism spectrum disorders across the life span. Retrieved from www.asha.org/policy
62
RALF W. SCHLOSSER ET AL.
Beck, A. (2009). Research on the effectiveness of the Picture Exchange Communication System (PECS) has increased, but this review is not very systematic [Abstract]. Evidence-Based Communication Assessment and Intervention, 3, 136–140. Abstract of: Sulzer-Azaroff, B., Hoffmann, A. O., Horton, C. B., Bondy, A., & Frost, L. (2009). The Picture Exchange Communication System (PECS): What do the data say? Focus on Autism and Other Developmental Disabilities, 24, 89–103. Choi, P. T. L., Halpern, S. H., Malik, N., Jadad, A. R., Trame´r, M. R., & Walder, B. (2001). Examining the evidence in anesthesia literature: A critical appraisal of systematic reviews. Anesthesia and Analgesia, 92, 700–709. Delano, M. E. (2007). Video modeling interventions for individuals with autism. Remedial & Special Education, 28, 33–42. DiCenso, A., Bayley, L., & Haynes, R. B. (2009). Accessing pre-appraised evidence: Fine-tuning the 5S model into a 6S model. Evidence-based Nursing, 12(4), 99–101. Eagly, A. H., & Wood, W. (1994). Using research to plan future research. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 485–500). New York, NY: Russell Sage Foundation. Guyatt, G., & Rennie, D. (2002). Users guide to the medical literature: Essentials of evidencebased clinical practice. Washington, DC: American Medical Association. Haynes, R. B. (2001). Of studies, syntheses, synopses, and systems: The ‘‘4S’’ evolution of services for finding current best evidence. ACP Journal Club, 134, A11–A13. Haynes, R. B. (2006). Of studies, syntheses, synopses, summaries, and systems: The 5S evolution of information services for evidence-based decision making. ACP Journal Club, 145(3), A8–A9. Humphris, D., Littlejohns, P., Victor, C., O’Halloran, P., & Peacock, J. (2000). Implementing evidence-based practice: Factors that influence the use of research evidence by Occupational Therapists. British Journal of Occupational Therapy, 63, 516–522. Lang, R., Mulloy, A., Giesbers, S., Pfeiffer, B., Delaune, E., Didden, R., & y O’Reilly, M. (2011). Behavioral interventions for rumination and operant vomiting in individuals with intellectual disabilities: a systematic review. Research in Developmental Disabilities, 32, 2193–2205. Law, J. (in press). Appraising systematic reviews. In R. W. Schlosser, J. Sigafoos, J. Law, T. Klee, & A. M. Raymer (Eds.), Evidence-based practice in speech-language pathology. New York, NY: Psychology Press. Lotan, M., & Gold, C. (2009). Meta-analysis of the effectiveness of individual intervention in the controlled multisensory environment (Snoezelen) for individuals with intellectual disability. Journal of Intellectual and Developmental Disability, 34(3), 207–215. Millar, D., Light, J. C., & Schlosser, R. W. (2006). The impact of augmentative and alternative communication intervention on the speech production of individuals with developmental disabilities: A research review. Journal of Speech, Language, and Hearing Research, 49, 248–264. Moher, D., Tetzlaff, J., Tricco, A. C., Sampson, M., & Altman, D. G. (2007). Epidemiology and reporting characteristics of systematic reviews. PLoS Medicine, 4, e78. Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med, 6(7), e1000097. doi: 10.1371/journal.pmed.1000097 Myers, S. M., & Johnson, C. P. (2007). Management of children with autism spectrum disorders. Pediatrics, 120, 1162–1182.
Appraising Systematic Reviews
63
Oxford Centre for Evidence-Based Medicine Levels of Evidence Working Group. (2011). The Oxford 2011 levels of evidence. Retrieved from http://www.cebm.net/index. aspx?o=5653 Petticrew, M., & Roberts, H. (2006). Systematic reviews in the social sciences: A practical guide. Malden, MA: Blackwell Publishing Co. Raghavendra, P. (2010). PECS promotes functional communication; however, more research is needed to investigate its efficacy in Phases IV to VI and its impact on speech and functional communication across contexts [Abstract]. Evidence-Based Communication Assessment and Intervention, 5, 7–10. Abstract of: Tincani, M., & Devis, K. (2009). Quantitative synthesis and component analysis of single-participant studies on the Picture Exchange Communication System. Remedial and Special Education. Schlosser, R. W., & Goetze, H. (1992). Effectiveness and treatment validity of interventions addressing self-injurious behavior: From narrative reviews to meta-analysis. Advances in Learning and Behavioral Disabilities, 7, 135–175. Schlosser, R. W., Koul, R., & Costello, J. (2007). Asking well-built questions for evidence-based practice in augmentative and alternative communication. Journal of Communication Disorders, 40, 225–238. Schlosser, R. W., & Raghavendra, P. (2004). Evidence-based practice in augmentative and alternative communication. Augmentative and Alternative Communication, 20, 1–21. Schlosser, R. W., & Sigafoos, J. (2009). Navigating evidence-based information sources in augmentative and alternative communication. Augmentative and Alternative Communication, 25, 225–235. Schlosser, R. W., Sigafoos, J., Raghavenrda, P., & Wendt, O. (2008). EVIDAAC Systematic Review Checklist. Unpublished manuscript, Northeastern University, Boston, MA. Schlosser, R. W., Wendt, O., & Sigafoos, J. (2007). Not all systematic reviews are created equal: Considerations for appraisal. Evidence-Based Communication Assessment and Intervention, 1, 138–150. Schwartz, J. B., & Nye, C. (2006). Improving communication for children with autism: Does sign language work? EBP Briefs, 1, 1–17. Simpson, R. L. (2011). Meta-analysis supports Picture Exchange Communication System (PECS) as a promising method for improving communication skills of children with autism spectrum disorders [Abstract]. Evidence-Based Communication Assessment and Intervention, 5, 3–6. Abstract of: Flippin, M., Reszka, S., & Watson, L. R. (2010). Effectiveness of the Picture Exchange Communication System (PECS) on communication and speech for children with autism spectrum disorders: A meta-analysis. American Journal of Speech-Language Pathology, 19, 178–195. Subramanian, S., & Wendt, O. (2010). PECS has empirical support, but limitations in the systematic review process require this conclusion to be interpreted with caution [Abstract]. Evidence-Based Communication Assessment and Intervention, 4, 22–26. Abstract of: Hart, S. L., & Banda, D. R. (2009). Picture Exchange Communication System with individuals with developmental disabilities: A meta-analysis of single subject studies. Remedial and Special Education. Advance Online Publication. doi: 10.1177/ 0741932509338354 Wendt, O., & Boesch, M. (2010). Systematic review documents PECS effectiveness for exchange-based outcome variables, but effects on speech, social, or challenging behavior remain unclear [Abstract], Evidence-Based Communication Assessment and
64
RALF W. SCHLOSSER ET AL.
Intervention, 4, 55–61. Abstract of: Preston, D., & Carter, M. (2009). A review of the efficacy of the Picture Exchange Communication System intervention. Journal of Autism and Developmental Disorders, 39, 1471–1486. Zipoli, R. P., & Kennedy, M. (2005). Evidence-based practice among speech-language pathologists: Attitudes, utilization, and barriers. American Journal of Speech-Language Pathology, 14, 208–220.
CHAPTER 4 ADAPTING RESEARCH-BASED PRACTICES WITH FIDELITY: FLEXIBILITY BY DESIGN LeAnne D. Johnson and Kristen L. McMaster ABSTRACT The contemporary focus on high fidelity implementation of researchbased practices often creates tensions for educators who seek to balance fidelity with needed flexibility as they strive to improve learner outcomes. In an effort to improve how decisions are made such that flexibility is achieved while fidelity to core components is maintained, this chapter begins with a discussion of the role of fidelity in research and practice. Particular attention is given to current conceptualizations of fidelity that may help inform theoretically and empirically driven adaptations to research-based practices. Specifically, we describe adaptations based on the instructional context for implementation and the characteristics of the individual learners. A framework for adapting research-based practices is then presented with relevant examples from research designed to optimize learner responsiveness without sacrificing fidelity to core components. The chapter ends with implications and future directions for research and practice.
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 65–91 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026006
65
66
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
Current educational mandates emphasize that, to ensure that all students progress toward high academic standards, educators must implement scientific research-based practices (Elementary and Secondary Education Act, U.S. Department of Education, 2002; Individuals with Disabilities Education Act, U.S. Department of Education, 2004). Further, educators must implement such practices with fidelity by implementing the components of the practices in the way that they were intended. Failure to do so can lead to uncertainty as to whether a research-based practice was truly in place, and could erode its potential impact on student outcomes. Yet, tensions arise between standards for high fidelity and practical needs for flexibility when educators attempt to adopt and sustain research-based practices. Inevitably, many educators will modify practices in ways they believe will better fit their core curricula, their own teaching styles, or their students’ specific needs (Ferrer-Wreder, Adamson, Kumpfer, & Eichas, 2012). For those students who experience significant academic or behavioral difficulties, implementing research-based practices with fidelity may not produce sufficient growth (e.g., Battistich, 2003; Denton, Fletcher, Anthony, & Francis, 2006; Dietsch, Bayha, & Zheng, 2005; Wanzek & Vaughn, 2009). In such cases, it may be necessary to modify practices to better meet individual students’ needs (Webster-Stratton, Reinke, Herman, & Newcomer, 2011). In this chapter, we discuss the role of fidelity in research and practice, and considerations regarding balancing fidelity of research-based practices with flexibility to adapt those practices to fit specific instructional conditions and individual student needs. Then, we review relevant research addressing the adaptation of research-based practices. We present a framework for adapting research-based practices to optimize participant responsiveness without sacrificing fidelity to core components, and end with implications for research and practice.
THE ROLE AND IMPORTANCE OF FIDELITY IN RESEARCH AND PRACTICE In the last several years, implementation fidelity has garnered increased attention in education research and practice, in alignment with the current emphasis on accountability and evidence-based practice (O’Donnell, 2008). In research, ‘‘procedures for ensuring and assessing fidelity of implementation’’ are listed as ‘‘essential’’ quality indicators for experimental and quasiexperimental group designs (Gersten et al., 2005, p. 151), and as ‘‘highly desirable’’ for single-subject research designs (Horner et al., 2005, p. 174).
Adapting Research-Based Practices
67
Similarly, approaches to monitoring and ensuring fidelity, as well as to understanding factors related to fidelity and its impact on intervention outcomes, are emphasized in current Requests for Application for federally funded intervention research (e.g., Institute for Education Sciences, 2011). Researchers’ documentation of fidelity contributes to internal validity of intervention studies, in that findings may be interpreted in light of the extent to which the intervention was implemented as intended (Durlak & DuPre, 2008; O’Donnell, 2008). Documentation of fidelity also contributes to external validity of intervention studies. For example, findings may be interpreted in light of the variation of implementation fidelity across a variety of instructional conditions (O’Donnell, 2008). For these reasons, dissemination outlets that review research-based interventions for consumer use, such as the National Center on Response to Intervention (NCRTI; www.rti4success.org) include evidence of fidelity as a design feature that improves confidence in findings of intervention studies. Fidelity of implementation of research-based practices is also emphasized in practice, which is the central focus of this chapter. The obvious assumption underlying the importance of fidelity in practice is that interventions can only be expected to produce the intended effect, or evaluated for lack of effect for individual students, if implemented as they were designed and validated in research. Recent reviews of research on implementation fidelity support this assumption. For example, Durlak and DuPre (2008) conducted a comprehensive review of research on the influence of implementation variables, including fidelity, on the effects of prevention and health promotion programs for children and youth. Across reports on over 500 interventions, they found that studies that monitored fidelity obtained higher effect sizes than those that did not, and further, that wellimplemented programs achieved effect sizes that were .34 higher than those of poorly implemented programs. Similarly, O’Donnell’s (2008) review of research on K-12 curriculum interventions revealed statistically significant higher outcomes when interventions were implemented with high fidelity. These findings are consistent with findings of earlier reviews of implementation fidelity in the public health literature (e.g., Dane & Schneider, 1998; Dusenbury, Brannigan, Falco, & Hansen, 2003).
ALL FIDELITY, ALL THE TIME? CAVEATS AND CONSIDERATIONS Findings of the above reviews validate the importance of implementation fidelity. Yet, whereas high fidelity may be related to positive intervention
68
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
outcomes, it is not clear that low fidelity is always associated with negative outcomes. In fact, in certain circumstances, low fidelity may be associated with neutral or even positive outcomes (Sanetti & Kratochwill, 2009). To better understand dynamic interactions among the theoretical underpinnings of an intervention, fidelity of implementation, and individual responsiveness to the intervention, the following issues require consideration: (a) fidelity is multidimensional, (b) fidelity is influenced by multiple factors, (c) there are likely to be ‘‘thresholds’’ or minimum levels of fidelity needed for an intervention to have positive effects, (d) not all intervention components will require the same levels of fidelity, and (e) some degree of flexibility allowing for adaptation may lead to stronger intervention effects. We discuss each of these issues below.
Fidelity Is Multidimensional Researchers generally agree that assessing fidelity entails determining ‘‘how well an intervention is implemented in comparison with the original program design’’ (O’Donnell, 2008, p. 33), but there is no broad consensus on an operational definition. To begin with, a range of terms are used (Durlak & DuPre, 2008), from ‘‘compliance’’ (which suggests procedural conformity) to ‘‘integrity’’ (which suggests ‘‘doing it right’’ in a more holistic way). While such terms generally imply accuracy and consistency of implementation, they convey somewhat different meanings regarding the way fidelity is to be quantified and qualified. Current comprehensive models of fidelity incorporate elements of quantity and quality, and emphasize that the construct is complex and multidimensional (Sanetti & Kratochwill, 2009). For example, O’Donnell (2008) identified five criteria from the public health literature for measuring fidelity, and grouped these into two primary dimensions: fidelity to structure and fidelity to process. Fidelity to structure includes adherence (extent to which components are implemented as intended) and duration (amount, length, and frequency of implementation). Fidelity to process includes quality of delivery and program differentiation (the extent to which features that differentiate the program from other practices are implemented). Both dimensions incorporate participant responsiveness, or ‘‘the extent to which participants are engaged by and involved in the activities and content of the program’’ (O’Donnell, 2008, p. 34). Whereas defining fidelity as a multidimensional construct accounts for critical implementation variables beyond simple adherence to a protocol, it also complicates the extent to which all criteria can be met at high levels.
Adapting Research-Based Practices
69
For example, in certain situations, rigid adherence to all components may come at the cost of lower-than-optimal participant responsiveness to those components (Sanetti & Kratochwill, 2009). Thus, we argue that the degree to which certain aspects of an intervention can be implemented with flexibility should also be considered in comprehensive models of fidelity, a point to which we return later in this chapter. Fidelity Is Influenced by Multiple Factors Numerous factors affect implementation fidelity and its impact on intervention outcomes. Sanetti and Kratochwill (2009) summarized these variables across four levels: (1) external environment variables, which include (among other things) levels of support or opposition from stakeholders, consistency of the intervention with policy, and level of bureaucratic or political barriers; (2) organizational variables, such as access to resources needed to implement the intervention, as well as a climate that promotes integration of the intervention into existing practices; (3) intervention variables, which include the specificity (e.g., manualization), complexity of the intervention, time required, materials, efficacy, ease of implementation, and compatibility with the context into which it is adopted; and (4) interventionist variables, which include interventionists’ skills, motivation, self-efficacy, and perceptions of the need for and benefits of the intervention. Interventionists’ capacity to meet multidimensional criteria for fidelity (e.g., those outlined by O’Donnell, 2008) likely depends on interactions among these contextual factors. Additional factors that may influence fidelity and its impact on outcomes are individual characteristics of the recipient of the intervention (e.g., Sanetti & Kratochwill, 2009; Schulte, Easton, & Parker, 2009). For example, a participant may not engage in an intervention component due to difficulties in paying attention for extended time periods, leading the interventionist to decide to implement that component for a shorter duration. Such an adjustment may not adhere to the intervention protocol, but may increase implementation ‘‘as intended’’ if the participant engages more fully in the intervention. Thus, we propose that recipient characteristics constitute an important factor that may influence fidelity. How Much Fidelity Is Enough? Another consideration that is not well understood in the literature is the level of fidelity or threshold that must be reached to influence
70
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
intervention outcomes (Castro, Barrera, & Martinez, 2004; Durlak & DuPre, 2008). Presumably, interventions will rarely be implemented with 100% fidelity, so the question is how much fidelity is enough? The answer is likely to vary by intervention and subgroups of participants receiving the intervention. For example, Felner et al. (2001) reported findings of an evaluation of two research-based school improvement programs designed to prevent school drop-out and improve student performance. They found that not only were higher levels of implementation fidelity associated with higher gains in student performance, but also that even higher levels of implementation were needed for students at higher risk. In the above example, it appeared that implementers could expect increased gains, especially among higher-risk students, the more that they refined their implementation. However, higher levels of fidelity may not always lead to greater intervention benefits. For example, Stein et al. (2008) reported a study investigating levels of support needed to promote teachers’ successful use of Kindergarten Peer-Assisted Learning Strategies (K-PALS; Fuchs et al., 2001), a class wide peer tutoring program designed to improve early reading outcomes. Teachers were assigned randomly to workshoponly, workshop plus booster support (which entailed periodic problemsolving sessions with K-PALS experts), and workshop plus booster plus helper support (which entailed weekly classroom assistance during K-PALS implementation). As expected, fidelity increased with each more intensive level of support. However, whereas significantly greater student reading gains were observed in the workshop plus booster compared to the workshop only condition, there was no significant added value of the helper condition on student outcomes. Thus, it appeared that the improvements in fidelity in the booster condition were sufficient to achieve an intervention effect, but even greater fidelity had no additional impact. To provide guidance to practitioners as to how much fidelity is enough, it would be helpful if researchers examined and reported the ranges of fidelity associated with positive intervention outcomes overall or for subgroups of recipients (as in Felner et al., 2001) as well as levels for which enhanced benefit is unlikely (as in Stein et al., 2008).
Not All Components Require the Same Levels of Fidelity Related to the above point, it is likely that fidelity to some intervention components is more critical than fidelity to other components (Sanetti & Kratochwill, 2009). Therefore, it is useful if researchers at least specify those
Adapting Research-Based Practices
71
components that are theoretically important to the intervention (i.e., the presumed active ingredients), as well as the necessary ranges of fidelity that are expected to be needed to effect intended outcomes (Durlak & DuPre, 2008; O’Donnell, 2008). Then, interventionists would likely do well to be particularly faithful to those core components. An even more useful approach, perhaps, would be for researchers to generate empirical evidence that indicates which intervention components are most predictive of intervention outcomes. In one such example, Songer and Gotwals (2005) examined the relation between components of a science curriculum intervention and student outcomes. They found one specific component to be particularly strongly associated with a lasting effect on students’ complex reasoning skills. They described this component as especially strong because it was one of the more well-developed components of the intervention, and included strong scaffolds to support higher-order thinking skills (Songer & Gotwals, 2005). Such information seems critical to convey to interventionists, whose understanding of both the theoretical and empirical basis of core components may increase the likelihood that they will implement these components with fidelity. Then, when interventionists do adapt or modify a research-based practice, they may be more likely to do so in a way that retains the core components and does not contradict the theory underlying the intervention (O’Donnell, 2008).
Balancing Fidelity with Adaptability There is some agreement in the literature that adaptation happens, and that it is not necessarily a bad thing (e.g., Castro et al., 2004; Ferrer-Wreder et al., 2012; O’Donnell, 2008). In fact, in some cases, adaptation has led to better participant outcomes (Durlak & DuPre, 2008). An important question, then, is whether and how an ‘‘optimal mix of fidelity and adaptation’’ (Ferrer-Wreder et al., p. 111) might be achieved. To resolve tensions between fidelity and adaptability, an important first question is, ‘‘Why adapt?’’ In some cases, adaptation may result from misconceptions of (or even opposition to) how an intervention is intended to be implemented. In such cases, it may be useful to recalibrate implementation toward higher fidelity, which can often be achieved by ensuring that appropriate professional development and implementation supports are in place (e.g., Kretlow & Bartholomew, 2010; Stein et al., 2008). In other cases, adaptation may be desirable to improve the fit of the intervention within contextual variables such as those described above (Castro et al., 2004).
72
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
In either case, it would be particularly useful for developers to specify theoretically and/or empirically important core components and threshold levels of fidelity, and further, to indicate aspects of the intervention for which adaptation can be tolerated or even encouraged (Castro et al., 2004; Durlak & DuPre, 2008). Given the breadth of research that exists to support and maintain fidelity of implementation, the remainder of this chapter will focus on adaptation that may occur in relation to the context in which interventions are implemented.
FLEXIBILITY BY DESIGN Given the likelihood that adaptation will occur, we argue that intervention developers can be proactive by designing interventions that not only anticipate a range of contextual factors, such as instructional conditions and recipient characteristics, but also include points of flexibility for adaptation to fit within these contexts (e.g., Dane & Schneider, 1998; Power et al., 2005; Sanetti & Kratochwill, 2009; Schulte et al., 2009). For example, in anticipation of ‘‘nonresponders’’ at the individual child level within a small-group reading tutoring program, contemporary multidimensional models of fidelity have included variables that reflect both the characteristics of the child recipient (e.g., features that promote engagement; appropriate pacing and length to maintain attention) as well as the child’s exposure to intervention (e.g., attendance in sessions and/or completion of specific tasks). Attending to such variables represents an important leap forward in accounting for the flexible way in which implementation occurs in real world classrooms and other learning environments. Interventionists’ decisions about what aspects of the intervention to adapt often represent their ‘‘best guess’’ of what can be accomplished within contextual constraints (e.g., time, resources, space) and what will be most responsive to the characteristics of the children receiving the intervention. We believe this ‘‘best guess’’ can be transformed into an ‘‘informed decision’’ by providing a conceptual model to guide decision making, such that flexibility becomes part of the systematic design of both the intervention and the implementation plan. Fig. 1 provides a preliminary framework for guiding future decision making and exploration into systematic flexibility and adaptability intended to enhance individual treatment effects. The assumption underlying this framework is that all decisions begin with selection of a research-based
73
Adapting Research-Based Practices
Research-Based Intervention
Adaptive Implementation of Manualized Components
Adapted Implementation of Manualized Components
Decision Guides Child Characteristics
Fig. 1.
Adaptive Implementation of Adapted Components
Instructional Context
Conceptual Model to Guide and Evaluate Adaptations.
intervention or set of practices. With a research-based intervention as the starting point, interventionists should then consider the instructional context and the characteristics of the recipient of the intervention to make decisions about adapting the intervention.
Instructional Context Several researchers have provided important and helpful examinations of variables within the instructional context that may influence fidelity of implementation compared to manualized procedures (Bosworth, Gingiss, Potthoff, & Roberts-Gray, 1999; Durlak & Dupre, 2008; Dusenbury et al., 2003). For this discussion, we define instructional context very broadly to include variables associated with the organization (e.g., staffing, time, facilities, materials), treatment intensity (e.g., dose, dose delivery mechanisms), and characteristics of the targeted group of recipients as a whole (e.g., culture, socioeconomic status, baseline skills). Building on the thorough reviews of Sanetti and Kratochwill (2009) and Schulte et al. (2009), we highlight several contextual variables to illustrate ways in which those variables influence adaptations to research-based practices.
74
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
Organizational Variables The literature includes several examples of research-based interventions that have been adapted to accommodate organizational variables (Fagan, Hansen, & Hawkins, 2008; Loman, Rodriquez, & Horner, 2010; Power et al., 2005). Organizational variables include where an intervention is delivered, who delivers it, format for delivery, and breadth and depth of content delivered. Depending on the intervention, the degree to which these features are represented by measures of fidelity and/or understood within the theoretical underpinnings of the intervention may vary dramatically. Thoughtfully planned empirical examinations of adaptations to implementation are necessary if informed decisions by real-world implementers is the goal. The Community Youth Development Study (CYDS; Fagan et al., 2008) exemplifies how empirical information may inform flexible implementation of interventions. As part of CYDS, 13 different prevention programs were implemented in 12 community settings. Fidelity of implementation included measures of (a) adherence to program components and content, (b) dosage, (c) quality of delivery, and (d) participant responsiveness. This comprehensive approach to monitoring fidelity allowed for documentation of adaptations from the intended implementation as well as an opportunity to consider the adaptations relative to child outcomes. Adaptations included delivery of content to large groups rather than small groups to accommodate logistical needs; deletion of certain delivery modes such as role plays given group size, time, or instructional style of the implementer; and deletion of certain program content to accommodate other content that the implementer wanted to present. Although these types of adaptation are expected to occur in communitybased implementation, examining those adaptations relative to outcomes is a necessary step that has yet to be fully realized. Fagan et al.’s (2008) approach was to categorize the appropriateness of each adaptation based on a combination of theoretical and empirical evidence. This process of careful monitoring and categorization of adaptations allows for an empirical evaluation of any impact that adaptations may have on recipient outcomes. From there, recommendations may be made to enhance flexible implementation without jeopardizing outcomes. Creating adaptations associated with organizational variables requires a clear theoretically and empirically derived basis for knowing which contentrelated elements of the intervention are core to the change process and must therefore remain fixed within implementation, and which may be allowed to vary. Several demonstrations of empirical tests of interventions that have
Adapting Research-Based Practices
75
been adapted based on organizational variables exist in the literature. Bloomquist, August, Lee, Piehler, and Jensen (2012) tested the effects of adapted delivery mechanisms by delivering the Early Risers Conduct Prevention Program (August, Realmuto, Hektner, & Bloomquist, 2001) to groups through community centers or to individuals through home visiting. Though core components of the program remained the same, attendance was increased and focus was on more family-oriented goals for low-income families who were offered training through community centers. A similar test of organizational variables was conducted by Boettcher Minjarez, Williams, Mercier, and Hardan (2011), who also adapted the delivery mechanisms associated with Pivotal Response Training (PRT; Koegel, Keogel, Harrower, & Carter, 1999). In an effort to enhance transportability and impact of the intervention, Boettcher Minjarez et al. (2011) found that a group delivery format for the training was comparable to the traditional individual delivery format when examining parent implementation. As demonstrated by each of these examples, close attention was given to the theoretical underpinnings of the intervention so as to not jeopardize essential elements of change. Treatment Intensity Treatment intensity has a very distinct role in this discussion of core elements. In medicine, it is well established that research must be conducted to help interventionists decide for whom and under what circumstances certain treatments and treatment delivery systems are most likely to be effective. Treatment theory encompasses not only the content basis for the treatment, but also treatment dosage and the interaction between dosage and the unique characteristics of the recipient (Siemonsoma, Schroder, Roorda, & Lettinga, 2010). To date, substantial attention in the educational arena has been given to the theory of change that is generated to guide intervention content, with relatively sparse attention to parameters of dosage with which interventions are implemented (Warren, Fey, & Yoder, 2007). Warren et al. (2007) observed that no direct comparison studies have been reported in which intensity is treated as the independent variable with all other intervention variables held constant. Further, there is no standard or widely accepted definition of treatment intensity in the intervention literature. Precise terms describing dimensions of intervention intensity are particularly important to further efforts that support systematic adaption to research-based intervention. More specifically, Warren et al. (2007) described an index of cumulative intervention intensity derived from dose,
76
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
dose frequency, dose form, and total intervention duration all representing different dimensions of intensity. To exemplify the complexities of experimentally validating treatment effects relative to each dimension of treatment intensity, we briefly consider milieu teaching procedures, which are broadly accepted as efficacious for improving language skills in preschoolers with autism (see Kaiser & Grim, 2005 for a review). Children with significant language delays have demonstrated communication acquisition and generalization when provided a dense set of learning opportunities within a clinical setting (e.g., structured play for 20 min per day/3 days per week/6 months; Hancock & Kaiser, 2006). Translating these positive effects to a more natural environment, such as a preschool classroom, highlights the importance of specifying the dimensions of treatment intensity. An interventionist in a natural classroom environment must understand the relative importance of specific parameters of dosage, such as: (a) How many specific structured interactions (doses) need to be offered? (b) Can other classroom routines besides playtime (dose forms) be utilized to offer the intervention? (c) How much time must be devoted to intervention each day (dose frequency) and can that time be distributed across multiple classroom routines or must it be offered all at once? and (d) Given how intervention is implemented based on each of the other parameters (cumulative index of treatment intensity), at what point should changes in child performance be expected? In the absence of explicit recommendations from intervention researchers and developers as to what comprises a dose of intervention, along with when, how, and how long doses of intervention are delivered, there is substantial room for variability. We contend that treatment intensity represents a unique aspect of the instructional context through which adaptation is both likely and potentially quite beneficial, despite relatively little empirical guidance. Hence, it is important for future work to directly evaluate different aspects of intervention intensity to generate empirically based recommendations for adaptations that will enhance flexibility without jeopardizing efficacy. Recipient Group as a Whole The characteristics of the recipient group as a whole represent another set of variables within the instructional context that influence adaptations to interventions. A growing body of research is developing to provide empirical insight into the mediating or moderating effects of group characteristics such as culture, socioeconomic status, and developmental skills. Lau’s (2006) work has been particularly influential for considering
Adapting Research-Based Practices
77
adaptations to research-based practices with a specific cultural group in mind. With a focus on both a need to ensure adequate participation in the planned intervention and outcomes that are consistent with those targeted for change by the intervention, Lau suggested that cultural adaptations are warranted when a specific group is not benefiting significantly or there are unique clinical problems specific to a cultural group. Webster-Stratton et al. (2011) provided a nice demonstration of Lau’s principles for cultural adaptation. They described intentional adaptations made to the Incredible Years Teacher Classroom Management intervention that is one of three interlocking components of the Incredible Years Training Series (Webster-Stratton, 1994). Acknowledging the importance of adaptations based on the cultural context of child recipients for the purpose of achieving high fidelity implementation with high impact across implementers, Webster-Stratton et al. (2011) highlighted the adaptive process that builds on the intervention’s core principles. The process of making adaptations is based on reciprocal interactions between training leaders who have knowledge of the content as well as the principles of the intervention, and the implementers who have knowledge as well as experience with the cultural context. Specific cultural adaptations to the Incredible Years Training Series were described by Baker-Henningham (2011), who explored the transportability of the intervention to a developing country. Focus groups with parents and teachers revealed that several core components to the intervention were unfamiliar and likely would require additional time in training and coaching beyond the manualized recommendations. Specifically, joining in children’s play and giving children choices appeared to be strategies that, though core to the intervention, were not valued by the cultural group. Further, classroom teachers placed all responsibility for addressing the child’s behavior on the parents. Given these conditions, Baker-Henningham (2011) concluded that the intervention would need to be augmented with additional strategies to support teachers in working directly with parents. Further, culturally relevant role plays, experiences, and metaphors needed to be infused throughout the training and coaching. The process utilized by Webster-Stratton et al. (2011) for adapting the training and implementation process to reflect the cultural norms and existing characteristics of the recipient group as a whole involved training leaders who were skilled at facilitating collaborative problem-solving, use of video vignettes to model intervention delivery and promote discussion, structured role-play and practice to create experiential learning opportunities for self-reflection, small group implementation planning time, and
78
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
regular coaching sessions. The researchers’ grounding of this adaptive process in a theoretically and empirically based set of core principles for facilitating treatment fidelity and effective intervention is a necessary feature that we revisit later in this chapter.
Child Characteristics We consider individual characteristics of children who receive an intervention differently from those variables that characterize a broader group to which a child may belong. For our purposes, the distinction is the level at which adaptations are being implemented and fidelity is being monitored. When considering the instructional context, adaptations are made with an entire target group in mind. In contrast, when considering child characteristics, adaptations are made with an individual child or a subset of children in mind. Perhaps the most straightforward way to think about adaptations at this level is to consider needs and preferences. August, Gewirtz, and Realmuto (2010) provided a helpful model in which adaptations are made to research-based practices with the intent of enhancing the fit between the intervention and either the child’s profile of strengths and difficulties (needs) or the child’s level of motivation and interest (preferences) to engage in and actively participate in the intervention. Examples of these types of adaptations are present throughout the literature. In a recent examination of multitier academic and behavior instruction for K-3 grade students who were at high risk for school failure, Algozzine et al. (2012) described the importance of careful selection of interventions that are appropriately matched to children’s needs. Though initial selection of intervention based on children’s needs seems intuitive, the interventionist’s data-based decisions to adapt and further personalize selected interventions represent a truly adaptive model that is sensitive to individual child characteristics. Algozzine et al. (2012) used Dynamic Indicators of Basic Early Literacy Skills (DIBELS; Good & Kaminski, 2002) and Office Discipline Referrals (ODR) that were consistent with the School-Wide Information System (SWIS) to monitor children’s performance. When combined and used in a timely manner by classroom teachers, Algozzine et al. found that students achieved better outcomes when interventions were systematically adapted to their behavioral and/or academic needs. For example, a kindergartener who was not progressing with basic literacy skills received Reading Mastery given the intensity of intervention that curriculum provides. By intervening for that child with
Adapting Research-Based Practices
79
intensity, enhancing academic competency served as a preventative intervention for later behavior problems. For another child who participated in behavioral contracting with his teacher and continued to demonstrate disruptive behavior, an individualized intervention plan based on a functional behavior analysis was developed to prevent more severe problem behavior that would further inhibit learning over time. Other recent examples of interventions effectively adapted to children’s needs include an adaptation of Check in/Check out for students with cognitive delays and disabilities (Boden, Ennis, & Jolivette, 2012) and adaptations of the Check, Connect, and Expect program for students demonstrating specific profiles of behavior (Stage, Cheney, Lynass, Mielenz, & Flower, 2012). In addition to adaptations made based on needs, a growing body of research has established the importance of adapting to the preferences of the recipient of the intervention. Caldwell, Bradley, and Coffman (2009) hypothesized that adapting the TimeWise intervention to support intrinsically motivated adolescents’ existing good decision making and providing them with resources to become involved in new activities may enhance efficacy. Conversely, for adolescents who are extrinsically motivated, lessons within TimeWise that focus on goal setting and enhancing self-regulation may be important adaptations to enhance efficacy. A similar example comes from a reading intervention study in which Zentall and Beike (2012) hypothesized that it may be helpful to adapt reading intervention such that children with identified reading problems receive more teacher-directed instruction as opposed to children with attention problems who would receive instruction tailored to their own curiosities and interests. As with Caldwell et al., an important extension of this work is to experimentally test the effects of empirically and theoretically driven adaptations for subgroups of students.
Summary of Conceptual Framework In summary, as depicted in Fig. 1, our conceptual framework for building flexibility into intervention design identifies instructional context and child characteristics as variables that may influence the extent to which a research-based intervention is implemented with fidelity. Further, it specifies points of overlap where the research-based intervention might be adapted to accommodate (or account for) specific instructional contexts, child characteristics, or both. We propose that intervention developers consider developing decision guides to be disseminated with the intervention to facilitate interventionists
80
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
making systematic adaptations in ways that align with the theoretical underpinnings and maintain the integrity of core intervention components. For example, developers might identify core components that must always be maintained, noncore components that may be adapted, examples of instructional conditions under which such adaptation may be desirable (or necessary), and ways to determine when children are unresponsive to the standard protocol and may need additional individualized adaptations. Decision guides could include a series of questions (e.g., presented in a flow chart) that would guide systematic analysis of the specific instructional conditions and student characteristics in relation to the implementation of the intervention. Developers could even provide examples of adaptations that align with the theoretical underpinnings of the intervention for further guidance. Below, we further explicate how such a framework might be applied to adapting research-based academic and behavioral interventions.
FOUNDATIONS FOR UNDERSTANDING ADAPTED AND ADAPTIVE IMPLEMENTATION To extend beyond the simple notion that flexibility to adapt research-based interventions is important, we distinguish between adapted and adaptive models for implementation (August et al., 2010). Table 1 illustrates work we have conducted with colleagues in recent years that demonstrates the concepts of adapted and adaptive implementation models to a classroombased reading intervention (PALS; Fuchs, Fuchs, Mathes, & Simmons, 1997) and a group oriented behavioral intervention (The Good Behavior Game (GBG); Barrish, Saunders, & Wolf, 1968). Adapted Models Adapted models of implementation, as described by August et al. (2010), emphasize adaptation to the intervention based on the instructional context in which the intervention will be implemented. Systematic adaptation is possible only after there has been a theoretically and empirically based evaluation of the intervention that elucidates both the core components as well as the change mechanisms. Though an empirically based evaluation is preferred, we contend that as the empirical foundation that informs adaptations is developing, implementation within community settings occurs with or without expert support guiding adaptations that may be
Peer-Assisted Learning Strategies (Fuchs et al., 1997) Theoretically and empirically derived core components: Reciprocal peer tutoring, derived from theoretical and empirical work supporting the powerful role of peers in promoting learning through social interaction 10 min of Partner Reading, derived from research supporting the roles of modeling and repeated reading to promote fluency and comprehension 10 min of Paragraph Shrinking, derived from research supporting the role of questioning and main idea summarization to promote reading comprehension A motivation system designed to reinforce positive peer interactions, on-task behaviors, and effort A duration of 35 min, 3 times per week, to ensure sufficient opportunities to respond (note that two other PALS activities, Retell and Prediction Relay, typically comprise 15 min of PALS, but are not considered ‘‘core’’)
Example: In a recent study focusing on scaling up PALS (Fuchs et al., 2010), researchers compared two groups of PALS teachers and a control. One group of PALS teachers implemented all PALS components with fidelity. The second group implemented the core components with fidelity, but were allowed to adapt the other PALS components to fit their specific classroom needs. For example, a 4th-grade teacher with a high proportion of English Language Learners (ELLs) in her class
Research-Based Intervention
Adapted Model of Implementation based on Manualized Intervention
Reading Intervention
Example: In a recent randomized control study that included the GBG as one component of a classroom intervention package, teachers were asked to implement the manualized components of GBG with support from a project coach. Coaches observed teachers’ implementation and offered supportive feedback each week for three months. Feedback was based on a 14-item procedural checklist designed to quantify fidelity of implementation of GBG. Teachers were asked to
The Good Behavior Game (Barrish et al., 1968) Theoretically and empirically derived core components: Group contingency system typically implemented with large group subdivided into smaller teams to utilize peer modeling and peer contingencies Explicit behavioral expectations are taught and reviewed each time the game is played Explicit criteria set and reviewed for obtaining reward at the end of the game Teacher maintains very consistent responses to rule violations and rule following within and across each session that GBG is implemented
Behavioral Intervention
Table 1. Examples of Adapted and Adaptive Implementation of Research-Based Interventions.
Adapting Research-Based Practices 81
Adaptive Model of Implementation based on Manualized Intervention
Example: Several teachers have opted to implement the GBG within their classrooms. They are concerned, however, that there are several students in each room who will ‘‘sabotage’’ their teams given histories of attention-seeking challenging behavior designed to exert power and control over others. The teachers will implement the manualized procedures as intended, but have opted to designate certain students as playing GBG
implement for a minimum of 10 min during language arts instruction. The 10-min duration was based on the theoretically and empirically derived need to establish consistency with teachers’ responses to rule violations and rule following as well as overall adherence to GBG procedures within the context of complex classroom instruction. Though some teachers were able to find a meaningful 10-min segment (discrete activity) within their language arts instruction in which GBG was implemented (e.g., journal writing or daily oral language practice), other teachers adapted the duration of implementation so that GBG would be implemented for the entire 45 min–1 hr of instruction across different types of language arts activities. Though classrooms implementing GBG outperformed the control group on key outcomes, the effect was small and did not sustain over time. A descriptive review of the fidelity data revealed that teachers who adapted implementation for durations longer than 10 min evidenced lower degrees of consistent responses to rule violations and rule following as well as lower levels of fidelity over time.
hypothesized that adding a vocabulary component to PALS would enhance its benefits for this group. So, in addition to the core PALS activities, she added a 10-min activity called ‘‘Vocabulary Relay,’’ in which partners identified unknown words in their texts and employed several strategies to learn the meaning (ask their partner, use context cues, look up the meaning in the dictionary, write the meaning, write a new sentence using the word). Other teachers incorporated other changes, based on their specific classroom needs. An evaluation of the effects of the adapted approach indicated that students who received adapted PALS outperformed students in both the standard version of PALS and the control group (Fuchs et al., 2010)
Example: A teacher implemented all components of PALS with fidelity with her full classroom, but made individualized modifications based on student needs. For example, a high-performing reader who refused to work with other children was given an ‘‘invisible’’ partner. The child executed all activities as designed, but worked with ‘‘Silent Bob’’ instead of a real child. The child was much more willing to engage in the activities, and a source of
Behavioral Intervention
Reading Intervention
Table 1. (Continued )
82 LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
Adaptive Model of Implementation based on Adapted Intervention
as individuals while the remainder of the class played in small groups. The student playing as an individual will learn the behavioral expectations and experience the consistency created by the game. After a short period of time demonstrating success with the game as an individual, the teacher will place the child on a team with his/her peers. As part of the manualized procedures, teams are regularly rotated, so the point at which the student is joined with a team is at the discretion of the teacher. This example represents an actual adaptation made by teachers within a large efficacy trial of GBG. Data to evaluate the impact of this adaptive model for the individual or classroom as a whole are not available. Example: The manualized procedures for GBG include providing feedback to teams in the form of a check mark for each rule violation, and then praising teams who are following the rule that was just violated. Though several teachers wanted to use the strategy, they were very concerned this approach might occasion more attention for their most disruptive students who tend to engage in attention-maintained challenging behavior. Two teachers adapted GBG such that rather than responding to rule violations, they responded to rule following and ignored rule violations. An empirical test of this adaptive model for implementation of the two versions of the game established that though both were effective, the adaptation to the procedures produced a slightly stronger behavioral change for their targeted students and was more preferred by teachers (Tanol, Johnson, McComas, & Cote, 2010).
disruption for the rest of the class was removed. In such a situation, it is strongly recommended that the teacher collect ongoing progress monitoring data to determine that the child was reaping the intended benefits of PALS.
Example: Teachers implementing their adapted versions of PALS would generally implement the core components with fidelity and adaptations as intended. However, one teacher discovered that one of her students did not yet have sufficient decoding skills to engage in some of the PALS activities. For this student, she individualized PALS so that he worked with an educational assistant, who implemented systematic decoding instruction, followed by the Partner Reading and Vocabulary Relay activities. Later in the school year, his decoding skills were sufficient such that he could join a partner and conduct the PALS activities in the same manner as the rest of his classmates. Again, a critical element of this adaptation would be to monitor the student’s progress to determine the effectiveness of the adaptation.
Adapting Research-Based Practices 83
84
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
necessary to support implementation. In such instances, explicit descriptions of the theoretical underpinnings of interventions are important starting places for informing adaptations. With continued study, the causal model through which an intervention has an impact on both proximal and distal outcomes will facilitate adaptation while preserving the ‘‘active ingredients.’’ It is important to note that with adapted models, once adaptations are made, implementation proceeds with a new set of manualized procedures such that all learners in the sample receive the adapted intervention. Adapted models are an intended improvement over the ‘‘one size fits all’’ application of evidence-based interventions. However, the intent is that, though tailored within a theoretically and empirically driven process, a new set of manualized procedures tailored to specific instructional contexts is created to promote high implementation fidelity and efficacy. In other words, flexibility is gained at the group level such that implementation is now ‘‘right fitted.’’ Examples of the application of the adapted model for implementation are provided in Table 1. In both examples, researchers guided implementation to maintain core components of the interventions. In the PALS example, core components were retained, but implementers adapted other components to better fit their instructional context. In this case, the adapted approach supported the teachers’ efforts to include a new component that was appropriate to the instructional context. This adapted approach enhanced the impact of the intervention on student performance (Fuchs et al., 2010). Conversely, in the GBG example, when teachers adapted the duration of implementation of GBG, that adaptation was associated with lower fidelity ratings for the core components of consistent responding to rule violation and rule following. Hence, though duration of implementation was not originally considered to be a core component, adapting implementation based on that dimension had a negative impact on other components and efficacy. These examples illustrate the complex and dynamic interactions between core components and implementation in classroom settings that require careful monitoring and explicit direction.
Adaptive Models Adaptive models of implementation seek to promote flexibility at the individual level. Such adaptive models would benefit from research that establishes characteristics or profiles of intervention recipients that moderate treatment response such that interventions may be tailored to those characteristics and enhance the treatment response. As adaptations
Adapting Research-Based Practices
85
are systematically made to address the effect of specific characteristics, experimental tests are needed to both validate the adaptations as well as create decision guides for future implementers and researchers. August et al. (2010) likened adaptive models to adaptive dynamic decision making that may generate decision guides for stepped-up or stepped-down care as described in prevention science. Stepped-up care provides a more intensive or properly fitted type of intervention when a learner is not being responsive to existing intervention. Stepped-down care recognizes that once interventions are implemented, when a learner is responsive to those interventions, interventionists should be mindful of fading the intensity of intervention back to what is minimally necessary to maintain success. In the education sciences, this model is similar to RTI. The conceptual underpinnings of the adaptive model for implementation may apply to RTI models by encouraging development of an empirically derived set of decision rules for selection and delivery of personalized interventions. Adaptive models for implementation that lead to the development of decision guides is informed by exploratory studies, but requires experimental tests within either largescale randomized control trials or aggregated single-case designs. In the adaptive PALS example in Table 1, the adaptation is based on the needs of one student while implementation of the core components remains unchanged for the large group. In this case, a child who had difficulty cooperating with peers, a necessary skill to facilitate successful participation in PALS, was assigned an ‘‘invisible partner’’ to support the focus on academic skill development. Addressing the child’s cooperative skills with peers would occur outside of the PALS sessions until the teacher felt it appropriate to again ask the child to work with a peer. In a similar adaptation for GBG, a teacher hypothesized that a child behavior was maintained by the attention received when the child is able to have an impact on their peers (positive or negative). To minimize the way in which the child is able to impact others during implementation of GBG while also using behavioral reinforcers to address the child’s behavior, the teacher had the child play GBG on her own team. Once successful in meeting the expectations of GBG set for the class, the child was allowed to join a team with her peers.
Adaptive–Adapted Models Often, both the instructional context and individual child characteristics come together in classroom settings to influence implementation of
86
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
interventions, representing the overlap in adaptive and adapted models. In the final examples for both PALS and GBG, the adaptive model is applied to an adapted version of the intervention. For PALS, implementation was already adapted to the instructional context that required reliance on an educational assistant to supplement PALS both during and after teacherdelivered sessions to support the learning of a class that, as a whole, were struggling readers. Through progress monitoring, the teacher identified one specific child who needed intensive decoding skill practice in order to make progress within the PALS intervention. An adaptive approach was then used by having the educational assistant provide systematic decoding instruction until the child sufficiently mastered these prerequisite skills to rejoin the class-wide PALS activities. For the GBG example, teachers started with an adaptive approach to address the attention-seeking problem behavior of their most disruptive children so that, like in the previous GBG examples, the children played on a team by themselves until successful adherence to the expectations was demonstrated. However, this particular group of teachers also wanted to use an adapted model for implementation of GBG to better fit their own teaching style and paradigm (praise rule following rather than reprimand rule violations). In both examples, the goal was to make adaptations based on data. We assert that, with both the adapted and adaptive models, the level at which data are gathered (individual student and/or large group) should be used to systematically consider the level at which adaptations are made and monitored for impact.
CONCLUSIONS AND FUTURE DIRECTIONS The dynamic interaction between research and practice has never been more evident than in discussions about how to enhance flexibility and adaptability of research-based interventions. In this chapter, we have argued that fidelity is critical to the successful adoption and use of research-based practices, but that fidelity is not attained by simply adhering to the procedural components of an intervention. If fidelity is defined as ‘‘implemented as intended,’’ then multiple dimensions of fidelity and the influence of multiple factors on fidelity must be considered. Perhaps the most important dimension of fidelity to be included in recent conceptualizations is recipient responsiveness to the intervention. This dimension complicates the task of
Adapting Research-Based Practices
87
explicitly defining core features of intervention relative to the contexts in which those interventions are intended to be implemented. We assert that, though complicated, the process of explaining intervention effects relative to the instructional context and recipient characteristics will create a databased approach to designing flexible implementation of research-based practices. We have proposed a conceptual model to guide these efforts and provided examples of adapTED and adapTIVE implementation of academic and behavior interventions. Much work is needed to better understand whether and how researchbased interventions can be implemented and adapted with fidelity to maximize effects for both the group and the individual recipient. First, researchers and practitioners must find ways to collaborate early in the development process, to ensure that interventions appropriately address a practical need, and that they do so in a manner that is feasible to implement across a range of instructional conditions and with a range of different participants. Close collaboration should increase the likelihood that specific needs for possible adaptation are identified early and incorporated, to the extent possible, as part of the intervention design (e.g., Webster-Stratton et al., 2011). Second, researchers will need to clearly specify the critical core components of the intervention. Whereas many researchers have little difficulty identifying the theoretical basis for core intervention components, determining the empirical basis requires painstaking component analysis, which is not always rewarded in the push for evidence-based, multicomponent intervention packages (Pressley, Graham, & Harris, 2006). Further, identifying specific thresholds of fidelity needed to obtain intervention effects also requires clearly operationalized definitions of fidelity and reliable and valid measures of components (Durlak & DuPre, 2008). In the literature that we have described, there are clear demonstrations of the positive outcomes that may be achieved when flexibility is intentionally built in to interventions such that meaningful adaptations can be made without compromising fidelity. Maintaining efficacy, however, continues to be contingent on researchers’ and experts’ specification of how the theoretical and empirical underpinnings of the intervention can be enmeshed within the adaptive process. Our hope is that future efforts to enhance portability and broad dissemination of research-based practices will include decision guides that represent rigorous research that both seizes on and empirically represents the dynamic interaction between research and practice.
88
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
REFERENCES Algozzine, B., Wang, C., White, R., Cooke, N., Marr, M. B., Algozzine, K., y Zamora Duran, G. (2012). Effects of multi-tier academic and behavior instruction on difficult-to-teach students. Exceptional Children, 79, 45–64. August, G. J., Gewirtz, A. H., & Realmuto, G. M. (2010). Moving the field of prevention from science to service: Integrating evidence-based preventive interventions into community practice through adapted and adaptive models. Applied and Preventive Psychology, 14, 72–85. August, G. J., Realmuto, G. M., Hektner, J. M., & Bloomquist, M. L. (2001). An integrated components preventive intervention for aggressive elementary school children: The Early Risers program. Journal of Consulting and Clinical Psychology, 69, 614–626. Baker-Henningham, H. (2011). Transporting evidence-based intervention across cultures: Using focus groups with teachers and parents of preschool children to inform the implementation of the Incredible Years Training Programme in Jamaica. Child: Care, Health, and Development, 37, 649–661. Barrish, H., Saunders, M., & Wolf, M. (1968). Good behavior game: Effects of individual contingencies for group consequences on disruptive behavior in a regular classroom. Journal of Applied Behavior Analysis, 2, 119–124. Battistich, V. (2003). Effects of a school-based program to enhance prosocial development on children’s peer relations and social adjustment. Journal of Research in Character Education, 1(1), 1–16. Bloomquist, M. L., August, G. J., Lee, S. S., Piehler, T. F., & Jensen, M. (2012). Parent participation within community center or in-home outreach delivery models of the early risers conduct problems prevention program. Journal of Child and Family Studies, 21, 368–383. Boden, L. J., Ennis, R. P., & Jolivette, K. (2012). Implementing check in/check out for students with intellectual disability in self-contained classrooms. Teaching Exceptional Children, 45, 32–39. Boettcher Minjarez, M., Williams, S. E., Mercier, E. M., & Hardan, A. Y. (2011). Pivotal response group treatment program for parents of children with autism. Journal of Autism and Developmental Disorders, 41, 92–101. Bosworth, K., Gingiss, P., Potthoff, S. J., & Roberts-Gray, C. A. (1999). A Bayesian model to predict the success of the implementation of health and education innovations in school centered programs. Evaluation and Program Planning, 22, 1–11. Caldwell, L. L., Bradley, S., & Coffman, D. (2009). A person-centered approach to individualizing a school-based universal preventive intervention. The American Journal of Drug and Alcohol Abuse, 35, 214–219. Castro, F. G., Barrera, M., & Martinez, C. (2004). The cultural adaptation of prevention interventions: Resolving tensions between fidelity and fit. Prevention Science, 5, 41–45. Dane, A. V., & Schneider, B. H. (1998). Program integrity in primary and early secondary prevention: Are implementation effects out of control? Clinical Psychology Review, 18, 23–45. Denton, C. A., Fletcher, J. M., Anthony, J. L., & Francis, D. J. (2006). An evaluation of intensive interventions for students with persistent reading difficulties. Journal of Learning Disabilities, 39, 447–466. .
Adapting Research-Based Practices
89
Dietsch, B., Bayha, J. L., & Zheng, H. (2005, April). Short-term effects of a character education program among fourth grade students. Paper presented at the American Educational Research Association, Montreal, Canada. Durlak, J. A., & Dupre, E. P. (2008). Implementation matters: A review of research on the influence of implementation on program outcomes and the factors affecting implementation. American Journal of Community Psychology, 41, 327–350. Dusenbury, L., Brannigan, R., Falco, M., & Hansen, W. B. (2003). A review of research on fidelity of implementation: Implications for drug abuse prevention in school settings. Health Education Research, 18, 237–256. Fagan, A. A., Hanson, K., Hawkins, J. D., & Arthur, M. W. (2008). Bridging science to practice: Achieving prevention program implementation fidelity in the community youth development study. American Journal of Community Psychology, 41, 235–249. Felner, R. D., Favazza, A., Shim, M., Brand, S., Gu, K., & Noonan, N. (2001). Whole school improvement and restructuring as prevention and promotion: Lessons from STEP and the project on high performance learning communities. Journal of School Psychology, 39, 177–202. Ferrer-Wreder, L., Adamson, L., Kumpfer, K. L., & Eichas, K. (2012). Advancing intervention science through effectiveness research perspective. Child and Youth Care Forum, 41, 109–117. Fuchs, D., Fuchs, L., Mathes, P. G., & Simmons, D. C. (1997). Peer-assisted learning strategies: Making classrooms more responsive to diversity. American Educational Research Journal, 34(1), 174–206. Fuchs, D., Fuchs, L. S., Thompson, A., Al Otaiba, S., Yen, L., Yang, N. J., y O’Connor, R. E. (2001). Is reading important in reading-readiness programs? A randomized field trial with teachers as program implementers. Journal of Educational Psychology, 93, 251–267. Fuchs, D., McMaster, K., Saenz, L., Kearns, D., Fuchs, L., Yen, L., Compton, C., Lemons, C., Zhang, W., & Schatschneider, C. (2010). Bringing educational innovation to scale: Topdown, bottom-up, or a third way? Presented at the IES Conference, Washington, DC. Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–165. Good, R. H., Gruba, J., & Kaminski, R. A. (2002). Best practices in using Dynamic Indicators of Basic Early Literacy Skills (DIBELS) in an outcomes-driven model. Best Practices in School Psychology IV, 1, 699–720. Hancock, T. B., & Kaiser, A. P. (2006). Enhancing milieu teaching. In R. McCauley & M. Fey (Eds.), Treatment of language disorders in children (pp. 203–233). Baltimore, MD: Paul Brookes. Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 149–164. Institute for Education Sciences. (2011). Request for grant applications: Special education research grants. Retrieved from http://ies.ed.gov/funding/ Kaiser, A. P., & Grim, J. C. (2005). Teaching functional communication skills. In M. Snell & F. Brown (Eds.), Instruction of students with severe disabilities (pp. 447–488). Upper Saddle River, NJ: Pearson.
90
LEANNE D. JOHNSON AND KRISTEN L. MCMASTER
Koegel, L. K., Koegel, R. L., Harrower, J. K., & Carter, C. M. (1999). Pivotal response intervention 1: Overview of approach. The Journal of the Association for Persons with Severe Handicaps, 24, 175–185. Kretlow, A. G., & Bartholomew, C. C. (2010). Using coaching to improve the fidelity of evidence-based practices: A review of studies. Teacher Education and Special Education, 33, 279–299. Lau, A. S. (2006). Making the case for selective and directed cultural adaptations of evidencebased treatments: Examples from parent training. Clinical Psychology: Science and Practice, 13, 295–310. Loman, S. L., Rodriguez, B. J., & Horner, R. H. (2010). Sustainability of a targeted intervention package: First step to success in Oregon. Journal of Emotional and Behavioral Disorders, 18, 178–191. O’Donnell, C. L. (2008). Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in K-12 curriculum intervention research. Review of Educational Research, 78(1), 33–84. Power, T. J., Blom-Hoffman, J., Clarke, A. T., Riley-Tillman, T. C., Kelleher, C., & Manz, P. H. (2005). Reconceptualizing intervention integrity: A partnership-based framework for linking research with practice. Psychology in the Schools, 42, 495–507. Pressley, M., Graham, S., & Harris, K. (2006). The state of educational intervention research as viewed through the lens of literacy intervention. British Journal of Educational Psychology, 76, 1–19. Sanetti, L. M., & Kratochwill, T. R. (2009). Toward developing a science of treatment integrity: Introduction to the special series. School Psychology Review, 38, 445–459. Schulte, A. C., Easton, J. E., & Parker, J. (2009). Advances in treatment integrity research: Multidisciplinary perspectives on the conceptualization, measurement, and enhancement of treatment integrity. School Psychology Review, 38, 460–475. Siemonsoma, P. C., Schroder, C. D., Roorda, L. D., & Lettinga, A. T. (2010). Benefits of treatment theory in the design of explanatory trials: cognitive treatment of illness perception in chronic low back pain rehabilitation as an illustrative example. Journal of Rehabilitation Medicine, 42, 111–116. Songer, N. B., & Gotwals, A. W. (2005, April). Fidelity of implementation in three sequential curricular units. In S. Lynch (Chair) & C. L. O’Donnell, ‘‘Fidelity of implementation’’ in implementation and scale-up research designs: Applications from four studies of innovative science curriculum materials and diverse populations. Symposium conducted at the annual meeting of the American Educational Research Association, Montreal, Canada. Stage, S. A., Cheney, D., Lynass, L., Mielenz, C., & Flower, A. (2012). Three validity studies of the daily progress report in relationship to the check, connect, and expect intervention. Journal of Positive Behavior Interventions, 14, 181–191. Stein, M. L., Berends, M., Fuchs, D., McMaster, K., Saenz, L., Yen, L., & Compton, D. L. (2008). Scaling up an early reading program: Relationships among teacher support, fidelity of implementation, and student performance across different sites and years. Educational Evaluation and Policy Analysis, 30, 368–388. Tanol, G., Johnson, L., McComas, J. J., & Cote, E . M. (2010). Responding to rule violations or rule following: A comparison of two versions of the good behavior game. Journal of School Psychology, 48, 337–355. U.S. Department of Education. (2002). Elementary and Secondary Education Act, Public Law No. 107-110, 115 Stat. 1425, 2002 U.S.C.
Adapting Research-Based Practices
91
U.S. Department of Education. (2004). Individuals with Disabilities Education Improvement Act of 2004 (IDEA), P.L. No. 108-446, 20 U.S.C. yy 1400. Wanzek, J., & Vaughn, S. (2009). Students demonstrating persistent low response to reading intervention: Three case studies. Learning Disabilities Research & Practice, 24, 151–163. Warren, S., Fey, M., & Yoder, P. (2007). Differential treatment intensity research: A missing link to creating optimally effective communication interventions. Mental Retardation and Developmental Disabilities Research Reviews, 13, 70–77. Webster-Stratton, C. (1994). The Incredible Years Teacher Training Series. Seattle, WA: Author. Webster-Stratton, C., Reinke, W. M., Herman, K. C., & Newcomer, L. L. (2011). The incredible years teacher classroom management training: The methods and principles that support fidelity of training delivery. School Psychology Review, 40, 509–529. Zentall, S. S., & Beike, S. M. (2012). Achievement of social goals of younger and older elementary students: Response to academic and social failure. Learning Disability Quarterly, 35, 39–53.
CHAPTER 5 SYNTHESIZING SINGLE-CASE RESEARCH TO IDENTIFY EVIDENCE-BASED TREATMENTS Kimberly J. Vannest and Heather S. Davis ABSTRACT This chapter covers the conceptual framework and presents practical guidelines for using single-case research (SCR) methods to determine evidence-based treatments. We posit that SCR designs contribute compelling evidence to the knowledge base that is distinct from group design research. When effect sizes are calculated SCR can serve as a reliable indicator of how much behavior change occurs with an intervention in applied settings. Strong SCR design can determine functional relationships and effect sizes with confidence intervals can represent the size and the certainty of the results in a standardized manner. Thus, SCR is unique in retaining data about the individual and individual effects, while also providing data that can be aggregated to identify evidencebased treatments and examine moderator variables.
The concept of evidence-based practice (EBP) was established in medicine during the early 1980s to promote maximum quality of care in a field beset with disparate practices and highly variable outcomes across age span,
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 93–119 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026007
93
94
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
ethnicity, and gender. The EBP framework consisted of three critical components – assessment, intervention, and decision making – all used to increase awareness in the principles, skills, and resources involved in providing sound medical services (Sackett, Richardson, Rosenburg, & Haynes, 2000). Clinicians implemented these practices by first assessing each patient while considering individual values and expectations, then selecting treatments supported by sound scientific evidence (i.e., evidence-based treatments, EBTs), and finally by evaluating the treatment evidence and finding a ‘‘best fit’’ between intervention and client (Sackett, Rosenburg, Gray, Haynes, & Richardson, 1996). Attention to each component of EBP (assessment, EBT, clinical decision making) is associated with improved patient outcomes. This framework transfers readily to education and is a solid match to our field’s similar problems of disparate practices and variable outcomes across populations. Assessment of student needs, selecting treatments based on empirical support, and using clinical judgment to identify optimum fit in education seems logical and makes a laudable goal. Achieving the goal of EBP in education requires an artful blending of these three components and, if accomplished, is expected to produce improved educational outcomes. EBTs are one component of EBP. The terms are not synonymous and the difference is more than semantic. Where EBP is an ongoing practice, an EBT is a technique or intervention with specific procedures. Determining EBTs was first done in medicine using a problem-based learning approach to evaluate the quality of randomized controlled trials (RCTs) (Guyatt et al., 2000). But this method proved challenging in special education. In 2012 the Council for Exceptional Children reported, While the law requires teachers to use evidence-based practices in their classrooms, the special education field has not yet determined criteria for evidence-based practice, nor whether special education has a solid foundation of evidence-based practices.
Other professional organizations responded more definitively. The American Psychological Association (APA) issued an approved policy statement in 2005 on EBP. In this document they identify a model parallel to that used in medicine. The National Association of School Psychologist (1998) formed a task force to tackle this issue and also identified a similar model. In addition, the Division for Research within CEC formed working groups that articulated standards for identifying EBTs in special education based on group experimental (Gersten et al., 2005) and SCR (Horner et al., 2005).
Synthesizing Single Case
95
Originally, medical researchers recognized only RCTs as viable for establishing EBTs in their field. But it was recognized that RCT studies (which compare outcomes between groups) may present data with limitations for use of the findings in clinical decision making. Specifically, clinicians reported that it was difficult to determine how the EBTs interact with individual differences (Sackett et al., 2000). Education faces a similar challenge. The research to practice gap exists partly due to the difficulty in determining how the results of published research will translate to use in field application at the school or classroom level and with individuals learners (Boardman, Argu¨elles, Vaughn, Hughes, & Klingner, 2005; Cook, Landrum, Tankersley, & Kauffman, 2003; Greenwood & Abbott, 2001). The challenge in making clinical decisions based on RCT results demonstrates the need for methodologies that can provide details about treatment effects at the individual level while also providing strong conclusion validity in the design.
THE ROLE OF SINGLE-CASE RESEARCH IN EVIDENCE-BASED PRACTICE SCR may be a methodology that can do this. SCR provides strong internal or conclusion validity while maintaining the details about the behaviors of individuals, treatment and environmental variations, and the relationship between individual behavior change and practices. SCR is the study of an individual. Widely used in applied behavior analysis (ABA) where individual problems and individual solutions are the variables of interest, SCR in ABA dates back more than five decades. In SCR, careful research designs control for threats to validity (Kazdin, 2010; Kennedy, 2005). Reversal designs (ABAB where A is baseline and B is treatment), for example, demonstrate causality by exposing a client to conditions under no treatment and treatment phases. Immediate, large, stable, and consistent changes in behavior demonstrate a functional relationship between the treatment and the behavior change. Much of the research related to treatments of clients with exceptional needs has occurred and continues to occur using SCR designs. The popularity of SCR designs in special education may be due to the uniqueness of learner and setting characteristics in special education, which makes conducting group research (e.g., RCTs) difficult if not impossible at times. Moreover, SCR is well suited to investigate the effectiveness of practices in applied settings such as those found in special education.
96
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
The What Works Clearinghouse (WWC) was developed by the Institute of Education Sciences to serve as a resource for finding EBTs. The initial WWC standards (versions 1.0 & 2.0) for EBTs, published in 2008, did not consider SCR, effectively eliminating decades of meaningful research studies from consideration. Recognizing the contributions of SCR in determining EBTs, Revision 2.01 to the WWC framework included pilot standards for single-case designs (Kratochwill et al., 2010; WWC, 2011). These pilot standards articulate inclusion and exclusion criteria for using SCR studies to determine EBTs based predominately on issues of design and internal validity. One issue that is mentioned in these standards but not described in any great detail is the analysis of SCR using statistical analysis of both parametric and nonparametric methods. Statistical analysis of SCR has seen recent, rapid development. Perhaps the biggest development is in the use of nonparametric methods to evaluate effect sizes (ESs). As a result of this, the application of ESs to SCR has contributed to a plethora of meta-analysis of SCR studies. Calculating ESs produces a standardized metric for comparing, contrasting, and combining studies. Researchers have conducted recent meta-analyses using SCR on interventions such as daily behavior report cards (Vannest, Davis, Davis, Mason, & Burke, 2010), class wide peer tutoring (Bowman-Perrott, Davis, Vannest, Williams, Greenwood, Parker, 2013), augmented communication (Ganz, Davis, Lund, Goodwyn, & Simpson, 2012; Ganz et al., 2012), functional assessment (Gage, Lewis, & Stichter, 2012), and video modeling (Mason et al., 2013). Some of these studies have also evaluated the moderators of effects, which investigate whether variables such as disability type, age, and length of treatment are associated with treatment effects. Developments in SCR analysis are thus contributing to the determination of EBT. Three spheres comprise EBP (see Fig. 1) and this chapter is dedicated to just one of them – EBTs. More specifically, we focus on how to use SCR in creating and expanding the body of knowledge about EBTs. The determination of EBTs has led to extraordinary advances in a variety of fields (Slavin, 2002). Including SCR in the determination of EBTs adds to our body of knowledge and therefore is an important determinant in EBP. Although we assume that the reader has a certain degree of knowledge about SCR design, two additional and important prerequisites to the discussion about determining EBTs using SCR: (a) an understanding of the unique nature of SCR and (b) an understanding about ES indices. We follow our discussion of these issues with a brief overview of methods for ES calculation, including some information about confidence intervals and statistical significance related to ESs. We conclude the chapter with
Synthesizing Single Case
Fig. 1.
97
Three Components of Evidence-Based Practice.
suggestions for aggregating SCR studies and important considerations in reporting results of SCR aggregations studies.
SINGLE-CASE RESEARCH, AGGREGATION, AND EFFECT SIZES SCR is a mainstay methodology in special education, with origins rooted in a philosophy that individual learners with specific and exceptional needs deserve and require data about individual outcomes; in other words, outcome data that are directly connected to the idiosyncratic characteristics of people and settings. The ‘‘problems’’ under investigation are unique and not always well suited to group design studies. Conversely, the conclusions from group or large n studies may not apply to all participants. Because of this, much of the empirical evidence supporting particular practices in special education is documented through studies focusing on the individual (i.e., SCR). Unfortunately, this evidence is not always accepted as it lacks standardization and reliability in the analysis of the results. APA, for example, required the reporting of ESs and confidence intervals (CIs) in 2001 (Thompson, 2007). Although APA did not specify to which research designs their requirement applies, it stands to reason that when ES and CI can be calculated they should be, regardless of design. ESs and CIs
98
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
have not been used historically in SCR, resulting in a lack of credibility in some circles and making it difficult to synthesize findings across studies in the absence of a common metric for treatment effects. In addition, reviews of SCR research in special education have reported considerable variability in the quality of SCR studies in general (Chard, Kelberlin, Geller, Baker, & Doabler, 2009; Cook, Tankersley, & Landrum, 2009). So although recent meta-analyses of SCR intervention work are increasing and now calculating ESs, many studies are excluded from the analyses based on a failure to meet quality standards (Bowman-Perrott et al., 2013; Ganz, Earles-Vollrath, et al., 2012). The articulation of SCR quality indicators (Horner et al., 2005), WWC standards (Kratochwill et al., 2010), and nonparametric analysis methodology (Parker, Vannest, & Davis, 2011; Vannest, Davis, & Parker, 2013) provides considerable guidance to the field to improve the design, analysis, and reporting of individual studies in SCR. When SCR studies adhere to standards such as the APA required reporting of ESs and CIs (APA, 2001; Thompson, 2007), acceptance of the work as a rigorous methodology with standardized results may improve and more studies will be available for determining EBTs while preserving data with unique characteristics. Standardization of results through ES reporting has an added benefit beyond ‘‘street cred’’ with statisticians. Standardized results allow for aggregation. Aggregation of studies can produce an omnibus ES, essentially quantifying the effects demonstrated in replications and providing weight to the evidence for effects. Before examining specifics as to how SCR study results can be aggregated to help identify EBTs, we consider what an ES is and is not, what it can and cannot do, and why ES and CI reporting does not pose a threat to the fundamentals of SCR.
What an Effect Size Is and Is Not ESs are statistics about the relationship between one variable and another. There are both standardized and unstandardized ESs. Standardized ES examples commonly seen in group research include Pearson’s r, Cohen’s d (Cohen, 1988), and odds ratios. Unstandardized ESs can indicate the ‘‘real’’ difference in the terms of the variable (e.g., absences decreased by 2). An ES communicates the size of the result, but it does not say if the result is important; without context it does not indicate whether the result is small, medium, or large. Using a standardized ES has the advantage of allowing for equitable comparisons and aggregation of studies that use different
Synthesizing Single Case
99
outcome measures. There are two ‘‘families’’ of ESs in group research. Those based on the variance accounted for and those which are based on the difference between means. Cohen’s d and Hedge’s g are means difference ESs (so also are odds ratio, risk ratio, and phi). Pearson r is a variance accounted for ES. There are more than a dozen ES indices for group research and nearly a dozen nonparametric ESs for SCR. While it is important to understand what an ES is, it is equally important to understand what it is not, so as to avoid potential misunderstandings in the analysis and interpretation of ESs, particularly in SCR. A large ES is NOT necessarily an indicator of a successful intervention. An ES is a measure of change or a standardized metric for the magnitude of change. It does not indicate a successful intervention any more than a p-value or statistical significance represents the importance of a finding. An ES does NOT inform us about consistency of findings within a study or across studies. Consistent effects are evaluated by examining individual studies and individual phase changes. An ES is NOT meaningfully calculated from just any contrast of phases. Calculating ES should not ignore or violate the design logic of the study. For example in an ABAB reversal design, the most ideal comparisons are (a) A1 and B1 and (b) A2 and B2. Or in a multiple baseline design (MBD) each AB phase should be compared. An ES does NOT ignore data patterns. The patterns in the data (e.g., positive baseline trend, intercept gap, generalization, and maintenance vs. reversals) are important sources of information to consider prior to an ES calculation and should inform the analysis selected and how phases are treated. An ES does NOT stand alone. An ES can be calculated by any number of methods, including hierarchical linear modeling (HLM), regression, and randomization or nonparametric techniques. Each has their strengths and weaknesses. Furthermore, a number of decisions must be made for each these techniques, all of which should be fully disclosed. Including CI and p-values provides confidence in the findings by articulating certainty and chance.
Seven Methods for Effect Size Calculations in Single-Case Research This section will briefly review several nonparametric ES indices and provide brief examples of their application. Within SCR, many nonoverlap techniques have been proposed. There is the original Extended Celeration
100
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
Line (ECL; White & Haring, 1980) and the classic percentage of nonoverlapping data (PND; Scruggs, Mastropeiri, & Casto, 1987). Recently published are a series of techniques attempting to address limits in either ECL or PND: percent of data exceeding the median (PEM; Ma, 2006), percentage of all nonoverlapping pairs (PAND, Parker, Hagan-Burke, & Vannest, 2007), nonoverlap of all pairs (NAP; Parker & Vannest, 2009), the Improvement Rate Difference (IRD; Parker, Vannest, & Brown, 2009), the Percent of Data Exceeding the Phase A Median Trend (PEM-T; Wolery, Busick, Reichow, & Barton, 2010), and TauU (Parker, Vannest, Davis, & Sauber, 2011). Existing papers and chapters review, compare, and teach these techniques in depth (Parker, Vannest, & Davis, 2011; Vannest et al., 2013). Therefore this chapter will only briefly present each, compare strengths and weakness in an abbreviated manner, and demonstrate differences using a fictitious data set (see Figs. 2–6). ECL (White & Haring, 1980) is also known as the ‘‘split middle’’ line and was first seen in print in 1972 (Pennypacker, Koenig, & Lindsley, 1972) and then again in 2010 as PEM-T (Wolery et al., 2010). ECL is the proportion of Phase B data that do not exceed a median-based trend line of phase A extended into Phase B. In the Fig. 2 example all the data in phase B are above the trend line so ECL=10/10=1. One (or 100%) is the percentage of nonoverlapping data using ECL or PEM-T. When interpreting an ECL score, 50% is chance level. So it is easiest for interpretation to rescale the score by multiplying by 2 and subtracting 1. In our example, the 1 would be transformed by multiplying by 2 and subtracting 1 – in our example (1 2)1=1 or 100%. If the result had been a 75% nonoverlap the transformed score would look like this (0.75 2)1=0.50 or a 50% improvement over that expected by chance. Among the strengths of ECL are that it has a long history of use, adjusts for trend in baseline, and is easily
Fig. 2.
Example for Extended Celeration Line.
Synthesizing Single Case
Fig. 3.
Example for Percentage of Nonoverlapping Data.
Fig. 4.
Example for Percent of Data Exceeding the Median.
Fig. 5.
Example for Percentage of all Nonoverlapping Pairs.
101
102
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
Fig. 6.
Example for Nonoverlap of All Pairs.
calculated by hand. However, it assumes linearity, lacks precision and power, and can produce a trend line that is nonsensical or unreliable (see Fig. 2). PND (Scruggs et al., 1987) is calculated by identifying the highest data point in phase A (see arrow in Fig. 3) and then counting the data points in phase B which exceed the highest data point in phase A (circled data points) and using this number to create a ratio. In Fig. 3 the third data point is the highest in phase A and 8 of the 10 data points in phase B exceed it. So 8/10 or 80% is the PND. Among the strengths of PND are that is has been field tested in dozens of studies and that is easy to calculate by hand. However, PND can be unduly influenced by a single outlying data point; is vulnerable to floor and ceiling effect; and it lacks a known sampling distribution, which prevents any inference testing of the calculated outcomes (i.e., no p values or CIs; Kratochwill et al., 2010; Parker et al., 2011). PEM (Ma, 2006) is meant to improve on PND by using a median value rather than an extreme score when calculating the overlap. In PEM a median line is drawn for Phase A data and the number of data points exceeding the median line are used in the ratio. Using the same example data (see Fig. 4), there are 8 data points in phase B above the median line (dotted line extending from A to B), resulting in 8/10 or 80% nonoverlap. PEM, like ECL, has the 50% chance interpretation and scores require recalculating with the same formula used for ECL. In our example (0.80 2)1=0.60, so 0.60 or 60% is our adjusted PEM score. Although PEM is not as affected by outlying data in the baseline, it has low power, lacks sensitivity, and is vulnerable to ceiling effects. PAND (Parker et al., 2007) uses more data in the analysis than PND. PAND is the percentage of all nonoverlap. In this method the fewest
Synthesizing Single Case
103
number of data points are removed from both phases to eliminate all overlap between the phases. One method to determine the fewest number to remove is to create an overlap zone in which the highest data point in phase A and the lowest in phase B are identified (see dotted lines). Then a visual trial and error process ensues. In Fig. 5, removing the one highest data points from phase A and the two lowest data points from phase B results in no overlap among remaining data points. PAND is then calculated by the number of remaining data points divided by the total, resulting in a ratio of nonoverlapping data compared to the total number of data points in phases. In Fig. 5, the calculation of PAND is (163)/16=13/16 or 0.81. PAND is easily calculated and directly accessible for interpretation. IRD was built to improve on PAND by creating an index with a sampling distribution that is easily interpretable. IRD is a risk-reduction ratio, renamed for use in education as the improvement rate difference. IRD starts like PAND by identifying the fewest data points to be removed to eliminate all overlap. The next step creates a phase A improvement rate and a phase B improvement rate. The final step is to subtract the ratios. In Fig. 5, for example, 1 of the 6 data points in baseline phase A is removed to eliminate overlap. This data point is considered ‘‘improved’’ (1/6=0.16). Two of the 10 data points are removed from phase B leaving 8 remaining. These 8 of the 10 data points in phase B that remain are thus considered improved (8/10=0.80). Subtracting improvement rate A from improvement rate B, IRD=0.800.16=0.64. One of the strengths of IRD is that it is an established approach in medical literature. However, it is insensitive to trend. NAP is a dominance statistic and is literally the percent of data that improve from phase A to phase B. NAP is a pair-wise comparison calculated by comparing each phase A data point with every phase B data point. Each comparison has one of three possible outcomes: growth (scored +1), no growth (i.e., negative change; scored 0), or tie (scored 0.5) NAP can be calculated by hand but can be tedious for long data series. First calculate the number of possible pairs by multiplying the number of data points in phase A with the number in phase B. In Fig. 6, 6 phase A data points 10 phase B data points=60 possible pairs. Data point #1 is compared to each data point in phase B (see arrows in Fig. 6), resulting in scores of .5, 1, 1, 1, 1, 1, 1, .5, 1, 1, summed to a total score of 9. Repeating this process for all 6 phase A data points results in scores of 9, 9, 8, 9, 10, and 9, which sum to 54. NAP, then, is calculated as 54/60=0.90. NAP is scaled 0–50 so scores need transformation to make them easier to interpret, in our example (0.90 2)1=0.80 or 80% nonoverlap of all pairs. For larger data sets,
104
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
free calculators are available online (www.singlecaseresearch.org). Among the strengths of NAP are the precision and power of the approach, as well as the direct manner in which it is calculated and interpreted. The limitations of the approach include insensitivity to trend and the difficulty in calculating larger data sets by hand. TauU is the most sophisticated of the techniques reviewed and has two forms, a simple nonoverlap and a version adjusted for trend. Simple TauU is the percent of nonoverlapping minus overlapping data. TauU (with trend) is the adjusted version of NAP, which means it is directly interpretable on a 0–100 scale and doesn’t need a transformation. TauU considers the number of pairs calculated as the product of two phases using Kendall’s rank correlation. Tau and Mann–Whitney U test for two groups based on an S distribution of the scores. TauU integrates nonoverlap and trend, which are operationally the same (Newson, 2001), and TauU controls for phase A monotonic trend (as opposed to linear trend) in a conservative manner. Monotonic trend is the tendency for data to go up over time but includes plateaus, drops, curves, as well as linear change. Avoiding the assumptions of linear trend means TauU should not produce nonsensical results where the extended trend line goes ‘‘out of range.’’ TauU is ideal for meta-analysis because each A versus B phase contrast has its own calculated TauU and standard error, which is recommended by experts (Cooper, Hedges, & Valentine, 2009; Hedges & Olkin, 1985; Lipsey & Wilson, 2001). Most statistical packages can run this data. To run the calculation on www.singlecaseresearch.org, enter raw data, select contrast for basic TauU, and click the ‘‘correct baseline’’ to control for trend. TauU is sensitive and powerful. It has precision and also can handle trend. These strengths make it the most powerful of these techniques. Long baselines and short treatments may skew values however and care should be used to check findings in those instances.
Confidence Intervals and p-Values Error exists in all measurement and assessment. For example, although a student might have scored at 85% on a test, perhaps one of the responses was a lucky guess and the ‘‘true score’’ was actually lower. Similarly, there are many sources of error in SCR ESs. CIs represent the range of error around a finding and are critical in interpreting an ESs. Determining the parameters of the CI should be based on decisions about the use of the data and need for certainty. ESs, CIs, and p-values work together to articulate
Synthesizing Single Case
105
the certainty of a finding, or conversely the error. For example, if data indicate that a student responded well to an intervention in my classroom and the IEP team therefore recommends she remain on campus rather than receive services in a residential facility, to what extent should the team be confident in the data? Could chance have played a role? A CI can address these issues empirically by providing a range of results in which one can be confident to a specified degree. For example, if the ES is 0.50 and, using a 95% CI, my CI is between 0.30 and 0.70, then I am 95% certain that the true ES of my intervention is somewhere between 0.30 and 0.70. This may be too much error in my data for me to feel comfortable making a decision about residential service delivery. Conversely if my ES is 0.50 and my 95% CI range is 0.45–0.55 then I am 95% confident that my ES is no less than 0.45 and no higher than 0.55 – so my .50 ES data is a good bet to make a decision with.
Applied Settings and Sensitivity to Change Prior to recent developments in nonparametric ESs, SCR analysis relied almost exclusively on visual judgment. Baer and colleagues (1968) established criteria for visual analysis and these have remained comfortingly stable across time (Kazdin, 2010; Kratochwill & Levin, 2010), a likely testimony to their reasonableness and usefulness. Some basic tenants of visual analysis include examining data for large, immediate, and consistent change corresponding with the introduction and withdrawal of the treatment. Only large changes that are visually evident so as to be easily agreed upon are expected to be considered in reporting results. Immediate change refers to the intercept gap or how quickly data change between one condition and another, with change ideally occurring immediately after a change in condition. Finally, meaningful change should be consistent and stable across conditions or participants. For example, in an ABAB reversal design behavior should consistently respond to each introduction and withdrawal of the treatment; or in a multiple baseline design across four participants, all four participants would respond similarly to the introduction of the treatment. Data should be stable, indicating that all environmental conditions were under control. These standards are meant to address threats to the internal validity of the study and eliminate possible rival explanations for behavior change. But when applied contexts and socially valid behaviors are the targets of study in determining the efficacy of an intervention, change may be neither
106
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
large, obvious, immediate, or stable; and thus may be difficult to detect or agree upon through visual analysis alone. These types of challenges to visual analysis can result in disparate or unreliable decisions, even by experts. Visual analysis is insufficient to reliably determine the size of change particularly in imperfect data (Brossart, Parker, Olson, & Mahadevan, 2006; Horner et al., 2005; Ottenbacher, 1990; Parker et al., 2009; Scruggs & Mastropieri, 2001). And it is this ‘‘size of change’’ issue that is increasingly important in applied research where decisions are made about placement, eligibility for services, and other ‘‘high-stakes’’ issues including the identification of EBTs. Quantifying the change with an ES provides more sensitivity in the analysis. A visual analysis of change typically indicates whether change occurred with a ‘‘yes or no’’ answer. A visual analysis might also quantify that the change is small, medium, or large (or a similar metric). A yes–no (two categories) or small–medium–large (three categories) classification of effects is much less sensitive than a calculation presenting a 50–100 ES (where 50 is chance) or a 0–100 ES as is most common. In practical settings where key stakeholders may disagree about outcomes, determining an indicator of change on a sensitive scale (e.g., 0–100%) may be more advantageous than debating whether or not meaningful change occurred. In determining EBTs, an ES provides more information than a visual analysis alone. Realistic or applied settings (e.g., classrooms) are typically ‘‘noisy.’’ Such settings contain competing stimuli and likely produce undesirable bleed across conditions. Participants in research studies in these settings frequently produce ambiguous data patterns and demonstrate less than consistent responding. The needs of the participants or the demands of the schedule of those involved (particularly true in classroom settings) can contribute to phase lengths and timing of phase changes that are less than desirable. All of these things contribute to smaller effects. Socially valid responses may include broader ‘‘response classes’’ or ‘‘behavioral constellations’’ that are sometimes measured less reliably and can be resistant to change. Practically significant change may require training across multiple settings and behaviors, which also may be associated with incremental change and a cumulative impact. These issues contribute to smaller effects; gradual, cumulative rates of change; or both. An ES with CIs is a sensitive index to the amount of change. An ES scale of 0–100 with a precision estimate of a CI tells us if improvement goes up by any factor, be it 0.1% or 100%, whereas a visual analysis cannot reliably do so. This is particularly meaningful for academic and behavioral gains that may be small or gradual and thus not easily detected by visual analysis.
Synthesizing Single Case
107
An ES also allows us to accumulate partial evidence across environments, behaviors, individuals, and studies. It is an index that can help with change that may have practical, if not clinical or statistical significance. Because ESs are standardized results, they allow for secondary analysis of moderator variables (e.g., participant characteristics, settings, treatment variation) across studies. However, an ES does not tell us whether a functional relationship exists. It does not speak to the quality or appropriateness of the design. An ES calculation done in isolation or decontextualized from the visual analysis, the data, or some determination of the quality of a study will be far less meaningful. Indeed, ES calculation should be something that happens in addition to, not instead of, visual analysis. An ES measures the amount of change and not the cause of change. An ES cannot tell us whether an intervention is effective. An ES is not a measure of conclusion validity, and does not tell us whether a functional relationship exists or about the internal validity of the study. Large ESs may be obtained from weak or flawed designs. Visual analysis and ESs are different analysis with different purposes. Skills in both should be used to inform identification of EBTs.
REPORTING OF EFFECT SIZES Calculating an ES in SCR requires knowledge and expertise regarding design logic. ES should be calculated only when other requirements for quality studies have been met, otherwise they would be nonsensical. Also ES should be calculated only when the design allows the inference of causality (one that allows for a functional relationship to be determined). The first decisions are about which phases to contrast. Within a design where A is baseline and B is treatment, A1 versus B1 phase comparisons are logical. Many meta-analyses only compare A1 to B1 regardless of how many conditions or reversals exist, believing this to be the most conservative approach in estimating treatment effects. A reversal design of ABAB has more than A1 and B1 data to compare however, and you may be interested in evaluating the A2–B2 contrast as well. In an ABCABC alternating treatment design, for example, the logic of the study should dictate which phases are compared. If A, B, and C represent three different treatments, the researcher might want to compare A versus B, B versus C, and A versus C. However, if A represents baseline/no treatment conditions and B and C are treatments that the researchers wishes to compare to baseline, it might be appropriate to compare only A versus B and A versus C. These first
108
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
decisions correspond with a first reporting requirement: Report which phases were contrasted and why. The patterns or properties of the data may inform this decision as well. Variability, trend, or environmental events that pose threats to internal validity are potential sources of information that should be considered. Researchers should report these where applicable. Excluding data that do not reverse when expected, for example, would inflate the ES of the study. Stable and consistent data may be easier (more appropriate) to combine than wild variations between participants. Researchers should describe the properties of the data as they inform the phase contrast selections. We presented seven nonparametric methods here, but others also exist. Randomization, HLM, linear regression, and the like are all available. A complete analysis of each is beyond the scope of this chapter. What is important however is for the analysis to make sense within the context of the design. Data should meet the assumptions of the analysis and trend should be addressed if present. Recently we see authors using multiple ESs and comparing findings across analysis, and sometimes comparing ESs to visual analysis as a bit of a confirmation. Regardless, the ES(s) selected should be identified clearly with a rationale about the match between data and analysis. When multiple phase contrasts are present, both individual ES and omnibus ES may be legitimate. These should be identified as well. The number of participants or behaviors, studies, contrasts, and any weighting should be reported. CIs and p-values should be described as well. Which CI parameters were selected (e.g., 80%, 90%, 95%) and why, and which procedure was used in calculating the interval should also be reported. The elements of an ES summary prevent misunderstanding and allow transparency (Parker & Vannest, 2010). The elements of an ES summary are data properties summarized; phases contrasted; statistical analysis used; separate or omnibus analysis conducted; if separate analyses, how analyses were combined; ES index used; ES precision (CI); and ES significance (optional p-value). The decisions for each of these issues should be reported so that readers are aware of the logic used and are able to reproduce, interpret, and aggregate results.
AN EXAMPLE OF SYNTHESIZING SINGLE-CASE RESEARCH A primary reason to synthesize SCR is to identify EBTs. For example, Vannest, Harrison, Parker, Harvey, and Ramsey (2010) conducted a
Synthesizing Single Case
109
comprehensive literature review and examined the effects of Daily Behavior Report Cards. Sixty-seven phase contrasts were compared to produce an overall ES (IRD) of 0.76. The visual provided in Fig. 7 depicts the ES for individual contrasts across the 20 articles reviewed. Larger squares represent larger data series. Horizontal lines represent the CIs around each ES. These data indicate that the range of ES for Daily Behavior Report Cards is moderately large and that there is great range in the results depending on the participant. Several additional points can be made using Fig. 7 as a visual for comparing effects between studies. The lines or whiskers of these stacked box plots show visually when data are statistically significantly different. Whisker lines that overlap each other are not statistically significantly different. Whisker lines that do not overlap are statistically significantly different at a p=.05 level. To do this, CIs are set at 83.4 which represents a p-value of .05 or a certainty of 95% that the effects are statistically significantly different from one another (Payton, Greenstone, & Schenker, 2003). We can ‘‘see’’ here that a wide range of effects can occur with an intervention, ranging from total improvement to some regression. Although the overall data demonstrate strong effects for the majority of the studies, researchers and consumers may also ask what (if anything) moderates the effect. Continuing the example, in a study of Daily Behavior Report Cards Vannest et al. (2010) hypothesized that ESs were moderated by the type of
Fig. 7.
A Forest Plot of Improvement Rate Difference Effect Sizes.
110
Fig. 8.
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
A Comparison of Effects of Daily Behavioral Report Cards on Six Categories of Dependent Variables.
behavior/dependent variable under investigation. Six categories were constructed based on the variables in the study: study skills and work productivity, minor social behavior (e.g., talking out), major social behavior (e.g., disruptions, fighting, and confrontation), attendance, on-task, and compliance with class rules. Results shown in Fig. 8 demonstrate that effects of Daily Behavior Report Cards were different across behavior categories. When whisker lines representing 95% CIs overlap, there is no statistically significant difference at po.05. For instance, minor behavior improved at a statistically significantly higher rate than attendance or study skills and work productivity. This figure visually represents ESs (small squares) for individual studies vertically based on the category of interest (in this case type of behavior or dependent variable). For each type of behavior category there is also a range of ESs and CIs. Also presented are CIs (horizontal whiskers or lines), mean ES (larger diamonds) for the group or category of studies, and an overall grand mean ES for all studies and phase contrasts. In this same example, a second hypothesis regarding the variability of data was tested to determine if ESs were explained by the quality of measurement (since most of the measures were ‘‘home-grown’’). Studies with high variability were equated to low measurement quality. To answer
Synthesizing Single Case
111
this question, the stability of Phase A data across studies was examined. We used only phase A data (baseline) to eliminate the effect of the intervention in stabilizing behavioral responding. To assess variability we used COV (Coefficient of Variation)=SD/M, or standard deviation expressed as a proportion of the mean. It is a relative measure, but it permits comparison across instruments and studies. Contrasts were grouped by their Phase A COVs, into four roughly equal groups, from low to high COV. A low COV reflects better measurement, and a high COV reflects poorer measurement. We expected an inverse relation whereby high COV is associated with low ESs reflecting that measurements that are less sensitive or reflect greater error are less likely to detect change. Fig. 9 demonstrates the scores for the four COV groupings and visually depicts ES sizes for individual studies (small squares), CIs (horizontal whiskers or lines), mean ES (larger diamonds) for a group or category of studies, and an overall grand mean ES for all studies and phase contrasts. These three examples provide some illustration of the types of knowledge that can be gained from synthesizing SCR. In the first instance we see evidence of an overall effect one could expect when using the practice and the confidence we have in this ES. We also see that there is a large range of
Fig. 9.
Four Categories of Coefficient of Variance Scores to Test if Measurement Mediates Effect Size.
112
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
ESs produced from this intervention and so ESs can be used to examine additional research questions, such as the differential effects based on variables, or types of measures, or quality of measures. Thus, ESs can both establish treatments as evidence based and can contribute to a greater understanding about implementation of EBTs.
HOW TO AGGREGATE SINGLE-CASE RESEARCH DATA Aggregating studies adds to the evidence base by combing the results of multiple studies. This increases the number of replications of an effect, and creates a larger participant pool. An omnibus ES presents these findings overall. Studies can be grouped to look at particular variables of interest that may moderate effects. When combing studies for meta-analysis, only quality research is recommended for inclusion. The first step after completing a literature review of studies for aggregation is to determine which to include or exclude. The WWC (2011) standards, which build on the quality indicators and standards proposed by Horner et al. (2005), include a decision making hierarchy for determining EBTs on the basis of SCR. Only studies of sound quality are to be evaluated. Evidence standards include number of replications, length of data phases, fidelity, and inter-rater reliability. Designs may ‘‘meet,’’ ‘‘meet with reservations,’’ or ‘‘not meet’’ evidence standards (WWC, 2011, p. 77). Studies that meet evidence standards with or without reservations then proceed to the next step of estimating effects using visual analysis techniques. Kratochwill and colleagues identified a categorical metric of strong, moderate, or no evidence and suggest that ESs should be estimated only for studies shown to have strong or moderate effects through visual analysis of data (WWC, 2011). It may be valuable, however, to analyze effects for studies in which treatment effects may not be apparent through visual inspection alone. Variability and trend can impair visual analysis. We suggest that ES be estimated for all studies of quality that demonstrate a sound design capable of detecting causal relationships. ES calculations can offer research consumers information that visual analysis cannot. Moreover, negligible, small, or no effects are an important component of our knowledge base and will have an impact on the aggregation of studies. Including only studies that have large or moderate effects provides an incomplete picture of the research literature.
Synthesizing Single Case
113
For most approaches to estimating ESs for SCR, raw data are required but may not be available in a published manuscript. Graphs are sometimes crowed and Y axes are not always demarked in units to allow reconstruction of the data. One method for extracting data is through scanning and digitizing graphs. Software such as Getdata Graph Digitizer (Version 2.21) from getdata.com is available. Digitizing data produces an exact reconstruction of graphed data to numerical data in table form. This involves a multistep process: 1. Gather or create an electronic file such as a PDF of the graphed data. Keep article identification data on the graph for later organization. 2. Scan the graphs into the software program and follow steps for use, such as setting X- and Y-axis values. 3. Convert graph to raw data. 4. Import values into Excels spreadsheet (column=phase). After the raw data is accessible, and visual analysis is used to assess the data for obvious trends or concerns in data patterns, calculating ESs allows for a standardized metric in order to synthesize results. There are many choices for calculating ESs, several of which we presented in this chapter and conceptualize as ‘‘bottom-up’’ analysis (Parker & Vannest, 2012). By bottom-up we mean an analytic strategy that proceeds from visually guided selection of individual phase contrasts (the bottom), and combining these results to form a single (or a few) omnibus ES(s) representing the entire design (the top). Bottom-up strategies involve and rely on the behavior analyst or interventionist determining which phases to contrast, which data to consider in the analysis, and the context for interpreting the effects, whether for individual studies or for aggregated findings/meta-analysis. Bottom-up analyses are customized based on logic of the design and the data patterns, so they complement visual analysis and produce relevant answers to questions about the size of the behavior change. These techniques fit well with SCR designs in their parsimonious approach. Most work even with just a few data points and do not require extensive statistical knowledge. Earlier we mentioned other approaches to calculating ES for SCR beyond the scope of this chapter. These require more advanced statistical knowledge, typically conducted by an expert methodologist or statistician and not the interventionist. Multilevel or hierarchical linear models (MLM, HLM) are sophisticated options for interpreting effects that may be inconsistent with the traditional visual analyses because they may not adequately capture the entire effect. For example a visual analysts considers individual and
114
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
environmental factors and features in understanding the data, and designs a study to produce conclusion validity. Using techniques without involving this viewpoint can produce decontextualized results and potentially misleading conclusions. There are a variety of parametric methods for calculating ESs in SCR such as HLM and MLM (van den Noortgate & Onghena, 2003, 2008), randomization (Kratochwill & Levin, 2010), and complex multiseries regression techniques such as ordinary least squares (OLS; Allison & Gorman, 1993; Huitema & McKean, 2000), generalized least squares (GLS; Maggin, Swaminathan, et al., 2011). These may or may not be viable approaches as each has its own strengths and weaknesses (Parker & Vannest, 2012) and we encourage readers to become sufficiently well versed with them to make informed decisions about what approaches offer the best fit for their interests and needs.
Establishing Evidence-Based Treatments EBTs are typically identified through aggregation of findings across studies. When aggregating SCR studies to identify EBTs, the first step involves searching the literature and identifying high quality, trustworthy SCR studies. Next, data from individual single-case studies are extrapolated to calculate ESs for each participant, setting, or treatment variation. These individual ESs can then be aggregated to give an ES for the entire study, which can in turn be combined across studies in a meta-analysis that yields an aggregate ES for a treatment across studies. Further analysis of ESs for a selected intervention can answer additional questions of interest about mediator and moderator variables or statistically significant differences in treatment outcomes between interventions. Aggregation of single-case outcomes helps educators to know specifically which populations respond with the greatest magnitude of change when selecting and implementing EBTs (Maggin, O’Keeffe, & Johnson, 2011). By employing statistically rigorous and defensible methods, researchers can accurately aggregate results and draw generalizable conclusions from studies, which are used as an integral component of EBP. The WWC engages in systematic reviews to establish EBTs. The general WWC process includes (a) protocol development, (b) literature search, (c) screening and evaluating studies, (d) combining findings, and (e) presenting a summary or conclusion from the review (WWC, 2011). This same process is appropriate for individual researchers or research teams to use with SCR
115
Synthesizing Single Case
methodology. Protocol development keeps teams consistent and allows for data and reliability to be assessed at each step of the process. Literature search procedures continue to evolve and reporting the results of literature searches should include search terms, findings, and reliability of findings. Screening and evaluating studies likewise includes protocol use and reliability reporting. This chapter has focused on steps 4 and 5 of WWC’s proposed process the aggregation of findings within studies, between studies, and across studies as well as how to present the results. Step 4, combining findings, allows researchers to scientifically evaluate studies for mechanisms of change and to identify mediating and moderating variables. Step 5 involves standardized reporting, and so we have presented a rationale and criteria for the reporting of ESs (something currently missing from the literature). Synthesis (i.e., aggregation or meta-analysis) is an accepted methodology (Kavale, 2001; Lipsey & Wilson, 2001) and the American Psychological Association (APA) requires that ESs and CIs are presented, and presented in a manner that is accessible and interpretable (APA, 2001, p. 26; Wilkinson & APA Task Force on Statistical Inference, 1999). But SCR research is somewhat unique and until recently has been omitted from the meta-analysis literature. Analysis of SCR was previously thought to be inaccessible to statistical measures like ES and CI (Parker & Vannest, 2009; Parker et al., 2009). Although we are beginning to see ES measures reported with SCR with greater frequency it is still rare to see CIs reported. A significant hurdle to meta-analyzing SCR is the variety of methodological issues unique to this body of research (Faith, Allison, & Gorman, 1996). The reliance on visual analysis of graphed data poses challenges for deriving numerical estimates for each study (Shadish & Cook, 2009). In addition, the wide variety of designs leads to questions regarding the most appropriate phases to compare (Parker & Brossart, 2006). Despite these challenges, the discipline of SCR meta-analysis and the use of SCR studies in the identification of EBTs are advancing rapidly.
CONCLUSION In the context of EBP, we showed how SCR may work well to add unique information in establishing EBTs by maintaining individual data while allowing for standardized reporting of results. We also presented seven methods for nonparametric ES calculation and mentioned other sophisticated statistical techniques that will likely continue to evolve. We described what an ES is and is not, provided some examples and for calculating ESs,
116
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
demonstrated the relevance of aggregation, and presented guidelines for reporting results. Using SCR work to identify and examine EBTs adds to our body of knowledge by allowing us to keep the details about individual treatments and individual clients while also amassing a ‘‘weight’’ to the studies when combined that they do not have independently. When analysis are undertaken with care, decisions respect SCR design logic, and these decisions are articulated throughout the research process, aggregation of SCR work brings new information to the table. However, if aggregation methods are carelessly applied by ignoring the quality of the individual studies or if the logic of the design does not allow causal inference, aggregation methods can be applied but their conclusions would be questionable at best. Those using SCR to determine EBTs have many techniques available to them. Nonparametric analysis of SCR designs have evolved to allow analyses that are sensitive, powerful, and can handle trend. These analyses can express the size of change associated with treatments in a standardized metric, present a band of certainty in the findings, and may identify the moderators of effects of treatments. Once EBTs are established, evidence-based practitioners then incorporate assessment of needs and use expert decision making to select and apply EBTs. Like changes in allied fields such as medicine, the identification of EBTs will help foster EBP and ultimately improved outcomes for children and youth with disabilities.
REFERENCES Allison, D. B., & Gorman, B. S. (1993). Calculating effect sizes for meta-analysis: The case of the single case. Behavior, Research, and Therapy, 31, 621–631. American Psychological Association. (2001). Publication manual of the American Psychological Association (5th ed.). Washington, DC: Author. American Psychological Association Task Force on Evidence-Based Practice for Children and Adolescents. (2008). Disseminating evidence-based practice for children and adolescents: A systems approach to enhancing care. Washington, DC: American Psychological Association. Baer, D., Wolf, M., & Risley, R. (1968). Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis, 1, 91–97. Boardman, A. G., Argu¨elles, M. E., Vaughn, S., Hughes, M. T., & Klingner, J. (2005). Special education teachers’ views of research-based practices. The Journal of Special Education, 39, 168–180. Bowman-Perrott, L., Davis, H. S., Vannest, K. J., Williams, L., Greenwood, C., & Parker, R. I. (2013). Academic benefits of peer tutoring: A meta-analytic review of single case research. School Psychology Review, 42(1) 39–55.
Synthesizing Single Case
117
Brossart, D. F., Parker, R. I., Olson, E. A., & Mahadevan, L. (2006). The relationship between visual analysis and five statistical analyses in a simple AB single-case research design. Behavior Modification, 30(5), 531–563. Chard, D. J., Ketterlin-Geller, L. R., Baker, S. K., Doabler, C., & Apichatabutra, C. (2009). Repeated reading interventions for students with learning disabilities: Status of the evidence. Exceptional Children, 75, 263–281. Cooper, H. M., Hedges, L. V., & Valentine, J. C. (2009). The handbook of research synthesis and meta-analysis. New York, NY: Russell Sage Foundation Publications. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates. Cook, B. G., Landrum, T. J., Tankersley, M., & Kauffman, J. M. (2003). Bringing research to bear on practice: Effecting evidence-based instruction for students with emotional or behavioral disorders. Education and Treatment of Children, 26, 345–361. Cook, B. G., Tankersley, M., & Landrum, T. J. (2009). Determining evidence-based practices in special education. Exceptional Children, 75(3), 365–383. Council for Exceptional Children. (2012). Classifying the state of the evidence for special education professional practices: CEC practice study manual. Retrieved from http:// www.cec.sped.org/AM/Template.cfm?Section=Evidence_based_Practice&Template=/ TaggedPage/TaggedPageDisplay.cfm&TPLID=24&ContentID=4710 Faith, M. S., Allison, D. B., & Gorman, B. S. (1996). Meta-analysis of single-case research. Hillsdale, NJ: Lawrence Erlbaum Associates. Gage, N. A., Lewis, T. J., & Stichter, J. P. (2012). Functional behavioral assessment-based interventions for students with or at risk for emotional and/or behavioral disorders in school: A hierarchical linear modeling meta-analysis. Behavioral Disorders: Journal of the Council for Children with Behavioral Disorders, 37(2), 55. Ganz, J. B., Davis, J. L., Lund, E. M., Goodwyn, F. D., & Simpson, R. L. (2012). Metaanalysis of PECS with individuals with ASD: Investigation of targeted versus nontargeted outcomes, participant characteristics, and implementation phase. Research in Developmental Disorders, 33, 406–418. doi: 10.1016/j.ridd.2011.09.023 Ganz, J. B., Earles-Vollrath, T. L., Heath, A. K., Parker, R., Rispoli, M. J., & Duran, J. (2012). A meta-analysis of single case research studies on aided augmentative and alternative communication systems with individuals with autism spectrum disorders. Journal of Autism and Developmental Disorders, 42(1), 60–74. doi: 10.1007/s10803-0111212-2 Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–164. Greenwood, C. R., & Abbott, M. (2001). The research to practice gap in special education. Teacher Education and Special Education, 24, 276–289. Guyatt, G. H., Naylor, D., Richardson, W. S., Greene, L., Haynes, R. B., Wilson, M. C., y Jaeschke, R. Z. (2000). What is the best evidence for making clinical decisions? JAMA, 284(24), 3127–3128. Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press. Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179.
118
KIMBERLY J. VANNEST AND HEATHER S. DAVIS
Huitema, B. E., & McKean, J. W. (2000). Design specification issues in time-series intervention models. Educational and Psychological Measurement, 60, 38–58. Kavale, K. A. (2001). Meta-analysis: A primer. Exceptionality, 9, 177–183. Kazdin, A. E. (2010). Single-case research designs: Methods for clinical and applied settings (2nd ed.). New York, NY: Oxford University Press. Kennedy, C. H. (2005). Single case designs for educational research. Boston, MA: Allyn & Bacon. Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S.L., Rindskopf, D. M., & Shadish, W. R. (2010). Single case designs technical documentation. In What Works Clearinghouse: Procedures and standards handbook (version 2.0). Retrieved from http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=229; http://ies.ed.gov/ncee/wwc/ pdf/wwc_scd.pdf/ Kratochwill, T. R., & Levin, J. R. (2010). Enhancing the scientific credibility of single-case intervention research: Randomization to the rescue. Psychological Methods. Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage. Ma, H. H. (2006). An alternative method for quantitative synthesis of single-subject researches: Percentage of data points exceeding the median. Behavior Modification, 30, 598–617. Maggin, D. M., O’Keeffe, B. V., & Johnson, A. H. (2011). A quantitative synthesis of methodology in the meta-analysis of single-subject research for students with disabilities: 1985–2009. Exceptionality, 19, 109–135. Maggin, D. M., Swaminathan, H., Rogers, J., O’Keeffe, B. V., Sugai, G., & Horner, R. H. (2011). A generalized least squares regression approach for computing effect sizes in single-case research: Application examples. Journal of School Psychology, 49, 301–320. Mason, R. A., Ganz, J. B., Parker, R. I., Boles, M. B., Davis, H. S., & Rispoli, M. J. (2013). Video-based modeling: Differential effects due to treatment protocol. Research in Autism Spectrum Disorders, 7(1), 120–131. Newson, R. (2001). Parameters behind ‘‘nonparametric’’ statistics: Kendall’s Tau, Somers’ D and median differences. Stata Journal, 1, 1–20. Ottenbacher, K. J. (1990). Visual inspection of single-subject data: An empirical analysis. Mental Retardation, 28, 283–290. Parker, R. I., & Brossart, D. F. (2006). Phase contrasts for multi-phase single case intervention designs. School Psychology Quarterly, 21(1), 46–61. Parker, R. I., Hagan-Burke, S., & Vannest, K. J. (2007). Percent of all non overlapping data PAND: An alternative to PND. Journal of Special Education, 40, 194–204. Parker, R. I., & Vannest, K. J. (2009). An improved effect size for single case research: NonOverlap of All Pairs (NAP). Behavior Therapy, 40, 357–367. doi: 10.1016/j.beth. 2008.10.006 Parker, R. I., & Vannest, K. J. (2010, May). Visual analysis of data plots and effect sizes: Is there any common ground? Paper session presented at the Association for Behavior Analysis International Annual Convention, San Antonio, TX. Parker, R. I., & Vannest, K. J. (2012). Bottom-up analysis of single-case research designs. Behavior Education, 17(1), 254–265. doi: 10.1007/s10864-012-9153-1 Parker, R. I., Vannest, K. J., & Brown, L. (2009). The improvement rate difference for single case research. Exceptional Children, 75, 135–150. Parker, R., Vannest, K. J., & Davis, J. L. (2011). Nine non-overlap techniques for single case research. Behavior Modification, 35, 303–322.
Synthesizing Single Case
119
Parker, R. I., Vannest, K. J., Davis, J. L., & Sauber, S. B. (2011). Combining nonoverlap and trend for single-case research: Tau-u. Behavior Therapy, 42, 284–299. Payton, M. E., Greenstone, M. H., & Schenker, N. (2003). Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance? Journal of Insect Science, 3, 1–6. Pennypacker, H. S., Koenig, C. H., & Lindsley, O. R. (1972). Handbook of the standard behavior chart. Kansas City, KS: Precision Media. Sackett, D., Richardson, W., Rosenberg, W., & Haynes, B. (2000). Evidence-based medicine (2nd ed.). London: Churchill Livingstone. Sackett, D. L., Rosenberg, W. M. C., Gray, J. A. M., Haynes, R. B., & Richardson, W. S. (1996). Editorial: Evidence based medicine: What it is and what it isn’t. British Medical Journal, 312, 71–72. Scruggs, T. E., & Mastropieri, M. A. (2001). How to summarize single-participant research: Ideas and applications. Exceptionality, 9, 227–244. Scruggs, T. E., Mastropieri, M. A., & Casto, G. (1987). The quantitative synthesis of single subject research: Methodology and validation. Remedial and Special Education, 8, 24–33. Shadish, W., & Cook, T. D. (2009). The renaissance of experiments. Annual Review of Psychology, 60, 607–629. Slavin, R. (2002). Evidence-based education policies: Transforming educational practice and research. Educational Researcher, 31(7), 15–21. Thompson, B. (2007). Effect sizes, confidence intervals, and confidence intervals for effect sizes. Psychology in the Schools, 44, 423–432. van den Noortgate, W., & Onghena, P. (2003). Hierarchical linear models for the quantitative integration of effect sizes in single case research. Behavior Research Methods, Instruments & Computers, 35, 1–10. van den Noortgate, W., & Onghena, P. (2008). A multilevel meta-analysis of single subject experimental design studies. Evidence-Based Communication Assessment and Intervention, 2, 142–158. Vannest, K. J., Davis, J. L., & Parker, R. I. (2013). Single Case Research in Schools: Practical Guidelines for School Based Professionals. New York, NY: Taylor and Francis. Vannest, K. J., Davis, J. L., Davis, C. R., Mason, B. A., & Burke, M. D. (2010). Effective intervention and measurement with a daily behavior report card: A meta-analysis. School Psychology Review, 39, 654–672. Vannest, K. J. Harrison, J., Parker, R., Harvey, K. T., & Ramsey, L. (2010). Improvement rate differences of academic interventions for students with emotional and behavioral disorders. Remedial and Special Education. 10.1177/0741932510362509 What Works Clearinghouse. (2008). Procedures and standards handbook (version 1.0). Retrieved from http://ies.ed.gov/ncee/wwc/references/idocviewer/doc.aspx?docid=19& tocid=1 What Works Clearinghouse. (2011). Procedures and standards handbook (version 2.1). Retrieved from http://ies.ed.gov/ncee/wwc/DocumentSum.aspx?sid=19 White, O. R., & Haring, N. G. (1980). Exceptional teaching: A multimedia training package. Columbus, OH: Merrill. Wilkinson, L., & APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journal. American Psychologist, 54, 594–604. Wolery, M., Busick, M., Reichow, B., & Barton, E. (2010). Comparison of overlap methods for quantitatively synthesizing single-subject data. Journal of Special Education, 44, 18–28.
CHAPTER 6 UTILIZING EVIDENCE-BASED PRACTICE IN TEACHER PREPARATION Larry Maheady, Cynthia Smith and Michael Jabot ABSTRACT Evidence-based practice (EBP) can have a powerful impact on schoolaged children. Yet this impact may not be realized if classroom teachers do not use empirically supported interventions and/or fail to include the best research available when they make important educational decisions about children. Whether classroom teachers use EBP may be influenced, in part, by what they learned or failed to learn in their preservice preparation programs. This chapter describes recent efforts to assess preservice teachers’ understanding and use of empirically supported interventions and provides four examples of how such practices were taught to preservice general educators in a small, regional teacher preparation program. We discuss four contemporary educational reform movements (i.e., federal policies mandating EBP, state-level policies linking growth in pupil learning to teacher evaluation, clinically rich teacher preparation, and the emergence of a practice-based evidence approach) that should increase interest and use of EBP in teacher education and offer recommendations for how teacher educators might
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 121–147 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026008
121
122
LARRY MAHEADY ET AL.
infuse EBP into their traditional teaching, research, and service functions in higher education.
What influence, if any, has evidence-based practice (EBP) had on teacher preparation in the United States? Is it reflected in the curriculum, pedagogy, and applied practice settings that encompass the major components of most preparation programs? Which practices should future teachers learn to use and how can teacher educators prepare them to apply these interventions across increasingly diverse populations, settings, and school contexts? Ultimately, how can teacher educators and P-12 schools work collaboratively to create evidence-based cultures in both settings? These questions, among others, are the focus of this chapter. While we can’t answer any of them with confidence, we can start a dialogue that may raise the visibility of EBP in teacher education. Presently, very little is known about the impact or role of EBP in teacher preparation. While there have been discussions on the topic, more often among special than general educators (e.g., Cook, Landrum, Tankersley, & Kauffman, 2003; Cook, Tankersley, & Landrum, 2009; Greenwood, & Abbott, 2001), there does not appear to be a generalized awareness of the movement, a clear understanding of its potential utility, nor a clamoring of support to bring EBP to teacher preparation programs. In preparing this chapter, for example, the authors did a cursory review of five prestigious teacher education journals (i.e., American Educational Research Journal, Educational Evaluation and Policy Analysis, Educational Researcher, Journal of Teacher Education, and Teaching and Teacher Education) over the past three years (i.e., 2010–2012) to see if the terms ‘‘evidence- or scientifically based practice(s)’’ were included in article titles. Only 2 of 813 article titles included the terms; a rather low citation rate for such an important topic. The relative lack of attention to EBP, particularly among general educators, may be a function of the relative youth of the movement, ambiguities surrounding its nature and application (e.g., what constitutes ‘‘evidence’’? how much and what types of evidence are acceptable? how can evidence-based practices be transported to practice settings?), and/or ideological differences with the concept itself. Irrespective of why EBP has not prospered among teacher educators, future teachers would benefit from better preparation in its understanding and application.
EBP in Teacher Preparation
123
We begin with a brief discussion of the general importance of teachers to pupil learning and the role that teacher education programs may play in their development. Next, some studies are reviewed that examined preservice teachers’ understanding and use of empirically supported interventions. These studies, which used primarily indirect assessment measures (i.e., surveys, interviews, and syllabi analyses), found little awareness or understanding of EBP among preservice teachers and even fewer opportunities to use them in applied settings. We then describe four projects that taught preservice teachers to use empirically supported interventions, measure fidelity of implementation, and collect data on pupil performance to evaluate the impact of their practice. All projects were completed in required coursework with accompanying field placements and are offered here as exemplars for future research and practice. Four contemporary reform efforts (i.e., federal policies mandating EBP, statelevel policies linking growth in pupil learning to teacher evaluation, National Council for the Accreditation of Teacher Education (NCATE)driven clinically rich teacher preparation, and the emergence of a practicebased evidence approach) are then discussed as advances that should promote wider understanding and use of EBP in teacher education. We conclude with a discussion of how EBP might be utilized more widely among teacher educators through their traditional teaching, research, and service functions in higher education. The following caveats are offered related to our discussion. First, there are other important facets of EBP in teacher education that are beyond the scope of this chapter (e.g., policy and governance support in higher education, legal and ethical issues associated with the use of EBP, and procedures for creating evidence-based cultures). Second, the term evidencebased practice is used to refer to the broad application of best available evidence to all facets of the educational decision-making process, while empirically supported interventions is used to refer to specific treatments or activities that were shown to meet criteria for research support (Detrich, 2008; Spencer, Detrich, & Slocum, 2012). Third, our analyses, interpretations, and discussion are derived from our experiences as mathematics, science, and special educators who prepared preservice general education teachers for the past two decades (Maheady, Harper, Karnes, & Mallette, 1999; Maheady, Harper, Mallette, & Karnes, 1993). This is relevant because our perspectives may be more reflective of general than special-education teacher preparation and preservice rather than in-service preparation. However, given that most students with learning and behavioral disorders
124
LARRY MAHEADY ET AL.
are taught to some extent in general education settings, the focus on the preparation of preservice general educators is appropriate.
TEACHER EDUCATION, TEACHING PRACTICE, AND PUPIL LEARNING Most people believe that good teachers make an important difference in children’s lives. Indeed, many of us have fond memories of teachers who inspired, challenged, and supported us. There is less consensus, however, on whether good teachers are ‘‘born or made’’ and the extent to which teacher education programs contribute to their professional development. In the 1960s, for instance, conventional wisdom emphasized the importance of the home and socioeconomic status on pupil learning. Cochran-Smith and Zeichner (2005) noted that teacher education programs and schools in general, as well as teachers in particular, were viewed as relatively unimportant during this era. The perception of teachers’ importance has grown substantially since, however, spurred on by public policies that directed more resources to teacher preparation and improved research methods that provided decision-makers with better evidence about teacher impact on student achievement (States, Detrich, & Keyworth, 2012). The rise of quantitative research methods, for example, corroborated many people’s belief that high quality teachers do, in fact, make a measurable impact on pupil learning. States et al. (2012) noted that research using effect sizes (e.g., Forness, 2001; Hattie, 2009; Kavale & Forness, 2000), value-added modeling (e.g., AERA, 2004; Sanders & Rivers, 1996), and randomizedcontrolled trials (Nye, Konstantopoulos, & Hedges, 2004) have provided increasing evidence that teachers contribute significantly to pupil learning. Given the growing consensus and evidence that teachers make an important difference in pupil learning, one might ask, to what extent does teacher education influence teaching practice? Goe and Coggshall (2007) reported that only a small number of empirical studies have examined this relationship and even fewer have linked teaching practice to pupil learning. Wilson, Floden, and Ferrini-Mundy (2002) concluded as well that valid, methodologically rigorous research that links the content and structure of teacher preparation programs to student outcomes is scant, inconclusive, and aggregated at a level that is not particularly useful for teacher preparation. It is against this broader knowledge void that the impact of EBP in teacher education is examined.
EBP in Teacher Preparation
125
TEACHER UNDERSTANDING AND USE OF EMPIRICALLY SUPPORTED INTERVENTIONS A basic premise of EBP is that students are more likely to benefit from instruction when teachers use interventions that were derived from the best available evidence (States et al., 2012). As such, researchers have used surveys, focused interviews, observations, and syllabi reviews to examine pre- and in-service teachers’ awareness and use of empirically supported interventions. Begeny and Martens (2006), for example, conducted an initial survey that asked 110 preservice general and special education teachers to report on their use of classroom-based interventions with strong empirical support (e.g., Direct Instruction, formative data collection and analysis, and peer tutoring). Using the 30-item survey, Index of Training in Behavioral Instruction Practices, participants from six colleges in the Northeast were asked to answer three questions: (a) whether they received ‘‘no training,’’ ‘‘very similar training,’’ or ‘‘exact training’’ in 26 targeted practices, (b) the total amount of coursework they received on selected practices (0 ¼ none to 6 ¼ an entire course), and (c) the total amount of applied practice they received (0 ¼ 0 hours to 6 ¼ 15+ hours). Overall, preservice teachers reported receiving little preparation in behavioral concepts, strategies, programs, and assessment practices. Moreover, the instruction they did receive focused more on general behavioral principles rather than explicit empirically supported interventions. The researchers suggested that preservice teachers’ lack of preparation in pupil progress monitoring was particularly troublesome given the importance of such information for making critical instructional decisions. They noted further that special education teachers had more learning opportunities than their general education peers to use certain interventions, but no significant differences emerged among elementary and secondary educators. Burns and Ysseldyke (2009) surveyed a national sample of special education teachers and school psychologists regarding the use of instructional practices that were identified previously as having large, medium, and/or small effect sizes (Forness, 2001; Kavale & Forness, 2000). Special education teachers were asked to note how often they used three interventions with large effect sizes (e.g., mnemonics, applied behavior analysis, and direct instruction); one with a medium effect size (e.g., formative assessment); and four with small effect sizes (e.g., perceptual motor and modality training, social skills, and psycholinguistic training) using a fivepoint, Likert-type scale (1 ¼ almost every day; 3 ¼ once or twice a month;
126
LARRY MAHEADY ET AL.
5 ¼ almost never). School psychologists, in contrast, were requested to rank order the same practices in terms of how often they observed them being used with students with special needs. Burns and Ysseldyke concluded that their findings were cause for concern and optimism. On the positive side, many special education teachers reported that they used interventions with strong research bases (direct instruction, applied behavior analysis, and mnemonic strategies), although not with the frequency one might prefer. On the other hand, almost one-third (30%) of special education teachers also said that they used practices with negligible to small effect sizes (e.g., perceptual-motor and modality training) at least weekly with students with special needs. The fact that teachers used interventions with and without empirical support suggested that selecting practices on the basis of evidence was not that common, at least not among this national sample of special education teachers. Most rankings by school psychologists were closely aligned with teacher reports. Jones (2009) conducted a qualitative study with 10 beginning special education teachers (i.e., o3 years of teaching experience) regarding their existing teaching practices, the processes they used to make instructional decisions, and their familiarity with six empirically supported interventions for students with learning and behavior disorders (i.e., direct instruction, peer-mediated learning, content enhancements, self-management, technology integration, and effective teaching behaviors). Using a series of openended questions during structured interviews, three classroom observations, and written responses to the Validated Practices Rating Scale, Jones identified three groups of respondents: (a) definitive supporters (40%), (b) cautious consumers (30%), and (c) critics (30%). Definitive supporters made overt statements during interviews in support of the use of research in instructional planning, although none reported using practices because of results they had witnessed personally. Cautious consumers, in contrast, reported being less convinced that research played a substantive role in educational decision-making, citing its lack of applicability and relevance to students, as well as negative personal experiences. Critics also raised serious doubts about the utility and appropriateness of empirically supported interventions and repeatedly stated their general mistrust of educational research. Direct observations revealed that none of the beginning special educators (including definitive supporters) actually used any of the empirically supported interventions in their classrooms. Jones also found an interesting disconnect between teachers’ observed and reported teaching practices. Some special education teachers, for example, reported using empirically supported interventions to a great extent on the Validated
EBP in Teacher Preparation
127
Practices Scale, yet they were never observed using them during formal observations. Reschly and his colleagues (e.g., Holdheide & Reschly, 2008; Oliver & Reschly, 2007; Reschly, Holdheide, & Smartt, 2008; Smartt & Reschly, 2007) took a different approach to examine the influence of EBP on teacher education. They conducted systematic reviews of course syllabi taken from teacher education programs to determine the extent to which empirically supported interventions were taught, observed, and practiced in required coursework. Using Innovation Configurations (IC) (Hall & Hord, 1987; Roy & Hord, 2004), tools that evaluate programs and assess fidelity of implementation, researchers rated whether or not essential components of empirically supported interventions (i.e., classroom organization and behavior management, scientifically based reading instruction, inclusive practices, and learning strategies) were present in course syllabi and, if so, what was the degree of implementation that was provided. Degree of implementation was rated using a five-point, Likerttype scale (i.e., 0 ¼ no mention of component; 1 ¼ component mentioned; 2 ¼ component mentioned plus readings/tests; 3 ¼ prior levels plus required papers and projects; and 4 ¼ prior levels plus supervised practice and feedback). IC were used to evaluate teacher education syllabi, identify gaps in general and special education preparation programs, and prescribe professional development activities that improve preservice teachers’ understanding and use of empirically supported interventions (Reschly et al., 2008). Findings from IC research are consistent with survey outcomes and indicate that preservice teachers, general and special education, received little exposure to empirically supported interventions and even fewer opportunities to apply them. One report noted, for example, that only 15% of general education teacher preparation programs in a national sample taught all five components of scientifically based reading instruction (phonemic awareness, phonics, fluency, vocabulary, and comprehension), while almost three times as many programs (43%) taught none of the components (Walsh, Glaser, & Wilcox, 2006). Similarly, Reschly et al. (2007) reviewed course syllabi from 26 of 31 special education programs in large population states using three different IC. In addition to documenting acceptable inter-rater reliability (.79 to .85), they found that most preparation programs failed to teach (or at least mention in syllabi) the five essential components of reading instruction. They also provided inadequate coverage and applied practice opportunities for systematic instruction, universal screening, and pupil progress monitoring.
128
LARRY MAHEADY ET AL.
Recently, Gable, Tonelson, Sheth, Wilson, and Park (2012) conducted a survey of over 3,000 general and special education teachers to identify their perspectives on the (a) importance, (b) use, and (c) preparation regarding 20 empirically supported interventions (e.g., clear rules/expectations, crisis intervention plans, peer-assisted learning, self-monitoring, group-oriented contingencies). This study was noteworthy for its large sample size and explicit focus on serving students with behavioral disorders. Using a researcher-developed survey, teachers rated perceived levels of importance, usage, and preparation for each of the empirically supported interventions using a five-point, Likert-type scale. Results indicated that a number of empirically supported practices were not in common use among general or special educators (e.g., group-oriented contingencies, peer-mediated interventions, anger management) and that teacher preparedness to use such practices was equally low. Researchers concluded that efforts must be intensified to prepare general and special education teachers in the use of empirically supported interventions, particularly with students with behavioral disorders. Finally, Kretlow and Bartholomew (2010) conducted a comprehensive review of research on the impact of peer coaching on preservice and inservice teachers’ use of empirically supported interventions. They reported that coaching consistently and overwhelmingly improved the fidelity with which preservice and practicing teachers implemented a variety of empirically support interventions (e.g., Class Wide Peer Tutoring, Direct Instruction, Positive Behavior Support) in practicum or classroom settings. They further highlighted important coaching components (i.e., high engagement small group training, follow-up observations, and performance-based feedback), noted the potential utility of a universal measure of instructional proficiency, and recommended adding coaching components to existing teacher preparation programs. They concluded that coaching was a promising practice for promoting the transport of empirically supported interventions from college to real classroom settings. Collectively, extant research does not paint a pretty picture of the state of the art regarding EBP in teacher education. Studies showed, for example, that (a) preservice special education teachers received very little instruction on EBP, instruction was more general than explicit, and few opportunities were provided to apply knowledge and skills; (b) practicing special education teachers reported using interventions with both large and small effect sizes regularly with students with special needs; (c) a general disconnect existed between what preservice general education teachers said and did instructionally; (d) essential components of empirically supported
EBP in Teacher Preparation
129
interventions were rarely represented in teacher education syllabi and few opportunities existed for preservice teachers to use them in applied settings; and (e) peer coaching was identified as a promising practice for transporting preservice teachers’ use of empirically supported interventions from college to classroom settings. Obviously, a considerable amount of work must be done to expand awareness and use of EBP in teacher education. We offer a series of applied research projects here as a modest effort in that regard.
SAMPLE PROJECTS TO PREPARE PRESERVICE TEACHERS TO USE EBP We have prepared preservice and in-service general education teachers for many years now. We have taught professional ‘‘methods’’ and research design courses, provided extensive professional development for elementary and secondary teachers, and conducted research on the effects of empirically supported interventions on pupil learning. We have also engaged in research on preservice teachers’ abilities to use empirically supported interventions, collect data on important pupil outcomes, and make better instructional decisions. All studies were completed in the context of required undergraduate and graduate coursework that included field experiences in primarily high-need schools. Four representative projects are described here as exemplars of (a) empirically supported practices that preservice teachers might use to improve pupil outcomes, (b) instructional arrangements that teacher educators might create or replicate to study teacher practice and pupil learning, and (c) methods of inquiry that educational researchers might use to address important P-12 school problems and accelerate the delivery of empirically supported interventions to school settings. A general overview of the four projects and outcomes can be seen in Table 1. First Year Preservice Teacher Use of Empirically Supported Interventions The first project was completed with large groups of freshman and sophomores enrolled in their first required education course (i.e., Introduction to Contemporary Education) in an inclusive general education program (Maheady, Jabot, Rey, & Michielli-Pendl, 2007). Students also participated in an 8- to 10-week field experience for approximately six hours per week. As part of course requirements, preservice teachers had to teach two
Major Findings
Preservice teachers used interventions with high degrees of accuracy; most pupils (o80%) made noticeable or marginal learning gains on post-tests Peer coaching improved preservice teachers’ accuracy in implementing PALS and pupils showed concurrent gains in reading comprehension Preservice teachers implemented CWPT with high degrees of accuracy (88%); educationally important learning gain scores; positive evaluation of CWPT; procedural changes adversely impacted learning gains and pupil satisfaction Three jars produced immediate and educationally important reductions across all disruptive behaviors even though teacher only monitored some pupils and behaviors
Research Design
Descriptive study
Single-case multiple baseline design across tutoring dyads Descriptive study of teacher practice; prepost assessments of pupil spelling grades
Single-case A-BA-B design
Preservice Teacher and Pupil Outcomes Fidelity of implementation; pre-post-test assessments of pupil performance Fidelity of implementation; pupils’ oral reading fluency and comprehension Fidelity of implementation of CWPT; normalized learning gains for pupils from prepost tests; social validity assessments
Fidelity of implementation; frequency of disruptive behaviors; social validity
Empirically Supported Intervention
Selected 1 of 6 practices; use in one lesson; measure fidelity of implementation
Peer-coaching and Peer Assisted Learning Strategies (PALS); systematic training in use of both interventions
Class Wide Peer Tutoring (CWPT) on campus and in-class implementation assistance
Group contingencies with randomized components; three jars
422 preservice general educators and children in field placements
Three dyads of preservice teachers and three students with special needs
10 general education teachers in student teaching placements
22 third-grade students; 5 with IEPs; high rates of disruptive behavior
Maheady, Jabot, Rey, and MichielliPendl (2007)
Mallette et al. (1999)
Maheady et al. (2004)
Lewis et al. (2012)
Representative Studies Engaging Preservice Teachers in Use of Empirically Supported Interventions.
Participants and Settings
Study
Table 1. 130 LARRY MAHEADY ET AL.
EBP in Teacher Preparation
131
formal lessons; collect pre- and post-test data; graph data to reflect entire class, small group, and individual pupil performance; and make written data-based instructional decisions. They were also required to use one of six relatively simple, empirically supported interventions (i.e., response cards, Numbered Heads Together, guided notes, graphic organizers, study guides, and cooperative structures) in one lesson and collect data on how accurately they implemented it. Students were assigned to field placements in pairs and shared instructional planning, implementation, and evaluation requirements including the collection of fidelity and outcome data. Four hundred and twenty-two preservice teachers, 78% of whom were placed in highneed schools, provided almost 17,000 hours of in-class assistance over four semesters. They taught more than 800 lessons and used empirically supported interventions with high degrees of accuracy (M ¼ 92%; range ¼ 88–96%). Pupils made noticeable or marginal learning gains based on normalized gain scores in over 80% of randomly selected lessons. The project was noteworthy because it showed that preservice teachers, even at the beginning of their careers, can learn to use some relatively simple empirically supported interventions with a high degree of fidelity in real-life teaching situations and assess their impact on pupil learning. As such, they had an opportunity to use curriculum-based measures of student performance to make important instructional decisions about whether student needs were met (Detrich, 2008). The descriptive study was also carried out in the context of a collaborative P-12 school–university partnership (i.e., Instructional Assistants Program) designed to meet pupil needs while simultaneously providing early and direct teaching opportunities for preservice educators. Classroom teachers and children received over 17,000 hours of instructional assistance, preservice teachers had early teaching opportunities and an introduction to EBP, and teacher educators learned to assess, albeit descriptively, preservice teachers’ abilities to use empirically supported interventions. It should be noted as well that this P-12 school–university partnership has been sustained for over 15 years.
Using Peer Coaching to Improve Teacher Use of an Empirically Supported Intervention The second project was completed in another required course within an 8- to 10-week field experience in a high-need elementary school (see Mallette, Maheady, & Harper, 1999). Introduction to the Exceptional Learner was taught to sophomore and junior general educators who were assigned
132
LARRY MAHEADY ET AL.
in pairs to work with one student with special needs twice per week in an after-school tutoring program. Within the context of the tutoring program, a single-case study was designed to examine the effects of reciprocal peer coaching on six preservice teachers’ implementation of an empirically supported intervention (i.e., Peer-Assisted Learning Strategy [PALS]; Fuchs, Fuchs, Mathes, & Simmons, 1997) with students with special needs. Using a multiple baseline design across subjects design, researchers found that a multicomponent training package consisting of modeling, role-playing, and direct practice in PALS and reciprocal peer coaching increased preservice teachers’ use of relevant coaching behaviors which, in turn, improved the accuracy of PALS implementation during tutoring sessions. Importantly, tutees showed concurrent improvements in daily reading comprehension as tutors increased their accuracy in using PALS. This study demonstrated once again that preservice teachers can learn to use an empirically supported intervention, in this case a more complex reading program, with a high degree of fidelity and for a more extended time period. The study also showed that reciprocal peer coaching was an effective way to improve implementation accuracy that, in turn, increased students’ reading comprehension. Preservice teachers learned that it was important to use PALS as intended and to monitor pupil progress to determine their impact on pupil learning. As teacher educators, we learned that it was important to train preservice teachers directly in both coaching and intervention components and to support them while using the intervention. For educational researchers, the study provided an example of how the concurrent evaluation of teacher practice and pupil outcomes facilitated an analysis of covariations in data patterns among coaching behaviors, PALS fidelity, and pupil comprehension. Once more, a required education course and field experience was used as a vehicle to teach preservice teachers to use an empirically supported intervention, collect pupil progress data, and make better instructional decisions. The 2:1 teaching ratio in the after-school program was also an effective way to provide direct instructional assistance to students and prepare preservice educators to collect and use fidelity and progress monitoring data. This P-12 school–university partnership (i.e., Pair Tutoring Program) has been sustained for over 20 years.
Preparing Student Teachers to Use Class Wide Peer Tutoring The third study (Maheady, Harper, Mallette, & Karnes, 2004) involved 10 preservice teachers who volunteered to use Class Wide Peer Tutoring (CWPT) during their final student teaching experience. The Juniper Gardens
EBP in Teacher Preparation
133
Children’s Project CWPT program is supported by over 30 empirical studies (Buzhardt, Greenwood, Abbott, & Tapia, 2007) and was identified as a ‘‘proven’’ practice by the Promising Practices Network in 2003. CWPT is available in manual form under the commercial title of Together We Can (Greenwood, Delquadri, & Carta 1997). Preservice teachers learned to use CWPT with a high degree of accuracy (i.e., 80% fidelity for three consecutive sessions without assistance) following a two-hour training program that included on-campus (i.e., tutoring manuals, role-play, and performance feedback) and in-class assistance (i.e., modeling, performance feedback, and coaching). When they used CWPT during student teaching, pupils’ weekly spelling test scores improved from an average of 69% (range ¼ 52–89%) to 94% (range ¼ 85–99%) – the equivalent of a large normalized gain score (.69) (Hake, 1998). In fact, when student teachers used CWPT, there were only 8 failing grades out of 1,028 spelling tests. Additional analyses showed that preservice teachers continued to use CWPT with a high degree of fidelity (M ¼ 88%; range ¼ 82–96%) over the course of their student teaching experience. Three preservice teachers, however, adapted CWPT in response to classroom teacher requests (i.e., omitted daily and bonus points, no public posting of pupil scores, and no contingent rewards). These procedural changes produced smaller achievement gains and lower pupil satisfaction compared to classes where the essential components of CWPT were retained. This study was noteworthy in a number of ways. First, it demonstrated again that preservice teachers can use an empirically supported intervention, collect evidence to document their impact on pupil performance, and use these data to make better instructional decisions. Here, they used CWPT with entire classes and maintained implementation over the whole student teaching experience. Preservice and in-service teachers, as well as pupils, provided positive evaluations on CWPT and parents commented anecdotally that their children were doing much better in spelling. Preparing preservice teachers to use empirically supported interventions like CWPT may increase early career successes, prompt them to seek out other practices with empirical support, and/or convince them to use CWPT in other settings with different students. The project also provided an effective and efficient process for teaching preservice teachers to use an empirically supported intervention during a relatively common preparation experience (i.e., student teaching). The study highlighted as well the potential pitfalls of making procedural adaptations. As noted, three student teachers eliminated or changed essential CWPT components at their cooperating teachers’ requests, yet when they did pupil learning gains and satisfaction decreased noticeably compared to classes where the components remained intact.
134
LARRY MAHEADY ET AL.
Using Group Contingencies with Randomized Components to Reduce Classroom Disruptions The fourth study was completed as a capstone project in a master’s degree program in curriculum and instruction (Lewis, Maheady, & Jabot, 2012). The graduate program has a required, 9-hour research sequence that helps teachers to understand, design, and implement applied educational research. During the second course, teachers design single-case research studies using guidelines provided by Horner et al. (2005). They then carry out the projects in their own or other teachers’ classrooms during a third research course. All capstone projects include (a) identification of educationally important problems; (b) illustrative literature reviews; (c) operational definitions of target behaviors; (d) direct, frequent, and reliable measurement of target behaviors; (e) use of empirically supported interventions and direct measurement of fidelity; (f) rigorous single-case research designs (e.g., A-B-A-B, multiple baseline, alternating treatments); and (g) social validity assessments. Lewis, a general education teacher with two years of substituting experience, completed her capstone project by working collaboratively with an experienced third-grade teacher who requested assistance with classroom disruption. Baseline observations revealed high rates of noncompliance, out-of-seat, and inappropriate touching (i.e., M ¼ 2.9 disruptions per minute) among 22 students, 4 of whom had IEPs, during literacy instruction in an inclusive classroom. We established a criterion of reducing disruptive behavior by at least 50% using group contingencies with randomized target behaviors, students, and rewards. Contingency components were randomized using three opaque jars in partial replication of a previous investigation (Kelshaw-Levering, Sterling-Turner, Henry, & Skinner, 2000). The first jar contained paper slips with the names of target behaviors (i.e., noncompliance, out-of-area, and inappropriate touching) written on them, as well as one paper slip with the words ‘‘all behaviors’’ written on it. The teacher selected a paper slip at the beginning of each literacy lesson to determine which target behavior would be monitored that day. Students were aware of the target behaviors but did not know which one(s) was being monitored daily. The teacher then selected a paper slip from the second jar containing (a) one ‘‘whole class’’ slip; (b) five ‘‘small group’’ slips (row 1, 2, 3, 4, or 5); and (c) 22 paper slips with each pupil’s name on them. This paper slip determined whose behavior was monitored during literacy class. Again, the class knew that someone’s behavior (i.e., entire class, one row, or individual student) was being monitored, but
135
EBP in Teacher Preparation
they were unaware of whose behavior it was. The classroom teacher then monitored pupil performance using a simple tally sheet while an independent observer recorded data on all pupils and disruptive behaviors. At the end of class, the teacher and observer compared frequencies; if totals matched and preestablished criteria were met, then students were allowed to select a reward for the entire class from the third opaque jar. Reward ideas were solicited from pupils prior to implementing the intervention and included items such as, stickers, art supplies, free time, and drop work coupons. If either condition was not met, then students were encouraged to try harder the next day. The effects of the three jars intervention on pupils’ disruptive behavior are depicted in Fig. 1. As shown, three jars produced immediate and noticeable decreases (i.e., over 80%) across all three disruptive behaviors for the entire class. Disruptive behaviors decreased from an initial baseline average of 2.9 per minute to 0.3 per minute when three jars were used. These effects were replicated across subsequent experimental phases and functional
Frequency of Target Behaviors
100
Baseline
“Three Jars”
Baseline
“Three Jars”
80
60
40
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sessions Non-Compliance
Hands & Feet to Self Out of Area
Fig. 1. Frequency of Disruptive Behaviors in an Inclusive Third-Grade Class.
136
LARRY MAHEADY ET AL.
relationships were found for three target students, two of whom had IEPs. The multiple behavior changes across multiple students were noteworthy because the teacher was only monitoring one or a few students’ on most days. Collectively, these studies showed that preservice teachers, at varied points in their preparation programs, learned to use different empirically supported interventions with high degrees of fidelity across diverse instructional settings. More importantly, their use of these interventions produced immediate and consistent improvements in pupil performance ranging from higher spelling test grades and improved reading comprehension to significant reductions in disruptive classroom behavior – outcomes that are valued by teachers, parents, school leaders, policy-makers, and accreditation agencies alike. The studies also provided a number of different formats for teaching preservice teachers to use these interventions. Oncampus workshops with in-class modeling and feedback, reciprocal peer coaching, and collaborative research projects were effective and efficient vehicles for preparing teachers to use empirically supported interventions. Collaborative research projects, in particular, have been unusually productive venues for addressing important P-12 problems, replicating intervention effects, and disseminating empirically supported interventions in local schools. All teacher preparation activities were carried out in required education courses and field experiences that were created in partnership with P-12 schools. Finally, the projects reflected a line of inquiry that (a) uses measureable change in pupil learning as the gold standard for teaching effectiveness, (b) highlights the importance of implementation fidelity, and (c) addresses the need to bring rigorous research methods to common practice settings.
ADVANCING EVIDENCE-BASED PRACTICE IN TEACHER EDUCATION It is probably an understatement to say that EBP has not been at the forefront of educational thought or practice in teacher education, particularly among general educators. While educational history has not been overly responsive to the basic tenets of EBP or the use of empirically supported interventions, there are at least four reasons for optimism: (a) federal legislation mandating empirically supported interventions, (b) policy-driven accountability systems linking teacher evaluation and
EBP in Teacher Preparation
137
pupil learning, (c) a movement to clinically rich teacher preparation, and (d) a practice-based evidence approach for research and practice.
Federal Policy Mandating Use of Empirically Supported Interventions It has become increasingly clear that the US Congress, the Department of Education (e.g., Office of Special Education Programs, Institute of Educational Sciences), and major teacher accreditation agencies (e.g., NCATE) have placed EBP at the heart of their educational reform efforts. They believe that student outcomes are more likely to improve if teachers use practices shown empirically to enhance pupil performance. Requirements to use empirically supported interventions have appeared, for example, in important federal legislation (e.g., Individuals with Disabilities Education Improvement Act, 2004; No Child Left Behind, 2001), professional ethics codes (e.g., American Psychological Association Task Force on Evidence-Based Practice for Children and Adolescents, 2008; Behavior Analyst Certification Board, 2004 [Standard 209 a and b]; Council for Exceptional Children, 2008; National Association of School Psychology, 2000 [Standard III F 4, IV 4]), and major teacher accreditation reports (Cibulka, 2009; NCATE, 2010). As Spencer et al. (2012) noted basing educational practice on scientific evidence is no longer just a good idea; it is the law. The question remains, however – are federal and state mandates sufficient to increase practitioner use of empirically supported interventions in our public schools? The professional literature suggests otherwise. Translating research to practice has been a long-standing problem across education and other help professions. For more thorough discussion, see Barkham and Margison (2007), Cook (2011), Cook and Cook (2010), and the May 2012 issue of Education and Treatment of Children. Policy-Driven Accountability Systems Linking Teacher Evaluation and Pupil Learning A second factor that should promote EBP among teacher educators is the emergence of state-level accountability systems that link teacher evaluation to student learning. New York State, for example, mandates that 40% of teacher evaluations in their Annual Professional Performance Reviews (APPR) be linked to growth in pupil learning. Growth is defined as positive change in student achievement between at least two points in time as
138
LARRY MAHEADY ET AL.
determined by school districts (Education Law 3012-c, 2012). While half of this criterion can be addressed via the state’s high stakes assessments, the other half can be derived from district-adopted achievement measures. Formal progress monitoring systems (e.g., Dynamic Indicators of Beginning Early Literacy Skills [DIBELS], Kaminski & Good, 1996; Achievement Improvement Monitoring System [AIMSweb], Howe & Shinn, 2002) and curriculum-specific, formative assessments offer more useful measures than standard, pre-post-test models because they provide more direct and frequent assessments as well as formal rules to improve teacher decisionmaking. APPR also requires that student growth measures consider the unique learning abilities of individual pupils and provide timely and corrective feedback to teachers. Again, pupil progress monitoring systems that are becoming more widely used in schools are sensitive to individual pupil differences and provide teachers with immediate and ongoing feedback about pupil performance. Given the need to maximize pupil progress in new accountability systems, one would anticipate increased demand for empirically supported interventions among practitioners and a concurrent need to prepare preservice teachers in similar practices. Clinically Rich Teacher Reform Movement National Council for Accreditation of Teacher Education (2010) Blue Ribbon Panel on educational reform argued that teacher preparation in the United States must be ‘‘turned upside down’’ (p. ii). The panel outlined 10 key principles for designing more effective teacher preparation programs that are built around clinically rich teaching experiences: make P-12 student learning the focal point for designing, implementing, and evaluating clinically based programs integrate clinical experiences throughout the program in a dynamic way use data to monitor teacher candidate and student progress in clinical experiences prepare teachers who are experts in content and how to teach it effectively provide ample opportunities for feedback on candidate and pupil performance in a collaborative learning culture select and prepare clinical instructors rigorously to ensure that they model effective practices develop intensive clinically based experiences that are structured, staffed, and financed to support preservice teacher learning and student achievement
EBP in Teacher Preparation
139
use state-of-the-art technologies to promote productivity, efficiency, and collaboration develop a rigorous research and development agenda and use data to support continuous program improvement create and/or strengthen strategic partnerships with P-12 schools Implementing clinically rich experiences will require sweeping changes in how teacher educators design, implement, monitor, evaluate, staff, and fund their preparation programs (NCATE, 2010). Significant among these many challenges are (a) the implementation of more rigorous accountability procedures linking teacher education to teaching practice and the improvement of student learning, (b) the creation or expansion of P-12 school–university partnerships devoted primarily to enhanced pupil outcomes, and (c) the development of a rigorous knowledge base on what makes clinical preparation effective. The emergence of the clinically rich reform agenda provides a golden opportunity for teacher educators and educational researchers to refocus their attention on improving teaching practice and pupil learning. Clinically rich experiences provide authentic vehicles for studying the effects of empirically supported interventions on important pupil outcomes while simultaneously meeting P-12 student needs.
Emergence of a Practice-Based Evidence Approach Practice-based evidence can be thought of as the second phase in a twophase process to bridge the research-to-practice gap in education. During the first phase, researchers demonstrate that particular interventions are unusually effective (i.e., educationally and statistically significant) in improving pupil outcomes under ‘‘highly controlled’’ experimental conditions. The emphasis here is placed on experimental rigor. Study participants are selected and assigned carefully, practices are often scripted and implemented by specially trained personnel, and outcomes are defined operationally and measured with integrity. While such rigorous control is critical for the identification of empirically supported interventions, it may actually impede generalization to more naturalistic and less-controlled settings (e.g., classrooms) (Detrich, Keyworth, & States, 2008; Kazdin, 2008). As such, a second research phase is necessary. Here, empirically supported interventions are moved from laboratory-type settings to public schools to determine if practitioners can obtain similar outcomes with their
140
LARRY MAHEADY ET AL.
pupils. This is referred to as effectiveness research and it evaluates the robustness of empirically supported interventions when ‘‘taken to scale’’ and implemented in typical educational settings (Detrich et al., 2008). The emphasis at this stage is on the relevance of empirically supported interventions and typical research questions include (a) for whom does the intervention work? (b) in what types of settings can it be used effectively? and (c) what minimal conditions must exist for practices to be implemented accurately and sustained by practitioners? The use of empirically supported interventions in more natural settings must be accompanied by the collection of progress monitoring data to determine if selected practices are, in fact, improving pupil outcomes. Just because a practice was effective under laboratory conditions does not ensure that it will work under noticeably different circumstances (Kazdin, 2008). Practice-based evidence, therefore, refers to the collection and analysis of classroom-based data to determine if there is a relationship between the instructional practices teachers use and pupils’ academic, behavioral, or social development (Fink-Chorzempa, Maheady, & Salend, 2012). Clear demonstrations that teachers can use empirically supported interventions effectively and efficiently in natural settings should improve student learning, strengthen the external validity of selected practices, and accelerate the generation of evidence developed and refined in practice settings (Greenwood & Maheady, 1997; Walker & Bruns, 2006). Practice-based evidence might also serve as a complement to evidence-based practice in ways that produce mutual benefits for theory, research, and practice (e.g., Barkham & Margison, 2007; Cook, 2011). If practitioners find, for example, that certain interventions improve pupil performance, then they might be more inclined to look for other EBP that can do the same for them and their students.
INFUSING EVIDENCE-BASED PRACTICE INTO TEACHING, RESEARCH, AND SERVICE FUNCTIONS Evidence-based practice can be infused into teacher educators’ traditional teaching, research, and service functions in a number of ways. Regarding teaching responsibilities, teacher educators should increase preservice teachers’ understanding and appreciation of EBP by assigning relevant readings and activities that highlight the legal, ethical, and empirical underpinnings of the EBP movement. Table 2 provides a list of what preservice teachers should know about EBP. Teacher educators might also create or replicate
EBP in Teacher Preparation
Table 2.
141
What Teachers Should Know About Evidence-Based Practice.
1. Education should be grounded in science and research on teaching and learning. Some teaching practices are supported by evidence in student learning, most are not, and others may actually have adverse effects on pupil learning. 2. Evidence-based practice can have two meanings: a process that uses best available research evidence in conjunction with professional wisdom to make sound educational decisions about children, and a list of interventions that meet independent validation standards. 3. Using empirically supported interventions is more than a good idea, it is the law! 4. They can design, implement, and evaluate the effects of any educational intervention and collect practice-based evidence on its impact on pupil learning. 5. Different forms of research communication exist (e.g., empirically supported treatment reviews, narrative literature reviews, meta-analyses, and best practice guides) and each has its own strengths and limitations. 6. Researchers and teachers can and should collaborate around evidence of student learning and the improvement of teaching practice.
clinical experiences that allow preservice teachers to apply directly what they are learning about EBP in university coursework to children in real-life settings. Within these clinical experiences, they can require preservice teachers to select empirically supported interventions that meet important pupil needs, implement them with fidelity, assess their impact on pupil learning, and make sound instructional decisions. Teacher educators might also examine the effects of specific preparation activities (e.g., in-class assistance with modeling, feedback, and reciprocal peer coaching) on preservice teacher practice. Finally, teacher educators should provide preservice teachers with a comprehensive framework for making sound instructional decisions about the effects of any intervention, irrespective of the level of empirical support. The complementary models of evidence-based practice and practice-based evidence might be particularly useful in this regard (Barkham & Margison, 2007; Cook, 2011). Teacher educators’ research efforts might also be aligned better with EBP. First, teacher educator research should be applied in nature and focused directly on problem prevention and solution. Mixed research methods can be used, but if the intent is to change teacher practice and pupil learning then quantitative, experimental methods would be preferred (e.g., randomizedcontrolled trials, single-case research designs, and data-based case studies). Second, researchers should use district-adopted assessments (e.g., curriculumbased assessment measures, content-specific pre- and post-tests) whenever possible to document pupil learning gains for both research and teacher evaluation purposes. When research and teaching goals, methods, and
142
LARRY MAHEADY ET AL.
outcomes are aligned and interdependent, there should be greater ‘‘buy in’’ from constituents and a higher probability of success for everyone involved. Third, teacher educators should use ‘‘enlightened’’ professional development strategies (e.g., peer coaching, web-based video and visual access, communities of practice) (Buysse & Wesley, 2006; Joyce & Showers, 2002; Odom, 2009) as independent variables and fidelity of implementation measures as dependent variables that are then linked to important pupil outcomes. Finally, teacher educators should establish a practice-based evidence research agenda that replicates the effects of empirically supported interventions in applied settings, generates an evidence base for those practices without empirical support, and identifies naturally occurring effective teaching arrangements (e.g., Greenwood, Carta, Arreaga-Mayer, & Rager, 1991; Kamps, Leonard, Dugan, Boland, & Greenwood, 1991). Teacher educators can also infuse EBP into their service functions to campus, community, and profession. For brevity, only relationships with P-12 schools are discussed here. First, teacher educators should become more visible in P-12 schools by establishing long-term partnerships aimed primarily at improving pupil outcomes. Contemporary schools are driven by a primary concern to maximize student achievement. For teachers educators to succeed, they must show that they can positively influence student learning. By engaging preservice teachers in direct instructional activities and practice-based evidence studies, teacher educators can provide assistance to teachers and pupils and valuable learning opportunities for future teachers. Second, teacher educators should focus their service efforts primarily on meeting consumer (i.e., teacher and pupil) needs. Assistance might include participation in enlightened professional development activities (e.g., reciprocal coaching, web-based and video training, learning communities), providing in-class assistance and feedback, helping with data collection and analysis, identifying empirically supported interventions for common school and classroom challenges, and ultimately creating an evidence-based culture in our schools (Detrich et al., 2008). Given recent policy implementation linking pupil learning growth to teacher evaluation, we have seen an increased demand for collaboration and instructional assistance. The two most requested topics are (a) how to make sound instructional decisions based on pupil data and (b) how to select, implement, and evaluate interventions that are likely to improve pupil performance. As such, there may not be a more opportune time to infuse EBP into teacher education and P-12 schools. In conclusion, while the state of the art regarding EBP in teacher education is not pretty, there is ample room for improvement. This will
143
EBP in Teacher Preparation
require more awareness of the utility of EBP among teacher educators and preservice teachers, greater competence in its application, and an increased desire to improve teaching practice and pupil learning. Given the increasing role of science in education, the rise of federal policies mandating EBP, state regulations linking pupil learning and teacher evaluation, and the emergence of a practice-based evidence approach, there may be no better time for such revolutionary changes to occur.
REFERENCES American Educational Research Association (AERA). (2004). Teachers matter: Evidence from value-added assessments. Research Points, 2(2). Retrieved from http://www.aera.net/ Portals/38/docs/Publications/Teachers%20Matter.pdf American Psychological Association Task Force on Evidence-Based Practice for Children and Adolescents. (2008). Disseminating evidence-based practice for children and adolescents: A systems approach to enhancing care. Washington, DC: American Psychological Association. Barkham, M., & Margison, F. (2007). Practice-based evidence as a complement to evidencebased practice: From dichotomy to chiasmus. In C. Freeman & M. Power (Eds.), Handbook of evidence-based psychotherapies: A guide for research and practice (pp. 443– 476). West Sussex, England: Wiley. Begeny, J. C., & Martens, B. K. (2006). Assessing pre-service teachers’ training in empiricallyvalidated behavioral instruction practices. School Psychology Quarterly, 21, 262–285. Behavior Analysis Certification Board. (2004). BACB guidelines for responsible conduct of behavior analysts. Retrieved from http://www.bacb.com/consum_frame.htlm Burns, M. K., & Ysseldyke, J. E. (2009). Reported prevalence of evidence-based instructional practices in special education. Journal of Special Education, 43, 3–11. Buysse, V., & Wesley, P. W. (2006). Evidence-based practice: How did it emerge and what does it mean for the early childhood field? In V. Buysse & P. W. Wesley (Eds.), Evidence-based practice in the early childhood field (pp. 1–34). Washington, DC: Zero to Three. Buzhardt, J., Greenwood, C. R., Abbott, M., & Tapia, Y. (2007). Scaling up class-wide peer tutoring: Investigating barriers to wide-scale implementation from a distance. Learning Disabilities: A Contemporary Journal, 5, 75–96. Cibulka, J. G. (2009). Meeting urgent national needs in P-12 Education: Improving relevance, evidence, and performance in teacher preparation. Washington, DC: National Council for Accreditation of Teacher Education. Cochran-Smith, M., & Zeichner, K. M. (Eds.). (2005). Studying teacher education: The report of the AERA panel on research and teacher education. Mahwah, NJ: Lawrence Erlbaum. Cook, B. G. (2011). Evidence-based practices and practice-based evidence: A union of insufficiencies. Focus on Research, 24(4), 1–2. Cook, B. G., & Cook, S. C. (2010). Evidence-based practices, research-based practices, and best and recommended practices: Some thoughts on terminology. Savage Controversies, 4(1), 2–4.
144
LARRY MAHEADY ET AL.
Cook, B. G., Landrum, T., Tankersley, M., & Kauffman, J. M. (2003). Bringing research to bear on practice: Effecting evidence-based instruction with students with emotional or behavioral disorders. Education and Treatment of Children, 26, 345–361. Cook, B. G., Tankersley, M., & Landrum, T. (2009). Determining evidence-based practices in special education. Exceptional Children, 75, 365–383. Council for Exceptional Children. (2008). Classifying the state of evidence for special education: Professional practices. CEC Practice Study Manual. Reston, VA: Council for Exceptional Children. Detrich, R. (2008). Evidence-based, empirically-supported, or best practice: A guide for the scientist-practitioner. In J. K. Luiselli, D. C. Russo, W. P. Christian & S. M. Wilczynski (Eds.), Effective practices for children with autism: Educational and behavioral support interventions that work (pp. 3–25). New York, NY: Oxford. Detrich, R., Keyworth, R., & States, J. (2008). A roadmap to evidence-based education: Building an evidence-based culture. In R. Detrich, R. Keyworth & J. Statesx (Eds.), Advances in evidence-based education (Vol. 1): A roadmap to evidence-based education (pp. 3–19). Oakland, CA: The Wing Institute. Fink-Chorzempa, B., Maheady, L., & Salend, S. J. (2012, April). A practice-based evidence model: Assessing what works for teachers and students. Presentation at the Annual Meeting of the Council for Exceptional Children, Denver, Colorado. Forness, S. R. (2001). Special education and related services: What have we learned from metaanalysis? Exceptionality, 9, 185–197. Fuchs, D., Fuchs, L. S., Mathes, P. G., & Simmons, D. C. (1997). Peer-assisted learning strategies: Making classrooms more responsive to diversity. American Educational Research Journal, 34, 174–206. Gable, R. A., Tonelson, S. W., Sheth, M., Wilson, C., & Park, K. L. (2012). Importance, usage, and preparedness to implement evidence-based practices for students with emotional disabilities: A comparison of knowledge and skills of special education and general education teachers. Education and Treatment of Children, 35, 499–519. Goe, L., & Coggshall, J. (2007). The teacher preparation – teacher practices – student outcomes relationship in special education: Missing links and necessary connections. NCCTQ Research and Policy Brief. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from www.ncctq.org Greenwood, C. R., & Abbott, M. (2001). The research-to-practice gap in special education. Teacher Education and Special Education, 24, 276–289. doi: 10.1177/088840 640102400403 Greenwood, C. R., Carta, J. J., Arreaga-Mayer, C., & Rager, A. (1991). The behavior analyst consulting model: Identifying and validating naturally effective instructional models. Journal of Behavioral Education, 1, 165–191. Greenwood, C. R., Delquadri, J. C., & Carta, J. J. (1997). Together we can! Class Wide Peer Tutoring for basic academic skills. Longmont, CO: Sopris West. Greenwood, C. R., & Maheady, L. (1997). Measurably change in student performance: Forgotten standard in teacher preparation? Teacher Education and Special Education, 20, 265–275. Hake, R. R. (1998). Interactive engagement versus traditional measures: A six thousand student survey of mechanics data for introductory physics courses. American Journal of Physics, 66, 64–74.
EBP in Teacher Preparation
145
Hall, G. E., & Hord, S. M. (1987). Change in schools: Facilitating the process. Albany, NY: State University of New York Press. Hattie, J (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. New York, NY: Taylor & Francis. Holdheide, L. R., & Reschly, D. J. (2008, June). Teacher preparation to deliver inclusive services to students with disabilities! Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from www.tqsource.org Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-case research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Howe, K., & Shinn, M. M. (2002). Standard reading assessment passages (RAPs) for use in general outcome measurement: A manual describing development and technical features. Retrieved from http://www.aimsweb.com/uploads/pdfs/passagestech-nicalmanual.pdf Individuals with Disabilities Education Improvement Act of 2004. Public Law 108-446 (118 STAT. 2647). Jones, M. L. (2009). A study of novice special educators’ views of evidence-based practices. Teacher Education and Special Education, 32, 121–136. Joyce, B., & Showers, B. (2002). Student achievement through staff development (3rd ed.). Alexandria, VA: Association for Supervision and Curriculum Development. Kaminski, R. A., & Good, R. H. (1996). Toward a technology for assessing basic early literacy skills. School Psychology Review, 25, 215–227. Kamps, D. M., Leonard, B. R., Dugan, E. P., Boland, B., & Greenwood, C. R. (1991). The use of eco-behavioral assessment to identify naturally occurring effective procedures in classrooms serving children with autism and other developmental disabilities. Journal of Behavioral Education, 1, 367–397. Kavale, K. A., & Forness, S. R. (2000). Policy decisions in special education: The role of metaanalysis. In R. Gersten, E. P. Schiller & S. Vaughn (Eds.), Contemporary special education research: Synthesis of the knowledge base on critical instructional issues (pp. 281–326). Mahway, NJ: Lawrence Erlbaum. Kazdin, A. E. (2008). Evidence-based treatments: Challenges and priorities for practice and research. In R. Detrich, R. Keyworth & J. States (Eds.), Advances in evidence-based education (pp. 157–170). Oakland, CA: The Wing Institute. Kelshaw-Levering, K., Sterling-Turner, H. E., Henry, J. R., & Skinner, C. H. (2000). Randomized interdependent group contingencies: Group reinforcement with a twist. Journal of School Psychology, 37, 523–534. Kretlow, A. G., & Bartholomew, C. C. (2010). Using coaching to improve the fidelity of evidence-based practices: A review of studies. Teacher Education and Special Education, 33, 279–299. Lewis, A., Maheady, L., & Jabot, M. (2012). The effects of three jars on the disruptive behavior of a 3rd grade classroom. Unpublished manuscript, Department of Curriculum and Instruction, SUNY Fredonia. Maheady, L., Harper, G. F., Karnes, M., & Mallette, B. (1999). The Instructional Assistants Program: A potential entry point for behavior analysis in education. Education and Treatment of Children, 22, 447–469. Maheady, L., Harper, G. F., Mallette, B., & Karnes, M. (1993). The Reflective and Responsive Educator (RARE): A training program to prepare pre-service general education teachers
146
LARRY MAHEADY ET AL.
to instruct children and youth with disabilities. Education and Treatment of Children, 16, 474–506. Maheady, L., Harper, G. F., Mallette, B., & Karnes, M. (2004). Preparing pre-service teachers to implement Class Wide Peer Tutoring. Teacher Education and Special Education, 27, 408–418. Maheady, L., Jabot, M., Rey, J., & Michelli-Pendl, J. (2007). An early field based experience and its effects on pre-service teachers’ practice and student learning. Teacher Education and Special Education, 30, 24–33. Mallette, B., Maheady, L., & Harper, G. F. (1999). The effects of reciprocal peer coaching on pre-service general educators’ instruction of students with special learning needs. Teacher Education and Special Education, 22, 201–216. National Association of School Psychology. (2000). Professional conduct manual. Prepared by the Professional Standards Revision Committee. Bethesda, MD: NASP Publications. Retrieved from www.naspweb.org National Council for Accreditation of Teacher Education. (2010, November). Transforming teacher education through clinical practice: A national strategy to prepare effective teachers. Washington, DC: NCATE. Retrieved from www.ncate.org/publications No Child Left Behind Act of 2001, 20 U. S. C. 6319 (2008). Nye, N., Konstantopoulos, S., & Hedges, L. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26, 237–257. Retrieved from http://steinhardt.nyu.edu/ scmsAdmin/uploads/002/834/127%20-%20Nye%20B%20%20Hedges%20L%20%20V% 20%20%20Konstantopoulos%20S%20%20(2004).pdf Odom, S. L. (2009). The tie that binds: Evidence-based practice, implementation science, and outcomes for children. Topics in Early Childhood Special Education, 29, 53–61. Oliver, R. M., & Reschly, D. J. (2007, December). Effective classroom management: Teacher preparation and professional development. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from www.tqsource.org Reschly, D. J., Holdheide, L. R., & Smartt, S. M. (2008, December). Innovation Configuration Tools: Implementing evidence-based practices in teacher preparation. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from www.tqsource.org Roy, P., & Hord, S. M. (2004). Innovation configurations chart a measured course toward change. Journal of Staff Development, 25, 54–58. Sanders, W. L., & Rivers, J. C. (1996). Cumulative and residual effects of teachers on future student academic achievement. University of Tennessee Value-Added Research and Assessment Center. Retrieved from http://heartland.org/policy-documents/cumulativeand-residual-effects-teachers-future-student-academic-achievement Smartt, S. M., & Reschly, D. J. (2007, June). Barriers to the preparation of highly effective reading teachers. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from www.tqsource.org Spencer, T. D., Detrich, R., & Slocum, T. A. (2012). Evidence-based practice: A framework for making effective decisions. Educational and Treatment of Children, 35, 127–151. States, J., Detrich, R., & Keyworth, R. (2012). Effective teachers make a difference. In R. Detrich, R. Keyworth, & J. Statesx (Eds.), Advances in evidence-based education (Vol. 2): Education at the crossroads: The state of teacher preparation (pp. 1-45). Oakland, CA: The Wing Institute. Walker, J. S., & Bruns, E. J. (2006). Building on practice-based evidence: Using expert perspectives to define the wraparound process. Psychiatric Services, 57, 1579–1585.
EBP in Teacher Preparation
147
Walsh, K., Glaser, D., & Wilcox, D. D. (2006). What education schools aren’t teaching about reading and what elementary teachers aren’t learning. Washington, DC: National Council on Teacher Quality. Retrieved from http://www/nctq.org/nctq/images/nctq_reading_ study/app.pdf Wilson, S., Floden, R., & Ferrini-Mundy, J. (2002). Teacher preparation research: An insider’s view from the outside. Journal of Teacher Education, 53, 190–204. doi: 10.1177/00224 87102053003002
CHAPTER 7 THE PEER-REVIEWED REQUIREMENT OF THE IDEA: AN EXAMINATION OF LAW AND POLICY Mitchell L. Yell and Michael Rozalski ABSTRACT In this chapter we consider the Individuals with Disabilities Education Improvement Act’s (IDEA 2004) provision that requires that students’ special education services in their individualized education programs be based on peer-reviewed research (PRR). We begin by reviewing federal legislation (i.e., Educational Sciences Reform Act, 2002, IDEA 2004; No Child Left Behind Act, 2001; Reading Excellence Act, 1998), which influenced the PRR principle and eventually the PRR language in IDEA. Next, we examine the US Department of Education’s interpretation of PRR in IDEA 2004 and review administrative hearings and court cases that have further clarified the PRR requirement. Finally, we make recommendations for teachers and administrators working to meet the PRR requirement when developing intervention plans for students with disabilities.
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 149–172 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026009
149
150
MITCHELL L. YELL AND MICHAEL ROZALSKI
The Individuals with Disabilities Education Act (IDEA) has had a profound influence on the education of students with disabilities. In 2004, Congress passed the Individuals with Disabilities Education Improvement Act (IDEA 2004), which reauthorized and amended the IDEA. The amendments included a number of major changes to the IDEA. In this chapter we address one of these changes: the addition of the requirement that students’ individualized education programs (IEPs) be based on peer-reviewed research (PRR), and how this change in the IDEA 2004 relates to the movement to include the results of educational research in public schools. In this chapter we first review the inclusion of educational research in federal education law, including IDEA 2004. Second, we examine the PRR language in the IDEA and interpretations of the PRR requirement in IDEA 2004 by the US Department of Education. Third, we analyze administrative hearings and court cases that have interpreted the PRR requirement. Finally, we address the implications of the legislation and litigation regarding the PRR requirement for administrators and teachers of students with disabilities.
EDUCATIONAL RESEARCH AND FEDERAL POLICY Congressional efforts to include the results of educational research in federal education law began in the late 1990s. A major reason for these efforts were national reports that showed public school students in the United States were falling behind their counterparts in other countries, especially in the areas of reading, mathematics, and science (Yell, 2012). Additionally, student assessments, especially the National Assessment of Educational Progress (NAEP), showed that the low achievement levels of American students were not improving despite large increases in federal funding to education (Yell, Katsiyannis, & Shriner, 2006). (NAEP, also called the Nation’s report card, is available at www.nces.ed.gov/nationsreportcard.) The efforts to remedy these problems by requiring educators to use research-based practices were based on the notion that if public school teachers use procedures, strategies, and programs that have been shown to be effective by the best available scientific evidence, student achievement would improve (Yell, Thomas, & Katsiyannis, 2011). We next review some of these federal legislative efforts. The Reading Excellence Act of 1998 One of the first efforts at including educational research in public law occurred when the Reading Excellence Act (REA) was signed into law in
151
PRR, Law, & Policy
1998. The REA provided competitive grants to states to improve students’ reading skills and the teaching of reading by using findings from scientifically based reading research, which was defined in the law as systematic, empirical methods of research, rigorous data analyses, and approval by a panel of independent experts or a peer-reviewed journal. The passage of this law was significant because it represented a unique bipartisan coalition of Senators and Representatives, President Clinton, and officials from the US Department of Education agreeing on the importance of scientific support for reading instruction. Thus, the primary vehicle for ensuring that the teaching of reading would be based on research was to require that grantees had to select and implement only reading programs that were grounded in the best research on reading available. According to Robert Sweet, a staffer on the US House Committee on Education and the Workforce during hearing on the REA, the law was seen as a ‘‘major catalyst in helping to turn back the rising tide of illiteracy and in ensuring that reading instruction is based on scientific research’’ (Sweet, 1998, p. 1).
No Child Left Behind Act of 2001 The 2001 reauthorization of the ESEA, which was named No Child Left Behind (NLCB), was signed into law on January 8, 2002. A major purpose of the law was to require states and school districts to use instruction based on scientifically based research (SBR) to improve student achievement. The law accomplished this by specifying what would count as scientific-based research for purposes of using federal dollars for instructional programs and methods used in schools (Table 1). NCLB required states and school districts to use instruction based on SBR to improve student achievement. SBR was defined in NCLB as ‘‘research that applies rigorous, systematic, and objective procedures to obtain relevant knowledge’’ (ESEA, 20 USC y 1208(6)). By requiring evidence from SBR to justify funding for educational programs, NCLB provided a link between practice and the research. Table 2 contains the NCLB language regarding SBR. According to O’Neill (2004), Congress’s purpose in including the references to scientific research in NCLB was to warn schools, school districts, and states that they cannot rely on untested practices without proof of effectiveness because reliance on such practices leads to widespread ineffectiveness and academic failure. Rod Paige, former Secretary of the US
152
MITCHELL L. YELL AND MICHAEL ROZALSKI
Table 1.
Definition of SBR in No Child Left Behind (20 USC y 1208(6)).
The term ‘‘scientifically based research’’ means research that involves the application of rigorous, systematic, and objective procedures to obtain reliable and valid knowledge relevant to education activities and programs; and (B) Includes research that: (i) Employs systematic, empirical methods that draw on observation or experiment;
(ii) Involves rigorous data analyses that are adequate to test the stated hypotheses and justify the general conclusions drawn;
(iii) Relies on measurements or observational methods that provide reliable and valid (iv)
(v) (vi)
data across evaluators and observers, across multiple measurements and observations, and across studies by the same or different investigators; Is evaluated using experimental or quasi-experimental designs in which individuals, entities, programs, or activities are assigned to different conditions and with appropriate controls to evaluate the effects of the condition of interest, with a preference for random-assignment experiments, or other designs to the extent that those designs contain within-condition or across-condition; Ensures that experimental studies are presented in sufficient detail and clarity to allow for replication, or, at a minimum, offer the opportunity to build systematically on their findings; and Has been accepted by a peer-reviewed journal or approved by a panel of independent experts through a comparably rigorous, objective, and scientific review.
Department of Education, noted that the intent of NCLB was to require the application of rigorous standards to educational research and that researchbased instruction is used in classroom settings (US Department of Education, 2003). Furthermore, he asserted that state educational agencies (SEAs) and local educational agencies (LEAs) must pay attention to this research and ensure that teachers use evidence-supported methods in classrooms. Paige further noted that NCLB demanded the use of methods that really work: ‘‘no fads, not feel-good fluff, but instruction that is based upon sound scientific research’’ (Paige, 2002, p. 1).
The Educational Sciences Reform Act of 2002 In early 2000, Congress began efforts to reorganize the Office of Educational Research and Improvement (OERI), the research arm of the US Department of Education. One of the goals of the reorganization was to improve federal education research and research dissemination (Eisenhart & Towne, 2003). To help inform this effort the National Research Council
153
PRR, Law, & Policy
Table 2. Analysis of Peer-Reviewed Research from the US Department of Education (Federal Register, Vol. 71, No. 156, Monday, August 14, 2006). Comment Does the reference to scientifically based academic and behavioral interventions in 34 C.F.R. y 300.226(b) (i.e., regulation on Early Intervening Services in the IDEA) mean that such interventions must be aligned with recommended practices and PRR?
What is the definition of PRR in the IDEA?
What does the phrase ‘‘to the extent practicable’’ mean?
The Department of Education should give clear guidance to SEAs, school personnel on the PRR requirement.
The regulations to the IDEA should be revised to require special education and related services, and supplementary aids and services be based on ‘‘evidence-based practices’’ rather than ‘‘peer-reviewed research.’’
Description When implementing coordinated early intervening services, an LEA may provide, among other services, professional development for teachers and other personnel to deliver scientifically based academic and behavioral interventions. The use of the term scientifically based in IDEA is intended to be consistent with the term SBR in NCLB. This definition of SBR (in NCLB) is important to the implementation of the IDEA. SBR must be accepted by a peer-reviewed journal or approved by an independent panel of experts through a comparably rigorous, objective, and scientific review. We expect that the professional development activities authorized under early intervening services will be derived from SBR. PRR generally refers to research that is reviewed by qualified and independent reviewers to ensure that the quality of the information meets the standards of the field before the research is published. This phrase generally means that services and supports should be based on PRR to the extent that it is possible, given the availability of PRR. The PRR requirement means that SEAs, school districts, and school personnel must select and use methods and results that research has shown to be effective to the extent that methods based on PRR are available. The IDEA does not refer to ‘‘evidence-based practices’’ or ‘‘emerging best practices,’’ which are generally terms of art that may or may be based on PRR.
154
MITCHELL L. YELL AND MICHAEL ROZALSKI
Table 2. (Continued ) Comment All IEP Team meetings should include a focused discussion on research-based methods and should provide parents with prior written notice when the IEP Team refuses to provide documentation of research-based methods.
Description The Department of Education declined to require that all IEP Team meetings to include a focused discussion on researchbased methods or require public agencies to provide prior written notice when an IEP Team refuses to provide documentation of research-based methods, as we believe such requirements are unnecessary and would be overly burdensome.
Note: IDEA, Individuals with Disabilities Education Act; IEP, Individual Education Program; LEA, Local education agency; NCLB, No Child Left Behind Act; PRR, Peer-reviewed research; SBR, Scientifically based research; SEA, State education agency.
(NRC) assembled a committee of educational researchers, called the NRC committee on Scientific Principles for Educational Research, to provide a definition of scientific research. In their report, the committee wrote that SBR was defined by a set of principles that should guide educational research. The principles were as follows (National Research Council, 2002): Scientific principle #1: Pose significant questions that can be investigated empirically. Scientific principle #2: Link research to relevant theory. Scientific principle #3: Use methods that permit direct investigation of the research question. Scientific principle #4: Provide a coherent and explicit chain of reasoning. Scientific principle #5: Replicate and generalize across studies. Scientific principle #6: Disclose research to encourage professional scrutiny and critique. These principles presented a general form for scientific research in education and seemingly allowed researchers flexibility in choosing research methods as long as their questions could be investigated empirically and allowed the researcher to draw conclusions that were valid, replicable, and generalizable. The Education Sciences Reform Act (ESRA) (2002) was seen as an important step toward improving the rigor and relevance of educational research (Whitehurst, 2002). The law replaced the OERI with the
PRR, Law, & Policy
155
Institution of Education Sciences (IES) and contained a definition of SBR in education. In testimony before Congress, Grover (Russ) Whitehurst, the Director of the IES in the US Department of Education, cited the findings of the NRC when he noted that the world of education, unlike many other fields, did not rest on a strong research base (Whitehurst, 2002). He also noted that in education, personal experience and ideology are too often relied upon and that in no other field is the research base so inadequate and little used. The ESRA was seen as a vehicle for transforming education into an evidence-based field because the law would build a scientific culture within the US Department of Education by including scientists in leadership roles in the agency and increasing funding of the department’s research budget. The primary purpose of IES was to be the research arm in the Department of Education and improve educational research, statistics, and evaluation. The mission of IES was to provide rigorous and relevant evidence on which to base education practice and policy and share this information broadly (IES, 2012). In August of 2002, the IES established the What Works Clearinghouse to be a source for information on scientific research in education and to disseminate credible and reliable information to educators. The purpose of the clearinghouse was to review educational research on programs, practices, and policies and to provide accurate information to the public.
The Individuals with Disabilities Education Improvement Act of 2004 Prior to the reauthorization of the IDEA in 2004, Congress held hearings in advance of its work on the IDEA. Similarly, President Bush appointed the President’s Commission on Excellence in Special Education to conduct hearings on special education and make recommendations for its reauthorization. Both the Commission and Congress addressed the important issue of using evidence-based instructional procedures in the IEPs of students with disabilities. The President’s Commission was created in October 2001. On July 1, 2002, the Commission, which was chaired by Governor Terry Branstad of Iowa, submitted its report. The report, titled A New Era: Revitalizing Special Education for Children and Their Families (2001), concluded that although the IDEA had been responsible for many of the achievements and successes of children with disabilities, the Commission members believed that ‘‘much more remains to be done to meet the goal of
156
MITCHELL L. YELL AND MICHAEL ROZALSKI
ensuring that all children with disabilities achieve their full potential’’ (p. 2). The Commission report included a number of goals for reauthorization, which would ‘‘provide a framework for improving all areas of special education’’ (p. 2). Among the major findings of the Commission was that special educators frequently do not use procedures grounded in SBR, nor are many of the commonly used procedures supported by actual evidence of effectiveness. Following the Commission’s report, Secretary of Education Rod Paige issued guidance on reauthorizing and amending the IDEA; he also called for a requirement that schools adopt research and evidence-based instructional procedures (US Department of Education, 2003). In 2002 the Senate Committee on Health, Labor, and Pensions held hearings, chaired by Senator Ted Kennedy, in advance of Congressional work on the reauthorization of the IDEA. The first hearing was titled IDEA: What’s good for kids? What works for schools? The second hearing was titled IDEA: Behavioral supports in schools. The third hearing was titled Accountability and IDEA: What happens when the bus doesn’t come anymore? The fourth hearing included testimony provided by many of the members of the President’s Commission. All four hearings addressed the need for special education teachers to be trained in and implement evidence-based strategies when working with students with disabilities. On December 3, 2004, IDEA 2004 was signed into law. The law was passed to reauthorize and amend the IDEA. The IDEA was seen as a tremendously successful law because it ensured access to educational services for millions of children and youth with disabilities. Nevertheless, according to the Congressional authors of IDEA 2004, the success of the IDEA had been impeded by low expectations and an insufficient focus on the application of SBR on proven methods of teaching children and youth with disabilities.
Summary of Educational Research and Federal Policy Clearly, officials in the federal government have been concerned that education in the United States is not, despite the increasing amount of federal funding of education, producing higher achieving students. A major area of concern was the amount, quality, and dissemination of credible and reliable educational research. Beginning in the late 1990s, Congress and the White House began to require that educators rely on programs and strategies that scientific research shows are effective. The Coalition for
PRR, Law, & Policy
157
Evidence-Based Policy (2003) summed up the requirement to use educational research in our schools as follows: The field of K-12 education contains a vast array of educational interventions – such as reading and math curricula, school wide reform programs, after-school programs, and new educational technologies – that claim to be able to improve educational outcomes and, in many cases, to be supported by evidence. This evidence often consists of poorly designed and/or advocacy-driven studies. State and local education officials and educators must sort through a myriad of such claims to decide which interventions merit consideration for their schools and classrooms. Many of these practitioners have seen interventions, introduced with great fanfare as being able to produce dramatic gains, come and go over the years, yielding little in the way of positive and lasting change y (the laws) call on educational practitioners to use ‘‘scientifically-based research’’ to guide their decisions about which interventions to implement. We believe this approach can produce major advances in the effectiveness of American education. (p. iii)
THE INDIVIDUALS WITH DISABILITIES EDUCATION ACT AND THE PEER-REVIEWED RESEARCH REQUIREMENT The movement to include SBR in federal education law, the results of the Congressional hearings on reauthorization, and the report of the President’s Commission (which all called for the reauthorized IDEA to include research-based procedures in training of special education teachers and in the development and implementation of the IEPs of students with disabilities) had an enormous impact on Congressional reauthorization of the IDEA. To this end, Congress included the PRR requirement in IDEA 2004. According to the text of the law, a student’s IEP must contain: ‘‘A statement of the special education and related services and supplementary aids and services, based on peer-reviewed research to the extent practicable – that will be provided to the child’’ (IDEA, 20 U.S.C. y 1414[d][1][A][i][IV]). Thus, when developing a student’s special education program, the IEP team must base the programming on educational research to the extent that such research is available. Gerl (2012) noted that it is rare for Congress to impose a requirement (i.e., use PRR), but qualify the requirement with a built in excuse for noncompliance (i.e., to the extent practicable). The qualifying phrase to the extent practicable means that whenever PRR is available it should be used. The excuse for noncompliance is when PRR does not exist in a specific area
158
MITCHELL L. YELL AND MICHAEL ROZALSKI
being addressed by the IEP team it obviously cannot be included in a student’s IEP. The US Department of Education declined to define the term PRR; rather the Department did note that the phrase was adopted from the following criteria that school districts could use to identify SBR from the Elementary and Secondary Education Act (ESEA): ‘‘research that has been accepted by a peer-reviewed journal or approved by a panel of independent experts through a comparably rigorous, objective, and scientific review’’ (Elementary and Secondary Education Act, 20 U.S.C. y 1208[6][B]). In the final regulations to the IDEA, issued on August 14, 2006, the US Department of Education defined PRR in the commentary as generally referring ‘‘to research that is reviewed by qualified and independent reviewers to ensure that the quality of the information meets the standards of the field before the research is published’’ (71 Fed. Reg. 46664). During the peer-review process for a professional journal, when a researcher submits a manuscript for publication, an editor sends the manuscript out for review by experts in the field. The reviewers, who are usually blind to the author of the research article, evaluate the researcher’s manuscript. The reviewers then make a recommendation to the journal’s editor as to whether the manuscript should be published. The reviewers’ recommendations usually include one of the following options: (a) accept the manuscript, (b) accept the manuscript if the author makes suggested revisions, and (c) reject the manuscript. If a manuscript is eventually published by a journal using such a format, the research meets the IDEA’s requirement of being PRR. Even though Congress included PRR to more closely align the IDEA with NCLB and other federal education laws, Zirkel (2008a) asserted that the term PRR is not synonymous with SBR. Whereas SBR refers to instructional practices proven effective by experimental methods, PRR refers to any research (e.g., experimental, correlational, single-subject research, and mixed methods) that has been approved by experts in either the blind peer-review process in professional journals or by an independent panel of experts (Rozalski, 2010). According to Etscheidt and Curran (2010a), the legislative history of the IDEA amendments of 2004 and the definition and commentary on the PRR requirement reveals that the intent of this section of the law was to ensure that IEP teams’ selection of educational approaches reflects sound practices that have been validated empirically whenever possible. Thus, when the services in a student’s IEP are based on PRR, there is reliable evidence that
PRR, Law, & Policy
159
the program or services works. IEP teams, therefore, should have strong evidence of the effectiveness of instructional programs and other services before they are included in students’ IEPs (Etscheidt & Curran, 2010a).
THE US DEPARTMENT OF EDUCATION, HEARINGS, COURT CASES, AND THE PEER-REVIEWED RESEARCH REQUIREMENT The PRR requirement in IDEA 2004 applies to the (a) selection and provision of special education methodology; (b) selection and provision of related services, which are services that are required to assist a student to benefit from special education; and (c) selection and provision of supplementary aids, services, and supports provided in regular education settings. According to the Office of Special Education (OSEP) in the US Department of Education, the PRR requirements also applies to nonacademic areas, such as behavioral interventions (71 Federal Register, 46,683, 2006), professional-development activities (71 Federal Register, 46,627, 2006), and individualized family service plans (IFSPs) under part C of the IDEA (20 U.S.C. y 1436(d)(4)). The US Department of Education issued the regulations for IDEA 2004 and commentary on August 14, 2006. The Department had issued an invitation to comment on the proposed regulations. Almost 6,000 comments were received and an analysis of some of these comments was included with the regulations. These comments can be found in the Federal Register issued on August 14, 2006 (Volume 72, Number 156). Table 3 includes a sample of comments and analysis of the US Department of Education comments on PRR. In Letter to Kane (2010), OSEP noted that PRR generally refers to research that is reviewed by qualified reviewers to ensure that the quality of the information meets the standards of the field before the research is published. Furthermore, OSEP noted that determining whether particular services are peer-reviewed might require that teachers review the literature or other information on the use of evidence-based practices. However, according to an opinion issued by the US Department of Education, the IDEA’s reference to PRR does not refer to evidence-based practices or emerging best practices, ‘‘which are generally terms of art that may or may not be based on peer-reviewed research’’ (71 Federal Register 46,665, 2006).
160
MITCHELL L. YELL AND MICHAEL ROZALSKI
Table 3.
Websites with Peer-Reviewed Research.
Resource and URL Best Evidence Encyclopedia www.bestevidence.org
Center on Positive Behavioral Interventions and Supports (PBIS) www.pbis.org
Doing What Works: ResearchBased Education Practices Online http://dww.ed.gov IRIS Center http:// iris.peabody.vanderbilt.edu
National Center on Response to Intervention (NCRTI) www.rti4success.org
National Dissemination Center for Children with Disabilities (NICHCY) www.nichcy.org
National Secondary Transition Technical Assistance Center (NSTTAC) www.nsttac.org
What Works Clearinghouse http://ies.ed.gov/ncee/wwc
Analysis The Best Evidence Encyclopedia is the website of the Center for Data-Driven Reform in Education (CDDRE) of the School of Education at Johns Hopkins University School of Education. The Center, which is funded by the Institute of Education Sciences in the US Department of Education, is intended to give educators and researchers information about the evidence base of various educational programs. The Technical Assistance Center on Positive Behavioral Interventions and Supports is devoted to giving schools information and technical assistance for identifying, adapting, and sustaining effective school-wide disciplinary practices. The website is sponsored by the Office of Special Education Programs in the US Department of Education. US Department of Education website that provides information to help educators to use practical tools to improve classroom instruction. The website is sponsored by the US Department of Education. The IRIS Center for training enhancement is a free online resource that translates research on the education of students with disabilities into practice. The website is sponsored by the Office of Special Education Programs in the US Department of Education. The National Center on Response to Intervention (RTI) mission is to provide technical assistance to states and districts and build the capacity of states to assist districts in implementing proven models for RTI. The website is sponsored by the Office of Special Education Programs in the US Department of Education. A central source of information on infants, toddlers, children, and youth with disabilities. Includes information on law and peer-reviewed research. The website is sponsored by the Office of Special Education Programs in the US Department of Education. The National Secondary Transition Technical Assistance Center (NSTTAC) is dedicated to ensuring full implementation of the IDEA and helping youth with disabilities achieve desired post-school outcomes. The website is sponsored by the Office of Special Education Programs in the US Department of Education. The What Works Clearinghouse is an initiative of the US Department of Education’s Institute of Education Sciences. It is a central source of scientific evidence for what works in education. The website is sponsored by the US Department of Education.
PRR, Law, & Policy
161
FREE AND APPROPRIATE PUBLIC EDUCATION AND THE PEER-REVIEWED RESEARCH REQUIREMENT The primary obligation of the IDEA is that schools provide a free and appropriate public education (FAPE) to all students in special education (Yell et al., 2011; Yell & Crockett, 2011; Zirkel, 2008b). The law defines a FAPE as special education and related services that (A) are provided at public expense, under public supervision and direction, and without charge, (B) meet standards of the State educational agency, (C) include an appropriate preschool, elementary, or secondary school education in the state involved, and (D) are provided in conformity with the individualized education program. (IDEA, 20 U.S.C. y 1401(a)(18)) Thus, when IEP teams develop and implement a special education program for a student with disabilities it must be based on a full and individualized assessment of a student, which leads to specially designed instruction that meets the unique needs of the student. An important component of this specially designed instruction is special education and other services based on PRR. When the Education for All Handicapped Children Act was first written, the Congressional authors understood that they could not define a FAPE for each student by detailing the specific substantive educational requirements in the law, so instead they defined a FAPE primarily in terms of the procedures necessary to ensure that parents and school personnel would collaborate to develop an individual student’s program of special education and related services. The legal definition of a FAPE, therefore, was primarily procedural rather than substantive. The lack of a substantive definition of FAPE in the IDEA has led to frequent disagreements between parents and schools regarding what constitutes an appropriate education for a particular student. State and federal courts, therefore, have often been required to define FAPE. In 1982, the US Supreme Court issued a ruling in Board of Education v. Rowley (1982), which was the first special education case heard by the US Supreme Court. In Rowley, the high court interpreted the FAPE mandate of the IDEA. In this case, the Supreme Court developed the so-called Rowley twopart test to be used by courts in determining if a school had provided a FAPE as required by the IDEA. The two-part test was as follows: First, the
162
MITCHELL L. YELL AND MICHAEL ROZALSKI
court had to determine if a school has complied with the procedures of the IDEA. If the answer was yes, the court would then determine if the school had passed the second part of the test, which was whether the IEP developed through the IDEA’s procedures had been reasonably calculated to enable the child to receive educational benefits? (Rowley, pp. 206–207). If these requirements are met, a court will find that a school has complied with FAPE requirements. A number of commentators predicted that the PRR requirement of the IDEA could possibly lead to an elevated FAPE substantive standard (Etscheidt & Curran, 2010a; Yell et al., 2006; Zirkel, 2008b). Although the administrative guidance and court cases on PRR have thus far not led to an elevated FAPE standard (Yell, 2012; Zirkel & Rose, 2009), there have been administrative guidance statements on the matter as well as a few hearings and cases. We next review administrative guidance. Following that we turn to a discussion of hearings and court cases.
THE US DEPARTMENT OF EDUCATION AND THE PEER-REVIEWED RESEARCH REQUIREMENT In comments to the 2006 regulations, officials in the US Department of Education clarified the relationship of the PRR Requirement to the IDEA requirement that all students in special education receive a FAPE. According to the Department: services and supports should be based on peer-reviewed research to the extent that it is possible, given the availability of peer-reviewed researchy States, school districts, and school personnel must, therefore, select and use methods that research has shown to be effective, to the extent that methods based on peer-reviewed research are available. This does not mean that the service with the greatest body of research is the service necessarily required for a child to receive FAPE. Likewise, there is nothing in the Act to suggest that the failure of a public agency to provide services based on peer-reviewed research would automatically result in a denial of FAPE. The final decision about the special education and related services, and supplementary aids and services that are to be provided to a child must be made by the child’s IEP Team based on the child’s individual needsy if no such research exists, the service may still be provided, if the IEP team determines that such services are appropriate. (Federal Register, Vol. 71, No 156, pp. 46663-46665)
The responsibility to make FAPE available to a student rests with the public school district in which he or she resides and ultimately with the state (Bateman, 2011). The IEP team is the forum by which a student’s FAPE is developed and the IEP document becomes the blueprint of
PRR, Law, & Policy
163
a FAPE. Since 2004 there has been a procedural requirement that an IEP team needs to consider PRR when developing a student’s special education services, related services, and supplementary aids and services in his or her IEP. However as Etscheidt and Curran (2010b) noted, ‘‘What has not changed with [IDEA 2004] is the requirement that an IEP, with or without research-based methods, be reasonably calculated to provide educational benefit’’ (p. 147). A few cases and a number of due process hearings have been heard on the PRR requirement of the IDEA and FAPE. It is useful to examine the rulings in these cases because they give us an idea of how courts are interpreting the PRR requirement. We next review three cases involving PRR.
Litigation and the Peer-Reviewed Research Requirement Three cases have directly addressed the PRR requirement of the IDEA: Waukee Community School District (2007), Rocklin Unified School District (2007), and Ridley School District v. M.R. and J.R. (2012). These cases began at the local hearing level and then moved to the Federal District Court level and the two went all the way to the US Court of Appeals. Waukee Community School District began with a due process hearing officer’ decision in Iowa. The case involved an eight-year-old girl with autism. A school district had used behavioral interventions with the child, which included physical restraint and long periods of time out. The girl’s parents filed for a due process hearing, contending that the use of time out and restraint procedures were inconsistent with their child’s IEP and the IDEA’s requirement that positive behavior supports be used in behavioral programs. The attorney for the school district countered that the behavioral interventions were within the bounds of professional judgment and were supported by PRR. The administrative law judge (ALJ) ruled that the procedures used by the child’s teacher were (a) not implemented in a manner consistent with PRR or appropriate educational practices, (b) not adequately monitored, and (c) not consistent with the IDEA’s positive behavioral supports mandate of the IDEA. The IEP team was ordered to reconvene to develop a new IEP and a new positive behavior support plan and to consult with an outside consultant with expertise in autism or challenging behaviors. The school district appealed to the Federal District Court in Iowa. In the renamed Waukee Community School District & Heartland Area Education Agency (2008), the district court judge held that the preponderance of the evidence supported the ALJ’s finding that the
164
MITCHELL L. YELL AND MICHAEL ROZALSKI
interventions as implemented were not supported by research and thus violated the FAPE requirement of the IDEA. In Rocklin Unified School District (2007), the parents of a six-year-old boy with autism requested a due process hearing because they believed that the school district, which developed an IEP consisting of a variety of eclectic methodologies, was denying their son a FAPE. The parents argued that PRR supported the use of applied behavior analysis (ABA) and did not support the use of the eclectic methodologies. Both the attorney for the parents and the attorney for the school district had expert witnesses testify at the hearing regarding PRR for students with autism and the parents’ evidence included three studies that demonstrated the ineffectiveness of eclectic methodologies. The ALJ, however, did not find the studies sufficiently convincing and noted that the district had also offered PRR to support the effectiveness of an eclectic approach. The ALJ also gave substantial weight to the testimony of the school district’s expert. With respect to the PRR requirement of the IDEA, the ALJ wrote that If the component parts of a plan are peer-reviewed, then it follows that the sum of those parts should be considered as peer-reviewed as well, particularly in the light of the moral, legal, and ethical constraints that prevent the truest form of scientific study from being conducted. The ultimate test is not the degree to which a methodology has been peerreviewed, but rather, whether the methodology chosen was believed by the IEP team to be appropriate to meet the individual needs of the child. (Rocklin, 2008, p. 234)
The parents appealed the decision to a Federal District Court in California, which affirmed the decision of the ALJ that the school district had not violated the FAPE provision of the IDEA (Joshua A. v. Rocklin Unified School District, 2009). With respect to the PRR, the District Court reasoned that It does not appear that Congress intended that the service with the greatest body of research be used in order to provide FAPE. Likewise there is nothing in the Act to suggest that the failure of a public agency to provide services based on (PRR) would automatically result in a denial of FAPE. (Rocklin, p. 234)
In this case, the court concluded that the school district had not violated because the eclectic program had provided Joshua A. with a FAPE. The decision was appealed to the US Court of Appeals for the Ninth Circuit (Joshua A. v. Rocklin Unified School District, 2009). The appellate court upheld the decision of the district court noting that the school district’s eclectic program was (a) based on PRR to the extent practicable and (b) was reasonably calculated to provide educational benefit.
PRR, Law, & Policy
165
Ridley School District v. M.R. and J.R. (2012) involved a young girl, referred to as E.R. by the court, with learning disabilities and health problems. In November 2007, E.R. was evaluated by the Ridley School District and was found to have learning disabilities in reading decoding and comprehension, math computation, reasoning skills, and written language. In March, an IEP meeting was held in which a phonics-based program called Project Read was suggested. The special educator provided a report of Project Read from the Florida Center for Reading Research in which the Center concluded that the research studies of the Project Read were promising and the curriculum was aligned with current research in reading. Toward the end of the school year, E.R. began attending a resource room one hour a day, for assistance in reading and math. In a summer meeting of the IEP team it was decided that during the summer the teachers and support staff would receive training in the use of Project Read and provide one hour per day of reading instruction and one hour per day of math instruction in the resource room. The parents reviewed Project Read and determined it was not appropriate for E.R. In the summer between the first and second grade E.R.’s parents removed her from the Grace Park Elementary School in the Ridley School District and placed her in a private school, called Benchmark School, which specialized in educating students with learning disabilities. The parents, who believed that the Ridley School District was offering an education that was not appropriate given E.R.’s educational needs, subsequently filed a complaint against the Ridley School District. The hearing office found that the proposed IEPs were inadequate because they lacked appropriate specially designed instruction in the form of a research-based, peer-reviewed reading program. Because the Ridley School District had failed to provide E.R. a FAPE, the due process hearing officer awarded E.R.’s parents’ compensatory education and reimbursement for tuition and transportation. The school district filed an appeal with the Federal District Court. The District Court reversed the ruling of the hearing officer, finding for the school district. The District Court, disagreeing with the hearing officer, found that Project Read was research-based and peer-reviewed. The parents appealed to the US Court of Appeals for the Third Circuit. The Appeal Court upheld the ruling of the District Court. The appellate court included an important discussion of the IDEA requirement that IEPs be based on PRR. The court stated that it did not need to decide whether the lack of peer-researched curriculum by itself would lead to the denial of a FAPE because the court found that Project
166
MITCHELL L. YELL AND MICHAEL ROZALSKI
Read was based on PRR. The court made the following two points regarding this requirement: First, although schools should strive to base a student’s specially designed instruction on peer-reviewed research to the maximum extent possible, the student’s IEP team retains flexibility to devise an appropriate program, in light of available research. Second, under the IDEA, courts must accord significant deference to the choices made by school officials as to what constitutes an appropriate program for each student. (p. 33)
The court further stated that The IDEA does not require an IEP to provide the optimal level of services, we likewise hold that the IDEA does not require a school district to choose the program supported by the optimal level of peer-reviewed research. Rather, the peer-reviewed specially designed instructions in an IEP must be ‘‘reasonably calculated to enable the child to receive meaningful educational benefits in light of the child’s intellectual potential.’’ (p. 34)
The appellate court did not set forth a test that lower courts and hearing officers could use in determining what would constitute an adequately peerreviewed special education program because courts and hearing officers should only ‘‘assess the appropriateness of an IEP on a case-by-case basis, taking into account the available research’’ (p. 37). The court further noted that if a school failed to implement a program based on PRR, even though such research was available, that fact would weigh heavily against the school district. An interesting and significant fact of this decision is that the US Court of Appeals seriously considered the PRR requirement of the IDEA. It did not solely consider the parents and school district’s dispute regarding the reading program to be offered; rather the court considered what the reading research showed and found that it supported Project Read. Perhaps, as Bateman and Linden (2012) asserted, the ‘‘the clear language of the statute requires that the services in the IEP must be based on peer-reviewed research’’ (p. 54) and because the research on teaching reading is so clear it is likely that the courts will become more receptive to considering its importance in disputes. A statement in the appellate court’s decision shows that Bateman and Linden’s assertion may be prescient. Toward the end of the decision, the court wrote that we will not set forth any bright-line rule as to what constitutes an adequately peerreviewed special education program; hearing officers and reviewing courts must continue to assess the appropriateness of an IEP on a case-by-case basis, taking into account the available research. We recognize that there may be cases in which the specially designed
PRR, Law, & Policy
167
instruction proposed by a school district is so at odds with current research that it constitutes a denial of a FAPE. (p. 13)
There will be more court interpretations of the PRR requirement of the IDEA in the future; however, these three decisions are certainly important rulings that give us good indications of how future courts may decide such cases. Clearly, the PRR requirement of the IDEA is a very important consideration that IEP teams need to address when developing special education programs for students with disabilities. Nonetheless, these decisions, and the aforementioned guidance from the US Department of Education, indicate that thus far courts have not used the PRR requirement to raise the standard of school districts’ responsibility to provide a FAPE for IDEA eligible students with disabilities. The standard remains the charge to provide an education that confers meaningful educational benefit to a student in special education. As the Ridley court noted, it remains to be seen what will happen when courts are faced with school districts that provide educational programs that are at odds with current research. What can be said with certainty is that when IEP teams rely on research-based strategies, and implement these strategies with fidelity, their special education programs will be much more likely to confer meaningful education benefit, and thus meet the substantive requirements of FAPE.
IMPLICATIONS OF THE PEER-REVIEWED RESEARCH REQUIREMENT FOR TEACHERS OF STUDENTS WITH DISABILITIES Essentially the PRR mandate of the IDEA requires that when developing special education programs for students with disabilities, IEP teams should rely on programs and strategies that have empirical support from research published in peer-reviewed journal. The following are suggestions that teachers of students with disabilities and school district administrators should follow with respect to the PRR requirement of the IDEA: Teachers need to understand and remain current with the research in their areas and use academic and behavioral interventions that have support in the research literature. This means that teachers should not use an intervention because (a) they have always used it, (b) it sounds good or feels right, or (c) a colleague told them about it, but rather because PRR has proven it to be successful in teaching behavioral and academic skills
168
MITCHELL L. YELL AND MICHAEL ROZALSKI
to students with disabilities. This is especially important in areas in which there is established and clear research (e.g., teaching reading). Teachers should be prepared to discuss PRR in IEP meetings and to be able to explain the research behind the educational methodologies that they choose. Because basing services on PRR is a legal requirement, a parent in an IEP meeting may legitimately inquire about the research base for an intervention that is being used, and it is up to the teacher to be able to respond to these inquiries. IEP team members must acknowledge and discuss research that parents propose at IEP meetings. If research does not support a particular procedure advocated by a student’s parents, be prepared to discuss not only the parent’s suggestions and opinions but also the lack of support in the PRR. Teachers should implement interventions with fidelity. This means that teachers should implement interventions in accordance with how they were designed or intended to be implemented. If teachers do not implement instruction the way it was designed, then the most highly researched and effective curricula may not be beneficial to students (Reschly, 2008). In fact, if interventions are not implemented as intended ‘‘it is impossible to determine whether poor student outcomes result from an ineffective intervention or an effective intervention that is poorly implemented’’ (Sanetti & Kratochwill, 2009, p. 24). Teachers should use progress monitoring to determine the effectiveness of interventions. The IDEA requires that student progress toward their goals be monitored on a frequent and systematic basis. Moreover, a student’s progress toward each goal in the IEP must be reported to his or her parents on a regular basis. When teachers have evidence of student progress it is more likely that when challenged in IEP meetings, due process hearings, or even in courts a school district will be able to show that the school’s program has provided educational benefits. Teachers should attend professional development activities in which training on new and emerging research is provided. Special education is a highly researched field and teachers and IEP team members should keep up with research in their area. Unfortunately special education teachers are often overwhelmed by the amount of research that is available, and often they do not have the time or training to meaningfully evaluate research (Cook & Smith, 2011). This means that school district officials must provide structured and effective professional development activities that keep teachers abreast of the emerging and new PRR. Teachers should also be encouraged and supported in becoming members of
PRR, Law, & Policy
169
professional organizations (e.g., Council for Children with Behavioral Disorders, Council for Exceptional Children). Such organizations are excellent sources of information on PRR through their journals and conferences.
THE FUTURE OF THE IDEA AND PEER-REVIEWED RESEARCH REQUIREMENT The IDEA has been an extremely successful federal education law. The major purpose of the original Education for All Handicapped Children Act, which was to open the doors of public education to students with disabilities, has been achieved. This success led Congress to begin focusing on improving educational outcomes for students with disabilities in subsequent amendments to the IDEA. For example, the IDEA Amendments of 1997 added the requirement that a student’s annual goals are measurable and in IDEA 2004 Congress also added the PRR requirement. It is critical that teachers of students with disabilities, IEP team members, and school district administrators understand these changes and the duties it imposes on them. According to Cook and Smith (2011), special education has long been plagued by the so-called research practice gap. It is likely that the movement toward quality of educational programming for students with disabilities and increased accountability for educational progress and results will continue in future amendments to the IDEA, thereby closing the gap between what we know from research and what is actually done in the classroom. The requirement that the IEPs of students with disabilities include special education services, related services, and supplementary aids and services that are based on PRR certainly has the potential to lead to more effective special education programming for students. As long as programs are based on legitimate research and are implemented with fidelity, the likelihood of improving student outcomes will be increased greatly. Thus, when IEP teams are considering a student’s services, team members should be aware of the research base for the procedures and methods they discuss (Huefner & Herr, 2012). Moreover, according to Huefner and Herr (2012), the team members should be able to defend the selection of services. The onus will be on teacher preparation programs in colleges and universities and on school districts in their professional development
170
MITCHELL L. YELL AND MICHAEL ROZALSKI
programs and activities to ensure that special education teachers are fluent in the research in their respective fields, become good consumers of PRR, and continue to keep abreast of research developments in special education. It can be fairly said that the original emphasis of the IDEA, which was ensuring access, has evolved into an emphasis on improving academic and behavioral outcomes for students with disabilities and increasing accountability for reaching these results. An example of the emphasis of improving outcomes for students is the PRR requirement of IDEA 2004. Using programs and methods based on PRR with fidelity has the potential to increase academic and behavioral outcomes of students with disabilities. Special educators need to understand and keep up with the PRR and be able to discuss it in an IEP meeting. In subsequent reauthorizations we believe that the emphasis on improving outcomes for students with disabilities will increase.
SUMMARY Since the 1990s the federal government has used the law as a tool to compel the public schools to use the results of educational research in educational programs. By requiring educators to use research-based procedures and methods that have been shown to be effective by the best scientific evidence, the drafters of these laws believed that students’ achievement would be increased. Two examples of federal laws that required educational research to be used in America’s classrooms were the NCLB and IDEA 2004. NCLB required states and school districts to use SBR in selecting and purchasing instructional materials and as the foundation for educational programs and instruction in classrooms. The IDEA requires that the IEPs of students with disabilities include special education services, related services, and supplementary services that are based on PRR to the extent that such research is available. PRR refers to research that is reviewed by qualified and independent reviewers to ensure that the quality of the information meets the standards of the field before the research is published. We examined three cases in which this requirement was addressed. Although these cases show that the requirement to understand and use programming based on research is important, the primary consideration in programming remains the following: Did the student receive an IEP reasonably calculated to confer educational benefit?
171
PRR, Law, & Policy
REFERENCES Bateman, B. D., & Linden, M. A. (2012). Better IEPs: How to development legally correct and education useful programs. Verona, WI: IEP Resources/Attainment. Board of Education of Hendrick Hudson Sch. Dist. v. Rowley, 458 U.S. 176 (1982). Coalition for Evidence-Based Policy. (2003). Identifying and implementing educational practices supported by rigorous evidence: A user friendly guide. Retrieved from http://www2. ed.gov/rschstat/research/pubs/rigorousevid/index.html. Accessed on October 24, 2012. Cook, B. G., & Smith, G. J. (2011). Leadership and instruction: Evidence-based practices in special education. In J. B. Crockett, B. Billingsley & M. L. Boscardin (Eds.), Handbook of leadership and administration for special education (pp. 281–296). New York, NY: Routledge. Education Sciences Reform Act (ESRA) of 2002, 20 U.S.C. y 9501 et seq. Elementary and Secondary Education Act, 20 U.S.C. y 16301 et seq. Eisenhart, M., & Towne, L. (2003). Contestation and change in national policy on ‘‘Scientifically Based’’ education research. Educational Researcher, 32(7), 31–38. Etscheidt, S., & Curran, C. M. (2010a). Reauthorization of the Individuals with Disabilities Improvement Act (IDEA 2004): The peer-reviewed requirement. Journal of Disability Policy Studies, 21, 29–39. Etscheidt, S., & Curran, C. M. (2010b). Peer-reviewed research and Individualized Education Programs (IEPs): An examination of intent and impact. Exceptionality, 18, 138–150. Federal Register, Vol. 71, No 156, pp. 46663–46665. Gerl, J. (2012, November). What is evidenced based research? Presentation to the Tri-state Regional Special Education Law Conference. Omaha, NE. Huefner, D. S., & Herr, C. M. (2012). Navigating special education law and policy. Verona, WI: IEP Resources/Attainment. Individuals with Disabilities Education Act (IDEA) of 2004, 20 U.S.C. y 1401 et seq. Individuals with Disabilities Education Act (IDEA) Regulations of 2006, 34 C.F.R. y 300.1 et seq. Institute of Education Sciences (2012, March). Retrieved from www.ies.ed.gov Joshua A. v. Rocklin Unified School District, 49 IDELR 249 (E.D. CA 2009). Letter to Kane (2010), Office of Special Education Rehabilitative Services, US Department of Education. Retrieved from http://www2.ed.gov/policy/speced/guid/idea/letters/2010-1/ kane021210ifsp1q2010.pdf. Accessed on March 11, 2012. National Research Council. (2002). Scientific research in education. In R. Shavelson & L. Towne (Eds.), Committee on Scientific Principles for Educational Research. Washington, DC: National Academy Press. No Child Left Behind, 20 U.S.C. y 16301 et seq. O’Neill, P. T. (2004). No Child Left Behind compliance manual. New York, NY: Brownstone, Publications. Paige, R. (2002, November). Statement of Secretary Paige regarding Title I regulations. Retrieved from http://www.ed.gov/news/speeches/2002/1/11262002.html?exp ¼ 0. Accessed August 2002. President’s Commission on Excellence in Special Education. (2001). A new era: Revitalizing special education for children and their families. Washington, DC: ED Pubs, Education Publication Center, US Department of Education. Reading Excellence Act of 1998, div. A, Sec. 101(f) [Title VIII], Oct. 21, 1998, 112 Sta. 2681– 337, 2681–391.
172
MITCHELL L. YELL AND MICHAEL ROZALSKI
Reschly, D. J. (2008). School psychology paradigm and beyond. In A. Thomas & J. Grimes (Eds.), Best practices in school psychology (Vol. V, pp. 3–15). Bethesda, MD: National Association of School Psychologists. Ridley School District v. M.R. and J.R. (2012). Ruling from the US Court of Appeals, for the Third Circuit, filed May 17, 2012. Retrieved from www.ca3.uscourts.gov/opinarch/ 111447p. pdf. Accessed on September 21, 2012. Rocklin Unified School District, 48 IDELR 234 (SEA CA 2007). Rozalski, M. E. (2010). Scientifically-based research. In T. C. Hunt, J. C. Carper, T. J. Lasley & C. D. Raisch (Eds.), Encyclopedia of reform and dissent (Vol. 2, pp. 802–803). Thousand Oaks, CA: Sage. Sanetti, L. M. H., & Kratochwill, T. R. (2009). Treatment integrity assessment in the schools: An evaluation of the Treatment Integrity Planning Protocol. School Psychology Quarterly, 24(1), 24–35. Sweet, R. W., Jr. (1998). The reading excellence act: A breakthrough for reading teacher training. Strasburg, VA: The National Right to Read Foundation. Retrieved from http:// www.nrrf.org/essay_ReadingExcel.html. Accessed on October 24, 2012. US Department of Education. (2003). Paige releases principles for Reauthorizing the Individuals with Disabilities Education Act. Retrieved from http://www.ed.gov/news/pressreleases/ 2003/02/02252003.html. Accessed on August 11, 2009. Waukee Community School District & Heartland Area Education Agency, 51 IDELR 15 (S.D. IA. 2008). Waukee Community School District, 48 IDELR 26 (SEA IA 2007). Whitehurst, G. J. (2002). Testimony of Dr. Grover Whitehust before the House Subcommittee on Education Reform. Retrieved from http://archives.republicans.edlabor.house.gov/ archive/hearings/107th/edr/oeri22802/whitehurst.htm. Accessed on November 11, 2012. Yell, M. L. (2012). Legal issues in special education. Upper Saddle River, NJ: Pearson/Merrill Education. Yell, M. L., & Crockett, J. (2011). Free appropriate public education (FAPE). In J. M. Kauffman & D. P. Hallahan (Eds.), Handbook of special education (pp. 77–90). Philadelphia, PA: Taylor & Francis/Routledge. Yell, M. L., Katsiyannis, A., & Shriner, J. G. (2006). No Child Left Behind, adequate yearly progress, and students with disabilities. Teaching Exceptional Children, 38(4), 32–39. Yell, M. L., Thomas, S. S., & Katsiyannis, A. (2011). Special education law for leaders and administrators of special education. In J. B. Crockett, B. S. Billingsley & M. L. Boscardin (Eds.), Handbook of leadership and administration for special education (pp. 69–96). New York, NY: Routledge. Zirkel, P. A. (2008a). A legal roadmap of SBR, PRR, and related terms under the IDEA. Focus on Exceptional Children, 40(5), 1–14. Zirkel, P. A. (2008b). Have the amendments to the Individuals with Disabilities Education Act razed Rowley and raised the substantive standard for ‘‘free appropriate public education’’? Journal of the National Association of Administrative Law Judiciary, 28, 396–418. Zirkel, P. A., & Rose, T. (2009). Scientifically based research and peer-reviewed research under the IDEA: The legal definitions, applications, and applications. Journal of Special Education Leadership, 22, 36–49.
CHAPTER 8 FROM RESEARCH TO PRACTICE IN EARLY CHILDHOOD INTERVENTION: A TRANSLATIONAL FRAMEWORK AND APPROACH Carol M. Trivette and Carl J. Dunst ABSTRACT A translation framework and associated processes and activities for bridging the research-to-practice gap in early childhood intervention are described. Translational processes and activities include methods and procedures for identifying evidence-based practices, translating findings from research evidence into early childhood intervention procedures, and promoting practitioners’ and parents’ routine use of the practices. The framework includes four interrelated processes and activities. Type 1 translation uses research findings to develop evidence-based practices. Type 2 translation involves the use of evidence-based professional development (implementation) practices to promote practitioners’ and parents’ use of evidence-based early childhood intervention practices. Type 3 translation includes activities to evaluate whether the use of evidence-based practices as part of routine early intervention have
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 173–196 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026010
173
174
CAROL M. TRIVETTE AND CARL J. DUNST
expected benefits and outcomes. Type 4 translation includes activities for the dissemination, diffusion, and promotion of broad-based adoption and use of evidence-based practices. Examples of each type of translation are described as are implications for practice.
Early intervention and early childhood special education for infants, toddlers, and preschoolers with developmental disabilities or delays has more than a 100-year history (e.g., Caldwell, 1970; Dunst, 1996). Contemporary interest in early childhood intervention can be traced to a number of events predating the Education for the Handicapped Act (EHA, 1986) establishing an early intervention program for infants and toddlers and mandating preschool special education by more than 40 years (see, e.g., Hunt, 1961). Both theory and research prior to the EHA legislation shaped the kinds of practices that were used to influence young children’s behavior and development (e.g., Friedlander, Sterritt, & Kirk, 1975; Tjossem, 1976). One focus of early childhood intervention research prior to the EHA was the identification of the kinds of practices (experiences, learning opportunities, intervention activities, curricula, etc.) that were best suited for influencing child learning and development and the conditions under which those practices were most effective (see Dunst, 2012a). This was accompanied by research investigating the efficacy and effectiveness of a wide range of practices and the programs that adopted those practices as early childhood intervention methods and strategies (e.g., Dunst, Snyder, & Mankinen, 1988; Simeonsson, Cooper, & Scheiner, 1982). These syntheses together with more recent research have been the foundation for the development of evidencebased, recommended, and best practices in early childhood intervention (e.g., Guralnick, 1997; Sandall, McLean, & Smith, 2000). Despite the evidence base for early childhood intervention practices, an important problem exists. Early childhood intervention practitioners often do not use evidence-based practices for intervening with young children with developmental disabilities or delays and their families (Dunst, 2007a; McLean, Snyder, Smith, & Sandall, 2002). There are several reasons why this is the case. One is that intervention practices investigated under controlled or prescribed conditions do not easily generalize to routine practice settings (Rohrbach, Grana, Sussman, & Valente, 2006). Another is that those practices are often not viewed as socially valid by early childhood intervention practitioners and, therefore, are not seen as applicable to everyday practice (Schwartz, 1996). These problems are not unique to early childhood intervention. They are recognized as roadblocks to bridging the
Research to Practice
175
research-to-practice gap in a number of fields and disciplines (e.g., McLennan, Wathen, MacMillan, & Lavis, 2006). Efforts to bridge the research-to-practice gap have recently been addressed through what is now known as translational research (Woolf, 2008). Translational research refers to the processes and activities that use findings from basic research studies to inform the development, adoption, and use of evidence-based practices in routine settings and programs (Drolet & Lorenzi, 2011; Dunst & Trivette, 2009b; Lochman, 2006). The term practice means the experiences, learning opportunities, or intervention procedures afforded to or used with young children or their parents and families to influence the outcomes the practices are intended to effect (Dunst, Trivette, & Cutspec, 2007). The term evidence-based practices means practices informed by research findings demonstrating a relationship between the characteristics and consequences of a planned or naturally occurring experience or opportunity where the nature of the relationship is used to develop early childhood intervention practices (Dunst & Trivette, 2009b). The purpose of this chapter is to describe a translational framework and approach for identifying evidence-based practices, translating findings from research into intervention methods and strategies, and using this information to promote early childhood intervention practitioner adoption and use of evidence-based practices. The chapter includes three main sections. The first section includes a description of a translational framework for bridging research and practice in early childhood intervention. The second section includes examples of how the translational framework has been used to inform the identification, development, adoption, and use of evidence-based early childhood intervention practices. The final section includes descriptions of the implications of the translational framework and approach for both professional development and practitioner adoption and use of evidence-based practices in early intervention and early childhood special education.
A TRANSLATIONAL FRAMEWORK FOR BRIDGING THE RESEARCH TO PRACTICE GAP Fig. 1 shows a three-component framework for bridging research and practice in early childhood intervention that specifies the sequence of events or stages for identifying evidence-based practices; using that information for
176
CAROL M. TRIVETTE AND CARL J. DUNST
Sources of Evidence
Fig. 1.
Translational Research
Implications for Practice
A Proposed Translational Framework for Bridging Research-to-Practice in Early Childhood Intervention.
developing evidence-based intervention methods, procedures, and interventions; and for promoting the use of the practices in routine settings. Sources of evidence refers to findings from the systematic investigation of the relationship(s) between the characteristics of early childhood intervention practices and the consequences of the use of the practices (Dunst & Trivette, 2009b; Dunst, Trivette, et al., 2007). Translational research refers to the processes and practices for using research findings and applying them to everyday early childhood intervention (Woolf, 2008). Implications for practice includes activities to promote early childhood intervention practitioner adoption and use of evidence-based practices in a manner that includes or mirrors the characteristics of the practices that research found are associated with positive outcomes (Dunst & Trivette, 2009b).
Sources of Evidence Debate continues as to what counts as evidence (Donaldson, Christie, & Mark, 2009; Dunst, 2010). At one end of the continuum, there are those who argue that randomized-controlled design studies should be the only or primary source of evidence-based practices (e.g., Higgins & Green, 2011). At the other end of the continuum, there are those who argue that professional wisdom (expert opinion and practitioner knowledge/experience) should be the only or primary source of information for evidence-based practices (Turnbull et al., 2009). We have opted for a middle ground where different types of studies are carefully scrutinized and evaluated to determine if the nature of the relationship between the characteristics and consequences of a practice are sufficient to conclude that the practice was effective and, therefore, is evidence-based (e.g., Dunst & Trivette, 2009b; Dunst, Trivette, et al., 2007). Evidence for the efficacy and effectiveness of early childhood intervention practices comes from various types of research – either experimental or correlational studies or both, and group or single participant design studies.
177
Research to Practice
The sources of evidence for early childhood intervention practices are purposely broad so as not to exclude research findings on an a priori basis without an adequate evaluation of the quality of the evidence. Quality is determined based on the ability to identify the characteristics of an intervention practice that matter most in terms of their relationships with study outcomes while at the same time ruling out other explanatory factors (Dunst, Trivette, et al., 2007). The foundations of evidence-based practices include systematic reviews, syntheses of research evidence, and the systematic replication of the effects of practices in different early childhood intervention studies (Davies, 2002). The focus of analysis of these types of evidence is the identification of the characteristics of a practice that are associated with observed effects and consequences. This relationship can be empirically determined through casual, functional, statistical, or mediational analyses. The primary sources of evidence for early childhood intervention practices are shown in Table 1. They include meta-analyses of studies of the same or similar kinds of early childhood intervention practices where effect sizes for the relationship between the characteristics of practices and Table 1.
Major Sources of Evidence for Early Childhood Intervention Practices.
Types of Evidence
Description
Research Syntheses
Systematic reviews that objectively assess and summarize findings from different empirical studies of the same or similar practice(s) to determine if accumulated evidence supports the use of the practice and to draw overall conclusions about the practice(s) (Cooper, 1998). A special type of meta-analysis or research synthesis that explicitly focuses on identifying the active ingredients and key characteristics of a practice associated with hypothesized or expected outcomes (Dunst & Trivette, 2009b). Systematic replication of the effects of an intervention practice in multiple studies with different participants in different settings to evaluate the effectiveness of the intervention on expected or hypothesized outcomes (Horner, Halle, McGee, Odom, & Wolery, 2005; Odom et al., 2005). Systematic replication of the correlations between the characteristics of a practice and an outcome in different studies with different participants to identify which characteristics show the largest covariation with the outcomes (Becker, 1992).
Practice-Based Research Syntheses
Replicated Experimental Studies
Replicated Correlational Studies
178
CAROL M. TRIVETTE AND CARL J. DUNST
different outcomes are used to determine whether a practice is effective (e.g., Trivette, Simkus, Dunst, & Hamby, 2012). They also include research syntheses that focus on identifying similar patterns of findings in different studies for determining whether a practice is effective (e.g., Kaiser & Trent, 2007). A particular type of meta-analysis and research synthesis called a practicebased research synthesis has proven especially useful for identifying the key characteristics of a practice that ‘‘stand out’’ as most important in terms of explaining the relationships between the characteristics of a practice and different outcomes (Dunst & Trivette, 2009b; Dunst, Trivette, et al., 2007). This type of research synthesis focuses on identifying and isolating which key features or active ingredients of a practice are most important for explaining observed changes or improvements in outcomes of interest where those characteristics are the foundations for developing evidence-based practices. Other sources of evidence for early childhood intervention practices are systematic replications of the same practice with different study participants in either group design studies (e.g., Als et al., 2003) or single-participant design studies (Odom & Strain, 2002). This includes studies conducted by the same investigators as part of a systematic line of research (e.g., Mahoney & Nam, 2011) or by different investigators studying the effects of the same practice (e.g., White et al., 2011). Another source of data for evidence-based practices are studies that include replicated patterns of correlations among practice variables and outcome measures to identify the particular characteristics of a practice associated with the largest sizes of effect with expected or hypothesized outcomes (Hemphill, 2003). Both Dunst and Kassow (2008) and Nievar and Becker (2008), for example, used this approach to identify the particular parent–child interactional behavior associated with infant attachment. The interactional behaviors found to be most important were following a child’s lead, sensitivity to a child’s interests and intent to initiate interactions, contingent social responsiveness to child’s initiations, and efforts to support and encourage child competence.
Translational Processes and Practices Translational research means the different ways research findings can be used to inform practitioner adoption and use of early childhood intervention practices that are evidence-based. This type of research encompasses
179
Research to Practice
two broad areas of translation (Zucker, 2009). One area of translation employs research findings from basic research studies and uses the results to inform the development of evidence-based intervention practices. The second area of translation includes the methods and procedures for promoting adoption and use of evidence-based practices by early childhood intervention practitioners in everyday settings. Translational research, processes, and practices have primarily been the focus of clinical practices in medicine (e.g., Zucker, 2009). Both clinicians and researchers have proposed or developed different models of translation to help clarify and define different types of translation (e.g., Drolet & Lorenzi, 2011; Trochim, Kane, Graham, & Pincus, 2011). In the short time since the first two-stage translational research process was introduced (see Zucker, 2009), multistep and multicomponent models have been proposed for bridging the research-to-practice gap (see especially Callard, Rose, & Wykes, 2011; Trochim et al., 2011). These different models and frameworks were used to develop the four types of translation shown in Table 2, although we have taken considerable liberty to describe translational processes and practices in terms of activities applicable to early childhood intervention. T1 translation uses the findings from one or more of the sources of evidence described above to develop evidence-based early childhood practices. This is accomplished by using the characteristics, active ingredients, and key features of a practice associated with positive outcomes in primary studies and using that information to develop intervention procedures, practice guides, or other intervention methods that include those aspects of a practice found to be associated with positive child, parent,
Table 2.
Types of Translational Research and Practice for Early Childhood Intervention.
Types
Description
Type 1 Type 2
Use research findings to develop evidence-based practices. Promote practitioner or parent adoption and use of evidence-based intervention practices using evidence-based implementation practices. Evaluate the effects of the use of evidence-based practices in everyday intervention settings and contexts by different end-users (e.g., practitioners, parents). Promote broad-based understanding of the characteristics of evidence-based practices and why those characteristics are important for improving child, parent, and family outcomes.
Type 3
Type 4
180
CAROL M. TRIVETTE AND CARL J. DUNST
or family outcomes in basic research studies. This is done by first identifying the most important characteristics of a practice and second by using those characteristics to develop evidence-based practices. T2 translation focuses on the evidence-based implementation methods and procedures for promoting practitioner adoption and use of evidencebased intervention practices (Dunst & Trivette, 2012). Implementation practices include different evidence-based methods and procedures used by coaches, supervisors, instructors, trainers, researchers, etc. to facilitate or enhance practitioners’ or parents’ use of evidence-based intervention practices (Dunst, Trivette, & Raab, in press). Research-validated implementation practices include, but are not limited to, coaching, just-in-time training, guided design, and other adult learning methods used to promote adoption and use of evidence-based intervention practices (Dunst, Trivette, & Hamby, 2010; Smith & DeFrates-Densch, 2008). T3 translation includes activities to discern whether the evidence-based practices developed as part of T1 and T2 activities have the same or similar effects as found in primary research studies when used in everyday intervention settings by different practitioners or parents. This includes assessment of the fidelity of both evidence-based implementation and intervention practices (Dunst, Trivette, McInerney, et al., 2008), and the manner in which variations in fidelity are related to variations in use of the practices as part of routine early intervention (Dunst et al., in press). T4 translation includes activities for increasing the awareness, understanding, and use of evidence-based practices and why the evidence-based characteristics of the practices are important for improving child, parent, or family outcomes. This is accomplished through knowledge translation, dissemination, diffusion, and scaling-up the broad-based adoption of evidence-based practices.
Implications for Practice The different ways in which research findings are used to develop evidencebased early childhood intervention practices provide a foundation for promoting the routine use of the practices by early childhood practitioners, parents, and other primary caregivers. This is accomplished using a number of strategies for increasing awareness, understanding, and broad-based use of evidence-based practices by highlighting the importance of the characteristics of practices that are associated with positive benefits and outcomes.
181
Research to Practice
The different types of translation of research findings into evidence-based practices have implications for the kinds of experiences and learning opportunities afforded young children with disabilities or delays (e.g., Campbell & Sawyer, 2007; Dunst, Hamby, Trivette, Raab, & Bruder, 2000), the kinds of instructional practices used to promote and enhance child learning and development (e.g., Dunst, Raab, & Trivette, 2011; Wolery, 1994), the practices used to support and strengthen family-capacity to promote child learning and development (Dunst, 2012b; Swanson, Raab, & Dunst, 2011), and the methods and procedures used by implementation agents to promote practitioner adoption and use of evidence-based early childhood intervention practices (e.g., Dunst, Trivette, et al., 2010; Smith & DeFrates-Densch, 2008). The different types of translation also have implications for assessing the fidelity of both implementation and intervention practices (e.g., Dunst et al., in press; Hulleman & Cordray, 2009), the social validity of both types of practices (e.g., Strain, Barton, & Dunlap, 2012; Turan & Meadan, 2011), and the methods and procedures for promoting broad-based understanding and use of the key characteristics of evidence-based practices (Bruder, 2010).
TRANSLATING EARLY CHILDHOOD INTERVENTION RESEARCH INTO PRACTICE Examples of activities for each type of translation are described in this section of the chapter. The examples are illustrative and not exhaustive. We provide the reader with a number of examples to highlight how the translational framework can be used to bridge research and practice in early childhood intervention. T1 Translation T1 translation focuses on the identification of the characteristics of evidence-based practices and how those characteristics are used to develop evidence-based intervention procedures. Different bodies of research evidence, for example, have been used to develop evidence-based models and frameworks for providing early childhood intervention to young children and their families (e.g., Dunst, 2004; Guralnick, 2001; Odom & Wolery, 2003). The different components of those models and frameworks, and the practices in each component, are based on different sources of research
182
CAROL M. TRIVETTE AND CARL J. DUNST
evidence that informed the intervention procedures proposed by the framework and model developers. Table 3 includes examples of different evidence-based early childhood intervention practices and the types of evidence used to identify their effectiveness or efficiency. The examples are a very small set of systematic reviews or lines of research for selected early childhood intervention practices. Two examples are provided to illustrate how evidence-based practices are identified and how results from different types of studies are used to develop evidence-based intervention practices. The first example is from research on early contingency learning. The second example is from research on parent–child interactions. Early contingency learning, or response-contingent learning, refers to a child’s use of instrumental behavior to produce interesting or reinforcing environmental consequences (Hulsebus, 1973) and the child’s awareness that he or she was the agent of the observed effects (Dunst, Trivette, Raab, & Masiello, 2008). Research syntheses and meta-analyses of both group design and single-participant design studies investigating early contingency learning of infants and toddlers with or without disabilities helped identify the conditions under which contingency learning, detection, and awareness develop (e.g., Dunst, Gorman, & Hamby, 2010; Dunst, Storck, Hutto, & Snyder, 2007; Hulsebus, 1973). Infants and toddlers with disabilities are more likely to learn contingency behavior when discrete, episodic reinforcement is provided in response to a child’s instrumental behavior whereas conjugate reinforcement is used to maintain behavior responding once contingency detection has been manifested. Infants and toddlers with disabilities, like their typically developing counterparts, demonstrate positive social-emotional behaviors that are overt indicators of contingency detection and awareness where the behavior serves to increase social interactions with adults and others (Dunst, 2007b). Results from systematic reviews and syntheses of response-contingent learning studies have been used to develop intervention procedures for young children with disabilities (e.g., Lancioni, 1980; Sullivan & Lewis, 1990). This has included compilations of learning games for promoting responsecontingent learning (Dunst, 1981) as well as tool kits for both parents and practitioners to use to engage young children with disabilities in responsecontingent learning opportunities (e.g., Orelena Hawks Puckett Institute, 2005). Another line of research that has been especially useful for developing evidence-based early childhood intervention practices has been correlation studies of different aspects of parent–child interactional behavior associated
Research to Practice
183
Table 3. Examples of Sources of Evidence-Based Practices in Early Childhood Intervention. Early Child Intervention Practices/Source of Evidence Response-Contingent Child Learning Dunst, C. J. (2007). Social-emotional consequences of response-contingent learning opportunities (Winterberry Research Syntheses Vol. 1, No. 16). Asheville, NC: Winterberry Press. Interest-Based Child Learning Dunst, C. J., Trivette, C. M., & Hamby, D. W. (2012). Effects of interest-based interventions on the social-communicative behavior of young children with autism spectrum disorders. CELLreviews, 5(6), 1–10. Joint Attention Practices White, P. J., O’Reilly, M., et al. (2011). Best practices for teaching joint attention: A systematic review of the intervention literature. Research in Autism Spectrum Disorders, 5, 1283–1295. Parent–Child Interactions Mahoney, G., & Nam, S. (2011). The parenting model of developmental intervention. International Review of Research in Developmental Disabilities, 41, 74–118. Naturalistic Teaching Practices Kaiser, A. P., & Trent, J. A. (2007). Communication intervention for young children with disabilities: Naturalistic approaches to promoting development. In S. L. Odom, R. H. Horner, M. E. Snell, & J. Blacher (Eds.), Handbook of developmental disabilities (pp. 224–245). New York, NY: Guilford Press. Natural Environments Noonan, M. J., & McCormick, L. (2006). Young children with disabilities in natural environments: Methods and procedures. Baltimore, MD: Brookes. Inclusion Practices Odom, S. L., Vitztum, J., Wolery, R., Lieber, J., Sandall, S., Hanson, M. J., Beckman, P., Schwartz, I., & Horn, E. (2004). Preschool inclusion in the United States: A review of research from an ecological systems perspective. Journal of Research in Special Educational Needs, 4, 17–49. Classroom Practices Dunn, L., & Kontos, S. (1997). Developmentally appropriate practice: What does the research tell us? [ERIC digest]. Champaign, IL: ERIC Clearinghouse on Elementary and Early Childhood Education. (ERIC Document Reproduction Service No. ED413106). Positive Behavior Supports Dunlap, G., & Fox, L. (2009). Positive behavior support and early intervention. In W. Sailor, G. Dunlap, G. Sugai, & R. Horner (Eds.), Handbook of positive behavior support (pp. 49–72). New York, NY: Springer. Family-Centered Practices Dunst, C. J., Trivette, C. M., & Hamby, D. W. (2007). Meta-analysis of family-centered helpgiving practices research. Mental Retardation and Developmental Disabilities Research Reviews, 13, 370–378.
184
CAROL M. TRIVETTE AND CARL J. DUNST
with positive child behavioral consequences (Shonkoff & Phillips, 2000). Research syntheses of parent–child interaction studies have led to the identification of the manner in which parents’ sensitivity and contingent social responsiveness to child behavior increases child engagement and behavioral responding, and how parent requests for elaborations produce variations in a child’s behavior repertoire (e.g., Dunst & Kassow, 2008; Nievar & Becker, 2008; Trivette, 2007; Warren & Brady, 2007). Findings from research syntheses and meta-analyses of parent–child interaction research have informed the development of a number of naturalistic teaching procedures (see Dunst, Raab, et al., 2011) and have been used to develop a number of parent-mediated approaches to early childhood intervention (e.g., Dunst, 2006).
T2 Translation T2 translation involves the use of both evidence-based implementation practices and evidence-based intervention practices to produce desired benefits and outcomes. Implementation practices are defined as the methods and procedures used by implementation agents (coaches, supervisors, instructors, trainers, etc.) to promote practitioners’ or parents’ adoption and use of evidence-based intervention practices. These types of practices include evidence-based coaching, mentoring, and other adult learning methods (e.g., Cochran-Smith, Feiman-Nemser, & McIntyre, 2008). In contrast, intervention practices are defined as the methods and strategies used by intervention agents (early childhood teachers, early interventionists, parent educators, parents, etc.) to influence changes or produce desired outcomes in a targeted population or group of recipients (e.g., preschool children) with disabilities. Naturalistic teaching procedures are one example of an evidence-based intervention practice (e.g., Dunst, Raab, et al., 2011). However, neither implementation nor intervention practices, no matter what their evidence base, are likely to have intended effects if they are not used with fidelity (Carroll et al., 2007; Gearing et al., 2011). Fig. 2 shows the relationships between evidence-based implementation practices and evidence-based intervention practices, and how the adoption and use of both practices with fidelity would, in turn, be expected to have hypothesized benefits and outcomes. A classroom-based early childhood intervention study conducted in 18 Head Start programs was used to illustrate T2 translation (Trivette, Raab, & Dunst, 2012). The purpose of the study was to promote teacher and
185
Research to Practice
Implementation Practices
Intervention Practices
Practice Outcomes
Evidence-Based Implementation Characteristics
Evidence-Based Intervention Characteristics
Practice Consequences
Implementation Fidelity
Intervention Fidelity
Optimal Benefits
Fig. 2. Framework for Showing the Relationship between the Fidelity of Both Evidence-Based Implementation and Intervention Practices and the Consequences of the Practices.
teacher assistant adoption and use of evidence-based child learning opportunities (Dunst et al., 2001) and responsive teaching (Raab & Dunst, 2009). The particular characteristics of the responsive teaching instructional practice, for example, were based on findings from research syntheses and meta-analyses of naturalistic teaching practices described briefly above (see Dunst, Raab, et al., 2011). The implementation practice used by the study investigators was an evidence-based Participatory Adult Learning Strategy (PALS) that included the characteristics of different adult learning methods found to be associated with optimal learner outcomes in a meta-analysis of adult learning studies (Dunst, Trivette, et al., 2010). This included the methods for introducing and demonstrating the classroom practices to the teachers and teacher assistants, the activities used to have classroom staff use the practices and analyze the consequences of the practices, and the methods to engage the classroom staff in reflection on and self-assessment of their mastery of the practices. A project staff member (coach) met with each teacher and teacher assistant once a week for four months, during which refinements or improvements in the practices or adoption of new practices were the focus of coaching. The fidelity of both the implementation and intervention practices was assessed by fidelity checklists where the use of different characteristics of the
186
CAROL M. TRIVETTE AND CARL J. DUNST
two types of practices were rated on a 5-point scale ranging from not-at-allused to routinely used. The median percentage of items rated a 4 or 5 on the implementation fidelity scale was 100% for coaching, and the median percentage of items rated a 4 or 5 on the intervention fidelity scale was 94% for the classroom practices. There was however, as expected, variability in the ratings on both fidelity scales in the individual classrooms. Analysis of the ratings showed that variations in implementation fidelity were related to variations in the use of the classroom practices. Results showed that in those classrooms where the implementation fidelity ratings were the highest, the teachers and teacher assistants in the classrooms had the highest fidelity ratings both on the classroom practices and responsive teaching fidelity measures showing that the two types of fidelity were related in the hypothesized manner (Fig. 2).
T3 Translation The main focus of T3 translation is to evaluate whether the use of evidencebased practices in everyday intervention settings has the same or similar effects as found in primary studies. Two examples are provided to illustrate evidence-based practices being evaluated as part of everyday use. Both examples were field-tests of evidence-based tool kits (Dunst, Pace, & Hamby, 2007; Trivette, Dunst, Hamby, & Pace, 2007). One tool kit included learning games for infants and toddlers based on findings from research syntheses and meta-analyses of response-contingent learning described above. The second tool kit included responsive teaching procedures based on findings from systematic reviews of parent–child interaction studies also described above. Both tool kits were used by early intervention practitioners with parents who used the practices with their children in the families’ homes. Each of the field tests included implementation fidelity, intervention fidelity, social validity, and child outcome measures. Findings from both field-tests showed the early intervention practitioners used the practices with the parents as intended (implementation fidelity) and that the parents used the practices with their children as intended (intervention fidelity). As expected, variations in the fidelity of use of the intervention practices were related to variations in the child outcomes. The majority of practitioners and parents in both field-tests judged the practices as socially valid. Variations in the use of both the implementation and intervention practices covaried with variations in the social validity of the practitioners and parents. The more socially valid the practices were judged,
187
Research to Practice
the more the practices were used with fidelity. The findings from both studies indicated that the evidence-based practices had the same or similar effects found in the basic research studies used to develop the practices, and that fidelity and social validity indicators were associated with the use and consequences of the practices as expected (Fig. 2).
T4 Translation T4 translation uses information from the other three types of translation – and especially information about intervention characteristics that matter most in terms of influencing child, parent, or family outcomes – to disseminate, diffuse, and scale-up the use of evidence-based intervention practices. One strategy that we have used for doing this is to embed evidence-based characteristics into the development of intervention procedures and descriptions of the evidence-based practices. For example, findings from research syntheses and meta-analyses of interest-based child learning (Dunst, Jones, Johnson, Raab, & Hamby, 2011; Dunst, Trivette, & Hamby, 2012; Raab & Dunst, 2007; Renninger, 1998) have been used to incorporate interest-based features (e.g., child preferences and choices) into early literacy learning practice guides (see www. earlyliteracylearning.org) to increase the likelihood that evidence-based practice guide activities will be engaging to young children and promote literacy and language skills. Any number of methods and approaches can be used to disseminate information about evidence-based early childhood intervention practices and to promote broad-based diffusion of information about evidence-based practices among potential users. Dissemination means the targeted distribution of information and evidence-based materials to specific audiences in order to spread knowledge and understanding of the practices (Glasgow et al., 2012). Diffusion refers to the communicative channels used to increase awareness, understanding, and the benefits of adopting evidencebased practice (Rogers, 1995). There is little doubt that the internet and website-based dissemination and diffusion have become one of the major, if not the major, means for communicating information about evidence-based practices (e.g., Steyaert, 2011). One strategy that we use to promote awareness of evidence-based practices is to post research syntheses, practice guides, and other evidencebased information and materials on one of our Institute’s websites (www.puckett.org) and to announce the availability of the products and
188
CAROL M. TRIVETTE AND CARL J. DUNST
materials by postings on partner websites, web-based newsletters, Listservs, Facebook, and other social media (Kaplan & Haenlein, 2010). Dissemination and diffusion are important for ‘‘getting the word out,’’ but additional activities are necessary for promoting broad-based adoption and sustained use of evidence-based practices. This is typically accomplished, for example, by initiatives to scale-up the use of evidencebased practices using methods and procedures to promote the sustained use of the practices throughout a program, organization, or geographic area (McDonald, Keesler, Kauffman, & Schneider, 2006). Scaling-up is a multifaceted and multilayered activity that requires considerable resources to promote the sustained use of evidence-based practices and interventions (Foorman & Moats, 2004; Klingner, Ahwee, Pilonieta, & Menendez, 2003). We have used lessons learned from different scaling-up initiatives to develop methods and procedures for scaling-up evidence-based early literacy practices (Dunst, Trivette, Masiello, & McInerney, 2006) and measuring the extent to which the practices were used with fidelity and had expected benefits (Dunst, Trivette, McInerney, et al., 2008). The initiative was carried out in six states where state-level technical assistance providers were trained to use the PALS evidence-based implementation practice (Dunst & Trivette, 2009a) to promote practitioner use of evidence-based literacy intervention practices (www.earlyliteracylearning.org). The practitioners in turn used the implementation practices to promote parents’ use of the intervention practices with their children. Results to date indicate that technical assistance providers learning to use the implementation practices with fidelity result in practitioners’ use of the intervention practices with fidelity when trained by the technical assistance staff.
DISCUSSION We conclude the chapter with brief descriptions of the implications of the translational framework for professional development and practitioner and parent use of evidence-based practices for achieving desired child, parent, or family outcomes. The implications for professional development include both preservice and inservice training. The implications for promoting practitioner or parent use of evidence-based practices include the need to understand and pay explicit attention to why practitioners or parents are likely to discontinue using nonevidence-based practices and adopt evidencebased practices.
Research to Practice
189
Implications for Professional Development At the preservice level, the translation framework can be used to help students understand how basic research findings can be used to inform the development of evidence-based intervention practices, and the steps that need to be taken to test and evaluate whether the practices are used with fidelity and have expected or anticipated consequences as part of routine intervention. The focus on the research evidence for both implementation and intervention practices, and the distinction between the two types of practices, can help students also understand what implementation practices for which types of intervention practices need to be used to ensure desired benefits and outcomes. One other implication of the framework is that it explicitly requires attention to implementation and intervention fidelity, and the manner in which social validity is related to the adoption of evidencebased practices with fidelity. At the inservice level, the translational framework, and especially the relationships depicted in Fig. 2, makes clear the need to use evidence-based implementation practices for promoting practitioner or parent use of evidence-based intervention practices. So often, ineffective training methods are used for promoting practitioner or parent knowledge and use of an intervention practice, which results in a failure to adopt and use the practice with fidelity. In contrast, the use of evidence-based training procedures, as in our previously discussed PALS example (Dunst & Trivette, 2009a), has proven effective for promoting practitioner or parent use of a number of different kinds of evidence-based early childhood intervention practices (e.g., Dunst & Raab, 2010; Dunst, Trivette, & Deal, 2011; Swanson et al., 2011; Trivette, Raab, et al., 2012). As previously described, PALS includes methods for introducing and illustrating the use of evidence-based intervention practices, activities for promoting practitioner use and evaluation of intervention practices, and procedures for engaging practitioners in reflection on and assessment of his or her understanding and mastery of the most important characteristics of the practices.
Implications for Early Childhood Intervention In terms of routine early childhood intervention, the translational framework is a useful heuristic for understanding why a practitioner or parent is or is not likely to entertain the adoption and use of an evidence-based intervention practice (Campbell & Halbert, 2002). Paying attention to
190
CAROL M. TRIVETTE AND CARL J. DUNST
comments about the presumed value of a practice (positive or negative) can provide information about social validity judgments. In nearly every research-to-practice field test we have conducted, social validity judgments predicted the use of an intervention practice with fidelity (e.g., Trivette, Dunst, Masiello, Gorman, & Hamby, 2009). Another reason that practitioners or parents often do not use evidencebased intervention practices or do not use the practices with fidelity is because the types of training, coaching, or supports used by implementation agents are simply ineffective. For example, as part of a meta-analysis of methods used to train practitioners or parents to use assistive technology with young children with disabilities, findings showed that training was most effective when it included the largest number of evidence-based PALS characteristics (Dunst, Trivette, Meter, & Hamby, 2011), but that many training studies did not include the majority of those characteristics. One other implication for practice has to do with the need for explicit demonstration that an evidence-based practice is better than practice-asusual. Practitioners and parents are more likely to adopt and use new practices if they see that they work. For example, as part of a line of research on interest-based child learning, the positive effects of this type of learning often result in practitioner and parent recognition and incorporation of children’s interests into early childhood intervention practices (e.g., Dunst & Raab, 2011). Demonstrating that an evidence-based practice is more effective than business-as-usual is one of the major activities of Type 3 translation.
CONCLUSION Bridging research and practice in early childhood practice is a multifaceted enterprise. A translational framework like the one described in this chapter can serve as a conceptual and operational guide for engaging in different kinds of activity for developing, implementing, and evaluating evidencebased practices as part of early childhood intervention.
REFERENCES Als, H., Gilkerson, L., Duffy, F. H., Mcanulty, G. B., Buehler, D. M., Vandenberg, K., & Jones, K. J. (2003). A three-center, randomized, controlled trial of individualized developmental care for very low birth weight preterm infants: Medical neurodevelopmental parenting, and caregiving effects. Journal of Developmental and Behavioral Pediatrics, 24, 399–408. doi:10.1097/00004703-200312000-00001
Research to Practice
191
Becker, B. J. (1992). Using results from replicated studies to estimate linear models. Journal of Educational Statistics, 17, 341–362. Bruder, M. B. (2010). Early childhood intervention: A promise to children and families for their future. Exceptional Children, 76, 339–355. Retrieved from http://journals.cec. sped.org/ec/ Caldwell, B. M. (1970). The rationale for early intervention. Exceptional Children, 36, 717–726. Callard, F., Rose, D., & Wykes, T. (2011). Close to the bench as well as at the bedside: Involving service users in all phases of translational research. Health Expectations. Advance online publication. doi:10.1111/j.1369-7625.2011.00681.x Campbell, P. H., & Halbert, J. (2002). Between research and practice: Provider perspectives on early intervention. Topics in Early Childhood Special Education, 22, 213–226. Campbell, P. H., & Sawyer, L. B. (2007). Supporting learning opportunities in natural settings through participation-based services. Journal of Early Intervention, 29, 287–305. Carroll, C., Patterson, M., Wood, S., Booth, A., Rick, J., & Balain, S. (2007). A conceptual framework for implementation fidelity. Implementation Science, 2, 40. doi:10.1186/17485908-2-40 Cochran-Smith, M., Feiman-Nemser, S., & McIntyre, D. J. (Eds.). (2008). Handbook of research on teacher education: Enduring questions in changing contexts (3rd ed.). New York, NY: Routledge. Cooper, H. M. (1998). Synthesizing research: A guide for literature reviews (Vol. 2, 3rd ed.). Applied Social Research Methods. Thousand Oaks, CA: Sage. Davies, P. (2000). The relevance of systematic reviews to educational policy and practice. Oxford Review of Education, 26, 365–378. Donaldson, S. I., Christie, C. A., & Mark, M. M. (Eds.). (2009). What counts as credible evidence in applied research and evaluation practice? Thousand Oaks, CA: Sage. Drolet, B. C., & Lorenzi, N. M. (2011). Translational research: Understanding the continuum from bench to bedside. Translational Research, 157, 1–5. doi:10.1016/j.trsl. 2010.10.002 Dunst, C. J. (1981). Infant learning: A cognitive-linguistic intervention strategy. Allen, TX: DLM. Dunst, C. J. (1996). Early intervention in the USA: Programs, models, and practices. In M. Brambring, H. Rauh & A. Beelmann (Eds.), Early childhood intervention: Theory, evaluation, and practice (pp. 11–52). Berlin: de Gruyter. Dunst, C. J. (2004). An integrated framework for practicing early childhood intervention and family support. Perspectives in Education, 22(2), 1–16. Retrieved from http:// search.sabinet.co.za/pie/index.html Dunst, C. J. (2006). Parent-mediated everyday child learning opportunities: I. Foundations and operationalization. CASEinPoint, 2(2), 1–10. Retrieved from http://www.fippcase.org/ caseinpoint/caseinpoint_vol2_no2.pdf Dunst, C. J. (2007a). Early intervention with infants and toddlers with developmental disabilities. In S. L. Odom, R. H. Horner, M. Snell & J. Blacher (Eds.), Handbook of developmental disabilities (pp. 161–180). New York, NY: Guilford Press. Dunst, C. J. (2007b). Social-emotional consequences of response-contingent learning opportunities (Winterberry Research Syntheses Vol. 1, No. 16). Asheville, NC: Winterberry Press. Dunst, C. J. (2010, August). Research syntheses of early childhood intervention practices: What counts as evidence? Presentation made at the Victorian Chapter of the Early Childhood Intervention Australia Association Seminar on Evidence-Based Practices, Melbourne, Australia. Retrieved from http://utilization.info/presentations.php
192
CAROL M. TRIVETTE AND CARL J. DUNST
Dunst, C. J. (2012a). Parapatric speciation in the evolution of early intervention for infants and toddlers with disabilities and their families. Topics in Early Childhood Special Education, 31, 208–215. doi:10.1177/0271121411426904 Dunst, C. J. (2012b, July). Strengthening parent capacity as part of early intervention program practices. Presentation made at the Office of Special Education Programs Leadership Conference, Washington, DC. Dunst, C. J., Bruder, M. B., Trivette, C. M., Hamby, D., Raab, M., & McLean, M. (2001). Characteristics and consequences of everyday natural learning opportunities. Topics in Early Childhood Special Education, 21, 68–92. doi:10.1177/027112140102100202 Dunst, C. J., Gorman, E., & Hamby, D. W. (2010). Effects of adult verbal and vocal contingent responsiveness on increases in infant vocalizations. CELLreviews, 3(1), 1–11. Retrieved from http://www.earlyliteracylearning.org/cellreviews/cellreviews_v3_n1.pdf Dunst, C. J., Hamby, D., Trivette, C. M., Raab, M., & Bruder, M. B. (2000). Everyday family and community life and children’s naturally occurring learning opportunities. Journal of Early Intervention, 23, 151–164. doi:10.1177/10538151000230030501 Dunst, C. J., Jones, T., Johnson, M., Raab, M., & Hamby, D. W. (2011). Role of children’s interests in early literacy and language development. CELLreviews, 4(5), 1–18. Retrieved from http://www.earlyliteracylearning.org/cellreviews/cellreviews_v4_n5.pdf Dunst, C. J., & Kassow, D. Z. (2008). Caregiver sensitivity, contingent social responsiveness, and secure infant attachment. Journal of Early and Intensive Behavior Intervention, 5, 40–56. Retrieved from http://www.jeibi.com/ Dunst, C. J., Pace, J., & Hamby, D. W. (2007). Evaluation of the games for growing tool kit for promoting early contingency learning (Winterberry Research Perspectives Vol. 1, No. 6). Asheville, NC: Winterberry Press. Dunst, C. J., & Raab, M. (2010). Practitioners’ self-evaluations of contrasting types of professional development. Journal of Early Intervention, 32, 239–254. doi:10.1177/ 1053815110384702 Dunst, C. J., & Raab, M. (2011). Interest-based child participation in everyday learning activities. In N. M. Seel (Ed.), Encyclopedia of the sciences of learning. New York, NY: Springer. doi:10.1007/978-1-4419-1428-6 Dunst, C. J., Raab, M., & Trivette, C. M. (2011). Characteristics of naturalistic language intervention strategies. Journal of Speech-Language Pathology and Applied Behavior Analysis, 5(3–4), 8–16. Retrieved from http://www.baojournal.com/SLP-ABA%20WEBSITE/index.html Dunst, C. J., Snyder, S. W., & Mankinen, M. (1988). Efficacy of early intervention. In M. Wang, H. Walberg & M. Reynolds (Eds.), Handbook of special education: Research and practice: Vol. 3. Low incidence conditions (pp. 259–294). Oxford, England: Pergamon Press. Dunst, C. J., Storck, A. J., Hutto, M. D., & Snyder, D. (2007). Relative effectiveness of episodic and conjugate reinforcement on child operant learning (Winterberry Research Syntheses Vol. 1, No. 27). Asheville, NC: Winterberry Press. Dunst, C. J., & Trivette, C. M. (2009a). Let’s be PALS: An evidence-based approach to professional development. Infants and Young Children, 22(3), 164–175. doi:10.1097/ IYC.0b013e3181abe169 Dunst, C. J., & Trivette, C. M. (2009b). Using research evidence to inform and evaluate early childhood intervention practices. Topics in Early Childhood Special Education, 29, 40–52. doi:10.1177/0271121408329227
Research to Practice
193
Dunst, C. J., & Trivette, C. M. (2012). Meta-analysis of implementation practice research. In B. Kelly & D. F. Perkins (Eds.), Handbook of implementation science for psychology in education (pp. 68–91). New York, NY: Cambridge University Press. Dunst, C. J., Trivette, C. M., & Cutspec, P. A. (2007). Toward an operational definition of evidence-based practices (Winterberry Research Perspectives Vol. 1, No. 1). Asheville, NC: Winterberry Press. Dunst, C. J., Trivette, C. M., & Deal, A. G. (2011). Effects of in-service training on early intervention practitioners’ use of family systems intervention practices in the USA. Professional Development in Education, 37, 181–196. doi:10.1080/19415257.2010.527779 Dunst, C. J., Trivette, C. M., & Hamby, D. W. (2010). Meta-analysis of the effectiveness of four adult learning methods and strategies. International Journal of Continuing Education and Lifelong Learning, 3(1), 91–112. Retrieved from http://research.hkuspace.hku.hk/ journal/ijcell/ Dunst, C. J., Trivette, C. M., & Hamby, D. W. (2012). Meta-analysis of studies incorporating the interests of young children with autism spectrum disorders into early intervention practices. Autism Research and Treatment, 2012, 1–10. doi:10.1155/2012/462531 Dunst, C. J., Trivette, C. M., Masiello, T., & McInerney, M. (2006). Scaling up early childhood intervention literacy learning practices. CELLpapers, 1(2), 1–10. Retrieved from http:// www.earlyliteracylearning.org/cellpapers/cellpapers_v1_n2.pdf Dunst, C. J., Trivette, C. M., McInerney, M., Holland-Coviello, R., Masiello, T., Helsel, F., & Robyak, A. (2008). Measuring training and practice fidelity in capacity-building scalingup initiatives. CELLpapers, 3(1), 1–11. Retrieved from http://www.earlyliteracylearning. org/cellpapers/cellpapers_v3_n1.pdf Dunst, C. J., Trivette, C. M., Meter, D., & Hamby, D. W. (2011). Influences of contrasting types of training on practitioners’ and parents’ use of assistive technology and adaptations with infants, toddlers and preschoolers with disabilities. Practical Evaluation Reports, 3(1), 1–35. Retrieved from http://practicalevaluation.org/reports/CPE_ Report_Vol3No1.pdf Dunst, C. J., Trivette, C. M., & Raab, M. (in press). Utility of implementation and intervention performance checklists for conducting research in early childhood education. In O. Saracho (Ed.), Handbook of research methods in early childhood education. Charlotte, NC: Information Age Publishing. Dunst, C. J., Trivette, C. M., Raab, M., & Masiello, T. (2008). Early child contingency learning and detection: Research evidence and implications for practice. Exceptionality, 16, 4–17. doi:10.1080/09362830701796743 Education of the Handicapped Act Amendments of 1986, Pub. L. No. 99-457, 100 Stat. 1145. (1986). Foorman, B. R., & Moats, L. C. (2004). Conditions for sustaining research-based practices in early reading instruction. Remedial and Special Education, 25, 51–60. Friedlander, B. Z., Sterritt, G. M., & Kirk, G. E. (Eds.). (1975). Exceptional infant: Volume 3: Assessment and intervention. New York, NY: Brunner. Gearing, R. E., El-Bassel, N., Ghesquiere, A., Baldwin, S., Gillies, J., & Ngeow, E. (2011). Major ingredients of fidelity: A review and scientific guide to improving quality of intervention research implementation. Clinical Psychology Review, 31, 79–88. doi:10.1016/j.cpr.2010.09.007 Glasgow, R. E., Vinson, C., Chambers, D., Khoury, M. J., Kaplan, R. M., & Hunter, C. (2012). National Institutes of Health approaches to dissemination and implementation science:
194
CAROL M. TRIVETTE AND CARL J. DUNST
Current and future directions. American Journal of Public Health, 102, 1274–1281. doi:10.2105/AJPH.2012.300755 Guralnick, M. J. (Ed.). (1997). The effectiveness of early intervention. Baltimore, MD: Brookes. Guralnick, M. J. (2001). A developmental systems model for early intervention. Infants and Young Children, 14(2), 1–18. Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58, 78–79. doi:10.1037/0003-066X.58.1.78 Higgins, J. P. T., & Green, S. (Eds.). (2011). Cochrane handbook for systematic reviews of interventions [Version 5.1.0]. Retrieved from http://www.cochrane-handbook.org Horner, R. H., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Hulleman, C. S., & Cordray, D. S. (2009). Moving from the lab to the field: The role of fidelity and achieved relative intervention strength. Journal of Research on Educational Effectiveness, 2, 88–110. doi:10.1080/19345740802539325 Hulsebus, R. C. (1973). Operant conditioning of infant behavior: A review. Advances in Child Development and Behavior, 8, 111–158. Hunt, J. M. (1961). Intelligence and experience. New York, NY: Ronald Press. Kaiser, A. P., & Trent, J. A. (2007). Communication intervention for young children with disabilities: Naturalistic approaches to promoting development. In S. L. Odom, R. H. Horner, M. E. Snell & J. Blacher (Eds.), Handbook of developmental disabilities (pp. 224–245). New York, NY: Guilford Press. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53, 59–68. doi:10.1016/j.bushor.2009.09.003 Klingner, J. K., Ahwee, S., Pilonieta, P., & Menendez, R. (2003). Barriers and facilitators in scaling up research-based practices. Exceptional Children, 69, 411–429. Lancioni, G. E. (1980). Infant operant conditioning and its implications for early intervention. Psychological Bulletin, 88, 516–534. Lochman, J. E. (2006). Translation of research into interventions. International Journal of Behavioral Development, 30, 31–38. doi: 1177/0165025406059971 Mahoney, G., & Nam, S. (2011). The parenting model of developmental intervention. International Review of Research in Developmental Disabilities, 41, 74–118. doi:10.1016/ B978-0-12-386495-6.00003-5 McDonald, S.-K., Keesler, V. A., Kauffman, N. J., & Schneider, B. (2006). Scaling-up exemplary interventions. Educational Researcher, 35(3), 15–24. McLean, M. E., Snyder, P., Smith, B. J., & Sandall, S. R. (2002). The DEC recommended practices in early intervention/early childhood special education: Social validation. Journal of Early Intervention, 25, 120–128. McLennan, J. D., Wathen, C. N., MacMillan, H. L., & Lavis, J. N. (2006). Research-practice gaps in child mental health. Journal of the American Academy of Child and Adolescent Psychiatry, 45, 568–665. doi:10.1097/01.chi.0000215153.99517.80 Nievar, M. A., & Becker, B. J. (2008). Sensitivity as a privileged predictor of attachment: A second perspective on De Wolff and van Ijzendoorn’s meta-analysis. Social Development, 17, 102–114. Odom, S. L., Brantlinger, E., Gersten, R., Horner, R. H., Thompson, B., & Harris, K. R. (2005). Research in special education: Scientific methods and evidence-based practices. Exceptional Children, 71, 137–148.
Research to Practice
195
Odom, S. L., & Strain, P. S. (2002). Evidence-based practice in early intervention/early childhood special education: Single-subject design research. Journal of Early Intervention, 25, 151–160. Odom, S. L., & Wolery, M. (2003). A unified theory of practice in early intervention/early childhood special education: Evidence-based practices. Journal of Special Education, 37, 164–173. Orelena Hawks Puckett Institute (Producer). (2005). Games for growing: Teaching your baby with early learning games [Multimedia]. Asheville, NC: Winterberry Press. Raab, M., & Dunst, C. J. (2007). Influence of child interests on variations in child behavior and functioning (Winterberry Research Syntheses Vol. 1, No. 21). Asheville, NC: Winterberry Press. Raab, M., & Dunst, C. J. (2009). Magic seven steps to responsive teaching: Revised and updated (Winterberry Practice Guides). Asheville, NC: Winterberry Press. Renninger, K. A. (1998). The roles of individual interest(s) and gender in learning: An overview of research on preschool and elementary school-aged children/students. In L. Hoffman, A. Krapp, K. A. Renninger, & J. Baumert (Eds.), Interest and learning: Proceedings of the Seeon Conference on Interest and Gender (pp. 165–174). Kiel, Germany: IPN. Rogers, E. M. (1995). Diffusion of innovations (4th ed.). New York, NY: Free Press. Rohrbach, L. A., Grana, R., Sussman, S., & Valente, T. W. (2006). Type II translation: Transporting prevention interventions from research to real-world settings. Evaluation and the Health Professions, 29, 302–333. Sandall, S., McLean, M. E., & Smith, B. J. (Eds.). (2000). DEC recommended practices in early intervention/early childhood special education. Longmont, CO: Sopris West. Schwartz, I. S. (1996). Reaction from the field: Expanding the zone: Thoughts about social validity and training. Journal of Early Intervention, 20, 204–205. Shonkoff, J. P., & Phillips, D. A. (Eds.). (2000). From neurons to neighborhoods: The science of early childhood development. Washington, DC: National Academy Press. Simeonsson, R. J., Cooper, D. H., & Scheiner, A. P. (1982). A review and analysis of the effectiveness of early intervention programs. Pediatrics, 69, 635–641. Retrieved from http://pediatrics.aappublications.org/ Smith, M. C., & DeFrates-Densch, N. (Eds.). (2008). Handbook of research on adult learning and development. New York, NY: Routledge. Steyaert, J. (2011). Scholarly communication and social work in the Google era. Journal of Social Intervention: Theory and Practice, 20(4), 79–94. Retrieved from http://www. journalsi.org/index.php/si/article/view/288 Strain, P. S., Barton, E. E., & Dunlap, G. (2012). Lessons learned about the utility of social validity. Education and Treatment of Children, 35, 183–200. doi:10.1353/etc.2012.0007 Sullivan, M. W., & Lewis, M. (1990). Contingency intervention: A program portrait. Journal of Early Intervention, 14, 367–375. Swanson, J., Raab, M., & Dunst, C. J. (2011). Strengthening family capacity to provide young children everyday natural learning opportunities. Journal of Early Childhood Research, 9, 66–80. doi:10.1177/1476718X10368588 Tjossem, T. D. (Ed.). (1976). Intervention strategies for high risk infants and young children. Baltimore, MD: University Park Press. Trivette, C. M. (2007). Influence of caregiver responsiveness on the development of young children with or at risk for developmental disabilities (Winterberry Research Syntheses Vol. 1, No. 12). Asheville, NC: Winterberry Press.
196
CAROL M. TRIVETTE AND CARL J. DUNST
Trivette, C. M., Dunst, C. J., Hamby, D. W., & Pace, J. (2007). Evaluation of the Tune In and Respond tool kit for promoting child cognitive and social-emotional development (Winterberry Research Perspectives Vol. 1, No. 7). Asheville, NC: Winterberry Press. Trivette, C. M., Dunst, C. J., Masiello, T., Gorman, E., & Hamby, D. W. (2009). Social validity of the Center for Early Literacy Learning parent practice guides. CELLpapers, 4(1), 1–4. Retrieved from http://www.earlyliteracylearning.org/cellpapers/cellpapers_v4n1.pdf Trivette, C. M., Raab, M., & Dunst, C. J. (2012). An evidence-based approach to professional development in Head Start classrooms. NHSA Dialog, 15, 41–58. doi:10.1080/ 15240754.2011.636489 Trivette, C. M., Simkus, A., Dunst, C. J., & Hamby, D. W. (2012). Repeated book reading and preschoolers’ early literacy development. CELLreviews, 5(5), 1–13. Retrieved from http://www.earlyliteracylearning.org/cellreviews/cellreviews_v5_n5.pdf Trochim, W., Kane, C., Graham, M. J., & Pincus, H. A. (2011). Evaluating translational research: A process marker model. Clinical and Translational Science, 4, 153–162. doi:10. 1111/j.1752-8062.2011.00291.x Turan, Y., & Meadan, H. (2011). Social validity assessment in early childhood special education. Young Exceptional Children, 14(3), 13–28. doi:10.1177/1096250611415812 Turnbull, A. P., Summers, J. A., Gotto, G., Stowe, M. J., Beauchamp, D., Klein, S., & Zuna, N. (2009). Fostering wisdom-based action through Web 2.0 Communities of Practice: An example of the Early Childhood Family Support Community of Practice. Infants and Young Children, 22(1), 54–62. doi:10.1097/01.IYC.0000343337.72645.3f Warren, S. F., & Brady, N. C. (2007). The role of maternal responsivity in the development of children with intellectual disabilities. Mental Retardation and Developmental Disabilities Research Reviews, 13, 330–338. doi:10.1002/mrdd.20177 White, P. J., O’Reilly, M., Streusand, W., Levine, A., Sigafoos, J., Lancioni, G., & Aguilar, J. (2011). Best practices for teaching joint attention: A systematic review of the intervention literature. Research in Autism Spectrum Disorders, 5, 1283–1295. doi:10.1016/j.rasd. 2011.02.003 Wolery, M. (1994). Instructional strategies for teaching young children with special needs. In M. Wolery & J. S. Wilbers (Eds.), Including children with special needs in early childhood programs (pp. 119–140). Washington, DC: National Association for the Education of Young Children. Woolf, S. H. (2008). The meaning of translational research and why it matters. Journal of the American Medical Association, 222, 211–213. doi:10.1001/jama.2007.26 Zucker, D. R. (2009). What is needed to promote translational research and how do we get it? Journal of Investigative Medicine, 57, 468–470. doi: 10.231/JIM0b013e31819824d8
CHAPTER 9 EFFECTIVE EDUCATIONAL PRACTICES FOR CHILDREN AND YOUTH WITH AUTISM SPECTRUM DISORDERS: ISSUES, RECOMMENDATIONS, AND TRENDS Richard Simpson and Stephen Crutchfield ABSTRACT Identification, implementation, and ongoing evaluation of scientifically supported and effective-practice methods are fundamental and crucial elements of an effective educational program for all children and youth, including learners with Autism Spectrum Disorders (ASD). To be sure, there is an unambiguous link between use of interventions and supports that have empirically supported merit and positive school and post-school outcomes. This chapter examines progress in the wide-scale adoption of evidence-based methods with learners with ASD. We also offer recommendations for advancing effective-practice initiatives and discuss trends and themes connected to effective-practice use with students with ASD.
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 197–220 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026011
197
198
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
By any account interest in topics linked to Autism Spectrum Disorders (ASD) is extraordinary. Books, magazine, and journal articles, and various forms of media routinely and continually feature and explore matters connected to ASD. This attention is not exclusively intended to satisfy the interests and appetites of professionals; parents and families as well as the general public are regularly showered with ASD-related stories and information. The widespread interest in ASD is understandable and deserved. ASD is an extraordinary disability; its perplexing nature and the myriad challenges with which it is associated make it understandable that this particular disability would be of such interest to so many. One obvious reason for the notable attention that ASD is getting is connected to its dramatic prevalence increase (Centers for Disease Control and Prevention, 2012a). The Centers for Disease Control and Prevention (CDC) (2012a) recently estimated that approximately 1 in 88 (and approximately 1 in 54 boys) falls on the autism spectrum. This remarkable prevalence estimate, of course, means that more and more families and communities are experiencing the challenges associated with ASD. Only a few years earlier the CDC offered a prevalence estimate of 1 in 150 children (2008). This significant and ongoing increase is especially sobering when the current numbers are compared to earlier prevalence estimates (consider, e.g., that in 1966 Lotter estimated that the frequency of autism was 4–5 per 10,000, or 1 in 2,000–2,500, vs. the current 1 in 88 and 1 in 54 statistics). Interest in issues related to ASD unmistakably reflects the significant impact of autism-related disabilities. Not only is ASD currently more common than Down syndrome, juvenile diabetes, and childhood cancer (Centers for Disease Control and Prevention, 2012a), it is increasingly having a profound impact on a wide sector of society (Simpson & Myles, 2008; White, Smith, Smith, & Stodden, 2012). To be sure, families, schools, communities, and virtually every program that purports to assist individuals and families with special needs are experiencing the force and consequences of ASD. Not surprisingly there have been ongoing attempts to identify the cause or causes of the exponential prevalence increase in ASD (Centers for Disease Control and Prevention, 2012b; Young, Grier, & Grier, 2008) as well as myriad personal statements, viewpoints, theories, and suppositions about why so many individuals are receiving an ASD diagnosis (Oller & Oller, 2010). Environmental agents such as mercury, vaccinations, and various toxins have long been debated as possible causes for ASD (Waly et al., 2004). The Centers for Disease Control and Prevention (2008) and the majority of professionals and professional organizations have rejected childhood vaccinations as a cause for all but a few cases of autism, yet
ASD Effective Practices
199
forceful debate over the connection between vaccinations and ASD continues. Equally predictable as a response to the challenges associated with ASD are multiple and varied attempts to identify and propose interventions, treatments, and other supports (hereafter we will collectively refer to these terms as ‘‘interventions’’) that will purportedly improve outcomes for individuals identified with an ASD (National Professional Development Center on Autism Spectrum Disorders, 2010). ASD stakeholders have had highly diverse impressions and opinions about the interventions that hold the most promise and that bode best for individuals with ASD. Of course this variability commonly occurs relative to intervention programs for all disorders, especially in instances where causes are either unknown or not clearly understood. In the case of ASD this phenomena is even more evident because of the inherently enigmatic nature of the disorder. Individuals diagnosed with ASD commonly have extremely variable and difficult to reliably estimate intellectual abilities and cognitive assets (including, on occasion, savant skills and gifted-level intellectual abilities). The atypical and highly individualized behavioral, social, and communication characteristics and patterns that form the inconsistent syndrome of autism add to its mystery as well. These elements create conditions wherein variable and in some instances arguably extreme and zealous interventions are considered reasonable and legitimate (Biklen, 1993; Wheeler, Jacobson, Paglieri, & Schwartz, 1993). Adding fuel to the fire of significant disagreements over which interventions are best are reports of mysterious and even inexplicable remedies and ‘‘cures’’ for autism (Iovannone, Dunlap, Huber, & Kincaid, 2003; Oller & Oller, 2010; Simpson, 2005; Volkmar, Cook, & Pomeroy, 1999). Differences of opinion and vociferous debate related to which interventions are most effective have long been a prominent part of the autism landscape (Gresham, Beebe-Frankenberger, & MacMillan, 1999; Simpson, 2008; Simpson, Mundschenk, & Heflin, 2011) and there is little reason to think this pattern will change anytime soon. Contributing to the continuing struggle to identify maximally effective methods and the polarizing opinions and perspectives concerning autism is a lack of practical information and well-designed guidelines that professionals and other stakeholders can use to select interventions that are most appropriate for individual children and youth diagnosed with ASD. Relative to the aforementioned conditions and circumstances this chapter discusses issues and matters associated with identifying and using with fidelity the most effective, practical, and cost-efficient interventions and supports for children and youth with ASD.
200
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
We also discuss intervention strategies that we think are most apt to be used in future years. Understanding these trends provides a way of preparing for implementation and evaluation of emerging methods and a structure for understanding and evaluating interventions relative to evidence-based standards.
EFFECTIVE PRACTICES AND AUTISM SPECTRUM DISORDERS Promotion of so-called educational effective practices, including evidencebased methods and practices, scientifically supported interventions, and research-validated methods in support of learners with ASD, tracks in a similar manner to the effective-practice movement in medicine. Sackett, Rosenberg, Muir Gray, Haynes, and Richardson (1996), pioneers in the advancement of medically focused evidence-based practices, described evidence-based practices as ‘‘the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients. The practice of evidence-based medicine means integrating individual clinical expertise with the best available external clinical evidence from systematic research’’ (p. 71). Other disciplines, including behavioral sciences and psychology, have also endorsed and participated in the effective-practice movement. The American Psychological Association (2005), relative to their discipline, described evidence-based methods as the integration of research-supported methods with clinical expertise in the context of patient preferences, characteristics, and culture. In a similar fashion leaders and policy makers in education have called for use of scientifically supported methods. Nuanced distinctions and differences among the terms linked to the evidence-based movement have been made, including scientifically supported approaches, evidence-based practices, and research-validated methods (Fixsen, Naoom, Blase, Friedman, & Wallace, 2005). Generally speaking, however, each of these terms refer to methods and practices that have been shown to be efficacious based on objective empirical research. The common features among evidence-based methods include a reliable and scientifically valid evaluation or research design, clearly explained procedures, and scientifically supported evaluation methods (Cook, Tankersley, & Landrum, 2009). National (United States) policies in support of using scientifically supported methods have also significantly advanced the use of effective methods, including with learners with ASD. The No Child Left Behind
ASD Effective Practices
201
(NCLB, 2001) Act, in particular, requires that educators base their programs and teaching on scientifically supported research. Such methods are those that are supported by ‘‘rigorous, systematic, and objective procedures to obtain reliable and valid knowledge relevant to education activities and programs’’ (NCLB, 20 U.S.C. y 7801(37)). That is, scientifically supported and evidence-based methods, per NCLB-defined criteria, are those that have reliably and objectively demonstrated the capacity to produce positive outcomes. Under patronage of the Education Sciences Reform Act of 2002, the Institute of Education Sciences (IES) was created to develop definitions and evaluation methods associated with evidencebased educational practices (Institute of Education Sciences, 2009). IES subsequently established the What Works Clearinghouse (WWC) to help educators, policymakers, and other stakeholders integrate and use scientifically based research methods as the foundation for their educational practices. The WWC developed, adopted, and until only recently sanctioned ‘‘evidence standards’’ that strongly favored randomized control-group research, a research design that is infrequently used to support ASD practices. As a result there are relatively few listings of scientifically valid methods for children and youth with ASD on the U.S. Department of Education’s IES WWC webpage (http://ies.ed.gov/ncee/wwc). Thus, in spite of good intentions and considerable hype, federal programs related to promoting the identification and use of IES-defined effective methods with students with ASD, as compared to other categorical groups that have been targeted for particular attention, have yielded minor results, at best. In spite of only modest gains in wide-scale identification and application of evidence-based methods there is growing sentiment within the ASD community that basing educational programs and decision-making on scientific merit is a noteworthy and necessary endeavor (National Professional Development Center on Autism Spectrum Disorders, 2010). To be sure, there appears to be an ever increasing call among professional groups and stakeholders to advocate and demand clear-cut identification of interventions with the greatest potential for benefit; and for educators to use these methods as the foundation for their educational programs. Notwithstanding this positive movement, the field of ASD has a timehonored legacy, history, and reputation for tolerating and endorsing methods that have little in the way of effective-methods credentials. Clearly the effective-practice initiative in support of individuals with ASD remains a work in progress. Even when examined with the most optimistic lens the ASD effective-practice initiative and movement is in its infant stage of development.
202
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
A number of groups and individuals have been involved in vetting the myriad interventions claimed to be appropriate and utilitarian for students with ASD and identifying those methods that have the strongest empirical and other forms of support. Early on in this process the National Research Council (2001) identified and recommended foundational effective-practice components recommended for all educational programs that serve young children with autism (e.g., early entry into a structured intervention program, active engagement in intensive instructional program for the equivalent of a full school day, adequate adult attention in one-to-one or small group instruction to address individualized goals). Other groups have also attempted to identify interventions with the strongest forms of scientific support and thus ostensibly with the highest probability of producing positive outcomes. These efforts have typically used different evaluation procedures and criteria, hence effective-practice identification attempts have, not surprisingly, resulted in different outcomes. For instance Simpson et al. (2005) categorized and evaluated 33 commonly used interventions for students with ASD. The methods were evaluated on the basis of: (a) outcomes and result; (b) qualifications of persons implementing the methods; (c) how, where, and when the methods were best used; (d) potential risks associated with the methods; (e) costs associated with the methods; and (f) recommendations for assessing each method’s effectiveness. These factors were used to classify the 33 methods as: (a) scientifically based (methods with ‘‘significant and convincing empirical efficacy and support’’ (p. 9); (b) a promising practice (‘‘efficacy and utility with individuals with ASD’’ (p. 9), although additional objective empirical verification was needed); (c) a practice that had limited supporting data or information (lacked convincing scientific evidence, yet possessed potential as an intervention or treatment for learners with ASD); or (d) a non-recommended method (interventions that lacked supporting scientific evidence and methods that had detrimental potential, such as facilitated communication). A more recent effort to evaluate the relative scientific merits of interventions for learners with ASD was undertaken by the National Autism Center (2009). This organization’s efforts were intended to provide information that would permit professionals, families, and other stakeholders to make prudent, objective, and scientifically based intervention choice decisions with children and youth with ASD. An expert panel used a structured, well controlled, and objective protocol to review over 700 intervention studies involving individuals with ASD. These procedures resulted in the National Autism Center adopting four major ASD intervention classification categories: Established, Emerging, Unestablished,
ASD Effective Practices
203
and Ineffective/Harmful. This effort has clearly advanced the ASD effectivepractice movement. However the National Autism Center’s dissemination product has only modest and relatively unremarkable utility and practical worth for practitioners and parents. The identified Established methods lack precision (e.g., ‘‘antecedent package’’); and the Emerging methods classify as equally effective relatively well-researched methods such as the Picture Exchange Communication System (PECS) and unproven methods such as massage and touch therapy. It is also significant that there were no methods that were classified as Ineffective/Harmful. Thus, even the widely denounced and criticized and potentially harmful facilitated communication method (Biklen, 1993; Wheeler et al., 1993) is classified as an Unestablished strategy rather than the more forceful, and in our judgment accurate, classification Ineffective/Harmful. The National Professional Development Center on Autism Spectrum Disorders (U.S. Department of Education Office of Special Education Programs, 2010) has also been a high-profile contributor to the ASD effective-practice movement. Participants in this project reviewed approximately 360 research studies published between 1997 and 2007. Twenty-four (24) evidence-based practices were identified through this effort: prompting, stimulus control/environmental modification, time delay, differential reinforcement, discrete trial training, extinction, functional behavior assessment, functional communication training, reinforcement, response interruption/redirection, task analysis and chaining, video modeling, naturalistic interventions, parent implemented interventions, peer mediated interventions, PECS, pivotal response training, self-management, social narratives, social skills interventions, speech generating devices, structured work systems, computer-aided instruction, and visual supports. There is considerable variability among these so-called evidence-based methods and clearly some of the methods lack precision and clarity (e.g., parent and peer implemented interventions); however this effort has been an important initial step in identifying methods with the most potential to positively affect outcomes for students with ASD and their families.
A MODEL FOR FUNCTIONAL AND EFFECTUAL EFFECTIVE-PRACTICE APPLICATION Advancement and clarification of the selection, implementation, and evaluation of maximally effective ASD interventions and support methods require basic and foundational principles, key building blocks, and
204
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
organizational pillars. These elements, as described below, guide decisions and practices in support of students with ASD. First, utilitarian and functional decision-making and implementation of effective-practice interventions require structured and consistent environments and individualized curricula and programs. These basic elements serve as the groundwork for applying effective practices by fundamentally addressing and providing supports for prominent and significant characteristics of ASD. Simply stated, programs for students with ASD require a foundation of structure and other basic components, such as clearly identified expectations, visual and physical supports, routines and predictability, consistently applied methods among staff, and so forth. Without such elemental components even the most effective methods will lead to less than fully successful outcomes. Furthermore, without strong and structured foundational components it will be virtually impossible to make objective and rational judgments about the potential utility and efficacy of specific interventions. Building on the foundation of a structured and consistent environment, individualized curricula and programs, and competent and committed educators and support staff, our recommended effective-practice decisionmaking and application model has three primary components: (a) commitment to selecting, using with fidelity, and evaluating scientifically supported methods; (b) thoughtful and deliberate consideration of the unique and individualized needs of learners with ASD; and (c) resolute commitment among stakeholders to creating and employing a collaborative and dynamic decision-making process. The schema that follows illustrates and describes these elements relative to identifying and applying effective practices. As reflected in Fig. 1, the intersection of the three model components identifies methods that are most apt to deliver the most positive outcomes for particular students.
Foundation for Effective Practices Unquestionably children and youth with ASD require structure and organization in order to learn and progress. Although they are necessary prerequisites for applying effective practices, it is highly unlikely that basic structure and organization by themselves will be sufficient to ensure that students with ASD receive a maximally effective education. That is, structure and organization are key foundational components and preconditions needed to create an orderly and systematic educational setting that will permit the conditions for functional application of effective-practice
205
ASD Effective Practices
A Model for Selecting Evidenced-Based Practices
Selection of Evidenced Based Practices and supports judged to be best suited for individual students
Unique and individualized needs of learners with ASD
Scientifically supported interventions, treatments, and support methods
Stakeholder’s collaborative and dynamic decision making process
Foundations of Effective Practices Examples of foundational supports include: Consistent Environment Programs Well Qualified, Committed Professionals and Support Staff Individualized Curricula and Program
Fig. 1.
A Model for Selecting Evidenced-Based Practices.
interventions and supports. Bluntly speaking, effective practices cannot be efficiently and successfully implemented in settings lacking structure and organization. In a parallel fashion, structure and organization without the additional use of appropriately designed and individualized interventions and other supports will almost certainly lead to less than optimal outcomes. Structured environments offer physical and psychological organization for children and youth, particularly for those diagnosed with ASD, by building consistency and routines and by creating conditions that permit students to anticipate task requirements and understand expectations. Without the foundation of structured settings even the most robust interventions will be ineffective, or at best less than optimally utilitarian. Equally important, objective and reliable evaluation of the comparative effects of individualized interventions relative to students’ personalized goals and objectives will be difficult or impossible to achieve. Elements linked to structured classrooms include clearly identified and easily understood rules and expectations, resources that permit systematic and ongoing monitoring of performance and appropriate performance
206
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
support, and visual supports to build on the perceived visual processing strength of individuals with ASD (Ganz, 2007). Ganz observed that visual support tools can be used to facilitate students’ independence and reduce their need for adult prompts and correction feedback. In a similar fashion organizational classroom techniques such as visual supports and visual boundaries, physical borders such as room dividers and floor tape to identify classroom areas for designated activities, room dividers, color coding, and pictures and labels to organize classroom materials are structuring techniques that appear to be well suited for learners with ASD and thus bode well for their success in a variety of settings, including school, home, and community (Ganz, 2007; National Research Council, 2001; Scheuermann & Webber, 2002; Schopler, Mesibov, & Hearsey, 1995). Finally, and most importantly, a sound foundational program requires teachers and educational support staff who have been trained to understand and implement evidence-based practices and who are unequivocally committed to providing an effective education to all learners. The effective-practice movement, to be sure, will have influence, capacity, and capability only to the extent that personnel understand and individually apply scientifically supported interventions. This will require professionals who not only are able to design and implement appropriately structured learning settings but also who are committed and able to discriminate among available evidence-based methods. Clearly suitable human resources are the key fundamental cog in the evidence-based machinery: these professionals and support staff are the key elements needed to competently and objectively select, implement with fidelity, and evaluate interventions and treatments that are most appropriate for individual students and their unique situations and needs. As shown in Fig. 1, the three elements of our effective-practice decisionmaking and application model – (a) commitment to selecting, using with fidelity, and evaluating effective practices; (b) deliberate attention to the unique and individualized needs of children and youth with ASD; and (c) stakeholders’ respectful and inclusive commitment to use a collaborative and dynamic decision-making process – build on elemental foundation of structure, organization, and competent faculty and support staff.
Commitment to Scientific Methods Applying a ‘‘best-practice’’ structure requires that stakeholders commit to choosing and using with fidelity methods that have objectively and
ASD Effective Practices
207
scientifically been shown to have the most capacity to consistently produce the best and maximum socially valid outcomes. Unmistakably and factually there are some methods for children and youth with ASD that are superior to others relative to producing empirically and scientifically validated outcomes for learners with ASD (National Autism Center, 2009; Simpson et al., 2005). This fact should be a principal and ever-present part of the intervention decision-making process; albeit, as illustrated in Fig. 1, it is not the sole decision-making consideration. Methods and procedures with objective scientific credentials bode best for children and youth with ASD and thus, in general, these are the interventions that should underpin programs for learners with autism-related disorders (National Professional Development Center on Autism Spectrum Disorders, 2010). It is also clear that there are certain methods that are either unproven or that appear to have limited potential to produce positive gains, or have potential to do harm (Simpson et al., 2011). In our opinion these are options that stakeholders should be extremely prudent and conservative in considering for use. If and when these options are considered we recommend stakeholders objectively vet and compare, using extant scientific documents, the objective and long-term potential advantages and disadvantages of using ‘‘unconfirmed’’ methods as opposed to more established interventions. In sum, this part of the effective-practice programming model for students with ASD accentuates unbiased and rational judgment and a willingness to make decisions based on students’ best interests rather than ideology, personal philosophy, convenience, impulsive thinking, or other considerations.
Consideration of the Individualized Needs of Learners with Autism Spectrum Disorders The second basic element of our effective-practice model highlights the need for a variety of effective practices that will serve a range of individual needs present among children and youth with ASD diagnoses. Individuals with ASD are characteristically unique and consequently variable in features and performance to an extent that particular student’s needs and qualities (as well as parent/family, community, and school factors) need to be fully and carefully considered in order to best determine those interventions that are most suitable. We contend that it is neither appropriate nor acceptable to recommend universally most-appropriate specific interventions without considering their suitability for individual learners and without carefully
208
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
and individually tailoring these methods to be the correct fit for unique student needs. Children and youth with ASD are heterogeneous and display a range of characteristics and functioning levels. Consider, for example, that children diagnosed with Asperger disorder have significantly different needs when compared to children with classic autism; and individual children with diagnoses of Asperger disorder or autistic disorder will each have variable specific needs and unique traits and features. At the same time, notwithstanding these specific individual attributes, children and youth on the autism spectrum share basic characteristics. Across the ASD spectrum learners can be expected to experience social skill and social interaction deficits as well as speech, language, and communication problems. Behavioral anomalies and quirks are also prominent among individuals with ASD. Some individuals with ASD will compulsively demand routines and environmental consistency, and perseverative and other unusual and problematic behaviors are common. Independent of their intellectual and cognitive assets and language abilities, and even if students with ASD exhibit developmentally advanced skills and abilities, they generally experience learning difficulties. Thus, even with recognition of the variable and idiosyncratic specific characteristics of children and youth with ASD, these individuals will be linked by basic features – albeit the exact and explicit form these characteristics will manifest in is highly idiosyncratic. These basic characteristics will dictate many of the general program needs of children and youth with ASD. Yet, programs for learners with ASD that consistently lead to the best outcomes are ones that are crafted by informed stakeholders to offer individualized and nuanced forms of interventions and supports. As a result, stakeholders involved in designing, implementing, and evaluating programs for learners with ASD must competently address the generic core elements of ASD (i.e., social, communication, behavioral, learning), albeit via use of an individualized-application lens. As reflected in Fig. 1 the needs of learners with ASD are best met as a result of using scientifically supported and empirically justified methods. Because there are no universally most-suitable methods however, interventions need to be individually crafted. It is obvious that not all interventions for learners with ASD have equal value; and we think it is obvious that scientifically supported options have the most potential to produce positive results. At the same time it is apparent that different children diagnosed with ASD will respond differently to methods; and without question some interventions will be better suited for some students than others. It is thus mandatory that educators and stakeholders be prepared to use a variety of
ASD Effective Practices
209
methods with learners diagnosed with ASD. Simpson et al. (2011) argued that it is essential that the ASD community not be seduced into believing that there is only one or perhaps a few methods that must universally be used in the same exact fashion with every child and youth with an autism-related disability while at the same time monitoring that those methods offered as effective have strong effective-practice credentials. This balance is needed to make sure that there is sufficient variety to meet the variable needs of students diagnosed with ASD while at the same time ensuring that those methods that are advanced as having passed the effective-practice test have demonstrated objective capacity to produce socially-valid positive changes when used with fidelity. (p. 12)
We also think it is important that delineations of the salient features of effective ASD methods not be overly restrictive. It is abundantly clear that a variety of research methodologies can be used to reliably and scientifically validate practices, including single-case designs and group experiments. Respected colleagues have identified guidelines for establishing evidencebased status using a variety of research methods (Cook, Tankersley, & Landrum, 2009; Odom et al., 2005). We strongly support that these various approaches can be used to effectively vet the utility and capacity of methods for students with ASD.
Stakeholder Commitment to a Collaborative and Dynamic Decision-making Process A third element connected to using our effective-practice model is a collaborative and dynamic decision-making process. We are of the strong opinion that it is vital that the perspectives, preferences, and judgments of major stakeholders be considered. Thus we think it is essential that the perspectives of a range of professionals as well as parents be considered relative to identifying and using with fidelity scientifically supported methods that best fit the unique characteristics of individual students with ASD. We also hasten to add that whenever possible individuals with ASD should participate in this process. Such a collaborative approach increases the likelihood that instructional decisions are made as shared group matters rather than by individual judgments. Of course this collaborative decision-making formula requires and is built on the foundation of knowledgeable and well-informed professionals and other stakeholders, a steadfast sensitivity to cultural factors, and an understanding of unique community and setting variables. This collaborative
210
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
teaming process appears to be particularly important because it enables and supports collectively preferred and individualized evidence-based programs. This process also supports both professionals and parents/families by creating a forum wherein all stakeholders provide input within their areas of expertise. Finally this process supports shared responsibility for choosing the most suitable interventions and strategies for individual students (National Autism Center, 2009). Friend and Cook (2007) observed that ‘‘collaboration has been a central element of successful education for many years’’ (p. xv). Indeed, support for collaboration and shared decision-making authority is well recognized and endorsed within the profession, including via policy enactments (see, e.g., language supporting team collaboration in the Individuals with Disabilities Education Improvement Act (IDEA), 2004, 20 U.S.C., y 665(b)(2)(G)). By design and definition collaboration accentuates shared responsibility and shared decision making among educators, related services personnel, and parents and families. Responsibility for instructional planning and accountability for students’ outcomes, growth, and development are thus shared rather than having professional staff assume total responsibility and authority. Variable interpretations of the markers and characteristics of utilitarian and efficient collaboration in programs that serve learners with ASD vary, however characteristics of such programs often include: having regularly scheduled collaborative team meetings open to multiple stakeholders, including parents and community personnel; having designated professional staff team leaders who are personally knowledgeable of student’s needs and who are responsible for monitoring and reporting student’s instructional programs and progress participate in meetings; providing training, supervision, and communication support that ensures that all school staff implement students’ programs in accordance with agreed upon protocol; providing training for parents and families that permits them to correctly apply interventions and treatments in home and community settings; having a problem solving and dispute resolution process in place to respond to disagreements and related issues; ensuring that there are resources and a structure for appropriate follow-up actions on decisions that are made at stakeholder meetings. (Hourcade & Umbarger, 2012; National Research Council, 2001; Simpson & Mundschenk, 2010)
ASD Effective Practices
211
GUIDELINES FOR ADOPTING AND USING AN EFFECTIVE-PRACTICE MODEL Implementation of the effective-practice model (see Fig. 1) is facilitated by use of questions that guide the diverse and individualized perspectives of stakeholders and students with ASD (Freeman, 1997; Heflin & Simpson, 1998; Simpson, 2008; Simpson, McKee, Teeter, & Beytien, 2007). We offer three questions to assist stakeholders in choosing interventions and related supports best suited for individual learners. 1. What proof supports purportedly effective interventions? 2. How will a selected intervention be evaluated? 3. To what extent does an intervention fit an individual learner’s unique needs? Question 1 focuses the attention of stakeholders on the scientific and objective evidence supporting a method. Stakeholders thus consider the extent to which participants in research studies objectively benefited from a method; and the degree to which one’s own students or learners are similar to the research participants. Implicit in this question is the requirement that stakeholders have the capacity to discriminate between research methods that possess scientifically valid and evidence-based credentials and methods and products lacking these characteristics. Consequently stakeholder groups should include members who have the experience and training to critically analyze the research methods used to examine interventions for learners with ASD under consideration; and the ability to interpret professional documents, peer-reviewed scientific journals, and reports that vet these methods. Stakeholders should also have the assets to discriminate objective and scientific research reports from the pseudoscience often found in nonpeer-reviewed materials (e.g., anecdotally based web reports, marketing and promotional brochures) supported by personal testimony. We also advise that methods that promise extraordinary and universal improvements that far exceed the outcomes reported for more tested strategies should be reviewed with caution and skepticism. Question 1 also focuses stakeholders on the degree to which the outcomes associated with an intervention align with the characteristics and needs of a student under discussion. Hence, independent of the reported evidence for a particular method is the matter of similarity between the individuals with whom an intervention was validated and the particular student being considered.
212
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
Question 2 asks stakeholders to consider the evaluation of interventions considered or adopted for use with an individual student. Included are the following issues: (a) what target behaviors will be measured as evidence of progress (e.g., social interaction initiations, verbal responses to teacher queries)? (b) Who will conduct the agreed upon evaluations and how often will the interventions be evaluated? (c) What standards or criteria will be used to determine if an intervention should be continued or changed? Question 1 of the guiding process for using the recommended evidencebased decision model focuses stakeholders’ attention on methods that have objectively yielded positive outcomes. Question 2 advances and refines this process by having stakeholders evaluate the method or procedure with particular learners. Thus, independent of the purported benefits of a particular approach and its supporting credentials, question 2 reminds stakeholders that they must objectively assess individual students’ specific targets and outcomes relative to individualized needs. The essence of this element of the decision-assisting process is that independent of supporting research, interventions require ongoing objective evaluation. Question 3 focuses on the qualitative merits and shortcomings of interventions under consideration. This item thus reminds stakeholders to carefully consider which strategies have the greatest potential to positively affect individual learners and align with their needs as well as the settings where they are served. Factors connected to question 3 relate to the perceived match of various interventions with the needs, values, social validity considerations, and life styles of individual students and families. These variables assist stakeholders understanding that, independent of the reported research that supports particular methods, qualitative factors (e.g., how a student’s learning style, personality, idiosyncratic preferences, and family circumstances might affect the application of an intervention or method) should be taken into account. By design question 3 also deals with social validity. Stakeholders are asked to include in their deliberations of potential interventions factors that may fall outside traditional efficacy research. Based on their roles, experiences, attitudes, individual circumstances, and so forth, stakeholders linked to students with ASD can be expected to have different ideas and perspectives. Thus discussions regarding social validity provide stakeholders a forum and an opportunity to discuss topics such as quality-of-life factors, perceived practical benefits of particular interventions, and students’ preferences and characteristics related to adopting certain methods. These discussions are not intended to replace stakeholders’ consideration of interventions based on empirical scientific variables but rather to broaden
ASD Effective Practices
213
the vetting standards by including informal and qualitative considerations as a part of the deliberation process. Question 3 discussions also allow stakeholders to consider negative side effects, challenging circumstances, or requirements associated with using a method, such as financial and quality of life risks for a student or family.
TRENDS IN EFFECTIVE-PRACTICE INTERVENTIONS AND SUPPORTS FOR LEARNERS WITH ASD Identification of scientifically based and maximally effective methods for individuals with autism-related disabilities has been a front-page matter for the autism-focused community for a number of years and there is every indication that this pattern will continue into the foreseeable future. Clearly there are few issues that confront stakeholders in autism, especially those involved in educational matters that are more important than identifying and using methods that hold the most promise and deliver the best outcomes for learners with ASD. Thus it should come as no great surprise that we predict that the effective-practice identification and vetting process will remain one of the preeminent issues in our field for years to come. Clearly the challenge is to understand the direction of the ever-changing and dynamic ASD effective-practice movement with enough precision and specificity to permit a reasonable and lucid interpretation of the movement’s direction and to logically identify those methods and supports that are on the present and distant horizon. Accurate trend analysis and predictions are indeed important because they permit ASD stakeholders – including practitioners, parents and family members, researchers, and policy makers – to begin with some measure of certitude the process of understanding and evaluating maximally effective methods that are most apt to be used in present and future classrooms. Our analysis is based on an examination of the 2011–2012 extant literature. Specifically, in an effort to identify current research trends regarding effective practices for individuals with an ASD, we conducted a review of recent ASD intervention literature. Inclusion criteria for our review included the following elements: articles were published in peerreviewed journals during the years 2011 and 2012. We limited our review to scientific evaluations of intervention strategies or related methodologies wherein at least 50% of the participants were diagnosed with an ASD. Our goal in this process was not to conduct a comprehensive review of the current extant literature and to identify all 2011–2012 published research
214
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
findings on ASD intervention topics. Rather our intent was to objectively gather sufficient published literature information and data to make an accurate judgment about the current and future directions of intervention strategies for individuals with an ASD. Studies were identified by searching online databases in the social sciences with the keywords: ‘‘autism,’’ ‘‘autism spectrum disorders,’’ ‘‘Asperger syndrome,’’ ‘‘high functioning autism,’’ ‘‘treatment,’’ ‘‘communication,’’ ‘‘social skills,’’ and ‘‘behavior.’’ Other studies were identified through published systematic reviews of specific interventions or through manual searches of 2011–2012 editions of specific autism related scholarly journals (i.e., Education and Training in Autism and Developmental Disabilities, Focus on Autism and Other Developmental Disabilities). In total 79 articles were included in our review and analysis.
Key Themes We examined the aforesaid literature relative to identifying thematic components linked to ASD interventions. The specific attention in our structured review of the 79 articles focused on (a) participant characteristics, (b) target behaviors, and (c) specific interventions and intervention components. Based on this review process we identified three major interventionrelated trends. First, school-age children and youth appear to be the most common recipients of interventions being researched with individuals diagnosed with ASD. Second, social competence development, social behaviors, and proactive social interaction skills are the most commonly identified intervention targets. Finally, the most frequently used interventions for individuals with ASD are based on technology and technological components. Each of these thematic trends is discussed below. Participant Trends Of the 79 studies we reviewed 58 (73%) focused on school age students (5–18 years of age). Nineteen ASD intervention studies (24%) involved young children, that is, boys and girls who were under 5 years of age. Two studies (3%) applied an intervention with an adult diagnosed with an ASD. As the current school-age population of individuals with ASD moves toward adult age we anticipate seeing more intervention studies that focus on more mature individuals. Our prediction is that the current evidencebased interventions will be the structural foundation for many of the practices used to address the needs of adults with ASD.
ASD Effective Practices
215
The participants in the articles we reviewed were most likely to be described as students with Autism or an Autism Spectrum Disorder (56/79, 71%). Eleven studies (14%) specifically applied an intervention with students with high functioning autism, Asperger Syndrome, or Pervasive Development Disorder-Not Otherwise Specified and four studies (5%) specifically identified participants as having Autistic Disorder or Moderate/ Severe Autism. Generally the study participants were broadly described as having an Autism Spectrum Disorder without more detailed information and data. It was uncommon for the studies to provide specific diagnostic classifications and supporting diagnostic information (e.g., specific defining characteristics such as intellectual functioning, expressive language scores, and results of behavioral measures). Target Behavior Trends Of the 79 studies we included in our research review 31 (39%) focused on social skills. These targets covered a wide range of social behaviors including social initiations and social bidding, emotional recognition, sibling and family interactions, play behaviors, joint attention, and general prosocial interactions with others. As discussed below a wide variety of instructional procedures were used to improve these social behaviors, including group instructional programs, video modeling, cognitive behavioral methods, pivotal response training, and theory of mind training. Other targets accounting for significant portions of the intervention research we found in our review included communication targets (13/79, or 16%; i.e., word acquisition, requesting, labeling, and speech outcomes). Behavioral targets (e.g., on-task behavior, appropriate behavioral responses, reduction in problem behaviors) were also the focus of 13 studies (16%). Other targets, accounting for approximately 29% of the interventions in the studies reviewed, included eating problems such as food over-selectivity, sleep problems, anxiety, daily living/leisure themes, motor tasks, and skillbased/academic outcomes. Intervention Trends One relatively strong intervention trend that emerged from our review of recent research studies was the use of technology. Included were a variety of technology components and applications, including computers, handheld personal digital assistants (PDAs), computer tablets (e.g., iPads), smartphones, speech output devices, videos and teaching models provided in the form of video images, and virtual reality builders. Of the 79 studies included in our overview, 31 (39%) used some form of technology as a part of the
216
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
intervention package. Video modeling was the most common of these technological supports, accounting for over a quarter of all of interventions in the studies we examined. We are not at all surprised by this finding and attribute this to the popularity and attention that video modeling is receiving as well as the relative ease and simplicity of creating and developing video models. What previously was a process that required advanced technological skills and unique equipment has become easily available to the masses. That video modeling has proven to result in positive outcomes also appears to contribute to its wide-scale use (Axe & Evans, 2012; Mechling & Ayers, 2012). Our review of 79 articles also revealed that 15% (12/79 studies) included group social skill interventions, 8% (6/79) included peer/sibling mentoring programs, 8% (6/79) included alternative or arguably unorthodox treatments (i.e., hyperbaric oxygen, weighted vests, dietary supplement), and 6% (5/79) included augmented or alternative communication (AAC) devices. Structured teaching, direct instruction, pivotal response training, social stories/scripts, and reinforcement systems were also represented in the articles we reviewed. Over the past several years video files have almost exclusively been converted to a digital format. This change has made it relatively easy for the general public (including educators, parents, and others) to create highquality videos and other images using an ever increasing array of userfriendly devices (e.g., smartphones, digital cameras, tablets, handheld high definition cameras). It appears that both practitioners and researchers are more and more harnessing these ubiquitous tools to support and teach learners with ASD by creating high quality models and images that explicitly show appropriate behaviors and teach a variety of skills. We not only expect this trend to continue, but to accelerate in the future. Of course the caveat and future challenge for using video modeling to support and teach individuals with ASD is to build on the preliminary findings of earlystage research and refine those methods and protocol in such a manner that will increase positive outcomes (Bellini & Akullian, 2007; Charlop-Christy, Le, & Freeman, 2000; McCoy & Hermansen, 2007). Based on the articles we reviewed it appears to us that video modeling is likely to be used in conjunction and as a delivery system for more ‘‘established’’ interventions in the future. In fact we are increasingly seeing classroom personnel use video modeling in combination with other methods to leverage visual-learning strengths among some students with ASD. For example, we are more and more seeing teachers and related service professionals use video models to teach children how to exchange PECS icons for desired items and activities (Bondy & Frost, 1994) and for children
217
ASD Effective Practices
and adolescents involved in cognitive behavioral intervention programs to use technological supports and devices (e.g., PDAs and smart phones) to do self-monitoring. There is good reason to assume that the trend of using video modeling will continue with students with ASD, both as a stand-alone learning tool to address a variety of targets and as a delivery system for implementing other evidence-based and established interventions.
CONCLUDING THOUGHTS The challenges associated with providing high quality educational programs and support services for individuals diagnosed with ASD are momentous. Indeed the ever-increasing number of individuals who qualify for a spot on the so-called autism spectrum translates to an unambiguous demand that educators and other school-based professionals skillfully and efficiently deliver first-rate programs and services that result in satisfactory school and post-school outcomes. There are no easy solutions to this challenge, yet it is clear that identification and use of interventions and other supports that have objectively and empirically been shown to result in positive outcomes is an essential and indispensable part of the resolution equation. Important initial steps leading to wide-scale adoption and application of evidencebased methods with learners with ASD have been initiated, and without a doubt the field is making progress. However, the field is far from being where we need to be relative to identifying and applying effective practices. As described in this chapter adoption of protocol such as we illustrate in Fig. 1 is a step in the right direction leading to wide-scale identification and correct use of effective methods.
REFERENCES American Psychological Association. (2005). Policy statement on evidence-based practice in psychology. Retrieved from http://www.apa.org/practice/resources/evidence/evidencebased-statement.pdf Axe, J., & Evans, C. (2012). Using video modeling to teach children with PDD-NOS to respond to facial expressions. Research in Autism Spectrum Disorders, 6, 1176–1185. Bellini, S., & Akullian, J. (2007). A meta-analysis of video modeling and video self-modeling interventions for children and adolescents with autism spectrum disorders. Exceptional Children, 73, 264–287. Biklen, D. (1993). Communication unbound: How facilitated communication is challenging traditional views of autism and ability/disability. New York, NY: Teachers College Press.
218
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
Bondy, A., & Frost, L. (1994). The picture exchange communication system. Focus on Autistic Behavior, 9(3), 1–19. Centers for Disease Control and Prevention. (2008). Autism information center. Retrieved from http://www.cdc.gov/ncbddd/autism/ Centers for Disease Control and Prevention. (2012a). Autism information center. Retrieved from http://www.cdc.gov/ncbddd/autism/ Centers for Disease Control and Prevention. (2012b). Prevalence of autism spectrum disorders: Autism and developmental disabilities monitoring network, 14 cites, United States, 2008. Morbidity & Mortality Weekly Report, March 30. Cook, B. G., Tankersley, M., & Landrum., T. J. (2009). Determining evidence-based practices in special education. Exceptional Children, 75, 365–383. Charlop-Christy, M. H., Le, L., & Freeman, K. A. (2000). A comparison of video modeling with in vivo modeling for teaching children with autism. Journal of Autism and Developmental Disorders, 30, 537–552. Fixsen, D. L., Naoom, S. F., Blase, K. A., Friedman, R. M., & Wallace, F. (2005). Implementation research: A synthesis of the literature. Tampa, FL: University of South Florida, Louis de la Parte Florida Mental Health Institute, The National Implementation Research Network (FMHI Publication #231). Freeman, B. J. (1997). Guidelines for evaluating intervention programs for children with autism. Journal of Autism and Developmental Disorders, 27, 641–651. Friend, M., & Cook, L. (2007). Interactions: Collaboration skills for school professionals. Upper Saddle River, NJ: Pearson. Ganz, J. (2007). Classroom structuring methods and strategies for children and youth with autism spectrum disorders. Exceptionality, 15, 249–260. Gresham, F., Beebe-Frankenberger, M., & MacMillan, D. (1999). A selective review of treatments for children with autism: Descriptive and methodological considerations. School Psychology Review, 28, 559–575. Heflin, L., & Simpson, R. (1998). Interventions for children and youth with autism: Prudent choices in a world of exaggerated claims and empty promises. Part I: Intervention and treatment review. Focus on Autism and Other Developmental Disabilities, 13, 194–211. Hourcade, J., & Umbarger, G. (2012). Collaborationand cooperative teaching for system-wide change. In D. Zager, M. L. Wehmeyer & R. L. Simpson (Eds.), Educating students with autism spectrum disorders (pp. 3–12). New York, NY: Routledge. Individuals with Disabilities Education Improvement Act of 2004. 20 U.S.C. 1400 et seq. (2004). Institute of Education Sciences, U.S. Department of Education Institute of Educational Sciences. (2009). What Works Clearinghouse. Retrieved from http://ies.ed.gov/ncee/wwc/ Iovannone, R., Dunlap, G., Huber, H., & Kincaid, D. (2003). Effective educational practices for students with autism spectrum disorders. Focus on Autism and Other Developmental Disabilities, 18, 150–165. Lotter, V. (1966). Epidemiology of autistic conditions in young children. Social Psychiatry, 4, 263–277. McCoy, K., & Hermansen, E. (2007). Video modeling for individuals with autism: A review of model types and effects. Education and Treatment of Children, 30(4), 183–213. Mechling, L., & Ayers, K. (2012). A comparative study: Completion of fine motor office related tasks by high school students with autism using video models on large and small screens. Journal of Autism and Developmental Disorders, 24, 469–486. National Autism Center. (2009). National standards report. Randolph, MA: Author.
ASD Effective Practices
219
National Professional Development Center on Autism Spectrum Disorders. (2010). Evidence based practices. Chapel Hill, NC: Author. National Research Council. (2001). Educating children with autism. Washington, DC: National Academy Press. No Child Left Behind Act of 2001, 20 U.S.C. 70 6301 et seq. (2002). Odom, S. L., Brantlinger, E., Gersten, R., Horner, R. H., Thompson, B., & Harris, K. R. (2005). Research in special education: Scientific methods and evidence-based practices. Exceptional Children, 71, 137–148. Oller, J., & Oller, S. (2010). Autism: the diagnosis, treatment & etiology of the undeniable epidemic. Sudbury, MA: James and Bartlett. Sackett, D., Rosenberg, W., Muir Gray, J., Haynes, R., & Richardson, W. (1996). Evidencebased medicine: What it is and what it isn’t. British Medical Journal, 312, 71–72. Scheuermann, B., & Webber, J. (2002). Autism: Teaching does make a difference. Belmont, CA: Wadsworth. Schopler, E., Mesibov, G. B., & Hearsey, K. (1995). Structured teaching in the TEACCH system. In E. Schopler & G. B. Mesibov (Eds.), Learning and cognition in autism (pp. 243–267). New York, NY: Plenum. Simpson, R. L. (2005). Evidence-based practices and students with autism spectrum disorders. Focus on Autism and Other Developmental Disabilities, 20, 140–149. Simpson, R. (2008). Children and youth with autism spectrum disorders: The elusive search and wide-scale adoption of effective methods. Focus on Exceptional Children, 40(7), 1–14. Simpson, R., deBoer, S., Griswold, D., Myles, B., Byrd, S., Ganz, J., y Adams, L. (2005). Autism spectrum disorders: Interventions and treatments for children and youth. Thousand Oaks, CA: Corwin Press. Simpson, R., McKee, M., Teeter, D., & Beytien, A. (2007). Evidence-based methods for children and youth with autism spectrum disorders: Stakeholder issues and perspectives. Exceptionality, 15, 203–218. Simpson, R., & Mundschenk, N. (2010). Working with parents and families of exceptional children and youth. Austin, TX: Pro-ed. Simpson, R., Mundschenk, N., & Heflin, J. (2011). Issues, policies and recommendations for improving the education of learners with autism spectrum disorders. Journal of Disability Policy Studies, 22, 3–17. Simpson, R., & Myles, B. (2008). Educating children and youth with autism: Strategies for effective practice. Austin, TX: Pro-Ed. U.S. Department of Education Office of Special Education Programs, National Professional Development Center on Autism Spectrum Disorders. (2010). Evidence-based practices for children and youth with ASD. Retrieved from http://autismpdc.fpg.unc.edu/content/ evidence-basedpractices Volkmar, F., Cook, E., & Pomeroy, J. (1999). Practice parameters for the assessment and treatment of children, adolescents and adults with autism and pervasive developmental disorders. Journal of the American Academy of Child and Adolescent Psychiatry, 38(12), 32S–54S. Waly, M., Olteanu, H., Banerjee, R., Choi, S., Mason, J., Parker, B. y Deth, R. C. (2004). Activation of methionine synthase by insulin-like grown factor-1 and dopamine: A target for neurodevelopmental toxins and thimerisal. Molecular Psychiatry, 9, 358–370. Wheeler, D., Jacobson, J., Paglieri, R., & Schwartz, A. (1993). An experimental assessment of facilitated communication. Mental Retardation, 31(1), 49–60.
220
RICHARD SIMPSON AND STEPHEN CRUTCHFIELD
White, M., Smith, J., Smith, T., & Stodden, R. (2012). Autism spectrum disorders: Historical, legislative, and current perspectives. In D. Zager, M. Wehmeyer & R. Simpson (Eds.), Educating students with autism spectrum disorders (pp. 3–12). New York, NY: Routledge. Young, H., Grier, D., & Grier, M. (2008). Thimerosal exposure in infants and neurodevelopmental disorders: An assessment of computerized medical records in the vaccine safety datalink. Journal of Neurological Sciences, 27, 110–118.
CHAPTER 10 CONSTRUCTING EFFECTIVE INSTRUCTIONAL TOOLKITS: A SELECTIVE REVIEW OF EVIDENCE-BASED PRACTICES FOR STUDENTS WITH LEARNING DISABILITIES Tanya E. Santangelo, Amy E. Ruhaak, Michelle L. M. Kama and Bryan G. Cook ABSTRACT Evidence-based practices have been shown to meaningfully improve learner outcomes by bodies of high-quality research studies and should therefore be prioritized for use in schools, especially with struggling learners such as students with learning disabilities. Although many resources are available on the internet with information about evidencebased practices, the magnitude and technical nature of the websites are often overwhelming to practitioners and are therefore not frequently used as part of the instructional decision-making process. In this chapter, we aim to provide a ‘‘one stop shopping experience’’ for readers interested in
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 221–249 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026012
221
222
TANYA E. SANTANGELO ET AL.
evidence-based practices for students with learning disabilities by reviewing five relevant website. Specifically, for each website we review (a) the procedures used to classify the evidence-based status of practices, (b) the classification scheme used to indicate the level of research support for practices, and (c) the practices reviewed for students with learning disabilities and their evidence-based classification. We conclude with a discussion of issues related to interpreting and applying information on evidence-based practices from these websites.
Evidence-based practices (EBPs) are instructional programs and techniques supported by sound research as having meaningfully positive effects on student outcomes. Although criteria for identifying EBPs vary widely, scholars generally posit that EBPs are supported by multiple, high-quality (i.e., meeting indicators of sound methodological quality) studies using research designs from which causality can be inferred (e.g., randomized controlled trials, single-subject research) (Cook & Cook, in press). EBP is also frequently used to refer to a broad approach to instructional decision making that involves consideration of research evidence in conjunction with stakeholder expertise, needs, and values (e.g., Whitehurst, 2002). However, we use EBPs in this chapter specifically to refer to empirically validated instructional practices and programs. Because they are shown by trustworthy research to be generally effective, the consistent use of EBPs with fidelity has significant potential for improving instruction and student outcomes. Indeed, Slavin (2002) suggested that EBP movement in education may occasion ‘‘a scientific revolution that has the potential to profoundly transform policy, practice, and research’’ (p. 15). EBPs can help improve the outcomes of all types of students, yet they appear especially important for learners with disabilities, who require the most effective instruction to reach their potential (Dammann & Vaughn, 2001). In particular, students with learning disabilities (LD) experience a number of problematic school outcomes (e.g., low achievement, low self-esteem, low peer acceptance) that require the application of the most effective instructional practices. However, because of their unique characteristics and learning needs, individuals with LD may not benefit from EBPs validated for typical learners. Thus, the focus of this chapter is on practices shown to be EBPs for students with LD. Although EBPs have the potential to broadly, markedly, and positively impact teaching and learning, many obstacles exist on the path to realizing that promise. For example, experts disagree on what type and level of
LD EBPs
223
research support are necessary for a practice to be considered an EBP (e.g., How many studies are needed? What research designs are acceptable? How is methodological quality assessed?). Additionally, the challenges associated with identifying EBPs almost certainly pale in comparison to those associated with implementing EBPs (Fixsen, Naoom, Blase, Friedman, & Wallace, 2005). That is, just because EBPs are identified, there is no guarantee that they will be broadly implemented as designed. And even when EBPs are identified and implemented appropriately, they will not be universally effective (Cook & Cook, in press). No one practice works well for everyone and there will be nonresponders to any practice, regardless of the evidence base supporting it. Therefore, educators – especially special educators – need to carefully monitor students’ individual progress to gauge the effect of EBPs on individual students. Identification and use of EBPs are critically important; however, there is a missing link between the two – effectively disseminating information on EBPs to practitioners and other stakeholders (Cook, Cook, & Landrum, 2013). A fundamental problem associated with the EBP movement is that most educators are unaware of the EBPs for the populations of learners who they teach. Simply put, if teachers do not know which practices are and which practices are not EBPs for the students they work with, EBPs are unlikely to be adopted and, therefore, unlikely to positively impact student outcomes. As the EBP movement developed in education in the early 2000s, organizations began to identify EBPs and typically disseminated their findings on publicly accessible websites (e.g., the What Works Clearinghouse (WWC) of the Institute for Education Sciences). These efforts created an enormous number of online resources, each containing a wealth of information on EBPs. However, the proliferation of sources and the wealth of information available from each source have overwhelmed many teachers. Many teachers do not know where to begin searching and, as a result, may search the internet haphazardly, only to find out that searching takes more time than they have (see Sandieson, Kirkpatrick, Sandieson, & Zimmerman, 2010; Williams & Coles, 2007). In fact, despite a pressing need to know which practices are the most effective and the impressive volume of information on that topic available on the internet, Williams and Coles (2007) reported that only 13.5% of educators indicated that they regularly used the internet to investigate research on teaching and learning. Our personal experiences are consistent with this finding; most teachers we work with are either unaware of the online resources regarding EBPs or, if aware, find them overwhelming and seldom access them.
224
TANYA E. SANTANGELO ET AL.
Not only might it be difficult to find the right information online, when it is located the information is often too technical to be helpful to teachers. For example, fully understanding the reports provided by the WWC involve consideration of evidence standards for rigorous research (which include research design, attrition rate, and equivalency of groups), statistical significance and effect sizes, sample sizes, and an improvement index (indicating in percentile rankings how much an average participant improved as a result of the intervention). This is all important information, but for practitioners and other stakeholders who are pressed for time and who do not have advanced training in statistics and research design, such information on EBP sites can prove to be overwhelming and confusing. Thus, a fundamental reason that the translation of research to practice breaks down appears to be that many educators are unaware of what practices are and are not evidence-based, do not access the available online resources, or find the resources too overwhelming to be helpful. Not only does this mean that the considerable time and resources invested in identifying and disseminating EBPs have been squandered, but – more importantly – the potential benefits of EBPs for student outcomes have gone largely unrealized. In this chapter, our goal is to contribute to the translation of research to practice for students with LD by addressing some of the primary obstacles educators encounter in accessing information about EBPs. Specifically, we provide a review of five online sources of EBPs for students with LD, describing in accessible terms for each source reviewed (a) the procedures used to classify the evidence-based status of practices, (b) the classification scheme used to indicate the level of research support for practices, and (c) the practices reviewed relative to LD and their evidencebased classification. In short, we hope that this chapter provides a ‘‘one stop shopping’’ opportunity for educators interested in finding out what works for students with LD. Because of space limitations, this chapter is a selective (rather than exhaustive) review of EBP sources for learners with LD. We conducted a systematic review of potentially relevant sources to determine the overall scope of currently available information. We decided to include sources that use both (a) a rigorous and systematic review process to evaluate research and (b) a classification scheme to draw a conclusion regarding the evidence supporting a particular practice. Thus, although sources such as the National Center on Response to Intervention, The Center on Instruction, and The National Center on Intensive Intervention provide valuable information, they are not featured in this chapter because they do not meet the second criterion. Additionally, we only included sources whose
225
LD EBPs
information and resources are free and publicly available via the internet. Therefore, publications (e.g., books, journal articles) and websites that require payment are not represented. Finally, from sources that met the above criteria, we include findings derived from research (a) with schoolaged populations (i.e., kindergarten through twelfth grade; 5–21 years old); (b) that included students with LD (we did not set a minimum percentage/ number, due to varying standards across the sites); and (c) that evaluated interventions targeting academic (e.g., reading, mathematics, writing, content areas) and/or college- and career-readiness (e.g., functional life, social, transition skills) outcomes. Thus, research focused solely on assessment practices (e.g., curriculum-based assessment) and/or behavioral interventions (e.g., school-wide positive behavioral support, community mentors) is not included in this chapter. In the following sections we review five sources of EBPs for students with LD: What Works Clearinghouse, Best Evidence Encyclopedia, National Secondary Transition Technical Assistance Center, Promising Practices Network, and Current Practices. For each source, we provide a brief overview of the source; describe the classification scheme used to categorize the evidence base of practices reviewed; and summarize the findings of the sources regarding which practices are and are not EBPs for students with LD, including a brief description for those practices found to be most effective for students with LD.
WHAT WORKS CLEARINGHOUSE Overview The WWC (http://ies.ed.gov/ncee/wwc) was established in 2002 by the U.S. Department of Education’s Institute of Education Sciences to provide educators and parents with ‘‘a central and trusted source of scientific evidence for what works in education.’’ Currently, the WWC includes information on the following topics: Children and Youth with Disabilities, College and Career Preparation, Dropout Prevention, Early Childhood Education, Education Technology, English Language Learners, Literacy, Math, School Choice, School Organization and Governance, Science, Student Behavior, Teacher and Leader Effectiveness, and Teacher Incentives. For each topic, the WWC publishes reviews of individual research studies (Single Study Reviews and Quick Reviews), as well as synthesis reviews of all research related to a particular program
226
TANYA E. SANTANGELO ET AL.
(Intervention Reports). To help educators understand research findings and use them to guide classroom practice, the WWC also publishes Practice Guides and offers webinars, as well as a companion website called, Doing What Works (http://dww.ed.gov). WWC reviews are conducted by a team of experts, using a rigorous and systematic process. To be eligible for a WWC rating, a study must use an experimental design (i.e., randomized controlled trial, quasi-experiment with pretest equivalence, regression discontinuity, or single case) and meet other quality standards, such as the use of reliable and valid measures and acceptable rates of attrition (complete details can be found in the WWC Procedures and Standards Handbook, which can be downloaded from http://ies.ed.gov/ncee/wwc/DocumentSum.aspx?sid=19).
Classification Scheme The WWC rating scale uses six categories to describe the effects of a practice: Positive Effects: At least two studies, one of which is a well-implemented random experiment, show statistically significant positive effects. No studies have statistically significant negative effects or an effect size o0.25. Potentially Positive Effects: At least one study has statistically significant positive effects or an effect size W0.25. No studies have statistically significant negative effects or effect size o0.25. The number of studies with positive effects equals or exceeds those with indeterminate effects (i.e., results are not statistically significant and/or effect size between 0.25 and 0.25). Mixed Effects: Either: (a) at least one study has statistically significant positive effects or effect size W0.25 and at least one study has statistically significant negative effects or effect size o0.25. Additionally, the number of negative findings does not exceed those that are positive; or (b) at least one study has statistically significant positive effects or effect size W0.25 and the number of studies with indeterminate effects exceeds the number with positive effects. No Discernible Effects: All studies are indeterminate. Potentially Negative Effects: Same criteria as Potentially Positive Effects, except the focus is on negative effects. Negative Effects: Same criteria as Positive Effects, except the focus is on negative effects.
227
LD EBPs
Summary of EBPs for Students with LD The WWC has a protocol for reviewing studies that evaluate the impact of interventions for K-12 students with LD (i.e., studies must have at least 50% of participants identified as having a LD; the protocol can be downloaded from http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=31). As of November 2012, 14 programs have been reviewed with the WWC LD protocol. As shown in Table 1, the WWC determined nine of these programs have no Table 1.
What Works Clearinghouse Ratings for Students with LD.
Rating
Program and Grade
General Reading Achievement No Discernible Effects
Project Read Phonology (K-4)
Alphabetics Potentially Positive Effects
Lindamood Phoneme Sequencing (4)
Potentially Negative Effects
Reading Mastery (2–5)
Reading Fluency Potentially Positive Effects
Peer-Assisted Learning Strategies (2–6) Lindamood Phoneme Sequencing (4)
No Discernible Effects
Read Naturally (4–6)
Potentially Negative Effects
Reading Mastery (2–5)
Reading Comprehension Potentially Positive Effects
Peer-Assisted Learning Strategies (2–6)
No Discernible Effects
Lindamood Phoneme Sequencing (4) Reading Mastery (2–5)
Mathematics Potentially Positive Effects
Lindamood Phoneme Sequencing (4)
No Discernible Effects
Peer-Assisted Learning Strategies (2–6)
Writing Potentially Positive Effects
Read Naturally (4–6)
Potentially Negative Effects
Reading Mastery (2–5) Lindamood Phoneme Sequencing (4)
No Studies Meet Evidence Standards Alphabetic Phonics, Barton Reading & Spelling System, Dyslexia Training Program, Fundations, Herman Method, Read 180, Unbranded Orton-Gillingham-based Interventions, Voyager Reading Programs, Wilson Reading System
228
TANYA E. SANTANGELO ET AL.
studies that meet their quality or evidence standards for review: Alphabetic Phonics, Barton Reading & Spelling System, Dyslexia Training Program, Fundations, Herman Method, Read 180, Unbranded Orton-Gillinghambased Interventions, Voyager Reading Programs, and Wilson Reading System. The five programs with at least one eligible study are briefly described below, by level of effectiveness. See Table 1 for a summary, organized by outcome measure.
Potentially Positive Effects The Lindamood Phoneme Sequencing (LiPS) program uses direct and individualized instruction to help students develop early reading skills, such as phonemic awareness and decoding. Initially, the focus is on teaching the basic oral actions required to produce sounds. Instruction then targets other areas, such as letter patterns, sight words, spelling, reading, sequencing, and context clues. Based on the one study that met WWC evidence standards, which included fourth-grade students with LD, LiPS was found to have potentially positive effects on alphabetics, reading fluency, and math. The WWC Intervention Report for LiPS can be downloaded from http:// ies.ed.gov/ncee/wwc/interventionreport.aspx?sid=279. Peer-Assisted Learning Strategies (PALS) is a structured and systematic peer-tutoring program that can be used to supplement any reading or math curriculum. PALS tutoring sessions last 30–35 min and occur three or four times a week for reading, and two times a week for math. During PALS sessions, students take turns acting as the tutor, and correct their peer immediately when mistakes are made. With PALS, teachers flexibly group students, based on individual needs and strengths. Based on three studies that met WWC evidence standards and included students with LD in second through sixth grade, PALS was found to have potentially positive effects on reading fluency and reading comprehension. The WWC Intervention Report for PALS can be downloaded from http://ies.ed.gov/ncee/wwc/intervention report.aspx?sid=569. Read Naturally is designed to improve reading fluency and is comprised of three strategies: repeated reading of text, teacher modeling of reading, and ongoing progress monitoring by teachers and students. Read Naturally is available in two formats – one uses a combination of audio- and hardcopy books, and the other is computer-based. Based on one study that met WWC evidence standards and included students with LD in fourth through sixth grade, Read Naturally was found to have potentially positive effects
229
LD EBPs
on writing. The WWC Intervention Report for RN can be downloaded from http://ies.ed.gov/ncee/wwc/interventionreport.aspx?sid=409. No Discernible Effects LiPS and Read Naturally were found to have no discernible effects on reading comprehension. PALS was found to have no discernible effects on mathematics. Based on the one study that met WWC evidence standards and included students with LD in kindergarten through fourth grade, Project Read Phonology was found to have no discernible effects on general reading achievement. Based on two studies that met WWC evidence standards and included students with LD in second through fifth grade, Reading Mastery was found to have no discernible effects on reading comprehension. Potentially Negative Effects Reading Mastery was found to have potentially negative effectives on alphabetics, reading fluency, and writing. LiPS was found to have potentially negative effects on writing.
BEST EVIDENCE ENCYCLOPEDIA Overview The Best Evidence Encyclopedia (BEE; http://www.bestevidence.org) is a website created by the Johns Hopkins University Center for Data-Driven Reform in Education, with funding from the Institute for Education Sciences. The overarching purpose of the BEE is to provide educators with useful information about the evidence supporting programs for students in kindergarten through twelfth grade. At present, the following topics (and subdivisions) are included on the BEE: (a) Mathematics (elementary, middle/high school, and effectiveness of technology), (b) Reading (beginning readers, elementary, middle/high school, English language learners, struggling readers, and effectiveness of technology), (c) Science (elementary), (d) Comprehensive School Reform (elementary, middle/high school, education service providers), and (e) Early childhood education. For each of these, the BEE includes a webpage summary and three downloadable publications – a Full Report, an Educator’s Summary, and an Educator’s Guide.
230
TANYA E. SANTANGELO ET AL.
Whereas the WWC reviews programs and studies individually, the BEE publishes research syntheses (e.g., meta-analyses) and summaries focused on broader topics, such as beginning reading. To be selected by the BEE, a research review must represent an exhaustive search of the relevant literature and include studies that: (a) use appropriate research designs (e.g., true random experiments or quasi-experiments with pretest equivalence), (b) report outcomes in terms of effect sizes and statistical significance, (c) include an intervention that lasted at least 12 weeks, and (d) use measures that validly assess what students in both the intervention and control groups learned. Some BEE reviews are authored by researchers affiliated with the Center for Data-Driven Reform in Education, whereas others are done by outside researchers. Classification Scheme The BEE rating scale uses six categories to describe the effectiveness of practices: Strong Evidence of Effectiveness: One large randomized study (Z250 students or 10 classes) complemented by an additional, nonrandomized large study or multiple smaller studies (combined sample Z500 students) with an average effect size Z0.20. Moderate Evidence of Effectiveness: At least two large matched studies or several small studies with a combined sample Z500 students with an average effect size Z0.20. Limited Evidence of Effectiveness – Strong Evidence of Modest Effects: At least two large matched studies or several small studies with a combined sample Z500 students with an average effect size between 0.10 and 0.19. Limited Evidence of Effectiveness – Weak Evidence with Notable Effects: At least one qualifying study with an effect size Z0.20. Insufficient Evidence of Effectiveness: One or more qualifying studies, but studies do not meet criteria for any of the categories described above. No Qualifying Studies: No studies meet inclusion standards.
Summary of EBPs for Students with LD Unlike the WWC, the BEE does not currently evaluate programs solely for students with LD. However, results from the BEE review, Effective
LD EBPs
231
Programs for Struggling Readers: A Best-Evidence Synthesis, are relevant because studies in this review included K-5 students with reading disabilities, reading performance at or below the 33rd percentile in their class, and/or students who received intensive services to prevent or remediate reading problems. Programs from this review are briefly described below, by level of effectiveness on general reading outcomes (as opposed to specific reading measures, as done by WWC). See Table 2 for a summary. The BEE review and related resources can be found at http://www.bestevidence.org/reading/ strug/strug_read.htm.
Strong Evidence of Effectiveness The BEE rated six programs as having strong evidence of effectiveness: Direct Instruction/Corrective Reading, PALS (described previously), Quick Reads, Reading Recovery, Success for All, and Targeted Reading Intervention. Two categories of programs also earned this rating: One-toone Teacher Tutoring with Phonics Emphasis and One-to-one Paraprofessional/Volunteer Tutoring with Phonics Emphasis. Direct Instruction/Corrective Reading is a highly structured and systematic instructional approach that utilizes small group tutorials. Students receive step-by-step, explicit instruction to develop their understanding of phonics and reading comprehension skills. Professional development and ongoing monitoring of implementation fidelity are key components of this approach. Quick Reads (QR) is a supplemental reading program designed to improve fluency, vocabulary, and comprehension for struggling readers in second through sixth grade. It is intended to be used in small-group tutorial settings. QR consists of short texts meant to be read quickly. The program is comprised of six levels, with three separate books containing 30 different texts at each level (90 texts per level). The texts include high-frequency words and phonics/syllabic patterns required to read successfully at grade level; text content covers social studies and science. Reading Recovery (RR) is a one-to-one teacher tutoring program designed for first-grade students who struggle with reading (i.e., the bottom 20%). RR focuses primarily on fluency and involves daily, individualized lessons that include reading familiar books, reading a new book, working with letters and/or words using magnetic letters, assembling a cut-up story, and writing a story. Professional development is a core component of RR; teachers participate in a yearlong graduate program, followed by ongoing training and assessment of implementation.
232
TANYA E. SANTANGELO ET AL.
Table 2.
Best Evidence Encyclopedia Ratings of Programs for K-5 Struggling Readers.
Strong Evidence of Effectiveness Direct Instruction/Corrective Reading (CPT, SG), Peer-Assisted Learning Strategies (CP), Quick Reads (SG), Reading Recovery (TT), Success for All (CPT), Targeted Reading Intervention (TT), One-to-one Teacher Tutoring with Phonics Emphasis,a,c One-to-one Paraprofessional/Volunteer Tutoring with Phonics Emphasisb,c Moderate Evidence of Effectiveness Cooperative Integrated Reading and Composition (CP) Limited Evidence of Effectiveness – Strong Evidence of Modest Effects Jostens/Compass Learning (IT) Limited Evidence of Effectiveness – Weak Evidence with Notable Effects Contextually based Vocabulary Instruction (CP), Early Intervention in Reading (SG), Edmark (TP), Lexia (IT), Lindamood Phoneme Sequence Program (SG), PHAST Reading (SG), Precision Teaching (CP), Proactive Reading (SG), Programmed Tutorial Reading (TP), Project READ (CP), RAILS (CP), Read Naturally (SG), Read, Write, and Type (SG), Reading Styles (CP), Responsive Reading (SG), Same Age Tutoring (CP), SHIP (SG), TEACH (TT), Voyager Passport (SG), Wallach and Wallach (TP) Insufficient Evidence of Effectiveness Academy of Reading, Destination Reading, Experience Corps, Failure-Free Reading, Fast ForWard, Gottshall Small Group Phonics, Headsprout, HOTS, New Heights, Knowledge Box, LeapTrack, Plan Focus, Read 180, Spell Read, Targeted Intervention, Waterford, Wilson Reading No Qualifying Studies 100 Book Challenge; A Comprehensive Curriculum for Early Student Success; Academic Associates Learning Centers; Accelerated Reader; ALEKS; ALPHabiTunes; Alpha-Phonics; Balanced Early Literacy Initiative; Barton Reading and Spelling System; Benchmark; BookMARK; Bradley Reading and Language Arts; Breakthrough to Literacy; Bridge; Bridge to Reading; Bring the Classics to Life; CIERA School Change Framework; Comprehensive Early Literacy Learning; Class Wide Peer Tutoring; Compensatory Language Experiences and Reading Program; Core Knowledge; Cornerstone Literacy Initiative; Curious George Reading and Phonics; DaisyQuest; Davis Learning Strategies; Discover Intensive Phonics for Yourself; Discovery World; Dominie; Dr. Cupp Readers and Journal Writers; Early Success; Early to Read; Earobics; Emerging Readers; Essential Skills; Evidence-Based Literacy Instruction; Exemplary Center for Reading Instruction; Fast Track Action; Felipe’s Sound Search; FirstGrade Literacy Intervention Program; First Steps; Flippen Reading Connections; Fluency Formula; FOCUS: A Reading and Language Program; Four Blocks Framework; Frontline Phonics; Fundations; Funnix; GOcabulary Program for Elementary Students; Goldman-Lynch Language Stimulation Program; Goldman-Lynch Sounds-in-symbols; Great Leaps; Guided Discovery LOGO; Guided Reading; Harcourt Accelerated Reading Instruction; Higher Order Thinking Skills; Hooked on Phonics; Huntington Phonics; IntelliTools Reading; Insights: Reading as Thinking; Invitations to Literacy; Irlen Method; Jigsaw Classroom; Johnny Can Spell; Jolly Phonics; Kaleidoscope; KidCentered Learning; Knolwedge Box; Ladders to
233
LD EBPs
Table 2. (Continued ) Literacy; Language for Learning; Language for Thinking; Leap into Phonics; Letter People; Letterland; LinguiSystems; Literacy Collaborative; Literacy First; Little Books; Little Readers; LocuTour; Matchword; Merit Reading Software Program; Multicultural Reading and Thinking Program; My Reading Coach; New Century Integrated Instructional System; Next Steps; Onward to Excellence; Pacemaker; Pacific Literacy; Pause, Prompt, and Praise; Peabody Language Development Kits; Performance Learning Systems; Phonemic Awareness in Young Children; Phonics for Reading; Phonics Q; Phono-Graphix; PM Plus Readers; Primary Phonics; Programmed Tutorial Reading; Project Child; Project FAST; Project LISTEN; Project PLUS; Rainbow Reading; Read Well; Reading Bridge; Reading Explorer’s Pathfinders Tutoring Kit; Reading Intervention for Early Success; Reading Rods; Reading Step by Step; Reading Success from the Start; Reading Upgrade; Richards Read Systematic Language Program; Right Start to Reading; Road to the Code; ROAR Reading System; S.P.I.R.E.; Saxon Phonics; Schoolwide Early Language and Literacy; Second Grade Acceleration to Literacy; Sequential Teaching of Explicit Phonics and Spelling; Sing, Spell, Read, and Write; SkillsTutor; Soar to Success; Soliloquy; Sonday System; Sound Reading; Sounds and Symbols Early Reading Program; Spalding Writing Road to Reading; Starfall; Start Up Kit; Stepping Stones to Literacy; Stories and More; Story Comprehension to Go; Storyteller Guided Reading; Strategies That Work; Student Team Achievement Divisions; Successmaker; Sullivan Program; Super QAR; Teacher Vision; Ticket to Read; Touchphonics; Tribes Learning Communities; Verticy Learning; Voices Reading; Vowel Oriented Word Attack Course; WiggleWorks; Wright Skills; Writing to Read Note: Rated programs are classified as: one-to-one tutoring by teachers (TT), one-to-one tutoring by paraprofessionals (TP) or volunteers (TV), small group tutorials (SG), classroom instructional process approaches (CP), classroom instructional process approaches with tutoring (CPT), or instructional technology (IT). a The rating for this category represents aggregate findings from the following programs: Auditory Discrimination in Depth (TT), Early Steps/Howard Street Tutoring (TT), Intensive Reading Remediation (TT), Reading Rescue (TT), and Reading with Phonology (TT). b The rating for this category represents aggregate findings from the following programs: Sound Partners (TP), The Reading Connection (TP), SMART (TP), Reading Rescue (TP), Howard Street Tutoring (TP), and Book Buddies (TV). c Each program had at least one qualifying study demonstrating evidence of effectiveness, but due to insufficient sample sizes, did not individually meet the criteria to earn a ‘‘strong evidence of effectiveness’’ rating.
Success for All (SFA) is a comprehensive school reform program that focuses on improving reading outcomes. Highly structured instruction, cooperative learning teams, frequent assessment, behavioral supports, and parent involvement are key features of SFA. Phonemic awareness, phonics, vocabulary, and comprehension are all emphasized in reading. In first grade, struggling readers are provided with one-to-one tutoring; other classroomlevel interventions are used in second through fifth grade.
234
TANYA E. SANTANGELO ET AL.
Targeted Reading Intervention (TRI) is a one-on-one teacher tutoring program for kindergarten and first-grade students who struggle with reading. TRI lessons occur daily and typically consist of re-reading for fluency (2 min), word work (6 min), and guided oral reading (7 min). One-to-One Teacher Tutoring with Phonics Emphasis is a category that includes the following programs: Auditory Discrimination in Depth, Early Steps/Howard Street Tutoring, Intensive Reading Remediation, Reading Rescue, Reading with Phonology. Each program had at least one qualifying study that demonstrated evidence of effectiveness, but due to insufficient sample size did not individually meet the criteria to earn a Strong Evidence of Effectiveness rating. Therefore, the BEE aggregated findings from these similar programs and generated the rating for the category. One-to-One Paraprofessional or Volunteer Tutoring with Phonics Emphasis is a category that includes the following programs: Sound Partners, The Reading Connection, SMART, Reading Rescue, Howard Street Tutoring, and Book Buddies. Like teacher tutoring, the BEE aggregated findings from similar programs to generate the rating for this category. Moderate Evidence of Effectiveness The BEE rated one program, Cooperative Integrated Reading and Composition (CIRC), as having moderate evidence of effectiveness. CIRC is a reading and writing curriculum for students in second through sixth grade. It has three main components: direct instruction in reading comprehension, integrated language arts/writing, and story-related activities. During daily lessons, cooperative learning teams of four students complete story-related writing activities, read to each other, make predictions, summarize text, and respond to questions. Learning teams also practice spelling, decoding, and vocabulary. Limited Evidence of Effectiveness As shown in Table 2, the BEE rated one program as having Limited Evidence of Effectiveness – Strong Evidence of Modest Effects and 20 programs as having Limited Evidence of Effectiveness – Weak Evidence with Notable Effects. Insufficient Evidence of Effectiveness or No Qualifying Studies As show in Table 2, the BEE found 17 programs have Insufficient Evidence of Effectiveness and 149 programs have No Qualifying Studies.
235
LD EBPs
NATIONAL SECONDARY TRANSITION TECHNICAL ASSISTANCE CENTER Overview The National Secondary Transition Technical Assistance Center (NSTTAC; http://www.nsttac.org) is funded by the U.S. Department of Education’s Office of Special Education Programs and directed/staffed by the Special Education Programs at the University of North Carolina at Charlotte and Western Michigan University. One of NSTTAC’s primary purposes is providing technical assistance and disseminating information to support implementation of EBPs that improve academic and functional outcomes for students with disabilities and prepare them for postsecondary education and employment. Consistent with this mission, NSTTAC focuses on interventions that target academics, as well as functional life and transition skills (e.g., banking, communication, employment, leisure) for secondarylevel students with disabilities. The NSTTAC website includes downloadable research summaries, as well as other transition-related resources such as presentations and sample lesson plans. NSTTAC’s ratings of specific practices were determined as part of a literature review the organization conducted to evaluate the level of evidence for transition services (which can be downloaded from http:// www.nsttac.org/content/executive-summary-ebps-and-predictors). Studies included in this review had to meet several criteria, including: (a) published in a peer-reviewed journal after 1984; (b) used a group experimental, single subject, correlational, or meta-analysis/literature review design; (c) sample included secondary-level students with disabilities; and (d) measured meaningful in- and/or post-school outcomes. Additional details about NSTTAC’s review can be found at http://www.nsttac.org/content/literature-reviewprocess. Classification Scheme NSTTAC’s rating scale has four categories designating varying levels of evidence: Strong, Moderate, Potential, and Low. For Strong, Moderate, and Potential, there is a different set of rating criteria for each research design included in the literature review. Below is a summary of the criteria for group experimental and single-subject research. For these two types of research designs, NSTTAC evaluates quality using the quality indicators for
236
TANYA E. SANTANGELO ET AL.
research in special education published in Exceptional Children (Gersten et al., 2005; Horner et al., 2005). Complete details for each rating category across research designs are available at http://www.nsttac.org/content/ literature-review-process. Strong level of evidence (group experimental): Two high-quality or four acceptable quality studies with effect size information. No strong contradictory evidence. Strong level of evidence (single subject): Five high-quality studies conducted by at least three independent research teams that demonstrate a functional relationship. No strong contradictory evidence. Moderate level of evidence (group experimental): One high quality or two acceptable quality studies with effect size information. Moderate level of evidence (single subject): Three high or acceptable quality studies conducted by one or two research teams that demonstrate a functional relationship. Potential level of evidence (group experimental): One acceptable quality study with effect size information. Potential level of evidence (single subject): Two high or acceptable quality studies conducted by one or two research teams that demonstrate a functional relationship. Low level of evidence (group experimental and single subject): Other types of research (e.g., descriptive studies), program evaluations (that do not meet above criteria), and expert opinion articles.
Summary of EBPs for Students with LD Like the BEE, NSTACC does not offer ratings specifically for students with LD. However, many studies in NSTACC’s literature review included students with LD, as well as students with other disabilities. Below is a brief description of the rated practices that included students with LD, by level of evidence. See Table 3 for a summary. All of the NSTACC summary reports can be downloaded from http://www.nsttac.org/content/evidence-basedpractices-secondary-transition. Strong Level of Evidence NSTTAC rated five practices as having a strong level of evidence for academic outcomes: mnemonics, peer assistance, self-management, technology, and visual displays. Mnemonics are learning strategies that help students memorize and retain content material and vocabulary across
237
LD EBPs
Table 3. National Secondary Transition Technical Assistance Center and Promising Practices Network Ratings. Rating
Program/Practice and Grade/Age
National Secondary Transition Technical Assistance Center Academic Skills Strong Level of Evidence
Mnemonics (13–17) Peer assistance (13–17) Self-management (13–16) Technology (12–22) Visual displays (13–16)
Functional Life and Transition Skills Self-Determined Learning Model to teach goal attainment (14–19) Moderate Level of Self-Directed IEP to teach student involvement in the IEP meeting Evidence (12–21) Simulations to teach social skills (12–21) Whose Future is it Anyway? to teach knowledge of transition planning (12–16) Whose Future is it Anyway? to increase self-determination skills (12–16) Potential Level of Evidence
Computer-Assisted Instruction to teach student participation in the IEP process (12–18) Mnemonics to teach completing a job application (15–16)
Promising Practices Network Reading/English Proven Promising
Mathematics Proven Promising
Class Wide Peer Tutoring (K-6a) Effective Learning Program (11–12) Peer-Assisted Learning Strategies (1–6, 9–12) Reciprocal Teaching (3–9) Class Wide Peer Tutoring (K-6a) Effective Learning Program (11–12) Peer-Assisted Learning Strategies (K, 2–4, 9–12)
a
Specific grades represented in research used to determine rating are not provided in the summary report. Summary reports can be accessed from: http://www.promisingpractices.net/ programs.asp
academic areas. Examples of commonly used mnemonic techniques include acronyms, keywords, and pegwords. NSTTAC’s rating for mnemonics was based on one high-quality meta-analysis that included 20 studies (19 of which included students with LD) with 13–17 year old students.
238
TANYA E. SANTANGELO ET AL.
Peer assistance is an instructional practice whereby students work together to accomplish an academic task. Examples of peer assistance techniques include: (a) peer tutoring (academic instruction provided by another student); (b) cooperative learning (groups of students, often of differing readiness levels, work together toward a common learning objective); and (c) peer instruction (students are given explicit roles and objectives for teaching a lesson or completing an assignment). NSTTAC’s rating for peer assistance was based on one high-quality meta-analysis that included 10 studies (3 of which included students with LD) with 13–17 year old students. Self-management interventions aim to help students monitor, manage, and assess their own academic or behavioral performance. Self-monitoring, self-evaluation, self-instructions, and goal-setting are examples of selfmanagements techniques commonly taught to students. NSTTAC’s rating was based on one high-quality meta-analysis that included 17 studies (at least 1 of which included students with LD) with 13–16 year old students. Technology is used in multiple ways to help students acquire skills in academic areas such as reading, math, and writing. NSTTAC’s review of technology interventions included four categories: (a) computer-based instruction (computers and associated technology are used as instructional aids); (b) computer-assisted instruction (software is used to provide instruction, practice, and tutorials); (c) computer-enriched instruction (computer technology augments instruction by serving as a tool for calculations, simulations, etc.); and (d) computer-managed instruction (integrated technology systems provide extended sequential instruction and progress monitoring). NSTTAC’s rating for technology was based on one high-quality meta-analysis that included 38 studies (at least 1 of which included students with LD) with 12–22 year old students. Visual displays are instructional tools that help students understand complex content. Examples of visual displays include graphic organizers, cognitive organizers, cognitive maps, structured overviews, tree diagrams, and concept maps. NSTTAC’s rating for visual displays was based on one high-quality meta-analysis that included 10 studies (7 of which included students with LD) with 13–16 year old students. Moderate Level of Evidence NSTACC rated five practices as having a moderate level of evidence for varying functional life and transition skills: Self-Determined Learning Model, Self-Directed IEP, Simulations, and Whose Future is it Anyways? The Self-Determined Learning Model of Instruction (SDLMI) is designed to help students become self-directed and self-regulated learners. SDLMI
LD EBPs
239
includes three units: (a) set a goal, (b) take action, and (c) adjust goal or plan. Based on one high-quality group experimental study with 14–19 year old students (some of whom had LD), NSTTAC found a moderate level of evidence for using SDLMI to teach goal attainment. The Self-Directed IEP (SDIEP) is a multimedia program that teaches students how to lead their own Individualized Education Plan (IEP) meetings. SDIEP uses a model-lead-test instructional format to teach specific IEP meeting steps and skills, such as introducing meeting participants, reviewing previous goals and levels of performance, and identifying necessary supports. Based on one acceptable quality group experimental study (that included students with LD) and two acceptable quality single-subject studies with 12–21 year old students, NSTTAC found a moderate level of evidence for using SDIEP to teach student involvement in the IEP meeting. Simulations allow students to learn and practice skills in situations that approximate authentic settings. For example, to help students learn about money, a teacher might set up a classroom store. NSTTAC’s review included several different simulation techniques, such as role-playing, teacher modeling, peer feedback, and student evaluation. Based on four acceptable quality single-subject studies (one of which included students with LD) with 12–21 year old students, NSTTAC found a moderate level evidence for using simulations to teach social skills. Whose Future is it Anyway? (WFA) is a program designed to increase students’ self-determination skills, including their participation in the IEP process. WFA includes 36 lessons focused on six topics: self- and disabilityawareness, transition decisions, identifying and utilizing community resources, setting and evaluating goals, communicating effectively, and leadership. Based on one high-quality group experimental study (that included students with LD) with 12–16 year old students, NSTACC found a moderate level of evidence for using WFA to teach self-determination and knowledge of transition planning. Potential Level of Evidence NSTACC rated two practices as having a potential level of evidence for teaching functional/transitional skills: computer-assisted instruction and mnemonics. Based on one acceptable quality group study and one acceptable quality single-subject study (which included students with LD) with 12–18 year old students, computer-assisted instruction was found to have a potential level of evidence for teaching student participation in the IEP process. Based on one acceptable quality group study (that included students with LD) with 15–16 year old students, mnemonics was found to
240
TANYA E. SANTANGELO ET AL.
have a potential level of evidence for teaching students to complete a job application.
PROMISING PRACTICES NETWORK Overview The mission of The Promising Practices Network (PPN; http://www. promisingpractices.net/) is to improve the lives of children and families by providing information about instructional practices that is reliable and research-based. In addition to identifying programs that work, PPN provides links to research, resources and tools (e.g., issue briefs, fact sheets), and experts’ perspectives on issues related to children and families. The RAND Corporation operates the project and is one of the funders that supports PPN. To be considered for review, programs must provide some evidence of positive effect related to at least one of the four outcome areas examined by PPN: Healthy and Safe Children, Children Ready for School, Children Succeeding in School, and Strong Families. The PPN considers programs that are not replicated (i.e., only one study is required for classification), peer-reviewed, or currently existing (i.e., programs that are no longer being implemented). To recommend a program for review, there is no formal application. Classification Scheme The PPN rates programs as either Proven or Promising based on the following criteria: Proven: Studies must use a ‘‘convincing comparison group’’ (e.g., randomized-controlled trials, quasi-experimental designs) with a sample size of greater than 30 in both the treatment and comparison groups. At least one outcome must change in the desired direction at least 20% or 0.25 standard deviations and the change must be statistically significant (po.05). The program must also directly impact one of the PPN indicators and the program evaluation must be publicly available. Promising: Studies must use a comparison group, but some weaknesses can exist (e.g., comparison and treatment groups not shown to be comparable). Comparison and treatment groups must have at least
241
LD EBPs
10 participants each and at least one outcome demonstrating change of at least 1% that is marginally significant (po.10). Promising programs may impact either a PPN indicator or an intermediary outcome associated with a PPN indicator. The program’s evaluation must be publicly available. Other Reviewed Programs: Programs are listed on the PPN website that have shown evidence of effectiveness by other credible organizations but have not been fully reviewed by PPN. Summary of EBPs for Students with LD Similar to BEE, the PPN does not identify programs specific to students with LD, however, some of the studies do include students with LD in their participant sample. The following section describes the results of the PPN reviews for those practices supported by research that included students with LD (see Table 3). Proven Class Wide Peer Tutoring is an effective instructional practice based on peer tutoring that is reciprocated between students. It has been evaluated by over 30 studies, some of which used experimental designs, in grades K-6 and for students with LD, autism, and cognitive disabilities. Student partners take turns tutoring each other throughout the week. Points for the tutee are awarded for every correct answer. The team with the most points is announced daily. On Friday, students are tested and also pretested for the next week’s material. Effective Learning Program was identified as a proven program for 13–18 year old adolescences who are academically at-risk on the basis of a single experimental study involving 80 students, some of whom had LD. The program took at-risk students in a ‘‘school within a school’’ that involved classes with a family or team atmosphere, a lower teacher to student ratio (15:1), block scheduling, a focus on interpersonal styles and interactions, close contact with parents, and progress monitoring. Promising Peer-Assisted Learning Strategies (PALS) is a supplementary activity to existing K-12 curricula, as described previously with the WWC. Two of the 13 studies that met the criteria for PPN included students with LD in both the treatment and control group.
242
TANYA E. SANTANGELO ET AL.
Reciprocal Teaching, which focuses on the examination of text via a structured dialogue between teacher and students, was also determined as a promising instructional activity on the basis of seven studies reviewed – four of which used rigorous experimental designs. One study involved only students with LD and another involved participants in remedial reading classes. Strategies used in discussion are summarizing, generating questions, clarifying, and predicting. Once the teacher has modeled the strategies, cooperative reading groups are formed to allow students to lead the dialogue and practice the strategies. No specific curriculum is required and teachers can implement the strategy without formal training.
CURRENT PRACTICE ALERTS Overview The Council for Exceptional Children’s Division for Learning Disabilities (DLD) and Division for Research (DR) provide a series of brief reports called Current Practice Alerts (CPA) on its website (http://teachingld.org). Each report answers a series of questions related to the targeted practice: What is it? For whom is it intended? How does it work? How adequate is the research knowledge base? How practical is it? How effective is it? What questions remain? and How do I learn more? The Alerts provide examples of the practice as well as additional references to resources on the practice and related research (Table 4). The Alerts Editorial Committee, which is composed of DLD and DR members, selects current and emerging practices intended for learners with LD for review with input garnered from the executive boards of each organization. CPA authors have expertise with the practice being reviewed but do not have a vested interest in its effectiveness. The Alerts Editorial Committee reviews each report for accuracy before publication.
Classification Scheme CPA authors, in conjunction with the Alerts Editorial Committee, recommend practices reviewed as either Go For It or Use Caution, based on the adequacy of the research base and practitioner experience implementing the practice. In order for a practice to be given the recommendation
243
LD EBPs
Table 4. Council for Exceptional Children, Divisions of Learning Disabilities and Research Current Practice Alert Ratings of Programs and Practices. Rating
Program/Practice and Gradea
Multiple Academic Domains and Content Areas Direct Instruction (K-12) Go For It Class Wide Peer Tutoring (K-12) Cognitive Strategy Instruction (K-12) Graphic Organizers (K-12) Mnemonics (K-12) Vocabulary Instruction (K-12) Use Caution
Reading Go For It
Use Caution Writing Go For It
Social Skills Use Caution
Cooperative Learning (K-12) Co-teaching (K-12)
Fluency Instruction (not specified) Phonics Instruction (beginning readers and older students who do not know how to read accurately) Phonological Awareness Acquisition and Intervention (beginning readers) Reading Comprehension Strategy Instruction (not specified) Reading Recovery (not specified)
Self-Regulated Strategy Development for Writing (upper-elementary through middle school)
Social Skills Instruction (K-12)
a
Most current practice alerts indicate a relevant age range, rather than the specific grades represented in the research used to determine the rating.
Go For It, its evidence of effectiveness is supported by solid research evidence of effectiveness. If results are mixed, incomplete, or negative, then a recommendation of Use Caution is made.
Summary of EBPs for Students with LD Sixteen CPAs met our inclusion criteria for this chapter. Eleven received Go For It recommendations and caution was recommended for using five of the practices.
244
TANYA E. SANTANGELO ET AL.
Go for it Direct Instruction consists of over 50 curricular programs in areas such as language, reading, writing, spelling, mathematics, and science that involve carefully sequenced, scripted, fast-paced lessons that provide frequent interaction between students and the teacher. Characteristics of Direct Instruction lessons include rapid pace, choral group responses mixed with individual turns, corrective feedback and reteaching, reinforcement, review and practice, and progression from teacher-directed instruction to independent performance. Class Wide Peer Tutoring (CWPT) is a form of reciprocal peer tutoring in which classmates are assigned to pairs and assume the roles of both tutor and tutee. CWPT involves four primary components: ‘‘(a) weekly competing teams, (b) a highly structured tutoring procedure, (c) daily point earning and public posting of pupil performance, and (d) direct practice in functional instructional activities’’ (Maheady, Harper, & Mallette, 2003, p. 1). Cognitive Strategy Instruction consists of explicit approaches to teach students specific and general cognitive strategies intended to help them develop the necessary skills to be self-regulated learners and ultimately improve academic performance. Cognitive strategy instruction follows a general process of developing and activating background knowledge, describing and discussing the strategy, modeling application of the strategy, memorizing the strategy, using the strategy, and using the strategy independently. Research provides strong evidence for the effectiveness of Graphic Organizers for students with LD in K-12 classrooms for improving reading comprehension, writing skills, and in content-area subjects. Graphic organizers visually structure information (e.g., Venn diagrams) to illustrate hierarchies, cause and effect, compare and contrast, and cyclic or linear sequences. Mnemonics, which are also identified as an effective practice by NSTACC, are tools that aid memory, frequently by using verbal and imagery components. Common mnemonic techniques include acronyms, acrostics, pegwords, and keywords. Mnemonics can be used with virtually any content area and are associated with some of the largest effect sizes in the special education research literature. A CPA noted five broad areas of Vocabulary Instruction as being effective for students with LD: keyword mnemonics, direct instruction, fluency building vocabulary practice activities, cognitive strategies, and computer-assisted instruction. Vocabulary knowledge is associated with improved reading comprehension.
245
LD EBPs
Phonological Awareness Acquisition and Intervention target phonological awareness, or ‘‘an explicit understanding that spoken language comprises discrete units ranging from entire words and syllables to smaller intrasyllabic units of onsets, rimes, and phonemes’’ (Troia, 2004, p. 1), which predicts successful reading. Troia suggested a number of strategies to promote phonological analysis and synthesis such as matching, oddity detection, same/different judgment, segment isolation, simple production, counting, and compound production. Whereas phonological awareness refers to awareness of the distinct sounds involved in language, phonics instruction involves systematically and explicitly teaching the relation of sounds with letters and letter combinations. Research indicates that synthetic phonics (converting letters into sounds and blending the sounds to make words) is associated with the largest effects. Fluency Instruction targets reading fluency, a common skill deficit for students with LD that is related to reading comprehension. The CPA on fluency instruction identified a number of specific strategies shown by research to be effective for students with LD: repeated reading, variations of repeated reading, contingent reinforcement, goal setting plus feedback, goal setting plus feedback and contingent reinforcement, and previewing. Reading Comprehension Strategy Instruction involves explicit strategies taught to students to build their comprehension of text as well as provide them frequent opportunities to practice them. Three well-researched strategies shown to be effective for students with LD are Question Answering, Text Structure, and Multiple Strategy Approaches. Self-Regulated Strategy Development (SRSD) for Writing, an instructional model for teaching writing, has been validated by over 40 studies. SRSD involves six stages, with self-regulatory techniques (self-monitoring, goal setting) being integrated in all stages: develop background knowledge, discuss it, model it, memorize it, support it, and independent performance. Use Caution CPAs recommend that Cooperative Learning, Co-teaching, Reading Recovery, and Social Skills Instruction be used with caution with learners with LD.
DISCUSSION Our overarching goal for this chapter was to create a ‘‘one stop shopping’’ resource of EPBs for students with LD. Specifically, we provided an overview
246
TANYA E. SANTANGELO ET AL.
of five online sources and summarized their findings related to K-12 students with LD. Sometimes searching EBP sources is relatively straightforward – one might quickly identify an EBP that meets their needs from a source they trust. However, educators may run into a number of issues that complicate the process of identifying EBPs when reviewing information from EBP sources. Thus, we close this chapter by offering a few recommendations that we hope will help educators use EBP information to make sound professional decisions. First, practitioners will often encounter multiple EBPs that are relevant to their needs, especially when searching multiple EBP sources. In these cases, we recommend prioritizing practices that (a) represent the best fit for their students and themselves and (b) are supported by the most trustworthy research evidence. For example, even though the BEE rated One-to-one Teacher Tutoring with Phonics Emphasis as having strong evidence of effectiveness, if the teacher does not have the time or resources available to provide one-to-one tutoring then she should prioritize another less resourceintensive EBP (e.g., Peer-Assisted Learning Strategies). In situations in which two or more EBPs seem equally apt for a teacher and her target student(s), prioritize the EBP with the most trustworthy evidence of its effectiveness. Trustworthiness of supporting research can be assessed by examining agreement across sources. For example, if a third-grade teacher wants to implement a supplemental reading program in her classroom consisting of students with and without LD, PALS would be a good option because it aligns with her needs and is rated positively by the WWC, BEE, and PPN (and is not rated as ineffective by any of the sources). Second, directly relevant EBPs will not always be identified, even when searching across multiple sources. In these cases, it is often possible to gain valuable insights from the information that is available. For example, if an eighth-grade foreign-language teacher wants to help several students with LD who are struggling to understand course material, she would not be able to identify any practices that represent a ‘‘perfect match.’’ However, because the use of structured peer assistance, explicit instruction, and learning strategies (e.g., mnemonics, visual displays) are all rated positively across sources, grade-levels, and content areas, it is likely that implementing one or more of these practices would improve her students’ performance. Additionally, because the current corpus of EBP information for students with LD is relatively small, there may be instances when it is appropriate to consider ratings derived from research with non-LD samples. For example, given that ratings of science curricula for students with LD are not yet
LD EBPs
247
available, findings from the WWC and BEE general education science reviews may be beneficial to consider as one source of information. Third, because there is not a uniform process for reviewing and rating EBPs, teachers may occasionally find that different EBP sources rate a particular practice differently. In some cases, the inconsistencies are relatively small – such as the BEE giving PALS its highest rating (Strong Evidence of Effectiveness) and the WWC and PPN giving it their second to highest ratings (Potentially Positive Effects and Promising, respectively). Although important to notice, these differences are not too concerning, because the findings are generally congruent. In some instances, however, the contrast is greater. For example, Reading Recovery is rated by the BEE as having Strong Evidence of Effectiveness and by CPA as Use Caution; in other words, the highest- and lowest-possible ratings. To understand this discrepancy, practitioners should consider the process used by each source. In this case, although CPAs’ conclusions are based on research, they have no specific criteria for making that recommendation beyond the consensus of the experts involved. In contrast, the BEE applies systematic criteria and may therefore be more objective in drawing their conclusions. Nonetheless, CPAs are focused specifically on learners with LD, whereas the BEE reviews focus on at-risk learners and are based on research that may have involved few individuals with LD. Thus, a teacher concerned about a specific student with LD may trust the CPA recommendation, whereas a teacher of a class with many at-risk learners, a few of whom have LD, might place more credence in the BEE’s recommendations. More generally, when searching for and considering EBPs educators should become familiar with the criteria the various EBP sources use and align them with their goals. For example, the WWC uses rigorous criteria to identify EBPs. Such rigor reduces the possibility that the WWC will erroneously identify EBPs (i.e., false positives); yet it correspondingly increases the possibility that practices that actually are effective may not be classified as such (a false negative; e.g., a truly effective practice that has only been examined by studies that do not meet WWC’s stringent criteria). Accordingly, practitioners who have multiple EBPs from which to choose may want to prioritize practices recommended as effective by the WWC to make sure what they select really works. In contrast, the PPN uses relatively lax criteria (e.g., requiring only a single study), thereby decreasing the likelihood of overlooking effective practices, but increasing the odds of mistakenly classifying a practice as proven or promising (e.g., based on the findings of an invalid study). Thus, practitioners wishing to identify a
248
TANYA E. SANTANGELO ET AL.
number of potentially effective practices to consider should include consideration of sites such as the PNN. Fourth, if relevant EBP information is not available from the WWC, the BEE, NSTTAC, PPN, or CPA, we encourage readers to explore other reputable resources that evaluate and summarize education research. A few examples include:
Center on Instruction (http://www.centeroninstruction.org) Florida Center for Reading Research (http://www.fcrr.org) IRIS Center (http://iris.peabody.vanderbilt.edu/index.html) National Center on Intensive Intervention (http://www.intensiveintervention. org) National Center on Response to Intervention (http://www.rti4success. org) National Center on Universal Design for Learning (http://www.udlcenter. org) National Dissemination Center for Children with Disabilities (http:// nichcy.org) Vaughn Gross Center for Reading and Language Arts (http://www. meadowscenter.org/vgc)
Although these resources do not provide a summary recommendation regarding the adequacy of the evidence supporting a practice, they provide stakeholders with important information to draw their own conclusions about the effectiveness of practices. Finally, although EBPs represent a significant advancement in the field of education, it is important to recognize some limitations. Even when EBPs align well with student and teacher characteristics, they won’t work for everyone – nothing works for everyone. Consequently, it is important that educators monitor students’ response to new EBPs. This is especially critical when participant characteristics in the research supporting the EBP do not align exactly with those of targeted students, when the EBP is being adapted in some way, or both. It is also important that educators distinguish between practices for which insufficient evidence exists to determine their effectiveness and practices that have been shown to be ineffective by sound research. The former might be highly effective and can be considered for use if no EBPs are identified. Alternatively, practices shown by solid studies to have no or negative effects on student outcomes should not be adopted. It is also important to recognize that the information in this chapter represents what was available as of November 2012. New research and reviews are frequently reported, and we encourage readers to stay current
249
LD EBPs
by periodically checking the sources for new information. And lastly, it is important to realize that identifying EBPs is only the first step in evidencebased education, and sustaining implementation of EBPs with fidelity – which is necessary to realize the promise of evidence-based reforms – requires ongoing resources and support (see Cook & Odom, 2013).
REFERENCES Cook, B. G., Cook, L. H., & Landrum, T. J. (2013). Moving research into practice: Can we make dissemination stick? Exceptional Children, 79, 163–180. Cook, B. G., & Cook, S. C. (in press). Unraveling evidence-based practices in special education. Journal of Special Education. Retrieved from http://sed.sagepub.com/content/early/ 2011/09/08/0022466911420877.full.pdf+html Cook, B. G., & Odom, S. L. (2013). Evidence-based practices and implementation science in special education. Exceptional Children, 79, 135–144. Dammann, J. E., & Vaughn, S. (2001). Science and sanity in special education. Behavioral Disorders, 27, 21–29. Fixsen, D. L., Naoom, S. F., Blase, K. A., Friedman, R. M., & Wallace, F. (2005). Implementation research: A synthesis of the literature (FMHI Publication No. 231). Tampa, FL: University of South Florida, Louis de la Parte Florida Mental Health Institute, The National Implementation Research Network. Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–164. Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Maheady, L., Harper, G. F., & Mallette, B. (2003). A focus on class wide peer tutoring. Current Practice Alerts, 8, 1–4. Sandieson, R. W., Kirkpatrick, L. C., Sandieson, R. M., & Zimmerman, W. (2010). Harnessing the power of education research databases with the pearl-harvesting methodological framework for information retrieval. Journal of Special Education, 44, 161–175. doi: 10.1177/0022466909349144 Slavin, R. E. (2002). Evidence-based education policies: Transforming educational practice and research. Educational Researcher, 31, 15–21. doi: 10.3102/0013189X031007015 Troia, G. A. (2004). A focus on phonological awareness acquisition and intervention. Current Practice Alerts, 10, 1–4. Whitehurst, G. J. (2002). Evidence-based education (powerpoint presentation at Student Achievement and School Accountability conference). Retrieved from http://ies.ed.gov/ director/pdf/2002_10.pdf Williams, D., & Coles, L. (2007). Teachers’ approaches to finding and using research evidence: An information literacy perspective. Educational Research, 49, 185–206. doi: 10.1080/ 00131880701369719
CHAPTER 11 EVIDENCE-BASED PRACTICE IN EMOTIONAL AND BEHAVIORAL DISORDERS Timothy J. Landrum and Melody Tankersley ABSTRACT Given the complex and chronic nature of emotional and behavioral disorders (EBD), the search for and use of evidence-based practices may be hindered by the way we frame questions of what works. Instead of asking ‘‘what works in EBD?’’ – a question that is framed around an eligibility category and not specific behavioral and academic needs – we argue that the question should be contextualized around the targets of intervention. With the right question in mind – ‘‘what works for addressing this problem?’’ professionals in the field must reach consensus on ways to evaluate the current knowledge base and provide guidelines for future research to answer the question. Interventions that address specific behavioral and academic needs, are simple to implement, explicit in their execution, and predictable in their outcomes are most likely to be useful to teachers and to contribute to an evidence base for EBD.
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 251–271 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026013
251
252
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
EVIDENCE-BASED PRACTICE IN EMOTIONAL AND BEHAVIORAL DISORDERS As is evident throughout the chapters in this volume, concerted efforts to define the term evidence-based practice, to develop systems for identifying evidence-based practices, and to promote the use of evidence-based practice have taken hold in special education only as recently as the first decade of the 21st century. This is not to suggest that special educators prior to this decade did not rely on evidence to inform practice; indeed, many of the empirically valid principles that guide current practice are the products of scientific inquiry that has spanned decades, if not centuries (see Kauffman & Landrum, 2006). The foundational concepts in applied behavioral analysis, for example, had their conceptual roots in the work of scholars in the early part of the 20th century (e.g., Thorndike, Watson), followed by development and refinement during mid-century (e.g., Skinner) and ultimately by a massive expansion of research in applied settings, beginning in the 1960s (see Baer, Wolf, & Risley, 1968). But even by the most lenient standards, it cannot be said that special education for students with emotional or behavioral disorders (EBD) is characterized by widespread application of empirically supported interventions or treatments. Instead, the history of the field of EBD has been marked by fits and starts in terms of efforts to rely on data to guide practice, and as often as not the field has been driven by fads, the shifting sands of public sentiment toward students whose behavior is most discrepant from the norm, political, or financial expediency, or simple personal preferences on the part of teachers. Moreover, data from classroom observations suggest that students with EBD may experience less of what they need most (e.g., praise, opportunities to respond), and more of what may actually exacerbate their problems (e.g., attention to negative behaviors, removal or escape from the classroom or non-preferred activities) (e.g., Jack et al., 1996; Lerman & Vorndan, 2002; Shores & Wehby, 1999; Sutherland, Wehby, & Copeland, 2000). But if the accumulation of research evidence over decades has pointed the field toward interventions that hold promise, why does there appear to be a dearth of evidence-based practice in classrooms that include students with EBD (Gable, Tonelson, Sheth, Wilson, & Park, 2012)? There are surely problems with the dissemination of research findings (e.g., Cook, Cook, & Landrum, 2013) and the preparation of teachers (Maheady, Smith, & Jabot, this volume); but it seems to us that a major roadblock in developing and applying evidence-based practice specifically to the problems of EBD may be
Evidence-Based Practice in EBD
253
the complex, chronic, and intractable nature of the manifestations of such disorders (Kauffman & Landrum, 2013; also see Landrum, Wiley, Tankersley, & Kauffman, in press). That is, even in cases where the best available evidence-based interventions are applied with fidelity over sustained periods of time, the emotional and behavioral problems are seldom solved; the challenges associated with EBD do not simply go away (e.g., Cohen, Cohen, & Brook, 1993). Indeed, the long-term outcomes for students identified with EBD in school are among the poorest of any group of students with or without disabilities. Compared to their peers, students with EBD earn lower grades, fail more classes, are retained in grade more frequently, pass state competency tests at lower rates, leave school earlier (i.e., drop out), are more likely to be unemployed or underemployed when they do leave school, and are more likely to be engaged with the juvenile justice system (Frank, Sitlington, & Carson, 1995; Koyangi & Gaines, 1993; Marder, 1992; U.S. Department of Education, 2011; Wagner, Newman, Cameto, & Levine, 2005). With the complex considerations of intervening with students with EBD and the chronic and intractable nature of their emotional, behavioral, and academic problems, we suggest that searching for ‘‘what works in EBD’’ is simply a case of asking the wrong question. Given the bleak outlook for most students with well-developed EBD, the questions we should be asking relative to serving children and youth with EBD are not surprising. First, we must ask what we can do to prevent the initial development of EBD: what interventions must be applied universally, and what features should characterize instruction for all students if we are to truly prevent EBD from ever emerging? Second, accepting that prevention can never be 100% successful (i.e., some children and youth will still develop EBD despite our best efforts), what interventions and characteristics of instruction can keep EBD that has already emerged from getting worse? Kauffman (1999) has decried the ways that our field has avoided the hard work of prevention, but we still see hope in the systematic effort of many scholars to develop and validate interventions for the extraordinarily challenging behavior typical of most students with EBD. As we highlight throughout this chapter, however, the challenges of this population are great, the problems they bring are generally well-entrenched before they are addressed, and once fully developed, the behavior problems of students with EBD do not respond well to typical intervention. In the sections that follow, we flesh out five ideas related to the search for evidence-based practice in EBD. First, identification as eligible for special education under the category of EBD (emotional disturbance in the federal
254
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
language used in IDEA) tells educators little – indeed, almost nothing – about how to intervene or teach. In other words, identification for services must not be mistaken for diagnosis of specific problems. Second, rather than searching for interventions for EBD, scholars must remain focused on establishing targets for intervention – specific skill deficits or behavioral excesses – and working toward the development and validation of interventions that impact these targets. Third, a rich literature in special education over the last half century has yielded a significant body of promising interventions, and there must be a concerted effort at increasing the translation of this research into practice. Fourth, on the heels of our argument that a rich empirical base is available to guide practice, it must be accepted that a new approach is emerging in evaluating the methodological rigor of studies and synthesizing evidence in support of interventions in special education for the purpose of identifying the most generally effective instructional practices (Cook, Tankersley, & Landrum, 2009). This approach speaks to the demands that (a) researchers evaluate (or reevaluate) previously published literature to confirm (or dispute) practices as evidencebased and (b) going forward, researchers should attend even more carefully both to the rigor of their methods and the reporting of their findings. Finally, we consider a question that will always complicate the evaluation of research studies in search of evidence-based practice, especially so in the quest for effective practices for students with EBD: effective for whom? For example, when a set of rigorous studies points to a particular intervention as evidence-based with regard to improving some outcome, it remains incumbent on educators to examine the nature of participants (e.g., ages, gender, ability levels, learning and behavioral characteristics) in this body of research before deciding whether results can be expected to generalize to a given group of students. How specific must research be to a given population in order for generalization to be presumed, and does this differ for students with EBD in particular? We conclude with thoughts on where the field of EBD stands at present with regard to evidence-based practice and where future efforts might be targeted in order to move the field toward more consistent use of evidence to guide practice.
PROBLEMS WITH PRESCRIBING TREATMENT BASED ON A LABEL The notion of a so-called ‘‘medical model’’ applied to special education has carried no small measure of controversy and debate over the years
Evidence-Based Practice in EBD
255
(e.g., Clark, Dyson, & Millward, 1998; Forness & Kavale, 2001a; Kauffman & Hallahan, 1974; Trent, Artiles, & Englert, 1998). A full discussion of the medical model versus other conceptions of disability (e.g., a deficit model vs. the social construction of disability) is well beyond our scope here, but one aspect of this thinking may be particularly relevant to discussions about evidence-based practice. The medical model, according to common interpretation, suggests that disability lies within the individual, and connotes a deficit, deficiency, or disease state that should be addressed, if not overcome, by medical care or the application of some similar form of prescribed professional intervention or treatment. With regard to EBD in particular, the idea of prescribing treatment based on a diagnosis warrants closer inspection (e.g., Forness & Kavale, 2001b). We argue that schools are not (or should not be) in the business of diagnosing disorders in the classic sense of that term. Rather, according to IDEA regulations, schools are to use a multidisciplinary process to identify students who are eligible for special education services because of educational characteristics that result in their inability to learn through the educational program provided to typical students. This is a fine point to be sure, and quibbling over the terms identification and diagnosis may seem little more than semantics. But in practice the term diagnosis may carry at least some implication that an answer has been discovered (e.g., the root or cause of the problem) or at minimum that a formal diagnosis will point to prescribed treatments to ameliorate or even cure a condition. Neither of these, of course, is the case in terms of schools’ identifying students as having EBD.
MOVING FROM LABELS TOWARD TARGETS FOR INTERVENTION Landrum, Tankersley, and Kauffman (2003) have argued that neither a diagnosis nor a school-based eligibility determination point to an educational plan at any reasonable level of specificity. In other words, it is probably not fruitful to approach the identification of evidence-based practices with an eye toward effective interventions ‘‘for EBD.’’ As we have suggested, the complexities of students’ emotional and behavioral responses, in concert with highly varied academic needs, make for an almost limitless array of intervention needs. What Landrum et al. proposed instead is a more systematic look at targets for intervention; that is, an approach that
256
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
delineates specific behavioral and academic needs (e.g., ‘‘aggression,’’ ‘‘compliance,’’ ‘‘social interaction,’’ ‘‘attention to task’’) that can then be matched to empirically validated interventions for improving specific behaviors or ameliorating particular deficits. It is far more logical, feasible, and ultimately useful to teachers for researchers to focus on the development and validation of interventions for specific concerns. Matching interventions to specific behavioral and academic concerns will also be helpful for synthesizing research results across the literature, provided the outcomes targeted are operationally defined. Our argument, therefore, is that the label EBD conveys little more than eligibility for special education services in school, and that treating all students with EBD as a homogenous group will, at best, result in spurious effectiveness. School identification as eligible for special education under this category is not, unfortunately, a differential diagnosis that points to a prescriptive list of indicated treatments. Nonetheless we do know based on the history of who is identified as EBD that some generalities of effective intervention do apply. Landrum et al. (2003) delineated three broad domains that capture most of the challenges students with EBD experience. First, students with EBD invariably display higher rates of inappropriate behavior than their peers, and conversely, they tend to display lower rates of appropriate behaviors. Second, they experience academic difficulties that are at least linked to their behavioral difficulties (regardless of whether academic problems caused behavioral concerns, or vice versa, the two are reciprocally linked and any intervention must take this into account). Finally, and perhaps related to their behavioral excesses and deficits, students with EBD struggle with relationships with both peers and adults. We reiterate that these are generalities, and know that individual teachers must conduct valid assessments (i.e., those that include reliable observational data) in order to learn about individual students’ specific academic and behavioral strengths and needs – skills and skill deficits – so that individually designed interventions can be applied. Still, this framework provides a workable means for teachers to approach the selection and implementation of interventions. The implicit inclusion of academic difficulties as a defining domain of EBD in this framework bears a bit of discussion here. Around the beginning of the 21st century we noticed a shift from focusing attention primarily or disproportionately on classroom behavior of students with EBD, to include a long overdue emphasis on their academic learning and instruction (see Kauffman, 2003; Kauffman & Bantz, in press; Nelson, Benner, Lane, & Smith, 2004). The view that effective instruction is the first line of defense in
257
Evidence-Based Practice in EBD
preventing the problems of EBD from worsening is now widely accepted, and the reasons for this seem simple. If students are engaged in structured, well-paced, interesting lessons, they are far less likely to engage in disruptive or off-task behavior (see Brophy & Good, 1986). Moreover, if students are engaged in effective lessons and their academic skill levels increase, they become far more likely to experience success in school, to receive more positive attention from teachers, and to draw less criticism or negative attention from both peers and adults for their poor performance. While explicit focus on the interplay between academics and behavior is long overdue (Sutherland, Lewis-Palmer, Stichter, & Morgan, 2008), researchers are beginning to focus in earnest on this concept, and many promising interventions seem to carry powerful benefits on both academic, behavioral, and social interaction outcomes (e.g., ClassWide Peer Tutoring, selfmanagement interventions).
INTERVENTIONS THAT SHOW PROMISE A number of promising interventions for the types of behaviors typical of students with EBD have emerged from the traditions of applied behavior analysis and direct instruction. Although we do not provide an overview of the research literature underlying all of the interventions showing promise for students with EBD, we discuss here several key features that seem to be consistent across empirically validated interventions: simplicity, explicitness, and predictability.
Simplicity There is tremendous risk of overwhelming both students and teachers with interventions that are complex, multifaceted, or have multiple steps or components (see Lane, Beebe-Frankenberger, Lambros, & Pierson, 2001; Lane, Bocian, MacMillan, & Gresham, 2004). For teachers, the notion of simplicity is that the intervention is easy to implement within the context of the classroom, along with the demands of the curriculum, and in relation to the time and resources needed. Teachers often report that they simply do not have the time to plan, faithfully implement, and manage multiple new components on top of existing instructional and management demands (e.g., Bambara, Goh, Kern, & Caskie, 2012). In a practical sense, they may also be significantly less likely to adopt a new intervention – even if it works – if
258
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
the effort needed to implement it requires them to juggle or even disregard other responsibilities they view as equally important (Konrad, Helf, & Joseph, 2011). For students, the notion of simplicity is related to a key element of effective instruction (see Becker & Carnine, 1980); students should be presented only with the essential information that is relevant to the task at hand (whether academic or social/behavioral). For example, three classroom rules are probably better than eight; a 5-min lecture and demonstration to introduce a new concept is probably preferable to a 20-min lecture (no matter how engaging) (e.g., Rademacher, Callahan, & Pederson-Seelye, 1998).
Explicitness Very much related to simplicity is explicitness. Explicitness is being unambiguous and direct in expectation, explanation, and direction. We can imagine no reason for keeping any aspect of an academic or behavioral intervention secret from a student. Clarity in what is expected, why it is expected, precisely what students are to do to complete a task or comply with a request, what responses they are required to make, when they are to finish, and what they are to do when finished are all examples of essential characteristics of an effective intervention. Rather than saying ‘‘Time to get ready for our quiz,’’ the teacher who states ‘‘I will pass out the quiz in 2 minutes. Please put all your books and papers away and keep only a pencil on your desk,’’ leaves far less room for confusion or doubt about expectations or what is to happen next. Archer and Hughes (2011) identified 16 instructional and behavioral elements that are characteristic of an explicit approach to teaching. Among them are sequencing skills logically, breaking down complex skills and strategies into smaller instructional units, providing step-by-step instructions, using clear and concise language, delivering sufficient examples and non-examples, providing guided and supported practice, including opportunities for frequent responses, monitoring student performance closely, and providing immediate affirmative and corrective feedback (p. 2). These characteristics of explicit instruction have been supported by over 30 years of research and apply to both academic and social behavior instruction. We believe explicit instruction provides students with EBD the clarity and precision necessary to be engaged and successful in learning. Explicitness is not only for delivery of an intervention, but also in the design of the intervention. Descriptions of ClassWide Peer Tutoring
259
Evidence-Based Practice in EBD
(Delquadri, Greenwood, Whorton, Carta, & Hall, 1986), for example, include precise guidelines for how teachers form tutoring pairs, teach students the tutor and tutee roles, implement the intervention, keep track of points, and reward students both for their academic responses and for adhering to the tutoring process properly (see Maheady & Gard, 2010). Such explicitness in design increases the probability that teachers will deliver the intervention with precision and fidelity (Lane et al., 2004) and provides researchers clear direction for determining treatment integrity and for synthesizing study results (Gresham, 2009).
Predictability Consequences are, of course, a hallmark of effective behavioral interventions, and it would be hard to imagine a behavior intervention plan that did not include clearly stated and consistently implemented consequences. In an effective behavioral intervention, desired behavior is followed – always – by the positive outcome that has been established, and the absence of this desired behavior never – ever – results in the same positive outcome. The concept is that consistency of implementation leads to predictability, which is generally regarded as a key element of success with students with EBD. But beyond reinforcement provided for specific responses, to be successful in academics, relationships, vocation, or recreation, individuals must become skilled at predicting what will happen following a given response or behavior. Predictability in responses, environments, and routines helps students understand the consequent events that logically follow from their actions, including both behavioral and academic responses. In the context of a timeout intervention, for example, students must be able to predict what gets them removed from a reinforcing environment, as well as what is necessary to earn their way back into the opportunity to earn reinforcement. Again, this contingency should not be difficult for students to predict, and specific, explicit examples must be taught. For example, students should be taught that physical aggression will result in their removal, at least temporarily, from a game on the playground. Likewise, they must learn what the intervention involves (e.g., sitting quietly on the bench for 5 min), as well as how they reenter the game (e.g., acknowledging their behavior and apologizing to the student who was the target of their aggression). Simplicity, explicitness, and predictability are not the only key features of effective interventions for students with EBD, but the learning history of most students with EBD is such that these characteristics probably are
260
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
critical elements of any intervention that is to be successful (e.g., Wagner, Kutash, Duchnowski, Epstein, & Sumi, 2005). Given that most students with EBD have experienced behavioral and academic struggles for years before they are formally identified as eligible for special education (see Feil et al., 2005; Kauffman & Landrum, 2013), it is likely that the responses of peers and adults in their lives have historically been anything but simple, explicit, and predictable. As teachers set out to plan and implement interventions for their students with EBD, they must be prepared not only to attend to the characteristics of simplicity, explicitness, and predictability we describe here, but to do so with intensity and relentlessness if these interventions are to reverse the negative patterns that have undoubtedly typified school for their students (see Landrum et al., in press, for a discussion of the intensity and relentlessness with which interventions must be delivered to students with EBD). Indeed, because EBD is an extreme condition, simple, explicit, and predictable interventions must be delivered with extraordinary energy, commitment, and resolve (see Kauffman, Bantz, & McCullough, 2002). We think the key features of an intervention that shows promise for students with EBD, then, are intense and relentless execution of practices that are
targeted at specific behavioral and academic needs, simple for teachers to implement and for students to understand, explicit in delivery and design, and predictable in consequent events.
To illustrate some of the promising interventions for the types of behavior typical of students with EBD, we provide a number of empirically validated intervention, one or more clear targets of each intervention, and examples of where to find information for how to implement the interventions in Table 1. In our assessment, the interventions listed in Table 1 meet the key features of targeted, simple, explicit, and predictable.
EVALUATING EVIDENCE BASES As is evident in the chapters throughout this volume, a tremendous push is underway in the special education research community to become more systematic and logical in the definition and identification of evidence-based practices. An important marker of this movement was the publication in 2005 of quality indicators for methodological rigor in special education
The IRIS Center at http://iris.peabody.vanderbilt.edu/resources.html Partin, Robertson, Maggin, Oliver, and Wehby (2009)
University of Kansas Special Connections at http://www.specialconnections.ku.edu/ Epstein, Atkins, Cullinan, Kutash, and Weaver (2008) University of Missouri Evidence Based Intervention Network at http://ebi.missouri.edu/?cat=29 Jenson, Rhode, and Reavis (2009); Rhode, Jenson, and Reavis (2010); Sprick, Garrison, and Howard (2009) Greenwood, Delquadri, and Carta (1997) Center for Effective Collaboration and Practice at cecp.air.org/ familybriefs/docs/PeerTutoring.pdf Promising Practices Network at http://www.promisingpractices.net/ default.asp CEC’s DLD and DR Practice Alerts at http://teachingld.org/alerts
Intervention Central at interventioncentral.org Kellam et al. (2011)
Increase appropriate academic responses Increase appropriate behavioral responses Increase compliance Enhance motivation Increase task engagement
Increase appropriate academic responses Increase appropriate behavioral responses Increase compliance Enhance motivation Increase task engagement Decrease disruptive behavior
Increase appropriate academic responses Increase appropriate behavioral responses
Reinforcement (positive, differential, negative, token)
ClassWide Peer Tutoring
Good Behavior Game
Examples of Implementation Information
Increase appropriate academic responses Increase appropriate behavioral responses Increase compliance Enhance motivation Increase task engagement
Potential Targets
Contingent teacher attention (praise); behavior specific praise (BSP)
Interventions
Table 1. Interventions that Show Promise for Students with EBD.
Evidence-Based Practice in EBD 261
Missouri Positive Behavior Support at http://pbismissouri.org/ ?s=opportunities+to+respond
Vanderbilt University at http://www.kc.vanderbilt.edu/pals/ Promising Practices Network at http://www.promisingpractices.net/ default.asp What Works Clearinghouse at http://ies.ed.gov/ncee/wwc/ interventionreport.aspx?sid=569 CEC’s DLD and DR Practice Alerts at http://teachingld.org/alerts University of Kansas Special Connections at http://www.specialconnections.ku.edu/ Promising Practices Network at http://www.promisingpractices.net/ default.asp Lehigh University College of Education Center for Promoting Research to Practice at https://coe.lehigh.edu/content/project-reach-resourcesteachers Haring Center at http://www.haringcenter.washington.edu/sites/default/ files/file/HPR%20Tip%20Sheet.pdf
Increase appropriate academic responses Increase appropriate behavioral responses Enhance motivation Increase task engagement
Increase appropriate academic responses Increase appropriate behavioral responses
Increase compliance Enhance motivation Increase task engagement
Peer Assisted Learning Strategies (PALS)
Direct Instruction
Behavioral momentum (high probability requests)
Increase compliance Enhance motivation Increase task engagement Decrease disruptive behavior
Examples of Implementation Information
Increase appropriate academic responses Increase appropriate behavioral responses Enhance motivation Increase task engagement
Potential Targets
Increased opportunities to respond (OTRs)
Interventions
Table 1. (Continued )
262 TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
Lehigh University College of Education Center for Promoting Research to Practice at https://coe.lehigh.edu/content/project-reach-resourcesteachers Randy Sprick’s Safe and Civil Schools at http://www.safeandcivilschools.com/newsletters/archive/ 12-13-2oct-tip-noncompliance.php Lehigh University College of Education Center for Promoting Research to Practice at https://coe.lehigh.edu/content/project-reach-resourcesteachers The IRIS Center at http://iris.peabody.vanderbilt.edu/resources.html Umbreit, Ferro, Liaupsin, and Lane (2007) Lane, Cook, and Tankersley (2013)
The IRIS Center at http://iris.peabody.vanderbilt.edu/resources.html Council for Children with Behavior Disorders at http://www.ccbd.net/sites/ default/files/BB%2016%281%29%20using%20precorrection.pdf Lane, Menzies, Bruhn, and Crnobori (2010)
Increase compliance Enhance motivation Increase task engagement
Increase compliance
Increase task engagement Increase productivity
Decrease inappropriate behavioral responses Increase appropriate replacement behavior
Decrease inappropriate academic responses Decrease inappropriate behavioral responses
Decrease inappropriate behavioral responses
Choice and preferred activities
Precision requests
Self-monitoring
Function-based interventions (functional behavioral assessment, A-B-C analysis) for modifying antecedents and/or consequences
Precorrection
Response cost
Evidence-Based Practice in EBD 263
264
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
research (see Gersten et al., 2005; Horner et al., 2005). Although these quality indicators represented a dramatic step forward in the movement of the field toward a systematic method of determining what works in special education, and a number of syntheses have since been published in which scholars have applied these quality indicators, it is important to accept that the development, validation, systematic application, and widespread agreement on a model for evaluating evidence is in its early stages. For example, in a special issue of Exceptional Children in 2009, five teams of scholars applied the quality indicators to bodies of research in special education (Baker, Chard, Ketterlin-Geller, Apichatabutra, & Doabler, 2009; Browder, Ahlgrim-Delzell, Spooner, Mims, & Baker, 2009; Chard, Ketterlin-Geller, Baker, Doabler, & Apichatabutra, 2009; Lane, Kalberg, & Shepcaro, 2009; Montague & Dietz, 2009). Among the key outcomes of this effort were observations by the research teams that while the Horner et al. and Gersten et al. quality indicators provided an important framework for evaluating evidence, many questions remain about how to operationalize and apply them in ways that are valid, reliable, and useful (Cook et al., 2009). Note that in many ways special education was late to the table in terms of developing and promoting standards for evaluating evidence, as a number of other organizations preceded the quality indicators effort in special education with standards and processes of their own (e.g., APA Division 12 Task Force, What Works Clearinghouse). Though as a field special education is still early in the process of defining and determining evidence-based practice, there is some measure of urgency to at least come to consensus on broad concepts for at least two reasons: (a) consistent evaluation of previously published studies and (b) agreed upon guidelines to direct future research. First, it seems imperative that reviews, synthesis, and evaluation of previously published empirical evidence in special education follow some consistent framework to identify what is currently known about effectiveness of practices and to minimize the risk of competing conclusions drawn from different evaluations of the same bodies of literature. For example, Cook and Tankersley (2007) discussed the problems associated with ‘‘retrofitting’’ quality indicators developed in the present day to research studies published years, or even decades, ago. An obvious example of this difficulty is the now nearly universal requirement that researchers report effect sizes for any findings of statistically significant differences. This requirement is relatively new in special education research yet is now necessary for most scholarly publication of results. If this requirement – to report an effect size – were to be applied to research findings published prior
Evidence-Based Practice in EBD
265
to its establishment, huge bodies of important research literature would be ignored or discounted when determining which studies to include in a review of evidence supporting a given practice. A consensus-based framework would also provide current guidelines to follow for implementing and reporting intervention research. Although the application of the quality indicators put forth by Horner et al. (2005) and Gersten et al. (2005) to extant studies and bodies of research has proven difficult, researchers tend to agree on the importance of such indicators as guidelines for designing and implementing rigorous intervention research studies (e.g., Baker et al., 2009; Browder et al., 2009; Chard et al., 2009; Lane et al., 2009; Montague & Dietz, 2009). That is, as many researchers have attempted to apply the quality indicators, they recognized the need for operational definitions of the indicators to guide current and future research so that evidence can be synthesized in meaningful and reliable ways (e.g., Cook & Tankersley, 2007; Lloyd, Pullen, Tankersley, & Lloyd, 2006; Tankersley, Cook, & Cook, 2008). We must also acknowledge that concerns about a consistent framework for evaluating evidence impinges upon our confidence in the research-based practices we have listed in Table 1. Although we know that the practices listed are research-based (and are confident in the positive effects shown consistently in the studies using them), can these practices be considered evidence-based (see Cook, Smith, & Tankersley, 2012 for a discussion of research-based vs. evidence-based practices)? Some practices included in the table have been subjected to formal review using defined standards (e.g., CWPT has been reviewed by the What Works Clearinghouse and according to their evidence standards found to be have ‘‘potentially positive effects’’ on reading achievement). Other practices listed there have been subjected to more traditional integrative reviews (e.g., self-monitoring; Reid, 1996) and pronounced to be empirically validated on that basis. Still others have simply been around so long, and the subject of literally hundreds of studies (e.g., contingent teacher attention), that few seem to question their merit (ourselves included). Again, if and when our field arrives at some consensus on even a general framework for evaluating evidence, there will be much work to do on appraising both current and previously published bodies of research.
EFFECTIVE FOR WHOM? Our argument in this chapter is that so-called targets for intervention represent a more meaningful conceptualization of how we might go about
266
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
identifying evidence-based practice in EBD rather than seeking effective interventions ‘‘for EBD.’’ Regardless of how evidence-based practices are conceptualized, we must at some point in the process of asserting that a practice is supported by evidence confront the question: effective for whom (Zigmond, 2003)? To expand on the potential scenario we raised earlier, suppose a research team evaluates a set of studies on the effects of selfmonitoring on students’ attending to task, and determines by applying a set of quality indictors to this body of literature that a sufficient number of methodologically sound (or rigorous) studies show positive effects, and thus they pronounce self-monitoring an ‘‘evidence-based practice’’ for attentional problems. Who were the participants in the studies that met criteria for methodological rigor? What were their ages, genders, academic grade levels, and, certainly, special education categories of disability, if any? More importantly, which of these characteristics are relevant? Perhaps the studies included participants ranging in age from 7 to 12 years, in elementary and middle school classrooms; both males and females were included, and racial/ethnic representation mirrored that of the United States. Suppose further that 20% of the participants were identified by their schools as having learning disabilities. Should a high school teacher of students with EBD implement self-monitoring for her students who have serious attentional difficulties in her 10th-grade English class? What factors would lead her to choose, or avoid, a particular intervention in such a scenario? In one sense we think that professional educators can and should be trusted to make judgments about the generalizability of findings from a review of studies (see Cook, Tankersley, & Harjusola-Webb, 2008). At the same time, however, we recognize increasing scrutiny of the standards for evidence and the methods used to apply evidence standards to sets of research studies. The processes are not simple, and the issues raised often involve conceptually complex or statistically sophisticated concerns that researchers and scholars continue to debate. To expect the individual practitioner alone to sort through such issues and consistently arrive at appropriate decisions about the potential utility of an intervention may be unfair. Instead, we think that educational teams (involving an array of individuals who may be familiar with research such as teachers, parents, school psychologists, administrators) should be involved together in selecting and adapting evidence-based practices to meet students’ targeted behavioral and academic needs (see Johnson & McMaster, this volume). But, to ask that educational teams invest in this process, it is incumbent
Evidence-Based Practice in EBD
267
upon academic scholars, organizations of professional standards, and other translators of research to synthesize, interpret, and transform what is currently known into useful and applicable practice so that educational teams can make use of the evidence base. Without translation, we cannot move forward. And, without first coming to consensus on concepts of defining and determining what constitutes an evidence-based practice, we cannot identify practices that should be translated and then implemented.
CONCLUSION: A CALL TO ACTION Researchers, teacher educators, administrators, and direct service providers all have a job to do related to bringing evidence-based practices to students with EBD. Scholars in the fields of EBD and special education need to reach agreement on a framework and criteria for consistently evaluating previously published research and to guide future research so that syntheses of practices can be conducted with reliability. Then, translators of research (e.g., teacher educators, leading practitioners, researchers, professional organizations) must find accessible, meaningful, and useful ways to bring practices to teachers and educational teams. The translations of practices should highlight which target behaviors can be addressed through their use so that teachers and teams can make important matches between students’ behavioral and academic needs and the potential outcomes of the interventions. Moreover, the translation of practices should keep the key features of simplicity, explicitness, and predictability as goals for presentation and dissemination. Educational team members will determine which evidence-based practices are most appropriate for individual students in the contexts in which they will be applied and will make decisions about implementation based on current data on student success. Identifying effective practices for students with EBD is an insufficient but necessary condition for reaching the goal of positively influencing students’ outcomes. The accumulation of past research evidence, as well as current and on-going research, hold promise for addressing the complex, chronic, and intractable nature of the manifestations of EBD and there must be a concerted effort at increasing the translation of this research into practice. As efforts toward translation proceed, it seems wise to focus on targets of intervention – increasing specific skill deficits or decreasing behavioral excesses – rather than interventions ‘‘for EBD.’’
268
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
REFERENCES Archer, A., & Hughes, C. (2011). Explicit instruction: Effective and efficient teaching. New York, NY: Guilford Press. Baer, D. M., Wolf, M. M., & Risley, T. R. (1968). Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis, 1, 91–97. Baker, S. K., Chard, D. J., Ketterlin-Geller, L. R., Apichatabutra, C., & Doabler, C. (2009). The basis of evidence for self-regulated strategy development for students with or at risk for learning disabilities. Exceptional Children, 75, 303–318. Bambara, L. M., Goh, A., Kern, L., & Caskie, G. (2012). Perceived barriers and enablers to implementing individualized positive behavior interventions and supports in school settings. Journal of Positive Behavior Interventions, 14, 228–240. Becker, W. C., & Carnine, D. W. (1980). Direct instruction: An effective approach to educational intervention with the disadvantaged and low performers. Advances in Clinical Child Psychology, 3, 429–473. Brophy, J. E., & Good, T. G. (1986). Teacher behavior and student achievement. In M. Wittrock (Ed.), Handbook of research in teaching (3rd ed., pp. 328–375). New York, NY: Macmillan. Browder, D., Ahlgrim-Delzell, L., Spooner, F., Mims, P. J., & Baker, J. N. (2009). Reviewing the evidence base for using time delay to teach picture and word recognition to students with severe developmental disabilities. Exceptional Children, 75, 343–364. Chard, D. J., Ketterlin-Geller, L. R., Baker, S. K., Doabler, C., & Apichatabutra, C. (2009). Repeated reading interventions for students with learning disabilities: Status of the evidence. Exceptional Children, 75, 263–281. Clark, C., Dyson, A., & Millward, A. (Eds.). (1998). Theorising special education. New York, NY: Routledge. Cohen, P., Cohen, J., & Brook, J. (1993). An epidemiological study of disorders in late childhood and adolescence – II. Persistence of disorders. The Journal of Child Psychology and Psychiatry, 34, 869–877. Cook, B. G., Cook, L., & Landrum, T. J. (2013). Moving research into practice: Can we make dissemination stick? Exceptional Children, 79, 163–180. Cook, B. G., Smith, G. J., & Tankersley, M. (2012). Evidence-based practices in education. In K. R. Harris, S. Graham & T. Urdan (Eds.), APA educational psychology handbook (pp. 495–528). Washington, DC: American Psychological Association. Cook, B. G., & Tankersley, M. (2007). A preliminary examination to identify the presence of quality indicators in experimental research in special education. In J. Crockett, M. M. Gerber & T. J. Landrum (Eds.), Achieving the radical reform of special education: Essays in honor of James M. Kauffman (pp. 189–212). Mahwah, NJ: Lawrence Erlbaum. Cook, B. G., Tankersley, M., & Harjusola-Webb, S. (2008). Evidence-based special education and professional wisdom: Putting it all together. Intervention in School and Clinic, 44(2), 105–111. Cook, B. G., Tankersley, M., & Landrum, T. J. (2009). Determining evidence-based practices in special education. Exceptional Children, 75, 365–383. Delquadri, J. C., Greenwood, C. R., Whorton, D., Carta, J. J., & Hall, R. V. (1986). Classwide peer tutoring. Exceptional Children, 52, 535–542.
Evidence-Based Practice in EBD
269
Epstein, M., Atkins, M., Cullinan, D., Kutash, K., & Weaver, R. (2008). Reducing behavior Problems in the elementary school classroom: A practice guide (NCEE #2008-012). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from http:// ies.ed.gov/ncee/wwc/publications/practiceguides Feil, E. G., Small, J. W., Forness, S. R., Serna, L. R., Kaiser, A. P., Hancock, T. B., y Lopez, M. L. (2005). Using different measures, informants, and clinical cut-off points to estimate prevalence of emotional or behavioral disorders in preschoolers: Effects on age, gender, and ethnicity. Behavioral Disorders, 30, 375–391. Forness, S. R., & Kavale, K. A. (2001a). Ignoring the odds: Hazards of not adding the new medical model to special education decisions. Behavioral Disorders, 26, 269–281. Forness, S. R., & Kavale, K. A. (2001b). ADHD and a return to the medical model of special education. Education & Treatment of Children, 24, 2–24. Frank, A. R., Sitlington, P. L., & Carson, R. R. (1995). Young adults with behavioral disorders: A comparison with peers with mild disabilities. Journal of Emotional and Behavioral Disorders, 3, 156–164. Gable, R. A., Tonelson, S. W., Sheth, M., Wilson, C., & Park, K. L. (2012). Importance, usage, and preparedness to implement evidence-based practices for students with emotional disabilities: A comparison of knowledge and skills of special education and general education teachers. Education and Treatment of Children, 35, 499–520. Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–164. Greenwood, C. R., Delquadri, J. C., & Carta, J. J. (1997). Together we can! Classwide peer tutoring to improve basic academic skills. Longmont, CO: Sopris West. Gresham, F. M. (2009). Evolution of the treatment integrity concept: Current status and future directions. School Psychology Review, 38, 533–540. Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Jack, S. L., Shores, R. E., Denny, R. K., Gunter, P. L., DeBriere, T., & DePaepe, P. (1996). An analysis of the relationship of teachers’ reported use of classroom management strategies on types of classroom interactions. Journal of Behavioral Education, 6, 67–87. Jenson, W., Rhode, G., & Reavis, H. K. (2009). Tough kid toolbox. Eugene, OR: Pacific Northwest Publishing. Kauffman, J. M. (1999). How we prevent the prevention of emotional and behavioral disorders. Exceptional Children, 65, 448–468. Kauffman, J. M. (2003). Reflections on the field. Education and Treatment of Children, 26, 325–329. Kauffman, J. M., & Bantz, J. (in press). Instruction, not inclusion, should be the central issue in special education. Journal of International Special Needs Education. Kauffman, J. M., Bantz, J., & McCullough, J. (2002). Separate and better: A special public school class for students with emotional and behavioral disorders. Exceptionality, 10, 149–170. Kauffman, J. M., & Hallahan, D. P. (1974). The medical model and the science of special education. Exceptional Children, 41, 97–102.
270
TIMOTHY J. LANDRUM AND MELODY TANKERSLEY
Kauffman, J. M., & Landrum, T. J. (2006). Children and youth with emotional and behavioral disorders: A history of their education. Austin, TX: ProEd, Inc. Kauffman, J. M., & Landrum, T. J. (2013). Characteristics of emotional and behavioral disorders of children and youth (10th ed.). Upper Saddle River, NJ: Pearson. Kellam, S. G., Mackenzie, A. C., Brown, C. H., Poduska, J. M., Wang, W., Petras, H., & Wilcox, H. C. (2011). The good behavior game and the future of prevention and treatment. Addiction Science & Clinical Practice, 6(1), 73. Retrieved from http:// www.ncbi.nlm.nih.gov/pmc/articles/PMC3188824/ Konrad, M., Helf, S., & Joseph, L. M. (2011). Evidence-based instruction is not enough: Strategies for increasing instructional efficiency. Intervention in School and Clinic, 47(2), 67–74. Koyangi, C., & Gaines, S. (1993). All systems failure: An examination of the results of neglecting the needs of children with serious emotional disturbance. Washington, DC: National Mental Health Association. Landrum, T. J., Tankersley, M., & Kauffman, J. M. (2003). What is special about special education for students with emotional or behavioral disorders? Journal of Special Education, 37, 148–156. Landrum, T. J., Wiley, A. L., Tankersley, M., & Kauffman, J. M. (in press). Is EBD ‘‘special,’’ and is ‘‘special education’’ an appropriate response? In P. Garner, J. M. Kauffman, & J. Elliott (Eds.), Sage handbook of emotional & behavioral difficulties: Part I: Contexts, definitions, and terminologies. London, UK: Sage Publications. Lane, K. L., Beebe-Frankenberger, M. E., Lambros, K. M., & Pierson, M. (2001). Designing effective interventions for children at-risk for antisocial behavior: An integrated model of components necessary for making valid inferences. Psychology in the Schools, 38, 365–379. Lane, K. L., Bocian, K. M., MacMillan, D. L., & Gresham, F. M. (2004). Treatment integrity: An essential – but often forgotten – component of school-based interventions. Preventing School Failure, 48(3), 36–43. Lane, K. L., Cook, B. G., & Tankersley, M. (Eds.). (2013). Research-based strategies for improving outcomes in behavior. Columbus, OH: Pearson. Lane, K. L., Kalberg, J. R., & Shepcaro, J. C. (2009). An examination of the evidence base for function-based interventions for students with emotional or behavioral disorders attending middle and high schools. Exceptional Children, 75, 321–340. Lane, K. L., Menzies, H. M., Bruhn, A. L., & Crnobori, M. (2010). Managing challenging behaviors in schools: Research-based strategies that work. New York, NY: Guilford Press. Lerman, D. C., & Vorndran, C. M. (2002). On the status of knowledge for using punishment: Implications for treating behavior disorders. Journal of Applied Behavior Analysis, 35, 431–464. Lloyd, J. W., Pullen, P. C., Tankersley, M., & Lloyd, P. A. (2006). Defining and synthesizing effective practice: Critical dimensions of research and synthesis approaches considered. In B. G. Cook & B. R. Schirmer (Eds.), What is special about special education? (pp. 136–154). Austin, TX: ProEd. Maheady, L., & Gard, J. (2010). Classwide peer tutoring: Practice, theory, research, and personal narrative. Intervention in School and Clinic, 46(2), 71–78. Marder, C. (1992). Education after secondary school. In M. Wagner, R. D’Amico, C. Marder, L. Newman, & J. Blackorby (Eds.), What happens next? Trends in postschool outcomes of youth with disabilities. The second comprehensive report from the National Longitudinal Transition Study of Special Education Students (pp. 3-1–3-39). Menlo Park, CA: SRI International.
Evidence-Based Practice in EBD
271
Montague, M., & Dietz, S. (2009). An examination of the evidence-base for cognitive strategy instruction and improving mathematical problem solving for students with disabilities. Exceptional Children, 75, 285–302. Nelson, J. R., Benner, G. J., Lane, K., & Smith, B. W. (2004). Academic achievement of K-12 students with emotional and behavioral disorders. Exceptional Children, 71, 59–73. Partin, T. C. M., Robertson, R. E., Maggin, D. M., Oliver, R. M., & Wehby, J. H. (2009). Using teacher praise and opportunities to respond to promote appropriate student behavior. Preventing School Failure: Alternative Education for Children and Youth, 54(3), 172–178. Rademacher, J. A., Callahan, K., & Pederson-Seelye, V. A. (1998). How do your classroom rules measure up? Guidelines for developing an effective rule management routine. Intervention in School and Clinic, 33, 284–289. Reid, R. (1996). Research in self-monitoring with students with learning disabilities: The present, the prospects, the pitfalls. Journal of Learning Disabilities, 29, 317–331. Rhode, G., Jenson, W. R., & Reavis, H. K. (2010). The tough kid book (2nd ed.). Eugene, OR: Pacific Northwest Publishing. Shores, R. E., & Wehby, J. H. (1999). Analyzing the classroom social behavior of students with EBD. Journal of Emotional and Behavioral Disorders, 7, 194–199. Sprick, R., Garrison, M., & Howard, L. (2009). CHAMPs: A proactive and positive approach to classroom management. Eugene, OR: Pacific Northwest Publishing. Sutherland, K. S., Lewis-Palmer, T., Stichter, J., & Morgan, P. L. (2008). Examining the influence of teacher behavior and classroom context on the behavioral and academic outcomes for students with emotional or behavioral disorders. The Journal of Special Education, 41, 223–233. Sutherland, K. S., Wehby, J. H., & Copeland, S. R. (2000). Effect of varying rates of behaviorspecific praise on the on-task behavior of students with EBD. Journal of Emotional and Behavioral Disorders, 8, 2–8. Tankersley, M., Cook, B. G., & Cook, L. (2008). A preliminary examination to identify the presence of quality indicators in single-subject research. Education and Treatment of Children, 31, 523–548. Trent, S. C., Artiles, A. J., & Englert, C. S. (1998). From deficit thinking to social constructivism: A review of theory, research, and practice in special education. Review of Research in Education, 23, 277–307. Umbreit, J., Ferro, J., Liaupsin, C., & Lane, K. L. (2007). Functional behavioral assessment and function-based intervention: An effective, practical approach. Upper Saddle River, NJ: Pearson Merrill Prentice Hall. U.S. Department of Education. (2011). 30th annual report to Congress on the implementation of the Individuals with Disabilities Education Act, 2008. Washington, DC: Author. Wagner, M., Kutash, K., Duchnowski, A. J., Epstein, M. H., & Sumi, C. (2005). The children and youth we serve: A national picture of the characteristics of students with emotional disturbances receiving special education. Journal of Emotional and Behavioral Disorders, 13(2), 79–96. Wagner, M., Newman, L., Cameto, R., & Levine, P. (2005). Changes over time in the early postschool outcomes of youth with disabilities: A report of findings from the National Longitudinal Transition Study (NLTS) and the National Longitudinal Transition Study-2 (NLTS2). Menlo Park, CA: SRI International. Zigmond, N. (2003). Where should students with disabilities receive special education services? Is one place better than another? The Journal of Special Education, 37, 193–199.
CHAPTER 12 EVIDENCE-BASED PRACTICES IN AUSTRALIA Jennifer Stephenson, Mark Carter and Sue O’Neill ABSTRACT This chapter examines evidence-based practice in the Australian education system, with particular reference to special education. Initially a brief overview of the Australian education system will be provided, followed by consideration of the incorporation of the concept of evidencebased practice into Australian educational policy at both national and state level. Subsequently, Australian teacher registration and teacher education program accreditation standards will be examined with regard to the adoption of evidence-based practice. We then describe the use of evidence-based practices in teacher education programs, particularly in the area of classroom and behavior management and in special education/ inclusion subjects. We will overview several research studies to illustrate the degree of penetration of the concept of evidence-based practice into educational systems and teaching practice. Although we found little evidence of a commitment to evidence-based practice in Australian education systems beyond rhetoric, we are cautiously optimistic that increasing emphasis will be given to the use of empirical evidence in the future.
Evidence-Based Practices Advances in Learning and Behavioral Disabilities, Volume 26, 273–291 Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 0735-004X/doi:10.1108/S0735-004X(2013)0000026014
273
274
JENNIFER STEPHENSON ET AL.
DESCRIPTION OF THE AUSTRALIAN EDUCATION SYSTEM The responsibility for education in Australia is primarily a state one, with the six states and two territories managing education through a state education department. All states offer 13 years of formal schooling, typically starting around 5 years of age and continuing to 18 years of age. The majority of schools in Australia (around 70%) are state run (there are also independent schools and Catholic school systems in each state), with financial contributions from the federal government, but with most funding coming from their state/territory government. There are currently moves to institute state/federal agreements on a national curriculum, national assessment and reporting procedures, and teacher registration and teacher education program accreditation (Australian Curriculum and Assessment Reporting Authority [ACARA], 2009; Australian Institute for Teaching and School Leadership [AITSL], 2012a). Typically, for students with disabilities, state education departments provide a range of options including placement in a regular class (with or without additional specialist support), a segregated special education class within a regular school, or a segregated special school. Currently 4.6% of the school population has an identified disability and 89% of those attend mainstream schools. Of those, 73% are in mainstream classes. Over and above that group, although not formally identified as having a disability, around 10–15% of students are likely to have learning difficulties as shown by low progress in literacy and/or numeracy and these students will all be in regular classes (with or without specialist support) (Students with Disabilities Working Group, 2010). Australia does not have national legislation that mandates particular policies or processes for providing special education services to students with disabilities. There is, however, a legal requirement under the Commonwealth Disability Discrimination Act of 1992 and the related Disability Standards for Education of 2005 (Commonwealth of Australia, 2006) that students with identified disabilities have a right to access and participate in education in a safe environment and on the same basis as those without disabilities. Education providers must, in consultation with the student and their family, make reasonable adjustments, including the provision of additional supports, and adjustments to teaching strategies in order for students with disabilities to access educational programs (Commonwealth of Australia, 2006).
Evidence-Based Practices in Australia
275
FEDERAL AND STATE POLICY AND EVIDENCE-BASED PRACTICE IN AUSTRALIA Overarching education policy is developed by the federal Department of Education, Employment, and Workplace Relations (DEEWR) and is guided by The Melbourne Declaration (Ministerial Council for Education, Early Childhood Development, and Youth Affairs [MCEEDYA], 2008). This joint declaration by federal and state ministers responsible for education sets out the Australian education agenda for all students, including those with disabilities, for the next 10 years. The Melbourne Declaration does not reference research or evidence-based approaches to education at all, apart from a call for Governments to ‘‘develop a substantive evidence base on what works’’ (MCEEDYA, 2008, p. 17). This call relates to collecting Australian data, rather than promoting the use of existing research-based practices, or developing standards for practices that could then be regarded as evidence-based. Under the National Education Agreement (Council of Australian Governments, 2008), which is part of the mechanism for implementing policy made between federal and state/territory governments, the Commonwealth has provided targeted funds for students with disabilities as part of the National Smarter Schools Partnership, although state/territory governments are responsible for policy and service delivery (DEEWR, 2011). The policy and reform directions in the National Education Agreement include reducing the effects of disadvantage (including disability) but none of the directions is explicitly related to implementing strategies drawn from the existing research or evidence base. As in the Melbourne declaration, there is an emphasis on performance reporting on outcomes to show evidence of improvement, but no directives as to how these improvements might be achieved. Similarly, the DEEWR strategic plan 2011–2014 (DEEWR, n.d.-a) does not link the proposed outcomes to policies supporting the use of existing research-based or evidence-based practices in education. Some insight into DEEWR’s understanding of evidence-based practice can be gleaned from the Teach, Learn, Share database established as part of the National Smarter Schools Partnership. This database, described as ‘‘a data-bank of literacy and numeracy evidence-based teaching strategies’’ (DEEWR, n.d.-b) includes strategies said to be appropriate for students with disabilities. The initiatives included on this site were those that were submitted by schools, researchers, or others during 2011 and 2012, and
276
JENNIFER STEPHENSON ET AL.
assessed as being effective. The assessment framework (DEEWR, n.d.-c) is available and very few of the indicators relate to the quality or rigor of the actual research design (such as those in Gersten et al., 2005 or Horner et al., 2005) and the scores and ratings for each strategy are not available on the website. Qualitative data is treated as if it can provide evidence of effectiveness, and there is no differentiation based on the nature of quantitative research. It seems, then, that promoting the use of research-based practice in education does not feature in the federal Australian national education agenda. Unlike the United States, where No Child Left Behind and federal government policies strongly endorse evidence-based practice and tie funding to the use of evidence-based practices, the Australian government pays minimal attention to supporting and promoting evidence-based practice in a formal and consistent manner. Although there is a commitment to improving education for students with disabilities, there is no coherent statement that suggests introducing evidence-based practices as a way of achieving better quality teaching, teachers, or curriculum. Because states and territories largely develop their own education and special education policies, we examined the sites of state/territory education departments using keyword searches with evidence-based practice, researchbased practice, and similar terms as keywords to locate relevant policies or practices. We also looked for key policy or planning documents related specifically to special education. We found consistent rhetoric about improving the quality of teaching and learning in education and special education but there was little mention of using research or evidence-based practices as a means to achieve these ends. In the introduction to the five-year strategic plan for New South Wales Department of Education and Communities (NSW DEC) 2012–2017 (2012a), for example, the Director-General noted that the plan was ‘‘based on evidence and research’’ (p. 3). In the details of the plan, however, there are only two mentions of research and only one of those refers to using research to make decisions. There are only three mentions of evidence and only one refers to using evidence to inform practice. Similarly, current strategic plans for other state departments of education make either passing general reference to evidence providing a basis for change in educational practice (Queensland Department of Education, Training, and Employment, n.d.-a; South Australia Department of Education and Children’s Services, n.d.; Victorian Department of Education and Early Childhood Development, 2008; Western Australia Department of Education, 2012), refer to evidence-based practice in relation to specific initiatives (Northern
Evidence-Based Practices in Australia
277
Territory Department of Education and Training, 2011), or fail to address evidence-based practice at all (Australian Capital Territory [ACT] Department of Education and Training, n.d.; Tasmanian Department of Education, n.d.). Looking more specifically at policies and plans related to special education and the education of students with disabilities, the NSW DEC has, for example, a People with Disabilities-Statement of Commitment (NSW DEC, 2011) and a Disability Action Plan (NSW DEC, n.d.). The Action Plan has as an action to increase teacher capacity to ‘‘meet diverse learning and development needs and manage challenging behavior’’ (p. 18), but there is no mention of using evidence-based practice as a foundation for special education. A recent initiative in NSW, Every Student, Every School (NSW DEC, 2012b), is directed specifically at students with disabilities and additional learning needs. The documents relating to this initiative do not of evidence-based practice, even though the plan aims to improve assessment and teaching practices for students with special education needs as well as the professional learning provided to teachers. Similarly, there are no references to research or evidence in the Victorian Department of Education and Early Childhood Development (VDEECD) Autism Friendly Learning Initiative aimed at improving educational provision for students with autism (VDEECD, 2011) nor in the South Australian Students with Disabilities Policy (South Australia Department of Education and Children’s Services, 2006) or the Tasmanian policy for inclusion (Tasmanian Department of Education, 2010). The Queensland Inclusive Education procedures (Queensland Department of Education, Training, and Employment, 2012a) place the responsibility on teachers to select evidence-based practices, but provides no guidelines about how this might be done. There are, however, some instances where more specific issues do appear to be informed by evidence. Within the VDEECD, The Victorian Deaf Education Institute, for example, is very active as a research partner with universities and others in order to provide a research foundation for deaf education. For example, it commissioned a literature review of assessment and intervention for language and literacy for students with hearing impairment and will draw on this review to improve its services (Victorian Deaf Education Institute, 2012). An interesting and widespread example of the use of evidence-based practice, despite the lack of policy frameworks at federal and state level, is provided by Positive Behavior Interventions and Support (PBIS), also known as Positive Behavior Support (PBS) (known as Positive Behavior for Learning (PBL) in NSW) or School-wide Positive Behavior Support
278
JENNIFER STEPHENSON ET AL.
(SWPBS). SWPBS is a three-tiered approach to improving student behavior, first developed in the United States (OSEP Center on Positive Behavioral Interventions and Supports, 2012). SWPBS provides support for all students and a framework for intervention and support for students with more challenging behavior and is an approach that caters for students with special education needs in both inclusive and segregated settings. A growing body of research has provided evidence of the effectiveness of the SWPBS approach (Horner et al., 2009). At a federal level, a recent federal government initiative, the National Safe Schools Framework (MCEEDYA, 2011a) identified SWPBS was as one of four evidence-informed approaches that had significant potential in reducing bullying in schools. In the associated manual provided to all Australian schools, evidence-informed was defined as ‘‘an evidence-informed approach considers research that demonstrates effective or promising directions and practices that have been carried out in different countries, cultures, school systems, and student populations in terms of its relevance for one’s own school’’ (MCEEDYA, 2011b, p. 44). It was evident that MCEEDYA had looked to international reviews and evaluations of interventions for bullying when providing the list of recommended approaches in their resource manual (see MCEEDYA, 2011b). Both the framework and the manual appeal to research and evidence, but as for more general government policies, there are no explicit standards for determining what is evidencebased and what is not. In Australia, the first tier (universal level) of SWPBS, aimed at all students, was adopted by state school systems in Tasmania and Queensland in 2004–2005 (Christofides, 2008), and in NSW beginning in 2005 (Barker, Yeung, Dobia, & Mooney, 2009). Subsequently, SWPBS has been adopted (but not mandated) system-wide, by the departments of education in the Northern Territory and Queensland (Christofides, 2008), and is one of several approaches advocated by the state education department in Victoria (Victorian Department of Education and Early Childhood Development, n.d.). It was of interest to establish whether the evidence-based nature of this approach had been an important consideration for the adopting education departments. In the Western Sydney Region of New South Wales, Barker et al. (2008) reported that the decision to adopt PBIS by the then Department of Education and Training (NSW DET) was based on a number of factors such as the evidence-based nature of the approach, its reported success in bringing about improvement in student behavior, its positive approach, and its flexibility.
Evidence-Based Practices in Australia
279
The state education departments in Queensland (Queensland Department of Education, Training, and Employment, n.d.-b) and the Northern Territory (Northern Territory Department of Education, n.d.-a) overtly and clearly identified the approach as evidence-based in their introductory information on SWPBS and provided links to the PBIS website. The Queensland Department of Education, Training, and Employment also provided a link to the United States National Technical Assistance Centre on Positive Behavior Interventions and Supports for their schools. At the same time, policy documents, the Code of School Behaviour (Queensland Department of Education, Training, and Employment, n.d.-c) and Safe Supportive and Disciplined Learning Environments (Queensland Department of Education, Training, and Employment, 2012b) contain no mention of evidence-based practice or specific recommendations to use a SWPBS approach. The Northern Territory education department policy related to behavior management in schools is not publicly available. The Northern Territory is, however, committed to piloting SWPBS with a view to extending implementation in the future (Northern Territory Department of Education, 2012a). The Tasmanian entry page, on the other hand (Tasmanian Department of Education, 2012a) makes no mention of the research or evidence base when describing SWPBS. There is a research page in the suite of PBIS pages, but it contains only three articles, and none provide an overview of the research supporting PBIS (Tasmanian Department of Education, 2011). Although the evidence base of SWPBS is acknowledged on some of the education department websites, there is no indication of the weight placed on the evidence base by these departments when they decided to adopt and recommend SWPBS, and NSW is the only state where there is a clear indication that the evidence base was a factor. The terms evidence-based practice, research-based practice, and similar expressions, then, have been adopted to varying extents by federal and state education authorities in Australia as evidenced by the preceding examples. In most cases, it is unclear exactly how the term is being used. In those few instances in which the term is defined, a range of intended meanings are evident. For example, in a discussion paper on evidence-based practice produced by the ACT Department of Education (ACT Department of Education and Training, 2007), the term was used to refer to both the grounding of educational practice on research evidence as well as of practices based on data collected by local schools. It is not uncommon for practices to be labeled as ‘‘evidence-based’’ – for example, SWPBS, as outlined above, and reading and numeracy programs in the Northern
280
JENNIFER STEPHENSON ET AL.
Territory (Northern Territory Department of Education, 2012b), with little or no specific reference to research or any indication as to how this conclusion was drawn. The example of one widely adopted evidence-based approach (SWPBS) shows that even when a practice does have an evidence base this may be noted as part of the motivation for using it, but it is unclear as to the extent to which the evidence base influenced decision makers. More generally, conspicuously absent from discussion of evidence-based practice are any specific criteria for assessing practices (e.g., Gersten et al., 2005; Horner et al., 2005; What Works Clearinghouse, 2011) as being evidence-based or any specific reference to sources that might be considered trustworthy in making such an assessment (e.g., Center for Data-Driven Reform in Education, n.d.; What Works Clearinghouse, n.d.). It would not be unreasonable to assert that it appears that state departments of education, on the available evidence, appear to use the term ‘‘evidence-based practice’’ or appeals to research as an imprimatur for selected approaches or interventions, rather than a systematic process used to select such practices. Within this vacuum, it is not unusual for decisions about selecting evidence-based practices to be devolved to principals and teachers, without providing any guidance as to how these practices may be selected. In Tasmania, the Raising the Bar Closing the Gap initiative (Tasmanian Department of Education, 2012b) required schools to implement evidencebased strategies in literacy and numeracy, but left it up to the school principal to identify these approaches. Also in Tasmania, under their Inclusion of Students with Disabilities in Regular Schools policy (Tasmanian Department of Education, 2010), the principal is responsible for providing appropriate professional development activities and the teacher is responsible for adopting appropriate teaching strategies. Similarly, in Queensland, the Inclusive Education Policy (Queensland Department of Education, Training, and Employment, 2012a) gives teachers the responsibility of choosing appropriate strategies and using evidence-based approaches. One possible light at the end of the tunnel is the recent announcement of a Centre for Education Statistics and Evaluation (NSW Department of Education and Communities, 2012c). According to the announcement of the Centre, it will ‘‘deliver much needed information about what works and what’s effective’’ (p. 2) and its priorities include production of ‘‘user-friendly syntheses of key research findings on significant educational issues’’ (p. 6). If these functions are effectively served, the Centre would fill a significant void in the Australian educational system with regard to evidence-based practice. Nevertheless, the devil will be in the detail and at this stage such detail is absent.
Evidence-Based Practices in Australia
281
There are some emerging pressures on governments to adopt standards for evidence-based practice within special education. The Australian Association of Special Education (AASE), a broad-based advocacy group for students with special education needs with representation on many federal and state advisory groups and committees, has recently released a position paper on evidence-based practice (AASE, 2012) which calls for state and federal education authorities to develop a set of criteria for judging the standards of evidence for practices in special education. The Australasian Journal of Special Education (published by AASE) issued a special issue on a scientific approach to special education in 2008 where Carter and Wheldall (2008) presented a case for scientific evidence-based practice in education. They were critical of the lack of use of evidence in determining educational policy in Australia and proposed criteria for evaluating education practices in the absence of high-quality research, which they noted is rare in education. They suggested a five-level rating system beginning with Level 1 use with confidence, down to Level 5 educationally unsafe. Hempenstall (2006) also deplored the lack of use of evidence by education policy makers in Australia. There are, then, some voices in the special education community that are speaking for evidence-based practices.
EVIDENCE-BASED PRACTICE AND TEACHER ACCREDITATION Because there is little guidance in national and state policy and planning documents for special education about standards for evidence-based practice or guidance as to what practices are evidence-based, we examined the documents from the various teacher accreditation bodies to determine if knowledge of evidence-based practices was required for teachers who would have students with disabilities in their classes. As students with disabilities are likely to be enrolled in both regular and special settings, the standards for regular classroom teachers and the required competencies for teaching diverse students, including those with special education needs, are relevant. As noted above, a new national body, the Australian Institute for Teacher and School Leadership (AITSL), has been established to set standards for teacher registration and to accredit teacher education courses (AITSL, 2011, 2012b). At present, there is no registration process for special education teachers in Australia at either a national or state level. The National Professional Standards for Teachers (AITSL, 2011) do, however, include standards expected of newly graduate, proficient, highly accomplished, and
282
JENNIFER STEPHENSON ET AL.
lead teachers in the area of inclusion and education of students with special education needs who are in regular classes (see focus statements 1.5, 1.6, 3.1, 4.1, and 4.3). Although no mention is made of evidence or research-based practice for educating students with or without special education needs, frequent mention is made of effective teaching strategies throughout the standards document. The definition of effective teaching strategies provided in the glossary was: ‘‘strategies which research and workplace knowledge suggests contribute to successful learning outcomes for students’’ (AITSL, 2011, p. 20). No examples, however, were given. With no clear guidelines about what constitutes good research, it is unclear how lead teachers might identify evidence-based practices. The other role of AITSL is to accredit teacher education programs. AITSL standards are to be implemented nationally, and although some states had previously developed their own standards, we will consider only the national standards. Documents pertaining to the accreditation of initial teacher education programs do not explicitly require programs to base their curriculum content regarding educating students with special needs on evidence or research. AITSL does require program providers to take account of ‘‘authoritative educational research findings’’ (AITSL, 2012b, p. C3), when developing their programs. What constitutes authoritative educational research is not defined.
EVIDENCE-BASED PRACTICES IN TEACHER EDUCATION PROGRAMS As previously noted, 73% of students diagnosed with a disability in Australia are enrolled in mainstream classes (Students with Disabilities Working Group, 2010), educated by teachers who may not have received special education training. Currently, only 3 out of 34 four-year, primary (elementary) teacher education programs do not have a mandatory subject on educating students with special needs (Stephenson, O’Neill, & Carter, 2012). With little guidance supplied by state or territory teacher registration/ accreditation bodies as to what content to include, teacher education programs have been left to determine the curriculum content they offer in general teacher education programs to prepare teachers for inclusive classrooms containing students with disabilities. Stephenson et al. examined the content delivered in subjects related to teaching students with special needs in Australian primary preservice programs. Although two-thirds of the subjects covered instructional strategies, only six subject descriptions out
Evidence-Based Practices in Australia
283
of the 61 subjects examined explicitly mentioned that content included evidence-based or research-based content. In a similar study, O’Neill and Stephenson (2011) examined the content of classroom behavior management subjects on offer in four-year primary education programs. In Australian tertiary institutions, a subject would typically run for one semester of around 12 weeks with three hours per week of face-to-face teaching time; full-time students complete four subjects per semester. These subjects are of interest as many are intended to prepare teachers for diverse classrooms containing students with special education needs. Few subjects (4/108) located via website searches explicitly mentioned evidence-based practices, although eight referred to evidence-based classroom and behavior management approaches including functional behavioral assessment (FBA), and three included PBS. In a study involving subject coordinators (academics responsible for the overall management of a subject) (O’Neill & Stephenson, 2012), information was provided by 49 subject coordinators, which covered 19 stand-alone subjects and 30 subjects where behavior management content was embedded in another subject, from 70% of Australia’s primary teacher education programs. The results showed that evidence-based approaches or models such as applied behavior analysis, PBS, and FBA were more commonly included in stand-alone, and less frequently included in embedded subjects. Applied behavior analysis, for example, was included in 75% of stand-alone and 33% of embedded subjects, but was no more likely to be included in a stand-alone subject coordinated by an academic who had a research interest in classroom and behavior management than in subjects coordinated by an academic without such interests. It seems then that although general teacher education programs recognize the need for some content on teaching students with disabilities and special education needs, the content provided may not be evidence or researchbased. It would also appear that clearly linking content to an evidence base from research is not seen as particularly relevant, even by subject coordinators who are actively conducting research in this area.
EVIDENCE-BASED PRACTICES IN AUSTRALIAN SCHOOLS Regardless of federal and state policies, standards for teacher certification, and content in teacher education programs, what happens in schools and classrooms regarding evidence-based practices is of the ultimate importance.
284
JENNIFER STEPHENSON ET AL.
As we have established, where evidence-based practice is recommended, the onus has been placed on schools and teachers to determine what those practices might be, in the absence of clear criteria for establishing when a practice might be regarded as evidence-based. In the following section, research particularly relevant to special education and inclusive settings will be briefly reviewed to provide a snapshot of the status of evidence-based practice in schools. Carter, Stephenson, and Strnadova´ (2011) reported the results of a survey of Australian special educators use of evidence-based practices, replicating the research of Burns and Ysseldyke (2009). Teachers were asked to report their level of use of eight practices on a 5-point scale from Almost never to Almost every day. Based on meta-analytic reviews, three practices were classified as effective (mnemonic strategies, applied behavior analysis, direct instruction) with effect sizes of 0.8 or above, one was rated as moderately effective (formative evaluation), with an effect size of 0.70 and four practices considered ineffective (psycholinguistic training, social skills training, modality instruction, perceptual-motor training), with effect sizes below 0.40. The survey was distributed to the members of the Australian Association for Special Education. A total of 193 surveys were returned, a return rate of approximately 30%. It was encouraging that two of the evidence-based practices (direct instruction and applied behavior analysis) were employed at moderate to high levels and formative evaluation was also widely used. Social skills instruction was also widely employed despite the weak evidence base, although this may be understandable given the critical importance of this curriculum area to many learners with special needs. Less encouraging was the level of use of unverified practices with approximately half of teachers reporting the use of modality training almost every day. Although the use of perceptual motor training and psycholinguistic training was much lower, approximately half of teachers still reported using these interventions once a week or more. In reference to perceptual motor programs, Stephenson, Carter, and Wheldall (2007) found that schools that used these programs often did so because they believed they would improve academic functioning of both typically developing students and students with difficulties in literacy and numeracy. They appeared to be unaware of the lack of support for such programs and to uncritically accept even the more extreme claims about their effects. These programs appear to be tacitly accepted by education departments, and some state education departments continue to encourage perceptual motor programs for young children (see, e.g., Northern Territory Department of Education, n.d.-b).
Evidence-Based Practices in Australia
285
International comparisons with North American special educators (Carter et al., 2011) and those from the Czech Republic (Carter, Strnadova´, & Stephenson, 2012) revealed some regional differences and idiosyncrasies but the take home message was similar. Carter et al. (2011) concluded: The present research delivers both good and bad news. While Australian special education teachers reported high levels of use of a number of evidence-based practices, they also reported moderate-to-high levels of use of a number of interventions with poor research support. (p. 58)
There is some other research that has explored what happens when schools operate in a policy vacuum with little or no guidance for selecting effective and evidence-based strategies. Multisensory environments (MSEs) or snoezelen rooms are designed to provide sensory stimulation to children with severe disabilities, in the expectations that this will result in positive outcomes. The research on MSEs shows that they do not currently meet the standards for an evidence-based practice (Botts, Hershfield, & ChristensenSandfort, 2009). Carter and Stephenson (2012) surveyed special schools in NSW and reported that of the special schools enrolling students with moderate to severe intellectual disabilities who responded to their survey, 53% had installed an MSE or snoezelen room. It seems that schools and teachers, relying primarily on other professionals and equipment suppliers, believe that sensory stimulation is somehow generally beneficial to students with severe disabilities. Teachers and schools are generally unaware of the research base (Carter & Stephenson, 2012; Stephenson & Carter, 2011a, 2011b). The fact that a third of schools reported receiving information from support teachers or other Department of Education and Training personnel about MSEs illustrates the lack of knowledge of the research around MSEs and lack of policies about educational practices to be used in special schools. Another example shows not only a lack of guidance for selecting evidence-based practices, but also a lack of warning against practices that are clearly ineffective. Stephenson (2009) investigated the advice provided by state education departments and other education bodies on Brain Gyms. Brain Gyms is a perceptual motor program widely promoted to educators. There is little or no evidence that Brain Gyms brings about the changes claimed (Hyatt, 2007; Hyatt, Stephenson, & Carter, 2009; Spaulding, Mostert, & Beam, 2010). A search of relevant websites found that Tasmania, Victoria, and Northern Territory education departments all provided explicit recommendations to use Brain Gymr with students with disabilities or provided funding support to special education teachers for
286
JENNIFER STEPHENSON ET AL.
professional learning related to Brain Gyms. None of the websites searched advised against the use of Brain Gyms; indeed, all states had material endorsing its use in education.
CONCLUSIONS Generally, then, there is little mention of research or evidence on effective practices in major policy and planning documents from DEEWR, in teacher registration and teacher education program accreditations documents from AITSL, or in state education and special education policy documents. Certainly there is no discussion of standards for determining whether or not a practice has a research base or if specific practices that may be evidencebased. Research and evidence may be mentioned, but only in the context of rhetoric around quality teaching, effective programs, and improving outcomes for students. There is no evidence that providers of teacher education courses are held accountable to promote evidence-based practices in teacher education courses, or that awareness of evidence-based practices is an essential attribute of teachers generally or special educators specifically. In practice this lack of policy direction plays out in a lack of specific advice about practices, and even promotion and endorsement of practices that are disproven or unproven. Although much of the content of this chapter is critical of governments and education authorities, we should take heart from the positive signs. The research that explores the practices that teachers report using in special education settings shows that both research-based and unproven practices are commonly used. Evidence-based practices are out there in special education settings and are being used, despite the lack of system support. The fact that references to research and to evidence are ubiquitous in the rhetoric of policy documents shows an emerging awareness that these things may matter. The focus on collecting outcome data in Australian settings is also a positive sign, even though there is little evidence that the processes used to produce the outcomes are drawn from the research base. The establishment of the Centre for Education Statistics and Evaluation (NSW Department of Education and Communities, 2012c) is a healthy sign that at least one state education department is moving to consider the evidence. The calls from advocates of quality in special education (AASE, 2012; Carter & Wheldall, 2008; Hempenstall, 2006) may not bear fruit immediately, but they are signs that an evidence-based approach to special education in Australia may be emerging.
Evidence-Based Practices in Australia
287
REFERENCES ACT Department of Education and Training. (2007). Teachers and school leaders: Making a difference through evidence-based practice. Canberra, ACT: Author. ACT Department of Education and Training. (n.d.). Strategic Plan 2010–2013. Retrieved from http://www.det.act.gov.au/__data/assets/pdf_file/0011/109955/DET_Strategic_Plan_ 2010-2013.pdf Australian Association of Special Education. (2012). Position paper: Evidence-based practice. Retrieved from http://www.aase.edu.au/phocadownload/National_Position_Papers/ position%20paper%20evidence-based%20practice.pdf Australian Curriculum and Assessment and Reporting Authority. (2009). National report on schooling in Australia 2009. Sydney, NSW: Author. Retrieved from http://www.acara. edu.au/reporting/reporting.html Australian Institute for Teaching and School Leadership. (2011). National professional standards for teachers. Carlton South, Vic: MCEEDYA. Australian Institute for Teaching and School Leadership. (2012a). Objectives. Retrieved from http://www.aitsl.edu.au/about-us/objectives.html Australian Institute for Teaching and School Leadership. (2012b). Teacher education programs in Australia: Guide to the accreditation process. Retrieved from http://www.aitsl.edu.au/ verve/_resources/Guide_to_accreditation_process_-_April_2012.pdf Barker, K., Dobia, B., Mooney, M., Watson, K., Power, A., Ha, M. T., y Denham, A. (2008, November). Positive behaviour for learning: Changing student behaviours for sustainable psychosocial outcomes. Paper presented at the annual conference of the Australian Association for Research in Education, Brisbane. Barker, K., Yeung, A. S., Dobia, B., & Mooney, M. (2009, November). Positive behavior for learning: Differentiating teachers’ self-efficacy. Paper presented at the annual conference of the Australian Association for Research in Education, Canberra. Botts, B. H., Hershfield, P. A., & Christensen-Sandfort, R. J. (2009). Snoezelens: Empirical review of product representation. Focus on Autism and Other Developmental Disorders, 23, 138–147. doi: 10.1177/1088357608318949 Burns, M. K., & Ysseldyke, J. E. (2009). Reported prevalence of evidence-based instructional practices in special education. The Journal of Special Education, 43, 3–11. doi: 10.1177/ 0022466908315563 Carter, M., & Stephenson, J. (2012). The use of multi-sensory environments in schools servicing children with severe disabilities. Journal of Developmental and Physical Disabilities, 24, 95–109. doi: 10.1007/s10882-011-9257-x Carter, M., Stephenson, J., & Strnadova´, I. (2011). Reported prevalence by Australian special educators of evidence-based instructional practices. Australasian Journal of Special Education, 35, 47–60. doi: 10.1375/ajse.35.1.47 Carter, M., Strnadova´, I., & Stephenson, J. (2012). Reported prevalence of evidence-based instructional practices by special educators in the Czech Republic. European Journal of Special Needs Education, 27, 319–335. doi: 10.1080/08856257.2012.691229 Carter, M., & Wheldall, K. (2008). Why can’t a teacher be more like a scientist? Science, pseudoscience and the art of teaching. Australasian Journal of Special Education, 32, 5–21. doi: 10.1080/10300110701845920 Center for Data-Driven Reform in Education. (n.d.). Best evidence encyclopedia. Retrieved from http://www.bestevidence.org
288
JENNIFER STEPHENSON ET AL.
Christofides, R. (2008). Positive behavior for success: Enriching school life in DET Illawarra and South East region. Student Welfare and Personal Development Association of NSW, August, p. 2. Commonwealth of Australia. (2006). Disability standards for education. Canberra, ACT: Author. Retrieved from http://www.deewr.gov.au/schooling/programs/pages/disability standardsforeducation.aspx Council of Australian Governments. (2008). National education agreement. Retrieved from http://www.federalfinancialrelations.gov.au/content/national_agreements.aspx Department of Education, Employment, and Workplace Relations. (2011). Funding for students with disabilities. Retrieved from http://www.deewr.gov.au/Schooling/Programs/Pages/ funding_for_students_with_disabilities.aspx Department of Education, Employment, and Workplace Relations. (n.d.-a). DEEWR strategic plan 2011–2014. Retrieved from http://www.deewr.gov.au/Department/Pages/About.aspx Department of Education, Employment, and Workplace Relations. (n.d.-b). National Smarter Schools Partnerships. Retrieved from http://www.smarterschools.gov.au/Pages/default. aspx Department of Education, Employment, and Workplace Relations. (n.d.-c). Standards of evidence for submission for the Teach, Learn, Share: The National Literacy and Numeracy Evidence Base. Retrieved from http://www.teachlearnshare.gov.au/ Gersten, R., Fuchs, L., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71, 149–164. Hempenstall, K. (2006). What does evidence-based practice in education mean? Australian Journal of Learning Disabilities, 11, 29–38. Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165–179. Horner, R., Sugai, G., Smolkowski, K., Todd, A., Nakasato, J., & Esperanza, J. (2009). A randomized control trial of school-wide positive behavior support in elementary schools. Journal of Positive Behavior Interventions, 11, 133–144. doi: 10.1177/1098300709332067 Hyatt, K. (2007). Brain Gym: Building stronger brains or wishful thinking. Remedial and Special Education, 28, 117–124. Hyatt, K., Stephenson, J., & Carter, M. (2009). A review of three controversial educational practices: Perceptual motor programs, sensory integration and tinted lenses. Education and Treatment of Children, 32, 313–342. Ministerial Council on Education, Employment, Training, and Youth Affairs. (2008). Melbourne declaration on educational goals for young Australians. Carlton, Vic.: Author. Retrieved from www.mceecdya.edu.au/mceecdya/melbourne_declaration,25979.html Ministerial Council on Education, Employment, Training, and Youth Affairs. (2011a). National safe schools framework. Carlton South, Vic.: Education Services Australia. Retrieved from http://www.deewr.gov.au/Schooling/NationalSafeSchools/Documents/NSSFramework.pdf Ministerial Council on Education, Employment, Training, and Youth Affairs. (2011b). National safe schools framework: Resource manual. Retrieved from http://www.deewr. gov.au/Schooling/NationalSafeSchools/Pages/nationalsafeschoolsframework.aspx Northern Territory Department of Education. (2011). Strategic plan 2011–2014: Delivering a smart Territory through quality education and training. Retrieved from http:// www.det.nt.gov.au/__data/assets/pdf_file/0013/4126/DET_StrategicPlan.pdf
Evidence-Based Practices in Australia
289
Northern Territory Department of Education. (2012a). Wellbeing and behaviour. Retrieved from http://www.det.nt.gov.au/students/support-assistance/safety-wellbeing/behaviour Northern Territory Department of Education. (2012b). Evidence-based literacy and numeracy practices framework. Retrieved from http://www.det.nt.gov.au/teachers-educators/ literacy-numeracy/evidence-based-literacy-numeracy-practices-framework Northern Territory Department of Education. (n.d.-a). Schoolwide positive behaviour support. Retrieved from http://www.det.nt.gov.au/students/support-assistance/safety-wellbeing/ behaviour/swpbs Northern Territory Department of Education. (n.d.-b). Assessment of student competencies teach handbook: Programming resource. Retrieved from http://www.det.nt.gov.au/ __data/assets/pdf_file/0019/991/TeacherHandbook.pdf NSW Department of Education and Communities. (2011). People with disabilities – Statement of commitment. Retrieved from https://www.det.nsw.edu.au/policies/general_man/general/ spec_ed?PD20050243.sthml NSW Department of Education and Communities. (2012a). 5 year strategic plan 2012–2017. Retrieved from http://www.dec.nsw.gov.au/about-us/plans-reports-and-statistics/strategiesand-plans NSW Department of Education and Communities. (2012b). Every student, every school: Learning and support. Retrieved from https://www.det.nsw.edu.au/every-student-everyschool NSW Department of Education and Communities. (2012c). Centre for Education Statistics and Evaluation. Retrieved from https://www.det.nsw.edu.au/media/downloads/about-us/ statistics-and-research/centre-for-education/cese_a5_15aug.pdf NSW Department of Education and Communities. (n.d.). Disability action plan, 2011–2015. Retrieved from https://www.det.nsw.edu.au/media/downloads/strat_direction/strat_ plans/disaplan.pdf O’Neill, S., & Stephenson, J. (2011). Classroom behaviour management preparation in undergraduate primary teacher education in Australia: A web-based investigation. Australian Journal of Teacher Education, 36(10), 35–52. Retrieved from http://ro. ecu.edu.au/ajte/vol36/iss10/3 O’Neill, S., & Stephenson, J. (2012). Classroom behaviour management content in Australian undergraduate primary teaching programmes. Teaching Education, 23, 287–308. doi: 10.1080/10476210.2012.699034 OSEP Center on Positive Behavioral Interventions and Supports. (2012). Home. Retrieved from http://www.pbis.org/default.aspx Queensland Department of Education, Training, and Employment. (2012a). Inclusive education. Retrieved from http://ppr.det.qld.gov.au/education/learning/Pages/Inclusive-Education. aspx Queensland Department of Education, Training, and Employment. (2012b). Safe, supportive and disciplined school environment. Retrieved from http://education.qld.gov.au/student services/behaviour/resources/ssdsepolicy.html Queensland Department of Education, Training, and Employment. (n.d.-a). Department of Education, Training and Employment strategic plan 2012-2016. Retrieved from http://deta.qld.gov.au/publications/strategic/pdf/strategic-plan-12-16.pdf Queensland Department of Education, Training, and Employment. (n.d.-b). Schoolwide positive behavior support. Retrieved from http://education.qld.gov.au/studentservices/behaviour/ swpbs/index.html
290
JENNIFER STEPHENSON ET AL.
Queensland Department of Education, Training, and Employment. (n.d.-c). The code of school behaviour. Retrieved from http://education.qld.gov.au/publication/production/reports/ pdfs/code-school-behaviour-a4.pdf South Australia Department of Education and Children’s Services. (2006). Students with disabilities policy. Retrieved from www.decd.sa.gov.au/docs/documents/1/StudentswithDisabilitie-1.pdf South Australia Department of Education and Children’s Services. (n.d.). The strategic plan 2012–2016 for South Australian public education and care. Retrieved from http:// www.decd.sa.gov.au/aboutdept/files/links/DECS2012StrategicPlan.pdf Spaulding, L. S., Mostert, M. P., & Beam, A. P. (2010). Is Brain Gyms an effective educational intervention? Exceptionality, 18, 18–30. doi: 10.1080/09362830903462508 Stephenson, J. (2009). Best practice? Advice provided to teachers about the use of Brain Gyms in Australian schools. Australian Journal of Education, 53, 109–124. Stephenson, J., & Carter, M. (2011a). The use of multisensory environments in schools for students with severe disabilities: Perceptions from teachers. Journal of Developmental and Physical Disabilities, 23, 339–357. doi: 10.1007/s10882-011-9232-6 Stephenson, J., & Carter, M. (2011b). Use of multisensory environments in schools for students with severe disabilities: Perceptions from schools. Education and Training in Autism and Developmental Disabilities, 46, 276–290. Stephenson, J., Carter, M., & Wheldall, K. (2007). Still jumping on the balance beam: Continued use of perceptual motor programs in Australian schools. Australian Journal of Education, 51, 6–18. Stephenson, J., O’Neill, S., & Carter, M. (2012). Teaching students with disabilities: A webbased examination of preparation of preservice primary school teachers. Australian Journal of Teacher Education, 37, 13–23. Retrieved from http://ro.ecu.edu.au/ajte/vol37/ iss5/3 Students with Disabilities Working Group. (2010). Strategies to support the education of students with disabilities in Australian schools. Retrieved from http://www.deewr.gov.au/Schooling/Programs/Pages/MoreSupportforSWD.aspx Tasmanian Department of Education. (2010). Inclusion of students with disabilities in regular schools. Retrieved from http://www.education.tas.gov.au/school/health/disabilities/ supportmaterials/deptresources/inclusion Tasmanian Department of Education. (2011). Research. Retrieved from http://www.education. tas.gov.au/school/health/positivebehaviour/research Tasmanian Department of Education. (2012a). Schools. Retrieved from http://www.education. tas.gov.au/school/health/positivebehaviour Tasmanian Department of Education. (2012b). Raising the bar closing the gap. Retrieved from http://www.education.tas.gov.au/?a=294775 Tasmanian Department of Education. (n.d.). Department of Education Strategic Plan, 2012–2015. Retrieved from http://www.education.tas.gov.au/documentcentre/Documents/DoEStrategic-Plan-2012-2015.pdf Victorian Deaf Education Institute. (2012). Research. Retrieved from http://www.education. vic.gov.au/about/directions/vdei/research.htm Victorian Department of Education and Early Childhood Development. (2008). Corporate plan 2009–2011. Melbourne, Vic.: Author.
Evidence-Based Practices in Australia
291
Victorian Department of Education and Early Childhood Development. (2011). Autism friendly learning. Retrieved from http://www.education.vic.gov.au/about/directions/autism/ default.htm Victorian Department of Education and Early Childhood Development. (n.d.). Strategies for schools. Retrieved from http://www.education.vic.gov.au/healthwellbeing/respectfulsafe/ strategies Western Australia Department of Education. (2012). Excellence and equity: Strategic plan for WA public schools 2012–2015. Retrieved from http://det.wa.edu.au/policies/detcms/ policy-planning-and-accountability/policies-framework/strategic-documents/strategicplan-for-wa-public-schools-2012-2015.en?oid=com.arsdigita.cms.contenttypes.FileStorage Item-id-12793162 What Works Clearinghouse. (2011). What Works Clearinghouse procedures and standards handbook (Version 2.1). Retrieved from http://ies.ed.gov/ncee/wwc/pdf/reference_ resources/wwc_procedures_v2_1_standards_handbook.pdf What Works Clearinghouse. (n.d.). Retrieved from http://ies.ed.gov/ncee/wwc/