The Impact of International Achievement Studies on National Education Policymaking [1 ed.] 9780857244505, 9780857244499

The chapters in this volume will: discuss the uses of international achievement study results as a tool for national pro

187 47 4MB

English Pages 381 Year 2010

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

The Impact of International Achievement Studies on National Education Policymaking [1 ed.]
 9780857244505, 9780857244499

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES ON NATIONAL EDUCATION POLICYMAKING

INTERNATIONAL PERSPECTIVES ON EDUCATION AND SOCIETY Series Editor from Volume 1: Abraham Yogev Volume 1:

Volume 3:

International Perspectives on Education and Society Schooling and Status Attainment: Social Origins and Institutional Determinants Education and Social Change

Volume 4:

Educational Reform in International Perspective

Volume 2:

Series Editor from Volume 5: David P. Baker Volume 5:

Volume 6: Volume 7: Volume 8: Volume 9: Volume 10:

New Paradigms and Recurring Paradoxes in Education for Citizenship: An International Comparison Global Trends in Educational Policy The Impact of Comparative Education Research on Institutional Theory Education For All: Global Promises, National Challenges The Worldwide Transformation of Higher Education Gender, Equality and Education from International and Comparative Perspectives

Series Editor from Volume 11: Alexander W. Wiseman Volume 11: Volume 12:

Educational Leadership: Global Contexts and International Comparisons International Educational Governance

INTERNATIONAL PERSPECTIVES ON EDUCATION AND SOCIETY VOLUME 13

THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES ON NATIONAL EDUCATION POLICYMAKING EDITED BY

ALEXANDER W. WISEMAN Lehigh University

United Kingdom – North America – Japan India – Malaysia – China

Emerald Group Publishing Limited Howard House, Wagon Lane, Bingley BD16 1WA, UK First edition 2010 Copyright r 2010 Emerald Group Publishing Limited Reprints and permission service Contact: [email protected] No part of this book may be reproduced, stored in a retrieval system, transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publisher or a licence permitting restricted copying issued in the UK by The Copyright Licensing Agency and in the USA by The Copyright Clearance Center. No responsibility is accepted for the accuracy of information contained in the text, illustrations or advertisements. The opinions expressed in these chapters are not necessarily those of the Editor or the publisher. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-85724-449-9 ISSN: 1479-3679 (Series)

Emerald Group Publishing Limited, Howard House, Environmental Management System has been certified by ISOQAR to ISO 14001:2004 standards Awarded in recognition of Emerald’s production department’s adherence to quality systems and processes when preparing scholarly journals for print

CONTENTS LIST OF CONTRIBUTORS

ix

INTRODUCTION: THE ADVANTAGES AND DISADVANTAGES OF NATIONAL EDUCATION POLICYMAKING INFORMED BY INTERNATIONAL ACHIEVEMENT STUDIES Alexander W. Wiseman

xi

PART I: OPPORTUNITIES AND LIMITATIONS OF INTERNATIONAL ACHIEVEMENT STUDIES MONITORING THE QUALITY OF EDUCATION: EXPLORATION OF CONCEPT, METHODOLOGY, AND THE LINK BETWEEN RESEARCH AND POLICY Mioko Saito and Frank van Cappelle

3

WHY PARTICIPATE? CROSS-NATIONAL ASSESSMENTS AND FOREIGN AID TO EDUCATION Rie Kijima

35

DOES INEQUALITY INFLUENCE THE IMPACT OF SCHOOLS ON STUDENT MATHEMATICS ACHIEVEMENT? A COMPARISON OF NINE HIGH-, MEDIUM-, AND LOW-INEQUALITY COUNTRIES Amita Chudgar and Thomas F. Luschei

63

v

vi

CONTENTS

NEW DIRECTIONS IN NATIONAL EDUCATION POLICYMAKING: STUDENT CAREER PLANS IN INTERNATIONAL ACHIEVEMENT STUDIES Joanna Sikora and Lawrence J. Saha

85

ANALYZING TURKEY’S DATA FROM TIMSS 2007 TO INVESTIGATE REGIONAL DISPARITIES IN EIGHTH GRADE SCIENCE ACHIEVEMENT Ebru Erberber

119

PART II: COMPARATIVE CONTRIBUTIONS OF INTERNATIONAL ACHIEVEMENT STUDIES TO EDUCATIONAL POLICYMAKING THE IMPACT OF STANDARDIZED TESTING ON EDUCATION QUALITY IN KYRGYZSTAN: THE CASE OF THE PROGRAM FOR INTERNATIONAL STUDENT ASSESSMENT (PISA) 2006 Duishon Shamatov and Keneshbek Sainazarov

145

FROM EQUITY OF ACCESS TO INTERNATIONAL QUALITY STANDARDS FOR CURBING CORRUPTION IN SECONDARY AND HIGHER EDUCATION AND CLOSING ACHIEVEMENT GAPS IN POST-SOVIET COUNTRIES Mariam Orkodashvili

181

A COMPARATIVE ANALYSIS OF DISCOURSES ON EQUITY IN EDUCATION IN THE OECD AND NORWAY Cecilie Rønning Haugen

207

THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES ON NATIONAL EDUCATION POLICYMAKING: THE CASE OF SLOVENIA – HOW MANY WATCHES DO WE NEED? Eva Klemencic

239

vii

Contents

FINLAND, PISA, AND THE IMPLICATIONS OF INTERNATIONAL ACHIEVEMENT STUDIES ON EDUCATION POLICY Jennifer H. Chung

267

PART III: CRITICAL FRAMEWORKS FOR UNDERSTANDING THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES WHY THE FIREWORKS?: THEORETICAL PERSPECTIVES ON THE EXPLOSION IN INTERNATIONAL ASSESSMENTS Jennifer DeBoer

297

STANDARDIZED TESTS IN AN ERA OF INTERNATIONAL COMPETITION AND ACCOUNTABILITY M. Fernanda Pineda

331

INDEX

355

LIST OF CONTRIBUTORS Amita Chudgar

Michigan State University, East Lansing, MI, USA

Jennifer H. Chung

Liverpool Hope University, Liverpool, UK

Jennifer DeBoer

Peabody College, Vanderbilt University, Nashville, TN, USA

Ebru Erberber

American Institutes for Research, Washington, DC, USA

Cecilie Rønning Haugen

Norwegian University of Science and Technology, Trondheim, Norway

Rie Kijima

Stanford University, Stanford, CA, USA

Eva Klemencic

Educational Research Institute, Ljubljana, Slovenia

Thomas F. Luschei

Claremont Graduate University, Claremont, CA, USA

Mariam Orkodashvili

Peabody College, Vanderbilt University, Nashville, TN, USA

M. Fernanda Pineda

Florida International University, Miami, FL, USA

Lawrence J. Saha

Australian National University, Canberra, ACT, Australia

Keneshbek Sainazarov

USAID/Creative Associates International, Inc., Bishkek, Kyrgyz Republic

Mioko Saito

UNESCO-IIEP, Paris, France ix

x

LIST OF CONTRIBUTORS

Duishon Shamatov

University of Central Asia, Bishkek, Kyrgyz Republic

Joanna Sikora

Australian National University, Canberra, ACT, Australia

Frank van Cappelle

Melbourne Graduate School of Education, University of Melbourne, VIC, Australia

Alexander W. Wiseman

Lehigh University, Bethlehem, PA, USA

INTRODUCTION: THE ADVANTAGES AND DISADVANTAGES OF NATIONAL EDUCATION POLICYMAKING INFORMED BY INTERNATIONAL ACHIEVEMENT STUDIES Alexander W. Wiseman This volume investigating the impact of international achievement studies on national education policymaking was born out of an invited panel held at the Comparative and International Education Society’s (CIES) annual meeting in 2009. It was advertised that this panel would discuss both the uses and abuses of Programme for International Student Assessment (PISA) and the Trends in International Mathematics and Science Study (TIMSS) results, provide recommendations for ways that international achievement data can be used in real-world policymaking situations, and discuss what the future of international achievement studies holds. The five panelists were  Hans Wagemaker (Executive Director, IEA Secretariat);  Clementina Acedo (Director, International Bureau of Education, UNESCO);  Maria Teresa Tatto (Associate Professor, Michigan State University; Principal Investigator, IEA Teacher Education Study in Mathematics);  Henry Levin (Professor, Teachers College, Columbia University); and  David P. Baker (Professor, Pennsylvania State University) xi

xii

INTRODUCTION

All of these panelists were invited based on their extensive experience with international achievement studies and national education policymaking either from the policymaking, study administration, item development, data analysis, or policy interpretation perspective (or all of the above). The panel was one of the liveliest sessions of the CIES annual meeting and was followed by a large number of questions and comments from the audience both during the formal question and answer time and long after the session officially ended. It is interesting to consider why there was and continues to be such a range of perspectives and approaches to both international achievement studies and national education policymaking. Each topic alone is controversial to some degree, but when combined, the result is often explosive. This volume was developed as a way to carry the discussion further and address some of the lingering questions and controversies. Continuing this discussion, begins with several basic premises, summarized below.

THE WORLDWIDE EXPANSION OF INTERNATIONAL ACHIEVEMENT STUDIES The rapid expansion of national participation in international achievement studies has been a hallmark of educational accountability and planning in countries around the world for the past 25 years. Since the IEA’s first international studies on mathematics and science achievement in the late 1960s, the availability and use of international achievement studies for national education policy has exploded (DeBoer, ‘‘Why the Fireworks?: Theoretical Perspectives on the Explosion in International Assessments’’; Smith & Baker, 2001; Wiseman & Baker, 2005). The most widely adopted studies are now administered on regular cycles and include participating countries from every region and level of development around the world. Although there are many contextual and contributing factors, one possible explanation for the rapid expansion of international education achievement studies since the 1960s is the shift from level of educational attainment to level of achievement as a fundamental indicator of national and systemic educational legitimacy. Unfortunately, this shift has not often been theoretically demonstrated beyond the macro-sociology of educational expansion, nor has it been sufficiently documented through empirical investigation. Additionally, many among the political, secondary, and economic communities in nations around the world follow cross-national comparisons as if they were valid predictors of economic productivity and

Introduction

xiii

social welfare (Altbach, 1997; Anderson, 1979; Ramirez, Luo, Schofer, & Meyer, 2006; Rubinson & Fuller, 1992). Some bring an evidence-based approach to these expectations for international achievement studies by using data from the background questionnaires from international assessments to investigate the economic plans and potential of students in participating countries (e.g., Sikora & Saha, ‘‘New Directions in National Education Policymaking: Student Career Plans in International Achievement Studies’’; Wiseman & Alromi, 2007). Another explanation for the expansion of international achievement studies, which is of both theoretical and policy interest, is the saturation of schools with students resulting from the global adoption of a mass schooling model. This also contributes to the consequent tendency to use student performance as an indicator of level of development and sociopolitical legitimacy. If this is true, then high and pervasive enrolment throughout a nation’s school system is no longer enough to warrant status as an educationally legitimate and competitive nation. Instead, the advent of a global mass-educated community makes it symbolically important for national educational systems to have high-performing students as well as universal enrolment. Consequently, for better or for worse, national means of student achievement have become tools for building and maintaining national legitimacy. These international studies, which initially focused on math and science achievement, now include cross-national investigations of multiple subject areas, teachers and teaching, and a developing focus on higher education. This information has been used to make decisions about resource distribution both within and across national education systems, but some of the most productive uses of international achievement study data by policymakers has been to create agendas for innovation and seek evidence of quality and equity in national educational systems (Gorard & Smith, 2004). The latter is in fact the aim of many of the chapters in this volume (Chudgar & Luschei, ‘‘New Directions in National Education Policymaking: Student Career Plans in International Achievement Studies’’; Erberber, ‘‘Analyzing Turkey’s Data from TIMSS 2007 to Investigate Regional Disparities in Eighth Grade Science Achievement’’; Haugen, ‘‘A Comparative Analysis of Discourses on Equity in Education in the OECD and Norway’’; Saito & van Cappelle, ‘‘Monitoring the Quality of Education: Exploration of Concept, Methodology, and the Link between Research and Policy’’; Shamatov & Sainazarov, ‘‘The Impact of Standardized Testing on Education Quality in Kyrgyzstan: The Case of the Program for International Student Assessment (PISA) 2006)’’. In short, participation in and use of data from international

xiv

INTRODUCTION

achievement studies have become taken-for-granted components of the landscape of national educational policymaking. The ubiquity of international testing is fascinating itself because it brings a cumbersome, contested, and often-controversial process into the realm of what is often a locally-controlled and locally-contested area. The chapters in this volume, therefore, seek to highlight a process that has both polarized and promoted the potential of policymaking within national education systems worldwide. To do this, the chapters that follow will (1) discuss the uses of international achievement study results as a tool for national progress as well as an obstacle, (2) provide recommendations for ways that international achievement data can be used in real-world policymaking situations, and (3) discuss what the future of international achievement studies holds.

DEFINITIONS AND DILEMMAS International testing is a broad concept, but is generally limited to the kinds of large-scale assessments and surveys that are administered in multiple countries and provide both within- and between-country comparative information. It is important to distinguish international achievement studies from ‘‘league tables,’’ which is a somewhat outdated term often used by critics of international testing to describe the published rankings of average student scores on international achievement tests by country (e.g., Broadfoot, 2004; Steiner-Khamsi, 2003). The comparison of national systems of education based simply on ranking student achievement means has also been referred to as horse-ranking and derided as a result of laziness or ignorance on the part of educational researchers and policymakers (Wiseman & Baker, 2005). However, cross-national comparisons of student achievement as a legitimate method of estimating nations’ levels of development and productivity are a decidedly more complex process than mere ranking (Baker, 1997; Wolhuter, 1997) and are a reason why secondary analysis of international achievement study data is especially important for developing informed educational policies and decisions based on international evidence. Several brief, but informative histories of international achievement studies exist, including historical summaries included in most of the chapters in this volume. The most well-known and perhaps the longest running international testing organization is the International Association for the Evaluation of Educational Achievement, which is known as the IEA. More recently, the Organisation for Economic Co-operation and Development (OECD) has become involved in international educational testing at the secondary level.

xv

Introduction

These two organizations, respectively, oversee the administration and basic analysis of the Trends in International Mathematics and Science Study (TIMSS) and the Programme for International Student Assessment (PISA). And, as the two highest profile organizations and the two most broadly recognized international achievement studies, the IEA, OECD, TIMSS, and PISA, have all become synonymous with what some believe is right and some believe is wrong with both international achievement studies and national education policymaking.

CRITIQUES AND MISCONCEPTIONS ABOUT INTERNATIONAL TESTING Just like there are exaggerations about the benefits of international testing, there also are critiques and misconceptions about the development, administration, analysis, and interpretation of international tests. This contributes to the debates surrounding the impact of international testing on national educational policymaking to a degree that is arguably unparalleled compared to other influences on national educational policy. Some have criticized international testing for its propensity to aggregate and therefore sometimes mask individual or regional-level variation, which has sometimes been referred to as ‘‘reductionism’’ (Wrigley, 2004). Some see international testing as a tool used by communities, organizations, and individuals with a strong power base to override the interests of the marginalized or disadvantaged or to merely push through the agendas of large, transnational organizations without regard for local, regional, or otherwise contextualized concerns (Prais, 2007; Steiner-Khamsi, 2004). There are others who are skeptical of international testing because of the inherent problems in sampling, coverage, administration, and interpretation of such large-scale data (Prais, 2003, 2007; Roth et al., 2006). These are all valid criticisms of international achievement studies that must not be ignored. The danger exists, however, of throwing away relevant and needed information coming from international studies rather than constructively training researchers and policymakers how to appropriately use the results and not only improve the way the studies are used but also the way data is collected and measured (Koretz, 2009). The extremely isolated cases that reside at the individual level are valuable and necessary pieces of data for understanding the particular disadvantages and needs of, for example, girls from marginalized racial or ethnic groups in otherwise well-functioning educational systems (Baker & Wiseman, 2009; Lewis & Lockheed, 2007).

xvi

INTRODUCTION

The point is that if unique cultural and contextual data is lost, the process of planning, implementing, analyzing, and interpreting international achievement studies is compromised even though a lost case may be extremely isolated and not representative of the larger sample. Then there is a potential gap in the overall understanding of a particular situation and the impact international achievement studies have on national education policy (Fensham, 2007). In the same way, valuable information is lost if only the most individualized data is recognized or used to make decisions. The argument for mixed-methods research has made the case for examining both the global and the individual out of international achievement study data. For example, the video studies from the 1995 and the 1999 TIMSS were major steps in guiding or supplementing the quantitative data from international achievement studies (Hiebert et al., 2005), even though there is rich evidence that single-item quantitative data can also be quite informative regarding students’ contextualized knowledge and understanding (Olsen, 2005). Beyond the analysis of the data from international achievement studies alone, international assessments of educational achievement have become vital instruments in the development and evaluation of national education policy, in spite of the dangers (Wiseman & Baker, 2005). The ability to participate in an international community of educational systems assessing students’ and systems’ performance coupled with the ability to actually compare the one-to-one performance, characteristics, and expectations of students, teachers, school leaders, and curriculum specialists ensure that international achievement studies continue to have a great impact on what educators, researchers, and policymakers know about teaching, learning, and curriculum in their countries and their comparison groups (Cai & Lester, 2007). But, how does the general public, parent, or community member get information about international achievement studies and their results?

THE ROLE OF THE MEDIA An under-investigated factor contributing to the broad and strong impact of international achievement studies on national education policy comes not from the research reports and results disseminated to policymakers, school administrators, and educational researchers, but from widespread, publicly disseminated media reports (Stack, 2006). In particular, reports on the results of international tests make mainstream media outlets such as national and international newspapers, television news and talk shows, and

Introduction

xvii

Internet news and blogs (Koretz, 2009). The ability of media communication to be both instant and informative to the widest possible community of parents, community members, business owners, politicians, and other social, political, and economic leaders often impacts the perception and use of international test results more than any research or policy agenda from a university-based research group or national ministry of education. To manage this dissemination of information in mainstream public outlets, the IEA and OECD, for example, create press releases, prepare highlight brochures, and host public announcements about the initial results of these large-scale international tests. They enlist the assistance of experts in the field to present the information from each round of testing within the context of scholarship and policy relevance as much as possible. But, in countries where students perform on average below the general public’s expectations, there is often a public reaction to the results of these international tests in spite of the media marketing that may precede the release of results. The United States and Germany, for example, have had their reactions documented in the media and then through public debate and educational reform more than many other countries. In particular, Germany’s PISA Schock phenomenon demonstrates how influential the media reaction to Germany’s PISA results can be on actual educational policy and reform (Ertl, 2006). Often, the impetus behind the media reports on the results of international assessments like TIMSS and PISA has more to do with reporting on the economic, political, or social competitiveness of a particular country or region than it does with the actual educational value or evidence resulting from the study (Pineda, ‘‘Standardized Tests in an Era of International Competition and Accountability’’; Stack, 2006). It can also represent a shift in blame or responsibility for wider social, political, and economic problems to a nation’s educational system.

THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES ON NATIONAL EDUCATION POLICYMAKING As the chapters in this volume demonstrate, there are both advantages and disadvantages for national education systems participating in international assessments like TIMSS or PISA. The disadvantages fall into several categories: (1) ‘‘horse-ranking,’’ (2) emphasis on common curriculum rather

xviii

INTRODUCTION

than contextualized teaching and learning and vice versa, (3) the assessment instrument may be biased in favor of more influential or hegemonic national participants, (4) resource and opportunity cost distribution and impact, and (5) limitations in the generalizability of results compared to the expense and effort of participating in these tests (Chung, ‘‘Finland, PISA, and the Implications of International Achievement Studies on Education Policy’’; Kijima, ‘‘Why Participate? Cross-national Assessments and Foreign Aid to Education’’; Klemencic, ‘‘The Impact of International Achievement Studies on National Education Policymaking: The Case of Slovenia – How Many Watches Do We Need?’’; Naumann, 2005; Prais, 2003; Reddy, 2006). The advantages of participating in these international assessments also fall into several categories: (1) international achievement studies catalyze widespread and often public debate about education, (2) provide valid evidence for decision-making, (3) allow for informed benchmarking, (4) build capacity in national education systems for systematic and widespread assessment, and (5) provide transparency in education where none may otherwise exist (Adams, 2003; Chung, ‘‘Finland, PISA, and the Implications of International Achievement Studies on Education Policy’’; Naumann, 2005; Orkodashvili, ‘‘From Equity of Access to International Quality Standards for Curbing Corruption and Closing Achievement Gaps in Post-Soviet Countries’’; Reddy, 2006). Combining both welcome and warnings about international achievement studies, the chapters in the first part of this volume address the ways that educational quality, educational achievement, opportunity structures and expectations, as well as national participation in international achievement studies intersect to form a foundation for what these studies do and how they are used for national education policymaking. As Saito and van Cappelle’s chapter asserts about monitoring the quality of education using international achievement studies, these studies create the opportunity for evidence-based decision-making among policymakers, which leads to improved educational quality in participating countries. Chudgar and Luschei’s chapter provides an evidence-based window into the ways that variation in inequality associates with or influences student achievement. Erberber’s chapter both monitors the quality of education and tracks the ways that inequality within one national education system (Turkey) associates with educational achievement. Sikora and Saha’s chapter outlines the ways that educational resources align with career optimism and ambition among youths using students’ responses to questions about their occupational expectations collected as part of a background questionnaire on an international achievement study. And, as Kijima’s chapter suggests,

Introduction

xix

the likelihood of a country receiving aid is associated with participation in international achievement studies as well. While the first section of chapters in this volume emphasizes the ways that the results from international achievement studies can themselves be used as a tool for national progress, the second section of chapters discusses ways that international testing contributes to real-world policymaking situations. For example, Shamatov and Sainazarov’s chapter discusses the fact that participation in an international achievement study provided Kyrgyzstan with previously unavailable information about the condition of education and the state’s ability to conduct systematic analysis of the data to inform national education policymaking. Orkodashvili’s chapter finds that in postSoviet countries, the impact of international achievement studies is a blend of globalization and contextualization. But, Haugen’s chapter looks at the ways that one transnational organization (the OECD) sets educational agendas for member states like Norway partly through the implementation and interpretation of OECD-sponsored international achievement studies. Klemencic’s chapter shows some of the ways that international achievement studies both directly and indirectly influence national education policy using Slovenia as the focus. And, Chung’s chapter rounds out the second section of chapters by looking at the ways that one high-scoring nation (Finland) has influenced national education policymaking in other countries purely because of its high average performance. The final two chapters contribute their own interpretations of the promises of and problems with international achievement studies by taking more of a theoretical approach to a discussion of what the past, present, and future of international achievement studies hold. DeBoer’s and Pineda’s chapters both compare and contrast several frameworks for understanding why and how international achievement studies exist, what they do, and what their impact on national education policymaking is. In particular, these chapters take more of a critical approach by asking ‘‘whose knowledge and whose policies are of most worth?’’ This is an important question to ask, and it is fitting that the final chapters in the volume prioritize these questions; however, there is another question that should be asked simultaneously: How does world culture influence the impact that international achievement studies have on national education policy. International achievement studies are the result of cross-national collaboration, which can admittedly take on many different forms. And, this collaboration provides a sense of legitimacy for international achievement studies (Wiseman & Baker, 2005), which allows these studies to be adopted by and participated in by countries from all over the world – even

xx

INTRODUCTION

when it may seem contrary to a country’s best interests to do so (e.g., South Africa and other consistently low-achieving countries). In fact, these studies often mirror the development of the educational system in various countries because the educational system of many countries was shaped into its current form by input from the global community and further impacted by the emergence of a global education governance structure (Amos, 2010). Legitimizing transnational actors frequently frame national, regional, and local policy options based on the comparative results from international achievement studies, which can lead to a ‘‘global’’ perspective being the chief priority in national education policymaking in spite of regional and local needs. In spite of the problems that may arise from international testing, there are still many educational researchers and decision-makers who know how to appropriately use the findings and data from international achievement studies to inform policy. What the chapters in this volume demonstrate – when taken as a whole – is that the direction decision-makers and community members take needs valid and reliable information for them to lead schools and the community. International comparisons resulting from secondary analyses of these large-scale cross-national assessments like TIMSS and PISA provide valuable information that can inform both policy and practice (Koretz, 2009; Sammons, 2006), even though they should be interpreted cautiously and may only be part of the puzzle.

REFERENCES Adams, R. J. (2003). Response to ‘cautions on OECD’s Recent Educational Survey (PISA).’ Oxford Review of Education, 29(3), 377–389. Altbach, P. G. (1997). The coming crisis in international education in the United States. International Higher Education, 8(Summer), 4–5. Amos, S. K. (Ed.) (2010). International education governance (Vol. 12). Bingley, UK: Emerald Publishing. Anderson, C. A. (1979). Societal characteristics within the School: Inferences from the International Study of Educational Achievement. Comparative Education Review (October), 408–421. Baker, D. P. (1997). Surviving TIMSS: Or, everything you blissfully forgot about international comparisons. Phi Delta Kappan (December), 295–300. Baker, D. P., & Wiseman, A. W. (Eds). (2009). Gender, equality and education from international and comparative perspectives (Vol. 10). Bingley, UK: Emerald Group Publishing Ltd. Broadfoot, P. (Ed.) (2004). Editorial: Lies, damn lies and statistics. Comparative Education, 40(1), 3–6. Cai, J., & Lester, F. (2007). Contributions from cross-national comparative studies to the internationalization of mathematics education: Studies of Chinese and U.S. classrooms.

Introduction

xxi

In: B. Atweh, A. C. Barton, M. C. Borba, N. Gough, C. Keitel, C. Vistro-Yu & R. Vithal (Eds), Internationalisation and globalisation in mathematics and science education (pp. 151–172). Dordrecth, The Netherlands: Springer. Ertl, H. (2006). Educational standards and the changing discourse on education: The reception and consequences of the PISA study in Germany. Oxford Review of Education, 32(5), 619–634. Fensham, P. J. (2007). Context or culture: Can TIMSS and PISA teach us about what determines educational achievement in science? In: B. Atweh, A. C. Barton, M. C. Borba, N. Gough, C. Keitel, C. Vistro-Yu & R. Vithal (Eds), Internationalisation and globalisation in mathematics and science education (pp. 151–172). Dordrecth, The Netherlands: Springer. Gorard, S., & Smith, E. (2004). An International comparison of equity in education systems. Comparative Education, 40(1), 15–28. Hiebert, J., Stigler, J., Jacobs, J. K., Givvin, K. B., Garnier, H., & Smith, M. (2005). Mathematics teaching in the United States today (and tomorrow): Results from the TIMSS 1999 video study. Educational Evaluation and Policy Analysis, 27(2), 111–132. Koretz, D. (2009). How do American students measure up? Making sense of international comparisons. The Future of Children, 19(1), 37–51. Lewis, M., & Lockheed, M. (2007). Inexcusable absence: Why 60 million girls still aren’t in school and what to do about it. Washington, DC: Center for Global Development. Naumann, J. (2005). TIMSS, PISA, PIRLS and low educational achievement in world society. Prospects, XXXV(2), 229–248. Olsen, R. V. (2005). Achievement tests from an item perspective: An exploration of single item data from the PISA and TIMSS studies, and how such data can inform us about students’ knowledge and thinking in science. Oslo: University of Oslo. Prais, S. J. (2003). Cautions on OECD’s recent educational survey (PISA). Oxford Review of Education, 29(2), 139–163. Prais, S. J. (2007). Two recent (2003) international surveys of schooling attainments in mathematics: England’s problems. Oxford Review of Education, 33(1), 33–46. Ramirez, F. O., Luo, X., Schofer, E., & Meyer, J. W. (2006). Student achievement and national economic growth. American Journal of Education, 113(1), 1–30. Reddy, V. (2006). Mathematics and science achievement at South African schools in TIMSS 2003. Cape Town: Human Sciences Research Council Press. Roth, K. J., Druker, S. L., Garneir, H. E., Lemmens, M., Chen, C., & Kawanaka, T. (2006). Teaching science in five countries: Results from the TIMSS 1999 video study (NCES 2006011). Washington, DC: U.S. Department of Education, National Center for Education Statistics, U.S. Government Printing Office. Rubinson, R., & Fuller, B. (1992). Specifying the effects of education on national economic growth. In: B. Fuller & R. Rubinson (Eds), The political construction of education (pp. 101–115). New York: Praeger. Sammons, P. (2006). The contribution of international studies on educational effectiveness: Current and future directions. Educational Research and Evaluation, 12(6), 583–593. Smith, T. M., & Baker, D. P. (2001). Worldwide growth and institutionalization of statistical indicators for education policy-making. Peabody Journal of Education, 76(3/4), 141–152. Stack, M. (2006). Testing, testing, read all about it: Canadian press coverage of the PISA results. Canadian Journal of Education, 29(1), 49–69.

xxii

INTRODUCTION

Steiner-Khamsi, G. (2003). The politics of league tables. Journal of Social Science Education, 1 (online journal SOWI). Steiner-Khamsi, G. (Ed.) (2004). The global politics of educational borrowing and lending. New York: Teachers College Press. Wiseman, A. W., & Alromi, N. H. (2007). The employability imperative: Schooling for work as a national project. Hauppage, NY: Nova Science Publishers. Wiseman, A. W., & Baker, D. P. (2005). The worldwide explosion of internationalized education policy. In: D. P. Baker & A. W. Wiseman (Eds), Global trends in educational policy (Vol. 6, pp. 1–21). London: Elsevier Science, Ltd. Wolhuter, C. C. (1997). Classification of national education systems: A multivariate approach. Comparative Education Review, 41(2), 161–177. Wrigley, T. (2004). ‘School effectiveness’: The problem of reductionism. British Educational Research Journal, 30(2), 227–244.

PART I OPPORTUNITIES AND LIMITATIONS OF INTERNATIONAL ACHIEVEMENT STUDIES

MONITORING THE QUALITY OF EDUCATION: EXPLORATION OF CONCEPT, METHODOLOGY, AND THE LINK BETWEEN RESEARCH AND POLICY Mioko Saito and Frank van Cappelle ABSTRACT The main aim of this chapter is to argue that a sound conceptualization and methodology for measuring the quality of education is a necessary, but not a sufficient, condition for establishing a link between research and policy to improve the quality of education. The following elements have been provided to support this argument: (1) a literature review of the different concepts and methods of measuring the quality of education that are in place internationally, as well as their importance; (2) a UNESCO desk review of 35 developing countries to compare the way educational quality is featured and monitored in National Education Sector Plans (NESPs); and (3) case studies of two developing countries focusing on the implementation of research to measure the quality of education, its impact, and the link between research and policy. It was found that the quality of education is recognized as an important factor in

The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 3–34 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013004

3

4

MIOKO SAITO AND FRANK VAN CAPPELLE

most NESPs, but it has not been defined, measured, or interpreted in a consistent way. Furthermore, while sophisticated and innovative methodologies have already been developed to measure the quality of education, the processes of linking research results with policy still seem to be at a developmental stage. This is a challenge not only for researchers and policy makers, but also for development partners to ensure that (i) policy and planning become more firmly grounded in objectively verifiable scientific evidence and (ii) through its impact on policy and planning, research leads to improvements in the quality of education.

INTRODUCTION At the World Conference on Education for All (EFA) in Jomtien in 1990, an expanded vision for meeting learning needs was outlined which included the requirement to improve and assess learning achievement (UNESCO, 1990). Educational quality was further emphasized in the World Education Forum in Dakar in 2000, specifically in goal no. 6 which states: ‘‘improving all aspects of the quality of education and ensuring excellence of all so that recognized and measurable learning outcomes are achieved by all, especially in literacy, numeracy and essential life skills’’ (UNESCO, 2000). Since Jomtien, many countries have significantly improved participation rates in education. However, these advances have not necessarily led to corresponding improvements in the quality of education. For those countries and societies where participation in education is not assured, achieving ‘‘quality education for all’’ is even more of a challenge. Quality has also been recognized as an important element in educational policy documents by the World Bank (1995, 1999). However, as can be seen in the report by the World Bank Independent Evaluation Group (2006), their basic education projects in developing countries since 1990 have been criticized as placing too much emphasis on increasing participation rates. The report indicates that there is a lack of focus on the learning achievement of children. Moreover, it suggests that developing countries and their partners need to not only emphasize the achievement of the Millennium Development Goals (MDGs) but also review their Fast Track Initiative (FTI) national plans and work toward improving student learning outcomes. The main aim of this chapter is to argue that a sound conceptualization and methodology for measuring the quality of education is a necessary,

Monitoring the Quality of Education

5

but not a sufficient, condition for establishing a link between research and policy to improve the quality of education. To do this, the paper has provided: (1) a brief review of the different concepts and methods of measuring the quality of education that are in place internationally, as well as their importance; (2) a comparison of the way educational quality is featured and monitored in National Education Sector Plans (NESPs) of developing countries including EFA–FTI countries; and (3) two detailed country cases in which the link between results from quality assessments and policy planning is examined.

CONCEPT OF QUALITY OF EDUCATION Since the late 1960s, the UNESCO International Institute for Educational Planning (IIEP) has hosted a series of international conferences on the theme of educational quality. As pointed out by Ross and Ma¨hlck (1990), these conferences reflected different interpretations of the concept of the quality of education. The 1960s was a time of ‘‘philosophical debate’’ on how to define the quality of education (Beeby, 1969; Peters, 1969). Beeby (1969) summarized the quality of education in terms of ‘‘qualitative changes’’ (as opposed to ‘‘quantitative changes’’) that would have two elements: (i) the learning environment (what is taught and how) and (ii) student flows (who are taught where). During the 1970s ‘‘pragmatic approaches to planning’’ the quality of education dominated (Adams, 1978). During this period the vision of ‘‘qualitative’’ educational planning was introduced, and the discussion focused on what measures are effective for improving learning outcomes. It was also during this period that economists started to pay attention to learning achievement. Economists observed that education systems can provide pathways to economic advancement (OECD, 1989; Ross, Paviot, & Genevois, 2006b). As seen in the literature (e.g., Hanushek & Wo¨Xmann, 2007; World Bank Independent Evaluation Group, 2007), good quality education in terms of learning outcomes in literacy, numeracy, and life skills can contribute to increased work productivity, higher individual income levels, economic and social growth, improvement in health, and the generation of innovative ideas. The conception of what constitutes the quality of education has been continually evolving, and the late 1980s to 2000s saw the establishment of more comprehensive interpretations. The World Declaration on Education for All, adopted by the Jomtien World Conference on Education for All in

6

MIOKO SAITO AND FRANK VAN CAPPELLE

1990, noted the importance of educational quality and specifically the need to focus on learning acquisition and outcome. The concept of the quality of education was expanded on a decade later in the Dakar Framework for Action, adopted at the World Education Forum in Senegal. Quality was now recognized as being of fundamental importance, and specific requirements of successful education programmes were listed – including well-trained teachers, adequate facilities and learning materials, a relevant curriculum, a good learning environment, and a clear definition and accurate assessment of learning outcomes. UNESCO’s Task Force on Education for the Twenty-first Century set out the following four pillars of learning: learning to know – laying the foundations of lifelong learning, learning to do – acquiring the competence and skills required in the world of work, learning to live together – fostering mutual understanding and appreciation of our growing interdependence, and learning to be – enabling individuals to reach their full potential (Delors et al., 1996; Amagi, 1996). Particular emphasis was placed on the need to learn how to live together in harmony in the ‘‘global village,’’ noting that the other three pillars provide the bases of this fourth pillar. UNESCO and UNICEF promote a high-quality education as a fundamental human right. The Convention on the Rights of the Child, adopted at the United Nations General Assembly in 1989, includes a number of commitments with respect to a child’s education, including the development of the child’s mental and physical abilities to his or her fullest potential and the development of respect for human rights, fundamental freedoms and the natural environment. From a human rights perspective, the concept of the quality of education is not just a list of elements but rather a ‘‘web of commitments’’ in which ‘‘education is placed and understood in terms of a larger context that reflects learning in relation to the learner as an individual, a family and community member, a citizen and as part of a world society’’ (Pigozzi, 2006, p. 42). Based on this conceptualization of the quality of education, Pigozzi (2006) describes a framework in which the various elements affecting educational quality are positioned at two levels – those that affect the level of the learner and those that affect the level of the education system supporting the learning experience. At the level of the learner, an important element in this model is what the learner brings, from positive early-childhood opportunities to illness or hunger. A high-quality education system would need to be able to recognize and adequately respond to the diversity of learners and their particular experiences, characteristics, skills, and conditions. Other elements at this level are the content of education and access to relevant educational materials, the processes of education (requiring well-trained

Monitoring the Quality of Education

7

teachers who use learner-centered teaching and learning methods) and the learning environment (including both physical aspects such as hygiene, sanitation, and safety, and psychosocial – a welcoming and non-discriminatory learning environment). In addition to these elements, Pigozzi notes that a high-quality education is one that seeks out and assists learners, particularly those who have traditionally been neglected: the poor, girls, working children, children in emergencies, children with disabilities, and children with nomadic lifestyles. At the education system level, elements affecting the quality of education are the managerial and administrative system, implementation of ‘‘good policies,’’ a supportive legislative framework that can ensure equality of educational opportunity, human and material resources, and the means to measure learning outcomes. Learning outcomes can encompass knowledge (including literacy and numeracy), values (including solidarity, gender equality, tolerance and respect for human rights), skills or competencies, and behaviors. Learning outcomes have generally been key elements of studies of the quality of education. The concept of the quality of education has become more complex and multifaceted over time, which poses considerable challenges to those who wish to measure it. Different studies have focused on different aspects of the quality of education, rather than attempting to capture all the elements described here. It is particularly challenging to determine the extent to which education supports such learning outcomes as creative and emotional development, and in promoting objectives such as equality and peace (UNESCO, 2004a). While it is difficult to capture the notion of educational quality in absolute terms, it has been the general practice of assessment studies to define the quality of education in terms of (i) students’ learning achievement and (ii) the characteristics of their learning environment (Ross et al., 2006b; Saito, 2008).

ASSESSMENTS TO MEASURE THE QUALITY OF EDUCATION Cross-National Surveys The importance of regularly collecting information on the quality of education is increasingly recognized. This research is necessary to account for the massive investments in education and to better understand how to improve the quality of education. In developed countries such research

8

MIOKO SAITO AND FRANK VAN CAPPELLE

has been ongoing for the past 50 years, but only more recently evaluations of the quality of education are beginning to attract more attention from researchers, organizations such as the World Bank as well as the media. The International Association for the Evaluation of Educational Achievement (IEA) pioneered the measurement of the quality of education, commencing their first pilot study internationally in 1958, followed by the First International Mathematics Study (FIMS), First International Science Study (FISS), Second International Mathematics Study (SIMS), and Second International Science Study (SISS) in the 1960s and 1970s. The Reading Literacy Study (RLS) and Computers in Education Study (COMPED) were organized during the 1980s while Trends in International Mathematics and Science Study (TIMSS), Progress in International Reading Literacy Study (PIRLS), and Civic Education Study (CIVED) were undertaken during the 1990s and 2000s (Elley, 1992; Beaton et al., 1996; Martin & Kwelly, 1996; Grisay & Griffin, 2006). These cross-national surveys also led many countries to undergo national assessments of the quality of education. In 1997, the Organization for Economic Cooperation and Development (OECD) launched the Programme for International Student Assessment (PISA) to assess the achievement of 15-year-olds in Reading literacy, Mathematics, and Science every three years starting in 2000 (OECD, 2007). A number of developing countries have also taken part in subregional networks conducting comparative surveys since the 1990s. For example, the Zimbabwe study in 1991, organized collaboratively by the Zimbabwe Ministry of Education and UNESCO-IIEP, developed into the Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) with 15 Ministries of Education as their members. In addition to data on Reading literacy, Mathematics, and HIV and AIDS Knowledge, SACMEQ has also collected data on the characteristics of schools, classrooms, and teachers. In Francophone Africa, Confe´rences des Ministres de l’Education des Pays Francophones (CONFEMEN) has collected data on student achievement since 1991 in the Programme d’Analyse des Syste`mes Educatifs des Pays de la CONFEMEN (PASEC). In Latin America, the Laboratorio Latinoamericano de Evaluacio´n de la Calidad de la Educacio´n (LLECE) commenced assessments of the quality of education in 1997 under the leadership of the Oficina Regional de Educacio´n para Ame´rica Latina y el Caribe (OREALC) (Postlethwaite, 2004b; Kellaghan, 2006). The second survey (SERCE) which was completed in 2008 assessed Reading, Writing, Mathematics, and Natural Science (UNESCO, 2009).

Monitoring the Quality of Education

9

National Assessments Many countries have established national assessment mechanisms to periodically monitor and evaluate the quality of their education systems (Kellaghan & Greaney, 2001). In some OECD countries such as the United States, Japan, the United Kingdom and Canada, this has been standard practice since the 1970s (Greaney & Kellaghan, 1996; Postlethwaite & Kellaghan, 2008). In Vietnam, the Ministry of Education collaborated with the World Bank and UNESCO-IIEP in 2001 to measure the Reading and Mathematics achievement of Grade 5 pupils. Other Asian countries carrying out national assessments include Cambodia, Laos, India, Indonesia, Nepal, Sri Lanka, and Thailand (Kellaghan, 2006). In Latin America, Mexico is an example of a country which has undertaken many surveys since the 1990s to measure the level of literacy and communication, as can be seen in the Evaluacio´n Nacional de Logro Acade´mico en Centros Escolares (ENLACE) and the Exa´menes para la Calidad y el Logro Educativos (EXCALE) (Athie, 2008).

Advantages and Disadvantages of Joining Cross-National Surveys With many possible mechanisms available to measure the quality of education, policy makers and researchers may wonder whether it is better to join a network (international or subregional) or conduct their own national assessments, or both. One obvious advantage in joining a cross-national survey is its ‘‘comparative framework’’ (Greaney & Kellaghan, 1996), where each country can be placed in an international perspective with a reference point based on the international average. Second, since the same technically sound methodology needs to be used at the same time in all participating countries, the national capacity can be developed by capitalizing on the expertise made available to participating countries (Murimba, 2006). This can in turn lead to lower staff requirements and costs at country level with respect to national surveys. Third, cross-national findings are more likely to attract media attention and more readily provoke political dialogue within participating countries (Schleicher, 2006). However, a legitimate concern expressed by some ministers is the unfair comparisons via ‘‘league-table’’ in which countries are ordered in ranking (Murimba, 2006). Moreover, taking into consideration the different contextual and cultural background, cross-national surveys are often criticized as less attuned to local issues and concerns (Greaney & Kellaghan, 1996).

10

MIOKO SAITO AND FRANK VAN CAPPELLE

METHODOLOGICAL ISSUES IN MEASURING QUALITY OF EDUCATION A summary of the different characteristics of a selection of cross-national surveys is presented in Table 1. Target Population One of the important issues in research design is how to define the target population. For OECD PISA, for example, the target population has been 15-year-old students. In this methodology, depending on the repetition policies and flexible intake systems, it might be required to take a sample from a number of different grades, and sometimes across different education cycles (e.g., some from upper primary schools and some from lower secondary schools). If facilities and teaching materials are different depending on the grade or the cycle, a more complex sample design is Table 1. Cross-National Surveys OECD–PISA

IEA–TIMSS/ PIRLS etc.

SACMEQ

LLECE

CONFEMEN– PASEC

Various Cross-National Surveys with Different Characteristics. Coverage

International (OECD members and partners) International (mainly industrialized countries) Subregional (Southern and Eastern Africa) Subregional (Latin America) Subregional (Francophone Africa)

Priority

Target Population

Test Framework

Measuring progress/ identifying trends Measuring progress/ identifying trends Capacity building

15-year-olds

Competency required for future life

The modal grades for different age groups Grade 6

Competency within the curriculum

Generation of indicators

Grades 3 and 6

Generation of indicators

Grades 2 and 5

Competency within the curriculum Competency within the curriculum Competency within the curriculum

Source: Compiled by the authors based on Grisay and Griffin (2006), OECD (2007), IEA (2005a, 2005b), Ross and Makuwa (2009), UNESCO (2009), and CONFEMEN (2009).

Monitoring the Quality of Education

11

required to ensure that the measured values regarding the characteristics of the learning environment can be appropriately generalized to the population (Postlethwaite, 2004a; Grisay & Griffin, 2006). Unlike PISA’s age-based definition of the target population, the IEA surveys are age/grade based. For example, the target population for TIMSS 1995 included (i) students enrolled in the two grades containing the largest proportion of 9-year-old students; (ii) students enrolled in the two grades containing the largest proportion of 13-year-old students; and (iii) students in their final year of secondary education (Postlethwaite, 2004a; Grisay & Griffin, 2006). In this methodology, national comparisons are made within the same two grades, where the same curriculum is used for each grade. However, international comparisons might be made across different grades, depending on the difference of the school-starting years and the education structure. SACMEQ has been using a pure grade-based population at the Grade 6 level. This is because most of the participating countries have relatively high repetition rates and late entries to primary school. By the time pupils reach Grade 6, there is a large variation in age. Policy makers in SACMEQ countries wished to find out the determining elements of the learning environment on achievement, and age differences at Grade 6 are considered to be a potential determining element (Postlethwaite, 2004a; Saito, 2008). Grade 6 was also identified as the target population because this was the final or penultimate grade of primary education in most SACMEQ countries.

Test Framework Although the trend in international and/or subregional studies has been to measure student achievement in basic subjects such as Reading and Mathematics, their test frameworks are not necessarily the same. For example, IEA’s TIMSS used the curricula of the participating countries to construct their tests. The test blueprint was organized using the ‘‘domain’’ as one dimension and the ‘‘cognitive level’’ as another. The Mathematics domain included (i) geometry, (ii) measurement, (iii) algebra, and (iv) data; and the cognitive level included (i) knowledge of facts and procedures, (ii) application of concepts, (iii) problem solving, and (iv) logical thinking (Beaton et al., 1996).1 The Mathematics test of SACMEQ was similar to that of TIMSS 1995, with (i) number, (ii) measurement, and (iii) graph/data as the domain; and (i) knowledge, (ii) understanding, and (iii) application as the cognitive level.

12

MIOKO SAITO AND FRANK VAN CAPPELLE

For Reading literacy, IEA’s PIRLS and SACMEQ had the same domains: (i) narrative prose, (ii) expositive prose, and (iii) document. For the first two domains, the cognitive level included (i) verbatim, (ii) paraphrase, (iii) inference, and (iv) main ideas. For the document domain, two levels were used: (i) locate and (ii) locate þ process. Unlike IEA’s and SACMEQ’s test frameworks, OECD PISA did not use school curricula as the basis for its test framework. Rather, PISA focuses on the kind of competencies which will be useful in adult life (Postlethwaite, 2006; Postlethwaite & Leung, 2006; Ross et al., 2006b). In PISA, three subjects have been measured: Reading literacy, Mathematical literacy, and Scientific literacy. Compared to IEA and SACMEQ, the PISA blueprint has been organized differently. For example, in Mathematics, the three dimensions measured were (i) contents (quantity, space/shape, change/ relation, uncertainty); (ii) process (recreation, relation, critical thinking, and aptitude); and (iii) situation (private, educational, professional, public, and scientific) (OECD, 2007; Ross et al., 2006b). It is important to recognize these differences and to use a test framework that corresponds with the priorities of the country. If the Ministry is mostly concerned about whether schools have taught the specified curricula to an acceptable standard, then a test framework similar to the ones used by IEA or SACMEQ would be more suitable. On the other hand, if the Ministry is more concerned about whether schools enable adolescents to acquire the knowledge and skills they are expected to have need of in the future, then the methodology used by PISA would be more appropriate (Saito, 2008).

Outcomes Not Often Covered in Assessments Measuring Behaviors and Attitudes toward School Subjects Both the PISA and IEA studies include measures relating to attitudes and emotions. Although behaviors and attitudes of students toward school subjects are considered very important elements which could influence student achievement, such measurements have not been established as standard practice compared to the measurement of knowledge and skills. Furthermore, as shown in the TIMSS results (Beaton et al., 1996; Mullis et al., 1997), Asian countries such as Singapore, Korea, and Japan had very contrasting results where many students with high achievement on Mathematics and Science had rather negative attitudes toward these subjects. It is reasonable to say that the development of definitions and

Monitoring the Quality of Education

13

methodologies to measure attitudes and emotions is still an ongoing process. However, it is important to investigate the reasons why in some cases there was no relationship between achievement and attitudes in certain subjects, and to use this information to improve the quality of education. Measuring Sustainable Life Skills Since the International Conference on Education for Sustainable Development (ESD) in Johannesburg in 2002, attention has been given to the importance of acquiring sustainable development skills in addition to the basic school subjects (Ross et al., 2006a). However, while the definition of ESD is still very vague2 it is difficult to develop a good measurement framework. There is a movement to relate ESD skills with life skills, which have been defined by various international organizations (UNICEF, 2005; UNESCO, 2004b; WHO 1999). None of these definitions have been accepted as the standard (Ross et al., 2006b). However, there is general agreement that one of the areas which life skills should encompass is knowledge and behavior regarding health. The SACMEQ III survey carried out in 2007 included the measurement of pupils’ and teachers’ knowledge of HIV and AIDS, as requested by the Ministers of Education from Southern and Eastern Africa during the SACMEQ Assembly in 2003. Measuring other areas of ESD, such as environmental needs, has also been on the discussion agenda during several of the SACMEQ Scientific Committee Meetings (SACMEQ, 2007). Measuring Teacher Performance In many research surveys, the evaluation of teacher performance is absent or receives little attention (Anderson, 2004). For example, none of the PISA studies collected data about teachers’ background information. TIMSS and PIRLS did collect information on teacher characteristics, but not on teacher’s subject-matter knowledge. The IEA carried out a video study of classroom teaching practices in Mathematics and Science in several countries (Hiebert et al., 2003). In general, due to strong opposition from teachers’ unions, it has been very difficult to measure teacher performance. However, SACMEQ did test Grade 6 teachers during SACMEQ II (Reading and Mathematics in all SACMEQ countries except for South Africa and Mauritius) and during SACMEQ III (HIV/AIDS knowledge in all SACMEQ countries and Reading and Mathematics in all SACMEQ countries except Mauritius). For these surveys, it is possible to put pupils’ and teachers’ ability on a single scale for each subject using modern Item Response Theory. It is therefore possible to compare pupil achievement in

14

MIOKO SAITO AND FRANK VAN CAPPELLE

Reading literacy and Mathematics with the achievement of their teachers in these subjects.

IMPACT OF RESEARCH ON EDUCATIONAL PLAN IMPLEMENTATION Comparative Review of EFA–FTI Countries It is obvious that data collection should not be the end of survey research. Research would be meaningless if information on quality improvement is not reflected in the policy making process. It has often been discerned that educational policy formation led by Ministers of Education rarely capitalizes on empirical evidence (Mendelsohn, 1996; Reimers & McGinn, 1997). Curiosity-driven research undertaken in universities may not be very relevant to the policy concerns of Ministries of Education, and would therefore have little or no impact on educational policy decisions (Murimba, 2006; Saito, 1999). This section contains a comparative review of the NESPs of FTI countries.3 Many FTI countries have been undertaking either international or subregional cross-national assessment to measure learning achievement (see Table 2). According to UNESCO (2008a), to make the process and outcomes regarding the quality of education more transparent, FTI countries have been urged to include the following information in their national plans:  Description of education results to be achieved  Baseline data of student learning outcomes  Description of the assessment system in use Despite these efforts, UNESCO (2008a) reported that many of the national plans addressed quality only partially. In most EFA–FTI countries, learning assessment systems are still in development. Based on a review of educational plans of 35 EFA–FTI countries, UNESCO identified several factors that are associated with quality and grouped them into five categories (see Table 3). In terms of indicators on the quality of education, UNESCO (2008a) reported that most countries do not go beyond the basic set of indicators, mostly relating to inputs. These include gross and net enrolment rates, repetition rates, survival rates, primary completion rates, gender parity index, student and teacher attendance, instruction time, teacher

15

Monitoring the Quality of Education

Table 2.

EFA–FTI Countries’ Participation in International and Regional Assessment Studies.

Countries Benin Burkina Faso Cameroon Central African Republic Djibouti Ethiopia Gambia Ghana Guinea Kenya Lesotho Liberia Madagascar Mali Mauritania Mozambique Niger Rwanda Sao Tome and Principe Senegal Sierra Leone Yemen Cambodia Mongolia Tajikistan Timor-Leste Vietnam Guyana Haiti Honduras Nicaragua

International

TIMSS 2003, 2007

Regional

PASEC PASEC PASEC PASEC

TIMSS 2003, SISS PASEC SACMEQ I, II, III SACMEQ II, III PASEC PASEC SACMEQ II, III PASEC

PASEC PASEC TIMSS 2007

TIMSS 2007

SERCE SERCE

Source: Compiled by the authors based on UNESCO (2008a) and Ross and Makuwa (2009).

qualification, pupil:teacher ratio, and pupil:textbook ratio. Examples of more elaborated indicators include percentage of children who achieved the established national standard at certain grades, percentage of children provided with a hot meal in schools, and level of ICT usage in schools. It appears that most countries identify indicators in the context of the present ‘‘realities’’ (UNESCO, 2008a). For example, if a country

16

MIOKO SAITO AND FRANK VAN CAPPELLE

Table 3. Percentage of Educational Plans that Dealt with Factors Associated with Quality Based on a Desk Review of EFA–FTI Plans for 35 Countries. Category

Factors Associated with Quality

% of Plans

Learners, family, and community

Decentralization of the system Family and community support Affordability

74 59 44

Enabling inputs

Textbooks and learning materials School/classroom libraries Gender sensitive/inclusive environments In-service professional development for teachers

86 86 86 82

Teaching/learning interactions

Differentiated, multigrade teaching Learner-centered, constructivist methods Inclusion of local knowledge

63 46 46

Learning outcomes

Vocational and life skills Foundational cognitive skills Participation in economic development Participation in civic duties

87 51 29 29

Assessment practices

Formal learning assessment programs Monitoring and evaluation systems Informal, alternative practices

74 26 1

Source: UNESCO (2008a).

has not yet achieved the provision of basic school inputs, these would be the priority in the plans before the establishment of more elaborated indicators. In conceptualizing the quality of education, the general trend in the plans was to describe the importance of a student centered and safe, inclusive learning environment, where better trained teachers deliver relevant and useful contents. A significant number included use of distance learning for both teacher development and student learning. Other, less commonly included indicators, are: mother-tongue instruction; availability of textbooks and learning materials in local languages; locally chosen/produced curriculum and textbooks, more specialized private schools to cater for students’ particular needs; use of integrated learning systems to reduce the need for and cost of textbook development, school lunches, health and counseling services; inclusion of local myths and stories to increase relevance; and flexible accommodations (e.g., for young mothers) (UNESCO, 2008a).

Monitoring the Quality of Education

17

Two Case Studies This section will focus on two examples, one African country (Malawi), which is one of the focus countries of the Deutsche Gesellschaft fu¨r Technische Zusammenarbeit (GTZ)4 and one Asian FTI country (Vietnam), to examine the link between research and educational plan implementation in more detail. These countries were selected for this review because they are illustrative cases of developing countries which have taken the initiative of carrying out conceptually and methodologically sound large-scale assessment studies.

The Case of Malawi Identification of Research Needs. In Malawi, free primary education was introduced in 1994. In concrete terms, this translates into a jump from 1.9 million to 3.2 million primary school population from one year to another (Chimombo, Kunje, Chimuzu, & Mchikoma, 2005). Consequently, there was an acute shortage of teachers and classrooms. Counting on some 20,000 temporary unqualified and retired teachers who were newly recruited that year, the pupil:teacher ratio was maintained at the previous 60:1 ratio. However, the pupil:qualified teacher ratio became 120:1. In addition, the pupil:classroom ratio jumped up to 160:1 because no new classrooms were constructed. This required some schools to have open-air classes. In the wake of the Jomtien Conference on Education for All in 1990, Malawi was one of several Sub-Saharan African countries that requested the support of the IIEP in 1993 to help establish a capacity development program in the area of monitoring and evaluating the quality of education. The proposal stressed the need for educational planners to obtain competencies in this area as well as to bring about an information culture where research information would be used for educational policy making (Moyo et al., 1993). It was decided that Reading literacy would be tested because this was considered to be the most critical subject that would form the basis for other subjects. A series of training programs were organized, which subsequently led to the formation of a consortium called SACMEQ. As one of the founding members of SACMEQ, Malawi has been very active in its participation. The policy concerns of senior decision makers in these countries, including Malawi, were translated into specific research questions. Based on these research questions, educational planners proceeded to develop the questionnaires and tests. SACMEQ considered quality to consist of the following elements,

18

MIOKO SAITO AND FRANK VAN CAPPELLE

all of which could influence pupil achievement in Reading, Mathematics, and HIV and AIDS Knowledge:  School characteristics (type, location, size, resources, principal’s qualification, parental involvement, etc.)  Teacher characteristics (age, sex, qualifications, behavior, in-service training, classroom resources, etc.)  Pupil characteristics (age, sex, attendance, repetition, socioeconomic status, nutrition, home help, etc.) The methodologies adopted by SACMEQ ensure high validity and reliability especially in such areas as sampling techniques, test construction, data collection, data cleaning, and data analyses. The SACMEQ approach has made possible different ways of comparison (Ross, Saito, Dolata, Ikeda, & Zuze, 2004):    

Comparison Comparison Comparison Comparison experts)  Comparison  Comparison  Comparison

among countries over time between pupils and teachers against Ministry’s benchmark standards (defined by subject with other international studies (TIMSS, PIRLS, etc.) with levels of competence of gender, social and distributional equity

Improvement of Malawi’s Research Capacity. Malawi had a late start in the main data collection for the SACMEQ I study due to a problem in securing funding. The data collection was therefore postponed from 1995 to 1998. Furthermore, there was a large amount of missing data, and therefore their SACMEQ I report was treated as an ‘‘interim report,’’ that is, not satisfying the SACMEQ technical standards (Milner, Chimombo, Banda, & Mchikoma, 2001). When the SACMEQ II study commenced in 2000, it was found during data collection that some of the selected schools turned out to be schools where there were no Grade 6 pupils. The operation was halted, and the problem was traced to the Education Management Information System (EMIS). In mid-2002, the Ministry of Education requested the SACMEQ Coordinating Centre to organize a special training workshop for Malawi. Experienced National Research Coordinators (NRCs) from Kenya and Namibia were sent as faculty members to lead training sessions in Malawi. Two months after the training, Malawi was back on track and

Monitoring the Quality of Education

19

able to complete data collection by the end of 2002, data cleaning in 2003, and the SACMEQ II report in 2005 (Murimba, 2005; Chimombo et al., 2005). For Malawi, some important lessons were learned from these difficult experiences. For example, the importance of involving professionals with different areas of expertise and from different organizations in the training activities, that is, from the Ministry of Education, University of Malawi, Centre for Educational Research and Training, and the UNESCO National Commission. The self-help approach also contributed to building the sense of local ownership that characterizes the SACMEQ project. In concrete terms, Malawi saw a tremendous improvement in strengthening the national EMIS (Murimba, 2005). For the SACMEQ III study, Malawi was one of the first SACMEQ countries that completed data preparation and cleaning (Hungi, 2009). In SACMEQ countries in general the time of completion was significantly reduced due to the new ‘‘Janitor’’ data quality management software developed at IIEP. This software automates many of the data cleaning processes which previously required external assistance. This is another example of how participation in a crossnational study could lead to an increase in research capacity.

Research Results and Policy Implications. The impact of participation in SACMEQ on research capacity, as described in the preceding text, needs to be separated from the impact of the research results on policy implementation. The SACMEQ I results for Malawi were not satisfactory to policy makers. Findings demonstrated that there were certainly negative spin-offs from free primary education. The already very low percentage of Grade 6 pupils who owned textbooks, notebooks, and pencils during SACMEQ I was even lower for SACMEQ II. Although there was a slight improvement in SACMEQ II with regards to teacher qualification, teaching materials, and school resources, there was a general shortage of human and material resources, especially in Central West, South West, and Shire Highland divisions (Chimombo et al., 2005). The achievement results were a difficult reality to accept. Malawi’s SACMEQ I mean Reading score for Grade 6 pupils was the lowest in the subregion, and only one-fifth of Grade 6 pupils were considered to meet the ‘‘minimum’’ level of the Ministry benchmark standard in Reading. The SACMEQ II results revealed a serious deterioration in quality; less than 10 percent of the Grade 6 pupils met the ‘‘minimum’’ benchmark standard (Chimombo et al., 2005).

20

MIOKO SAITO AND FRANK VAN CAPPELLE

As was the case for the other SACMEQ countries, the policy suggestions were classified into five different categories depending on the level of intervention:     

Consultation with staff, community, and experts Review of existing planning and policy procedures Data collection for planning purposes Educational research Investment in infrastructures and resources

In addition, each policy suggestion included the relevant department for implementation, expected cost, and duration. Ten policy suggestions out of a total of 52 dealt with shortages in materials, and they appeared throughout these different categories. For example, one policy suggestion stated that District Education Managers should make sure that education advisors and schools are encouraged to use the teacher development centers (in the second category above). Another policy suggestion elicited the Ministry of Education to establish clear guidelines on norms for material provisions (in the third category), while another suggestion urged the Planning Unit to search for more resources to cover the deficits in materials (in the fifth category) (Chimombo et al., 2005). It has been reported that these suggestions were used as inputs to the development of the educational policy investment framework (Murimba, 2005). Review of Malawi NESP. The Malawi NESP defines as its mission ‘‘to provide quality and relevant education to the Malawian nation.’’ For preprimary, secondary, vocational technical, and higher education subsectors, the number one priority has been ‘‘access and participation,’’ followed by ‘‘quality.’’ However, for the primary education subsector, ‘‘quality’’ has been the number one priority (UNESCO, 2008c). The NESP listed the following indicators as important ones to improve the quality of education:    

Reduced dropouts and repetition Improved teacher distribution Increased survival rate Increased supply of teachers

None of the indicators listed dealt with pupil achievement. There was mention of SACMEQ as an important capacity building tool. However, there was no reference to the research results for policy planning purposes, nor for program intervention (UNESCO, 2008c).

Monitoring the Quality of Education

21

The Case of Vietnam Identification of Research Needs. Since independence in 1945, Vietnam has experienced a series of education reforms. Although mass education was introduced in the reform following reunification in 1975, the key reform was associated with the transition from a command economy into a market economy under (Le, 2006; UNESCO, 2008b) in the late 1980s. Education was considered to be an important factor for the economic development of Vietnam, and therefore received high priority (World Bank, 2004). In the early 1990s, the Vietnam government undertook an education and human resource sector study, jointly supported by UNESCO and UNDP (UNESCO, 2008b). This was followed by an education financing study, which was jointly conducted by the World Bank and the government in 1996. In both studies, the focus was on the costs and financing of development, renovation, and reform of the different subsectors of the education system. There was no attempt to define the quality of education or investigate learning achievement (Griffin, 2007). In Vietnam, quantitative expansion took place during the 1990s, and the NER for primary school exceeded 90 percent. At the end of the 1990s, some 15,000 primary schools enrolled over 10 million pupils in Vietnam. Although practically no data were available before the reform, it was reported that the survival rate in primary school was 68 percent, and the transition rate to secondary school was 98 percent (Le, 2006). In the late 1990s, the government decided to look into the first-ever national assessment of learning achievement for pupils at the upper primary level. It was the Ministry officials themselves who decided on the policy questions for the study. These policy questions were related to: level of Grade 5 pupil achievement in Reading and Mathematics, the level of material and human resources compared to Ministry benchmarks, and equity of material and human resources. A large-scale monitoring study was then launched by the Ministry of Education and Training in 2000. This survey is one of the few projects which has focused on learning outcomes rather than school access (World Bank Independent Evaluation Group, 2006), and it emerged from the collective efforts of the Ministry of Education, the World Bank, and UNESCO-IIEP (World Bank, 2004). Research Design, Capacity Building, and Results. The assessment was a cross-sectional survey involving a scientific sample of all pupils in Grade 5 and their teachers in all of the 61 provinces5 in April 2001. Reading literacy

22

MIOKO SAITO AND FRANK VAN CAPPELLE

in Vietnamese and Mathematics were selected as the focus. Since a new curriculum was supposed to be introduced in 2000, the tests covered both the old and new curricula. The study also included background questionnaires for pupils, teachers, and school heads. The final data archive contained data from 72,660 Grade 5 pupils and over 7,000 teachers in 3,636 primary schools in Vietnam (World Bank, 2001). The survey methodology adhered to high standards based on the international standards set by the IEA and SACMEQ. The local team of the National Institute for Educational Sciences (NIES) was trained by an international team of specialists in the areas of sample design, test construction, and data preparation (Postlethwaite, 2004a; World Bank, 2004). Results revealed that a large proportion of Grade 5 pupils were in schools without access to some of the key material resources for learning, such as wall charts, lockers, bookshelves, and lamps. In addition, there were virtually no classroom library books for pupils to read. In terms of pupil achievement, a profile of skills in each subject was established, and there were significant provincial differences in achievement for both subjects (World Bank, 2004). The final chapter of the Vietnam report contains a series of policy suggestions that were drawn up from hard evidence on the quality of education, providing feedback to the original policy questions. Following the SACMEQ example, these policy suggestions were presented together with implementation time frames, estimated costs, and the relevant Ministry departments responsible for their implementation. Out of some 40 policy suggestions, 3 suggestions were directly related to achievement, 8 suggestions dealt with the quality of material resources, and 7 suggestions dealt with human resources. Other suggestions were related to home environment, school process, and further data collections. For example, one policy suggestion was for the provincial authorities to consider undertaking a review of school resources (short time, low cost), while another suggestion was about increasing material resources to isolated schools (medium time, high cost). Regarding pupil achievement, the Ministry was to introduce a new assessment framework based on a profile format (in which a score is associated with descriptions of skills achieved) for the new curriculum (medium time frame, low cost). It was also suggested that the Ministry use this tool as an intervention program for specific groups of pupils and teachers (also medium time and low cost). These suggestions were written by educational planners for review, consultation,

Monitoring the Quality of Education

23

dissemination, and implementation by the Ministry of Education and Training (World Bank, 2004). Review of Vietnam’s Policy Documents. In 2003, the government together with UNESCO prepared the Education Strategic Development Plan 2001–2010 and National Action Plan for Education for All (Socialist Republic of Vietnam, 2003). There were also the Five Year Socio-Economic Development Plan 2006–2010, Millennium Development Goals (MDGs)/ Vietnam Development Goals (VDGs), and Five Year Strategic Education Development Plan 2006–2010. These policy and plan documents formed the basis for the Law of Education in 2005. In these documents, improving educational quality was spelled out as one of the government’s sector priorities for all the subsectors. The focus was on the following four areas (UNESCO, 2008b):  Improvement of overall quality, stressing national spirit and socialist ideal  Abolition of gender disparity  Improvement of teaching/learning methods, encouraging creativity, selfstudy capability, and use of ICT  Assuring career-oriented knowledge rather than subject-oriented knowledge It should be noted that the fourth focus above would go along with the PISA test framework rather than the national assessment framework established in 2001. UNESCO (2008b) reports that Vietnam considers the following as contributing factors for improving the quality of education: school hours, school shifts, teachers’ salaries, and learning outcomes. However, no details were given concerning the type of learning outcomes, that is, whether these outcomes are to be described through an average score or a profile format. Although user-friendly policy suggestions were provided in the Vietnam assessment study of 2001 regarding the use of a format to measure the profile of Grade 6 pupils’ achievement, there was no trace of the use of these policy suggestions in the review provided by UNESCO (2008b). Moreover, despite the enormous investment made by the World Bank in carrying out the Grade 5 assessment which took place in 2001, no further assessment plan was mentioned in the educational plan documents prepared by the Vietnam Ministry of Education (Vietnam Government & UNDP, 2005; Vietnam Government, 2005; Ministry of Education and Training, 2003).

24

MIOKO SAITO AND FRANK VAN CAPPELLE

WEAK LINK BETWEEN POLICY SUGGESTIONS AND NATIONAL EDUCATION PLANS From the detailed review of two country cases, it seems clear that a distinction needs to be made between the following different types of impact:  Impact of the assessment exercise itself on national research capacity  Impact of assessment results on policy planning  Impact of policy intervention on educational quality In the case of Malawi, the assessment exercise (the SACMEQ experience) has primarily had an impact on national research capacity. Some policy suggestions were incorporated in a few of the policy documents, yet the impact on educational quality is yet to be seen. For Vietnam’s case, although some of the indicators used in the World Bank report match those in the Strategic Plan, there was no clear linkage between the policy suggestions and policy plans. Both the SACMEQ research studies and the Vietnam Grade 5 assessment of 2001 have been based on the policy concerns of senior decision makers in Ministries of Education (see Fig. 1). These studies have employed high technical standards with respect to the methodologies used and contain hard evidence that can be used for policy making. The policy suggestions and agendas for action were written by educational planners from these countries and endorsed by the ministers. Nevertheless, the final stages in the policy cycle (policy reform, agenda for action, and program implementation) are yet to take place in these countries. However, there are a few SACMEQ countries that have intensively and successfully used the SACMEQ policy suggestions in their policy documentation and/or investment framework for education. Their dissemination and communication strategies as well as the coordination and harmonization between planners and decision makers, planners and donors, and donors and decision makers have been documented in the literature and are worth examining (Hovland, 2005; Leste, 2005; Murimba, 2005; Nzomo & Makuwa, 2006). The impact of SACMEQ at national and international levels has been presented separately in Table 4. For example, Leste (2005) reported that the SACMEQ II results with regard to the problem of streaming at school level prompted the minister to launch a de-streaming promotion activity at all levels of planning in Seychelles. Nzomo and Makuwa (2006) reported that the deterioration of Namibia’s Reading achievement from 1995 to 2000 had urged the Ministry to embark on a

Monitoring the Quality of Education

Fig. 1.

25

SACMEQ Policy Cycle. Source: Saito (1999).

targeted intervention in disadvantaged regions focusing on resource provision and the implementation of language policy.

IMPROVING THE COMMUNICATION AND DISSEMINATION OF ASSESSMENT RESEARCH RESULTS One of the reasons for the lack of integration of research results in policy dialogue and policy reform is the inadequate and/or ineffective communication and dissemination of results among policy and decision makers (Mendelsohn, 1996; Gunderson, 2007). A promising tool which was specifically developed to improve the communication of results for decision making is StatPlanet6 (Van Cappelle, 2009). This visualization tool is currently used by SACMEQ, UNESCO, and a number of other

X

X

BOT KEN

X

X

LES

X

X X

X

X

X

X

X

MAL MAU MOZ NAM SEY

X

X

X

X

X

X

X

SOU SWA TAN UGA ZAM ZAN

Impact of SACMEQ Research Results on Educational Policy.

Source: Murimba (2005), Leste (2005), and Nzomo and Makuwa (2006).

To inform presidential commissions of inquiry into education To use as inputs to education sector analysis To develop national assessment systems To develop education policy investment frameworks To develop education sector master plans or strategic plans To lead to educational reforms and school improvement initiatives To strengthen national EMIS

Type of SACMEQ Impact at the National Level

Table 4.

X

ZIM

26 MIOKO SAITO AND FRANK VAN CAPPELLE

Monitoring the Quality of Education

27

international organizations as well as government agencies for the purpose of improving the communication and dissemination of data. StatPlanet enables the visualization of research results in the form of interactive thematic maps for spatial analysis (such as comparisons between countries, provinces or regions) and interactive graphs (i.e., bar charts, time series, and scatter plots). StatPlanet automates the transition from data to visual presentations, and enables users to visually browse through the data and select the type of visualization which is most relevant to their needs. It also includes functions for automating the process of merging data from different sources and databases, for example, EMIS and national census data, regardless of database structure and naming conventions. StatPlanet offers a number of advantages over existing, non-electronic means of disseminating research results and data within Ministries of Education, such as publications and reports. Visualization is useful for data analyses and makes it easier to derive meaning and understanding from data. StatPlanet also facilitates and speeds up the process of merging and visualizing data – tasks which Ministries of Education may not always have the time, resources, and/or staff to do otherwise. By automating these processes, the time between data collection and data analysis can be reduced. Disseminating data as early as possible is important to ensure that the data is still relevant and of interest to policy makers and planners. By the time research results are released to the public realm (which could take from several months to several years), the policy interest could already have dissipated (Gunderson, 2007). The software itself is easy to disseminate. This is of course an important criteria if it is to be used for the dissemination of research results (e.g., within the Ministry or to district offices around the country). StatPlanet does not require installation, and can run online (in a web browser) as well as offline. Visualizations produced with StatPlanet can also be exported for use in presentations, reports and publications. In contrast to many other data visualization or mapping systems, StatPlanet is targeted toward non-technical users, rather than statisticians, GIS specialists, database administrators, or other expert users. It is intended to be easy to use for educational planners and policy makers without requiring particular training in statistics or database systems. Feedback from users of StatPlanet so far suggests that ease of use is one of its strongest points. Further research and development in the area of data visualization for decision making would need to focus on catering to the particular needs of policy makers and planners, and coming to a better understanding of how data are currently communicated, disseminated and used at the different levels of educational administration.

28

MIOKO SAITO AND FRANK VAN CAPPELLE

CONCLUSION In this chapter, the concepts and issues of measuring the quality of education have been discussed in the context of the EFA goals. An analysis of the linkage between learning assessment activities and policy planning was made based on a UNESCO desk review of education plans from 35 countries as well as more detailed case studies of Malawi and Vietnam. It was found that while the quality of education is recognized as an important element in most education plans, there is no consistent way of interpreting or measuring it. Many countries are making significant investments in assessments of the quality of education, which naturally raises the expectation that such assessments are used to improve quality. However, as long as there are no significant efforts toward linking the outcomes of such studies to concrete education plans and educational policies, these studies are unlikely to have much impact. One of the ways to improve this linkage is through the better communication and dissemination of results, which should take place at all levels of educational administration. Constructive dialogue is also needed, in which the outcomes of such studies are openly discussed in relation to educational policies and national plans. Furthermore, as seen in Vietnam’s case, the concept of quality of education described in the education plan documents may not exactly match the methodology that was in place in previous assessment studies. For example, if ‘‘career-oriented knowledge’’ is prioritized over ‘‘subjectoriented knowledge’’ in the education plan document, this should be reflected in the test framework used. Although sophisticated and innovative methodologies have already been developed to measure the quality of education, the processes of linking research results with policy still seem to be at a developmental stage. This is a challenge not only for researchers and policy makers, but also for development partners to ensure that (i) policy and planning become more firmly grounded in objectively verifiable scientific evidence and (ii) policy interventions have an impact on the improvement of the quality of education.

NOTES 1. For TIMSS 2003, the cognitive framework has been modified. See the following link for details: http://timss.bc.edu/PDF/t03_download/t03cdrpt_appendix_A.pdf

Monitoring the Quality of Education

29

2. ‘‘Skills required to satisfy the social, economic, and environmental needs of the present generation without compromising the needs and resources available for future generations’’ (Ross et al., 2006, p. 289). 3. The list of FTI countries is updated periodically. The endorsed countries as of March 2009 have been listed in the appendix. 4. This document was originally presented at an International Conference organized by the Deutsche Gesellschaft fu¨r Technische Zusammenarbeit (GTZ) in September 2009. 5. In 2008, the number of provinces in Vietnam was increased to 63. 6. http://www.sacmeq.org/statplanet/

REFERENCES Adams, R. S. (Ed.) (1978). Educational planning: Towards a qualitative perspective. Paris: UNESCO-IIEP. Amagi, I. (1996). Upgrading the quality of school education. In: Learning: The treasure within. Report to UNESCO of the International Commission on Education for the Twenty-first Century. Paris: UNESCO. Anderson, L. W. (2004). Increasing teacher effectiveness. Fundamentals of educational planning (Vol. 79). Paris: UNESCO-IIEP. Athie, L. (2008). Communication competences of students in Mexico: Policy recommendations for the National Reading Program. Unpublished master thesis, UNESCO-IIEP, Paris. Beaton, A., Martin, M., Mullis, I., Gonzales, E., Smith, T., & Kelly, D. (1996). Science achievement in the middle school years. Boston, MA: IEA, TIMSS International Study Center. Beeby, C. E. (1969). Educational quality in practice. In: C. E. Beeby (Ed.), Qualitative aspects of educational planning (pp. 39–68). Paris: UNESCO-IIEP. Chimombo, J., Kunje, D., Chimuzu, T., & Mchikoma, C. (2005). The SACMEQ II project in Malawi: A study of the conditions of schooling and the quality of education. Harare: SACMEQ. CONFEMEN. (2009). Son programme d’analyse: PASEC. Available at http://www.confemen. org/spip.php?rubrique3. Retrieved on May 29, 2009. Delors, J., Mufti, I. A., Amagi, I., Carneiro, R., Chung, F., Geremek, B., Gorham, W., Kornhauser, A., Manley, M., Quero, M. P., Savane´, M-A., Singh, K., Stavenhagen, R., Suhr, M. W., & Nanzhao, Z. (1996). Learning: The treasure within. Report to UNESCO of the International Commission on Education for the Twenty-first Century. Paris: UNESCO. Elley, W. (1992). How in the world do students read? Hamburg: International Association for the Evaluation of Educational Achievement (IEA). Greaney, V., & Kellaghan, T. (1996). Monitoring the learning outcomes of education systems. Washington, DC: World Bank. Griffin, P. (2007). Mathematics achievement of Vietnamese Grade 5 pupils. Asia Pacific Education Review, 8(2), 233–249. Grisay, A., & Griffin, P. (2006). Chapter 4: What are the main cross-national studies? In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 67–104). Paris: UNESCO-IIEP.

30

MIOKO SAITO AND FRANK VAN CAPPELLE

Gunderson, M. (2007). How academic research shapes labor and social policy. Journal of Labor Research, 28(4), 573–590. Hanushek, E. A., & Wo¨Xmann, L. (2007). Education quality and economic growth. Washington, DC: The World Bank. Hiebert, J., Gallimore, R., Garnier, H., Bogard Giwin, K., Hollingsworth, H., Jacobs, J., Miu-Ying Chui, A., Wearne, D., Smith, M., Kersting, N., Manaster, A., Tseng, E., Etterbeek, W., Manaster, C., Gonzales, P., & Stigler, J. (2003). Teaching mathematics in seven countries: Results from the TIMSS 1999 Video Study (NCES 2003-013). Washington, DC: United States Department of Education, National Center for Education Statistics. Hovland, I. (2005). Successful communication – A toolkit for researchers and civil society organizations. London: Overseas Development Institute. Hungi, N. (2009). Progress on data cleaning. Unpublished document, SACMEQ, Paris. International Association for the Evaluation of Educational Achievement. (2005a). Brief history of IEA. Available at http://www.iea.nl/brief_history_of_iea.html. Retrieved on May 23, 2007. International Association for the Evaluation of Educational Achievement. (2005b). Completed studies. Available at http://www.iea.nl/completed_studies.html. Retrieved on February 24, 2009. Kellaghan, T. (2006). Chapter 3: What monitoring mechanisms can be used for cross-national (and national) studies? In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 51–66). Paris: UNESCO-IIEP. Kellaghan, T. & Greaney, V. (2001). Using assessment to improve the quality of education. Fundamentals of educational planning (Vol. 71). Paris: UNESCO-IIEP. Le, C. L. V. (2006). Reform or renovations? The political economy of education reform in Vietnam since the introduction of . Nagoya: Nagoya University. Leste, A. (2005). Streaming in Seychelles: From SACMEQ research to policy reform. Paper presented at the SACMEQ Research Conference, SACMEQ, Paris. Martin, M. O., & Kwelly, D. L. (Eds). (1996). TIMSS technical report: Vol. 1. Design and development. Chestnut Hill, MA: Boston College. Mendelsohn, J. M. (1996). Education planning and management, and the use of graphical information systems. Paris: UNESCO-IIEP. Milner, G., Chimombo, J., Banda, T., & Mchikoma, C. (2001). The quality of education: Some policy suggestions based on a survey of schools. SACMEQ policy research no.7: Malawi. Paris: IIEP. Ministry of Education and Training. (2003). National Education for All (EFA) Action Plan 2003–2015. Hanoi: MOET. Moyo, G., Murimba, S., Nassor, S. M., Dlamini, E., Nkamba, M., & Chimombo, J. (1993). A Southern Africa proposal for monitoring progress towards attaining the goals of the EFA Jomptien conference concerning the quality of education. Unpublished document. Mullis, I. V. S., Martin, M. O., Beaton, A. E., Gonzales, E. J., Kelly, D. L., & Smith, T. A. (1997). Mathematics achievement in the primary school years: IEA’s Third International Mathematics and Science Study (TIMSS). Chestnut Hill, MA: Boston College. Murimba, S. (2005). The impact of the Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ). Prospects, XXXV(1), 91–108. Murimba, S. (2006). Chapter 6: What do ministers of education ‘really think’ about crossnational studies? In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 121–131). Paris: UNESCO-IIEP.

Monitoring the Quality of Education

31

Nzomo, J., & Makuwa, D. (2006). Chapter 10: How can countries move from dissemination, and then to policy reform? (Case studies from Kenya and Namibia). In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 213–228). Paris: UNESCO-IIEP. Organization for Economic Co-operation and Development (OECD). (1989). Schools and quality. An international report. Paris: OECD. Organization for Economic Co-operation and Development (OECD). (2007). OECD Programme for International Student Assessment. Available at http://www.pisa.oecd. org/pages/0,2987,en_32252351_32235731_1_1_1_1_1,00.html. Retrieved on February 24, 2009. Peters, R. S. (1969). The meaning of quality in education. In: C. E. Beeby (Ed.), Qualitative aspects of educational planning (pp. 149–167). Paris: UNESCO-IIEP. Pigozzi, M. J. (2006). Chapter 2: What is the ‘quality of education’? (A UNESCO perspective). In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 39–50). Paris: UNESCO-IIEP. Postlethwaite, T. N. (2004a). Monitoring educational achievement. Fundamentals of educational planning 81. Paris: UNESCO-IIEP. Postlethwaite, T. N. (2004b). What do international assessment studies tell us about the quality of school systems? Background paper prepared for the Education for All Global Monitoring Report 2005 The Quality Imperative. 2005/ED/EFA/MRT/PI/40. Paris: UNESCO. Postlethwaite, T. N. (2006). Chapter 5: What is a ‘good’ cross-national study? In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 105–120). Paris: UNESCO-IIEP. Postlethwaite, T. N., & Kellaghan, T. (2008). National assessments of educational achievement. Education policy series 9. Paris: International Academy of Education and UNESCOIIEP. Postlethwaite, T. N., & Leung, F. (2006). Comparing educational achievements. In: M. Bray, B. Adamson & M. Mason (Eds), Comparative education research: Approaches and methods. Hong Kong: Comparative Education Research Centre, the University of Hong Kong. Reimers, F., & McGinn, N. (1997). Informed dialogue: Using research to shape education policy around the world. Westport, CT: Praeger. Ross, K. N., Donner-Reichle, C., Jung, I., Wiegelmann, U., Genevois, I. J., & Paviot, L. (2006a). Chapter 15: The ‘main’ messages arising from the policy forum. In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 279–312). Paris: UNESCO-IIEP. Ross, K. N., & Ma¨hlck, L. (1990). Planning the quality of education: The collection and use of data for informed decision-making. Paris: UNESCO-IIEP. Ross, K. N., & Makuwa, D. (2009). What is SACMEQ? Paris: UNESCO-IIEP. Ross, K. N., Paviot, L., & Genevois, I. J. (2006b). Chapter 1: Introduction: The origins and content of the policy forum. In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 25–36). Paris: UNESCO-IIEP. Ross, K. N., Saito, M., Dolata, S., Ikeda, M., & Zuze, L. (2004). SACMEQ archive. Paris: UNESCO-IIEP. SACMEQ. (2007). Notes for file: Scientific committee meeting. Unpublished document.

32

MIOKO SAITO AND FRANK VAN CAPPELLE

Saito, M. (1999). A generalizable model for educational policy research in developing countries. Journal of International Cooperation in Education, 2(2), 107–117. Saito, M. (2008). Chapter 7: Issues regarding the quality of education: Importance of measuring the quality of education in the EFA context. In: K. Ogawa, M. Nishimura & Y. Kitamura (Eds), Rethinking international educational development: Towards education for all in developing countries. Tokyo: Toshindo. Schleicher, A. (2006). Chapter 14: How can international organizations work with the media to manage the results of cross-national studies? In: K. Ross & I. J. Genevois (Eds), Cross-national studies of the quality of education (pp. 265–275). Paris: UNESCOIIEP. Socialist Republic of Vietnam. (2003). National education for all (EFA) action plan 2003–2015. Government document no. 872/CP-KG. Hanoi: Ministry of Education and Training. UNESCO. (1990). World declaration on education for all. Available at http://www.unesco.org/ education/efa/ed_for_all/background/jomtien_declaration.shtml. Retrieved on May 25, 2009. UNESCO. (2000). The Dakar framework for action: Education for all – Meeting our collective commitments. World Education Forum, Dakar, Senegal (April 26–28). Paris: UNESCO. UNESCO. (2004a). EFA global monitoring report: Education for all 2005 – The quality imperative. Paris: UNESCO. UNESCO. (2004b). Report of the inter-agency working group on life skills in EFA. Paris: UNESCO. UNESCO. (2008a). Overview of approaches to understanding, assessing and improving the quality of learning for all: A preliminary desk review. Unpublished document. UNESCO. (2008b). UNESCO National Education Strategy Support (UNESS) Vietnam. Paris: UNESCO. UNESCO. (2008c). UNESCO National Education Support Strategy (UNESS) for Malawi 2008–2009. Paris: UNESCO. UNESCO. (2009). Latin American laboratory for assessment of the quality of education. Available at http://llece.unesco.cl/inc/. Retrieved on February 24, 2009. UNICEF. (2005). Life skills. Available at http://qqq.unicef.org/lifeskills. Retrieved on May 25, 2009. Van Cappelle, F. (2009). StatPlanet user’s guide (Available at http://www.sacmeq.org/ statplanet/). Paris: UNESCO-IIEP. Vietnam Government. (2005). Education law (Law no. 38/2005/QH11) national assembly of the socialist republic of Vietnam, eleventh legislature, seventh session (from May 5 to June 14, 2005). Hanoi: Vietnam Government. Vietnam Government & UNDP. (2005). Vietnam: Achieving the millennium development goals. Hanoi: UNDP. World Bank. (1995). Priorities and strategies for education: A World Bank review. Washington, DC: The World Bank. World Bank. (1999). Education sector strategy. Washington, DC: The World Bank. World Bank. (2001). Data archive: Vietnam grade 5 mathematics and reading assessment study. Hanoi: World Bank. World Bank. (2004). Vietnam reading and mathematics assessment study (Vol. 2). Hanoi: World Bank.

Monitoring the Quality of Education

33

World Bank Independent Evaluation Group. (2006). From schooling access to learning outcomes: An unfinished agenda – An evaluation of World Bank support to primary education. Washington, DC: The World Bank. World Bank Independent Evaluation Group. (2007). Facts about primary education. Available at http://www.worldbank.org/oed/education/facts_figures.html. Retrieved on May 25, 2009. World Health Organization (WHO). (1999). Partners in life skills education: Conclusions from a United Nations inter-agency meeting. Geneva: WHO.

Burkina Faso Guinea Guiyana Honduras Mauritania Nicaragua Niger

   

The Gambia Mozambique Vietnam Republic of Yemen

2003

2005

 Ghana  Kenya  Ethiopia  Lesotho  Madagascar  Moldova  Tajikistan  Timor-Leste

2004

   

     Albania Cambodia Cameroon Djibouti Kyrgyz Republic Mali Mongolia Rwanda Senegal

2006

Source: World Bank FTI Secretariat (2009) and UNESCO (2008a).

      

2002

Endorsed Countries 2008

 Benin  Central  Georgia African  Sao Tome & Republic  Haiti Principe  Liberia  Zambia  Sierra Leone

2007

    



    



     

 Angola  Bangladesh  Bhutan  Burundi  Comoros  Democratic  Republic of  Congo Republic of  Congo  Eritrea Guinea-Bissau   Lao PDR  Malawi Nigeria (3–4 States) Papua New Guinea Solomon Islands Tanzania Togo, Tonga Uganda Vanuatu

Countries Expected in 2009–2010

Afghanistan Cote d’Ivoire India Indonesia Kiribati Myanmar Nepal Nigeria (other States) Pakistan Somalia Sri Lanka Sudan Zimbabwe

Other Eligible Countries

APPENDIX. FAST-TRACK INITIATIVE (FTI) COUNTRY ENDORSEMENT SCHEDULE (AS OF FEBRUARY 2009)

34 MIOKO SAITO AND FRANK VAN CAPPELLE

WHY PARTICIPATE? CROSS-NATIONAL ASSESSMENTS AND FOREIGN AID TO EDUCATION Rie Kijima ABSTRACT Participation in cross-national assessment is becoming a global phenomenon. While there were only 43 countries that participated in the Programme for International Student Assessment (PISA) in 2000, the number of participating countries/economies has increased to 65 in 2009. To understand this global trend, this chapter seeks to answer the following research questions: What are the real incentives for developing countries to participate in cross-national assessments? What do they gain from actual participation in cross-national assessments, given that there are many constraints and barriers associated with test participation? It employs country-level fixed effects to test the hypothesis that there is a positive association between participation in cross-national assessments and foreign aid to education. This study shows that countries that participate in major cross-national assessments receive, on average, 37 percent more foreign aid to education than countries that do not participate in major cross-national assessments, while holding all other variables constant. Although further research is necessary to make The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 35–61 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013005

35

36

RIE KIJIMA

a causal warrant of the association between participation in cross-national assessment and education aid, the results of this study have great implications for developing countries that are considering participating in cross-national assessments.

INTRODUCTION Standardized test results are used around the world as a viable method for measuring student performance in a systematic way. Education practitioners and state authorities are vested in knowing how their young citizens perform in basic literacy skills relative to other children of similar age, both within a particular country and between other nations. To measure students’ proficiency in grade and age level-appropriate subjects, tests or assessments are frequently employed to audit student learning performance. Although there exists a wide array of literature that compares student performance among countries, there are virtually no studies that look at why countries participate in cross-national assessments. This chapter focuses on the incentives for developing countries to participate in cross-national assessments. The central argument of this chapter is that, while holding all other variables constant, countries that participate in cross-national assessments are more likely to benefit from an increase in foreign aid to education. Standardized assessments are commonly institutionalized at the national level, where youths are compared against other youths of similar age within the boundary of a nation-state. Over the past decade, however, there has been an increase in the number of countries that assess student performance against youths of similar age from other countries. These cross-national assessments began in the late 1950s as a way to measure how students from various countries perform against each other.1 This endeavor, which initially started off with only 12 country participants, has grown to include a greater number of participating countries and different types of assessments ranging from mathematics and science to civics education. Today, there are at least three major international assessments and three regional cross-national assessments that are regularly administered.2 While there were only 45 countries/economies that participated in the Trends in International Mathematics and Science Study (TIMSS) in 1995, the number of participating countries has increased to 68 countries/economies for the 2011 assessment. For the Programme for International Student Assessment (PISA), only 43 countries/economies participated in 2000, but this figure has increased to 65 in 2009. Also, implementation agencies involved in the

Why Participate? Cross-National Assessments and Foreign Aid to Education

37

administration, implementation, and analysis of test results have grown significantly over the past two decades. The first sets of cross-national assessments were conducted under the auspices of the International Association for the Evaluation of Educational Assessment (IEA). More recently, other actors have become involved in the administration of crossnational assessments, such as the Organisation for Economic Cooperation and Development (OECD) (for PISA) and the United Nations Educational, Scientific, and Cultural Organization (UNESCO) (e.g., regional assessments such as the Southern and Eastern African Consortium for Monitoring Educational Quality (SACMEQ)). This reflects the increase in both the supply and demand of cross-national assessment worldwide. Among these participating countries, which range from Kuwait to Indonesia, there has been a significant rise in the number of developing countries. Reasons for test participation among industrialized nations are easier to identify than those of less industrialized nations, as industrialized nations need to fulfill their obligations as members of organizations, such as the OECD, which is comprised of advanced economies.3 However, reasons for participation in cross-national examinations among developing countries are harder to discern in part because of the specific challenges these countries face when administrating these exams. There are several barriers to participation for developing countries. International assessments cater to industrialized nations, as test items do not necessarily take into account the cultural, linguistic, and ethnic diversity represented in developing countries (Ercikan, 1998, 2002; Ercikan & Koh, 2005; Hambleton & Kanjee, 1993; Hambleton & Patsula, 2000; Jakwerth et al., 1997). Cross-national assessments are expensive to administer, which poses great financial challenges in countries where resource allocation in education is limited (Greany & Kellaghan, 2008). The cost associated with TIMSS Grade 8 assessment is approximately 40,000 USD, excluding analysis and dissemination of results (Greany & Kellaghan, 2008, p. 75). Moreover, test visibility is high since results are disseminated internationally. For developing countries, ranking toward the bottom of the list of all participating countries may have negative domestic or international policy repercussions. Yet despite all of these challenges, developing nations continue to participate at increased rates in these cross-national exams. If international assessments have historically been an undertaking of the advanced economies of the world, why has there been an increase in the number of developing countries that participate in cross-national assessments? This puzzle gives rise to the main research questions of this study: What are the real incentives for developing countries to participate in

38

RIE KIJIMA

cross-national assessments? What do they gain from actual participation in cross-national assessments, given that there are many constraints and barriers associated with test participation? These questions are central in understanding the underlying motivation behind developing countries’ willingness to have their students measured against students from more economically advanced countries. This study takes the first step in answering the larger question of why countries participate in cross-national assessments. The relationship between developing countries and the donor community is critical in understanding the incentives that lie behind developing countries’ participation in cross-national assessments. It is important to note that most cross-national assessments are organized, promoted, and administered by members of the donor community. More specifically, multilateral agencies play a prominent role in facilitating information exchanges about the different states of education around the globe via student assessments. In return, developing countries use cross-national assessments as a way to increase their visibility in the international community. In short, the underlying assumption of this study is that testing is a mechanism that yields benefits for both the developing world and the donor community in advancing their respective education policy agendas. The central argument of this study is that developing countries have strong incentives associated with participating in cross-national assessments. An empirical test for this claim is to measure whether there are any changes in the flow of aid from the donor community to developing countries that participate in cross-national assessment. I employ country-level fixed effects analysis to test the hypothesis of whether test participation is positively associated with foreign aid to education. Results from my analysis indicate that there is a statistically significant relationship between test participation and education aid. The outline of the study is as follows: the second section introduces the conceptual framework of the study; the third section outlines data sources and model specifications; and the fourth section reports main findings. The study concludes with a summary of main findings and the identification of areas for further research.

CONCEPTUAL FRAMEWORK This study of cross-national assessments seeks to answer why developing countries participate in cross-national assessments. I present two theoretical frameworks that help explain why developing countries participate in

Why Participate? Cross-National Assessments and Foreign Aid to Education

39

cross-national assessments. I first introduce world society theory, which derives from the sociological strand of neo-institutional theory. Then, I present a rationalist approach that best represents how I analyze this subject, and that will act as the conceptual framework through which I will explore the main research questions of this study.

World Society Theory World society theory provides a useful framework to help explain why there has been an exponential growth in the number of countries participating in cross-national assessment around the world. This theory explains the development of world culture in two complementary ways. First, global models determine and legitimize local actions and decisions (Meyer, Boli, Thomas, & Ramirez, 1997). Second, nation-states that are immersed in world culture are more likely to adopt similar patterns of modernization, thus giving rise to worldwide assimilation and isomorphism (Meyer et al., 1997; Meyer & Ramirez, 2000; Ramirez & Boli, 1987). Together, world society theory asserts that nation-states adopt global models because of their legitimizing forces. Furthermore, nation-states have a strong desire to maintain their sovereignty while being conscientious about their status in the international community. A world society theorist would then argue that the reason why countries participate in cross-national assessment is because an increasingly greater number of countries are buying into the legitimizing effect of assessments. While cross-national assessments do not comprehensively assess the overall performance of the education system, they are used for competition between countries (Baker & LeTendre, 2005). Today, assessments are widely considered a necessary tool in national education policy making that help nation-states to adopt better decisions for education policy. These ideological forces behind the importance of cross-national assessments have compelled countries to accept, adopt, and participate in cross-national assessments (Kamens & McNeely, 2009). In explaining how countries begin adapting to and internalizing global norms, I draw more broadly from constructivism in the field of international relations. Constructivists seek to understand the social processes that give rise to regulatory norms and common knowledge (Katzenstein, Keohane, & Krasner, 1998). Their central argument is that multilateral agencies are active proponents of change, and they play key roles in balancing world order in international politics of development (Finnemore & Sikkink, 1998;

40

RIE KIJIMA

Ruggie, 1998). Furthermore, they argue that norms are borne from repeated negotiations between organizational platforms, cascaded to lower levels by means of legitimization, and finally internalized by organizational platforms (Finnemore & Sikkink, 1998). In short, international organizations are purveyors of ideologies that give rise to normative beliefs. Scholars of world society theory have adopted this line of argument set forth by constructivists and have applied it to the field of international comparative education. In particular, they argue transnational actors are the mechanisms by which countries adapt and join to construct world culture (Chabbott, 1998, 2003; McNeely, 1995). Transnational actors are agencies that facilitate exchange of information and often times serve as advocates of concepts and ideas for global diffusion (Chabbott, 2003). For instance, the World Bank, an international organization that provides loans and grants for education support, has influenced developing countries to adopt neoliberal policies in education, such as choice and competition, privatization, and decentralization (Ball, 1998; Colclough, 1991; Mundy, 2002). UNESCO, on the other hand, takes an active role in advocacy for children’s rights, culture, education, and science, and they have been the leader in promoting human rights around the globe (Jones, 1999; Mundy, 1998). These multilateral agencies are equipped with mandates, and they exert norms associated with education development policies around the world. More specifically, Benveniste (2002) argues that international organizations, such as UNESCO, the World Bank, and the OECD, have exerted great influence over developing countries regarding the need for crossnational assessments. If one believes that these multilateral agencies are active agents in the diffusion of global norms, then they are vital in understanding how international assessments have been promoted over the past years. Nonetheless, findings from this study show that multilateral agencies are not merely facilitating the flow of information; they resist and respond actively to their constituents as witnessed in the amount of foreign aid they provide to developing countries. Moreover, while neo-institutional theory offers insights about the ideological reasons why countries participate in cross-national assessments, it fails to identify the specific motivations that drive countries to join in this international effort. If global embeddedness is the most prevailing reason for countries to participate in cross-national assessments, then one would have to assume that countries put a lot of faith in the positive, legitimizing effects of testing when buying into it. Moreover, there are many countries that rank low on indicators of ‘‘global embeddedness,’’ such as Iran4 and Oman,5 that have recently joined the

Why Participate? Cross-National Assessments and Foreign Aid to Education

41

international league of cross-national assessments. How does world society theory explain the participation of these countries in educational assessments? In short, world society theory does not fully describe the motivation behind developing countries’ increased interest in participating in crossnational assessments. To better understand the strategic power relations between recipient countries and the donor agencies, I draw from a rational-actor framework.

Incentives The field of international relations offers some useful theoretical lenses to understand the motivation and incentives behind a country’s decision to partake in international endeavors. Rationalists frame actions and behaviors in terms of power, interests, and institutional rules in contrast to sociological neo-institutionalists who focus on norms, culture, identity, knowledge, and ideologies (Fearon & Wendt, 2002; Katzenstein et al., 1998). If we believe that a country’s actions are driven by incentives and strategic decision making, then the rationalist framework allows us to understand the motivation and incentives behind countries’ decision to participate in cross-national assessments. The rational-actor framework is useful in my study of cross-national assessments for two reasons. First, one is able to delineate and identify tangible reasons that could influence each nation-state in determining its policy decisions. According to this framework, countries face specific incentives that influence their decisions to participate or not to participate in cross-national assessments. These rewards are tangible and easy to identify, which can then be tested empirically. Second, unlike neo-institutional theory, where norms are borne and cultivated within the global culture and are cascaded down to developing countries, rationalists would depict developing countries as principal agents. This paradigm provides the basis for my study which investigates the motivation, incentives, and rewards associated with a country’s decision to participate in this international endeavor. Very little extant literature that investigates how countries respond to real incentives associated with test taking exists in the field of international and comparative education. Nonetheless, the existing literature points to two types of incentives. When countries participate in cross-national assessments, the State benefits from being able to (i) guide or change the course of education policy directions that are in line with their larger political agendas and (ii) receive financial support to revamp their education system,

42

RIE KIJIMA

including technical assistance and budgetary aid. The next section describes these incentives.

Political Incentives States employ assessment as leverage to advance their own political agendas. In his study of three Southern Cone countries, Benveniste (2002) contends that countries use testing as a tool to legitimize education policy decisions determined by the state. His main argument is that countries administer examinations not only for evaluative purposes, but to advance political agendas driven by various political powers within the State apparatus (Benveniste, 1999, 2002). His depiction of testing is one that is based on a struggle for power within the boundary of a nation-state. The Chilean military regime decisively implemented student assessment to signal to the public their support for school choice (Benveniste, 1999, p. 108). In Argentina, national assessment was administered to justify the national policy toward greater autonomy and decentralization, but ultimately was used as a tool to provide strong oversight by the government (Benveniste, 1999, p. 181). In Uruguay, examination symbolized the State’s effort to promote social accountability when it needed to intervene in communities for greater state control and monitoring (Benveniste, 1999, p. 241–242). The case studies of Chile, Uruguay, and Argentina show that countries’ decisions to conduct assessments were largely driven by political agendas. An example from Japan further extends Benveniste’s theory. While Japan is not a developing country with a weak state, it is a persuasive example that shows how the State utilizes assessments to achieve its own political objective. Immediately after the release of the TIMSS and PISA 2003 test results, the Japanese Minister of Education, Nakayama, announced an increase in classroom hours, class days, and a resumption of Saturday classes. Takayama (2007) argues that this apparent sense of ‘‘crisis’’ in Japanese classrooms was ‘‘constructed through the reductive interpretation and selective appropriation of international league tables, namely, PISA and TIMSS’’ (p. 424). The Minister of Education strategically used the results of international examinations to gain political support. His aim was to create momentum for a highly contested, large-scale policy reform. He also wanted to gain political support to overhaul an existing policy agenda that still had avid supporters from the neoliberal strands of scholars and critics. This example illustrates the politicized nature of testing and its use by the State apparatus to achieve certain political objectives. The cases from Chile,

Why Participate? Cross-National Assessments and Foreign Aid to Education

43

Uruguay, Argentina, and Japan are compelling in showing that State interests play a critical role in the determination to administer assessments and the usage of cross-national assessments.

Financial Incentives Another persuasive argument for why countries decide to participate or not participate in cross-national assessment is financial. The importance of measuring student learning outcomes is one of the key factors that serve the interests of the state. Evidence shows that better capacity for measuring student performance will improve transparency and will inform policy makers to better allocate funds to address challenges. Consequently, this will improve the overall quality, efficacy, and efficiency in the delivery of education, increasing the quality of human capital, which will eventually contribute to the overall economic and social development of a nation-state. To strengthen the core capacity for a country to measure learning performances, various incentives are instituted. As an illustration, the Race to the Top initiative by the US Department of Education is a way to levelout the current state of education by providing competitive grants so that states can revamp their assessment system.6 Bush’s No Child Left Behind policy also emphasized the importance of developing and administrating reliable assessments and issued discretionary funds so that states can improve their capacity for measurement. These initiatives are some examples of how States enforce better assessment structure to strengthen the monitoring capacity to improve the overall system of education. If countries are incentivized by financial rewards, then they are more likely to participate in cross-national assessments. Furthermore, if test participation leads to larger flows of aid over a longer period of time, then their initial financial costs to participate in cross-national assessment will reap bigger rewards. It is important to clarify that the flow of aid reported by the OECD includes monies transacted by project. This includes technical assistance, direct budgetary support, costs associated with knowledge transfer (e.g., conferences, training), and any other budgets that are costed out for each project that is signed between the recipient country and the donor agency. This also incorporates technical support that a country receives from the community to participate in cross-national assessments. Although the detailed budget for each project is difficult to obtain for all countries, these types of data will inform the mechanisms of how developing countries are incentivized to participate in cross-national assessments.

44

RIE KIJIMA

The existing literature in the field of international and comparative education is limited in its analysis of financial incentives associated with participation in multinational platforms. The closest example to international assessment would be trade in educational services. Mundy and Iga (2003) investigate the potentially lucrative arrangement benefitting developed countries that participate in the World Trade Organization (WTO) General Agreement on Trade in Services (GATS) negotiations over trades in educational services. The WTO, an international organization facilitating trade policies, serves to mediate agreements and policies for the benefit of developing and developed countries. They found that industrialized countries gain more from multilateral agreements than do developing nations. While the incentives for countries participating in the WTO differ significantly from those that participate in cross-national assessments, the framing of their study offers insights about the financial incentives associated with participation in multinational platforms. The research question I investigate in this study is framed around a rationalist argument that contends countries are driven by incentive when deciding whether to participate or not to participate in cross-national assessments. There are two ‘‘real’’ incentives that drive countries to participate in cross-assessments: political and financial. In the next section, I provide evidence to show that countries respond to financial incentives, even after controlling for political indicators. This suggests that whatever other reasons compel countries to participate in international tests countries do respond to real incentives. The next section presents the data and the specification used to estimate the results.

DATA AND MODEL SPECIFICATIONS Variables and Data The dependent variable in my model is official development assistance in the education sector (interchangeably referred to as education aid). The data available include all countries that are recipients and potential recipients of foreign aid to education between 1994 and 2006 (a total of 193 sovereign states). This information was derived from the OECD CRS database (OECD, 2008). It is then calculated per youth capita, to take into account that education aid is allocated to benefit the youth population (ages 0–24). It is then transformed into natural log (1 þ education aid) to attenuate the greatly varying magnitude of aid flow to small versus large economies.

Why Participate? Cross-National Assessments and Foreign Aid to Education

45

I also disaggregate the dependent variable by donor status to test the assumptions set forth in the literature that multilateral agencies are key stakeholders in the diffusion, provision, and funding of cross-national assessments. The independent variable of interest in the model is a binary variable indicating whether countries that are recipients and potential recipients of foreign aid participated in testing or not. The following cross-national assessment tests were included: the Trends in International Mathematics and Science Study (TIMSS) 1995, 1999, 2003; the Programme for International Student Assessment (PISA) 2000, 2003, 2006; and the Programme in International Reading Literacy Study (PIRLS) 2001 and 2006. The control variables in the model are: each country’s gross domestic product (GDP) per capita (measured in purchasing power parity (PPP) in USD, 1994–2006), public expenditure on education relative to GDP, gross secondary enrollment, income distribution (Gini coefficient), polity scores, youth population, and membership in inter-governmental organizations (IGOs). The covariates are used to control for the differences between countries to address the issue of omitted variable bias. GDP per capita (PPP) was derived from the International Monetary Fund’s Online World Economic Outlook database. This accounts for the varying level of economic development among all countries. Public expenditure on education relative to GDP and gross secondary enrollment were obtained from the World Bank Education Statistics portal, EdStats. This indicator was selected because foreign aid to education is used to support national expenditure on education. The Gini coefficient is taken from the World Development Indicators, 2008. This controls for the varying degree of economic and social inequalities among countries. The Gini index ranges between 0 and 100, where 0 means evenly distributed income and 100 indicates total income inequality. Polity scores from the Polity IV Project are indices that measure the level of political authority/democracy in a country, which ranges between 10 (autocratic) and 10 (democratic).7 Polity scores control for the varying levels of democracy versus autocracy within the country which also takes into account changes in the political structure. Youth population was derived from the World Bank EdStats portal. This data was included to take into account the variance in the population of youth among education aid recipients. Since OECD’s ODA to the education sector includes all levels of education (e.g., early childhood education to higher education and some nonformal education), the population of youth is defined as individuals between ages 0 and 24. Data on membership in IGOs come from the Correlates of War (COW)

46

RIE KIJIMA

database.8 This indicator measures the degree of global embeddedness. Missing data were imputed using Amelia II.9 I test the following hypothesis: H1. There is a positive association between participation in crossnational assessments and foreign aid to education. I run three models to test my hypothesis. The first model shows the association between log of education aid per youth and first-time participation in one of three major cross-national assessments: lnðeduaid per youthÞit ¼ a þ b  first_tit þ g  X it þ li þ eit

(1)

where i is an index for countries, t for years, first_t is a binary variable coded as 1 if country i participated in a major cross-national assessment (PISA, TIMSS, or PIRLS) for the first time in year t, and the subsequent years coded as 1. X is a vector of time-varying covariates, and l is a vector of country fixed effects. The second model shows the patterns of education aid flow for the years immediately before and after test administration. I follow the ‘‘event-time specification’’ described in Kuziemko and Werker (2006): lnðeduaid per youthÞit ¼ a þ b0  T0it þ b1  T  3it þ b2  T  2it þ b3  T  1it þ b4  T1it þ b5  T2it þ b6  T3it þ g  X it þ li þ eit

ð2Þ

where T3 is a binary variable for three years prior to test participation, T2 for two years prior, and T1 for one year prior. T0 is the year of participation. T1 is a year after test participation and the year the test results are revealed. T2 and T3 are included to see whether aid flow persists over time. Theories that inform the conceptual framework of this study suggest that various stakeholders in international development and education have different mandates and agendas. To test this assumption, the third model disaggregates the dependent variable by donor status (multilateral or bilateral): lnðeduaid per youth from multilateralÞit ¼ a þ b  first_tit þ g  X it þ li þ eit (3a)

lnðeduaid per youth from bilateralÞit ¼ a þ b  first_tit þ g  X it þ li þ eit (3b)

Why Participate? Cross-National Assessments and Foreign Aid to Education

47

FINDINGS Table 1 presents the result using model specification (1), where the dependent variable is log of education aid per youth and the independent variable is first-time participation in any one of three major international assessments (PISA, TIMSS, or PIRLS). Version 1 of the model runs a bivariate regression using naı¨ ve estimates. Version 2 runs the same analysis using fixed effects. Version 3 controls for subsequent year test participation to delineate the possible difference from first-time participation. The point estimate on subsequent participation is positive, but not statistically different from zero. Versions 4–7 include the covariates. Version 7 is our preferred model. Results show countries that participate in cross-national assessment receive more foreign aid after test participation than countries that do not participate in cross-national assessment. More specifically, the relative difference between countries that participate for the first time and countries that do not participate is associated, on average, with a 36.8 percent difference in foreign aid to education per youth, while holding GDP per capita, polity score, gross secondary enrollment, Gini coefficient, education expenditure per GDP, youth population (ages 0–24), and membership in IGO constant. The R2 is low (less than 10 percent), due to the fact that this study exploits the relationship between foreign aid to education and test participation.10 The impact of test participation is greatest the first time a country participates in cross-national assessments. There are some plausible explanations why this might be the case. First, whenever a country pledges to take part in a cross-national assessment for the first time, it sends a strong message to the international community of its serious intention to engage in an international endeavor. One of the effects of test participation is increased level of donor commitment and support in providing technical assistance to participating countries to ensure success during the first round of test taking. Second, countries that participate once are more likely to continue to participate in subsequent rounds of assessments. To ensure that countries are off to a good start, the donor community is more likely to provide strong support to countries during the first round of participation in any one of three major cross-national assessments surveyed in this analysis. Table 2 presents findings using lag and lead years in model specification (2). Version 1 includes all lag and lead years. Versions 2–4 include the covariates. The preferred version is 4, which is consistent with model specification (1) where the covariates were included. Results indicate that aid flow continues for three years after the assessment is administered, while

0

1.620 (0.029) No 2,408

1.530 (0.026) Yes 2,408 193 0.012

0.569 (0.112)

1.529 (0.026) Yes 2,408 193 0.012

0.559 (0.113) 0.069 (0.111)

Version 3 b/se

po0.05, po0.01, po0.001; b/se, coefficient/standard error.

Country fixed effects N of observations N of groups R2

Intercept

Membership in IGO

Youth population

Gini coefficient

Education expenditure, GDP

Gross secondary enrollment

Polity score

GDP per capita, PPP

Dependent variable: ln(education aid per youth) First time participation 0.063 (0.071) Subsequent participation

Version 2 b/se

1.481 (0.077) Yes 2,408 193 0.07

0.428 (0.111) 0.054 (0.108) 0 (0.000) 0.063 (0.005) 0.001 (0.001) 0 (0.011)

Version 4 b/se

1.029 (0.140) Yes 2,408 193 0.076

0.446 (0.111) 0.043 (0.108) 0 (0.000) 0.061 (0.005) 0.001 (0.001) 0.003 (0.011) 0.011 (0.003)

Version 5 b/se

Foreign Aid to Education and Test Participation.

Version 1 b/se

Table 1.

1.198 (0.184) Yes 2,408 193 0.077

0.450 (0.111) 0.045 (0.108) 0 (0.000) 0.061 (0.005) 0.001 (0.001) 0.003 (0.011) 0.011 (0.003) 0 (0.000)

Version 6 b/se 0.368 (0.111) 0.037 (0.107) 0 (0.000) 0.057 (0.005) 0.001 (0.001) 0.005 (0.011) 0.009 (0.003) 0.000 (0.000) 0.018 (0.003) 0.447 (0.216) Yes 2,408 193 0.094

Version 7 b/se

48 RIE KIJIMA

Why Participate? Cross-National Assessments and Foreign Aid to Education

49

Table 2. Foreign Aid to Education and Test Participation by Lag and Lead Years. Version 1 b/se Dependent variable: ln(education aid per youth) Three years prior participation 0.215 (0.080) Two years prior 0.048 (0.080) One year prior 0.08 (0.076) Year of participation 0.135 (0.077) One year after 0.217 (0.075) Two years after 0.374 (0.079) Three years after 0.315 (0.077) GDP per capita, PPP Polity score Gross secondary enrollment Education expenditure, GDP Gini coefficient Youth population Membership in IGO Intercept Country fixed effects R2 N of observations N of groups po0.05, po0.01, po0.001.

Version 2 b/se

Version 3 b/se

0.233 (0.077) 0.069 (0.078) 0.096 (0.074) 0.1 (0.075) 0.178 (0.073) 0.324 (0.077) 0.286 (0.075) 0 (0.000) 0.061 (0.005) 0.001 (0.001) 0.003 (0.011) 0.011 (0.003)

0.232 (0.077) 0.069 (0.078) 0.097 (0.074) 0.102 (0.075) 0.179 (0.073) 0.324 (0.077) 0.285 (0.075) 0 (0.000) 0.062 (0.005) 0.001 (0.001) 0.003 (0.011) 0.011 (0.003) 0 (0.000)

Version 4 b/se

0.237 (0.076) 0.083 (0.077) 0.091 (0.073) 0.085 (0.074) 0.157 (0.072) 0.283 (0.076) 0.231 (0.075) 0 (0.000) 0.057 (0.005) 0.001 (0.001) 0.005 (0.011) 0.009 (0.003) 0.000 (0.000) 0.018 (0.003) 1.553 1.058 1.207 0.482 (0.026) (0.139) (0.183) (0.215) Yes Yes Yes Yes 0.024 0.089 0.09 0.106 2,408 2,408 2,408 2,408 193 193 193 193

50

RIE KIJIMA

holding all other variables constant. A plausible explanation for an increase in education aid prior to test participation is anticipation effects. Countries begin making necessary arrangements for implementation several years prior to test taking, which could result in obtaining additional budgetary support for all activities associated with the preparation and administration of assessments. The fact that the three-years prior test participation is the only coefficient that is statistically significant prior to the year of participation suggests that when countries ‘‘indicate’’ their commitment to participate (usually three to four years prior to test taking), this triggers the donor agencies to allocate more aid to education. The flow of education aid is largest one year, two years, and three years after countries conduct the assessments. One year after test participation, countries that participated in a cross-national assessment received approximately 15.7 percent more foreign aid to education per youth relative to countries that did not participate, while holding all other variables constant. For the second year after test participation, the aid amount increases to a difference of 28.3 percent in foreign aid to education per youth between participants and nonparticipants, statistically significant at a 1 percent level. The figure slightly decreases to 23.1 percent three years after test participation, and the significance level remains at 5 percent. This suggests that the association between test participation and foreign aid transaction is largest immediately following the year of test taking. The finding corresponds with the timeline of events, as the test results are released a year after test participation (for PISA 2009, test results are released in late 2010), and donors pledge to provide more aid after dissemination of results, which is reflected in the actual disbursement of foreign aid to education approximately two to three years after test administration. This is also consistent with the general patterns of aid giving. Foreign aid is committed immediately following test participation, and aid agencies disperse a disproportionately large amount of aid at the initial phase of the implementation period. Nonetheless, it tapers off toward the end of the project so that countries can achieve fiscal sustainability under their national budget. Table 3a breaks down the analysis by multilateral and bilateral agencies (model specifications (3a) and (3b)). The dependent variable is participation in all three cross-national assessments in the years that they were administered. Essentially, three versions of the model are presented; the first two versions (versions 1 and 2) use a bivariate naı¨ ve estimator, the second two versions (versions 3 and 4) use fixed effects without the covariates, and the third two versions use fixed effects with the covariates (versions 5 and 6). Columns 5 and 6 provide us with the most interesting

Multilateral Version 1 b/se

Bilateral Version 2 b/se

Multilateral Version 3 b/se

0.002

0.552 (0.022) No 2,408

po0.05,  po0.01,  po0.001.

Country fixed effects N of observations N of groups R2

Intercept

Membership in IGO

Youth population

Gini coefficient

Education expenditure, GDP

Gross secondary enrollment

Polity score

0

1.330 (0.024) No 2,408

0.550 (0.022) Yes 2,408 193 0.001

1.318 (0.017) Yes 2,408 193 0.003

0.155 (0.059)

Bilateral Version 4 b/se

0.135 (0.073) 0.000 (0.000) 0.023 (0.006) 0.002 (0.001) 0.001 (0.012) 0.007 (0.003) 0 (0.000) 0.014 (0.003) 0.482 (0.228) Yes 2,408 193 0.029

Multilateral Version 5 b/se

Foreign Aid to Education and Participation in All Assessments.

Dependent variable: ln(education aid per youth) from multilateral and bilateral agencies Participation in all tests 0.138 0.022 0.11 (0.043) (0.060) (0.073) GDP per capita, PPP

Table 3a.

0.134 (0.057) 0 (0.000) 0.044 (0.005) 0.001 (0.001) 0.002 (0.009) 0.002 (0.002) 0.000 (0.000) 0.016 (0.002) 0.794 (0.179) Yes 2,408 193 0.075

Bilateral Version 6 b/se

Why Participate? Cross-National Assessments and Foreign Aid to Education 51

52

RIE KIJIMA

results. Overall, the coefficients on bilateral agencies using fixed effects are statistically significant (versions 4 and 6), and the magnitude of the coefficient is largely positive compared to that of the coefficient on multilateral agencies (versions 1, 3, and 5). On average, bilateral agencies provide approximately 13.4 percent more foreign aid to education per youth to countries that participate in cross-national assessments relative to countries that do not participate at all, while holding all control variables constant (version 6). In comparison to multilateral agencies, the magnitude of aid giving is significantly larger from bilateral agencies to test participants than to non-test participants. To make the estimates correspond to model specification (2), I run the model using first-time participation as the independent variable (Table 3b). The analysis changes slightly. There are differences in aid giving between multilateral and bilateral agencies. In all of the versions employing fixed effects, the point estimate for education aid per youth from bilateral agencies is statistically significant at the 1 percent level in comparison to aid flow from multilateral agencies. More specifically, countries that participate in cross-national assessments for the first time receive from bilateral agencies, on average, 38.9 percent more foreign aid to education per youth capita than countries that do not participate in cross-national assessments, holding all control variables constant. The differences in aid flow from multilateral agencies to test participants and non-test participants are not statistically significant, meaning there is no difference in the amount of aid that participants and nonparticipants receive over time. Table 4 reports findings that add more precision to the estimates by adding lag and lead years. Results show that education aid from multilateral agencies to participating countries is statistically significant two years after test taking. In contrast to multilateral agencies, bilateral agencies respond much more consistently. Countries that participated in cross-national assessment received, on average, 19.5 percent more foreign aid to education per youth the year the assessment was conducted, in comparison to countries that did not participate in cross-national assessments. One year after test participation, there was a difference of 20.8 percent in education aid per youth between participants and nonparticipants, holding all other variables constant. Two years after test taking, this figure increases to 21.6, and rises to 23.7 percent three years after the year the international assessments were conducted. Multilateral agencies reward countries when they commit to test participation and after test results are revealed, but bilateral agencies provide continuous financial support prior to test participation and years immediately following participating in cross-national assessments.

Multilateral Version 1 b/se

Bilateral Version 2 b/se

Multilateral Version 3 b/se

Bilateral Version 4 b/se

0

0.539 (0.024) 2,408

po0.05, po0.01, po0.001.

N of observations N of groups R2

Intercept

Membership in IGO

Youth population

Education expenditure, GDP Gini coefficient

Gross secondary enrollment

Polity score

GDP per capita, PPP

0

1.335 (0.026) 2,408

0.505 (0.027) 2,408 193 0.002

1.247 (0.021) 2408 193 0.017

0.507 (0.027) 2408 193 0.003

0.283 (0.116) 0.133 (0.114)

Multilateral Version 5 b/se

1.244 (0.021) 2408 193 0.019

0.547 (0.092) 0.176 (0.091)

Bilateral Version 6 b/se

0.196 (0.117) 0.159 (0.113) 0.000 (0.000) 0.021 (0.006) 0.002 (0.001) 0.003 (0.012) 0.007 (0.003) 0 (0.000) 0.013 (0.003) 0.490 (0.228) 2408 193 0.029

Multilateral Version 7 b/se

Foreign Aid to Education and Participation in First Major Assessment.

Dependent variable: ln(education aid per youth) from multilateral and bilateral agencies First time participation 0.026 0.048 0.265 0.571 (0.066) (0.060) (0.114) (0.091) Subsequent participation

Table 3b.

0.389 (0.091) 0.161 (0.088) 0 (0.000) 0.043 (0.005) 0.001 (0.001) 0 (0.009) 0.001 (0.002) 0.000 (0.000) 0.015 (0.002) 0.790 (0.178) 2408 193 0.083

Bilateral Version 8 b/se

Why Participate? Cross-National Assessments and Foreign Aid to Education 53

54

RIE KIJIMA

Table 4. Foreign Aid to Education and Test Participation by Lag and Lead Years by Donor Status. Multilateral Version 1 b/se

Version 2 b/se

Bilateral Version 3 b/se

Version 4 b/se

Dependent variable: ln(education aid per youth) from multilateral and bilateral agencies 0.204 Three years prior participation 0.062 0.082 0.190 (0.082) (0.081) (0.065) (0.063) Two years prior 0.098 0.131 0.028 0.053 (0.083) (0.082) (0.066) (0.064) One year prior 0.075 0.077 0.033 0.033 (0.078) (0.078) (0.062) (0.060) Year of participation 0.102 0.136 0.230 0.195 (0.079) (0.079) (0.063) (0.061) 0.208 One year after 0.014 0.022 0.253 (0.077) (0.077) (0.061) (0.060) Two years after 0.214 0.165 0.288 0.216 (0.081) (0.081) (0.064) (0.063) Three years after 0.015 0.026 0.309 0.237 (0.080) (0.080) (0.063) (0.062) GDP per capita, PPP 0.000 0 (0.000) (0.000) Polity score 0.022 0.043 (0.006) (0.004) Gross secondary enrollment 0.002 0.001 (0.001) (0.001) Education expenditure, GDP 0.001 0.001 (0.012) (0.009) Gini coefficient 0.007 0.001 (0.003) (0.002) Youth population 0 0.000 (0.000) (0.000) 0.014 Membership in IGO 0.013 (0.003) (0.002) Intercept 0.538 0.479 1.268 0.824 (0.027) (0.228) (0.021) (0.177) N of observations 2408 2408 2408 2408 N of countries 193 193 193 193 R2 0.006 0.033 0.033 0.096 po0.05, po0.01, po0.001.

Why Participate? Cross-National Assessments and Foreign Aid to Education

55

I provide some explanations for why we observe a difference in the pattern of aid giving between multilateral and bilateral agencies. Scholars of international relations have argued that bilateral and multilateral agencies exhibit different motivations behind aid giving (Lumsdaine, 1993; Maizels & Nissanke, 1984). It is noteworthy that the countries giving bilateral aid are also members of the OECD, a league of rich nations that mandate that their members take part in PISA every three years. One reason why bilateral agencies are more consistent in their aid giving is because these OECD member states (also key bilateral agencies) are strong proponents of international assessments and want more countries to participate in them. For this reason, they provide strong support for developing nations that take part in cross-national assessments. This perspective takes on the neo-institutional theory, where there is a global culture constructed around cross-national assessments, and that countries that are embedded in this world polity are more likely to participate in this multinational platform. As per multilateral agencies, existing literature points to the fact that multilateral agencies institute more conditionalities via aid giving than bilateral agencies (Maizels & Nissanke, 1984). Their aid practices may also influence the flow of education aid to participating countries. While little empirical evidence has been gathered to date, it is highly probable that multilateral agencies have agreements spelled out in aide-memoires, encouraging countries to take necessary steps to participate in crossnational assessments. This type of incentive may not necessarily be in the form of strict conditionalities for aid disbursement, but it may take in ‘‘soft’’ forms of compliance, such as policy dialogues, agreed plan of action, or project milestone indicators to monitor progress, among others. In more budgetary terms, there may be project budget allocated to provide partial funding to offset some of the costs associated with participating in crossnational assessments as a way to encourage countries to participate in crossnational assessments.11 Kamens and McNeely also provide evidence to support this claim, stating that ‘‘yan expanding number of donor agencies and multilateral organizations are mandating some form of learning assessment to accompany their loans and other aid support’’ (p. 6). Multilateral agencies provide foreign aid assistance in the form of technical assistance to improve their capacity for national and/or international assessments. In short, various types of aid (e.g., conferences, technical assistance, funding for project preparation where policy dialogues are carried out) are all translated as financial transactions in the OECD aid to education data.12

56

RIE KIJIMA

One of the key stakeholders that are not tracked using education aid, but are influential nonetheless are the international nongovernmental organizations (INGOs). Since OECD does not report aid flows from INGOs to developing countries, I cannot draw conclusions about what role INGOs play in this analysis in terms of its financial assistance. Nonetheless, the role of transnational advocacy networks in education is undoubtedly significant in the education policy arena, not for the financial support they provide to beneficiaries, but for their role in facilitating information and advocacy (Mundy & Murphy, 2001). Mundy and Murphy (2001) mention that when there are ‘‘overlapping collective action frames,’’ INGOs become aligned with other members of the donor community. For example, the International Association for the Evaluation of Educational Achievement, a type of INGO, issues a series of reports using TIMSS and PIRLS. They collaborate not only with international organizations, but also with other bilateral agencies (e.g., US National Center for Educational Statistics) as well as nonprofit foundations and think tanks.13 Further analysis on the mechanisms by which INGOs work to advocate for these ‘‘action frames,’’ such as the importance of cross-national assessments, will help us understand the role of INGOs in influencing developing countries to participate in cross-national assessments.

CONCLUSION Findings from this study show a clear and strong positive relationship between test participation and foreign aid to education. The main finding of the study is that participating countries receive more foreign aid to education than countries that do not participate in cross-national assessments. Using lag and lead year variables, I find that the association between participation and foreign aid is largest immediately after worldwide dissemination of test results. When the dependent variable is disaggregated by donor status, I make two additional conclusions. First, multilateral agencies respond to test participation only after test results are announced, and the flow of education aid is the greatest two years after test results are disseminated. Second, bilateral agencies are much more consistent in their aid giving to participants of cross-national assessments over time, as they provide participating countries with more education aid than nonparticipating countries. Like multilateral agencies, the association between aid from bilateral agencies and test participation is greatest two years after test participation.

Why Participate? Cross-National Assessments and Foreign Aid to Education

57

This study also reveals that the donor community responds positively and rapidly to participation in cross-national assessments. Results show that on aggregate, aid flow is greatest immediately after test results are revealed, usually in one to three years after test participation. If developing countries are willing to invest in short-term costs, it appears that there are long-term benefits associated with test participation. These long-term benefits include learning the technical aspects of administering cross-national assessment to strengthen knowledge in administering national assessments, stronger monitoring and evaluation framework, sound information systems to track performances, which all leads to efficient usage of limited public finances. Moreover, the reputation of international assessments seems to have improved over time. International assessments are now an internationally accepted mechanism that is sought after by both developing and developed countries. These international assessments are widely considered to be some of the most legitimate tools in comparing the performances of children from various countries. For this reason, in countries where national assessments are underdeveloped, countries will rely on the results of cross-national assessment to inform policy decisions. While this can have drawbacks due to measurement errors arising from underrepresented groups or small sample size, the usage of cross-national assessment is becoming more pervasive, imminent, and influential in domestic education policy making.

AREAS FOR FURTHER RESEARCH There is limited empirical research investigating the reasons why developing nations participate in cross-national assessments. This study is one attempt at understanding the incentives behind national participation in crossnational assessments. Further qualitative work involving surveys and interviews will reveal the causality of aid giving. It is difficult to ascertain whether the recipient countries negotiate for more aid as a result of test participation, or if the donor agency is responding ex ante or ex post to the efforts of developing countries. This will also address the question of whether the donor community provides foreign aid as a way to help countries pay off costs associated with test participation, or if they are simply rewarding the efforts of participating in the international league of testing. The political environment and bureaucratic tendencies are also key factors that influence a developing country’s decision to participate or not participate in cross-national assessments. Supplementing my analytical model by conducting case studies, structured interviews as well as surveys

58

RIE KIJIMA

will be conducted to understand the full spectrum of the reasons why countries find cross-national assessments vital in education policy making. The results of this study have significant policy implications for developing countries. This study offers insights about test participation that was previously hard to justify: For developing countries, there are tangible incentives associated with test participation. It provides evidence that there is a positive association between test participation and foreign aid transactions in education. While further research is required to ascertain whether participants from developing nations are consciously making decisions in response to specific foreign aid ‘‘carrots,’’ empirical evidence indicates that test participation is highly associated with the inflow of aid allotted to education. The magnitude of the difference between aid participants and nonparticipants is very high (on the aggregate, approximately 37 percent difference in education aid). The donor community seems to regard participation in cross-national assessment as an indicator of serious commitment to educational reform, and the recipient countries seem to be compensated for joining the international league of testing. This finding offers a compelling answer to why a greater number of developing countries are taking part in this global phenomenon.

NOTES 1. The first cross-national assessment was conducted between 1959 and 1962. It involved 12 countries and tested mathematics, reading comprehension, geography, science, and nonverbal ability on 13-year-olds (middle school students). This then led to the First International Mathematics Study (FIMS) which was conducted in 1964. 2. International assessments: PISA, PIRL, TIMSS. Regional assessments: SACMEC in Africa, SERCE/PERCE in Latin America, and PASEC for current and former French speaking countries. 3. The 30 member countries of the OECD are: Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Korea, Luxembourg, Mexico, the Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Spain, Sweden, Switzerland, Turkey, United Kingdom, United States (OECD website). Mexico became a member in 1994. Turkey became a member in 1961. Slovenia became a member in 2010. 4. Iran has participated in TIMSS since 1995. Their average polity score between 1994 and 2006 is –1.46 and the average number of IGO membership during the same period is 57. Iran received, on average, 27 million USD per year in education aid. 5. Oman participated in TIMSS for the first time in 2007. Their average polity score between 1994 and 2006 is –7.5 and the average number of IGO membership during the same period is 49. Oman received, on average, 354,000 USD per year in

Why Participate? Cross-National Assessments and Foreign Aid to Education

59

education aid. In contrast, South Korea, a recipient of education aid until 1997, has an average polity score of 6.8 and their average IGO membership is 68. 6. Race to the Top is a 4.3 billion USD project that began in 2009 to improve the quality of public education in the United States. It is a mechanism to introduce more accountability, better standardizing, and assessing progress. For more information, refer to: http://www2.ed.gov/programs/racetothetop/index.html 7. The raw dataset is available at http://www.systemicpeace.org/polity/polity4.htm 8. COW dataset is available at http://www.correlatesofwar.org/COW2%20Data/ IGOs/IGOv2.3.htm 9. Amelia II is an imputation software that employs a bootstrapping-based algorithm that generates variables for missing values commonly found in time-series and cross-sectional datasets. It uses multiple imputation to generate the missing variables. Multiple imputation is done in three steps. First, plausible values for missing entries are generated that best represent the uncertainty about the nonresponses. Second, the imputations are done multiple times, resulting in a predetermined number of ‘‘complete’’ datasets with imputed variables. Third, the results from these imputations are combined, which addresses the uncertainty about the missingness of the variables that will be incorporated in the final dataset. For my study, I ran five imputations, the number that is recommended when running Amelia II. More information on Amelia II is found at http://gking.harvard.edu/amelia/ 10. Many other factors besides test participation influence how much a country receives aid to education. In other words, test participation is not the only factor that influences how much a country receives official development assistance in education. Methodologically, it is not necessary to control for other potential determinants of foreign aid to education, as long as they are uncorrelated with the independent variable, test participation. This study only exploits the relationship between test participation and cross-national assessment; therefore, it is not surprising that the R2 is low in my specification models. 11. Evidence to support this point is difficult to obtain because budgets for each project, minutes of the policy dialogue, and project milestones are not publically available. Also, each project has different measures for compliance; therefore, it will be challenging to obtain this type of data that can be collected across numerous countries. 12. Further research by way of interviews with relevant government officials will reveal different ways in which a country is incentivized by multilateral agencies to participate in cross-national assessments. 13. IEA: Brief history. Available at http://www.iea.nl/brief_history_of_iea.html

REFERENCES Baker, D. P., & LeTendre, G. (2005). National differences, global similarities. Stanford, CA: Stanford University Press. Ball, S. (1998). Big policies/small world: An introduction to international perspectives in education policy. Comparative Education, 34(2), 119–130.

60

RIE KIJIMA

Benveniste, L. (1999). The politics of student testing: A comparative analysis of national assessment systems in southern cone countries. Unpublished Ph.D. dissertation, Stanford University, Stanford, CA. Benveniste, L. (2002). The political structuration of assessment: Negotiating state power and legitimacy. Comparative Education Review, 46(1), 89–118. Chabbott, C. (1998). Constructing educational consensus: International development professionals and the world conference on education for all. International Journal of Educational Development, 18(3), 207–218. Chabbott, C. (2003). Constructing education for development: International organizations and education for all. New York: Routledge/Falmer. Colclough, C. (1991). Who should learn to pay? An assessment of neo-liberal approaches to education policy. In: C. Colclough & J. Manor (Eds), States of markets? Neo-liberalism and the development policy debate (pp. 197–213). English-Oxford, UK: Clarendon Press and New York: Oxford University Press. Ercikan, K. (1998). Translation effects in international assessments. International Journal of Educational Research, 29(6), 543–553. Ercikan, K. (2002). Disentangling sources of differential item functioning in multilanguage assessments. International Journal of Testing, 2(3 and 4), 199–215. Ercikan, K., & Koh, K. (2005). Examining the construct comparability of the English and French versions of TIMSS. International Journal of Testing, 5(1), 23–35. Fearon, J., & Wendt, A. (2002). Rationalism v. constructivism: A skeptical view. In: W. Carlsnaes, T. Risse & B. Simmons (Eds), Handbook of international relations (pp. 52–72). London: SAGE. Finnemore, M., & Sikkink, K. (1998). International norm dynamics and political change. International Organization, 52(4), 887–917. Greany, V., & Kellaghan, T. (2008). National assessments of educational achievement Vol. 1: Assessing national achievement levels in education. Washington, DC: The World Bank. Hambleton, R., & Kanjee, A. (1993). Enhancing the validity of cross-cultural studies: Improvements in instrument translation methods. Paper presented at the AERA. Available at http://eric.ed.gov./PDFS/ED362537.pdf Hambleton, R., & Patsula, L. (2000). Adapting tests for use in multiple languages and cultures. Massachusetts University, Amherst, School of Education. Jakwerth, P., Bianchi, L., Houang, R., Schmidt, W., Valverde, G., & Wolf, R., et al. (1997). Validity in cross-national assessments: Pitfalls and possibilities. Paper presented at the AERA. Available at http://www.eric.ed.gov/PDFS/ED409373.pdf Jones, P. W. (1999). Globalisation and the UNESCO mandate: Multilateral prospects for educational development. International Journal of Educational Development, 19(1), 17–25. Kamens, D., & McNeely, C. (2009). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54(1), 5–25. Katzenstein, P. J., Keohane, R. O., & Krasner, S. D. (1998). International organization and the study of world politics. International Organization, 52(4), 645–685. Kuziemko, I., & Werker, E. (2006). How much is a seat on the security council worth? Foreign aid and bribery at the United Nations. Journal of Political Economy, 114, 905–930. Lumsdaine, D. (1993). Moral vision in international politics: The foreign aid regime, 1949–1989. Princeton, NJ: Princeton University Press.

Why Participate? Cross-National Assessments and Foreign Aid to Education

61

Maizels, A., & Nissanke, M. K. (1984). Motivations for aid to developing countries. World Development, 12(9), 879–900. McNeely, C. L. (1995). Prescribing national education policies: The role of international organizations. Comparative Education Review, 39(4), 483. Meyer, J., Boli, J., Thomas, G., & Ramirez, F. (1997). World society and the nation-state. American Journal of Sociology, 103(1), 144–181. Meyer, J., & Ramirez, F. (2000). The world institutionalization of education. In: J. Schriewer (Ed.), Discourse formation in comparative education (pp. 111–132). New York: Peter Lang. Mundy, K. (1998). Educational multilateralism and world (dis)order. Comparative Education Review, 42(4), 448. Mundy, K. (2002). Retrospect and prospect: Education in a reforming World Bank. International Journal of Educational Development, 22(5), 483–508. Mundy, K., & Iga, M. (2003). Hegemonic exceptionalism and legitimating bet-hedging: Paradoxes and lessons from the US and Japanese approaches to education services under the GATS. Globalisation, Societies and Education, 1(3), 281–319. Mundy, K, & Murphy, L (2001). Transnational advocacy, global civil society? Emerging evidence from the field of education. Comparative Education Review, 45(1), 85–126. OECD. (2008). CRS/Aid Activities. DAC secretariat. Ramirez, F., & Boli, J. (1987). The political construction of mass schooling: European origins and worldwide institutionalization. Sociology of Education, 60(1), 2–18. Ruggie, J. G. (1998). What makes the world hang together? Neo-utilitarianism and the social constructivist challenge. International Organization, 52(4), 855–885. Takayama, K. (2007). A nation at risk crosses the pacific: Transnational borrowing of the U.S. crisis discourse in the debate on education reform in Japan. Comparative Education Review, 51(4), 423–446.

DOES INEQUALITY INFLUENCE THE IMPACT OF SCHOOLS ON STUDENT MATHEMATICS ACHIEVEMENT? A COMPARISON OF NINE HIGH-, MEDIUM-, AND LOW-INEQUALITY COUNTRIES Amita Chudgar and Thomas F. Luschei ABSTRACT In this chapter, we seek to contribute to a line of international and comparative research that began with Heyneman and Loxley’s 1983 study examining the importance of schools across national contexts. In their influential paper, Heyneman and Loxley found that in lower-income societies, schools (rather than families) constitute the predominant influence in explaining student achievement. Similar studies followed, often with results challenging Heyneman and Loxley’s original findings. We argue that one reason for inconsistencies among these studies is the failure to account for the distribution of income. Until recently, few studies had examined whether school effects vary across countries with different levels of income inequality. Yet emerging evidence suggests that inequality plays an important role in determining the extent to which The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 63–84 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013006

63

64

AMITA CHUDGAR AND THOMAS F. LUSCHEI

schools ‘‘matter’’ for student learning. In this study, we employ hierarchical linear modeling and two related yet distinct measures of inequality to examine how inequality relates to within- and between-country variations in student performance. We also explore whether, in certain countries, schools are differently able to help children from higher- and lower-Socio Economic Status (SES) groups. To capture sufficient variation in country context, we use data from nine diverse countries participating in the fourth grade application of the 2003 Trends in International Mathematics and Science Study (TIMSS). Our findings indicate that schools are important in their own right, and especially important in unequal countries. However, schools may affect SES-based achievement gaps only in countries with high income and resource inequality, accompanied by heterogeneous classrooms in terms of SES composition.

OBJECTIVES In this chapter, we seek to contribute to a line of international and comparative research that began with Heyneman and Loxley’s (HL) (1983) study examining the importance of schools across national contexts. In their influential paper, Heyneman and Loxley found that in lower-income societies, schools (rather than families) constitute the predominant influence in explaining student achievement. Similar studies followed, often with results challenging Heyneman and Loxley’s original findings (Baker, Goesling, & Letendre, 2002; Hanushek & Luque, 2003; Harris, 2007). We argue that one reason for inconsistencies among these studies is the failure to account for the distribution of income. Until recently, few studies had examined whether school effects vary across countries with different levels of income inequality. Yet emerging evidence suggests that inequality plays an important role in determining the extent to which schools ‘‘matter’’ for student learning (Chiu & Khoo, 2005). Our own research examining fourth grade math and science achievement data in 25 diverse countries indicates that schools play a particularly important role in poor and unequal countries (Chudgar & Luschei, 2009). Yet inequality has many dimensions, most of which have been ignored in cross-national educational research. In this study, we employ two related yet distinct measures of inequality to examine how inequality relates to within- and between-country variations in student performance. We also explore whether, in certain countries, schools are differently able to help children from higher- and lower-SES groups.

Does Inequality Influence the Impact of Schools on Student Achievement?

65

To capture sufficient variation in country context, we use data from nine diverse countries participating in the fourth grade application of the 2003 Trends in International Mathematics and Science Study (TIMSS). This and previous work with TIMSS have taught us many lessons regarding the uses and challenges of conducting research with large, crossnational (and cross-sectional) datasets. Our secondary objective here is to provide reflections and lessons learned related to the use of these data for research and policy.

Theoretical Framework Few educational researchers doubt that students’ family backgrounds strongly influence their success in school. As Rothstein (2004) observes, ‘‘no analyst has been able to attribute less than two-thirds of the variation in achievement among schools to the family characteristics of their students’’ (p. 14). The avenues by which students’ social class contributes to or impedes their learning are numerous and may include differences in parenting approaches, educational expectations, and exposure to different levels of social capital. Taken together, these differences ‘‘influence the average tendencies of families from different social classes’’ (Rothstein, 2004, p. 3). In turn, these tendencies contribute to a persistent achievement gap in the United States between less and more advantaged children (e.g., Chubb & Loveless, 2002; Jencks & Phillips, 1998). Outside of the United States, and particularly in developing countries, the relationship between social class and student achievement is not as clear. Heyneman and Loxley’s (1983) paper challenged years of United Statesbased evidence that family background matters much more than school quality in explaining student achievement differences. Using achievement data from 28 countries – including countries in Africa, Asia, Latin America, and the Middle East – Heyneman and Loxley (1983) found that the influence of school and teacher quality on student achievement was greater in low-income countries than in higher-income countries. In fact, the authors concluded that the ‘‘predominant influence’’ on student learning in low-income countries was the quality of schools and teachers they were exposed to. The authors argue that one possible reason for the differential importance of schools and teachers in low-income countries is that ‘‘as a commodity, education is both scarce and in high demand’’ (p. 1182). As a result, both low- and high-income parents alike may push children to perform well if given the opportunity to attend school. According to

66

AMITA CHUDGAR AND THOMAS F. LUSCHEI

Heyneman and Loxley, such uniformity of parental pressure and expectations may reduce the impact of SES on achievement in low-income countries. Heyneman and Loxley’s (1983) study inspired a rich cross-national literature on the role of schools, with mixed and often contradictory results. Reviewing more than 90 education production function studies in developing countries, Hanushek (1995) found that some school resources tend to have a greater positive relationship with student outcomes in lower-income countries, results that ‘‘clearly suggest a possible differentiation by stage of development and general level of resources available’’ (p. 281). Similarly, in their analysis of the 2000 Program for International Student Assessment (PISA) data, Chiu and Khoo (2005) found evidence that 15-year-old students in poorer countries benefit more from educational resources than similarly aged students in wealthier countries. Our own work with the 2003 fourth grade TIMSS data similarly found that schools tend to have a greater impact on student achievement in lower-income countries (Chudgar & Luschei, 2009). Yet other recent studies have called the Heyneman and Loxley results into question. Using data from the 1995 Third International Mathematics and Science Study (TIMSS), Hanushek and Luque (2003) found that the impact of school resources is not systematically related to a country’s level of income or development. In an analysis of 1999 TIMSS data, Harris (2007) found only weak support for the hypothesis of diminishing marginal returns to school inputs. Examining data from the 1995 TIMSS, Baker et al. (2002) found a ‘‘vanishing HL effect’’ between the early 1970s and 1995. They hypothesize that increasing access of children to school through governments’ funding of mass schooling has diminished the differentially positive impact of school resources in low-income countries.1 Inconsistencies in the results of cross-national studies may stem in part from a failure to account for the influence of income and resource inequality on the school–student relationship. Indeed, Baker et al. (2002) point out that their findings may diverge from Heyneman and Loxley due to a differential ability of these studies to consider national inequality levels. While Heyneman and Loxley’s country sample included more highinequality Latin American countries, Baker et al.’s (2002) study included more low-inequality countries from the former Soviet Union. Yet for the most part, the question of inequality’s influence on the impact of schools has not been examined empirically in the cross-national literature. Although most prior cross-national studies group countries according to per capita national income to analyze the importance of schools, they generally do not make similar groupings by income inequality. Yet if income inequality is

Does Inequality Influence the Impact of Schools on Student Achievement?

67

accompanied by inequality of educational access and quality, access to school resources may be severely constrained among poor children in very unequal countries. In such cases, scarce educational resources may have a much greater impact on student achievement. Alternatively, if educational resources are concentrated among wealthier children, diminishing marginal returns to these ‘‘lower-yield’’ students may reduce the overall impact of schools in highly unequal countries (Chiu & Khoo, 2005). A related consideration is the extent to which students are segregated into schools along social class lines, which influences the types of peers and the level of resources that children have access to. Yet given limited prior cross-national attention to these questions, we know little about whether and how different levels and types of inequality influence the role of schools across and within countries. Our own research across 25 diverse countries found that although family background is generally more important than school-level factors, schools as a unit of analysis represent a significant source of variation in achievement in all 25 countries after controlling for student and family background variables. We also found that the importance of schools relative to families is particularly strong in countries with high levels of income inequality, and to a lesser extent in lower-income countries (Chudgar & Luschei, 2009). These findings suggest that even in a wealthy but relatively unequal country like the United States, schools may play a greater role than researchers recognize. Although we believe that these results make an important contribution to existing literature, our analysis relied only on ‘‘external’’ (based on measures from the World Bank and other sources) and ‘‘internal’’ (calculated by the authors from the TIMSS data) measures of the Gini index. We did not explore the degree to which school resources, including students, are clustered in schools, nor did we examine differences in the impact of schools across different social class groups within countries. Here, we work with a smaller sample of nine countries to explore these alternative dimensions of inequality. This reduced sample allows us to explore complex questions and relationships, while at the same time exploiting variation in country-level inequality and student clustering in schools. To address the relative inattention to inequality in the existing literature on cross-national differences in the school–achievement relationship, we examine the HL Hypothesis both across and within a diverse group of nine countries with varying levels of income inequality and national income. In addition to our cross-country analysis, we examine within-country differences in relationships among school resources, student socioeconomic status, and student achievement. Specifically, we examine the impact of schools on

68

AMITA CHUDGAR AND THOMAS F. LUSCHEI

SES-based achievement gaps. We explore three principal questions in our analysis. (1) Does the relative importance of schools vary by country inequality levels?2 (2) Within individual countries, do schools play a role in bridging the achievement gap associated with differences in students’ SES? (3) How does the relationship between schools and achievement gaps vary across countries by national resource inequality levels?

CONCEPTUALIZING INEQUALITY AND ITS RELATIONSHIP WITH STUDENT ACHIEVEMENT The limited cross-national work examining the importance of income inequality in understanding the impact of schools suggests that there are multiple ways in which the distribution of income or resources can mediate the impact of school resources. To sort out the multiple effects that inequality may have on the impact of schools, we distinguish between two types of inequality: (a) overall income inequality in a country, as indicated by measures such as the Gini index and (b) the degree to which students are clustered in schools based on their social class. The first type is more general and reflects various facets of a country’s social relationships and institutions. As Chiu and Khoo (2005) suggest, income inequality may also serve as a proxy for inequality in the distribution of school resources. The second type of inequality is more specific to schools and offers a better picture of student composition within schools and the extent to which children interact with other students of different social class backgrounds. The first measure can thus be seen as a broader, national reflection of income distribution, whereas the second measure captures more closely the diversity in resource availability that students in our sample would experience. The first measure of inequality, strictly speaking, measures the variation in income distribution in the country, but for the purposes of our study, we also use this widely used measure of inequality as a proxy for inequality in overall resource distribution across a country. Thus, we argue that in a country where income is equally distributed, we are also likely to see schools that have equal levels of resources relative to a country with huge disparities in income distribution. This dual conceptualization of inequality leads to a number of hypotheses and possible scenarios (Table 1) pertaining to our third research question. Within countries, we must consider not only inequality in the distribution of

Does Inequality Influence the Impact of Schools on Student Achievement?

Table 1.

69

Various Country Scenarios Based on Income Inequality and School Level Clustering of Students.

National Resource Inequality/School Clustering

High Clustering (Homogeneous Schools)

Low Clustering (Heterogeneous Schools)

High (unequal country)

Scenario 1 School resources are unequally distributed and cannot bridge the achievement gap due to SES status. Example: Philippines

Scenario 2 School resources may be unequally distributed, but a more diverse group of children benefit from them, so schools may be able to bridge the achievement gap due to SES status. Example: Latvia

Low (equal country)

Scenario 3 Overall equal resource distribution implies that school resources matter less. Example: Hungary

Scenario 4 Overall equal resource distribution implies that school resources matter less. But low clustering of children may allow for some additional benefits to poor children. Example: Norway

resources or income, but also clustering of children within schools, which serves as a proxy for SES-based segregation. Ignoring countries with midlevel inequality, we imagine four scenarios: (1) an unequal country with a high degree of clustering; (2) an unequal country with low clustering; (3) an equal country with high clustering; and (4) an equal country with low clustering. Assuming that in unequal countries school resources are also distributed unequally, and that resources are skewed in the same direction as income inequality, in Scenario 1 we will find that more advantaged children have more resources available to them. In other words, we see what Chiu and Khoo (2005) characterize as ‘‘privileged student bias.’’ Due to high clustering, children will be in schools with other children similar to them in terms of social class, schools with higher average SES will tend to have greater resources, and schools will have difficulty ameliorating SES-based achievement gaps. As a result, we hypothesize that the performance of children from a given SES group will not vary much from school to school; at the country level, schools will not have a strong impact on SES-based achievement gaps. However, in Scenario 2, we may find a mixture of

70

AMITA CHUDGAR AND THOMAS F. LUSCHEI

children from different social classes in schools. Despite a likely concentration of resources among schools with higher average student SES, a more diverse group of children will have access to these resources. Here, we may see some effect of school resources in dampening gaps caused by differences in student background. We further hypothesize that in Scenarios 3 and 4, schools should matter less if we again assume that school resource distribution reflects country-level income distribution, but we may still find that low student clustering in schools – which allows students of different SES levels to interact with each other – plays some role in reducing SESbased achievement gaps. In this case, all else equal, in countries reflecting Scenario 3 (low inequality, low clustering), schools will have a greater impact on SES-based achievement gaps than in Scenario 4 countries (low inequality, high clustering).

DATA AND METHODS Our data come from the 2003 fourth grade application of TIMSS. In addition to student test scores, these data contain extensive background information on students, principals, math and science teachers, and curriculum. In the 2003 TIMSS, 25 countries participated in the fourth grade survey and 49 countries in the eighth grade. TIMSS tests one or more classes in each sample school, so data are at the classroom level. Although the TIMSS studies discussed in the preceding text generally use eighth grade data, we choose instead to work with the fourth grade data due to our interest in student SES and inequality.3 To examine patterns across a diverse group of countries and levels of income equality, we draw on a subsample of nine countries participating in the 2003 fourth grade TIMSS. Given the complex nature of the analysis, and the variety of expected relationships, we believed that limiting our study to a smaller, yet sufficiently diverse sample of countries, allows for a more manageable approach to explore our research questions. To capture variability in national contexts, we chose countries with a range of income equality levels, according to the most recent Gini index estimates available (Table 2). We also classify these countries into groups based on income equality (Table 3): low inequality (Hungary, Norway); mid-level inequality (Armenia, Japan, Latvia); and high inequality (Philippines, Russia, Tunisia, United States). These countries also represent a variety of World Bank income categories. The World Bank uses gross national income per capita to divide national economies into four groups: low income ($905 or less),

71

Does Inequality Influence the Impact of Schools on Student Achievement?

Table 2. Country

Armenia Hungary Japan Latvia Norway Philippines Russia Tunisia USA

2003 GDP/ Capita (2000$, PPP)

3,473 14,793 26,063 10,032 35,327 4,250 8,373 6,871 35,313

World Bank Income Classificationa

LMI UMI High OECD UMI High OECD LMI UMI LMI High OECD

Country Sample. Gini Indexb (Year)

33.77 26.85 37.90 37.67 25.79 44.53 39.93 39.80 45.00

Fourth Grade TIMSS Mathematics Mathematics Scale Score, Rank TIMSS 2003 (Out of 25)

(2003) (2002) (2000) (2003) (2000) (2003) (2002) (2000) (2004)

456 529 565 536 451 358 532 339 518

20 11 3 7 21 23 9 25 12

Chiu and Khoo Measure of Clustering 0.20 0.23 0.09 0.18 0.10 0.36 0.35 0.36 0.20

Source: CIA (2007), Gonzales et al. (2004), and World Bank Development Indicators (2008). a LMI, lower-middle income; UMI, upper-middle income. b The Gini index is a measure of income inequality in a given country. A larger number indicates a greater concentration of income among a small fraction of the population, that is, greater income inequality.

Table 3.

Country Sample According to Income Inequality and National Income Categories. High Income

Low inequality (Ginio30) Middle inequality (Ginio40) High inequality (GiniZ40)

Norway Japan United States

Upper-Middle Income Hungary Latvia Russia

Lower-Middle Income

Armenia Tunisia, Philippines

lower-middle income (LMI, $906–3,595), upper-middle income (UMI, $3,596–11,115), and high income ($11,116 or more).4 According to these classifications, our sample includes three LMI countries (Armenia, Philippines, Tunisia); three UMI countries (Hungary, Latvia, Russia); and three high-income countries (Japan, Norway, United States). We acknowledge that while the matrix presented in Table 3 indicates sufficient diversity in our country selection, this country grouping is by no means absolute. With a different international dataset, a similar study could be conducted with a set of nine countries that look quite different from those we have selected here. The primary purpose here is to apply a common set of questions to these nine countries with relatively different income and inequality levels and to explore the results side by side.

72

AMITA CHUDGAR AND THOMAS F. LUSCHEI

VARIABLES Our key dependent variable is the student test score on mathematics achievement tests as represented by five plausible values. With TIMSS, it is essential to use appropriate techniques to take into account these separate imputed estimates for each test score. We also include sample weights related to the probability of being included in the sample, which is necessary to produce nationally representative results (Martin, 2005). Our independent variables represent various characteristics of children, their families, and their schools (Table 4). At the individual student level, controls include students’ age and gender, as well as indices measuring their self-confidence in learning math and the time they spend on math homework. Unfortunately, the fourth grade TIMSS data do not provide information on parental education, occupation, or income. To control for family background, we use an index of ‘‘educational capital’’ or SESEDCAP in the home as a control for student SES. This index is based on students’ answers regarding family possessions related to learning: dictionary, calculator, computer, desk (1 ¼ yes and 0 ¼ no), and books in the home (1 ¼ no or few books to 5 ¼ more than 200 books). By construction, educational capital ranges from a minimum of 1 to a maximum of 9. We chose this measure of SES as it seems likely to be closely related to educational outcomes and at the same time provides a fair proxy for the family’s economic status and parental education. Unfortunately, given the rather limited published work on fourth grade TIMSS data – as well as the unavailability of other variables such as parental education in these data – we cannot externally validate this measure; however, we may argue that, by construction, it fairly captures the educational resources available to a child within the household. Moreover, in this and other work, we have found this variable to be strongly and positively associated with student achievement, indicating that educational capital measures something educationally important about children’s backgrounds (Chudgar & Luschei, 2009; Luschei & Chudgar, 2009). We then created three dummy variables to identify students in the top, middle, and bottom third of the SES distribution in their countries. In the HLM model described in detail in the following text, we introduced the high- and low-SES dummies to identify the SES-based achievement gap experienced by these students relative to middle-SES students. We use the SES variable in this fashion rather than a continuous variable because it allows us to separately evaluate the relationship between school factors and student performance when the student belongs to high- and low-SES groups. This approach also allows

Does Inequality Influence the Impact of Schools on Student Achievement?

Table 4. Variable Dependent variable Student achievement in math Student background Student age Female student Student self-confidence in math Student time spent on homework Family background Language spoken

Educational capital

Measures of inequality Chiu and Khoo’s (2005) measure of clustering

Gini index

73

Variables Used in Analysis. Description

Source

Represented by five plausible values

TIMSS 2003

Age in years Dichotomous dummy variable Categorical variable indicating high-, medium-, or low-level of self-confidence Categorical variable indicating high, medium or low amount of time

TIMSS 2003 TIMSS 2003 TIMSS 2003

Categorical variable indicating the frequency with which the language of the test is spoken at home Index from 1 to 9 based on answers to questions about family possessions related to learning: dictionary (0–1), calculator (0–1), computer (0–1), desk (0–1), books in the home (1–5)

TIMSS 2003

Ratio of the variance in school mean SESEDCAP in a country divided by overall variance in SESEDCAP in the country Level of income inequality in the country

Constructed from TIMSS 2003

TIMSS 2003

Constructed from TIMSS 2003

World Bank, World Development Indicators

Note: All TIMSS variables are from Grade 4 database.

us to account for the possibility that the relationship between SES and student achievement is not linear. One disadvantage to this approach is the loss of variability that results from converting continuous variables into categorical ones. Finally, to measure inequality levels in our sample countries, we use preexisting external information on the Gini index (Table 2). We also use our SESEDCAP variable to calculate Chiu and Khoo’s (2005) measure of

74

AMITA CHUDGAR AND THOMAS F. LUSCHEI

clustering. This is the ratio of the variance in school mean SESEDCAP in a country divided by overall variance in SESEDCAP in the country. A higher ratio indicates that school means are spread farther apart, implying that schools are ‘‘clustered’’ by SES levels rather than having homogeneous student populations across schools. It is likely that in many countries, the Gini and clustering ratio are closely related; in fact, among the nine countries we study, the correlation coefficient between these two variables is 0.49. However, these variables are far from perfectly correlated. Therefore, one may find a situation where a country has high income inequality but low clustering in schools, and vice versa. Although there is no absolute way to judge inequality and clustering levels, the four quadrants in Table 1 help us to make some preliminary observations regarding our sample countries. For instance, the Philippines experiences not only high levels of resource inequality (as expressed by the Gini index), but also high student clustering across schools. In Hungary, resources may be more equally distributed, but within schools, children experience a very homogeneous (clustered) environment. In contrast, Norway enjoys equal resource distribution and low levels of clustering, or more heterogeneous schools. Finally, Latvia and Japan experience both high levels of resource equality and low levels of clustering across schools, or heterogeneous schools.

METHODS We use hierarchical models (Raudenbush & Bryk, 2002) to explore the three research questions of interest. Specifically, we estimate the proportion of variance in the variables of interest that is attributable to Level 2 (schools/ teachers), controlling for relevant student/family background. We conduct the following estimations separately for each of the nine countries in our sample: Level 1 :

Y ij ¼ b0j þ bhighj HighSESij þ blowj LowSESij þ aSTUDENT þ dFAMILY þ eij

Level 2 :

ð1Þ

b0j ¼ g00 þ u0j bhighj ¼ ghigh0 þ uhighj

ð2Þ ð3Þ

blowj ¼ glow0 þ ulowj

ð4Þ

Level 1 is the student-level equation, where the subscript i refers to the student and the subscript j refers to the school. a represents the vector of coefficients associating students’ background with their achievement in

Does Inequality Influence the Impact of Schools on Student Achievement?

75

school j. The significance levels of coefficients bhighj and blowj indicate the ‘‘SES-achievement gap,’’ or the advantage and disadvantage faced by highand low-SES students respectively, compared to the reference category, middle-SES students. In the Level 2 analysis, we estimate the variation in school-level coefficients b0j, bhighj and blowj by focusing on the variance components for the error terms (u0j, uhighj, and ulowj). The error term u for each Level 2 equation indicates the unique increment in the estimate associated with the Level 2 unit, that is, schools. The variance of u0j represented as t00, for instance, is the unconditional variance in Level 1 intercepts, or the variance component. The significance of the variance component is tested using a w2 test; significance of the b0j variance component indicates significant differences in average achievement across schools. The significance of variance components associated with bhighj and blowj (uhighj and ulowj) similarly indicates that the SES-based achievement gap varies significantly from school to school; in other words, schools can help to exacerbate or dampen the SES-based achievement gap in that country. If the coefficient bhighj and its variance component are statistically significant, this implies the existence of an SES-based achievement gap distinguishing higher-SES children from middle-SES children. Moreover, the significance of the variance component indicates that the extent of this gap varies from school to school; in other words, school-level factors potentially influence this gap. In a country where this variance component is significant, we should find that in some schools higher-SES children experience a larger gain in their performance than in other schools when compared to middleSES children. By not including any specific school covariates such as school resources, student composition, school leadership, or school management at Level 2 we are unable to ask what specifically about schools (such as resources or peers) may be important in the SES–achievement relationship. That is a separate and important investigation in its own right that we do not explore in this study. We allow the remaining coefficients not expressed at Level 2 to remain fixed without the error term. In other words, we propose a scenario where student, gender, age, or other family-level variables relate to their achievement in the same way regardless of the student’s school. After generating estimates for unconditional variance components in Model 2, we analyze these data alongside country indicators of income inequality (Gini index) and the measure of clustering that we generated from the data. As discussed in the preceding text, clustering represents a different type of inequality that may or may not be directly related to the distribution of income. It is an important measure because it illustrates the degree to

76

AMITA CHUDGAR AND THOMAS F. LUSCHEI

which children interact with others from different social classes, and so gives us more information about the composition of schools themselves.

RESULTS Table 5 shows select results from HLM models for the nine countries in our sample. In all nine countries, the coefficient of the variance in the intercept (t00) is statistically significant. This implies that controlling for relevant child and family-level background variables, average student test scores vary significantly from school to school. In other words, the significance of the variance component across the board indicates that there are school characteristics that matter in addition to children’s own background in determining their test scores in all nine countries in our sample. Having noted that schools matter for all the countries in our sample, we approach Research Question 1 by plotting the changing magnitude of the Table 5. HLM Results for the Nine Countries.

Hungary p Norway p Japan p Latvia p Armenia p Philippines p Russia p Tunisia p USA p

Intercept g00

VC t00

Hcoeff g10

HVC t11

Lcoeff g20

LVC t22

599.46 0.00 542.6 0.00 644 0.00 602.8 0.00 535.9 0.00 410.3 0.00 585.8 0.00 402.5 0.00 557.8 0.00

686.22 0.00 334.6 0.00 221 0.00 898.59 0.00 1559.51 0.00 4304.3 0.00 2474.2 0.00 2677.5 0.00 1243.6 0.003

23.16 0.00 11.6 0.00 27.2 0.00 7.5 0.01 9.3 0.04 7.8 NS 8.4 0.01 28.3 0.00 33.6 0.00

22.97 NS 72.2 0.03 46.4 NS 52.8 0.01 44.14 NS 393.3 NA 93.6 NS 175.8 NS 195.9 NS

20.72 0.00 24.2 0.00 NA NA 14.7 0.00 16.5 0.002 11.1 0.01 4.2 NS 3.3 NS 7 NS

163.6 NS 52.7 NS NA NA 59.5 0.01 117.4 NS 215 NS 177.7 NA 50.7 NA 123 NA

For Japan, given the nature of the SESEDCAP distribution, we had only two SES groups. Therefore, the reference category for the high group is all other students. NS, not significant; NA, not applicable.

Does Inequality Influence the Impact of Schools on Student Achievement?

77

5000 4500

PHL

Variance component

4000 3500 3000

TUN

2500

2

R = 0.3777 RUS

2000 ARM

1500

USA LAV

1000

HU

500

NO

JPN

0 20

25

30

35

40

45

50

Gini Coefficient

Fig. 1.

Scatterplot, Variance Component of the Intercept and Gini Coefficient.

5000 4500

PHL

Variance component

4000 3500 R2 = 0.7931 TUN RUS

3000 2500 2000 ARM USA LVA

1500 1000 500

HUN

NOR JPN

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Clustering

Fig. 2.

Scatterplot, Variance Component of the Intercept and Clustering.

variance component against our two measures of inequality (Figs. 1 and 2). On the X-axis is the list of countries ordered in ascending levels of inequality. The upward-sloping line in both panels indicates a positive relationship between inequality levels and the importance of school-level factors. The findings are especially strong when using the Chiu and Khoo

78

AMITA CHUDGAR AND THOMAS F. LUSCHEI

(2005) measure of clustering. The stronger findings from Fig. 2 may indicate that the context closest to children (i.e., clustering within their schools), rather than the broader national context, is more salient in terms of the role schools play in varying children’s achievement. Both these findings also suggest that when there is high inequality in terms of resource distribution (nationally or at the school level), school-level factors could play an important role in improving student performance (as noted already in Chudgar & Luschei, 2009). Yet the lack of complete congruence between the two figures indicates that the Gini measure and the clustering measure capture slightly different forms of inequality. Extending our original findings, we note that the relationship between inequality and the importance of schools is especially strong when using the Chiu and Khoo (2005) measure of clustering. In other words, as the level of clustering within a country increases (i.e., children are more likely to study with other children ‘‘like’’ them), schools – as units of observations – become more important in explaining variation in student achievement. Our second question relates to whether achievement gaps due to differences in students’ SES vary from school to school. If we do find such variation, we can argue that schools play a role not just in improving the average performance of children in school (as exhibited by earlier research and in this analysis), but that they also can help to bridge or worsen the achievement gap children experience due to differences in family background. To address this question, we first note some consistent evidence of an SES-based achievement gap across our nine sample countries (Table 5). In Hungary, Norway, Latvia, Armenia, and Japan, children from all different SES groups perform significantly differently from one another5 in each direction, with those in the high-SES group performing better than the middle group and those in the low-SES group performing worse than the middle group. The remaining countries in the sample demonstrate a different pattern. In the Philippines, children in the low-SES group are significantly worse-off compared to the middle-SES group, but children in the higher-SES group do not perform significantly better than those in the middle-SES group. In the case of Russia, Tunisia, and the United States, we find the reverse scenario, where children in the high-SES group significantly outperform children in the middle-SES group. Children in the middle-SES group are not significantly different from the low-SES group. Overall, we find that an SES-based achievement gap in one or both directions exists in all nine countries studied. Turning to the role of schools in bridging or exacerbating the gap, or focusing on the significance of the variance components of the SES

Does Inequality Influence the Impact of Schools on Student Achievement?

79

coefficients, a slightly different picture emerges with regard to relationships among SES, test scores, and influence of schools. We find that in extremely few instances in our sample, schools actually may be playing a role in bridging the achievement gap due to student family background or SES. In response to our third question, which relates to cross-country variation in the relationship between schools and SES-based achievement gaps, the data show that only in Norway and Latvia does the extent to which highSES children outperform middle-SES children vary from school to school. In Latvia, the extent to which low-SES children underperform middle-SES children also varies from school to school. In other words, of nine diverse countries, only in Latvia do we find school-to-school differences in the extent to which high-SES children outperform and low-SES children underperform middle-SES children. While there is no absolute way to judge inequality and clustering levels, it may be argued that Latvia is the only case in our current sample which exhibits both high inequality (relative to several other countries in the sample) and relatively low clustering (Japan is an exception, as it has both higher inequality and lower clustering than Latvia. But Japan also has higher national income, suggesting possible interactions between inequality and national income). Referring to our earlier conceptualization, Latvia is the closest in our sample to being a ‘‘Scenario 2’’ country. As such, it is the only sample country in which while overall resource distribution is not equal, meaning some school-going children belong to especially well-off families and some to especially worse-off families (as reflected in the Gini), in schools, children of different social backgrounds seem to come together, as reflected in relatively low clustering. This scenario may allow schools to influence SES-based achievement gaps. Similarly in Norway, while overall inequality in resource distribution is low to begin with, there are also low levels of clustering. This low clustering may allow schools to play somewhat of a corrective role as argued in the preceding text. Overall, our findings indicate that school-level factors are important in their own right, and especially important in unequal countries. However, school-level factors may be associated with SES-based achievement gaps only in countries with high inequality and low clustering. Possibly because such a situation is rare, in most of the countries in our sample we do not find that schools significantly influence SES-based achievement gaps. Additionally, we find that the relationship between clustering and income inequality may be mediated by a country’s level of income so that in high-income countries, these relationships do not hold.

80

AMITA CHUDGAR AND THOMAS F. LUSCHEI

DISCUSSION Our analysis of nine countries participating in the 2003 TIMSS has produced some general patterns worthy of note. First, our results lend support to the importance of inequality in explaining cross-national differences in the impact of schools. Systematic relationships associated with overall country inequality levels, along with evidence of differential impacts on different SES groups, encourage further pursuit of this line of inquiry. More specifically, these results demonstrate that declining importance of schools is closely aligned with declining country-level inequality. Results regarding the role of schools in influencing the relationship between students’ SES and their achievement are mixed. We find that in certain countries, schools can influence the extent to which high-SES students outperform middle-SES students (Norway, Latvia) and in Latvia schools mediate the extent to which low-SES students underperform middle-SES students. These results may stem from the unique situation of Latvia, a middle-income country with both relatively high inequality and low SESbased clustering. This creates a situation in which, while school resources may vary significantly across schools, within schools, children of different social class backgrounds can interact and share resources and knowledge, so that schools can influence the SES–achievement relationship. In the case of Norway, despite low income inequality, low levels of student clustering may help to further mediate class-based achievement differences. These results indicate that in a given country, the extent to which schools mediate the effect of SES may depend on students’ SES level, and that schools may not be able to mediate the relationship between SES and achievement for all SES groups. These findings also indicate that while between countries, schools may matter more for poor countries, within countries schools may not uniformly matter more for poor children.6

CONCLUSION Our study makes several important contributions to the cross-national literature on student achievement. To begin with, we highlight the importance of inequality in explaining the impact of school resources on student achievement. We also use a student sample (fourth grade) that is likely to have greater variation in children’s background than most previous studies, which generally rely on eighth grade students (TIMSS) or

Does Inequality Influence the Impact of Schools on Student Achievement?

81

15-year-olds (PISA). Our analysis of nine countries participating in the 2003 fourth grade application of TIMSS suggests the need for further crosscountry research on the impact of schools across countries with different levels of income inequality. Our results also suggest the importance of within-country analysis, as schools do not appear to impact all groups in the same way within countries. Many of the patterns we observe in TIMSS would be virtually impossible to observe in just one country, no matter how diverse. At the same time, within-country analysis in TIMSS is somewhat constrained by the availability of student and family background variables. These represent a few of the strengths and weaknesses that lead us to make some broader reflections on the rich, cross-national TIMSS data. TIMSS (and other similar large-scale international assessment data) have several unique attributes that make them different from other large-scale data bases (for an excellent introduction and summary of these issues, see Rutkowski, Gonzalez, Joncas, & von Davier, 2010). This implies that, like with any large-scale dataset, researchers must make a significant investment of time to understand the intricacies of the data. But having made this initial investment, we believe that these extremely rich, and yet relatively underutilized data provide the international comparative research community with outstanding research opportunities. Most importantly, as exhibited in this small study, these data allow the researcher to observe similar attributes of children, families, and schools across widely divergent settings. While we by no means claim that findings from one country can always be used to support or oppose education policy in another, countries can often learn a lot from one another by observing the experiences of similar and dissimilar nations. Indeed, in an increasingly globalizing world, debates about what Korea or China is ‘‘doing right’’ or how the United States can ‘‘retain its global advantage’’ form the core of many education policy discussions both domestically and abroad. And it is with regard to these debates that such excellent, rich, consistent, user-friendly, and publically available data such as TIMSS make a very important contribution. These data also have their limitations. We have struggled with two issues in particular. First, while over time more and more countries have participated in these large-scale data collection exercises (which are demanding in terms of national resources and services), many of the truly poor and developing countries are not included in multicountry comparisons. Related to this point is a concern that affects all studies like this one: When we characterize a country as ‘‘equal’’ or ‘‘unequal’’ or ‘‘rich’’ or ‘‘poor,’’ we do not mean in relation to the rest of the world; rather, these are

82

AMITA CHUDGAR AND THOMAS F. LUSCHEI

comparisons relative to the subpopulation of countries participating in a particular round of TIMSS or another international study. As a result, we must take great caution in generalizing the results from these studies to broader/global levels, and particularly to countries that did not participate in a given survey. A second related concern is the ability of international surveys to accurately measure variables that may vary in importance and interpretations across countries (for an excellent summary of these arguments see Fuller & Clarke, 1994). In Chudgar and Luschei (2009) we discuss in greater detail the time and thought that go into constructing questionnaires and variables that are truly internationally comparable (for instance, see Martin, Mullis, & Chrostowski, 2004). However, it is extremely challenging to generate measures of home background and socioeconomic status that are at once truly representative of diverse national contexts and yet logistically feasible and affordable to collect within the limits of a massive cross-national operation. In fact, a recent article by Harwell and LeBeau (2010) argues that a widely used measure of student SES in the United States (eligibility for free lunch), while ‘‘easy to access, inexpensive’’ and with ‘‘minimal’’ chances of nonresponse is a ‘‘poor measure of socioeconomic status’’ and ‘‘suffers from important deficiencies that can bias inference’’ (p. 120). If finding an easily accessible and inexpensive measure of SES in a country with so many resources devoted to education research is challenging, it should be no surprise that such issues will also remain a challenge in international comparative work, with its more ambitious goals and sometimes relatively limited resources. In the end, we believe that the positive potential of international achievement studies far outweighs the pitfalls and challenges that they must overcome. Increasingly, more and more countries share this opinion, as the number of participants in PISA, TIMSS, and PIRLS increases with each application. Moreover, regional assessments in Africa and Latin America, such as SACMEQ, PASEC, and SERCE, allow the southern hemisphere to join the debate about whether and how schools matter. In sum, in spite of the many limitations of large-scale international assessment studies, these studies hold tremendous and yet underutilized potential to answer questions that are highly relevant for both research and policy.

NOTES 1. For a more complete review of research regarding school effects in developing countries, see Buchmann and Hannum (2001) and Chudgar and Luschei (2009).

Does Inequality Influence the Impact of Schools on Student Achievement?

83

2. The first research question reexamines the results of our original study (Chudgar & Luschei, 2009) with a smaller country sample and more nuanced measures of inequality. 3. For example, in the 2003 assessment of PISA, participants Mexico and Turkey enrolled only 58% and 54% respectively of their 15-year-old children in school (OECD, 2004). If 15-year-olds enrolled in school are different from those who are not enrolled, then PISA data, and possibly even the eighth grade TIMSS data, do not provide truly representative samples of the population of all children. More importantly, exclusion from the sample of poorer and lowerachieving children, who are more likely to be absent or leave school prior to being assessed, can also bias the relationship between school resources and achievement. Additionally, if poor students are more likely to drop out of school as they age, then the population of children in school will become more homogeneous over time, suggesting that 15-year-olds are more alike each other in terms of social class than children in eighth grade, who are more homogeneous than children in fourth grade. 4. The World Bank periodically updates these income ranges and country classifications. These classifications are based on World Bank figures from 2008. 5. In the case of Japan, the distribution of the SESEDCAP variable allows only two categories, high and low, and for Japan we also find that higher-SES children outperform the rest of the children in the sample. 6. In Latvia, the one example where schools matter both for high- and low-SES children, it is worth noting that the variance component for the high-SES coefficient is smaller than the variance component for the low-SES coefficient.

REFERENCES Baker, D. P., Goesling, B., & Letendre, G. K. (2002). Socioeconomic status, school quality, and national economic development: A cross-national analysis of the ‘‘Heyneman–Loxley Effect’’ on mathematics and science achievement. Comparative Education Review, 46(3), 291–312. Buchmann, C., & Hannum, E. (2001). Education and stratification in developing countries: A review of theories and research. Annual Review of Sociology, 27, 77–102. Chiu, M. M., & Khoo, L. (2005). Effects of resources, inequality, and privilege bias on achievement: Country, school, and student level analysis. American Educational Research Journal, 42(4), 575–603. Chubb, J. E., & Loveless, T. (Eds). (2002). Bridging the achievement gap. Washington, DC: Brookings Institution Press. Chudgar, A., & Luschei, T. F. (2009). National income, income inequality, and the importance of schools: A hierarchical cross-national comparison. American Educational Research Journal, 46(3), 626–658. Fuller, B., & Clarke, P. (1994). Raising school effects while ignoring culture? Local conditions and the influence of classroom tools, rules, and pedagogy. Review of Educational Research, 64, 119–157.

84

AMITA CHUDGAR AND THOMAS F. LUSCHEI

Gonzales, P., Guzma´n, J. C., Partelow, L., Pahlke, E., Jocelyn, L., Kastberg, D., & Williams, T. (2004). Highlights from the trends in International Mathematics and Science Study (TIMSS) 2003. NCES 2005-005. U.S. Department of Education, National Center for Education Statistics. Washington, DC: U.S. Government Printing Office. Hanushek, E. A. (1995). Education production functions. In: M. Carnoy (Ed.), International encyclopedia of economics of education (2nd ed., pp. 277–282). Tarrytown, NY: Pergamon. Hanushek, E. A., & Luque, J. A. (2003). Efficiency and equity in schools around the world. Economics of Education Review, 22, 481–502. Harris, D. N. (2007). Diminishing marginal returns and the production of education: An international analysis. Education Economics, 15(1), 31–53. Harwell, M., & LeBeau, B. (2010). Student eligibility for a free lunch as an SES measure in education research. Educational Researcher, 39(2), 120–131. Heyneman, S. P., & Loxley, W. A. (1983). The effect of primary-school quality on academic achievement across twenty-nine high- and low-income countries. American Journal of Sociology, 88(6), 1162–1194. Jencks, C., & Phillips, M. (Eds). (1998). The black–white test score gap. Washington, DC: Brookings Institution Press. Luschei, T. F., & Chudgar, A. (2009, April). Students, teachers, and fourth grade mathematics achievement: A cross-national examination of relationships and interactions. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA. Martin, M. O. (Ed.) (2005). TIMSS 2003 user guide for the international database. Boston, MA: Lynch School of Education, Boston College. Martin, M. O., Mullis, I. V. S., & Chrostowski, S. J. (Eds). (2004). TIMSS 2003 technical report. Chestnut Hill, MA: TIMSS and PIRLS International Study Center, Boston College. OECD. (2004). Learning for tomorrow’s world: First results from PISA 2003. Paris: OECD Publications. Raudenbush, S. W., & Bryk, A. A. (2002). Hierarchical linear models: Applications and data analysis methods. London: Sage Publications. Rothstein, R. (2004). Class and schools: Using social, economic, and educational reform to close the Black–White achievement gap. Washington, DC: Economic Policy Institute. Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational Researcher, 39(2), 142–151. World Bank. (2008). Country classification. Available at http://web.worldbank.org/WBSITE/ EXTERNAL/DATASTATISTICS/0,contentMDK:20420458BmenuPK:64133156B pagePK:64133150BpiPK:64133175BtheSitePK:239419,00.html. Retrieved on March 26, 2008.

NEW DIRECTIONS IN NATIONAL EDUCATION POLICYMAKING: STUDENT CAREER PLANS IN INTERNATIONAL ACHIEVEMENT STUDIES Joanna Sikora and Lawrence J. Saha ABSTRACT Our first goal is to discuss new information for national policymaking which may arise from the analyses of international achievement study data. The second is to illustrate this potential by exploring determinants of students’ career plans in a cross-national perspective. Using neoinstitutionalism as our theoretical framework, we propose that the influence of a global educational ideology encourages high levels of occupational ambitions among students. This is particularly the case in countries where the transfer of this ideology is supported by the reception of aid for education, where economic prosperity is at modest levels but the service sector employment is expanding. To explore this proposition, we analyze students’ occupational expectations using the 2006 PISA surveys from 49 countries. We account for a broad range of possible determinants by estimating three-level hierarchical models in which students are clustered in schools and schools within countries. We find that at The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 85–117 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013007

85

86

JOANNA SIKORA AND LAWRENCE J. SAHA

individual and school levels, ambition is positively correlated with economic and noneconomic resources. In contrast, students in poorer countries, where secondary education is not yet universally accessible, tend to be more ambitious. The global educational ideology, indicated by the reception of education-related aid, is associated with student career optimism, while students in affluent nations with less economic inequality have modest occupational plans. In addition, the rate of service sector expansion is positively related to high levels of ambition. These patterns hold even after we control for cross-national variation in the extent to which PISA respondents represent populations of 15-year-olds in their countries.

INTRODUCTION The purpose of our chapter is to identify new directions through which international achievement studies can contribute to national policymaking. Although we use data from the Programme for International Student Assessment (PISA) which focus on academic achievement, our study focuses on a nonacademic outcome, namely student occupational expectations. Although achievement scores are central to the PISA project, a growing understanding of their significance will be found in the nonachievement context variables. This is due to the unprecedented ‘‘ywealth of contextual background information from participating students and their homes, teachers and schools’’ (Rutkowski, Gonzalez, Joncas, & von Davier, 2010, p. 142) in the PISA variables. These data make possible more comprehensive contextual analyses of the roles that micro- and macro-level opportunity structures and subjective perceptions play in students’ educational outcomes. Because of this perspective, we study the determinants of the occupational plans of PISA adolescents which have policy relevance not only for the educational and occupational attainments of young people, but also for career counselling and market-planning policies. In addition to the study of the relationships between educational and occupational attainments, the analysis of young people’s occupational expectations can reveal both local and global forces which affect youth career plans. The latter represent important changes to the general cultural milieu which is introduced by the spread of a global educational ideology (Kamens & McNeely, 2010, p. 10). The impact of globalization transforms the economic outlooks of nation states which often facilitate rather than resist its progress, and thus strive to produce an appropriate labor force to

New Directions in National Education Policymaking: SCPIAS

87

support their expanding economy. The growth of economies goes hand in hand with the expansion of education systems, and the new discourses and imageries of the global ideology then shape the individual goals of youth. Therefore, the expectations of young people are affected not only by individual preferences for the intrinsic rewards of particular jobs, and by family and school environments, but also by the knowledge of labor market opportunities. The transnational educational ideology which emphasizes equality of opportunity, the right to quality education, and connectedness to international labor markets is arguably an important context for youth career formation. Our goal in this chapter is to highlight the impact of this broad context on cross-national differences in adolescent occupational plans. We open this chapter with a literature review which focuses on the neoinstitutionalist and status attainment theories as frameworks for understanding the complex and dynamic configurations of influences which shape students’ plans. We argue that in addition to individual and school characteristics, it is necessary to include cross-national variations in the linkage to global educational ideologies. These cross-national variations also include the characteristics of national education systems and the economic conditions within nation states, for these also may affect student ambitions, possibly in opposite directions. Our literature review focuses on the neo-institutionalist and status attainment theories as frameworks for understanding these complex and dynamic configurations of influences which shape students’ plans. We then proceed to the empirical analyses of students’ occupational expectations in the PISA countries. Our analysis includes a discussion of some potential pitfalls of investigations of the type presented here, such as the truncated sample of PISA countries and missing data. We also consider issues related to the differences in secondary school participation rates across countries and their implications for cross-country comparisons. In conclusion, we discuss the value of our results for policymaking and for the future problems and promises of cross-national research on the ‘‘context’’ variables in achievement studies. We are aware that the use of PISA data for our analyses may raise questions for some readers. PISA represents a relatively recent development in a string of international achievement studies which include the International Project for the Evaluation of Educational Achievement (IEA), the Trends in International Mathematics and Science Study (TIMSS), and Progress in International Reading Literacy Study (PIRLS). Because PISA is sponsored and in part funded by the OECD, it has the resources and legitimacy to dominate policymaking in an unprecedented manner.

88

JOANNA SIKORA AND LAWRENCE J. SAHA

Unlike previous international achievement surveys, the PISA data clearly provide an opportunity for far more than simple comparisons of country educational performances. No other survey has approached the study of school knowledge based on the perceived needs of adult success in modern society. This makes the PISA surveys potentially more sensitive, compared to previous international surveys, to the awareness or experience of an institutionalizing or globalizing impact which was never widely discussed in the context of these other studies. Our study is the first to ask whether an institutionalized world education culture can be assumed to involve a dominant imagery of desirable careers, for example, a professional career which is both a product and an agent of educational expansion and the international isomorphism in educational practices and ideologies. To investigate the presumed influence of global-level effects, we use hierarchical linear modeling. This procedure allows us to analyze the factors which influence student occupational expectations at the individual, school, and country levels. The key questions we address are as follows: (1) Can we learn anything new about the determinants of youth plans from three level cross-national analyses in which between-country variation complements the same set of factors in individual country-by-country comparisons? In other words, do three-level analyses, in which students are clustered in schools, and schools are clustered in countries, provide new information about the wider global contexts within which adolescents form their occupational plans? and (2) Can this knowledge inform educational equity and vocational counselling policies by incorporating the influence of local and global ideologies on students’ career ambitions?

PRIOR RESEARCH AND OUR THEORETICAL FRAMEWORK Before a global perspective in the study of student career plans emerged, the occupational expectations of students were investigated primarily as individual phenomena by developmental psychologists whose interest lay in career counselling. Within sociology, career plans have also been studied but have been related to social mobility within and between generations (Musgrave, 1967; Turner, 1964). In the latter tradition, it was found that an advantaged family background coupled with high levels of achievement raised educational aspirations and expectations which, in time, boosted

New Directions in National Education Policymaking: SCPIAS

89

occupational attainments. These positive correlations between advantage and ambition were found both at the individual and the school levels. Initially, there was some conceptual ambiguity between career aspirations and expectations. The distinction between these two concepts was recognized early because of the awareness that the former measured life plans which were relatively unaffected by perceived social restraints, whereas the second took them into account (Caro & Pihlblad, 1965; Desoran, 1977/1978; Empey, 1956; Han, 1969; Saha, 1983, 1997). Later, expectations defined in this way were found to be better predictors of actual attainments than aspirations (Goyette, 2008) and thus the PISA questionnaires have included only the measure of students’ expectations. Despite the long-standing consensus that the vocational preferences of students are driven by individual interests, perceived intrinsic rewards of particular careers, and personality types (Brown, 2002; Holland, 1997), psychologists recognize that students’ career expectations are not formed in a social vacuum. This is consistent with status attainment research within sociology which emphasizes the role of economic and cultural capital in the family of origin, and the role of structural barriers within institutional systems, as crucial determinants of students’ ambitions. The researchers in both traditions found that career plans are shaped by family backgrounds (including economic, social, and cultural capital), community factors, peer groups, students’ work experience, and local labor market conditions (Gottfredson, 2002; Johnson & Mortimer, 2002). By and large, social advantage in various forms raises the levels of ambition, although under some circumstances some disadvantaged groups or minorities display very high levels of expectations as shown by both psychological and sociological research (Feliciano & Rumbaut, 2005; Helwig, 2008; Looker & McNutt, 1989; Rajewski, 1996; Rindfuss, Cooksey, & Sutterlin, 1999; Saha, 1983; Saha & Sikora, 2008). This latter point is important because occupational plans have been found to be especially good predictors of later occupational attainments, not only for older generations but also for recent cohorts of high school students (Croll, 2008; Rindfuss et al., 1999). Recent research in Australia has confirmed this relationship, even after the educational plans of youth and their actual university completion had been statistically controlled (Sikora & Saha, 2010). Earlier, concerns that youth plans were only ‘‘flights of fancy,’’ fleeting, and randomly fluctuating (Alexander & Cook, 1979) have been shown to be unfounded. In fact, although upon the completion of studies graduates rarely work in the exact occupations desired in high school, students who can articulate specific and ambitious goals at that time are more likely to

90

JOANNA SIKORA AND LAWRENCE J. SAHA

succeed in early entry to high-status occupations (Croll, 2008, Rindfuss et al., 1999). This higher likelihood is net of academic performance, family SES and a host of other influences. To complement and extend these findings, we argue that the educational ideology which comes packaged with the imagery of highly skilled and well rewarded employment, typical for increasingly service-oriented economies, has its own capacity to stimulate very ambitious plans among students, particularly in countries where the prevalence of such employment is yet to become a reality. The impact of this ideology does not eradicate the effects of individual differences and the differences in school environments on students’ expectations. Moreover, we contend that its effects depend also on economic and social conditions in particular countries. Until education systems become more homogenous across countries, returns to education will remain higher in poorer, less developed societies which experience more inequality (Psacharopoulos & Patrinos, 2004). This, coupled with a rapid expansion of educational systems and the growth of service employment within these countries, is likely to trigger high levels of ambition among youth.

THEORETICAL FRAMEWORK AND THREE LEVELS OF INFLUENCES ON STUDENTS’ CAREER PLANS We employ two theoretical perspectives to examine three contextual levels in which students formulate their vocational plans. Our first perspective is the institutional approach developed by Meyer and his colleagues (Benevot, 1997; Meyer, Ramirez, Rubinson, & Boli-Bennett, 1977; Schofer & Meyer, 2005), which informs our macro-level analysis. At the same time, we incorporate developments within the status attainment perspective, which informs our analyses of gender, academic ability, student family background, and school characteristics. These latter are our control variables. The institutional theoretical approach focuses on the global expansion of educational systems, on the institutional links between education and other social institutions, and finally on the impact of mass education on other institutional arrangements in society (Benevot, 1997). Institutional theory is particularly well-suited to guide the study of attainment processes at the cross-national level because it emphasizes the impact of global pressures on local educational systems. This perspective explains the increasing isomorphism of national education systems, driven by global educational expansion and progressive standardization, and by universal educational

New Directions in National Education Policymaking: SCPIAS

91

values supported by supranational organizations like UNESCO and the World Bank. Yet it also allows for cross-country variation in responses to the standard and uniform ideologies which are influenced by local cultural and economic conditions within nation states. Suda (1979) contended that as education expands in developing countries, the perceptions of life chances are raised unduly, resulting in what some have called ‘‘the diploma disease’’ (Dore, 1976). Moreover, as Irizarry (1980) pointed out, under the contradictions of capitalist development in these countries, ambitions are raised but with limited possibilities for their fulfillment. With the expansion of educational systems and the globalization of labor markets, what seemed previously to be unduly ambitious expectations, particularly among youth in less developed countries, becomes part of the worldwide cultural milieu dominated by the assumptions of equal rights to education for all and, likewise, equal access to highly skilled and well rewarded employment (Wiseman & Baker, 2006). In addition to the variation in economic structures of opportunity, and, as we argue, the ideological influences of a global culture of education, the status attainment theory postulates that preferences of students are affected by the features of families, schools, and national school systems. For instance, the most recent studies find that students form ‘‘more realistic’’ and thus less ambitious educational expectations in highly stratified systems, which rely on early specialization and tracking (Buchmann & Dalton, 2002; Buchmann & Park, 2009). Similarly, Mortimer and Kru¨ger (2000) found that, with respect to career destinations, the differentiation of educational institutions served to maintain stratified pathways. Unequal access to university-oriented academic pathways, or even to higher-level secondary schooling, is the key factor determining not only educational but also occupational attainment. This is why, in our analyses, we control for the degree of educational system differentiation, rates of participation in secondary education, and the prevalence of tertiary educated workers among the youngest adults. However, none of these measures directly captures the influence of global educational ideology. To address this issue, we employ data about countries’ access to aid for education. Youth in countries which receive aid to expand and improve their education systems are direct recipients of the key messages contained in the equity discourse of global educational ideologies. Moreover, they live in economic conditions which may facilitate optimism about prospects of upward mobility associated with development. We argue that it is this change in both educational ideologies and economic conditions that fosters high levels of optimism in students’ career expectations.

92

JOANNA SIKORA AND LAWRENCE J. SAHA

Following from the points discussed above, we next develop the hypotheses which will guide our analysis.

HYPOTHESES The direct link to educational expansion programs, which propagate ideas of equality, citizens’ rights, and economic progress, should stimulate ambitious career plans even in countries in which current labor markets cannot accommodate echelons of highly skilled professionals. Therefore, we expect that students in countries which receive educational aid will more likely be optimistic in their career expectations despite the less favorable economic conditions in their local labor markets (H1). However, we also expect that this optimism is not only stimulated by the connectedness to global educational ideologies but also to the pace of expansion in local labor markets. Therefore, we propose that in countries where the service sector has grown rapidly in recent years, students will expect professional and managerial employment despite low levels of economic prosperity and pervasive economic inequalities (H2). To account for forces which curb and moderate student optimism, and in line with prior research on stratified education systems, we expect lower levels of expectations in countries where tracking and segregation of students start at or before the age of 15 (H3). We control our analyses for rates of participation in secondary education to keep in check the possibility that the apparent higher levels of ambitions in poorer countries result from the fact that only the more determined and affluent students are at school at the age of 15. Finally, we also control for the proportion of tertiary educated among the youngest workers, because high levels of tertiary attainment may boost the expectations for professional employment among adolescents. These three hypotheses are tested after controlling not only for access to secondary and tertiary education but also for an association between parents’ socioeconomic status and students’ career goals, their gender, and their academic ability. Schools in which students’ parents have higher occupational standing are known to be conducive to higher educational attainment. This is because school communities, which consist of many skilled, affluent, and welleducated parents, utilize not only material but also social resources to benefit and advance their students. Therefore, we also control for schools’ average parents’ SES, selective admission policies at the school level which screen out average and below average students, and urban location, all of which are likely to be associated with more ambitious occupational plans.

New Directions in National Education Policymaking: SCPIAS

93

DATA, MEASUREMENT, AND METHODS To test these hypotheses, we use the most recent round of the OECD’s PISA survey, conducted in 2006 in over 50 countries. Our dependent variable is derived from a single question which has been asked in most PISA surveys. What kind of job do you expect to have when you are about 30 years old? Write the job title:___________________________________

Although a measure based on a single question can be seen as potentially problematic, particularly in light of concerns expressed about the variability of adolescents’ plans over time (Rindfuss et al., 1999, p. 231), this practice is a standard form of collecting occupational data in surveys. Moreover, research in Australia indicates that, while particular occupational titles desired by teenagers vary over time, their preferences in terms of occupational status are significantly more stable (Sikora & Saha, 2010). Although PISA participation continues to be dominated by OECD countries, the presence of some ‘‘lower-middle income’’ countries, using the World Bank terminology, with varying levels of economic prosperity, inequality and the receipt of international aid for education programs, makes the analysis of these data suitable for our research questions. Measurement The PISA occupational data, including students’ career expectations, was initially coded to ISCO-88, that is, the International Standard Classification of Occupations. These codes then were recoded into the ISEI index of occupational prestige (Ganzeboom & Treiman, 1996), which, as a measure of occupational expectation, is our dependent variable. ISEI scores range from 10 to 90, with the lowest values denoting unskilled labor and the highest allocated to highly specialized professions, such as neurosurgeons or judges in courts of law. In this sample of countries, various aspects of institutional contexts are strongly interrelated (see Appendix Table A2 for the full correlation matrix for country-level variables). Therefore, we use an index to convey the information about GDP per capita and the proportion of labor force employed in the service sector. Our modeling strategy follows three steps. First, we assess the impact of our measure of connectedness to the global educational ideology, namely the reception of education-related aid. Nations which received such aid are coded 1 and the rest are coded 0.

94

JOANNA SIKORA AND LAWRENCE J. SAHA

Other controls comprise secondary school net enrolment rates, the proportion of people aged between 25 and 34 years who completed tertiary education (defined as ISCED level 5a or 6), and the number of schools or distinct educational programs available to 15-year-olds within each country. The last variable is our indicator of the differentiation within the school systems. In the second step, we focus on the economic characteristics of countries as follows: (1) the index of GDP per capita and the share of service employment expressed as ratios to the USA figures and then averaged; (2) the GINI index of economic inequality; and (3) the expansion rate of the service sector. Finally in the third step, we enter all predictors at the country level. These predictors have been sourced from World Bank Development Indicators, UNESCO, and International Labour Office’s online databases, complemented with several other sources. The details are provided in the appendix. At the school level, we control for (1) parents’ SES averaged within schools, which identifies schools with higher proportions of students from privileged backgrounds; (2) selective admission based on academic achievement (coded 0 for schools that do not have such a policy and 1 for those that do); and (3) the size of town in which each school is located. In a preliminary analysis, we also included an indicator of the school’s autonomy in determining the curriculum, but this variable turned out to be insignificant and thus was omitted from later analyses. At the individual level, we include gender and parents’ socioeconomic status, created from the information on mother and father’s averaged ISEI scores, and averaged years of schooling completed. Educational levels were recoded into years using the template provided by the 2003 PISA manual (OECD, 2005a). Education and occupation contribute equally to our measure of parents’ SES. Following prior research based on the PISA data (Buchmann and Park, 2009), we averaged five plausible values indicating reading performance to create a control variable indicating prior academic achievement, as the actual data on prior academic achievement are not available. We also control for students’ participation in either vocationally or prevocationally oriented programs. This is crucial to account for the possibility that, as in Poland before and after the educational reform first initiated in 1999 and implemented over the next few years, PISA may have been administered to students before or just after the first transition which distributed students into academic versus vocational streams. The timing of such a transition may significantly affect the likelihood of forming concrete occupational plans.

New Directions in National Education Policymaking: SCPIAS

95

Method To model occupational expectations, we employ random intercept1 threelevel linear models, as available in HLM version 6.08, in which students are clustered in schools and schools are clustered in countries, as in Eq. (1). Expected_Occupationi ¼ constantijk þ Recipient_of Aid_to_Educationk þ Number_of_School_Programsk þ Percentage_of_2534_years_old_Tertiary_Educatedk þ Secondary_School_Participation_Ratek þ Ginik þ Index of_GDP_&_ServiceEmploymentk þ Growth_of_Service_Sector_19972004k þ Average_Parents0 _SES_in_schooljk þ Size_of_townjk þ Selective_admissionjk þ Maleijk þ Parents0 _SESijk þ Reading_scoreijk þ Vocational_Programijk þ v0k þ u0jk þ e0ijk

(1) All multivariate analyses are weighted by the student population weight, and a country factor which ensures that each country contributes equally to the analysis (OECD, 2008a). As we rely on linear models which are sensitive to departures from normality, all the analyses have been performed with robust standard errors. The predictors in our models are centered around the overall mean and thus model coefficients should be interpreted as deviations from the grand mean.

Missing data As in all data sets, PISA variables have some missing values. Of particular concern are missing values on the dependent variable. Occupational expectation is a variable which is subject to higher levels of missing data than educational plans (Table A1). This is partly due to the fact that some students find it harder to name their expected occupation and therefore give invalid or uncodable responses such as ‘‘anything that pays well.’’ But many students do not have clear occupational plans at 15 years of age (White, 2007). Prior research has found that students with lower levels of academic achievement are more likely to skip this question or give uncodable answers

96

JOANNA SIKORA AND LAWRENCE J. SAHA

(Sikora & Saha, 2009). Therefore, cross-national comparisons of top achievers might be a better strategy when comparing occupational plans because imputing occupational expectations would introduce significant bias (Marks, 2010). This is one of the reasons why we estimate our models for the whole sample and then again for a top achievers subsample. To reduce the loss of information on independent variables, we imputed some data at the individual level for independent variables only. The results from analyses with imputed variables were not substantively different from those obtained without imputations.

RESULTS Before we proceed to the analyses of the impact that global educational ideology has on career plans, we first examine the pattern of average ambitions in the context of national economic differences. In countries with higher levels of economic prosperity and lower economic inequality, the average occupational expectations of students tend to be less ambitious. We employ the Gini index of inequality as an indicator of the economic differences between countries, and Fig. 1 illustrates this relationship for our data. In 2006, students in South American countries with high levels of inequality tended to report the most ambitious career plans. Equally ambitious were young Jordanians, Tunisians, and students from Azerbaijan whose economies rank lower on the Gini index. At the other end of the inequality spectrum, average occupational expectations of students in the Scandinavian nations, Japan, the Czech Republic, and Slovakia were significantly more modest in terms of their status. However, the average status of future employment is not predicted perfectly by overall levels of economic inequality and we note a number of outliers. First, countries in which students are more modest in their occupational ambitions than their level of inequality would indicate are the developed nations with highly differentiated education systems, namely Germany, Austria, and particularly Switzerland, which is an outlier in Fig. 1. Conversely, students in Iceland, Bulgaria, and Kyrgyzstan report much higher expectations than what is implied by the Gini index income differentials in their country. Overall, despite the identified outliers, the relationship between within country income inequality and levels of occupational expectations appears positive and approximately linear. In addition to the mean estimates in Fig. 1, and those found in Table A1, Table 1 provides more detail about the distribution of occupational

Fig. 1.

45

50

55

60

65

70

75

80 Linear (Expected occupation)

Expected occupation

Low inequality Gini =25

Students’ Occupational Expectations in 50 Nations in 2006. Countries are Ranked from the Lowest to the Highest Economic Inequality Measured by Gini Coefficients. Source: OECD (2008b).

High inequality Gini = 59

ISEI score

Colombia Brazil Chile Argentina Mexico Uruguay Turkey Hong Kong Thailand USA Russia Tunisia Jordan Portugal Latvia Azerbaijan Italy Lithuania UK Estonia Spain Poland Greece Indonesia Ireland Chinese Taipei New Zealand Switzerland Belgium France Canada Korea (ROK) Romania Netherlands Kyrgyzstan Australia Bulgaria Austria Croatia Slovenia Germany Finland Hungary Norway Slovakia Czech Rep Iceland Sweden Japan Denmark

New Directions in National Education Policymaking: SCPIAS

97

Colombia Brazil Chile Argentina Mexico Uruguay Turkey Hong Kong Thailand United States Russia Tunisia Jordan Portugal Latvia Azerbaijan Italy Lithuania United Kingdom Estonia

Countries

56 51 49 50 54 52 53 45 43 49 48 54 56 45 34 54 43 48 37 32

69 65 65 65 69 68 69 52 51 60 64 69 69 56 53 70 54 54 52 54

71 71 69 70 71 70 69 66 69 69 70 70 69 69 69 70 69 69 64 69

85 85 83 82 78 83 77 71 74 83 78 85 80 77 71 85 74 74 71 71

20 40 60 80 Percentile Percentile Percentile Percentile

PISA 2006a

4,040 7,477 4,217 3,663 24,124 3,867 3,999 3,820 4,147 4,665 4,354 3,897 4,761 4,052 3,517 3,346 18,357 3,496 11,229 3,597

N

1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0

55 78 85 79 67 66 66 76 71 89 84 64 81 82 98 81 92 95 95 90

3 2 3 3 3 2 3 1 2 1 1 3 1 3 1 1 3 2 1 2

8 4 23 12 18 8 14 6 10 31 24 12 6 17 21 23 12 20 29 26

Recipient of Secondary Number of Percentage Aid to School Programs or of 24–35Educationb Enrolment School Year Rate (Net)b Types Olds with Available to Tertiary 15-YearEducationb Oldsb 0.43 0.43 0.48 0.59 0.46 0.58 0.34 0.94 0.27 1.00 0.42 0.35 0.51 0.52 0.45 0.33 0.69 0.42 0.88 0.46

59 57 55 51 46 45 44 43 42 41 40 40 39 39 38 37 36 36 36 36

0.8 2.0 4.7 2.3 5.3 2.1 7.9 8.7 6.0 4.5 2.3 6.4 1.7 0.9 7.0 6.2 3.5 5.1 5.3 1.6

Index of Gini Index Growth of GDP Per of Economic Service Capita and Inequalityb Sector Service Sector 1997–2004b Employment Ratio to the USAb

Table 1. Occupational Expectations (Quintile Values), Economic and Education System Characteristics of Countries.

98 JOANNA SIKORA AND LAWRENCE J. SAHA

42 38 50 45 37 43 39 33 34 38 43 50 37 38 50 37 50 34 37 43 34 30 33 34 34 34 34 34 38 34

57 54 56 54 54 54 53 40 51 51 54 56 52 51 65 53 65 44 51 54 45 43 50 51 52 50 61 49 45 50

69 69 69 69 67 68 65 55 66 61 69 69 69 59 77 65 71 54 60 67 55 54 64 65 65 54 69 56 68 61

73 71 71 77 71 71 71 69 71 70 77 71 71 69 88 71 85 69 71 74 70 70 71 73 71 71 77 69 69 71

14,447 4,385 3,566 7,905 3,797 7,179 3,695 9,793 6,995 3,565 18,466 4,773 4,549 4,229 3,780 11,416 3,150 3,291 3,426 5,026 3,237 3,522 3,047 3,342 3,538 3,999 2,899 3,587 3,908 3,510

All estimates are weighted with student population weight. Refer to appendix for information about data sources.

b

a

Spain Poland Greece Indonesia Ireland Taiwan New Zealand Switzerland Belgium France Canada Korea (ROK) Romania Netherlands Kyrgyzstan Australia Bulgaria Austria Croatia Slovenia Germany Finland Hungary Norway Slovakia Czech Republic Iceland Sweden Japan Denmark

0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0

93 90 87 56 86 94 91 82 86 95 99 90 81 89 82 86 90 85 86 92 99 94 91 95 87 87 88 99 100 91

1 1 2 2 4 1 1 4 4 3 1 3 2 4 3 1 1 4 2 4 4 1 3 1 5 5 1 1 2 1

25 25 18 2 27 17 18 19 24 22 24 33 13 34 15 23 28 7 16 26 14 26 20 41 14 13 24 26 29 12

0.63 0.42 0.61 0.26 0.83 0.39 0.65 0.95 0.79 0.78 0.82 0.60 0.27 0.81 0.29 0.80 0.39 0.77 0.43 0.50 0.75 0.80 0.48 1.04 0.45 0.45 0.93 0.90 0.95 0.89

35 35 34 34 34 34 34 34 33 33 33 32 31 31 30 29 29 29 29 28 28 27 27 26 26 25 25 25 25 25

2.1 5.6 7.2 1.1 5.1 2.5 2.4 3.4 3.1 3.0 0.4 6.9 6.7 3.8 5.3 2.1 13.5 4.0 1.4 6.1 4.0 3.6 3.0 4.2 4.2 3.5 5.8 3.9 4.9 3.1

New Directions in National Education Policymaking: SCPIAS 99

100

JOANNA SIKORA AND LAWRENCE J. SAHA

ambition among students in particular countries. We compare quintile values in all surveys in Table 1, in which countries have been sorted by the descending value of the Gini coefficient. The pattern that emerges can be best described as approximately equal distances between the 20th and the 80th percentile of the distributions. In most countries, this distance is approximately about 35 ISEI points. However, in poorer countries characterized by more economic inequality, students’ preferences are in greater part focused on professional destinations. If we recall that ISEI scores of 65 and higher denote mostly professional jobs, the distribution of preferences in Brazil, which ranges from 51 to 85, is indicative of much higher status expected by students than the range of Germany, Sweden, Norway, and Finland, with quintile values between the mid-30s and 70s. The plans of students in nations where school participation rates are high and inequality is lower are less concentrated on highly skilled professional destinations. This opens the possibility that the differences between students can be due primarily to the fact that the populations of students in these countries are not comparable in terms of the representativeness of relevant student age groups. If we make a plausible assumption that those students who leave the school system before the PISA data are collected, come from lower SES backgrounds or are less academically successful, then the PISA data may represent two different populations. In countries where all adolescents progress to upper secondary education, we have a good representation of the whole relevant age group, but in other countries, where not all youth are in secondary school, we have only information from elite students. To gain a better insight into this possibility, we estimate our multivariate models for all students and the elite students. Let us begin with the presentation of the results for all the students who gave valid responses.

MULTIVARIATE ANALYSES As our first hypothesis predicted, countries which receive international aid to expand and improve their education systems have students whose career ambitions are, on average, significantly higher than elsewhere (Table 2). This is the case when we control for various features of educational systems, but importantly even after we control for the country’s level of GDP per capita, service sector employment, inequality (indicated by the Gini index) and also the rate of expansion of employment in the service sector.

101

New Directions in National Education Policymaking: SCPIAS

Table 2.

Occupational Expectations: Total Sample. Model 1

Model 2

Model 3

Coefficient Standard Coefficient Standard Coefficient Standard error error error Fixed effects Country characteristics Country receives education-related aid Secondary school net enrollment rate Number of school types available to 15-year-olds Percentage of 25–34year-olds with tertiary education Index of GDP per capita and service employment ratio to USA Gini coefficient Growth of the service sector 1997–2004 School characteristics Parents’ SES averaged by school Selective admission based on achievement Size of town (millions of inhabitants) Individual characteristics Male Parents’ SES (education and ISEI) Reading score Nonnational language spoken at home

8.94

1.93



7.20

1.84

0.42

0.10



0.28

0.10

0.96

0.68



0.82

0.55

0.03

0.12



0.05

0.08

11.90

3.21



20.06

3.78

– –

0.51 0.61

0.11 0.41

0.14 0.84

0.12 0.37

1.80

0.66

1.80

0.66

1.82

0.66

1.97

1.27

1.95

1.27

1.91

1.27

0.27

0.74

0.28

0.74

0.26

0.74

1.32 2.01

0.55 0.26

1.32 2.01

0.55 0.26

1.32 2.01

0.55 0.26

0.06 4.88

0.00 1.00

0.06 4.88

0.00 1.00

0.06 4.88

0.01 1.00

102

JOANNA SIKORA AND LAWRENCE J. SAHA

Table 2. (Continued ) Model 1

Model 2

Model 3

Coefficient Standard Coefficient Standard Coefficient Standard error error error Vocationally oriented program (constant) Random effects Variance at country level Variance at school level Variance at student level Explained variance at country level Explained variance at school level Explained variance at student level

4.98

1.06

4.98

1.06

4.98

1.06

59.02

0.80

58.79

0.85

58.78

0.61

28.0

10%

33.1

11%

18.1

7%

127.0

44%

127.0

44%

127.0

46%

130.9

46%

130.9

45%

130.9

47%

21%

6%

49%

24%

24%

24%

8%

8%

8%

Percent total explained 17% variance Number of countries 49 Number of schools 13,580 Number of students 281,981

16%

20%

49 13,580 281,981

49 13,580 281,981

Statistically different from zero at p ¼ 0.01, statistically different from zero at p ¼ 0.05. All analyses weighted with student population weights adjusted so that each country contributes equally to the analysis.

Receiving aid can be seen as a form of connectedness to the international educational agencies and thus to the ideological packages which they promote. Yet, aid is given primarily to developing countries, so the effects of economic conditions must be controlled for, as they are in Model 3, if the reception of aid is to be conceptualized as an indicator of a strong link to global educational ideology. In Model 1, the coefficient for participation rates in secondary education is negative, which indicates that in countries where some youth are no longer at school at 15 years of age, youth remaining at school expect particularly high levels of occupational attainment. Yet, the difference between more and less selective secondary education systems does not render all other country-level characteristics

New Directions in National Education Policymaking: SCPIAS

103

irrelevant. In Model 1, the association between average ambition and highly differentiated education systems, denoted by a number of distinct school programs available to 15-year-olds, is negative although not statistically significant. Nevertheless, the direction is consistent with findings in other studies of educational expectations (McDaniel, 2010), although we find no strong evidence in support of our third hypothesis. Finally, the prevalence of tertiary education has no discernible effects. Model 2 considers economic characteristics, that is, countries’ level of affluence indicated by the 2004 GDP per capita, combined with the 2004 proportion of the labor force employed in services, economic inequality, and the rate of growth of the service sector. Clearly, it is the youth living in poorer countries, where economic inequality is pervasive, who are particularly keen to enter high-status professional and managerial employment. In contrast to the effects of GDP and service employment, the rate of expansion of the service sector is positively associated with adolescent hopes for professional jobs, as stipulated by our second hypothesis (H2), although this effect is not significant in all models. Nevertheless, the direction of association is consistent with the notion that what fosters high ambitions among youth is less the level of economic development and prevalence of service employment, but rather the rate of growth in the service sector. Obviously, there is a good deal of intercorrelation between these factors, hence, when seven different measures are entered simultaneously into the model, some effects are cancelled out. However, the reception of financial aid for education, which is associated with various forms of transfer of the global model of education, remains an important indicator of ambitious career expectations among 15-year-old students. The control variables at school level reveal a strong relationship between school parental SES and occupational ambitions. This relationship is significantly stronger than the predictive power of selective admission, which pre-screens students’ academic achievement, and urban residence as indicated by the size of town. Both these latter influences are positive (not shown in Table 2) until controls for average parents’ SES within schools are introduced into the models. In terms of individual variables, our results are consistent with the findings of previous studies. Girls are more oriented toward higher status employment than boys, and parents’ socioeconomic position significantly raises students’ ambition (McDaniel, 2010; Marks, 2010; Sikora & Saha, 2009). This effect is separate from the increase provided by academic performance, and this is in line with status attainment theory which predicts separate and complementary effects of parents’ background and academic

104

JOANNA SIKORA AND LAWRENCE J. SAHA

achievement on ambition levels. In other words, some students from lower SES backgrounds might be motivated more by their excellent academic performance, whereas students of high SES origin may need little academic success to aim for professional jobs. In either case, an early entry into a vocationally-oriented program reduces plans for higher status employment, independently of the other individual-level variables. The analyses presented in Table 2 highlight the relatively low proportion of variance attributable to cross-national differences for this dependent variable. About 10% of variance in occupational plans can be attributed to differences between countries. In contrast, approximately 45% is attributable to differences between schools and another 45% is due to individuallevel differences. These findings have implications for policymakers in designing and implementing various vocational and nonvocational educational policies. Policymakers need to understand the relative importance of particular factors which differentiate youth occupational ambitions, namely the extent to which relative socioeconomic differences between students, compared to relative differences between school communities and countries, facilitate expectations of high-status careers. Our analysis indicates that social selection processes affect youth career plans in various ways. Adolescents, whose parents pass on multiple advantages, who are models of high-status employment, and who provide guidance and assistance in navigating educational systems, are the ones who plan for higher status employment. Those students who are successful at school are also more oriented toward the professions. But above and beyond individual differences and school environments, the connectedness to the global ideological package, which promotes equal access to high-quality education for all, fosters highly ambitious career plans in youth from less developed countries with high levels of economic inequality.

THE RELATIONSHIP BETWEEN INEQUALITY AND CAREER PLANS: A FUNCTION OF SELECTIVITY? Our argument, that it is the impact of global educational ideology coupled with relative economic deprivation and also rapid economic growth that stimulate ambitious career plans among youth, needs another sensitivity test. To provide stronger support for the proposition that cross-country differences in our dependent variable cannot be explained away by unequal access to secondary education, we will replicate our original analysis on a

New Directions in National Education Policymaking: SCPIAS

105

subsample of the most academically able, that is, elite students. If, by using a subsample drawn according to ability, the original relationship between country characteristics and occupational expectations does not persist, then we must conclude that differences in participation rates are the primary factor generating cross-country variation in these expectations. On the other hand, if the original relationship persists, then we can conclude with more confidence that macroeconomic and macrosocial factors play a role in shaping youth ambition. We include students in the top 30% of the reading score distribution in countries in which 100% of 15-year-old students are enrolled in high school. In a country like Mexico, where only 67% of the students in the relevant age group are enrolled in secondary school, we take 45% (i.e., 30/67) from the top to counterbalance the possible distortion due to systematic variation in participation rates. In Table 3, we present a replication of our earlier multivariate analysis, but this time the sample is restricted to the academically most able ‘‘elite’’ students in all countries. Overall, the results for the elite students are reasonably close to the patterns which we found when students with all levels of academic ability were analyzed. The effects of participation rates remain negative and significant, and the association between our measure of connectedness to international educational aid agencies and average expectations persists, although somewhat reduced in terms of point estimates. Thus, we conclude that systematic variation in participation rates does not seem sufficient to explain away country-level variation in student expectations. In fact, country, school, and individual effects are not altered dramatically in the elite subsample analysis.

DISCUSSION We began this chapter by posing two questions: (1) Can we improve our knowledge about the determinants of youth occupational plans from three-level cross-national comparisons which make possible the identification of global-level effects as compared to school and individual effects; and (2) Can this knowledge inform educational equity and vocational counselling policies which take into account both local and global effects? In this discussion, we propose an affirmative argument in response to both questions. In this chapter, our focus has been on the effects of a global educational ideology on student occupational expectations in the context of countries’

106

JOANNA SIKORA AND LAWRENCE J. SAHA

Table 3.

Occupational Expectations of Top 30% of Readers. Model 1

Model 2

Model 3

Coefficient Standard Coefficient Standard Coefficient Standard error error error Fixed effects Country characteristics Country receives education-related aid Secondary school net enrollment rate Number of school types available to 15-year-olds Percentage of 25–34year-olds with tertiary education Index of GDP per capita and service employment ratio to USA Gini coefficient Growth of the service sector 1997–2004 School characteristics Parents’ SES averaged by school Selective admission based on achievement Size of town (millions of inhabitants) Individual characteristics Male Parents’ SES (education and ISEI) Reading score Nonnational language spoken at home

6.19

2.21



4.46

2.10

0.35

0.10



0.26

0.10

0.43

0.50



0.40

0.47

0.08

0.08



0.09

0.09



13.53

4.47

9.02

3.40

– –

0.34 0.43

0.10 0.34

0.10 0.46

0.10 0.31

1.90

0.58

1.90

0.58

1.93

0.58

1.30

1.20

1.28

1.21

1.23

1.21

1.33x

0.74

1.34x

0.74

1.36x

0.74

0.73 1.55

0.46 0.21

0.73 1.55

0.46 0.21

0.73 1.55

0.46 0.21

0.04 3.30

0.01 1.16

0.04 3.30

0.01 1.16

0.04 3.30

0.01 1.16

107

New Directions in National Education Policymaking: SCPIAS

Table 3. (Continued ) Model 1

Model 2

Model 3

Coefficient Standard Coefficient Standard Coefficient Standard error error error Vocationally oriented program (constant) Random effects Variance at country level Variance at school level Variance at student level Explained variance at country level Explained variance at school level Explained variance at student level

6.25

1.26

6.26

1.26

6.27

1.26

65.15

0.76

64.77

0.76

65.02

0.64

22.3

9%

24.3

10%

17.8

8%

110.5

46%

110.5

45%

110.5

47%

108.5

45%

108.5

45%

108.5

46%

26%

19%

41%

8%

8%

8%

2%

2%

2%

Percent total explained 8% variance Number of countries 49 Number of schools 11,773 Number of students 109,273

7%

9%

49 11,773 109,273

49 11,773 109,273

Statistically different from zero at p ¼ 0.01, statistically different from zero at p ¼ 0.05, # statistically different from zero at p ¼ 0.10 All analyses weighted with student population weights adjusted so that each country contributes equally to the analysis.

educational systems and economic characteristics. The relevant analytic variables were (1) education-related development aid to recipient countries; (2) the proportion of the relevant age population enrolled in secondary school; (3) the number of program or school types available to 15-year-old students; (4) the proportion of the population, aged between 25 and 34 years, with a tertiary education; (5) the proportion of the population in service occupations, and economic prosperity denoted by GDP per capita; (6) the Gini index, a measure of economic inequality; and (7) the growth in the service sector between 1997 and 2004.

108

JOANNA SIKORA AND LAWRENCE J. SAHA

Youth in countries which are connected to the international education agencies by reception of official aid for education tend to form significantly more ambitious plans than their counterparts in developed economies where all 15-year-olds are in secondary education. We find that this relationship holds even after economic prosperity, measured by GDP per capita, combined with a measure of service sector employment is taken into account. We consider the reception of aid, after other economic characteristics of countries have been taken into account, to be an indicator of the connectedness to the global educational ideology. Moreover, we find tentative evidence that while the level of economic development is negatively related to average career ambitions of students, the pace of expansion of the service sector fosters high levels of these ambitions. This opens up a new opportunity, or indeed the obligation, to raise the bar in educational and youth career policymaking. Taking account of these third-level effects means that policymaking can now consider more systematically the effect of global-level forces in policies designed to be implemented at the nation– state level. However, three issues will affect the future potential of these ‘‘context variable’’ analyses in international achievement studies. First, the increasing complexity of the nature of evidence will challenge straightforward interpretations of multilevel patterns. Second, the measurement of particular ‘‘context variables,’’ which are often less carefully developed than literacy and numeracy scores, might fuel validity debates. Finally, potential selectivity issues, which have been and will continue to affect country-level representativeness in PISA, will continue to pose a challenge to data analysts. These problems will need to be resolved because future PISA surveys will increasingly be used for informing policy relevant not only to achievement levels but also for occupational expectations and thus vocational counselling. In terms of the relative importance of country and global-level factors, we have found that the largest amount of variance explained in our model occurs at the individual and school levels. However, the significant findings at the third level corresponds to our argument that there is a global, institutionalized cultural force which affects the ways that young students, across countries, view their future careers. To ensure that our results are not brought about by the smaller proportion of youth in schools in the less affluent societies, we repeated the analysis with a subsample of elite students, the top 30%, and found essentially the same results. In other words, our main findings regarding third-level effects are not simply a function of the lower school participation rates of 15-year-olds in the poorer countries.

New Directions in National Education Policymaking: SCPIAS

109

Another equally important finding from our study concerns the rapid change in the service sector of economies. In countries where the service sector grew most between 1997 and 2004 (roughly the period during which our students were undergoing a high level of socialization), the expectations were higher (see Table 2, Model 3). However, the relationship disappeared in the elite analysis. This suggests that for the sample as a whole, the extent of growth in the service sector acted as a stimulus for occupational expectations. The fact that this relationship did not apply for the elite student subsample may indicate that for them, other factors independent of growth, in particular the global ideology and the low-level of development, were more important in driving their ambitions (see Table 3, Model 3). Although the overall range of countries in the PISA studies remains truncated, and very poor countries are not in the sample, the future expansion of PISA promises that there will be a broader representation from nations with lower levels of development and high levels of inequality. For example, so far there is only one African country in the sample and no representation from China or India, two of the most significant Asian representatives. As more countries join the PISA project, future studies can comment on the findings reported here within the context of a wider range of variation in countries’ social conditions and culture. So far, it is clear that large proportions of young people across all the countries are ambitious with respect to their expected occupational attainments and these levels of ambition are particularly high in countries where education is supported by international organizations and which experience the expansion of not only the education system but also service employment. Our theoretical perspectives offer plausible interpretations for these findings. The neo-institutional theory argues that educational expansion throughout the world, based on a standard institutionalized model of schooling, has been created as part of world culture and it influences educational systems at the global level. Given that this global model influences education systems, and given the relationship between education and entry into the labor market, it is not unlikely that the occupational expectations of school students are similarly influenced by a world culture of prestigious and desirable occupations, in particular professional-level occupations such as law and medicine. Thus, it is plausible that educational expansion and the spread of global educational ideology is affirmed by and affirms in turn a global culture of prestigious occupations which are known to students everywhere.

110

JOANNA SIKORA AND LAWRENCE J. SAHA

This neo-institutional interpretation is supported by the persistently rising level of educational expansion at the global level. In advanced societies, this expansion is now occurring at the post-secondary level, whereas in poorer countries it continues to occur at the secondary, and even the primary levels. The driving forces in this expansion are the explicit pressures from outside the poorer countries, such as the World Bank, UNESCO, and other global pressures, as well as internal within country forces, such as the expectations of individual country populations. Thus, with the expansion of education comes the expansion of occupational ambitions which are equally influenced by global culture. Indeed, as we have noted earlier in this chapter, the link between education and occupation is strongest in those countries where the participation of students in the education system is lower. Finally, the notion that there is an institutionalized culture in which young people in most societies participate complements microsociological explanations of ambitions. When the concept of world culture is combined with theories of youth decision making, we can better understand individual-level decisions. Probably the most important result from the perspective of policymaking which emerges from this research is that the occupational expectations of young people across the PISA countries are high, and they are most likely higher in less affluent settings in part because of a global culture. Thus, various national policies within a specific country regarding both education and manpower planning have to accept that some forces which impact on the education and career orientations of their young people lie outside of their control. Thus, rather than attempt to hinder this global influence for fear of negative implications of unrealized youth ambitions, policymakers would do best to accept it and try to use it to the advantage of their own policy decisions. This is particularly the case as recent studies, so far limited only to the USA, indicate that high levels of ambitions among youth are on balance more beneficial than detrimental even if ambitions remain unfulfilled (Reynolds & Baird, 2010). Ambitious occupational plans can be encouraged through national manpower planning programs, the provision of multiple career streams in the educational system, and training for career or vocational counsellors who give direct advice to students. For all of these programs, our data show that the positive association between a high level of resources and optimistic career plans at individual and school level may be further strengthened or moderated by cultural amalgams which emerge in particular combinations of educational systems and economic conditions. Policymakers might be

New Directions in National Education Policymaking: SCPIAS

111

advised that these trends should not be seen as infallible evidence which cannot be altered or contextualized to suit the needs of individual countries. Herein lies the challenge for policymakers. Our study has identified a global phenomenon related to the high occupational ambitions of students, but it remains for the policymakers to take this higher-level effect into account for meeting local needs. Only the cross-national quantitative data sets such as PISA have the possibility of identifying these higher-level global forces, and this is why we contend that this represents a ‘‘new direction’’ for policymaking, which should be pursued as the availability of these data expands.

NOTE 1. The results from the estimation of two-level random intercept separately in each country, which illustrate the extent to which the assumption of parallel slopes across countries is justified, are available upon request from the authors.

REFERENCES Alexander, K. L., & Cook, M. A. (1979). The motivational relevance of educational plans: Questioning the conventional wisdom. Social Psychology Quarterly, 42(3), 202–213. Benevot, A. (1997). Institutional approach to the study of education. In: L. J. Saha (Ed.), International encyclopedia of the sociology of education (pp. 340–345). Oxford: Pergamon Press. Brown, D. (Ed.) (2002). Career choice and development (4th ed.). San Francisco: Jossey-Bass. Buchmann, C., & Dalton, B. (2002). Interpersonal influences and educational aspirations in 12 countries: The importance of institutional context. Sociology of Education, 75(2), 99–122. Buchmann, C., & Park, H. (2009). Stratification and the formation of expectations in highlydifferentiated educational systems. Research on Social Stratification and Mobility, 27(4), 245–267. Caro, F. G., & Pihlblad, C. T. (1965). Aspirations and expectations: A re-examination of the bases for social class differences in the occupational orientations of male high school students. Sociology and Social Research, 49, 465–475. Croll, P. (2008). Occupational choice, socio-economic status and educational attainment: A study of the occupational choices and destinations of young people in the British Household Panel Survey. Research Papers in Education, 23(3), 243–268. Desoran, R. A. (1977/1978). Educational aspirations: Individual freedom or social injustice? Interchange, 8(3), 72–87. Dore, R. (1976). The diploma disease: Education, qualification and development. London: Routledge and Kegan Paul. Empey, L. (1956). Social class and occupational aspiration: A comparison of absolute and relative measurement. American Sociological Review, 21(6), 703–709.

112

JOANNA SIKORA AND LAWRENCE J. SAHA

Feliciano, C., & Rumbaut, R. G. (2005). Gendered paths: Educational and occupational expectations and outcomes among adult children of immigrants. Ethnic and Racial Studies, 28(6), 1087–1118. Ganzeboom, H. B. G., & Treiman, D. J. (1996). Internationally comparable measures of occupational status for the 1988 international standard classification of occupations. Social Science Research, 25, 201–239. Gottfredson, L. S. (2002). Gottfredson’s theory of circumscription, compromise and selfcreation. In: D. Brown (Ed.), Career choice and development (4th ed.). San Francisco: Jossey-Bass. Goyette, K. (2008). College for some to college for all: Social background, occupational expectations, and educational expectations over time. Social Science Research, 37(2), 461–484. Han, W. S. (1969). Two conflicting themes: Common values versus class differential values. American Sociological Review, 34(October), 679–690. Helwig, A. A. (2008). From childhood to adulthood: A 15-year longitudinal career development study. The Career Development Quarterly, 57(1), 38. Holland, J. L. (1997). Making vocational choices: A theory of careers. Odessa, FL: Psychological Assessment Resources. International Labor Organization (2010). LABORSTA database. Available at http:// laborsta.ilo.org/. Retrieved on 2 August 2010. Irizarry, R. (1980). Overeducation and unemployment in the third world: The paradoxes of dependent industrialization. Comparative Education Review, 24, 338–352. Johnson, M. K., & Mortimer, J. T. (2002). Career choice and development from a sociological perspective. In: D. Brown (Ed.), Career choice and development (4th ed.). San Francisco: Jossey-Bass. Kamens, D. H., & McNeely, C. L. (2010). Globalization and the growth of international educational testing and national assessment? Comparative Education Review, 54(1), 5–25. Looker, E. D., & McNutt, K. L. (1989). The effect of occupational aspirations on the educational attainments of males and females. Canadian Journal of Education, 14, 352–367. Marks, G. N. (2010). Meritocracy, modernization and students’ occupational expectations: Cross-national evidence. Research in Social Stratification and Mobility, 28(3), 275–289. McDaniel, A. (2010). Cross-national gender gaps in educational expectations: The influence of national-level gender ideology and educational systems. Comparative Education Review, 54(1), 27–50. Meyer, J. W., Ramirez, F. O., Rubinson, R., & Boli-Bennett, J. (1977). The world education revolution. Sociology of Education, 50, 242–258. Mortimer, J. T., & Kru¨ger, H. (2000). Pathways from school to work in Germany and the United States. In: M. T. Hallinan (Ed.), Handbook of the sociology of education (pp. 475–497). New York: Kluwer. Musgrave, P. M. (1967). Towards a sociological theory of occupational choice. Sociological Review, 15, 33–46. OECD. (2004). Education at a glance OECD indictators 2004. Paris: OECD Publishing. OECD. (2005a). PISA 2003 data analysis manual-SPSS users. Available at http://www. oecd.org/pisa OECD. (2005b). Education at a glance OECD indictators 2005. Paris: OECD Publishing.

New Directions in National Education Policymaking: SCPIAS

113

OECD. (2008a). PISA 2006 science competencies for tomorrow’s world. Annex A9-SPSS syntax to prepare data files for multilevel regression analysis. Available at http://www.oecd.org/ dataoecd/59/32/39730315.pdf OECD. (2008b). PISA 2006-machine readable data file. Available at http://pisa2006.acer. edu.au/downloads.php. Retrieved on January 10, 2008. Psacharopoulos, G., & Patrinos, H. A. (2004). Returns to investment in education: A further update. Education Economics, 12(2), 111–134. Rajewski, J. I. (1996). Occupational aspirations and early career choice patterns of adolescents with and without learning disabilities. Learning Disability Quarterly, 19(2), 99–116. Reynolds, J. R., & Baird, C. L. (2010). Is there a downside to shooting for the stars? Unrealized educational expectations and symptoms of depression. American Sociological Review, 75(1), 151–172. Rindfuss, R. R., Cooksey, E. C., & Sutterlin, R. L. (1999). Young adult occupational achievement. Early expectations versus behaviour reality. Work and Occupations, 26(2), 220–263. Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data. Educational Researcher, 39(2), 142–151. Saha, L. J. (1983). Gender, school attainment and occupational plans: Determinants of aspirations and expectations among Australian urban school leavers. Australian Journal of Education, 26(3), 247–265. Saha, L. J. (1997). Aspirations and expectations of students. In: L. J. Saha (Ed.), International encyclopedia of the sociology of education (pp. 512–517). Oxford: Pergamon Press. Saha, L. J., & Sikora, J. (2008). The career aspirations and expectations of school students: From individual to global effects. Education and Society, 26(2), 5–22. Schofer, E., & Meyer, J. W. (2005). The worldwide expansion of higher education in the twentieth century. American Sociological Review, 70(6), 898–920. Sikora, J., & Saha, L. J. (2009). Gender and professional career plans of high school students in comparative perspective. Educational Research and Evaluation, 15(4), 387–405. Sikora, J., & Saha, L. J. (2010). Lost talent? The occupational expectations and attainments of young Australians. Adelaide: National Centre for Vocational Education Research. Suda, Z. (1979). Universal growth of education aspirations and the over-qualification problem: Conclusions from a comparative data analysis. European Journal of Education, 14, 113–164. Turner, R. (1964). The social context of ambition. San Francisco: Chandler Publishing Company. UNESCO. (2007). Education for all global monitoring report 2008. Oxford: UNESCO Publishing. UNESCO (2010). UNESCO’s institute for statistics. Available at http://data.un.org/. Retrieved on August 1, 2010. White, P. (2007). Education and career choice: A new model of decision making. New York: Palgrave. Wiseman, A. W., & Baker, D. P. (2006). The symbiotic relationship between empirical comparative research on education and neo-institutional theory. In: D. P. Baker & A. W. Wiseman (Eds), The impact of comparative education research on institutional theory (pp. 1–26). Amsterdam, Oxford: Elsevier, JAI. World Bank (2010). World development indicators. Available at http://ddp-ext.worldbank.org. Retrieved on August 1, 2010.

114

JOANNA SIKORA AND LAWRENCE J. SAHA

APPENDIX. DETAILS OF MEASUREMENTCOUNTRY CHARACTERISTICS Whenever possible, the reference year for all country-level statistics is 2004. If data for 2004 were not available, we used the data for 2005. If there were also no data for 2005, we used the point in time that was closest to 2004. The information about reception of aid for education in 2004 was obtained from UNESCO’s ‘‘Education for All Global Monitoring Report 2008’’ (UNESCO, 2007, p. 382). Countries which were listed as recipients of aid were coded as 1 and all the other countries were coded as 0. Secondary school net enrolment rate was sourced from UNESCO’s Institute for Statistics online database (UNESCO, 2010). Whenever possible, we used 2004 as the reference year but the figure for Canada is the 2001 figure. Data for the Czech Republic, Austria, Taiwan, Latvia, Russian Federation, and Slovak Republic were obtained from the UNICEF website (http://www.unicef.org/infobycountry) or from country-specific national statistical offices. Number of school types or distinct school programs available to 15-yearolds was obtained from OECD’s ‘‘Education at a Glance’’ (OECD, 2005b, p. 400). For European countries, for which the information was not available in this publication, the information from Eurybase-Descriptions of Education Systems http://eacea.ec.europa.eu/education/eurydice/index_ en.php was used. Finally, for those countries for which the information was not available in either source, we used country-specific descriptions of educational systems, the details of which are available upon request. Percentage of 25 to 34 year olds with tertiary education was obtained from the International Labor Office’s LABORSTA online Labor Statistics Database (International Labor Organization, 2010). For Colombia, Jordan, Spain, Thailand, Tunisia, and Uruguay, we used the 2002 data from Table A3.3 in ‘‘Education at a Glance’’ (OECD, 2004). Index of GDP per capita and service employment ratio to USA was created in three steps. First, the data for 2004 for GDP per capita in constant dollars and for the proportion of labor force employed in services were obtained from World Development Indicators online (World Bank, 2010). The figures for both indicators for each country were then divided by the relevant USA figures, so that the USA value on both GDP per capita and the proportion of labor force employed in the service sector equalled 1. Finally, ratios for each country were averaged resulting in the index value.

New Directions in National Education Policymaking: SCPIAS

115

The values of the GINI coefficients for each country were obtained from the World Development Indicators online (World Bank, 2010). Where data for 2004 were unavailable, we used data from the closest point in time and, in several cases we used the data from national statistical offices, the details of which are available upon request. Growth of the service sector 1997–2004 was estimated for each country based on the data from World Development Indicators online (World Bank, 2010). For Colombia, Taiwan, and Tunisia, we used the information from their national statistical offices.

116

Table A1. Nations

Azerbaijan Argentina Australia Austria Belgium Brazil Bulgaria Canada Chile Colombia Croatia Czech Republic Denmark Estonia Finland France Germany Greece Hong Kong Hungary Iceland Indonesia Ireland Italy Japan Jordan Korea (ROK) Kyrgyzstan Latvia Lithuania Mexico Netherlands New Zealand Norway Poland Portugal Romania Russia

JOANNA SIKORA AND LAWRENCE J. SAHA

Mean Occupational Expectations and Missing Data in 50 Countries in 2006. ISEI 2006 Mean

N

Total Invalid Answers (%)

68.7 64.0 56.9 51.9 55.3 66.2 65.4 61.2 64.1 69.1 55.0 52.5 54.5 57.3 51.3 54.9 52.1 61.0 58.5 53.3 60.4 61.5 57.6 59.9 55.1 67.5 60.8 67.2 57.9 60.3 67.4 53.9 57.6 56.0 58.9 61.7 58.3 62.6

3,346 3,663 114,616 3,291 6,995 7,477 3,150 18,466 4,217 4,040 3,426 3,999 3,510 3,597 3,522 3,565 3,237 3,566 3,820 3,047 2,899 7,905 3,797 18,357 3,908 4,761 4,773 3,780 3,517 3,496 24,124 4,229 3,695 3,342 4,385 4,052 4,549 4,354

35 20 19 28 23 19 30 18 16 10 34 27 43 26 24 26 32 18 24 23 24 25 17 17 38 27 11 36 35 26 24 14 22 26 24 19 11 25

Uncodable answers (%)

Missing (%)

Missing and Uncodable Answers (%) – Top 30% Students

3 8 8 18 11 10 20 13 10 2 22 18 9 12 11 12 17 7 14 12 11 9 10 9 20 5 7 5 20 14 10 11 10 13 10 12 7 13

32 12 11 10 12 9 10 5 6 8 12 9 33 14 13 13 15 10 10 11 13 17 7 7 18 22 4 31 14 12 14 3 12 13 14 7 4 12

24 9 13 31 21 19 28 13 14 7 34 25 25 21 25 20 25 18 14 26 17 22 14 14 25 13 4 21 17 18 14 15 19 23 11 20 5 19

117

New Directions in National Education Policymaking: SCPIAS

Table A1. (Continued ) Nations

ISEI 2006

Slovakia Slovenia Spain Sweden Switzerland Taiwan Thailand Tunisia Turkey United Kingdom United States Uruguay

Mean

N

Total Invalid Answers (%)

56.8 59.2 60.2 53.4 49.8 58.4 60.3 68.6 66.3 56.3 62.8 65.3

3,538 5,026 14,447 3,587 9,793 7,179 4,147 3,897 3,999 11,229 4,665 3,867

28 24 25 20 23 19 24 16 19 34 20 20

(N)

Uncodable answers (%)

Missing (%)

Missing and Uncodable Answers (%) – Top 30% Students

20 8 9 14 14 13 11 3 0 30 7 12

8 16 16 7 9 5 14 13 19 4 13 8

17 20 16 21 19 12 16 8 10 13 17 14

389,847

Table A2.

Correlations Between Country Characteristics.

2004 1. Country receives education-related aid 2. Secondary school net enrolment rate 3. Number of school types available to 15-year-olds 4. Percentage of 25–34-year-olds with tertiary education 5. Percent of labor force employed in services 6. GDP per capita 7. Index of GDP and service employment 8. Gini coefficient 9. Growth of the service sector 1997–2004

1.

2.

3.

0.18

1

4.

5.

6.

7.

8.

9.

1 0.61 0.08

1

0.44

0.61

0.18

1

0.35

0.41

0.18

0.29

1

0.56 0.53

0.47 0.47

0.07 0.11

0.45 0.43

0.72 0.86

0.53 0.15

0.60 0.08

0.01 0.39 0.07 0.00 0.22 0.16

1 0.98

1.

0.41 0.32 1 0.03 0.03 0.09 1.00

Statistically different from zero at p ¼ 0.01, statistically different from zero at p ¼ 0.05.

ANALYZING TURKEY’S DATA FROM TIMSS 2007 TO INVESTIGATE REGIONAL DISPARITIES IN EIGHTH GRADE SCIENCE ACHIEVEMENT Ebru Erberber ABSTRACT Improving Turkey’s low level of education quality and achieving equity in quality education across its seven regions continue to be a monumental challenge. The purpose of this study was to document the extent of Turkey’s regional differences in science achievement at the eighth grade and to investigate factors associated with these differences. Identifying the factors influencing Turkey’s regional inequalities in student learning is crucial for establishing policies that will help raise the educational performance at the national level as well as close regional gaps. A series of hierarchical linear modeling (HLM) analyses were performed at two levels (the school/class level and student level) using Turkey’s nationally representative data from the Trends in International Mathematics and Science Study (TIMSS) 2007. Findings suggested that attempts to increase Turkish students’ achievement and close the achievement gaps between regions should target the students in the undeveloped regions, The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 119–142 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013008

119

120

EBRU ERBERBER

particularly in Southeastern and Eastern Anatolia. Designing interventions to improve competency in Turkish and to compensate for the shortcomings of insufficient parental education, limited home educational resources, poor school climate for academic achievement, and inadequate instructional equipment and facilities might be expected to close the regional achievement gaps as well as raise the overall achievement level in Turkey. Using TIMSS data, this study provided an example of a methodology that may be employed to describe how student achievement is distributed among various subpopulations of national interest and to investigate the factors that contribute to differences in the achievement distribution.

INTRODUCTION Description of the Problem Despite being among the 20 largest economies of the world, Turkey has yet to provide its 70 million citizens with high-quality living standards. The United Nations Development Programme’s Human Development Report 2007/2008 (UNDP, 2007) ranked Turkey 84th among 177 countries and placed it among the ‘‘medium level’’ countries in terms of the Human Development Index (HDI).1 By contrast, the 27 member states of the European Union (EU), in which Turkey seeks membership by 2013, are all considered ‘‘high level’’ HDI countries. When education, health, and economic components of the index are examined separately, it appears that Turkey’s medium HDI level results mainly from its low level on the education index (ranking 104th) compared with its life expectancy index and GDP index (ranking 84th and 67th, respectively). Issues of access, quality, and equity in education have been a challenge for Turkey since its founding less than a century ago in 1923. In academic year 1923–1924, the ministry educated less than a half million primary school students. In academic year 2007–2008, the number of students in primary education approached 11 million students – the total population size of the Republic when it was founded. Remarkable progress has been made, particularly in the last decade, in terms of increasing access to primary schooling. Since compulsory primary education was extended from 5 to 8 years in 1997, the net enrollment ratio had increased from 85% to 97% in 2007 (MoNE, 2008). However, Turkey still faces enrollment deficits before and after the years of compulsory schooling. The gross schooling rate for

Analyzing Turkey’s Data from TIMSS 2007

121

preschool was only 20% in 2005 (DPT, 2006) and the net enrollment ratio for secondary school was 59% in 2007 (MoNE, 2008). Despite the substantial progress in increasing access to primary schooling, improving the quality of primary education is still a source of considerable concern in Turkey. Low student achievement results in both international and national assessments show that improving student outcomes remains a challenge for Turkey. Most recently, eighth grade students2 in Turkey had significantly lower average achievement in mathematics than their peers in all of the 12 EU member countries3 that were among 49 countries that participated in the Trends in International Mathematics and Science Study (TIMSS) 2007 (Mullis, Martin, & Foy, 2008), and although Turkey’s average science achievement was similar to three EU countries (Romania, Malta, and Cyprus), it was lower than all the other EU countries (Martin, Mullis, & Foy, 2008). The TIMSS 2007 results showed that even at the eighth grade with nearly all students still enrolled in school, high-quality education remains a privilege provided only for a small fraction of Turkish students. Only 5% of eighth grade students reached the TIMSS 2007 Advanced International Benchmark in mathematics (Mullis et al., 2008) and 3% in science (Martin et al., 2008), demonstrating competency in complex concepts and reasoning tasks. Moreover, an alarming 41% of Turkish students performed below the Low International Benchmark in mathematics, signifying they did not demonstrate a grasp of even basic computational skills. In science, 29% performed below the Low International Benchmark, indicating little knowledge of even basic facts from the life and physical sciences. These results suggest that in addition to the challenge of maintaining high enrollment in grades 1–8, Turkey also faces the enormous challenge of providing quality education to all children in the nation. Organization for Economic Cooperation and Development’s (OECD) Programme for International Students Assessment (PISA) 2006 results presented a similarly dismal picture. Among 30 OECD countries in the study, Turkey ranked at the bottom, above only Mexico. Almost half (47%) of the 15-year-old students in Turkey performed at or below the lowest proficiency level of the science literacy scale (OECD, 2007a). The grim picture of the results from international assessments corresponds to national test results. In 2005, the Ministry of National Education sampled primary school students, nationwide, at grades 4 through 8, and tested their achievement in four primary subjects – mathematics, science, social studies, and Turkish. The results of the test, the OBBS (Ogrenci Basari Belirleme Sinavi [Student Achievement Determination Test]), revealed that

122

EBRU ERBERBER

the level of primary curriculum attainment was unsatisfactory across the country. For all grades and subjects, except Turkish, the average score for correct answers was 50% or less (MoNE, 2007). The 2005 results showed no change from 2002 when OBBS was first conducted, confirming the disappointing picture revealed by the initial test results. Beyond the overarching concerns of access and quality in education, another aspect of Turkey’s educational challenge relates to the persistent disparities in human development among its seven geographic regions (Fig. 1). The urbanized west of Turkey is socioeconomically more developed than the rural east, and apparently better educated. A recent study revealed that human development disparities among the regions of the country are wide and continue to exist (Dincer, Ozaslan, & Kavasoglu, 2003). Fig. 2 displays the Socioeconomic Development Index (SEDI)4 scores that Dincer and his colleagues calculated for each region and compared to the country mean of zero. The Marmara region that includes Istanbul – the demographic and economic heart of the country – had the highest index score and is the most developed region of the country. The Aegean and Central Anatolia (including the capital city of Ankara) regions are the next most developed regions with very close SEDI scores that are above the national average. The index score for the Mediterranean region is at the country’s average and it is below average for the Black Sea, Southeastern Anatolia, and Eastern Anatolia. Socioeconomic disparities between the western and eastern parts of Turkey gained more attention after the start of membership negotiations

Fig. 1.

Geographic Regions of Turkey.

Analyzing Turkey’s Data from TIMSS 2007

Fig. 2.

123

SEDI Rankings by Regions of Turkey. Source: Data are from Dincer et al. (2003).

with the EU in 2005. One of the important goals of the EU is to reduce regional gaps to achieve economic and social cohesion not only within the EU but also in its territories (Loewendahl Ertugal, 2005). OECD’s economic survey of Turkey (2006) suggested that ‘‘improved education quality in the poorest regions would contribute to reducing these [regional] disparities while also encouraging faster growth of the economy as a whole’’ (p. 158). OECD’s latest review of educational policies in the country (2007b) also highlighted the striking socioeconomic disparities among Turkey’s regions. Their report recommended that Turkey make education a key instrument for socioeconomic cohesion. To reach this goal, the report also recommended that Turkey strive toward providing equal educational opportunities for all people, establishing priorities for efficient use of existing resources, and continuing to narrow socioeconomic gaps among regions. Purpose of the Study The purpose of this study was to determine the extent to which science achievement inequalities exist across the seven regions of Turkey and to

124

EBRU ERBERBER

explore potential reasons for why such educational inequalities might exist. The research questions investigated in this study were as follows: 1. What is the science achievement profile of eighth grade Turkish students across the seven geographic regions of Turkey? 2. What student background and school context factors contribute to regional disparities in science achievement in Turkey? Studies using nationally representative samples of students are scarce in Turkey, mainly because of a lack of reliable and valid data. In TIMSS 2007, Turkey’s student sample was nationally representative and stratified by region. This provided a unique opportunity to conduct research on the current picture of regional achievement disparities in Turkey. An in-depth understanding of regional inequalities in relation to educational outcomes is central to establishing strategies for reducing regional gaps. In particular, identifying the factors influencing regional inequalities in student learning is crucial for formulating policies that will help raise the educational performance at the national level as well as close regional gaps, most notably between the western and eastern regions. Moreover, the findings of this study may also be helpful for policymakers in other countries facing quality and equity issues similar to the issues in the Turkish education system.

METHODOLOGY Sampling and Instruments This study used Turkey’s TIMSS 2007 eighth grade science achievement data as well as the background data about the home and school contexts in which learning takes place. Characteristics of Turkey’s sample in TIMSS 2007 allowed for the investigation of regional disparities in student outcomes. The TIMSS sample design is a two-stage stratified cluster sample design. A sample of schools is selected in the first stage. The second stage consists of a random sample of one or more intact classrooms from these sampled schools. In Turkey, at the first stage of the sampling, 146 schools were sampled with probability-proportional-to-size (PPS) but also stratified by geographic region (Table 1). At the second stage of the sampling, one classroom was selected from each sampled school. As a result, a nationally representative sample of 4,498 Turkish eighth grade students participated in TIMSS 2007 (Joncas, 2008).5

125

Analyzing Turkey’s Data from TIMSS 2007

Table 1. Turkey’s Sample Allocation in TIMSS 2007 at the Eighth Grade. Region Marmara Aegean Central Anatolia Mediterranean Black Sea Southeastern Anatolia Eastern Anatolia Total

Number of Schools

Number of Students

39 18 26 18 16 17 12 146

1362 505 739 527 442 592 331 4498

Average Age (SE) 14.0 14.0 14.0 14.1 14.0 14.3 14.1 14.0

(0.0) (0.0) (0.1) (0.1) (0.1) (0.1) (0.2) (0.0)

Source: ‘‘Number of Schools’’ from Appendix B of the TIMSS 2007 Technical Report (Olson, Martin, & Mullis, 2008). For computing average age, IEA’s International Database Analyzer (IEA, 2008) was used.

As explained in the TIMSS 2007 Assessment Frameworks (Mullis et al., 2005), the science test questions in the TIMSS 2007 assessment were developed to measure both content (biology, chemistry, physics, and earth science) and cognitive dimensions (knowing, applying, and reasoning). Almost all of the content assessed by the TIMSS 2007 science test was included in Turkey’s science curriculum and intended to be taught to all students in the country by the end of eighth grade (Martin et al., 2008). In addition to assessing students’ performance in mathematics and science, TIMSS 2007 administered background questionnaires to collect information from students, teachers, and school principals about contextual factors that may affect learning outcomes. This contextual information is crucial to understanding the achievement results, because it sheds light on the factors likely to shape student outcomes.

DATA ANALYSIS The aim of this study was twofold: first, to examine the extent of Turkey’s regional differences in science achievement, and second, to investigate factors associated with the differences. Consequently, data analysis in this study consisted of two phases. In the first phase, regional differences in student achievement were investigated via computing average science achievement for each region as well as for Turkey. For this computation, the IEA’s International Database (IDB) Analyzer6 (IEA, 2008) was used. In the second phase, the TIMSS 2007 background variables associated with

126

EBRU ERBERBER

achievement differences across regions were identified via an exploratory analysis and then examined in a series of statistical models for their contribution to reducing the regional differences in achievement. In the second phase, background variables that satisfied the following two criteria were included in statistical models of regional achievement: 1. The variable was positively associated with achievement both at the country and the regional level. That is, achievement was higher for students in some categories of the variable than others (e.g., for the parental education variable, students whose parents had university or higher education had higher achievement than students whose parents had lower levels of education). 2. The percentages of students in various categories of the variable differed among regions, particularly between the developed and undeveloped regions. For example, Marmara (and/or other developed regions) had higher percentages of students in the favorable categories of the variable (e.g., parents with university degree) or, Southeastern and Eastern Anatolia had higher percentages of students in the unfavorable categories of the variable (e.g., parents with less than primary education). As noted earlier, the aim of this study was to develop an explanatory model of Turkey’s regional achievement differences and not simply a model of student achievement differences in Turkey. In the case of the latter model, it would have been sufficient to identify those variables that would explain the most variation in student achievement. Even though the first criterion is important and a prerequisite for explaining regional differences in achievement, it does not mean that those factors that are associated with higher student achievement overall necessarily contribute to regional disparities. In addition to satisfying the first criterion, for a variable to help explain the relatively higher performance of the developed regions, it needed to vary between developed and undeveloped regions in such a way that it had favorable values for developed regions. Similarly, to help explain the low performance of undeveloped regions, a variable needed to vary in such a way that it had unfavorable values for undeveloped regions. For exploration of those background variables that satisfied the above two criteria, existing literature on science achievement established the theoretical base. The empirical basis was provided by the background data almanacs in the TIMSS 2007 International Database (Foy & Olson, 2009). For each country, the data almanacs provided statistical summaries of the data collected from students, teachers, and schools. In light of the literature review, the statistical summaries of approximately 400 potential explanatory

127

Analyzing Turkey’s Data from TIMSS 2007

variables were reviewed to identify those factors that showed an association with science achievement in Turkey. Also, the IDB Analyzer was used to perform a similar review at the regional level. Next, those variables that were selected in the screening process were included in a series of statistical models to identify factors that, if altered, could serve to reduce the magnitude of the region effect. Analyses were carried out using hierarchical linear modeling (HLM). HLM is a regression technique for examining data with a hierarchical or nested structure (e.g., data from schools, where students are nested within classes, and classes are nested within schools) as in the case of Turkey’s sample in TIMSS 2007. Braun, Jenkins, and Grigg (2006, p. 7) state that ‘‘at present, the use of HLM is strongly recommended for nested data.’’ For this study, the HLM analyses were performed at two levels (the school/class level and student level ) using the HLM Version 67 (Raudenbush, Bryk, & Congdon, 2004). The first statistical model tested the effects of the region factor to establish a base model for subsequent multilevel analyses. Further models investigated student and school background factors, which if controlled statistically, could be expected to diminish the magnitude of the region effect. The steps involved in these analyses are summarized in Table 2. The base model included no exploratory variable at the student level and only ‘‘region’’ at the school level. Subsequently, student models and school models were built on the base model to show how the inclusion of various student-level and school-level predictors changes the estimate of the region effect. Lastly, the predictors that were statistically significant were included together in a final model that tested the joint effects of those predictors on regional differences in achievement. This was done because student and school background factors do not operate in isolation in the schooling process. That is, they are interconnected – both within and among the different levels – in influencing Table 2. Multilevel Model Base Student

Steps Involved in Modeling Regional Differences in Achievement. Variables in Student Level

School

None Student characteristics that satisfied the selection criteria None

Final

Significant student characteristics

Variables in School Level

Region Region Region þ school characteristics that satisfied the selection criteria Region þ significant school characteristics

128

EBRU ERBERBER

student outcomes. For example, it is likely that as parental education level (a student level factor) increases, both home resources level (a student level factor), and parental support for student achievement (a factor contributing to positive school climate – a school level factor) increases.

RESULTS Using Turkey’s TIMSS 2007 science achievement data, the first phase of the analysis investigated regional differences in student outcomes at the eighth grade. The second analytic phase using multilevel statistical models capitalized on the student and school background data from TIMSS 2007 to examine the relative impact of home and school factors associated with regional differences in achievement. This section presents the results of each phase of analysis.

Phase 1: Turkey’s Regional Differences in Science Achievement at the Eighth Grade Fig. 3 displays for each region and for Turkey the TIMSS 2007 average science scale scores (denoted by circles) with their 95% confidence intervals (indicated by the bars extending above and below the circles). In general, the differences in student outcomes corresponded to the socioeconomic disparities between the western and eastern parts of Turkey. As expected, the pattern was that Marmara (the most developed region) together with the Aegean and Central Anatolia regions had the highest average achievement. Eastern and, in particular, Southeastern Anatolia (the two least developed regions) were the lowest achieving regions. Table 3 displays the results from the base model, which compared the average science achievement in Marmara to the average science achievement in each of the other six regions.8 The table shows the estimates of regional contrasts that represent the average achievement difference between Marmara and each region. That is, the base model tested the effect of attending school in each of the six regions compared to attending school in Marmara. The results revealed that students in the two least developed regions, Eastern and Southeastern Anatolia, performed significantly lower, on average, than students in Marmara ( pEastern ¼ 0.020 and pSoutheastern ¼ 0.001, respectively). Specifically, the mean TIMSS 2007 science score in Southeastern Anatolia was estimated to be 47 points lower than the

Analyzing Turkey’s Data from TIMSS 2007

Fig. 3.

129

TIMSS 2007 Average Science Achievement in Turkey and its Regions.

Table 3. Base Model of TIMSS 2007 Science Achievement in Turkey (N ¼ 4,498). Intercept (Marmara) Predicted mean science score (SE) Aegean Central Mediterranean Black Sea Southeastern Eastern

468 (8.3) 10 (16.4) 11 (12.2) 17 (16.3) 21 (16.5) 47 (13.9) 40 (16.7)

Note: Hierarchical linear models with random intercepts weighted by TOTWGT. The use of TOTWGT (student sampling weight) in all the analyses of this study ensured that all subgroups of the sample were properly represented in the estimation of population parameters. po0.01, po0.05.

mean score in Marmara. The achievement differences between Marmara and the other four regions (Aegean, Central Anatolia, Black Sea, and Mediterranean) were not statistically significant. That is, these four regions and Marmara performed similarly on the TIMSS 2007 science test.

130

EBRU ERBERBER

Phase 2: Explaining Turkey’s Regional Differences in Science Achievement at the Eighth Grade Identification of Factors Related to Turkey’s Regional Differences in Science Achievement Why are eighth grade students in Southeastern and Eastern Anatolia lagging behind those in Marmara despite the fact that all students in Turkey are intended to be provided with similar educational opportunities by the end of primary education? What are those student background and school context factors that contribute to the regional achievement differences in compulsory education in Turkey? These questions were examined during the second analytic phase and the results are presented in this section. The exploratory analysis of Turkey’s extensive array of TIMSS 2007 background data revealed six key indicators9 that were related to higher science achievement in Turkey, overall and regionally. The analysis also showed that there are regional gaps in these key indicators. More specifically, students in Southeastern and Eastern Anatolia – the two lowest performing regions – had less favorable conditions in these six key variables compared with students in the higher performing regions. That is, disadvantaged students tend to be concentrated in the eastern part of the country and enter schools with vast socioeconomic background differences in terms of their parental education level and home resources as well as differences in school factors, such as availability of school resources and positive school climate for academic achievement. The three student background indicators meeting both criteria for inclusion in a series of multilevel models consisted of frequency of speaking Turkish (the language of the test and the medium of instruction in Turkey) at home, parents’ highest education level, and Index of Home Resources. The three school context measures that satisfied both criteria included school community type (i.e., whether schools are located in rural or urban settings), the Index of Principals’ Reports on Positive School Climate for Academic Achievement and the Index of Teachers’ Reports on Adequacy of School Resources in Teaching Science. On average, in the country, a large majority of students (89%) reported always or almost always speaking Turkish (the formal language of instruction in Turkey) at home and these students had higher average science achievement than those who reported speaking Turkish less frequently (sometimes or never). Similarly, in all regions, there was a positive relationship between the frequency of speaking Turkish at home and science achievement. The two lowest performing regions, Southeastern

Analyzing Turkey’s Data from TIMSS 2007

131

and Eastern Anatolia, had the highest percentages of students from homes where Turkish is not often spoken (41% and 22%, respectively), whereas in the higher performing regions, the percentages of students in this category were less than 10%. This finding is consistent with the language situation in Turkey where a major ethnic group (Kurds) with a mother tongue other than Turkish populates the eastern part of the country. Results revealed a grim picture of the level of parental education in Turkey. For two-thirds of the Turkish eighth grade students (68%), the highest level of education for either (or both) parents was very minimal (i.e., either with a primary school education or did not finish primary school or go to school at all). Twenty percent had at least one parent with a high school education, and only 10% had at least one parent who completed a degree beyond high school. Consistent with the disparities in socioeconomic development, parents’ education level differed across the regions. Central Anatolia (it includes the capital city, Ankara), one of the most developed and high performing regions, had the highest percentage of students (17%) with parents who finished a degree beyond high school. Regarding the least educated parents, the students in Southeastern and Eastern Anatolia were in the most disadvantaged home environments with a quarter of the students having neither parent finishing even primary schooling. In all regions and in Turkey as a whole, higher levels of parents’ education are associated with higher average science achievement. The Home Resources Index was developed based on students’ reports of the following variables: number of books in the home; presence of three educational aids in the home, including a computer, study desk for student’s own use, and Internet connection; and having three home possessions (a dishwasher, a heating system with radiator, and a DVD/CD player). This index was used as a proxy measure of family income specific to Turkey. Students assigned to the high level of this index came from homes with four or more of these seven resources.10 Students assigned to the low level had three or fewer home resources. On average in Turkey, one-third of students were at the high level of the Home Resources Index, although the distribution varied from region to region. Central Anatolia had the highest proportion of students from well-resourced homes (almost half of them were at the high index level). In comparison, Southeastern Anatolia had the lowest proportion of students (20%) at the high level. In all regions, students at the high level of the index had higher average achievement compared with those at the low level. In Turkey, two-thirds of eighth grade students (66%) were enrolled in schools in urban areas11 and one-third in rural areas. Three-fourths of

132

EBRU ERBERBER

students (74%) in Marmara (the most developed region) attended urban schools. From the opposite perspective, in Marmara, one-fourth of the students attended rural schools, whereas almost half of the students attended rural schools in Southeastern and Eastern Anatolia (45% and 46%, respectively). In Turkey, average science achievement in rural schools was lower than in urban schools. This pattern of achievement was apparent in all regions in the country, particularly more evident in Eastern Anatolia (the least urbanized region). Students were placed in the high category of the Index of Principals’ Reports on Positive School Climate for Academic Achievement, if they attended schools where the principals responded; on average, the following aspects of school climate were high or very high: teachers’ expectations for student achievement, parental support for student achievement, parental involvement in school activities, and students’ desire to do well in school. Students in the other category of the index were those who attended schools for which principals characterized the elements of school climate, on average, as less than positive (i.e., medium, low, or very low). In Turkey, only 37% of the students attended schools where the principal rated the school climate favorably. In terms of attending a school environment supportive of learning, the principal’s ratings were the most positive in Central Anatolia (one of the most developed and higher performing regions), where about half of the students (55%) attended schools with positive climates. In contrast, only a quarter of students (27%) in Southeastern and Eastern Anatolia attended such schools. Across all regions, but particularly in Eastern Anatolia, students who attended schools with a positive school climate performed higher than those in schools where the environment for enhancing learning was rated poorly by principals. The Index of Teachers’ Reports on Adequacy of School Resources in Teaching Science was based on science teachers’ perceptions of the extent to which inadequacies of the following resources limited science instruction: physical facilities, equipment for teacher use in demonstrations, computer hardware, computer software, support for using computers, and other instructional equipment for student’s use. Students were assigned to the high level of the index if their science teachers’ average rating for school resources available for science instruction was positive (i.e., the shortage of those resources had, on average, little or no impact on science instruction). If their science teachers’ rating was poor (i.e., the shortage of resources resulted in, on average, some or a lot of limitations to science instruction), then students were assigned to the low level of the index.

Analyzing Turkey’s Data from TIMSS 2007

133

On average in Turkey, students were split almost evenly into the two categories of the index and those at the high level performed higher than those at the low level. Although almost half of the students (46%) in Turkey were at the high category of the index, there was considerable variation among regions. On the basis of science teachers’ reports, in Central Anatolia (a higher performing region), the majority of students (63%) attended schools where shortages of school resources for science instruction were, on average, not a problem. In contrast, a considerably lower percentage of students had the advantage of attending such schools in Southeastern and Eastern Anatolia (27% and 30%, respectively). Across all regions and in Turkey, performance was higher for those students who attended schools where science teachers’ average rating for school resources available for science instruction was positive than those in schools where science teachers’ average rating was poor. Investigation of Factors Related to Turkey’s Regional Differences in Science Achievement The six selected variables that met both criteria for inclusion in multilevel modeling were examined in a series of multilevel models, as previously summarized in Table 2,12 to explore their possible contribution to reducing the Turkish region effect (i.e., the negative effect of attending schools in Southeastern or Eastern Anatolia on student outcomes). As shown in Table 4, results of the final model in a series of multilevel models revealed that the six variables are significant predictors of science achievement in Turkey ( phomeresourceso0.0005, planguageo0.0005, pparenteduo0.0005, pcommunitytypeo0.023, pschoolclimateo0.0005, pinstructionalresources ¼ 0.006). Among the student background variables examined, speaking Turkish (medium of instruction in Turkey) was the strongest predictor of science achievement, followed by parental education level, and home resources (blanguage ¼ 31, bparentedu ¼ 27, bhomeresources ¼ 17, respectively).13 As to the school background variables, positive school climate for achievement was the strongest predictor of achievement (bschoolclimate ¼ 29), followed by adequacy of school resources in teaching science (binstructionalresources ¼ 20) and school community type (bcommunitytype ¼ 15). Results of the final model also suggest that controlling statistically for the differences in the six student and school factors examined would be expected to greatly reduce the magnitude of the region effect. Specifically, the final model estimated Marmara’s average science score as 530 points when the six predictors were optimal, that is, the students attended schools where principals perceived high positive school climate for student achievement

134

EBRU ERBERBER

Table 4.

Models of TIMSS 2007 Science Achievement in Turkey (N ¼ 4,498).

Intercept reliability Intercept (Marmara) Predicted mean science score (SE) School level variables Aegean Central Mediterranean Black Sea Southeastern Eastern Community type (urban ¼ 0, rural ¼ 1) Index of positive school climate (high ¼ 0, low or medium ¼ 1) Index of adequacy of instructional resources (high ¼ 0, low ¼ 1) Student level variables Index of home resources (high ¼ 0, low ¼ 1) Speaking Turkish at home (always or almost always ¼ 0, sometimes or never ¼ 1) Parental education (higher than primary school ¼ 0, primary school or less ¼ 1)

Base Model

Final Model

0.935

0.930

468 (8.3)

530 (8.7)

10 (16.4) 11 (12.2) 17 (16.3) 21 (16.5) 47 (13.9) 40 (16.7)

9 (10.8) 14 (8.6) 4 (13.0) 12 (13.4) 17 (12.7) 16 (9.9) 15 (6.6) 29 (7.5) 20 (7.1) 17 (3.3) 31 (5.8) 27 (3.6)

Note: Hierarchical linear models with random intercepts weighted by TOTWGT. po0.01., po0.05.

(e.g., high level of teachers’ expectations and parental support for student achievement), the science teachers rated the level of school resources available for teaching science as high, and the schools were located in urban areas as well as the students came from home environments where they had high levels of home resources (e.g., access to computer, Internet connection, and study desk), often spoke Turkish at home, and had parents with more than a primary school education. Under the same optimal conditions, the model estimated the average science achievement only 17 points lower for Southeastern Anatolia and 16 points lower for Eastern Anatolia, and these achievement differences were not statistically significant. That is, initial significant difference estimates of the base model – 47 points lower for Southeastern Anatolia and 40 points lower for Eastern Anatolia – were reduced to the extent that the initial

Analyzing Turkey’s Data from TIMSS 2007

135

achievement differences were no longer statistically significant. These findings suggest that if all regions had the same distributions in terms of the six variables, the average achievement difference between Marmara and Southeastern Anatolia and also between Marmara and Eastern Anatolia could be greatly reduced. In other words, low performing students in the Southeastern and Eastern Anatolia regions would have performed much more like the students in Marmara (i.e., there would not be any statistically significant differences in achievement) if students across the regions were provided with similar resources and opportunities in terms of the characteristics of the schools they attended and their home backgrounds.

CONCLUSIONS AND POLICY IMPLICATIONS The Turkish government’s Ninth Development Plan 2007–2013, the primary strategy plan aimed at coordinating the efforts involved in the EU accession process, documents that Turkey seeks to be ‘‘globally competitive’’ and ‘‘sharing more equitably’’ (DPT, 2006, p. 11). Considering these national goals of the country, progress in increasing the quality and equity in student achievement is imperative. As Turkey pursues its historic path to the EU, the unification process presents a unique opportunity for educational reforms that are necessary to help improve conditions of quantity, quality, and equity in education. To contribute to the discussion in Turkey about the best strategies for improving the overall level of student outcomes in primary education and eliminating the achievement inequalities among its regions, this study analyzed eighth grade science achievement data from TIMSS 2007. Considering Turkey’s persistent regional disparities in human development, the results of the analysis of achievement differences across regions were not entirely surprising, but were nevertheless disappointing. The study revealed that in general, the socioeconomic differences between the western and eastern parts of Turkey corresponded to the differences in student achievement at the end of compulsory primary education. Marmara, Aegean, and Central Anatolia – the most socioeconomically developed regions – were the highest performing regions. Eastern Anatolia and, in particular, Southeastern Anatolia – the two least developed regions – had the lowest science achievement in TIMSS 2007 at the eighth grade. These regional inequalities in science achievement at the eighth grade occurred despite the fact that all students in Turkey are intended to be provided with similar teaching time and content by the end of primary

136

EBRU ERBERBER

school. In addition to the overall low performance at the national level, these findings highlighted yet another challenge for the Turkish educational system. That is, the already low educational quality is not distributed evenly across the country. To identify factors associated with regional differences in educational achievement, this study involved extensive exploratory data analysis of the hundreds of background variables collected by TIMSS 2007 via student, teacher, and school questionnaires. The analyses revealed that, compared with students in high performing regions, the students in the two lowest performing regions had disadvantageous background on several key home background and school context variables associated with student achievement. In comparison to the students in the highest performing regions, the students in Southeastern and Eastern Anatolia came from homes with fewer educational resources (e.g., books, computer, Internet connection, and study desk), had parents with less education, and spoke Turkish less frequently at home. These students also attended schools with characteristics associated with low achievement, including being located in rural areas, not adequately equipped with instructional resources (e.g., physical facilities, equipment for teacher use in demonstrations, computer hardware and software, and other instructional equipment for student’s use), and having a school climate not supportive of learning in terms of teachers’ expectations for student achievement, parental support for student achievement, parental involvement in school activities, and students’ desire to do well in school. Using HLM analyses, this study found that if students across the regions were provided with similar opportunities in terms of the characteristics of the schools they attended and their home backgrounds, the significant achievement gaps across the regions could be greatly reduced to the extent that the regional achievement gaps were no longer statistically significant. These findings suggest that attempts to increase Turkish students’ achievement and close the achievement gaps between regions should target the students in the undeveloped regions, particularly in Southeastern Anatolia and Eastern Anatolia. The findings also suggest that to overcome low student achievement in these two regions, all factors contributing to the region effect need to be eventually addressed. Although this study used highly reliable data and sophisticated statistical analyses, there are some inherent limitations of the study. It should be noted that TIMSS is an observational study and not a randomized experiment. That is, students tested in TIMSS were already enrolled in their schools and not randomly assigned to schools by TIMSS. Effects of the predictors

Analyzing Turkey’s Data from TIMSS 2007

137

estimated using observational data, such as TIMSS, should not be interpreted in terms of causal relationships because it is not possible to determine the extent to which preassignment differences in student populations might affect the estimated effects. In addition, the results of this study are based on statistical models that are constrained by the available data. Even though TIMSS collected a vast array of background information from students, teachers, and school principals, there most certainly is other information related to student achievement that was not fully captured in the data. For example, in reference to this study, it may be the case that the students in the low performing regions are also overly represented among students with low or no preprimary education. Because information on students’ preprimary education was not in the dataset, the statistical models of this study could not investigate the role of preprimary education on regional achievement differences. Despite these limitations of the study, the findings are consistent with other findings in the literature reporting associations between student achievement and characteristics of home background and school context. Moreover, the results of this study provide preliminary supportive evidence for policymakers to establish targeted educational policies to combat low levels of educational outcomes in general and achievement disparities between its regions in particular. For example, the findings of this study may serve as a basis for some small-scale programs in Turkey or in other education systems that may be facing similar quality and equity issues. If new programs provide empirical evidence about the effects of the intervention on student achievement, a broad-scale implementation could then be launched. The results of this study suggest that parent education programs should be developed or enhanced so that they encompass strategies that equip parents to support student achievement and involvement in school activities. Such comprehensive education programs may be conducted jointly with schools and include teachers to create school climates generally supportive of student learning. Strategies may involve helping parents understand the main elements of the curriculum and keeping them informed about how well their children are progressing in school. Sustained training for parents and collaboration with teachers may help build trusting relationships among parents, teachers, and students. In such a supportive and collaborative system, teachers are more likely to have high expectations from students and in turn students are more likely to desire to perform well in school.

138

EBRU ERBERBER

The results also suggest that one way to improve education levels in Turkey would be to develop educational policies in the low performing regions that aimed specifically to alleviate the disadvantages that stem from students receiving school instruction in Turkish but infrequently speaking Turkish at home. Currently, such policies for second language learners are essentially absent in Turkey. If Turkish remains the only formal language of instruction, initiatives should be put in place to compensate for the negative influences of not being fluent in Turkish. Students who do not speak Turkish frequently at home may benefit from studying Turkish as a second language or from special training programs in the Turkish language. These extra Turkish language programs may be embedded in after-school programs or summer programs offering extended learning time in the language of instruction. In addition to preparing students for primary education, early childhood development programs are viewed as one of the most effective intervention approaches provided for disadvantaged children (OECD, 2009). In Turkey, the schooling status for preprimary education, which is optional in the country, is a source of considerable concern. Progress in International Reading Literacy Study (PIRLS) 2001 results showed that only in Turkey and Iran, of the 35 countries that participated in the study, ‘‘did the majority of students not attend preschool’’ (76% and 70%, respectively), whereas ‘‘almost all countries make provision for at least one year of preprimary education’’ (Mullis, Martin, Gonzalez, & Kennedy, 2003, p. 129). Moreover, the already low numbers of children that have access to preprimary schooling are mostly located in urban areas and in the western part of the country (MoNE, 2005). Given the importance of preprimary schooling for achieving equity in student achievement, initiatives need to be formulated to prioritize preschool education in Southeastern and Eastern Anatolia. To better combat the unfavorable characteristics of school learning contexts in the east of Turkey, perhaps, the Ministry of National Education’s resource allocation policies may need to be revisited. The ministry currently allocates funds to schools on an incremental budgeting basis. That is, ‘‘through a percentage increase on a school’s prior year budget’’ (OECD, 2007b, p. 134). Compensatory financing that is tailored according to the information on key indicators, such as enrollment and percentage of socioeconomically disadvantaged students, may ensure better resource allocation as well as enhance the physical infrastructure and instructional resources. Further research on student achievement will be possible for countries as they continue to participate in international assessments of student achievement such as TIMSS, PIRLS, and PISA. Monitoring the trends in student

Analyzing Turkey’s Data from TIMSS 2007

139

achievement during the primary school years (grades 1–8) is particularly important for Turkey. In 2007, the net schooling ratio in compulsory primary education was very high (97%), but decreases substantially as students move on to optional secondary education (57%). As a result, nearly half of the secondary school age children are excluded from the picture that is portrayed by studies that collect data from Turkish students in secondary education. Because a high percent of the dropouts in Turkey are usually low performing students who come from disadvantaged home backgrounds, such studies are limited since they may underestimate the distribution of disadvantaged background characteristics, while overestimating the achievement. However, studies that sample Turkish students in primary school do not have such limitations as these studies have the advantage of collecting data that is representative of almost all primary school age children in the country. The ministry’s commitment to monitor progress in educational quality is greatly needed as Turkey’s pursues full membership to the EU. Studies of educational achievement in core subject areas provide each participating country opportunities for comparing its national student achievement to other countries as well as in-depth investigations of linkages between achievement and background factors in their national context. Such investigations in turn help policymakers make informed decisions to improve education systems. This study, in particular, illustrated how TIMSS data may be used to inform quality and equity issues in a national education system. Using TIMSS data, this study provided an example of a methodology that may be employed to describe how student achievement is distributed among different populations of an education system and to investigate the factors that contribute to differences in achievement distribution. Although the focus of this study was regional achievement differences in Turkey, a similar methodology may be employed in other national systems of education to investigate achievement distribution among other subpopulations of national interest (e.g., gender) or among different school systems (e.g., public versus private). Moreover, even though the findings of this study pertain to equity issues stemming from regional differences in Turkey, the policy implications might be relevant in other education systems where equity issues exist among different groups of the national population.

NOTES 1. The HDI is a composite index based on three basic dimensions of human development: (1) a long and healthy life as measured by life expectancy at birth;

140

EBRU ERBERBER

(2) educational attainment as measured by combination of the adult (ages 15 years and older) literacy rate (two-thirds weight) and the combined gross enrollment ratio (one-third weight) for primary, secondary, and tertiary schools; and (3) a decent standard of living as measured by gross domestic product (GDP) per capita in purchasing power parity terms in US dollars. 2. The eighth grade is an important educational milestone in Turkey because it is the end of primary education and the last grade of compulsory education. 3. These countries included Bulgaria, Cyprus, the Czech Republic, Hungary, Italy, Lithuania, Malta, Romania, Slovenia, and Sweden. England and Scotland participated in TIMSS 2007 as separate entities, thus they were counted as two separate EU states, not as the United Kingdom. 4. SEDI is based on 58 variables selected from social and economic measures including indicators of demography, employment, education, health, infrastructure, manufacturing, agriculture, and finance. 5. Based on this sample, Turkey’s eighth grade student population is estimated to be 1,091,654 (Joncas, 2008). 6. The IDB Analyzer was developed to analyze data from IEA surveys such as TIMSS 2007 that made use of the plausible values methodology which was explained in detail in Foy, Galia, and Li (2008). 7. This program is designed to carry out the computations with five plausible values and can utilize the sampling weight variables of the TIMSS 2007 dataset. 8. Each region was represented with a dummy variable where Marmara was the reference category. 9. When background variables were related to a single underlying construct, responses to the individual variables were combined to create an index (indicator) that would summarize the information from the component variables in a concise manner. 10. Presence of books at home corresponded to having more than 25 books and absence of books was defined as having 25 or fewer books at home. 11. Urban areas are defined in this study as areas where the population is greater than 50,000. 12. Detailed descriptions of these series of multilevel models are provided in Erberber (2009). 13. To aid the interpretation of results, variables included in the multilevel models were dummies coded with values of 0 and 1. The favorable category of the explanatory variables (e.g., the high level of an index) was coded 0, whereas the unfavorable category (e.g., the low level of an index) was set to 1. The regression coefficients of the final model were interpreted as the estimated change in achievement associated with moving from the favorable category of the predictors to the unfavorable level, thus the regression coefficients had negative values.

REFERENCES Braun, H., Jenkins, F., & Grigg, W. (2006). A closer look at charter schools using hierarchical linear modeling (NCES 2006–460). U.S. Department of Education, National Center for

Analyzing Turkey’s Data from TIMSS 2007

141

Education Statistics, Institute of Educational Sciences. Washington, DC: U.S. Government Printing Office. Devlet Planlama Teskilati (DPT) [State Planning Organization]. (2006). Ninth development plan 2007–2013. Ankara: Republic of Turkey. Prime Ministry State Planning Organization. Available at http://ekutup.dpt.gov.tr/plan/ix/9developmentplan.pdf. Retrieved on May 3, 2008. Dincer, B., Ozaslan, M., & Kavasoglu, T. (2003). Illerin ve bolgelerin sosyo-ekonomik gelismislik siralamasi arastirmasi [Investigation of socioeconomic ranking of regions and provinces]. Ankara: Republic of Turkey. Prime Ministry State Planning Organization (Report no. 2671). Available at http://ekutup.dpt.gov.tr/bolgesel/gosterge/2003-05.pdf. Retrieved on March 24, 2008. Erberber, E. (2009). Analyzing Turkey’s data from TIMSS 2007 to investigate regional disparities in eighth grade science achievement. Dissertations and theses, Boston College, Boston, MA. Foy, P., Galia, J., & Li, I. (2008). Scaling the TIMSS 2007 mathematics and science assessment data. In: J. F. Olson, M. O. Martin & I. V. S. Mullis (Eds), TIMSS 2007 technical report. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Foy, P., & Olson, J. F. (Eds). (2009). TIMSS 2007 user guide for the international database. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. IEA. (2008). IEA International Database (IDB) Analyzer (Version 2.0) [computer software]. Hamburg, Germany: IEA Data Processing and Research Center. Available at http:// www.iea.nl/iea_software.html Joncas, M. (2008). TIMSS 2007 sampling weights and participation rates. In: J. F. Olson, M. O. Martin & I. V. S. Mullis (Eds), TIMSS 2007 technical report. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Loewendahl Ertugal, E. (2005). Europeanisation of regional policy and regional governance: The case of Turkey. European Political Economy Review, 3(1), 18–53. Martin, M. O., Mullis, I. V. S., & Foy, P. (with Olson, J. F., Erberber, E., Preuschoff, C., & Galia, J.). (2008). TIMSS 2007 international science report: Findings from IEA’s Trends in International Mathematics and Science Study at the fourth and eighth grades. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Ministry of National Education. (MoNE). (2005). Basic education in Turkey: Background report. Ankara: Republic of Turkey. Ministry of National Education. Available at http://www.oecd.org/dataoecd/8/51/39642601.pdf. Retrieved on May 24, 2008. Ministry of National Education. (MoNE). (2007). OBBS 2005: Ilkogretim ogrencilerinin basarilarinin belirlenmesi raporu [OBBS 2005: Report on primary school students’ achievement]. Ankara: EARGED, Ministry of National Education. Ministry of National Education. (MoNE). (2008). National education statistics: Formal education 2007–2008. Ankara: Republic of Turkey. Ministry of National Education. Available at http://www.meb.gov.tr/english/indexeng.htm Mullis, I. V. S., Martin, M. O., & Foy, P. (with Olson, J. F., Preuschoff, C., Erberber, E., Arora, A., & Galia, J.). (2008). TIMSS 2007 international mathematics report: Findings from IEA’s Trends in International Mathematics and Science Study at the fourth and eighth grades. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Mullis, I. V. S., Martin, M. O., Gonzalez, E. J., & Kennedy, A. M. (2003). PIRLS 2001 international report: IEA’s study of reading literacy achievement in primary schools in 35 countries. Chestnut Hill, MA: Boston College.

142

EBRU ERBERBER

Mullis, I. V. S., Martin, M. O., Ruddock, G. J., O’Sullivan, C. Y., Arora, A., & Erberber, E. (2005). TIMSS 2007 assessment frameworks. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. OECD. (2006). OECD economic surveys: Turkey 2006. Making quality education accessible to the whole population. OECD Economic Surveys, 15, 153–167. OECD. (2007a). PISA 2006 science competencies for tomorrow’s world. Volume 1: Analysis. Paris: OECD. OECD. (2007b). Reviews of national policies for education: Basic education for Turkey. Paris: OECD. OECD. (2009). Education today: The OECD perspective. Paris: OECD. Olson, J. F., Martin, M. O., & Mullis, I. V. S. (Eds). (2008). TIMSS 2007 technical report. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Raudenbush, S. W., Bryk, A. S., & Congdon, R. (2004). Hierarchical linear and nonlinear modeling: HLM for Windows (Version 6.00) [computer software]. Lincolnwood, IL: Scientific Software International. United Nations Development Programme (UNDP). (2007). Human development report 2007/ 2008. Fighting climate change: Human solidarity and human development (Available at http://hdr.undp.org/en/). New York: Palgrave Macmillan.

PART II COMPARATIVE CONTRIBUTIONS OF INTERNATIONAL ACHIEVEMENT STUDIES TO EDUCATIONAL POLICYMAKING

THE IMPACT OF STANDARDIZED TESTING ON EDUCATION QUALITY IN KYRGYZSTAN: THE CASE OF THE PROGRAM FOR INTERNATIONAL STUDENT ASSESSMENT (PISA) 2006 Duishon Shamatov and Keneshbek Sainazarov ABSTRACT In 2006, Kyrgyzstan entered the Program for International Student Assessment (PISA) competition and the results were very poor, with it securing the last position among all participating countries. However, to date, there are no in-depth studies examining the results and the impact of the PISA test on the quality of secondary education in Kyrgyzstan. This chapter attempts to fill this gap. The study was conducted in post-Soviet Central Asian education context where standardized tests are only emerging and what their far-reaching implications are not yet known. The data were collected using semistructured interviews and document analysis. Respondents to semistructured interviews included representatives of government, education officials, specialists from the independent testing center, representatives of international development organizations, The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 145–179 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013009

145

146

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

university professors, school administrators and teachers, community members, and students. The study showed that the poor results of PISA 2006 awakened many policymakers, education officials, and educators about the current state of the country’s education. However, the findings of the study also showed that the lessons and implications were not analyzed systematically and, as a result, rather fragmented and less coordinated efforts and initiatives were undertaken.

INTRODUCTION This chapter examines the impact of the Program for International Student Assessment (PISA) 2006 on the quality of secondary education in Kyrgyzstan. In 2006, Kyrgyzstan, along with 56 other countries and economies, took part in PISA, which is coordinated by the Organization for Economic Cooperation and Development (OECD). Kyrgyzstan was the first country in post-Soviet Central Asia to enter the PISA competition. PISA 2006 focused on students’ competency in science. The results demonstrated that Kyrgyzstan’s 15-year-old students performed extremely poorly, with Kyrgyzstan placed last among all participating countries. These results caused alarm among the politicians, intellectuals, and educators. However, to date, there are no in-depth studies examining the results and the impact of the PISA test on the quality of secondary education in Kyrgyzstan. Employing a series of semistructured interviews and document analysis, this chapter describes what lessons were learned from the PISA experience, and whether the process had any significant impact on the quality of education in Kyrgyzstan. The chapter also examines the phenomenon of how transnational forces such as PISA, an international comparative test, can affect education policy in Kyrgyzstan, and what the implications are of this impact.

CONTEXTUAL BACKGROUND Kyrgyzstan, officially the Kyrgyz Republic, is a small, landlocked, and predominantly mountainous independent country located in Central Asia. Bordered by China, Tajikistan, Uzbekistan, and Kazakhstan, it encompasses 198,500 sq km. Kyrgyzstan is made up of over 90 nationalities and ethnic groups (Ibraimov, 2001). The Turkic-speaking Kyrgyz, one of the most ancient people of Central Asia (Jusupov, 1993), constitute about 65%

Impact of Standardized Testing on Education Quality in Kyrgyzstan

147

of the country’s population. The capital of the country is Bishkek, which has a population of about 790,900 (Ibraimov, 2001). Kyrgyzstan is divided into seven administrative provinces or oblasts: Batken, Chu¨i, Naryn, Osh, JalalAbad, Talas, and Yssyk Ko¨l. Kyrgyzstan was one of the 15 republics that made up the Union of Soviet Socialist Republics (USSR). Soviet rule was established in Kyrgyzstan between 1918 and 1922 (Akiner, 1998; Landau & Kellner-Heinkele, 2001), and for over 70 years Kyrgyzstan was an integral part of the USSR, serving as its mountain outpost (Gleason, 1997, cited in DeYoung, 2001, p. 4). As part of the Soviet scheme to establish national republics in Central Asia, the Kara-Kyrgyz Autonomous Oblast was formed within the Russian Federation in October, 1924 (Haugen, 2003). It then became the Kyrgyz Autonomous Soviet Socialist Republic in February, 1926, and the Kyrgyz Soviet Socialist Republic in December, 1936 (Ibraimov, 2001; Soktoev, 1981). After gaining control over Central Asia, the Soviets embarked upon a program of radical transformation. Akiner describes the results of this massive Soviet campaign of modernization: The dramatic increases in literary rates, improved standards of health care and nutrition, electrification of virtually the entire region, intensified industrialisation, the creation of serviceable communication and transport networks, a huge expansion of mass media outlets, the diversification of employment opportunities, or cultural facilities such as museums, libraries and art galleries, the establishment of modern state institutions and of a modern bureaucracy. (1994, p. 11)

Advocates for the Soviet regime have emphasized such developments. For example, Kolbin (1960) observed that ‘‘Kirghizia reached remarkable economic, political and cultural development’’ (p. 3). The Soviets ‘‘carried out progressive reforms in the availability of mass education and health care, the growth of industry, the development of mechanized methods of farming and irrigation y’’ (Rashid, 2003, p. 37). The USSR broke up in December, 1991 and Kyrgyzstan became independent. Gaining independence aroused the hopes and aspirations of the people of Kyrgyzstan (Akiner, 1998). Askar Akaev, the first president of Kyrgyzstan, introduced an array of reforms, such as gaining membership in international organizations, introducing a national currency (the som), privatization, shifting from a monistic power structure to a pluralistic electoral system, and moving from a centralized state economy to a marketoriented economy (Abazov, 2004). Kyrgyzstan gained a reputation abroad as a leader of democracy in Central Asia; the term of ‘‘Island of

148

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

Democracy’’ was popularly used to refer to the country (Megoran, 2002; Meyer, 2003). As DeYoung stated, ‘‘the West quickly focused upon Kyrgyzstan as the best hope for democracy and market economy reforms: not only had it been historically less well integrated into the former USSR, but also its new president had been an academic and not a lifetime communist y’’ (2002, p. 4). Despite these growing hopes in the West, the USSR’s breakup was tragic for many people in Kyrgyzstan; it brought chaos, despair, and uncertainty to the lives of thousands (Akiner, 1998). Economic crisis, unemployment, poverty, dislocated civilians, poor living conditions, and various health problems have plagued Kyrgyzstan since independence. Kyrgyzstan’s economy found itself in a deep crisis: Industrial production declined by 63.7% from 1990 to 1996; agricultural output declined by 35%; and capital investment by 56% (Rashid, 2003). In 2001, a World Bank report (cited in Rashid, 2003) indicated that 68% of Kyrgyzstan’s population lived on less than US$7 a month and the average annual salary was just US$165; the same report estimated a subsistence-level salary at US$295 a year. Between 1990 and 1996, Kyrgyzstan’s gross domestic product was almost halved, falling by 47% (Rashid, 2003). A large number of people suffered from the economic collapse. Official sources reported that about 60,000 people became unemployed in Kyrgyzstan (Abazov, 2004); however, Abazov (2004) believes that real estimates exceed this figure by far. The worsening socioeconomic and unstable conditions have caused many people, especially from the highly educated, skilled Russian-speaking population, to leave for Russia or migrate abroad (Allen, 2003; Ibraimov, 2001). The poverty level increased dramatically. Around 30% of Kyrgyzstan’s population live in poverty (expense based) according to the National Statistics Committee (2009).

Education during the Soviet Era The Soviets realized that the tempo of societal progress depended on the development of science and education (Holmes, Read, & Voskresenskaya, 1995), and Kyrgyzstan achieved a lot of progress in education during the Soviet era (Shamatov, 2005). From its outset, education in the USSR was free and unified. With massive campaigns, the literacy rate in what is now Kyrgyzstan jumped from 16.5% in 1926 to 99.8% in 1979 (Ibraimov, 2001). Schools were built in the most remote mountain villages and by 1978, there were 1,757 schools with 854,000 students and around 50,000 teachers (Tabyshaliev, 1979).1

Impact of Standardized Testing on Education Quality in Kyrgyzstan

149

At the same time, there were problems with Soviet education. All students were exposed to the same centrally designed curriculum, with minor local adaptations to accommodate each Soviet republic (DeYoung, 2001; Heyneman, 2000). The state controlled educational institutions, teaching appointments, syllabi, and textbooks to ensure that all learners were exposed to the same outlook and official knowledge and attitudes (Apple, 1993; Heyneman, 2000). While Soviet education overtly promoted internationalism above nationalist and ethnic identities, many scholars argue that in practice it promoted Russian identity over other national identities within the USSR. A system of education with both Kyrgyz- and Russian-medium schools was introduced in Kyrgyzstan early in the Soviet era, and after the late 1950s, parents ostensibly had a choice in the language of instruction for their children. However, socioeconomic and ideological pressure to send children to Russian-speaking schools was strong (Korth & Schulter, 2003), and there were key differences between Russian schooling and schooling in local languages. Kyrgyz and Russian schools divided people along linguistic lines which reflected social and economic lines. The reality was that Russian speakers occupied the higher positions in most Soviet institutions (Korth, 2001b). Notably, there was only one Kyrgyzmedium school in the capital Frunze (now Bishkek), where most economic and social opportunities existed. Education in the rural, predominantly Kyrgyz-speaking regions was marginalized and neglected, and pupils in Kyrgyz schools were disadvantaged and underprivileged (Korth, 2001a). As a result, many people, including elite Kyrgyz families, preferred Russian school education. Though Soviet education espoused equality and uniformity, many scholars argue that, contrary to official doctrine, Soviet schooling was never really monolithic or egalitarian. Besides clear disparities between Russian- and Kyrgyz-medium schools, obvious status differences also existed between urban and rural schools (Niyozov, 2001; Sutherland, 1999).2 Despite high learning standards and an egalitarian approach, success in the Soviet Union was closely related to speaking and acting Russian, resulting in a neglect of, and even distain for, Kyrgyz language, identity, and culture (Korth & Schulter, 2003).

Education in the Post-Soviet Era After the breakup of the Soviet Union, Kyrgyzstan began experiencing serious problems in the field of education (DeYoung, 2004). Preschool

150

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

enrolment declined catastrophically during the 1990s; out of 1,604 preschool institutions existing in 1991, only 416 remained by 2000 (DeYoung, 2004), and overall preschool enrolment in Central Asia was only 14% in 1999 (Open Society Institute, 2002).3 About 83.6% of the population of Kyrgyzstan completed secondary education in 1993; this decreased to 76.4% in 1996, and further to 69% in 1999 (DeYoung, 2002). The gap between the quality of education offered in urban and rural schools became evident. Under an official reform effort called ‘‘diversification,’’ new, innovative private schools such as lyceums, gymnasiums, author schools, and schools for gifted children emerged (Holmes et al., 1995; Open Society Institute, 2002). Many urban schools turned themselves into gymnasia or schools referred to as ‘‘new type’’ to generate extra income.4 ‘‘New type’’ schools offer advanced coursework in addition to the national curriculum, and extra academic services to students. They generally provide a better and more comprehensive education than ‘‘ordinary’’ state-funded schools. Graduates of these schools have a better chance of entry to prestigious higher education institutions. Overall, there are 73 private schools in Kyrgyzstan (Interview, staff of Ministry of Education and Science (MoES), July, 2009). These are mostly located in urban areas with wealthy families who can afford to pay school fees (Open Society Institute, 2002). However, the reality is that only a small fraction of people can afford quality education for their children (EFA, 2000). While providing new opportunities to those who can afford it, this officially endorsed diversification of schools has exacerbated the stratification of Kyrgyz society. Almost 70% of Kyrgyzstan’s population and 83% of schools are located in rural areas (UNDP Report, 2003). Children from rural and mountain schools receive poor quality education. They are also frequently distracted by agricultural work and other family responsibilities (Open Society Institute, 2002). According to official sources, over 2,500 school-age children dropped out of school in 2001; however, unofficial reports suggest that the actual number far exceeds this figure (DeYoung & Santos, 2004). These dropout rates are a by-product of economic collapse and declining support for the social sector, with primary reasons including poverty, insufficient food, lack of adequate clothing, inability to afford learning materials, and the increasing cost of education. The declining prestige and perceived value of education has also contributed to dropout rates (Open Society Institute, 2002). According to the National Statistics Committee, 1,542 children between 7 and 17 years of age did not attend school5 in 2008. However, children not attending schools constitute several times more than the official figure indicates according to unofficial sources.

Impact of Standardized Testing on Education Quality in Kyrgyzstan

151

METHODOLOGY To study the impact of PISA on education quality of Kyrgyzstan, a qualitative research design was adopted to provide a comprehensive and contextualized account that takes into account complex, multidimensional dynamics, alternative ways of knowing, expressing, and acting upon reality, and sensitivity to local perspectives (Hitchcock & Hughes, 1995; Merriam, 1998). The data were collected between June, 2009 and April, 2010 using semistructured interviews and document analysis (Hitchcock & Hughes, 1995). Purposeful sampling was used to gain the maximum possible data (Merriam, 1998; Miles & Huberman, 1994) from expert respondents about the impact of PISA 2006 on education in Kyrgyzstan. Respondents to semistructured interviews included two representatives of the MoES of Kyrgyzstan and Kyrgyz Academy of Education (KAE), a specialist from the independent testing center (CEATM) which conducted the PISA 2006, and two representatives of the Rural Education Project (REP) of World Bank which initiated Kyrgyzstan’s participation in PISA competition. An international education expert who has served as a long-term consultant of World Bank was also interviewed about the necessity and benefits of the country’s participation in the international comparative test. A local education expert, who is involved in curricular and assessment reforms, working with the Open Society/Soros Foundation, Kyrgyzstan was interviewed. Professors from public and private universities, school administrators and teachers, community members, and students were also interviewed. In total, 30 people were interviewed. With the participants’ consent, the interviews were taped to aid in recall and analysis (Frankel & Wallen, 1993). Prior to being interviewed, the respondents were selected on a volunteer basis, they were informed of the purpose and nature of the study, and they gave their written consent to be interviewed and have those interviews recorded (Cohen & Manion, 1997; Clandinin & Connelly, 2000; Glesne, 1999). Document analysis was used as another tool for investigation (Bell, 1995). To examine the questions related to the focus of this paper, namely, the reasons for Kyrgyzstan’s participation in PISA 2006, current state of education, PISA’s impact on education quality, a number of materials, reports, and other documents were analyzed. Documents and reports of the MoES, reports of PISA 2006 by OECD and CEATM, mass media materials, and others materials were reviewed and analyzed. Data analysis is a rigorous continuous process of systematically searching and arranging the accumulated data to increase one’s understanding of them

152

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

(Bogdan & Bicklen 1998; Merriam, 1998; Niyozov 2001). In this project, data analysis involved making sense of the data by arranging them into coherent and plausible arguments. A combination of data analysis techniques such as noting patterns and themes, testing plausibility, clustering, counting, making metaphors, making contrasts and comparisons, and noting relations between variables were used to analyze the data and generate meaning from them (Miles & Huberman, 1994).

PROGRAM FOR INTERNATIONAL STUDENT ASSESSMENT (PISA) PISA is an international standardized test for comparative assessment of 15-year-old students’ skills. It is the product of collaboration between participating countries and economies through the OECD, and draws on leading international expertise to develop valid comparisons across different countries. The members and partners of OECD participate in the PISA process to assess the comparative quality and condition of their education systems. The PISA process also highlights components of participant countries’ individual education systems, and offers recommendations to improve education quality. Thus, educational reforms and policies can be developed by participating countries, based on PISA results. Educational authorities pay serious attention to PISA results because they provide objective and reliable data about education quality, and highlight both strengths and weakness of education systems (Figazzolo, 2009). Consequently, following PISA, many countries have launched educational reforms to improve their education quality and system in general. For example, French President Sarkozy launched school reform under the 2007 Re´vision Ge´ne´rale des Politiques Publiques (general revision of public policies) using PISA 2006 results as a reference point, to support the educational reform in France. Similarly, German Education Ministers launched major educational reforms under the ‘‘Seven Action Areas’’ program to improve education and learning, based on the PISA 2000 and 2003 results. These examples demonstrate the impact of PISA beyond simply testing whether students have acquired predefined knowledge and skills from school curricula or not. In PISA 2006, all 30 OECD member countries participated, as well as 27 partner countries and economies. In total, around 400,000 students were randomly selected to participate in PISA survey, representing about

Impact of Standardized Testing on Education Quality in Kyrgyzstan

153

20 million 15-year-old students from 57 participating countries. Representative samples of between 3,500 and 50,000 15-year-old students were drawn in each country. PISA thus covers roughly 90% of the world economy. To ensure the comparability of the results across countries, PISA devoted great attention to assessing comparable target populations. PISA tests students who are aged between 15 years 3 months and 16 years 2 months at the time of the assessment and have completed at least 6 years of formal schooling, regardless of the type of institution in which they are enrolled (OECD Report, 2007). PISA 2006 focused on student competency in science. In today’s fastprogressing globalized, technological world, understanding main scientific concepts and theories and the ability to solve science problems are more important than ever. PISA 2006 assessed not only science knowledge and skills, but also the attitudes which students have toward science, the extent to which they are aware of the opportunities that possessing science competencies may open, and the science learning opportunities and environments which their schools offer. PISA defines scientific literacy in terms of an individual’s scientific knowledge and use of that knowledge to identify scientific issues, explain scientific phenomena, draw evidence-based conclusions about science-related issues, and demonstrate understanding of the characteristic features of science as a form of human knowledge and enquiry, awareness of how science and technology shape our material, intellectual, and cultural environments, as well as willingness to engage with science-related issues. PISA measures scientific literacy across a continuum from basic literacy skills through high levels of knowledge of scientific concepts and examines students’ capacity to use their understanding of these concepts and to think scientifically about real-life problems. Student performance scores and the difficulty of questions were divided into six proficiency levels.

KYRGYZSTAN’S PARTICIPATION IN PISA 2006 Since the breakup of the USSR, the Kyrgyz public and education community raised the issue of the dramatic decline of education quality in the country. End-of-school, end-of-grade results were getting lower and lower. Overall funding of education declined, and teachers’ salaries lagged far behind any economic developments. There was a common feeling that education quality was deteriorating. In 2001 and 2005, local research firm El-Pikir carried out research for UNICEF to test fourth and eighth grade

154

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

student mathematics, literacy, and life skills. These were the attempts to measure student performance at a national level after the collapse of Soviet Union and the results were quite alarming. Only 58.8% of fourth grade students passed a standard mathematics test, compared to 81.4% at the time of the first study in 2001. Only 44.2% passed a literacy test, down from 59.1% in 2001 (El Pikir, 2005). However, the El-Pikir study had the following limitations. First, the sample size of 1,900 students was very small, and there was skepticism about the result. Second, El-Pikir research tools were developed not in accordance with any educational standards. In fact, the first State Standards of Education were not developed until 2005. Test expectations or benchmarks were based on textbooks. Third, the study could not provide comparative results with other countries. Thus, there was still need for objective and comparative measurement of the actual state of education quality in Kyrgyzstan. Kyrgyzstan participated in PISA for the first time in 2006. The decision for Kyrgyzstan to participate in PISA was taken in 2005 by the MoES of Kyrgyzstan with encouragement and financial support of the REP of World Bank.6 Stephen Heyneman, long-term World Bank education consultant stated, ‘‘Kyrgyzstan is joining countries all over the world. My recommendation was for Kyrgyzstan to join from the beginning of PISA. About ten years ago, I was saying ‘join PISA, join PISA.’ I measure the speed of reform in this region by the number of countries taking part in PISA and TIMSS’’ (Interview, July 10, 2009). A local consultant of World Bank REP commented, We hesitated for a long time to go with PISA or not. We thought as it was an international test, there would be test questions which were comfortable for French children, for example, and not for our kids. We then reviewed all PISA documents, and had discussions with other consultants about how test items of PISA are developed. We learned that PISA test items undergo very thorough examination and review and are adapted to each country specifically. There should be no shocking questions to any student from any part of the world. Only when all participating countries say: ‘‘Yes, this suits our country,’’ be it Ethiopia, the United States or Kyrgyzstan, are PISA test items are approved. (Interview, April 3, 2010)

According to a specialist in the MoES, ‘‘Everyone was excited to participate and see the results of PISA. It could be a tool to demonstrate the state of education in our country, which area of the education system is not performing well, and how bad or good the system is in comparison with other countries.’’ The REP consultant added, ‘‘It was important to know not only where Kyrgyzstan stood, but why we stood where we stood, and what

Impact of Standardized Testing on Education Quality in Kyrgyzstan

155

should be done so that we could move forward.’’ The following objectives for Kyrgyzstan’s participation in PISA 2006 were identified by the MoES:7 a) to assess the educational achievement of Kyrgyzstan’s students with a modern and international assessment tool; b) to define what place Kyrgyzstan occupies in the world among the other countries on level of preparedness of 15-year-old schoolchildren for adult life; and c) to analyze the results of research and propose recommendations and ways of school development and improvement. The PISA 2006 in Kyrgyzstan was conducted by the Center for Educational Assessment and Teaching Methods (CEATM),8 with financial support from the World Bank REP. About half a million US dollars were spent for conducting PISA competition in Kyrgyzstan. Around 6,000 students from 201 schools were randomly selected throughout the country. The test was conducted in Kyrgyz, Russian, and Uzbek languages. In addition to the test, a survey was conducted with schoolchildren and school administrations.9

PISA 2006 RESULTS IN KYRGYZSTAN The PISA 2006 results were first presented on February 7, 2008 at an event attended by all educational officials from the MoES, representatives from the President’s administration, members of the Kyrgyz parliament (Jogorku Kenesh), representatives of international organizations, and other stakeholders (Kiyizbaeva, 2008). The results of PISA 2006 showed that 15-yearold students of Kyrgyzstan performed extremely poorly. Among the 57 participating countries and economies, Kyrgyzstan took the last place. Among the participating countries and economies, Finland performed highest in science (563 points); while Chinese Taipei (549 points), Finland (548 points), Hong Kong-China (547 points), and Korea (547 points) performed highest in mathematics; and Korea performed highest in reading (556 points). Students of Kyrgyzstan achieved a mean score of 322 points in science, 311 points in mathematics, and 285 points in reading.10 These are the lowest scores among the participating countries and economies. Even among the participating post-Soviet countries (which included Estonia, Russia, and Armenia), Kyrgyzstan’s results were poor. Only 13.6% of Kyrgyzstan 15-year-old students were able to carry out a basic level of tasks in science, 11.7% in reading, and 11.8% in math. Over 85% could not score

156

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

even the basic level of the PISA scale, meaning that a great majority of students could not demonstrate the science competencies that would enable them to participate actively in life situations related to science and technology (Report on PISA Assessment Results, 2007). The PISA 2006 results provided solid evidence on the terrible state of secondary education in Kyrgyzstan. A specialist from the KAE stated, ‘‘On one hand, the PISA result was shameful for us, but on the other, it was very useful because we were able to identify education quality in Kyrgyzstan according to the international requirements. The poor result made all of us seriously think about our education system’’ (Interview, June 26, 2009). The REP specialist commented, We knew the education quality was declining, but we did not have exact picture of what was bad and how bad it was. Therefore, the PISA results gave us documentary evidence as to where our position was. This is very significant evidence which no one can deny or ignore. Education ministers come and go, but these documents remain. (Interview, April 3, 2010)

Different education stakeholders, educators, politicians, parents, and others discussed and debated the results of PISA in the press and through other media. Many were saddened by the poor result, realizing that the PISA results reflected the real and objective level of education quality. They started searching for ideas as to why Kyrgyzstan performed so poorly and discussing ways to improve the quality of education. Soon after the PISA results were released, a delegation comprising of a former minister of MoES, a specialist from MoES, and two representatives of the World Bank REP visited Finland to learn lessons from the success of Finland as the leading country in PISA 2006. The Kyrgyz delegation observed school life, learned about the structure of education system, curriculum design, textbook supply, and the decentralization of school management and autonomy of schools (see following sections for lessons learned and actions taken).

THE 2007 NATIONAL SAMPLE-BASED ASSESSMENT When the PISA results became public, there were people who were skeptical about the poor results.11 They were mostly older government and educational officials who had believed that the Soviet legacy of education in Kyrgyzstan still outperformed education in many other countries. They asked questions such as ‘‘We have an education system which we inherited

Impact of Standardized Testing on Education Quality in Kyrgyzstan

157

from the Soviet period, so how can it be that bad?’’ Some also wondered, ‘‘I cannot understand why Kyrgyzstan’s students performed so poorly because our students are winning different international Olympiads in different subjects.’’ There was distrust among certain layers of the population as to the validity of the PISA 2006 results as a true indicator of the school quality level in Kyrgyzstan. In 2007, the National Sample-Based Assessment (NSBA) was carried out by CEATM in 2007, as part of the REP of World Bank, in accordance with national standards. A local consultant of REP of World Bank stated, We knew that the PISA results would be viewed negatively by some people in Kyrgyzstan and anticipated that there would definitely be criticism that we want to ‘‘rank with Europe or other developed countries.’’ We said, okay, let us then see the results according to our own standards, because we wanted to triangulate and validate the findings of PISA 2006. We conducted NSBA in 2007, and then we conducted a second NSBA in 2009. (Interview, April 3, 2010)

The NSBA was directly inspired and influenced by items in PISA 2006 and tested competences and skills such as the application of concepts in different contexts and logical reasoning. The 2007 NSBAs at grades 4 and 8 were based on Kyrgyzstan’s Education Standards of 2005. The results of NSBA 2007 confirmed the poor results of PISA 2006. As many as 64.4% of fourth grade students scored below basic level12 in reading comprehension. In mathematics, no fourth graders scored at the highest level in math, while 62% scored below basic level, 28% at basic level, and only 8% scored above basic level. The scores in science and reading comprehension were similarly low. Almost 65% of those tested scored at below basic level; 64.8% in science and 64.4% in reading comprehension. At the eighth grade level, the results were even worse: 84.3% scored below basic level in mathematics and 73.5% scored below basic level in reading comprehension (CEATM, 2007).13 A World Bank REP specialist observed, ‘‘The results of the NSBA were nearly the same as those of PISA 2006. It means our students are not getting the quality of education according to even existing national education standards’’ (July 28, 2009).

IMPACT OF THE PISA 2006 RESULTS IN KYRGYZSTAN The PISA 2006 results increased awareness of the actual state of the education quality in Kyrgyzstan. They also became a springboard for

158

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

advocacy efforts. In June 2009, the President of Kyrgyzstan presented a new concept paper on education.14 While not directly mentioning the PISA results, this document describes the deterioration of education quality, the importance of preparing well-qualified graduates, and the importance of quality education for Kyrgyzstan’s future, reflecting the issues which emerged in the PISA 2006. Government and education authorities started using the PISA results as a reference point in forums and meetings. The poor results of Kyrgyzstan in PISA 2006 were more than once used as a justification for the implementation of the reform by the government. REP country coordinator observed, ‘‘In all strategic and programme documents they [government] now include questions about PISA related to content of education, methodology and resources’’ (Interview, April 3, 2010). The poor results of PISA 2006 gave the education officials an opportunity to strategically gain support from international development agencies. Silova and Steiner-Khamsi (2009) report, ‘‘They [government education officials] had to convey a graphic sense of educational crisis to attract external fundingy . After years of using ineffective strategies to attract international donors, the ministries of education finally learned to belittle their own accomplishments and instead emphasize how far their system lagged behind other countries’’ (p. 14). These new tactics were contrary to what government education authorities had been accustomed to do for many years, that is, to glorify that their goals had been accomplished, and often ahead of time. Now, they have become keen to state how far the education away from ‘‘international standards’’ was (Silova & SteinerKhamsi, 2009, p. 15). To secure a grant or loan, ministry of education officials have learned to speak the language of the international donors and have familiarized themselves with the current philosophy of aid that emphasizes needs rather than accomplishments.15 Many education specialists and international development agencies also started using PISA to support their attempts to improve the state of education. As the Soros Foundation education specialist remarked, ‘‘The main change is the start of discussion on problems of education. Based on these discussions, we are trying to develop new systems of education, new standards, and based on these new standards, we will develop textbooks and train teachers’’ (Interview, July 28, 2009). The PISA 2006 results impacted efforts to improve education quality and in some cases, the results catalyzed new action, in others they strengthened already existing efforts. Below are some examples of how the PISA results had a direct impact. Some of them were clearly illuminated by PISA and are being changed in response to PISA, while others are examples of government

Impact of Standardized Testing on Education Quality in Kyrgyzstan

159

responses using PISA to lobby for more funds from international donors. Some examples simply describe reasons behind the PISA results and it seems that the PISA results simply further highlight existing problem areas.

Curriculum Reform Many respondents asserted that the poor results in PISA 2006 were related to the outdated curriculum in Kyrgyzstan. The REP assessment specialist of the World Bank stated, ‘‘Our educational programmes do not meet the requirements or educational goals identified in the Education Development Strategy for 2011–2020.16 Our teachers mostly teach to develop rote memorization and retelling. But the PISA asks questions like ‘Why? How do you use this formula? How does this formula work in real life?’ ’’ A specialist from CEATM added, ‘‘Our children cannot apply their knowledge in real life situations. For example, there was a question in the PISA test asking where you should put a torch in order to get maximum lighting in the room, which requires knowledge of physics. Most students from Kyrgyzstan could not answer the question correctly.’’ The PISA 2006 report recommended reforms to align curriculum with international standards and focus on modern skills and competencies at higher proficiency levels (Briller, 2009). ‘‘Curriculum is at the heart of everything, and all other reform initiatives are linked to curriculum reform. So, we are trying to change our curriculum according to international standards’’ (REP country coordinator, Interview, April 3, 2010). Curricular reforms actually pre-dated PISA 2006. A new national curriculum framework had been spearheaded by the Soros Foundation, Kyrgyzstan prior to the PISA 2006. Education Specialist, at the Soros Foundation commented, The curriculum framework is a main document in education,17 and all documents should follow it. It describes goals and objectives of education at different levels, means to achieve those goals, including methodology of teaching and structure of organization of education system. For example, how many hours should be taught at primary or secondary level, how it should be assessed. It will also have graduate profiles which describe what a secondary school graduate should be able to do, his or her competencies.18 (Interview, July 28, 2009)

At the same time, the Asian Development Bank’s the Second Education Project (SEP) is developing subject-based curricula. This curricular reform also pre-dates PISA 2006. Subject-based curricula for primary grades 1–4 have already been developed and approved, and subject curricula for grades 5–9 and 10–11 are being developed. These curricula aim to develop students’

160

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

competencies and include innovative teaching methods to achieve their objectives (SEP Specialist, Interview, April 3, 2010).19 While the PISA did not initiate these curriculum reforms, the results provided clarity on where Kyrgyzstan stood internationally, and curriculum developers use the lessons and recommendations of PISA reports in their work.20 After the PISA 2006 results were announced, the MoES of Kyrgyzstan strongly supported the curricular reforms, and pushed to expedite the process of curriculum development, which is just one step in a long process toward improving standards and quality in education. If we complete the development of curriculum framework tomorrow, then the day after tomorrow, we will write textbooks according to new curriculum, and then train teachers accordingly. We will develop resources and then after we teach for five years, it will be necessary to participate in PISA and see the real outcome, pluses and minuses of this new curriculum. Thus, it will take about 10 years before we see some significant changes. (Education Specialist, Soros Foundation of Kyrgyzstan, July 28, 2009)

The REP country coordinator added, ‘‘Curriculum change is the first step, and financing reform is another step, and then textbooks should be developed in accordance with new curriculum, and then teacher training. I think step by step, everything should change’’ (Interview, April 3, 2010). Reduction of Education Load According to PISA 2006 analysis offered by CEATM, overloaded learning time negatively affected the Kyrgyz students’ performance in PISA 2006. The education program in Kyrgyz schools, in terms of time spent in lessons, was the heaviest among all participating countries of PISA 2006. After the breakup of the USSR, new subjects were added to an already long list of subjects (Shamatov, 2010). The annual educational load for 15-year-old students in Kyrgyzstan in 2006 was 1,190 hours, while students in Finland clocked only 855 hours. As (Steiner-Khamsi, Mossayeb, & Ridge, 2007, p. 23) wrote, ‘‘The breadth of knowledge required is overwhelming as is the limited amount of time in which teachers have to cover it. This also assumes that children attend school every day and that teachers also attend regularly.’’ Currently, the KAE is working to consolidate the existing 22 subjects into 14 subjects (Steiner-Khamsi et al., 2007). So far, the education load of Kyrgyzstan has been reduced by 10%, moving closer toward other PISAparticipating countries’ average education load (President of the KAE, Interview, April 2, 2010).21 Mathematics, Kyrgyz language and literature, Russian language and literature, and foreign language were reduced by

Impact of Standardized Testing on Education Quality in Kyrgyzstan

161

1 hour a week. Additionally, the following subjects were integrated: Arts (Meken Taanuu) and Ethics (Adep) in primary grades were integrated into one subject, and so were Ethics (Adep) and Civic Education (Adam jana Koom) in grades 9–11, further reducing the education load by 1 hour per week (Interview with KAE specialist, July 15, 2009).22 Longer contact hours for regular classes do not necessarily guarantee quality education. Therefore, just reducing the amount of education load is a positive step forward. The next step is to ensure that the reduced amount of time is used efficiently, effectively, and qualitatively. Extracurricular activities, such as science clubs, fairs, competitions, and excursions, also positively affect students’ performance and also have to be scheduled.

Shortage and Poor Quality of Textbooks Shortage and poor quality of textbooks was another reason most respondents agreed on for the poor PISA 2006 results. Insufficient quantities of textbooks and teaching materials, especially in Kyrgyz language, and the poor quality of available textbooks and teaching materials were commonly reported to lead to poor quality education. According to the National Statistics Committee (2008), only 17% of Kyrgyz-medium schools are supplied with about 50% of their textbooks, and only 18% with more than 80% of their textbooks. Over 30% of Russian-medium schools are supplied with less than 50% of their textbooks, and only 24% of Russian-medium schools are supplied with more than 80% of their textbooks. Poor quality of textbooks is attributed to the textbook development and publication procedure. Currently, one institution, the KAE, is responsible for developing requirements for writing and approving textbooks. As a result, there is a conflict of interest, which has led to low quality of textbooks as a result of the monopolization of the ‘‘business’’ of textbook development. Textbooks are developed by authors who are hired and approved of by the KAE, but are usually removed from school life. Thus, according to the education official from Jalal-Abad, the textbooks these authors develop are usually overly theoretical and difficult for both teachers and students to use. An alternative to this process could be that new textbooks are developed in accordance with the new subject curricula of the SEP of ADB. The project committed funds to print textbooks for primary grades in 2011 and tender for manuscripts was announced in the summer of 2010.23

162

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

Teacher Shortages Teacher shortage was identified in Education Development Strategy for 2001–2020 as ‘‘the greatest barriers for quality improvement’’ calling it ‘‘the crisis of the pedagogical cadre.’’24 Shortages and inadequate quality of teaching personnel was a significant factor contributing to the poor PISA performance of students from Kyrgyzstan. ‘‘About 25% of students from schools participating in PISA did not take one or more science classes in academic year of 2004–2005. Only 3% of students studied at schools where there were no vacancies for science teachers, and 72% of vacancies were filled by teachers of other subject areas’’ (CEATM Report, 2009). This result provides alarming insight into the availability of qualified and quality teachers in Kyrgyz schools. There are two major issues related to teacher shortages – teacher supply (quantity) and poor qualification of teaching force (quality). An increasing number of teachers are leaving teaching. According to research by USAID’s Quality Learning Project, in 2005–2007 there was a gradual increase in number of teachers resigning from their jobs throughout all provinces (USAID, 2009) (see Fig. 1 for details). According to a specialist from the MoES, teacher shortages remained between 3,000 and 4,000 each year from 2002 to 2007. Additionally, the percentage of young new teachers who enter the teaching profession is decreasing with rates of approximately 60% in 2005 falling to 35% in 2007. Even those that begin teaching do not remain in schools very long due to

Fig. 1. Gradual Increase in Number of Teachers Resigning from Their Jobs Throughout all Provinces. Source: Survey among district education departments, 2007, USAID Quality Learning Project Report, 2009.

Impact of Standardized Testing on Education Quality in Kyrgyzstan

Fig. 2.

163

Loss of Teachers by Subject Area. Source: Survey among district education departments, 2007, USAID Quality Learning Project Report, 2009.

professional and socioeconomic difficulties (Shamatov, 2005). According to data collected by the district education office staff in 2007, schools lost 32% of foreign language teachers, 27–28% of Russian language teachers, 27–28% of computer science teachers, 15% of history teachers, 15% of biology teachers, 12% of primary grade teachers, and 12% of mathematics teachers (USAID, 2009) (see Fig. 2). Teacher supply has become one of the biggest problems in Kyrgyzstan, and it leads to a number of issues related to the qualifications and experience of the teaching corps. Older teachers with over 10–15 years of experience make up the majority of teachers, and often teaching vacancies are filled with unqualified teachers. According to the research on teacher shortages at the school level, real shortages are far greater than official figures indicate, and the impact of the shortages is numerous. One impact is that classes are simply not taught. Subjects are cancelled when the school does not have a teacher to teach, or instruction time is cut. Sometimes, ‘‘ghost’’ lessons are reported, but never actually taught. Another impact of the shortages is teachers without adequate training, as the best people have little incentive to join the profession. In place of qualified teachers, professionals without pedagogical training and university students work as teachers. Alternatively, pedagogical specialists teach subjects for which they have not been trained. Existing qualified teachers carry the burden of shortages, with teachers working at or past retirement age, teaching multiple loads and working in undivided classes with multiple

164

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

subjects. Schools often try to hire teachers away from other schools, further weakening an already vulnerable system (Steiner-Khamsi, 2009). The government is trying to attract and retain more young teachers at schools to rectify teacher shortages. Officials are trying to attract new teacher education graduates to rural schools, by promising better working conditions, or by providing the teachers with land plots from the village governments. To retain young teaching graduates to work in rural public schools, the government of Kyrgyzstan introduced a new project called ‘‘Deposit for Young Teachers’’ in 2004 (Kanimetova, 2005). To execute this program, the Ministry of Education of the Kyrgyz Republic organized a competitive recruiting campaign to select beginning teachers. In 2004, 200 beginning teachers were selected, signed a contract, and were credited 2,000 som monthly in addition to their salaries, for a total of 76,000 som each, to be withdrawn only after completing their contracts.25 Beginning teachers selected for this program underwent training and were under contract to work for three years at the schools to which they are assigned. The government officials hoped that retaining these young teachers at village schools for three years would lead to their adaptation and continued commitment to teaching at the same schools. The ‘‘Deposit for Young Teachers’’ program had limited success so far. But the MoES aims to continue this program due to absence of other strong alternative mechanisms. Teachers nowadays face a range of complex and multifaceted problems. Factors such as the increasing complexity of the teachers’ work and life, lack of recognition and respect, negative treatment from stakeholders, lack of support from administrators and colleagues, and criticisms and resentment from parents all strongly impact whether or not young teachers stay in the profession. Young teachers should be provided with professional development tools to develop the skills and knowledge to become active and effective teachers who, in turn, can positively influence other teachers. The authorities should focus on improving the conditions of teachers and schools through additional incentives, including improving administrative policies and practices, involving local communities in school matters, and investing more finances in schools. They should also reduce beginning teachers’ workloads so that they can focus on their professional development (Shamatov, 2005).

Ineffective Teaching Methods Ineffective teaching approaches were also commonly believed as one of the main causes for the poor PISA results. This was linked to poor systems of

Impact of Standardized Testing on Education Quality in Kyrgyzstan

165

preservice and in-service training for teachers and the lack of consistent and motivational teacher evaluation systems. Most teaching was reported to be poor and not aligned with modern theories and practices of teaching and learning. A specialist from KAE observed, More than 70 percent of teachers in Kyrgyzstan are doing their job inertially or routinely. They just come to work, pretend to be teaching and then leave. Teachers only cover the daily plans which are developed by the Ministry of Education. Only about 5 percent of teachers run here and there in order to update their knowledge. Students also do not like teachers’ teaching these days, because what teachers teach often has no relevance to students’ daily lives. (Interview, June 25, 2009)

An education specialist of the Soros Foundation, Kyrgyzstan stated, ‘‘Our education quality deteriorated in the last 15 years. Even if the content of education has been changing, the modified content did not have the support of methodological materials, was not supported by training of teachers, teacher preparation, resources have shrunk, and all these led to worsening quality.’’ REP coordinator of the World Bank added, ‘‘The nature of the tests were unfamiliar to our students. Our students are used to respond schematically to a specific type of questions, and thus they even did not attempt to respond to some questions. This also shows that our students are not taught to be able to analyze, to possess and express their own opinions.’’ A professor of a public university from Bishkek observed, ‘‘Teachers are not equipped with skills on how to design and conduct tests. They rarely use tests in their classes, therefore, their students are not prepared to take tests, as was the case with PISA 2006.’’ It is essential to improve the quality by teaching subjects in greater depth as well as with more effective teaching methods and materials. Even though the competency-based approach to teaching and student assessment is inscribed in the current curriculum framework, it remains to be implemented in practice. Many international development agencies are assisting local education authorities to provide effective in-service teacher education by introducing elements of student-centered and interactive teaching methods. However, preservice teacher education has been left neglected as large donors considered teachers to be ‘‘lost generation,’’ not worth investing in (Silova & Steiner-Khamsi, 2009, p. 32); higher education reform has not been a priority of international aid; and finally, given that fewer than half of teacher education graduates ever enter the teaching profession, it is not seen as a good investment.

166

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

Shortage of Resources and Materials Inadequate educational resources can explain the dismal results of students from Kyrgyzstan. The PISA 2006 survey that examined the level of school resources demonstrated that compared to other OECD countries, schools in Kyrgyzstan have a very low level of school resources (OECD, 2007). There are significant relationships between the level of material resources and overall performance. Over 90% of school directors surveyed for PISA 2006 referred to lack of, or low quality of, physical and material resources such as laboratories, textbooks, computers, Internet access, libraries, audio-visual means, and other tools as a cause of poor quality of education. The supply of resources and materials remains problematic. Government education officials have managed to use the PISA 2006 recommendations to gain support and more resources from donor agencies. The REP country coordinator stated, The PISA 2006 results were ‘‘screaming’’ that there was a shortage of educational equipment. The Ministry of Education started requesting that donor agencies provide textbooks, resources and equipments. For example, they requested the Foundation of Eradication of Poverty to provide computers to schools, and five thousand computers were purchased for 100 million soms. (Interview, April 3, 2010)

Lack of School Autonomy and Financial Reform The lack of school autonomy was also stated as a reason for the poor PISA result. From 2005, as part of decentralization reform, state funds were allocated to village governments who were responsible for distributing money to schools. According to a school director from Jalal-Abad oblast, school administrators operated under rigid and very difficult circumstances, Since 2005 when decentralization was adopted, the schools are serving at least three ‘‘bosses.’’ On a daily basis, my immediate ‘‘boss’’ is the head of village government. I am dependent on him as the whole budget comes through the local government. My second ‘‘boss’’ is the head of the district education administration. These both serve my third ‘‘boss,’’ who is the head of district government.

School administrators’ ability to develop and manage their budget, formulate school curriculum, and adjust school management in order to compete with other schools is severely limited by a centralized system that continues to mirror the system put in place during Soviet times. School curriculum is formulated and administered centrally by the MoES, and schools have little flexibility in adjusting curriculum and school

Impact of Standardized Testing on Education Quality in Kyrgyzstan

167

management.26 The budget allocation process, which involves bargaining and centralized discretion, is nontransparent, unpredictable, cumbersome, and does not address long-term strategic issues, resulting in an inflexible and inefficient use of scarce resources. In response to this issue, per capita school financing was implemented (Briller, 2009). This was done to increase ‘‘cost effectiveness and efficiency by decentralizing education finance, including financial autonomy at school level by introducing per capita financing, and by enhancing social accountability and participation’’ (Silova & Steiner-Khamsi, 2009, p. 19). Since 2006, the Ministry of Education, with support from the World Bank and USAID, is piloting per capita finance at schools.27 The per capita funding system is a process of decentralizing budget management to the school level, opening access to school budgets, and introducing accountability mechanisms to budget management. It prioritizes school autonomy, allowing schools to make allocative choices in their budgets, according to their individual needs. This reform was initiated prior to PISA in 2006, by USAID’s project Participation, Education and Knowledge Strengthening (PEAKS) and the World Bank’s REP, in Tokmok city of Chu¨i oblast (province).28 PEAKS completed its project in 2005, and a new USAID initiative called the Quality Learning Project (QLP), in cooperation with the Ministry of Education, continued the work of PEAKS and expanded the per capita funding model to all 301 schools of the Chu¨i oblast.29 The PISA 2006 results gave new impetus and support for an immediate shift to per capita funding among educators and other stakeholders. However, in the per capita funding scheme, the money goes directly to schools, rather than to village governments, and village governments are resentful to this innovation. At the same time, school administrators’ capacity to manage per capita funding has not been very strong, so the outcomes of this reform initiative remain to be seen.

Equity Issues The PISA 2006 results also highlighted existing issues related to equity and access to quality education.30 Students at private and elite urban schools of Kyrgyzstan showed significantly better performances in PISA 2006 than their rural counterparts. The higher social and economic level of students in these schools, one of the main factors affecting literacy level, clearly impacted the test results.

168

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

Post-Soviet officially endorsed diversification of schools which created ‘‘new type’’ schools further stratified the Kyrgyz society.31 A small number of parents can now afford to choose quality education for their children (EFA, 2000). The ‘‘new type’’ schools offer advanced coursework in addition to the national curriculum.32 They normally select pupils with good academic qualifications through interviews or entrance tests, and develop their own curriculum. They generally provide a better and more comprehensive education than ‘‘ordinary’’ state-funded schools. Although they can charge parents for extra services such as additional lessons in academic subjects, these schools also receive more national funding for teacher salaries.33 Their pupils have a better chance of entering prestigious higher education institutions upon completing school. Teachers often prefer to work in the ‘‘new type’’ schools because they get better pay and more motivated pupils. As a rule, these schools are located in urban areas, where wealthier families who can afford to pay school fees live (Open Society Institute, 2002). This imbalance raises serious concerns, because almost 70% of Kyrgyzstan’s population live in rural areas and 83% of schools are in rural settings (UNDP Report, 2003).34 Rural schools in post-Soviet Kyrgyzstan are experiencing devastating challenges. They lack funds and material support from the government, and serve impoverished communities. Rural community members normally have a low opinion of education and teachers and students have additional responsibilities including agricultural work, which compete with their school work. PISA 2006 confirmed the huge gap between quality of education offered at urban and rural schools. Unfortunately, this gap is increasing; some urban schools are becoming stronger, whereas the majority of rural and mountain schools are deteriorating. The large majority of rural, semirural, and mountain schools still teach facts and memorization, but the PISA test assesses higher order thinking and application of knowledge in real practical life. While PISA may catalyze or support efforts in curricula reform, until it is a reality in all schools across Kyrgyzstan, the PISA results will likely remain low.

ANALYSIS AND DISCUSSION The results of the PISA 2006 were shocking for Kyrgyzstan.35 However, the dismal performance in PISA 2006 also inspired self-reflection and selfrealization by the Kyrgyz school system. Administrators and educators are

Impact of Standardized Testing on Education Quality in Kyrgyzstan

169

now increasingly involved in advocacy and policy making. The PISA 2006 results have shaped public opinion through references via the mass media, and education policy debates have been impacted. While policymakers initiated reforms in education before the PISA 2006 findings, they are now legitimizing their recommendations and actions with the PISA results. Reforms and setbacks following the PISA 2006 are impacted by the broader context. Political instability in Kyrgyzstan, including two drastic government changes in the past five years, has led to discontinuity in the work of education authorities. A specialist from KAE noted, ‘‘There have been constant changes in the appointment of Ministers of Education. We had 10 ministers of education in for the last 15 years. How can we expect improvements? When one minister is just getting to conceptualize the education reforms, he or she is replaced’’ (Interview, June 25, 2009). There is also a lack of strong local capacity of education experts and policymakers. Reforms are implemented sporadically and ad hoc with different planning agencies and implementing bodies that do not communicate. Most reform initiatives and documents are conceptualized and designed primarily by international agencies. ‘‘Education system reforms have been driven primarily by the agendas and procedures of the funding and technical assistance agencies, with the result that reforms are imposed externally rather than initiated internally’’ (Silova & Steiner-Khamsi, 2009, p. 10). Since independence, Kyrgyzstan has been subject to a myriad of international education-assistance projects including international agencies, private foundations and philanthropists, and international nongovernmental organizations.36 These international organizations are now assisting the Ministry of Education to conduct major education reform, using the results and lessons learned from the PISA 2006 experience and survey. Reform has been initiated in a range of areas including curricular reforms, introduction of standards and/or outcome-based education, student-centered learning, decentralization of education finance and governance, and standardization of student assessment. While the contributions of the donor agencies are praiseworthy and much needed, often there is dissonance between the discourse of donors and the local needs. It is still unclear whether the initiatives of donor agencies truly reflect local needs and bring about sustainable improvements. Besides, different components of education, such as curriculum framework, subject curriculum, assessment, teacher development, textbook development, are being worked on by different agencies who work often with little or no communication. There is no effective coordination between all the

170

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

international and national institutions working on educational sector. KAE specialist argued, It is true that there are many international organizations working on education sectors, but the problem is that in most cases they choose education issues and problems for their project themselves without asking the MoES suggestions. Sometimes, they repeat already implemented projects. Unfortunately, the MoES also does not actively suggest educational issues. (Interview with KAE specialist, June 25, 2009)

There is no systematic, well-coordinated effort (REP assessment specialist, April 3, 2010). On the contrary, there is often overlap and duplication. Most reform initiatives are not institutionalized, indicating a lack of sustainability (Steiner-Khamsi et al., 2007). Systemic change on education system is only possible when all stakeholders and international organizations coordinate their activities with each other37 and when the initiatives focus on strengthening institutionalization and sustainability. Thus, Kyrgyzstan, similar to other Central Asian states, has become a kind of testing ground for how transnational forces affect educational policy. Chabbot (2009) observes that powerful international nongovernment organizations can have a substantial impact on newly formed states. As new states with suddenly reduced resources for education, the post-Soviet Central Asian states were particularly prone to these impacts. While the PISA results were very helpful in shaping education reforms to a great extent, it is also obvious that Kyrgyzstan jumped on the bandwagon of relying too heavily on the test results to shape their educational reforms.38 Due to low capacity in the country, it was not possible to critically engage in discourses. Rather, we believe that transnational forces have affected educational policy, and rather than creating initiative within Kyrgyzstan, there is a growing lack of ownership and empowerment, and a growing dependency on external aid and ideas for educational reform initiatives.39

CONCLUSIONS Developing new standards and curriculum, reducing education loads, modernizing school infrastructure and equipment, improving teaching standards and performance, and introducing per capita financing were some of the reforms that got new impetus due to the results and lessons of the PISA 2006. The PISA 2006 results provided further proof that the national standard of education has to improve and there is a need for significant changes in curriculum and teaching methodologies. These

Impact of Standardized Testing on Education Quality in Kyrgyzstan

171

changes are now being implemented by the Ministry of Education, with the support of international donor agencies. PISA has been a driving force for reforms, though within current economic constraints, it is difficult to expect drastic changes. The Ministry of Education has been using the PISA results actively in gaining support from international donor agencies. The PISA data have been largely used by different actors in the educational debate in different ways to support their positions. The PISA test demonstrated that there is a huge gap between quality of education offered at urban and rural schools. Unfortunately, this gap is not decreasing but on the contrary increasing. Few urban schools are becoming stronger while large majority of rural and mountain schools are changing for the worse. The large majority of rural, semirural, and mountain schools still teach their students for ‘‘facts’’ and ‘‘memorization,’’ but PISA test assesses higher order thinking and application of knowledge in real practical life. In PISA 2006, one of the main reasons that Finland scored top position is that it has a strong secondary school system across the country, and there is no huge divide between quality of education in urban and rural settings. Almost all students in Finland have access to quality education. As this is a matter of public policy, the Kyrgyz government needs to recognize this increasing gap between quality of education is strengthening existing inequalities and it needs to invest to improve education quality, particularly in rural schools so that gap is reduced. PISA 2006 provided a reliable and objective assessment of education quality and can effectively strengthen messages for reform to government authorities and other stakeholders of education. The test results showed that the majority of students cannot apply their knowledge and skills in real-life situations. They have little understanding of concepts and they mostly memorized concepts and facts. This is dangerous for the country’s future, because if school students do not develop critical and analytical thinking skills, problem solving, and cannot use their knowledge in real life, they will become citizens who are very poorly prepared to address issues and challenges of our dramatically changing societies. The PISA results are a wake-up call that have already, and can further, strengthen education reform efforts in Kyrgyzstan, as long as that reform is carried out in a systematic and well-coordinated way. Fullan and Miles (1992), analyzing the history of successful and unsuccessful reforms, asserted that most reforms fail because those who push for change do not involve all stakeholders, do not recognize complexity of their problems, and adopt superficial and quick solutions. Moreover, failure to institutionalize an innovation underlies the

172

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

disappearance of many reforms. To truly build on the results of PISA 2006 and initiate the necessary changes, reforms in the education system of Kyrgyzstan must be systematic and sustainable, and based on the inputs of all stakeholders.

NOTES 1. According to statistics provided by the Ministry of Education and Science in 2009–2010, there were 2,134 public schools in Kyrgyzstan, out of which 1,379 Kyrgyz-medium schools, 162 Russian-medium schools, 137 Uzbek-medium schools, and seven Tajik-medium schools, as well as 449 schools that had two or more languages of instruction. There were also 73 private schools. 2. Korth and Schulter (2003) observe that the Russian-medium schools continue offering better education than schools in Kyrgyz and other local languages. The Russian schools continue to enjoy high prestige and are attended by children of different linguistic backgrounds, while Kyrgyz-medium schools are attended exclusively by Kyrgyz children. 3. Significant declines in enrolment in preschool institutions across Central Asia are related to the increased costs of education, reduced state subsidies for transport and food, and lower family incomes. 4. To be classified a ‘‘new type’’ school, a school has to have highly qualified, innovative teachers, sufficient facilities and resources, including textbooks and library. DeYoung, Reeves, and Valyaeva (2006) observe that these schools got even better by untangling the mandated requirements of the government regulations of compulsory curriculum and schooling policies, and attracting money from the international development agencies. 5. National Statistics Committee Report, Education and Science in Kyrgyzstan, 2008. Table 7.25. 6. The REP of the World Bank aims to improve learning and learning conditions in primary and secondary education in Kyrgyzstan. Apart from financially supporting Kyrgyzstan to participate in the comparative assessment of PISA, REP is also implementing a pilot project to introduce formative and summative assessment in selected oblasts of Kyrgyzstan (Talas and Yssyk Ko¨l). Under REP, teachers and school administrations are assessed based on their performance. This promotes the establishment of merit pay based on certain standards and the valueadded approach that was recommended for systems to improve their performance in PISA tests (Briller, 2009). 7. CEATM, Report of International assessment. http://www.testing.kg/ru/projects/ what/ (June 20, 2009). 8. CEATM is an independent testing organization in Bishkek, Kyrgyzstan. It was founded by the American Councils for International Education with financial support of the United States Agency for International Development. See website www.testing.kg

Impact of Standardized Testing on Education Quality in Kyrgyzstan

173

9. CEATM, Uchimsiya dlya Jizni- Rezultaty mejdunarodnogo sravnitel’nogo issledovaniya fukstional’noi gramotnosti 15-letnikh uchashihsya. PISA-2006 (Bishkek: CEATM, 2008), p. 12. 10. The PISA 2006 assessment included 108 different questions at varying levels of difficulty. Usually several questions were posed about a single scientific problem described in a text or diagram. In many cases, students were required to construct a response in their own words to questions based on the text given. Sometimes, they had to explain their results or to show their thought processes. Each student was awarded a score based on the difficulty of questions that he or she could reliably perform. Scores were reported for each of the three science competencies, and for overall performance in science. The science performance scales have been constructed so that the average student score in OECD countries is 500 points. In PISA 2006, about two-thirds of students scored between 400 and 600 points (i.e., a standard deviation equal to 100 points). A score can be used to describe both the performance of a student and the difficulty of a question. Thus, for example, a student with a score of 650 can usually be expected to complete a question with a difficulty rating of 650, as well as questions with lower difficulty ratings. 11. The Parliament (Jogorku Kenesh) of the Kyrgyz Republic reacted strongly when the PISA results were made publicand some members of parliament questioned the validity of the PISA study and especially its sampling procedure. According to them, more (or only) good schools were supposed to participate in PISA competition. 12. According to the definition given by CEATM, ‘‘below basic’’ means that ‘‘students do not demonstrate sufficient knowledge and skills for successful further learning.’’ 13. NSBA results report, CEATM, 2007, http://www.testing.kg/files/NSBA07/ NSBA_Report_2007.pdf 14. Information Agency ‘‘24.kg,’’ Konsepsiya provedeniya reformy obrazovaniya v Kyrgyzstane, predstavlennaya prezidentom strany Bakievym K.S. http://www.24.kg/ community/2009/06/10/114742.html (July 22, 2009). 15. However, this strategy of stressing shortcomings can be misleading, because ‘‘In order to establish a need for external intervention or funding, the ministries of education sometimes tamper with statistics’’ (Silova & Steiner-Khamsi, 2009, p. 16). 16. Education Development Strategy 2011–2020 in the Kyrgyz Republic. Draft for discussion and review by internal and external stakeholders, November 2, 2008, p. 10. 17. Ministerstvo Obrazovaniya i Nauki, Proekt – Ramochniy Nastional’nyi Kurrikulum srednego obshego obrazovaniya Kyrgyzskoi Respubliki, Bishkek, 2008, p. 2. 18. Soros Foundation – Kyrgyzstan, Programa – Obrazovatel’niya reforma, http:// www.soros.kg/index.php?option ¼ com_content&view ¼ article&id ¼ 597&Itemid ¼ 125&lang ¼ ru (August 10, 2009). 19. There is disconnection between those who are developing new curricula and those who will ultimately implement it. It is, therefore, unclear how subject curricula developed by an international development agency will be accepted, approved, and implemented by the local education institution responsible for the education standards and content. Currently, there are no clear agreements between SEP and the Kyrgyz Academy of Education, a body under the Ministry of Education and Science, which is responsible for educational standards, content, and textbooks.

174

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

20. Both curriculum reform initiatives by the Soros Foundation and SEP of ADB attempt to shift from content-based to outcome- or competency-based curriculum. Competency is defined as the integrated ability of a person to apply different elements of knowledge, skills, and abilities in certain situations of life. The main goal of this approach is for children to be able to use their school knowledge in real or close to real-life situations. 21. The reduction of education load has enabled funds freed because of the reduction to be used to encourage performance-based teaching by increasing teachers’ salaries according to their performance. 22. Although this reduction was related to per capita financing mechanism and this initiative started well before PISA came in, in 2006, nevertheless, those who promoted this reform initiative now use PISA results to solidify their arguments and convince others in the worth of their efforts. One of the weaknesses of this reduction is that reduction of 10% happened throughout primary grades only for now. For other grade levels reduction is implemented only in those oblasts where per capita financing mechanism is being piloted (Chu¨i, Yssyk Ko¨l, Batken). 23. However, the situation seems much more complex and confusing at the moment. Secondary Education Project of ADB operates under the Ministry of Education and Science. However, representatives of SEP of ADB had had strained relations with Kyrgyz Academy of Education. Thus, it is still to be seen how effective will be this textbook development initiative. 24. Education Development Strategy 2011–2020 in the Kyrgyz Republic, for discussion and review by Internal and External Stakeholders, November 2, 2008, p. 10. 25. Originally, it was announced that 40 beginning teachers would be selected for the ‘‘Deposit for Young Teacher’’ program. They were to be credited 3,000 som monthly in addition to their salary, for a total of 108,000 som each that they could withdraw after completing their contracts (Kyrgyzinfo, September 1, 2004). 26. Traditional budget allocation processes were inherited from the Soviet period, with each school formulating its budget estimate using norms based on inputs (such as the number of classes, the number of teaching hours, the number of square meters, previous years’ budget allocations) rather than on outputs. The school then submits its estimate to the municipal administration where proposals are reviewed against actual resource availability. Usually, the actual budget received is much less than the submitted budget. This procedure limits school directors’ autonomy in making allocative choices based on the particular needs of each school. There is no incentive for schools to economize on particular areas of spending, since the school is unlikely to benefit from the savings. School budgets are sometimes cut by the amount saved during the previous year. Indeed, there is more incentive for schools to inflate their needs just so they get the basic needs covered. 27. The per capita pilot project is functioning throughout Chu¨i oblast, and from 2010, two more oblasts, Yssyk Ko¨l and Batken, rolled out to per capita funding system completely (Minister of Education and Science, Information Agency ‘‘Kabar,’’ July 20, 2009). It is, however, puzzling for us why the term ‘‘pilot’’ is used and often misused in donor’s involvement in Kyrgyzstan, because most ‘‘pilot projects’’ finish as their term ends, and nothing is normally left behind. 28. Final Project Completion Report (CAR & PEAKS, 2008, p. 27). 29. USAID – Quality Learning Project (QLP), Year 1 – Work Plan, 1 October 2007 – 30 September 2008, March 4, 2008, p. 27.

Impact of Standardized Testing on Education Quality in Kyrgyzstan

175

30. Since the breakup of the USSR, which (at least in theory) aspired egalitarian principles, the issues of equity have become less pronounced. 31. Though Soviet education espoused equality and uniformity, many scholars argue that Soviet schooling was never really monolithic or egalitarian, contrary to official doctrine. Besides clear disparities between Russian- and non-Russianmedium schools, obvious status differences existed between urban and rural schools as well as between schools with an emphasis on English or Mathematics (Niyozov, 2001; also see, Sutherland, 1999). Korth and Schulter (2003) observe that the Russian-medium schools continue offering better education than schools in Kyrgyz and other local languages. The Russian schools continue to enjoy high prestige and are attended by children of different linguistic backgrounds, while the Kyrgyz schools are attended exclusively by Kyrgyz children (Korth & Schulter, 2003). 32. To generate extra income, many urban schools are turning themselves into gymnasia or other schools of the ‘‘new type.’’ To become a ‘‘new type’’ school, a school has to have highly qualified, innovative teachers, sufficient facilities and resources, including textbooks and library. ‘‘New type’’ schools offer more advanced coursework and their pupils regularly win various regional and national academic competitions. 33. There have been many controversies around additional education services, and many people view these as another means of ‘‘pumping out’’ money from pupils’ parents; there is no standard or norm that guides the school administrators when they collect money for additional services. Officials attempted to monitor and regulate it, but it was mostly unsuccessful so far. 34. Village schools experience immense problems of funding, scarcity of resources, and teacher shortages. In many rural areas, no new schools can be built due to shortage of funds, although the number of school-age children continues to grow. 35. Kyrgyzstan participated in PISA research of 2009 for the second time. The results of PISA 2009 are yet to become public by the end of 2010 or in early 2011. The Assessment Specialist of REP anticipates there will be improvement from PISA 2006: ‘‘We tried to eliminate the effect of surprise and uncertainty this time. Of course we cannot go to every school, so we worked through district education departments. We shared information via mass media, for example, we published several sample questions in teachers’ newspaper ‘Kutbilim’ circulated nationally.’’ 36. Silova and Steiner-Khamsi (2009, p. 60) observe that education reforms are often imposed from outside or voluntarily borrowed out of fear of falling behind internationally. Thus, from early 1990s, different international agencies (UN, World Bank, IMF, EBRD, ADB, OSCE), foreign agencies (USAID, JICA, CIDA, TOCA, GTZ, DANIDA), private foundations and philanthropists (OSI, Soros Foundation, Aga Khan Foundation), and international nongovernment organizations (Save the Children, Mercy Corps, Academy of Educational Development, CARE) have started working actively in the field of education. 37. Amaliya Benliyan, Grajdanka Gita, Newspaper Vecherniy Bishkek, Friday, July 28, 2008, p. 27. 38. It is also important to note that there are limitations to the PISA process too. PISA produces large amounts of data, but the data do not adequately express indepth information about education (Figazzolo, 2009). Also, as a standardized test, PISA cannot reflect a complete picture of school system and education quality in a

176

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

particular country. Moreover, certain skills and competencies cannot be assessed with standardized test. Practitioners also caution that a focus on standardized tests can negatively impact the classroom experience. When a state pays too much attention to testing, introduces more measurement and national testing mechanisms based on PISA model and methodology, such actions can lead to fundamental changes in teaching and learning (Figazzolo, 2009). Standardized testing programs can lead to new instruction that contradicts teachers’ educational practice. To avoid pressure, teachers mostly change format of class and devote large amounts of classroom time to test preparation activities’’ (Abrams, Pedulla, & Madaus, 2003, p. 18). 39. Even at school level, there is growing dependency on external sources for survival. See DeYoung et al. (2006) for how school administrators are trying to adapt to changes and adopt survival strategies of schools in Kyrgyzstan.

ACKNOWLEDGMENTS The authors of this chapter would like to express gratitude to Daniyar Karabaev (Institute for Professional Development, Khorog) for the enthusiastic assistance in data collection process. They also thank Sia Nowrojee for her insightful feedback, eloquent reading, and editorial assistance on the chapter.

REFERENCES Abazov, R. (2004). Historical dictionary of Kyrgyzstan. Maryland: Lanham. Abrams, L. M, Pedulla, J., & Madaus, G. (2003). Views from the classroom: Teachers’ opinions of statewide testing programs. In: The impact of high-stakes testing. Theory into Practice, 42(1), 18–29. Mahwah, NJ: Lawrence Erlbaum Associates. Akiner, S. (1998). Social and political reorganization in Central Asia: Transition from precolonial to post-colonial society. In: T. Atabaki & J. O’Kane (Eds), Post-Soviet Central Asia (pp. 1–34). London: Taurus Academic Studies. Allen, J. B. (2003). Ethnicity and inequality among migrants in the Kyrgyz Republic. Central Eurasian Studies Review, 2(1), 7–10. On-line. Available at http://cess.fas.harvard.edu/ cesr/pdf/CESR_02_1.pdf. Retrieved on May 5, 2003. Apple, M. (1993). Official knowledge. New York: Routledge. Bell, J. (1995). Doing your research project. Buckingham: Oxford University Press. Bogdan, R. C., & Bicklen, S. K. (1998). Qualitative research in education (3rd ed.). Boston: Allyn and Bacon. Briller, V. (2009). Learning achievement in CEE/CIES region: An Analysis of 2006 PISA results. Presentation made at 7th Central Asian Forum on Education Organized by UNICEF (September 15–17), Bishkek, Kyrgyzstan. CAR & PEAKS. (2008). Final project completion report, 2008, p. 2. CEATM. (June 19, 2009). The programme for international student assessment. Available at http://www.testing.kg/en/projects/pisa/

Impact of Standardized Testing on Education Quality in Kyrgyzstan

177

Chabbot, C. (2009). Constructing education for development: International organizations and education for all. London: Routledge. Clandinin, D. J., & Connelly, F. M. (2000). Narrative inquiry: Experience and story in qualitative research. San Francisco: Jossey-Bass Publishers. Cohen, L., & Manion, L. (1997). Research methods in education (4th ed.). London: Routledge. DeYoung, A. (2001). West meets East in Central Asia: Competing discourses on education reform in the Kyrgyz republic. Unpublished manuscript. DeYoung, A. (2002). West meets East in Central Asia: Competing discourses on education reform in the Kyrgyz republic. Journal of Educational Research, Policy and Practice, 3(3), 3–45. DeYoung, A. (2004). On the current demise of the ‘‘Action Plan’’ for Kyrgyz education reform: A case study. In: S. P. Heyneman & A. DeYoung (Eds), The challenges of education in Central Asia (pp. 199–224). Greenwich, CT: Information Age Publishing Inc. DeYoung, A., Reeves, M., & Valyaeva, G. K. (2006). Surviving the transition? Case studies of schools and schooling in the Kyrgyz Republic since independence. Greenwich, CT: Information Age Publishing. DeYoung, A., & Santos, C. (2004). Central Asian educational issues and problems. In: S. P. Heyneman & A. DeYoung (Eds), The challenges of education in Central Asia (pp. 65–80). Greenwich, CT: Information Age Publishing Inc. El Pikir. (2005). Second Monitoring Learning Achievements study, commissioned by UNICEF. Figazzolo, L. (2009). Impact of PISA 2006 on the education policy debate. Education International. Available at http://download.ei-ie.org/docs/IRISDocuments/Research% 20Website%20Documents/2009-00036-01-E.pdf Frankel, J., & Wallen, N. (1993). How to design and evaluate research in education (2nd ed.). New York: McGraw Hill International Editions. Fullan, M., & Miles, M. (1992). Getting reforms right: What works and what doesn’t. Phi Delta Kappan, 73(10), 744–752. Gleason, G. (1997). The central Asian states. Boulder, CO: Westview Press. Glesne, C. (1999). Becoming qualitative researchers: An introduction (2nd ed.). New York: Longman. Haugen, A. (2003). The establishment of national republics in Soviet Central Asia. Great Britain: Palgrave. Heyneman, S. (2000). From the party/state to multiethnic democracy: Education and social cohesion in Europe and Central Asia. Educational Evaluation and Policy Analysis, 22(2), 173–191. Hitchcock, G., & Hughes, D. (1995). Research and the teacher: A qualitative introduction to school-based research. London: Routledge. Holmes, B., Read, G. H., & Voskresenskaya, N. (1995). Russian education: Tradition and transition. New York: Garland Publishing Inc. Ibraimov, O. (Ed.) (2001). Kyrgyzstan: Encyclopedia. Bishkek, Kyrgyzstan: Center of National Language and Encyclopedia. Jusupov, K. (1993). Kyrgyzdar: Sanjyra, Tarykh, Muras, Salt. Bishkek, Kyrgyzstan: Akyl. Kanimetova, A. (2005). Vvedenie ‘‘Depozita molodogo uchitel’ya’’ ne reshit polnost’iu problemu nehvatki uchitelei v shkolah Kyrgyzstana. Kabar news. February 16. Kiyizbaeva, Ch. (2008). Artky orundan alga kantip jylabyz? (How can we move up from the last position? Newspaper Kutbilim, February 20, p. 5. Kolbin, L. M. (1960). Kirghizskaya SSR. Moskva: Izdatelstvo VPSh and AON under Central Committee of CPSU.

178

DUISHON SHAMATOV AND KENESHBEK SAINAZAROV

Korth, B. (2001a). Analyzing language biographies. On-line article. Available at http://www. ca-research-net.org/pdf/Korth-Language_biographies.pdf. Retrieved on January 14, 2002. Korth, B. (2001b). Bilingual education in Kyrgyzstan: Pros and cons. In: Proccedings of the conference on Bilingual education and conflict prevention (pp. 141–150). CIMERA, Bishkek. Korth, B., & Schulter, B. (2003). Multilingual education for increased interethnic understanding in Kyrgyzstan. On-line article. Cimera Publications. Available at http://www.cimera.ch/ files/biling/en/MLG_Text1.pdf. Retrieved on February 28, 2003. Landau, J. M., & Kellner-Heinkele, B. (2001). Politics of language in the ex-Soviet Muslim states: Azerbayjan, Uzbekistan, Kazakhstan, Kyrgyzstan, Turkmenistan and Tajikistan. London: Hurst & Co. Megoran, N. (2002). The borders of eternal friendship? The politics and pain of nationalism and identity along the Uzbekistan-Kyrgyzstan Ferghana Valley boundary, 1999–2000. Unpublished doctoral thesis. Sidney Sussex College, Cambridge. Merriam, S. (1998). Qualitative research and case study applications in education. San Francisco: Jossey Bass. Meyer, K. (2003). The dust empire: The race for mastery in the Asian heartland. New York: Public Affairs. Miles, M., & Huberman, M. (1994). Qualitative data analysis: An expanded sourcebook. Thousand Oaks, CA: Sage. National Statistics Committee of the Kyrgyz Republic. (2008). Education and science in Kyrgyzstan. National Statistics Committee of the Kyrgyz Republic. (2009). Living conditions of the population of the Kyrgyz Republic 2004–2008. Bishkek, Annual publication. Niyozov, S. (2001). Understanding teaching in post-Soviet, rural, mountainous Tajikistan: Case studies of teachers’ life and work. Unpublished doctoral thesis. University of Toronto. OECD. (2007). PISA 2006: Science competencies for tomorrow’s world, Vol. 1. OECD report. Open Society Institute: Education Support Program. (2002). Education Development in Kyrgyzstan, Tajikistan and Uzbekistan: Challenges and ways forward. Available at http://www.osi-edu.net/esp/events/materials/final.doc. Retrieved on May 10, 2003. Rashid, A. (2003). Jihad: The rise of militant Islam in Central Asia. USA: Penguin. Shamatov, D. (2010). The impact of educational reforms on quality, sustainability and empowerment. Paper presented at the World Council of Comparative Education Societies (WCCES) XIV World Congress ‘‘Bordering, Re-Bordering and New Possibilities in Education and Society,’’ Bogazici University, Istanbul, June 13–18, 2010. Shamatov, D. A. (2005). Beginning teachers’ professional socialization in post-Soviet Kyrgyzstan: Challenges and coping strategies. Unpublished doctoral dissertation. Ontario Institute for Studies in Education of the University of Toronto. Silova, I., & Steiner-Khamsi, G. (Eds). (2009). How NGOs react. Globalization and education reform in the Caucasus, Central Asia and Mongolia. Bloomfield, CT: Kumarian. Soktoev, I. (1981). Formirovanie ii razvitie Sovetskoi intelligentsii Kirgizstana. Frunze: Ilim. Steiner-Khamsi, G. (2009). The impact of teacher shortage on the quality of education in Kyrgyzstan. Public lecture presentation at University of Central Asia, September 14, 2009. Steiner-Khamsi, G., Mossayeb, S., & Ridge, N. (2007) Curriculum and student assessment, preservice teacher training – an assessment in Tajikistan and Kyrgyzstan. New York: Teachers College, Columbia University.

Impact of Standardized Testing on Education Quality in Kyrgyzstan

179

Sutherland, J. (1999). Schooling in the new Russia: Innovation and change, 1984–1995. London: Macmillan and St. Martin Press. Tabyshaliev, S. (Ed.) (1979). Torzhestvo idei velikogo oktiab’ria v Kirgizii. Frunze: Ilim. The EFA (2000). Assessment-Country report: Kyrgyzstan. Available at http://www2.unesco. org/wef/countryreports/kyrgyz/contents.html. Retrieved on March 19, 2004. UNDP. (2003). The Kyrgyz Republic. Millennium development goals: Progress report. Bishkek. Available at http://www.undp.kg/english/publications/2003/mdgpr2003.pdf. Retrieved on March 20, 2004.

FROM EQUITY OF ACCESS TO INTERNATIONAL QUALITY STANDARDS FOR CURBING CORRUPTION IN SECONDARY AND HIGHER EDUCATION AND CLOSING ACHIEVEMENT GAPS IN POST-SOVIET COUNTRIES Mariam Orkodashvili ABSTRACT The chapter discusses the introduction of standardized tests as they move education system from equity of access to quality of instruction and learning. The aim of the chapter is to analyze the role of international education projects like PISA, TIMSS, and PIRLS in shaping national education policies and in helping them to tackle such issues as limited access to education, corruption, illegal practices, quality manipulations in academia, and achievement gaps. The main method used is crossnational comparative analysis. The theoretical scope of the chapter covers major scholarly works on international testing, achievement gaps, and corruption in education. The research finds that the arrows of influence The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 181–206 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013010

181

182

MARIAM ORKODASHVILI

operate in both directions, implying that while setting global standards, international projects base their judgments on identified local challenges in education systems of individual countries. Besides, internationalization of standards has spillover effects on curbing corruption and illegal actions that often cause widening of achievement gaps. The findings of the research could be used in designing education policies both on national and international levels to make education systems more transparent and comparable to international standards. The chapter sets forth a novel idea that the international projects like PISA, TIMSS, and PIRLS could serve not only as means of setting and checking quality standards in education, but as mediators in closing achievement gaps and even curbing corruption. This novelty presents a value to the local education policymakers, and more importantly, to the public.

INTRODUCTION The chapter discusses the introduction of standardized tests as they move the education system from equity of access to quality of instruction and learning. The chapter states that standardization brings about multiple implications, even more so when global pressures play a role in shaping national policies. Standardized testing on local and global levels affects all three major aspects of education: access, equity, and quality. Under the conditions of cultural relativism and the spread of neoliberal ideals, it has become quite easy to manipulate with rules, regulations, and quality standards in any sphere. The education system has become particularly vulnerable to those manipulations. In many countries, education sphere, whether at secondary or tertiary levels, has often been the field for manipulations, secrecy, corruption, nepotism, and professional misconduct. These deviant behaviors in academia have negatively affected equity and access to education and lowered the quality of instruction and learning. Post-Soviet region has become particularly notorious in this respect, where the unregulated decentralization, marketization, and commoditization of core public values such as education have been prone to corruption, illegality, and misconduct since the collapse of the Soviet Union. Thus, the major research issue that the chapter raises is that the international assessment projects such as Third International Mathematics and Science Study (TIMSS), Progress in International Reading Literacy Study (PIRLS), and Program for International Student Assessment (PISA) would not only produce official data on the rankings of countries revealing

Curbing Corruption in Secondary and Higher Education

183

the academic standings of school students (viz., fourth and eighth graders), but would also trigger discussions on the possible causes of achievement gaps between the countries, where in the case of post-Soviet region, corruption, illegality, and quality manipulations would be perceived as major maladies. The chapter further argues that aligning local standards with international requirements would increase the accountability of schools and improve the academic achievement of not only fourth and eighth graders but would enable the students to prepare for university entrance examinations as well. It claims that if public schooling offered high-quality education to all schoolchildren irrespective of their SES status (since countries would be incentivized to do so to achieve high rankings in TIMSS, PISA, and PIRLS projects), then the need for illegal private tutorship would decrease, and so would the instances of other professional misconduct. Therefore, the main argument of the chapter is that the international assessments like PISA, TIMSS, and PIRLS could have spillover effects on the achievements of all secondary and high school graders and on the academic readiness of university/college entrants as well. The chapter describes extensively the instances of corruption in education systems of post-Soviet countries, especially the corrupt actions connected with the process of entering the higher education institutions. The rationale for the choice of depicting corrupt practices during the transition process from secondary and high schools to universities and colleges is to show why the highest level of corruption takes place at this stage. The answer is: Because of poor public schooling in earlier grades the school graduates are not prepared for the entrance examinations and therefore resort to the help of private tutors. Bribery and nepotism in many post-Soviet countries during university entrance process are again motivated by low preparation levels of school leavers due to poor public schooling. The chapter assumes that this transition process from school to higher education, and illegalities connected with it, need to be depicted in detail to reveal the long run benefits that the international assessments such as PISA, TIMSS, and PIRLS could bring about to public schooling, which would help eliminate those illegalities and incentivize schools to revise and update study programs, curricula, materials and enhance the quality of teachers. Therefore, the chapter starts by describing the illegalities and corruption in education and then moves on to discussing international assessments and their potential to influence public schooling positively. A note should be made of the section that discusses the literature dealing with achievement gaps across the world and different theories offered to

184

MARIAM ORKODASHVILI

explain those gaps. This is done because the chapter offers a novel approach to this issue and claims that in the case of post-Soviet countries the most valid explanation for widening achievement gaps since the 1990s is again the corruption and illegal private tutorship that enables the rich to receive extra education and leaves out the socioeconomically unprotected strata. This differentiation has been becoming ever more vivid since the 1990s, after the collapse of the Soviet Union. Therefore, the chapter again states that international assessments would address this issue as well, since it would motivate the quality enhancement of public schooling and hence, closing of achievement gaps.

METHODOLOGY AND THEORETICAL GROUNDING The chapter uses cross-country international comparative analysis as the main methodological tool. Characterization of educational systems in the contexts of transition economies and political shifts in post-Soviet countries creates an idiosyncratic background against which major educational issues are raised and discussed relative to global educational context. The chapter also supports the idea that the internationally comparable tests will enable the policymakers to reveal education achievement gaps across countries, and design and offer appropriate remedies applicable to individual countries and regions. The chapter further argues that the adherence to the criteria of international standards contributes to the enhanced levels of macrosatisfaction that leads to global social justice. Therefore, cross-national comparative analysis is used as the main method of study throughout the chapter. The examples and cases are brought from different post-Soviet countries (mainly Georgia, Kyrgyzstan, Latvia, and Russia), since they present interesting cases of undergoing global influences that help to curb corruption and bribery, and reduce instances of quality manipulations and nepotism. The theoretical grounding is presented in the next two sections, referring to types of standardized examinations and achievement gaps, respectively.

Assessing Different Types of Standardized Examinations Standardized examinations are considered by many policymakers to be the most effective means of measuring achievement and contributing to

Curbing Corruption in Secondary and Higher Education

185

educational equity. Although widely debated on their efficiency to measure educational outcomes, standardized examinations are still considered to be the most optimal way of testing academic achievement. Anyway, no better method has been offered so far. The tests choose on the basis of achievement and not ascription. They are universalistic and transparent. Therefore, they increase equity of access to education. ‘‘Standardized testing simply means that the circumstances in which tests are conducted are made as similar as possible so that our evaluation of a person’s achievement will not be unduly biased’’ (Heyneman & Lehrer, 2006). Besides, ‘‘The main function of the standardized test is to be instrumental in achieving the highest possible degree of uniformity in the marking system’’ (Heyneman & Fa¨gerlind, 1988). Achievement test measures how well a student has acquired knowledge from secondary school syllabus. Aptitude test measures skills and abilities of a student and is an indicator of a future performance; it is a kind of prognostic test. Debates arise on the efficiency of achievement vs. aptitude tests. One asset of SAT (Scholastic Aptitude Test) is that because it is not hard on school curriculum, teachers can be creative and experiment with study programs, whereas academic achievement tests dictate school curriculum significantly. SAT also fosters higher equity by revealing low-income but bright students. However, SAT (like other similar standardized examinations) is subject to coaching, and controversies arise regarding the degree of coaching. Second, university performance is more closely predicted by academic achievement than by academic aptitude (Heyneman, 1987). However, SAT gives an opportunity to low-income students who did not have high-quality schooling to reveal their abilities and skills. Therefore, it increases equity. Although different countries need to use either achievement- or aptitudebased tests, in certain circumstances a mixture of the two types of testing could be applicable. The aim of the educators in this case is to consider costs and benefits associated with each type of test (for example, aptitudebased tests with multiple-choice questions being the cheapest option), the feasibility of conducting any particular type of test, geographic and ethnic peculiarity of a country, and other factors that might be influencing the successful outcome of the conducted policy. As an instance, while small countries like Georgia, Kyrgyzstan, and Latvia could combine aptitude- and achievement-based tests due to their small size and easier manageability of standardization, in large countries like Russia school-based achievementtesting selection might not always work, since there is a wide geographic and ethnic disparity.

186

MARIAM ORKODASHVILI

Achievement Gaps and International Standards The present research hypothesizes that all the other factors being held constant (SES, parental education, school resources, teacher qualification, etc.), high levels of corruption, bribery, and other illegal activities in academia could account for the achievement decline that has been taking place in educational systems of the former Soviet countries since the 1990s. Achievement gaps across countries and their causes have been studied manifold and major theories have been revised several times (Baker, 1993; Baker, Goesling, & Letendre, 2002; Coleman, 1968; Gamoran & Long, 2006; Heyneman & Loxley, 1983; McEwan & Marshall, 2004; Mullis, 1997; Postlethwaite, 1987; Schaub & Baker, 1991; Schmidt et al., 2001; Theisen et al., 1983; Willms & Somers, 2001). The issue of factors affecting student achievement has been researched over the past decades yielding various results depending on the context- and time-specific characteristics of the analyzed data sample. To start with, in late 1960s the impact of family socioeconomic status on academic achievement was quantified in the US and the UK (Coleman, 1968). In his report Equality of Educational Opportunity, Coleman found that preschool characteristics, and hence SES level, have more impact on academic achievement than school characteristics (Coleman, 1968). However, the findings underwent various modifications in later years. ‘‘Since then a host of studies have described how family SES and schooling interactively reproduce social status through the children’s achievements and educational attainments’’ (Baker et al., 2002, p. 291). First, a counterargument to the Coleman hypothesis raised the question of universality of findings, and whether they would be equally true for developed and developing countries (Heyneman & Loxley, 1983). Having researched different countries (29 high- and low-income countries), Heyneman and Loxley found that in developing countries school effects (infrastructure, class size, materials, etc.) have strong impact on student achievement and the poorer the country is economically, the more powerful the school effects appear to be (Heyneman & Loxley, 1983). The authors explain that this might be related to the scarcity of supply in developing countries, and, consequently, to a strong desire to learn among students. Moreover, in western cultures low- and high-income families have different attitudes to schooling, whereas in low- and middle-income countries, the desire of any SES-level parent (be it a peasant or a government official) is to give the best possible education to their child. This explains why student social background does not play a significant role in low-income countries.

Curbing Corruption in Secondary and Higher Education

187

Therefore, according to ‘‘Heyneman–Loxley Effect’’ in developed countries background factors are more important than school effects, whereas in developing countries school effects are strong determinants of academic achievement. For instance, SES and other preschool influences at age 10 are seven times more powerful in the US than they are in India, and wealthy children do not perform better in less industrialized countries, which is against what Coleman found in the US (Heyneman & Loxley, 1983). Afterwards, Baker et al. (2002) replicated Heyneman–Loxley question to test whether the HL effect still persisted 20 years later. They used TIMSS (1994–1995) cross-national data of 13-year-olds. ‘‘Thirty-six out of the 41 nations in TIMSS have all the data components required for this analysis’’ (Baker et al., 2002, p. 298). The measures of national economic development (GDP per capita), school achievement (mathematics and science assessment), family SES, non-school variables (student gender, age), and school resource quality were used in the study. The models used in the statistical analysis were (1) OLS (Ordinary Least Squares) regression model (variance explained statistically through adjusted R2), which was also used by Heyneman and Loxley (1983); (2) HLM (Hierarchical Regression Model), which is now commonly applied to multilevel data such as TIMSS (Baker et al., 2002, p. 300). The results showed that ‘‘Poorer nations do not show stronger school effects, and there is some indication that, contrary to the 1970s findings, the relationship between family background and student achievement, and the amount of student achievement variance attributable to family background, are similar across nations regardless of national income’’ (Baker et al., 2002, p. 305). However, the questions that were raised regarding these findings – How comparable are the nations across the 1970s and 1994 analysis? What is the overall trend in developing nations? Is the effect still present in some contexts?’ (Baker et al., 2002) – need further explanations, which Baker et al. offer in the following way: ‘‘Compared with the 1994 sample of nations, the 1970s sample of nations includes more poor nations from Latin America, where educational inequality tends to be high and involve widespread private schooling for elite families. The 1994 sample includes more poor nations emerging from the former Soviet bloc, where there is some evidence that educational resource inequality is relatively low and elite private schooling is rare’’ (Baker et al., 2002, p. 311). Therefore, it became obvious that the specific characteristics of data samples need to be considered while accounting for achievement differences cross nationally. Heyneman (2005) attributes the change of earlier findings to the socioeconomic changes over the past two decades. ‘‘There is a significant difference in samples. The earlier sample of 29 countries included nine from

188

MARIAM ORKODASHVILI

Latin America and one from the socialist states of Europe and Central Asia (ECA); the more recent study of 35 countries included only one from Latin America (Chile) and eight from ECA (the Russian Federation, Hungary, Latvia, Romania, Lithuania, Slovakia, the Czech Republic, and Slovenia)’’ (Heyneman, 2005, p. 3). Besides, Heyneman (2005) differentiates between Latin American and former socialist countries in terms of equity by which school resources were distributed, Latin America being the most inequitable and socialist countries being the most equitable. Hence, the findings differ depending on the sample of countries studied and compared. Later, Gamoran and Long (2006) reexamined the Coleman report findings. They concluded that the majority of the findings (schools largely segregated by race, large racial achievement gaps, and variation in achievement related more to family background) are still present in the US today. The authors also cover explanations for the variation in achievement in countries of different income levels. They suggest there is a clear threshold (B$16,000) below which HL effects are present and beyond which school resource variation matters very little. This could be explained by diminishing marginal return to education in higher income countries. While the lack of teaching resources and materials, low parental encouragement, SES and educational level, wide social stratification, nonstandardized curriculum, low teacher quality, and so on are usually named as major causes of achievement gaps in different parts of the world that are usually reflected in TIMSS, PISA, and PIRLS results (Kallo, 2006; OECD, 2001, 2004, 2007; PISA, 2008; Rautalin & Alasuutari, 2009; Rinne, 2008; Rinne, Kallo, & Hokka, 2004; Schuller, 2005; Vickers, 1994; Weymann & Martens, 2005), the present research claims that low international rankings of most post-Soviet countries in the international projects like PISA, TIMSS, and PIRLS could be most likely attributed to the chaotic, unregulated situation in almost all spheres, and in education sphere among others, that have engendered corruption, bribery, and deterioration of academic quality since the beginning of the 1990s. Thus, the chapter names bribery, nepotism, and illegal private tutorship as examples of most widely spread forms of corruption in education systems of post-Soviet countries that could account for the widening of achievement gaps in schools and decreasing access to higher education. These illegal practices may as well be the major causes of quality deterioration in academia, both at secondary and tertiary levels that have made the transition countries lag behind the international standards. Both practices hinder equity of access to education for socioeconomically disadvantaged students, and create significant achievement gaps. Therefore, the introduction of

Curbing Corruption in Secondary and Higher Education

189

effective education policies in the form of transparent ‘‘universalistic-type’’ examinations will make quality comparisons more possible due to standardization of requirements across countries. In this respect, the chapter emphasizes the role of the internationally comparable standardized examination projects like PISA, TIMSS, and PIRLS and their influence on the education quality enhancement in post-Soviet countries. Globally comparable standards will reduce the cases of quality manipulation and will help educators identify not only such country-specific shortfalls as lack of teacher qualifications, outdated study programs, and lack of school resources, but also the existence of corruption, illegal private tutorship, and nepotism in education systems. Besides, the introduction of standardization in public schools will enable socioeconomically disadvantaged students to access quality education. Thus, the chapter moves from the issues of wider access to possibilities of quality enhancement mainly through the decrease of the levels of corruption in the contexts of transition countries. The process is often referred to as shifting from increasing access to ensuring progress.

SECTION 1: CORRUPT LEGACIES AND EXAMINATIONS The first important implication that the standardized examinations brought about to education systems of many transition countries was decreased levels of bribery, nepotism, and favoritism. Especially in post-Soviet countries, the heritage of previous decades was the academia plagued with bribery, illegal tutorship, nepotism, coercion, and graft. University entrance examinations were particularly vulnerable to corruption. Only those who went to urban schools and went through private tutorship, and sometimes even managed to bribe examination committee members, had a chance to prepare well and enter higher education institutions. The system was characterized by a high level of nepotism among closed circles of ‘‘elite’’ classes. For instance, ‘‘one of the officials to be dismissed was Gelbakhiani, the rector of the Tbilisi Medical Institute (Georgia), who, along with the party secretary of the institute, was discovered to have tampered with entrance examinations, excluding qualified students and admitting those who paid bribes or had proper connections’’ (Suny, 1988, p. 307). This was widely discussed because it provided an exemplary punishment and sent shock

190

MARIAM ORKODASHVILI

waves to those public officials and university staff at other universities who were involved in illegal transactions. Second, the lack of adequate information needed to prepare in any subject and ambiguously1 defined entrance requirements made the system even more complex and impenetrable for most people. Entrance examinations were a combination of oral and written exams with each higher education institution having its own requirements regarding both the subject and the level of difficulty of the examinations. This complex set of rules created difficulties for minorities and populations that lived in the remote regions and rural areas, as they did not have access to the preparatory programs for the required subjects. Especially in the 1980s and the end of 1990s, until the collapse of the Soviet Union in 1991, the entrance examinations became a kind of ‘‘arms race’’ between secondary school curricula and entrance examination requirements, in which school curricula ended up a clear loser. The widely held public belief was that the changing of requirements was done on purpose to encourage illegal private tutorship. While entrance requirements became more stringent year after year, the school curricula began to lag behind the entrance requirements. It was not easy to change and adjust school curricula at the same pace as the postsecondary entrance requirements progressed. Besides, each university had its own entry requirement making it difficult for disadvantaged groups who did not have access to the inside information of the institution. These groups were again low-SES, minorities, and residents of remote regions. The ‘‘insider’’ information was accessible only to the elite classes and ‘‘education cartels’’ that developed the practice of nepotism and patronage through social connections and private tutorship. In the 1990s, the practice of nepotism and patronage through private tutorship became a well-established trend. Private tutors realized how good their chances were of running ‘‘private businesses’’ of preparatory courses and earning a ‘‘second income.’’ As Silova, Budiene, and Bray (2006) state, illegal private tutorship is another form of corruption widely spread in the post-Soviet territory and widely debated regarding the definitions and frames of its legality and illegality (Silova, Budiene, & Bray, 2006; Bray, 1999, 2009). In post-Soviet countries, the majority of these tutors either taught at universities themselves or were the members of admissions or examination committee. Naturally, their private students had a considerable advantage over other applicants in being admitted at universities. Malfunctioning of the governing educational bodies regarding entrance examinations, lack of transparency, and lack of efficient mechanisms to

Curbing Corruption in Secondary and Higher Education

191

align secondary school curricula with entrance examination requirements created an environment conducive to corruption in the system, which included nepotism, patronage, and favoritism. In the 1990s, this trend achieved a state-of-the-art level. ‘‘Prestigious’’ tutors – known for their particularly intensive and ‘‘high-profile’’ preparation courses that were accessible only to ‘‘elite’’ classes informally ‘‘defined’’ the main trends and directions of preparation for different departments. This tendency reached a peak in the late 1990s, when there were ‘‘wellestablished private-tutors’’ whose students had approximately 95–99% chances of acceptance at their chosen institutions. This fact had a direct effect both on equity of access and quality of teaching issues. As already mentioned, preparing for the entrance examinations with a private tutor was a luxury. The majority of the urban population prepared intensively during their final years at high schools. Not wasting extra time and entering the university at an early age (which in the most cases was 16–17 years) became a kind of educational fashion. Therefore, the ‘‘private tutorship tendency’’ put low-SES groups and ethnic minorities, as well as the residents of rural areas, at a huge disadvantage. Their low income and lack of access to highquality tutors significantly limited their chances of entering high education institutions (Orkodashvili, 2010). To sum up this period, entrance examinations were characterized by secrecy of information, nepotism, patronage, and favoritism among closed elite circles. Hence, we have a clear example of how governmental inefficiency contributed to a significant increase in inequality of access, deprived the low-SES and ethnic groups of chances for education chances, and fostered corruption. Thus, during the late 1990s, the unregulated and chaotic situation in post-Soviet countries caused lowering of the quality of education and ambiguities of university entrance requirements. As the private secondary and higher education institutions mushroomed, a lot of young students were lured to relative ease of obtaining secondary school certificates or higher education diplomas. New educational institutions did not have strict entrance requirements. Bribery at entrance examinations was ‘‘flourishing.’’ There was total chaos in academia. Nobody controlled secondary school curriculum, higher education study programs, or qualifications of teachers at schools or of faculty at higher education institutions. Neither student enrollment numbers nor acceptance criteria were controlled in any way. In later years, illegal private tutorship and its negative effects still persisted in many parts of the world. Russian Federation could provide one example. ‘‘Differentiation of the level of accessibility to higher education

192

MARIAM ORKODASHVILI

mainly results from the inequality in the level of preparation of school graduates to enter a higher educational establishment being complicated by nonuniformity of universities in their requirements to entrants as well as differentiation of the volume of knowledge which graduates acquire at comprehensive school. In fact, nowadays we can see the whole system of preparation to entrance exams at higher educational establishments. This system functions on the payable basis. It includes tutoring, access courses, system of agreements between schools and universities, and bribes. Cash flows in this sphere are semitransparent. The research of SU–HSE shows that in 2002 family spending on transfer from comprehensive school to higher educational establishment are estimated as much as 0.9 billion US dollars (Gokhberg et al., 2004). Only well prepared, well informed about the terms of entrance schoolchildren can overcome this barrier of accessibility. In the opinion of the overwhelming majority of those surveyed, the transfer from comprehensive school to higher educational institution is the focal point of corruption, conflicts, material, intellectual, and emotional losses of the families. In accordance with the data of the Russian Center of Research of Public Opinion all this has become the highest point of problem’’ (Gudkov, Dubin, & Leonova, 2004). Therefore, the enlarged ‘‘gap’’ between secondary school and higher educational institutions in Russia has led to the further spread of tutorship and ‘‘pseudotutorship,’’ which makes bribing a routine practice (Gokhberg et al., 2004, p. 30). A few more examples of corruption at examinations could be brought from media reports. 1. An instance of corruption at examinations could be taken from www. themoscowtimes.com (June 7, 2007): ‘‘High school students are asking President Vladimir Putin to annul the results of a state exam after answers were posted on the Internet 10 hours before it was administered.’’ 2. ‘‘The overall amount of bribes annually given in Russia for entering universities reaches $520 million, Interfax announced referring to UNESCO report on corrupt practices in education, which covered 60 states of the world’’ (from www.kommersant.com, June 8, 2007). Furthermore, Table 1 illustrates general perceptions of public (in Russia) on the possibilities of entering university without bribes based on 2004 survey. Besides, owing to the corruption there is considerable social differentiation in higher education access. Bar chart 1 shows the situation in the sphere of education for children of those groups that occupy relatively high

193

Curbing Corruption in Secondary and Higher Education

Table 1.

Accessibility to Higher Education in Russia.

Variants of Reply

Accessible

Inaccessible

Diversity

1 Yes No Could not answer

2 40 49 10

3 67 15 17

3–2 27 34 7

Note: People saying that without bribe, without present or service to some people in a university administration it is impossible to enter a university. Do you agree with this? (as a % from the number of those surveyed in the groups divided according to the level of accessibility. Source: Gudkov et al. (2004).

180 160 63

140

59 % of those who managed to enter HEEs

120 100

45

47 58

80 60

35

43

32

29

1963

1983

56

% of those who planned to enter HEEs % of total number of school leavers

40 20

50

48

1994

1998

0

Bar chart 1. The Dynamics of Social Structure during the Period of Transition from Secondary Education to Higher Education. Children of directors and specialists. Novosibirsk Oblast. Source: Gokhberg et al. (2004).

position in the society. The bar chart reveals the obvious tendency that children of directors and specialists prevail in the number. This prevalence is even more obvious if we look at those who managed to enter higher educational institutions in 1994 and 1998. And vice versa, children of workers, peasants, and employees are more seldom among those who are eager to enter a higher educational institution (Gokhberg et al., 2004). They are also not very often met among those who managed to enter higher educational institutions (Bar chart 2). No doubt that this selection

194

MARIAM ORKODASHVILI 180 160 140

50

51 % of those who managed to enter HEEs

120 100

56

55

33

80 36

60 40

60

27

% of those who planned to enter HEEs

30

% of total number of school leavers

69

20

42

38

1994

1998

0 1963

1983

Bar chart 2. The Dynamics of Social Structure during the Period of Transition from Secondary Education to Higher Education. Children of employees, workers, and peasants. Novosibirsk Oblast. Source: Gokhberg et al. (2004).

during the transition from secondary education to higher education bears social character. As the bar charts illustrate, during the 1990s, social differentiation in entering higher education institutions grew. Therefore, standardized examinations could to certain extent level off this differentiation.

SECTION 2: STANDARDIZED EXAMINATIONS FIGHTING CORRUPTION In many countries the presence of corruption prompted the initiation of reforms aimed at decreasing its levels, scales, and scopes. Hence, the introduction of standardized examinations could be a significant tool to fight corruption. The argument of Section 2 is that through the introduction of standardized testing, the first step can be taken toward fighting corruption and transforming education system effectively in a short span of time. The section discusses the advantages and positive implications of standardized testing systems, and argues that the countries with successful testing mechanisms could provide a good example of jump-starting education reforms for other countries facing similar problems (Orkodashvili, 2010).

Curbing Corruption in Secondary and Higher Education

195

Standardized exams increase transparency and objectivity of assessment criteria. Being autonomous from educational institutions and making requirements and criteria public, examination centers enhance equity by giving opportunities to the worthy students to reveal their knowledge and be accepted at a higher education institution. For instance, in Georgia computer-assisted and anonymous test checking system has made it practically impossible to ‘‘buy one’s high grade’’ at Unified National Entrance Examinations. This has given a chance to the students from low-SES and remote mountainous regions to participate and succeed in the exams. Until 2004, students were able to purchase not only their university admission, but also passing grades and eventually a diploma. Individual universities administered their own admissions exams. Admissions bodies, composed of university lecturers, would sit in on oral exams and grade written papers. No independent observers were allowed to monitor the process. Previously, there were two ways to obtain a university place. The first involved students in their final year taking private classes offered by the same lecturers who sat on the admissions body at his or her chosen university. The second required the parents of a university applicant simply to bribe the admissions body before the entrance exams. In both instances students would be ‘‘fed’’ pre-agreed questions in the oral exam and given advance warning of the subjects (i.e. topics, exam questions) in the written exam. (Karosanidze & Christensen, in TI, 2005, pp. 36–37)

The important difference between the UNEEs and the entrance examinations of the previous years is that while in previous years each university had its own entrance requirements, the UNEEs are uniform in structure. Special examination centers have been set up in several places in the capital and other cities in the regions of Georgia. All students have to register for the exams and sit for the tests at one of those examination centers, which are assigned to them during their registration process. The tests are a combination of achievement measuring, or curriculumbased tests, and skill/aptitude measuring tests. In contrast, during previous years the majority of tests and exams were purely knowledge based; moreover, they were based on the knowledge that each individual university required, and that was acquired from private tutors, and not on the knowledge acquired at secondary schools. Therefore, the chances of entering higher education institutions for ethnicities, low-SES students, and residents of the regions have significantly increased after the introduction of the UNEEs. Besides, students can indicate up to five or seven institutions of their choice. The exam process is monitored by external invigilators, observed by surveillance cameras that are also accessible to parents and the wider public.

196

MARIAM ORKODASHVILI

The multiple-choice part of the tests is checked by computers, and the essaypart of the tests is assessed by anonymous examiners. All these facts have contributed to transparency, to wider access to higher education, particularly for low-SES and ethnic minority students, and have decreased the rate of bribery in the admission process. The monitoring of the new system illustrated the degree of planning that had gone into the first sitting of the UAE, further limiting opportunities for corruption. Some examples of the new procedures were: o o o o

Test-takers were seated randomly inside the test centre; Multiple versions of tests were created; Names of test-takers were not included on test papers; Identities of graders were thoroughly protected before, during and after the grading process; o Video monitors were placed in every testing room, allowing live transmission to parents and other observers outside the centers. The registration procedures also proved effective. Students registered in advance and received direct confirmation that included their personal testing schedule and location, and their photographic identification. The NAEC retained a copy of this information and required each student to produce registration documents and identification on the day of examination. No student was admitted without the necessary documentation. (Karosanidze & Christensen, in TI, 2005, pp. 39–40)

It is interesting to view the perceptions of the public about the transparency of the UNEEs and about their impact on decreasing levels of corruption in the university admissions process. Transparency International Georgia carried out three separate surveys with a total of 973 students, 764 parents and 340 administrators across Georgia. Parents were interviewed outside the testing site while their children sat the exam inside. TI Georgia monitors interviewed test-takers as they exited the test centre. Only students who volunteered to be interviewed were included in the survey. A large majority of respondents (80% of students, 79% of parents and 96% of administrators) felt confident that the new process would eliminate corruption in university admissions. Interestingly, only 19.5% of students made use of a special information hotline that was put in place in Tbilisi. (Karosanidze & Christensen, in TI, 2005, p. 37)

Bar chart 3 illustrates the results of the survey conducted among students, parents, and administrators. Bar chart 4 presents the results of the survey that was conducted among students and parents on how understandable the process of university admissions was. The survey revealed that a high percentage of both students and parents understood the system and procedures of the newly introduced examinations.

197

Curbing Corruption in Secondary and Higher Education 100 80

60 40 20 0 Students

Parents

Admins

Bar chart 3. Do You Feel Confident that the New Processes will Eliminate Corruption in University Admissions? Percentage of respondents answering ‘‘yes.’’ Source: Karosanidze and Christensen, in TI (2005). 100 80 60 Yes

40

No

20 0 Students

Bar chart 4.

Parents

Do You Understand the Process of University Admissions? Source: Karosanidze and Christensen, in TI (2005).

Another example is Kyrgyzstan. In 2002, national entrance examinations were introduced there – the Obsˇcˇerespublikanskoe Testirovanie – Kyrgyz National Scholarship Test (or the equivalent of American SAT). This policy could be considered as a step toward fighting corruption in education system and hence widening ethnic and regional equity of access to higher education owing to the fact that examination centers were set up in 90 localities

198

MARIAM ORKODASHVILI

dispersed over different regions of Kyrgyzstan. The general perception of public is a better access to higher education and objective testing system (http://www.for.kg/goid.php?id=61000&print). Therefore, the positive implications of standardized exams were first and foremost creating a uniform, ‘‘universalistic,’’ and more transparent testing system understandable and accessible for ethnic minority, low-SES, and regional students.2 One more noteworthy implication of standardized examinations can be that through continuous assessment process and tracking of student achievement, they strengthen the view that no single level of education should be prioritized over others, but rather tests could serve as good indicators of how well different levels work together in terms of moving students smoothly and efficiently from one level to another. It can be summarized that standardization of tests and examinations has created a benchmark from where governments could start reforming education system effectively. However, in certain cases, new education policy needs to be introduced step-by-step, due to high geographic disparity of a country. For instance, the recently introduced Unified State Examinations (so-called EGEs) in Russia may serve as an instrument of ‘‘smoothening’’ of the consequences of inequity at the stage of transition from the comprehensive education to the higher education. This policy was introduced gradually. Such an exam took place in 5 regions in 2001, 16 regions in 2002, and 47 regions in 2003. The Ministry of Education proposed in 2001 a plan to reform university admissions procedures and introduce a single, nationwide standardized set of exams. On June 2 Education Minister Vladimir Filippov announced that these plans are to come to fruition by 2005. The exam will be introduced gradually, and this year school leavers will be able to apply to 15–20 different universities taking part in the experiment. Students who have won national and municipal competitions will be able to enter the university of their choice without taking the exam. Filippov stressed that the national exam is necessary for raising the objectivity of school-leaving examinations. Previously, admissions boards from each university set their own examinations to test a student’s suitability for entry to specific departments. (Rosbalt News Agency, June 3, 2003, http:// www.rosbalt.ru)

In 2008, Lomonosov Moscow State University and EGE Center agreed that test results in mathematics and Russian language would be taken into consideration in the admission process (http://www.proforientator.ru, November 4, 2008). In 2009, Unified State Examinations (EGEs) were administered all over Russia in all higher education institutions. However, the agreement between

Curbing Corruption in Secondary and Higher Education

199

MSU and EGE Center caused controversial reactions among the students from regions who intended to apply to different higher education institutions and perceived this agreement as inequitable. Likewise, skepticism and controversies accompanied the introduction of unified testing in Ukraine (USAID Report, July–September, 2007). Therefore, step-by-step introduction of a new policy seems to be most reasonable in largely diversified countries like the Russian Federation and Ukraine. Another important fact to consider is that new education policies might sometimes encounter certain difficulties not because of being wrong but rather due to their transitory shock effect on the society. For instance, the situation appeared somewhat complex in Latvia. ‘‘The Centre for Curriculum Development and Examination introduced new school standards and curricula and started to implement the system of centralized examination. However, the main problems occurred in the phase of implementation – many new democratic changes that aimed to strengthen achievements of the first stage were introduced undemocratically. Public opinion was neglected, the society was not taking part in the decisionmaking process, and the situations were not analyzed and estimated. At this stage, the society was rising against new steps in the reform, many of which were progressive. Most debated were issues of centralized examination, bilingual education for minority schools in Latvia, and introduction of a study loan system. By not being able to change the strategy of the decision-making process, MOES was forced to slow the pace of the reform’’ (Dedze & Catlaks, 2001, p. 156).

SECTION 3: MOVING FROM NATIONAL EQUITY TO INTERNATIONAL QUALITY Uncertainties in transition countries have caused moral confusion in many cases. Standardized examinations serve to decrease the levels of uncertainty and confusion and introduce more clarity in education systems across countries. They also help us to see our universal human nature through its variable manifestations and then to craft political institutions that build on this insight. The questions that rise while moving from national to international level are the following: how do we use metadata to change national practices? How could we contextualize the TIMSS, PIRLS, and PISA data (http:// timss.bc.edu)?

200

MARIAM ORKODASHVILI

This section discusses the role of international standardization projects in helping developing countries raise quality of teaching, curriculum, and assessment. As already mentioned, standardized examinations seem to be providing good evidence for linking inputs with outputs and for measuring achievement accurately. Aligning local standards with international requirements is a further positive implication of standardized examinations. The Organization for Economic Cooperation and Development administered PISA, the results of which were published in 2002. Besides, ‘‘Cross-national surveys of student achievement, and in particular the Third International Mathematics and Science Study (TIMSS), of which the data were released in 1996 and 1999, had a considerable influence on policy makers. In the domain of mathematics, for example, Dossey and Lindquist (2002) reported that TIMSS data were an important reference for curriculum reform’’ (Bray, Adamson, & Mason, 2007, p. 22). Higher TIMSS results in other countries can provide an important feedback for teachers, practitioners, and educators on possible shortfalls in qualification and preparation of mathematics teachers at secondary schools in the countries with lower TIMSS scores. In addition to enhancing students’ chances at higher education entrance examinations, this feedback will naturally prompt the improvement of teacher training and preparation courses and curriculum revision. Evaluation of international education achievement will help countries raise their standards of quality and fairness. Adhering to international validity and reliability standards of tests, as well as scaling and equating methods will create uniform, transparent, and objective international testing system that will prompt educational institutions of different countries to raise their standards of instruction and assessment. Hence, high-quality instruction at schools will naturally contribute to raising the level of trust among parents and wider public in education institutions. This will consequently lead to tighter social cohesion associated with economic growth and increased prosperity, which will in no way be a local process but a wider scale international movement toward quality enhancement and global social cohesion. ‘‘Recent research in the United States shows that the quality of schooling relates to real differences in earnings and attainment y When scores are standardized, they suggest that a 1-standard-deviation increase in mathematics performance at the end of high school translates into 12 percent higher annual earnings y Although the research is less extensive, similar or larger magnitudes of earnings improvement have been found in other countries’’ (Hanushek & Raymond, 2006, p. 52). This is a clear

Curbing Corruption in Secondary and Higher Education

201

indication of a tendency toward global economy development through international alignments of education standards. Therefore, the chapter assumes that implementing TIMSS, PIRLS, and PISA projects across countries might be the most cost-efficient strategy for fighting corruption and achieving higher transparency, objectivity, and quality standards in academia in several countries simultaneously and with consolidated efforts. In addition to raising academic standards, international standardized examinations also bear certain global social responsibility and power to spread certain universal ethical values. For instance, the role of PISA, TIMSS, and PIRLS can be significant in the changing attitudes and perceptions of populations in different countries. This would give a rather interesting insight into the relationship between global educational policies and their influence on universal human and social dynamics that would yield significant data for developing future multinational businesses and international relations by considering universal social and ethical values. Moreover, as already discussed in previous sections, international education projects make it possible to identify and account for academic achievement gaps between the countries and regions, which is a significant step forward toward diagnosing malfunctioning education systems and policies in individual countries and prescribing suitable remedies.

CONCLUSIONS AND POLICY RECOMMENDATIONS Educational projects of a global scale are a major force in shaping education worldwide. The chapter states that the arrows of influence move in both directions, meaning that the international projects, while designing the international standards and requirements, base their judgments and criteria first and foremost on national experiences, shortcomings, and countryspecific needs. The chapter assumes that internationalization of standards and requirements makes cross-country comparisons possible. Besides, as discussed and shown throughout the chapter, internationalization of standards has spillover effects on curbing corruption and illegal actions that are prevalent in post-Soviet countries and that often cause widening of achievement gaps both within and between countries. Standardized examinations first make

202

MARIAM ORKODASHVILI

the criteria and requirements more explicit and available to public on national and international levels, then trigger quality enhancement strategies and ultimately reduce the degree of illegal practices. Standardized examinations can serve as effective tools for expanding equity of access to education, for fighting corruption, raising standards and quality of teaching, curricula, and assessment, and for contributing to social cohesion. It is essential for educators and policymakers to constantly standardize and publicize the requirements of examinations to widen equity of access to education and to adjust curriculum and test levels (predictive validity calculations could serve as one good example in this case). Educators should vividly show wider public benefits of standardized exams in fighting corruption and hence opening better education and career chances for minorities and low-SES students. This has been particularly true for post-Soviet countries. Aligning local standards with international standardized examination projects like PISA, TIMSS, and PIRLS will enable countries to prepare secondary school students for better international education and labor opportunities. This will move education system of any country from expanding equity toward enhancing quality. Therefore, effective implementation of standardized examinations will have multiple beneficial consequences. Effectively conducted standardized tests and examinations could serve a good starting point for the revitalization of the whole education system in a country. They will enable the countries to close achievement gaps and enhance quality of education through curbing corruption and reducing the need for illegal private tutorship that often disadvantages low-SES students. Thus, the countries with successful standardized examination procedures could jump-start education reforms and provide a good example to other countries who wish to pursue similar goals. While introducing new educational policies, sociopolitical, geographic and ethnic peculiarities of a country should be taken into consideration along with universal readily available strategies. Before and after the introduction of any major policies, the educators should conduct cross-country comparative comparisons to see and understand the results of the policies for future improvements. Under the conditions of cultural relativism, standardization and comparability of education policy outcomes are particularly important for understanding the reasons and causes that might be bringing about gaps and differences that make international comparisons complex.

Curbing Corruption in Secondary and Higher Education

203

The fact to be borne in mind is that the mismatch between the goals and responsibilities of the authorities to educate everyone and the scarcity of resources should be considered as the primary challenge to tackle in the future. Therefore, the factors to be taken into consideration in terms of examination administering process are feasibility, budget, and context issues that inevitably influence the efficiency and effectiveness of conducting the examinations. Every single policy or innovation to be carried out should consider the amount of available budget, the feasibility of the idea, and the socioeconomic, political, or cultural context in which this policy is going to operate. Therefore, the ultimate goal is achieving high-quality, accessible, and equitable education systems through the projects and programs that are scientifically sound, empirically informed, and practically applicable.

NOTES 1. As already mentioned, the ambiguity in the definition of standards, requirements, and generally, of quality, is a strong tool in the hands of university staff for manipulation and hence induces the spread of corruption. 2. In addition to enhancing equity of access, standardized tests have given opportunity to policymakers to conduct different measurement calculations more accurately to identify shortcomings and plan future improvements of tests. For instance, the introduction of Unified National Entrance Examinations in Georgia has facilitated the calculations of essential results regarding the predictive validity of correlation correction for range restriction, which helps determine the relationship between entrant achievements (during entrance examinations) and student achievements and persistence (in their later years of studies). These calculations bear significant importance for measuring the success of test quality, future success predictability, and assessment criteria. The formula for calculating predictive validity is r2 XU ¼ r2 xu s2 X=s2 x þ r2 xuðs2 X  s2 xÞ, where r2XU is the correlation between X and U (nonselected population); r2xu is the correlation between x and u (selected population); and s is the variation for nonselected and selected populations accordingly. The calculation results revealed the highest predictive validity for foreign language tests: r ¼ 0.55 and lowest predictive validity for mathematics and so-called skills’ tests (similar to SAT): r ¼ 0.4. (Source: Ministry of Education of Georgia, http:// www.mes.gov.ge) The results will be helpful for educators involved in curriculum design to either make appropriate adjustments to the tests and curriculum, or to retrain school teachers for Unified National Entrance Examination requirements.

REFERENCES Baker, D. (1993). Compared to Japan, the U.S. is a low achieveryreally. Educational Researcher, 22(3), 18–20.

204

MARIAM ORKODASHVILI

Baker, D. P., Goesling, B., & Letendre, G. K. (2002). Socioeconomic status, school quality, and national economic development: A cross-national analysis of the ‘‘Heyneman–Loxley Effect’’ on mathematics and science achievement. Comparative Education Review, 46(3), 291–312. Bray, M. (1999). The shadow education system: Private tutoring and its implications for planners. Paris: UNESCO: International Institute for Educational Planning. Bray, M. (2009). Confronting the shadow education system. What government policies for what private tutoring? Paris: UNESCO: International Institute for Educational Planning. Bray, M., Adamson, B., & Mason, M. (Eds). (2007). Comparative education research: Approaches and methods. Hong Kong: Comparative Education Research Center, University of Hong Kong, Springer. Coleman, J. S. (1968). The equality of educational opportunity report. Washington, DC: U.S. Printing Office. Dedze, I., & Catlaks, G. (2001). Who makes education policy in Latvia? Peabody Journal of Education, 76(3 and 4), 153–158 (Lawrence Erlbaum Associates). Gamoran, A., & Long, D. (2006). ASA Meeting Paper, Montreal, Canada. Gokhberg, L., Gaslikova, I., Kovaleva, N., Larionova, M., et al. (2004). Country Analytical Report. Equity in Education Thematic Review. Russian Federation. The State University – Higher School of Economics, SU-HSE. Gudkov, L., Dubin, B., & Leonova, A. (2004). Education in Russia: Attractiveness, availability, functions. Vestnik of the Public Opinion, 1(69), 35–55. Hanushek, E. A., & Raymond, M. E. (2006). School accountability and student performance. Federal Reserve Bank of St. Louis Regional Economic Development, 2(1), 51–61. Heyneman, S. P. (1987). Uses of examinations in developing countries: Selection, research, and education sector management. International Journal of Education Development, 7(4), 251–263. Heyneman, S. P. (2005). Student background and student achievement: What is the right question? American Journal of Education, 112(November), 1–9. Heyneman, S. P., & Fa¨gerlind, I. (Eds). (1988). University examinations and standardized testing. Principles, experience and policy options. World Bank technical paper no. 78. Proceedings of a seminar on the uses of standardised tests and selection examinations, Washington, DC (Beijing, China, 1985). Heyneman, S. P., & Lehrer, R. (2006). Should standardized tests be used to assess the progress of NCLB? Peabody Reflector (Fall 2006), Peabody College, Vanderbilt, pp. 12–13. Heyneman, S. P., & Loxley, W. A. (1983). The effect of primary-school quality on academic achievement across twenty-nine high- and low-income countries. American Journal of Sociology, 88, 1162–1194. Kallo, J. (2006). Soft governance and hard values: A review of OECD operational processes within education policy and relations with member states. In: J. Kallo & R. Rinne (Eds), Supranational regimes and national education policies: Encountering challenge (pp. 261–297). Helsinki: Finnish Educational Research Association. Karosanidze, T., & Christensen, C. (2005). A new beginning for Georgia’s university admissions. In: Stealing the future. Corruption in the classroom. Ten real life experiences. Berlin, Germany: Transparency International. McEwan, P. J., & Marshall, J. H. (2004). Why does academic achievement vary across countries? Evidence from Cuba and Mexico. Education Economics, 12(3), 205–217 (Routledge: Taylor and Francis). .

Curbing Corruption in Secondary and Higher Education

205

Mullis, I. V. S. (1997). Mathematics achievement in the primary school years: IEA’s Third International Mathematics and Science Study. Chestnut Hill, MA: Boston College. OECD. (2001). Knowledge and skills for life: First results from PISA 2000. Paris: OECD. OECD. (2004). Learning for tomorrow’s world: First results from PISA 2003. Paris: OECD. OECD. (2007). PISA 2006 science competences for tomorrow’s world. Paris: OECD. Orkodashvili, M. (2010). Higher education reforms in the fight against corruption in Georgia. Demokratizatsiya: The Journal of Post-Soviet Democratization, 18(4), forthcoming. PISA. (2008). OECD Programme for International Student Assessment (http://www.pisa. oecd.org). Paris: OECD. Postlethwaite, T. N. (1987). Comparative education achievement research: Can it be improved? Comparative Education Review, 31(1), 150–158. Rautalin, M., & Alasuutari, P. (2009). The uses of the national PISA results by Finnish officials in central government. Journal of Education Policy, 24(5), 537–554 (Routledge: Taylor & Francis Group). Rinne, R. (2008). The growing supranational impacts of the OECD and the EU on national education policies, and the case of Finland. Policy Futures in Education, 6, 665–680. Rinne, R., Kallo, J., & Hokka, S. (2004). Too eager to comply? OECD education policies and the Finnish response. European Educational Research Journal, 3, 454–484. Schaub, M., & Baker, D. P. (1991). Solving the math problem: Exploring mathematics achievement in Japanese and American middle grades. American Journal of Education. 99(4), pp. 623–642. Schmidt, W. H., Wang, H. C., & McKnight, C. C. et al. (2001). Why schools matter: Using TIMMS to investigate curriculum and learning. Unpublished manuscript. Schuller, T. (2005). Constructing international policy research: The role of CERI/OECD. European Educational Research Journal, 4, 170–179. Silova, I., Budiene, V., & Bray, M. (Eds). (2006). Education in a hidden marketplace: Monitoring of private tutoring. New York: Open Society Institute. Suny, R. G. (1988). The making of the Georgian nation (and Indianapolis in association with Hoover Institution Press, Stanford University, Stanford, CA). Bloomington, IN: Indiana University Press. Theisen, G., Achola, P. P. W., & Musa Boakari, F. (1983). The underachievement of crossnational studies of achievement. Comparative Education Review, 27(1), 46–68. USAID. (October 15, 2007). The Ukrainian standardized external testing initiative (USETI). Quarterly report no. 2, July–September, 2007. American Institutes for Research. USAID. Vickers, M. (1994). Cross-national exchange, the OECD, and Australian education policy. Knowledge and Policy, 7, 25–47. Weymann, A., & Martens, K. (2005). Bildungspolitik durch internationale Organisationen: Entwicklung, strategien und bedeutung der OECD. O¨sterreichische Zeitschrift fu¨r Soziologie, 30, 68–86. Willms, J. D., & Somers, M. A. (2001). Family, classroom and school effects on children’s educational outcomes in Latin America. School Effectiveness and School Improvement, 12(4), 409–445.

206

MARIAM ORKODASHVILI

WEB SOURCES ON INDIVIDUAL COUNTRIES Georgia http://www.naec.ge http://www.statistics.gec http://www.eppm.org.ge http://www.education-info.ge http://dir.geres.ge/ge/Education-and-Reference

Kyrgyzstan http://www.akipress.kg http://www.bibl.u-szeged.hu/oseas/kyrgyz_structure.html http://bc.edu/bc_org/avp/soe/cihe/ihec/regions/Uzbekistan_Central_Asia.pdf http://www.moik.gov.kg http://www.govservices.kg http://www.open.kg

Latvia http://isec.gov.lv/en/index.shtml http://isec.gov.lv/en/links.shtml http://www.li.lv http://www.izm.gov.lv http://izm.izm.gov.lv/ministry/currently.html

Russia http://www.informika.ru http://www.rosbalt.ru http://www.proforientator.ru/ucheba/ege_msu.htm http://ege.ru/ (Unified State Examinations Centre of Russia) http://www.eurydice.org

A COMPARATIVE ANALYSIS OF DISCOURSES ON EQUITY IN EDUCATION IN THE OECD AND NORWAY Cecilie Rønning Haugen ABSTRACT This chapter undertakes a comparative analysis of discourses on equity found in OECD and Norwegian policy documents. This is an interesting area to study as the OECD is found to be an important agenda setter for many countries’ educational policies. A comparative analysis of OECD and Norwegian educational policies is especially interesting because the OECD is often found to be pressing for a neo-liberal agenda, while Norway has a socialist-alliance government. Combining Basil Bernstein’s theoretical framework with key principles from Critical Discourse Analysis, the author investigates power relations within OECD and Norwegian educational policy documents. Two equity models serve as analytical tools: equity through equality and equity through diversity, which can be described along the three dimensions: de-/centralization, de-/standardization and de-/specialization. Using the analysis of two key documents on equity in education from the OECD and Norway, the author points out the similarities and differences in two documents. Both the OECD and Norwegian approaches to equity in education can be The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 207–238 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013011

207

208

CECILIE RØNNING HAUGEN

related to a centralized decentralization or a conservative modernization of education. However, there are also important differences between the two documents. For example, the Norwegian ministry has more emphasis on equity through equality and is less influenced by neo-liberalism and authoritarian populism than the OECD. In conclusion, the author argues that neither of the two described approaches appears to improve the inequities in education. A different way of targeting these inequities could be based on critical theory and research.

INTRODUCTION In this ‘‘age of accountability’’ (Hopmann, 2007), we find that educational systems are being pressed very hard to prove that their activity is efficient and maintains a high quality. One mayor agent used to assess many nations’ education systems is the OECD’s PISA evaluation. Numerous reports use PISA data to assess ‘‘the quality of school structures and schooling, issues of social inequality, gender, migration, etc.’’ (Hopmann, 2007, p. 389). However, both the quality of the PISA investigation and how the data material is used to address national educational policies have been highly criticized (see, e.g., Hopmann, Brinek, & Retzl, 2007; Haugen, 2009). However, there is quite a difference in how each nation responds to the recommendations made by the OECD (Hopmann, 2007), from all but ignoring them, in using them as an important source in their national educational policymaking. Hopmann (2007) claims that this difference in compliance tends to vary according to whether and how the nations have implemented national accountability tools. Educational policies are based on approaches to education and knowledge, where the interests and voices of some are included, whereas others are excluded. ‘‘Discourses are about what can be said, and thought, but also about who can speak, when, where and with what authority’’ (Ball, 1994, p. 21). Clarke and Newman (1997, p. Xiii) point to the importance of language in this context: language can be appropriated by different groups for different purposes: it forms a distinct terrain of political contestation. This terrain is of critical importance because of its place in the struggle for legitimacy. The success of any change project, whether initiated by government legislation or by organizational managers, depends on the success of its claims for legitimacy and its ability to win the hearts and minds of those on whom it impacts.

Discourses on Equity in Education in the OECD and Norway

209

When addressing educational reforms, one key rhetorical element is the issue of equity. Both the OECD and Norway pay specific attention to this issue in their educational policies, and it receives special treatment in the PISA investigation and reports. As the concept is of basic importance in a democracy, it speaks to the hearts and minds of the general population. This means that one’s ideological and political standpoint will influence how equity is interpreted. In other words, the discourse on equity is never neutral, it rather consists of power relationships where some groups may be favoured over others (see Ball, 1994). Proposals for improving equity have to be integrated within the various social groups’ political interests, which may be advanced in the name of equity. As Odora Hoppers (2008) states: ‘‘social justice has still been largely defined by whatever the strong decided. Social justice is therefore both a philosophical problem and an important issue in politics’’ (p. 10). As the concept of equity is of such importance in the political arena, I find it important to question/reveal how political interests may be hidden when establishing the basis for current educational policies and practices of equity. In other words, I will investigate how the concept may serve a rhetorical function by relating current discourses on equity to the interests of various social groups. Globally it is acknowledged that socio-economic backgrounds and ethnicity are important factors that impact the pupils’ success or lack of success in school. Good intentions notwithstanding, how the equity goal is to be met is not a given, and there are various ways of understanding the concept depending on one’s ideological stance (Solstad, 1997).

OBJECTIVES In this chapter, my aim is to analyze how the Norwegian socialist-alliance government responds to policies proposed by the OECD on how to improve equity in education. Analyzing Norwegian policies in light of OECD policies is especially interesting for the following reasons: 1. Norway has a weak tradition in using national assessments/evaluations and is focusing much attention on the PISA investigation and reports from the OECD (Hopmann, 2007). 2. The OECD is often found to be pushing for a neo-liberal agenda (Haugen, 2010a, 2010b; Eide, 1995), whereas in Norway, the Socialist

210

CECILIE RØNNING HAUGEN

Left Party is in power (in coalition with the Labour Party (received the most votes) and the Centre Party), and the current Minister of Education is from the Socialist Left Party. Therefore, it could be expected that disputes would arise between the OECD and the Norwegian government as to what policies would best improve equity.

THEORETICAL FRAMEWORK Theoretical Framework: Basil Bernstein’s Code Theory Bernstein’s theories examine how the dominant classes use formal education to spread power and control in society. This is done by classifying the content and framing the interactions in particular ways (Bernstein, 1977). The principles of this classification and framing are derived from the meaning structure of dominant classes and applied to all children in the educational system. This means that the power and control of the preferred principles will be centred in specific social classes, or sections of these classes, and will then be applied to all other classes. These concepts, while useful in the study of specific transmission in the classroom, also help us to understand the organization of school and the relationships between macroactors in society (Haavelsrud, 2009). The preferred principles were seen as fundamental in maintaining the domination of middle-class values over working-class values. The theories can be studied in the context of power and control at all levels of analysis, including the policy level, where certain principles are included and others excluded. The analysis here will be on this level, examining the OECD’s equity report on Norway and the Norwegian white paper on equity. The aim of the analysis of the reports is to describe how the recommendations that are made relate to content and pedagogy and thus to models of equity.

Tools of Analysis: Classification and Framing When analyzing the equity models, the term’s classification and framing will be employed. ‘‘Classification’’ refers to power relations and the transmission of power. The stronger the classification the more isolated the categories are from each other and the less contact there is between them, as for instance between the school system and the production system (Haavelsrud, 2009).

Discourses on Equity in Education in the OECD and Norway

211

If the classification is weak, the two categories interact and take each other into account. The recommendations the two documents make for improving equity will be analyzed in relation to the following four classification elements: 1. Extra-discourse relations of education. Educational discourse may be strongly or weakly insulated from non-educational discourse. 2. Intra-discourse relations of education. Organizational contexts: (a) Insulation between agents and insulation between discourses. Agents and discourses are specialized to departments which are strongly insulated from each other. (b) Insulation between discourses but not agents. Here agents and discourses are not specialized to departments but share a common organizational context. 3. Transmission context. Educational discourses within and/or between vocational and academic contexts may be strongly or weakly insulated from each other. 4. System context. Education may be wholly subordinate to the agencies of the State, or it may be accorded a relatively autonomous space with respect to discursive areas and practices (Bernstein, 1990, p. 27). The concept of ‘‘framing’’ describes relations of control. This makes it a key element in the study of the relations between, for example, pupil and teacher, home and school, or work and school. In the case of teacher–pupil relations when communicating on a particular subject, important choices are made about the following (Bernstein, 2000, p 12): – selection of what is to be communicated; – the sequence of the what that is to be communicated (what comes first, what comes next); – the pacing or expected acquisition; – the criteria for evaluation; and – the control over the social base which makes the communication possible. Framing can be described as ‘‘internal’’ or ‘‘external.’’ Internal framing occurs when the pupil has influence over the teaching. In the case of participatory and problem-centred teaching/learning, the framing would be weak (F), as the teacher may be open to pupil preferences. In the case of more traditional approaches, the framing may be strong ( þ F) as the pupil’s interests and choices are not taken into account. External framing ‘‘refers to the controls on communication outside that pedagogic practice entering the pedagogic practice’’ (Bernstein, 2000, p. 14),

212

CECILIE RØNNING HAUGEN

for example, the degree to which the state exercises control over pedagogic practice through regulations. The classification and framing analytical tools can help when analyzing equity models. In the following I will describe how they can be connected to the two models ‘‘equity through equality’’ and ‘‘equity through diversity.’’

Equity through Equality and Equity through Diversity Historically, different concepts of equity have underpinned educational policies (Hernes, 1974; Solstad, 1997; Aasen, 2007). Solstad (1997) distinguishes between two models of equity when he uses the concepts ‘‘equity through equality’’ and ‘‘equity through diversity.’’ When I apply tools from Bernstein on Solstad’s models, I choose his perspectives of power and control. Unlike Solstad (1997), who claims that an equality model will likely reproduce power, whereas a diversity model likely will not, I apply a conflict perspective to both models, suggesting that both will most likely reproduce power relations, but the relations might differ (cf. ‘‘old’’ versus ‘‘new’’ middle class which will be described later, Bernstein 1977). In taking this approach, different orientations to equity are first of all seen as an ideological conflict, where both models reveal the interests of dominant social groups. Solstad describes the two models in terms of three dimensions: de-/ centralization, de-/specialization and de-/standardization. The classification and framing characteristics can be used to describe these dimensions in the following way: The first dimension, de/centralization, can be described by the classification of the system context and the external framing: In an equality model, the classification of the system context is weak (C), as there is a vertical dependency on the state, and a centralized, pre-specified school. A stronger classification ( þ C), and thereby more autonomy from the state, is present in a diversity model, as the municipal level or the school is more responsible for how the school is run. The state will likely have more control ( þ F) in an equality model than in a diversity model, where the teacher tends to have more autonomy (F). The second dimension, de-/standardization, can be described in terms of the classification of the intra-discourse relations of education, the transmission context and the internal framing. The intra-discourse relations of education in an equality model are also characterized by strong classification ( þ C) as there is likely a strong subject orientation and strong teacher autonomy, with a low degree of cooperation. In a diversity model, the

Discourses on Equity in Education in the OECD and Norway

213

intra-discourse relations are likely more weakly classified (C), as the pupils work on broader topics and cross subject boundaries. The transmission context in an equality model is also characterized by strong classification ( þ C), as there is limited and incidental cooperation/ interaction with external school agencies. In a diversity model, there is more focus on professional competence, extensive cooperation and negotiation with agents at the local community and municipal levels (C). For internal framing, the equality model is most likely strongly framed ( þ F) due to the degree of apparent influence the pupil has over the teaching. For example, all pupils probably are told to use the same work methods. They have the same syllabi, face the same demands and receive the same instructional teaching. Teacher autonomy is likely weak, as the focus is on ready-made knowledge presented in national textbooks ( þ F). The internal framing in a diversity model is likely weaker (F), as there is more focus on the pupils’ interests, needs or qualifications, and more collective work. Finally, the third dimension, de-/specialization, can be described by the classification of the extra-discourse relations of education. In an equitythrough-equality model, the classification of extra-discourse relations of education (e.g., how school relates to everyday or local contexts) is typically strong ( þ C), as the content is likely ready-made, while in a diversity model, the extra-discourse relations of education are more likely to focus on everyday knowledge or local problems (C). To summarize: an equality model is characterized by strong classification of extra-discourse relations ( þ C), intra-discourse relations ( þ C), and transmission context ( þ C), but weak classification of system context (C). Both internal and external framing are likely strong ( þ F). The opposite is true in a diversity model (Table 1). The different models referred to here can be described as an ideological conflict between the ‘‘old middle class’’ and the ‘‘new middle class’’ (Bernstein, 1977), where the old relates to a collection code and visible pedagogy, and the new relates to an integration code and an invisible pedagogy. In his theory, Bernstein (1977) describes two kinds of knowledge through the concepts ‘‘collection code’’ and ‘‘integration code.’’ Although a collection code typically addresses a reproduction of content, evaluating the ‘‘state of knowledge,’’ the integration code emphasizes that the pupils should have insight into principles and processes rather than facts, a focus on ‘‘ways of knowing.’’ Furthermore, Bernstein differentiates between two kinds of pedagogic orientations: ‘‘visible’’ and ‘‘invisible pedagogy.’’ With visible pedagogy, the aim is to transfer specific knowledge, and thus there are clear criteria for

214

CECILIE RØNNING HAUGEN

Table 1. Classification and Framing Characteristics of an ‘‘Equitythrough-Equality Model’’ and an ‘‘Equity-through-Diversity Model’’. Equality Model

Diversity Model

Centralization System context: C External framing: þ F

Decentralization System context: þ C External framing: F

Standardization: Intra-discourse relations of education: þ C Transmission context: þ C Internal framing: þ F

Destandardization: Intra-discourse relations of education: C Transmission context: C Internal framing: F

Specialization: Extra-discourse relations of education: þ C

Destandardization: Extra-discourse relations of education: C

evaluation of the results. The evaluation emphasizes the result rather than the process of learning (Bernstein, 1977). With invisible pedagogy, the process is in focus because the teacher has produced the context for the pupil to explore, there is less emphasis on the transference of specific skills, and consequently the criteria for evaluation are multiple and diffuse, and not so easily measured (Bernstein, 1977). When attempting to analyze how the OECD and Norwegian socialistalliance government address equity in education, I will focus on how their policies may relate differently to the old/new middle class by using two equity models that can be described by means of Bernstein’s knowledge and pedagogic orientations. The problem here is that policies are always the result of compromises (Apple, 2006). This means that it is unlikely that we will find only one of the two equity models or only one group’s interests in the documents. To supplement Bernstein, I find the concept of conservative modernization (cf. Apple, 2006) interesting as it refers to a combination of interests where emphasis is placed on traditional values and knowledge (neo-conservatism), market control through individual choice (neo-liberalism), strong moral authority (authoritarian populism1), and quality improvement through testing (new managerial middle class).

MODES OF INQUIRY Bernstein’s theoretical framework will be combined with key principles from Critical Discourse Analysis (CDA) (Fairclough & Wodak, 1997). CDA

Discourses on Equity in Education in the OECD and Norway

215

investigates the relationship between discursive practice and social and cultural developments in various social contexts (Jørgensen & Phillips, 1999). It focuses on linguistic aspects of social processes and problems, where the key claim is that ‘‘major social and political processes and movements y have a partly linguistic-discursive character’’ (Fairclough & Wodak, 1997, p. 271). In this context, my aim is to examine how equity is addressed by the OECD and Norway. By searching for meaning (cf. Svennevig, 2009), I will analyze the ideological foundation of equity to critically investigate whose interests are present in current equity orientations as expressed in the two documents. It should be stated that within CDA, it is hard to separate theory from the methods used (cf. Jørgensen & Phillips, 1999), in other words, the theoretical approach forms part of the methodology. Combining Basil Bernstein’s theories with CDA has proven to be an insightful theoretical approach (cf. Chouliaraki & Fairclough, 1999). Through an analysis of the order of discourse (cf. Chouliaraki & Fairclough, 1999), OECD policies and Norwegian policies on how to improve equity in education will be analyzed to see how they relate to an old middle-class discourse, the already mentioned equity-through-equality model or a new middle-class discourse, described as an equity-through-diversity model (cf. Solstad, 1997). The data material and how it relates to the two equity models will be presented as networks (cf. Bliss, Monk, & Ogborn, 1983) where the initiatives are visualized and counted. When analyzing the results, one key focus will be on the contradictions within the policies, that is, how they may pull in opposite directions, and on how the OECD and the Norwegian socialist-alliance government may differ in their recommendations for improving equity in education.

EVIDENCE SOURCES The comparative analysis of equity discourses in the OECD and Norway will be based on two documents. The two selected documents selected are particularly interesting here as they both have equity in education as their main focus, and the Norwegian white paper was written in direct response to the OECD report, as is evidenced by the many references to the report. Furthermore, the PISA assessments in this context have a high degree of relevance; the OECD report uses PISA data as important background material for legitimizing the policies it addresses, whereas the Norwegian

216

CECILIE RØNNING HAUGEN

socialist-alliance government uses the OECD report and PISA data to legitimize the policies in the white paper. The Norwegian government refers to the OECD report 72 times in the white paper and the PISA investigation refers to it 21 times, making it an important source for the policies on equity. The first document to be analyzed comprises 22 recommendations on how to improve equity in Norway. 1. Mortimore, Field, and Pont (2004). Equity in Education. Thematic Review. Norway. Country Note. OECD. The 22 recommendations are numbered in the OECD report, where recommendations 1–10 ‘‘build on strengths,’’ while recommendations 11–22 ‘‘address shortcomings.’’ The second document to be analyzed is a white paper from the Norwegian Ministry of Education and Research (2006/2007): 2. Report to the Storting (White paper no. 16) y and Where No One Was Left Behind. Early Intervention for Lifelong Learning. The data material from this chapter comes from the initiatives highlighted in Chapter 6 of the white paper: ‘‘Strategy areas and initiatives.’’ Altogether, 99 initiatives are analyzed. These initiatives are written as ‘‘bullet points’’ in the white paper. When I present them in the analysis, I have numbered them from 1 to 99, that is to say, each initiative for improving equity is represented by a number.

DESCRIPTION OF THE NETWORKS Each initiative is represented on the right side of the network map by a number (1–22 from the OECD report and 1–99 chronologically as presented in the Norwegian white paper). To build a bridge between the raw data (represented as a number in the map) and the analysis tools (de-/ centralization, de-/standardization, and de-/specialization), categories describing the content of the data material in a more detailed way are placed in the middle. Owing to limitations of space, I have not been able to present the raw data in the map. However, I will expand on what I consider to be the essence of each initiative by representing each one through quotations in the text. It should be noted that some of the initiatives are coded more than once as they can be related to different dimensions of the equity models.

Discourses on Equity in Education in the OECD and Norway

217

I am employing a quantitative approach to the analysis of the content as I demonstrate how many initiatives are related to each category. In this way, I can show what may be considered to be important tendencies in the Norwegian white paper (although it is impossible to consider the weight of each recommendation). However, quantitatively, the data material from the OECD report can count less, as only 22 recommendations are provided. As all policies carry contradictions because they are always the result of compromise (Apple, 2006), I find it important to visualize them and discuss what might be more powerful compared to the others. This can be visualized clearly through the networks. The networks also illustrate the lack of initiatives related to certain categories, leaving some categories empty, and thereby demonstrating what is not chosen as the most effective way to improve equity in school. Before analyzing the two documents, I will present how the Norwegian education system is described in the OECD report to contextualize the recommendations it makes.

THE NORWEGIAN EDUCATION SYSTEM In Norwegian pre-school education, parents pay for day care, focusing on care, and education of children from 1 to 5 years of age (45% attend day care). Norway offers compulsory education for the 6–16 age group, and provides an open compulsory education system where pupil performance plays a small role. All pupils progress, and all have a right to upper secondary education. It is impossible to fail, as all grades cover passed achievements. However, where there is competition for places, admittance to programmes in upper secondary education is based on grades achieved. In Norway, upper secondary education is divided into academic programmes and vocational training. Pupils have the statutory right to upper secondary education of 3 (general studies) or 4 years (vocational qualification). The general studies lead to higher education, while the vocational qualifications can be upgraded to also lead pupils to higher education if they take some supplementary subjects. Admittance to higher education can also be based on 5 years of work or a combination of work and upper secondary subjects. This is done to ensure there are no dead ends in the system. The budgetary framework is set by the state, and Norway has the second highest expenditure rate in the OECD (Mortimore et al., 2004). The state is

218

CECILIE RØNNING HAUGEN

responsible for basic minimum programme standards in the educational systems as it provides a general curriculum and goals. The municipalities are responsible for the compulsory education (6–16 years) based on the frames set by the state. Norway has a rather low private school attendance where 1.7% attends subsidized private schools, and 1% attends special schools. With respect to knowledge and pedagogy (teaching), Norway has a strong focus on adapting the teaching to each pupil’s abilities (although this does not function satisfactorily according to the OECD), with both project/group work and subject orientation in the system (Mortimore et al., 2004). However, according to the PISA study of evaluated performance in Norway, in the current educational system ‘‘there is a bigger than average dispersion of scores despite the high level of equity within the system’’ (Mortimore et al., 2004, p. 5), and ‘‘there remains a worryingly long chain of underachieversy ’’ (p. 57). Despite the lack of equity, Norway is found to have a very good system, and when a longer time perspective is used, the equity situation in Norway improves considerably, as indicated in the measurements for adult literacy. Despite differences in equitable functions, the losers in the educational system tend to be pupils from lower socio-economic backgrounds and ethnic minorities. ‘‘It is likely that those pupils whose parents have enjoyed only limited schooling and other vulnerable children (in terms of their low socioeconomic status, potential special needs and in some cases ethnic and language background) make up the tail of underachievement’’ (Mortimore et al., 2004, p. 52). The theoretical approach described for this study is specifically concerned with this: how education reproduces socio-economicbased power relations and disfavours the already disadvantaged in society through its knowledge and teaching orientations, which I will analyze through the specific recommendations outlined in the OECD report and the Norwegian white paper, specifically addressing education as a tool for equity.

ANALYSIS OF EQUITY DISCOURSES IN THE OECD AND NORWAY The equity models will, as stated earlier, be analyzed along the three dimensions: centralization/decentralization, standardization/destandardization, and specialization/despecialization. The data material will be analyzed and presented as follows: Before presenting the analysis of the data material I will first present the main findings of each dimension, followed by the

Discourses on Equity in Education in the OECD and Norway

219

network map demonstrating the categories and findings. The raw data will be presented through quotes and related to each category. Owing to limitations of space and to avoid too many theoretical ‘‘complications’’ in the presentation of the analysis, I will not refer to the Bernsteinian concepts in the main analysis, but the reader will be able to see how they are related in the description of the two equity models (see Section 2). I will return to Bernstein again in the discussion.

CENTRALIZATION OR DECENTRALIZATION? The category centralization/decentralization describes the degree of autonomy the Norwegian ministry gives the school, and whether the local authorities (municipal level) have high or low influence over the education. To analyze the centralization/decentralization dimension, I have used the different ways of governance: rules/frames, goals and values, together with the four categories of policy instruments: legal, financial, informative and evaluative, as described by Aasen (2007). I will first present centralization initiatives from the OECD and Norway, initiatives related to decentralization will be presented afterwards. The analysis presented below the chart shows that both the OECD and the Norwegian ministry emphasize that the state should practise stronger control and governance over the education system. However, both the OECD and the Norwegian government address many other initiatives related to more centralization than to more decentralization (from the Norwegian government we find only one decentralization initiative compared to 53 for more centralization). In a Norwegian historical context, this is not surprising, as Norway has a long tradition for strong state involvement in education. However, what may be more interesting here is the heavy emphasis on evaluation instruments, as there is a weak tradition for using accountability tools in Norway (cf. Hopmann, 2007). One interesting aspect here, however, is that while the OECD addresses ‘‘added-value measures,’’ this is not an area of focus for the ministry. The categorization of the raw data is presented after the chart (Table 2).

Centralization – OECD The OECD provides one recommendation addressing centralization that can be categorized as goal governance. ‘‘Goal governance refers to ministerial

220

CECILIE RØNNING HAUGEN

Table 2. Centralization or Decentralization? OECD Centralization Rules, frames, legal Goal Value Financial Information Evaluation Decentralization Value Financial

17 2, 11 20 16, 19 18 21

Norway

17, 18, 35, 36, 40, 48, 52, 54, 76, 74, 77, 78, 82, 90, 97, 98, 28, 44, 87, 61, 67 21, 26, 29, 24, 56, 58 15, 30, 39, 57, 83 77, 91, 92, 73 42, 45, 53, 75, 80, 62 6, 19, 25, 29, 32, 47, 51, 59, 96, 93, 63 5

governance according to defined goals which will be followed up in planning and action, but where local units have the freedom to choose how and by what means the goals are to be attained’’ (Aasen, 2007, p. 25, my translation). The OECD recommends that ‘‘[t]he ministry and municipalities work with the teaching unions to devise a suitable range of intervention strategies’’ (17). The OECD further recommends two financial tools: ‘‘Financial instruments indicate the direction one should take according to which purposes and measures should be rewarded with economic funding’’ (Aasen, 2007, p. 26). The OECD says that ‘‘[t]he current level of investment in education should be maintained’’ (2) and ‘‘support for early childhood education and care [should be prioritized] over the costs of tertiary education’’ (11). One recommendation is categorized as information. ‘‘Informative instruments are used by the central authorities to spread knowledge and information as the underpinning for local decisions, be it at the school owner level (the local and county administration) or the classroom level’’ (Aasen, 2007, p. 26). The OECD recommends that ‘‘[t]he time devoted to multicultural, bilingual and special education issues in teacher training should be increased’’ (20). The OECD makes two more recommendations related to centralization through the policy instrument evaluation. As described by Aasen (2007, p. 26): ‘‘The educational institution is then evaluated according to goal achievement and performance results.’’ The two evaluation instruments recommended by the OECD are: ‘‘The ministry should y support the development of ‘added-value’ measures’’ (16) and ‘‘The ministry engages with the municipal authorities and the offices of the country governors in order to create an appropriate ‘light-touch’ monitoring procedure’’ (19).

Discourses on Equity in Education in the OECD and Norway

221

I will now move on to analyze initiatives related to centralization from the Norwegian socialist-alliance government.

Centralization – Norway The Norwegian ministry provides 21 initiatives categorized as rules/frames: ‘‘Rule governance controls the school’s activities through formal rules’’ (Aasen, 2007, p. 25, my translation). This category can also be described as legal policy instruments. ‘‘Legal instruments provide certain rules for behaviour through, for example, Acts, regulations, directives, instructions or curricula. A school system dominated by legal instruments will likely have a hierarchical authority structure with a strong central authority’’ (Aasen, 2007, p. 25). I have chosen not to separate the two in this analysis. Twenty-one initiatives are related to rule governance and legal instruments: ‘‘the local authority’s duty to provide language stimulation y ’’ (17); ‘‘consider section 5–7 of the Education Act on special-needs teaching to help children of pre-school age’’ (18); ‘‘expansion of the school day at the primary-school level up to 28 hours a week’’ (35); ‘‘implement arrangements for homework assistance’’ (36); ‘‘possible amendment to the object’s clause in the Education Act [referring to differentiated learning]’’ (40); ‘‘review the rules regarding individual evaluation’’ (48); ‘‘school owners’ responsibility to follow up results’’ (52); ‘‘consider whether existing rules are adjusted and expanded and if there is binding collaboration between sectors with responsibility for children and youngsters’’ (54); ‘‘give institutions that want it access to applicants with relevant professional training’’ (76); ‘‘expand the mandate of the Parent Council y ’’ (74); ‘‘develop the principle of free higher education’’ (77); ‘‘possibility to freeze the interest on education loans for ten years’’ (78); ‘‘right to primary education so, if necessary, one can choose whether to adjust this in primary or secondary education’’ (82); ‘‘no possibility of putting adult applicants with a right to education on waiting lists’’ (90); ‘‘authorization for collection of necessary data in the Day-care Institution Act’’ (97); ‘‘proposals for necessary amendments to the Education Act and the Free School Act’’ (98); ‘‘proposals for amendments to the Free School Act y ’’ (28); ‘‘amend the Education Act y to provide special-needs education according to individual decisions without an expert assessment’’ (44); ‘‘rescind the provision in the Education Act that lays down that only adults born before 1978 have the right to upper secondary education y ’’ (87); ‘‘expand the information provided regarding the purpose and content of pupil counselling in the

222

CECILIE RØNNING HAUGEN

regulations for the Education Act’’ (61); and ‘‘amend the Education Act so that the right of minority-language pupils to special Norwegian teaching in upper secondary education is regulated by a separate paragraph’’ (67). Relating to goal governance, we find six initiatives from the Norwegian ministry: ‘‘a goal calling for day care for all y (21); ‘‘equality between genders in day care y ’’ (26); ‘‘make the effort against bullying goal directed y ’’ (56); ‘‘rationalize and make the governmental supervision of the Education Act and the Free School Act goal directed’’ (29); ‘‘put a major effort into competence development y [regarding] the day-care centre’s implementation and use of the curriculum y (24); and ‘‘national strategy for art and culture in education’’ (58). ‘‘Finally, value governance refers to control of the institutions by prescribing norms and values which are to be developed through pedagogic practice’’ (Aasen, 2007, p. 25, my translation). Relating to this, we find seven initiatives: ‘‘early intervention and greater completion rates in upper secondary education are prioritized’’ (15); ‘‘develop quality in primary and lower and upper secondary education’’ (30); ‘‘research the significance of nutrition for learning’’ (39); ‘‘ensure that there are cultural programmes for pupils with functional disabilities’’ (57); and ‘‘motivation and information activities on the right of adults to education and the value of education’’ (83). Four initiatives from the socialist-alliance government are related to financing: ‘‘continue the principle of free higher education’’ (77); ‘‘incentive schemes for learning activities in the institutions that can contribute to a higher participation of the poorly educated in learning and competence development’’ (91); ‘‘provide more educational programmes for prisoners, y ten million for this aim’’ (92); and ‘‘increase the support for a development and rationalization network for upper secondary education y ’’ (73). Furthermore, there are nine initiatives related to informative instruments: ‘‘guidelines with concrete examples of how the local authorities can work with early intervention y ’’ (42); ‘‘guide to special-needs education in primary school y , special-needs assistance in day care’’ (45); ‘‘guidelines for multi-subjected and multi-sectoral collaboration on programmes for children and young people’’ (53); ‘‘templates for contracts with parents’’ (75); ‘‘guidelines for adult education in basic skills y ’’ (80); and ‘‘guiding criteria for competence’’ (62). The fourth instrument is evaluation. From the Norwegian ministry we find 15 initiatives for evaluation: ‘‘improve the reporting and information on use of funding’’ (6); ‘‘ambulating educators y evaluation of the project’’ (19); ‘‘evaluation of the implementation of the curriculum for the content and

Discourses on Equity in Education in the OECD and Norway

223

tasks in day care’’ (25); ‘‘governmental control through the Education Act and the free school Act’’ (29); ‘‘evaluate and follow up the results of ‘programme for digital competence’ y ’’ (32), ‘‘evaluate to what degree the possibility to reallocate up to 25 per cent in each subject is used and what this means for the pupils’ learning’’ (47), ‘‘research on evaluation of Norwegian schools’’ (51); ‘‘evaluation of the ‘cultural school bag’ y ’’ (59), ‘‘systematically follow up the higher educational institutions’ work with study and career counselling’’ (96); ‘‘evaluate the three pilot projects with regional partnerships for career counselling y ’’ (93); and ‘‘compile experience of programme subjects for selection and in-depth projects for improving vocational training and general studies education’’ (63). I will now analyze recommendations and initiatives relating to decentralization. Decentralization – OECD With respect to decentralization, the OECD has two recommendations. Categorized as value governance we find: ‘‘The local government associations and the head teacher unions write guidelines to deal with school interventions’’ (18). Categorized as financial instrument we find: ‘‘The funding methods used to support the needs of immigrants should be reviewed after consultation with ethnic minorities’’ (21). Decentralization – Norway The Norwegian ministry provides only one initiative in relation to decentralization: categorized as value governance: ‘‘ensure that the competence development strategy is anchored in local considerations of competence needs’’ (5). All other aspects are related to a stronger power of the state over educational policies and practices at schools. I will now move on to analyze the second dimension: how the initiatives to improve equity from the OECD and the Norwegian socialist-alliance government relate to the dimension: standardization/destandardization.

STANDARDIZATION OR DESTANDARDIZATION? The category standardization/destandardization describes the ‘‘origin’’ of school knowledge, teaching methods and educational outcome with respect

224

CECILIE RØNNING HAUGEN

to whether it is ‘‘ready-made,’’ more likely to be nationally determined or whether knowledge, methods and outcome can vary depending on, for example, local problems or circumstances, or pupils’ backgrounds. As the analysis will demonstrate, both the OECD and the Norwegian ministry address a standardization of competence. However, while the OECD addresses a stronger standardization of the social base (acceptable classroom behavior, raise expectations), and by implication ‘‘blames’’ the pupils’ work, the Norwegian ministry puts much emphasis on improving the competence of the teachers, and does not imply the more hierarchical relation that the OECD does through rules for classroom behavior and raised expectations. Both the OECD and Norway propose clearer criteria for evaluation. This could lead to a standardization of education that is characteristic of an equality model. What is more difficult to categorize are the initiatives related to early intervention and differentiated learning in the classroom. This could mean ensuring that more pupils meet the same standards (standardization), but it could also mean varying the content and teaching to a higher degree. Although the OECD expresses specifically that multicultural, bilingual special education should be improved, the Norwegian ministry addresses general initiatives relating to providing teaching that is differentiated in the classroom and adapted to the pupils’ abilities. The Norwegian ministry also addresses the local context, stating that parents should have more influence, and that the education concept should be expanded by having focus on culture, digital competence, physical fostering and nutrition. These are not found in the OECD recommendations. When taking into account all the initiatives relating to the standardization/destandardization dimension, there is a potential conflict between the focus on clearer criteria for evaluation and the cultural, individual needs. I argue that when we take into account the Norwegian ministry’s strong orientation for more centralization of power/control described earlier, it appears to be more likely that the initiatives relating to standardization will ‘‘win’’ when initiatives collide. The categorization of the raw data is presented after the chart (Table 3).

Standardization – OECD We find seven recommendations on standardization from the OECD. Two are categorized as a standardization of competence: ‘‘The comprehensive,

Discourses on Equity in Education in the OECD and Norway

Table 3.

Standardization Competence Criteria for evaluation Early intervention, differentiated teaching Social base Destandardization Vocational/general education Cultural/individual needs Local context Expanded educational concept

225

Standardization or Destandardization? OECD

Norway

3, 11 15 11, 7, 13

2, 8, 10, 11, 9, 12, 13, 24, 30, 72, 86, 88, 19, 3, 4, 31, 50 16, 27, 42, 43, 47, 80, 49, 50, 51, 52, 68 44, 68, 15, 17, 18, 20, 36, 41

5, 12, 14

55, 56

7 4, 20, 21

64, 70, 71 69, 44, 36, 79 5, 74 37, 38, 31, 33, 34, 57, 58

Note: Recommendation number 10 from the OECD is not categorized: ‘‘The scope for innovation should be preserved and enhanced, particularly where it may improve equity.’’

non-streamed model of schooling should be retained’’ (3) and ‘‘support for early childhood education’’ (11). This recommendation is also categorized as early intervention/differentiated teaching together with ‘‘the follow-up counselling service should be improved’’ (7) and ‘‘supporting the early learning of disadvantaged pupils in danger of underachieving’’ (13). Furthermore, reference is made to standardization of criteria for evaluation: ‘‘The establishment of a research project to consider how agerelated subject benchmarks can be developed alongside a new testing programme’’ (15). Three recommendations are categorized as a standardization of the social base: ‘‘Anti-bullying programmes y ’’ (5), ‘‘Municipalities, the teachers’ and the school students’ unions and parents’ representatives should draw up rules for acceptable classroom behaviour’’ (12) and ‘‘Municipalities, the teachers’ and school students’ unions should establish a working party to explore how expectations about pupils’ intellectual capabilities can be raised’’ (14).

Standardization – Norway The Norwegian ministry also places much emphasis on improving competence. Where the ministry differs in comparison to the OECD is that while the OECD focuses on the competence of the acquirer, the Norwegian ministry addresses an improvement of the competence (13 initiatives) of the

226

CECILIE RØNNING HAUGEN

educators in day care, school and adult learning: ‘‘demands for competence to teach central subjects on certain levels’’ (2); ‘‘competence development and practical experiments and development work in the day-care sector’’ (8); ‘‘recruitment of day-care teachers’’ (10); ‘‘improve mentoring of recently educated pre-school teachers’’ (11), ‘‘increase competence in the day-care sector’’ (9); ‘‘develop the knowledge base of the whole education sector y ’’ (12); ‘‘make adjustments so both national and international knowledge is known and used in pedagogical educational institutions y ’’ (13); ‘‘competence development y to bring into force and use the curriculum [for day care] y ’’ (24); ‘‘develop the quality in primary, lower secondary and upper secondary education’’ (30); ‘‘improve competence development for instructors and professional managers in enterprises’’ (72); ‘‘develop programmes for public education providers y ’’ (86); ‘‘create good adult education programmes at the primary school level y ’’ (88); ‘‘ambulating educators y ’’ (19); ‘‘improve tutoring of recently educated teachers’’ (3); ‘‘improve practical education y ’’ (4); ‘‘increase the number of schools participating in the learning network’’ (31); and ‘‘improve teachers’ evaluation/assessment competence’’ (50). The initiatives relating to competence may first of all be related to better education of the personnel; they should have special skills. This may strengthen the educational discourse by ‘‘shaping’’ the personnel of the educational institutions more clearly. However, depending on their interpretation and use, many of the competence initiatives could be used in a diversity orientation. Nevertheless, when examining them in connection with criteria (soon to be described) and the strong emphasis on centralization (described earlier), I will argue that they more likely may be interpreted in an equality orientation. The Norwegian ministry also presents 11 initiatives related to a standardization of the criteria for evaluation, where performances should be assessed (11 initiatives): ‘‘develop assessment tools for pedagogical use’’ (16); ‘‘improve the [website] School Portal as a management tool for school owners and administrators’’ (27); ‘‘develop a guide with specific examples of how the local authorities can work with early intervention in day care and school’’ (42); ‘‘develop the national tests and assessment tools adjusted to the new national curriculum entitled Knowledge Promotion’’ (43); ‘‘evaluate y meaning for pupils’ learning’’ (47); ‘‘develop a guide for the teaching of basic skills to adults y ’’ (80); ‘‘test various tools for evaluation, for example common signs for evaluating pupils’ subject dividends’’ (49); ‘‘improve the teachers’ evaluation competence’’ (50); ‘‘research assessment y ’’ (51); ‘‘tighter link between pupils’ results on assessment tests and school owners’ responsibility for following up results’’ (52); and ‘‘level-based

Discourses on Equity in Education in the OECD and Norway

227

curriculum in basic Norwegian, combined with assessment tools’’ (68). These are initiatives that will potentially influence and control the content to be transferred, and in this way also control the educational discourse of the transference context. If the criteria are to be clear, they need to be easily measured; to be easily measured they need to be as objective as possible. Hence, they need to be as context independent as possible. This could help to direct the initiatives relating to decentralization and competence toward an equality model. The Norwegian ministry also proposes eight initiatives categorized as early intervention/differentiated teaching: ‘‘offer special-needs teaching based on individual decisions without assessment by experts’’ (44); ‘‘levelbased curriculum in basic Norwegian y ’’ (68); ‘‘early intervention and greater completion rate in upper-secondary education is prioritized’’ (15); ‘‘language stimulation for all children of pre-school age who need it y ’’ (17); ‘‘special-needs assistance before school age y ’’ (18); ‘‘follow up children with delayed or abnormal language development’’ (20); ‘‘homework assistance’’ (36); and ‘‘more resources in education as early as possible y ’’ (41). These initiatives place a great deal of emphasis on monitoring learning and providing resources for children from a very early age by assessing/ testing and providing improved differentiated teaching in the classroom. Bearing in mind these initiatives related to competence, criteria/assessment and early intervention/differentiation in the classroom, I find that they may strengthen the educational discourse by defining/controlling it to a higher degree. This may lead to the knowledge orientation that Bernstein (1977) calls a ‘‘collection code,’’ where focus is on the state of knowledge, rather than ways of knowing. This knowledge orientation is characteristic of an equality model. With respect to the social base (cf. Bernstein, 2000), the Norwegian ministry also addresses anti-bullying, as the OECD does, but also mentions sex-related harassment specifically: ‘‘interventions against sex-related harassment and other sexual harassment’’ (55) and ‘‘interventions to prevent harassment y ’’ (56). Where the Norwegian ministry differs from the OECD is that it does not address ‘‘raise expectations’’ and ‘‘rules for classroom behaviour.’’ These initiatives could be related to what Bernstein (2000) refers to as positional control – strengthening the hierarchical relation between teacher and pupils, which is characteristic of an equality model.

Destandardization – OECD With respect to destandardization categorized as the relation between vocational/work and general education, we find that the OECD recommends

228

CECILIE RØNNING HAUGEN

that ‘‘The parity of esteem between general and vocational education should be preserved y ’’ (7). Relating to cultural/individual needs, the OECD recommends that ‘‘An increased emphasis should be given to the principle of adapting the teaching to the pupils’ abilities’’ (4); ‘‘The time devoted to multicultural, bilingual and special-education issues in teacher training should be increased’’ (20); and ‘‘support the needs of immigrants y ’’ (21).

Destandardization – Norway Although the OECD does not address the need for a tighter relationship between education and work, the Norwegian ministry addresses the following initiatives relating to destandardization through a tighter relation to work: ‘‘in collaboration with the employer and employee organizations, provide some examples of basic competences that can be offered in learning enterprises y ’’ (64); ‘‘increase the number of young people who sign apprentice contracts’’ (70); and ‘‘increase the number of school places in the public sector’’ (71). With respect to differentiated teaching, the following initiatives are found: ‘‘to make the follow-up service more preventive’’ (69); ‘‘special-needs education according to individual decisions without expert assessment’’ (44); ‘‘homework assistance’’ (36); and ‘‘provide flexible studies’’ (79). These elements, although responding to pupils’ needs, can be interpreted in different ways. The relation to the equity discourse will consequently depend on this. Taking this into account, it is possible that the pupils will be provided with differentiated teaching in the classroom consisting of closer follow up to meet the same goals as in an equality model (cf. Hernes, 1974: result equity), rather than being treated to a more invisible pedagogy. However, the following two initiatives address a diversity orientation: The local context should also influence the transference context: ‘‘assure that the competence development strategy is anchored in local considerations of competence needs’’ (5) and influence parents by ‘‘expanding the mandate of the Parent Committee for primary and lower secondary education so it functions through the first year of upper secondary education’’ (74). Finally, we have a more expanded concept of education initiated through the focus on nutrition, physical fostering, digital competence and culture: ‘‘encourage the provision of fruit and vegetables in school’’ (37); ‘‘encourage physical activity’’ (38); ‘‘increase the number of schools in ‘learning networks’ [digital competence]’’ (31); ‘‘historical archive material for digital use y ’’ (33); ‘‘strengthening digital competence in teacher education and

Discourses on Equity in Education in the OECD and Norway

229

for school administrators’’ (34); ‘‘ensure that cultural programmes are also available for pupils with functional disabilities’’ (57); and ‘‘develop a national strategy for art and culture in education’’ (58). From describing the standardization/destandardization dimension, we now move on to the last dimension: specialization/despecialization.

SPECIALIZATION OR DESPECIALIZATION? The category specialization/despecialization refers to the role of the teacher and school as to whether their role is strongly classified by the community, work and the outside world of school. In an equality model, the teacher likely has strong subject and individual pupil orientation, works individualistically and has limited interaction with external school agencies (Solstad, 1997). As the analysis will show, the OECD’s initiatives relating to early intervention and criteria for evaluation may be oriented toward an equality model, where there are specific standards to be met. The Norwegian ministry also emphasizes early intervention and clearer criteria for evaluation, but also places a strong focus on improving the teachers’ competence. Competence and early intervention are not in themselves automatically connected to an equality model. However, when mirrored against the initiatives for centralization/decentralization (where all (53) but one initiative is categorized as centralization) together with the standardization/destandardization dimension (where, for example, much attention is placed on providing clearer criteria for evaluation), it may be the case that competence and early intervention can be related to an equality model rather than a diversity model. However, the picture is not one dimensional. There are many initiatives, both from the OECD and the Norwegian ministry, where the recommendation is to open the educational space to more people (in/out of work and immigrants) with respect to both age (early childhood education, adults, and life-long learning) and arenas (education in workplaces, prison, and kindergarten). These initiatives are linked to a diversity model where education is not limited to schools for some hours a day, but rather is expanded to cover various arenas and ages. However, the question is whether these openings and the expansion of the educational field would be focusing on standardizing people according to some specific, predefined competences, or skills, or whether the knowledge and needs would be built from ‘‘below’’ and vary to a high degree depending on who, where, when

230

CECILIE RØNNING HAUGEN

Table 4.

Specialization or Despecialization? OECD

Specialization Early intervention Teachers’ competence Criteria for evaluation Despecialization Access, links Heterogeneity Arena/age

13

Norway

15

17, 18, 20, 41 3, 4, 9, 10, 11, 12, 13, 19, 24, 31, 72, 50 16, 49, 68

3, 9, 11, 1, 7, 8 20, 21, 16 6, 11

54, 65, 70, 71, 81, 89 69, 44, 36, 79 64, 84, 85, 92, 21, 23, 18, 17, 86, 87, 88, 90

and with what purposes. The first would be related to an equality way of thinking, whereas the latter would be related to a diversity approach. The categorization of the raw data is presented after the chart (Table 4).

Specialization – OECD The OECD provides two recommendations relating to a specialization of the role of teacher and school: ‘‘support the early learning of disadvantaged pupils in danger of underachieving’’ (13); categorized as a specialization early intervention; and ‘‘age-related subject benchmarks y developed alongside the new testing programme’’ (15), categorized as a specialization of criteria for evaluation.

Specialization – Norway The Norwegian socialist-alliance government addresses the following four initiatives with respect to early intervention: ‘‘language stimulation for all children of pre-school age y ’’ (17); ‘‘special-needs assistance before school age’’ (18); ‘‘follow up of children with delayed or abnormal language development’’ (20); and ‘‘more resources as early as possible y ’’ (41). Furthermore, twelve initiatives are found on improving teachers’ competence: ‘‘improve mentoring of recently educated teachers’’ (3); ‘‘improve practical education y ’’ (4); ‘‘competence improvement in the day-care sector’’ (9); ‘‘recruitment of pre-school teachers’’ (10); ‘‘improve mentoring of recently educated pre-school teachers’’ (11); ‘‘further develop

Discourses on Equity in Education in the OECD and Norway

231

the knowledge base of the whole education sector y ’’ (12); ‘‘national and international knowledge is known and used in pedagogical educational institutions y ’’ (13); ‘‘ambulating educators y ’’ (19); ‘‘competence development y day care’s implementation and use of the curriculum y ’’ (24); ‘‘increase the number of schools participating in the learning network’’ (31); ‘‘improve the competence development of instructors and professional leaders in enterprises’’ (72); and ‘‘improve teachers’ evaluation/assessment competence’’ (50). With respect to criteria for evaluation, the following initiatives are found: ‘‘assessment tools for pedagogic use’’ (16); ‘‘testing different tools for evaluation, among others, common signs for evaluating the pupils’ subject dividends’’ (49); and ‘‘level-based curriculum in basic Norwegian, combined with assessment tools’’ (68). These are initiatives that provide a specialization of the teacher role, in the sense of becoming more restricted.

Despecialization – OECD With respect to despecialization, the OECD recommends that education should be opened to more people (access), ages and arenas: ‘‘The comprehensive, non-streamed model of schooling should be preserved’’ (3); ‘‘Additional suitable provisions should be made for adults (including immigrants) who wish to pursue primary and secondary education courses’’ (9); and ‘‘priority should be given to support for early childhood education y ’’ (11). Furthermore, there should be links and few boundaries in the system: ‘‘The basic structure of the education system should be preserved’’ (1); and ‘‘The parity of esteem between general and vocational education should be preserved’’ (7). The OECD recommends heterogeneity by addressing: ‘‘multicultural, bilingual and special education issues in teacher training y ’’ (20) and suggesting to ‘‘support the needs of immigrants y ’’ (21). The recommendation for addressing school choice is also categorized as heterogeneity: ‘‘launch discussions with municipalities and other stakeholders on the implications of potential demands for school choice’’ (16). This is because school choice means that schools must expect that they will need to respond to different wishes from parents and to compete for pupils and market themselves. Through such orientation, schools may become more homogenized internally according to pupil backgrounds, but more heterogeneous externally, differentiating from other schools.

232

CECILIE RØNNING HAUGEN

With respect to a despecialization of the education arena and age, the OECD states that ‘‘The life-long learning perspective should be retained’’ (6). This is also described by the aforementioned recommendation: ‘‘priority should be given to support for early childhood education y ’’ (11).

Despecialization – Norway The Norwegian ministry also addresses the despecialization of education through the focus on access and links, no dead ends: These are oriented toward opening the structure of education by organizing the educational space differently. For example, the following initiatives are to be undertaken: ‘‘consider if the current acts are adjusted and expanded, and if there is binding collaboration between sectors that have responsibilities for children and youngsters’’ (54); ‘‘collaborate with the employer/employee organizations’’ (65); ‘‘collaborate with the Council for Vocational Education y ’’ (70); ‘‘increase the number of school places in the public sector’’ (71); ‘‘invite KS (The Norwegian Association of Local and Regional Authorities) and other relevant stakeholders to collaborate y ’’ (81); and ‘‘collaborate with the municipal sector on job-seekers who need an education’’ (89). With respect to heterogeneity, we find the following initiatives: ‘‘to make the follow-up service more preventive’’ (69); ‘‘special-needs education according to individual decisions without expert assessment’’ (44); ‘‘homework assistance’’ (36); and ‘‘provide flexible studies’’ (79). These elements, although responding to pupils’ needs, can be interpreted in different ways. The relation to the equity discourse will consequently depend on this. When related to the analysis of centralization and standardization, I maintain that the initiatives are most likely related to an equity-through-equality model. Bearing this in mind, pupils might be provided differentiated teaching in the classroom consisting of closer follow up to meet the same goals as in an equality model (cf. Hernes, 1974: result equity), rather than being treated to a more invisible pedagogy (cf. Bernstein, 1977) as in a diversity model. Relating to age and arena, we find the following from the Norwegian ministry: Thirteen of the initiatives relate to an expansion of the educational space (both according to age and arenas). Pre-school/day care is now given more responsibility for learning: ‘‘goal to have day care for everybody y ’’ (21); ‘‘legal right to a place in day care’’ (23); ‘‘consideration of y special-needs teaching assistance before school age’’ (18); and ‘‘provide language stimulation to all children of pre-school age who need it, regardless of whether they attend day care or not’’ (17).

Discourses on Equity in Education in the OECD and Norway

233

Adults, both in general, in work or out of school/work are to be offered more learning. Examples of this are: ‘‘develop learning programmes in basic skills y ’’ (86); ‘‘eliminate the provision in the Education Act stating that only adults born before 1978 have the right to a high school education y ’’ (87); ‘‘create good adult learning programmes in elementary skills throughout the entire country’’ (88); ‘‘specify for the local authorities that it is not possible to put adult applicants with a right to education on waiting lists’’ (90); ‘‘in collaboration with employer/employee organizations provide some examples of basic skills y ’’ (64); ‘‘consider making the ‘Programme for basic competences in working life’ permanent’’ (84); ‘‘differentiated educational programmes for adults who cannot be reached through efforts related to the workplace’’ (85); and ‘‘more educational programmes for prisoners y ’’ (92).

DISCUSSION: THE OECD’S ROLE IN NORWEGIAN POLICYMAKING The purpose of this paper has been to analyze the equity orientations found in the OECD’s thematic review on equity compared with the Norwegian socialist-alliance government’s equity orientation as described in White Paper 16, treating specifically how education could be used as a tool for promoting equity. Do the two documents share a common approach? What are the differences and similarities? What might the Norwegian socialistalliance government reject? As I have demonstrated, the OECD and the Norwegian government address both an equality model and a diversity model. This is as expected, as policies are always based on compromises (cf. Apple, 2006). Through the analysis I have demonstrated that the Norwegian socialist-alliance government responds to most recommendations from the OECD, but there are also differences in some areas. With respect to the centralization/decentralization dimension, we see that both the OECD and the Norwegian government recommend stronger state involvement in education. The Norwegian white paper has a clear imbalance between initiatives addressing centralization as opposed to decentralization. Although 53 promote stronger centralization, only one initiative is related to decentralization. For the OECD, the relation is six for centralization compared to two recommendations for decentralization. As stated in the introduction, Norway has a weak tradition for using accountability tools, such as national tests and measurements of educational

234

CECILIE RØNNING HAUGEN

outcome. The Norwegian school is now in the process of putting greater focus on accountability measures, with more emphasis on policy instruments as evaluation and clearer criteria for evaluation of educational outcome. An interesting difference in the content addressed in this regard is the Norwegian ministry’s rejection of added-value measures in combination with parental choice (see recommendation number 16 from the OECD). The added-value measures constitute one typical element of the No Child Left Behind approach in the United States, where schools and teachers are rewarded or reprimanded depending on whether or not scores on national tests improve from year to year (Hopmann, 2007). The Norwegian ministry accepts surveillance through evaluations but rejects both a tighter follow-up as in ‘‘added-value’’ measures, and instruments such as reward and punishment. This difference between the OECD and the Norwegian redgreen government when it comes to added-value measures and use of reward and punishment is also found in a comparative analysis of the OECD’s and the Norwegian ministry’s position on teacher education policies (Haugen, 2010c, work in progress). The Norwegian discourse is less influenced by neoliberals, where use of incentives (or punishment) based on such accountability tools as testing is an important ingredient in the management of education. Furthermore, the ministry rejects the idea of providing more choice to parents. The issue of privatization and choice is perhaps one of the main differences in the educational policies of the Right compared to the Left in Norway. However, the Norwegian ministry also addresses the idea that parents should have more influence in primary and lower secondary education (see initiative number 74), but choice is not recommended. With respect to the standardization/destandardization dimension, we find that the OECD and the Norwegian ministry address both standardization and destandardization, where there is focus on standardizing competence, providing clearer criteria for evaluation, undertaking early intervention and differentiated learning and on taking the social base into consideration. Where the two documents diverge is that while the OECD addresses improved competence and the ‘‘morals’’ of the acquirer by raising expectations and creating better standards for classroom behavior through rules, the Norwegian ministry emphasizes improving the competence of teachers. The Norwegian discourse is less characterised by a moral authority, both for teachers and pupils, an element which is a key part of the conservative and authoritarian discourse (cf. Apple, 1989). The focus on clearer criteria for evaluation may conflict with the initiatives related to destandardization, where, for example, cultural and

Discourses on Equity in Education in the OECD and Norway

235

individual needs come into play. Although the OECD explicitly addresses multiculturality, the Norwegian ministry addresses differentiated teaching in the classroom and learning in general. This lack of explicitness may also indicate that there will be a difference in whether the initiatives for differentiated teaching should be related to result equity (the same outcome) or a varied orientation. The Norwegian ministry has also expanded the educational concept so that ‘‘new’’ knowledge is included. For the specialization/despecialization dimension, where the focus is on the teachers’ and schools’ role, we find that both the OECD and Norway address the issues of early intervention and clear criteria for evaluation, in addition to a strict teacher role for assuring that pupils ‘‘meet the standards.’’ The teacher’s competence in becoming more specialized is only addressed by the Norwegian ministry. It is clear that the Norwegian ministry finds that teacher training has an important role in improving equity in education. Initiatives relating to despecialization are found in both the OECD report and the Norwegian white paper, providing more access and links between the education systems, and also opening education for more ages and in more arenas. The concept of life-long learning becomes important. All in all, whereas the OECD to a higher degree presents hybridization between an old middle-class discourse (equality model) and a new middleclass discourse (diversity model), the Norwegian ministry emphasizes the old middle-class discourse to a higher degree. The combination of the two models for both the OECD and Norway could, however, be described as a centralized decentralization (cf. Bernstein, 2001) or conservative modernization (Apple, 2006) as referred to in Section Objectives. One difference, however, is that the Norwegian ministry rejects choice, the raising of expectations and the adding of rules for classroom behavior. The Norwegian ministry has a stronger focus on teachers’ competence than on pupil ‘‘morals’’ (with the exception of the focus on antibullying programmes). It thus appears that the Norwegian socialist-alliance government is less influenced by neo-liberalism and authoritarian populism than the OECD. However, the focus on national testing and strong centralization, together with giving the working world more influence in the education field, could also open for more market-oriented education. When making results from the national tests public (which most likely will be understood as an indicator of the quality of the school) there is a danger that parental choice will still enter the picture, as parents with good economic and cultural resources will likely prefer to reside in areas where there are good test results, choosing the ‘‘best’’ education for their children. This will in turn potentially

236

CECILIE RØNNING HAUGEN

serve as a market force, although the state has national control and governance over the schools. Thus, there may not be that big a difference compared to opening for choice and added-value measures. As a closing comment, I find it interesting to look at what is ‘‘missing’’ in the documents. The approaches described in the two documents do not problematize power in education, although they do recognize and focus on the fact that structural differences do exist. The solution to improve inequities is first of all targeted at providing more educational opportunities to more people. Opening education systems to more people, providing opportunities to learn, does not, however, automatically mean that education is becoming more democratic (cf. Bernstein, 2001). Bernstein’s theory on how power and control are firmly embedded in education does not itself provide a framework for improving equity in education. However, we can apply his theories as tools that we can use to reflect on and expose power relations within education. If he is right, neither of Solstad’s equity models can improve equity. It would be naive to think that by putting our faith in one model we are leading the way to better equity. To improve equity in education one would, in other words, need to turn to other sources for guidance. A different way of targeting inequities could be based on critical education theory and research (Haugen, 2009). Understanding the relation between differential power and education through critical education theory would imply that we would need to examine both the form and content (the framing and classification characteristics) of education to ascertain if they could transform the current inequities. This involves considering them as part of a larger system, where the focus would be on how economy, politics and culture are interconnected.

NOTE 1. For more about ‘‘authoritarian populism’’ (see Apple, 1989, pp. 5–10).

REFERENCES Aasen, P. (2007). Læringsplakatens utdanningspolitiske kontekst [The Learning Poster’s Political Context. In Norwegian]. In: J. Møller & L. Sundli (Eds), Læringsplakatenskolens samfunnskontrakt [The Learning Poster – School’s Social Contract] (pp. 23–44). Oslo: Norwegian Akademic Press.

Discourses on Equity in Education in the OECD and Norway

237

Apple, M. W. (1989). Critical introduction: Ideology and the state in education policy. In: R. Dale (Ed.), The state and education policy (pp. 1–20). Bristol: Open University Press. Apple, M. W. (2006). Educating the ‘‘right’’ way. Markets, standards, god and inequality. New York: Routledge. Ball, S. J. (1994). Education reform. A critical and post-structural approach. Philadelphia: Open University Press. Bernstein, B. (1977). Towards a theory of educational transmissions. Class, codes and control (Vol. 3). London: Routledge & Kegan Paul. Bernstein, B. (1990). The structuring of pedagogic discourse. Class, codes and control (Vol. IV). London: Routledge. Bernstein, B. (2000). Pedagogy, symbolic control and identity. Boston: Rowman & Littlefield Publishers, Inc. Bernstein, B. (2001). From pedagogies to knowledge. In: A. Morais, I. Neves, B. Davies & H. Daniels (Eds), Towards a sociology of pedagogy: The contribution of basil bernstein to research (pp. 363–368). New York: Peter Lang. Bliss, J., Monk, M., & Ogborn, J. (1983). Qualitative data analysis for educational research: A guide to uses of systematic networks. London: Croom Helm. Chouliaraki, L., & Fairclough, N. (1999). Discourse in late modernity. Edinburgh: Edinburgh University Press. Clarke, J., & Newman, J. (1997). The managerial state. Power, politics and ideology in the remaking of social welfare. London: SAGE Publications. Eide, K. (1995). OECD og norsk utdanningspolitikk. En studie i internasjonalt samspill. [OECD and Norwegian Education Policy. A Study of International Interaction, in Norwegian]. Oslo: Utredningsinstituttet for forskning og høyere utdanning. Fairclough, N., & Wodak, R. (1997). Critical discourse analysis. In: T. A. van Dijk (Ed.), Discourse as social interaction. London: SAGE Publications. Haavelsrud, M. (2009). Reflections on knowledge and equity. Festschrift to Betty Reardon. Haugen, C. R. (2009). What critical education theory and research can tell us about the relation between differential power and education. Trial Lecture for PhD defence. Trondheim, 27/11/2009. Norwegian University of Science and Technology. Haugen, C. R. (2010a). OECD for en konservativ modernisering av utdanning? [OECD for a Conservative Modernization of Education? In Norwegian]. In: E. Elstad & K. Sivesind (Eds), PISA- sannheten om skolen? [PISA- the Truth about Education? In Norwegian]. Oslo: Universitetsforlaget. Haugen, C. R. (2010b). Educational Equity in Spain and Norway: A Comparative Analysis of Two OECD Country Notes. In: Educational Policy (accepted for publication 28th of February 2010, in preparation for publication). Haugen, C. R. (2010c). A comparative analysis of the OECD’s and Norway’s Teacher education policies (work in progress). Hernes, G. (1974). Om ulikhetens reproduksjon. Hvilken rolle spiller skolen? [About the Reproduction of Inequality. What role does school play? In Norwegian]. In: M. S. Mortensen (Ed.), I forskningens lys [In the Light of Research] (pp. 231–251). Oslo: Lyches Forlag. Hopmann, S. T. (2007). Epilogue: No child, no school, no state left behind: Comparative research in the age of accountability. In: S. T. Hopmann & M. R. Brinek (Eds), PISA according to PISA. Does PISA keep what it promises? Wien: Lit Verlag GmbH & Co.

238

CECILIE RØNNING HAUGEN

Hopmann, S. T., Brinek, M. R., & Retzl, M. (Eds). (2007). PISA according to PISA. Does PISA keep what it promises? Wien: Lit Verlag GmbH & Co. Jørgensen, M. W., & Phillips, L. (1999). Diskursanalyse som teori og metode. [Discourse analysis: Theory and methods, in Danish]. Roskilde: Roskilde Universitetsforlag. Mortimore, P., Field, S., & Pont, B. (2004). Equity in education. Thematic review. Norway. Country note. OECD. Available on http://www.oecd.org/dataoecd/10/6/35892523.pdf Norwegian Ministry of Education and Research (2006/2007). Report to the Storting (White Paper no. 16) y and Where No One Was Left Behind. Early Intervention for Lifelong Learning. Odora Hoppers, C. A. (2008). Education, Culture and Society in a Globalizing World: Implications for Comparative and International Education. Keynote Address to the British Association for International and Comparative Education – Annual Conference on Internationalisation in Education: Culture, Context and Difference. Solstad, K. J. (1997). Equity at risk. Oslo: Scandinavian University Press. Svennevig, J. (2009). Spra˚klig samhandling [Linguistic Interaction. In Norwegian]. Fagernes: Cappelen Damm AS.

THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES ON NATIONAL EDUCATION POLICYMAKING: THE CASE OF SLOVENIA – HOW MANY WATCHES DO WE NEED? Eva Klemencic ABSTRACT This chapter discusses the influence of international educational studies on knowledge in a general sense. In a theoretical framework, a split between realistic and constructivist theories of knowledge with special regards to global and local knowledge is discussed. Since Slovenia is a country that is included in a number of different international comparative educational studies and assessments, even more so, it has been participating in these studies continually for the last two decades, the focus is on Slovenian educational policymaking (PM). The chapter for the first time analyzes the impacts on national PM of different international studies and predicts the future Slovenian participation in these studies; therefore, the chapter could be interesting for national and international audiences involved in comparative education research. For the The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 239–266 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013012

239

240

EVA KLEMENCIC

estimation of existing impacts on national PM, semi-structured interviews were used. The findings suggest that international results represent an argumentation for some directly and indirectly curricular and syllabus changes over the years. Furthermore, some of the argumentations for changing the national educational system regarding the international findings are still more declarative; irrespective of experts and policymakers estimations of how great impacts these studies have in Slovenia. Future research on the topic for Slovenian PM in education will need more secondary analysis of collected data from both national and international assessments.

INTRODUCTION Decades ago, indicators of the quality of education systems were formed. Examples of these are the number of schools and students at a particular school level or the average number of teachers in proportion to the number of students etc. In recent decades, the findings of educational (school) access are no longer the most important indicator of educational quality (Straus, Klemencic, Brecko, Cucek, & Gril, 2006; Klemencic & Rozman, 2009). It is clear that the mere assurance of access to the education system cannot guarantee educational effectiveness. Indicators of the so-called educational results came to the forefront, that is, benchmarking of knowledge achievements at different fields (Straus et al., 2006). In our current society, knowledge is the central element of effectiveness in global economic competitive positions, which has caused political perception to move specifically from problems of access to education systems and managing the quantitative growth of an education system to the subsequent qualitative criterions such as benchmarks and proficiency level (Huse´n & Tuijnman, 1994, in Straus et al., 2006). Students’ achievements, school efficacy, and the responsibility for the setting up of achievement aims have become some of the most important criterions for establishing the quality of educational systems (Bottani & Tuijnman, 1994, in Straus et al., 2006). ‘‘Since education has many purposes and components (which are interrelated; a serious deficit in one is likely to have an implication for quality in others), the question regarding quality may reasonable be posed about any important aspect of a system’’: teacher training, teaching, education material, infrastructure, administration, students’ achievements, etc. (Kellaghan & Greaney, 2001, pp. 22–23). A lot of these aspects are investigated in both international and/ or national assessments.

The Impact of International Achievement Studies

241

As well as in Slovenia, one of the theoretical principles from the general theoretical framework for the renewal of the public education system is the ‘‘demands for reaching the internationally verifiable standards of knowledge of the developed countries /y/’’ (White Paper on Education in the Republic of Slovenia, 1996, p. 34) and more concrete to ‘‘make possible the achievement of internationally comparable standards of knowledge after the completion of primary education’’ (ibid., p. 92). Even if we are now in a process of renovating our basic conceptual document of the education system as the White paper on Education is, we can foresee that a statement will be presented in our new conceptual document as well. These directions, however, ‘‘require’’ participation in comparable education studies and Slovenia is a country that is included in a number of different international education studies and assessments, for example, Programme for International Student Assessment (PISA), Trends in International Mathematics and Science Study (TIMSS), Second Information Technology in Education Study (SITES), Progress in International Reading Literacy Study (PIRLS), Civic Education Study/International Civic and Citizenship Education Study (CIVED/ICCS), Teaching and Learning International Survey (TALIS), even more so, it has been participating in these studies continually. Therefore, my basic question is: What could these studies tell us about our education system and even more so, what could they tell us in comparison with other systems, beginning with the question why do we need the comparison at all? Researching comparative education being on the increase within education systems all over the world is nothing new. Different international institutions are also interested in developing such studies [e.g., International Association for the Evaluation of Educational Achievements (IEA), Organization for Economic Co-operation and Development (OECD), Southern Africa Consortium for Monitoring Educational Quality (SACMEQ) etc.]. My question is how are these studies impacting on the Slovenian national education policymaking (PM)?

INTERNATIONAL COMPARISONS AS A PROCESS OF EDUCATION AND KNOWLEDGE GLOBALIZATION I could identify at least three lateral directions of knowledge globalization, in the sense of international comparisons as a process that could cause globalization of knowledge and education, thereby reducing the significance

242

EVA KLEMENCIC

of local knowledge1 in that particular country or restricting the education system. The reasons for this are threefold. First is the level of national and international assessments and its potential connections. Second is the comparison as a ranking, when international league tables become a significant source of influence in the shaping of national educational policy. And the third one, which is actually a consequence of the prior ones and illustrates the process of knowledge globalization, in particular, is the changing curriculum and especially the knowledge structure in the curriculum and/or some syllabuses. Knowledge globalization represents to me the multidimensional globalization processes in Giddens’ terms (2007; in Klemencic, 2010) in a sense of deterritorialization, what means that time and space are no longer tied with particular territory, where and when ‘‘social space can no longer be wholly mapped in terms of territorial places, territorial distances and territorial borders’’ (Scholte, 2005, p. 17). Therefore, it is impossible to identify the concrete impacts of globalization on education. These impacts are constructing complex reality from different processes and dimensions: political and economical globalization (education is no longer dominant within national frontiers; some institutions are acquiring on the basis of importance – speaking in terms of knowledge, this is placing great importance on institutions such as OECD, IEA etc.)2 and cultural globalization (which I understand as being a link between ‘‘globalization from above’’ and ‘‘globalization from below’’).3 From that starting point, I would try to conceptualize global and local knowledge. While local knowledge4 is still not deterritorialized in a global perspective, it is knowledge specific for some territories. It is not necessary to be tied-in with territories in a geographical sense, but tied-in with some ‘‘territory’’ within which some particularities could be exposed (which cannot be universal accepted). Local knowledge is shared by a community of practice or it is locally available (Evers & Gerke, 2003), bound by language, tradition, and values to a particular community (Antweiler, 1998; Sillitoe, 1998; in Evers & Gerke, 2003). Local knowledge therefore relates to some particular knowledge that is dominant within a particular region – local, national, or regional – and knowledge that some special groups posses (e.g., feminist knowledge). On the contrary, global knowledge relates to ‘‘universalities’’ that have widespread recognition and are accepted as common knowledge. The starting point for global knowledge understands reality as not (socially) constructed and requires a view of knowledge that simply describes reality without perspectives, knowledge that is not bound by any particular geographical territory or group.

The Impact of International Achievement Studies

243

NATIONAL AND INTERNATIONAL ASSESSMENTS The term ‘‘assessment in the field of education’’ is used to refer to any ‘‘procedure or activity that is designed to collect information about the knowledge, attitudes or skills of a learner or a group of learners’’ (Kellaghan & Greaney, 2001, p. 19). The collected assessment information could be used for a variety of levels and purposes. At the individual student level, it is used in relation to grade promotion, motivation, certification, selection for the next level, describing students’ learning and problems, etc. At the institutional level (school level), the results of an assessment could be used as a measure of effectiveness of a school (e.g., the percentage of students who reach a given level of ‘‘proficiency’’).5 While system assessments provide information on an education system and could be used to reach a judgment about the adequacy of the performance of an education system or a part of it (Kellaghan & Greaney, 2001, pp. 20–21). Nowadays, we can perceive the increase of international education comparison, but assessments in education are nothing new. Over the years, countries are making their own assessments with their main goal being to improve their education in the sense of raising the quality of education. For that they are using national and international assessments. The trend indicates a growth in international assessments (Klemencic & Rozman, 2009),6 with national learning assessments growing rapidly to, ‘‘especially in mathematics and languages and more so at the primary and lower secondary levels’’ (Kamens & McNeely, 2010, p. 6).7 Although national studies of language and international studies of (e.g., reading) literacy are distinguished (which will be presented shortly when PIRLS and its impacts on Slovenian educational PM will be explored). In addition to this, mathematics, science, citizenship, and reading literacy are the focus of international assessments in particular. National assessments (sometimes called system/learning assessment or assessment of learning outcomes) tend to be initiated by the government (the Ministry of Education to be exact), as distinguishing from international assessments, whose initiatives originate from the members of the research community. The main differences between those two types of assessment are that national assessments are designed and implemented within individual countries with regard to their specialities (sampling design, instrumentation, frequency, etc.), whereas international assessment require participating countries to follow similar procedures and use the same instrument. However, they are also similar in that national assessment instruments are developed to assess students’ knowledge and skills (Greaney & Kellaghan, 1996, pp. 12–25; Kellaghan & Greaney, 2001, pp. 33–63). Even if there is not

244

EVA KLEMENCIC

a lot of doubt about the applicability of national assessments, which can reflect circumstances better in a country and their education system (without problems of translation and national adaptations or problems of sampling), international assessments are making additional reflections on education achievements in a country and within countries. The obvious advantages of international assessments are: (a) the collecting of descriptive data about benchmarking in comparison with other systems (because they are not absolute achievement standards), (b) they can contribute to better correlation understanding in achievement differences within a country and between countries, (c) they demand for high technical standards in collecting data, which could help in research quality in a country without a strong tradition in research, (d) countries cooperation increases the attention given to the activities in other countries for the improvement in education (learning from other experiences), and (e) the increase of media and policymakers’ attention (Straus, 2004, pp. 14–15). In what way can these comparisons be understood and what benefits do potential comparisons of education systems have? Even when countries are making their own (national) assessments to improve their education in the sense of raising the quality of education and simultaneously when the standard of quality is actually internationally accepted (globally defined or universally accepted) could this mean the first step in the process of globalization of education? This will assume a vigorous uniting of national and international assessments in the future on one hand and an increased sense of researching epistemic universalizations, in the sense of global knowledge on the other.

INTERNATIONAL AND NATIONAL EDUCATION ASSESSMENTS IN SLOVENIA Slovenia was separated from Yugoslavia in 1999. Since then Slovenia has been participating in numerous international education comparative studies: in OECD PISA from the third data collection (PISA, 2006, 2009), in OECD TALIS 2008 and now in PISA 2012 as well and recently. The history of Slovenian participation in IEA international comparative studies is even longer. Before the millennium, Slovenia was participating in RLC (The Reading Literacy Study; 1985–1994), TIMSS 1995 (1993–1997) and TIMSSR99 (Third International Mathematics and Science Study Repeat; 1997– 2001), LES (The Language Education Study; 1993–1996), Comped (The Computers in Education Study, 1987–1993), CIVED 1999 (1994–2002),

The Impact of International Achievement Studies

245

SITES-M1 (Second International Technology in Education Study – Modul1; 1997–1999). After the millennium Slovenia was participating in IEA SITES-M2 (1999–2002), TIMSS 2003, 2007 and TIMSS Advanced 2008, PIRLS 2001 (2000–2003), PIRLS 2006 and SITES 2006, ICCS 2009 (2006–2010). Now Slovenia is participating in IEA TIMSS and PIRLS 2011, as well as in the ongoing EU international study European Survey on Language Competencies (ESLC). Like many other countries around the globe, Slovenia is conducting national education assessments as well. The National Examination Centre (2006) is responsible for the external assessment of pupils, apprentices, students, and adults in Slovenia. Beside other tasks,8 it prepares and administers the General ‘‘Matura’’ exam (a school-leaving exam enabling candidates to enroll in academic courses at Slovene universities), the Vocational ‘‘Matura’’ exam (core curriculum subjects), and National assessment in nine-year primary education.9 For the General Matura exam (consisting of five subjects), students take tests in Slovene (in ethnically mixed areas, Hungarian or Italian, respectively), Mathematics and a modern foreign language (the last three are compulsory), and two optional subjects. The Vocational Matura exam consists of candidate’s mother-tongue and a basic subject (or specialization), the third subject is a choice between Mathematics, a foreign language and a second language, and the fourth subject is to perform practical work in their chosen profession or branch. Pupils at the end of primary education take tests (national assessments) in Slovene, Mathematics and either a modern foreign language or another optional subject. Comparing assessments on the national and international level highlight the statement that Mathematics and language10 are dominant in the national assessments in Slovenia. In addition, literacy dominates in international comparative education studies (besides mentioned modern international comparative education studies conceptualizing science, reading, citizenship, and computer literacy as well).

COMPARISON AS A RANKING ‘‘Across the globe, international league tables have become a significant source of influence in the design of national educational policies’’ (Takayama, 2008, p. 387) and to be more precise, they have been one of the ‘‘strongest pretexts for school reform in many nations’’ (Sahlberg, 2007, p. 162). Although the aim of international education comparative studies is not solely to give a ranking, they ‘‘are often absorbed by politicians simply

246

EVA KLEMENCIC

by looking at the’’ league ‘‘tables of countries, rather than by trying to learn about national underlying characteristics that might explain comparative system performance’’ (Sahlberg, 2007, p. 163). League tables have never been the predominant factor for establishing research groups such as research frames, but it is at the forefront of political and media activity, which is logical in two different contexts. The first context is the contemporary era of competitiveness in all of modern societies, thinking of schools as a microcosm of society, the education system represents a mezocosm (or interfering cosmos) of modern society: (a) in a sense that the limits of national boundaries represents the connections with other subsystems of everyday life (like health, defence, etc.) and the public management of the country as a whole; (b) assuming modern societies as global that represents competitiveness between (mostly heterogeneous) economical11 societies and their consequences on everyday peoples’ lives. The second context concerns simplification. Frequently, for the media and politicians feel sufficiently informed, merely glancing over the league tables to make judgments of effective or less effective domestic education system would suffice. As globalization represents a phenomenon entering (almost) all the spheres of human activities, many aspects which are obvious exist; there are ‘‘economical and political, social and cultural’’ spectrums of globalization (Jenicek, 2006, p. 1). Education and knowledge are connected with all spectrums or dimensions of globalization. In addition, perceiving the economy and the labour market as global, even if we know that there is truly nothing like a global economy, there are only (global) economies, because there is no global totality. Global does not exist;12 global is made up of fragmented and individually differing but fundamentally similar pieces. If globalization will exist in totality, then this will presuppose universalization of all spectrums of globalization including knowledge and education systems. My question at this point is: Is global knowledge the final and unquestionable results of total universalization? Therefore, it seems that the comparing of education systems that can push up economies (if we take into consideration only that possible effect and neglect knowledge within itself) starts with achievement measuring and ranking. ‘‘Countries ranking high in league tables instantly become the symbol of educational excellence’’ (Takayama, 2008, p. 387). It seems that they are interpreted, in terms of an education system (in total), as a best practice. National systems vary in social, historical, and cultural contexts; therefore, their education systems (portrayed through the four social institutions – goals, decision-making authority, curriculum topics, and students attainment) should reflect that variety (Schmidt et al., 2001, p. 17). Even if

The Impact of International Achievement Studies

247

interpretation of achievements from league tables is certainly applicable, often they do not include enough critical investigation into other research elements. This may be a possible reason for demonstrated results and consecutively important characteristic of demonstrative knowledge in a country to remain hidden, and in addition, it may be possible to derive incorrect conclusions about educational circumstances. Comparing the average achievement of students from one country to the average achievement of students in another country, it is possible only to make interpretations about the quality of the education system with regard to some ‘‘norms’’ – for example, comparative students achievements; those are normative interpretations (Hambleton & Sireci, 1997, in Straus, 2004, p. 16). To establish what students know and what they are capable of, student achievement results must be interpreted with regard to some external criterion and standards; for exmple, expected benchmarks for students, students’ achievements with regard to a particular knowledge domain. These are criterion interpretations, which are often tied to aims and standards in syllabuses; therefore, they interpret student’s achievements with regard to the standards in syllabuses (Hambleton & Sireci, 1997, in Straus, 2004, p. 17). However, even those are not without limitations; defining criterion and due to this, it is sometimes very difficult to describe the importance of student achievements that will be equal or even similar among diverse interpreters (Straus, 2004, pp. 16–17). Consequently, it seems that the phenomenon of best practice of a certain education system is basically obtained solely from the data seen in international league tables. But does this have an effect also on the knowledge structure in a globalized world?

WHY DO WE NEED COMPARISON AT ALL? Examining historical trends, such as ‘‘major military losses, colonial failures, or sharp economic decline’’ by educational sociologist, often shows results in various educational reforms (Archer, 1984; Ramirez & Boli, 1987, in Baker, 2002, p. 393) and especially in more recent times ‘‘national crises focuses attention on the quality of educational outcomes such as achievement, graduation rate and transition to more education or the labour market’’ (Baker, 2002, p. 393). In an era of globalization, ‘‘/t/here is a widespread consensusythat contemporaryysocieties are in one sense or another ruled by knowledge and expertise.’’ This knowledge is managed, monopolized, or shared throughout the industrialized OECD countries but also increasingly in parts of the developing world (Evers & Gerke, 2003, p. 3–4).

248

EVA KLEMENCIC

International comparative assessments grew out of the ‘‘consciousness of the lack of internationally valid standards of achievement with which individual countries could compare the performance of their students’’ (Kellaghan & Greaney, 2001, p. 63). In addition, they provide both a national and an international context (Robitaille, Beaton, & Plomp, 2000, p. 7). Therefore, these studies are significant in two different but linked meanings; on one hand, it is creating the opportunity for international comparison, but on the other hand, the studies are connected with the evaluation of domestic education systems and require further work to improve the quality of the education system. It seems that it is becoming more common to understand international achievement studies as a part of the process of globalization. These studies for national education PM raise new demands. As Protner said, these demands usually are not solely the product of experts’ judgments but also concern the interests of capital, which within the notion of knowledge can be defined in accordance with the logic of self-reproduction and demands of the labour market (Protner, 2004, p. 7). For the international questionnaire UOE (UNESCO/OECD/Eurostat), the method of acquiring data is also conducted through national and international education studies. It is clear that international comparative education studies (with comparable student and school achievements data and other contextual data – e.g., social-economical backgrounds of the students and their families, organizational information about the school, etc.) contribute to a great share of the statistical indicators of the quantity and quality of education in each country and cross-nationally. Data from European countries are used also for annual reporting in the publication about indicators for education systems (‘‘Key Data on Education in Europe’’ – prepared by Eurostat and Eurydice) and data from OECD countries are included in the publication ‘‘Education at a Glance’’ (Straus et al., 2006). Therefore, my question could be are international comparative studies a necessity or is it destined to become one? Comparisons could also contribute to changing curriculum and knowledge structure, for which I will try to develop the thesis in the further chapters.

KNOWLEDGE GLOBALIZATION There are many indications that national curriculums are increasingly adapting to comprehension content for which the instruments used in international student assessments are based on; when education-content planners change curricula, they refer to comparisons of data on education in

The Impact of International Achievement Studies

249

their own countries and in other countries (Robitaille et al., 2000; Klemencic, 2010). Some international large-scale student assessments focus primarily on curricula, that is, on the ‘‘progress’’ or ‘‘implementation’’ of education (e.g., IEA studies). Are national curriculums becoming homogenized under the influence of such studies? This potential effect of international research could be interpreted as globalization of knowledge. It clearly follows from the use of the term globalization in an educational context that the role of local knowledge in education is declining. Declining can be understood not only in the sense that the increase of global knowledge causes the decrease of the local knowledge in general, but in a sense that globalization processes are affecting the educational sphere. In my view, the globalization of educational tests, that is, international largescale student assessments, which can compare only knowledge that is very broadly recognized as a common knowledge and which is not specific to any territory. However, the use of the term globalization in this context is also distinctly ambiguous, as it permits at least two interpretations: First interpretation: Globalization of knowledge can be understood as the propagation of knowledge created by science that is universal and objectively valid, that is, valid in all cultures and social contexts. The existence of this knowledge should justify the standardization of national school curriculums. Owing to their comparative nature, international largescale students’ assessments can be an effective instrument for this standardization and this form of globalization of curricular content. Second interpretation: Globalization of knowledge can also be viewed as a process in which knowledge originally linked to an individual society and culture prevails over knowledge created by other societies and cultures. Accepting this thesis, the reduced role of local knowledge in local environments is primarily a consequence of the domination of expansive foreign cultural (and economic) models, which could also represent a danger to national education systems, particularly in terms of the loss of local knowledge. Today much of the debate on national school curriculum follows a basic division that arose in epistemological theories, the division into realist and anti-realist understandings of knowledge (Moore, 2000). In other words, a division in epistemology in recent decades has been copied to theories of curricular planning: (a) Realist theories of knowledge assume the existence of a reality that is independent of human comprehension activities, and sees in knowledge an

250

EVA KLEMENCIC

adequate and objective representation of this reality (Russell, 1979, in Justin, 2008, p. 178), a representation that is not derived from any specific, socially and culturally bound comprehensive position; in other words, knowledge (reality) without perspective. These general theories of knowledge have provided curricular planners with arguments for adapting national curriculum to systems of knowledge with supposed universal validity. (b) Anti-realist theories of knowledge, e.g. constructivist theories of knowledge (Berger & Luckmann, 1966; Hacking, 1999; Howcroft & MooreTrauth, 2005; Kincheloe & Horn, 2008, in Klemencic, 2010)13 do not allow the possibility that knowledge is a neutral and objective representation of facts that exist independently of human comprehension activities; instead, they link comprehension to specific comprehension positions, institutions, symbolic systems and ultimately society, culture, power relations, etc. These theories have developed conceptual tools with which it is possible to investigate the possibility that central systems of knowledge which international large-scale students’ assessments are based on in fact originate from individual societies and cultures and are linked to interests and power relations in the modern world. The reduction of the role of local knowledge in national curriculum is an effect of the global dominance of specific cultural models. The basic difference of those two theories is the perception of reality – for realists, reality is universal and for social constructivist, reality is (socially, historically) constructed. For constructivists, there is no such thing as universal concepts, there are just concepts (which are not true and false), but concepts that are effective enough describe reality. This, in turn, is transferable to knowledge too. Therefore, it seems that for constructivists, global knowledge does not exist, whereas realistic theories are founded on the global ideas. We could interpret realist theories as being closer to global knowledge than nonrealist theories. That leads us to next thesis that states international comparative studies are derived from realistic views of knowledge. Furthermore, we are led to the conclusion that international comparative studies accelerate the process of knowledge globalization.

IS LOCAL KNOWLEDGE CONSEQUENTIALLY BEING REDUCED? There are at least two ways of looking at the effect of globalization on education. First, ‘‘to what extent are education systems being ‘shaped’ by

The Impact of International Achievement Studies

251

globalization’’ or in prospective terms, what are the changes that are likely to affect educational systems in the future as a results of globalization. The second, ‘‘what kind of policy reforms should be adopted to address the consequences of globalization’’ (Hallak, 1998, pp. 8–9). In my view, international comparative education studies influence and are influenced by both perspectives. With exact research procedures and instrumentation (especially the selection of domain/subject areas and explanatory factors, preparing instruments for measurement of achievement, item selection, scorer training, etc.) the logical consequence would be that participating countries mostly reach common agreements about what particular concept of knowledge is and how they will measure them. If we presuppose that education is a cultural, economical, and social determinate, the same proposition is valid for knowledge too. Which means that participating countries must find a common agreement on what is particularly significant for the majority of countries (which should be widely accepted knowledge) and which is significant only for their society. This particular (local) knowledge cannot be measured in an international study as the validity of compared achievements would not be objective (local knowledge could be measured only in territorialized contexts, within obvious sociocultural contexts). On the contrary, international education studies identify the knowledge that could be measurable in different sociocultural contexts; actually they are suppose that those context does not have impacts on students performance in international studies.14 Some forms of local (traditional, indigenous) knowledge is often expressed through legends, rituals, songs etc. and are culturally oriented in particular. Even more, although some legends could be similar to the wider community, since it includes beliefs, values, and fixed practices, it can also distinguish between particular (smaller) community/society. The same meaning may not be shared between communities. Even related national and international assessments could not be reflected (and linked) to local and global knowledge. The fact is that international students’ assessments must be equally understandable among students from different cultures and they try to measure knowledge that is neither bound by any space frontiers (e.g. national state), nor specific to any group. Another view is that the impact of participation in international comparative studies also changes the curriculum at the national level in different educational systems, which we have much evidence for, especially after the TIMSS studies and the consequential national reports on observing/changing curriculum. Therefore, I could presuppose that international large-scale student assessments force the processes in which the clear outcome is to reduce the local knowledge.

252

EVA KLEMENCIC

RESEARCH OBJECTIVES, METHODS, AND MODES OF INQUIRY In this part, I try to illustrate the process of knowledge globalization (regarding curriculum homogenization) through international comparative education studies in the case of Slovenia by presenting some key findings.

SAMPLE AND INSTRUMENT As my objective was to present the impacts of international comparative studies (and international large-scale student assessments in particular) on national education PM in Slovenia, in the sample I included three groups of respondents: (a) National research coordinators from Slovenia (NRC): Barbara Japelj Pavesic15 (TIMSS and TIMSS Advanced), Mojca Straus16 (PISA), Marjan Simenc17 (CIVED/ICCS), Marjeta Doupona Horvat18 (PIRLS), Barbara Brecko19 (SITES); (b) Expert and advisory board (EAB): Janez Justin,20 who is the president of the ‘‘Commission for the direction of international comparative education studies’’ in Slovenia (which is the advisory board for the Ministry of Education and Sport), he is also the president of the ‘‘Commission for the renovation of syllabuses for primary education;’’21 (c) PM: Andreja Barle Lakota,22 who is the director of Education Development Office, Milan Zver,23 who was the minister for education and sport (2004–2008). The decision for choosing these three groups of respondent was connected with the general objective of the research that is to identify the concrete impacts of these studies in Slovenia on national PM (especially on changing syllabuses and other curricular documents). Therefore, I selected all important respondents (within the mentioned groups) that could measure impacts of the studies on national educational PM in Slovenia. All of the respondents have been for a long time directly or indirectly connected with international comparative education studies (based on their research, expert or political experiences).

INSTRUMENT For reasons of a small number of respondents and in the case that some of them would need some additional explanations I chose to use interviews.

The Impact of International Achievement Studies

253

The interviews were conducted in two phases. In the first phase, I used structured interviews and sent the questions to respondents via e-mail in January and February 2010. I received all of the respondents’ answers within a month. In the second phase, which was conducted in April 2010, I used semi-structured face-to-face interviews to explain concrete impact in details. All participants were given all questions, but the NRCs answered, within the topic of concrete impacts (research question four), only about the study that they were coordinating. The structured interviews were focused on five major research questions, relating to: 1. the impact of international education studies on national PM in Slovenia (respondents’ general feelings about impacts); 2. reasons for participating (focused on studies and particular cycles); 3. media and expert interests [which of the studies have (potentially) major media and experts interests]; 4. concrete impact of all data collections (regarding particular cycles); impacts on curriculum and/or some syllabuses); 5. additional use of collected data. I selected those questions topics to provide information about how important these studies are to PM and experts and in addition to this I was interested in a media interest, particularly about which of the studies seem to have more media attention. I asked the same question to all three groups of respondents to find out if there was any variation between these three groups about media, experts, and policymakers interests. In the semi-structured interviews, I asked respondent for additional information about the uses of international large-scale student assessment results as a tool for national progress as well as obstacles/limitations. I also asked for recommendations for ways in which the international data could be used in real-world PM situations and the possible implications of international large-scale student assessments for Slovenia were discussed. The duration of the semi-structured interviews was between 10 and 30 min.

RESULTS The first phase of interviewing (results presented in Table 1) exposed the general meaning of international education studies, how they are impact on Slovenian educational PM. It identified the reasons why Slovenia is

Large-scale student assessments (leading PISA, TIMSS) _______________ TIMSS, PISA (PIRLS)

TIMSS, PISA

TIMSS, PISA

Reasons for participating in Media, expert interests of results the studies

Depends on a study, impacts on Evidence-based policy different levels Especially after last curricular Evidence-based policy, reform comparisons with other education systems Unsystematic at the beginning, Evidence-based policy, today assured financing (EU systematic_______________big funds) _____________ (positive, direct)a comparable to other, databases (EU, OECD), good praxes

Impacts

Major Research Questions

More secondary analysis needed _______________ changing syllabuses, policymaking (in general)

More secondary analysis needed (content- knowledge analysis)

More secondary analysis needed

Additional using data

Impact of International Education Studies on Policymaking in Slovenia.

This part represents the response of our former minister for education and sport (M. Zver). I have distinguished between respondent’s answers within the PM group, because of their differences (e.g., perception of reasons why Slovenia is participating in these studies). Moreover, the PM group is composed of former and current policymakers. Therefore, introducing their differing view is necessary.

a

PM

EAB

NRC

Groups of Respondents

Table 1.

254 EVA KLEMENCIC

The Impact of International Achievement Studies

255

participating in these studies, exposed media and experts interests in particular studies and highlighted that further additional use of the collected data was needed. In the second phase I used the semi-structured interviews to identify the concrete impacts of particular cycles of international large-scale student assessments, and these impacts are presented separately. All groups of respondents were similar in their answers. They see a great impact of international education studies on national PM in Slovenia, especially after the last curricular reform. All the three groups were persuaded that the impacts are especially big from large-scale student assessments24 and that TIMSS, as it has the longest tradition in our society and PISA because it is conducted by OECD, are dominant in this.25 These two studies are causing great interest in the media and expert discussion forums. Different expert groups emphasized the role of special international assessment study (with regard to numeracy and literacy that are conceptualized by international assessments), because the results of different studies are dominant within different areas and experts groups (e.g., ICCS is probably more important for experts dealing with citizenship education, while TIMSS is more important for experts within mathematical area). Continual participation enables measuring trends, which is particularly important between curricular reforms (for evaluating past reforms and ideas about some corrections in future reforms). Slovenia is participating in these studies because of the need for evidence-based policy and up-to-date information. It has also had financial opportunities to participate. The EU is also interested in these studies (it gives financial incentives for participation). It seems that in Slovenia today, we have a more systematic approach for acquiring financial means for participation (also with partial financing from EU). It is also a suitable moment for discussions as to whether or not Slovenia will participate in all international education studies in the future (if we can be exempt from some studies) and if we will participate in all data collection cycles (especially if the EU funds that enable participation in studies will be reduced in a future). All three groups see necessary emphasis on secondary analysis as an additional tool for collecting previous data (accompanying data collected from national assessments and other data), which could answer particular education areas more precisely. In this view, international education studies are pointed out as basic ideas about researching areas that have to be done within the national environment. The groups of respondent exposed the possible limitations of predominantly relying on international collected data, especially that international validation of collecting data could impose changes in the education system

256

EVA KLEMENCIC

too quickly without the appropriate time for reflection within specific national environments. I call this effect a reducing of local knowledge, which could be particularly reflected while changing the curriculum and knowledge structure. Although none of the respondents deny the advantages in participating in international comparative education studies, especially because Slovenia is (in comparison to other countries or education systems) a small country. Therefore, we do not have enough experts in all areas and especially in complex educational (research) areas. International education studies were proofed using statistical methods of increased confidence in statistical measurements on large-scale samples. They also influenced reporting methodology about national assessments with benchmarking. The NRCs that are responsible for international education studies are often named in commissions that evaluate different curricular documents and/or syllabuses.

CONCRETE IMPACTS OF INTERNATIONAL LARGE-SCALE STUDENT ASSESSMENTS The concrete impacts of international education studies on national education PM in Slovenia have especially been seen with international large-scale student assessments (TIMSS, CIVED/ICCS, PIRLS, and PISA), where I identify their direct and indirect impacts. Generally, the obvious impacts are easily identifiable in studies that are organized around curriculum focus (e.g., TIMSS). However, it is very hard to identify indirect impact (e.g., how and how often teachers in school are using exercise books with international items, how results are influencing university teaching/ special didactics for subjects etc.). TIMSS in Slovenia Slovenia has participated in all TIMSS data collection up until now (with both target populations for TIMSS and also in TIMSS Advanced), and it is participating in TIMSS 2011. TIMSS 1995 impacted on the syllabus plans for nine-year schooling. The result of TIMSS 1999 pointed out that our teachers (in comparison to other education systems) were not overloaded as they perceived they were, so the national discussions about that topic lessened. TIMSS 2003 showed a

The Impact of International Achievement Studies

257

deficiency (weakness in contents and expected knowledge) in a nine-year schooling syllabus (compared to the old eight-year schooling syllabuses) and caused immediately corrections of some syllabuses which now are waiting for formal introduction (e.g., learning about fractional numbers is one grade lower). TIMSS 2003 represented expressive sources for teachers’ trainings. In Science, after that data cycle, teachers’ trainings with particular focus on experimental relations with learning (in contrast to learning about facts) were enlarged. TIMSS 2007 indicated regional differences in knowledge within Slovenia and influenced the Ministry of Education to support the attempt to reduce differences. Regional differences in knowledge are still exposed in the media today. TIMSS Advanced was interesting in Slovenia because the same population was also tested in the General Matura exam. Therefore, some secondary analyses that are using both databases (national and international) are going now. From the different types of items from TIMSS included in textbooks and other didactical materials as well, the similarities for national assessments could be perceived. TIMSS national experts are also members of the national assessments groups and NRC TIMSS published several textbooks on Mathematics for primary school.

PISA in Slovenia Slovenia participated in PISA 2006 and 2009, and now it is participating in PISA 2012. Within the PISA framework, there is nothing that could correspond to, for example, TIMSS curriculum focus. Even more, PISA is not focused on particular school/grade level, therefore it is difficult to identify the direct impact on national PM (particularly in changing curriculum and syllabuses), which is comparable to other studies. However, the indirect impact of PISA could be predicted on two different levels: on everyday school work (when teachers are using published items) and on different expert bodies/commissions in which experts who are nationally connected with PISA are included. Different national experts (practitioners, theorists) who are evaluating the PISA framework and PISA assessments have the opportunity to propose items. Reviewing the composition of experts who are working nationally on the PISA study highlights the fact that the same people are mostly members or presidents in the national exams commission (this is true for the Matura exam as well as for the primary education national assessments) where they are included in the process of preparing items for national assessments. PISA NRC is also a

258

EVA KLEMENCIC

member of the expert group for renewing White paper on Education in the Republic of Slovenia as well as being a member of the ‘Commission for the direction of international comparative education studies.’

RLC/PIRLS in Slovenia At a first look, RLC and PIRLS 2001 had a relatively great impact on national education PM. Some of the results of the two studies together with secondary analysis were one of the foundations for forming the National Literacy Strategy (which is still more adequate than every day policy requires). PIRLS NRC was named in the commission that prepared said strategy (especially with regards to the care of underprivileged groups as identified in some PIRLS data). National and international assessments within the area of reading literacy are not comparable as comprehension conceptualized differently.26 National assessments are focused more on technical aspects of reading (explicitly trying to find out information without making inferences, spelling, grammar), while PIRLS is, on the contrary, focused more on content. Primary schools in Slovenia still do not systematically instruct comprehension; therefore, progress with this skill is small from one international data collection to another, despite the last curricular reform after which all teachers are responsible for reading (but this is still mainly a cross-curricular theme and it seems that it is still not emphasized enough). With data collected in PIRLS 2006, some researchers from Slovenia are still working on some secondary analysis, but the media attention is focused mainly on national regional differences in children’s achievements.

CIVED/ICCS in Slovenia It seems that CIVED 1999 had a relatively small impact on national PM. Because some of the results express very high inclinations toward all that is connected with national identity and patria and that in 2008 we changed the name of the compulsory subjects (add patriotic) etc., it seems that the impact of that study was small. On the contrary, it appears that ICCS 2009 may well have a relatively vigorous impact on national PM.27 By the end of 2010, the group of experts will be responsible for the evaluation of the new syllabus for the 7th and 8th grade elementary school compulsory subject ‘‘Citizenship and Patriotic

The Impact of International Achievement Studies

259

Education and Ethics.’’ The president of the evaluation commission is also the ICCS NRC, and the results of ICCS28 will be released a few months before the deadline for the evaluation of the school subject syllabus. Therefore, it is possible to say that subject evaluation will also take international results into consideration.

SUMMARIZING AND ADDITIONALLY INTERPRETING THE RESULTS From the first phase (using structured interviews), it is clear that all three groups of respondents expose the great impact of international comparative studies on Slovenian national PM in education, especially for evidencebased policy. However, they are all in agreement that more secondary analysis is needed. This is actually the first conclusion, that in Slovenia more analyses that are socioculturally oriented are need. In addition, sociocultural orientation could be attained using national assessments or with some additional data analysis combining international and national assessments (actually parts of the mentioned assessments, because simple interpretations between national and international assessments are not possible). It seems that estimations about the impacts of international comparative studies (especially emphasizing the impacts) are still more declarative statements than everyday reality and that participating in these studies is seen of as important due to being placed in international league tables. On the contrary, with second phase (using semi-structured interviews with NRCs), I tried to identify some of the concrete direct (e.g., from TIMSS study) and indirect impacts. Since indirect impacts are often not measurable at all, estimations about impacts could be misleading. From this phase, it is also clear that some of the results from, for example, TIMSS have leads to changing curriculum (e.g., Mathematics syllabus), while other results, for example, PIRLS and National Reading Strategy are more important as a declarative statement, which in Slovenia is compulsory, but it has still not been realized in everyday school reality. In addition, it seems that the areas of studies that could be closer to global or universal knowledge (such as Mathematics)29 are more subject to homogenization of school curricular national changes than areas that are, as a starting point, more cultural oriented. As perceiving civic education as less globalized knowledge from the start,30 it could raise a question as to why then international comparisons are needed at all. Answering this would need

260

EVA KLEMENCIC

a more precise view on multidimensional processes of globalization. Slovenia is not isolated from other cultural, economic, political influences (as with other countries today), which are influencing our civic and citizenship education as well; they are primarily influencing our social reality, which is constructed at a regional and global level as well. Also organizing civic and citizenship education across the globe is very heterogeneous (as a school subject at different school levels, as a crosscurricular theme, as a model, as a dimension, etc.). Recognizing the heterogeneous organization as well as a cultural meaning of citizenship education facilitate the understanding of the importance of international testing, which is, results must be interpreted with a socio-cultural awareness. I see this as maybe one of the reasons why after the CIVED 1999 in Slovenia, the active addition of patriotic dimension into our citizenship education was present (although the international results did not indicate this as a problem at all). This was mainly however a politically oriented decision and was not founded in any international results. This could be the answer to the question as to why civic education and the impacts of their international assessments or national PM are not so identifiable. After the ICCS results, it will be interesting to identify citizenship education curricular changes regarding to international results. Therefore, results from interviews cast a shadow, that from all the international comparative studies, TIMSS is, in Slovenia, perceived as the most important for national educational PM (based on evidence-based policy). However, the fact is that last statement could be misleading. TIMSS in Slovenia really caused syllabus changes, but the study is focused on intended, implemented, and attained curriculum; therefore, impacts of TIMSS on national educational PM is easier to estimate. The real problem of estimating the impacts of international comparative studies is that estimation of other, indirect impacts than influence Slovenian national educational PM are difficult to identify.

INSTEAD OF CONCLUSION International comparative studies and international large-scale student assessments in particular have an impact on different levels of policy: on global and cross-national policy, on national policy, and on school-level policy. They are conceptualizing different forms of literacy that have had a great impact on all three levels (individual student level, school level, educational system level).

The Impact of International Achievement Studies

261

Two international organizations, the OECD and the IEA, are conducting international comparative research projects of the knowledge conveyed to new generations by educational institutions. These research projects involve a large majority of developed countries, as well as many middle- and lowincome countries. Slovenia has been involved in all data-collection procedures for such research for the past two decades. Moreover, it seems that we truly personify the parable ‘‘Men with the two watches’’ (Schagen & Hutchison, 2007). Furthermore, it seems that in the future, we will need some more ‘‘watches’’ (but will they be more nationally or internationally oriented?). Data from international comparative research is increasingly used both in the definition of quality indicators of (national) education systems and in the increasing attention paid to media and expert interpretations of the globalization (of education and of knowledge) and the influence of comparative research of knowledge. The aim of these studies is not only to prepare the league table of countries with regards to students’ achievements, but primarily, it is crucial that they are also giving the descriptions of the different activities performed in the education systems and their connections with students’ achievements. Therefore, they can be used in different formations of an educational policy and practice, interpreting their national and international contexts (in addition, they both need precise secondary analysis). While researching a global and cross-national policy in the field of knowledge demands, research emphasized particular knowledge within conceptualized literacy (mathematical, science, reading, citizenship literacy). The impact on national PM appears to be researched more in terms of changing curriculum and syllabuses (in particular the structure of knowledge and the structure of syllabuses), with greater consideration paid to the different factors in education systems that could not be solely collected from the international education assessments alone. The impact of these studies on national educational PM must be researched in relative connections with national assessments as with much of the rest of the data collected nationally. Researchers and educational policymakers in Slovenia are now exposing that view. So in the future, it could happen that Slovenia will skip some of the international comparative education studies or will not participate in collecting international data in each cycle. From concluded international large-scale student assessments I identified different impacts on Slovenian national PM. Some of them are directly connected and others indirectly connected to the changing of curriculum and some syllabuses. Both of the impacts are changing school-level policy as well. Although it seems that, particularly in media and some expert discussions,

262

EVA KLEMENCIC

regional differences, socioeconomic status (SES) related to achievements, gender differences, etc. will have increased attention. To answer those questions, we need to increase secondary analysis, with more consideration on national factors that influence and that are influencing education. As for today, it seems that Slovenia has a more systematic approach for the financing of these studies (in comparison with 10 years ago); it seems that future will be focused on a more systematic approach for the use of data.

NOTES 1. Reducing the significance of local knowledge is not an absolute supposition, I am just trying to expose that changing curriculum regarding achievements on international comparative studies could cause reducing the significance of some parts of local knowledge. 2. International results and reports are available for each data collection cycle, for each study additional piece of information (and brief reports). They are also accessed through Internet; www.iea.nl, www.pisa.oecd.org. I am not referencing on them because my focus is on the impacts of international large-scale student assessments on national policy-making in general. Data about the impacts on Slovenian educational policy-making were collected in interviews, but basic international data, which make up the basis for my interviews interpretation, are also accessible from international reports and databases. 3. Both concepts are conceptualized (Falk, 1999; in Singh, Kenway, & Apple, 2007). However, ‘‘globalization from below’’ mainly focuses on the cultural perspective. 4. I am using the term local to cover both indigenous and traditional knowledge. 5. Researching school effectiveness investigates the extent to which schools fulfill their aims of efficiency (Kelly, 2008, p. 517) to minimize starting points of student’s socio-economic status etc. and with regard to that, how effective a school or educational system is. With this notion, comparative education studies provide comparative data also about school factors affecting outcomes, but we could expose one of their limitations. As school effectiveness focused on the measurable, it tends to ignore the ‘‘difficult-to-measure,’’ for example, societal culture that actually makes international comparisons difficult to make (Kelly, 2008, p. 518). 6. The analysis of OECD and EU country participation in international comparative education studies for the past 50 years clearly showed the trends participation for both groups of countries. The analysis was made for 26 past international comparative education studies; although there are some differences within some sub-clusters of OECD and EU clusters. For instance, the difference between participation ‘‘EU old countries’’ and ‘‘EU new countries’’ (countries that became EU after 2004) is statistically significant (w ¼ 46.18; p ¼ 0.00). 7. Authors have exposed that 35% of countries now conduct international tests; however, many more countries in the world are conducting some form of national assessments (Kamens & McNeely, 2010, p. 19).

The Impact of International Achievement Studies

263

8. More information about the National Exam Centre is available at http:// www.ric.si. 9. Nine-year primary education, in Slovenia, is divided into three-year compulsory cycles. Today national assessment is compulsory only for pupils from Year 9 and it can be considered as a criterion for the selection of pupils (future students) in cases of limited enrolment in secondary school. Pupils take tests in Slovene, Mathematics and either a modern foreign language or an another optional subject. 10. National and international assessments within the area of reading literacy are not completely comparable (if we used Slovene tests as a reading literacy), because reading understanding is conceptualized differently. 11. Although it is not possible to neglect other directions of globalization (like cultural, social, etc.), our assumption is that economical globalization is primary for the total sub-context of globalization. And educational (or knowledge) globalization is just one sub-context of globalization. 12. Because economical globalization is a process, it is not the final results as, for example, global pollution is. 13. The main ideas that are connected with constructivist theories of knowledge have long traditions (from Socrates and his idea that in cognition exist three types of learning and in more recent times from Vygotsky and his important idea about social and historical roots of the origins of knowledge). 14. Although they collect background data (e.g., school, family context), their position is very close to the realistic theories of knowledge. 15. B. Japelj Pavesic (personal communication, January 26, 2010; February 1, 2010; April 13, 2010). 16. M. Straus (personal communication, January 29, 2010; April 12, 2010). 17. M. Simenc (personal communication, February 2, 2010). 18. M. Doupona Horvat (personal communication, February 20, 2010; April 13, 2010). 19. B. Brecko (personal communication, April 4, 2010). 20. J. Justin (personal communication, February 21, 2010; April 13, 2010). 21. Slovenia is now in the process of reducing its syllabuses by 20% (due to the recent statements of the Minister for Education and Sport). 22. A. Barle Lakota (personal communication, April 13, 2010) – semi-structured interview. 23. M. Zver (personal communication, April 15, 2010) – structured interview. 24. In comparison with other international education studies (SITES, TALIS). 25. The statement does not suggest that only these two studies have reliable international coordinating centers. My opinion is that all the mentioned studies have reliable international coordinating centers, but perhaps the media attention is primarily focused on OECD (because of the importance of this organization, Slovenia has for many years tried to become an OECD country, but we could also discuss PISA’s importance regarding neoliberal globalization and the role of OECD in it). My interpretation about media interests with TIMSS results in Slovenia is that TIMSS (with regards to its connections with curriculum) was especially important because of our last curricular reform. In addition, the particular interest in Mathematics and Science could also be connected with the historical roots of globalization of education (e.g., mass schooling and reasons in Europe to establish it).

264

EVA KLEMENCIC

26. Using Slovene tests in part for literacy. 27. That relatively vigorous impact on national PM, as I see it, does not create problems with the relationship between local and global knowledge at all. As perception for civic and citizenship education is primarily more connected with local knowledge [although my opinion is that even Mathematics and Science are socially constructed and do not represent knowledge in a (radical) realist view], it will be interesting to see how experts will interpret the international results within the national domain. 28. A brief report (partial) was released in June. 29. Within the area of global knowledge the aspect of agreement of meaning results from a longer history, in opposition to the construction of local knowledge (such as civic education), which will always be in some sense sociocultural determinate. Even if my view is that none of the knowledge could ever really be global, because I do not accept global society or global culture as a fact. 30. Within citizenship education is not possible to interpret, for example, human rights without cultural and historical context.

REFERENCES Antweiler, C. (1998). Local knowledge and local knowing. Anthropos, 93, 469–494. Archer, M. S. (1984). The social origins of educational systems. London: Sage. Baker, D. P. (2002). International competition and education crises (cross-national studies of school outcomes). In: L. D. Levinson, W. P. Cookson & R. A. Sadovnik (Eds), Education and sociology – an encyclopaedia (pp. 393–397). London, New York: Routledge Falmer. Berger, P., & Luckman, T. (1966). The social construction of reality. Garden City, NY: Doubleday. Bottani, N., & Tuijnman, A. (1994). International education indicators: Framework, development and interpretation. In: OECD, Making education count: Developing and using international indicators (pp. 21–35). Paris: OECD. Evers, D. H., & Gerke, S. (2003). Local and global knowledge: Social science research on Southeast Asia. Paper read at the international conference, Kuching, September. Falk, R. (1999). Predatory globalization: A critique. Cambridge, UK: Polity Press. Giddens, A. (2007). Sociologija. Zagreb: Nakladni zavod Globus. Greaney, V., & Kellaghan, T. (1996). Monitoring the learning outcomes. Washington, DC: The World Bank. Hacking, I. (1999). The social construction of what. Cambridge, UK: Harvard University Press. Hallak, J. (1998). Education and globalization. Paris: Unesco. Hambleton, R. K., & Sireci, S. G. (1997). Future directions for norm-referenced and criterionreferenced achievement testing. International Journal of Educational Research, 27, 379–393. Howcroft, D., & Moore-Trauth, E. (Eds). (2005). Handbook of critical information systems research: Theory and application. Cheltenham: Edward Elgar Publishing.

The Impact of International Achievement Studies

265

Huse´n, T., & Tuijnman, A. (1994). Monitoring standards in education: Why and how it came about. In: A. C. Tuijnman & T. N. Postlethwaite (Eds), Monitoring the standards of education: Papers in honor of John P. Keeves (pp. 1–21). Oxford: Pergamon Press. Jenicek, V. (2006). Globalisation and knowledge economy. Agricultural Economics, 52/1, 1–6. Justin, J. (2008). Taksonomije in znanje. Ekosistemi- povezanost zˇivih sistemov. Mednarodni posvet biolosˇka znanost in druzˇba. Ljubljana: Zavod RS za sˇ olstvo, str. 170–182. Kamens, H. D., & McNeely, L. C. (2010). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54/1, 5–26. Kellaghan, T., & Greaney, V. (2001). Using assessment to improve the quality of education. Paris: Unesco. Kelly, A. (2008). School effectiveness. In: G. McCulloch & D. Crook (Eds), The Routledge international encyclopedia of education (pp. 517–518). London and New York: Routledge. Kincheloe, J. L., & Horn, R. A. (2008). The Praeger handbook of education and psychology (Vol. 1). Westport, CT: Greenwood Publishing Group. Klemencic, E. (2010). Curricular reforms, international large-scale student assessments, local/ global knowledge. Unpublished. Klemencic, E., & Rozman, M. (2009). Knowledge globalization through international studies and assessments. In: International conference on social sciences and humanities: The progressive impact of research in social sciences and humanities: Towards the regeneration of knowledge (pp. 54–69). Malaysia: Faculty of Social Sciences and Humanities. Moore, R. (2000). For knowledge: Tradition, progressivism and progress in educationreconstructing the curriculum debate. Cambridge Journal of Education, 1(30), 17–36. Protner, E. (2004). Vpliv mednarodnih primerjav znanja na sˇ olski kurikulum. In: E. Protner (Ed.), Sodobna pedagogika, 55(121), 6–10. Ramirez, F., & Boli, J. (1987). The political construction of mass schooling: European origins and worldwide institutionalization. Sociology of Education, 60(1), 2–17. Robitaille, F. D., Beaton, E. A., & Plomp, T. (Eds). (2000). The impact of TIMSS on the teaching & learning of mathematics & science. Vancouver, Canada: Pacific Educational Press. Russell, B. (1979). Filozofija logicˇnega atomizma. Ljubljana: Cankarjeva zalozˇba (Zbirka Nobelovci, 54). Sahlberg, P. (2007). Education policies for raising student learning: The Finnish approach. Journal of Education Policy, 22(2), 147–171. Schagen, I., & Hutchison, D. (2007). Comparisons between PISA and TIMMS – we could be the man with two watches. Education Journal, 101, 34–35. Schmidt, H. W., McKnight, C. C., Houang, T. R., Wai, H., Wiley, E. D., Cogan, S. L., & Wolfe, G. R. (2001). Why school matter: A cross-national comparison of curriculum and learning. San Francisco: Jossey-Bass. Scholte, A. J. (2005). Globalization: A critical introduction. Hampshire and New York: Palgrave Macmillan. Sillitoe, P. (1998). What know natives? Local knowledge in development. Social Anthropology, 6(2), 203–220. Singh, M., Kenway, J., & Apple, W. M. (2007). Globalizing education: Perspectives from above and below. In: W. M. Apple, J. Kenway & M. Singh (Eds), Globalizing education: Policies, pedagogies, & politics (pp. 1–30). New York: Peter Lang Publishing. Straus, M. (2004). Mednarodne primerjave kot podlaga za oblikovanje strategije razvoja izobrazˇevalnega sistema. Sodobna Pedagogika, 55(5), 12–27.

266

EVA KLEMENCIC

Straus, M., Klemencic, E., Brecko, B., Cucek, M., & Gril, A. (2006). Metodolosˇka priprava mednarodno primerljivih kazalnikov spremljanja razvoja vzgoje in izobrazˇevanja v Sloveniji. Raziskovalno porocˇilo. Ljubljana: Pedagosˇ ki insˇ titut. Takayama, K. (2008). The politics of international league tables: PISA in Japan’s achievement crisis debate. Comparative Education, 44/4(21), 387–407. The National Examination Centre. (2006). The National Examination Centre. Retrived April 14, 2010, Available at http://www.ric.si White Paper on Education in R Slovenia. (1996). Ljubljana: Ministry of Education and Sport.

FINLAND, PISA, AND THE IMPLICATIONS OF INTERNATIONAL ACHIEVEMENT STUDIES ON EDUCATION POLICY Jennifer H. Chung ABSTRACT Finland’s performance in PISA has created considerable interest in the country’s education system, to ascertain what has made Finland so successful in the survey. In reference to the phenomenon, this chapter discusses cross-national attraction, policy borrowing, the effect of Finland in PISA, and its influence on education policy. This chapter explores at length the theoretical background of cross-national attraction and policy borrowing, also investigating cases that have already occurred. It discusses Finland’s role as the new object of cross-national attraction and eventual policy borrowing. The chapter incorporates research into the reasons for Finland’s success in PISA, the possibilities of policy transfer from Finland, and delves into the likelihood of policy implications as a result of Finland in PISA. This cross-national attraction denotes the first stage in policy borrowing; however, comparative educationalists, for years, have warned about the uncritical transfer of education policy. Research in Finland has revealed many reasons for the country’s PISA success stem from contextual factors: those related to historical, cultural, societal, and The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 267–294 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013013

267

268

JENNIFER H. CHUNG

political features of Finland. Therefore, policy borrowing from Finland needs to heed warnings of past comparativists. The new phenomenon of Finland in PISA has generated much curiosity from those in education, educational policy, and politics. Policymakers are keen to incorporate Finland’s educational features into their education systems. PISA and Finland’s performance in the survey influence educational policy. This illustrates the importance the warnings of past and present comparative educationalists in order to prevent uncritical policy borrowing.

INTRODUCTION International achievement studies, while not a new phenomenon, have gathered increased attention since the advent of the Program for International Student Assessment (PISA). Although international surveys such as Trends in Mathematics and Science Survey (TIMSS) and Progress in Reading Literacy Survey (PIRLS) existed previously, PISA’s new approach to education assessment acquired much worldwide interest. Finland, traditionally not an avid participant in TIMSS or PIRLS, administered by the International Association for the Evaluation of Educational Achievement (IEA), has attracted much attention due to its performance in PISA. PISA, which essentially ranks the participating countries in terms of their performances, has seen Finland as one of the consistent top scorers in the surveys. The top performance of Finland in all administrations of PISA thus far, and on all assessed literacy areas, has given the country new status as a global leader in education. The equity of education and consistency across the PISA surveys in Finland, coupled with its high performance, makes the country even more alluring to those seeking educational models. In other words, Finland’s performance in PISA has created an educational frenzy manifest in considerable attraction to the Finnish education system. Finland’s PISA outcomes have already taken a conspicuous position in examples of potential policy borrowing, as it provides a good educational example from which other countries can learn. However, comparative educationalists have consistently warned against the uncritical transfer of education policy. Finland’s success in PISA necessitates discussion of the potential of educational policy borrowing. Comparative education has, for centuries, cautioned against the indiscriminate transfer of education policy. PISA’s rankings have affected cross-national attraction, the desire for policy borrowing, and the increased political effect of international student assessments.

Finland, PISA, and the Implications for Education Policy

269

This chapter discusses PISA, the criticisms of PISA, policy borrowing theory, cases of cross-national attraction and policy borrowing, the interest in Finland’s education system, and the possibilities of borrowing from the country.

PISA PISA, administered by the Organization for Economic Cooperation and Development (OECD), surveys 15-year olds, focusing on mathematics, science, and reading literacy. The OECD’s definition of ‘‘literacy’’ involves not what students learn from the curriculum, but how they apply these subjects in real life situations. PISA, which began in 2000, assesses students from around the world every three years. In response to the member countries’ interest in comparative student performance, the OECD initiated PISA in 2000 (Riley & Torrance, 2003, p. 420). Through PISA, the OECD makes good educational practice visible to the rest of the world. However, the OECD ‘‘acts non-coercively,’’ meaning countries, whether involved in PISA or not, can set their own levels of reaction and response to the survey (Gruber, 2006, p. 198). The assessment survey, administered every three years, tests students nearing the end of many countries’ compulsory education, at age 15, on their acquired skills necessary for life in the knowledge economy (OECD, 2004, p. 4). The OECD points out that this survey ‘‘does not produce prescriptions for education systems, but makes observations designed to help policy makers think about the effect of certain system features’’ (OECD, p. 18). According to the OECD, three general themes emerged from the PISA data. First, they noticed that autonomous education systems performed better than centralized ones. Second, they found that education systems that monitored and assessed their performance had better results than those that did not undertake periodic assessments. Last, the OECD states that countries that provide support to low-performing students had overall higher academic achievements than those that did not (OECD, 2004, p. 19). The first and third general themes describe quite closely aspects of the Finnish education system. PISA effectively changes the role of the OECD and also changes its relationship with member countries (Gruber, 2006, pp. 197–198). It provides the OECD countries and participating countries with significant educational benchmarks. The OECD’s strong reputation for measuring economic indicators presumably has allowed for the high visibility of PISA and its general acceptance as a measure of educational standards. Drawing on the

270

JENNIFER H. CHUNG

educational research potential of its member countries, the OECD has created a policy-driven survey, with ‘‘hard’’ empirical data, arranged in a league table format that can embarrass low-performing countries or praise high-performing ones (ibid., pp. 198–199).

CRITICISMS OF PISA Many criticisms of the OECD’s PISA surveys exist. Prais and Adams, for example, famously engaged in a ‘‘dialogue’’ in Oxford Review of Education about some criticisms of PISA, such as the methodology, sampling, type of questions asked, in addition to the value of PISA, especially to education ministries (Prais, 2003; Adams, 2003; Prais, 2004). Schagen and Hutchinson compare and contrast TIMSS and PISA. They admit that two international surveys give conflicting impressions of educational systems. They imply that having two surveys, even with different goals, can raise confusion among those interpreting the results. They equate having two surveys to a man having two watches: A man with one watch always knows what time it is; a man with two watches is never quite sure. (Anonymous, in Schagen & Hutchinson, 2007, p. 34)

They criticize the causal connections made by those analyzing the questionnaire data gathered by these surveys, and how many use the background data to reach conclusions about education systems. They believe that ‘‘unmeasured ‘third factors’ can actually be the root cause of both measures’’ (ibid., p. 35). Goldstein also mentions that ‘‘observed differences will undoubtedly reflect differences between educational systems but they will also reflect social and other differences that cannot fully be accounted for’’ (Goldstein, 2004b, p. 321). Sahlberg agrees. The high scores of nations such as Japan and South Korea in PISA may reflect the strong culture of private tutoring outside of school, and not the education system alone. A country like Finland, however, which does not participate in highstakes testing and does not have a culture of private tutoring, also performs well in PISA (Sahlberg, 2007, p. 163). Cultural bias and the influences of society, therefore, are difficult to avoid in international achievement studies. Riley and Torrance criticize the politicization of education as a result of international surveys such as TIMSS and PISA. They draw attention to the fact that the OECD actually has no direct responsibility for education, but has had a great impact on education and subsequent policy. This influence could be either positive or negative (Riley & Torrance, 2003, p. 420).

Finland, PISA, and the Implications for Education Policy

271

The superficial scores generated by surveys such as PISA, ‘‘blunt instruments,’’ often carry too much weight and have become a part of a ‘‘new education currency’’ (ibid., pp. 420–421). They cite the example of Finland in their analysis of the positive and negative outcomes of surveys such as TIMSS and PISA. The many educational observers of Finland, as a result of its success in PISA, argue that PISA could have a good result if it improves, for example, teacher education levels, or better teaching of reading skills in other countries. This ‘‘PISA tourism,’’ so to speak, could have a negative impact if ‘‘politicians seek simplistic solutions to the education challenges which their own countries face and seek off-the-shelf solutions which are highly context specific’’ (ibid., p. 421). Riley and Torrance worry that education policy-makers do not pay any attention beyond the scores produced by surveys such as PISA. They believe that international surveys such as TIMSS or PISA have the ‘‘potential to create understanding and identify where the strengths and weaknesses of education systems lie,’’ since they generate a fair amount of data about background information and revealing details about the education systems (Riley & Torrance, 2003, p. 422). However, the allure of the survey results often overshadows these helpful elements. They worry if ‘‘the ‘findings’ actually say anything very meaningful about the state of education in different countries, and if so, do the league table presentation[s] of results do more harm than good?’’ (ibid.). The political impact of PISA, less directly related to the formulation of the tests, remains a source of criticism nonetheless. Riley and Torrance feel that the rankings, the cause of political and media sensation, ‘‘are clearly designed to attract attention while the caveats which are included in the reports are routinely ignored’’ (Riley & Torrance, 2003, p. 423). They also feel that many observers of these surveys do not take into account the statistical significance of the results. For example, England, ranked 7th in PISA 2000, could have been 3rd or 9th, as the countries in that range did not produce different scores of statistical significance (Riley & Torrance, 2003, p. 423). They worry that countries will construct new education policies as a direct result of the outcomes in PISA or TIMSS. Riley and Torrance, for example, would disapprove of the German reaction to PISA. The supposed PISA-Schock, discussed later in this chapter, prompted changes in the German education system and its curricula due to poor PISA outcomes. The Germans, who previously thought their education system proved exemplary for the rest of the world, found the negative PISA outcomes devastating for the education system, politicians, and the nation (Ertl, 2006, pp. 619–620).

272

JENNIFER H. CHUNG

Riley and Torrance also have concerns over the narrowing of education. Surveys such as PISA and TIMSS ‘‘create and reinforce a climate that views education as narrow skill preparation for future employment, rather than as a challenging engagement with the knowledge and understanding that constitutes our culture and the democratic processes which future citizens must control’’ (ibid., p. 424). These surveys have changed the field of comparative education into ‘‘a political tool for creating educational policy or a mode of governance, rather than remaining in the research realm of intellectual inquiry. The publicity and effects of the OECD-led PISA assessment on political debate were a perfect example of this. ‘‘It is symptomatic of the problem that scholarly discussion has been most vivid in so-called ‘hero and villain’ countries’’ (Simola, 2005, p. 456). Riley and Torrance imply that international surveys may change the future of education. ‘‘What physicists realized some time ago, but educational testing people seem averse to acknowledging, is that when you measure something you change it’’ (Riley & Torrance, 2003, p. 424). However, supporters of PISA, while acknowledging the criticisms, point out that the political criticisms, as well as criticisms of the use of PISA data, move beyond the jurisdiction of the creators of PISA (Sahlberg, 2007, p. 163). Surveys such as PISA should come with a caveat, summed up well by Goldstein: Finally, any such survey should be viewed primarily not as a vehicle for ranking countries, even along many dimensions, but rather as a way of exploring country differences in terms of cultures, curricula and school organizationySuch studies should be treated as opportunities for gaining fundamental knowledge about differences, not as competitions to see who comes top. (Goldstein, 2004b, p. 329)

Despite Goldstein’s assertions, PISA and other international surveys of educational achievement are and will continue to fall prey to the pride and shame of countries from doing well or poorly on the assessment. The political and policy ramifications of the league tables created by these surveys will always remain a criticism of international student assessments. These international comparisons will most likely remain a fixture in the future of educational policy and research. The surveys ‘‘serve as a basis for creating a rich empirical database that has continuing significance for crossnational research in its attempt to understand the potential reasons behind observed differences between and within countries’’ (Lie & Linnakyla¨, 2004, p. 228).

Finland, PISA, and the Implications for Education Policy

273

GERMAN PISA SHOCK: AN EXAMPLE OF PISA’S IMPACT ON NATIONAL EDUCATION POLICY The German reaction to PISA illustrates the impact of the survey and the political ramifications of international achievement studies. Germany’s response to its PISA outcomes has become a classic example of ‘‘negative external evaluation,’’ to be discussed later in this chapter. This PISASchock, which occurred after Germany scored below the OECD average in the 2000 survey, has had much the same impact as Sputnik or A Nation at Risk (Ertl, 2006, pp. 619–621). In fact, Gruber believes the PISA-Schock, following the 2000 results, eclipses the impact of A Nation at Risk (Gruber, 2006, p. 195). He cites how the Bundestag, or German Parliament, held special ‘‘PISA sessions’’ to discuss PISA-triggered educational worries, and how PISA marks different educational eras with BP (Before PISA) and AP (After PISA), much like BC and AD also denote a significant change in temporal measurement and history (ibid.). The media reaction to PISA, especially after the release of the 2000 scores, illustrated ‘‘how calmly Finland dealt with its champion status’’ and ‘‘how deep the German PISA-Schock went’’ (Gruber, 2006, p. 196). The German performance in PISA 2000 contradicted ‘‘the German and international expectations of a nation with a high standard of living and a school system which had enjoyed a high international reputation since the nineteenth century’’ (ibid.). The PISA performance illustrated the disparities in performance among PISA literacy levels and the varied educational achievement across Germany’s La¨nder (ibid.). In addition, Germans found that socio-economic background, immigration, and the tripartite system held positions as strong factors affecting educational success (Ertl, 2006, p. 623). In fact, Germany had the highest rate of social inequality in education, even more than the United States (Gruber, 2006, p. 203). On a more positive note, this PISA-Schock triggered the unilateral decision by the federal government, in addition to the La¨nder, to agree to educational reforms and national standards. Previously, they were at a standoff (Ertl, 2006, pp. 622–624). The great ‘‘taboo’’ subject of German education, the Gesamtschule, or comprehensive school, however, still remains untouched (Gruber, 2006, pp. 205–206). Despite great disparities in educational attainment, socio-economic disparity, and the low educational attainment of immigrants, the tripartite system still remains intact (ibid.). The PISA-Schock, however, did prompt the German curricula toward a more practical focus. The formerly respected German system admitted to its ossification and began looking outwards, generating

274

JENNIFER H. CHUNG

comparative research, in order to improve. The Germans especially look to Finland and Sweden as successful models of good education (Ertl, 2006, pp. 627–629). PISA triggered a ‘‘mass pilgrimage’’ to Finland from German politicians and educational authorities that viewed Finland as the educational promised land (Gruber, 2006, p. 203). Despite these positive reforms, they do not address the socio-economic problems and subsequent educational inequalities in the country, instead focusing on the future increase of PISA outcomes (Ertl, 2006, p. 630). PISA, as Ertl has described, generates as powerful an external negative evaluation as Sputnik or A Nation at Risk, especially for Germany (2006, pp. 619–621). The negative reinforcement of PISA has initiated crossnational attraction, explained later in this chapter, and more specifically of course, to PISA’s top performer, Finland.

POLICY BORROWING Comparative education often aims to investigate educational systems not in the ‘‘home’’ context in order to improve the ‘‘home’’ system. Copying, emulating, or simulating another country’s educational practice is generally known as ‘‘borrowing’’ within the comparative education discipline; this often occurs when successful education policy abroad is seen as potentially beneficial for the ‘‘home’’ system (Phillips & Schweisfurth, 2006, p. 17). Looking elsewhere occurs commonly when aspiring toward improvement. Countries often look toward others for good examples in various realms. Morris describes this cross-national attraction:  Country A is an economic basket case (high levels of unemployment and low levels of economic growth) – this is portrayed as largely the result of the educational system that is not producing workers with appropriate skills.  Country B is economically successful (low levels of unemployment and high levels of economic growth) – this is to a large degree the result of its possessing a well-educated workforce.  Therefore, if Country A adopts some of the features of the educational system of Country B, it will improve the state of Country A’s economy (Morris, quoted in Bray, 2004, p. 12, in Ochs & Phillips, 2004, p. 7). Although rather one-dimensional, Morris’s example clearly illustrates the nature and interest of comparative study and the allure of policy borrowing.

Finland, PISA, and the Implications for Education Policy

275

Phillips and Schweisfurth acknowledge the difficult and complex nature of policy transfer (2006, p. 17). The process of policy transfer, which seemingly takes three simple steps, firstly, with the identification of good practice, secondly, the introduction of the policy into the home country, and thirdly, assimilation into the home context, deceivingly produces problems through its complexity (ibid., p. 18). Sadler said: In studying foreign systems of Education we should not forget that the things outside the schools matter even more than the things inside the schools, and govern and interpret the things inside. We cannot wander at pleasure among the educational systems of the world, like a child strolling through a garden, and pick off a flower from one bush and some leaves from another, and expect that if we stick what we have gathered into the soil at home, we shall have a living plant. A national system of Education is a living thing, the outcome of forgotten struggles and difficulties, and ‘of battles long ago.’ It has in it some of the secret working of national life. (Sadler, in Higginson, 1979, p. 49)

Self-improvement, whether on the micro or macro levels, instinctively involves looking elsewhere for strong examples. While countries have long ‘‘borrowed’’ from each other in terms of science, technology, and agriculture, Sadler’s warnings imply a much more complex process of borrowing in education. The word ‘‘context,’’ in this situation of policy borrowing, needs definition. As a key point for models of policy borrowing, context may include ‘‘philosophical, historical, cultural, religious, social, ‘national character,’ political, economic, demographic, geographical, linguistic, administrative, and technological’’ features (Ochs & Phillips, 2002b, p. 16). These factors also have a complex relationship with each other, as these features often influence each other or act in conjunction. Therefore, context must remain a factor when discussing cross-national attraction (ibid., p. 33). Hence, the lending and borrowing of educational policy depend on the context of both the ‘‘target’’ and ‘‘home’’ countries. The notion of context, especially in terms of policy borrowing, holds importance in the field of comparative education and with potential policy transfer. Phillips and Ochs have also created a model or, more specifically, suggested a cycle of policy borrowing, which consists of four stages: 1. 2. 3. 4.

Cross-national attraction. Decision. Implementation. Internalization/indigenization (Phillips & Ochs, 2004, p. 779).

The cross-national attraction stage begins with impulses that spawn this attraction, such as internal dissatisfaction, political imperatives, or ‘‘negative

276

JENNIFER H. CHUNG

external evaluation.’’ ‘‘Negative external evaluation’’ often comes from international education surveys such as the OECD’s PISA (ibid., p. 778). According to the typology engineered by Ochs and Phillips, the impetus for cross-national attraction can happen at any time (2002a, p. 330). With this typology of cross-national attraction, they hope to generate research involving ‘‘both the processes and the context of ‘cross-national attraction’ in education systems, which the researcher can use in thinking about the discrete elements of educational policy, their inter-relationship, and necessary conditions for policy transfer’’ (ibid., p. 328). The second phase of policy borrowing, decision, has four types of decision-making: (a) (b) (c) (d)

Theoretical. Realistic or practical. ‘‘Quick Fix.’’ ‘‘Phony’’ (Phillips & Ochs, 2004, p. 780).

Theoretical decision-making occurs when governments make decisions on policies so abstract that they cannot easily find effective implementation within the education system. Realistic or practical decision-making isolates measures already successful in another country or education system, which does not have the constraints of contextual factors. ‘‘Quick fix’’ borrowing occurs in times of ‘‘immediate political necessity,’’ for example, after the fall of Communism in Eastern Europe in 1989 (ibid.). The ‘‘phony’’ type of decision-making refers to interest in external education systems by politicians for immediate political impact (ibid.). Although the implementation stage does not require much explanation, the internalization/indigenization phase necessitates clarification. Phillips and Ochs also refer to this stage as the ‘‘domestication’’ of education policy (2004, p. 780). The borrowed policy becomes internalized in four steps: (a) The impact on the existing system, where the educational policy makers juxtapose their goals with the current structure of their education system. (b) The absorption of external features, where close examination of the extent of absorption of external features becomes necessary. (c) Synthesis, the step where the borrowed policy becomes part of the context of the borrower country’s education system. (d) Evaluation, where policy makers review whether the education system has successfully implemented the policy (ibid., p. 781).

Finland, PISA, and the Implications for Education Policy

277

This process, not a linear but rather a cyclical one, implies that policy borrowing does not occur as a one-off process, instead as a continuum of cross-national attraction. Once properly indigenized, the cross-national attraction may begin again, sparking a new cycle of policy borrowing (ibid.). This educational interest seemingly never ends: it continues as countries, politics, society, and education systems grow and evolve. In addition to this process of policy borrowing, Phillips and Ochs also discuss the filters involved in the policy borrowing process, which distort and alter the original educational policy. The ‘‘borrowed policy’’ goes through various stages before the policy becomes properly ‘‘lent.’’ These filters or lenses distort the original policy in four phases: 1. 2. 3. 4.

Interpretation. Transmission. Reception. Implementation (Ochs & Phillips, 2004, p. 16).

The first stage, interpretation, acknowledges how different actors in education construe educational occurrences. The context of the actors in education, in addition to their experiences and strengths, plays a role in their interpretation of educational practices (Ochs and Phillips, 2004, p. 16). Transmission, the second phase, marks when the actors in education have finished their interpretations of an educational practice and then ‘‘filter the policy through the lens of their own agendas and expectations’’ (ibid., p. 17). The third stage, reception, occurs when the educational policymakers, who interpreted the original practice and then filtered it again through their own perspectives, pass it through another lens. With this lens, the individuals and institutions examine the practice and see if it will function for their purposes (ibid.). Finally, the last stage of implementation can also act as a filter. Although the practitioners, through their own examination in stage 3, distorted the original practice for their purposes, the final phase of implementation further distorts the original ‘‘image.’’ The concrete process of applying an educational practice can further distort the original policy (ibid.). In the end, the borrowing country can have a very different educational practice from that originally borrowed. Steiner-Khamsi and Quist describe educational borrowing research, where the first stage focuses on the politics of policy borrowing. The second, called ‘‘externalization,’’ which refers to models outside of the ‘‘home’’ system, gives an ‘‘interpretive framework’’ of policy borrowing analysis (2000, p. 276). They argue that education has the natural tendency to attract criticism and opposition; therefore, it feels the ‘‘pressure to

278

JENNIFER H. CHUNG

continuously re-establish credibility and legitimization by referring to ‘authorities’ inside and outside the educational system’’ (ibid., p. 277). Therefore, when educational policymakers have a lack of support politically, they look to examples from other countries to regain their legitimacy in the ‘‘home’’ context (ibid.). Even if the policies are seen as unfavorable in the ‘‘home’’ context, as discussed later in this chapter, the decontextualization and deterritorialization of the borrowed policy, then the recontextualization and indigenization, allow for acceptance in the ‘‘borrowing’’ nation (ibid., pp. 276–277). This discussion of policy borrowing brings us back to another the point made by Sadler, and Bray’s starting point for educational policy borrowing. Sadler said, ‘‘The practical value of studyingythe working of foreign systems of education is that it will result in our being better fitted to study and to understand our own’’ (Sadler, in Higginson, 1979, p. 50). Bray stated one country has the ability to improve its own shortcomings by adopting the virtues of another. An education system can adopt policy from another model, but only through the cyclical phases, the distorting lenses of policy borrowing methodology, decontextualization, and indigenization.

CASES OF POLICY BORROWING The phenomenon of Finland in PISA, still quite recent, has not yet yielded study into the full implementation of the Finnish model into other systems of education. In fact, ‘‘there is a lack of empirical studies on contemporary cross-national policy networks. Comparative policy studies lag dramatically behind domestic educational policy research that for years has investigated the interaction between foundations, policy think tanks and policy scholarship’’ (Steiner-Khamsi, 2006, p. 667). Unless documented, monitoring the trajectory of policy borrowing becomes difficult. However, some documented cases of past policy borrowing help illustrate the process of and difficulties with policy transfer. Steiner-Khamsi and Quist cite the case of Achimota as an early case of educational policy borrowing, where the model of education for blacks in the segregated South was implemented into a model of educating Africans in the British colonies (2000, pp. 272–273). Achimota College, now Achimota School, in present-day Ghana, was established in 1927 in order to found an elite institution to provide for the educational and industrial needs of the area (ibid., p. 278). The basis of Achimota College came from the

Finland, PISA, and the Implications for Education Policy

279

Hampton-Tuskegee model in the United States, and during the 1920s the Phelps-Stokes Fund, a philanthropic society promoting education for blacks and native Americans, recommended the model to be transferred to colonial Africa (ibid., pp. 272–273). The idea of ‘‘adapted education,’’ modified education for indigenous or minority peoples, provides a good historical context from which to study educational policy borrowing (Steiner-Khamsi & Quist, 2000, p. 274). Although two schools initiated by missionaries, Mfantsipim School and Adisadel College, already existed in the area, Achimota College ‘‘was intended as a showcase of what education in the Gold Coast and the rest of colonial Africa should become’’ (ibid., p. 278). The borrowed HamptonTuskegee model of adapted education illustrates compromise in educational policy borrowing. For example, ‘‘The borrowed model was recontextualized, locally modified, and indigenized’’ in order to reconcile supporters of elite schooling modeled on British ‘‘public’’ schools and the original Hampton-Tuskegee model which emphasized manual labor and agriculture (ibid., p. 279). Achimota College received much criticism by the elite for promoting a rural, agricultural lifestyle and for preparing students ‘‘for a life of servitude to the colonial master and for confinement to tribal life’’ and for ‘‘revitalizing meaningless tribal practices that resonated with the colonizer’s fantasies about the idyll of savage life’’ (ibid., p. 280). In response to this, Achimota College became more academically oriented, much like Mfantsipim School, and replaced the agricultural element with courses in African art, languages, and history (ibid., pp. 280–281). Interestingly, the Hampton-Tuskegee model was borrowed when it was losing favor in America for discouraging the development of leaders and academics from the African-American community (Steiner-Khamsi & Quist, 2000, p. 289). In fact, ‘‘evidence suggests that Achimota’s adapted education was, in fact, old wine repackaged in new bottles’’ (ibid.). However, this impetus to borrow policy from an African-American context is due to these facts:  The Hampton-Tuskegee model’s creation and implementation was by African Americans.  The Hampton-Tuskegee model existed successfully in a segregated society.  The United States had no direct colonial power in Africa.  Proponents of adapted and industrial education, both of black and white descent, advocated the benefits for the economy and the African or African American (Steiner-Khamsi & Quist, 2000, p. 294).

280

JENNIFER H. CHUNG

Sadler, who in 1900 made his famous warning about policy borrowing, actually recommended the transfer of the Hampton-Tuskegee model to Africa, which no doubt gave academic credibility to the borrowing of the model (ibid., p. 296). Achimota College, in turn, evolved into a prestigious center for educational research focusing on agriculture and for African languages and cultures (ibid., pp. 295–296). Ochs documented a successful implementation and internalization of certain foreign practices into a London school system. This case is grounded in the Phillips and Ochs model of the four stages of educational borrowing (Ochs, 2006, p. 600). The London borough of Barking and Dagenham successfully reformed its schools to implement education practices from Switzerland and Germany, by passing the original policies through the filters within the policy borrowing process. The original practice went through different lenses, or filters, that distorted and converted the original practice into the context of the London borough (ibid., p. 605). In the end, the London borough did successfully implement the foreign systems, but by carefully following five goals: 1. A strong commitment to improving the school system. 2. Strong key partnerships to provide support in the process. 3. Awareness of the challenges at hand when implementing a foreign system into one’s own. 4. Recognizing that the process would require continuous commitment and repetition. 5. Considering the contexts of both countries throughout the policy borrowing stages (ibid., p. 616). Ochs’s study illustrates a case where a policy is sensitively, and successfully, borrowed from one system and implemented into another. Crossley and Vulliamy (1984) describe policy implementation in the cases of Papua New Guinea and Tanzania. The implementation of policy in the two countries ‘‘differed as a result of distinct social, economic, and institutional contexts in which the similar policiesytook place’’ (ibid., p. 201). These two cases illustrate, ‘‘on the one hand, marked divergences in the manner in which similar policies have been implemented but, on the other hand, an underlying similarity in the way in which deep-seated sociological constraints have operated’’ (ibid., p. 202). The cases of policy implementation in Papua New Guinea and Tanzania echo the sentiments of Phillips, Schweisfurth, and Ochs. Therefore, the ‘‘indigenization’’ of a similar education policy took different routes in Papua New Guinea and Tanzania.

Finland, PISA, and the Implications for Education Policy

281

CASES OF CROSS-NATIONAL ATTRACTION Finland has already become a classic example of cross-national attraction, the first stage of policy borrowing. However, this attraction to Finnish education, although quite compelling, does not mark the first time where a country became the focus of educational attention. Many other cases of cross-national attraction exist, despite the fact that ‘‘educationists in ‘target’ countries often react with skepticism to the outside interest expressed in their home systems’’ (Phillips, 1989, p. 271). Halpin and Troyna uncover an interesting issue in policy borrowing: ‘‘the borrowing of educational policies from other educational systems that, in the original context, were seen as failures, ineffective, or at least highly contested’’ (Steiner-Khamsi, 2000, p. 276; Haplin & Troya, 1995, p. 304). For example, Japan originally borrowed Western models before its education system became the educational envy of many countries in the 1980s. Cummings attributes this to the publication of A Nation at Risk in 1983, the Japanese economic boom, and US economic decline around the same time (1989, pp. 293–294). An American report attributed this to the ‘‘superior quality of [the Japanese] labour force, and especially the work ethic and intellectual capabilities of the average participant’’ (ibid., p. 294). The ‘‘negative external evaluation’’ of this situation created an impetus to visit Japan. Thus, this established an ‘‘educational pilgrimage’’ of Americans to Japan (ibid.). These pilgrimages by American educational scholars resulted in ascertaining the salient cultural characteristics of the Japanese system, such as the ‘‘education mother’’ and high competition (together with the apparent downside of elevated suicide rates among high school students) (ibid., p. 297). They also determined the strengths of the system, for example, integrated science and mathematics, and a sequential curriculum (ibid., p. 299). Ichikawa responds to Cummings and demystifies the characteristics of Japanese education as perceived by the American researchers (1989, pp. 304–307). The juxtaposition of their assertions illustrates the different perceptions of an education system from the ‘‘home’’ country and the observing country. Phillips cites a Japanese periodical at the time that found the interest baffling, since the Japanese looked to other countries for models of creativity in schools (1989, p. 271). Furthermore, both authors discuss the difficulty of a Japan-to-US ‘‘borrowing’’ situation. Cummings states, ‘‘Despite the rising American interest in Japanese education we have yet to see a significant impact on the way Americans solve their educational problems’’ (1989, p. 301). Ichikawa comments on his statement: ‘‘I agree

282

JENNIFER H. CHUNG

that the United States will encounter difficulties in borrowing ideas and practices from Japan without modification’’ (1989, p. 304). Although the interest on the part of the US in Japan would allow for educational improvement, Cummings, at the time, did not see any. Ichikawa acknowledges the difficulty in borrowing Japanese practice, especially without modification and indigenization into the home context. England and Germany also form another classic example of crossnational attraction. Gruber and Pollard, however, take a different perspective on this matter and account for the attraction, or lack thereof, toward British primary schools from their continental admirers, namely Germany and Austria. Gruber praises the philosophy of the child-centered primary school and the autonomy behind the school administration (1989, p. 363). He wonders why continental European countries do not look toward British primary schools as good models of early childhood education (ibid.). He cites three reasons why they have overlooked the example of Britain: 1. Governments overlooked primary school reform in German-speaking countries due to energies devoted elsewhere. 2. The admirable autonomy of the British schools does not translate well into the tradition of ‘‘standardization and uniformity’’ of school culture in Germany and Austria. The school-to-school variations in British primary schools ‘‘are appreciated with difficulty.’’ 3. German-speaking countries tend to concentrate their educational energies on theoretical research, while British educationists analyze real-school processes (1989, pp. 363–364). Gruber asserts that the ‘‘silent revolution’’ of British primary education has remained the object of admiration of experts (ibid., p. 364). Pollard, in response, feels the German and Austrian perspective on primary schools represents a degree of ‘‘idealistic romanticism’’ (1989 p. 365). Pollard feels that the autonomy of schools that should, according to Gruber, allure continental admirers also has downsides: large variations in school quality and then limitation of central governments to intervene with failing schools (ibid.). As a result, Pollard supported the Education Act of 1988, which implemented a National Curriculum in Britain. Gruber’s three assertions about the overlooked British example come under scrutiny from Pollard. He agrees with the first point, but feels that the second point does not ring true in his opinion. He believes that continental European teachers also have encouragement to assert their individuality (ibid., p. 366). On the third point, Pollard agrees with Gruber, as he ‘‘note[s] with great sadness’’ that the ‘‘abstract and detached work which Gruber portrays was left

Finland, PISA, and the Implications for Education Policy

283

behind some time ago by most educationalists in Britain’’ (ibid., p. 366). The analysis of these two articles raises two interesting points. First it illustrates how cross-national attraction can go both ways. Phillips accounts for English attraction to Germany (Phillips, 2000a, pp. 49–62, 2000b, pp. 297–307), while Gruber describes attraction from Germany and Austria to England. Secondly, this raises the issue of the cycles of cross-national attraction. The articles, published in 1989, praise a British system characterized by autonomy and decentralization. However, the reforms of 1988 created a more centralized system. Pollard presciently writes, ‘‘Perhaps in 2007yKarl Heinz Gruber will be led to wonder why we too failed to appreciate some of the finest qualities and achievements of British primary education’’ (ibid., p. 367). In other words, almost 20 years after publication, does the same admiration of British primary schooling still ring true, or does Britain regret the educational reforms of the late 1980s? These examples of cross-national attraction provide a benchmark for the exploration of Finland as the ‘‘new’’ target for policy borrowing. Pollard astutely wondered if, in the future, the same attraction from Germany and Austria to British primary schools would exist (1989, p. 367). Clearly, however, Finland has taken over the position of educational admiration because of PISA. As long as Finland maintains the level of performance in PISA, the education system will continue to attract other countries seeking the reasons behind PISA success and improvement of the ‘‘home’’ system.

FINLAND, PISA, CROSS-NATIONAL ATTRACTION, AND POLICY BORROWING Cross-national attraction focusing on Finnish education, although quite compelling, does not mark the first time where a country became the center of educational cross-national attraction. However, Finland is the ‘‘new’’ target for policy borrowing. As long as Finland maintains this top performance in PISA, the education system will continue to attract other countries seeking the reasons behind PISA success and improvement of the ‘‘home’’ system. This attraction to Finland because of PISA, now a clearly established phenomenon, addresses the issue of the possibility of policy borrowing from the Finnish context. The Finns perceive this interest in their education system with some bemusement: The outstanding success of Finnish students in PISA has been a great joy but at the same time a somewhat puzzling experience to all those responsible for and making decisions

284

JENNIFER H. CHUNG

about education in Finland. At a single stroke, PISA has transformed our conceptions of the quality of the work done at our comprehensive school and of the foundations it has laid for Finland’s future civilisation and development of knowledge. (Va¨lija¨rvi, Linnakyla¨, Kupari, Renikainen, & Arffman, 2002, p. 3; Va¨lija¨rvi et al., 2007, p. 3)

Finland traditionally looked toward other countries for educational examples, first Germany, then later Sweden (Va¨lija¨rvi et al., 2002, p. 3). In fact, a Finnish adage says, ‘‘In reforming school, Finland makes exactly the same mistakes as Sweden. Only it happens ten years later’’ (ibid.). The new international attention, a change from looking elsewhere for educational examples, has made Finland the educational example for other countries. Before PISA, little interest surrounded the Finnish education system. In fact, in the Second International Mathematics Study (SIMS), Finland ranked only average among the eighteen participating countries (Sahlberg, 2007, p. 160). In the 1999 repeat of the Third International Mathematics and Science Study (TIMSS-R), 38 countries participated, and Finland ranked only slightly above average (ibid., p. 161). However, owing to its performance in PISA, Finland’s education system has become a popular travel destination for educational policymakers, teachers, researchers, and the like, observing how Finland created a high-performing education system while maintaining its commitment to the Welfare State (ibid.). The educational achievements of Finland, especially taken into account with its financial struggles of the 1990s, are worthy of praise. ‘‘The overall social and economic progress has often been judged as indicating that a relatively small, peripheral nation can transform its economy and education system into a showcase knowledge society only if policies are right and if sufficient hard work supports the intended visions’’ (ibid.). International surveys such as PISA have become one of the biggest reasons for educational change in recent years, triggering an ‘‘educational pilgrimage,’’ as seen in the case of Finland (ibid., p. 163). In fact, ‘‘several countries changed the direction of their education reforms by borrowing education policies and practice from well-performing nations’’ (ibid.). Finland, due to its top outcomes in PISA, has become a ‘‘reference society,’’ an educational example worthy of attention and emulation (Steiner-Khamsi, 2006, p. 666; Schriewer & Martinez, 2004, p. 34). Finland in PISA has created a ‘‘policy window’’ where the likelihood of policy borrowing is high (Steiner-Khamsi, 2006, p. 670). This window of policy opportunity allows countries interested in the Finnish model to justify educational policy reform within a time period where such borrowing is acceptable.

Finland, PISA, and the Implications for Education Policy

285

Comparativists have addressed the issue of policy borrowing for centuries. Educational ‘‘borrowing’’ is ‘‘the most obvious consequence of learning from and understanding what is happening ‘elsewhere’ in education’’ (Phillips, 2000b, p. 299). In the early nineteenth century, Jullien generated a series of questions in order to identify ‘‘good educational practice’’ and aid in its ‘‘transfer to other systems’’ (Phillips, 1989, p. 267). Jullien sparked an educational interest in other countries that continues to this day. Continuing along this vein, Sadler also expressed his aforementioned opinions on the matter (in Higginson, 1979, p. 50). Halls takes it a step further, by stating, ‘‘The grafting of features of another educational system into a different cultural context, like transplants of the heart, is a difficult and sometimes unsuccessful operation’’ (1970, p. 163). Noah states: Cross cultural study of education, then, can identify the potentials and the limits of international borrowing and adaptation y my impression is that international borrowing of educational ideas and practices has more failures to record than success. Transplantation is a difficult art, and those who wish to benefit from the experience of other nations will find in comparative studies a most useful set of cautions, as well as some modest encouragement. (ibid., p. 556)

This potential for abuse calls for careful consideration and borrowing to successfully implement foreign practices: The authentic use of comparative study resides not in wholesale appropriation and propagation of foreign practices but in careful analysis of the conditions under which certain foreign practices deliver desirable results, followed by consideration of ways to adapt those practices to conditions found at home. (ibid., pp. 558–559)

Therefore, with ‘‘responsible scholarship,’’ ‘‘knowledge of what is being proposed and tried in cognate situations abroad is indispensable for reasoned judgement about what we need to do at home’’ (ibid., pp. 552, 559). In fact, when attributing success in education many have the tendency to credit individuals, ‘‘their psychologies and pedagogies, rather thanyphenomena characterized as social, cultural, institutional or historical’’ (Simola, 2005, p. 455). ‘‘Schooling is not confined to pedagogy, didactics or subject matter y it also, even mainly, incorporates social, cultural, institutional and historical issues’’ (ibid., pp. 456–467). Simola believes ‘‘a comparative study in education purporting to be something more than a mode of educational governance should be a historical journey’’ (ibid., p. 457). His beliefs imply that PISA and other international achievement studies do not take enough social, cultural, institutional, and historical issues into account.

286

JENNIFER H. CHUNG

Can countries borrow from Finland and achieve the same degree of success? Comparativists have debated similar dilemmas of policy borrowing over the years. Since the time of Jullien, the ‘‘attitudes to the feasibility of educational policy borrowing have ranged from scornful dismissal to enthusiastic advocacy’’ (Phillips, 2006, p. 551). Many have used the foreign model to exemplify a successful education system and to warn against change. The same country can also come as a positive or a negative illustration of an education system, as in the case of Germany (ibid., pp. 551–552). Phillips raises a central question: ‘‘Can country x solve its educational problems by adopting policy or practice deemed to be successful in country y? And if so, how is such policy or practice transferred and implemented?’’ (ibid., p. 553). To study education systems cross-nationally, we must take into account two different methods, ‘‘one to study the society, and one to study the education system itself’’ (Ochs & Phillips, 2002a, p. 327). Sometimes careless policy borrowing can cause dangerous outcomes. Phillips states ‘‘it is only through analysis and understanding of the roots that feed educational systems that we can arrive at a proper understanding of why things are as they are and avoid the pitfalls of too great a concentration on description and measurement of perceived outcomes’’ (Phillips, 1989, p. 269). We must hope that this new interest in Finland will spark multi-disciplinary study of the country and proper, careful implementation of successful Finnish policies, adapted and ‘‘indigenized’’ for the borrower.

REASONS BEHIND FINLAND’S PISA SUCCESS Investigation into the reasons behind Finland’s success in PISA illustrates how closely related a country’s education system is with its historical, cultural, and societal factors. The research of Chung (2009) found that Finnish success in PISA came down to these salient factors:  Finland’s teachers hold an enviable position in Finnish society and strongly influence Finland’s top outcomes in PISA. Entrance to teacher training programs at universities has a 10 percent acceptance rate and all Finnish teachers have master’s degrees.  Finland’s history also plays a major factor in the outcomes in PISA. The country’s history as a part of both Sweden and Russia intertwined education with the movement for independence.

Finland, PISA, and the Implications for Education Policy

287

 The recession from 1991 to 1993 produced an unemployment rate of 20 percent in Finland. The recovery from this recession also reinforced the importance of education in the Finnish psyche as well as the relationship of education and economics.  Finland’s commitment to the Welfare State and egalitarian values has ingrained an ethos of equality in the education system. This has manifested itself in a comprehensive school system, equality of provision, consistency of quality throughout the country, and a highly developed system of support for weaker students.  All of these factors have one underlying thread: the Finnish concept of sisu. Although it has no direct translation into English, one can refer to it as tenacity in the Finnish mentality. This strong will and internal strength in the face of adversity provide an overarching explanation for the reasons behind Finnish success in PISA.

BORROWING FROM FINLAND The findings of Chung (2009) illustrate the importance of context when regarding education systems, which relate to the importance of sensitivity in policy borrowing. What can be borrowed from Finland? Certainly, the historical context cannot be borrowed, but what about the teacher training? Research has shown that the uncritical transfer of Finnish education has already taken place. For example, a teacher of Finnish described a group of visitors to her school from Japan: ‘‘A group came from Japan and liked our textbooks so much that they are going to translate them into Japanese! It’s funny.’’ (Chung, 2009, pp. 278–279). Furthermore, a Finnish professor of education, although he admits he does not know the details, describes how China has borrowed aspects of the Finnish approach for a school in Beijing: Chinese authorities are trying to replicate a Finnish gymnasium somewhere in Beijing or somewhere else. They have translated the material from several of our schools, the core curriculum, and some textbooks used by those example schools. There has been some agreement to train Chinese teachers, several hundred of themyIf it would be a success, no evidence would be found. Nobody would know why. (ibid., p. 332)

While much can be learned from the Finnish education system, simplistic international transfer of policy, from one cultural context to another, could be both problematic and unwise. Another professor of education in Finland describes how ‘‘PISA tourists’’ come and focus on the micro factors that may not make any difference to

288

JENNIFER H. CHUNG

the overall Finnish system, much less transferred into another context. He feels that people do not concentrate enough on the backgrounds and cultures influencing PISA scores. The PISA tourists come, and focus on small things, like school lunches or that the children take off their shoes in school. He says: The focus might be on the small things, some instructional things, things that are not easy to be changed in any system. The school type, for example, those cannot be changed easily. A similar educational system cannot be built, for example, in the States, if the bases are not the same as over here. (Chung, 2009, p. 334)

He also mentions: Of course we are very proud to present our system and so on. It is our adaptation of one single instructional activity, and expecting that to work similarly in another country as in Finland is nonsense. It is a stupid thing (ibid.).

One OECD official responsible for PISA believes some things can transfer into another system, but not others. For example, something like highly qualified teachers, which Finland boasts as part of their education system, can be ‘‘policy malleable,’’ but he says that the Finnish language cannot transfer into another country. According to this official, aspects of an education system such as a comprehensive education system or educational funding can transfer into another. However, he does account for cultural differences. He states: I think even within countries, the differences between regions are strong y I can’t imagine countries will adopt wholesale without taking their own culture into account, their own history, because it is difficult to change things quickly like that. (Chung, 2009, pp. 372–373)

Another OECD official does not believe in directly copying an entire system, rather, in identifying the drivers of educational success and transporting them to another system. He says, ‘‘You can’t transfer the context, but you can transfer the ingredients of success’’ (Chung, 2009, p. 374). Although research shows that uncritical policy transfer from Finland has already taken place, many aspects of the Finnish education system do not translate into another system, and, therefore, will not yield the same results. With careful, sensitive borrowing of ‘‘policy malleable’’ features, perhaps some countries can ‘‘transfer the ingredients of success.’’ One Finnish professor of education states how many perceive Finnish education as the best in the world, but admits, ‘‘We are best in that test’’ (Chung, 2009, p. 330).

Finland, PISA, and the Implications for Education Policy

289

INFLUENCE ON EDUCATION POLICY Among other things, comparative education can aid successful education policy (Noah, 1984, p. 551). However, the impact of PISA illustrates the possible politicization of international achievement studies. Goldstein asserts that ‘‘numerical learning targets can be dysfunctional’’ and that ‘‘any rise in test scores should not be confused with a rise in learning achievement as opposed to test-taking performance’’ (2004a, p. 10). These numerical targets, even on the international level, have led to ‘‘highly dysfunctional consequences’’ (ibid.). He worries about the ‘‘distorting effects that ‘high stakes’ target setting can lead to, by encouraging individuals to adapt their behaviour in order to maximize perceived awards; viewed as a rational response to external pressures’’ (ibid., p. 8). Goldstein warns of the possible politicization of high-stakes testing, applicable to the influence PISA can have on education, education policy, and politics: When learning outcomes are made the focus of targets, those who are affected will change their behaviour so as to maximize their ‘results,’ even where this is dysfunctional in educational terms. At the international level it would not be surprising if we witnessed similar kinds of behaviour where the curriculum and educational infrastructures were manipulated to maximize performance on the international performance measures, whatever the deleterious side effects that this might produce. (Goldstein, 2004a, p. 11)

Goldstein recommends that policy should not concentrate on ‘‘devising specific targets’’ but to design ‘‘delivery, curriculum design, pedagogy, financial incentives, etc. that work best within each country. Each educational system can develop different criteria for assessing quality, enrolment, etc. y instead of monitoring progress toward an essentially artificial set of targets’’ (2004a, p. 13). Goldstein writes in reference to Education for All, but his arguments have applicability in all realms of education policy. Grek et al. discuss the notion of a ‘‘European Education Policy Space’’ and its possible relation to international education comparisons (2009, p. 5). This growth in international education data could mark a change from national policy to European policy (ibid.). ‘‘Europeanisation can provide a vehicle for the transmission of global agendas into the national arena’’ (ibid., p. 6). They argue that Europeanization has turned the concept of European education from ‘‘a rather idealistic project of cultural cohesion to a much sharper competitive reality’’ (ibid., p. 7). Furthermore, ‘‘policy actors focus on ensuring successful outcomes, on producing a ‘world-best’ education through the production and use of data: successful competition is the new

290

JENNIFER H. CHUNG

language of high quality of standards y policy actors interpret their brokering as a fusion of European and global influences that places pressure on systems to demonstrate success in terms of measurable outcomes’’ (ibid.). Grek et al. also write about cross-national studies such as PISA and its impact on European and global education policy. The OECD, through PISA, has established itself as an agency that develops ‘‘educational indicators and comparative educational performance measures’’ (ibid., p. 8). Furthermore, ‘‘EU data collection then is intersected by OECD work, which in turn may contribute to possible emergence of a global education policy field’’ (ibid.). Global forces, therefore, increasingly influence education worldwide. Since the 1980s, education programmes and guidelines on the European level have indicated a European ‘‘language’’ of education (Grek et al., 2009, p. 9). PISA has added to this by influencing policy debates and educational agendas, as well as indicating ‘‘good’’ and ‘‘bad’’ education systems in its ‘‘league tables,’’ leading to a possible trend of policy convergence (ibid.). The deluge of international education data in recent years has increased inter-country competition: Comparison is now cross-border; it is both an abstract form of competition and an element of it; it is a proxy for other forms of rivalry. Comparison is highly visible as a tool of governing at all levels – at the level of the organisation (to manage); of the state (to govern); indeed comparison events or ‘political spectacles’ (such as PISA) may be used because of their visibility. (Grek et al., 2009, p. 10)

Europeanization, whether in the traditional terms of European social cohesion, or in terms of along the newly formed view of international competition, has felt influences from the OECD and PISA (Grek et al., 2009, p. 18). These big international studies of achievement and the league tables they generate are also becoming powerful influences on education policy worldwide. Policymakers are very much seduced by the findings of these studies. King illustrates how multi-lateral agencies prescribe development goals for countries in the ‘‘south’’ while not taking into account each individual country’s needs and specific contexts. In an age when it has become mandatory for donors to stress the importance of the country ownership of their own education agendas, it would indeed be paradoxical to discover that the allegedly global education agenda was perceived by many analysts in the south to have been principally developed by multilateral agencies in the north. (2007, p. 378)

Although policy documents such as the Jomtien Framework and the policy document released by the Development Assistance Committee of the OECD

Finland, PISA, and the Implications for Education Policy

291

‘‘stress the need for a highly context-dependent approach,’’ the commitment to context becomes buried under quantitative goals (King, 2007, p. 382). For example, in the Jomtien Declaration and Framework for Action, the ‘‘core drafting personnel were drawn from multi-lateral agencies’’ (ibid., p. 381). This outcome could be a forewarning for education policy formation in the advent of PISA. The example of Finland could lead to more uncritical transfer of its education policy. King’s illustrations of the power of multilateral agencies could easily apply to the influence of PISA in future education policy. Comparative education, then, informs various fields within the education discipline (Crossley & Watson, 2009, p. 643). Finland’s role in PISA could significantly contribute to ‘‘globally influential policy agendas’’ (ibid.). Crossley and Watson argue ‘‘that issues such as these could be best approached by researchers with deep, multi-disciplinary grounding in comparative perspectives, methodologies and sensitivities’’ (ibid.). They argue that comparativists have the tools with which to approach these dilemmas in education, including the problematic nature of international student assessment in terms of the politicization of education policy. These aforementioned warnings of comparativists highlight the importance of improved understanding of the nature, role, and impact of international studies of assessment, the importance of more critical analyses of the results of such studies, and the implications of the success of countries such as Finland. Furthermore, words of caution also come from educationalists and other academics about the policy and political implications of international assessments such as PISA, the role of these international league tables, and the role of multi-lateral agencies such as the OECD.

CONCLUSION The advent of PISA in 2000 has changed the face of education. PISA and international student assessments provide more than a ranking of countries’ performance. Finland’s consistently top performance in PISA has garnered immense attention to its education system. The interested parties seek ways in which to ‘‘import’’ successful education practice in order to improve education at home. However, comparative education has long warned against ‘‘wholesale’’ transfer of policy. Properly borrowed, education policy needs to be properly ‘‘indigenized’’ into the home context. Filtration through ‘‘lenses’’ distorts the original policy into the borrowing system.

292

JENNIFER H. CHUNG

Although the borrowed policy may not resemble the original one, it has been properly ‘‘indigenized’’ into the borrowing country’s context. Although the OECD insists it does not make prescriptions for education systems, instead aiming to inform policymakers, the league tables PISA generates can cause embarrassment, in the case of Germany, or accolades, in the case of Finland. Therefore, PISA has already started influencing education policy. The case of Germany and its PISA-Schock illustrates the power of an international achievement study over education policy. Despite the warnings of comparative educationalists over the past century, politicians and policymakers are very seduced by the quantitative goals these tests generate and the race to come out on top. Although many academics urge policymakers to remember the importance of context, policy is shifting from the local to national to global levels. The case of Finland in PISA and the allure of ‘‘quick fix’’ policy borrowing only reinforce the influence of international student assessments on education policy.

REFERENCES Adams, R. J. (2003). Response to ‘cautions on OECD’s recent educational survey (PISA)’. Oxford Review of Education, 29(3), 377–389. Chung, J. (2009). An investigation of reasons for Finland’s success in PISA. Doctoral Dissertation, Oxford University, Oxford, United Kingdom. Crossley, M., & Vuilliamy, G. (1984). Case study research methods and comparative education. Comparative Education, 20(2), 193–207. Crossley, M., & Watson, K. (2009). Comparative and international education: Policy transfer, context sensitivity and professional development. Oxford Review of Education, 35(5), 633–649. Cummings, W. K. (1989). The American perception of Japanese education. Comparative Education, 25(3), 293–302. Ertl, H. (2006). Educational standards and the changing discourse on education: The reception and consequences of the PISA study in Germany. Oxford Review of Education, 32(5), 619–634. Goldstein, H. (2004a). Education for all: The globalization of learning targets. Comparative Education, 40(1), 7–14. Goldstein, H. (2004b). International comparisons of student attainment: Some issues arising from the PISA study. Assessment in Education, 11(3), 319–330. Grek, S., Lawn, M., Lingard, B., Ozga, J., Rinne, R., Segerholm, C., & Simola, H. (2009). National policy brokering and the construction of the European Education Space in England, Sweden, Finland and Scotland. Comparative Education, 45(1), 5–21. Gruber, K. H. (1989). Note of failure to appreciate British primary education in Germany and Austria. Comparative Education, 25(3), 363–364. Gruber, K. H. (2006). The impact of PISA on the German education system. In: H. Ertl (Ed.), Cross-national attraction in education: Accounts from England and Germany (pp. 195–208). Oxford: Symposium Books.

Finland, PISA, and the Implications for Education Policy

293

Halls, W. D. (1970). Present difficulties in educational reform: Some points of comparision. In: C. Fuhr (Ed.), Educational Reform in the Federal Republic of Germany. Paris: UNESCO. Haplin, D., & Troya, B. (1995). The politics of education policy borrowing. Comparative Education, 31(3), 303–310. Higginson. (1979). Selections from Michael Sadler. Liverpool: Dejall & Meyorre. Ichikawa, S. (1989). Japanese education in American eyes: A response to William K Cummings. Comparative Education, 25(3), 303–307. King, K. (2007). Multilateral agencies in the construction of the global agenda on education. Comparative Education, 43(3), 377–392. Lie, S., & Linnakyla¨, P. (2004). Nordic PISA 2000 in a sociocultural perspective. Scandinavian Journal of Educational Research, 48(3), 227–230. Noah, H. J. (1984). The use and abuse of comparative education. Comparative Education Review, 28(4), 550–562. Ochs, K. (2006). Cross-national policy borrowing and educational innovation: Improving achievement in the London borough of barking and Dagenham. Oxford Review of Education, 32(5), 599–618. Ochs, K., & Phillips, D. (2002a). Comparative studies and ‘cross-national attraction’ in education: A typology for the analysis of English interest in educational policy and provision in Germany. Educational Studies, 28(4), 325–339. Ochs, K., & Phillips, D. (2002b). Towards a structural typology of cross-national attraction in education. Educa, pp. 1–43. Ochs, K., & Phillips, D. (2004). Processes of educational borrowing in a historical context. In: D. Phillips & K. Ochs (Eds), Educational policy borrowing: Historical perspectives (pp. 7–23). Oxford: Symposium Books. OECD. (2004). Messages from PISA 2000. Paris: OECD. Phillips, D. (1989). Neither a borrower nor a lender be? The problems of cross-national attraction in education. Comparative Education, 25(3), 267–274. Phillips, D. (2000a). Beyond travellers’ tales: Some nineteenth-century British commentators on education in Germany. Oxford Review of Education, 26(1), 49–62. Phillips, D. (2000b). Learning from elsewhere in education: Some perennial problems revisitd with reference to British interest in Germany. Comparative Education, 36(3), 297–307. Phillips, D. (2006). Investigating policy attraction in education. Oxford Review of Education, 32(5), 551–559. Phillips, D., & Ochs, K. (2004). Researching policy borrowing: Some methodological challenges in comparative education. British Educational Research Journal, 30(6), 773–784. Phillips, D., & Schweisfurth, M. (2006). Comparative and international education: An introduction to theory, method and practice. Trowbridge, Wiltshire: The Cromwell Press. Pollard, A. (1989). British primary education: Response to Karl Heinz Gruber. Comparative Education, 25(3), 365–367. Prais, S. J. (2003). Cautions on OECD’s recent educational survey (PISA). Oxford Review of Education, 29(2), 139–163. Prais, S. J. (2004). Cautions on OECD’s recent educational survey (PISA): Rejoinder to OECD’s response. Oxford Review of Education, 30(4), 569–573. Riley, K., & Torrance, H. (2003). Big change question: As national policy-makers seek to find solutions to national education issues, do international comparisons such as TIMSS and PISA create a wider understanding, or do they serve to promote the orthodoxies of international agencies? Journal of Educational Change, 4(4), 419–425.

294

JENNIFER H. CHUNG

Sahlberg, P. (2007). Education policies for raising student learning: The finnish approach. Journal of Education Policy, 22(2), 147–171. Schagen, I., & Hutchinson, D. (2007). Comparisons between PISA and TIMSS – we could be the man with two watches. Education Journal, 101, 34–35. Schriewer, J., & Martinez, C. (2004). Constructions of internationality in education. In: G. Steiner-Khamsi (Ed.), The global politics of educational borrowing and lending (pp. 29–53). New York: Teachers College Press. Simola, H. (2005). The Finnish miracle of PISA: Historical and sociological remarks on teaching and teacher eduaction. Comparative Education, 41(4), 455–470. Steiner-Khamsi, G. (2000). The economics of policy borrowing and lending: A study of late adopters. Oxford Review of Education, 32(5), 665–678. Steiner-Khamsi, G., & Quist, H. O. (2000). The politics of educational borrowing: Reopening the case of Achimota in British Ghana. Comparative Education Review, 44(3), 272–299. Steiner-Khamsi, G. (2006). The economics of policy borrowing and lending: A study of late adopters. Comparative Education, 32(5), 665–678. Va¨lija¨rvi, J., Linnakyla¨, P., Kupari, P., Renikainen, P., & Arffman, I. (2002). The finnish success in PISA – and some reasons behind it: PISA 2000. Jyva¨skyla¨: Jyva¨skyla¨ University Institution for Educational Research. Va¨lija¨rvi, J., Linnakyla¨, P., Kupari, P., Renikainen, P., Sulkunen, S., To¨rnroos, J., & Arffman, I. (2007). The finnish success in PISA – and some reasons behind it 2: PISA 2003. Jyva¨skyla¨: Jyva¨skyla¨ University Institution for Educational Research.

PART III CRITICAL FRAMEWORKS FOR UNDERSTANDING THE IMPACT OF INTERNATIONAL ACHIEVEMENT STUDIES

WHY THE FIREWORKS?: THEORETICAL PERSPECTIVES ON THE EXPLOSION IN INTERNATIONAL ASSESSMENTS Jennifer DeBoer ABSTRACT There has been a notable growth in the number, participants, and frequency of international assessments of student academic performance over the past 50 years. This chapter provides a structure for the perspectives that could be used to analyze this rise. This chapter highlights case study examples of specific countries’ choices to participate in particular assessments. It further describes the utility of three analytic frameworks in understanding the decision factors, diffusion mechanisms, and environmental dynamics that relate to international testing. Factors such as the cost of testing, the cultural connections between nations participating, and the temporal relevance of testing to today’s focus on accountability arise in illustrations of the transmission mechanism for international achievement tests. This chapter organizes large and diverse amounts of important testing sampling frame information in a unique way. The questions we ask are driven by the framework we begin analyzing with. Organizations conducting these tests

The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 297–330 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013014

297

298

JENNIFER DEBOER

can better understand the touchpoints for nations deciding whether or not to participate. Concerns about developing country participation, for example, can be better addressed.

INTRODUCTION International comparative educational assessments on a large scale began over 50 years ago with the formation of the International Association for the Evaluation of Educational Achievement (IEA). The organization has existed formally since 1967 and claims origins back to 1958 (IEA, 2007). Since then, cross-national assessments have developed and diversified. But, specifically in the last decade, a surge of interest in these studies has washed over the public discourse. National rankings on the tests are part of common educational parlance. The number, rigor, scope, complexity, and connectivity of these assessments are currently at an all-time high. Why is participation in this information-gathering process so prescient now? In this chapter, I describe the growth of international tests, focusing on the factors related to national decisions to participate, mechanisms that explain how participation has spread from country to country, and the import of the contemporary testing climate to this phenomenon. I apply three theoretical frameworks to organize the understanding of why national participation in international educational studies has skyrocketed. The growth in international assessments is multifaceted, so three broad research questions guide my inquiry. First, in attempting to explain the explosion of international education assessments, it is useful to look at the chronology of events leading up to the present day and the current environment in which various assessments are situated. Second, a close look at the characteristics of educational tests now helps to distinguish them from past policy events. Third, the participants in this policy must be identified. I describe the use of three theoretical lenses to frame the understanding of these questions. I want to know: (1) why has this situation arisen now? (2) why have the assessments taken on the forms that they have? and (3) why have the countries that are participating chosen to do so? There is a dearth of policy literature investigating the explosion in international tests, in part because of its relatively recent occurrence. I address this shortcoming in the following chapter by raising a number of plausible questions in this area, suggesting possible explanations, pointing out lines of inquiry for future work, and illustrating theoretical lenses that frame the discussion of the entire volume.

299

Why the Fireworks?

DEVELOPMENT, ASSESSMENTS, AND PARTICIPANTS Historical Trajectory The first research question – why now? – begs an analysis of the historical progression to the current situation. For example, the current policy climate regarding standardization would encourage these types of assessments being created and utilized. At the individual country level, local and regional histories and relevant events shape the current atmosphere as it relates to assessment. Changing cultural contexts have temporal relevance for the testing environment, and also include regional issues – for example, massification, the boom in the number of college-age students in France and the Maghreb. The popularity of these massive assessments and their widespread public use grew from humble esoteric beginnings. International comparative assessments were first pioneered by an informal group of academics testing a nonrepresentative sample of 13-year olds in 12 countries (IEA, 2007). This informal group would go on to become the IEA, and this first trial run would provide the impetus for the next, more rigorous assessment of math – the First International Mathematics Study (FIMS) (IEA, 2007). Debate and discussion of the findings of these and subsequent reports flourished, including supportive pieces, quantitative analysis and re-analysis, and critique (e.g., Brown, 1999; Lokan, Adams, & Doig, 1999; Boyle & Christie, 1996; Postlethwaite, 1999). In a very short amount of time, assessments of student cognitive performance included many countries and subjects. Program of International Student Assessment (PISA) and the Trends in International Mathematics and Science Study (TIMSS), run by IEA, are currently the largest (in terms of participation) and most widely discussed international assessments. Numerous other international and regional assessments have arisen recently. Clear interest in cross-national cognitive assessments pervades both the public dialogue and conversations in the ivory tower. An online search of the popular press yields over 60 hits for one month in 2008 for the search term ‘‘international assessment.’’ A citation search of the Social Sciences Citation Index (ISI Web of Science) yields 79 academic citations relating specifically to ‘‘international assessment’’ with the greatest number being published in both 2008 and 2009 and only eight being published at all before 2000. Not only has media and academic coverage of these assessments grown, but the public affinity for international tests and their easily digestible

300

JENNIFER DEBOER

rankings has permeated the policymaking chambers. More countries are participating in more, varied assessments. The historical development and current state of international tests can be split into two components for further description – the tests themselves and the participants in the assessments.

Current Tests The nature of the assessment itself is an important part of my analysis, and the second main line of inquiry. How can the current tests be characterized? What different subject areas are tested on? Are these norm- or criterionreferenced assessments? Is other data gathered to contextualize the assessments? What is the relative cost for a country to undertake these tests? The methodological structure and purpose of the tests are key to understanding their current popularity. I focus on tests up to 2007, as tests after that year do not have results available as of the time of this writing.

Characteristics The testing environment currently is broad and diverse. Fig. 1 illustrates the development of the largest multinational studies in the past 60 years. The studies conducted by IEA, the largest organizer of these tests, have grown in number and subject area. Studies such as the International Assessment of Educational Progress (IAEP) expanded in scope into tests like PISA, including multiple years of assessment data. The cognitive assessments are accompanied by information on the schools, teachers, and often the home backgrounds of the students. The very number of international comparative assessments currently underway is testament to this phenomenon; there are currently more than seven such studies in progress. The variety of subjects tested has also increased, and the establishment of assessments given at regular intervals now facilitates trend data. The subject areas tested on now include math, reading literacy, science, civic education, and technology use. (Statistics compiled from IEA, 2007; Organisation for Economic Co-operation and Development [OECD], 2007.) The topic of the test raises questions about a country’s decision to participate. PISA’s focus topic shifts between each round, and a different student questionnaire is issued each time (OECD, 2007). Furthermore, PISA now incorporates an open-ended problem-solving component. A decisionmaker

Why the Fireworks?

301

Fig. 1. Growth in Large-Scale International Assessments over the Past 50 Years.

would assess whether recording student self-reports would maximize its utility and then make its decision to participate or give credence to the particular assessment. If more countries prefer to use multiple-choice assessments, then the PISA assessments might not be the ones chosen for widespread use. Other international assessments contrast even more starkly with TIMSS and PISA. The IEA, which conducts the TIMSS studies, also conducts international studies of Civic and Citizenship Education (ICCS), Technology in Education (SITES), and Reading Literacy (PIRLS) among a number of other subjects in studies already completed (IEA, 2007). A national policymaker would see performance in a subject that is emphasized in its curriculum as beneficial and therefore participate in and support this kind of assessment. For example, Fig. 2 shows participation in ICCS. There are numerous abstentions; countries lacking civic education as a priority would not see value in participating in this assessment. The sampling frame for the assessments can affect their appeal to different nations. Assessments can act as both facilitators and inhibitors of access for different populations; much depends on who has the power to make the tests (Broadfoot, 1990). International assessments come under special scrutiny from education researchers for many reasons, and while the American push during the 1980s for more academic rigor in the assessments was somewhat successful, debate and contention still abound (e.g., Rotberg, 2006). For example, Brown (1999) cites differences in retention policies and practices and

302

JENNIFER DEBOER

Fig. 2. ICCS Participation 2009. Source: Acade´mie d’Aix-Marseille (2008). Note: This map and maps in Figs. 3–9 are created using freely available outline maps from Acade´mie d’Aix-Marseille, http://histgeo.ac-aix-marseille.fr/.

attendance rates as causing some discrepancies in sampling frames so important that comparisons are invalidated. PISA (OECD, 2007) and TIMSS (IEA, 2007) give very divergent descriptions of the allowances for exceptions from the sampling frames for the respective tests even in the technical reports.

Purposes The stated mission of IEA’s ‘‘educational laboratory’’ was to compare practices and results between countries in order to reveal important educational relationships that might go undetected otherwise. The group hoped to use the world as a way of understanding underlying learning processes (IEA, 2007). Chabbott and Elliott (2003) describe the potential uses for these assessments, including cross-national comparisons, policy informing, and a broader understanding of education. Depending on the lens used and the understanding of the driving force behind the fireworks of educational tests, their results will be used differently. TIMSS and PISA provide an excellent example and counter-example. TIMSS, as an IEA study, is more time-consuming and in-depth. It takes

Why the Fireworks?

303

stock of the curriculum in each respective country, the educational materials available there, the student, teacher, and school background factors, and student achievement (ISC, 2007). Furthermore, achievement assessment is a curriculum-referenced test. Participation is voluntary, and while the ability to fund a country’s own study is required, there are still high numbers of countries participating, and funding assistance can be negotiated. The type of the test follows directly from the purpose of the assessment and of the organization. Contrasting with TIMSS is PISA, which is conducted by OECD. Before PISA, nations involved in OECD were agitating for an assessment that could determine in a snapshot where they and others were ranked (Heyneman & Lykins, 2008). Such a test, PISA, was created to do just that. Unlike TIMSS, PISA is a criterion-referenced achievement test that is not concerned with collecting the additional data that TIMSS does. Further, participation in PISA is restricted to members of OECD (wealthy industrialized nations) and partner nations they have accepted. Funding, however, for these tests presents less of a burden than for TIMSS since these are less involved. The design of the research study influences a country’s decision to participate or not. Indeed, an important design influence is the actors who come to the bargaining table in constructing the studies. Both the IEA and the OECD organizational processes differ greatly from regional governing bodies such as the Southern and Eastern African Consortium for Monitoring Education Quality (SACMEQ, 2008). What different agendas do they have that will determine the kind of design they will agree to? Will countries who come to participate in the design process actually participate? Latin American countries’ participation in TIMSS may provide an interesting case study in this vein. While some countries are sometimes involved in the design process (four of the six Western Hemisphere institutional members of IEA are from Latin America; IEA, 2007), they sometimes do not end up participating in the studies. Countries could be displaying displeasure with the research design. The United States, on the contrary, took a greater role in the assessments after the mishandled data of the Second International Mathematics survey; the National Center for Education Statistics was given more stringent criteria for participation, and the Board on International Comparative Students Education (BICSE) was formed. Why are policy diffusion tracks for TIMSS and PISA so different? One explanation comes from the administrators of the test themselves – IEA and OECD. The two organizations are vastly distinct in content (members) and mission (type of test used and purpose). Western Europe comprises a large number of the OECD countries. Because OECD administers PISA, these

304

JENNIFER DEBOER

countries are obliged to participate and may feel required to do so for reasons of competition with other OECD countries. All OECD member countries participated in the first surveys, and countries interested in participation must be ‘‘chosen’’ (OECD, 2007). Both the types of tests and their purposes help to explain the spread of international educational assessments from one locale to another. PISA is conducted quickly to yield a snapshot of performance. Its very purpose is mainly for comparison. Furthermore, this organization, made up of the official government representatives of its member countries, has the power to control its membership and its partnerships. TIMSS, on the other hand, is conducted by an academic group of scholars who first gathered at a UNESCO meeting and ‘‘viewed the world as a natural educational laboratory’’ (IEA, 2007). TIMSS is therefore a much more in-depth gathering of knowledge. TIMSS participation policy states that countries should be prepared to finance their participation in the assessment, as does PISA. However, the TIMSS policy also states that the cost of participation may be adjusted depending on ‘‘the scope of the study and the availability of funds from other national or international agencies’’ (IEA, 2007). (PISA’s policy has no such wording.) This implies that developing countries could receive help in participating in the assessments. IEA is open for membership and is more flexible about the timeliness and exceptions for its data collection (see the large number of caveat notes in the TIMSS reports). The very nature of IEA as opposed to OECD could engender more accessibility to developing countries, which could explain the obviously divergent trends in participation. Smith and Baker (2001) discuss the various organizations involved in the historical development of educational ‘‘indicators’’ and the advent of the ‘‘statistical indicators movement.’’ This push closely relates to the spreading acceptance of such educational assessments as a norm of national education systems. As countries maneuver to be ranked higher against their neighbors, the importance of these tests skyrockets. In addition to the widely-publicized tests that involve nations around the world, cross-national assessments have grown at the regional level. A number of countries have taken part in regional assessments like the Southern and Eastern African Consortium for Monitoring Education Quality (SACMEQ) and the Programme d’Analyse des Syste`mes Educatifs de la CONFEMEN (PASEC). These region-wide assessments (and others, such as the Primer Estudio Internacional Comparativo in Latin America) may provide substance for further, more targeted exploration of national decision-making regarding educational assessments.

305

Why the Fireworks?

Even at the national level, countries are working to develop the rigor and reach of their national testing schemes. The place that assessment itself holds in the educational system of a country plays a role in the country’s attitude toward these international studies (Sebatane, 1998). The ‘‘culture of testing’’ has expanded. National ideologies that play out within respective governments can affect their individual dispositions toward international assessments. Take the mutable American perspective toward assessments. In the 1980s, the educational field shifted its focus away from ‘‘positivism’’ (Broadfoot, 1990), which may have made the country less inclined to participate in quantitative achievement assessments. More recently, with the advent of the No Child Left Behind (NCLB) Act and the creation of the Institute for Education Sciences (IES), the pendulum has swung back the other way. Issues such as Adequate Yearly Progress, ‘‘teaching to the test,’’ and cut scores reference this switch. At the international, tertiary level, the European discussion of the Bologna process and its consequences for European standardization imply that international assessments now have a political environment ripe for implementation. The standards movement, which is taking hold in several countries (e.g., USA, New Zealand), could similarly indicate the relevance of today’s educational environment to assessments and to their use in rankings and setting policy agendas.

Current Participants The third question of which countries are participating in these assessments is crucial both to understanding the phenomenon of international studies and to evaluating the comparisons resulting from the data gathered. More countries are actively participating in the generation of this data. In 2003, over seventy countries participated in one or both of the two largest international assessments (taking place concurrently), more than at any point in the last fifty years. Sixty-seven countries total had participated in any TIMSS assessments before 2007 (excluding ‘‘benchmark participants,’’ or specific regions within countries who wanted to participate individually). Not counting benchmarking participants, there are eleven countries participating in TIMSS that have never done so before, and there are sixteen countries that participated in one or more of the previous TIMSS studies and are not participating in the newest edition of the assessment. Important factors to consider include the population in the formal school system in participating countries and higher interest in assessment in some

306

JENNIFER DEBOER

countries (Broadfoot, 1990). Differing educational characteristics between countries could decrease the external validity of testing populations and nullify the comparisons that result (Postlethwaite, 1999). Income and the national achievement gaps, homogeneity of the country, drop-outs, and access to education could all affect comparative performance. If a country sees its domestic policies as favorably influencing its performance on these achievement tests, it may be more likely to participate. Countries are conscious enough of the distribution of school quality and population makeup in their nation that these could be factors in their decision-making. If countries from previous iterations are known to not be participating in 2007’s ongoing data collection, other countries may be more or less likely to want to take part. Table 1 lists countries that participated in one or more of the TIMSS assessments in 2003 or prior but not in 2007 next to countries that will participate in TIMSS 2007 and have not participated before. The makeup of the participant pool may be an important factor in countries’ decisions to participate. Although 68 countries are participating in TIMSS 2007 versus 49 for TIMSS 2003 (International Study Center [ISC], 2007), some countries not listed are participating in the collection of data for TIMSS 2007 but did not participate in the most recent collection, TIMSS 2003. Interestingly, the new spate of ‘‘benchmarking participants’’ points to different ways of participation that countries have found. While some countries have done so to gather additional data from a representative subsample, others may do so to begin to glean information from the assessments, though they do not yet participated nationally (e.g., India; Kingdon, 2007). Of the 11 new additions, all are developing countries. Of the 16 former participants, only 6 are developing countries, and of these 6, 2 are considered newly industrialized countries (including South Africa). Any country considering participation in the current TIMSS 2007 could see that the shift in participation was toward more developing countries participating and fewer developed countries participating. A policymaker might look at this shift and make assumptions both about the change in the average scores of this new population of countries and the benefits that could be reaped from being in a larger, but more diverse, pool. Low scorers may even encourage other possible low scorers to join. After TIMSS-R (TIMSS, 1999), one follow-up study found that (Elley, 2005) the 18 developing countries that participated generally found it to be very useful in human resource development, test results, and future possibilities. Factors such as reputation are at play, but they are complex. Although a MENA country like Egypt might also culturally include reputation as an

307

Why the Fireworks?

Table 1. Countries Participating in Any Previous TIMSS Collections versus 2007. Participated in 2003, 1999, or 1995 but not in 2007

New Participant in 2007 Collection

Argentina (late on delivering data)

Alberta (Canada, though Canada participated in 1999) Algeria Bosnia Herzegovinia British Columbia (Canada)

Belgium (Flemish) Belgium (French, only 1995) Canada (divided in 2007 – regions participated only as benchmarks, only as a whole in 1999, 1995) Chile Estonia Finland (only 1999) France (only 1995) Greece (only 1995) Iceland (only 1995) Ireland (only 1995) Macedonia Philippines Portugal (only 1995) South Africa Spain (only 1995) Switzerland (only 1995) Indiana (United States)

Dubai (United Arab Emirates) El Salvador Georgia Kazakhstan Malta Massachusetts (United States) Minnesota (United States) Mongolia Oman Qatar Ukraine

Source: TIMSS (2007).

important variable in the calculation of its utility function, it has participated in some of the most recent IEA assessments (e.g., TIMSS, 2003). Many of the Arab Country group participated with support from the UNDP (UNDP, 2007). While countries may originally be shocked out of participating, the general testing culture may win participants back; see, for example the headline ‘‘South Africa ready to take the TIMSS test again’’ (Scott, 2010). Noticeably high-scoring countries may see association with assessments showing wider achievement spreads as detrimental to their image or not worth their time and money if they have no need of re-proving their reputation. In contrast, a country with high scores may see an opportunity to showcase its results. Test results are easy to cite, a quick reference to illustrate national human capital prowess. A country’s interpretation of this variable would depend on its expectations of its students as well as its cultural attitudes. In this case, in analyzing the decisionmaking factors at the national level, perhaps association with the industrialized nations in TIMSS by

308

JENNIFER DEBOER

participating in assessments with them outweighs the danger to reputation due to performance. Kamens and McNeely (2009) provide an excellent, in-depth analysis of how the forces of globalization relate to the diffusion of testing at the international and national levels. Their work is especially useful in understanding how cross-national organizations – both international and regional – may facilitate this spread. India, though one of the founding members of IEA (Ottobre, 1976), is nowhere to be seen in recent assessments. With the requirements of a nationally representative sample, the huge and stratified population of India and the difficulties it poses outweigh the benefits such a quickly growing country could see. India did have two states take TIMSS questions (Kingdon, 2007). In 2009, 72 countries are slated to participate. In addition, this year China completes the administration of PISA in 14 of its 23 provinces (OECD, 2007). And with strong encouragement from the World Bank, India is piloting PISA in a number of its states in 2010 (Alliance for Excellence in Education (AEE), 2009).

THEORETICAL FRAMEWORKS This chapter builds on inspiration by Allison and Zelikow (1999) and applies multiple theoretical frameworks to the same central issue to better understand the characteristics of the stakeholders, environment, and even possible future developments. The analytic framework may bias one’s interpretation and leave plausible explanations completely unstudied; the very paradigm in which we are operating allows us to ask different types of questions about a situation (Kahne, 1996). In analyzing the spread of this policy, therefore, I describe the use of multiple frameworks in order to organize a more comprehensive understanding of why and how the current situation arose. Here, ‘‘international assessments’’ refer to those types of comparative assessments wherein similar measurements are taken in multiple nationstates for the purposes of comparative evaluation and drawing conclusions (Type I in categories of Chabbott & Elliott, 2003). I do not concentrate on other types of cross-national data collected, for example, from national governing bodies, to create indicators that illustrate educational inputs and outputs such as teacher qualifications or graduation rates (Postlethwaite, 1999). While the development of indicators informs the understanding of the overall environment for cross-national education policy, I focus on tests with cognitive assessments, both the well-established, well-known test

Why the Fireworks?

309

cycles of TIMSS and PISA, but also the emerging regional data-gathering groups such as SACMEQ. This chapter uses three different lenses to focus on various aspects of the topic of international educational assessments. I evaluate the explanatory power of these frameworks as I answer the aforementioned three research questions. The characteristics of these paradigms govern the very subquestions that can be asked about this phenomenon to better understand the aforementioned general research questions. This chapter generates numerous possible questions that can be pursued in future inquiry and explores plausible explanations. Looking at the explosion of international studies from different, sometimes competing angles informs policymaking in a much richer manner than simply accepting the current testing environment or understanding its emergence from merely a single viewpoint. The first lens through which I examine the issue of contemporary international assessments is rational choice theory. This framework has its roots in economic theory and my discussion incorporates both a basic rational actor structure and game theory – both of these paradigms help to analyze a country’s decision-making process. Rational choice theory posits that in decision-making, the actor or actors involved in the dilemma each make choices to maximize the achievement of their own goals – their own individual utility functions – based on the knowledge they possess. If a given actor’s utility function is known, their response to a situation can be predicted. Game theory adds a layer of complexity to this framework, noting that an important variable in a given actor’s utility function is his or her expectation of how others will perform. An examination of the growth in international assessments in this framework provides useful insights about the individualities of countries that lead them to participate in the tests. The second lens to be used in this exploration is the theory of policy innovation/diffusion. This work first appeared in J. L. Walker’s study of policy spread between American states (1969). Policy innovation/diffusion looks at the spread of a particular idea between states connected by certain geographical, linguistic, cultural, or other commonalities. If I can understand the characteristics of a policy adopter or the relationship between an initial policy innovator and the subsequent policy implementers, I can predict the future spread of ideas, and test-makers can encourage more or particular kinds of participation. The third lens that will shed light on my discussion here, a new addition to the policy toolkit, is macro-dissatisfaction theory. Macro-dissatisfaction theory is a novel combination of two previously-used lenses, based partially on the more general ‘‘dissatisfaction theory’’ and incorporating other

310

JENNIFER DEBOER

established frameworks such as conflict theory, though modified to address the question at hand. Dissatisfaction theory first appeared in Iannaccone and Lutz’s work on school board turnover at the local policy level (1970). Though normally applied to local school- and district-level politics, I use this theoretical basis in my new development of a ‘‘macro-dissatisfaction theory’’ that can be applied to the international context. Dissatisfaction theory focuses on the role of a stake-holder’s dissatisfaction as a catalyst for policy change (take for example, the public and governmental reaction in the United States to ‘‘A Nation at Risk’’). Dissatisfaction at the nation-state level can be caused by a number of factors, and the results can be policy implementation, policy discontinuation, or policy modification. The policy football feeds into the collective psyche, affecting the psychology of the organization, with implications for the organization’s behavior.

DEFINITIONS Rational Choice and International Assessments The implications of rational choice for politics and public policy are clearly defined in economic work relating both to individual utility maximization and game theory. Boyd, Crowson, and van Geel (1995) applied this theoretical framework to the politics of education. The mathematical tradition from which rational choice theory arises demands definitions of the relevant variables and functions, which is where I begin applying the framework to the question of assessments. Through the rational choice theory lens, the first task is to decide who the ‘‘actors’’ are in this situation. It is common in analyses of international relations and political theory to explain the actions of entire countries as that of one unified actor (Allison & Zelikow, 1999). I use this method of analysis, recognizing the important limitation of oversimplifying very complex systems involving national governments and relevant policymakers, national ministries of education, a nation’s school system and its students, the general public, etc. The ‘‘actors’’ for this analysis, however, are all the participating and non-participating nation-states. Friedman and Hechter (1988) point out that this model is useful for understanding macrosociological concerns. Koremenos, Lipson, and Snidal (2001) contend that ‘‘Our basic presumption, grounded in the broad tradition of rationalchoice analysis, is that states use international institutions to further their

311

Why the Fireworks?

own goals, and they design institutions accordingly. This might seem obvious, but it is surprisingly controversial’’ (p. 762). And, Goldthorpe (1998) describes the utility of the rational actor model for sociology – one that has ‘‘rationality requirements of intermediate strength, that has a primarily situational emphasis and that aims to be a theory of a special, although at the same time a privileged, kind’’ (p. 186). In this chapter, I define a general utility function and possible variables and proceed to look at a handful of particular cases for which a more detailed function is discussed. This general utility function incorporates the type of assessment, its cost, the overall political climate, and the knowledge of other countries participating as key components of a country’s utility function. If the combination of these factors is greater than that country’s ‘‘participation threshold,’’ that country will choose to participate in the assessment. For example, IEA’s values defined the world as an education laboratory; however, not all countries see a maximization of their utility in comparing their performance or in using knowledge about foreign results, and some may interpret the use of international comparisons differently from the way the creators intended. The utility function in this case includes two basic possible actions: A ¼ fparticipate; abstaing or, more complex: A ¼ fparticipate as a country; participate as benchmark participants; delay participation; refuse to participateg The utility function defined includes general categories of factors: uðfinancial resources; sociocultural setting; political settingÞ. One branch of rational choice emerges as particularly useful in conceptualizing national decisions to participate – game theory. In game theory, two rational actors are, as before, trying to maximize their utility. However, the decisions they take are dependent on their prediction of the other player’s decision. For example, a country could decide on whether or not it will participate based on another country’s participation or based on how that country thinks the other country will react to its participation. If a country believes that by participating, other countries in the assessment may have a higher regard for the newcomer, that country could include this factor in its utility function.

312

JENNIFER DEBOER

Diffusion Theory Policy innovation/diffusion theory provides a different framework for analyzing the openness or resistance to international assessments. This theory, a more recent arrival in the toolbox of political science and policy analysis, provides a paradigm within which we can examine the flow of ideas and policies from one nation to another. Rather than looking through the eyes of one actor as it (the country) measures up the world and makes a discrete decision, this framework highlights relationships and the permeation of a policy through the global sphere. Diffusion theory dates back to Walker’s study of policy diffusion between the American states (1969), which pointed out that Factors both within the nation, which determine its individual characteristics, and between nations, which characterize relationships, are important to consider in analyzing the diffusion of policy, particularly one that involves cross-national comparison. Studies have already begun looking at the transfer of educational policies (McLendon, Hearn, & Deaton, 2006) as well as transfer at the international level (Zahariadis, 1998; Sebatane, 2000). These sources outline a framework of policy invention, implementation, adaptation, and transfer as a model for understanding why a particular locale might begin to utilize a certain policy at a certain time. The ‘‘policy innovation’’ of interest is defined here as the international comparative assessment as well as participation in or abstention from the assessment. I therefore look at a number of different policy innovations (the various international assessments) and see how their trajectories compare and contrast. A useful example is the comparison between the rise of TIMSS and PISA, discussed in the next section.

Macro-Dissatisfaction Theory The final lens employed in studying the phenomenon of international assessments introduces a new theoretical framework from a combination of earlier works. The Dissatisfaction Theory of Governance (Iannaccone & Lutz, 1970) suggests that ‘‘local school board governance is a democratic process evidenced by sporadic and intense election defeats by a dissatisfied citizenry’’ (Alsbury, 2003). Although the analysis here is not concerned with local school board politics, this theoretical background provides a useful basis for what we call ‘‘macro-dissatisfaction theory.’’

Why the Fireworks?

313

‘‘Only a crisis – actual or perceived – produces real change. When that crisis occurs, the actions that are taken depend on the ideas that are lying around’’ (Friedman, 1962). Friedman’s words point to the necessary force of public unease for action. Further, this must be broad enough to build a coalition and spark an actual scientific shift (Kuhn, 1962). I begin to develop a ‘‘macro-dissatisfaction theory’’ and apply it not to changes in school board governance, but to changes in policy made at the nation-state level. Macro-dissatisfaction theory states that, given a particular political, social, economic, or cultural climate, enough collective dissatisfaction will be created that a large change is made more viable in supposed response to this unrest. Another implication of dissatisfaction theory that will be incorporated into my macro-dissatisfaction theory is the concept of a periodic reevaluation. Causes of macro-dissatisfaction can be found in a number of different areas. Perceived inequity, unrealized expectations, changing senses of entitlement, awareness of alternatives, and political or cultural shifts can all be cited as possible general causes that can be attributed to creation of a large enough amount of dissatisfaction as to precipitate a corrective action. The recognized fad-like behavior of public interest in an issue is clearly documented (Downs, 1972). Downs calls this the ‘‘issue-attention cycle’’ and documents with detail the stages of increasing and decreasing interest in the issue. If what we are seeing with international assessments is indeed a crisis/fad model of public interest, then we should be able to observe a cyclic pattern of interest in international comparisons. As with the other two theories, the actual definition of relevant factors for one country is boundless, so I can highlight the most relevant factors involved.

EXPLANATORY MECHANISMS, DECISION FACTORS, RELATED ENVIRONMENTAL FACTORS Table 2 gives an organized view of how these three frameworks may give specific insight into the growth of international assessments. It highlights the factors each perspective considers and the benefits and limitations of the explanatory power of each. Rational Actor Theory – Decision Factors Financial burden is one of the major decision factors countries employ in determining whether or not they will participate. The organization of the

314

JENNIFER DEBOER

Table 2. Theoretical Framework

Utility of Three Frameworks Applied to the Growth in International Assessments. Predominant Perspective

Important Factors

Utility (Benefits and Limitations)

Rational choice

Economic

 ‘‘Competitor’’ nations’ participation  Cost of testing  Test type  Reputation  Testing atmosphere/ideology

 Simple  Focused on the decisionmaking process and influential factors

Policy diffusion

Political

   

 Highlights mechanisms of transfer  Traces growth of the assessments

Macrodissatisfaction

Sociocultural

    

Shared language Geographic proximity Language/culture Political and organizational connections/obligations (go through draft with all categories)

Health of economy Cultural moments Cyclical nature Inequity Environmental factors/ catastrophe  Previous test scores  Other nations’ (dis)satisfaction

 Crisis catalyzes and cyclical  Underscores importance of ‘‘peer groups’’ for nations

actual data collection is based largely on autonomous work at the national level. Both member and non-member countries of IEA and OECD (for TIMSS and PISA, respectively) can join in the studies. The process of joining TIMSS for non-members begins with an application describing the country’s plans for organizing the study, including appointing a National Study Center, a National Research Coordinator, and a National Committee of experts (IEA, 2007). Participating countries cover the national costs of the study as well as international coordination costs, though respective costs depend on the availability of outside supplemental funds and specific needs. These costs alone could be a deterrent to countries that see little added utility in participating and great financial cost. These do not even include issues getting schools/local level to participate, which are large (Williams et al., 2009). The wording of PISA’s participation process is similar. However, the membership scheme of OECD gives decision control to the member states,

Why the Fireworks?

315

so member states have explicit (and ultimate) say over which other countries will be a part of the studies. In France, reputation is a key variable in the calculation of the country’s utility function (e.g., Labi, 2008). A country’s decision may also be tempered, of course, by how well a country perceives other countries as being capable of performing. Taking TIMSS 2007 as an example, it is useful to compare the changes in countries participating.

Diffusion Theory – Mechanisms for Diffusion A number of mechanisms could explain the diffusion of international assessments. Geographical proximity is one possible mechanism, as nearness facilitates awareness of neighboring countries’ policies and increases the likelihood that these two countries will cooperate or compete in various arenas. Linguistic and cultural links are also plausible as connections that would facilitate diffusion. Language allows a freer flow of communication, and cultural connections often imply a shared history, especially in public institutions such as education. Political or organizational ties (including between similar SES countries) are an important consideration in the analysis. Finally, certain countries can be characterized as ‘‘policy entrepreneurs,’’ those who are more willing to experiment with novel policies and who may lead the way for other related countries to follow. Maps provide a valuable tool for studying the spread of international assessment participation, as proximal and culturally-connected regions are immediately visible. Figs. 3–6 illustrate the participation in TIMSS in 1995, 1999, 2003, and 2007, respectively. In scrolling through the four different maps, and then through the PISA group afterwards, trends are immediately apparent. Geographic proximity is a major vehicle for policy transfer. 1995, Iran and Israel were the only countries from the Middle East and North Africa (MENA – here I use the World Bank regional designations) region participating in TIMSS. TIMSS 1999 saw Morocco, Tunisia, Jordan, and Turkey all joining in the assessment. While Turkey did not participate in 2003, Egypt, Saudi Arabia, Bahrain, Lebanon, the Palestinian National Authority, Syria, and Yemen all joined. This nearly exponential growth in Middle Eastern and North African participation continued in 2007, as Algeria, Dubai (United Arab Emirates), Kuwait, Oman, Qatar, and again Turkey participated. Geographic proximity and shared culture facilitates policy diffusion here.

316

JENNIFER DEBOER

Fig. 3.

TIMSS 1995 Participation.

Fig. 4.

TIMSS 1999 Participation.

317

Why the Fireworks?

Fig. 5.

TIMSS 2003 Participation.

Fig. 6.

TIMSS 2007 Participation.

318

JENNIFER DEBOER

Similar growth is observable in the Eastern Europe and Central Asia (ECA) region. A number of Eastern European countries had already participated in 1995 (Latvia, Lithuania, Hungary, Czech Republic, Slovak Republic, Romania, Bulgaria). In 1999, Moldova joined the list of ECA countries participating. Armenia and Estonia joined in 2003, and Georgia, Kazakhstan, Mongolia, and Ukraine joined in 2007. While the ECA region experienced large growth in TIMSS participation, Western European participation appeared to wane. After the first TIMSS assessment in 1995, Norway, Sweden, Denmark, Ireland, Germany, France, Spain, Portugal, Austria, and Switzerland did not participate. Norway and Sweden returned to the assessments in 2003, and Germany, Austria, Denmark, and Spain returned for the 2007 assessment. What can explain this apparent drop-off of Western European participation in TIMSS? Western European countries may have seen little value in the results of the first assessment and only considered rejoining the tests after other countries in their area (e.g., Eastern European countries) increased their participation. To fully understand the phenomenon of participation in the TIMSS assessments, especially for Western Europe, participation in PISA must be discussed.

Political and Organizational Connections/Obligation Map trends for PISA participation illustrate strikingly different results from analysis with TIMSS. Figs. 7–9 illustrate participation in the first three completed PISA collections. All the countries in Western Europe, including those who dropped out of TIMSS assessments, participate in all three iterations of PISA (2000, 2003, and 2006). On the contrary, there are far fewer Eastern European or former Soviet Republic country participants. There are very few Central European participants, and the growth in Middle Eastern and North African participation is not nearly comparable to that observed in TIMSS participation. Tunisia is the only African country to participate at all. The difference in Latin American participation is stark as well. Mexico and Brazil participate in all three PISA tests, Chile and Argentina participate twice, and Columbia and Peru participate once. As the French education system is still relevant in the former colonies, the linguistic/cultural diffusion mechanism is an important consideration. The advent of the Programme d’Analyse des Syste`mes Educatifs (PASEC) supports this. Policy entrepreneurs lead the way for their region in adopting a policy. Tunisia, for example, was one of the first North African countries to

319

Why the Fireworks?

Fig. 7.

PISA 2000 Participation.

Fig. 8.

PISA 2003 Participation.

320

JENNIFER DEBOER

Fig. 9.

PISA 2006 Participants.

participate in TIMSS in 1999. Two rounds later, all but Libya in North Africa participated in TIMSS, as did most of the MENA region. Tunisia may have helped to give the policy the necessary foothold in that region to allow it to diffuse. Tunisia’s particular character may be a fitting illustration of a policy entrepreneur in its region. The country’s forcefully secular government maneuvers more toward Mediterranean/European countries than with the rest of Africa or even with other MENA countries. Since independence, education has been a top priority (as the ‘‘top national priority,’’ more than 20% of the annual government budget goes to education; Hamdy, 2007). Since Tunisia is the only North African participant in PISA, I would predict similar policy diffusion.

Temporal Considerations I look briefly at how macro-dissatisfaction theory transpires. In developing nations, dissatisfaction with current conditions could actually push the country to participate in assessments in hopes that publicizing the country’s plight will attract the attention of aid agencies and investment. Dissatisfaction can also lead to abstention, as may have been the case with France. After its disappointing TIMSS 1995 results, France no longer participated in the

Why the Fireworks?

321

TIMSS studies. For other countries, dissatisfaction with the popular grouping they are assigned to (e.g., developing country) might encourage participation in an assessment to be grouped with other, perhaps more powerful, nations. Depending on the political leadership in a country at a given time, dissatisfaction as a political tool could be desired or avoided based on whether the party is seeking or hoping to maintain political office. Smith and Baker (2001) point to instances wherein parties in a nation used results to agitate for policy change, creating a political football from the educational assessments. Globally, macro-dissatisfaction is an even more complex variable to quantify. Changing contexts in one area of the world do not always create a change in dissatisfaction in another. Satisfaction on the part of some nations could equate to dissatisfaction on the part of others; does this then nullify the broader dissatisfaction? The phenomenon of globalization is an issue, to differing extents, everywhere. Globalization brings disparate parts of the world closer together, thereby facilitating increased awareness of perceived inequity, discovery of unfulfilled expectations, and heightened sensitivity to previously ignored issues, all of which are possible contributors to macro-dissatisfaction. Globalization also provides an outlet through which localized dissatisfaction can be broadcast quickly and can create a snowball-effect until macrodissatisfaction decidedly exists. Further, globalization creates interdependence between different nation-states that can lead to dissatisfaction if one country does not feel that it is benefiting enough. The forces of globalization may have numerous consequences for policy spread (Astiz, Wiseman, & Baker, 2002). All these possibilities are present in the current international environment to some extent, offering macro-dissatisfaction theory as a plausible explanation for the increased international participation in and attention to international educational assessments. Related concurrent outcomes that are also products of macro-dissatisfaction could promote this theory as a valid explanation. Take, for example, the increased international use of accountability systems in education. The global climate includes increasing standards and accountability systems in education that are perhaps an outcome of general dissatisfaction with educational outcomes concurrent to the increase in international assessment participation. The phenomenon of test participation itself could be viewed as a cause of macro-dissatisfaction, which would perhaps provide a useful explanation for the seemingly exponential growth in TIMSS participation. For countries who have not previously participated in such assessments,

322

JENNIFER DEBOER

macro-dissatisfaction could be attributed to the perceived inequality between countries that are privileged with participation and countries that are not. The result of such macro-dissatisfaction would be increased effort and possible success in securing a place in the next round of data collection. Also, dissatisfaction on the part of a high-performing country could predict satisfaction on the part of a low-performing country, as reasons for dissatisfaction would differ from one context to another. Lastly, the general global political climate offers a poignant possibility for the application of macro-dissatisfaction theory. Perhaps macro-dissatisfaction and international fatigue resulting from extended international conflicts and economic difficulties have catalyzed a change in international attitudes. The international community may have reacted by looking to what might be regarded as more concrete and objective measures of a nation’s human resources.

CASE STUDIES United States To underscore the use of these frameworks, I highlight examples and the insight offered by each theoretical perspective. What lines of inquiry are open in order to better explain America’s participation in international assessments? What factors relevant for the US would make the current time and environment one that creates dissatisfaction which is great enough to push for change in participation in and type of international assessments? Where is the US situated in the policy’s diffusion, and what tests does it pursue? When looking at America, the issue-attention cycle may be very clearly extant. The cases of Sputnik, ‘‘A Nation at Risk,’’ and ‘‘Rising Above the Gathering Storm’’ (Baker & LeTendre, 2005, p. xiii) are all events wherein a momentous ‘‘realization’’ precipitated a huge amount of political and popular rhetoric directed at revising the education system because of the implications of international comparisons. These ‘‘crises’’ mark out periods of approximately 25 years (1957, 1983, and 2007). The current policy situation in the United States extends to higher education as well. ‘‘Measuring Up: The National Report Card on Higher Education’’ (National Center for Public Policy and Higher Education [NCPPHE], 2006) touts its novel international comparisons and admonishes American complacency in higher education. For the United States,

Why the Fireworks?

323

macro-dissatisfaction appears to be applicable to the current international educational assessment climate. Besides the ‘‘Rising Above the Gathering Storm’’ report, a number of other pieces may illustrate that macro-dissatisfaction is present currently in amounts large enough to precipitate change. The immense popularity of ‘‘alarmist’’ literature such as Thomas Friedman’s The World is Flat (2005) and John Kao’s Innovation Nation (2007), both national bestsellers, illustrates the public demand for this writing. Both books point to the threat to complacent American assuredness from rising human and economic capital abroad. Public opinion on education is specifically identified as ‘‘one of the most important national problems’’ (Bositis, 2002). Previous international assessments (TIMSS, 1995, 1999; PISA, 2000) pointed to distressingly low American performance relative to the other countries surveyed. Perceived inequity of ability and drastically underwhelmed expectations, especially in math and science, caused a large amount of dissatisfaction in the American public during this time. Concurrent outcomes like the ratcheting up of standards and accountability measures in education (such as NCLB) support macro-dissatisfaction theory as an explanation for the United States’ increased sensitivity to international assessments. This was seen in the past at the very beginning of today’s interNEA push. Owing to frustration with the conduct of these IEA reports, the United States approached the OECD for an alternative; the result was the IAEP, which later developed into the PISA; (OECD, 2004a, as cited in Heyneman & Lykins, 2008). Another cause of macro-dissatisfaction and supporting concurrent change relates to the U.S.A.’s international standing. The American reputation internationally during the first few years of the twenty-first century was very vocally harangued in the popular press and in opinion research (BBC poll, 2003; Pew Global Attitudes 2002–2007; British Broadcasting Company, 2002; Pew Research Center, 2007). As early as mid-2002, American unpopularity was noticeably high during the time leading up to the Joint Resolution to Authorize the Use of United States Armed Forces Against Iraq (White House, 2002a). Just 11 days prior, the United States formally reentered the United Nations Educational, Scientific and Cultural Organization (Americans for UNESCO, 2005). This event came 12 days after the release of the National Security Strategy (White House, 2002b), which emphasized American military supremacy. The concurrence of these events points to dissatisfaction – both from within the United States at the loss of international regard and from without at American political decisions – was responsible for the shift in attitude and

324

JENNIFER DEBOER

practice toward a greater public consciousness of and a greater national participation in international educational assessments. In the United States, it seems that the ultimate result of dissatisfaction was an increased attention to international comparisons – poor performance in this case was actually a catalyst for increased interest in rankings. Just as ‘‘A Nation at Risk’’ engendered heightened rigor and greater participation in international assessments, so perhaps the recent dissatisfaction has been fuel for the domestic political fire and for more attention to this newly-re-identified ‘‘problem.’’ In the United States, international assessments do inform domestic education policy-making. However, they have come to be used often as a political football by various interested parties. The United States as a rational actor participates in international assessments for a number of reasons, but perhaps one is that domestic interests want to use international performance as their own political tool. The costs are non-trivial. For the United States, for example, the first TIMSS study in 1995 cost around $30 million (Baker, 1997). The assessments are roundly cited, however, and go through a lot of publicity ‘‘mileage.’’

Latin America The financial dimension is related even to the differential participation between studies conducted by the same organization – IEA: TIMSS 2007 and ICCS 2009 (IEA, 2007). A notable number of Latin American countries did not participate in TIMSS 2007 but participate in ICCS 2009 (Chile, Dominican Republic, Guatemala, Mexico, Paraguay). This may be because participation or high performance on this specific type of test or subject area is very important to the maximization of the utility functions of Latin American countries. However, this fact may be more relevant to the discussion of financial costs. One of the ‘‘regional modules’’ conducted as a part of ICCS 2009 is a Latin American module, and therefore the related Latin American countries receive financial support from the Inter-American Development Bank (IEA, 2007). Even with a subsidy from the UNESCO Education Sector, participation in ICCS 2009 would cost 120,000 USD total over the four years of the study for these countries if they did not receive extra financial support (IEA, 2007). Depending on the country’s utility function this cost could be justified, menial, or overly taxing. For countries that do not see the benefits of participation in international assessments as maximizing their utility if a financial burden is involved, extra financial

325

Why the Fireworks?

support could facilitate their participation in a particular study. However, Wolff (2007) argues (regarding Latin America) that: Testing is among the least expensive innovations in primary education reform, costing far less than increasing teachers’ salaries, reducing class size, and reforming teacher training. Costs play an important but not defining role in decisions about testing. Each country has a different set of conditions, and decision makers and technicians need to make their own trade-offs regarding breadth and depth of testing based on their objectives and capacities. Given current capacities, it is not advisable to test all students in all grades, as is now mandated in the United States y Participating in international tests is not expensive, and it can pay off many times over if the results are used to reform curricula and teacher training. (Wolff, 2007, pp. vi)

In 1995, three Latin American countries participated in TIMSS (Mexico, Colombia, and Argentina). Only one Latin American country participated in 1999 (Chile), and only two in 2003 and 2007 (Chile and Argentina, Columbia and El Salvador). As there are few countries from which the policy could spread locally within this region, the lack of a local policy innovator might explain the lack of infiltration TIMSS has had here.

Ghana The same year as Ghana joined for the first time (2003), Botswana also participated in TIMSS for the first time. Spread of the policy to nations in sub-Saharan Africa appears to be more closely related than the spread of the policy to North and West Africa. This different categorization is further strengthened by looking at other countries related to Francophone North Africa – the Middle Eastern countries. As shown earlier, the policy diffusion in the MENA countries is visually obvious. North Africa’s diffusion characteristics are more related to its connection with the Arab countries. Ghana’s participation and use of the assessments also highlights the individual decisionmaking process. In the case of Ghana after 2003 (Ghanaian educational researcher, personal communication, 2009), the country’s delay in releasing scores and participating in the next round indicated reticence to have another low relative ranking. This also led to a greater focus on quality monitoring within the country. After the 2003 TIMSS results, a large amount of resources were poured into the development of the Basic Education Comprehensive Assessment System (BECAS). A component of this was a NEA for two grades in primary school.

326

JENNIFER DEBOER

IMPLICATIONS Each of these three lenses – rational actor theory, innovation/diffusion theory, and macro-dissatisfaction theory – offer different perspectives, some of which are complementary, for why the international comparative educational assessments have grown in scope, number, and participation. These theories begin to explain why certain nations have begun or ended their participation in these assessments, why these types of assessments have been adopted, and why the these changes are taking place at the current point in history. While rational actor theory is simple, it does provide compelling arguments for how individual nation states would evaluate participation in or attention to international assessments. This theory’s explanations are plausible for a number of individual cases. Innovation/diffusion theory offers a different perspective that dovetails with the decisions of individual nations (Rational Actor) and looks at how the policy of participation in international assessments has spread. This framework incorporates literature on the diffusion of innovation, where the international study itself is the innovation, as well as literature on policy borrowing, to provocatively illustrate the geographic regions where participation has grown or is lacking. Macro-dissatisfaction theory offers a different, temporal look at the generation of policy choices in individual countries and the international community. This theory examines the cyclical nature of public interest and the crisis-catalyst often required for macro-change that could explain the circumstances surrounding international assessments. A number of concerns arise in conducting this analysis. If financial cost is a weightier factor in a country’s decision to participate in an assessment than the opportunity cost of abstention (Rational Actor), could this create a confounding variable that might hamper cross-national comparisons? Could the choice of a country to participate in or pay attention to an international assessment create problematic selection bias? Another major concern is the fact that, while more developing countries are participating, they are nearly all from similar regions (Diffusion Theory). TIMSS has made inroads into the fringes of the African continent, but the vast majority of sub-Saharan African nations have not participated in any of the larger international assessments discussed here. However, a number of these countries have taken part in regional assessments like SACMEQ and PEIC. Some countries that might be expected to join in assessments have not participated. Despite Kenya and Nigeria’s status as institutional members of

Why the Fireworks?

327

IEA, they have not participated in any of the TIMSS collections. Why is India, one of the founding members of the IEA (Ottobre, 1976) not present in any of the major assessments studied here? Why is China, present in much of the dissatisfaction dialogue in America, also not present? Wiseman and Baker (2005) allude to the increasingly internationalized nature of educational policymaking itself. Where policy borrowing already existed, the increasing emphasis on quantifying and measuring educational characteristics has facilitated comparison and focus. Most worrisome for proponents of international assessments in general and the international assessments currently used in particular is the concern arising from the implications of the third theory, macro-dissatisfaction. A central tenet to this theory is its cyclical nature. In America, the last fifty years have been relatively evenly demarcated by spikes in national interest in international comparative standings. This periodic resurgence and decline could imply that interest and participation in international assessments will peter off in the not-too-distant future. These three theories do not provide exhaustive explanations for this international educational phenomenon. However, they do illustrate intriguing possible explanations for the advent of recent growth in these assessments. The complex systems involved make definitive conclusions difficult, but this chapter attempts to create a foundation for inquiry on this topic. Further analysis should be conducted from here to verify and explain in greater detail, especially as the situation of interest and participation in international comparative assessments develops. International studies of education are currently being used as both tools and impediments for national and local policymakers worldwide. Their results are setting national policy agendas and driving political debates. Even participation in these assessments may be politically driven. Responsible use of data from international studies can only be realized if the development of the assessments and the related policy environment is understood and connected to consequent policymaking. The ambiguous mechanisms by which these assessments are constructed pose a great challenge to researchers trying to understand how learning works and to provide policymakers with evidence. This chapter illustrates how the history, nature, participants, and future of international achievement studies can be conceptualized through multiple lenses. It describes how the impact of such studies has shaped the policymaking environment and how the complex environment has influenced these studies. Technically, these are not ‘‘high-stakes’’ tests, but the weight given the rankings from programs like TIMSS is obvious in many countries. The background of

328

JENNIFER DEBOER

these assessments should be thoroughly understood, though few scholars have addressed the topic and even fewer policymakers contextualize the data they use from these studies appropriately. Large-scale assessments are neither made nor implemented in a vacuum; the forces that shape society also affect the development and implementation of these studies. This chapter illustrates how these studies have arisen, how results have been used in the past, and how we can understand the potential for future development. It is imperative that both researchers and policymakers understand how the tests arise in order to conscientiously apply the information collected. I have looked at the uptake and use of assessments based on the time, purpose, and leaders in implementation. Corralling this herd of international assessments demands sensitivity to their complexities.

REFERENCES Acade´mie d’Aix-Marseille. (2008). Carthothe`que. [Free, publicly available outline maps.] From Page accueil du site Histoire-Ge´ographie de l’acade´mie d’Aix-Marseille. Available at http://histgeo.ac-aix-marseille.fr/. Retrieved on March 1, 2008. AEE. (2009). Alliance for excellent education. Available at http://www.all4ed.org. Retrieved on October 1, 2008. Allison, G., & Zelikow, P. (1999). Essence of decision. New York: Longman. Alsbury, T. L. (2003). Superintendent and school board member turnover: Political versus apolitical turnover as a critical variable in the application of the dissatisfaction theory. Educational Administration Quarterly, 39(5), 667–698. Americans for UNESCO. (2005). The United States and UNESCO: Beginnings (1945) and new beginnings (2005). Americans for UNESCO. Available at http://www. americansforunesco.org. Accessed on October 1, 2008. Astiz, M. F., Wiseman, A. W., & Baker, D. P. (2002). Slouching towards decentralization: Consequences of globalization for curricular control in national education systems. Comparative Education Review, 46(1), 66–88. Baker, D., & LeTendre, G. (2005). National differences, global similarities. Stanford: Stanford University Press. Baker, D. P. (1997). Surviving TIMSS: Or, everything you blissfully forgot about international comparisons. Phi Delta Kappan, 78, 22–28. Bositis, D. A. (2002). National opinion poll: Education, 2002. Washington, DC: Joint Center for Political and Economic Studies. Boyd, W. L., Crowson, R. L., & van Geel, T. (1995). Rational choice theory and the politics of education. In: J. D. Scribner & D. H. Layton (Eds), The study of educational politics (pp. 127–145). London: Falmer Press. Boyle, B., & Christie, T. (Eds). (1996). Issues in setting standards: Establishing comparabilities. London: Falmer. British Broadcasting Company. (2002). Global anger at US ‘growing’. BBC News, December 5.

Why the Fireworks?

329

Broadfoot, P. (1990). Changing educational assessment: International perspectives and trends. London: British Comparative and International Education Society. Brown, M. (1999). Problems of interpreting international comparative data. In: B. Jaworski, D. Phillips (Eds.), From comparing standards internationally. Oxford Studies in Comparative Education. Oxford: Symposium Books. Chabbott, C., & Elliott, E. J. (Eds). (2003). Understanding others, educating ourselves. Washington, DC: The National Academies Press. Committee on Science, Engineering, and Public Policy. (2005). Rising above the gathering storm. Washington, DC: National Academies Press. Downs, A. (1972). Up and down with ecology: The issue-attention cycle. In: J. M. Shafritz, K. S. Layne & C. P. Borick (Eds), Classics of public policy (pp. 137–147). New York: Pearson Education, 2005. Elley, W. B. (2005). Evaluating students’ achievements: How TIMSS-R contributed to education in eighteen developing countries. Prospects, 35(2), 199–212. Friedman, D., & Hechter, M. (1988). The contribution of rational choice theory to macrosociological research. Sociological Theory, 6(2), 201–218. Friedman, M. (1962). Capitalism and freedom. Chicago: University of Chicago Press. Friedman, T. L. (2005). The world is flat. New York: Farrar, Straus, Reese, and Giroux. Goldthorpe, J. H. (1998). Rational action theory for sociology. The British Journal of Sociology, 49(2), 167–192. Hamdy, A. (2007). ICT in education in Tunisia. Available at http://www.infodev.org. Retrieved on October 1, 2008. Heyneman, S. P., & Lykins, C. (2008). The evolution of comparative and international education statistics. In: H. F. Ladd & E. B. Fiske (Eds), Handbook of research in education finance and policy (pp. 105–127). New York: Routledge. Iannaccone, L., & Lutz, F. W. (1970). Politics, power and policy: The governing of local school districts. Columbus, OH: Charles E. Merrill. International Association for the Evaluation of Educational Achievement. (2007). International association for the evaluation of educational achievement. Available at http:// www.iea.nl International Study Center. (2007). TIMSS and PIRLS international study center. Available at http://timss.bc.edu. Retrieved on December 1, 2007. Kahne, J. (1996). Reframing educational policy: Democracy, community, and the individual (chap. 1, pp. 1–8; chap. 6, pp. 92–118; chap. 8, pp. 146–154). New York: Teachers College Press. Kamens, D. H., & McNeely, C. L. (2009). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54(1), 5–25. Kao, J. (2007). Innovation nation: How America is losing its innovation edge, why it matters, and what we can do to get it back. New York: Simon and Schuster, Inc. Kingdon, G. (2007). The progress of school education in India. Oxford Review of Economic Policy, 23(2), 168–195. Koremenos, B., Lipson, C., & Snidal, D. (2001). The rational design of international institutions. International Organization, 55(4), 761–799. Kuhn, T. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press. Labi, A. (2008). Obsession with rankings goes global. The Chronicle of Higher Education, 55(8), A.27.

330

JENNIFER DEBOER

Lokan, J., Adams, R., & Doig, B. (1999). Broadening assessment, improving fairness? Some examples from school science. Assessment in Education, 6(1), 83 (Education Module). McLendon, M. K., Hearn, J. C., & Deaton, R. (2006). Called to account: Analyzing the origins and spread of state performance-accountability policies for higher education. Educational Evaluation and Policy Analysis, 28(1), 1–24. National Center for Public Policy and Higher Education. (2006). Measuring up: The national report card on higher education. Available at http://measuringup.highereducation.org/ National Commission on Excellence in Education. (1983). A nation at risk. National Commission on Excellence in Education. Available at http://www2.ed.gov. Accessed on October 1, 2008. Organisation for Economic Co-operation and Development. (2007). PISA – The OECD Program for International Student Assessment. OECD. Available at http://www. pisa.oecd.org. Accessed on October 1, 2008. Ottobre, F. M. (1976). The work of the international association for educational assessment. Paedagogica Europaea, 11(1), 81–88 (New Developments of Educational Research). Pew Research Center. (2002). What the world thinks in 2002. Pew Global Attitudes Project. Washington, DC: The Pew Research Center for the People & the Press. Postlethwaite, T. N. (1999). Overview of issues in international achievement studies. In: B. Jaworski & D. Phillips (Eds), Comparing standards internationally (pp. 23–60). Oxford: Oxford Studies in Comparative Education. Rotberg, I. (2006). Assessment around the world. Educational Leadership, 64(3), 58–63. Scott, C. (2010). SA ready to take the TIMSS test again. Mail & Guardian Online, August 3. Sebatane, M. (1998). Profiles of educational assessment systems world-wide. Assessment in Education, 5(2), 255–269 (Education Module). Sebatane, M. (2000). International transfers of assessment. Assessment in Education, 7(3), 401–414. Smith, T. M., & Baker, D. P. (2001). Worldwide growth and institutionalization of statistical indicators for education policy-making. Peabody Journal of Education, 76(3–4), 141–152. Southern and Eastern African Consortium for Monitoring Education Quality. (2008). SACMEQ. Available at http://www.sacmeq.org. Accessed on October 1, 2008. United Nations Development Program (UNDP). (2007). Arab countries participating in TIMSS 2007. Available at http://www.arabtimss-undp.org. Retrieved on October 1, 2008. Walker, J. L. (1969). The diffusion of innovations among the American states. American Political Science Review, 63(3), 880–899. White House. (2002a). Joint resolution to authorize the use of United States armed forces against Iraq (News Release). Office of the Press Secretary, October 2. White House. (September 2002b). The national security strategy of the United States of America. Washington: The White House. Williams, T., Ferraro, D., Roey, S., Brenwald, S., Kastberg, D., Jocelyn, L., Smith, C., & Stearns, P. (2009). TIMSS 2007 U.S. technical report and user guide (NCES 2009–012). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC. Wiseman, A. W., & Baker, D. P. (2005). The worldwide explosion of internationalized education policy. International Perspectives on Education and Society, 6, 1–21. Wolff, L. (2007). The costs of student assessments in Latin America. Washington, DC: PREAL. Zahariadis, N. (1998). Contending perspectives in international political economy. San Diego: Harcourt Brace.

STANDARDIZED TESTS IN AN ERA OF INTERNATIONAL COMPETITION AND ACCOUNTABILITY M. Fernanda Pineda ABSTRACT This chapter discusses some of the criticisms of standardized assessments by doing a document analysis of mainly Mexico’s and Argentina’s ministries of education’s web sites and exploring the theoretical work of diverse authors, mainly critical pedagogues and culturalists. This chapter argues that the process of assessment using standardized tests is a highly political and even commercial process, but the challenge to compete globally, still perform locally, collaborate in solidarity, and decide collectively whose knowledge is of most worth is still before us. As exemplified in Mexico’s test ENLACE, standardized tests tend to show a negative bias against minorities and tendency to highlight certain values and knowledge. Countries should seek for as many partnership opportunities with teachers and communities to be able to assess learning collectively and even consider not adopting policies passively, as opposed to having an international organization or policy dictating what is worth knowing and testing. This way, assessment will still help countries compete globally, still perform locally, and collaborate in solidarity. The Impact of International Achievement Studies on National Education Policymaking International Perspectives on Education and Society, Volume 13, 331–353 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1479-3679/doi:10.1108/S1479-3679(2010)0000013015

331

332

M. FERNANDA PINEDA

INTRODUCTION We have been experiencing in the past decades an increased focus on evaluating learning, observable at both the local and the global levels. Evaluating learning, nonetheless, is something that should be done periodically, for it helps us – educators, learners, policy makers, and community members – to adjust and reorient our practices, to celebrate achievements, to assure quality and equality, and that learning/improvement is taking place. Educators have the ability to design and implement organic, holistic, and efficient ways to do this, and ideally, collectively with their students and communities. However, the assessment process at all levels is highly political and maybe even commercial. Many creative minds might be shut down and turned into test takers, and an important body of knowledge probably made invisible because it is ‘‘not on the test.’’ This is some of what we are facing in this era of ‘‘outcome-based bureaucratic accountability’’ (Gabbard & Ross, 2004). Around the world, numerous governments look for what is considered effective policies (OECD, 2007; Kamens & McNeely, 2010). Even though there are regional differences and the presence and the capacities of international organizations vary from country to country (Kamens & McNeely, 2010), the adoption of standardized assessments and related policies has markedly increased. In international arenas, tests and their creators are having a great impact on national curricula and educational policies (Spring, 2008), bringing along numerous challenges as well as perspectives to explore and rethink. The challenge to compete globally, still perform locally, collaborate in solidarity, and decide collectively whose knowledge and whose policies are of most worth is still before us. This chapter discusses some of the criticisms of international and national standardized testing in relation to teachers, students, curricula, and knowledge and offers insight into rethinking standardized tests in an era of international competition and accountability.

THE POLITICAL PAST AND PRESENT OF STANDARDIZED TESTING: SOME HIGHLIGHTS Education and power are an indissoluble couplet (Apple, 2000), and it becomes even more visible when we talk about assessment – in particular, the use of standardized testing – and accountability, a word that resonates with authority and power (Gabbard, 2000). The history of the increased use

Standardized Tests in an Era of International Competition and Accountability

333

of and importance given to tests (hence, accountability) makes this visible as it relates closely to international competition and impacts policy making and practice, as this volume points out. This is observable in the history of the United States, for example, where international competition made way for standardized testing, as events like the launching of Sputnik in the 1950s or the Nation at Risk report in the 1980s took place. These events plugged the United States into a race for testing and accountability like never before. Just past decade, the United States faced the acme of testing in the policies implemented with the No Child Left Behind Act of 2001 and now the Race to the Top Fund, ‘‘a national competition which will highlight and replicate effective education reform strategies,’’ which includes, among other aspects, the adoption of ‘‘internationally benchmarked standards and assessments that prepare students for success in college and the workplace’’ and the creation of ‘‘data systems that measure student success’’ (USDE, 2009). Presently, numerous nations are undergoing reforms, where similar policies and programs that highlight testing and accountability were or will be implemented. Noticeably, behind or throughout these reforms, the names of well-known international institutions continuously come up. Another common denominator is that national standardized tests often make comparative reference to international assessments, mainly Trends in Mathematics and Science Study (TIMSS), Program for International Student Assessment (PISA), and the Progress in International Reading Literacy Study (PIRLS) (discussed later), making more visible the impact of international achievement tests on national education policy making. These institutions and their tests reinforce the paradigm of the knowledge-based economy, competition, accountability, and other global discourses that set national agendas for education worldwide (Spring, 2008). An example of the impact of standardized assessments on a country’s educational policies is observable in Mexico, a country that has embraced the philosophy of national assessment systems since the early 1990s (Benveniste, 2002). Recently, however, there is an unprecedented reform nationwide to comply with the dominating international testing/assessment models and the national desire to become a competitive country that provides quality education for all. ENLACE (Evaluacio´n Nacional de Logro Acade´mico en Centros Escolares/National Evaluation of Academic Achievement in Schools) is the test that Mexico has administered nationally and consecutively since 2005 in elementary and middle schools (3rd grade and 9th grade), and since 2007, in high schools (ENLACE, n.d., a).1 A body that assists in articulating and implementing the educational reform (of which ENLACE is part of) is the Alianza por la Calidad de la Educacio´n

334

M. FERNANDA PINEDA

(Alliance for the Quality of Education), a consortium of the Ministry of Education (Secretarı´a de Educacio´n Pu´blica, SEP), the Ministry of Social Development (Secretarı´a de Desarrollo Social ), Ministry of Finance and Public Credit (Secretarı´a de Hacienda y Cre´dito Pu´blico), and the Ministry of Health (Secretarı´a de Salud). In brief, this alliance between the Federal Government and the teachers of Mexico represented by the National Union of Workers of Education (Sindicato Nacional de Trabajadores de la Educacio´n; ACE, n.d., a) calls for legislators, parents, students, civil society, entrepreneurs, and academics (ACE, n.d., a) to be part of the improvement of Mexico’s education. The Alianza fosters that ‘‘the Mexican society watches and owns the different commitments [compromisos] needed for the transformation of the education system’’ (ACE, n.d., a, p. 5). In principle, civil society watch is a very attractive idea; it creates a culture of transparency, accountability, and civic participation, usually characteristic of modern and developed countries. The advocacy work and participation of civil society organizations around the world has increased substantially at regional, national, and international levels in this past decade (UNESCO, 2008b). Suma por la Educacio´n (Group in favor of Education, or Add for Education, using the pun) is the civil society group that organized the 5,000 ‘‘observers’’ to monitor the application of ENLACE in March 2009 nationwide (Suma X la Educacio´n, 2009a; ENLACE, n.d., b). Suma is a consortium of over 40 civil society organizations and aims to work ‘‘co-responsibly’’ with others involved in education in Mexico (Suma X la Educacio´n, 2009b). Without the intention to diminish the observers’ efforts, or the Ministry of Education’s enthusiasm for having citizens watch ENLACE’s processes, the urban and mainstream culture bias in all of this reaches almost alarming levels and cannot be ignored. The EFA 2008 report highlights that in Latin America, our assessments tend to show ‘‘strong disparities in favour of urban students, reflecting both higher household incomes and better school provision in urban areas’’ (UNESCO, 2008a, p. 18). There is not one group within Suma that works specifically with indigenous children, most of the groups are located in urban areas (mainly Mexico City), and these observers were able to visit only 5,000 schools nationwide, representing only 6% of the total number of schools that completed ENLACE (Suma X la Educacio´n, 2009a). Nonetheless, ENLACE is administered nationwide, even in remote areas such as Metlato´noc, one of the poorest municipalities in the state of Guerrero, Mexico, where over 50% of the population speaks either Mixteco or Tlapaneco (indigenous languages) (EMM, n.d.). Of the 24 villages in Metlato´noc that completed ENLACE, 15 villages had 0% of students

Standardized Tests in an Era of International Competition and Accountability

335

(n ¼ 641) achieving not even the ‘‘good’’ level, 7 villages had from 7 to 9.5% (n ¼ 677), and only 1 had 30% of students (n ¼ 23) achieving the ‘‘good’’ or ‘‘excellent’’ level (ENLACE, 2009a). As noticeable as the lack of rural or indigenous presence in Suma is, the opposite is true for ENLACE’s consultancy group (consejo te´cnico). Two representatives from the SEP’s Direccio´n General de Educacio´n Indı´gena (Office of Indigenous Education) are part of this consejo (ENLACE, n.d., c). There are, however, no representatives from the Coordinacio´n General de Educacio´n Intercultural Bilı´ngϋe (Office of Intercultural Bilingual Education [CGEIB]) or from any department for children with special needs. Nonetheless, the ENLACE test for 3rd grade (sample available online) includes questions related to a passage about pets with passports (p. 8), talking rhinoceros (the reading is Rino, by Patrick Goldsmith; p. 1), the interpretation of a flyer for summer camp called ‘‘Aquatic Gym Sport’’ (p. 16) (these in reading comprehension), and the people counter device at the zoo’s door (p. 3) for math (ENLACE, 2009b). In the Formacio´n Cı´vica y E´tica section (Civic and Ethic Development), there are numerous interesting questions (for example, about government complaints and community organizing), but one in particular seems to allude to intercultural relations – it is a scenario where a girl that came from a rural area ‘‘speaks funny’’ and children do not want to play with her until she wins a race in a team with another classmate (p. 25). The choices to answer from are ambiguous and there are many assumptions necessary to answer correctly. ENLACE is part of the strand #5 (‘‘Evaluar para Mejorar,’’ ‘‘Evaluate in order to Improve’’) of Alianza por la Educacio´n. Recently, the World Bank accepted to evaluate Mexico’s reform project, suggesting that many other nations might benefit of adapting this model (ACE, n.d., c). These examples signify the problematic nature of standardized testing, which are open to bias and tend to overlook differences, broadening gaps in our societies. Another example of how the wave of systemic evaluations of academic attainment through standardized instruments has spread transforming entire educational systems is Argentina.2 Since 1993, according to Benveniste’s (2002) research, Argentina administers a national test in grades 3, 6, 7, 9, and 12. The subject areas tested are language, mathematics, social science, and natural sciences (these later ones in 6th and 12th grades). When the scores are analyzed, the teaching materials are tailored to address the weaknesses shown on the test. Argentina also administers the PISA test and OREALC-UNESCO’s SERCE (Segundo Estudio Regional Comparativo y Explicativo; ME, n.d.). International organizations such as the World Bank and the IADB provide Argentina with financial support to

336

M. FERNANDA PINEDA

implement and continue their national standardized assessment (Benveniste, 2002). Similar to Mexico’s standardized testing aims (‘‘evaluar para mejorar’’), Argentina also seeks to prevent differentiation across regions and provide information that would help allocate resources and portray performance of the various actors involved (teachers, undoubtedly; Benveniste, 2002). A publication of Argentina’s DINIECE (Direccio´n de Informacio´n y Evaluacio´n de la Calidad Educativa, or Office of Information and Evaluation of Educational Quality) provides a snapshot of the country’s complex assessment system. Similar to Mexico’s Alianza por la Educacio´n, Argentina seeks to embrace a culture of participatory and systematized assessment nationwide (Leon, Scorzo, & Novello, 2009). Their plans are elaborated in detail and justified by notions if ‘‘innovation’’ and ‘‘international tendencies,’’ as outlined in DINIECE’s 2009 publication Hacia una Cultura de Evaluacio´n, are similar to the Mexican case. The plan, however, is relatively silent with regard to indigenous students and children with special needs. They do call for the assessments to be ‘‘carefully designed’’ (Leon et al., 2009, p. 42) and highlight that for the performance levels of students to be comparable, their social conditions of origin should be considered (Leon et al., 2009). As exemplified in Mexico and Argentina, the culture of evaluation, or the project to ‘‘evaluate in order to improve’’ points to an assumed direction of progress and improvement, following the international tendencies (Leon et al., 2009). Across the globe, as numerous authors have pointed out, ‘‘the ‘need’ to test or assess student populations is spreading as a taken-forgranted assumption’’ (Kamens & McNeely, 2010, p. 6). On a comparative and inquiring note, Norway, for example, has not yet established a system of national standardized testing, and they do not grade children with a formal assessment until they are 13–14 years old (Tveit, 2009). Nonetheless, Norway’s Human Development Index is .971 (the highest in the world in 2007) and has a Gini index of 25.8. Argentina’s is .866 (ranked 49) and a Gini index of 50.0, and Mexico’s is .854 (ranked 53) and a Gini index of 48.1 (HDR, 2009). Could we factor in the real effect of standardized tests that suggest a causal relationship between standardized assessment, human development, and equality? Should the focus shift from establishing a complex assessment system toward a culture of access, autonomy, respect, creativity, and collective evaluation? The current situation of standardized testing is more political and international than ever. The standardized assessments culture plunges the conversations into the arena of a global or world culture, best practices, and

Standardized Tests in an Era of International Competition and Accountability

337

homogenization of learning (Spring, 2008) to ensure ‘‘fair competition,’’ terms that many could consider unproblematic in the surface. There are, nonetheless, questions that must be answered, ‘‘givens’’ to be revisited, and actions of government and international organizations’ officials to be deconstructed, especially when adopting policies of standardized testing that would affect national curricula. Making (and unmaking) educational policies, together with funding and curriculum decisions, is always political (Levin, 2008; Apple, 2000, 2004). Some politicians might profit from implementing standardized testing, and even more so if the experts come from international organizations that all wealthy countries seem to endorse and trust for best practices. Assessment as a political phenomenon ‘‘reflects the agendas, tensions, and nature of power relations between political actors’’ (Benveniste, 2002, p. 89). The political popularity-raiser that an educational policy that appeals to quality, accountability, better performance, and progress can become cannot be overseen (this is by no means intended to overlook the positive consequences of these assessments, as discussed later). These type of discussions –quality, accountability, and better performance– tap into ‘‘common sense,’’ as if these ideas happened naturally (Apple, 2000). Who would not vote for someone that does not want to ‘‘leave any child behind,’’ wants us to ‘‘race to the top,’’ or is in ‘‘alliance for the quality of education?’’ What president would not want his/her country perform highly in an international assessment that all industrialized countries are taking? As Kamens and McNeely (2010) point out, ‘‘fewer and fewer countries [and their elites] imagine that they will achieve the status of the ‘good society’ without high levels of formal education and accompanying efforts at national assessment and/or international testing’’ (p. 19). In as much as the criticisms of reforms, policies, and assessments flourish in innumerable publications, abandoning ourselves to the pessimism that all international/standardized assessments and their policies are planned with double or hidden agendas is unfruitful. Equally unfruitful is remaining with a bland or expectant position, meaning that we are passively receiving instructions and models, hoping to one day become like a country that scores at the top in everything. May the protests and criticisms resonate among policy makers and be a clear call for critical scrutiny and perspectiveseeking. In the broad and rich spectrum of criticisms, we find a denunciation against privatization of public education, issues of power struggles (as discussed earlier), and silenced voices. Elaborating on privatization is beyond the scope of this chapter, but relevant to mention nonetheless. The logic behind the hyperemphasis on testing and accountability, it is argued,

338

M. FERNANDA PINEDA

might be to eventually privatize education (Apple, 2001; Gabbard & Ross, 2004; Spring, 2002). In relation to the wave of privatization, however, I join Apple (2001) in asking, ‘‘is private better, and better for whom?’’ What follows are short sections of what goes on with schools, teachers, and learners in relation to standardized tests, in an attempt to highlight the power struggles and the silenced voices.

STANDARDIZED TESTS AND SCHOOLS The implications of test (under)performance are very tangible. In Miami, FL (USA), a multicultural city, numerous schools are not achieving adequate yearly progress (AYP). It is a common assumption that going from ‘‘B’’ to ‘‘A’’ is real progress, and going from ‘‘B’’ to ‘‘F’’ is eternal doom. Public figures (mainly government officials/politicians) make themselves look righteous when they punish schools for underperforming. For example, the Miami Edison Senior High School, located in ‘‘Little Haiti,’’ was about to close a couple of years ago because of consecutive years of not achieving AYP. Many of my (bright and committed) colleagues taught there at the time and they were visible upset by the AYP threats. As if the solution to an ‘‘underachieving’’ school was closing it. Gladly, Miami Edison Senior High School was able to continue operating and raised scores (still the focus is on scores; McGrory, 2009). Educational policies should be geared toward supporting communities and their schools, their collective efforts, and backed up by funding. Funding and achievement in standardized tests usually go hand in hand, in spite of the studies conducted to deny so (Berliner & Biddle, 1995). Even the OECD, an institution that is often criticized for some of its approaches to education (see Spring, 2008, 2009). The ‘‘other side of the coin’’ in funding, though, is that oftentimes, the funds are geared toward further testing (Apple, 1995). In Miami Edison Senior High School now there is personalized preparation for standardized exams (McGrory, 2009). It is the role of the teachers, students, and the communities to be watchful.

STANDARDIZED TESTS AND TEACHERS Accountability, as mentioned earlier, is a statement of hierarchical power (Gabbard, 2000; Gabbard & Ross, 2004). Teachers are expected, as a result from federal and international pressures, to respond for the education of a

Standardized Tests in an Era of International Competition and Accountability

339

nation. In other and rather simple words, they are usually held accountable for what comes in and goes out in schools, as in factories. The curricula tailored for the tests are almost like scripts: teacher-proof (Shor & Freire, 1987; McLaren, 2007), and in many countries or states, teachers’ salaries and evaluations depend on scores. The freshest example is Florida’s: numerous teachers and teacher educators protest this April 2010 for the veto of the Senate Bill 6, which will allow for teachers’ evaluations to depend heavily on tests scores, and advanced degrees of in-classroom teachers will not represent higher salaries (Florida Senate, 2010). In Mexico, teachers’ evaluations are linked to students’ performance in ENLACE and PISA (ACE, n.d., a, n.d., b). All this, with few exceptions, is a statement of the poor professional treatment toward (mainly public schools’) teachers (McLaren, 2007; Freire, 1998), often portrayed as incompetent. There have even been numerous statistical studies showing that those that chose teaching as careers scored lower in standardized tests (Berliner & Biddle, 1995); we must be cautious and look closer for spurious relationships. Statistical studies are not gospel (Berliner & Biddle, 1995); ‘‘evidence’’ will always be ‘‘someone’s evidence, and teachers should be respected as the professional educators that they are. UNESCO and the ILO outlined over four decades ago a document about the professionalization of teachers. This document indicates, among other recommendations, that teachers are valuable specialists and merit rating for them should be avoided (UNESCO, 2002). Part of being a professional is to have freedom to make choices (Hoyle, 1982) in the classroom. Scripted teaching to a standardized assessment might cause this freedom to ebb. Freire (1998) made a call for respect for the teaching profession. He considers educators more like cultural workers and political agents. He calls for professional recognition of teachers not in arrogance, but in abilities. Harding (1990) also points out that historically, teachers have proven to be able to organize in favor of important social movements. Those in the classroom should not be considered bureaucratic pawns, but educators impacting lives (Freire, 2005). Teachers are capable of evaluating learning in numerous ways; typically they know their communities, their students, and collaboratively can inform society (or the state) about areas of improvement. As opposed to many policy makers’ beliefs, teachers in general ‘‘favor appropriate, high-quality, responsible student testing’’ (Spring, 2010, p. 46), but the international tendency is the de-professionalization of teachers and the – oftentimes a priori – disqualification of their abilities or dispositions. The point is not to get rid of standardized tests or creating a more dichotomous environment between the state and schools, but to embrace a spirit of collaboration and

340

M. FERNANDA PINEDA

respect. We should all (teachers, students, parents, elders, policy makers, and so on) be part of the ‘‘evaluate to improve’’ project.

STANDARDIZED TESTS AND LEARNERS In the hype for testing, students – together with their families, groups, and cultures – might be ‘‘sorted out’’ as they perform, meaning that they are locked in certain amounts of funding, with lower salary for staff, or special types of ‘‘interventions.’’ These are some of the possible ramifications (let alone the psychological effects) of standardized tests in this outcome-based accountability era. Part of the neoliberal agenda of standardized tests and education is to dictate students’ ‘‘use-value’’ (Gabbard & Ross, 2004) depending on their performance. This has strong linkages with human capital arguments (Spring 2008; Gabbard & Ross, 2004). On a stronger note, neoliberal states turn children into securities, making children insecure (Gabbard & Ross, 2004). In other words, the test results – together with many other factors like cultural and tangible capital – become the ticket to your bright or dark future. As stated earlier, the problems lie not on the notions of assessment and accountability. The worrisome aspect is when testing directly affects students and communities in almost irreversible ways. Aronowitz and Giroux (1991) denounced that failing standardized tests might lock you into a secondary labor economy. There is individual agency that has allowed many to overcome blunt social inequalities, but the troublesome fact is that this does not happen among communities of strong financial and cultural capital. In a neoliberal state, education tends to assign individuals, in this case, students’ ‘‘use-value’’ (Gabbard, 2004); an argument of human capital dominates the global discourse (Spring, 2008) when it comes to international assessments and reforms. What happens to those that do not represent the desired capital? What happens to the children that score 0% on a national test, and in addition, live in extreme poverty, some are still learning Spanish (a study on how their languages survive with worth it another chapter alone), experience alarming teacher absenteeism and high rates of migration, and very likely do not know about pets with passports, rhinoceros, and summer camps? Very likely the children in Metlato´noc know how to tell the time without a watch, differentiate with great detail the seasons, and can describe animal reproduction in ways that other children cannot. Even though ‘‘Mexico has a long history of developing compensatory

Standardized Tests in an Era of International Competition and Accountability

341

programmes aimed at dispersed rural communities and at migrant and indigenous populations’’ (UNESCO, 2008b, p. 107), the road is still long and arduous. A humbling and enriching experience would be to see international consultants and national officials incorporating into their plans and into their tests the creativity and perspective (and struggle) of those that on the chart represent a zero. It would also be a humbling and enriching experience if official websites of the international organizations and ministries of education reported the narratives of those whose ‘‘use-value’’ feels also reduced to zero.

WHOSE KNOWLEDGE IS OF MOST WORTH? A CALL FOR COLLABORATION AND RETHINKING CURRICULA AND ASSESSMENTS A highlight in the discussion of standardized assessment is knowledge itself. Part of the global culture of these assessments tends to emphasize ‘‘the knowledge economy.’’ The knowledge economy assumes strong links between what students learn at school, higher incomes, and competitiveness in the global market place. Human capital gains ‘‘ascendancy over both labor and raw materials as a vehicle for increased productivity and economic growth’’ (Benveniste, 2002, p. 91). Furthermore, in the knowledge economy, ‘‘wealth [is] tied to knowledge workers and ultimately to educational systems’’ (Spring, 2008, p. 337). The capitalist democratic state is looked upon to provide justice and equity ‘‘to compensate for inequalities arising out of the social and economic system, [and] education’s role then is seen as improving the social position of have-not groups by making relevant knowledge and certification for participation available to them’’ (Carnoy & Levin, 1985, p. 27). The battle of selecting and administering this relevant knowledge is more visible than ever in this era of international/national assessments; however, this is not new. Knowledge, with its hierarchies and its structures, has played very important roles in societies since the beginning of civilization. Knowledge can have the power to divide, create, perpetuate, or destroy cultures, social structures, and even entire societies. History provides several examples of how knowledge is crucial to humans and their groups, especially when it comes to issues of power and knowledge. Plato in his Republic emphasized the hierarchies of a society that would be perpetuated through knowledge.

342

M. FERNANDA PINEDA

He favored that the rulers (and their children) receive certain type of education (knowledge), the warriors other type and the common people, other type. By doing so, society – through the ‘‘proper’’ distribution of knowledge – would designate and maintain everyone’s role. We also see a very similar system implemented in the Aztec empire. In brief, the Aztecs had two different types of educational institutions: the Calmecac and the Tepochcalli. The Calmecac (House of Priests) was for children of royalty and religious leaders to attend. They were taught religion, governance, and the sciences. Their place was assured in either the government or the temples. The Tepochcalli (House of Youth), on the contrary, was the institution for commoners, merchants, or people with some kind of trade (Keen & Haynes, 2000). Knowledge, as history shows, impacts societies and people’s lives directly. This is the case even today. In the field of education, the questions of knowledge and the politics around it are also of vital importance to guide, create, legitimize, and discontinuing policies. Curriculum theorists, ranging from a wide political spectrum, have long considered the question ‘‘what is worth knowing?’’ (a question that goes hand in hand with ‘‘what is worth testing?’’). We can trace back this question to many centuries back to numerous thinkers, but Herbert Spencer, philosopher of the 19th century, comes to mind for many of his sociopolitical orientations have prevailed in curriculum even today. Spencer was a Social Darwinist who firmly believed in and researched ‘‘the survival of the fittest.’’ He considered that the only ‘‘useful’’ knowledge worth pursuing and teaching in society was the scientific and that the only way to function in an increasingly complex structure in societies (from families to clans, to tribes, and to heterogeneous societies) was through scientific knowledge (Elliot, 1917). Scientific knowledge is unquestionably very valuable – consider medicine or physics – but it also has harmful powers for societies – consider idiosyncrasies of the survival of the fittest, embodied decades later in Nazi thought and practice. It is important to consider the question of ‘‘what is worth knowing,’’ especially for those curriculum theorists/educators aware of the potential of knowledge to harm or edify us, the politics of legitimization and knowledge (Apple, 2000), and aware of the different types of knowledge that exist (Young, 2008), and the subjugation (Spring, 2008) they go through. The question of what is worth knowing is important because it has implications for our everyday life, for social justice, and for the politics of education including, of course, assessment. The value of knowledge impacts our daily lives – we make many decisions – our majors in college, our religious or

Standardized Tests in an Era of International Competition and Accountability

343

political affiliations, the cities we move to, and even our purchases/ consumption (Stromquist, 2002) – based on what we consider valuable. Knowledge is our symbolic interaction with the world (Young, 2008) and helps us make sense of life, together with culture and beliefs/religion in some cases. Moreover, the question of whose knowledge is of most worth has important implications for public policy. If we know and value what we know, we pursue its implementation in society. An example is the recent pandemic of the H1N1 virus – based on scientific knowledge of viruses and infections, governments implemented public policies of health contingency and the population followed. However, there are also much politicized implications for social justice and curriculum when it comes to the question of what and whose knowledge is of most worth, because knowledge is at the heart of the social (Young, 2008). Knowledge is a form of capital, a form of commodity (Apple, 1995, following Bourdieu’s theory of cultural capital), and there is a selective tradition ‘‘in which only specific groups’ knowledge becomes official knowledge’’ (Apple, 2000, p. 62), hence, is reflected on international assessments. The present paradigm of standardized assessments seems to indicate: ‘‘if it is valuable, it is tested on.’’ In terms of social justice, when some groups’ knowledge (linked to their cultures and in their languages) are considered inferior, the treatment they receive (overtly or covertly) is also as inferior. This could sadly translate into larger consequences such as the disappearance of that group’s language loss; many indigenous languages across the globe undergo this threat (Deyhle, Swisher, Stevens, & Galva´n, 2008). This relationship of knowledge’s worth and social justice is evident; as May (1999) points out (elaborating on J. Crawford’s work), these types of losses seldom occur in communities of power, but in the disenfranchised and dispossessed. Part of the conversation of knowledge’s worth is the awareness at the personal level of who decides what is worth knowing. In the private and individual arena, there is usually more agency, depending on your socialization and your idiosyncrasy, you will hopefully pursue and operate based on the knowledge that is worth to you. However, this could be a slippery road. If you are constantly bombarded with messages that your or yours’ knowledge (your culture, your language, your experiences, and your religion) is of no worth and not relevant to be tested on – mainly because of economic or social reasons – you might pursue knowledge that maybe is of more worth for someone else, or that is part of an intellectual community that simply invalidates or relegates to second class of your knowledge. This constantly emerged in a study conducted among high school students in

344

M. FERNANDA PINEDA

rural and urban settings in Guerrero in regard to indigenous knowledge and the possibility of them attending an intercultural university. Numerous (not all) adolescents from indigenous and nonindigenous backgrounds rejected openly indigenous knowledge/culture/languages because they were of ‘‘no value’’ or ‘‘no use’’ (Pineda, 2007). Nonetheless, the students who embraced the project expressed a keen awareness of the hierarchies of knowledge and power (Pineda, 2007). Who decides what is worth knowing, and therefore tested, in the public and collective realm is a political battle (that could crystallize in a physical battle, in extreme cases). This is usually linked to power, sometimes the decision is imposed on the populations (like in the Crusades or in Nazi times) and sometimes it is the result of struggles for the recognition of someone’s knowledge, though the knowledge of the powerful versus powerful knowledge (Young, 2008) is more likely to emerge. Aronowitz and Giroux (1991) argued against this practice of allocating value to someone knowledge because of the apparati behind them – because this is a struggle between the powerful and the powerless, it is usually the first ones that gain attention. It is important that educators (teachers in the classroom) grow in awareness of this battle; therefore, students do not become passive recipients (Freire, 2000) of ‘‘knowledge’’ that we really do not value, that we are compelled to value, or vice versa, knowledge that robs us of our value. Learning takes place a myriad of settings besides schools – homes, apprenticeships, voluntary organizations, local community settings, and political campaigns (McCowan, 2010) – and a standardized assessment might never be able to capture it. All this by no means is intended to suggest that international/standardized assessments should be rejected or always looked at with suspicion, as argued later herein, but a call to rethink the politics of knowledge, embedded in international assessments. Educators should protect learners’ agency and awareness of what knowledge is valuable to them/their groups, understand how globalization transforms schooling (Stromquist, 2002), challenge the – sometimes incredibly prescriptive – reforms/ assessments, and openly discuss and analyze the positive and the negative consequences of these transformations in educational systems (Stromquist, 2002) and curricula. Freire (2000) constantly pointed out that the curriculum (and in this context, the assessments) must be relevant to your life and community – in as much as this sounds trite (Nielsen, 1997). This is part of realizing – at least at the personal level – what and whose knowledge is worth knowing. However, part of this immense challenge is usually dictated by economic reasons and

Standardized Tests in an Era of International Competition and Accountability

345

an overwhelming reality of the need for jobs and access to technology, and the knowledge valued in a knowledge-based economy (Spring, 2008), as I will briefly elaborate on it in the next section.

WHOSE POLICIES ARE OF MOST WORTH? A CALL FOR COLLABORATION AND RETHINKING INTERNATIONAL/NATIONAL ASSESSMENT POLICIES In the international arena, those countries that score quite low on international standardized tests might be looked at as not competitive enough, raising anxieties about international competition that usually results in stricter, more prescriptive, and to-the-test national curricula. These countries are likely to request a set of recommendations from international agencies’ experts to sharpen up their competitive edge for the next testing round (Kamens & McNeely, 2010). Some countries use these tests and the results to profit politically or make policy decisions based on them (Spring, 2008), as discussed earlier. On the contrary, those countries that excel on the international standardized tests become the center of attention for researchers and politicians. Such is the case of Finland, upon their consecutive success on PISA. Finland’s ‘‘success has y been a hot topic in international media and in different seminars and conferences, either ex cathedra or in the corridors’’ (PISA 2006 Finland, n.d.). These experiences – learning and comparing from elsewhere – are now the norm rather than the anomaly, because the ‘‘international perspective is now considered indispensable’’ (Steiner-Khamsi, 2004, p. 1). If too much emphasis is placed on policies of standardization and assessment, there could be a threat to what localities consider to be of value in their children’s education. Carney (2003), through his work in Nepal, criticizes this global standardization and testing wave, saying that fostering cognitive achievements without assessing local values (or what people value in their education) is a form of westernization. For many scholars, this is actually part of what they call the world culture. Schriewer and Martinez (2004), Spring (2008), and Holland (2010) provide a comprehensive summary of what world cultural theorists or neo-institutionalists argue. According to them, the world slowly converges to a global culture (Spring, 2008), where values such as progress and justice take precedence

346

M. FERNANDA PINEDA

(Meyer et al., 2006, as cited in Holland, 2010) and countries design their curricula and their educational systems in increasingly homogenous ways. Among the numerous criticisms of this stance, as Spring (2008) outlines, are the postcolonialists, who consider this westernization and domination, the world systems theorists, that consider that ‘‘the core countries are trying to legitimize their power by using aid agencies’’ (p. 335), and the culturalists, that reject the homogeneity of systems around the world and highlight local agency (Spring, 2008). Just as in the case of Mexico or Argentina, worldwide, ‘‘the creators of tests rather than curriculum developers or teachers become the arbiters of what should be taught’’ (Madaus, 1988), a criticism over two decades old. In some instances, national exams and related material are linked somehow to international/standardized assessments. An example is the OECD’s PISA and ENLACE in Mexico. This might be paving the way for furthering an agenda of implementing policies of neoliberal nature (Spring, 2008); unless the assessment is done in a collaborating manner, with a local and global perspective, very likely this would be the case. These assessments should not plunge countries and their children in a race, being ‘‘educated’’ by and large to compete and consume, rather than to care and conserve (on a more environmental note; Orr, 2004; Sterling, 2001), to focus on a test and not on discovery and analysis, and to accept homogeneity of educational and even economic models as entire bodies of knowledge are – almost openly – relegated to second class. Among the negative consequences of the ongoing transformations of education worldwide is the ‘‘narrowing of what is considered knowledge worth learning’’ (Stromquist, 2002, p. 57). Standardized assessments should not put at odds the local practice and the global content – the content that allows us to live the kind of life we value (Dre`ze & Sen, 1995), as our human capabilities and freedoms – not only our human capital – flourish. The hype of testing might put countries in (unfair) competition with each other, and the powerless get little room and little agency (some questions arise: has ENLACE been translated into Tlapaneco language? Is there a Braille version of it for children with special visual needs?). Also, as mentioned earlier, these tests might make governments to adapt their national curricula to market-based needs (Spring, 2008), and for educators that consider education to be for more holistic purposes than just getting a job (in the city, very likely), this might not be the desired answer. There are, nonetheless, very informative outcomes from international testing and many inequalities that they help us to unveil, as the UNESCO’s EFA reports indicate, as I will discuss herein.

Standardized Tests in an Era of International Competition and Accountability

347

BEYOND THE ZERO PERCENT ACHIEVEMENT When almost an entire community receives a zero percent achievement of ‘‘good’’ levels, there is a local reality crying out to be unveiled. Academic achievement, ‘‘apart from inherent ability,’’ is ‘‘the product of social, economic and cultural circumstances, such as gender, language spoken at home, parental occupation and education, family size and immigrant status’’ (UNESCO, 2009, p. 21). TIMSS was recently administered worldwide. This exam is given to elementary school children and seeks to measure achievement in math and science. Chinese Taipei, Republic of Korea, and Singapore scored the highest worldwide (for grade 8) this past TIMSS in mathematics, but in comparison with the 1995 TIMSS, the country that has had the most improvement is Colombia (in grade 8), followed by Lituania (IES, n.d.). This result shows more than a higher test result, especially when considering that Colombia has had decades and decades of internal guerrilla warfare and other numerous problems. These achievements should be used to celebrate the work of Colombian teachers, students, and their families, and maybe collaborative endeavors between the Asian and the Colombians could take place. Another international test is the PIRLS that assesses reading and literacy behaviors among elementary school children. For example, UNESCO’s EFA’s reports uses PIRLS (and other) scores to inform policies of gender equity around the world. EFA reports show that girls often perform much better than boys once they overcome the obstacle of getting into school, more notoriously in countries that were struggling with providing quality education for girls. In several nations, ‘‘girls are less likely than boys to get into school. Once in school, though, they tend to perform as well as, or better than, their male classmates’’ (UNESCO, 2010, p. 19). There are many international organizations that benefit from these test results to advocate with robust evidence for including girls in schools around the world (like UNICEF program ‘‘Getting Girls in School’’). It would be very enriching to learn how ENLACE informs the Mexican Secretarı´a de Educacio´n Intercultural Bilı´ngu¨e (Ministry of Intercultural and bilingual education, SGIEB) regarding how this evaluation will help indigenous children – in rural and urban settings – to perform academically and yet live an intercultural life, the kind that expresses connections, differences, contradictions, and even conflicts between groups/ cultures/people, addressed in dynamic, respectful, and dialectic ways (Khoˆi, 1994). Interculturalism is an aspect of the Alianza project and is actually part of the strand #4, ‘‘Curricular Reform’’ (ACE, n.d., a); however, very little has been published on the SEP official website on this specific theme.

348

M. FERNANDA PINEDA

Finally, the PISA is the international assessment used by the OECD. The 2006 report indicates that the country that scored the best was Finland (OECD, n.d.). A very enriching exchange would be a dialog between Finnish and Colombian educators (given the TIMSS results), community organizers, parents, and students. Then, we would be able to make an organic comparison. Important questions to ask when importing and receiving these international/national standardized assessments are what Carney (2008) asks, as he studied learner-centered pedagogy in Tibet and analyzed the tensions built in reforms based on international standards: ‘‘Do national policies and visions emerge in recognisable form in the classroom, and can efforts in the classroom overcome the social structures and norms that have shaped y [the country’s] educational practices? What are the implications for minority groups?’’ (Carney, 2008, p. 40). Another important aspect is to find stories of comparable contexts to draw from similar experiences (Mexico is part of the OECD – the only Latin-American country – and is always scoring at the bottom in comparison to the other member states. To whom can we really compare?), hoping to find the niche of the not homogenizing effects (SteinerKhamsi, 2004) of the borrowed or lent policies, the ‘‘hybrid homegrown version of educational reform’’ (Spreen, 2004, p. 101), and the adaption of good practices even ‘‘against the desires of local elites’’ (Spring, 2008, p. 336, quoting the work of Beverly Lindsay in Zimbabwe). We should also consider that international assessments ‘‘further point to inequalities in learning outcomes within countries’’ (UNESCO, 2008a, 2008b, p. 68). We should use them also to unveil and discuss persistent inequalities and challenges to inform our policies. International examples and possible recommendations abound in the UNESCO’s EFA reports about international assessments and gender equity, inclusion of children with disabilities, the importance of resources, well-being factors, and numerous other aspects that are key in our work for a better world. There is no perfect assessment, there is no perfect reform, and there is no perfect educational policy (and there are no perfect teachers, surely). However, the opportunities for progressive possibilities (Apple, 2000) and collaborative efforts in solidarity and respect surely must abound and be highlighted.

CONCLUSIONS We live in challenging but yet hopeful times where education and knowledge play crucial roles for the sustainable, organic, and holistic development of

Standardized Tests in an Era of International Competition and Accountability

349

countries. The constant presence of international standardized assessments in the world reminds us of this. In international arenas, tests and their creators are likely to have a great impact on national curricula (Spring, 2008) and on educational policies to almost a market-driving extent, bringing along numerous challenges and perspectives to explore and rethink. The challenge to compete globally, still perform locally, collaborate in solidarity, and decide collectively whose knowledge and whose policies are of most worth is still before us. International assessments should be considered incomplete proxies of achievement or learning; they will never crystallize what is really going on in the world’s classrooms or in a child’s reality. Assessment should help us improve our practices and celebrate our achievements, but watching closely not to transform educators into testgivers, our students into test-takers, and international standardized assessments into absolutes.

NOTES 1. I thank Dr. Carlos Ornelas (Universidad Auto´noma Metropolitana-Xochimilco) for sharing informative links with me regarding ENLACE. 2. Benveniste (2002) provides a detailed comparison between Argentina, Chile, and Uruguay and their systems’ transformations as they implement standardized assessments.

REFERENCES Alianza por la Calidad de la Educacio´n (ACE). (n.d., a). Alianza por la calidad de la educacio´n entre el gobierno federal y los maestros de Me´xico representados por el sindicato nacional de trabajadores de la educacio´n. Vivir Mejor. Available at http://alianza.sep.gob. mx/pdf/alianzabreve.pdf Alianza por la Calidad de la Educacio´n. (n.d., b). Avances. Eje 5: Evaluar para mejorar: Sistema nacional de evaluacio´n. Available at http://alianza.sep.gob.mx/index_017.php Alianza por la Calidad de la Educacio´n. (n.d., c). El Banco Mundial acepta evaluar a Me´xico en materia educativa. Available at http://alianza.sep.gob.mx/pdf/boletines/Boletin27junio08.pdf Apple, M. (1995). Education and power (2nd ed.). New York: Routledge. Apple, M. (2000). Official knowledge: Democratic education in a conservative age (2nd ed.). New York: Routledge. Apple, M. (2001). Educating the ‘‘right’’ way: Markets, standards, God, and inequality. New York: Routledge Falmer. Apple, M. (2004). Ideology and curriculum (3rd ed.). New York: Routledge. Aronowitz, S., & Giroux, H. (1991). Postmodern education: Politics, culture and social criticism. Minneapolis, MN: University of Minnesota Press.

350

M. FERNANDA PINEDA

Benveniste, L. (2002). The political structuration of assessment: Negotiating state power and legitimacy. Comparative Education Review, 46(1), 89–118. Berliner, D., & Biddle, B. (1995). The manufactured crisis: Myths, fraud and the attack on America’s public schools. New York: Addison-Wesley Publishing Company, Inc. Carney, S. (2003). Globalisation, Neo-liberalism and the limitations of school effectiveness research in developing countries: The case of Nepal. Globalisation, Societies and Education, 1(1), 87–101. Carney, S. (2008). Learner-centred pedagogy in Tibet: International education reform in a local context. Comparative Education, 44(1), 39–55. Carnoy, M., & Levin, H. (1985). Schooling and work in the democratic state. Stanford, CA: Stanford University Press. Deyhle, D., Swisher, K., Stevens, T., & Galva´n, R. T. (2008). Indigenous resistance and renewal. In: F. M. Connelly, M. F. He & J. Phillion (Eds), The SAGE handbook of curriculum and instruction. Los Angeles, CA: Sage. Dre`ze, J., & Sen, A. (1995). India: Economic development and social opportunity. Oxford, UK: Oxford University Press. Elliot, H. (1917). Herbert Spencer (Available at http://books.google.com/books?id ¼ IoEZAAAAMAAJ&printsec ¼ frontcover&dq ¼ herbert þ spencer&source ¼ bl&ots ¼ IpkOPnUUlh&sig ¼ 3RTheZSazjQsbQ8XA9zdJVhATGY&hl ¼ en&ei ¼ JWbDS463Js H38AbOweHDAQ&sa ¼ X&oi ¼ book_result&ct ¼ result&resnum ¼ 3&ved ¼ 0CBUQ 6AEwAg#v ¼ twopage&q&f ¼ false). New York: Henry Holt and Company. Enciclopedia de los Municipios de Me´xico (EMM). (n.d). Estado de Guerrero, Metlato´noc. Available at http://www.e-local.gob.mx/work/templates/enciclo/guerrero/municipios/ 12043a.htm Evaluacio´n Nacional del Logro Acade´mico en Centros Escolares. (2009a). Consulta de posiciones de escuelas segu´n diferentes criterios [database P12v2: ‘‘Porcentaje de alumnos que obtuvieron nivel de logro bueno o excelente’’]. Available at http:// enlace.sep.gob.mx/ba/cons_crit/listado_esc.html Evaluacio´n Nacional del Logro Acade´mico en Centros Escolares. (2009b). Descarga las pruebas aplicadas en formato PDF (sample of 3rd grade ENLACE online). Available at http:// enlace.sep.gob.mx/ba/docs/2009_p3.pdf Evaluacio´n Nacional del Logro Acade´mico en Centros Escolares (ENLACE). (n.d., a). Bienvenida. Available at http://enlace.sep.gob.mx/gr/ Evaluacio´n Nacional del Logro Acade´mico en Centros Escolares. (n.d., b). Ciudadanos vigilaron la aplicacio´n de la prueba ENLACE. Available at http://enlace.sep.gob.mx/gr/ ?p ¼ n02 Evaluacio´n Nacional del Logro Acade´mico en Centros Escolares. (n.d., c). Consejo te´cnico. Available at http://enlace.sep.gob.mx/ba/ct01.html Florida Senate. (2010). SB 6. Available at http://www.flsenate.gov/data/session/2010/Senate/ bills/billtext/pdf/s0006.pdf Freire, P. (1998). Teachers as cultural workers: Letters to those who dare to teach. Boulder, CO: West View Press. Freire, P. (2000). Pedagogı´a del oprimido (53rd ed.). Montevideo, Uruguay: Tierra Nueva. Freire, P. (2005). Teachers as cultural workers: Letters to those who dare to teach (Expanded Edition). Boulder, CO: Westview Press. Gabbard, D. A. (2000). Accountability. In: D. Gabbard (Ed.), Knowledge and power in the global economy: Politics and the rhetoric of school reform (pp. 53–61). Mahwah, NJ: Lawrence Erlbaum Associates, Publishers.

Standardized Tests in an Era of International Competition and Accountability

351

Gabbard, D. A. (2004). Welcome to the desert of the real: A brief history of what makes schooling compulsory. In: A. Gabbard & E. W. Ross (Eds), Defending public schools: Education under the security state. New York: Teachers College Press. Gabbard, D. A., & Ross, E. W. (Eds). (2004). Defending public schools: Education under the security state. New York: Teachers College Press. Harding, V. (1990). Hope and history: Why we must share the story of the movement. Maryknoll, NY: Orbis Books. Holland, D. G. (2010). Waves of educational model production: The case of higher education institutionalization in Malawi, 1964–2004. Comparative Education Review, 54(2), 199–222. Hoyle, E. (1982). The professionalization of teachers: A paradox. British Journal of Educational Studies, 30(2), 161–171. Human Development Report (HDR). (2009). Human development report 2009 – HDI rankings. Available at http://hdr.undp.org/en/statistics/ Institute of Education Sciences/National Center for Educational Statistics (IES). (n.d.). Trends in international mathematics and science Study (TIMSS). Available at http://nces.ed. gov/timss/index.asp Kamens, D., & McNeely, C. (2010). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54(1), 5–25. Keen, B., & Haynes, K. (2000). A history of Latin America (6th ed.). New York: Huntington Mifflin Company. Khoˆi, L. (1994). Intercultural education. In: F. B. Dubbeldam, T. Ohsako, L. Khoˆi, P. Dasen, P. Furter, G. Rist, P. Batelaan, S. Churchill, K. P. Epskamp, F. M. Bustos, & G. R. Teasdale (Eds), International yearbook of. Volume XLIV-1994: Development, culture and education (Vol. 44, pp. 79–104). Paris, France: UNESCO Leon, M., Scorzo, P., & Novello, J. (2009). Hacia una cultura de la evaluacio´n. ONE 2009/ Censo. Argentina: Direccio´n Nacional de Informacio´n y Evaluacio´n Educativa/ Evaluacio´n Educativa/Ministerio de Educacio´n Presidencia de la Nacio´n. Available at http://diniece.me.gov.ar/images/stories/diniece/publicaciones/2009%20hacia%20una% 20cultura%20EVALUACION%20interior%20OK.pdf Levin, B. (2008). Curriculum policy and the politics of what should be learned in schools. In: F. M. Connelly, M. F. He & J. Phillion (Eds), The SAGE handbook of curriculum and instruction. Los Angeles, CA: Sage. Madaus, G. (1988). The distortion of teaching and testing: High-stakes testing and instruction. Peabody Journal of Education, 65(3), 29–46. May, S. (1999). Introduction. In: S. May (Ed.), Indigenous community-based education (pp. 1–7). Clevedon, Canada: Multilingual Matters LTD. McCowan, T. (2010). Reframing the universal right to education. Comparative Education, 46(4), 509–525. McGrory, K. (2009, December 13). Once avoided, Miami Edison Senior High hopes to be embraced under new system. The Miami Herald. Available at http://www.miamiherald. com/2009/12/13/1379509/once-avoided-miami-edison-senior.html McLaren, P. (2007). Life in schools: An introduction to critical pedagogy in the foundations of education (5th ed.). New York: Pearson. Ministerio de Educacio´n (ME). (n.d.). Direccio´n nacional de informacio´n y evaluacio´n de la calidad educativa. Available at http://www.me.gov.ar/diniece/ Nielsen, H. D. (1997). The last frontiers of education for all: A close-up of schools in the periphery. In: H. D. Nielsen & W. K. Cummings (Eds), Quality education for all: Community-oriented approaches (pp. 25–34). New York: Garland Publishing, Inc.

352

M. FERNANDA PINEDA

Organisation for Economic Co-operation and Development (OECD). (2007). Education at a glance: OECD indicators. Available at http://www.oecd.org/document/30/0,3343, en_2649_39263238_39251550_1_1_1_1,00.html Organisation for Economic Co-operation and Development (OECD). (n.d.). PISA 2006 results. Available at http://www.oecd.org/document/2/0,3343,en_32252351_32236191_ 39718850_1_1_1_1,00.html#ES Orr, D. (2004). Earth in mind: On education, environment, and the human prospect (10th anniversary ed.). Washington: Island Press. Pineda, M. F. (2007). Unity and diversity: Exploring high school students’ perceptions about multiculturalism and the intercultural university of Guerrero, Mexico. In: S. M. Nielsen & M. S. Plakhotnik (Eds), Proceedings of the sixth annual conference of the college of education: Urban and international education section. Miami, Florida International University (pp. 90–95). Available at http://coeweb.fiu.edu/research_ conference/ PISA 2006 Finland. (n.d.). The Finnish PISA 2006 pages. Available at http://www.pisa2006. helsinki.fi/ Schriewer, J., & Martinez, C. (2004). Constructions of internationality in education. In: G. Steiner-Khamsi (Ed.), The global politics of educational borrowing and lending (pp. 29– 53). New York: Teachers College Press. Shor, I., & Freire, P. (1987). A pedagogy of liberation: Dialogues on transforming education. Westport, CT: Bergin & Garvey. Spreen, C. (2004). Appropriating borrowed policies: Outcomes-based education in South Africa. In: G. Steiner-Khamsi (Ed.), The global politics of educational borrowing and lending (pp. 101–113). New York: Teachers College Press. Spring, J. (2002). American education (10th ed.). Boston, MA: McGraw Hill. Spring, J. (2008). Research on globalization and education. Review of Educational Research, 78(2), 330–363. Spring, J. (2009). Globalization of education: An introduction. New York: Routledge. Spring, J. (2010). Political agendas for education: From change we can believe in to putting America first (4th ed.). New York: Routledge. Steiner-Khamsi, G. (2004). Globalization in education: Real or imagined? In: G. SteinerKhamsi (Ed.), The global politics of educational borrowing and lending (pp. 1–6). New York: Teachers College Press. Sterling, S. (2001). Sustainable education: Re-visioning learning and change (Schumacher Briefings No. 6). Bristol, UK: Green Books. Stromquist, N. (2002). Education in a globalized world: The connectivity of economic power, technology, and knowledge. Lanham, MD: Rowman & Littlefield. Suma X la Educacio´n. (2009a). Comunicado de prensa: Suma por la educacio´n aporta 5,000 observadores externos a proceso de aplicacio´n de prueba ENLACE. Available at http://www.sumaporlaeducacion.org.mx/boletinesdeprensa/23marzo09.pdf. Retrieved on September 3, 2010. Suma X la Educacio´n. (2009b). ¿Quie´nes somos? Available at http://www.sumaporlaeducacion. org.mx/quinessomos.html Tveit, S. (2009). Educational assessment in Norway – A time of change. In: C. Wyatt- Smith & J. Cummings (Eds), Educational assessment in the 21st century: Connecting theory and practice (pp. 227–243). Netherlands: Springer.

Standardized Tests in an Era of International Competition and Accountability

353

United Nations Educational, Scientific and Cultural Organization (UNESCO). (2002). Recommendation concerning the status of teachers. Available at http://portal.unesco. org/education/en/ev.php-URL_ID ¼ 6140&URL_DO ¼ DO_TOPIC&URL_SECTION ¼ 201.html United Nations Educational, Scientific and Cultural Organization (UNESCO). (2008a). EFA global monitoring report 2008, summary: Education for all by 2015, will we make it? Paris, France: UNESCO Publishing. United Nations Educational, Scientific and Cultural Organization. (2008b). EFA global monitoring report 2008: Education for all by 2015, will we make it? Oxford, UK: Oxford University Press. United Nations Educational, Scientific and Cultural Organization. (2009). EFA global monitoring report 2009, summary: Overcoming inequality: Why governance matters. Paris, France: UNESCO Publishing. United Nations Educational, Scientific and Cultural Organization. (2010). EFA global monitoring report 2010, summary: Reaching the marginalized. Paris, France: UNESCO Publishing. United States Department of Education (USDE). (2009). Press releases – President Obama, U.S. Secretary of Education Duncan announce national competition to advance school reform. Available at http://www2.ed.gov/news/pressreleases/2009/07/07242009.html Young, M. (2008). From constructivism to realism in the sociology of the curriculum. Review of Research in Education, 32, 1–28.

INDEX Accountability, 42, 59, 167, 183, 208, 219, 233–234, 297, 321, 323, 331–335, 337–341, 343, 345, 347, xii, xvii Achievement, 3–5, 7–9, 11–14, 18–24, 35, 56, 63–75, 77–83, 85–88, 94–95, 101, 103–104, 106, 108, 119–121, 123–140, 145, 155, 181–188, 195, 198, 200–202, 207, 220, 239–241, 243–249, 251, 253, 255, 257, 259, 261, 263, 267–268, 270, 272–273, 285, 289–290, 292, 297–298, 303, 305–307, 309, 327, 331, 333, 338, 347, 349, xi–xx Achievement gaps, 64, 68–70, 78–79, 119–120, 136, 181–184, 186, 188, 201–202, 306, xviii Alignment of national and international standards, 201 Assessment, 6–9, 14–17, 21–26, 28, 35–39, 42–47, 50, 52, 55, 57–59, 66, 81–83, 86, 121, 125, 145–146, 151–153, 155–157, 159, 165, 169–173, 175, 182, 187, 195, 198, 200, 202–203, 221, 226–228, 231–232, 241, 243, 245, 253, 255, 263, 268–269, 272, 291, 299–301, 303–305, 311–312, 315, 318, 321, 323, 325–326, 331–333, 336–337, 339–342, 344–346, 348–349, xi, xiii, xv, xviii Basil Bernstein, 207, 210, 215 Career Plans, 85–86, 88–90, 92, 96, 104, 110, xiii CIVED (Civic Education Study), 8, 241, 244, 252, 256, 258, 260 355

Collaboration, 137, 152, 221–222, 228, 232–233, 339, 341, 345, xix Comparative policy studies, 278 Competition, 39–40, 145–146, 151, 155, 173, 217, 281, 289–290, 304, 331–333, 335, 337, 339, 341, 343, 345–347, xvii Corruption in education, 181, 183, 188, 197 Cost/benefit of testing, xv, 297, 314 Critical discourse analysis, 207, 214 Cross-national assessments, 35–41, 43–47, 50, 52, 55–59, 298, 304, xviii, xx Cross-national study, 66, 290 Curriculum, 6, 10–11, 16, 22, 70, 94, 122, 125, 137, 149–150, 156, 159–160, 165–166, 168–170, 172, 174, 185, 188, 191, 199–200, 202–203, 218, 222, 226–227, 231, 242, 245–246, 248–253, 256–257, 259–263, 269, 281–282, 287, 289, 301, 303, 337, 342–344, 346, xvi–xvii Curriculum revision, 200 Economic inequality, 86, 94, 96–98, 100, 103–104, 107 Education aid, 36, 38, 44–56, 58–59 Education quality, 119, 123, 145, 147, 149, 151–159, 161, 163, 165, 167, 169, 171, 173, 175, 189, 303–304, xiii Educational inequality, 187 Educational outcomes, 72, 86, 124, 137, 185, 247, 321 Educational reform, 26, 58, 94, 135, 152, 170, 209, 247, 273, 283, 333, 348, xvii

356 EFA (Education for All), 4–5, 14–17, 23, 28, 91, 104, 150, 168, 289, 333–334, 346–348 Eighth grade, 70, 80, 83, 119, 121, 124–125, 128, 130–131, 135, 140, 153, 157, 183, xiii ENLACE (Mexico’s standardized test), 9, 331, 333–336, 339, 346–347, 349 Equity, 18, 21, 88, 91, 105, 119–120, 124, 135, 137–139, 167, 175, 181–182, 185, 188, 191, 195, 197, 199, 202–203, 207–219, 221, 223, 225, 227–229, 231–233, 235–236, 268, 341, 347–348, xiii, xviii Equity in education, 120, 135, 207, 209, 211, 213–217, 219, 221, 223, 225, 227, 229, 231, 233, 235–236, xiii Equity through diversity, 207, 212 Equity through equality, 207–208, 212 Ethnic Minorities, 191, 218, 223 Finland, 58, 97, 99–100, 155–156, 160, 171, 267–271, 273–275, 277–279, 281, 283–289, 291–292, 307, 345, 348, xviii–xix FTI (Fast Track Initiative), 4–5, 14–17, 29 Foreign aid, 35–41, 43–59, xviii Fourth Grade, 64–66, 70–72, 80–81, 83, 154, 157 Game theory, 309–311 Global educational ideology, 85–86, 91, 93, 96, 102, 104–105, 108–109 HLM (Hierarchical Linear Modeling), 64, 72, 76, 88, 95, 119, 127, 136, 187 ICCS (International Civic and Citizenship Education Study), 241, 245, 252, 255–256, 258–260, 301–302, 324

INDEX IEA (International Association for the Evaluation of Educational Achievement), 8, 10–13, 22, 37, 56, 59, 87, 125, 140, 241–242, 244–245, 249, 261–262, 268, 298–304, 307–308, 311, 314, 323–324, 327, xi–xii, xiv–xv, xvii IIEP (International Institute for Educational Panning), 5, 8–9, 17, 19, 21 Importance of Schools, 63–68, 78, 80 Incentives, 35–38, 41–44, 57–58, 164, 234, 255, 289 Inequality, 45, 63–71, 73–75, 77–81, 83, 86, 90, 93–94, 96–98, 100, 103–104, 107, 109, 187, 191–192, 208, 273, 322, xviii International assessment participation, 315, 321 International assessments, 36–37, 40, 47, 52, 55, 57–58, 121, 138, 183–184, 240, 242–244, 251, 255, 258–260, 263, 291, 297–299, 301, 305, 308–310, 312–315, 322–324, 326–328, 333, 340, 343–344, 348–349, xii–xiii, xvi–xviii International organizations and donors, 47, 158–159, 171 Knowledge, 7–8, 11–13, 16, 18, 23, 28, 39, 41, 43, 57, 80, 87–88, 105, 121, 149, 152–153, 159–160, 164–165, 167–168, 171, 173–174, 185, 192, 195, 208, 213–214, 218, 220, 223–224, 226–227, 229, 231, 235, 239–244, 246–252, 254, 256–257, 259, 261–264, 269, 272, 284–285, 304, 309, 311, 331–333, 341–346, 348–349, xvi, xix Knowledge, constructivist theories, 239, 250, 263 Knowledge, global, 242, 244, 246, 249–251, 264 Knowledge, globalization, 241–242, 246–250, 252, 261, 263, 344

Index Knowledge, local, 16, 239, 242, 249–251, 256, 262, 264 Knowledge, realistic theories, 250, 263 Kyrgyzstan, 96–97, 99, 145–151, 153–176, 184–185, 197–198, xiii, xix Large-scale student assessments, 249, 251–256, 260–262 Macro-dissatisfaction theory, 309–310, 312–313, 320–323, 326 Macrosatisfaction, 184 Mathematics Achievement, 9, 63, 72 Mexico, 9, 58, 83, 97–98, 105, 121, 318, 324–325, 331, 333–336, 339–340, 346, 348 Motivation, 38, 41, 222, 243 National assessments, 8–9, 35–41, 43–47, 50, 52, 55–59, 121, 138, 183–184, 209, 240, 242–245, 251, 255–263, 291, 297–299, 301, 304–305, 308–310, 312–315, 322–324, 326–328, 333, 340–341, 343–344, 348–349, xii–xiii, xvi–xviii, xx National income, 66–67, 70–71, 79, 187 National policy-making, 262 National study centers, 314 Neo-institutionalism, 85 NESP (National Education Sector Plans), 3, 5, 20 Norway, 58, 69–71, 74, 76, 78–80, 97, 99–100, 207, 209–211, 213, 215–221, 223–225, 227–235, 318, 336, xiii, xix Occupational expectations, 85–88, 95–98, 101, 105–106, 108–110, xviii OECD (Organization for Economic Co-operation and Development), 5, 8–10, 12, 37, 40, 43–45, 55–56, 58, 71, 83, 87, 93–95, 97, 121, 123, 138, 146, 151–153, 166, 173, 188, 207–211, 213–221, 223–225, 227–235, 241–242,

357 244, 247–248, 254–255, 261–263, 269–270, 272–273, 276, 288, 290–292, 300, 302–304, 308, 314, 323, 332, 338, 346, 348, xiii–xv, xvii, xix PIRLS, 8, 10, 12–13, 18, 45–47, 56, 82, 87, 138, 181–183, 188–189, 199, 201–202, 241, 243, 245, 252, 254, 256, 258–259, 268, 301, 333, 347 PISA (Programme for International Student Assessment), 8, 10–13, 23, 35–37, 42, 45–47, 50, 55, 58, 66, 81–83, 85–89, 93–95, 98, 100, 108–111, 121, 138, 145–146, 151–162, 164–176, 181–183, 188–189, 199–202, 208–209, 215–216, 218, 241, 244, 252, 254–257, 262–263, 267–279, 281, 283–292, 299–304, 308–309, 312, 314–315, 318–320, 323, 333, 335, 339, 345–346, 348, xi, xiii, xv, xvii–xviii, xx Policy diffusion, 303, 312, 314–315, 320, 325 Post-Soviet education, 145, 149, 170 Power relations, 41, 207, 209–210, 212, 218, 236, 250, 337 Quality of education, 3–11, 13–17, 19–23, 25, 27–28, 67, 146, 150, 156–157, 166, 168, 171, 186, 191, 202, 240, 243–244, 247–248, 334, 337, xiii, xviii Rational actor theory, 313, 326 Regional assessments, 37, 58, 82, 299, 304, 326 Regional disparities, 119, 124, 126, 135, xiii SACMEQ (Southern and Eastern Africa Consortium for Monitoring Educational Quality), 8, 10–13, 15, 17–20, 22, 24–26, 29, 37, 82, 241, 303–304, 309, 326

358 Sample-based assessment, 156–157 School accountability, 183 Science achievement, 64, 119, 121, 123–135, xii–xiii Secondary school students, 202 Slovenia, 58, 97, 99, 140, 188, 239–241, 244–245, 252–263, xviii–xix Social cohesion, 123, 200, 202, 290 Socioeconomic status, 18, 67, 82, 92, 94, 186, 218, 262 Standardized examinations, 184–185, 189, 194, 198–202 Standardized test, 36, 145, 147, 149, 151–153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175–176, 181–182, 185, 194, 202–203, 331–333, 335–341, 343, 345, 347, xiii, xvii Standardized Testing, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175–176, 182, 185, 194, 332–333, 335–337, xiii StatPlanet, 25, 27, 29 Student composition, 68, 75 Testing, 38, 40, 42, 45, 57–58, 145, 147, 149, 151–153, 155, 157, 159, 161, 163, 165, 167, 169–173, 175–176, 181–182, 185, 194, 196, 198–200, 214, 225, 227, 230–231, 234–235, 260, 270, 272, 289, 297–300, 305–309, 314, 325, 331–333,

INDEX 335–340, 342, 345–346, xiii–xv, xvii, xix–xx TIMSS (Trends in International Mathematics and Science Study), 8, 10–13, 15, 18, 28, 36–37, 42, 45–47, 56, 58, 64–67, 70–73, 80–83, 87, 119–121, 123–131, 133–140, 154, 181–183, 187–189, 199–202, 241, 244–245, 251–252, 254–257, 259–260, 263, 268, 270–272, 284, 299, 301–309, 312, 314–318, 320–321, 323–327, 333, 347–348, xi, xiii, xv–xvii, xx Turkey, 58, 83, 97–98, 119–135, 137–140, 315, xiii, xviii Turkish, 119–122, 124, 130–131, 133–136, 138–139 UNESCO (United Nations Educational, Scientific, and Cultural Organization), 3–10, 13–16, 19–21, 23, 25, 28, 37, 40, 91, 94, 110, 192, 248, 304, 323–324, 334–335, 339, 341, 346–348, xi World Bank, 4–5, 8–9, 21–24, 40, 45, 67, 70–71, 73, 83, 91, 93–94, 110, 148, 151, 154–157, 159, 165, 167, 172, 175, 308, 315, 335