The Institutionalisation of Evaluation in Europe 3030322831, 9783030322830

This book examines the progress of institutionalisation of evaluation in European countries from various perspectives. I

122 86

English Pages 546 [526] Year 2020

Table of contents :
Preface and Acknowledgements
Contents
Notes on Contributors
List of Figures
List of Tables
Part I Introduction
1 The Institutionalisation of Evaluation Theoretical Background, Analytical Concept and Methods
1 Introduction
2 State of Knowledge
3 Institutions—Some Theoretical Remarks
4 The Methodological Approach
5 Selection of Countries and Methodological Procedure
Appendix
Analytical Guideline: Compendium on the Institutionalisation of Evaluation
References
Part II Northern Europe
2 Denmark
1 General Country Overview
1.1 Structure
1.2 Culture
1.3 Public-Sectors Reforms/Transformations
2 Institutional Structures and Processes
2.1 Evaluation Regulation
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Evaluation in Society
4 Professionalisation
5 Discussion and Conclusion
References
3 Finland
1 Introduction
2 Institutional Structures and Processes
3 Evaluation Practice
4 Utilisation of Evaluations
5 Evaluation in Municipalities and in the Civil Society
6 Professionalisation of Evaluation
7 The Future of Evaluation in Finland
References
Part III Western Europe
4 The Netherlands
1 Introduction
2 Institutional Structures and Processes (Political System)
2.1 Evaluation Regulations
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination/Acceptance
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
3.3 Civil Society: Demand for Evaluations
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses, Further Training
4.2 Profession/Discipline
4.3 Research Quality Rather Than Evaluation Quality
5 Concluding Remarks
Appendix
References
5 Belgium
1 General Country Overview
2 Institutional Structures and Processes (Political System)
2.1 Evaluation Regulations
2.1.1 A Formal Anchorage of Evaluation Requirements in Laws or Regulations
2.1.2 The Institutionalisation of Evaluation Within Parliament
2.2 Evaluation Practice
2.2.1 A Relatively Wide Diffusion of Evaluation Practice
2.2.2 Type of Evaluation Practice
2.2.3 Locus of Evaluation Practice
2.2.4 Evaluation Institutes, with Varying Levels of Independency
2.3 Use of Evaluations
2.3.1 Evaluation Motives
2.3.2 Quality Assurance
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
3.3 Civil Societies Demand Evaluations
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses and Further Training
4.2 Profession/Discipline
4.3 Compliance to Standards and Quality Obligations
5 Conclusion and Outlook
References
6 France
1 General Country Overview
2 Institutional Structures and Processes (Political System)
2.1 Evaluation Regulations
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
3.3 Civil Societies Demand Evaluations
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses, Further Training et Cetera
4.2 Profession/Discipline
4.3 Compliance to Standards and Quality Obligations
5 Conclusion
References
7 Germany
1 Introduction
2 General Country Overview
3 Institutional Structure and Processes
3.1 Evaluation Regulations
3.2 Evaluation Practice
3.3 Use of Evaluations
4 Societal Dissemination
4.1 Institutionalised Use of Evaluations by Civil Society
4.2 Public Perception and Discussion of Evaluation and Evaluation Findings
4.3 Civic Participation in Evaluations
5 Professionalisation
6 Summary
References
8 Switzerland
1 General Country Overview
1.1 Institutional Framework of the Swiss Political System
1.2 A Brief History of Evaluation in Switzerland
1.2.1 The Pioneer Phase (Until 1990)
1.2.2 The Establishment Phase (1990–2000)
1.2.3 The Mature Phase (2000–2010)
1.2.4 The Phase of Bureaucratisation (Since 2010)
2 Structures and Processes of Evaluation in the Political System
2.1 Evaluation Regulations
2.2 Evaluation Practices
2.2.1 The Federal Level
2.2.2 Cantons (and Communities)
2.3 Evaluation *Utilisation
2.3.1 Public Administration
2.3.2 Parliaments
2.3.3 Public Discourse
2.4 Professionalisation
3 Conclusion
References
9 Ireland
1 A Historical Overview
2 Institutional Structures and Processes
2.1 National and Sectoral Laws and Regulations
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluations by Citizens, Enterprises and Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
4 Professionalisation
4.1 Academic Study Courses, Further Training, Et cetera
4.2 Profession/Discipline
4.3 Compliance to Standards and Quality Obligations
5 Conclusion
References
10 United Kingdom
1 General Country Overview
2 Institutional Structures and Processes (Political System)
2.1 Evaluation Regulations
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
3.3 Civil Societies Demand Evaluations
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses, Further Training Et cetera
4.2 Profession/Discipline
4.3 Compliance to Standards and Quality Obligations
5 Conclusion and Outlook
References
Part IV Southern Europe
11 Italy
1 General Country Overview
2 Institutional Structures and Processes (Political System)
2.1 Evaluation Regulations
2.1.1 The Environmental Impact Assessment
2.1.2 Evaluation of Public Investments at the National and Regional Level
2.1.3 Evaluation in Education and University Sectors
2.1.4 Evaluation of the Performance of Public Administration and Public Managers
2.1.5 Evaluation Within the Italian Judiciary System
2.1.6 Evaluation in the Non-profit Sector
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception, Discussion of Evaluation and Evaluation Findings
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses, Further Training Et cetera
4.2 Profession/Discipline
4.3 Compliance with Standards and Quality Obligations
5 Conclusions
References
12 Spain
1 Introduction
2 Political Subsystem: Institutional Structures and Processes
2.1 Evaluation Regulations
2.2 Evaluation Practice
2.3 Use of Evaluation
3 Social System: Distribution of Evaluation Towards Society
4 System of Professionalisation
4.1 Academic Study Courses, Capacity Development and Further Training
4.2 Profession/Discipline
5 Conclusions
References
13 Portugal
1 Introduction
2 General Country Overview
3 Institutional Structures and Processes (Political System)
3.1 Evaluation Regulations
3.2 Evaluation Practice
3.3 Use of Evaluations
4 Societal Dissemination/Acceptance (Social System)
4.1 Institutionalised Use of Evaluations by Civil Society
4.2 Public Perception and Discussion of Evaluation and Evaluation Findings
4.3 Civil Societies Demand Evaluations
5 Professionalisation (System of Professionalisation)
5.1 Academic Study Courses, Further Training, Etc.
5.2 Profession/Discipline
5.3 Compliance to Standards and Quality Obligations
6 Conclusion
References
Part V Central Eastern Europe
14 Czechia
1 General Country Overview
2 Institutional Structures and Processes
2.1 Legal Framework and Regulations Concerning the Evaluation
2.1.1 National Rules
2.1.2 Sectoral Approach to Evaluations
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination
3.1 Use of Evaluations by the Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
4 Professionalisation
4.1 Academic Study Courses and Further Training
4.2 Evaluation as a Profession
4.3 Mechanisms to Achieve the High Quality of Evaluations
5 Conclusions and Future Outlook
References
15 Latvia
1 General Country Overview
2 Institutional Structure and Processes (Political System)
2.1 Evaluation Regulations
2.1.1 Evaluation Requirements in Laws or Regulations
2.1.2 Evaluation Within the Parliament
2.2 Evaluation Practice
2.2.1 General Description of Evaluation Practice
2.2.2 Relation Between Process and Impact/Outcome Evaluations
2.3 Use of Evaluation
2.3.1 Best Performing Sectors
2.3.2 Poor Performing Sectors
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluation in Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
3.3 Civil Societies Demand of Evaluations
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses and Further Training
4.2 Profession/Discipline
4.3 Compliance to Standards and Quality Obligations
5 Conclusion
References
16 Poland
1 General Country Overview
2 Institutional Structure and Processes (Political System)
2.1 Evaluation Regulations in Different Fields
2.2 Evaluation Practice
2.3 Use of Evaluation
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluation in Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
4 Professionalisation (Academic System)
4.1 Academic Study Courses, Further Training, and Other Learning Opportunities
4.2 Profession/Discipline
5 Conclusion
References
17 Romania
1 General Country Overview
2 Institutional Structures and Processes (Political System)
2.1 Evaluation Regulations
2.1.1 Provisions on Evaluation or Use of Evaluation in the National Laws or Regulations
2.1.2 Normative Framework Paths on Development of Capacity and Culture of Evaluation
2.1.3 A Top-Down Approach of Institutionalisation of Evaluation Use
2.1.4 Guidelines and Tools for Evaluation Practice
2.1.5 Budget Allocation for Evaluations
2.1.6 Evaluation and Use of Evaluation Findings in Parliamentary Activities
2.2 Evaluation Practice
2.3 Use of Evaluations
3 Societal Dissemination/Acceptance (Social System)
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception and Discussion of Evaluation and Evaluation Findings
3.3 Civil Societies Demand Evaluations
4 Professionalisation (System of Professionalisation)
4.1 Academic Study Courses, Further Training Etc
4.2 Profession/Discipline
4.3 Compliance to Standards and Quality Obligations
5 Conclusion
References
Part VI Transnational Organisations
18 European Union
1 General Information
2 Development of the Evaluation Framework of the EU: Institutional Set-Up and Processes
3 Evaluation Practice and Use in the EC
4 The Development of the Evaluation Function at the CoE
4.1 Overview of the CoE
4.2 Institutional Set Up and Processes for Evaluations
4.3 Acceptance and Use of Evaluations in the CoE
4.4 Professionalisation of the Evaluation Function at the CoE
5 Comparative Conclusions
References
Part VII Synthesis
19 The Institutionalisation of Evaluation in Europe: A Synthesis
1 Introduction
2 The Institutionalisation of Evaluation in the Political System
3 The Institutionalisation of Evaluation in the Social System
3.1 Institutionalised Use of Evaluations by Civil Society
3.2 Public Perception and Discussion of Evaluation
3.3 Civil Societies’ Demand for Evaluations
4 Institutionalisation of Evaluation as a Profession
5 Conclusions: Correlations and Interlinkages
6 Some First Attempts of Explanation
Appendix
References

Recommend Papers

The Institutionalisation of Evaluation in the Americas 3030811387, 9783030811389

This book examines the progress of institutionalisation of evaluation in American countries from various perspectives. I

124 38 Read more

The Legacies of Institutionalisation: Disability, Law and Policy in the ‘Deinstitutionalised’ Community 9781509930739, 9781509930760, 9781509930753

This is the first collection to examine the legal dynamics of deinstitutionalisation. It considers the extent to which s

144 15 5MB Read more

The Institutionalisation of Political Parties in Post-authoritarian Indonesia: From the Grass-roots Up 9789048518951

Indonesia’s democratic political parties developed rapidly after the end of the New Order era (1966–1998). Based on exte

136 25 905KB Read more

Restorative Justice at a Crossroads: Dilemmas of Institutionalisation 1032341203, 9781032341200

This book reflects on the institutionalisation of restorative justice over the last twenty years and offers a critical a

113 110 17MB Read more

The Evaluation Society 9780804778121

Any public or private organization that "lives in public" must now evaluate its activities, be evaluated by ot

127 113 1MB Read more

Practical Program Evaluation: Theory-Driven Evaluation and the Integrated Evaluation Perspective [2 ed.] 1412992303, 9781412992305

The Second Edition of Practical Program Evaluation shows readers how to systematically identify stakeholders’ needs in o

1,123 57 5MB Read more

The Faces of Nature in Enlightenment Europe

537 39 78MB Read more

The Decline of Fertility in Europe 9781400886692

This volume summarizes the major findings of the Princeton European Fertility Project. The Project, begun in 1963, was a

113 86 36MB Read more

Electrifying Europe: The Power of Europe in the Construction of Electricity Networks 9789048521203

Nowadays most consumers are aware of the European dimensions of their electricity supply. But what ideas lie behind this

149 83 2MB Read more

The War in Europe

165 21 3MB Read more

The Institutionalisation of Evaluation in Europe
3030322831, 9783030322830

Author / Uploaded
Reinhard Stockmann (editor)
Wolfgang Meyer (editor)
Lena Taube (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

The Institutionalisation of Evaluation in Europe Edited by Reinhard Stockmann Wolfgang Meyer · Lena Taube

The Institutionalisation of Evaluation in Europe

Reinhard Stockmann · Wolfgang Meyer · Lena Taube Editors

The Institutionalisation of Evaluation in Europe

Editors Reinhard Stockmann Saarland University Saarbrücken, Saarland, Germany

Wolfgang Meyer Saarland University Saarbrücken, Saarland, Germany

Lena Taube Centre for Evaluation Saarland University Saarbrücken, Saarland, Germany

ISBN 978-3-030-32283-0 ISBN 978-3-030-32284-7 (eBook) https://doi.org/10.1007/978-3-030-32284-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover illustration: Alamy AHKA7D Ashway/Alamy Stock Photo This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface and Acknowledgements

The idea for this book, which is the first in a series, matured slowly over time and received input from a variety of sources. First and foremost, was our practical work as evaluators in developing countries. Over the last few decades, we have conducted hundreds of evaluations in countries in Asia, Latin America and Africa. Due to the participatory approach of CEval (Stockmann 2008: 248ff.1; Stockmann and Meyer 2013: 206ff.2), these activities were associated with capacity building measures right from the start, in order to qualify partners in developing countries, so that they could carry out their own evaluations. In addition, with the support of the DAAD (German Academic Exchange Service), we took advantage of our opportunity as part of the Saarland University within the framework of university cooperation to strengthen academic teaching and research in the field of evaluation. On behalf of the BMZ (Bundesministerium für wirtschaftliche Zusammenarbeit und Entwicklung = Federal Ministry for Economic Cooperation and Development) and with the support of GIZ (Deutsche Gesellschaft für internationale Zusammenarbeit = German Society for International Cooperation), we implemented the “Blended Learning Master Evaluation” at the Uganda Technology and Management University (UTAMU), which is based on the Saarbrücken 1Stockmann, R., & Meyer, W. (2008). Evaluation and quality development. Peter Lang: Frankfurt am Main. 2Stockmann, R., & Meyer, W. (2013). Functions, methods and concepts in evaluation research. Palgrave Macmillan: London.

v

vi

PREFACE AND ACKNOWLEDGEMENTS

evaluation master’s programme. We are connected to the Universidad de Costa Rica (UCR) and the Pontifical Catholic University of Ecuador (PUCE) by a long-term university partnership that, in collaboration with the GIZ/DEval project FOCEVAL (Fomento de Capacidades en Evaluación), extends far beyond teaching- and research-related capacity building measures. Participation in international conferences, mostly in Latin America, Africa and Asia, has strengthened our opinion that evaluation is developing particularly dynamically outside of the countries where it originated. However, since the focus of the international donor, teaching and research areas in the field of evaluation is very much fixed on the developments in North America and, to a much lesser degree, in Europe, there is a lack of research on evaluation taking place beyond these regional boundaries. This becomes particularly clear when one considers the pioneering work of Furubo et al. (2002)3 and the follow-up study by Jacob et al. (2015).4 Although the book is titled International Atlas of Evaluation, the case studies cover twelve European countries, two from North America, three from Asia, one from Africa and none from Latin America or the Caribbean! The stark contrast between, on the one hand, our own experiences and the impressive dynamics of the institutionalisation of evaluation outside of North America and Europe and, on the other hand, the lack of attention given to research in these regions, gave rise to the idea that the venture to provide a comprehensive overview beyond our own borders was one worth attempting. The first opportunity presented itself when we celebrated the tenth anniversary of the Center for Evaluation (CEval) at Saarland University in 2012 with an international conference in Saarbrücken. Instead of looking back on past years of success, we dared to take a look into the future. The contributions were so well received that we decided to publicate them with Palgrave Macmillan under the title The Future of Evaluation: Global Trends, New Challenges, Shared Perspectives (2016). This was the first publicist step on the path that led to this volume. Back in May 2015, a workshop with international participants took place 3Furubo, J. E., Rist, R. C., & Sandahl, R. (2002). International atlas of evaluation. Transaction Publishers. 4Jacob, S., Speer, S., & Furubo, J. E. (2015). The institutionalization of evaluation matters: Updating the International Atlas of Evaluation 10 years later. Evaluation, 21(1), 6–31.

PREFACE AND ACKNOWLEDGEMENTS

vii

at the CEval of Saarland University in order to discuss the idea and begin with the project design. At the EES conference in Maastricht in September 2016, the concept, which now had a theoretical and methodological basis, was presented and discussed. Finally, the idea grew into a global project includes a bold plan to publish one case study volume each for Europe, America, Africa and Asia, analysing the institutionalisation of evaluation within the political, social and professional systems. Because we are, after all, Europeans and we are most familiar with Europe, we decided to start with this European volume. The American volume will follow soon in 2020. Most of the articles have already been submitted. The African volume (2021) and the Australian-Asian volume (2022) are to follow. It takes a lot of heart, persuasiveness, support, perseverance and many hard-working heads and hands to be able to carry out such a mammoth project! Therefore, first and foremost, we want to thank the authors of this first volume, who agreed to write their case studies based on the analysis framework we developed, in order to ensure the highest possible comparability across all countries. We would also like to thank Saarland University, which granted us the freedom needed to research this project. Thanks are due to our many hard-working and patient colleagues who have helped with and supported this project. Especially heartfelt thanks to: Christine Nolte, Angelika Nentwig, Selina Röhrig, Myriel Mohra, Felipe Ramirez Kaiser, Nicole Ebel and Fabian Otto. Last but not least, we would like to thank our publisher Palgrave Macmillan for agreeing to publish all four volumes. We hope that the first volume on the institutionalisation of evaluation in Europe will pique your interest and meet your expectations. If so, then you may look forward with us to the following volumes—in which you may even want to participate. Saarbrücken, Germany

Reinhard Stockmann Wolfgang Meyer Lena Taube

Contents

Part I Introduction 1

The Institutionalisation of Evaluation Theoretical Background, Analytical Concept and Methods 3 Wolfgang Meyer, Reinhard Stockmann and Lena Taube

Part II Northern Europe 2 Denmark 37 Peter Dahler-Larsen and Hanne Foss Hansen 3 Finland 63 Petri Uusikylä, Pertti Ahonen, Nina K. Hietakangas and Johanna Nurmi Part III Western Europe 4

The Netherlands 89 Carolien Klein Haarhuis

ix

x

CONTENTS

5 Belgium 115 Valérie Pattyn and Bart De Peuter 6 France 139 Clément Lacouette-Fougère and Benoit Simon 7 Germany 167 Reinhard Stockmann and Wolfgang Meyer 8 Switzerland 199 Thomas Widmer 9 Ireland 227 Richard Boyle, Joe O’Hara, Gerry McNamara and Martin Brown 10 United Kingdom 249 Tracey Wond Part IV Southern Europe 11 Italy 273 Erica Melloni 12 Spain 303 María Bustelo 13 Portugal 329 Ana Diogo Part V Central Eastern Europe 14 Czechia 351 Jiří Remr and Oto Potluka

CONTENTS

xi

15 Latvia 377 Krišjānis Veitners 16 Poland 405 Monika Bartosiewicz-Niziołek, Beata Ciężka, Weronika Felcis, Katarzyna Hermann-Pawłowska, Seweryn Krupnik, Piotr Strzęboszewski and Agnieszka Szczurek 17 Romania 435 Nicolae Toderaş and Teodora Diana Iacob Part VI Transnational Organisations 18 European Union 463 Roland Blomeyer and Ansgar Eussner Part VII Synthesis 19 The Institutionalisation of Evaluation in Europe: A Synthesis 483 Reinhard Stockmann, Wolfgang Meyer and Lena Taube

Notes

on

Contributors

Pertti Ahonen is Professor of Political Science at the University of Helsinki. His work includes research articles in Big Data & Society, Administration & Society, Sage OPEN, Evaluation, Annals of Public and Cooperative Economics, International Journal of Administrative Sciences, International Journal of Critical Accounting and international edited monographs. Monika Bartosiewicz-Niziołek Alumnus of Ph.D. Studies in social sciences and Post-diploma Studies in evaluation. An expert with several years of experience in conducting, commissioning, consulting and supervising of evaluation. An author/reviewer of several publications concerning evaluation, a mentor and an approved reviewee (VEPR) of the European Evaluation Society, a member of the Evaluation Steering Group and the NGO Trainers’ Association. Roland Blomeyer has an academic background in law and politics and has been working for over 20 years in the area of evaluation for different European Union institutions. He is the director of the firm Blomeyer & Sanz, specialised in the provision of evaluation and research services for the European Parliament, with over 50 publications in the last ten years. Richard Boyle is Head of Research at the Institute of Public Administration. He has carried out a wide range of research and consultancy studies on aspects of public service management including publicsector reform, performance measurement, and evaluation. He is a former board member of the European Evaluation Society. xiii

xiv

NOTES ON CONTRIBUTORS

Martin Brown works at the Centre for Evaluation, Quality and Inspection (EQI), School of Policy and Practice, at DCU Institute of Education in Ireland. He has published widely in the field of evaluation as well as coordinating and leading national and transnational projects in the areas of school evaluation, inspection, quality assurance, culturally responsive evaluation and assessment and Distributed Evaluation and Planning in Schools. Dr. María Bustelo Associate Professor of Political Science and Public Administration and Director of the Master on Evaluation of Programmes and Public Policies (UCM). She was President of the European Evaluation Society 2012–2013, and member of the Board of Directors Committee of the Spanish Agency for the Evaluation of Public Policies (AEVAL; 2007–2011). She is member of the UNWomen Global Evaluation Committee and was part of the High Level Expert Group in charge of the FP7 ex-post evaluation (2014–2015), and the Expert Group in charge of the Interim Evaluation as a crosscutting issue in Horizon 2020, at DG Research (European Commission). Beata Ciężka Freelance evaluator since 1995 with vast experience in research and evaluation projects for such clients as the European Institutions, Ministries, NGOs, business sector. Specialist in labour market, social, education and training projects. Designed and delivered a number of evaluation training programmes. Peter Dahler-Larsen is leader of CREME, Center for Research on Evaluation, Measurement and Effects. He is the author of The Evaluation Society (Stanford University, 2012) and Quality: From Plato to Performance (Palgrave, 2019). Bart De Peuter is auditor at the Belgian Court of Audit since spring 2018 for performance audits with a focus on policy results. Earlier he was a research expert at the KU Leuven Public Governance Institute involved in the research, practice and training regarding policy evaluation. He is co-founder and board member of the Flemish Evaluation Association. Ana Diogo has work experience on public-sector performance measurement, in Portugal, has worked as an evaluator at the Spanish Public Policy Evaluation Agency—AEVAL, during 2017, and also a consultant for ILO and ITC-ILO. Her training includes a Master’s Degree on Public

NOTES ON CONTRIBUTORS

xv

Policy and Programmes Evaluation, Complutense University of Madrid; executive seminars, Hertie School of Governance, Berlin; IPDET 2018— International Program for Development Evaluation Training. Dr. Ansgar Eussner started evaluation work in 1978 for the European Commission and German ministries, on behalf of the German Development Institute, followed by several years as independent consultant working for various German and international organisations. Since 1993, Dr. Eussner has done evaluation work in senior positions in international organisations, first at UNESCO, then at the Multilateral Fund for the Implementation of the Montreal Protocol/UNEP (1999– 2008), followed by the position of Chief of Evaluation and Audit at ICAO (2009–2011), and finally as Director of Internal Oversight at the Council of Europe (2011–2017). He retired in June 2017 and works now as freelance evaluation consultant. Weronika Felcis Alumnus of Ph.D. Studies in social sciences at Jagiellonian University. An author and reviewer of several publications concerning evaluation, reviewer in VEPR system of the European Evaluation Society. Past President of PES, board member of EES, Secretary of IOCE. Hanne Foss Hansen is professor in public administration and organisation at the Department of Political Science, University of Copenhagen. Her main interests are public organisation and management, public-sector reform and evaluation. Her recent publications include articles in Evaluation and American Journal of Evaluation as well as contributions to edited volumes at Routledge, Springer and Palgrave. Katarzyna Hermann-Pawłowska Analyst, evaluator, expert in EU funds, strategic planning and public participation. Experienced in delivering and managing evaluations. Member of the Management Board in the IDEA for Development Foundation. Graduated from the Faculty of Law and Administration at Adam Mickiewicz University in Poznań. Ph.D. in economics obtained at Warsaw School of Economics. Nina K. Hietakangas works as Head of Evaluation in the Finnish Blue Ribbon, which is a member of the International Blue Cross and an umbrella organisation for about 100 associations in Finland providing caring and supportive services. In her current position Mrs. Hietakangas in charge of evaluation procedures, as well as methods and process development and evaluation assignments including programme evaluations, training and consultation.

xvi

NOTES ON CONTRIBUTORS

Teodora Diana Iacob, Ph.D. is an expert in the area of evaluation of EU funds programmes and strategies. Since 2013 she is involved in several evaluation studies, both as expert as well as evaluation manager. Complementary to this professional experience and knowledge, she is also specialised in public policies, better management, evaluation capacity and culture. Carolien Klein Haarhuis After having completed her Ph.D. in Sociology based on the evaluation of an anti-corruption programme, Carolien Klein Haarhuis worked as an (evaluation) researcher for the Research and Documentation Centre (WODC) at the Ministry of Justice and Security for over a decade (where this chapter was completed). She now is a senior advisor at the Ministry of Finance. Seweryn Krupnik Social Scientist, Director of the Centre for Evaluation and Analysis of Public Policies in Krakow at the Jagiellonian University in the years 2016–2018. An expert on the evaluation of innovation policy. He conducted many evaluations with the use of counterfactuals and theory-based evaluation approach. Member of the Polish Evaluation Society. Clément Lacouette-Fougère is “evaluation” head of project at the Interministerial Directorate for Public Transformation at the French Ministry of Economy. He also teaches evaluation methods and public policy analysis at the Political School of International Affairs in Sciences Po. Gerry McNamara is Full Professor of Educational Evaluation at the Institute of Education at Dublin City University. His main area of research in recent years is around models of school evaluation and in particular the relationship between internal or school-based evaluation and external evaluation whether through inspection and/or the use of student performance data. Erica Melloni, Ph.D. is currently Director of the Administrative and Institutional Policies Research Unit of the Istituto per la Ricerca Sociale (Milan, Italy) and President of the Italian Evaluation Society. Her main fields of research are public policy analysis and evaluation, capacity building and cohesion policies.

NOTES ON CONTRIBUTORS

xvii

Wolfgang Meyer is Assistant Professor at Saarland University and Adjunct Professor at Uganda Technology and Management University, Kampala; Vice-Director of the Centre for Evaluation (CEval); Founding Member of the German Evaluation Society (DeGEval). Sociologist with a Focus on Empirical Methodology; Evaluation Specialist in the Field of Environment, Labour Market and Regional Development. Johanna Nurmi is a Ministerial Adviser at the Finnish Ministry of Finance, Public Governance Department. The Public Governance Department is responsible for public governance development, human resources management and the preparation and implementation of governance policy. Joe O’Hara is Full Professor of Education and Director of the Centre for Evaluation, Quality and Inspection (EQI) at Dublin City University. He has researched and published widely in the areas of school evaluation and inspection, quality assurance, culturally responsive evaluation and distributed evaluation. He is the current President of the European Educational Research Association (EERA). Valérie Pattyn is Assistant Professor at the Institute of Public Administration of Leiden University; and is partially affiliated to KU Leuven Public Governance Institute. Valérie is co-chair of the Permanent Study Group on Policy Design and Evaluation of the European Group for Public Administration; board member of the Flemish Evaluation Association, and associate editor of Policy Design and Practice. Oto Potluka is a senior researcher at the Center for Philanthropy Studies (CEPS) at the University of Basel. He has participated in dozens of evaluations of programmes supporting local and regional development, especially programmes of the EU Cohesion Policy, including financial capacities in companies and nonprofit organisations. He is a member of several international evaluation associations. Jiří Remr is Chief Research Officer at the Institute for evaluations and social analyses in Prague. His background is methodology of social research. Since 2010 he has been performing evaluations of social intervention programmes and meta-evaluations. He also develops tools and guidelines helping other evaluators to conduct their evaluative inquiries.

xviii

NOTES ON CONTRIBUTORS

Benoit Simon is the founder and managing partner of the consultancy Planete Publique specialised in evaluation and citizen’s participation. He is also member of the French Evaluation Society (SFE) and European Evaluation Society (EES). Reinhard Stockmann is Professor for Sociology at Saarland University; Founder and Director of the Center for Evaluation (CEval); Managing Director of two Master of Evaluation programmes; Executive Editor of the German Journal of Evaluation (ZfEv); Founding member of the German Society for Evaluation (DeGEval). Evaluation specialist in the field of development cooperation, education, vocational training, environmental policy and foreign cultural policy. Piotr Strzęboszewski Head of National Evaluation Unit. Has over 12 years of experience in the area of evaluation of public policies. Responsible for the organisation of postgraduate studies in the field of evaluation (Academy of Evaluation). He supervised the implementation of over 40 evaluation studies and participated in the work on numerous publications on evaluation. Co-author of governmental strategic documents on the subject of evaluation. Agnieszka Szczurek Graduated from sociology and post-diploma pedagogy at Warsaw University. Graduated from NGO Trainers’ Association school. Founder member of Polish Evaluation Society. NGO Trainers’ Association and Polish Sociological Association member. From 1999 to 2005—evaluation and research expert in National Examining Board. Since 2004—co-owner, researcher and evaluator in the Evaluation Centre Ltd. (www.ewaluacja.eu), conducts evaluation research in various social policy, local community development and education projects. Lena Taube was a research associate at the Center for Evaluation (CEval) at Saarland University where she focused on research regarding the institutionalisation of evaluation as a means to good governance. She is currently working at the German Institute for Development Evaluation (DEval) engaged with evaluating human rights in German development aid. Nicolae Toderaş, Ph.D. is specialised in EU policies and organisational evaluation, he has published several papers and public policy studies regarding European policies and governance, higher education

NOTES ON CONTRIBUTORS

xix

governance, regional development policies, institutional and organisational evaluation, as well as the deepening of the relations between the Republic of Moldova and the EU. Petri Uusikylä is co-founder and CEO of the Frisky & Anjoy and President of the Finnish Evaluation Society and President of the Finnish Section of the IIAS (International Institute of Administrative Sciences). He has more than 25 years of experience in Policy and Program Evaluation, Policy Analysis, Government Advice and Public Management Consultancy. Krišjānis Veitners is an independent evaluation and monitoring expert. He holds a Master’s degree in Evaluation from Saarland University, Germany. His main research interests are evaluation capacity building, evaluation of public programmes and policies. He is founding member and President of Latvian Evaluation Society and active member of European Evaluation Society. Thomas Widmer is Professor of Political Science with focus on Evaluation at the Department of Political Science, University of Zurich. He received his M.A. (1991), Ph.D. (1995) and P.D. (2007) from the University of Zurich. He held research and teaching positions at Universities in Switzerland (Berne, Fribourg, St. Gall and Zurich), in Germany (Constance) and in the USA (Harvard University). Widmer’s research focuses on evaluation (esp. quality and ethics in evaluation and meta-evaluation), public policy, Swiss politics and methodology. Dr. Tracey Wond has researched the organisational aspects of evaluation for the past 12 years. In that time, she has contributed a number of papers, book chapters and talks on evaluation. Tracey is a University researcher, doctoral supervisor and former Head of Research. She is Council Member and Treasurer for the UK Evaluation Society (UKES). Tracey also undertakes private research and evaluation consultancy.

List of Figures

Chapter 3 Fig. 1 The members of the SAYFES by their roles (SAYFES Database 2017) Fig. 2 The members of the SAYFES by their institutional affiliation (SAYFES Database 2017)

82 83

Chapter 6 Fig. 1 Sector coverage of evaluation reports in France (SFE 2016) 148

Chapter 7 Fig. 1 Use of the term ‘evaluation’ in draft laws dealt with by the Bundestag (Own depiction; German Bundestag 2017) 172 Fig. 2 Frequency of use in the printed matter of the Bundestag (Own depiction; German Bundestag 2017) 180 Fig. 3 Frequency of use in the SZ archive (Süddeutsche Zeitung 2017) 184

Chapter 12 Fig. 1 Evaluation activity in Spain 2004–2013: number of evaluations reported by respondents each year (Rodríguez 2014) 312 Fig. 2 Evaluation activity in different fields: number of evaluations reported in each field in 2009 and 2014 (Rodríguez 2014) 314

xxi

xxii

LIST OF FIGURES

Chapter 15 Fig. 1 Number of EU funds evaluations done in Baltic countries in 2007–2013 planning period (Visionary Analytics 2017) 388

Chapter 16 Fig. 1 Number of completed evaluation studies of Cohesion Policy in years (Own study based on the Evaluation Database of the National Evaluation Unit [Accessed on 8 February 2018]) 413 Fig. 2 Number of evaluations as due of the moment of their implementation (Own study based on the Evaluation Database of the National Evaluation Unit [Accessed on 8 February 2018]) 413

Chapter 19 Fig. 1 Institutionalisation and evaluation use in European countries 496 Fig. 2 Voluntary organisations of professionalisation in Europe 2019 (GLOBE Authors) 509 Fig. 3 Z-Scores for standardised indices on political, social and professional system 513

List of Tables

Chapter 1 Table 1 Institutionalisation of evaluation in different sub-systems

15

Chapter 3 Table 1 Assessment of M&E information in Ministry of Education, Ministy of Traffic and Communication and Ministry of Finance, 2006/2010 (Uusikylä 2013) Table 2 The level of institutionalisation and capacity of evaluation in various sectors of the government in Finland

72 74

Chapter 6 Table 1 Main pieces of law related to evaluation in France (1998–2012) 143

Chapter 11 Table 1 The units for the evaluation and monitoring of public investments created within the Italian regional administrations, according to Law 144/99: year of institution and membership 281 Table 2 Main elements of the regulations of 2009 and 2015 on the public administrations’ performance 284 Table 3 A list of the main regulations concerning evaluation in different policy domains 290 xxiii

xxiv

LIST OF TABLES

Chapter 14 Table 1 Number of performed evaluations in the Czech ODA (Ministry of Foreign Affairs) Table 2 Number of evaluations of ESIF during the period 2007–2013 (Ministry of Regional Development) Table 3 The primary focus of evaluations of ESIF during the period 2007–2013 (Ministry of Regional Development) Table 4 Number of evaluations delivered in ESIF (Ministry of Regional Development)

357 360 362 363

Chapter 19 Table 1 Table 2 Table 3 Table 4

Legislative institutionalisation of evaluation and evaluation use Degree of spread across sectors (cf. Table 1, column 6) Institutionalisation of evaluation in the social system Professionalisation Index (reference to Appendix)

486 493 501 508

PART I

Introduction

CHAPTER 1

The Institutionalisation of Evaluation Theoretical Background, Analytical Concept and Methods Wolfgang Meyer, Reinhard Stockmann and Lena Taube

1

Introduction

The global development of evaluation is remarkable. Starting from North America, during the past 50 years, evaluations are conducted in a steadily increasing amount of countries, policy fields, types of organisations and technical forms. In the course of an administrative modernisation that can also be observed worldwide, evaluations are used to make measures, projects and programmes more effective and efficient, to supply decision makers with information and assessments in order to enable rational

W. Meyer (B) · R. Stockmann · L. Taube Department of Sociology, Saarland University, Saarbrücken, Saarland, Germany e-mail: [email protected]; [email protected] R. Stockmann e-mail: [email protected] L. Taube e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_1

3

4

W. MEYER ET AL.

decision-making processes. Evaluations serve to measure success, accountability, legitimise political and administrative action and sometimes also— and this is the other side of the coin—by means of evaluations decisions already taken are to be rationalised and justified at a later date. Evaluations are used ex ante for regulatory impact analysis, on-going for implementation adjustment or for impact analysis. In recent decades, the evaluation portfolio has continued to diversify both horizontally and vertically. This goes hand in hand with a professionalisation process that can be observed worldwide. More and more training courses are offered at universities, academic and non-academic training centres, inside and outside public and private organisations, and—not to forget—in the internet. Furthermore, the number of evaluation journals, textbooks, articles and studies has increased in tsunami terms. Quality standards for evaluations have been developed in many countries, associations and networks have been established and certification systems are being developed (Stockmann 2013). After almost 50 years of development history, it is now time to take some global stocktaking, because research on evaluation is still underdeveloped (Coryn et al. 2017) and merely concentrated on North America. Hence, many parts of the world are not well integrated into the scientific publishing system (Altbach 2003) because most of the important journals are in English and publications in other languages—even in other universal languages like Chinese, French, German, Portuguese, Russian or Spanish—are almost not recognised (Jacob 2005; Widmer et al. 2009). Studies on evaluation are, therefore, clearly biased and incomplete. This is even true for the most important work, the “International atlas of evaluation” edited by Jan-Eric Furubo, Ray C. Rist and Rolf Sandahl (2002). While many European countries are included (Denmark, Finland, France, Germany, Ireland, Italy, the Netherlands, Norway, Spain, Sweden, Switzerland and United Kingdom), almost all African (with the exception of Zimbabwe) and Asian countries (with the exception of China, Japan and South Korea) and all Latin American and Central Eastern European Countries are missing. Nevertheless, from a European perspective the atlas (and its update by Jacob et al. 2015) is a wonderful reference for “The institutionalisation of evaluation in Europe”. Again, the focus in this volume is on Europe (including Eastern Europe). But this volume marks the beginning of a book series that also focuses on the other continents. This is all the more important as evaluation is developing particularly dynamically there. For this reason, the Europe volume will

1

THE INSTITUTIONALISATION OF EVALUATION

5

be followed by three further volumes that examine the institutionalisation of evaluation in America, Africa and Asia. The reason for this order is just pragmatic: for the editors, the access to authors was easier (and faster) here than in other areas and an early start at the European Evaluation Society (EES)-conference in Maastricht 2017 was possible. All four publications are based on the same theoretical and methodological concept, which leads to a common analysis grid. In this way, a high degree of systematic comparability across countries and continents is to be ensured. The chapter is structured in such a way that the current state of research on the institutionalisation of evaluation is briefly discussed before the theoretical and methodological concept applied to all four volumes is presented. Afterwards, the selection of countries for the European volume and the challenges to be mastered with the production of the volume will be discussed. A total of 16 countries and the role of the European Union (EU) will be covered. The Europe volume concludes with a synthesis chapter in which the most important results are worked out in cross-section and, as far as possible, attempts are made to explain the different degrees of institutionalisation of evaluation.

2

State of Knowledge

As already mentioned above, Furubo, Rist and Sandahl provided the first systematic overview about evaluation systems in 21 different countries and three international organisations. The purpose of the research was to describe the countries’ individual systems as well as global trends and developments, and to explain these developments regarding possible driving forces and consequences. The methodological approach was to conduct country case studies oriented on nine indicators. In 2015 an update was presented by Jacob, Speer and Furubo,1 offering a cautious longitudinal comparison. The finding in this longitudinal comparison was a general ‘lift’ effect of institutionalisation in all countries, or like Furubo and colleagues state it “evaluation culture has matured over the last decade” (Jacob et al. 2015, p. 23). While the former forerunner countries from 2001 (United States of America [USA], Canada, Australia and Sweden) remained quite

1 Using the same indicators to ensure comparability.

6

W. MEYER ET AL.

static at the high level they achieved previously, others fulfilled huge efforts to catch up and thereby managed to establish themselves at the rankings top (such as Spain, New Zealand and Japan). This ranking can be considered as very useful to get a first impression about the actual state of evaluation in a country, but two points lead to a limited comparability, namely that evaluation, evaluation culture and the institutionalisation of evaluation lack a clear definition, and the fact that in the update from 2015 different experts have been questioned than in 2002, which might have induced a shift in the results. Besides that, it seems that no theoretical deduction of the indicators took place and an indicator measuring training and education is missing. The focus lays, like already mentioned above, clearly on the European and North American continent, while all Latin American countries and nearly all African and Asian countries were left out. Still the findings presented by Furubo and colleagues are important contributions towards how evaluation is institutionalised. It has been possible to work out three different internal driving forces for the development of evaluations. Those are the political constellation, the fiscal situation and constitutional features. External driving forces might be significant external pressures by donor countries or organisations (cf. Stockmann and Meyer 2014; Furubo et al. 2002; Jacob et al. 2015). Another publication dealing with the institutionalisation of evaluation was presented by Widmer et al. (2009). This aims to capture the development of evaluation in a systematic way in 10 different topic areas, thereby comparing Switzerland, Germany and Austria. It covers the institutionalisation of evaluation, focusing on the constitutional and other legislative foundation of evaluation as well as the anchoring of evaluation in parliament, government and administration, different topic areas and the use of evaluations as well as sectoral and national trends. The obvious difference in comparison to Furubo and colleagues is the focus on specific sectors. Each chapter is written by a different expert from the specific sector in the specific country. Each thematic rubric ends with a comparison of all three countries. Thereby it is possible to identify differences and similarities of the countries and sectors. To enable the already mentioned comparisons the authors set a strict framework for the chapters and defined evaluation as systematic, transparent and data based. The authors had to follow this definition. It can be criticised that the methodological approach of the single chapters is not systematic—some relied only on their own expertise in this area while others carried out written surveys or systematic analysis of databases (cf. Widmer et al. 2009). Still the book provides a detailed

1

THE INSTITUTIONALISATION OF EVALUATION

7

description of the differences and similarities of all three countries regarding the institutionalisation of evaluation. Two studies about National Evaluation Policies have been conducted by Rosenstein in 2013 and 2015. In 2013 Rosenstein carried out an internet research of 115 countries and found that 20 already possess a written, legislated evaluation policy, while others are either developing a policy (23) or conducting evaluation routinely without a policy (34). 38 did not provide any information indicating that they are developing one at the moment (Rosenstein 2013). These findings have to be interpreted very cautiously due to the fact that the study of Rosenstein was performed solely in the internet, which can lead to a false categorisation, if one is not familiar with the country’s political system, language or other issues. The cross-country comparison about different legislation of evaluation can be seen as a starting point for further research, but systematic assessment of the quality of governance as well as the development of evaluation will be necessary (cf. Stockmann and Meyer 2016). In the ‘Future of evaluation’, by Stockmann and Meyer (2016), more than 30 different authors from 20 countries placed in all continents provide an overview about the professionalisation of evaluation in their countries. The focus lays, like the title expresses, on the future of evaluation, meaning how evaluation will develop in different countries around the world. Topics covered in this book are challenges, which might be able to weaken, stop or reverse the increase of evaluation or the question, if there will be a globalisation process leading towards ‘one’ evaluation culture, or if there will be differentiation of the evaluation culture according to the political culture of various countries and their specific use of evaluation. The strength of the publication clearly lays in the provision of a new viewpoint towards evaluation, also including countries that are normally not on the radar of evaluation research. But although the editors provided indicators to ensure comparability, these were not systematically explored in all contributions. Therefore, also this publication does not provide a systematic analysis of the institutionalisation of evaluation in different countries (cf. Stockmann and Meyer 2016). Criticism of the analyses presented so far clearly shows that a uniform theoretical and methodological approach is necessary to carry out an analysis across countries and continents, from which an analysis grid can be deduced that is binding for all case studies. The theoretical approach starts with the institutions.

8

W. MEYER ET AL.

3

Institutions---Some Theoretical Remarks

Institutions are one of the most important objects of social sciences from its early beginnings. In sociology, for instance, classical sociologists like Emile Durkheim or Max Weber paved the way for an intensive discussion on the role of institutions as bridges between society and individuals (e.g. Stachura et al. 2009; Traugott 2013). Especially in the middle of the twentieth century, Talcott Parsons and the structure-functionalist school emphasised the importance of institutions and, among others, dominated the thinking about professions and professionalisation (e.g. Parsons 1939; Brante 1988). Institutions are seen as stabilising elements of society and—in Parsons (1939, p. 457) own words—the “study of the institutional framework within which professional activities are carried on should help considerably to understand the nature and functions of some of these social ‘constants’.” As a consequence of this, Parsons and the structural functionalists were criticised for overemphasising stability of societies formed by social institutions. This debate led to modernisation theory, still the most important concept of social change in sociology. In this theoretical tradition, the driving force of change is social differentiation: it refers to a horizontal process through which different functional areas of society—e.g. politics, law, science, the economy et cetera—emerge, each exhibiting their own functional principles (cf. Luhmann 1983, 1988). The central trigger of differentiation is specialisation, which results in more efficient processing of a common task within the framework of division of labour between individuals, groups or subsystems. The necessity for this can arise from intrinsic development processes (e.g. technological progress), changed extraneous influences (e.g. climate change) or new individual needs (e.g. altered consumer behaviour). The constantly advancing trend towards social differentiation creates pressure for change on existing subsystems, which then try to stabilise themselves by adapting their institutions (e.g. Alexander 2001). The theory of social differentiation (Schimank 1996) is, therefore, at the core of modernisation models, primarily looking at change from a traditional to a modern society based on an increasing degree of complexity. There is no consensus about the subsystems to be analysed and several slightly different solutions are offered in sociological literature: Lerner (1968) characterised the modern age by a growing, self-supporting economy, democratic participation in the political process, a culture oriented towards secular, rational norms, and a performance-based society which is

1

THE INSTITUTIONALISATION OF EVALUATION

9

equipped with personal freedoms and is geographically, socially and intellectually mobile. Talcott Parsons (1971) identified markets, administration, the rule of law and democracy as hallmarks of modern age, Anthony Giddens (1996) capitalism, industrialisation, the nation state and its monopoly of power, Wolfgang Zapf (1991) competitive democracy, market economy and an affluent society, and Ronald Inglehart (1998) changes in values, rising standards of living and democracy. Pollack concludes that the majority of modernisation theorists agree that the most important factors revolve around the economic and political discussion, that is economic markets and democratic forms of participation, and that individual dimensions do not emerge independently of one another but are rather interwoven in an intricate way (Pollack 2016). Following this theoretical concept, (modern) society thus consists of specialised subsystems operating in accordance with their own logic, exhibit independent structures and processes, following their own codes and rules of communication, and developing their own internal dynamics. The relative structural and cultural independence of a subsystem is based on its specialisation for particular social tasks, which enables better, more tailored and efficient processing. Although the functionally differentiated subsystems can be distinguished from one another using these criteria, the dividing lines between them are neither invariable nor impervious (cf. Pollack 2016). For their purposes, subsystems require outputs of other subsystems, which also create their own specialisations due to their own performance. For example, the economic system is influenced by laws and rules which have been made by the political system. The scientific system is in a similar position. With the cultural system on the one hand, having a determining influence on which research questions are dealt with, and the political system on the other hand, determining—through the allocation of financial resources—to what extent and in how much depth these questions can be examined by academics. The respective societal subsystems are therefore mutually dependent and connected to each other through institutional rules for managing these interdependencies. These institutional rules for managing interdependencies between subsystems opens the door for the question how these developing processes may be shaped and managed. Rules and systems of governance are necessary to stabilise the amorphous relationship between ‘inside’ and ‘outside’ of a (sub)system. Here system boundaries are promoted along with

10

W. MEYER ET AL.

regulating and channelling of ‘cross-border traffic’ between the subsystems at the same time (cf. Kapitanova 2013, p. 257). Such institutions include formal and informal systems of control for opening and closing the system, govern exchange between subsystems necessary for system integration, facilitating mutual monitoring of transactions and incursions which break the rule (cf. Armingeon 2016). Subsequently, Schimank (2001) sees in ‘subsystem interdependencies’ the key to understanding modern societies. Moreover, the conflicts and difficulties derived from (dis-)functionality within and between subsystems have to be solved by ‘man-made’ rules—being social innovations with side-effects and nonforeseeable consequences in each of the subsystems. The economic approach on institutions was totally different from these sociological enquiries. Starting point was Ronald Coase famous article on the ‘nature of the firm’ (Coase 1937), trying to answer the question why corporate actors are existing in economy and what are their advantages in market exchange. By bringing in the term ‘transaction costs’, Coase way of thinking influenced a lot of economists, finally forming ‘institutional economics’ as an important school (Voigt 2019). These economists focus merely on the role of institutions for cooperation and exchange. In this context, institutions are seen as bridges between different actors bringing them together for joint and successful collective action. Finally, one has to mention the huge amount of literature and discussions on institutions in the political sciences (Peters 2019). The roots of this way of thinking can be traced back to Greek philosophers like Aristotle and includes famous politicians like Montesquieu or the former US president Woodrow Wilson. Peters (2019, p. 7ff.) emphasises the holistic structuralism of this old institutionalism in political science and the birth of modern political science was somehow related with a critical perspective on this traditional way of thinking about the political system. Especially in the second part of the twentieth century, behavioural and more individuum oriented approaches like rational-choice theory dominated the way of thinking before the ‘new institutionalism’ was introduced by March and Olson in the 1980s (1983). For them: Political institutions are collection of interrelated rules and routines that define appropriate action in terms of relations between roles and situations. (March and Olson 1989, p. 21)

1

THE INSTITUTIONALISATION OF EVALUATION

11

This is not far away from a rational choice definition offered by Elinor Ostrom and broadly used in institutional economics. Therefore, institutions are … rules used by individuals for determining who and what are included in decision situations, how information is structured, what action can be taken and in what sequence, and how individual actions will be aggregated into collective decisions… (Kiser and Ostrom 1982, p. 179)

Bringing these two famous definitions from the 1980s together, one may keep the following elements for the context here: • Institutions are a set of rules, norms and regulations (more or less formalised) used by individuals and corporate actors for orientation in situations where decisions about action must be made. • Institutions are process-oriented, showing individuals and corporate actors’ pathways by building sequences and routines and guiding them through a jungle of decisions toward joint action. Moreover, changes over time are of certain interest because, on the one hand, institutions must be stable for giving guidance and cannot be changed within a short period of time, but, on the other side, must be flexible enough for adapting to social change and incorporate social innovations (Vargo et al. 2015). Moreover, rules develop its own pathdepending dynamics within organisations (March et al. 2000; Zhou 1993) and thus institutions may become a hindering factor for good governance and improved development (Andrews 2013). So, the third aspect mentioned here is about the process of institutionalisation: • Institutionalisation is the process of incorporating new rules, norms and regulations into an existing institutional system for adapting it to new demands from outside, improving its effectiveness and/or efficiency by including innovations, and/or for extent its task and influence to new fields of activities. These very general remarks and conclusions drawn out of a heterogeneous bundle of scientific literature from various disciplines implies some

12

W. MEYER ET AL.

further steps to apply it for research on the global success story of evaluation as a new instrument for governance. For the institutionalisation of evaluation, one has to look for • Rules, norms and regulations on evaluation, implemented in the already existing social subsystems. The first decision, therefore, is about the systems under observation and the kind of institutions to be investigated; • Evaluation processes, procedures and routines, implemented within a broad set of organisations or networks at least as a possible way of practice within a certain policy field. So, the second decision is about these procedures and the scope of implementation; • Finally, institutionalisation is a process with certain steps and it is probably a long way toward building a complete and deep-seated institutional framework. Some subjective assessment on the state of the art is necessary and any rating cannot be a complete objective and precise measurement. Therefore, some observable and comparable steps toward institutionalisation have to be assigned.

4

The Methodological Approach

The answer for the first question—which social subsystems should be observed—is somehow related with the specific task of evaluation. Although there are many different ways to define ‘evaluation’, almost all agree on the fact that evaluation is contracted research and done for the purpose of commissioners and not for the evaluators itself. While providing relevant information for decisions, it is first and at all a management tool supposed for governance of social subsystems. In order to carry out management activities, respective subsystems require information. That is to say, it is necessary to identify which outcomes can be achieved through targeted interventions under prevailing framework conditions. To enable effective management intervention, administration requires information on whether programmes and measures implemented have the desired effects, on what unintended impacts arise which may thwart the intentional outcomes, on whether the activities carried out are efficient, whether they are appropriate for the target group in question and are sustainable, et cetera. This leads to research on the political economy of governance and the evolution of social institutions for steering societies as a whole on a global,

1

THE INSTITUTIONALISATION OF EVALUATION

13

national or local level (merely linked to the political system), sub-systems like civil societies, education, economy and markets, and corporate governance in various kinds of profit and non-profit organisations (e.g. Schofield and Caballero 2015; Payne and Phillips 2014; Bertelli 2012; Barker 2010; Ebener 2008). Monitoring and evaluation can be easily located within this context and it is spreading through all fields and levels of governance mentioned here. However, the history of evaluation is primarily linked to public activities in the political system, although the private sector developed parallel some strongly related instruments like controlling or quality management (Stockmann 2008). Due to this use of monitoring and evaluation in the political system, one has to focus on the key principles of governance and its leading institutions there. The lens is put on the national state for international comparison reasons, although the relevance of differences between sectors or regions should not be ignored. In some areas (e.g. development cooperation, cohesion policy), the international regulation is at least as important than national ones, in other areas (e.g. schools) more decentralised governance system can be found in a couple of countries. Nevertheless, the key question here is the implementation of monitoring and evaluation within the national political system of governance with regards of system specifics and variations in Europe. One of the key elements of modern democratic systems is the division of powers between three independent systems which control each other: executive, legislative and judiciary. Executives and judiciaries work on base of legal system, an institutional setting including acts, laws, and regulations. They do not decide about the laws (besides some special constitutional courts in judiciary, checking whether laws are in line with the national constitution) and they are not allowed to adopt or change laws by their own decisions. This is the task of the legislative, in general the parliament (sometimes the president), and a majority of member votes is necessary for adopting a new law. Therefore, in most countries, the government is built by lists, clubs or parties with a majority of seats in the parliament and it is developing laws according to party programmes and coalition agreements. It is possible to distinguish between primary and secondary legislations, the first one is more on a general national level (e.g. acts, statutes or broad outlines endorsed by the parliament), the second one is delegated power by the parliament to specific authorities on an executive branch (e.g. ministries, agencies or expert groups), merely decrees, orders, rules, or

14

W. MEYER ET AL.

regulations below the level of laws (although there are some ‘administrative laws’ and ‘presidential decrees’ in USA, Lowe and Potter 2018). Subsequently, evaluation can be anchored in the legal system at the top national level (e.g. in form of a ‘national evaluation act’, determining the evaluation of all national laws, strategies, policies, programmes, projects etc.) or at a lower scope of liability (e.g. a decree for doing evaluation in the health or other policy sectors). As mentioned above, institutionalisation of evaluation is not only a question of rules, norms and regulations but also of implementing it into routines of collective action within or between key actors. In the political system, this shifts the focus from the legislative to the executive part (cf. Goodnow 2003). While legislation is determined by parties and the public debate about political decisions, the executive is merely an administrative process (cf. Kettl 2018). Primarily, the execution of laws and regulations lies in the hand of government and its public bureaucracy, including national ministries, agencies, offices, bureaus et cetera (cf. Henry 2018). However, in some cases, governance is not limited to the state but also includes private actors such as companies, non-profit organisations or other service providers, acting on behalf or as a substitute of the state. Some political scientists even see a shift from government executed by the state to network governance including all other societal forces (Mayntz 2003). Without any doubt, the institutionalisation of executive forces changed during the last decades and a lot of new forms of governance institutions rise under the flag of ‘new public management’ (e.g. Christensen and Laegrid 2007; Kettl 2015; Ansell and Torfing 2016). The key ideas of these reforms were to bring in ‘service and quality orientation’ from the private to the public sector and improve effectivity and acceptance of bureaucratic governance (Stockmann 2008, p. 57ff.). Hence, this development increases the difficulty for identifying the institutionalisation of evaluation within this broad variety of different governance systems in particular policy fields, including the question how evaluation is implemented within the whole administrative process, the practice of actors, the organisations involved, the role of control instances like the audit office and the parliament, and finally the scope of evaluation practice including different forms and instruments of evaluation. The emphasis lays on the spectrum of implementations, so a perfect institutionalisation of evaluation means its inclusion in all policy fields—although the form and degree may differ.

1

THE INSTITUTIONALISATION OF EVALUATION

15

In the context of the political system it will therefore be examined primarily how evaluation is integrated into institutional structures and processes (Table 1, column 1). The following questions are relevant here: • Whether there are national laws and regulations requiring the conduct of evaluations, overall (on a national level) or in individual policy fields (institutionalisation in legislation and the legal system). • Whether a national and/or sectoral evaluation policy exists through which the conduct and application of evaluations is systematised (institutionalisation in public administration and the system of governance). • Whether and in what way evaluations are embedded in parliamentary structures and structures of authority (e.g. audit courts) and are used by politics (institutionalisation in policies and the system of decision making). • Whether special evaluation units in governmental organisations or independent evaluation institutions in different policy fields exist and how and in which extent evaluation is practised and how and in which extent used by decision makers (organisational institutionalisation of evaluation). Hence, governance is not limited to the political system. As Alexis de Tocqueville already lined out in his famous work ‘De la démocratie en Amérique’ (Tocqueville 2003), citizens participation within a civil society Table 1 Institutionalisation of evaluation in different sub-systems Institutionalisation of evaluation in different sub-systems Political system: institutional structures and processes

Social system: societal dissemination and acceptance of evaluation in civil society

System of professionalisation: professionalisation of evaluations

Evaluation regulations

Institutionalised use of evaluations by civil society Public perception and discussion of evaluation and evaluation findings Civil societies demand evaluations

Academic study courses, further training etc. Platforms for communication about evaluation Compliance to standards and quality obligations

Evaluation practice in governance Use of evaluation results for decision making

16

W. MEYER ET AL.

is building the base of democracy and anchors the roots of such a political system into an open society. Modern political theory confirms this view and understand the political system even as an outcome of civil society (f.i. Cohen and Arato 1994). Civil society is seen as a sphere of interaction between private and public sector, merely organised through institutions for communicating interests (primarily voluntary associations such as trade unions) and sharing responsibilities in governance (e.g. through self-regulation mechanisms like the tariff autonomy). Civil society is an important source for social development in controlling the state and its institutions, providing information on particular interests and needs, supporting the means and governance measures for improving its impact, and taking over responsibilities and self-organisation for social services (f.i. Ojo and Mellouli 2018; Kalm and Uhlin 2015; Laville et al. 2015; Lovan et al. 2003; Howell and Pearce 2001). While the political system is legally well-defined and regulated, the civil society is heterogenous and diverse with many variations both in kind of interactions and its institutions. It is not possible to describe and explain civil societies as a holistic and closed system because it is rapidly changing due to social development and differentiated by various positions and opinions. In the context here, the focus is set again on national regulations for civil society, either set by the national state or the civil society itself. Although there might be some sectoral or regional regulations implemented for evaluation and its use, this is not enough for describing evaluation as being institutionalised in civil society as a whole. There are at least some general rules and institutions installed for regulating civil societies (e.g. access to parliamentarians is controlled by lobby lists and contact rules, the formation of interest organisation is depending on legal settings like election procedures or reporting systems, professional associations are developing standards and control mechanisms for adequate behaviour within and between its own organisational setting etc.). The question here is if and how evaluation is implemented in these institutional settings for civil society. Again, the institutionalisation is not limited to these kinds of rules but also includes the activities and regular procedures of civil society. At least two different functions can be distinguished: firstly, civil society acts as partners of the state, sometime in sharing responsibilities and decision making as mentioned above, but more often as service provider and transmitter between state and society. This role is primarily taken by non-governmental

1

THE INSTITUTIONALISATION OF EVALUATION

17

organisations (NGOs), acting as non-profit service providers and interest groups. Secondly, civil-society also acts as watch-dogs with a certain distance to the state, criticising decisions and performances of state organisations. In this role, civil society works for social change and development, including new or different focus of state policy or governance structures. Evaluation can be used in both fields: as an instrument for improving the own activities of NGOs, aiming for better societal services, or as an instrument for controlling the state activities, aiming for more useful policies. Institutionalisation of evaluation in civil society therefore means using generally (not only in some limited fields) these instruments for social betterment and for full filling both tasks. In the context of the social system, the social spread and acceptance of evaluation is analysed processes (Table 1, column 2). Questions such as the following are asked: • Whether organisations of civil society use the instrument for the improvement of their organisation and, if so, how and to what end. • Whether there is a public dialogue around evaluation results. • Whether and to what extent citizens, organisations of civil society or other actors are involved in evaluations. • Whether civil society organisations demand evaluation from political institutions. As already mentioned, the institutionalisation of interest group in civil society is an important aspect especially if it comes to professions. Professions link needed social services to continuous development of such kind of expertise. However, literature on professions reveal a lot of different perspectives and criteria how to define professions and distinguish them from non-professions (MacDonald 1995). One of these long-lasting debates can be found in evaluation community, asking whether evaluation is a profession by its own or just a general term for a bundle of activities in heterogenous practice (e.g. Wanzer 2019; Jacob and Boisvert 2010; Worthen 1994; Patton 1990; Merwin and Wiener 1985; Morell and Flaherty 1978). The most highlighted element for defining a profession is expertise and how it is institutionalised within a society (see especially Abbott 1988). Freidson (2001, p. 84) argued: “The key to the occupational control of work is the creation of a training credential that becomes a prerequisite for entering a labour market and performing a defined set of tasks […] the

18

W. MEYER ET AL.

credential testifies to successful training in a specialised, relatively exclusive school associated with a university”. In general, the academic system is the main place for producing and communicating expertise in modern societies. Universities have a monopoly to provide academic titles which are necessary for the right to teach within the autonomic academic system. The main institution is a fully equipped study programme (merely on Master or PhD-level) and there are clear rules within the academic system, how to install a new study programme. Becoming a main subject of such a study programme (with at least 60 ECTS points) is the highest level of institutionalisation in academic teaching but there are certain alternatives within the academic system (e.g. as a minor topic or only as single courses offered in other study programmes) or outside in general adult training courses (e.g. offered by state institutions, associations or private consulting). Some authors emphasise the process of achieving a ‘professional identity’ (e.g. Colbeck 2008), how it was derived by social interaction with others (e.g. Specht 1985), how it gets in conflict with ‘organisational identity’ (e.g. Ashforth and Mael 1989) and further questions developed merely at the individual and not at the system level. The main focus lies on communication as the key source for achieving a professional identity and for exchange on professional expertise (and new developments within the profession). This communication is possible within the academic system and must adapt to the existing styles and rules for science communication (Schiele et al. 2012; Chan and Fisher 2008). The key platforms for academic exchange are peer-reviewed journals with an increasing influence in the digital age (e.g. Larivière et al. 2015). Although there are certainly other platforms for academic communication (particular in new media), a specialised journal is today the most important institution. But the exchange between experts is not limited to academic journals: most professions developed a broad variety of other ways for communication. Some authors see professional associations as an “essential component of professionalism” (Friedman and Phillips 2004, p. 187) because they provide a broad variety of benefits and options for their members and they are building bridges between individual experts and the society (e.g. Merton 1958; Greenwood et al. 2017). Such kind of an organisation is, therefore, a key institution for internal and external communication for a profession. and a huge powerful association is the best way for anchoring a profession within a society.

1

THE INSTITUTIONALISATION OF EVALUATION

19

Finally, many researchers on professionalisation emphasise the autonomy of professions and their ability to set their own rules and norms (e.g. Rueschemeyer 1983). The most powerful solution are obligatory and exclusive rules, giving only the owners of a title, licence or approbation the right to practise in the field of expertise. Such a process of social enclosure and monopolising the market is seen as a key target for professionalisation (e.g. MacDonald 1985; Richardson 1997). While professionalisation literature overemphasises the external function in competition to other professions, standard systems, codes of ethics or all kinds of guidelines are preliminary targeting on the practice within the profession and try to improve the quality of services (e.g. Ingvarson 1998). It is an open question, how rules about evaluation practice are institutionalised in European countries. The analysis of the institutionalisation of evaluation in the system of professions therefore revolves around the embedding of evaluation as an independent entity within the academic system and the market processes, based on the criteria derived from professionalisation research. The following questions are dealt with (Table 1, column 3): • Whether there are academic educational, training and qualification offerings in the field of evaluation (rubric ‘Academic study courses, further training etc.’) • Whether there are scientific and technical journals or other media and fora that deal with evaluation and act as a communication platform (rubric ‘Profession’). • Whether there are professional organisations (trade associations) for evaluation which manage the exchange of ideas and promote the ongoing development of the discipline (rubric ‘Profession’) • Whether professional standards and rules for evaluation have been established and are generally binding (rubric ‘compliance to standards and quality obligations’, see Appendix). To sum it up: Institutionalisation of evaluation is here seen as a broad implementation in three main societal systems. The political system should use evaluation as a tool for governance, installed in general national acts and laws (not limited to certain policy fields) and included in all kind of governance mechanisms (again not limited to certain policy fields). Finally,

20

W. MEYER ET AL.

the results of these evaluations should be used as base for decision making and learning for to improve the quality of state activities and public services. The social system, civil society organisations in particular, should use evaluation as a general tool for improving their own activities, either for providing social services in cooperation with the state and for addressing the state as recipient of citizens interests. Moreover, evaluation results should be used by the civil society and public media to critically discuss options for improving governance and political performance of the state. Finally, civil society should be included as a source of information in evaluations run by the state (as a general rule in all policy fields, not only singular in particular cases). The system of profession should include an academic study programme for teaching how to do evaluations in a broad and universally usable form. It should also provide a platform for exchange for research on evaluation, guaranteeing the further development of the profession in theory, methods, and practice. An applied discipline like evaluation also needs an exchange between these academic approaches and the practice in a broad variety of policy fields. Professional organisations are the best way for interlinking basic research (as part of what evaluation is able to offer) and public demand (as part of what evaluation is needed for). Finally, the evaluation community should develop certain principles and standards how evaluation should be done for quality assurance and differentiation from other activities and services. These criteria add up to nine fields of investigation as shown in Table 1.

5

Selection of Countries and Methodological Procedure

When selecting the countries for the European volume, the first step was conducting comprehensive literature and internet research, in order to find out in which countries evaluation is significantly relevant. There was a pragmatic reason for this: As the publisher had allotted a maximum of 400 pages, including an introductory and synthesis chapter and a chapter on evaluation in the EU, no more than 20 countries could be included if each case study were only 15 pages long, which would allow each country to make a meaningful contribution. Of the countries selected in this way, despite extensive efforts, authors were not able to be found for two countries: Sweden and Austria. Originally, due to the size of the countries, a summary article on Estonia, Latvia and Lithuania had been planned. However, it turned out that the contextual conditions in the three Baltic States are vary so greatly

1

THE INSTITUTIONALISATION OF EVALUATION

21

that they simply could not be addressed in one article, so there is only one case study on Latvia in this volume. The launch of the European Book Project happened at the twelfth Conference of the EES in September 2016 in Maastricht, at which the research objectives and the methodological approach were presented and discussed. In addition, the conference also offered the opportunity to speak with potential authors. The interest in the project was so great that it was possible to quickly recruit expert authors (and author teams) for most countries. The authors were required to use the analytical framework derived from the theoretical and methodological considerations to write their case studies, which had been revised again after the EES conference (Appendix). While writing their case studies, the authors were also requested, in addition to providing a literature and document analysis, to conduct interviews with evaluation experts in which they could ask the questions for which insufficient information was available. The two main goals pursued by the case studies were completeness and comparability. On the one hand, all research questions listed in the analytical framework were to be answered as comprehensively as possible. On the other hand, a high level of comparability beyond the individual research questions was to be achieved. In addition, a glossary was created and made available to everyone so that they could base their analyses on a consistent understanding of the terminology. The two main goals were only able be achieved to a limited extent. Although in all cases authors were found who are subject matter experts and thus have a very good overview of the institutionalisation of evaluation in their respective countries, not all questions could be answered down to the last detail. This is mainly because there is too little information about the practice of evaluation. For example, while the legal basis for evaluation, the existence of national or policy-related legal regulations, the role of parliaments, audit courts, the organisational basis for evaluation in ministries and public authorities, and the sectoral dissemination of evaluation are all well documented, questions about the practice of evaluation—that is how evaluation is carried out and how often—are significantly more difficult to answer. This applies even more so to the questions of who uses evaluations and for what. It was also not easy to answer the analysis questions regarding the social system, since the variety of operating NGOs is innumerable, and their evaluation practice varies greatly and is also scarcely known. On the other hand, the assessment of public perception and discussion of evaluation results,

22

W. MEYER ET AL.

the participation of NGOs in evaluations or the demand of civil society organisations for evaluation were able to be determined quite precisely. Apart from a few analysis questions, the complex of professionalisation was also easy to handle, as academic courses of study, further training et cetera, the existence of VOPEs, the availability of journals, certification systems or quality standards et cetera, are all well documented. Another reason why comprehensive answers could not be provided for all analysis questions is quite simply due to the available space, which was severely restricted by the allotted number of pages. The information available in individual case studies, such as in Germany, but certainly also in other countries, was so extensive that they could easily have filled an entire volume on their own. Interviews conducted with other evaluation experts of the country being analysed were rather rare and they are insufficiently documented. The reason for this may be due to the abundance of existing material, which did not make additional interviewing absolutely necessary. All in all, it can be said that the present case studies do not cover all analytical questions in detail, but rather offer a wealth of information and— most importantly—have a high degree of comparability. Important factors in this were not only the authors’ great willingness to adapt to the corset of the analysis framework, but also the use of feedback loops. Intensive feedback on missing information and assessments in the individual case studies allowed them to be harmonised and their comparability to be increased. This has not only made it easier for the publishers to write the cross-sectional analysis presented in the synthesis chapter, but also allows all readers to comparatively analyse the results. What was rather unsuccessful, however, was the use of another tool, which was meant to complement the individual country case studies. Originally, a quantitative survey was planned to provide the most representative picture possible of the institutionalisation of evaluation in the individual case studies. Survey data were to be used to back up and validate the country case studies. Therefore, an online survey in several European languages (English, French, German and Spanish) was developed using the analytical framework. Data retrieved from this source were to be shared with the country experts to provide an empirical background due to the very small number of evidence-based studies on the embedding and use of evaluation worldwide. The plan was to distribute the survey through the national evaluation societies of the respective countries selected for the volume, through evaluation mailing lists and online forums, in order to reach a broad range

1

THE INSTITUTIONALISATION OF EVALUATION

23

of evaluation experts throughout Europe. Although a lot of time and effort has gone into carrying out a quantitative survey, unfortunately, the distribution process did not turn out as planned. Not many evaluation societies were eager to distribute the survey within their networks. In the end, the survey only reached a small amount of evaluation experts and the data collected was not enough to be used as a validation base for the results obtained in the country case studies. The editors of this book, therefore, had to regrettably abandon the idea of creating a database regarding the institutionalisation of evaluation. Although we have had to forego this complementary data source, the case studies nevertheless provide a comprehensive overview of the institutionalisation of evaluation in 16 European countries and the EU. These are presented below, followed by a concluding synthesis chapter which identifies commonalities and differences between the European countries.

Appendix Analytical Guideline: Compendium on the Institutionalisation of Evaluation

• Which sectors are “good performer” regarding use of evaluation and evaluation findings? Please describe up to 3 sectors that can be considered as leading in the field of evaluation’s use • Which sectors are “bad performer” regarding use of evaluation and evaluation findings? Please describe up to 3 sectors that are lagging behind in the field of evaluation’s use • On which aspect do most evaluations focus in these sectors (e.g. Planning and Steering, Accountability and Legitimacy, Enlightenment)? – In the case of different findings in different sectors: What might be possible reasons for these differences? • Which professional groups use evaluation and evaluation findings regularly (e.g. political decision makers, program or project manager, administrative staff)? – For what reasons are evaluations and/or evaluation findings used by these groups?

• With regard to the whole country: How would you describe the scope of conducted evaluations? Is it possible to speak of a frequent rhythm of evaluations, for instance for every new legislation or for every national program? Or is it rather non-specific? Does evaluation take place in all sectors/policy fields of a country (instead of only in the field of development cooperation, e.g.)? And within one sector, is evaluation applied for measures funded in different ways or maybe only the ones that received funding by the European Union? • With regard to the whole country: How would you describe the relation between internal and external evaluations? Which form is carried out more often and for what purposes? • What are possible reasons for this (e.g. determination in laws, policies or regulations)? • Is this relation differing with regard to sector or state level?

• Are there national laws or regulations about evaluation or use of evaluation? If yes, which? • Are there sectoral laws or regulations about evaluation or use of evaluation (e.g. a law about school evaluation or evaluation in the higher education system as example for laws in the educational sector)? If yes, which? • Are there policies or strategies about evaluation or use of evaluation, either national or sectoral? If yes, which? • Are there administrative regulations about evaluation or use of evaluation in different policy fields (instructions, guidelines, etc.)? If yes, which? • What is the content of these laws/regulations/policies/strategies or administrative regulations regarding independence of evaluation, quality, impact orientation and available budget? – Is use of evaluation specified? If yes, how? – How binding are specifications regarding use of evaluation?

(continued)

I.3 Use of evaluations

I.2 Evaluation practice

I.1 Evaluation regulations

I. Institutional structures and processes (Political system)

24 W. MEYER ET AL.

I.2 Evaluation practice

I.3 Use of evaluations

• With regard to the whole country: How – How is the use of evaluation findings – What are aspired functions of would you describe the relation guaranteed (f.i.: management evaluation (e.g. Planning and between process and impact/outcome response mechanisms, Steering, Accountability and evaluations? Which form is used more implementation of monitoring for Legitimacy, Enlightenment)? often and for what purposes? evaluation results, others)? • Is evaluation and use of evaluation – What are possible reasons for this – Are there differences with regard to findings embedded in parliamentary (e.g. determination in laws, policies different sectors? structures? If yes, how? or regulations)? • How is the quality of evaluations – Do parliamentarians in your country – Is this relation differing with regard guaranteed (e.g. regular conduction of deal with evaluation findings for their to sector or state level? meta-evaluations analyses, competence own political work? If yes, to what requirements for evaluators, quality extent (how often/how detailed do • Does an independent evaluation requirements for evaluations)? they use evaluation findings)? institute exist in your country? – Do parliamentarians in your country – With a national responsibility? demand evaluations for their own – With a responsibility for a specific political work? If yes, to what extent? sector or policy field? (How often? Do they commission • Do independent internal departments evaluations? Do they publicly exist, in ministries or elsewhere? demand evaluations)?

I.1 Evaluation regulations

I. Institutional structures and processes (Political system)

(continued)

1 THE INSTITUTIONALISATION OF EVALUATION

25

• Do individual citizens, civil society organisations, private enterprises or other actors in your country demand evaluations, e.g. from political decision-makers? – If yes, how often does this happen and under which circumstances/for what reasons? If not, why not? What might be possible hindering factors?

• How well-known is the instrument of evaluation in society? • Are evaluation reports (full version) made publicly available? • Is the general use of evaluation publicly discussed in media (benefits of evaluation, quality of evaluations, and professionalisation of evaluation)? – If yes, to what extent? If not, what might be possible hindering factors? • Are findings of actual evaluations publicly discussed (surprising findings, different possibilities of dealing with these findings)? – If yes, to what extent? If not, what might be possible hindering factors?

• Is it usual practice in your country that evaluations are used to provide knowledge for referenda or political decision making on a communal basis? – If yes, how regularly does this happen? If not, what might be possible hindering factors? • Are evaluations and evaluation findings used by individual citizens/civil society organisations and or private enterprises or other actors? – If yes, for what reasons (e.g. enforcement of their interests, knowledge or proof for work related issues, knowledge or proof for voluntary activities etc.) and how regularly? If not, what might be possible hindering factors?

(continued)

II.3 Civil societies demand evaluations

II.2 Public perception and discussion of evaluation and evaluation findings

II.1 Institutionalised use of evaluations by civil society

II. Societal dissemination/acceptance (Social system)

26 W. MEYER ET AL.

• Is it usual practice in your country that citizens or civil society organisations (NGOs, CSOs, churches etc.) are participating in evaluations (as stakeholder)? – If yes, how regularly does this happen? What are different forms of participation (e.g. as interview partners, as clients, as users of evaluation findings etc.)? If not, what might be possible hindering factors?

II.1 Institutionalised use of evaluations by civil society

II.2 Public perception and discussion of evaluation and evaluation findings

II. Societal dissemination/acceptance (Social system)

(continued)

II.3 Civil societies demand evaluations

1 THE INSTITUTIONALISATION OF EVALUATION

27

(continued)

• Which professional journals, newsletters • Do professional organisations ask their or other ways/media of communication members to follow standards or guiding (e.g. e-Mail or discussion lists) exist? principles? If yes, how obligatory is this? • Which professional journals from other • Do clients demand a certain evaluation scientific disciplines deal with evaluation quality and/or compliance to regularly? standards? How does this demand look like (is it obligatory)? • Does a professional organisation (VOPE - Volunteer Organisations for • To what extent do evaluators (and Professional Evaluation) exist in your clients) follow these standards and/or country? quality obligations? • Do standards, guiding principles for evaluators or s.th. similar exist in your country? – Developed by the VOPE? – Adopted from another VOPE? • Would you say that the evaluation market in your country is mostly dominated by freelancer (people calling themselves evaluators), consulting firms or scientific research institutes?

• Do programs of higher university education for evaluators (Diploma, Master) exist in your country? If yes, how many and where? • In which other scientific disciplines is evaluation instructed as scientific subject? Please give as many examples as possible. • Do other forms of academic or non-academic training exist? (e.g. e-learning, training by consultancies, else)?

III.3 Compliance to standards and quality obligations

III.2 Profession/discipline

III.1 Academic study courses, further training et cetera

III. Professionalisation (system of professions)

28 W. MEYER ET AL.

III.1 Academic study courses, further training et cetera

III. Professionalisation (system of professions)

(continued)

• Does a certification system for evaluators exist in your country? • Does an authority, which might be asked to conciliate in case of a dispute, arbitration board exist in your country, like an arbitration board or ombudsman? • Does a professorship for evaluation exist in your country?

III.2 Profession/discipline

III.3 Compliance to standards and quality obligations

1 THE INSTITUTIONALISATION OF EVALUATION

29

30

W. MEYER ET AL.

References Abbott, A. (1988). The system of professions: An essay on the division of expert labour. Chicago and London: Chicago University Press. Alexander, J. C. (2001). Soziale Differenzierung und kultureller Wandel: Essays zur neofunktionalistischen Gesellschaftstheorie. Frankfurt am Main: Campus-Verlag. Altbach, P. G. (2003). Centers and peripheries in the academic profession: The special challenges of developing countries. In P. G. Altbach (Ed.), The decline of the Guru (pp. 1–21). Bansingstoke: Palgrave Macmillan. Andrews, M. (2013). The limits of institutional reform in development: Changing rules for realistic solutions. Cambridge and New York: Cambridge University Press. Ansell, C., & Torfing, J. (Eds.). (2016). Handbook on theories of governance. Cheltenham and Northampton: Edward Elgar Publishing. Armingeon, K. (2016). Political institutions. In H. Keman & J. J. Woldendorp (Eds.), Handbook of research methods and applications in political science (pp. 234–247). Cheltenham and Northhampton: Edward Elgar. Ashforth, B. E., & Mael, F. (1989). Social identity theory and the organization. Academy of Management Review, 14(1), 20–39. Barker, R. M. (2010). Corporate governance, competition, and political parties: Explaining corporate governance change in Europe. Oxford: Oxford University Press. Bertelli, A. M. (2012). The political economy of public sector governance. New York: Cambridge University Press. Brante, T. (1988). Sociological approaches to the professions. Acta Sociologica, 31(2), 119–142. Chan, A. S., & Fisher, D. (Eds.). (2008). The exchange university: Corporatization of academic culture. Vancouver: UBS-Press. Christensen, T., & Laegrid, P. (2007). The whole-of-government approach to public sector reform. Public Administration Review, 67 (6), 1059–1066. Coase, R. H. (1937). The nature of the firm. Economica, 4(16), 386–405. Cohen, J. L., & Arato, A. (1994). Civil society and political theory. Cambridge and London: MIT Press. Colbeck, C. L. (2008). Professional identity development theory and doctoral education. New Directions for Teaching and Learning, 113, 9–16. Coryn, C. L., Wilson, L. N., Westine, C. D., Hobson, K. A., Ozeki, S., Fiekowsky, E. L., et al. (2017). A decade of research on evaluation: A systematic review of research on evaluation published between 2005 and 2014. American Journal of Evaluation, 38(3), 329–347. Ebener, A. (2008). Institutional evolution and the political economy of governance. In A. Ebener & N. Beck (Eds.), The institutions of the market: Organizations, social systems, and governance (pp. 287–308). Oxford and New York: Oxford University Press.

1

THE INSTITUTIONALISATION OF EVALUATION

31

Freidson, E. (2001). Professionalism, the third logic. Cambridge/Malden: Blackwell/Polity. Friedman, A., & Phillips, M. (2004). Balancing strategy and accountability: A model for the governance of professional associations. Nonprofit Management and Leadership, 15(2), 187–204. Furubo, J. E., Rist, R. C., & Sandahl, R. (2002). International atlas of evaluation. New Brunswick, NJ: Transaction Publishers. Giddens, A. (1996). Konsequenzen der Moderne. Frankfurt am Main: Suhrkamp. Goodnow, F. J. (2003). Politics and administration: A study in government. New York: Routledge. Greenwood, R., Suddaby, R., & Hinings, C. R. (2017). Theorizing change: The role of professional associations in the transformation of institutionalized fields. Academy of Management Journal, 45(1), 58–80. Henry, N. (2018). Public administration and public affairs. New York and London: Routledge. Howell, J., & Pearce, J. (2001). Civil society and development: A critical exploration. Boulder: Lynne Rienner. Inglehart, R. (1998). Modernisierung und Postmodernisierung: Kultureller, wirtschaftlicher und politischer Wandel in 43 Gesellschaften. Frankfurt am Main: Campus-Verlag. Ingvarson, L. (1998). Professional development as the pursuit of professional standards: The standards-based professional development system. Teaching and Teacher Education, 14(1), 127–140. Jacob, S. (2005). Institutionnaliser l’évaluation des politiques publiques: étude comparée des dispositifs institutionnels en Belgique, en France, en Suisse et aux Pays-Bas (No. 35). Frankfurt am Main: Peter Lang. Jacob, S., & Boisvert, Y. (2010). To be or not to be a profession: Pros, cons and challenges for evaluation. Evaluation, 16(4), 349–369. Jacob, S., Speer, S., & Furubo, J. E. (2015). The institutionalization of evaluation matters: Updating the international atlas of evaluation 10 years later. Evaluation, 21(1), 6–31. Kalm, S., & Uhlin, A. (2015). Civil society and the governance of development: Opposing global institutions. Basingstoke: Palgrave Macmillan. Kapitanova, J. (2013). Regeln in sozialen Systemen. Wiesbaden: Springer. Kettl, D. F. (2015). The transformation of governance: Public administration for the twenty-first century. Baltimore, MD: Johns Hopkins University Press. Kettl, D. F. (2018). Politics of the administrative process. Thousand Oaks, CA: Sage and CQPress. Kiser, L., & Ostrom, E. (1982). The three worlds of action: A metatheoretical synthesis of institutional approaches. In E. Ostrom (Ed.), Strategies of Political Inquiry (pp. 179–222). Beverly Hills: Sage.

32

W. MEYER ET AL.

Larivière, V., Haustein, S., & Mongeon, P. (2015). The oligopoly of academic publishers in the digital era. PLoS ONE, 10(6), 1–15. Laville, J. L., Young, D. R., & Eynaud, P. (Eds.). (2015). Civil society, the third sector and social enterprise: Governance and democracy. London and New York: Routledge. Lerner, D. (1968). Modernization: Social aspects. In L. David (Ed.), International encyclopedia of the social sciences (pp. 386–402). New York: Macmillan. Lovan, W. R., Murray, M., & Shaffer, R. (Eds.). (2003). Participatory governance: Planning, conflict mediation and public decision making in civil society. London: Routledge. Lowe, D., & Potter, C. (2018). Understanding legislation: A practical guide to statutory interpretation. London: Bloomsbury Publishing. Luhmann, N. (1983). Legitimation durch Verfahren. Frankfurt am Main: Suhrkamp. Luhmann, N. (1988). Soziale Systeme. Grundriß einer allgemeinen Theorie. Frankfurt am Main: Suhrkamp. March, J. G., & Olsen, J. P. (1983). The new institutionalism: Organizational factors in political life. American Political Science Review, 78(3), 734–749. March, J. G., & Olsen, J. P. (1989). Rediscovering institutions: The organizational basis of politics. New York: Free Press. March, J. G., Schulz, M., & Zhou, X. (2000). The dynamics of rules: Change in written organizational codes. Stanford: Stanford University Press. MacDonald, K. M. (1985). Social closure and occupational registration. Sociology, 19(4), 541–556. MacDonald, K. M. (1995). The sociology of the professions. London: Sage. Mayntz, R. (2003). From government to governance: Political steering in modern societies. Summer Academy on IPP, 7–11. https://www.ioew.de/ fileadmin/user_upload/DOKUMENTE/Veranstaltungen/2003/CVMayntz. pdf. Accessed on 6 June 2019. Merton, R. K. (1958). The functions of the professional association. The American Journal of Nursing, 58(1), 50–54. Merwin, J. C., & Wiener, P. H. (1985). Evaluation: A profession? Educational Evaluation and Policy Analysis, 7 (3), 253–259. Morell, J. A., & Flaherty, E. W. (1978). The development of evaluation as a profession: Current status and some predictions. Evaluation and Programm Planning, 1(1), 11–17. Ojo, A., & Mellouli, S. (2018). Deploying governance networks for societal challenges. Government Information Quarterly, 35(4), 106–112. Parsons, T. (1939). The professions and social structure. Social Forces, 17 (4), 457– 467. Parsons, T. (1971). Evolutionäre Universalien in der Gesellschaft. In W. Zapf (Ed.), Theorien des sozialen Wandels (pp. 55–74). Köln: Kiepenheuer & Witsch.

1

THE INSTITUTIONALISATION OF EVALUATION

33

Patton, M. Q. (1990). The challenge of being a profession. Evaluation Practice, 11(1), 45–51. Payne, A., & Phillips, N. (Eds.). (2014). Handbook of the international political economy of governance. Cheltenham: Edward Elgar Publishing. Peters, B. G. (2019). Institutional theory in political science: The new institutionalism. New York and London: Bloomsbury Publishing USA. Pollack, D. (2016). Modernisierungstheorie—Revised: Entwurf einer Theorie moderner Gesellschaften. Zeitschrift Für Soziologie, 45(4), 219–240. Richardson, A. J. (1997). Social closure in dynamic markets: The incomplete professional project in accountancy. Critical Perspectives on Accounting, 8(6), 635– 653. Rosenstein, B. (2013, December). Mapping the status of national evaluation policies. Parliamentarians Forum on Development Evaluation in South Asia and EvalPartners. Rosenstein, B. (2015, February). Status of national evaluation policies global mapping report (2nd Ed.). Parliamentarians forum on development evaluation in South Asia and EvalPartners. http://www.pfde.net/index.php/publicationsresources/global-mapping-report-2015. Accessed on 2 July 2019. Rueschemeyer, D. (1983). Professional autonomy and the social control of expertise. In R. Dingwall & P. Lewis (Eds.), The sociology of the professions: Lawyers, doctors and others (pp. 38–58). London and New York: MacMillan St. Martins Press. Schiele, B., Claessens, M., & Shi, S. (Eds.). (2012). Science communication in the world: Practices, theories and trends. Dodrecht: Springer. Schimank, U. (1996). Theorien gesellschaftlicher Differenzierung. Wiesbaden: Springer. Schimank, U. (2001). Teilsysteminterdependenzen und Inklusionsverhältnisse. Ein differenzierungstheoretisches Forschungsprogramm zur System- und Sozialintegration moderner Gesellschaft. In E. Barlösius, H. Müller, & S. Sigmund (Eds.), Gesellschaftsbilder im Umbruch. Soziologische Perspektiven in Deutschland (pp. 109–130). Wiesbaden: Springer. Schofield, N., & Caballero, G. (Eds.). (2015). The political economy of governance: Institutions, political performance and elections. Cham, Switzerland: Springer. Specht, H. (1985). Managing professional interpersonal interactions. Social Work, 30(3), 225–230. Stachura, M., Bienfait, A., Albert, G., & Sigmund, S. (Eds.). (2009). Der Sinn der Institutionen: Mehr-Ebenen-und Mehr-Seiten-Analyse. Wiesbaden: Springer. Stockmann, R. (2008). Evaluation and quality development: Principles of impactbased quality management. Frankfurt am Main: Peter Lang. Stockmann, R. (2013). Evaluation in der Entwicklungszusammenarbeit. In J. Wilhelm & H. Ihne (Eds.), Einführung in die Entwicklungspolitik (3rd ed., pp. 541– 562). Berlin: LIT Verlag.

34

W. MEYER ET AL.

Stockmann, R., & Meyer, W. (2014). Evaluation: Eine Einführung (2nd ed.). UTB Verlag: Leverkusen. Stockmann, R., & Meyer, W. (Eds.). (2016). The future of evaluation: Global trends, new challenges, shared perspectives. Basingstoke: Palgrave Macmillan. Tocqueville, A. (2003). Democracy in America and two essays on America. London: Penquin. Traugott, M. (2013). Emile Durkheim on institutional analysis. Chicago and London: Chicago University Press. Vargo, S. L., Wieland, H., & Akaka, M. A. (2015). Innovation through institutionalization: A service ecosystems perspective. Industrial Marketing Management, 44, 63–72. Voigt, S. (2019). Institutional economics: An introduction. Cambridge: Cambridge University Press. Wanzer, D. (2019). What is evaluation? Perspectives of how evaluation differs (or not) from research. https://psyarxiv.com/v9g8f/. Accessed on 9 March 2019. Widmer, T., Beywl, W., & Fabian, C. (Eds.). (2009). Evaluation: ein systematisches Handbuch. Wiesbaden: Springer. Worthen, B. R. (1994). Is evaluation a mature profession that warrants the preparation of evaluation professionals? New Directions for Program Evaluation, 1994(62), 3–15. Zapf, W. (Ed.). (1991). Die Modernisierung moderner Gesellschaften: Verhandlungen des 25. Deutschen Soziologentages in Frankfurt am Main 1990 (Vol. 25). Frankfurt am Main: Campus-Verlag. Zhou, X. (1993). The dynamics of organizational rules. American Journal of Sociology, 98(5), 1134–1166.

PART II

Northern Europe

CHAPTER 2

Denmark Peter Dahler-Larsen and Hanne Foss Hansen

1

General Country Overview

With its 5.7 million habitants, Denmark is a small nation situated in the Nordic part of Northern Europe. Denmark has a large public sector, as the welfare state is well developed. More than 33% of the total workforce is publicly employed. Accordingly, the level of taxation is high. Denmark is involved in intensive international collaboration in international organisations some of which develop evaluation policies and standards, such as European Union and Organisation for Economic Cooperation and Development. When analysing something as a ‘context’ for evaluation, the ‘context’ can be seen as a given fact, or interpreted and accounted for in different ways. The Danish context can be retold as a story of a traditional, communitarian, culture-bound nation gradually under change, or as a modern, flexible,

P. Dahler-Larsen (B) · H. F. Hansen Department of Political Science, University of Copenhagen, Copenhagen, Denmark e-mail: [email protected] H. F. Hansen e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_2

37

38

P. DAHLER-LARSEN AND H. F. HANSEN

and dynamic state highly susceptible to international influences such as neoliberalism (Dahler-Larsen and Schwandt 2006). Given the interaction between evaluation theory and practice, and given the relatively small size of the evaluation community in Denmark, all knowledgeable observers are also involved in evaluation. We will base our account on data as much as we can, but systematic data on evaluation practice is sparse. There is no official evaluation journal and no national database which covers evaluations. It is our interpretation that the structural and cultural factors, and the public-sector transformation mentioned in the following, have influenced how evaluation has been adopted, translated, implemented and developed and how it is regulated and practised today. 1.1

Structure

Denmark is a unitary, but highly decentralised state with a constitutional but ceremonial monarchy. Since the 1980s, Danish governments have mostly been coalition governments and in long periods minority coalition governments. The executive power rests with the prime minister, the cabinet and the individual ministers who have direct ministerial accountability vis-à-vis the parliamentary system. Within the limits of the law and the budgetary framework decided by parliament, the ministers have decisionmaking power in every matter in his/her field of responsibility. Ministers are thus both political and administrative. As a consequence, policy and administration are blended in an informal way, ministries are organised differently and administrative policies including evaluation policies are implemented in a sector-wise fashion (Hansen 1999, 2003). A large proportion of public expenditures in key policy areas for the welfare state (such as basic education, health care and social work) are administered by regions and municipalities. Symptomatically, these regional and municipal levels employ as much as 74% of the public-sector work force. Within limits, municipalities levy income tax on their inhabitants. Perhaps this factor helps emphasise the responsibility of each municipality for delivering value for money. Local governments have considerable autonomy regarding how to organise provision of welfare services. The central/local government interface is regulated by agreements and informal norms. Political responsibility is pushed back and forth between the central and decentral levels. The national association of municipalities is a key player. While it has supported

2

DENMARK

39

a variety of evaluation initiatives, it has also been instrumental in delaying large-scale, systematic and publicly available comparative performance information about municipal service provision (Kristiansen et al. 2017). Denmark is a highly organised society. Stakeholders such as employers, employees, students, patients, elderly and other user groups as well as public organisations such as local governments are organised in interest organisations. A tradition for corporatism, although somewhat diminished in recent years, constitutes the backdrop for ongoing dialogue and stakeholder involvement in policy development including involvement in discussions on evaluation (Hansen and Jørgensen 2009). While the legislative and administrative structures have been revised with regular intervals through Danish history, there has never been a drastic revolution where all principles have been redefined. The administrative mentality is lacking strong principles, but it is also characterised as trustbased, pragmatic, relatively effective, and surprisingly capable of meeting challenges, including financial strain (Mouritzen 2012). 1.2

Culture

Denmark is a high-trust country and one of the most corruption-free countries in the world according to Corruption Perception Index provided by Transparency International (2016). Objectively, the specific need to use evaluation to further transparency and good government is therefore not as pressing in Denmark as in many other countries. An argument saying that evaluation is necessary because of a lack of trust in public institutions has therefore less empirical support in Denmark than in other countries. Whether evaluation indicates ‘lack of trust’ continues to be an issue in Denmark. Furthermore, the Danish culture is characterised by low scores on power distance and uncertainty avoidance (Hofstede 2001). Power is decentralised and workplace atmosphere informal. Danes do not need a lot of structure and predictability in their work life. In addition, research has revealed that public managers, when asked, refer to a set of stable public values. In their own view, these are: Renewal and innovation, independent professional standards, accountability towards society in general, public insight and judicial values (Jørgensen 2007; Jørgensen and Vrangbæk 2013). Perhaps in consistency with a Danish form of pragmatism, innovation is thus seen as a more or less traditional value.

40

P. DAHLER-LARSEN AND H. F. HANSEN

1.3

Public-Sectors Reforms/Transformations

Public-sector reforms inspired by what was later termed New Public Management (NPM) saw the light of the day in the 1980s. Changing management practices emphasising performance monitoring and evaluation have been very important reform components transforming policy fields and public organisations (Hansen 2011). In 2007, a structural reform of local governments was enacted, resulting in larger municipalities with intendedly higher administrative capacity. As a corollary, the evaluation capacity of each municipality was strengthened. However, a survey (KREVI 2008) showed great variation in their evaluative capacity. Many municipalities had no explicit policy on evaluation as such, and less than 20% had written guidelines describing the use of evaluation. Most of the municipalities expected to increase the number of evaluations in the following years, and a need for specific evaluation training was identified. In the wake of the financial and economic crisis emerging in the late 2000s, hitting Denmark harder than the neighbouring countries, centralisation has been an important component in public-sector reforms. A tension remaining over several decades has to do with whether differences in performance across municipalities should be made clearly visible through statistical indicators (Hansen et al. 2014). As evaluation is becoming integrated in managerial practice, and evaluation results appear online, a distinction between evaluation on the one hand and on the other hand performance indicators, audit, accreditation, and quality assurance becomes increasingly blurred. We prefer to see the former as an overarching analytical category and the latter as variations in the design, form, and use of evaluative procedures which are becoming increasingly interconnected and overlapping in practice.

2

Institutional Structures and Processes

Having given a general country overview, we now turn to institutional structures and processes distinguishing between evaluation regulations, evaluation practice and the use of evaluations.

2

2.1

DENMARK

41

Evaluation Regulation

Although Denmark is found to have a relatively mature evaluation culture both in 2001 (Furubo et al. 2002) and in 2011 (Jacob et al. 2015), there is no overall national law or regulation about evaluation in Denmark. In some policy-fields there are, however, sectoral laws. One example is public schools where quality reports have to be worked out every second year. Reports are expected to be a management tool for developing quality. Furthermore, a legislation on transparency stipulate that evaluative data such as average grades in final exams, must be made publicly available. In some situations, a delicate balance between what is publicly available and what is not is maintained. For example, average grades from final exams are publicly available, while results from mandatory tests of pupils qualifications during school years are not. These are only used internally for pedagogical purposes and managerial oversight. Upper secondary schools publish drop-out rates, student satisfaction scores, and average grades in exams. Another example is the field of higher education where evaluation has been an important activity for many years (Hansen 2009, 2014). Here, a recent layer of evaluation is a law-based institutional accreditation system aiming at securing and developing quality and relevance. Another recent addition is the official publication of assessments of the quality of all programmes in higher education based on survey data from more than 100,000 students and alumni called “Uddannelseszoom” (“Education Zoom”) (Uddannelses- og Forskningsministeriet 2017). One of the main purposes is to inform future students about the quality and relevance of each educational programme they are considering. In the area of labour market and occupation, a very elaborate statistical system allows for public and online monitoring of many aspects of employment services. The same data also make evaluation and research possible. An elaborate set of official statistics connected to the personal identification number of each and every citizen makes it possible—after anonymisation—to answer evaluation and research questions regarding long-term individual consequences of interventions in child care, education, health care, employment et cetera in a way which would be impossible in many other countries. In some policy fields, there are well-developed policies and strategies for evaluation. One example is development aid where an office in The

42

P. DAHLER-LARSEN AND H. F. HANSEN

Ministry of Foreign Affairs has been dedicated to evaluation since the 1980s. Another example is the national strategy within the field of health where The Centre for Patient Experience and Evaluation since 2000 has conducted surveys of patient experiences. Evaluation is becoming institutionalised and performance data linked to financial resources, but these tendencies are not uniform and not unilinear. A large evaluative system in health care (‘The Danish Quality Model’) established in 2004 was dismantled in 2015 after two rounds of hospital accreditations and after widespread complaints from professional staff (Vrangbæk et al. 2017). An interesting anomaly is the legislation for free schools. These are institutions which are run by parents or by independent organisations and paid partly by fees, partly by public subsidies. These schools are usually based on particular religious, humanitarian or pedagogical beliefs, and they have traditionally had a fairly high degree of autonomy. From 2003, these schools have gradually been required to evaluate more. They must evaluate their programme and make evaluation results publicly available, but they use their own value framework as a basis. The legislation illustrates a typical Danish compromise between local autonomy and central oversight. In recent years, there have been cases of more intense inspection of some free schools, particularly those based on Muslim beliefs. In addition to passing legislative frameworks which influence evaluation, parliament itself decides from time to time to commission ad hoc evaluations of major initiatives and reforms. Parliament also has the capacity to terminate evaluations already initiated, as it happened with an evaluation of the Danish participation in the war in Iraq. Decisions to evaluate legislative acts are not determined by any general policy, but rest with shifting political majorities, governments and ministers. Only about 1–4% of Danish legislative acts (depending on time period and method of counting) include a review clause so that the law requires some sort of deliberative revision after a given period of time (European Court of Auditors 2017). It is to a large extent up to individual ministries to conduct evaluations within their policy area. As a political game of avoiding blame/winning praise unfolds, whether or not to carry out evaluation is itself a politicised decision. An illustrative example is the evaluation of the large-scale structural reform enacted in 2007 (mentioned above) (Breidahl et al. 2017). 271 municipalities were amalgamated into 98 and 14 counties into 5 regions. In addition, the division of labour between the administrative levels were

2

DENMARK

43

changed and likewise the financing of tasks. Major parts of the reform were decided on by a small majority of the Parliament. Once the reform process began, the opposition suggested an evaluation. This proposal was discussed inconclusively several times during the following years. After an election in 2011, the opposition took over. A decision was made and the evaluation carried out from 2011 to 2012 inviting a relatively narrow set of actors implementing the reform into the process. The evaluation was used to both legitimise the reform and make minor adjustments of it. Evaluation regulation is becoming multi-layered and complex. An example: After a handful of heart-breaking cases with extensive media coverage in which children under care were molested, or not cared for properly, legislation describing a new foster home inspection system in Denmark was enacted in 2014 at the initiative of the Minister of Social Affairs. Foster care of children is a troubled policy area, which no one wants to ‘own’ politically. The minister decided that the main responsibility for the reformed system should remain in the municipal sector. Each municipality formerly inspected its own institutions. The new legislation identified five selected municipalities responsible for inspecting homes in the rest of the municipalities, while the inspecting municipalities inspect each other. The minister asked The National Board of Social Services (NBSS) to guide the implementation of the inspection system. The NBSS produced written guidelines with quality criteria guiding the inspections. Training courses for inspectors were developed as well as audits, where inspectors go through exemplary cases and learn from each other. Variations in inspection style and organisation were reacted to as these are difficult to justify from a legal and political perspective. The NBSS also established a statistical monitoring of the content of the inspection reports. A risk-based algorithm was developed to regulate the frequency and intensity of inspections. A mid-term evaluation of the inspection system included a survey to the foster homes and a survey to the children about their quality of life. Finally, the NBSS also commissioned a comprehensive evaluation of the whole system. In this example, regulation of and by evaluation does not converge into a simple and unified system. Instead, a number of evaluative mechanisms are layered on top of each other and knit together into a complex, dynamic network. Accountability is distributed in complex ways somewhere in the interaction between foster homes, inspectors, municipalities, inspection agencies, evaluators, NBSS, and, backstage: The minister of social affairs.

44

P. DAHLER-LARSEN AND H. F. HANSEN

2.2

Evaluation Practice

Overall, Danish evaluation practice takes many forms and is best characterised as pragmatic. Most decisions on evaluation practice are delegated to ministries, local governments and public organisations at various levels. The pragmatic style also relates to the question about external versus internal evaluation. Both forms are used. Some actors give priority to external evaluations as for example the Ministry of Finance which in several cases have commissioned some of the large consultancy firms to carry out evaluations including doing business cases on how fields such as the police, the tax authority or the administration at the universities may raise effectiveness. These evaluations have been criticised for unrealistic assumptions about efficiency gains and for justifying considerable cutbacks. External consultancy reports are sometimes used merely to legitimise a change of policy already under way. An agency also known for its extensive use of evaluation consultants is Danida (responsible for foreign aid). Over time, several issues have occurred in relation to their use of consultants, for example whether a few companies performed too many consultancy assignments (Rigsrevisionen 2001); whether consultants were pressured to provide particularly positive evaluation results in return for future assignments and whether there is a sufficiently clear policy in the Ministry of Foreign Affairs regarding the acquisition and use of consultancy services (Rigsrevisionen 2015). In municipalities, a survey found two thirds of evaluations to be internally organised (Dahler-Larsen 2002). The use of external partners such as consultants in the remainder of the cases was statistically correlated with a slightly higher frequency of major change induced by evaluation as compared to the purely internal evaluations. Hansen (2003, p. 92) found that in central government, the most commonly used evaluation models were process evaluation, impact evaluation, and evaluation of goal attainment. Often, several evaluation models were used in the same evaluation. In local government, evaluation commissioners were primarily interested in the estimation of impacts, understanding processes from intervention to outcome, and measuring user satisfaction (Dahler-Larsen 2002, p. 42). Gønge (2014) found that more than 10 years and a municipal reform later, managers in municipalities continued to ask for evaluation of effects and impacts of social interventions.

2

DENMARK

45

A recent analysis of innovation in the public sector documented that 46% of all innovations implemented recently had been subject to evaluation at the time of the survey. While some authorities had specific innovation strategies, offices or funds allocated to innovation purposes, none of these features were systematically correlated with evaluation frequency (Dahler-Larsen, forthcoming). This observation indicates that evaluation of innovation is not guided by systematic policies. Looking at evaluation practice in a historical perspective, Evert Vedung (2010) interprets the development of evaluation practice in the Nordic countries since the 1960s in terms of four waves, each representing a set of ideas about the right approach to evaluation. The scientific wave consolidated in the 1960s faded out in the 1970s and was moved aside by the dialogue-oriented and stakeholder-based wave. In the late 1970s and the early 1980s, a swing towards a neo-liberal wave with focus on performance started and in the mid-1990s, the evidence wave, a return to experimentation—rolled in. Building on Vedung’s interesting analysis, it is our interpretation that the 1960s rationalistic, scientific wave never established a stronghold in Denmark. The rationalistic wave was most pronounced among the first group of adopters, but Denmark was not a member of that group (Furubo et al. 2002). When the idea of evaluation was adopted in Denmark, it found its place in three areas: Social policy, development aid and research policy (Hansen 2003). In social policy and development aid, evaluations were carried out by social scientists with a qualitative, dialogue-oriented approach, and in research policy, the approach was anchored in peer review. Furthermore, the dialogue-oriented, the neo-liberal and the evidence approach have all rolled in as waves, but not rolled out again. Instead these approaches have settled as sediments in a layering process. In relation to the evidence approach Danish researchers within the field of medicine and later within the field of social welfare were early adopters of the Cochrane and Campbell Collaborations way of practicing (Hansen and Rieper 2009). We may see a new variety of the evidence wave developing these years in several ministries, for example within the field of social policy (Møller 2017) and the field of educational policy, under the term data-driven policy development. Here the idea is not to make evaluations and produce evidence through experiments but instead to build national documentation systems and statistics that allow for analyses and evaluations of performance. The Danish development in this respect reflects a contemporary trend in international evaluation, that is the integration of evaluative information into

46

P. DAHLER-LARSEN AND H. F. HANSEN

larger streams of integrated data (Rist and Stame 2006), a tendency which continues more or less seamlessly into ambitions regarding the future use of ‘Big Data’. Another story of the development of evaluation practice in Denmark can be told about changing definitions of evaluands over time. In this respect, three layers, all part of current evaluation practice, can be defined (Hansen 2013). The first layer, evaluation classic, focusses on evaluation of programmes in the form of interventions and reforms. Following this idea, evaluation practice is organised on an ad hoc basis as follow-up upon political decisions. The second layer focusses on organisations as is the case in performance management systems related to for example contract management. Here the idea is that a principal, for example a parent ministry, has to monitor its agencies in order to evaluate their performance. More focus is on implementation of policy and on performance than on the evaluation of policy as such. The third layer, systemic evaluation governance, focusses on a plurality of organisations. Here the idea is that comparison and transparency in an organisational field facilitate learning as good practices are spread in the field. In the third layer, we find approaches to evaluation such as benchmarking, rating, ranking and accreditation. Along with the intended mechanisms of learning, this approach is also said to operate through competition and/or mechanisms such as naming and shaming (Pawson 2002). We have seen in Denmark in recent years an increasingly institutionalisation of systemic evaluation governance. We find conspicuous examples hereof in employment policy, health care, schooling, higher education and research policy. In some fields, we find independent evaluation institutes and departments. The National Audits Office and The Parliamentary Ombudsman have particular capacity to control and evaluate the performance of public institutions. The National Audit Office is an independent institution under the parliament, mandated by law to strengthen the accountability of public administration to the benefit of the citizens. Years back, the office was primarily engaged in financial auditing, but in recent years, it has moved still more into evaluation in the form of management and performance auditing based on broader criteria, including efficiency and goal-effectiveness. Furthermore, the office plays a role in providing advice on how to conduct and use evaluations (Rigsrevisionen 2005). It also prescribes evaluative practices in situations where the outcomes of public spending have not been monitored sufficiently.

2

DENMARK

47

The Danish Parliamentary Ombudsman is mandated by law to investigate complaints about public administration. However, the Ombudsman also takes initiatives to conduct thematic evaluations about the well-being of citizens for example in prisons, psychiatry and institutions for disabled people. Further, the evaluations are followed up by dialogues with ministries about which measures to take to solve the problems revealed. An addition to these institutions which exemplify how the use of evaluating institutions is incorporated in the overall legal-administrative structure, there has been evaluation (and research) institutions devoted particularly to local government. Recently several such institutions have merged into VIVE, a Danish acronym for ‘Knowledge for Welfare’. VIVE has about 200 employees and is described as an independent institution under the Ministry of Economy and the Interior. In addition, EVA is an evaluation institute in the educational sector, and The Danish Accreditation Institution performs accreditations of higher education. In health care and the social sector, there are a number of knowledge centres and quality centres of various sizes which carry out evaluations. Their degree of independence varies substantially. So, does their analytic ambitions, and their degree of anchoring in social science theory and methods. The quality of evaluation practices has been analysed. Lomborg (1998) argued that too many evaluations carried out by municipalities did not comply with fundamental methodological rules. For example, they lacked sound causal reasoning. Dahler-Larsen (1998) argued that quality in evaluation is multidimensional, and evaluations can be of good use without being methodologically perfect. In a study of 452 evaluation reports in central government, Hansen (2003, pp. 99ff.) found that in the majority of evaluations, there was no discussion of strengths and weaknesses of the evaluation model and the evaluation method used, alternative models were not discussed, the evaluation was not theoretically informed, the terms of reference were invisible for the readers of the evaluation report, and the costs of evaluation were not revealed. Other studies focus on a particular evaluation model. Two close-up studies were made of a Danish version of theory-based evaluation called virkningsevaluering. Tronegård-Madsen (2008) asked 28 evaluators to report on their own experiences with this model. They found the model useful as it allowed testing of causal assumptions about impacts. Further, it could be used formatively, it allowed for stakeholder participation, and it allowed for evaluation of side effects of interventions. On the critical side,

48

P. DAHLER-LARSEN AND H. F. HANSEN

they characterised the model as time-consuming, complicated, difficult to communicate, and not providing clear conclusions. Christensen (2015) reviewed 117 instances of virkningsevaluering. She evaluated the reports on the basis of textbook criteria on theory-based evaluation. She found that the programme theory often lacked clarity, that the evaluations were often inconsistent and conclusive, and lacked attention to side effects. The substantial quality problems were most pronounced in internal evaluations. At the same time, the proportion of virkningsevaluering which was carried out internally had gone up from 38 to 60% in the seven years between the two studies. The gap between evaluation quality as defined by textbooks versus evaluation quality defined by real-life evaluators may even widen, as an evaluation model becomes more used and integrated into internal operations and common practice in many corners of the public sector. 2.3

Use of Evaluations

Evaluations are used for many purposes. We observe both process use and results use (Forss et al. 2002) and it appears that evaluations are used in many of the different ways suggested by the literature, such as instrumentally, conceptually, legitimising, interactively, tactically and ritually (Vedung 1997). It is extremely difficult to separate each form of use conceptually and capture it empirically. Already in 2005, a self-evaluation carried out by EVA (an evaluation institute in the educational sector) realised that it is difficult to measure the ‘effect’ of evaluation. Instead, in the self-evaluation, the use of evaluation is reduced to what is subjectively perceived by people under evaluation, such as whether the evaluation triggered reflections and discussions (EVA 2005, p. 93). In our interpretation, this criterion is difficult not to meet and it almost makes the use of evaluation a self-fulfilling prophecy. In an attempt to measure actual consequences for evaluands, a survey among directors of social work in municipalities revealed that 45% of the evaluations lead to minor adjustments only, while 23% triggered major revisions of the evaluated activities. In many cases, activities were continued without change either because of a deliberate decision based on evaluative data (37%) or because the evaluation did not make any difference at all (7%). Only about 1% of evaluations resulted in a termination of the evaluated activities (Dahler-Larsen 2002, p. 70). A broader study among municipal managers in all policy sectors showed similar results (Dahler-Larsen 2000).

2

DENMARK

49

Altogether, these data illustrate testify to evaluation becoming a management instrument more or less integrated in the daily organisational life and mostly used for ongoing piecemeal adjustments of services. In addition, municipal managers said that evaluation served a number of purposes, such as drawing attention to an intervention, creating a common understanding of an intervention, promoting organisational learning, developing staff competencies, or stimulating debate. The integration of evaluation into many aspects of management practice has continued through the years, but not without debate. An international study of teachers found that teachers in Denmark were particularly sceptical about the use of international comparisons in education. Danish teachers subscribed to a unique definition of school quality. They put more emphasis on involvement of pupils and raising democratic awareness, and less on instrumental skills, than were the case for teachers in all the other countries. Perhaps this is the reason why, in contrast to teachers in Sweden, England, Scotland, and Finland, more Danish teachers viewed the influence of international comparisons of schools (such as PISA) upon school quality as negative rather than positive (Ozga et al. 2011). Much debate occurred around a legislation proposed in 2002 which mandated publication of evaluative data from schools and which made ranking possible, a phenomenon which collided with traditions in the Danish school. Some found that ranking would promote neoliberal values and lead to increased competition among schools. In a case study, however, it was found that the actual format in which data were presented did not authoritatively provide for official ranking. Rankings would be made individually by citizens or journalists. Contrary to expectations, some media focused not on bashing the schools in the bottom of a league table, but on the resources that school principals in these schools said they needed for improvement of their schools. These factors, together with socio-economic control factors and accompanying information, mitigated some of the worst expected effects of the publication of data (Andersen and Dahler-Larsen 2008). Dahler-Larsen and Pihl-Thingvad (2014) surveyed professionals in the public sector such as language teachers, job consultants and upper secondary school teachers with regard to their views on performance indicators. One group spent on the average 4.44 weekly hours on documenting their results (which according to some, was taken away from time with clients). In other professional groups, about one hour weekly was spent on documentation. There was great variation among the groups regarding their views on the consequences of performance indicators. Many felt that

50

P. DAHLER-LARSEN AND H. F. HANSEN

performance measurement indicated a lack of trust. An even more pertinent critique was that performance indicators lead to a one-sided focus and that the indicators failed to describe the core of professional work as defined by the professionals themselves. There were discrepant views across the groups concerning whether or not performance indicators provided clarity about what needed to be done (Dahler-Larsen and Pihl-Thingvad 2014, p. 90). Whether performance measurement leads to clarity is highly context-dependent. Analyses revealed correlations between seeing performance indicators as a burden, seeing the professional work as full of complexity, having conflicts with clients/citizens, seeing one’s demands upon oneself as a burden, and having stress (Dahler-Larsen and Pihl-Thingvad 2014, p. 178). These observations indicate that the functions of performance indicators are entangled in a web of issues related to provision of welfare services as well as to the working environment of the professional in the public sector. On the practical side, the study revealed that in the daily life of many professionals, computer problems are annoying. Programmes required for mandatory registration of evaluative data are not user-friendly. Documentation takes time. Observations about such pedestrian issues indicate a wide gap between the utopia of a smoothly functioning data-based public administration on the one hand and the lived reality of public employees on the other. New efforts in the area of schools include cash rewards to schools which are able to demonstrate positive changes in evaluation results for school children with a non-Danish background in the coming years. The spirit of this policy instrument is new, as school managers will have to figure out whether or not it will pay off to invest in such improvement in their schools. They will only be paid retrospectively if they succeed. Traditionally, employment in the public sector has been regarded as secure even though the Danish labour market model is well known for its traits of flexicurity referring to a welfare state model that combines labour market flexibility and employee security through unemployment benefits. Flexicurity means that it is rather easy for employers to dismiss staff. In the public sector this has traditionally not been the case but as personnel management has become tougher in the wake of the financial and economic crisis dismissals have become a kind of ‘new normal’ (Hansen and Mailand 2013; Hansen and Kristiansen 2018). In an ongoing research project on reforms and transformations in the university sector, we have seen examples of performance information being used in dismissal processes.

2

DENMARK

51

Dahler-Larsen and Foged (2017) showed that privatisation especially in the form of competitive tendering had a significant influence on job satisfaction among language teachers (teaching Danish to immigrants). This adds to our understanding of the role of evaluative data, because the best bid among competing language schools is defined in terms of performance data (in practice test results for students) and costs. In schools winning the bid, there is therefore an ongoing attention to performance so that the school remains competitive through the next tendering process. Among the intervening mechanisms which help explain how competitive tendering leads to lower job satisfaction, this study found more strained relations between managers and teachers as well as a feeling of stress. Taken together, our observations suggest that use is not a result of evaluative data in themselves, but when woven into institutional and managerial practices, and linked with money streams, evaluative data can be quite consequential.

3

Evaluation in Society

Non-governmental organisations (NGOs) are adopting evaluation to an increasing extent partly due to contracting with the public sector. For example, NGOs operating with foreign aid report back based on a fairly elaborate evaluation scheme. As a consequence, NGOs must hire or train specialists in evaluation. Voluntary organisations doing social work for public money are faced with evaluation demands in return. This creates tensions, when volunteers perceive their motivation as ethical and not outcome-oriented. Furthermore, they hesitate to participate in documentation practices, for example when services are provided to anonymous clients. Without identification of beneficiaries, it is difficult to collect systematic outcome data. Volunteers cannot be compelled to participate in particular evaluative practices. Some NGOs have reflected in advanced ways about these dilemmas in voluntary work (Blå Kors 2012). While documentation is important, the tension between this documentation on the one hand and on the other hand what is seen as the true spirit of the organisation’s work may increase. Another example of dilemmas of collecting evaluative information is found in Folkekirken (‘The People’s Church’) which is administratively not separated from the state apparatus in Denmark. As a result of declining membership, the allocation of resources must be prioritised. To decide how to allocate clerical manpower to parishes and to—ultimately—decide

52

P. DAHLER-LARSEN AND H. F. HANSEN

which churches to close, a monitoring scheme of the number of participants in sermons has been developed, as has systems to count the relative workloads of priest. A clerical activity such as soul-searching, for example, was allocated a standardised time frame of 45 minutes. These systems have, not surprisingly, created intense debate. It has also been discussed whether the ‘customers’ in the church should evaluate the priest based on user satisfaction, but this has not been implemented. Philanthropic foundations play a role in setting requirements or expectations related to the evaluation of activities in return for donations. Some foundations also see themselves quite explicitly as advocates for specific forms of evaluation. There have been wide debates about the role of evaluation in the welfare state and society in general. A newspaper column written in 2007 by two bureaucrats who have helped promote evaluation as a steering principle in the public sector asked for ‘forgiveness’ as “we did not know what we were doing” (Politiken 2007, March 29). Their viewpoint resonated with several other debates which keep recurring from time to time, such as the debate about de-bureaucratisation of the public sector and the specifically Danish debate about djøficering, which refers to the alleged dominance in public sector management of academics trained as lawyers, economists and political scientists (their trade union is DJØF). Allegedly, these academics have ‘cold hands’ and prefer to run the public sector using spreadsheets, indicators, rules, and management fads at the expense of the ‘warm hands’ of nurses, teachers, pedagogues et cetera. A part of the debate about NPM and about evaluation and indicators has overlapped with, or has perhaps been re-phrased, as a critique of those ‘elites’ who were seen as responsible for the current transformation of the public sector. The easy alternative has been presented as more ‘trust’ in the professions such as nurses and teachers. Thus, public debates related to evaluative data, their production and use have taken a variety of forms and have often not been coined as debates about evaluation as such. For example, in relation to the publication of school data, researchers including statisticians have argued that it is unfair to publish raw average grades without control for socio-economic background factors in each school district. These factors include for example the proportion of single parents and the proportion of non-Danish-speaking families. Consistent with textbook methodology, it is argued that genuine school quality is better represented by the added value after control for these factors rather than by the raw data themselves. Inhabitants in some

2

DENMARK

53

school districts, however, have complained about the publication of these socio-economic control factors, because they feel it sets their local community in a negative light. Common sense views and the logic of sophisticated evaluation do not always coincide. Literature has contributed to debates about evaluation, too. In addition to literature already mentioned—and in addition to textbooks and handbooks—the Danish evaluation literature is rich given the small size of the country. One of the first landmark books was Erik Albæk’s (1988) “Fra Sandhed til Information” (‘From Truth to Information’). Inspired by the American debate on utilisation of evaluation, Albæk argued that evaluation cannot and should not replace politics. The evaluation literature in Denmark also includes discussions of evaluation as a ritual (Dahler-Larsen 1997); the relation between evaluation and power (Dahler-Larsen 2004); how result-based steering can be made meaningful in the public sector (Ejler et al. 2008); and whether quantification at the workplace has become a tyranny (Andersen and Tanggaard 2016). A recently published review of “results-based steering” (Møller et al. 2016), critically discussing the side effects of performance indicators in schools, employment and social services, was much discussed in the media. Some of the Danish contributions to evaluation literature have resonated internationally. Hansen’s article (2005) about the many factors, rational as well as irrational, which influence the choice of evaluation models, is internationally cited. The same goes for Dahler-Larsen’s book (2012) on the evaluation society. A Danish-Swedish contribution published in American Journal of Evaluation looks at combinations of theory-based evaluation with a dialogue among stakeholders (Hansen and Vedung 2010; DahlerLarsen 2018). This idea expresses international theory-based evaluation with a Scandinavian twist. A large amount of literature concerns the involvement of users in evaluation. Hanne Krogstrup (1997) has developed an UPQA model (BIKVA in Danish) which describes how inputs from users can be used in the development of local services and in institutional learning. Her inventive model has been widely adopted outside Denmark and contributes to a Scandinavian tradition emphasising dialogue and participation in evaluation (Brix et al. 2017). This tradition cultivates evaluation involving citizens and users both in its deliberative forms (Hanberger 2006; Karlsson Vestman and Segerholm 2009) as well as within a representative democratic regime (Dahlberg and Vedung 2001).

54

P. DAHLER-LARSEN AND H. F. HANSEN

Contrary to the conclusion of Nielsen and Winter’s (2014) bibliometric study of international journal articles, we argue that if we read articles rather than count them; if we include books in the analysis; if we attend to publications in the language of the country we draw conclusions about; and if we pay attention to the interaction between structure, culture and evaluation practice in the Nordic countries, it is not too difficult to identify a Nordic tradition in evaluation. In separation, each of the components may not be totally unique (neither to the Nordic countries, nor to Denmark, evidently), but in their combination, we think the following keywords help characterise the tradition we wish to portray: An attention to the social, cultural and organisational embeddedness of evaluation; a pragmatic and non-technical approach to evaluation; an inclination for participatory and user-oriented approaches; a conspicuous lack of overly optimistic enthusiasm about evaluation as such; and trust in the public sector which exists prior to evaluation.

4

Professionalisation

The Danish Evaluation Society (DES) arranges annual conferences with now more than 200 participants. The society arranges meetings and houses special topical interest groups. The composition of its members is truly multi-disciplinary and multi-professional. The society has not made a stand on official evaluation policy nor has it developed a set of standards or guidelines for evaluation. An early initiative in this direction was discussed, but there was no support for standardisation. Pragmatism continues to reign. Some of the broader themes taken up in DES include complexity, realistic evaluation, the use of evaluation in relation to reforms, and the value of evaluation to society. The balance between academics and practitioners at the DES conferences has shifted over time, reflecting the increasing market for evaluation. Consultants have been part of the driving forces of DES since its beginning, but now they play an ever-more visible role, while academics are becoming relatively fewer. A general Master Programme in evaluation was offered by University of Southern Denmark from 2006 to 2013. Many evaluation courses in several formats are now offered, but in somewhat fragmented form and in connection with specific sectors or courses, such as public health, pedagogy, sociology, or political science. There is no accreditation or authorisation

2

DENMARK

55

of evaluators. Professionals trained in different disciplines and schools of thought, from economists to pedagogues and therapists, help maintain a diverse evaluation landscape. Debates about evaluation approaches and paradigms continue. Some of these debates focus on the notion of evidence, its meaning and relevance, for example in relation to the feasibility of randomised controlled experiments; other debates look at the quantitative/qualitative distinction and its aftermaths. The value and use of evaluation are another issue that refuses to go away. A key figure in evaluation approaches in Denmark is ‘the user’. Scandinavian languages even have a special term for users of services which is more used than ‘user’ in English (and which remains clearly separate from ‘clients’, ‘consumers’, ‘customers’ etc.). The social construction of ‘the user’ has changed over time. Traditionally, users have been defined as patients, pupils, clients et cetera each playing roles complementary to professionals such as doctors, teachers et cetera. The introduction of NPMoriented forms of management inspired by the private sector helped transform the notion of ‘users’ into ‘consumers’. Supported by evaluative paraphernalia such as user satisfaction surveys, this notion was instrumental in reforming public sector institutions into something more market-oriented or proxy-market oriented. Mandatory schemes for self-reflection among ‘users’ were also used as a tool in pedagogy (Hermann 2007) and social work (Mik-Meyer 2004). An extension of the user as an active player in the provision and implementation of public services is being introduced under the headline of ‘co-production’. At the same time, the ‘user’ is also being reconfigured in the light of the needs of the ‘competition state’ as a player in the global arena. As a consequence, students of higher education are now less seen as consumers and more as units of production which need to be processed as quickly as possible through the education industry. Incentives are linked to the performance measures on the side of both educational institutions and individual students. In sum, the notion of the ‘user’ is a socio-political construction under change. Its transformation is followed and enhanced by a number of evaluative measures, some of which remain sedimented in different layers from different epochs. In this sense, debates about evaluation are indicative of broader democratic issues regarding transformations of the role of citizens in society.

56

P. DAHLER-LARSEN AND H. F. HANSEN

5

Discussion and Conclusion

With the increasing institutionalisation of evaluation in managerial, organisational and professional practices, evaluation itself becomes a configuration of diverse, but interconnected streams of information and data-collecting practices, including experiments, indicators, performance management, auditing, and others. Among the most influential factors upon the institutionalisation of evaluation, in our judgment, one should count: • The large public sector and high level of taxation combined with the public-sector value of accountability towards society are important contextual factors explaining the high degree of evaluation culture maturity. Expenditures must be justified. Evaluation helps here. • The political system with coalition and often minority coalition governments is an important contextual factor explaining why it is not a matter of course to evaluate major public-sector reforms as well as why most legislative acts do not incorporate mechanisms for automatic review or evaluation of the legislation itself. Sometimes, however, the insertion of evaluation into a piece of legislation makes it possible for a majority of legislators to ‘digest’ the act and support it. • Ministerial accountability, the high degree of decentralisation with considerable autonomy to local governments and cultural features such as power distance are important contextual factors explaining evaluation pluralism and the absence of an overarching evaluation policy. • The high degree of stakeholder organisation is an important contextual factor explaining the intensive debates on evaluation. The alreadyexisting high level of trust in public authorities and the high number of public employees creates a setting in which evaluation is often portrayed as signifying ‘lack of trust’. • International collaboration and NPM-inspired reforms emphasising performance are important contextual factors explaining the institutionalisation of evaluation. The austerity climate in the wake of the financial and economic crisis has moved this further on. • The high degree of decentralisation in Denmark means that many issues regarding the adoption, quality, and use of evaluation in relation to specific services take place as battles ‘from house to house’. They

2

DENMARK

57

are guided by changing mentalities, norms, models and exemplars as much as by explicit policy. • The layering of several evaluation objects and several forms of evaluation make configurations of evaluative mechanisms complex and dynamic. More evaluation is not always followed by more transparency and accountability. All in all, we have aspired to paint of picture of evaluation in Denmark characterised by international influences and national specificity, by general trends and much diversity, by increasing institutionalisation, but also backlashes, by ‘maturation’ as well as no small amount of contestation and debate. Acknowledgements The authors would like to thank professor emeritus Evert Vedung for his valuable and insightful comments to a draft version of this chapter. Also, Olaf Rieper provided fruitful suggestions for improvements.

References Albæk, E. (1988). Fra sandhed til information. Copenhagen: Akademisk Forlag. Andersen, V. N., & Dahler-Larsen, P. (2008). The framing of public evaluation data: Transparency and openness in Danish schools. In R. Boyle, J. D. Breul, & P. Dahler-Larsen (Eds.), Open to the public (pp. 99–116). New Brunswick, NJ: Transaction Publishers. Andersen, M. F., & Tanggaard, L. (2016). Tæller Vi Det Der Tæller? Målstyring og Standardisering i Arbejdslivet. Aarhus: Klim. Blå Kors. (2012). Læring Unplugged. En case om arbejdet med kvalitet og dokumentation i en NGO. Frederiksberg: Frydenlund. Breidahl, K. N., Gjelstrup, G., Hansen, H. F., & Hansen, M. B. (2017). Evaluation of large-scale public-sector reforms. American Journal of Evaluation, 38(2), 226–245. Brix, J., Nielsen, L. K., & Krogstrup, H. K. (2017). The diffusion and adoption of BIKVA: A 20-year impact case study. Aalborg: Institut for Læring og Filosofi. Aalborg Universitet. Christensen, A.-S. T. (2015). Hvor, af Hvem og med Hvilken Kvalitet? En Kortlægning af Virkningsevaluering i Danmark. Speciale (Master thesis). Department of Political Science, University of Copenhagen. Dahlberg, M., & Vedung, E. (2001). Demokrati och Brukarutvärdering. Lund: Studentlitteratur.

58

P. DAHLER-LARSEN AND H. F. HANSEN

Dahler-Larsen, P. (1997). Den Rituelle Reflektion. Odense: Syddansk Universitetsforlag. Dahler-Larsen, P. (1998). Kommentar til Bjørn Lomborgs Evaluering af Evaluering. Nordisk Administrativt Tidsskrift, 79(2), 233–236. Dahler-Larsen, P. (2000). Surviving the routinization of evaluation: The administrative use of evaluations in Danish municipalities. Administration and Society, 32(1), 70–91. Dahler-Larsen, P. (2002). Evaluering Kortlagt. Århus: Systime. Dahler-Larsen, P. (2004). Evaluering og Magt. Århus: Magtudredningen. Dahler-Larsen, P. (2012). The evaluation society. Stanford: Stanford University Press. Dahler-Larsen, P. (2018). Theory-based evaluation meets ambiguity: The role of Janus variables. American Journal of Evaluation, 39(1), 6–23. Dahler-Larsen, P. (forthcoming). Evaluering af Projekter–og andre ting, som ikke er ting (2nd ed.). Odense: Syddansk Universitetsforlag. Dahler-Larsen, P., & Foged, S. K. (2017). Job satisfaction in public and private schools: Competition is key. Social Policy & Administration. https://doi.org/ 10.1111/spol.12324. (First published 20 June 2017). Dahler-Larsen, P., & Pihl-Thingvad, S. (2014). Resultatmålinger og Stress. Odense: Syddansk Universitetsforlag. Dahler-Larsen, P., & Schwandt, T. A. (2006). When evaluation meets the “rough ground” in communities. Evaluation, 12(4), 496–505. Ejler, N., Seiding, H. R., Bojsen, D. S., Nielsen, S. B., & Ludvigsen, F. (2008). Når Måling Giver Mening, Resultatbaseret Styring og Dansk Velfærdspolitik i Forvandling. Copenhagen: Jurist-og Økonomforbundets Forlag. EVA. (2005). Med Egne Øjne. EVAs selvevaluering 2005. Copenhagen: EVA. European Court of Auditors. (2017). Denmark: General information on the ex-post review system. Draft of country fiche written by University of Tilburg. Forss, K., Rebien, C. C., & Carlsson, J. (2002). Process use of evaluations: Types of use that precede lessons learned and feedback. Evaluation, 8(1), 29–45. Furubo, J. E., Rist, R. C., & Sandahl, R. (2002). International atlas of evaluation. New Brunswick, NJ: Transaction Publishers. Gønge, K. (2014). Nye Tider – Ny Evalueringspraksis? En kortlægning af udviklingen i evalueringspraksissen på det kommunale socialområde. Speciale (Master thesis). Department of Political Science, University of Copenhagen. Hanberger, A. (2006). Evaluation of and for democracy. Evaluation, 12(1), 17–37. Hansen, H. F. (1999). Den Historiske Arv-Danmark. In P. Lægreid & O. K. Pedersen (Eds.), Fra opbygning til ombygning i staten. Organisationsforandringer i tre nordiske lande (pp. 69–104). Copenhagen: Jurist-og Økonomforbundet. Hansen, H. F. (2003). Evaluering i Staten. Kontrol, Læring eller Forandring? Copenhagen: Samfundslitteratur.

2

DENMARK

59

Hansen, H. F. (2005). Choosing evaluation models: A discussion on evaluation design. Evaluation, 11(4), 447–462. Hansen, H. F. (2009). Educational evaluation in Scandinavian countries: Converging or diverging practices? Scandinavian Journal of Educational Research, 53(1), 71–87. Hansen, H. F. (2011). NPM in Scandinavia. In T. Christensen & P. Lægreid (Eds.), The Ashgate research companion to new public management (pp. 113–129). Farnham: Ashgate. Hansen, H. F. (2013). Systemic evaluation governance. Scandinavian Journal of Public Administration, 16(3), 47–64. Hansen, H. F. (2014). Quality agencies: The development of regulating and mediating organizations in Scandinavian higher education. In M. Chou & Å. Gornitzka (Eds.), Building the knowledge economy in Europe (pp. 188–218). Cheltenham: Edward Elgar. Hansen, H. F., & Jørgensen, T. B. (2009). Den danske forvaltningsmodel og globaliseringens udfordringer. In M. Marcussen & K. Ronit (Eds.), Globaliseringens Udfordringer-Politiske og administrative modeller under pres (pp. 36–64). Copenhagen: Hans Reitzels Forlag. Hansen, H. F., & Kristiansen, M. B. (2018). Cutback management in Denmark. In E. M. Ghin, H. F. Hansen, & M. B. Kristiansen (Eds.), Public management in times of austerity (pp. 92–110). New York: Routledge. Hansen, H. F., Kristiansen, M. B., & Sørensen, E. M. (2014). Resultatmåling i den danske offentlige forvaltning. In M. B. Kristiansen (Ed.), Resultatstyring i den Offentlige Sektor (pp. 127–146). Copenhagen: Jurist-og Økonomforbundets Forlag. Hansen, H. F., & Rieper, O. (2009). The evidence movement. Evaluation, 15(2), 141–163. Hansen, M. B., & Vedung, E. (2010). Theory-based stakeholder evaluation. American Journal of Evaluation, 31(3), 295–313. Hansen, N. W., & Mailand, M. (2013). Public service employment relations in an era of austerity: The case of Denmark. European Journal of Industrial Relations, 19(4), 375–389. Hermann, S. (2007). Magt & Oplysning. Folkeskolen 1950–2006. Copenhagen: Unge Pædagoger. Hofstede, G. (2001). Culture’s consequences: Comparing values, behaviors, institutions and organizations across nations (2nd ed.). Thousands Oaks, CA: Sage. Jacob, S., Speer, S., & Furubo, J.-E. (2015). The institutionalization of evaluation matters: Updating the international atlas of evaluation 10 years later. Evaluation, 21(1), 6–31. Jørgensen, T. B. (2007). Public values, their nature, stability and change: The case of Denmark. Public Administration Quarterly, 30(4), 365–398.

60

P. DAHLER-LARSEN AND H. F. HANSEN

Jørgensen, T. B., & Vrangbæk, K. (2013). Den Gode Forvaltning—På Basis Af Hvilke Værdier? Nordisk Administrativt Tidsskrift, 90(3), 115–134. Karlsson Vestman, O., & Segerholm, C. (2009). Dialogue, deliberation and democracy in educational evaluation—Theoretical arguments and a case narrative. In K. E. Ryan & J. B. Cousins (Eds.), The SAGE handbook of educational evaluation (pp. 465–482). London: Sage. Kristiansen, M. B., Dahler-Larsen, P., & Ghin, E. M. (2017). On the dynamic nature of performance management regimes. Administration & Society, 1–23. KREVI. (2008). Evaluering i danske kommuner: En kvantitativ kortlægning af kommunernes evalueringskapacitet. Copenhagen: Rambøll Management. Krogstrup, H. K. (1997). Brugerinddragelse og Organisatorisk Læring i den Sociale Sektor. Århus: Systime. Lomborg, B. (1998). Metodologisk Vurdering af Borger/Brugerundersøgelser i Århus Amts Kommuner 1995–96. Nordisk Administrativt Tidsskrift, 2(79), 93– 120. Mik-Meyer, N. (2004). Dømt til Personlig Udvikling: Identitetsarbejde i Revalidering. Copenhagen: Hans Reitzels Forlag. Mouritzen, P. E. (2012). On the (blessed) deficiencies of Danish democracy. In J. Blom-Hansen, C. Green-Pedersen, & S.-E. Skaaning (Eds.), Democracy, elections and political parties, essays in honor of Jørgen Elklit (pp. 181–192). Aarhus: Politica. Møller, A. M. (2017). Evidensdagsordenens Metamorfoser i Dansk Socialpolitik. Politik, 20(2), 68–86. Møller, M. Ø., Iversen, K., & Andersen, V. N. (2016). Review af Resultatbaseret Styring. Copenhagen: KORA. Nielsen, S. B., & Winter, D. M. (2014). A nordic evaluation tradition? A look at the peer-reviewed evaluation literature. Evaluation, 20(3), 311–331. Ozga, J., Dahler-Larsen, P., Segerholm, C., & Simola, H. (2011). Fabricating quality in education: Data and governance in Europe. London and New York: Routledge. Pawson, R. (2002). Evidence-based policy: The promise of “realist synthesis”. Evaluation, 8(3), 340–358. Politiken. (2007, March 29). Gjørup, J., & Hjortdal, H. Kronik: Tilgiv os—Vi vidste ikke, hvad vi gjorde. Rigsrevisionen. (2001). Beretning til statsrevisorerne om Danidas køb af konsulentydelser. Copenhagen: Rigsrevisionen. Rigsrevisionen. (2005). Notat om Beretning om Statens Anvendelse af Evalueringer. Copenhagen: Rigsrevisionen. Rigsrevisionen. (2015). Beretning til Statsrevisorerne om Udenrigsministeriets brug af konsulenter i forbindelse med udviklingsbistanden. Copenhagen: Rigsrevisionen.

2

DENMARK

61

Rist, R. C., & Stame, N. (2006). From studies to streams. New Brunswick, NJ: Transaction Publishers. Transparency International. (2016). Corruption perceptions Index 2016. https:// www.transparency.org/news/feature/corruption_perceptions_index_2016. Accessed on 25 March 2017. Tronegård-Madsen, M. (2008). Virkningsevalueringens veje—Erfaringer fra Danmark. Speciale (Master thesis). The University of Southern Denmark, SDU. Uddannelses- og Forskningsministeriet. (2017). Nye tal på kvalitet i uddannelserne. http://ufm.dk/aktuelt/pressemeddelelser/2017/nye-tal-pa-kvaliteti-uddannelserne. Accessed on 25 June 2017. Vedung, E. (1997). Utvärdering och de sex användningarna. In B. Rombach & K. S. Andersson (Eds.), Från Sanningssökande till Styrmedel. Moderna Utvärderingar i Offentlig Sektor. Uppsala: Nerenius & Santérus Förlag. Vedung, E. (2010). Four waves of evaluation diffusion. Evaluation, 16(3), 263– 277. Vrangbæk, K., Klenk, T., Appelby, J., & Gregory, S. (2017). Accountability through performance management? Hospital performance management schemes in Denmark, Germany and England. In T. Christensen & P. Lægreid (Eds.), The Routledge handbook to accountability and welfare state reforms in Europe (pp. 119– 132). Oxon: Routledge.

CHAPTER 3

Finland Petri Uusikylä, Pertti Ahonen, Nina K. Hietakangas and Johanna Nurmi

1

Introduction

Finland has been the independent republic since 1917. From 1809 to 1917 it was the autonomous part of the Russian Empire forming the Grand Duchy of Finland, the predecessor state of modern Finland and had until the 1980s important semi-presidential characteristics. Today Finland compares with most other member states of the European Union (EU) in that the Prime Minister is the unquestionable leader of governmental policymaking. However, as Finland has since 1907 a pronouncedly proportional system of political representation, the Prime Minister normally leads governments that are coalitions between three to six political parties. This

P. Uusikylä (B) Finnish Evaluation Society, Helsinki, Finland e-mail: [email protected] Frisky & Anjoy, Helsinki, Finland P. Ahonen Department of Political Science, University of Helsinki, Helsinki, Finland e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_3

63

64

P. UUSIKYLÄ ET AL.

circumstance emphasises both the importance and the challenges of coordination in governmental policy-making. Despite the entrenched position of Åland, Finland belongs to the unitary states among the EU member states. Although regional self-government is being introduced country-wide by 2020, no federal structures will be established. Besides the evolving regional self-governments, Finland’s governments also comprise more than 300 municipalities with self-government guaranteed in the Finnish Constitution. Following general Nordic models, the municipalities have comprised the foremost providers of the public services. However, the establishment of regional self-government (from 2020 onwards) will be accompanied with the transfer of one half of the duties of the municipalities thus far to the new regional government, meaning tasks of the social welfare and health care field. The development of public policy evaluation in Finland has been closely connected with the development of the welfare state. Much of what later was called evaluation was first conceptualised—and used systematically since the late 1950s and the early 1960s –— as provision of knowledge for social welfare reform with scientific basis (Ahonen et al. 2002, pp. 47–48). The tipping point for evaluation and policy analysis was the introduction of the synoptic, rational planning systems in the 1960s and 1970s. Comprehensive budgeting planning together with new generation politicians and civil servants (with social science background) laid ground for systematic use of policy analysis and evaluation. The role of policy analysis was to make ex ante assessments of alternative policy options and provide input information for the mid- and long-term plans. Evaluation (that time mostly applied social science research) was often ideologically laden and aimed either to support government policies or oppose them. Access to strategic and policy level information was highly restricted and evaluations were mainly carried

N. K. Hietakangas Finnish Blue Ribbon, Helsinki, Finland e-mail: [email protected] J. Nurmi Ministerial Adviser, Ministry of Finance, Public Governance Department, Helsinki, Finland e-mail: [email protected]

3

FINLAND

65

out by the planning and research units of the Ministries and Central Government Agencies or by the academics or sector-based research centres. The politico-administrative system of steering was extremely hierarchical and based on top-down control. During the 1980s and 1990s the hierarchical steering system of government started to deteriorate. New managerial models such as New Public Management (NPM), Management by Results and decentralisation crept into Finland after 1987 when conservative Government (PM Holkeri) took the office. Influences from the UK and US travelled to Finland via the Organization for Economic Cooperation and Development (OECD) and its Public Management Committee. First NPM pilots started in early 1990s and by the end of the decade NPM was the ‘official’ management tool of the Finnish public sector. The role of the evaluation was pivotal in NPM-based steering. When regulatory and detailed budget control was loosened, the role and importance of so-called information steering grew enormously. Evaluation was the tool for providing feedback to the policy makers and ministries on the success of their policies. During that time Ministry of Finance was the key actor in enhancing the systematic use of evaluation in Finland. A number of evaluation groups were established and new guidelines on evaluation have been given to central government agencies. Another important milestone in the systematic use of evaluations in Finland was the membership of the European Union in 1995. EU institutions and especially European Commission has been (and still is) a major advocate of the use of evaluation information. Structural fund evaluations together with a rising number of evaluations of the cross-border programmes (such as Interreg and ENPI) and framework programmes in the field of science, technology and innovation (today Horizon 2020) gave a new boost to evaluations carried out in Finland. The evaluation capacity and culture started to develop fast due to series of evaluation trainings at the various levels of government. Around 2010 evaluation in Finland was already a business as usual. Ministries and government agencies carried out regularly forecasting exercises and self-assessments, commissioned external evaluations from consultants and academics. What was still missing in many branches of government were clear procedures how the evaluation information is to be utilised and linked to policy learning. Also, municipalities and regional governments evaluate their local and regional programmes and use external and developmental evaluations to develop their performance but face much the same problem: evaluations provided critical feedback information, but this data

66

P. UUSIKYLÄ ET AL.

and information did not accumulate and translate into sufficient knowledge and policy understanding. To solve this problem, the government established a new unit under the Prime Minister’s Office in 2014. This unit is called the Government Policy Analysis Unit. One of its main functions is to commission research and evaluations that support the implementation of the government programme. The joint analysis, assessment and research activities, coordinated by the government, generate information that supports decision making, working practices and management by knowledge. The government adopts a plan for analysis, assessment and research annually that underpins policy decision making and steers studies and research towards specific priority areas selected by the government. Under the leadership of the Prime Minister’s Office, the Government Working Group for the Coordination of Research, Foresight and Assessments Activities is in charge of formulating the plan. The working group comprises experts from all administrative branches. Today the evaluation in Finland is fluidly transforming from rigid external evaluation model into scattered sphere of co-creation, peer-learning and developmental evaluations. Evaluation forms an intrinsic part of the development government of government policies and is used as evidence supporting pool rather that expert based tool. Various learning platforms call experts and programme designers together to test and pilot new innovative ideas and jointly implement these projects. Thus, the external role of evaluators has in many occasions been replaced by role of a critical friend or a co-designer.

2

Institutional Structures and Processes

In Finland, evaluation is, at least referred, in many government acts (passed by the Parliament), statutes (passed by the government, valtioneuvoston asetus ) and regulations (passed by the government ministries, ministeriön asetus ). However, there is no single ‘master’ act on evaluation in Finland. The most general stipulations on evaluation in Finland are included in the Budget Act (L valtion talousarviosta), which is detailed in the Budget Statute. According to the act, evaluation should be carried out both as concerns the fields of responsibility of each ministry (by and large corresponding with the division into sectors of economic, social and other policies) and in addition within the organisation of each ministry. However, how the statutory evaluation should be carried out is not defined in any way in the legislation.

3

FINLAND

67

In general, systems and procedures of evaluation that have to be set up according to laws, statutes and higher-level regulations generally either support ex ante planning or ex post accountability. Evaluation knowledge very often comes too late to help with actual steering. Evaluation may certainly de facto be implemented to support legitimacy, but this does not happen overtly. Last, it is unlikely that public money can and will be used to support sheer enlightenment. Moreover, the ‘should’ and the ‘is’ may be only of loose coupling to each other in evaluation. A good example comprises the regulatory impact assessments of government proposals to parliament. Although, the system is in place de jure, de facto that system is weak and does not provide sufficient regulatory impact information (for more, see Tala 2015). In Finland there are no overall national guidelines or strategies of evaluation. However, certain policies of applied research have more lately come close to enabling and requiring evaluation. First, since 2014 the Strategic Research Council, whose members the government nominates, has allocated around 50 million each year towards two-to-four-year applied research projects in fields that the government defines. Moreover, since the same year the Finnish government (coordinated by the Prime Minister’s Office) allocates around 10 million euros annually to shorter-term applied research and development projects, some of which have comprised evaluation, development of evaluation or been evaluation-related in other ways. In the absence of government general evaluation policies, the decisionmaking on individual evaluation projects takes places individually in the individual ministries. Insofar as evaluation is statutorily or in some other way institutionalised in a ministry or in its sector of administration, continuity often drives evaluation activities without having to define any specific evaluation policies. As indicated above, overlaps and synergies prevail or may prevail between evaluation activities in the domain of the ministries and the Strategic Research Council projects and the State Council projects. The Ministry for Foreign Affairs (MFA) has traditionally been one of the most advanced ministries in terms of commissioning, steering and utilisation of evaluations. There is an evaluation unit in the development policy departments under the MFA that has strong evaluation capacity and sufficient resources to carry out several annual evaluations on individual projects, programmes, country programmes and even meta-evaluations. The MFA has also published several guidelines and manuals on evaluations

68

P. UUSIKYLÄ ET AL.

and results-based manual (see e.g. MFA 2013, 2015). The evaluation practices of the MFA are based on the principles agreed internationally within the OECD and the EU. The implementation of evaluation in Finland’s development cooperation is steered by MFA evaluation guidelines (MFA 2013) and a systematic manual sets out the main evaluation criteria based on the criteria of the OECD and the Development Assistance Committee (DAC). The primary intended users of the Evaluation Manual are the officials who commission and manage evaluations and the experts who implement evaluations both in Finland and in partner countries. The manual also provides more broadly helpful information for all stakeholders who have an interest in the evaluation function and the results of Finland’s development policy and cooperation. Finland sees itself as one of the inheritors of the Nordic model of the publicness of the information related to the dealing with public authorities, a model the first predecessor of which evolved in the eighteenth century. It is the rule rather than the exception that the results of evaluations commissioned by the Finnish ministries (and the results of the Strategic Research Council and the State Council projects) are made public by means of the internet to all and everybody. In this sense the results of evaluations—and one can safely say that those of most evaluations in—Finland are free to use by whomsoever. Concerning the possible functions of evaluation—such as the division of these functions into planning and steering, accountability, legitimacy, and enlightenment—one can say that such divisions do not fit well to make clear the functions of evaluation in the Finnish case. For instance, the evaluation provisions in the Budget Act mostly aim at strengthening budgetary discipline as maintained by the Finance Committee of Government and the Ministry of Finance (and behind, the Prime Minister and the Finance Minister, who traditionally do not disagree on questions of public finance). However, despite the problems of applying such divisions as the indicated, it is possible to say a little more on these divisions further below. It is not ruled out that Finnish parliamentarians deal with evaluation findings in the political activities. One of the crucial questions in this respect is, if the performance audits of the Finnish National Audit Office (NAO), affiliated with parliament as an autonomous body since legislation passed in the year 2000, are considered evaluation. These audits often—and even most often—do comprise evaluation as to their content, although the NAO mandate does not call these results literally for ‘evaluation’ (on this question see Ahonen 2015a). The parliamentarians certainly utilise the results

3

FINLAND

69

of the NAO performance audits in various ways. The boundary between evaluation and applied government- or Parliament-funded research studies and between evaluation and studies carried out at the government research institutes, which still are many in Finland, is also floating in Finland. Parliamentarians do often refer to all such type studies and are likely to utilise the results in multifarious ways.

3

Evaluation Practice

Despite the serious attempts to form a coherent government-level evaluation framework most evaluations have been planned and carried out on an ad hoc basis. Also, the utilisation of evaluation findings varied a lot from one line-ministry to another. Most systematic evaluation frameworks were introduced in the fields of EU’s Structural Funds, by the MFA (development policy department) and in employment and innovation policies. One of the most advanced and coherent evaluation strategies was introduced by Business Finland (former Finnish Funding Agency for Technology and Innovation). Ex ante impact assessment of legislative proposals is an example of ongoing evaluations in Finland. The Ministry of Justice prepared guidelines on the ex ante evaluation / impact assessment of all government proposals to parliament for new or revised legislation in 2007. According to the guidelines each legislative proposal should contain impact assessment on the following areas: • Economic impacts. Contains impacts to municipalities and local governments, government expenditures and impacts to firms (e.g. administrative burden). • Impacts to government services and cooperation. Includes impacts to inter-ministerial coordination, state personnel, government processes and procedures et cetera. • Environmental impacts. Impacts to health, soil, water, living environments, cultural heritage et cetera. • Societal impacts. Includes impacts to citizens, gender and minority languages (Swedish, Sami) and ethnic minorities. The Government appointed the Finnish Council of Regulatory Impact Analysis (Lainsäädännön arviointineuvosto) for the term 15 April 2019

70

P. UUSIKYLÄ ET AL.

to 14 April 2022. The first Council of Regulatory Impact Analysis was appointed for the three-year term that started in 2016. The government plenary session appoints a chairperson and a maximum of eight members to the Council of Regulatory Impact Analysis. According to its mandate, the Council should be independent. This council, modelled more or less according to the German Federal Normenkontrollrat, should in principle carry out ex ante evaluation of all coming government proposals for new legislation. However, the scant resources of the council (part-time members and a very small staff) have allowed during the early period of its operation the evaluation of only 10–15% of the evolving new legislation. The Budget Act (and detailing statutes and regulations) also presupposes incessant evaluation, although formal compliance with the norms in question does not imply strong substantial compliance (Ahonen 2015b). Moreover, insofar as the performance audits of the Finnish National Audit Office are seen as evaluation, there is a frequent rhythm of evaluations. Also, in the field of environmental evaluation there is an ongoing stream of evaluations under the title of the statutory Environmental Impact Assessment. This is based on a government statute. Economic Policy Council (Talouspolitiikan arviointineuvosto) was established in 2014. It is affiliated with the Ministry of Finance and its Institute for Economic Research (VATT), see this council can be seen as advisory rather than carrying out actual evaluations. When it comes to the evaluation of other government programmes the picture is more fragmented. Since the 1970s there has been a serious effort to monitor and evaluate the implementation of government programmes, that is to have a reliable overall picture on success or failure of the government policies. Since the early 2000s government has started to report to which extent the government policy goals have been achieved. This however has given mainly follow-up information on implementation rates rather that effectiveness of the overall government. The electoral term (four years) seems to be too short for many policies and programmes to take effect. Finland has not really ever had national programmes except for exceptional cases. Despite certain efforts since the late 1960s until the late 1980s workable national multiyear planning really never evolved well. Programme based budgeting really never took root, either. One of the exceptions comprised the special programmes of government, the first of which was launched in the mid-1990s and the last of which was implemented during the first decade of the 2000s. One difficulty derived from the fact that no

3

FINLAND

71

government lasts for more than the electoral period of four years (shall there be no extraordinary elections), but the programmes possibly showed little impact before the government launching them had to concentrate on the next electoral struggle. So far, we have mainly discussed the role of external evaluations in the Finnish public sector. There are, however, several internal evaluations or self-assessments taking place in various branches of the Government. These procedures are normally connected to quality assurance procedure or service design models. Perhaps the example of the comprehensive and wellstructured self-assessment framework can be found in the Museum administration. National Board of Antiquities started an ambitious project for developing the evaluation framework for the museums. Original framework consisted of internal and external (in fact peer) evaluation parts. Later in 2015–2016 this model was digitised and turned into museums´ selfassessment framework (see National Board of Antiquities 2016). As Uusikylä (2013) has argued, the overall system of governance can be described as silo-steering (referring to autonomous ministries) and most policy evaluations have been carried on an ad hoc basis rather than in a pre-planned and systematic manner prior to the establishment of the Government Policy Analysis Unit and the Evaluation Council. It is also often the case that the evaluation criteria and indicators used for evaluation are different from those used to performance monitoring and management. Naturally, evaluators do use the performance monitoring data while carrying out their studies. It is also known that the number of evaluation reports or explicit strategies aiming to build bridges between evaluation and performance monitoring does not describe the full magnitude of M&E synchronisation because both are embedded into ongoing streams of performance follow-up. With M&E institutionalisation evaluation becomes a taken-for-granted phenomenon, which needs no further justification (Dahler-Larsen 2007). Table 1 shows performance information in three government branches, namely the Ministry of Education, Ministry of Traffic and Communication, and Ministry of Finance. Data collected by Anniina Autero (Autero 2010) is based on the Mid-term Operating Financial Plan, which explains why intra-organisational indicators are mainly missing. However, they do exist in government budgets, and performance contracts of the ministries in each sector. Three main findings in Autero’s analysis are: (1) Although, all sectors present effectiveness goals the quantitative indicators are missing

72

P. UUSIKYLÄ ET AL.

Table 1 Assessment of M&E information in Ministry of Education, Ministy of Traffic and Communication and Ministry of Finance, 2006/2010 (Uusikylä 2013)

Effectiveness goals in state budget (SB) Are there effectiveness goals in SB? Are the goals quantitative (numeric indicators)? Are the goals qualitative or verbal? Intra-organisational goals in SB Is there numeric performance objectives (outputs)? Are there indicators on efficiency? Are there indicators of quality assurance? Are there indicators on HR? Is there logical connection between effectiveness goals and other performance indicators? Horizontality and M&E Are there horizontal goals at the branch level? Are monitoring and evaluation information in sync?

Min. of Education

Min. of Traf. & Comm.

Min. of Finance

2006

2010

2006

2010

2006

2010

Yes Yes

Yes Yes

Yes Yes

Yes Yes

Yes No

Yes No

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No

Yes

No

Yes

No

No No No No

No

Yes No Yes No

No

Yes

Yes

Yes

No

Yes

No

No

No

No

No

No

in the sector of the Ministry of Finance; (2) there has been an improvement in dealing with cross-sectoral or horizontal goals from 2006 to 2010 and; (3) there are no systematic procedures for matching performance monitoring information and evaluation results. This is corroborated by the State Audit Office (STO) of Finland who has also criticised ministries for their lack of coordination in collating evidence from research, evaluation and performance monitoring and use of this evidence base when preparing new policies. In example, STO’s performance audit of the management system of the Ministry of Employment and Economy, STO found that the Ministry (as sector) had 50–60 individual strategies within its portfolio which clearly hampered coordination and prioritisation. The State auditors also reported that, despite the presence of a research and evaluation strategy within the Ministry, it was not

3

FINLAND

73

comprehensive enough and did not sufficiently account for how evaluation and performance monitoring information should be collated. The joint analysis, assessment and research activities, coordinated by the Government, generate information that supports decision making, working practices and management by knowledge. Traditionally ministries have commissioned a consortium (consultants, researchers and specialists) to carry put evaluations. For evaluation with a sufficiently high monetary value a call for tenders has, according to EU-derived procurement norms, started the process leading to the selection of the evaluation team. To gain more strategic research and evaluation information the Government introduced in 2014 a new annual Government plan for analysis, assessment and research. The plan steers the studies and research selected by the Government towards specific priority areas that are relevant to the work of the Government and the ministries. The aim of the plan is to create a basis for systematic and broad-based use of research data in decision making in Government and the ministries, and to strengthen the knowledge base that the Government and the ministries use in decision making, knowledge-based policy and overall strategic insight. The resources available for implementing the plan amount approximately EUR 10 million. Table 2 summarises the institutionalisation and capacity of evaluation in various sectors of the Government in Finland. Table 2 shows that approximately half of the ministries have policy analysis or evaluations functions in their sector. The MFA’s Development Policy Department has traditionally had an appropriate and well-resources evaluation unit (EVA-11) that operates under direct supervision of the UnderSecretary of State for development. EVA-11 participates in international joint evaluations, develops evaluation procedures and participates in international peer reviews. Recently, some of its functions have been outsourced to private consultancy firms. The MFA is committed to the OECD/DAC principles, criteria and standards for evaluation. There are no specific evaluation standards set by the Finnish Government. Evaluations are divided into two functional entities within the MFA structure. Centralised evaluation includes comprehensive, strategically important sector, thematic, country programme et cetera evaluations. In other branches of the government the relevant research and evaluation assignments have been carried out by government research institutes. There are no sector or policy institutes dedicated to evaluation only.

Yes (the futures committee) Yes (policy analysis unit)

Yes (development cooperation evaluation unit EVA-11) No No Yes (research unit) No—indirectly VATT

Yes indirectly (FINEEC—The Finnish Education Evaluation Centre) No (indirectly government research institute Luke)

Parliament Prime Minister’s Office

Ministry for Foreign Affairs

Ministry of Education and Culture

Ministry of the Environment

No

Ministry of Employment and the Economy Ministry of Social Affairs and Health No (indirectly SYKE)

No (indirectly THL)

No

Ministry of Transport and Communications

Ministry of Agriculture and Forestry

Ministry of Justice Ministry of the Interior Ministry of Defence Ministry of Finance

Has a separate research, policy or evaluation unit

Institution

Academics, consultants and government research institutions Academics, consultants and government research institutions

Academics, consultants and government research institutions Academics, consultants and government research institutions Academics and consultants

Academics Academics and consultants Academics and consultants Academics, consultants and government research institutions Academics

Academics Academics, consultants and government research institutions Consultants

Which institutions are typically contracted to carry out evaluations

Strong

Strong

Strong

Medium

Medium

Medium

Medium—development cooperation strong Medium Medium Medium Strong

Weak Strong

Evaluation capacity Weak–Medium–Strong

Table 2 The level of institutionalisation and capacity of evaluation in various sectors of the government in Finland 74 P. UUSIKYLÄ ET AL.

3

FINLAND

75

However, many government organisations, such as research institutes under ministries, have to do and carry out evaluation as one of their functions (e.g. institutes in the domains of social welfare and health; agriculture; forestry; and economic policy). There are also such institutions as centres of evaluation, which commission evaluations and ‘evaluate these evaluations’ and fine-tune the expert recommendations importantly, covering the field of basic general education, vocational education, polytechnic education, and university evaluation (Finnish Education Evaluation Centre 2018). The Finnish Education Evaluation Centre (FINEEC) is an independent agency responsible for the evaluation of education. It operates as a separate unit within the Finnish National Agency for Education. It carries out evaluations related to education including the operations of education providers from early childhood education to higher education. The FINEEC comprises the Evaluation Council, the Higher Education Evaluation Committee and four units: the General Education and Early Childhood Education Unit, the Vocational Education Unit, the Higher Education and Liberal Adult Education Unit, and Development Services Unit. According to Finnish tradition, the state has relied dedicated research institutes rather than commissioned research from the universities, unlike Sweden, for instance. However, nowadays insofar as the institutes specially paid for their research by the ministries, the evaluation they may do receives more external characteristics. In 2017, the following institutes were in operation: • UPI —The Finnish Institute of International Affairs (under parliament). • VATT —Institute for Economic Research (Ministry of Finance). • LUKE—Natural Resources Institute, Finland (Ministry of Agriculture and Forestry). • Finnish Food Authority (same ministry). • MML—National Land Survey of Finland (same ministry). • Ilmatieteen laitos —Finnish Meteorological Institute (Ministry of Transport and Communications). • GTK —Geological Survey of Finland (Ministry of the Economy and Employment). • VTT —Technical Research Centre Finland, Ltd. (same ministry). • STUK —Radiation and Nuclear Safety Authority Finland (Ministry of Social Affairs and Health). • THL—Finnish Institute for Health and Welfare (same ministry).

76

P. UUSIKYLÄ ET AL.

• TTL—Finnish Institute for Occupational Health (same ministry). • SYKE—Finnish Environment Institute (Ministry of the Environment). Although, the position of government research institutes is still strong in carrying out evaluations and applied research the role of consultancy firms has been strengthened during the last 15 years. In a typical evaluation project there is a consortium that consists of private sector consultants, academics and one of the government research institutions. The selection of the evaluator or consortium is normally based on public procurement which makes the system rather transparent. Also, evaluation reports are in most of the cases public documents and achievable from the homepages of the commissioner (e.g. ministry or government) agency. Most of the evaluations have been commissioned by the Ministry of Finance, Ministry of Labour and Economy, Ministry of Agriculture and Forestry, Ministry of Environment and Ministry for Foreign Affairs. It is thus not surprising that the competence and evaluation capacity is also highest in these ministries. The role of Prime Minister’s Office has strengthened drastically during last five years in terms of policy analysis and evaluation capacity. This is mainly due to establishment of the Government Policy Analysis Unit, a systematic monitoring of the implementation of the government programme, establishment of government strategy sessions and establishment of the Evaluation Council. The Committee for the Future was established in 1993 and it serves as a standing committee in the Parliament of Finland. The Committee consists of 17 members of the Finnish parliament. The Committee serves as a Think Tank for futures, science and technology policy in Finland. The counterpart cabinet member is the prime minister. The Committee’s mission is to generate dialogue with the government on major future problems and opportunities. The main task of the Committee for the Future is then to prepare the parliament’s response (sc. Futures Submission) to the Government’s report on the future. By this way, the Finnish government and parliament can recognise important political themes at such an early stage that different alternatives and policy lines are still completely open and under development. Also, the Finnish National Audit Office, affiliated with parliament, is accepted for an evaluator. It can be seen as a national evaluation institution in its performance audits.

3

4

FINLAND

77

Utilisation of Evaluations

In 2009 and 2010 the Prime Minister’s Office launched two important projects to strengthen and improve the strategic management of the Finnish government (and to improve the use of evaluation information in decisionmaking at the government level) (Prime Minister’s Office, Finland 2011a). The premise for these system level evaluation reforms was a finding by the government that evaluation of impacts and effectiveness of policy measures was inadequate in Finland. According to the government the problem was, in particular, that evaluation information is neither systematically used nor required among ministries and agencies. Evaluation activities and the use of evaluations are not organised clearly. No central government function is charged with the overall responsibility for the development and maintenance of the evaluation system. Evaluation activities are thus confined to sectors which make it even more difficult to manage inter-sectoral policy entities and to recognise how issues are interlinked. Recently, a government working group introduced an operating model that aims at ensuring a strong and horizontal information base for the most important socio-political decisions (Prime Minister’s Office, Finland 2011b): • Evaluation and research data are used systematically in identifying reform needs, in selecting the most effective policy measures, and in the various phases of decision-making and its preparatory processes. • Information needs are defined and identified starting from political priorities. • Information resources are of a high standard and are used effectively. The management system supports the timely, efficient and systematic use of evaluation and research data are expected. • Interaction among the producers and users of evaluation and research data is made smooth. • Reporting is as practical and light as possible, is linked with other processes, and does not excessively burden public servants and political decision-makers. These working groups and their final reports served as an important input to the preparation of the Prime Minister Sipilä’s government programme in

78

P. UUSIKYLÄ ET AL.

2015. As referred earlier, new evaluative functions were established to the Prime Minister’s Office (namely the joint analysis, assessment and research activities, coordinated by the government and the Evaluation Council) to complement the earlier ones (e.g. Economic Policy Council). To achieve the evidence-based policy-making goal, a systematic operating model was urgently needed, to ensure the creation of a strong and horizontal knowledge base in support of societal decision-making and measures. The operating model has been developed as part of a comprehensive reform of state research institutes and research funding. The coordination of the state’s sectoral research in support of decision-making is one of the Prime Minister’s Office’s responsibility areas. This includes: • coordination of government’s analyses, assessments and research activities; • proposals on key theme areas and priorities for strategic research presented annually to the government on the basis of the initiative of the Council for Strategic Research; and • cooperation with ministries, bodies producing analysis, assessment and research information, research financiers, businesses and other organisations. The MFA and especially its Development Policy Department has a long tradition of creating a sound and comprehensive evaluation system. This includes publishing evaluation manuals and guidelines, providing training courses and lectures, organising evaluation related webinars and seminars et cetera. During last years the ministry has made a special effort in enhancing the utilisation of the evaluation findings and linking evaluations with the ministry’s development policy decision making. After each evaluation the ministry prepares a special Management Response Memorandum that will be discussed by the top civil servant in so-called Quality Group. The Management Response Memorandum lists all the actions and revisions that the ministry will carry out based on evaluation recommendations. In addition to this the MFA also commissions several meta-analyses, meta-evaluations and peer reviews to verify the quality of individual evaluation reports and to form the big picture of the effectiveness and impacts of Finland’s development policy. These results are often discussed in the National Development Policy Committee.

3

5

FINLAND

79

Evaluation in Municipalities and in the Civil Society

The Association of Finnish Local and Regional Authorities (Kuntaliitto 2016) presupposes that evaluation should be tightly integrated in management in municipalities. The Association recommends that evaluation is rendered an elementary part of management and planning both on a strategic and an operational level. Evaluation results should be available in planning and resource allocation for a coming year’s operating plan (Sundquist and Lovio 2006). In order to develop better evaluation systems in municipalities the Association has promoted the implementation of the Common Assessment Framework (CAF) as a tool for more coherent evaluation (Vuorinen 2015). Moreover, seminars and trainings on CAF are organised frequently. Evaluation aims to enhance municipally produced, effective services for residents. Especially the present fundamental structural change in governmental system in Finland requires paying attention to evaluation. Evaluation helps the progress of municipal service production in an efficient way, which is the background agenda and the aim in organising structures and practices. The Local Government Act 410/2015 stipulates municipalities to compose a strategy, which is based on an evaluation of the municipality’s present situation and of the future changes in the operational environment as well as the impact of these on the performance of the municipality’s functions. The strategy must also define the evaluation and monitoring processes for the strategy implementation (Local Government Act 410/2015, part III, chapter 7, Sect. 37). The Act establishes the residents’ opportunities to participate and exert influence in different ways. Each municipality can decide the methods used for considering their opinions and in some cases, this is evaluation. Some special groups, like young people, elderly people and people with disabilities have special councils for local influence and participating in planning and developing services and other matters. They can also influence monitoring of activities. Nevertheless, no commonly shared evaluation system exists in municipalities (Kuntaliitto 2016). Concerning referenda, there is no systematic practice to use evaluation for providing knowledge as background information. The participating process of interest groups accumulates information and understanding on local issues for council members and other residents and oftentimes includes

80

P. UUSIKYLÄ ET AL.

evaluative elements. Increased knowledge among residents might produce more initiatives for referendum from local citizens. Evaluation is an integrated and regular element of management process in municipalities. There might be different ways to carry this out, much depending on the size of a municipality and its evaluation capacity. There is no a common system how to do it, but general frameworks are given by the Association of Local and Regional Authorities. In Finland, we have a practice of hearing the interest groups always, when a new issue or matter is under planning in a municipality. Among interest groups there are also CSOs. The aim is to refine the existing information in cooperation with the different groups of society and eventually get to more interactive cooperation. Furthermore, there is a need to improve methods and tools for hearing. Finland is also a member in the Open Government Partnership (OGP), which is a voluntary international initiative that aims to promote transparency and good governance to build better and more empowering services to citizens Finland’s action plan focused on improving citizen engagement and making government information easier for citizens to understand. The Independent Reporting Mechanism (IRM) carries out a biannual review of the activities of each country that participates in OGP. The OGP process comprises self-assessment reports, which are available for public comment for a two-week period. The report does not specifically define completion levels of each commitment or explain why certain activities have not progressed as expected (OGP 2018). The Ministry of Finance is responsible for coordinating the OGP process in Finland. Most of the activities are voluntary in nature, but for supporting the implementation of OGP commitments, the Ministry of Finance has appointed the OGP Support Group, which includes 16 civil servants from ministries, government agencies, and local governments and five representatives from civil society organisations (CSOs). The OGP Support Group meets several times per year to discuss action plan implementation. Going forward, the government could prioritise issues that are of high value to civil society, such as corporate transparency, by using the OGP platform to coordinate ministerial support and multiagency consultation. CSOs and non-governmental organisations (NGOs) are likely to use evaluations and evaluation findings, especially those ones that concern their cooperation partners or other interest groups. Evaluation reports give them information on the field they are operating and help them develop their own operations more accurately. Great deals of the evaluation reports are public

3

FINLAND

81

and individual citizens can easily find evaluation reports, presentations and other materials. CSOs are active in doing evaluations themselves. They are conducting many projects and other developmental activities and evaluation is mostly regarded as an important part of the accomplishment of an intervention. Among CSOs there are organisations with well-equipped evaluation skills and they are actively developing pragmatic evaluation methods. CSOs cooperate with public sector in evaluation field both as provider and client. The Finnish Federation for Social Affairs and Health (SOSTE) is a national umbrella organisation that gathers together 200 social and health CSOs and dozens of other partner members. They operate at the regional, national and international levels. Evaluation is tightly integrated in developmental work in SOSTE as it offers information needed for focusing on proper things and gear up operations for the right direction. SOSTE also arranges evaluation seminars for CSOs and develops methods for evaluation and self-assessment. Church and different parishes also evaluate their work and have adopted self-assessment to their practices. CSOs are well acquainted with evaluation, at least those in the social and health field. Namely, in order to gain economical support, they are supposed to evaluate their operations and projects, mainly by selfassessment. Formerly, the main financier, the Finnish Slot Machine Association, required external evaluation for larger-scale projects, but nowadays, after changes in orders and organisation, there is no more such practice. Today the financier is a closer part of the Ministry of Social Affairs and Health, together with two other gaming associations and is called The Funding Centre for Social Welfare and Health Organisations. Nevertheless, evaluation has become familiar to CSOs and they have got practice for planning, doing and using evaluation. As mentioned before, the summaries of evaluation reports are publicly available most of the time and often also full versions. Evaluation reports are not discussed publicly unless there is something special or commonly interesting in them. Evaluation might be mentioned, but normally no thorough discussion about evaluation is taking place.

6

Professionalisation of Evaluation

University of Tampere had a full-scale programme on evaluation with special funding from the Ministry of Education between 1997 and 2002. After the funding ended, the programme was terminated. At the moment there

82

P. UUSIKYLÄ ET AL.

Other, 5%

are no university education programmes that specialise only on evaluations. There are, however, several courses and lectures in most of the social science faculties that teach evaluation as part of public management, education, social welfare and health care development or science, technology and innovation. The Finnish Evaluation Society (Suomen Arviointiyhdistys, SAYFES), some CSOs and government agencies offer evaluation training courses temporarily. Usually these courses introduce new evaluation methods or techniques and are somehow connected to on-going societal reforms. The Finnish Evaluation Society was founded in 1999. It is a registered association with approximately 300 members. The Society was founded when evaluation was an emerging trend and a new profession of evaluators was growing fast. Professionalism was also needed for those commissioning evaluations and those using and interpreting evaluation results. SAYFES has been developed to be a platform for those interested and involved in evaluation related matters (Fig. 1). The SAYFES members have on average over twelve years of experience on evaluations, variation being from one to 30 years. The mission

User of evaluattion results, 25% Evaluator, 50%

er,

h

Ev

of E Subs val crib uat er ion s, 1 0%

al

rc ea s e nr tio 10% ua

Fig. 1 The members of the SAYFES by their roles (SAYFES Database 2017)

3

FINLAND

83

of SAYFES is to promote the evolution of evaluation theory, practice and understanding, as well as the deployment of interdisciplinary evaluation know-how in society. Its goal is to contribute to the dynamic, international evaluation culture in the activities of public, private and third sector. The members of the society represent different fields of society (Fig. 2). In order to meet the objectives of knowledge development and dissemination of evaluation and its education, SAYFES organises different activities, conferences and seminars, discussion forums on current evaluation topics, workshops on evaluation methods and breakfast meetings where evaluation activities of different institutions are introduced. Previously SAYFES has also published an annual evaluation supplement in the referee-journal of Administrative Studies. Lately, this activity has been on hold, but can hopefully be reactivated at some stage. In 2017, SAYFES has also introduced an annual award for excellent evaluation action for the first time. International cooperation has been an important part of SAYFES activities since the beginning. Meetings and exchange of knowledge with the Nordic evaluation societies and NESE are important, as well as the collaboration with the European Evaluation Society. Fig. 2 The members of the SAYFES by their institutional affiliation (SAYFES Database 2017)

Other, 11% Municipality, 17%

Private sector, 11%

Civil Society Organisation, 11%

University or school, 22%

Government Agency, 28%

84

P. UUSIKYLÄ ET AL.

7

The Future of Evaluation in Finland

Although, the government, CSOs, number of think tanks and public policy experts in Finland agree that evaluation is needed to give critical feedback for policy makers and bureaucracy, they also see that the system evaluation needs a new paradigm and methodology to increase the utilisation of the results of evaluations. There have been three major external challenges that have accelerated this process: (1) urgent need for understanding the effectiveness of public interventions or development aid, (2) increasing complexity and interconnectedness in the world and a clear need for policy coherence, and (3) a need to expand single-organisation perspective towards the wider sphere of governance (Uusikylä and Valovirta 2007). Sector-based administrative structures do not support comprehensive and appropriate horizontal policy-preparation or the use of the multi-policy assets needed to effectively tackle so-called wicked policy problem. The existence of multi-level forms of governance and of various players at the different level of the steering system (in this case: global-Europeannational-regional and local) makes it even more difficult for decisionmakers to see ‘the big picture’ and to make informed decisions which will really have an impact. All this has altered the dynamics of policy-making and set new restrictions on the credibility of traditional democratic governance models. Traditional evaluation models have been inefficient to cope with increasing complexity and systemic development in a turbulent field of governance. If the existing (linear evaluation) model is broken, how would the new one look like? In spring 2017, the Government of Finland introduced the Place to Experiment, a digital platform developed together by the Finnish Prime Minister’s Office and a community of Finnish experimenters. The Place to Experiment promotes the government’s target, defined in the mid-term policy review session, of making Finland world leader in innovation and experiments by 2025. There are three levels to the government’s new approach for participating in experimenting and disseminating information. The three levels stem from Prime Minister Juha Sipilä’s government programme and its implementation plan and build on discussions held with those already trying out these new approaches. These levels are strategic pilot studies, pooled pilots and grass-root level experiments. Also, Sitra, the Finnish Innovation Fund has launched several innovative pilots to reform public service and management models. The role of evaluation is to provide ongoing and real-time feedback for policy designers on

3

FINLAND

85

results and impacts of these experiments. This can be done by applying a developmental evaluation approach. According to Patton (2011, p. 1) developmental evaluation supports innovation development to guide adaptation to emergent and dynamic realities in complex environments. Innovations can take the form of new projects, programmes, products, organisational changes, policy reforms, and system interventions. A complex system is characterised by a large number of interacting and interdependent elements in which there is no central control. Patterns of change emerge from rapid, real time interactions that generate learning, evolution, and development—if one is paying attention and knows how to observe and capture the important and emergent patterns. Should Finland be a world leader in innovation and experiments by 2025 also the capacity, competence, creativity and vitality of the Finnish evaluation community need to enhance.

References Ahonen, P. (2015a). Aspects of the institutionalisation of evaluation in Finland. Basic, agency, process and change. Evaluation, 21(3), 308–324. Ahonen, P. (2015b). Diffusion, modification and sedimentation of institutional elements in government accounting change. Finland in a comparative perspective during three decades since 1980. International Journal of Critical Accounting, 7 (2), 123–141. Ahonen, P., Virtanen, P., & Uusikylä, P. (2002). Evaluation in Finland. In J. E. Furubo, R. C. Rist, & R. Sandahl (Eds.), International evaluation atlas (pp. 47– 61). New Brunswick: Transaction Publishers. Autero, A. (2010). Tuloksellisuusraportoinnin rooli ja kehittyminen valtionhallinnossa 2000-luvulla? Selvitysmiesraportti (Liite 1). Tulosohjauksen arviointihankkeen loppuraportti [The role and development of the performance reporting in state administration between 2000–2010. The final report of the evaluation of the performance management in Finland, 47]. Helsinki: Ministry of Finance. Dahler-Larsen, P. (2007). Evaluation and public management. In E. Ferlie, L. E. Lynn Jr., & C. Pollitt (Eds.), The Oxford handbook of public management (pp. 615–642). Oxford: Oxford Policy Press. Finnish Education Evaluation Centre. (2018). https://karvi.fi/en/fineec/. Accessed on 6 January 2019. Kuntaliitto. (2016). https://www.kuntaliitto.fi/asiantuntijapalvelut/johtaminenja-kehittaminen/arviointi-ja-laadunhallinta. Accessed on 6 January 2019. Ministry for Foreign Affairs. (2013). Evaluation manual. Ministry for Foreign Affairs. (2015). Results based management (RBM) in Finland’s development cooperation—Concepts and guiding principles. Guidance Note 2015.

86

P. UUSIKYLÄ ET AL.

National Board of Antiquities. (2016). http://www.museoarviointi.fi. Accessed on 6 January 2019. Open Government Partnership. (2018). https://www.opengovpartnership.org/ about/about-ogp. Accessed on 6 January 2019. Patton, M. Q. (2011). Developmental evaluation applying complexity concepts to enhance innovation and use. New York: The Guilford Press. Prime Minister’s Office, Finland. (2011a). Final report of project on developing impact evaluation of the government policies, no. 2. http://vm.fi/documents/ 10616/622966/R0211_Politiikkatoimien+vaikuttavuusarvioinnin+kehittämin en.pdf/008fbdb7-47b7-400d-8d55-6a6da76186d0?version=1.0. Accessed on 6 January 2019. Prime Minister’s Office, Finland. (2011b). Improving the information base of political decision-making—From goals to reality. Working group report: developing the effectiveness evaluation of policy measures, no. 13. Helsinki: Prime Minister’s Office. Statutory Environmental Impact Assessment. http://www.finlex.fi/fi/laki/alkup/ 2017/20170277. Accessed on 6 January 2019. Sundquist, S., & Lovio, M. (2006). Suositus arvioinnista kuntien valtuustoille [Evaluation guidelines to councils of municipalities by the federation of municipalities]. Helsinki: Kuntaliitto. Tala, J. (2015). Better regulation through programmes and quality standards–Are new perspectives needed? Legisprudence, 4(2), 193–212. Uusikylä, P. (2013). Transforming silo-steering into a performance governance system: The case of the finnish central government. New Directions for Evaluation, 2013(137), 33–43. Uusikylä, P., & Valovirta, V. (2007). Three spheres of performance governance. Spanning the boundaries from single-organisation focus towards a partnership network. Evaluation, 13(4), 399–419. Vuorinen, E. (2015). Itsearvioinnin pikaopas. Yhteinen arviointimalli. The common assessment framework (CAF). Organisaation kehittäminen itsearvioinnin avulla. Turun kaupungin Hyvinvointitoimiala. Turku.

PART III

Western Europe

CHAPTER 4

The Netherlands Carolien Klein Haarhuis

1

Introduction

In 2009, a historic overview and state-of-the-art of evaluation policy in the Netherlands was published in New Directions for Evaluation (Leeuw 2009). Since then, the institutional landscape of evaluation has not stopped evolving. With a focus on six indicators of evaluation capacity (Nielsen et al. 2011; Bourgeois and Cousins 2013), new data were gathered in 2015. The resulting report is about evaluation institutions, programming, budgeting, evaluation process and finally, evaluation use in central Dutch policy departments (Klein Haarhuis 2016). Data sources encompassed literature, documents and interviews with key informants (Appendix). Several capacity-enhancing initiatives were found to have been taken between 2009 and 2015, such as departmental evaluation structures and programmes to promote the commissioning of effectiveness evaluations. For the purpose of

Carolien Klein Haarhuis worked as a researcher at WODC by the time of writing this contribution. She currently works for the Dutch Ministry of Finance. C. Klein Haarhuis (B) Former Researcher at Research Centre WODC, Ministry of Security and Justice, The Hague, The Netherlands © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_4

89

90

C. KLEIN HAARHUIS

this Evaluation Globe Compendium, we processed findings from this study and supplemented these with documents (online), publications and expert judgments.1,2 Most of the information presented here regards evaluation policies and strategies by the central government. General Country Overview In the Netherlands, there is quite a long-standing tradition of evaluation, which obtained its first nation-wide impulse in the 1970s, with efforts to achieve a more goal-oriented government budget (Leeuw 2009). This was followed by a broad range of spending reviews in the 1980s. In the decade following, the government began to establish its formal evaluation policy, which has been formalised further after 2000. Since then, evaluation activity has blossomed; primarily in and around the central government, but also at the local and regional levels. After 2010, more emphasis—also system-wide—has been put on the learning function of evaluation, next to accountability. Dutch evaluation policy has always been rather liberal, leaving much room for the community of researchers as well as policy professionals.

2

Institutional Structures and Processes (Political System) 2.1

Evaluation Regulations

Overall Framework: Accounts Act, Regulation on Periodic Evaluation and Policy Reviews Instrument The central evaluation regulation in the Netherlands is the Government Accounts Act (Comptabiliteitswet ), which is elaborated through the regulation on periodic evaluation (Regeling Periodiek Evaluatieonderzoek, RPE) (Ministry of Finance 2015). Ministers in the national government are

1 The author wishes to thank Prof. Dr. Frans Leeuw for comments on two earlier versions of this contribution. In addition, Dr. Marieke Gorree is thanked for additional expert views, particularly with regard to auditing and evaluation on the local level. 2 For efficiency reasons, we do not refer to the separate interviews throughout the chapter, as opposed to references to desk material and previous studies, which are included.

4

THE NETHERLANDS

91

legally responsible for evaluating their policy products periodically. A further elaboration of this regulatory framework took shape in the central Policy Reviews—instrument (Beleidsdoorlichtingen), which seems unique in comparison to other European countries (comments at EES 2016 conference).3 Each budgeted overarching policy goal is to be reviewed every four to maximum seven years in terms of effectiveness, efficiency and a range of other aspects among which the underlying policy logic or theory. Preferably, a policy review is a synthesis study of previous evaluations. Policy divisions within departments are responsible for setting up ‘their’ policy reviews. This has an (intended) impact on day-to-day evaluation practice, including evaluation programming. The accent is on accountability and legitimacy, but learning or enlightenment has emerged as a second aim (Ministry of Finance 2017). Hopefully this can reduce adverse effects of the instrument, such as performance paradoxes (crowding out). In 2016, a guide for conducting policy reviews (in Dutch) was published (Ministry of Finance 2016). The policy reviews instrument is under continuous refinement; most recently, in the context of the operation Insight in Quality (Inzicht in Kwaliteit ). Initiatives are to improve the ‘evaluation base’ of policy reviews, for example by improving the programming and coherence of evaluations addressing one overarching policy objective. A corresponding evaluationprogramming pilot has been set up in the health department in 2017. The underlying thought is to enhance the ownership of evaluation programming by policy makers and—as a result—the quality of evaluation products, in terms of both learning and accountability. The Government Accounts Act has been amended to take effect as from January 1, 2018. The following amendment was made at the request of the Second Chamber of Parliament: to incorporate a new stipulation (Article 3.1) that each policy proposal should contain an explanation of the goals and proposed policy instruments, as well as of the effectiveness and efficiency strived for. This should stimulate a more solid foundation of arguments in relation to the expected effectiveness and efficiency of new policy proposals. It should also enhance their evaluability.

3 Leeuw (2009) used the term ‘policy appraisal’, referring to this instrument.

92

C. KLEIN HAARHUIS

Evaluation Clauses in Individual Laws and Regulations Except for the aforementioned arrangements applicable to all policies, there are evaluation clauses in individual laws and regulations, obligating the executive to (periodically) evaluate the law in question after for example five years, or earlier. These evaluation clauses can be found in laws within any policy domain; possibly in varying degrees. Over the past decade, they seem to have been included in legislation more often.4 At the level of provinces and municipalities, there is no sharp demarcation between auditing and evaluation. As regards the use of evaluation findings in policy-making, there are no explicit central, government-wide obligations (Jacob et al. 2015). Exceptions are laws or regulations that include horizon clauses (often subsidy regulations), or experimental legislation (Government of the Netherlands 2011): here, evaluation outcomes are fundamental to decide whether or not to continue a temporary situation—either or not with regulatory adaptations. Sectoral Evaluation Laws/Regulations and Policies Except for some differences, evaluation regulations or laws do not differ substantially between policy sectors or domains. In the infrastructure domain, ex ante evaluation of any large infrastructural project is obligatory. Here, Societal Cost Benefit analysis is a key evaluation instrument. In the Health policy department, a special Commission on Evaluation of Laws and regulations has been in existence since 1998 (ZonMw 2013). In terms of evaluation policies and strategies, there is a much wider variety between policy departments. Part of these continuously evolve into new or different ones. Some evaluation arrangements have disappeared over time (cf. Klein Haarhuis 2016)—for example, in the Internal Affairs Ministry as from 2014. In a key directorate within the Economic Affairs Ministry, a comprehensive evaluation strategy has been operational and developing since 2013. The key focus is on the application of experimental designs

4 In 2017, a parliamentary motion was adopted in which the executive is requested to include an evaluation paragraph in all legal and policy proposals implying a ‘substantive change’. This evaluation paragraph should explain whether and if so, how the proposal will be evaluated.

4

THE NETHERLANDS

93

(effectiveness studies), with assistance from the academia and the Central Planning Bureau. Regional Evaluation Regulations Regional and local audit offices have been established by law, focusing on the outputs and outcomes of decentral policies. In practice, only part of these policies or goals are ‘covered’ by evaluation studies. For example, many decentral audit institutions have concentrated on the large decentralisation operation in the social policy domain. There is a non-exhaustive overview of audit reports at the local level (NVRR, s.a.). Unfortunately, it is often concluded that efficiency or effectiveness are difficult to establish because end goals are insufficiently clear and the filing of, for example, policy implementation is incomplete. Content of Laws/Regulations in Terms of Independence of Evaluation, Quality, Impact Orientation and Available Budget Beyond the general terms of effectiveness and efficiency in the Accounts Act and RPE, there are no central evaluation standards in the Netherlands as of yet—but this is a topic of ongoing discussion. In the aforementioned guide for Policy Reviews (Ministry of Finance 2016) (and additional sectoral guides), users are informed about different types and methodologies of evaluation. Independence is to be achieved in two main ways: (a) in the process of outsourcing, by setting conditions for the selection of evaluators or evaluation teams; (b) by installing a Supervisory Committee to warrant a certain degree of evaluation independence—something quite regularly applied in various policy domains. In such a committee different groups are represented, for example, stakeholders, field experts and academics, next to commissioners. In the periodic Committee meetings, the input of all parties is balanced. Supervisory committees are neither obligatory nor standard practice, except for specific Health regulation evaluations and research conducted by the Research and Documentation Centre (WODC) of the Justice and Security Ministry. Here, there is more stress on scientific quality, or on both scientific quality and representation. Rather than through the application of country-wide quality standards, evaluation quality is to be attained primarily through researcher selection processes. As regards outsourced evaluations—the vast majority in Dutch central policy departments—competition in terms of price-quality ratios in

94

C. KLEIN HAARHUIS

the procurement process is key. What evaluator or evaluation quality actually is, remains rather implicit in most policy fields. Quality criteria were found to be more explicit there, where long-term contracts are involved (e.g. in the Education department). What is more, in the Netherlands, quality standards often apply to research in general rather than to evaluation in particular. For instance, research institutes at the central departmental level have their own quality standards (e.g. the social cultural planning office SCP), performing research including evaluations for the health and social policy domains5 and the aforementioned WODC. Explicit evaluation guidelines were found in the domains of: Economic Affairs (experimental), Education (both experimental and explanatory) and Health regulation (legal-empirical)—all in connection to special multi-annual evaluation programmes. One final remark as regards evaluation quality. Conducting a systematic review or meta-analysis automatically involves quality assessments of the underlying (evaluation) reports. Quite a number of such meta-studies were conducted in the past in the Netherlands (Leeuw 2009) and still are, for example, by central research institutes like WODC, SCP and the Netherlands Court of Audit. Meta-research is not done on a regular basis, however, and serves varying purposes. As regards impact orientation, end policy goals are an evaluation focus in all departments—at least periodically. In part, this can be ascribed to the overall obligation to evaluate policies in terms of effectiveness and efficiency periodically—and more specifically, to the aforementioned ‘policy reviews’ instrument. The evaluation policy of the department of Economic Affairs has a particular focus on end effects. Inside and between policy departments, a variety of envisioned evaluation types can be found, for example process and explanatory evaluations are promoted next to impact evaluations. The exact budget per evaluation is often not determined long beforehand; most of the times, evaluations form part of the overall policy budget and/or research budget within a department. Hence the evaluation budget is determined annually, and in some cases, for several years—for example in case of programmatic evaluation.6 In practice, such programmatic 5 The SCP has a permanent methods division. 6 This is not to deny that every policy initiative could be evaluated in a programmatic way,

starting with ex ante research and moving via process evaluation and output monitoring to an effectiveness study (Klein Haarhuis 2016). In practice, effectiveness studies quite often are the first evaluation since the start of a policy initiative.

4

THE NETHERLANDS

95

evaluation most often applies to large pieces of (new) legislation or to fundamental system and organisational changes.7 We cannot answer the question as to whether evaluation budgets are sufficient. As regards nonfinancial means, however, part of our respondents reported in 2015 that policy divisions spend a limited percentage of their time on research and evaluation, judging from (among other things) limited capacity for the co-ordination of research activities. Aspired functions of evaluation differ per policy department/domain and co-determine the set-up of the evaluations. (1) Accountability is a key function of Dutch evaluation research; although policy domains—and even specific fields within domains—vary in this regard. Over the past decade, in the departments of Economic Affairs and Social Affairs there has been a stronger focus on accountability and end goals than in Justice and Security or Education, where the ‘mechanisms’ at work behind impact are at least as important. (2) Legitimacy and enlightenment are also key evaluation functions (Leeuw 2009) and may coincide with the accountability aim—also in the aforementioned periodic Policy Reviews Guide. Various studies serve this purpose. At the Ministry of Foreign Affairs and Development co-operation, there is a direct organisational link between the finance directorate within policy departments and evaluators from the departments’ Inspectorate. In practice, however, evaluations are generally qualitative/complex in nature and effectiveness in terms of end goals is hard to prove scientifically. Particularly, when end goals are quite abstract. This difficulty applies to many other domains and topics, among which social affairs (e.g. integration policies), or Justice and Security. (3) Policy evaluation is not often mentioned in relation to planning and steering purposes. However, ex ante studies (including feasibility studies), organisational and sometimes process or mid-term evaluations do serve this purpose and are quite widespread across policy departments, particularly in the domains of infrastructure and the environment and agriculture. Particularly, ex ante (societal) cost–benefit analyses (CBA) seems to have gained popularity also

7 An example is the evaluation of the reorganisation of the national police and relevant legislation. Here several rounds of evaluations took place covering a period of five years since the introduction. Another example is a multi-annual structural evaluation programme—ex ante and ex post—specifically for environmental legislation, the ‘STEM’-programme (see Uylenburg et al. 2011 for a meta-evaluation of STEM).

96

C. KLEIN HAARHUIS

in the ‘softer’ domains of internal, social and economic affairs.8 Ex ante studies are outnumbered by ex post evaluations, however. Evaluation Use: Initiatives and Practice Even though use of evaluations is generally not prescribed, Dutch policy departments address the communication of evaluation reports (and other research) in rather similar ways. Reports are often published and sent to Parliament. In many cases, an official cabinet response is attached. This response often includes a statement as to how a Minister intends to deal with the findings and (if available) recommendations. Often, this statement is also published—depending in part on whether Parliament demanded or was promised an evaluation. Sometimes use is an intended part of evaluation policies/strategies; these intentions are hardly binding, however. Nevertheless, evaluators are usually allowed to, and often do, valorise evaluation findings by means of presentations for policy officials or academics and articles in professional or academic journals. As regards the actual evaluation (and research) use by policy makers, there are several Dutch studies, which have often arrived at mixed conclusions in terms of levels and types of use. In practice, it depends on— particularly—the timing of an evaluation report in combination with the political constellations in the decision-making process (Johnson et al. 2009) whether evaluation use moves beyond reading a report or writing a policy response. At the local level, the use of auditing and evaluation is seemingly more ‘direct’ in comparison with central government (according to a local and central audit expert). Contacts with policy officials are more direct and informal. This might stimulate the uptake of findings in the policy process although this has not been a topic of systematic inquiry. Parliamentary Involvement in Evaluation Parliamentarians quite regularly deal with evaluation products. Firstly, because of the direct institutional relationship with the Netherlands Court of Audit (Algemene Rekenkamer), which—next to audits—produces 8 A guideline for Societal CBA that was first developed to the domain of infrastructure and environment, has been rewritten and published to serve a variety of policy domains (Romijn and Renes 2013).

4

THE NETHERLANDS

97

self-initiated evaluations. Secondly, parliamentarians quite frequently take notice of the wide variety of departmental evaluation studies—next to other types of research9 —and use this in the political debate. There is staff to support them in this regard. In addition, parliamentarians quite regularly ask for evaluation studies to be conducted—in closed or public parliamentary sessions. According to Jacob et al. (2015, p. 21), the use of evaluation research in Dutch parliamentary discussions has risen significantly between 2001 and 2011, particularly as regards development co-operation. For example, in the discussion of new policy or legislative proposals, evaluation promises are made, or evaluation clauses are included in a draft law; that is that the report should be realised within X years (evaluations as ‘change’). The actual commissioning work normally takes place at the responsible policy department(s) and sometimes by the Parliamentary Analysis and Research Department (Dienst Analyse en Onderzoek, DAO).10 2.2

Evaluation Practice

In the Netherlands, evaluation takes place in practically every policy domain (Jacob et al. 2015 on pluralism), however, evaluation intensity and rigour varies between domains. Despite the general legal obligation in the Accounts Act, the ministerial responsibility to periodically conduct evaluations and other obligations, there is often no systematic, multiannual planning of policy or programme evaluations by policy officials. There are exceptions to this, like the evaluation of some bigger acts/law books in four to five subsequent reports, highlighting a different topic each time (e.g. the general book of administrative law). In 2016, a pilot was started in the Health domain to promote ownership in multiannual evaluation programming. In the Social Affairs and Infrastructure domains, part of evaluations is sparked by ESF programme obligations. Within policy departments, a variety of units are involved in the management of evaluations. The initiative to conduct evaluations is most often 9 For example, over a three-year period (2008–2010) parliamentarians referred to publications by research institute WODC (Justice and Security) for over 700 times (Van der Laan et al. 2013). 10 The DAO was installed per September 1, 2017 with the wider mission to strengthen the knowledge and research position of the Second Chamber of Parliament (Tweede Kamer 2017).

98

C. KLEIN HAARHUIS

taken by the research co-ordinator within a policy division or directorate, often a senior policy official. All policy departments have a financialeconomic division responsible for accountability and control. To varying extents, these financial-economic divisions help stimulate the planning, quality and use of policy evaluation research. For example, in the Ministry of Social Affairs, the role of the financial-economic division is quite pronounced. Its cluster Evaluation and policy information (Evaluatie en beleidsinformatie) supervises the programming of evaluations and the aforementioned policy reviews in concordance with the co-ordinators of each policy division. This cluster also monitors the quality and use of evaluations in the domain of social affairs. Similarly, in the department of Economic Affairs, since 2012–2013 a network of divisions and units—with the financial-economic division fulfilling a central role—and four external members (including CPB Netherlands Bureau for economic policy analysis and VU university) have addressed the quality of monitoring and evaluation (M&E) activities, mostly in relation to the study of effectiveness. This was done primarily by launching practical evaluation guides, covering process as well as content, and by systematically exchanging knowledge and experience, for example by means of knowledgeable ‘linking pins’ in part of the department’s policy divisions. As opposed to financial-economic divisions, the primary focus of ‘knowledge divisions’ is strategic, future-oriented. Ex post evaluations account for just a fraction of their overall research activity.11 Next to the aforementioned, none of the policy departments we studied has a specific ‘evaluation division’ or unit. Instead, there are a number of (semi-)independent research institutes, performing also evaluations. These evaluation institutes are quite many: Central Planning Offices (CPB, SCP, WODC, KIM, RIVM12 ), which conduct—and independently publish both monitors and evaluations on a regular basis—next to a variety of other policy relevant research. In the Foreign Affairs and International co-operation department, evaluations are conducted by the Policy and Operations Evaluation Department (IOB) and are used for both learning and accountability purposes. 11 With the exception of the knowledge division in the department of Education (see below). 12 CPB = Central Planning Bureau, SCP = Social Planning Bureau, WODC = Scientific Research and documentation Centre, KIM = Netherlands Institute for Transport Policy Analysis and RIVM = National Institute for Public Health and the Environment.

4

THE NETHERLANDS

99

By far most of the evaluation work is outsourced by the responsible policy officials.13 Outsourcing is mostly done to commercial research bureaus, followed by university faculties/institutes and central research institutes with a link to one or more policy departments (e.g. WODC, SCP, CPB, PBL)—or a consortium of evaluation parties. Despite the dominance of outsourced evaluations, the ratio internal—external research can vary per policy domain (e.g. allegedly a relatively large percentage of internal studies inside the Education department). There are evaluation support functions, however; varying significantly between departments. For example, the Ministry of Social Affairs has a Chief Science Officer who supports the co-ordination and methodology of research—including evaluations. At the Education Ministry, the knowledge division fulfils a similar role as well as a Policy Analyst Team at the Ministry of Economic Affairs forming part of the aforementioned evaluation network. At the Ministry of Justice and Security, WODC project managers of (evaluation) research more or less ‘take over’ the supervision of evaluations after the TOR-stage. Nevertheless, the commissioner/policy official from this ministry is member of a supervisory committee, which is always installed. The Netherlands Court of Audit independently delivers performance audits to assess whether the ministers’ policies are effective, efficient and economical. What is more, sometimes inspectorates, like the Health Inspectorate, the Dutch Safety Board, or financial market supervisor AFM assess effects of supervision/inspection and hence conduct a type of evaluations (e.g. Goslinga et al. 2016). At the local level, audit offices are generally more focused on process and outputs rather than on social end effects (local audit expert). Ratio Impact to Process Evaluations Society is not a laboratory; in the often ‘messy’ reality of policy making it is not easy to deliver internally valid impact evaluations, no matter which country or context. There is often (strong) interaction between evaluation subjects and societal factors other than the policy under study. Although baseline measurements and control groups can tackle this problem, these 13 For example, of published ex ante evaluation studies only 4% was performed internally (Klein Haarhuis and Smit 2017) and experts within various policy departments estimated that this also holds for ex-post evaluation studies.

100

C. KLEIN HAARHUIS

require careful planning and can be hard to organise because evaluation generally moves slower than the policy cycle does. Generally, in most Dutch policy domains there is a mix between process and impact evaluations (e.g. Netherlands Court of Audit 2012, 2013). It is difficult to judge the actual ratio of impact to process evaluations. One of the reasons is that some evaluations are claimed to be impact evaluations (possibly in relation to accountability/legitimisation), whereas in fact they are process evaluations, either or not combined with an assessment of goal attainment (Klein Haarhuis and Niemeijer 2009). In 2012 and 2013, the Dutch Court of Audit assessed the percentage of the government budget that is ‘covered’ by effectiveness and efficiency studies (impact evaluations)—as a means to estimate departmental accountability. It concluded that over the period 2006–2011, less than half (42%) of the government-wide policy budget of 94 billion e was covered by evaluations studying the relationship between policy inputs and formal outcome indicators (in the government budget).14 The many process and output evaluations as well as sheer goal attainment studies were not included in this percentage. Evaluation methodology seems to have gained attention across central government over the past 5–10 years (Leeuw 2009). The dominant evaluation methodology varies between the policy departments studied, depending in part on the key policy divisions or units responsible for evaluation. For example, if the financial-economic division is very active, summative approaches often prevail over explanatory ones. If knowledge divisions or scientific research institutes are involved in evaluation, more often a mix of evaluation approaches was encountered. For example, in the department of Education, the knowledge division has promoted combinations of black-box (experimental) and clear-box (explanatory) evaluation approaches (Ministry of Education 2015; CPB 2011; Pater et al. 2012). The same applies to the WODC of the Ministry of Justice and Security.

14 No assessment of evaluation quality was made. In part, this finding of 42% can be explained by the fact that the overarching policy goals in the government budget (the focus of the Court of Audit) were often not precisely reflected in concrete evaluation studies. Policy officials came up with a number of explanations for the lack of effectiveness evaluations (Netherlands Court of Audit 2012, 2013), among which: it is unclear how to evaluate effectiveness; it was impossible to form a control group or baseline measurement; it is too early to assess effectiveness; budget was lacking; other parties are evaluating or have evaluated the policy at hand; impact evaluation has little added value; the policy has been ended.

4

THE NETHERLANDS

101

Having a mix of evaluation types is often not the result of a deliberate multi-annual evaluation programming, however.15 In an ideal situation, baseline measurements and process evaluations are deliberate stepping stones to subsequent, sound impact evaluations. This programmatic approach to evaluation is difficult to realise if by far most evaluations are being outsourced on an individual basis. If there is a ‘fixed’ research institute (e.g. WODC or IOB), in-house expertise and co-ordination mechanisms help stimulate coherence in the programming, performance and outsourcing of evaluations. 2.3

Use of Evaluations

Some departments have a long-standing tradition when it comes to policy research and evaluation, making evaluation a more explicit (potential) part of the policy development process. Examples are the Justice and Security Ministry and the affiliated research institute WODC, as well as the department of Foreign and development Affairs, with its policy inspectorate IOB. This Inspectorate conducted a broad study into use of (evaluation) research for policy purposes (Van Gils and Leeuw 2010; Panel van Advies IOB 2011, 2013, 2014). Combined with our study on evaluation capacity (Klein Haarhuis 2016), three main lessons can be drawn: to increase their learning value, respondents argued that ex post evaluations should not only look back, but also focus on future actions. What is more, evaluations are deemed instructive if policy assumptions (policy theory) are given a reality check. Finally, more often, knowledge repositories could be established in which findings from previous evaluation studies are systematised. An example is the knowledge repository on judicial interventions and recidivism. Professional groups potentially using evaluation studies are, primarily: (1) Policy officials and politicians, either or not in concordance and (2) programme or project managers—partly overlapping with the first category. This applies to both central government and the regional or local level. Direct and other types of use have been investigated, for example through evaluation forms to extract commissioners’ judgements of (WODC-)policy evaluation and other studies. In terms of the conduct and use of ex ante evaluation studies, especially societal cost–benefit analyses, the infrastructure domain has been in the lead for a long time. Much (valuable) conceptual 15 Although the interest for more thorough evaluation programming is on the increase, however.

102

C. KLEIN HAARHUIS

use could be taking place, without leaving tangible traces in policies on the short term. It proved difficult to establish conceptual use although efforts have been made by doing bibliographical research (SCP, WODC). Apart from political science work (e.g. Bovens et al. 2008), the actual aims or drivers of evaluation work have not been an explicit topic of empirical study. In the Dutch situation there is generally a mix between accountability and enlightenment motives, although accents can vary between and also within policy departments. The aforementioned periodic Policy Reviews instrument is driven by both motives, although in practice, accountability has been a dominant driver thus far. Currently, the learning element of policy reviews is stimulated by for example the Finance Ministry. For example, in a special course for policy officials and others responsible for conducting policy reviews, participants receive cognitive training as well as exchange and discussion (Rijksacademie 2017). As explained, warranting evaluation use is hardly applicable to the Dutch situation. Nevertheless, it seems common practice across policy domains to provide at least a (public) response to new evaluation reports, often indicating how its findings will be used in the (near) future. Such an indication of (intended) use can vary from indicative to concrete or even an actual decision.

3 3.1

Societal Dissemination/Acceptance Institutionalised Use of Evaluations by Civil Society

There is an empirical relation between involvement during the evaluation trajectory and evaluation use (e.g. Johnson et al. 2009); this also applies to civil society organisations or societal stakeholders. Unfortunately, evaluation use by civil society has not been investigated in a systematic way in the Netherlands to date. Nevertheless, it is widely known that civil society organisations regularly participate in evaluation projects in The Netherlands, (1) as commissioners (2) as respondents or (3) as members of supervisory committees or ‘sounding boards’. In some policy domains, supervisory committees are to represent the various stakeholders involved in the evaluation topic. This has been identified as a reflection of Dutch ‘culture’ rather than as a result of evaluation policy (Leeuw 2009). One clear example of evaluation use by companies for strategic purposes, is the use of a periodic Monitor on civil and administrative dispute resolution by a large Dutch Legal Expenses Insurance company. It is unknown

4

THE NETHERLANDS

103

to us, whether such types of use take place on a regular basis. Investigating this would require a large number of respondents within a variety of networks. 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

Evaluation reports are often made publicly available in the Netherlands. However, the precise size of the unpublished share (‘grey literature’) is unknown and hard to assess: this would require thorough stock-taking inside policy departments. Actual evaluation findings are quite often discussed in public. Particularly, the impact analyses by the CPB of (a) coalition agreements and (b) election manifestos of political parties draw the public’s attention; providing voters with a more objective view. Sometimes, highly salient local findings make it to the national news. An example is the report by the Audit Office Rotterdam, in which the cyber security of the local city council data was found to be like Swiss cheese (Volkskrant 2017). 3.3

Civil Society: Demand for Evaluations

Individual citizens, civil society organisations, private enterprises or other actors do ask for evaluation studies in the Netherlands. However, it is not well known how often, or to what degree. For example, in 2016 a lawyers’ organisation commissioned a ‘second opinion’ on a report commissioned by the Ministry of Justice and Security regarding the new right to a lawyer during police interrogations. There are more examples of second opinions, for example on social cost–benefit analyses commissioned in the infrastructure domain. Also, the National Police, stakeholder associations, for example, in the education or health sector, environmental organisations (umbrella), associations in the social security sector or the prison system have conducted or commissioned research/evaluation on their own initiative, mostly to (commercial) research bureaus.16 What is more, executive organisations have asked for (ex ante) feasibility studies (Klein Haarhuis and Smit 2017).

16 Thus far, the accent has been on inspection. The aim is to give evaluation a larger foothold in the near future.

104

C. KLEIN HAARHUIS

4 4.1

Professionalisation (System of Professionalisation) Academic Study Courses, Further Training

In the Netherlands, evaluations are conducted by a variety of professional researchers. As such, evaluation is not an officially recognised profession or academic discipline: there is neither a homogeneous professional group of evaluators nor standards for evaluation products or evaluators. Nevertheless, evaluation forms part of an increasing number of education and training programmes. There is no full academic Evaluation curriculum (Master or Bachelor) in the Netherlands. Still, however, evaluation (methods) courses or course topics form part of quite some academic programmes among which Sociology, General Social Sciences, Public Administration/Public Policy, Public/Spatial Planning, Health Sciences et cetera at various universities spread across the country. For example, in Groningen, Rotterdam, Utrecht, Nijmegen and Enschede. In Maastricht University’s Law Faculty, there is an Empirical legal Studies programme including Evaluating Legal Arrangements. Public policy evaluation has formed a part of the curriculum for one to two decades. In the medical sciences, evaluation (of treatments) takes the shape of RCT’s and other forms of experimentation; a long-standing tradition (cf. Cochrane Netherlands 2018). What is more, in co-ordination with the academia, there is a variety of trainings for government officials; sometimes a full evaluation training (e.g. Economic Affairs in the recent past; now institutionalised and transformed at the ‘Rijksacademie’, named ‘Policies in perspective’. However, most of the times evaluation is just one module in a wider training, for example, on ‘Policy and behaviour’ or ‘Evidence Based Policy and Evaluation’ for the Education ministry (TIER 2011). Finally, (Societal) Cost Benefit Analysis —a type of ex ante evaluation—has been part of trainings for policy officials/policy researchers for a longer time. 4.2

Profession/Discipline

Since evaluation is not a recognised profession or academic discipline, there is no certification system for evaluators in the Netherlands.

4

THE NETHERLANDS

105

Journals and Newsletters There are a number of professional—online accessible—journals relevant to the practice of evaluation in the Netherlands. To mention the most salient: • Policy research online (Beleidsonderzoek Online) • Public Administration (Bestuurskunde); Public Administration Review (Bestuurswetenschappen) • Policy and society (Beleid en Maatschappij ) • Socio-legal journals (Recht der Werkelijkheid, Regelmaat ) • Sociology (Sociologie Magazine).

VOPE: The Dutch Evaluation Society Within Vide In 2012, in the course of a much longer history of evaluation (Leeuw 2009), the Dutch Evaluation Society (Vide-Evaluatorennetwerk) was founded. Given the context just described—a mature infrastructure and a heterogeneous profession—the founders opted for a light organisational structure: a freely accessible network, through which about two-hundred ‘evaluators’ and others are now connected. They participate in meetings on evaluation and otherwise show their interest, for example by participating in the LinkedIn group. Particularly, central government employees (officials) are represented, for example, policy officials, auditors and researchers at central (departmental) research institutes or the Netherlands Court of Audit. Evaluators at municipalities, provinces, universities and regional audit offices participate to a lesser degree. The Dutch Evaluation Society forms part of the Vide organisation, the Dutch association of professionals in the field of monitoring, inspection, enforcement and evaluation (Leeuw 2009). Vide is an organisation by and for people who either individually or by professional association can become acquainted with colleagues in their field. Vide aims to be the engine in the professional development of its members and to contribute to continuous improvement of monitoring, inspection, enforcement and evaluation. In this way, Vide wishes to contribute to a well-functioning government and society. Vide has a periodic newsletter (Vide Nieuwsbrief ). In line with this philosophy, the Dutch Evaluation Society focuses on sharing knowledge and expertise and making connections within the highly diverse evaluation profession.

106

C. KLEIN HAARHUIS

There is a web page set up within the Vide website. The Dutch Evaluation Society is also represented in the editorial team of Policy Research Online (Beleidsonderzoek Online), a Dutch online journal of public administration. Approach and Activities of the Dutch Evaluation Society Annually, four to six meetings are organised within the three main themes of the Dutch Evaluation Society: (1) use and usefulness of evaluations; (2) evaluation methods; and (3) evaluation infrastructure. Meetings are organised by a Working Group of six. Most Working Group members are employed by organisations inside the government, including the Netherlands Court of Audit. One of them represents the Working Group in the Vide board. The usual structure of these meetings follows a classic format: one or several presentations by invited speakers followed by discussion with attending participants. A few examples of recent events include: (1) a meeting about a Programme Effectiveness evaluation programme in the Health Care Sector (conducted by ZonMw); (2) a meeting on the mandatory periodic ‘policy reviews’; and (3) a meeting on Qualitative Comparative Analysis as a promising method for evaluation researchers. Evaluation (Quality) Standards As indicated, evaluation methodology seems to have gained attention over the past five to ten years. The Dutch Evaluation Network (of Vide) or other professional networks or institutions have not developed or adopted evaluation quality standards/guidelines or principles as of yet. Evaluation Market The evaluation market in The Netherlands—in terms of who obtains evaluation assignments—seems to be dominated by larger research bureaus and consulting firms, followed by scientific research institutes, either at universities or linked to the central government (e.g. Klein Haarhuis and Smit 2017 on ex ante evaluation). The Netherlands is studded with (general) dispute resolution arrangements and institutions (cf. WODC-Paths to Justice monitor in The Netherlands). Evaluation-related disputes are possibly resolved by one of these.

4

THE NETHERLANDS

107

The Dutch National Ombudsman is the most obvious when it comes to problems related to the functioning of government institutions. There are quite a number of ‘evaluation professorships’ in the Netherlands; most of them part-time. To mention a few: a professor of Policy Evaluation at VU University Amsterdam who is also advisor for evaluation/knowledge issues at the Economic Affairs Ministry and partner at SEO Amsterdam Economics17 ; a Professor of Policy Evaluation at the Erasmus University of Rotterdam, with a specific focus on Health, Education and Strategic Philanthropy; a Professor of Law, Public administration and social science research at the University of Maastricht; a Professor of Evaluation in Health Care at the Erasmus University of Rotterdam; and a Professor specialised in the Education domain (particularly effects of school inspections and added value of schools and the school system) at Maastricht University and the Netherlands Inspectorate for Education. 4.3

Research Quality Rather Than Evaluation Quality

Most of the time, quality requirements as formulated by research institutes or commissioning parties address research in general, rather than evaluation quality in particular. Nevertheless, there is awareness as regards the potential of experimental evaluation research (judging from an evaluation programme at the Economic affairs department) as well as about the benefits of contribution analysis (e.g. at the Ministry of Education). What is more, there is an Accreditation Committee for youth-related interventions at the Netherlands Youth Institute.18 Here, a database with accredited interventions is kept. The committee is a national institution that independently assesses the evidence-base of interventions.19 There are ten sub-commissions (topics include, among others: youth health care, sports, stimulation of development and judicial interventions).

17 At the Social Affairs Department, a comparable advisor ‘Chief Science Officer’ has been active, however, not specifically in relation to evaluation but with a wider assignment regarding knowledge-related issues. 18 For more info, see www.youthpolicy.nl/ for www.nji.nl/nl/Databank/Databank-Effec tieve-Jeugdinterventies/Erkenningscommissie-Interventies. 19 In the context of ‘targeted use’, Leeuw (2009) previously observed a growth of evaluation and accreditation committees in the Netherlands, relating to the domains of health, education, crime and justice. ‘Programs and interventions can be subsidised only if an ex ante evaluation has demonstrated plausibility with regard to the underlying theory and future impact’.

108

C. KLEIN HAARHUIS

As regards the evaluation of legislation in the health domain, the evaluation studies outsourced and supervised by the Commission for Regulatory Evaluation of the programme carrying the same name, can be considered a best practice. Here, evaluations of laws should fulfil seven main requirements, from a legal as well as an empirical perspective (ZonMw 2013). The attention for evaluation quality seems to be on the rise, however. This includes the quality of the aforementioned obligatory periodic Policy Reviews in terms of both accountability and learning: can the degree of effectiveness found be ascribed to the actual interventions; what are the explanatory mechanisms?

5

Concluding Remarks

In the Netherlands, evaluation is neither an officially accredited profession nor academic discipline; there are no officially recognised standards for Dutch evaluation products or evaluators. Nevertheless, evaluation culture has been qualified as ‘mature’ (Jacob et al. 2015)—particularly, in terms of the coverage of policy domains (pluralism), institutionalisation in government and parliament and central audit institution. What is more, evaluation makes part of an increasing number of education and training programmes, in universities and for policy officials. At the local/community level, the accent is more on auditing than on evaluation, although the relatively direct communication between commissioners and evaluators contributes to the use of local evaluation reports. As far as the societal discourse on evaluation is regarded, the Netherlands can be considered an average case (Jacob et al. 2015); beyond examples, there is no systematic research on evaluation use by civil society organisations, companies and citizens. At the central government level, evaluation policy is mainly determined by the periodic ministerial evaluation obligation, taking shape in the crossdepartmental ‘policy reviews’ instrument (Leeuw 2009). This periodic policy review demands accountability in terms of effectiveness and efficiency of departmental policies. Under this umbrella, Dutch policy departments have very differently shaped their evaluation policies/evaluation capacity, both in terms of the involved organisational structure(s) and in terms of evaluation processes and methodological frameworks. Across policy departments, we observed quite a number of programmes and initiatives to facilitate evaluation research in terms of planning, conduct (methodology) and use. Many evaluation structures and programmes are still quite young, however, making it hard to determine their impact on learning and accountability and

4

THE NETHERLANDS

109

ultimately, on policy effectiveness. Turnover of policy staff, regular budget cuts and other (departmental) developments make it a lasting challenge to consolidate these initiatives. There is active utilisation of evaluation in The Netherlands—even though this is generally not obligatory. A prerequisite for evaluation capacity is that initiatives to enhance policyrelevant knowledge and learning experiences are welcomed rather than punished, in the political arena and by the central accountability system. As regards the latter, learning is promoted by the Finance Ministry (and Court of Audit) which have begun to encourage policy divisions to disclose and clarify (any) gaps in evaluations and policy reviews, rather than covering these.

Appendix Overview of interviews and key informants in the recent evaluation capacity study regarding Dutch central policy departments (Klein Haarhuis 2016).

Social Affairs Justice and Security

Education

Infrastructure and the Environment

Finance

15 July 8 June, 20 October 15 June (telephone) 11 August (continued)

14 July (e-mail and telephone)

1 July 26 August

24 August

16 July

26 August

15 July

13 October

Financial-economic division (FEZ) together with Knowledge division DG Wonen en Bouwen FEZ/BEC DG Enterprise and Innovation—Regiegroep M&E (later BAT) Division of General economic policy (AEP) Behavioural Insights Team (BIT) Legislation and legal affairs (5 pers.) Inspectie der Rijksfinanciën (IRF), Bureau Strategic Analysis (2 officials) Legal Affairs main division (HBJZ) (2 officials) and FEZ (Finance, Management and Control—FMC) Another official HBJZ Knowledge division (3 officials) Research co-ordinator DG for Higher Education Legislation and legal affairs division (DWJZ) FEZ and Knowledge division, DG SZI WODC (EWB, 4 officials) and top management Research co-ordinator

Economic Affairs

2 September

IOB

Foreign Affairs Development co-operation Internal Affairs

Date (2015)

Division

Department

110 C. KLEIN HAARHUIS

Other National Audit Office Council of State Scientific Council for Government Policy (WRR)

Health and Sports

Department

(continued) Date (2015)

2 officials Official, telephone communication Top management

1 September 24 November 16 November

Official involved in previous large evaluation 25 August of administrative law Policy official FEZ 15 September Commission for the Evaluation of laws and 8 April regulations (CER) at ZonMw

Division

4 THE NETHERLANDS

111

112

C. KLEIN HAARHUIS

References Algemene Rekenkamer/Netherlands Court of Audit. (2012). Effectiviteitsonderzoek bij de rijksoverheid. The Hague: Algemene Rekenkamer. Algemene Rekenkamer/Netherlands Court of Audit. (2013). Effectiviteitsonderzoek bij de rijksoverheid: Vervolgonderzoek. The Hague: Algemene Rekenkamer. Bourgeois, I., & Cousins, J. B. (2013). Understanding dimensions of organizational evaluation capacity. American Journal of Evaluation, 34(3), 299–319. Bovens, M., Hart, P., & Kuipers, S. (2008). The politics of policy evaluation. In M. Moran, M. Rein, & R. E. Goodin (Eds.), The Oxford handbook of public policy (pp. 319–335). Oxford: Oxford University Press. Cochrane Netherlands. (2018). http://netherlands.cochrane.org/. Accessed on 28 February 2019. CPB. (2011). Zicht op effectiviteit. Studie naar evaluatieontwerpen voor onderwijs- en wetenschapsmaatregelen. The Hague: CPB Achtergronddocument. www.beleidsevaluatie.info/wp-content/uploads/2011/06/cpb-achterg ronddoument-zicht-op-effectiviteit-van-beleid.pdf. Accessed on 28 February 2019. Goslinga, S., Siglé, M., & Van der Hel, L. (2016). Het meten van effecten van de handhaving door de Belastingdienst. Tijdschrift voor Toezicht (3). Government of the Netherlands. (2011). Aanwijzingen voor de Regelgeving. http://wetten.overheid.nl/BWBR0005730/2011-05-11. Accessed on 28 February 2019. Jacob, S., Speer, S., & Furubo, J. E. (2015). The institutionalization of evaluation matters: Updating the international atlas of evaluation 10 years later. Evaluation, 21(1), 6–31. Johnson, K., Greenseid, L. O., Toal, S., King, J. A., & Lawrenz, F. (2009). Research on evaluation use: A review of the empirical literature from 1986 to 2005. American Journal of Evaluation, 30(3), 77–410. Klein Haarhuis, C. M. (2016). Evaluatievermogen bij beleidsdepartementen, praktijken rond uitvoering en gebruik van ex post beleids- en wetsevaluaties. The Hague: WODC (forthcoming). M.m.v. A. Parapuf. Klein Haarhuis, C. M., & Niemeijer, E. (2009). Synthesizing legislative evaluations: Putting the pieces together. Evaluation, 15(4), 403–425. Klein Haarhuis, C. M., & Smit, M. (2017). Ex-ante research explored: Numbers, types and use of ex-ante policy studies by the Dutch government. Evidence & Policy, 13(4), 647–668. Leeuw, F. L. (2009). Evaluation policy in the Netherlands. In W. M. K. Trochim, M. M. Mark, & J. Cooksy (Eds.), Evaluation policy and evaluation practice: New directions for Evaluation (Issue 123, pp. 87–102). San Francisco: Jossey-Bass. Ministry of Education. (2015). Notitie over Zicht op Effectiviteit. Intern document Directie Kennis.

4

THE NETHERLANDS

113

Ministry of Finance. (2015). Regeling Periodiek Evaluatieonderzoek (RPE). Staatscourant, 83. http://wetten.overheid.nl/BWBR0035579/2015-01-01. Accessed on 28 February 2019. Ministry of Finance. (2016). Handreiking beleidsdoorlichtingen. www.rijksbegr oting.nl/beleidsevaluaties/evaluaties-en-beleidsdoorlichtingen/handreiking. Accessed on 28 February 2019. Ministry of Finance. (2017). Operatie inzicht in kwaliteit (Letter to Parliament). https://www.rijksoverheid.nl/documenten/kamerstukken/2017/04/10/ka merbrief-aandachtspunten-commissie-financien-over-focusonderwerp-en-oper atie-inzicht-in-kwaliteit. Accessed on 28 February 2019. Nielsen, S. B., Lemire, S., & Skov, M. (2011). Measuring evaluation capacity: Results and implications of a Danish study. American Journal of Evaluation, 32(3), 324–344. NVRR. (s.a.). Nederlandse Vereniging van Rekenkamers en Rekenkamercommissies (Online library). www.nvrr.nl/bibliotheek. Accessed on 28 February 2019. Panel van Advies IOB. (2011, 2013, 2014). Eerste, tweede en derde advies van het panel van advies. The Hague: Inspectie Ontwikkelingssamenwerking en beleidsevaluatie. www.rijksoverheid.nl/ministeries/ministerie-van-buitenlandse -zaken/inhoud/organisatie/beleidsevaluatie/panel-van-advies. Accessed on 28 February 2019. Pater, C., Sligte, H., & Van Eck, E. (2012). Verklarende evaluatie, een methodiek. Amsterdam: Kohnstamm Instituut. Romijn, G., & Renes, G. (2013). Algemene leidraad voor maatschappelijke kosten-batenanalyse. The Hague: Centraal Planbureau, Planbureau voor de Leefomgeving. www.mkba-informatie.nl/index.php/download_file/force/24 1/456/. Accessed on 28 February 2019. Rijksacademie voor Financiën, Economie en Bedrifsvoering. (2017). Maatwerkcursus Beleidsdoorlichtingen. https://www.rijksacademie.nl/. Accessed on 28 February 2019. TIER. (2011). De evidence based beleidsmaker en toezichthouder: Een master voor professionals in evidence based policy and evaluation. www.tierweb.nl/assets/file s/UM/TA/MEPE%20brochure%20nov.2011(1).pdf. Accessed on 28 February 2019. Tweede Kamer. (2017). Over de Tweede Kamer. www.tweedekamer.nl/over_de_t weede_kamer. Accessed on 28 February 2019. Uylenburg, R., de Boer, J., Peeters, M. G. W. M., & Oosterhuis, F. H. (2011). Het STEM-geluid bij de evaluatie van milieuwetgeving. Arnhem: STEM Publicatie, 2011/1. https://dare.uva.nl/search?identifier=d9393ced-70b1-43ba-81e8-a1 69f6b30a0d. Accessed on 28 February 2019. Van der Laan, A. M., Beijaard, F., Geurts, T., Kalidien, S. N., & Van Straalen, E. K. (2013). Self-evaluation, Research and Documentation Centre (WODC) Ministry

114

C. KLEIN HAARHUIS

of Security and Justice (2006–2012). The Hague: WODC. https://www.wodc. nl/binaries/zelfevaluatie-wodc-2006-2012_tcm28-92074.pdf. Accessed on 28 February 2019. Van Gils, G., & Leeuw, F. L. (2010). Leren van evalueren: Onderzoek naar het gebruik van evaluatieonderzoek bij het Ministerie van Buitenlandse Zaken. The Hague: Ministry of Foreign Affairs. Volkskrant. (2017). Burgemeester Aboutaleb loopt onnodig groot veiligheidsrisico. www.volkskrant.nl/binnenland/burgemeester-aboutaleb-loopt-onnod ig-groot-veiligheidsrisico~a4483405/. Accessed on 28 February 2019. ZonMw, Commissie Evaluatie Regelgeving. (2013). Brochure Programma Evaluatie Regelgeving. The Hague: ZonMw. www.zonmw.nl/nl/onderzoek-resultat en/kwaliteit-van-zorg/programmas/programma-detail/evaluatie-regelgeving /. Accessed on 28 February 2019.

CHAPTER 5

Belgium Valérie Pattyn and Bart De Peuter

1

General Country Overview

In order to have an accurate understanding of the level and way of institutionalisation of policy evaluation in Belgium, it is essential to highlight

This chapter is to large extent based on the conference paper entitled Once Laggard, Always Laggard? A Multi-level Analysis of the State of Evaluation Institutionalisation in Belgium, that the authors presented at the 2016 European Evaluation Society conference in Maastricht. For this conference paper, we launched a questionnaire that was modelled along the survey used in (Jacob et al. 2015). We owe special thanks to Steve Jacob for sharing this questionnaire; and we are most grateful to the eleven respondents that provided us with valuable input on the state of evaluation institutionalisation in Belgium. The insights collected via this survey have been integrated in this chapter. V. Pattyn (B) Institute of Public Administration, Leiden University, Leiden, The Netherlands e-mail: [email protected] B. De Peuter Belgian Court of Audit, Brussels, Belgium e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_5

115

116

V. PATTYN AND B. DE PEUTER

some key tenets of its politico-administrative setting: the federal structure, multiple party coalition governments and partitocracy. Since 1993, Belgium is officially a federal state. Consecutive constitutional reforms have transferred major competences from the national (coined ‘federal’) level to the regional level. The distribution of competences on the regional level is mainly organised along two broad axes. The first axis concerns linguistics and culture. A Flemish-, French- and German-speaking Community are each in charge of person-related matters (examples: education, culture, youth). The second axis is designed to accommodate the diversity in socioeconomic interests in the country and has constituted the basis for the creation of three territorial regions: the Flemish Region, the Walloon Region and the Brussels-Capital Region. The regions are in charge of territorial matters (examples: environment, agriculture), and legislation with regard to municipalities and provinces. Despite the devolution in competences to regions and communities, the federal level in Belgium is still competent for traditionally core tasks of a state including internal security, foreign affairs, defence, justice and finance. Moreover, in several domains, competences are scattered over the federal and regional level (examples: labour market, health care and social security). All government levels do have their own government; parliament and administration, with the exception of the Flemish region and community which have merged their institutions (Pattyn and Brans 2015). Secondly, and related to the consensus seeking culture in Belgium, the political landscape is characterised by a system of proportional elections. This, in turn, triggers a fragmented party-political landscape and multiple party coalition governments on all levels. To be clear, since the 1970s political parties are also regionalised in their organisation, with the exception of the extreme left-wing Workers’ Party (PvdA/PTB). Partitocracy constitutes a third key characteristic of the Belgian politicoadministrative setting (Deschouwer 2012; Dewachter 2001). In a country that is highly fragmented, political parties act as gate keepers for demands coming from the executive and legislative branches and play a decisive role in the policy-making process. Linked to the partitocracy, is the influential role of personal staffs of ministers which are large by international standards and dominant in policy formulation (Brans et al. 2005). The advisors assist the minister in identifying problems, in outlining policy and in everyday decision-making. They also play a pivotal role in cross-departmental coordination.

5

BELGIUM

117

Why are these country features important to the study of evaluation institutionalisation in Belgium? First, the complexity of the federal structure makes it challenging to describe its state of evaluation praxis in a manner that gives full justice to the different administrative cultures and evolutions on the different governmental levels that exist in the country. Belgium has not been included in reputed publications that describe the level of evaluation maturity and institutionalisation across OECD countries (including Furubo et al. 2002; Jacob et al. 2015). An exception is the study of Varone et al. (2005). Secondly, for decades the process of federalisation has required a lot of political and administrative energy and focus, with an impact on attention given to and the degree of structural policy analytical capacity building. As a matter of fact, the partitocracy and the influential role of the large ministerial cabinets intensify the area of tension between a political rationale and a more rationalistic perspective to which evaluations can contribute. As Varone et al. (2005, p. 262) have put it: “As evaluators can call into question policy choices, which are the products of delicate compromises between four to six coalition parties, it is quite obvious that the party leaders would resist this potential loss of coordinating power”. Against these backgrounds and from the few and earlier available comparative studies (Jacob 2004, 2005; Varone et al. 2005), Belgium is to be considered as a latecomer in terms of evaluation culture and praxis to be situated in the slipstream of the second wave with which evaluation praxis has spread in Europe during the early 1990s. Rather external pressures have put evaluation on the agenda, such as evaluation demands and impulses from the EU (Furubo et al. 2002). In addition, New Public Management (NPM)-inspired government wide administrative reforms have played a major role. Yet, intra-country differences exist: evaluation demands have reached Wallonia more and earlier than Flanders. The explanation lies mainly in the evaluation requirements related to the European co-financed programmes (i.e. structural funds). Irrespective these intra-country differences, from an international comparative perspective, the important conclusion anno 2004–2005 was that Belgium, after a late start in the 1990s, had to be classified as a laggard in terms of evaluation institutionalisation (Jacob 2004, 2005; Varone et al. 2005). In this chapter we take stock of the recent developments in policy evaluation praxis more than one decade later. We describe the state of institutionalisation of evaluation in Belgium with a view to the federal and regional level, the latter with some bias to the Flemish region.

118

V. PATTYN AND B. DE PEUTER

2

Institutional Structures and Processes (Political System) 2.1

2.1.1

Evaluation Regulations

A Formal Anchorage of Evaluation Requirements in Laws or Regulations In Belgium the practice or use of policy evaluation as such is not anchored in the constitution, or/neither in national laws or regulations. Nor does a central country-wide policy or strategy on policy evaluation exist. The above-described country-specific features of the political system have contributed to the spread of the perception that evaluation is functional for accountability, rather than for policy learning or improvement. However, in the last 20 years policy evaluation has been put on the agenda across the entire spectrum of policy domains. At the same time, government-specific laws and regulations have been introduced on the implementation of generic or sector-specific evaluation tools such as regulatory impact analysis (RIA) and environmental impact analysis (EIA). Following supranational regulation (EU) and advocacy (OECD), the implementation and use of some evaluation tools with broad scope have been regulated in Belgian domestic law, both at the federal and at regional level. At the federal level, regulatory impact assessment is anchored in the decision-making process by law. It is mandatory for all draft legislation, ministerial and royal decrees, excluding draft legislation constituting the ratification of international treaties, intergovernmental agreements or autoregulation regarding the federal government. In an effort to better coordinate the existent federal thematic ex ante tests, it was decided in 2013 to integrate a number of them about topics such as administrative simplification (established in 2004), sustainable development (2007), gender (2007), small and medium-sized enterprises (SMEs 2008) and policy coherence for development aid (2013) in a more generic RIA framework (DAV 2014). It remains to be seen whether the integrated RIA obtains its goal of raising awareness for assessing the impact of policy decisions, and whether it will overcome difficulties which the former separate tests encountered, that is limitations to self-evaluation and poor-quality control from the overseeing agency (OECD 2015). In Wallonia, a similar integrated procedure for ex ante evaluation of proposals of legislation exists, but the Walloon regional government refers to

5

BELGIUM

119

this set of tests as ‘an examination of conformity of sustainable development goals’ (Regional Law of 27 June 2013). The Walloon version can be considered as a light variant of the federal RIA. The appraisal is carried out by a central unit within the government and its result is formulated as an internal advice to the government. In Flanders, the introduction of RIA was regulated in 2000 as part of a reform package for better regulation. Since 2005, the RIA is required for all drafts of parliamentary decrees and governmental acts which have a regulatory effect on citizens, private or non-profit organisations. As of now, the RIA has developed into a standard procedure in policy analytical work, specifically as regards the preparation of policy initiatives that are translated into regulation. It intends to outline the pros and cons of possible policy options by comparing expected direct and indirect effects. All in all, the relevance and impact of the RIA should not be overestimated. While it is meant as a tool supporting policy-makers in their decisions, it quite often serves as an after the fact justification for decisions that have arrived through political bargaining (Fobé et al. 2017b). In addition to the RIA, the well-known EU developed EIA has been converted in regional regulation with specific accents. Although linked to environment policy, this ex ante evaluation technique is used for a wide range of governmental decision-making. In Flanders, the Decree on environmental impact assessment of 2002 (amending the Decree of 1995 containing general regulations on environmental policy) translated the European Directives, and adapted it to the Flemish context. The Flemish variant includes an additional participatory stage in the scoping phase (see below). The administrative regulations further require a systematic and scientific analysis of expected environmental impacts of the intended plan or project and of its ‘reasonably to be considered’ alternatives. In addition, the assessment report should have a conclusion in which the best acceptable alternative and recommendations are outlined, in terms of mitigating measures to avoid, limit, remedy or compensate for the substantial negative effects. Government-wide public sector reform initiatives have also been a major catalyst for the adoption of evaluation practice and use, especially in Flanders. Inspired by NPM, the so-called ‘better administrative policy’ reform in 2003 was aimed at a decoupling of the policy cycle. Departments were to be conceived as core administrations responsible for planning and evaluation, and autonomised agencies were assigned the responsibilities for policy implementation. Departments were also charged with the explicit obligation to produce outcome evaluations on the level of the policy domain.

120

V. PATTYN AND B. DE PEUTER

While the philosophy of the reform was meant to concentrate evaluation capacity in the departments, the present-day situation, more than a decade later, shows considerable variation when comparing the different policy domains (see below). Also, the overall outcome evaluations have never been developed. Nevertheless, the agenda-setting role of the reform for evaluation must not be underestimated. Hence, despite government-wide initiatives, sectoral dynamics still prevail and have resulted in substantial variation across policy domains. The latter also holds for sectoral regulations and policies on evaluation. In many policy fields, the institutionalisation of evaluation has been strengthened in the past 10 years. Indications for this—apart from increased practice—can be found in the presence of an evaluation framework, a long term planning of evaluation activities and/or systems of evaluation quality control (Desmedt et al. 2016). An explicit recognition of the evaluation functions, also in organisational charts, proved be an important leverage in this regard. By way of illustration, we concisely elaborate on the institutionalisation of evaluation practice within a selection of policy sectors at the Flemish regional level (environment; education; and economics, science and innovation), and at the federal level (development cooperation). Environment has traditionally been a leading sector when it comes to policy evaluation in Flanders. In 1995 and 1997, services within government agencies were given the duty to publish evaluation reports. This was deemed compatible with an integrated approach towards policy planning, reporting and annual programming. A lack of expertise in evaluation, however, delayed the genuine establishment of the policy evaluation pillar until the millennium switch. Although considerable evaluation activity was developed at that time, it appeared not to be sustainable despite the existence of the above-mentioned regulations and evaluation clauses in environmental legislation (see below). A more sustainable development can be found in the field of education, another domain with a relatively long evaluation tradition and high evaluation activity. In the education field, one can identify several requirements for evaluation in various thematic decrees; yet there is no overarching regulation on evaluation. Nonetheless, the administration of the education sector invested considerably in evaluation quality and capacity via the establishment of a separate unit which coordinates and supports evaluation activity. Exemplary are the introduction of a tailor-made evaluability assessment tool by this unit, and the development of an introductory manual that is widely diffused within the educational administration.

5

BELGIUM

121

Similar dynamics can be observed in ‘economics, science and innovation’ where a specific departmental unit is in charge of the development of an evaluation agenda and of ensuring evaluation quality control in its area. At the federal level, we would like to highlight the field of international development as belonging to the high-end policy sectors in terms of evaluation institutionalisation and as having a strong regulatory framework. The Special Evaluation Office for Development Cooperation was founded in 2003 by Royal Decree, and was updated in 2010 and 2014. The Office is assigned the tasks of planning, conducting and follow-up of evaluation regarding every form of development cooperation (co-)financed by the federal state. Its evaluation focus lies on relevance, efficiency, impact and sustainability of interventions. The tasks of the service were further translated in a policy document in 2014. The framework stresses the need for cooperation with all parties involved in the planning and application of evaluation processes in order to match the stakeholders’ needs, to enhance the credibility of evaluations and to ensure that findings and recommendations trigger improvements and new dynamics in both policy and practice. It is exactly the combination of a sound legal basis and the strategic and operational elaboration in a policy document that constitutes a firm basis for the Office’s evaluation practice (De Maesschalck et al. 2016). 2.1.2 The Institutionalisation of Evaluation Within Parliament In principle, one could view Parliament as an important trigger of evaluation demand, as it regularly asks for the inclusion of evaluation clauses in legislation. In fact, in the neo-corporatist and consensus-style policy context of Belgium, these clauses are often used as a mechanism to overcome a political deadlock and function as leverage to achieve a compromise in difficult negotiations over policies. On the federal level the insertion of evaluation clauses is rather limited. In the Flemish Parliament the management services keep an inventory of all reporting clauses in decrees; the numbers are revealing. Between 1971 and 2013 in total 285 clauses were identified, with differences in frequency (once, yearly, other periodicity) and level of concreteness. About 38 concerned an explicit obligation to conduct an evaluation and to deliver it to Parliament. Yet, in the absence of rigid follow-up mechanisms, only in six out of the 38 cases an evaluation was actually conducted (SERV 2015). When compared to the executive branch of government, the institutionalisation of evaluation within the legislative branch is clearly less present.

122

V. PATTYN AND B. DE PEUTER

Hindrances are the lack of specific structures and the absence of evaluationspecific budgets. At the federal level, a parliamentary committee for legislative evaluation was founded and has been operational between 2011 and 2014. Yet, it now only exists on paper. Nevertheless, the balance is not merely negative. Several attempts toward a stronger embedding of evaluation in parliamentary praxis can be identified. In the Flemish Parliament, under impulse of its chairman, a thorough reflection operation was recently organised along different scenarios for a more systematic ex post decree evaluation. The Social-Economic Council suggested a better followup of evaluation clauses and the inclusion of more evaluative information in yearly policy letters. In addition, it called for a broader use of existing instruments such as thematic debates and hearings, a stronger cooperation with the Court of Audit and the need to draft an evaluation agenda (SERV 2015). It remains to be seen, however, whether this will evolve toward stronger institutionalisation. Besides this, evaluations are commissioned or conducted by Parliament in the context of inquiries triggered by exposed (indications of) public sector dysfunctions (e.g. railway safety policy after train accident, anti-terrorism policy after terrorist attacks) or with a thematic focus (e.g. national commission for the evaluation of the law on abortion). Reports of the Belgian Court of Audit are sometimes—not always—discussed. The utilisation or reports usually ends after the hearings, though. Against the low level of institutionalisation in Parliament, one should acknowledge that a substantial number of members of parliament does show attention and interest for evaluation, as an analysis of parliamentary written questions in the Flemish Parliament revealed (Speer et al. 2015). The initiation, the results and the use of evaluations turned out to be of most interest to parliamentarians. Interestingly, both majority and opposition party members frequently ask evaluation related questions. For Wallonia, observers noticed that the Walloon Institute for Evaluation, Foresight and Statistics (IWEPS) is less frequently demanded for input by Parliament. IWEPS is the central unit for evaluations commissioned by the Walloon Region (see below). Also, for this region and in the Brussels Capital region, legislation often contains evaluative clauses leading to evaluation activities. Yet, specific structures or procedures for the dissemination and use of evaluation findings do not exist.

5

2.2

BELGIUM

123

Evaluation Practice

2.2.1 A Relatively Wide Diffusion of Evaluation Practice As we mentioned in the introduction, Belgium was a late starter, to be situated in the slipstream of the second wave of countries in which evaluation has been introduced in the late 1990s (Furubo et al. 2002). Policy evaluation activity is nowadays widely diffused across the entire public sector. Generally speaking, evaluation praxis (apart from RIA) seems to have developed more systematically on the regional level than on the federal level. EU demands for evaluation and public-sector reforms encompassing attention to evaluation have had a clear catalyst function for the regional level. For the federal government level, such major catalysing stimuli cannot be identified. Empirical indications for evaluation activity can be found in a large variety of sources, such as policy (planning) documents referring to existing or planned evaluation procedures and activities, announcements of public tenders and evaluation reports published on governmental websites, and via references to evaluation in the recent government agreements. The Flemish government agreement 2014–2019, for example, includes 50 explicit references to evaluation commitments. This number is but a small share of the total number of announced evaluations. Sectoral ministers can highlight additional evaluations in their so-called policy notes, that is formal communications in which ministers outline their policy intentions for the upcoming government term. In a longitudinal 20 year timespan analysis of references to evaluation, respectively 247, 254, 347 and 231 references (with context information on the purpose) were counted for the five-year terms between 1999 and 2019 (De Peuter and Pattyn 2016). Substantial differences exist again across policy domains. Belgium largely follows international trends: the domains that are front running in evaluation in an international context are also the domains where we can detect most evaluation activity in Belgium; and vice versa, this also applies to laggard domains. Policy domains as education, employment, public health and development cooperation have a strong evaluation culture. Although policy evaluation in Belgium is not restricted to those areas, these are clearly the fields that distinguish themselves in terms of the frequency of evaluating (Pattyn 2015). The same can be said for social affairs at the Federal level, for economics, social policies and poverty in Wallonia and for the combined domain of economics, science and innovation, and for environment in Flanders. But as stated above the enthusiasm for evaluation of

124

V. PATTYN AND B. DE PEUTER

environmental policy has, except for the obligatory environmental impact assessments, faded away in recent years, mainly due to a lack of political commitment, an insufficient match between supply and demand concerning focus and too little coordination capacity within the department (Crabbé and Vansteertegem 2016). For Wallonia, we already mentioned the important role of IWEPS. As of 2004, the Walloon government has systematically charged this economic research institute with the evaluations of the successive Marshall Plans, that is large-scale economic redeployment programmes. Given the size of these programmes and their importance for Wallonia, the evaluation activity in this region for a large part relates to the domains of economic, social and employment policies. In 2015 IWEPS also evaluated the government’s anti-poverty programme. At the other side of the continuum we can situate policy domains such as justice and home affairs (federal level) or culture (regional level) where evaluation is predominantly taking place on an ad hoc basis. One should be careful, of course, in classifying entire policy domains in terms of evaluation activity. Also, within certain policy domains, large heterogeneity often exists at the level of single public sector organisations. Even in leading fields, such as education, not all public sector organisations are equally active in policy evaluations and in policy domains that are generally not very active in policy evaluation, there are sometimes organisational islands with a strong evaluation culture (Pattyn 2014). In 2015, the Research Centre of the Flemish Government launched a survey among leading civil servants of all policy sectors, across departments and agencies (population: n = 368; response rate: n = 186). The survey provides relevant insights as for the main initiators of evaluation: 64% of the respondents stated that the own division mostly or always takes the initiative of setting-up an evaluation. Half of them (50%) (also) pointed at a request from the minister, 30% to obligations in regulation and 20% referred to stakeholders asking for evaluations or evaluations following specific topicalities. Parliament was indicated by 16% as the most common or exclusive source of demand for evaluation (Verlet et al. 2015). 2.2.2 Type of Evaluation Practice A taxonomy of evaluation can use the timing (ex ante, interim, ex post) or the focus (input, process output, outcome) as distinguishing criterion. When it comes to the timing of the conducted (ad hoc) evaluations, there is a strong bias towards retrospective (interim/ex post) evaluations,

5

BELGIUM

125

compared to ex ante evaluations. The embedded RIA applications are herein not taken into account. When looking at the focus of evaluations, the different government levels have a relatively similar profile. For the federal level, evaluations with a focus on outcome do take place, yet output and process evaluations are comparably more numerous. Considerable monitoring and evaluation capacity is occupied with the follow-up of timely service delivery and the number of inspections. Less attention is paid to the effectiveness of the many measures that are mentioned in action plans. When also considering the ex ante impact assessments related to the introduction of new regulations, the picture is more nuanced. In the latter type of assessments, the focus lays on expected effects and impact of various policy alternatives. As for Flanders, the above-stated survey results are telling (Verlet et al. 2015): 90% of respondents indicated that their evaluations ‘always’ or ‘almost always’ concern outputs; and 62% reported to ‘always to almost always’ investigate policy effects. Although survey results should be treated with care, it is clear that a lot of investment and attention goes to monitoring activity and output evaluation. The latter is not always accompanied with effect or impact evaluations. In the Walloon region similar trends can be identified. A lot of monitoring practice and output evaluations are being conducted. According to the experts we contacted, the outcome-focused evaluation of the economic Marshall plan, coordinated by IWEPS, is rather an exception. 2.2.3 Locus of Evaluation Practice As in-house policy evaluations are not always officially classified as ‘evaluations’, accurate data on the relation between evaluations that are conducted in-house and outsourced evaluations are not available. Research in the Flemish context (Pattyn and Brans 2013) concluded that most evaluations are conducted in-house, although there are departments (e.g. education) that systematically opt for external evaluations. For the other government levels, we expect to find similar trends. Even when an administration decides to outsource evaluations, this is not always evident. Experts at all government levels voiced concern over the limited pool of external evaluators. The relatively small supply side on the Belgian evaluation market influences the extent of outsourcing evaluation. Moreover, as only a limited number of applicants usually respond to evaluation calls, inevitably the same evaluators are repeatedly evaluating the same policies time and again. Similarly, a handful of private evaluation

126

V. PATTYN AND B. DE PEUTER

firms can easily build competitive advantages in a limited market. Methodologically, the pool of techniques used is also restricted. Furthermore, peer reviews are hard to implement, as everybody knows one another (Pattyn and Brans 2013). As to future trends, capacity plays a key role. In Flanders the available survey (Verlet et al. 2015) highlights a divided perception: 30% of the respondents reports to have sufficient internal capacity to meet the demands of evaluation addressed to their administrative entity, while about 33% experiences a lack of capacity. About 60%, in turn, states to have sufficient knowledge to conduct or steer external evaluations, yet only 30% feel that there is sufficient external capacity for evaluation. In terms of funding, an important share of 25% reports to have a dedicated budget for evaluation. Relevant to mention is the significant reshuffling of the Policy Research Centres (Steunpunten). In 2001, these Centres were established with the aim to conduct short-term applied research, as well as fundamental long-term basic research on negotiated themes for the government. Such consortia of university institutions existed in most policy domains. In the past they also conducted a considerable share of evaluative research. In 2014 the system of co-financing by the Science department was abolished though, resulting in fewer and smaller centres and leaving other policy domains with a choice between decentralised outsourcing, more in-house evaluation, or reducing the amount of evaluation activity. 2.2.4 Evaluation Institutes, with Varying Levels of Independency During the previous decade, all government levels have taken major steps to institutionalise evaluation in the executive, through the creation of divisions or units that are (also) responsible to support the strengthening and harmonisation of evaluation practices. As to the horizontal institutionalisation of evaluation practices at the federal level, the Administrative Simplification Division, situated within the office of the Prime Minister, played a major role with regard to the coordination of the introduction of the Regulatory Impact Analyses (RIA) across federal policy departments. Today, its hosts the secretariat of the Impact Analysis Committee, which provides advice and feedback on the implementation of the RIA with the aim of strengthening the quality of applications of this ex ante evaluation tool and legislation in general (Fobé et al. 2017b). At the regional level, IWEPS has a prominent function in the horizontal institutionalisation of policy evaluation with regard to policy domains

5

BELGIUM

127

for which the Walloon Region is competent. It was explicitly charged with the task to centralise studies commissioned by the Walloon Region and act as a support agency regarding methods of analysis and policy decision (Fobé et al. 2017b). Within the Brussels-Capital Region the Institute for Statistics and Analysis performs evaluations of regional policies for its government since 2013. Within the French Community and in Flanders evaluation capacity is less coordinated. The Study Centre of the Flemish Government has been much more active on the support and coordination of monitoring instruments compared to evaluation. Despite the existence of horizontal evaluation units, evaluation capacity development to large extent evolves along dynamics at the organisational and sectoral level. In organisational terms, policy domains predominantly function as disconnected silos. For policy evaluation, this implies that administrations are searching answers to the same issues but they do not systematically exchange evaluation related knowledge, experience or documentation. The ‘island-culture’ also hampers cross-sectoral evaluations that concern transversal policy issues (such as sustainable development). Across the public sector, situated either within the centralised departments or within the public agencies that operate at arm’s length of these departments, units were created or given the additional task to conduct or coordinate evaluation. These have served as an important stimulus for the development of evaluation capacity in many public sector organisations (Pattyn 2015). Some actively conduct evaluation studies themselves (e.g. within the Flemish Employment Agency, the federal Service for Sustainable Development, and the Research administration in the Walloon region). Others predominantly outsource evaluations and still other departments rely more on the input of horizontal services (Fobé et al. 2017b). Although these units are visible in the organisational chart, it would require in depth research to accurately assess their actual extent of autonomy. In principle, most of them are not conceived to work on a fully independent basis. The earlier mentioned Special Evaluation Office for Development Cooperation is an exception. It has a large degree of autonomy, both legally vis-à-vis the Directorate-General for Development Cooperation and Humanitarian Aid and functionally in dealing with the planning, implementation and steering of evaluations. The Office is accountable to the public and the parliament, to which it submits an annual activity report (De Maesschalck et al. 2016). Despite the formal institutionalisation of evaluation units at organisational level, observers have expressed concerns about the sustainability

128

V. PATTYN AND B. DE PEUTER

of the evaluation capacity in many organisations (Desmedt et al. 2016). The evaluation function is often performed by one or only a handful of civil servants which are strong believers in evaluation. These ‘evaluation entrepreneurs’ frequently act as the de facto evaluation memory in the organisation, in absence of a sound archive of evaluation practices. This reliance on a few experts constitutes a risk to evaluation capacity building in the longer run. Also, the Belgian Court of Audit undertakes evaluative activities. Since 1998, the supreme audit institution has been given a mandate to check public policies on economy, efficiency and effectiveness, apart from its classic praxis with financial audits and legality and regularity audits. Since 1999 it conducts so-called performance audits (about 10 per year on average based on an internal count). While in the past this type of audit rather focused on the presence of preconditions for effective and efficient policy and on goal attainment, recent examples demonstrate a shift toward policy evaluation with a substantial judgement on outcome and impact of policy initiatives. In its strategic plan (2015–2019) the Court of Audit underlined the intention to endorse this shift with its goal to assess the policy results of societal relevant themes. Currently it tries to elaborate its evaluation capacity accordingly. It has full autonomy to select policies and programmes and does so based on a risk analysis and indications of potential for improvements. Moreover, as a collateral body of parliament it aspires to be a reference for parliamentary work, policy-making and public debate. The Court of Audit also fine-tunes its activities with the Flemish Audit Office. The latter acts as an internal audit service for the Flemish governmental administration with a focus on organisational control. The Belgian Court of Audit is competent for all governmental levels except for the local level (for which the regions can perform external audits of the local internal control systems). It must be noted, though, that the performance audits conducted by the Court of Audit only count for a very small share in the total evaluation praxis in Belgium and its regions. The exact share depends on the very definition of policy evaluation, of course. Nevertheless, the institution’s reports gain considerable attention from policy-makers. Follow-up of its audits is also monitored. Within the Flemish region for example, ministers have to explain to Parliament in their annual policy notes how they deal with recommendations received from the Court of Audit.

5

2.3

BELGIUM

129

Use of Evaluations

2.3.1 Evaluation Motives As Chelimsky and Shadish (1997) argued: “The purpose of an evaluation conditions the use that can be expected of it” (Chelimsky and Shadish 1997, p. 32). The other way around, the study of use of evaluation can start by looking at evaluation purposes. Given the lack of an explicit evaluation agenda, ministerial policy notes constitute a relevant information source in this regard. Reference to evaluation is often made when policy initiatives are announced. A longitudinal analysis of ministerial policy notes of the Flemish government in Belgium yielded interesting insights on intended use of evaluations (De Peuter and Pattyn 2016). The focus of the study were the evaluation purposes being mentioned in ministerial policy notes, issued in four subsequent government terms. In total, the ministerial documents span a period of 20 years of policy making, ranging from 1999 to 2019. We distinguished between three main groups of evaluation purposes: evaluations can be set up to underpin policy planning, they can serve accountability purposes, and/or they can provide input for policy improvement and policy learning. We highlight a few key observations. First of all, the total number of references across ca. 25 policy notes remained relatively stable for three out of four legislatures. Also, the distribution over the purposes did not show substantial variation, that is one third of the references relates to the purpose of policy planning, while almost two thirds refer to policy improvement and learning. The various policy domains overall follow the same pattern, in terms of this distribution. Differences are more outspoken when we compare the number of evaluation references across individual policy notes; some contain dozens, other only state a few. The ‘champions’ are again those sectors which are known to be frontrunners in evaluation in Belgium. Across all four legislatures, the domains of education, environment, economics, wellbeing, and traffic mobility make up for almost half of all references. Of course, purposes of evaluation do not yet explain whether evaluation results are also used in the same way. As for actual use, little systematic evidence is available. But it is clear that policy sectors do differ in terms of the systematic use of their evaluations. Evolutions are also observable throughout time. For example, we already mentioned that evaluation activity in the field of environment dropped due to problems with matching supply and demand, which does not contribute to the use of findings. Within the field of education, instead, the potential for evaluation use seems to benefit

130

V. PATTYN AND B. DE PEUTER

from the systematic participatory approach towards evaluations: stakeholders are actively engaged in the evaluation process. The field of international development at the federal level follows a similar participatory approach, by engaging guidance committees in the evaluation process. To be clear, the guidance committees are not implied in the preparation of the evaluation, nor in the follow-up of evaluation results. These roles were assigned to an advisory committee, but its involvement decreased after some time. In absence of systematic research, one can but speculate about the use of evaluation reports beyond the civil service. In Part II of this chapter we provide insights in use by civil society organisations. 2.3.2 Quality Assurance In the Flemish administration, quality assurance at an organisational level is overall rather limited, and mainly ad hoc. In a study (Pattyn and Brans 2014) among 18 organisations (departments and agencies) from a mixture of policy fields we screened for the systematic application of a range of common quality assurance measures. It turned out that the practice of involving second readers for evaluation reports is most spread. Common practice is further the establishment of a steering committee or working group for evaluations. Evaluation manuals are neither common practice, with only four cases that systematically use such manuals, and five in which this is sometimes done. In addition, only one organisation reported to frequently conduct meta-evaluations. In terms of explanations, the presence of an evaluation unit proved to make a strong difference to have quality measures systematically embedded. Other contributing factors are the provision of evaluation trainings, demand for evaluation from politicians, and organisational stability.

3

Societal Dissemination/Acceptance (Social System) 3.1

Institutionalised Use of Evaluations by Civil Society

While evaluation culture and practice are gradually maturing in Belgium, evaluation does not yet take a central place in decision-making. The proportional election system and the coalition governments resulting out of this, imply that coalition agreements are the ultimate reference for decisionmaking. The consensus-style of policy making also entails a lot of political

5

BELGIUM

131

bargaining between parties. Adding to this the strong partitocratic culture in Belgium, it is no surprise that there is only little room left for an active use of referenda or decision-making on a communal basis. The cases where evaluation can underpin these kinds of processes are rare by default. An exceptional example has been the referendum on a scenario to solve the mobility gridlock around the city of Antwerp. For years, this highly salient and longstanding issue was characterised by a closed decisionmaking attitude by the regional government. Gradually however, action groups came into the policy arena demanding for other alternatives and calling to approach the issue from a broader perspective (quality of air, liveability, etc.). To substantiate their arguments, the action groups made active use of studies and ex ante evaluations of possible tracks for the ring road. They even managed to collect resources to commission new ex ante evaluations. Only recently an agreement between the government and the action groups was reached, which was applauded as a first step toward a sustainable and foremost supported solution. This being said, civil society organisations and to a lesser extent private organisations or citizens do use available evaluation results in discussions with the public authorities, or in lobbying activities. We tend to argue that the main function in using evaluations is the enforcement of interests. Not seldom however, do different actors in the policy arena ‘read’ the same evaluation from a different perspective. Depending on their particular interests, stakeholders sometimes emphasise different parts of the evaluation findings and conclusions. We can name anecdotal examples in which a minister communicates evaluation findings and announces next steps, upon which civil society organisations react with their own recommendations. The latter being partly based on the evaluation findings, partly on the organisations’ general strategy and agenda. Election times are typically also a period in which civil society organisations resort to evaluations. In their so-called memoranda to the incoming government, they regularly refer to evaluations to underpin proposals. Furthermore, in line with Belgium’s neocorporatist legacy, civil society is represented in a system of advisory boards to the government. Although the advisory system is under pressure and reform (Fobé et al. 2017a), it remains a forum in which evaluations are used when formal advice is being drafted. As for the evaluation process itself, citizens or civil society organisations have been involved in several evaluations during the information gathering stage. Their engagement in the planning, analysis or reporting stages

132

V. PATTYN AND B. DE PEUTER

is, yet, far from common practice. An exception is the field of international development, with a formal requirement to involve relevant actors throughout the evaluation process. Worth mentioning also are the EIA, where the participation of civil society is actively looked after. The early notification stage includes a mandatory consultation round, in which the predetermined relevant advisory bodies (i.e. administrative divisions related to environmental policy) are consulted. During this step also the general public gets the opportunity to participate and to formulate suggestions and remarks concerning the scope of the ex ante evaluation. An in-depth multiple case study (De Peuter et al. 2010) revealed the following context elements as being decisive to trigger public participation: the historical salience of the issue, the organisational capacity and availability of time to react. Also, the visibility of and know-how on the subject of environmental assessment turned out to be leverages, yet they seemed to be of a less essential nature for the degree of participation. Other contributing factors are the quality of the evaluation process, that is the quality of communication on participation opportunities, the quality of background documents available in the notification stage, as well as the organisation of extra initiatives to inform and consult stakeholders. Especially civil society organisations, rather than individual citizens, turn out to participate in these evaluation processes. Thematic interest groups have an operational reflex to monitor decision-making processes and have the capacity to more quickly react when impact assessment processes are launched. Some interest groups use the notification stage systematically as an opportunity to voice their concerns and suggestions. Having relevant topical expertise is a key in this regard. Individual citizens and ad hoc committees, in contrast, are often not acquainted with the complexity of the dossier; and with the procedures for participation. As a result, intermediary organisations often take up the role of spokesmen or substitute of the individual citizen in participation processes. 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

In most cases external evaluation reports are actively made public by the commissioning administration, on condition of ministerial approval. Evaluation reports are usually disseminated via the relevant governmental websites; or (to lesser extent, and if applicable) via the organiser of events at which evaluation findings are announced. Whether in-house evaluations are

5

BELGIUM

133

also made available is difficult to assess. Discussion of evaluation findings in media is not common practice and is usually restricted to these policy issues that are salient, or in case deficiencies in public sector functioning are at stake. Personal interest of journalists and competing news items also matter in this regard. A general discourse on the benefits of evaluation, the quality of evaluation, or the professionalisation of the evaluation function does not really exist. These items are addressed ad hoc in evaluation working groups within government administrations, or during events organised by the evaluation networks (see below). 3.3

Civil Societies Demand Evaluations

The demand for evaluation mainly comes from the executive, herein often triggered by evaluation clauses in legislation, or following up on requests from stakeholders. In the neo-corporatist and consociationalist context of Belgium, such requests are usually voiced in the context of negotiations, advisory boards and parliamentary hearings. Public calls for evaluations are only made when these ‘internal’ channels are felt not to result in a satisfactory response from policy-makers. Again, requests predominantly come from organised civil society organisations; given their competitive advantage in monitoring the policy agenda and decision-making processes.

4 4.1

Professionalisation (System of Professionalisation) Academic Study Courses and Further Training

The professionalisation of evaluation takes different shape in the two main language communities. In both communities, the increased anchorage of evaluation in the public sector has been gradually matched by an expansion in supply and diversification in evaluation courses and training. At university level, there is no full-fledged master or Ph.D. degree in evaluation as it exists in other countries, apart from a specialised Master of Development Evaluation and Management, offered at Antwerp University (Flanders). As its name suggests, this master focuses on the evaluation of development effectiveness, and is populated by a strong international student body. Most other university-level initiatives are organised by public administration departments, such as the inter-university certificate programme in evaluation and the evaluation trainings provided at the recently established

134

V. PATTYN AND B. DE PEUTER

School for Public Administration; both are to be situated in Wallonia. In Flanders, courses in policy evaluation are included in public administration master programmes, such as the Master in Public management and Policy at KU Leuven, or the Master in Public Administration and Public Management at Ghent University. The KU Leuven Public Governance Institute further organises a five-day programme in evaluation for civil servants. In addition to the supply at public administration departments, evaluation related issues are covered at Faculties of Medicine (f.i. the course Evidence based Health Care of the Master of Health Care Management and Policy at KU Leuven) or at Educational Studies’ Departments (f.i. KU Leuven organises an annual workshop in Systematic Review). The evaluation discourse and the associated trainings in these disciplines develop along separate trajectories, though. There is hardly any exchange between them. In terms of professional training, also the evaluation associations play a major role in Belgium (see below). Furthermore, initiatives for the exchange of evaluation related information have gained ground within the civil service itself. These are often more of an ad hoc nature and are usually restricted to the civil servants of a specific policy domain only. 4.2

Profession/Discipline

In Belgium, there are two evaluation associations that target the different language communities. The French speaking Société Wallonne de l’Evaluation et de la Prospective (SWEP)1 has been established in 2000 and aims at promoting evaluation and foresight in Wallonia to improve decisionand policy-making. It does so through conferences, seminars and trainings (Fobé et al. 2017b). A main impetus leading to the establishment of SWEP were the compulsory evaluations that had to be conducted in the framework of the EU structural funds programme. In Flanders, the Flemish Evaluation Association (Vlaams Evaluatieplatform, VEP) exists since 2007. Convinced of a need for more evaluation exchange, several academics mobilised other evaluation stakeholders including civil servants from different government administrations and the Court of Audit and policy advisers from civil society organisations. The launch of VEP anticipated an increased demand of evaluation following the attention for evaluation in 1 In February 2018, SWEP announced a withdrawal as active platform mainly due to organisational capacity issues. It remains to be seen whether this constitutes a fundamental turn for network practice in Wallonia.

5

BELGIUM

135

the above-mentioned administrative reform framework, launched in 2003. VEP and SWEP provide a forum for all kinds of evaluation related stakeholders: civil servants, policy advisors, academics and actors from the private and non-profit sector. To be clear, the external evaluation market in Belgium is mostly dominated by scientific research institutes and consultancy firms. There are only a limited number of freelancers active in evaluation. Whereas federal civil servants are welcome to attend the activities of VEP and SWEP and both associations sometimes cooperate in the registration for events, they participate to much lesser extent. Most communication with members takes place via social media or email lists. The associations do not have their own journal, although VEP has close connections with the Flemish Journal of Public Management (Vlaams Tijdschrift voor Overheidsmanagement ), as both initiatives are part of the Flemish Association of Management and Policy (Vlaamse Vereniging voor Bestuur en Beleid). Furthermore, several board members represent VEP in the editorial committee of the Dutch e-journal Beleidsonderzoek Online (‘Policy Research Online’). The networks have played a major role in capacity building of civil servants involved in evaluation, especially for the Walloon and Flemish public sector. There are no specific Belgian standards for evaluators. Neither is there a certification system for evaluators; or a specific evaluation professorship. 4.3

Compliance to Standards and Quality Obligations

The evaluation associations in Belgium have not developed specific standards or guiding principles but rely on existing standards in other countries (e.g. the Swiss, American, etc.). In absence of a certification system for evaluators, it is up to the members to decide whether to follow such standards. We do not exclude the possible existence of standards of guiding principles within certain disciplinary professional associations, but these are not actively communicated by the main evaluation associations. Of course, clients do request a certain level of evaluation quality for commissioned evaluations, but they usually not apply certain quality grids to evaluate proposals. Also, evaluators—at least to our knowledge—do only seldom rely on specific standards, nor do they systematically invest in meta-analyses of their evaluations. There is still a way to go before full maturity is reached, in this respect.

136

V. PATTYN AND B. DE PEUTER

5

Conclusion and Outlook

Belgium has often been labelled as a latecomer compared to other OECD countries, demonstrating increased attention to evaluation only in the (late) nineties. In this chapter we focussed on the question how the country has evolved in terms of evaluation institutionalisation, and where it stands anno Domini 2017. Despite being a federal country in which internal and external impulses for policy evaluation vary between levels of government and also between its regions, the overall balance is that major progress in evaluation institutionalisation is observable. Generally speaking, evaluation culture has clearly matured in the last two decades, with the establishment of an evaluation association; large scale public sector reforms directly or indirectly anchoring evaluation in the policy cycle; an increasing debate on evaluation within and by parliament; a growing number of references to evaluation in policy documents and coalition agreements; more evaluation activity; the Court of Audit shifting part of its audits to evaluation of policy results; and an extending supply of training in evaluation. Yet, as the international peloton also seems to keep the pace of maturing (Jacob et al. 2015) Belgium probably has not compensated for its slow start. One should be careful, however, in sketching a general country-wide picture. It should be clear, for instance, that the level of institutionalisation both in the executive and legislative branch is less pronounced at the federal level compared to the regions. Furthermore, within the administration, institutionalisation dynamics (and hence differences) are visible mainly along sectoral or organisational paths. Indeed, with the exception of the Walloon region, evaluation capacity remains highly decentralised and scattered across and within policy domains. With several experts, we agree that the islands and silos, along which evaluation capacity develops, endanger the future development for government wide investments in evaluation capacity and incorporate risks as to its sustainability. The country specific partitocracy adds to this risk and can foster a too narrow association of evaluation with accountability. On the other hand, intensifying dynamics for coproduction of policy and interactive policy-making, for instance, and a further decrease of neocorporatist legacies, may push the pendulum back towards evaluation for rationalistic and learning-based purposes. Time will tell which trends will prevail in the longer run.

5

BELGIUM

137

References Brans, M., Pelgrims, C., & Hoet, D. (2005). Politico-administrative relations under coalition governments: The case of Belgium. In B. G. Peters, T. Verheijen, & L. Vass (Eds.), Coalitions of the unwilling? Politicians and civil servants in coalition governments (pp. 207–235). Bratislava: NISPAcee. The Network of Institutes and Schools of Public Administration in Central and Eastern Europe. Chelimsky, E., & Shadish, W. R. (1997). Evaluation for the 21st century: A handbook. Thousand Oaks, CA: Sage Publications. Crabbé, A., & Vansteertegem, M. (2016). Verleden, heden en toekomst van milieubeleidsevaluatie in Vlaanderen. Vlaams Tijdschrift Voor Overheidsmanagement, 2, 17–24. De Maesschalck, F., Holvoet, N., & Hooghe, I. (2016). Beleid, gebruik en invloed van evaluaties in de Belgische ontwikkelingssamenwerking: een blik op de Dienst Bijzondere Evaluatie. Vlaams Tijdschrift Voor Overheidsmanagement, 9(2), 55– 62. De Peuter, B., Houthaeve, R., & Van Der Linden, M. (2010). Case studies merprocessen bij plannings en vergunningstrajecten. Leuven: Instituut voor de Overheid. De Peuter, B., & Pattyn, V. (2016). Waarom evalueren beleidsmakers? Een longitudinale analyse van motieven voor beleidsevaluatie in Vlaamse ministeriële beleidsnota’s. Bestuurskunde, 25(2), 32–45. Deschouwer, K. (2012). The politics of Belgium: Governing a divided society. Basingstoke: Palgrave Macmillan. Desmedt, E., Pattyn, V., & Van Humbeeck, P. (2016). Beleidsevaluatie vandaag: Een voorzichtige balans. Vlaams Tijdschrift Voor Overheidsmanagement, 21(2), 63–69. Dewachter, W. (2001). De mythe van de parlementaire democratie: een Belgische analyse. Leuven: Acco. Dienst Administratieve Vereenvoudiging (DAV). (2014). Regelgevingsimpactanalyse. Inleiding tot praktische oefeningen. Brussels: DAV. Fobé, E., Biard, B., & Schiffino, N. (2017a). Policy advisory bodies in Belgium. In M. Brans & D. Aubin (Eds.), Policy analysis in Belgium (pp. 151–170). Bristol: Policy Press. Fobé, E., De Peuter, B., Petit Jean, M., & Pattyn, V. (2017b). Analytical techniques in Belgian policy analysis. In M. Brans & D. Aubin (Eds.), Policy analysis in Belgium (pp. 35–56). Bristol: Policy Press. Furubo, J., Rist, R., & Sandahl, R. (2002). International atlas of evaluation. New Brunswick, NJ: Transaction Publishers. Jacob, S. (2004). L’institutionnalisation de l’évaluation des politiques publiques en Belgique: entre balbutiements et incantations. Res Publica, 4, 512–534.

138

V. PATTYN AND B. DE PEUTER

Jacob, S. (2005). Institutionnaliser l’évaluation des politiques publiques: étude comparée des dispositifs institutionnels en Belgique, en France, en Suisse et aux Pays-Bas. Brussels: Peter Lang. Jacob, S., Speer, S., & Furubo, J.-E. (2015). The institutionalization of evaluation matters: Updating the International Atlas of Evaluation 10 years later. Evaluation, 21(1), 6–31. OECD. (2015). Impact assessment in Belgium, federal government. Paris: OECD. Pattyn, V. (2014). Policy evaluation (in)activity unravelled: A configurational analysis of the incidence, number, locus and quality of policy evaluations in the Flemish public sector. KU Leuven: Public Management Institute. Pattyn, V. (2015). Explaining variance in policy evaluation regularity: The case of the Flemish public sector. Public Management Review, 17, 1475–1495. Pattyn, V., & Brans, M. (2013). Outsource versus in-house? An identification of organizational conditions influencing the choice for internal or external evaluators. Canadian Journal of Program Evaluation, 28(2), 43–63. Pattyn, V., & Brans, M. (2014). Explaining organisational variety in evaluation quality assurance: Which conditions matter? International Journal of Public Administration, 37 (6), 363–375. Pattyn, V., & Brans, M. (2015). Belgium. In D. A. Bearfield, E. M. Berman, & M. J. Dubnick (Eds.), Encyclopedia of public administration and public policy (3rd ed., pp. 1–4). Boca Raton, FL: CRC Press. Sociaal-Economische Raad van Vlaanderen (SERV). (2015). Advies: Tien denksporen voor ex post decreetsevaluatie in en door het Vlaams Parlement. Brussel: SERV. Speer, S., Pattyn, V., & De Peuter, B. (2015). The growing role of evaluation in parliaments: Holding governments accountable? International Review of Administrative Sciences, 81(1), 37–57. Varone, F., Jacob, S., & De Winter, L. (2005). Polity, politics and policy evaluation in Belgium. Evaluation, 11(3), 253–273. Verlet, D., Lemaître, J., & Carton, A. (2015). Beleidsevaluatie binnen de Vlaamse overheid. Een Overzicht van de resultaten uit de bevraging van de leidinggevenden. Belgium: Presentatie Studiedag Vlaams Evaluatieplatform. http://www. evaluatieplatform.be/VEP/programma/20151217.htm. Accessed on 29 January 2019.

CHAPTER 6

France Clément Lacouette-Fougère and Benoit Simon

1

General Country Overview

Evaluation in France is often considered as both an ‘imported and a half-done product’. An imported product because genesis of evaluation in France was mainly based on the North American standards while its widespread application in the 1990s was due to European regulation. A half-done product because despite the repeated attempts over the last decades, institutionalisation of evaluation in France partially failed and evaluation still remains an unstable and fragmented activity (LacouetteFougere and Lascoumes 2013). Schematically, four mains stages can be defined to describe the development of evaluation in France.

C. Lacouette-Fougère (B) French Interministerial Directorate for Public Transformation, Paris, France e-mail: [email protected] Sciences Po, Paris, France B. Simon Planète Publique, Paris, France e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_6

139

140

C. LACOUETTE-FOUGÈRE AND B. SIMON

• In the 60s, the implementation of a programme dedicated to the Rationalisation of the Budgetary Choices by civil servants of the Budget Ministry is considered as the birth certificate of public evaluation in France. This programme built on the model of the American Planning programming Budgeting System was given up finally in 1983. During this period evaluation remains quite a recent endeavour so that at the beginning of the 1980s, “[…] evaluation in France is neither a school of thought nor an organised administrative practice” (Nioche 1982, p. 37). • Culture of evaluation really comes of age in France between the beginning of the 1980s and the end of the 1990s (Barbier and Hawkins 2012). This culture has grown on the complexification of public action, constant struggles in its management and justification, and professionalisation of public policy analysis, especially in the private sector (Duran and Monnier 1992). But it is also the product of two driving forces that contributed to the institutionalisation of evaluation. In the one hand, emerged in France in the 1980s different levels of government, at supra (European Union, EU), and infra-national (local authorities) levels. For these authorities, evaluation has been a resource of power and an instrument of legitimation (Fontaine and Monnier 2002), and therefore has progressed in the context of vertical counter-power affirmation from the EU and from French Regional authorities (Jacob and Varone 2004). On the other hand, political change at the end of the 1980s led new actors to set on an agenda of state ‘modernisation’ (Pollitt 1995). Evaluation, because it focuses on the consequences of public action rather than on the administrative culture of rules and procedures (Silvent et al. 2002), and as a way to provide information and knowledge to decision-making processes (Thoenig 2002), quickly made consensus among politicians and some senior State officials as an instrument of modernisation (Bezes 2009; Duran 2010; Leca 1993). • This, in turn, led to further institutionalisation by the development of a market for evaluation services, while in 1999, evaluation professionals created the French Society of Evaluation (SFE). As a consequence, a development of a market for evaluation emerged at the end of the 1990s while in 1999 evaluation professionals created the SFE. Progressively appeared an evaluation ‘à la française’ in the beginning of the 2000s, in which “emphasis is placed on the ‘plurality principle’,

6

FRANCE

141

[…] anchored in the French Charte de l’Évaluation” (Jacob et al. 2015, p. 17). A pluralist evaluation not only associates to its process the decision maker, but also the stakeholders that are operating, are targeted or benefit from the evaluated intervention (Guba and Lincoln 1989). • Since the last decade, evaluation in France knows a new development from the State and from the research community. At the State level, evaluation was first recognised within the constitution in 2008. Between 2012 and 2017 and the Modernisation of Public Action (MAP) initiative evaluation was also embedded in an extensive program of inter-ministerial evaluations that has led to 80 evaluation reports reviewing all the main government policies: education, health, social, housing, public development aid, et cetera. In parallel, new actors of evaluation appeared from the research community. Economic research centres have been developing impact evaluations based on quantitative methods, mainly on the field of health, employment and education (Bono et al. 2018). They only represent around 15% of evaluation carried out in France but are a major in the public and political debate about evaluation.

2

Institutional Structures and Processes (Political System)

The French political and administrative system is often defined negatively as an ‘administrative layer cake’. Apart the nation-state level, four levels govern the French society: 3700 communes (municipalities), 101 departments, 13 regions, and finally Europe. The nature of both the supra- and infranational administrations is very distinct of the French state administration. Originally, these levels had weak decision-making powers and most of their resources were mobilised through contractual policies with the national government that organised specific actions. These contracts between level of administrations, due to their different fields of competence, system of regulation and modes of intervention, are of a very different nature. If this specific organisation has led to particular difficulties in the definition of adequate evaluation procedures during the 1990s (Fontaine and Monnier 2002) it still explains the original institutionalisation process of evaluation in France today.

142

C. LACOUETTE-FOUGÈRE AND B. SIMON

2.1

Evaluation Regulations

Evaluation Inside the State: Formal Translations But Still a Highly Political Object (Table 1) In France, the development of evaluation has first been achieved on a formal basis through different regulations adopted in the 1990s and 2000s. The decree signed on 22 January 1990 has thus given birth to a regulatory and institutional system aiming at strengthening the space of evaluation at the national level, thanks to the combined creation of an Inter-ministerial Evaluation Committee (CIME), a Scientific Evaluation Council (CSE) and a National Fund for the Development of Evaluation (FNDE). Completed in 1998 with the decree of 18 of November and prime minister act of 28 of December, these founding acts mainly allowed to officialise the existence of evaluation and to frame its practice both on an institutional and methodological levels. However, they were not sufficient to impact the achievement of evaluations and to stimulate a true demand for evaluation. In addition, local authorities which were considered at this time as the real driving force for evaluation in France thanks, on one hand, to the implementation of European Structural Funds and the search for legitimacy of local public action, on the other hand, were not really considered in these texts. Ten years later, a second wave of rules adopted by the end of the 2000s enabled to give a stronger acknowledgement to evaluation through the constitutional reform of 2008 which reinforced the role of the Parliament and Court of Auditors in the field of evaluation, without, however, leading to a considerable evolution of the evaluative practice within these two institutions. On the institutional level, in 2007, the nomination of a Secretary of State dedicated to future studies and evaluation of public policies in the first government of President N. Sarkozy has been considered as an unknown moment in the political history of evaluation, even if this minister only lasted two years without strong achievements. Finally, the most recent initiative took place in 2012 when the Government of Prime Minister J.-M. Ayrault decided the implementation of an ambitious programme of evaluation in the framework of an approach MAP at the first Interministerial Modernisation of Public Action Committee (CIMAP) of 18 December 2012. Deployed all along the mandate of President F. Hollande and steered by the general Secretariat for the modernisation of public action (SGMAP), this approach is characterised by three main aspects:

Constitutional reform of 2008: Art. 24 and Art. 47.2

Decree of 19 June 2007 relative to the composition of the Government

Circular of 28 December 1998 relative to the evaluation of public policies, Annex I. Definition

FRANCE

(continued)

“Evaluation of public policy in the meaning of this decree aims at assessing if the legal, administrative or financial means which have been put in place allowed for producing the expected impacts of the policy and reaching the goals which were set.” “Evaluation of public policy, in the meaning of this decree aims at appreciating […] the effectiveness of this policy when comparing its outcomes to the assigned objectives and to the means which were used.” “Evaluation of public policy consists in comparing its outcomes to the means which have been used—whatever they are legal, administrative or financial—and to the objectives initially set. It is different from control and inspection works as it must lead to a shared judgement on the effectiveness of the policy and not a simple check of the respect of administrative or technical standards.” “Has been nominated Secretary of State and participe to the Council of ministers for matters under their devolution: M. Eric Besson, secretary of State in charge of prospective and evaluation of public policies (…).” “The Parliament pass legislation. It controls the action of the Government. It evaluates public policies.” (…) “The Auditing Court supports the Parliament in the control of the action of the Government. It supports the Parliament and the Government in the control of the enforcement of the finance acts and the implementation of the finance acts for social security as in evaluation of public policies.”

Decree of 22 January 1990

Decree of 18 November 1998

Citation

Reference

Table 1 Main pieces of law related to evaluation in France (1998–2012)

6

143

“The Government is in the obligation to deliver to the Council of State and then to the Parliament for each bill, an impact study as to set up a tool for evaluation and decision-making support.” “Following a performant evaluation approach: setting up a collective vision of stakes, objectives, outcomes and modalities for implementing each policy; co-designing a scenario of change, translated in an action plan.”

Organic Law of 15 April 2009 (Art. 8)

Prime Minister, 1st CIMAP of 18 December 2012

Citation

Reference

Table 1 (continued)

144 C. LACOUETTE-FOUGÈRE AND B. SIMON

6

FRANCE

145

• The ambition and scope of the approach which targeted the whole spectrum of inter-ministerial perimeters and which enabled to achieve more than 80 evaluation reports within five years on inter-ministerial topics with strong stakes. • The position of inspection bodies in this framework: inspectors from the administration, ministry of finances or social affairs realised most of the evaluations. Deeply opposed for a long time to the development of evaluation in France in the name of the specific finalities and methods for inspection and control, we could see these last years the premises of a ‘conversion’ of these inspection bodies to the culture of evaluation. • Principles of transparency and stakeholders’ involvement. Every evaluation was supposed to take into account the main stakeholders and beneficiaries and to be released at the end of the process. Evaluation Within Ministries: Experimentations and Share of Methodologies Historically speaking, evaluation in France has been first achieved within ministries themselves. Since the 1960s, apart from the Plan Commissariat, ministries were the main places for experimenting and implementing this production of knowledge. However, there are major differences between the ministries. A parliamentary report (Bourdin et al. 2004) displayed the scattering of practices, the internal production, and the uncertain quality of the produced knowledge which remains difficult to transfer. Ministries of Culture, Education or Defence were the first to create specific structures dedicated to evaluation but with very different formats: from internal management tools (e.g. Ministry of Education) to prospective analysis (e.g. Ministry of Defence). In addition, several ministries developed their own evaluation organisations like for instance the Ministry of Environment which created the Direction of Economic Studies and Environmental Evaluation or the Ministry of Labour which put in place the Direction of Research and Economic Studies. The most well-known example is, however, related to the urban affairs policy implemented in 1970s by the Ministry of Urban Planning. At the instigation of a new generation of engineers, this ministry has been able to combine the production of knowledge on its activities (cost, human resources management, innovation programmes) with actions for modernising ministerial services (also because of the decentralisation) and for the involvement of public officers (redesign of services, training, careers

146

C. LACOUETTE-FOUGÈRE AND B. SIMON

management). The starting point was a study programme about the transformation of cities commissioned by the Plan Commissariat. An ‘urban research mission’ was set up in 1973, becoming first the Analysis and Prospective Office, and then in 1984 the ‘urban plan’. Finally, the merger with the Building Plan led in 1998 to the creation of Urbanism Building Architecture Plan. During these 30 years this structure became the true ‘pilot fish’ of the Ministry of Urban Planning. More recently the Ministry for Overseas, which has a specific place in the government, put in place an ambitious evaluation scheme for public policies to be implemented in the French overseas countries and territories. On the basis of these sector-based experiences most methodological guidelines and tools were designed which allowed for the achievement of evaluations for the last 20 years. The external evaluation guidelines for social and health facilities edited by the French Ministry of Health are one of the last examples of a methodological framework. The Deadlocks of the Parliamentary Evaluation in France At the level of parliament, the development of evaluation still remains quite limited in France. In the Constitution, just like in political tradition, the legislative power has a limited weight when facing the executive. This feature lies on three main factors: the rationalised parliamentary process, the dominant logic of the majoritarian fact and the lack of any other framework for expression. Despite important means available due to the traditional process of parliamentary control (parliamentary questions, auditions and information requests, commissions of inquiries, parliamentary missions, etc.) evaluation within the parliament faced multiple failures since 1980 and the promised revalorisation thanks to the 2008 constitutional reform did not lead to the development of systematic evaluative actions. The creation of specific structures for evaluation within the Parliament turned out to be non-effective several times. Hence, only four years after their implementation in 1996 by President J. Chirac, the Parliamentary Office for the Evaluation of Public Policies and the Parliamentary Office for the Evaluation of Law (laws of 14 June 1996) both shared between the two chambers were suppressed in 2000 in a large consensus considering their limited capacity of production: respectively four and three reports. The deletion of these two offices turned to the profit of the Evaluation and Control Mission and the Evaluation and Control Committee both under the supervision of the National Assembly and established by articles

6

FRANCE

147

57 and 59 of the organic law of 1st August 2001 relative to the laws of finances. The only parliamentary structure that could be regarded as a stable institution is finally the Parliamentary Office for Assessment of Scientific and Technological Choices (law of 8 July 1983) created in 1983 on the model of the Office of Technology Assessment of the United States (US) Congress. This organisation leads an evaluative practice dedicated to the social dialogue. The constitutional reform of 2008 aimed at increasing the role of the parliament in the field of evaluation. It forecasted thus, that a quarter of the activity of members of the parliament should be dedicated to “the control of the government action and to the evaluation of public policies”. However, a recent assessment displayed that this threshold of 25% was far to be reached (Rozenberg 2016). Following this constitutional reform, the Parliament has adopted an organic law of 15 April 2009 which made it compulsory upon September 2009, to present an impact assessment study for each bill (CAS 2009). This organic law gave birth to a strongly positive scheme for evaluation such as the Conference of Presidents of the National Assembly which has to decide upon the bill considers that the impact study is insufficient. This bill can be considered theoretically as not been introduced. If this law constitutes a strong assertion of principles, as a matter of fact, the implementation of the process for Regulatory Impact Assessment in France is still not effective. The lack of human and technical resources dedicated to this task and the schedule for introducing and discussing bills did not allow for putting into practice this new evaluation tool. 2.2

Evaluation Practice

General Overview: Around 300 Evaluations Carried Out Each Year in France There is no systematic evaluation for national or local programmes. It is considered that less than 300 evaluations are carried out each year in France (SFE 2016), which remains quite limited if considering the number of public institutions which could carry out evaluations for their programmes: 15 ministries, approximately 1200 national agencies, 17 regional authorities, 100 departmental councils, 1266 local communities et cetera.

148

C. LACOUETTE-FOUGÈRE AND B. SIMON

Within this spectrum, some of these institutions have a high rhythm of evaluations (f.i. regions with programmes supported by the European Regional Development Fund), while some barely do not practise it. However, the state remains the first commissioner of evaluations (50%), while local authorities represent a third of it. Sector Coverage Many different sectors/policy fields are covered as displayed by the following figure. Economic development, health and solidarity, employment and work are the first three sectors for which evaluations are achieved (Fig. 1). However, we may notice that the core policy fields of the state–sovereign policies (like justice, security and defence) are far less covered by an evaluation practice than other kind of policies. Two arguments could explain this situation: on one hand, mainly policies facing legitimacy issues are evaluated while those considered as fundamental are less questioned; on the 19%

Economic development

19%

16%

Health and solidarity

16%

15%

Employment

14%

Sustainable development and transport

12%

Local developmente and urban policy

12%

13% 11%

Europe

7%

Society

7%

8% 7%

4%

International cooperation

5%

5%

Education and research

5%

2%

Justice

2%

1%

Others

1%

0%

2% 2016

4%

6%

8%

10% 12% 14% 16% 18% 20%

2014

Fig. 1 Sector coverage of evaluation reports in France (SFE 2016)

6

FRANCE

149

other hand, policies that are facing huge public issues are mainly evaluated (economic, health, employment policies, etc.). Internal vs. External Evaluations Two thirds of evaluations are carried out by external bodies, mostly private consulting companies while public bodies dedicated to audit, inspection or monitoring and evaluation take care of 10% of evaluation. French specificity is the very small share of evaluation carried out by universities and research bodies (around 2%) (SFE 2016). We may consider at least two main reasons for explaining this situation. • The first reason stems from universities and research institutes, their public funding, the long-time framework of research works and their main orientation towards publication that do not give a strong incentive to carry out short term evaluations. More recently, some research institutes started to work on evaluations: these are mainly economics (Paris school of economics [PSE], CREST1 ) or politics (LIEPP,2 IPP3 ), the former being strongly influenced by quantitative methods (experimental and randomised, etc.). • The second reason has to do with the public bodies themselves. Two major categories of institutions have been set up in France: auditing bodies like the State Audit Office (Cour des Comptes, and its regional subsidies) and inspection bodies (IGAS, IGS, etc.). Both shared for a long time a limited culture in evaluation of public policies. However, in the recent years, this picture is slowly changing with a stronger involvement in the achievement of evaluations commissioned by ministries and the government. Consequently, the share of evaluations achieved by the private sector is slowly decreasing also thanks to two main factors. First, the development of university degrees in evaluation of public policies (Lille, Rennes, Lyon, Paris, etc.) since the beginning of the 2000s which provides a continuously

1 Centre de Recherche en Économie et Statistique. 2 Laboratoire Interdisciplinaire d’Evaluation des Politiques Publiques (Sciences Po). 3 Institut des Politiques Publiques.

150

C. LACOUETTE-FOUGÈRE AND B. SIMON

growing number of evaluation professionals, who find job opportunities in local and national administrations. Second, the recent laws give more space to reporting and evaluation in general. However, this does not ensure a full implementation of evaluation of laws and programmes which have been voted. At the national level, the State Audit Office, SGMAP and Mission d’évaluation et de Contrôle of the parliament are key players for commissioning and/or undertaking evaluations. It is thus likely that the share of internal evaluation would be higher. This is also reinforced by the existence of sector-based inspection bodies (health, labour, environment, finance, etc.). Methodological Debate Still Warming In France as elsewhere in Europe, experimental evaluation and more precisely randomised controlled trials (RCTs) are gaining increasing attention since the end of the 2000s (Jacob et al. 2015). This kind of impact evaluation, based on evidence-based policy making, has been pushed by academics and economists specialised in RCTs. A bibliometric study based on scientific publications dealing with these methods of impact evaluation in economic journals shows that this kind of ex post evaluation is still marginal (on average around a dozen over the year) but is increasing since 2009/2010 (Bono et al. 2018). But this emphasis on evaluation of policy impact has also been supported by national programmes such as the European Programme for Employment and Work of Youth or the Experimental Youth-Development Fund (FEJ). Created in 2008—by The French High Commissioner in charge of active solidarities, Martin Hirsch—the fund was endowed with 255 Mio. e for experimentations and promoted experimental methods to evaluate the impact of public intervention (Devaux-Spatarakis 2014). Although only 15% of the experimentations supported by the fund were actually achieved through RCT or experimental methods, the FEJ experience—also from the criticism it nourished among quantitativist and qualitativist evaluators—can be considered as a turning point for the promotion of this kind of evaluative practices in France. Even though several experts see an opportunity for cross-fertilisation between the disciplines and methods, in France, a certain scarcity of interdisciplinary evaluators can still be noticed. A long tradition of process evaluations continues to co-exist with and is not crowded out by impact evaluations.

6

2.3

FRANCE

151

Use of Evaluations

Use of Evaluation Hindered by Lack of Transparency There are very few in-depth researches conducted in France about the use of evaluation which makes it difficult to analyse which sectors are good performers in the use of evaluation reports or findings. One available indicator comes from the part of evaluation reports that is released. According to the 2016 barometer, 47% of evaluation reports are accessible or released in France. However, there are wide disparities concerning the transparency in practices: while 70% of State evaluations are associated to a public and online report or synthesis, only 22% of evaluations carried out by local and regional authorities are accessible. The Case of MAP Evaluation Conceived to Be Used Evaluations achieved by the state during the MAP process (2012–2017) provide the best example of evaluations conceived to be useful (SGMAP 2017). However, the MAP enforces principles that are: feeding decision making, favouring consultation (participation) at all steps of the evaluation, as well as transparency, going beyond a measurement of effectiveness and efficiency, and also considering relevance and social improvement brought by the evaluated policies; and ensuring independence. At the individualevaluation level, each study follows two phases: a data-collection and analysis phase (diagnostic), and a recommendation phase centred around transformation scenarios. This second part was an innovation and was aimed at ensuring that evaluations would not only be retrospective but would also spend enough time and resources on the changes that could be brought to the evaluated public policies. Expectations towards the MAP evaluations were very high including organisational transformations (i.e. merging several existing operators on a public policy, f.i.), strategic transformations (i.e. redefinition of the global objectives of a policy) or more spread-out transformations (i.e. progressive evolution of a work modality in employment policy) but also reshaping conditions for the grant of aid in a specific sector (Picavet 2015). A meta-evaluation of the 80 evaluations carried out during the MAP process was implemented in 2017 to analyse the outcomes of this approach (Box 1). This meta-evaluation shows that the evaluation process was conceived to take into consideration the commissioners’ needs and to embed

152

C. LACOUETTE-FOUGÈRE AND B. SIMON

the evaluation in a decision process. Although evaluators were attentive to the commissioner’s requests and carried out their work in a serious manner their lack of evaluation skills may have largely hampered the ability of these services to feed the decision-making in an ‘evaluative’ way, for example by questioning the rationale of its policy. More generally, despite the robustness of the certain tool used during the MAP process (scoring tool and a realisation of meta-evaluation at the end of the process) there seems to be a lack of shared professional standards or competence requirements for evaluations. Good practices exist nevertheless. Box 1 Focus on the MAP meta-evaluation: Use of 80 MAP evaluations (SGMAP 2017, pp. 70ff.) Results Evaluation approach In terms of the rationale for the evaluation, meta-evaluation underlines that 95% of evaluations were prompted by specific, well-enunciated purposes. Besides organisational purposes (63% of the motivated evaluations) and reduce spending (45%), 40% explicitly mention a future decision to be taken (e.g. a new law or a new programme adoption). This is due to a systematic initial process in which the demands of commissioners are discussed and then written down in a letter of assignment signed by the Prime Minister. Do evaluations focused enough on specific information needs? The report shows that 40% of the evaluations included evaluation questions, but this rate increased from 30% in 2013 to 90% in 2015, thanks especially to a stronger methodological involvement of the SGMAP team in the inception phase of the evaluations. As it was expected, the evaluations answered a wider than before range of evaluation questions, including relevance and external coherence in 42% of the cases each. Concerning association of other stakeholders, the study evidenced that a steering committee was convened in 88% of the evaluations, and that it was open to at least one representative of the beneficiaries of the evaluated policy in 62% of the total number of cases. As for the way the evaluations were designed, the structuring part was extremely weak with almost no instances of a description of criteria that evaluators would use to guide the evaluation process and assess the evaluated intervention in the end; no cases where the intervention logic of the evaluated policy was reconstructed.

6

FRANCE

153

Evaluators Regarding the evaluators, the MAP evaluations could rely on an impressive workforce. Some evaluation teams included more than ten members of the inspection bodies; also, they could count on external advisory or data collection services to help them. When it comes to the skills of the team and their role in the evaluation process, findings are more balanced. Members of the inspection bodies are used to work collaboratively, and despite a few counter-examples, the evaluation team could usually deliver on its assignment. They are also accustomed to work on a variety of topics and, in some cases, they relied on thematic expertise to provide a fresh look on the evaluated policy. However, it is only by exception that they called for evaluative expertise. This was an issue given that most evaluators had no prior expertise in evaluation, and this may have led to teams conducting evaluation work just as they would have done a classical inspection assignment. Besides, there is evidence of a learning curve among evaluators and commissioners alike, which can be seen in the increasing quality of the Terms of Reference over time. Evaluation outputs Evaluations usually arrived on time, especially when they were bound to contribute to a specific decision that had been announced in advance: evaluators would adapt the ambition of evaluations to the date or provide specific content when asked so for a certain need (e.g. to support or refute an amendment in a legislative process). Evaluators often proved committed to consulting stakeholders beyond the implementation sphere. They would perform several dozen interviews, and up to 200 in some studies, that an important aspect of the credibility of their work. However, the sheer number of interviews and the lack of strategies to, for instance, use different data collection tools for exploratory and confirmatory phases meant that the synthesis of information was in many cases extremely strenuous. That’s why the collected data is not always traceable. Conclusions and recommendations are usually based on an analysis, but in 60% of the cases, this analysis does not explicitly present how the different sources of information were used to reach conclusions. Also, in half of the evaluations, it was difficult to distinguish personal opinions of evaluators from factual pieces of information.

154

C. LACOUETTE-FOUGÈRE AND B. SIMON

3 3.1

Societal Dissemination/Acceptance (Social System) Institutionalised Use of Evaluations by Civil Society

The Spread of the Principle of Transparency for Evaluations At the national level, two initiatives have been carried out to foster the spread of evaluation and the use of evaluation findings. First, the creation of the Evaluation Observatory in 2012 (whose latest findings have been mentioned above) initiated by the French Evaluation Society with the support of the state. This observatory has referenced around 1800 works of evaluation since 2007 and, based on the monitoring indicators, produces a complete analysis of the development state of evaluation in France called ‘evaluation barometer’ every two years. According to the 2016 opus of the barometer, the diffusion rate of evaluation reports is estimated around 47%. Second, at the state level a principle of transparency was introduced in 2012 for evaluations carried out in the MAP context. This principle has been materialised by systematic publication of evaluation reports and all the documents linked to evaluation (mandates, surveys, action plans etc.). In addition, a dedicated website allows one to follow the progress of each evaluation. At the end of the MAP sequence in 2017, the rate of publication reached nearly 90%. This initiative takes part of a wider international project: The Open Government Partnership launched in 2011 by President Obama and seven founding countries (Brazil, Indonesia, Mexico, Norway, Philippines, South Africa and United Kingdom) and now gathers 75 members states, local authorities and non-governmental organisations sharing some 20 commitments structured in four key-principles: transparency, citizen participation, accountability and innovation. Evaluations Mainly Known by Policy-Makers Beyond these two initiatives, evaluation works remain little known by the stakeholders of public policies and the use of evaluation findings proves to be marginal except by political and bureaucratic circles. In the meta-evaluation analysis done on the MAP evaluations, in 38 out of 65 evaluations carried out at the state level from 2012 to 2016 changes

6

FRANCE

155

posterior to the evaluation process could be observed in the realm of the evaluated policy. As regards the FES barometer, no analysis is done concerning the use of evaluations; it only emphasises that 85% of the evaluations reviewed include recommendations. More broadly, the MAP experience of evaluation in the last five years underlines that the main factor for useful evaluations relies on the link to decision-making, that is to say political involvement at the beginning of evaluation and all along the process of evaluation (in this case the signature of engagement letters directly from the Prime Minister). Stakeholder participation plays a minor role in production of change after evaluation. Yet, at the local level different approaches exist through more technical evaluations that directly involve civil servants and users of the evaluated policy or programme and make possible a better sharing of issues and solutions that could generate change in the field of the evaluated policy or programme. 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

Weak Signal for Public Opinion Evaluation in France suffers from lack of visibility from public opinion because of several factors. First, the low attractiveness of the discipline for the opinion leaders and prescribers (politicians, mass media, trade unions, etc.) unlike other kinds of studies or surveys such as for instance the reports released by the State Audit Office that generally receives wide press and radio coverage when presented to the government. Second, the mixing-up between evaluation and similar approaches (audit, control, monitoring, etc.) and the negative connotation of the term evaluation in current language (that commonly refers to the pejorative ideas of rating and penalties) did not contribute to reinforce the spread of evaluation in public opinion. Third, the technical nature of evaluation makes it difficult for citizens who aren’t experts on public action stakes and principles and who are not used to technical language to access and understand these reports. Fourth, there seems to be a lack of a structured reflection about institutional communication on evaluation. Since at the national level evaluation

156

C. LACOUETTE-FOUGÈRE AND B. SIMON

remains fragmented between several organisations as mentioned above, there’s no unified message about what evaluation is and what is used for. Emerging Initiatives However Some initiatives carried out in the last few years tend to contribute to a change in the dissemination of evaluation. Among these emerging initiatives we can mention: • The publication of booklet by academic centres and research organisms that aims at presenting the findings of evaluation in non-technical terms; the policy briefs released by the lab. • The rise of structures dedicated to disseminate research, evaluation and experimentations to local stakeholders on the model of the What Works Centres implemented in United Kingdom. These organisations, created first in the area of medicine and health, develop an operational use of evaluation that contribute to make them known by people searching for tried and proven practices: teachers, project managers, district directors, policy designers et cetera. • The emergence of isolated proposal from academics of associations aiming at mobilising citizens for evaluation. For instance, during the presidential campaign in 2017, it has been suggested to create and recognise a citizen evaluation mandate that could be given by a public organisation and make it possible to promote evaluation as a real democratic counter-power.

3.3

Civil Societies Demand Evaluations

A Growing Interest from Civil Society But a Citizen Demand Still Weak One may observe a growing interest for evaluation from actors and groups of actors belonging to civil society even if this interest is still at the embryonic stage with regard the concrete practices. In France, the Economic Social and Environmental Council (CESE) that is a constitutional assembly that interferes on key economic, social and environmental fields, promoting cooperation between different socioprofessional interest groups and ensuring they are part of the process of

6

FRANCE

157

shaping and reviewing public policy, decided to create a specific delegation dedicated to evaluation and prospective in 2011. The CESE is also involved in the promotion of the culture of evaluation since the publication of a report in 2015 (CESE 2015). In its wake the local CESE also investigated the topic. Concerning the world of non-profit organisations, one may also consider a slight increase: 7% of the evaluation reports reviewed by the Evaluation Observatory (SFE 2016) proceed from associations, professional networks and federations. More precisely, specialised groups such as associations and networks in health, medico-social or disability field have integrated evaluation in their general toolkits. Individual citizens remain very far from evaluation as it is evidenced by the poor coverage of evaluation in the media and during electoral campaigns as well. Yet, communication and publication implemented by several big cities in France, such Nantes, Grenoble, Lyon or Bordeaux, show that a potential interest for evaluation from citizens may exist.

4 4.1

Professionalisation (System of Professionalisation)

Academic Study Courses, Further Training et Cetera

Dedicated Evaluation Courses and Masters Since the Early 2000s Since the early 2000s a growing number of Masters is being delivered by universities and ‘high-schools’ in France and currently we may estimate that a dozen of dedicated Masters in evaluation exists while an ever more important number of university diploma and Masters are integrating at least a specific course to teach the basics of evaluation. Two main branches of social sciences have been involved in this development: • On one hand, the political science field mainly represented by Sciences Po (Lyon, Paris, Lille mainly). • On the other hand, faculties of economics (Rennes, Paris, Strasbourg, etc.) were among the firsts to create dedicated courses and diploma in this field: Macroeconomics and Public sector economics were

158

C. LACOUETTE-FOUGÈRE AND B. SIMON

naturally interested in evaluation thanks to a long-lasting trend of analysis of the performance of the public sector (new public management, etc.). Emergence of evaluation of public policy as a full scientific area is quite recent, more obvious in the domain of political science than in economics, even if it takes a larger place in specific sub-domains like development studies or environment economics. In these specific sub-domains, we find more specific courses in evaluation than full diploma focusing on evaluation: in these cases, evaluation is considered as a tool among many others to be taught to students. Anyway, it must be noticed the absence a professorship for evaluation in France. The Parallel Development of Non-Academic Training In addition, a training offer in evaluation of public policies have emerged since the early 2000’s (coinciding with the development of the academic offer) thanks to the growth of Structural funds. As a matter of fact, we may note that one of the first consulting companies in evaluation (C3E) was funded by two researchers in 1994 (to be verified) who were in charge of designing the evaluation methodology of structural funds (means). Thus, they quickly put in a place a dedicated offer of training for public officers in charge of these European funds. Following this example, several companies (e.g. Edater or Argos) set up training programmes as other methodological guidelines were designed for specific domains or public programmes (regional natural parks, guide ADF, methodological guidelines for the evaluation of climate change programmes, f.i.). Also, different public institutes and schools in charge of training public officers (current and future) started to introduce dedicated training sessions (INET, ENACT, ENA, etc.). In addition, some local authorities commissioned consultants or university professors to train their own employees (Regions Aquitaine, Rhone-Alpes, City of Pau, etc.). More recently, the French Evaluation Society decided to develop its own training programme.

6

4.2

FRANCE

159

Profession/Discipline

A Weak Development of Information and Communication About Evaluation There is no professional journal dedicated to evaluation in France. However, different dedicated media do exist or have been existing for a period of time, with different perspectives. • On the Internet, one blog is regularly updated which provides news and opinions: La Vigie de l’Evaluation is managed by an expert in Evaluation for now several years. Formerly, there was another blog rather similar, created by an individual expert (Colette Lançon). • On paper, at least two institutions edited for some time a regular publication: Les cahiers de la SFE and Les cahiers de l’évaluation de la DG Trésor. These publications were mainly addressing the evaluation community and could not participate as such in the diffusion of the evaluation culture. • However, we may notice that all these publications, while online of offline, face many difficulties to survive on the long run. Several journals deal regularly with evaluation: these are not professional journals from other scientific disciplines but professional magazines for the public sector: the main actives are Acteurs publics (which address the entire public sector community) on one hand and La gazette des communes (which focuses on local authorities) on the other hand. An Historic Actor for Promoting Evaluation in a Scattered Landscape: The French Evaluation Society Finally, the most active actor to promote evaluation and diffuse information and knowledge about it remains the SFE, established in 1999 and which is the main Volunteer Organisation for Professional Evaluation in France. It gathers more than 400 members belonging to four communities: Researchers, Commissioners and public officers both from national administrations and local authorities, Consultants and non-governmental organisations.

160

C. LACOUETTE-FOUGÈRE AND B. SIMON

Its main objective is to contribute to the development of evaluation and to promote its use in public and private organisations, in the aim to improve public policies. Its core activities to foster exchanges between all stakeholders of evaluation are developed as follows: • regional clubs and thematic working groups are developed and taken in charge by members of SFE who organise regular meetings for interested members; • the organisation of a biannual conference and national thematic seminars (three to four each year); • the development of training programme either for beginners in evaluation or people who look for in-depth knowledge about evaluation techniques; • publication of thematic working groups’ studies and national seminars proceedings; • the creation of a national observatory of public policies evaluation, with a national database of evaluations and the publication of a biannual barometer. SFE edited in 2009 a charter based on seven principles, which is still the only standard for professional evaluation. These principles are: • Pluralism: Means that all points of view from different stakeholders must be taken into account in a balanced way. • Independence: An evaluation has to be carried out impartially. Potential conflicts of interest must be published. • Competencies: Designing, managing, commissioning evaluations, collecting data and choosing methods, as analysing outcomes require to use specific competences. • Opportunity: Evaluation has to be decided when it is able to produce findings in terms of accountability, effectiveness, learning and organisational process. • Respect of integrity: Participants in evaluation process must respect the rights, the integrity and safety of all stakeholders. • Transparency: Presentations of evaluation results must be done along with a clear description of the evaluation subject, its objectives, its

6

FRANCE

161

target, the questions raised, the methods used and their limits, and arguments et criteria which led to these results. • Responsibility: When starting an evaluation, decisions relative to the share of responsibilities between the different involved actors must be made such as responsibility is clearly established for each function of the evaluation. Many professional evaluators however consider this charter as too weak to really act as a standard of quality as it exists for instance in Canada. They consider that it cannot ensure the selection of real professional of evaluations, trained and warrantying a high quality of work and thus regret the absence of a certification system. As a matter of fact, the French evaluation market is still very scattered: there are no dominant companies with a large share of the market: • Large consulting companies such as the Big Four do not own a dedicated evaluation team, Ernst and Young excepted and they mainly focus their intervention on large public programmes like European ones. • Consulting firms dedicated to evaluation and considering themselves as generalists (meaning they can work on many different fields) are not very numerous and the biggest ones do not overcome 15 consultants. • Some engineering offices are regularly working on evaluation of public policies, taking advantage of their high level of expertise in different fields (water, energy, transport, f.i.) despite a general lack of evaluation experts with their teams. • Individual experts or freelancers in evaluation are quite numerous, positioning themselves on specific niches, often in partnership with larger companies. Development policy is a field where we can find typically a lot of freelancers. • Researchers are not very active in the field of evaluation, if considering their ‘market share’. When active, their role can be twofold: on the one hand some researchers are recruited by consulting companies for their expertise in a given field and they thus participate in framing the evaluation and collecting data; on the other hand, they can lead some evaluations, mainly at the national or European level, with

162

C. LACOUETTE-FOUGÈRE AND B. SIMON

a strong focus on impact assessment like economists (PSE). However, the number of these kind of evaluations remains limited, while growing in the last years. 4.3

Compliance to Standards and Quality Obligations

Quality Obligations are Rather Weak in France and Rely on the Commissioner As previously said, the French Evaluation Society rules a charter but there is no control of its respect (just an initial declaration by evaluators) and the different criteria are not very engaging for professional evaluators. The same happens with France Strategie that recently published new guidelines for policy evaluation that establishes standard impact evaluation technique (Desplatz and Ferraci 2016). Then, depending on their evaluation culture, clients may demand a certain evaluation quality, but this quality is mainly based on the respect of the terms of reference. Thus, if a client is not satisfied by the evaluator’s work, it could happen that he would not pay a part of the service. But in any case, a dispute needed to be conciliated by any authority or arbitration board. In the 2000s some client was still using a quality analysis grid to be used at the end of the evaluation process. However, it seems that it does not exist anymore. The absence of complying standards or high-level quality standards reflects the difficult position of evaluation in France. After 20 years of a continuous but low development, evaluation still does not find its true place in the landscape of public policies management. Many factors can explain this situation and thus the difficulty to raise the level of standards: • Training remains insufficient, especially for top managers of public authorities. • The overall absence of evaluation culture does not create a pressure both on commissioners (through high quality terms of reference) and evaluators.

6

FRANCE

163

• With the absence of high-level standards (which are not requested by the entire profession because of the risk of being eliminated of the market), this leads to a heterogeneity of evaluation quality. • Then managers and politicians are criticising evaluation as a whole, considering its added value as insufficient in regard to its costs and duration. • In addition, the dispute between researchers and consultants about evaluation methodology is even more blurring the landscape for commissioners which are generally not professional evaluators. • This is weakening in return the profession which cannot anticipate on a growing market and thus invest in human resources with more qualified and experienced teams which could be benefit from in-depth trainings. • A vicious circle is thus in place which could lead to the marginalisation of evaluation in France to the benefit of new, ‘fashion’ alternative ways of managing public policies.

5

Conclusion

Evaluation in France has been developed since the 1960s mainly on two complementary driving forces: first, the reform of the State and the search of the improvement of public performance at the national level, second the development of European programmes for the benefit of the Regional authorities at the local level. This development has leaded to a rather high degree of maturity of the French evaluation both in terms of methods and practices and in terms of professionalisation. The specific trajectory of this institutionalisation process has also sketched out the lines of a French approach of evaluation made of pluralism and participation. In the last decade, the recognition of evaluation within the Constitution and the 80 evaluations programme carried out by the French state during the MAP (2012–2017) reflect these achievements. However, this development and this apparent maturity degree are widely window-dressing. Evaluation in France remains characterised by a fragmentation of the practices and the actors and a lack of follow-up in governmental action. Interdisciplinary debates and the existence of methodological ‘chapels’ also hinder the development of a French unified model of evaluation. Obstacles also lie in the credibility and transparency of evaluations which are very far from being systematically disseminated to policy-makers and used by them.

164

C. LACOUETTE-FOUGÈRE AND B. SIMON

Recently French Parliament called for a strengthening of its expertise and evaluation capacities through the creation of an Independent Evaluation Agency. Under the expression ‘spring of evaluation’, Parliament members claim for a more democratic conception of policy evaluation which could be used as a genuine counter-power. This movement that indirectly overlaps the demand for greater transparency and participation in the policy-making of part of French citizens (though, e.g. referendum on citizens’ initiative asked for part of the Yellow Jacket movement) may certainly constitute a strong further development for democratic evaluation in France.

References Barbier, J. C., & Hawkins, P. (2012). Evaluation cultures: Sense-making in complex times. New Brunswick: Transaction Publishers. Bezes, P. (2009). Réinventer l’état: les réformes de l’administration française (1962– 2008). Paris: Presses universitaires de France. Bono, P. H., Debu, S., Desplatz, R., Hayet, M., Lacouette-Fougère, C., & Trannoy, A. (2018). Vingt ans d’évaluations d’impact en France et à l’étranger. Analyse quantitative de la production scientifique. Document de travail. https://www.strategie.gouv.fr/sites/strategie.gouv.fr/files/atoms/files/ fs-dt-impact-politiques-publiques-decembre-2018.pdf. Accessed on 17 March 2019. Bourdin, J., André, P., & Plancade, J. P. (2004). Rapport d’information sur l’évaluation des politiques publiques en France. Sénat, Rapport No. 392. CAS—Centre d’analyse stratégique. (2009). Réglementer moins, réglementer mieux? Les analyses d’impact de la réglementation. Note de veille No. 151. http://archives.strategie.gouv.fr/cas/system/files/noteveille151. pdf. Accessed on 17 March 2019. CESE. (2015). Promouvoir une culture de l’´evaluation des politiques publiques: avis du Conseil e´ conomique, social et environnemental sur le rapport pr´esent´e par Nasser Mansouri-Guilani, au nom de la d´el´egation a` la prospective et a` l’´evaluation des politiques publiques. Paris: Ed. des Journaux officiels. Desplatz, R., & Ferracci, M. (2016). Comment évaluer l’impact des politiques publiques? Évaluation. Devaux-Spatarakis, A. (2014). La méthode expérimentale par assignation aléatoire: un instrument de recomposition de l’interaction entre sciences sociales et action publique en France? (Doctoral dissertation, Centre Emile Durkheim–science politique et sociologie comparatives, UMR5116). Duran, P. (2010). L’évaluation des politiques publiques: une résistible obligation. Revue Française Des Affaires Sociales, 1, 5–24.

6

FRANCE

165

Duran, P., & Monnier, E. (1992). Le développement de l’évaluation en France; nécessités techniques et exigences politiques. Revue française de science politique, 42, 235–262. Fontaine, C., & Monnier, E. (2002). Evaluation in France. In J. E. Furubo, R. C. Rist, & R. Sandahl (Eds.), International atlas of evaluation (pp. 63–76). New Brunswick: Transaction Publishers. Guba, E. G., & Lincoln, Y. S. (1989). Fourth generation evaluation. London: Sage. Jacob, S., Speer, S., & Furubo, J. E. (2015). The institutionalization of evaluation matters: Updating the international atlas of evaluation 10 years later. Evaluation, 21(1), 6–31. Jacob, S., & Varone, F. (2004). Cheminement institutionnel de l’évaluation des politiques publiques en France, en Suisse et aux Pays-Bas (1970–2003). Politiques et Management Public, 22(2), 135–152. Lacouette-Fougère, C., & Lascoumes, P. (2013). Les scènes multiples de l’évaluation. Les problèmes récurrents de son institutionnalisation. SciencesPo, LIEPP. Leca, J. (1993). Sur le rôle de la connaissance dans la modernisation de l’état et le statut de l’évaluation. Revue française d’administration publique, 66, 185–196. Nioche, J. P. (1982). De l’évaluation à l’analyse des politiques publiques. Revue Française De Science Politique, 32(1), 32–61. Picavet, O. (2015). Les caractéristiques des évaluations réalisées par les corps d’inspection dans le cadre de la modernisation de l’action publique. Revue Française D’administration Publique, 155(3), 713–722. Pollitt, C. (1995). Justification by works or by faith? Evaluating the new public management. Evaluation, 1(2), 133–154. Rozenberg, O. (2016). Un petit pas pour le parlement, un grand pas pour la Ve république. L’évaluation de la révision constitutionnelle du 23 juillet 2008 (LIEEP Working Paper No. 61). SFE—French Society of Evaluation. (2016). Baromètre 2016 de l’évaluation des politiques publiques en France. Document de travail, en partenariat avec le CGET et le SGMAP. SGMAP (2017). Évaluation de la démarche globale d’évaluation des politiques publiques menée dans le cadre de la modernisation de l’action publique. Rapport externe pour le SGMAP. Silvent, F., Dureau, G., & Guillaume, H. (2002). Gestion publique. L’État et la performance. Presses de Sciences Po et Editions Dalloz. Thoenig, J. C. (2002). L’évaluation en actes. Leçons et perspectives. Politiques et Management Public, 20(4), 33–50.

CHAPTER 7

Germany Reinhard Stockmann and Wolfgang Meyer

1

Introduction

In Germany, applied empirical social research began to emerge as early as during the final third of the nineteenth century. Furthermore, with the expansion of statistical offices, extensive data bases were created (Stockmann and Willms-Herget 1985). The beginnings of evaluation, however, can only be traced back to the time after the second world war, as—following a phase of “institutional restauration” (Alber 1989, p. 60)— the pressure resulting from reform deadlock increased, which in the 1960s triggered a comprehensive modernisation of state and administration. This modernisation was founded on the idea that the “state should function as the central instance of societal management, pursue ‘active policies’, and shape society through long-term planning” (Mayntz 1997, p. 68). In the Federal Republic of Germany, this situation led to a first boom in evaluation, despite other labels still frequently being used, such as accompanying and

R. Stockmann (B) · W. Meyer Department of Sociology, Saarland University, Saarbrücken, Germany e-mail: [email protected] W. Meyer e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_7

167

168

R. STOCKMANN AND W. MEYER

implementation research (Begleit- und Implementationsforschung ) or performance audit (Erfolgskontrolle). It wasn’t only institutions of the nation state, but also federal states and local authorities, that made use of evaluation as an instrument of management and control (Wollmann 1994, 1998, 2017; Derlien 1976, 1990; Derlien et al. 1994). Wide-ranging, elaborate evaluation studies were carried out, most notably in the fields of education, urban regeneration and infrastructure. Yet this did not lead to evaluation becoming successfully established within the political and scientific system on a sustainable basis. The economic and budgetary crisis which resulted from the crude oil price shock in the 1970s served to stifle the reform efforts. From then on, evaluation was used primarily for gathering information on the effectiveness and efficiency of programmes and measures, serving the political legitimisation of selection and prioritisation decisions. Only at the start of the 1990s did evaluation undergo a renaissance, leading to a boom phase which lasts to this day. The cause of this was—in addition to international developments and guidelines of the European Union (EU)—the rise of New Public Management within the framework of universal modernisation of administration (Stockmann and Meyer 2016). Before describing the current status of evaluation, firstly several aspects of the German political, social and scientific system will be dealt with, aspects which are important for understanding evaluation’s institutionalisation and the associated difficulties.

2

General Country Overview

One formative element of the political system is German federalism. The Federal Republic consists of 16 states with their own state constitutions. They are as such not administrative provinces but rather add a vertical dimension to the classic division of powers between legislature, executive and judiciary.1 Through the ongoing Europeanisation of politics, national

1 In addition, there are the local authorities, which also have clearly defined rights and obli-

gations within the framework of local self-administration and the corresponding independent scope for action (Wollmann 2013). In spite of the increasing transferral of responsibility from the local to the national level, and the concurrent expansion of governance responsibility at a European level, the political system of the Federal Republic of Germany is—when compared with other European countries—still organised in a very decentralised way (Grotz 2007).

7

GERMANY

169

policy-making is also influenced by supranational decisions.2 The federal structure and integration within the EU means, among other things, that centralised state regulations and acting at the national level are hampered. On top of this, in the parliamentary party democracy, the head of government—the Chancellor—is not elected by the public directly but by the Bundestag. This usually leads to coalition governments involving multiple parties, which have the task of reaching consensual decisions within the cabinet. In accordance with the departmental principle (Ressortprinzip), individual ministers also enjoy a high degree of authority, which has consequences for political coherence as a whole as well as for the implementation of administrative regulations valid for evaluation. Within the parliamentary system of government, the parties are of central importance. They not only provide candidates for political offices, but also influence the occupation of top positions in the administrations, courts, public prosecutor’s offices and public service broadcasting. Added to this is a stable institutional structure, which—in the Federal Constitutional Court and the Federal Audit Office—has at its disposal recognised and highly esteemed supervisory bodies. These complex linkages in the political system need to be considered when the significance of the topic of evaluation is analysed from the point of view of politics. The economic system of the Federal Republic is shaped first and foremost by the fact that Germany—with a gross national product of 41,902 US dollars per capita—is among the richest nations on earth and has further enhanced this position over the last decades. This enables wide-ranging, reform-oriented welfare policy to be pursued. Germany is characterised by a ‘social market economy’ (Weißer 2017), which combines freedom of action within the market with social balance in accordance with the principle of solidarity. The social system of the Federal Republic of Germany includes extensive public involvement in civil society. At present, a total of 17.5 million people belong to 616,000 registered associations, foundations or other non-profit-making groups, with 72% of these organisations operating on a purely voluntary basis. Despite the prophecies of doom, involvement in civil society has continued to grow over the past 20 years (Priemer et al. 2017). The number of foundations has increased at a particularly high rate: there are now 21,301 (2015), half of which are dedicated to social tasks 2 On the increasing interwovenness of politics and the efforts of the EU to achieve consistency (Schuster 2017).

170

R. STOCKMANN AND W. MEYER

(Statistisches Bundesamt 2016). Here, evaluations could shed light on the efficacy of work carried out and of resource deployment, which—given the worthy causes and limited means—would be especially useful for most of these organisations (Bender and Mildenberger 2014). Political involvement is of particular importance for the 2325 organisations that are on the lobby list of the German Bundestag (Deutscher Bundestag 2017). This list consists primarily of special interest groups and associations (Verbände), which participate in consultations and thereby hope to be listened to politically.3 Traditionally, the most influential associations include the employers’ associations and trade unions, which—within the framework of free collective bargaining (Neurath 2016)—assume the role of regulating societal subsystems through independently reaching central agreements. In general, the activities of associations are increasingly no longer limited to consultations or lobbying, but also involve collaborative participation in ‘governance’ structures within political networks (Mayntz 2003). Here too, evaluations could offer important support for such management activity (Puhani and Sonderhof 2011). The media also belongs to the central actors within the social system (Pürer 2015). On top of the diverse and long-standing newspaper and magazine landscape in Germany.4 There is of course radio, television and— increasingly—online media (such as Facebook, Twitter, etc.). The public service broadcasters in particular enjoy a high level of trust among citizens.5 Regarding evaluation, the extent to which evaluations are perceived by civil society is of first and foremost interest. This also relates to special interest groups and non-governmental organisations as a means of monitoring state action, and whether the public takes part in evaluations and is interested at all in the conclusions it reaches. Finally, Germany’s scientific system is also characterised by a few features which set it apart, as well as by increasing change in recent years (Simon 2014; Knie and Simon 2016). Among its particular features is the fact that education and science policy are the responsibility of the federal states. Accordingly, the 426 institutions of higher education in Germany (including 107 universities) are federal state establishments

3 On the role of associations in the federal Republic, see Sebaldt and Straßner 2004. 4 There are currently 333 daily, 22 weekly and 6 Sunday papers with a combined circulation

of almost 19 million copies (BDZV 2017). 5 For a comprehensive description, see Aigner et al. 2017.

7

GERMANY

171

(Statistisches Bundesamt 2016). Accounting for a mere 5% of all students, private universities do not play a significant role. In Germany, a total of around 1% of gross national product is invested as public funds in higher education (Emrich et al. 2013), of which a little over 15 billion euros is for research and development (approximately a fifth of the overall R&D spending in the Federal Republic, Statistisches Bundesamt 2016). On the teaching side, an accreditation system was put in place through the implementation of the Bologna reform. In this context, evaluations also increasingly play a part on various levels (Stoetzer and Watzka 2017; Großmann and Wolbring 2016).

3

Institutional Structure and Processes 3.1

Evaluation Regulations

Germany currently has no national or federal state-level evaluation legislation, and no national or federal state evaluation policy. However, the performance audit in case of completed or partially completed measures has been specifically prescribed within the Federal Budget Code (FBC) since the 1970s. Such audits should include the measurement of the achievement of objectives, impacts and cost effectiveness (FBC § 7, 23, 44). These administrative provisions have been more precisely detailed and given added importance in recent years (Dommach 2008, p. 282). At federal state level, similar regulations are in place (State Budget Code). Similarly, there is not a long tradition of the evaluation of legislation in Germany (Gusy 2015). The German parliament still does not have autonomous evaluation structures. However, it can be observed that the embedding of evaluation and reporting obligations within legislation has increased greatly (Konzendorf 2009). As can be seen in Fig. 1, the term ‘evaluation’ hardly appeared at all during the legislative sessions between 1949 and 1990. Even subsequent to that, merely a minimal presence at a somewhat higher level can be made out. An extreme surge in the topic within legislative provisions has only occurred over the last three legislative sessions, during which time the number of times it was mentioned more than trebled. If the printed documents are filtered by type, in 2016, 211 draft laws can be found, 139 of which contain the term ‘evaluation’. Upon closer analysis, however, in only 58 cases is an evaluation recommended or prescribed, for instance, only in a quarter of all draft laws dealt with in 2016. Evaluations, performance audits, impact reviews, or reporting obligations

172

R. STOCKMANN AND W. MEYER

450

418

400

Frequency

350 300 250 200

166

150

102

100 50

1

1

1

1

2

3

6

4

3

7

10

22

24

44

51

0

Legislative period

Fig. 1 Use of the term ‘evaluation’ in draft laws dealt with by the Bundestag (Own depiction; German Bundestag 2017)

are stipulated above all in social security statutes. Unfortunately, there are no surveys showing in how many national or federal state laws such clauses were ultimately included. Admittedly, one has to be cautious here: ‘reporting’, ‘examining’, ‘reviewing’ and ‘monitoring’ can refer to any number of things—possibly, but not necessarily, to evaluation. Differentiation between theory and reality with regard to evaluation is also required as far as the Rigorous Impact Assessment (RIA) is concerned. Since 1 September 2000, the Joint Rules of Procedure of the Federal Ministries (Gemeinsame Geschäftsordnung der Bundesministerien—GGO) have set out that the legislature should carry out a systematic review of expected and actual consequences when enacting any legal norm. This may be done either on an ex ante, ongoing or ex post basis, with the objective of examining, inter alia, the intended and unintended impacts, the expense of implementation, costs to the economy, and the effect on budgetary planning, prices and consumers. Pursuant to section 44 of the GGO (Bundesministerium des Innern 2011) of the Federal Ministry of the Interior (Bundesministerium des Innern—BMI ), the RIA should, among other things, cover explicitly the intended and unintended impacts of a law, and ascertain whether such impacts represent sustainable progress, outlining what long-term effects can be expected.

7

GERMANY

173

A new attempt at enforcing the performance of evaluations, through administrative (non-statutory) controls, was made by the State Secretaries’ Committee on ‘Bureaucracy Reduction and Better Regulation’ with a concept for the evaluation of new regulatory measures that became effective on 1 March 2013. Here it is set out that all significant laws—starting from a threshold of implementation cost of one million euros—should be evaluated three to five years after entering into force, with the aim of checking whether the respective regulation’s objective has been met. The corresponding reports are to be submitted to the National Regulatory Control Council (Normenkontrollrat—NKR) (NKR 2015). According to the NKR, this represents the first time a binding framework for the systematic review of legislation and regulations has been put in place by the Federal Government. In order to the test the robustness of the concept, the Federal Government has—together with the NKR—carried out pilot projects, reaching the conclusion that it is possible, even with limited effort, to present meaningful evaluation reports (NKR 2015). However, these reports are sometimes only three pages long and not available to the public, meaning there is by all means sufficient reason to doubt this statement. 3.2

Evaluation Practice

The existence of rules and regulations prescribing the performance of evaluations of course does not mean that they are indeed carried out. On the contrary: In reviews undertaken by the Federal Audit Office (Bundesrechnungshof—BRH ) in 1989 and 1998 on “monitoring of public administration programmes having a major financial impact” (Bundesbeauftragte für Wirtschaftlichkeit in der Verwaltung [BWV] 1989, 1998), a disastrous conclusion is reached. In both assessments, it is observed that only in a few instances, performance audits are carried out, and most departments are, therefore, not in a position to adequately judge the success of their measures. The BRH therefore recommends, as it did in 1989, the establishment of organisational and methodological foundations for performance audits, and the subsequent tying of the granting of budgetary funds to the submission of such evaluations. In the report produced by the BRH in 2013, it is once again observed that authorities often fail to carry out performance audits. The reason given

174

R. STOCKMANN AND W. MEYER

for this to the BRH is that the required effort is not related to any practical benefit. Moreover, a lack of personnel and financial resources are also mentioned. According to the BRH, although evaluation activity within the federal administration has increased in recent years, performance audit is still not viewed to a sufficient extent as an ongoing task (BWV 2013; BRH 2013). The audit findings therefore are eye-opening. With regard to the goal attainment assessments, the following criticisms are made: • There was often a failure to formulate SMART targets and ascertain baseline values during the planning phase. • It was often merely reviewed whether the contractually agreed services were rendered at the agreed times, without considering that services should contribute to achieving objectives rather than being goals in themselves. • Sustainability was examined only rarely. In respect of impact assessments to be performed in accordance with the FBC (section 7, administrative regulation no 2.2), the situation isn’t any better. The report comes to the conclusion that: • Impact assessments were only carried out in a few cases. • Many employees responsible for their undertaking were not aware of how impact assessments are performed. • Impact assessments that were in place focussed overwhelmingly on examining intended impacts, regularly failing to look at external effects. Likewise, the cost effectiveness monitoring prescribed by the FBC was often not carried out (BRH 2013). These findings have been backed up by NKR studies. However, the NKR emphasises that this is not due to a lack of political or administrative will, but is rather a problem of insufficient methodological competence. It therefore recommends, to build up evaluation expertise in the Federal Government within cross-sectional units, to develop minimum standards and a practical guide for evaluations, to link ex post evaluations with ex ante assessments of consequences, to explicitly make ex post evaluations

7

GERMANY

175

a component of political planning, and to introduce external, systematic quality control. Consequently, it can be concluded that—whilst there is no national or federal state-level evaluation legislation—evaluations are being stipulated in an increasing number of laws. Moreover, the administrative regulations of the FBC, along with the Joint Rules of Procedure of the Federal Ministries, do indeed demand assessments, impact evaluation and cost effectiveness. However, in certain policy fields evaluation is not applied, whether on the federal level nor on the country level. This leads to a blatant lack of enforcement (Konzendorf 2009). Evidently, the different departments interpret the rules in strongly differing ways. As a consequence, the established structures and processes also differ greatly between the respective policy fields. Below are two examples of this. Development cooperation (DC) is the policy area in which the most evaluation activity is to be seen. There are approximately 40–50 evaluations per year (Stockmann 2006, 2013). The responsible department, the Federal Ministry for Economic Cooperation and Development (Bundesministerium für wirtschaftliche Zusammenarbeit und Entwicklung— BMZ ), has had an evaluation unit (previously referred to as an ‘inspection unit’) in place since 1971. It is the only federal ministry to which this applies. In addition, in 2002 the German Institute for Development Evaluation (Deutsches Evaluierungsinstitut der Entwicklungszusammenarbeit—DEval ) was founded. Its objective is to support the BMZ in improving German development cooperation through its evaluations. With the inception of this institution, an actor with an explicit evaluation mandate was created, something which does not exist in any other policy field. However, even in this policy area, there has never been and continues to be no specific evaluation law or coherent evaluation policy defined by the relevant ministry. For DC, a comprehensive ‘system study’ on the institutionalisation of evaluation exists, something which is not available in relation to any other policy field (Borrmann and Stockmann 2009). This study shows that evaluation practice in DC consists primarily of internal evaluations of programme-based management processes. Yet external evaluations are becoming ever more important, particularly in the context of impact evaluations, which can also be used for the political legitimation of invested resources.

176

R. STOCKMANN AND W. MEYER

With regards to the deployment of financial resources, neither budget legislation nor the BMZ set out normative guidelines, with the result that expenditure on evaluation is rather modest when measured against international benchmarks. At present, less than 0.5% of public DC resources are allocated to evaluations; for independent evaluations, this figure drops to just 0.2% according to the Organisation for Economic Co-operation and Development (OECD) (OECD 2016). Thus, in this regard‚ Germany is below the OECD and Development Assistance Committee (DAC) country average. Typical for this policy field is its strong involvement in international collaboration. As early as at the beginning of the 1980s, an initiative for the “evaluation of aid effectiveness” (OECD 2006, p. 19) was formed. The DAC, founded back in 1961, has developed essential rules and standards for the evaluation practices of its member countries. As well as the definition of uniform evaluation criteria (effectiveness, efficiency, efficacy, sustainability, relevance), above all it is the establishment of impact orientation as the central planning and management concept that has changed evaluation practice. Increasingly, evaluations examining outputs are being replaced by those examining outcomes. By contract, impact evaluations which also deal with system-changing effects remain relatively rare. The evaluation of foreign culture and education policy (Auswärtige Kultur- und Bildungspolitik—AKBP ) can be taken as a strong counterexample, showing what range of implementation possibilities the framework set out by FBC and Joint Rules of Procedure of the Federal Ministries allows. The AKBP represents—in addition to political and economic relations—one of the three pillars of German foreign policy. Through encouraging dialogue and cultural exchange, it aims to develop stable international relations.6 The AKBP is managed politically by the Federal Foreign Office (Auswärtiges Amt—AA), whose area of responsibility it is. It is implemented through so-called mediator organisations (Mittlerorganisationen).7 In contrast to the ‘implementation organisations’

6 Central elements of the German AKBP are scientific and higher education exchange, overseas schooling, cultural programmes, the promotion of German as a foreign language and intercultural dialogue (Ständige Arbeitsgruppe Deutsch als Fremdsprache 2006). 7 Important mediator organisations are, among others, the German Academic Exchange Service (Deutscher Akademischer Austauschdienst—DAAD), the Goethe Institute, the Alexander von Humboldt Foundation and the Institute for Foreign Cultural Relations (Institut für Auslandsbeziehungen—ifa).

7

GERMANY

177

(Durchführungsorganisationen) of the BMZ within the field of DC, these are mostly organised as private institutions or foundations, and have a large degree of freedom in the specific formulation of their activities. To this day, the Federal Foreign Office has not developed an evaluation policy covering multiple systems, or indeed any independent evaluation structures. The Federal Audit Office has often criticised this lack of activity, on occasion heavily. Among other things, the AA is faulted for neglecting to manage and monitor the success of its support measures, and for not defining measurable targets or indicators, meaning an effective performance audit is impossible. In so doing, the AA was not in a position to ensure that evaluation results were incorporated in future activities or served as lessons learned for the future work of other units (BRH 2015). The long-serving former Secretary General of the Institute for Foreign Cultural Relations, which reports to the AA, thus describes the latter as a consistent ‘evaluation avoider’ (Maaß 2013). The AA did put in place an ‘evaluation advisory body’ in 2009, but this has no evaluation budget and it is not mandatory to involve it in evaluations. Instead, the AA points to its mediator organisations, which have established their own evaluation structures to varying degrees. But not all of these make their reports accessible to the public, thus holding them back from wide-ranging discussion. In the case of the few evaluations carried out by the AA, almost all were external evaluations undertaken for legitimisation purposes. There is a general movement away from output—towards outcome—evaluations, and here and there impact evaluations. Therefore, they mostly focus on results in terms of outcomes or impacts. 3.3

Use of Evaluations

In Germany, there are no studies or data on the use of evaluation covering multiple policy areas. But several indicators are to be found that can be used to establish the role of evaluation in certain policy fields. One of these has been mentioned already: In the area of law-making, the fields of labour and social affairs are the ones which the Federal Government focusses on primarily in relation to the implementation of evaluations. In civil society, above all the field of education is the one in which many evaluators are active—this relates to the activities of the evaluation association and the implementation of evaluation within institutions of higher education. Further policy fields in which evaluation is demonstrably well established are research, innovation and technology policy, urban and spatial planning

178

R. STOCKMANN AND W. MEYER

(including environmental protection), and the health sector. In the areas of economics and administration—primarily in relation to quality management—as well as in culture and security policy, first beginnings can be seen (Böttcher et al. 2014, Widmer et al. 2009, DeGEval 2000). The system study mentioned above, examined the 20 most important DC organisations (including the BMZ) with regard to their evaluation practices (Borrmann and Stockmann 2009). It could be shown that evaluations are primarily used for political management activities. However, upon being asked to rank the significance of the four central evaluation functions—gaining of insight, control, learning, legitimation (Stockmann and Meyer 2014)—all of them, including the BMZ as the institution which dictates policy, attached the highest priority to learning. Thus, above all the results are used by the management with their responsibility for setting the programme agenda for the purpose of learning for project implementation. This applies in similar measure to higher education policy, which, as mentioned above, has been subject to dramatic change in recent years, above all due to the unification of degree courses in Europe (‘Bologna reform’) and the associated accreditation through external agencies. These changes and the accompanying discussion on the quality of degree courses and apprenticeships have given new impetus to the quality control efforts which have been discernible at German universities and specialist technical colleges (Fachhochschulen) since the early 1990s. This has not led, however, to a uniform understanding of evaluation and to consistent quality management (Fischer-Blum et al. 2000). Variations are down to the federal structure and the autonomy of the federal states with regard to institutions of higher education, which has consequences: “Overall, an ongoing deficit is apparent with regard to the linking of evaluation procedures, resulting measures and the development of institutions of higher education” (Schmidt 2009, p. 169). In spite of these difficulties, and in some cases major problems of acceptance, evaluation has nevertheless become established at German higher education institutions, largely in the form of an ‘evaluation officer’ within university management. What is more, it has become the standard instrument of quality reviews in relation to teaching and degree courses. On top of this, there are appraisals and studies in the research domain, as well as in relation to the career paths of graduates, which have also been carried out regularly for years. Meta-evaluations, or more generally research work on the quality and effects of evaluations, are, however, still almost completely lacking in this area (Harris-Huemmert et al. 2014).

7

GERMANY

179

In other policy fields, differing understandings of the use of evaluations as a means of monitoring quality and results have developed and become entrenched. One example of this is in environmental policy. In this area, Germany is among the pioneers on a global level, and even during its beginnings in the 1980s, it was characterised by intense political confrontation, not only between parties but also between politics and civil society (Böcher and Töller 2012). Such debate was influenced to a large degree by disputes between experts and by the use of specialist knowledge from different scientific disciplines, with evaluations in the social scientific sense barely playing any role at all. To the actors in this policy field, ‘evaluation’ refers primarily to ‘technical assessment’, meaning the monitoring and inspection of technical facilities using technical measurement tools, for example in order to quantify the production of certain toxic substances. Accordingly (nontechnical) evaluation is not embedded in any legislation, nor does a sectoral evaluation policy exist. Evaluations are indeed commissioned by the Federal Environment Agency within the framework of the Environmental Research Plan, on behalf of the Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety (Bundesministerium für Umwelt, Naturschutz, Bau und Reaktorsicherheit—BUMB); however, the evaluations performed differ greatly in their format, objectives and methodology, meaning the audiences and users of evaluations also vary a lot. Systematic, regular use of evaluation has thus far not taken place. This also applies to the sustainability strategy, which doesn’t focus any attention on the subject of evaluation, either in its old 2002 or the revised 2016 version (Die Bundesregierung 2016). In this respect, Germany is still lagging behind most other countries (Stockmann et al. 2016; Simon and Meyer 2017; Simon et al. 2017; Meyer et al. 2018). Instead, the tradition of quantitative reporting on the basis of macro indicators is prevalent, within which evaluations represent, at most, isolated—but certainly not systematic—additions. In accordance with the range of evaluations described here, with strong differences in terms of objectives, the degree of institutionalisation, methods used and the processes employed to carry them out and to utilise their results, the audiences and users also vary considerably, from programme managers, administrators and political decision-makers right the way through to the general public. One actor not dealt with so far is parliament, which has not yet been conspicuous as a promoter, keen user or commissioner of evaluations. However, interest in the subject has seemingly grown significantly in recent years. At least that is the impression given by Fig. 2, which shows the

180

R. STOCKMANN AND W. MEYER

2451 2204

2500

Frequency

2000 1560 1500 1000 442

500 0

1

1

1

1

540 568

213 98 149 5 10 19 44 31

Legislative period

Fig. 2 Frequency of use in the printed matter of the Bundestag (Own depiction; German Bundestag 2017)

frequency of the use of the term ‘evaluation’ in the printed matter and plenary protocols of the Bundestag since 1949. Starting with the 16th legislative session (2005–2009), a great leap in use of the term can be made out, leading to the supposition that the topic has enjoyed a huge increase in importance in the last ten years, one which continues to hold. In relation to the 16th parliamentary term (2005–2009), Speer, Pattyn and De Peuter (2015) examined all questions posed by members of parliament and the parties. Each member of the Bundestag has the right to address up to four questions per month to the Federal Government, and to receive a written reply. Of the 15,553 questions submitted during the 16th parliamentary term, 38 (~0.24%) contained the term ‘evaluation’. Moreover, the parliamentary parties within the Bundestag are entitled to request the Federal Government to make statements on specific issues, by way of so-called brief or major enquiries. Of the 3299 brief enquiries, 157 alluded to the topic of evaluation. Major enquiries numbered 63 during the 16th parliamentary term, with the term evaluation cropping up 21 times (~33.33%). If we differentiate on the grounds of subject, it is evident that most questions related to domestic policy, labour market and social policy, and education and research policy.

7

GERMANY

181

The analysis of content showed that most questions concerned the evaluation process, for instance, whether an evaluation is planned at all, who is carrying it out, when will results be available, et cetera. Other questions pertained to the evaluation results themselves, for example whether the results had been published or whether the government knew about them. Only a few questions focussed on the consequences which the government would draw from the evaluation results in the interests of evidence-based decision-making. This could be interpreted as an indication that the use of evaluation results in decision-making processes is not exactly a parliamentary focus. In Germany, neither an institution nor state-defined rules exist for the purposes of ensuring evaluation quality. At best, the standards for evaluation published by DeGEval may contribute to this end. Yet, DeGEval has neither a clearing point nor an ombudsman which could be drawn upon in case of conflict. A certain role in assuring quality, at least in the case of state evaluation, could be ascribed to the BRH as the entity responsible for ensuring adherence to existing (evaluation) regulations, even though it does not itself have any evaluation mandate (Müller 2011) but rather—in accordance with the Basic Law (Art. 114, section 2, sentence 1)—merely the responsibility to audit “the account and determine whether public finances have been properly and efficiently administered”. The BRH not only admonishes the ministries due to a lack of results monitoring, but frequently also finds fault with the quality of evaluations that have been carried out. Of late, the National Regulatory Control Council has also manifested itself in this field, having taken over the task of assessing the quality of pilot evaluations carried out by the departments as part of the concept for evaluating regulation programmes which came into effect in 2013 (NKR 2015). All of this remains, however, a very long way away from a national quality control system.

4 4.1

Societal Dissemination

Institutionalised Use of Evaluations by Civil Society

In view of the heterogeneity and complexity described above, it is equally problematic to present a comprehensive overview of the institutionalisation of evaluation within civil society. At an organisational level, a search within the register of associations (Vereine) for the term ‘evaluation’ produced 47 hits among the 600,000-plus associations. Even behind this small number, there is a high degree of variance—it relates to social services providers,

182

R. STOCKMANN AND W. MEYER

healthcare and therapy facilities, institutions of quality assurance and management, engineering offices and construction experts, consulting firms and evaluation institutes, and to research facilities in the field of social pedagogy, in development cooperation and in the area of political consulting. Although the majority of these organisations can be categorised as belonging to the policy fields of healthcare and social services, even here there are no indications of a wide-ranging, comprehensive evaluation offering. Looking at the institutional members of the DeGEval also reveals the heterogeneity of evaluation in Germany. A total of 188 organisations from Germany, Austria, Switzerland and Luxembourg actively participate in the society, among them numerous state entities and ministries. Of the 107 German institutional members acting as public or private providers or executors of evaluations, the vast majority belong to the field of higher education; most frequently these are departments that evaluate degree courses and apprenticeships within their own institutions. Of the remaining organisations, most work primarily for public-sector clients; neither private enterprise nor civil society itself appear to any meaningful extent as customers of these organisations which are specialised to a greater or lesser degree in performing evaluations. In terms of content, the spectrum is wide and covers primarily the fields of education, development cooperation, healthcare, social services, research, technology and innovation. However, in many of these areas, self-evaluations are dominant among the organisations of civil society, the results of which are rarely made available to the public. In the case of foundations (Stiftungen), the picture is similar, with the website of the Association of German Foundations8 producing just a few hits (n = 32), mostly in relation to (a small number of) training sessions and general information about monitoring and evaluation. Although PHINEO9 —recently published a ‘foundation timetable’ which includes information on the setting-up of monitoring and evaluation systems (PHINEO 2016), the efforts of PHINEO do not (yet) appear to have contributed to wider dissemination of evaluations among foundations (PHINEO 2015). This is also not the first attempt at introducing monitoring and evaluation as quality control measures within the context of foundations. Plans to add an evaluation component to the seal-of-approval awarded by the

8 For further information see https://www.stiftungen.org. 9 PHINEO is an analysis and consultancy firm created by the Bertelsmann Stiftung to

promote effective civic engagement.

7

GERMANY

183

German Central Institute for Social Issues (Deutsches Zentralinstitut für soziale Fragen—DZI ) back in 2006 eventually came to nothing. The sealof-approval is intended as a sign that a fundraising organisation handles money it is entrusted with in a careful and responsible manner. Currently, 230 organisations, with total funds of 1.2 billion euros, have earned the much-respected seal. To receive the seal, organisations have to meet a range of criteria (DZI 2016). In using resources, suitable planning, implementation and monitoring should be ensured; funds should be ring-fenced for a specific purpose and used in accordance with the principles of efficiency, prudence and achievement of the maximum possible impact. In order to document the latter, a kind of monitoring is required in the form of the observation of impacts. However, the word evaluation is not mentioned once within the 40-page guideline document on the awarding of the fundraising seal-ofapproval. Impact evaluations are not required of the organisations earning the seal. In summary, an increasing use of evaluation can be seen in organisations of civil society in Germany. However, this increase relates to a broader spread rather than more depth and has led to the development of an evaluation culture only in a few areas—let alone to institutionalisation in terms of the routine actions of these organisations. Where evaluations are performed, this can normally be attributed to the initiative of the state. The basic attitude of organisations of civil society towards the topic of evaluation can seemingly best be described by the words of warning contained in the annual evaluation report of one of the big NGOs that is relatively active in the field of evaluation: However, we must not carelessly call for more and more resources for evaluations, as evaluation experts like to do; funds that we use to keep improving evaluations are not available for use in the execution of development projects (Misereor 2016). Such a statement is of course fundamentally correct, and applies, incidentally, not only to evaluations but also, for example, to large-scale media campaigns for raising funds and propagating success stories. However, it is unfortunately typical of German civil society that it should employ such arguments primarily for resisting the state’s requirement for monitoring how successfully state-provided resources have been deployed, rather than demanding accountability from the state on the effects of its investment of public money in the form of independent evaluations. This can be seen when looking at the media coverage of evaluation.

184

R. STOCKMANN AND W. MEYER

4.2

Public Perception and Discussion of Evaluation and Evaluation Findings

It must be said that a comprehensive analysis is difficult by the heterogeneity and scope of the German media landscape. As a representative example, a search for articles containing the term ‘evaluation’ was undertaken within the archive of the Süddeutsche Zeitung (SZ)—one of the most prominent German daily papers of repute. Although an increase can be seen between 1992 and the middle of the first decade of the new millennium, there have been and continue to be significant fluctuations. A boom similar to that in the parliamentary documents cannot be made out (Fig. 3). This remains unchanged even when adding the term ‘impact and performance audit’ (Wirkungs- und Erfolgskontrolle) to the term ‘evaluation’ as part of the search. 80

70

60

50

40

30

20

10

0

Evaluation + Evaluierung (engl. Evaluation + Evaluating)

Evaluation + Evaluierung + Wirkungs-/ Erfolgskontrolle (engl. Evaluation + Evaluating + Effect-/Performance Control)

Fig. 3 Frequency of use in the SZ archive (Süddeutsche Zeitung 2017)

7

GERMANY

185

If we take a closer look at the 40 articles containing the term ‘evaluation’ which appeared in the SZ and on süddeutsche.de, we notice that almost half of these relate to the field of education. A further area of focus cannot be made out. In most cases the information relates to an array of subjects such as refugees, schooling issues, the housing market, the police, energy efficiency, careers and unemployment, data retention, medicine, et cetera. In about a third of the articles, evaluation is admittedly portrayed as being useful; yet only in two cases are explicit recommendations made for the carrying out of evaluations. Around half of the 40 articles examined are not really about evaluation, but rather the concept of evaluation is somehow mentioned, such as in the article entitled ‘A Man Sees Red’. Here a Grimme Award-winning comedy series is discussed, in the latest episode of which the main character ‘Schotty’ encounters a sadistic management consultant with a specialism for ‘evaluation and efficiency methods’. What is more, sometimes the term is not used in the context of its actual conceptual meaning, for example when talking about the evaluation of a person’s mental state or the evaluation of energy consumption in buildings. Generally speaking, there are no signs that evaluations are increasingly being perceived by the general public as an important instrument for the monitoring and management of state actions. Although numerous evaluation studies are now made available to the public, public interest in these is low. The analysis of the SZ content also makes clear that organisations of civil society only very rarely demand the use of evaluation as a means of controlling state actions, nor do they often actively utilise the results of such evaluations for their own aims. The growing use of the term in the media may be a manifestation of the increasing dissemination of the concept, but the sweeping manner in which it is used, often in a non-specialist sense, hinders rather than facilitating the establishment of evaluation within society. Significantly, the DeGEval stopped awarding a media prize some years ago due to a lack of worthy contributions. 4.3

Civic Participation in Evaluations

With regard to civic involvement in evaluations, it can be stated that—in contrast to planning procedures—there are no political guidelines governing how citizens could be actively involved in evaluations, either at the local government, federal state or national level. Yet on the other hand,

186

R. STOCKMANN AND W. MEYER

even though there is broad consensus among evaluators on the necessity of ensuring wide-ranging participation, with participatory approaches even enjoying somewhat of a boom, the conceptual ideas as to how such participation can be implemented in evaluation practice are strangely nebulous and methodologically unconvincing (Meyer et al. 2016). As far as we know, there are no demands from organisations of civil society for better involving the public in state-commissioned evaluations. Even the Evaluation Society and the professional social science associations, whilst calling on politics to ensure more evaluations, do not call for increasing civic participation. This leads to a strange discrepancy between conceptual wishes around participation and a practice of involving citizens in evaluation where the implementation of such ideas is patchy, not well-founded and certainly not institutionalised. In summary, it can be concluded that, within the social system, there is merely modest interest in evaluations. Organisations of civil society only manifest themselves here and there as commissioners and users of evaluation (and where they do, then mostly in the form of self-evaluations). Frequently, this only happens when the state supports such measures financially and in so doing insists on the performance of evaluations. Moreover, it can be seen that civil society organisations are even less willing than state bodies in publishing their evaluation reports unvarnished. Most reports are either not published at all or only in the form of summaries. The beginnings of an independent evaluation culture can be seen within only a few organisations of civil society. This tends to relate more to their own activities, though, and hasn’t culminated in demands relating to society as a whole. It is the state which puts pressure on organisations of civil society to carry out evaluations, rather than vice versa. If we focus on the public dialogue surrounding evaluation, the sobering conclusion is reached that such discussion only takes place sporadically, in the context of specific topics, and, what is more, frequently even involves an incorrect comprehension of the concept itself. Overall, it therefore demonstrates a lack of evaluation understanding. In the media, particular interest in evaluation results frequently only arises when these can be used to make a scandal of certain political practices. No calls for more or better evaluation are directed towards politics—either by media representatives or agents of civil society. Likewise, a lack of demands for strong civic participation in evaluations was established by our research findings.

7

5

GERMANY

187

Professionalisation

Whilst the political and social systems primarily appear as consumers of evaluation, the scientific system is responsible for the provision of a quantitatively and qualitatively suitable offering of sufficiently qualified personnel, along with appropriate conceptual (research) approaches and methodological tools. The main objective of the institutionalisation of evaluation within the scientific system is its professionalisation as an independent subject area of applied social research with technically adequate and institutionally protected rules. According to the literature on professionalisation, professional action is ensured by four interlinked institutions10 : qualification, communication, organisation and regulation. On Qualification In Germany, similar to the United States of America back in the 1970s (Meyer 2015a), evaluation was initially able to establish itself within institutions of higher education as an element of different specialist disciplines (primarily educational science and psychology). This led to an unclear offering of individual courses which fluctuated over time and which were mostly linked to the initiative of particular lecturers. Only once such courses have found their way into the study and examination regulations, and thus have to be offered irrespective of the people involved, can we talk about institutionalisation. Research into various degree course portals found that there are currently (2017) 34 master’s degree courses in Germany in the field of social science whose official course documents mention evaluation. However, in most cases evaluation is only included as a small sub-area (usually in the context of methods training), with no—or only a few—separate courses on evaluation topics being offered. 14-degree courses contain a whole module in which evaluation—in conjunction with other methodology topics, quality management, and planning or diagnostics tools is taught. A specialist evaluation module only exists in two cases (one within a development course, one within a social work course), with a further degree course in the field of public health offering evaluation as an area of specialisation. The scope of evaluation courses does not equate to greater than 20 points 10 In addition to the market mechanisms for reconciling supply and demand, see Meyer (2015b, 2017).

188

R. STOCKMANN AND W. MEYER

within the European Credit Transfer and Accumulation System (ECTS) in any of these cases. Accordingly, the Master of Evaluation offered by Saarland University in conjunction with the University of Applied Sciences—and originally also with the Catholic University for Social Work—remains unique in Germany with its 120 ECTS points.11 The attempt by the University of Bonn between 2008 and 2011 to establish a further master’s degree course in evaluation regrettably failed. Since the summer of 2018, Saarland University, together with the University of Kaiserslautern, offer an additional English-language master’s course in evaluation in blended learning format at the Distance and Independent Studies Center.12 The few qualification courses outside of institutions of higher education are linked to specific subject areas and are, as a rule, merely isolated offerings. A regularly undertaken, systematic training programme for evaluators in development cooperation (Fortbildungsprogramm für Evaluatoren in der Entwicklungszusammenarbeit ) existed between 2003 and 2014; there is an absence of comparable programmes in other policy areas. We can therefore not talk about an institutionalisation of evaluation qualification outside of institutions of higher education; evaluation remains—in the vast majority of cases—an activity which is encountered in very diverse professional training courses and learned ‘on the job’. On Communication A fundamental prerequisite for the emergence of a profession is regular dialogue amongst people interested in the subject, which enables them to develop a common understanding. In the respective specialist organisations in the field of social science, the topic of evaluation has occasionally been addressed, being dealt with within individual courses or ad hoc groups; yet institutionalisation in the form of distinct departments or work groups has only occurred within the German Society for Psychology, with the creation—back in 1992—of the ‘specialist group for methods and evaluation’ (Meyer 2003a, b). To this day, nothing has changed in this regard.

11 For further information see www.master-evaluation.de. 12 For further information see www.zfuw.de.

7

GERMANY

189

In this context, the much-used platform for exchange Forum für Evaluation 13 was established on an initiative of Wolfgang Beywl at an early stage (1997), and now has 800 subscribers. This has since been complemented by further offerings, in particular in the course of the founding of a specialist society in the same year. With regard to institutionalisation in the scientific system, the publication of a German-language journal was definitely the most crucial step. The ‘Zeitschrift für Evaluation’ (Journal for Evaluation14 ) founded in 2002 gave a home to German-speaking evaluators, something they clearly had not found in the major English-language journals (Gaus et al. 2014). The main topics dealt with were theory and methods, institutions of higher education, labour market policy, research, technology and innovation, and urban and regional development (Gaus 2015). On top of this, there are of course articles in other German-language social science journals,15 which do not specialise in evaluation but which, needless to say, due to the type of content also publish articles on evaluation. A systematic search within the five large social science journals published by Springer-Verlag 16 during the period of time since the turn of the millennium delivers just under 700 hits, of which, however, only just over 10% relate to pertinent original articles. Content-wise, these articles—including those in the non-education-related journals—focus on education topics, primarily relating to the evaluation of teaching, research and schools.

13 For further information see https://lists.uni-koeln.de/mailman/listinfo/forum-evalua

tion. 14 For further information see www.zfev.de. Since 2004, this peer-reviewed journal is listed by the leading information services and achieved an impact factor of 0.318 in 2014, a respectable figure for a German-language publication. At this point in time, about 1250 copies per edition were being sold (including to nearly 1000 subscribers), and since the first edition, 120 original articles had been published (80% of which were by German authors). 15 There are about 140 journals. 16 Berliner Journal für Soziologie (Berlin Journal of Sociology), Kölner Zeitschrift für Sozi-

ologie und Sozialpsychologie (Cologne Journal for Sociology and Social Psychology), Zeitschrift für Bildungsforschung (Journal for Educational Research), Zeitschrift für Erziehungswissenschaft (Journal for Educational Science) and Zeitschrift für Vergleichende Politikwissenschaft (Comparative Governance and Politics).

190

R. STOCKMANN AND W. MEYER

Bearing in mind the journals selected, this is of course not representative. A general trend cannot be made out. On Organisation The professionalisation of a subject also requires, in addition to a permanent, specific qualification offering, its institutional establishment in the form of a professional association (Fachverband). Through DeGEval in 1997, this was done relatively early in comparison to the situation in other European countries, and a long time before the worldwide wave of such evaluation organisations being created (Meyer 2015b). DeGEval was able to recruit central stakeholders (including federal authorities, departmental research institutes, research units of universities and consultants) as members relatively quickly, registering a significant increase in members, especially during its first years of being. With around 823 private and institutional members, it has developed into the association with the most members in Europe, covering a formidable range of topics with its 16 working groups. In comparison to the professional social science associations (Fachverbände),17 DeGEval has more in common with a professional association (Berufsverband) and doesn’t see itself primarily as a platform for scientific exchange, but rather as a bridge between science and practice and between clients and providers. In this respect, DeGEval is very successful at bringing together the heterogeneous evaluation community, which is active in many diverse fields, but not at driving integration in the scientific or political system through resolute external representation. DeGEval’s self-conception also reflects very well the German evaluation market, which is mainly characterised by freelancers, consultancies and research institutes. The role of universities has never been particularly influential and is rather declining. On Regulation DeGEval was concerned from the outset with establishing universal rules to provide orientation when carrying out evaluations. As a result of this, in 17 The German Educational Research Association, the German Political Science Association, the German Society for Psychology and the German Sociological Association.

7

GERMANY

191

2004 its ‘evaluation standards’ were officially published, then re-launched following a revision process in 2016. These represent ‘maximum standards’, an ideal which can often not be completely achieved in practice, rather than minimum requirements that have to be met by all parties. Although the use of the standards had not spread as desired among evaluators and clients in the early years of the new millennium (Brandt 2009), more recent survey results give hope of a positive change in this regard (Stockmann et al. 2011). A comprehensive report on use of the standards within various policy fields, including the identification of inhibiting and facilitating factors regarding the pervasiveness of the standards, is currently being worked on; initial results are not yet available though (Böttcher et al. 2019). To sum it up: evaluation has done certain steps toward professionalisation in Germany, although the institutions are still very weak. Nevertheless, there is a well-established study programme, a respected specialised scientific journal, the largest evaluation society in Europe and Evaluation standards that are at least partly recognised in a broad variety of policy fields.

6

Summary

Germany’s federal system of government is characterised by a multiple power structure, divided horizontally and vertically, in which parties, associations and the media have a strong weight. This results in a broad spectrum of decision-making structures and centralised governmental solutions require a detailed coordination process. This has an influence on the development of evaluation in Germany, which shows different structures depending on the policy field, the federal level (federal government, countries, communities) and the system context (political, societal, private sector). Due to this diversity, no standardised evaluation structures and processes have developed in Germany. If one considers the political system, it can be said that it has taken on a central role in the development and dissemination of evaluation in Germany. Although there is still no national evaluation legislation or national evaluation policy, with the FBC and the GGO, legally binding regulations are in place requiring the mandatory performance of evaluations. Moreover, the number of laws within which the requirement to undertake evaluation is explicitly rooted has soared massively in the last ten years.

192

R. STOCKMANN AND W. MEYER

Owing to the wide-ranging scope for action of the individual federal departments, these stipulations are implemented to greatly varying degrees. While several ministries have established comprehensive sectoral evaluation systems, there is a whole number of evaluation avoiders. Furthermore, some policy fields lie primarily within the responsibility of the federal states, creating an uneven landscape of different institutional solutions. As a consequence, it is difficult to formulate universally valid statements in relation to Germany. Yet it is undoubtedly the case that the political system displays a lack of enforcement, something which for decades has been decried by the BRH and, more recently, by the NKR too. Evaluations are encountered most frequently as internal process evaluations serving as a basis for decisions to be taken by programme management. The prominence of independent external evaluations varies from department to department. As far as they are carried out, it is mainly for legitimation purposes. The primary target audiences therefore are programme managers and budget controllers, rather than political decision-makers or the public. The institutionalisation of evaluation in the social system considerably lags behind compared to the political system. Despite the increasing use of evaluation in civil society organisations, no nationwide structures and processes have been established yet. There is little interest in evaluation by the public or media. Citizens are generally not expected to participate in evaluations. During recent decades huge progress has been made on the supply side, in particular thanks to the activities of DeGEval. In summary, it can be concluded that—during recent decades—huge progress has been made on the supply side, in particular thanks to the activities of DeGEval. Through the long-term establishment of a master’s degree course in evaluation, the creation of an association with high membership figures, the publication of now widely-used specialist standards, a well-launched and respected professional journal, a wide-ranging supply of literature and textbooks, and a solid communication structure, evaluation has become professionalised and—at least to some extent—has established itself institutionally within the scientific system. However, following quick early successes, this process has stalled, despite an overall increase in demand.

7

GERMANY

193

References Aigner, R., Handrich, L., Mattes, A., & Pavel, F. (2017). Öffentlich-rechtlicher Rundfunk in einer konvergenten Medienwelt: Endbericht. Studie im Auftrag der ARD (Politikberatung kompakt No. 119). DIW Berlin. Alber, J. (1989). Der Sozialstaat in der Bundesrepublik Deutschland 1950–1983. Frankfurt and New York: Campus. BDZV. (2017). Die deutschen Zeitungen in Zahlen und Daten 2017. Berlin: BDZV. Bender, R., & Mildenberger, G. (2014). Wirkungsmessung in Kooperationen. Zusätzlicher Aufwand oder hilfreiche Unterstützung? Stiftung & Sponsoring 4/2014, 17 (4), 24–26. Borrmann, A., & Stockmann, R. (2009). Evaluation in German development cooperation: A system analysis. Münster: Waxmann. Böcher, M., & Töller, A. E. (2012). Umweltpolitik in Deutschland. Eine politikfeldanalytische Einführung. Wiesbaden: Springer. Böttcher, W., Hense, J., Kalman, M., & Meyer, W. (Eds.). (2019). Standards der Evaluation in unterschiedlichen Handlungsfeldern: Einheitliche Qualitätsansprüche trotz heterogener Praxis? Münster: Waxmann. Böttcher, W., Kerlen, C., Maats, P., Schwab, O., & Scheikh, S. (Eds.). (2014). Evaluation in Deutschland und Österreich. Stand und Entwicklungsperspektiven in den Arbeitsfeldern der DeGEval – Gesellschaft für Evaluation. Münster: Waxmann. Brandt, T. (2009). Evaluation in Deutschland: Professionalisierungsstand und – perspektiven. Münster, München, and Berlin: Waxmann Verlag. BRH. (2013). Bemerkungen des Bundesrechnungshofes 2013 zur Haushalts- und Wirtschaftsführung des Bundes. Bonn: BRH. BRH. (2015). Auswärtiges Amt verbessert seine Erfolgskontrollen (No. 12). https://www.bundesrechnungshof.de/de/veroeffentlichungen/produkte/b emerkungen-jahresberichte/jahresberichte/2015/teil-iii-einzelplanbezogen e-entwicklung-und-pruefungsergebnisse/auswaertiges-amt/langfassungen/ 2015-bemerkungen-nr-12-auswaertiges-amt-verbessert-seine-erfolgskontrolle n-pdf. Accessed on 13 May 2019. Bundesministerium des Innern. (Eds.). (2011). Gemeinsame Geschäftsordnung der Bundesministerien (GGO). https://www.bmi.bund.de/SharedDocs/downloa ds/DE/veroeffentlichungen/themen/ministerium/ggo.pdf;jsessionid=5BBF 6F41720D107D8423E5E6DCEAEFB0.2_cid373?__blob=publicationFile& v=2. Accessed on 15 May 2019. BWV. (1989). Erfolgskontrolle finanzwirksamer Maßnahmen in der öffentlichen Verwaltung: Gutachten des Präsidenten des Bundesrechnungshofes als Bundesbeauftragter für Wirtschaftlichkeit in der Verwaltung. Abgeschlossen im Juni 1989. Stuttgart, Berlin, and Köln: Kohlhammer.

194

R. STOCKMANN AND W. MEYER

BWV. (1998). Erfolgskontrolle finanzwirksamer Maßnahmen in der öffentlichen Verwaltung: Gutachten der Präsidentin des Bundesrechnungshofes als Bundesbeauftragte für Wirtschaftlichkeit in der Verwaltung (2nd ed.). Stuttgart, Berlin, and Köln: Kohlhammer. BWV. (2013). Anforderungen an Wirtschaftlichkeitsuntersuchungen finanzwirksamer Maßnahmen nach §7 Bundeshaushaltsordnung. Stuttgart: Kohlhammer. Deutsche Gesellschaft für Evaluation (DeGEval). (Ed.). (2000). Evaluation. Reformmotor oder Reformbremse? Köln: DeGEval. Derlien, H. U. (1976). Die Erfolgskontrolle staatlicher Planung: eine empirische Untersuchung über Organisation, Methode und Politik der Programmevaluation. Baden-Baden: Nomos-Verlag. Derlien, H. U. (1990). Program evaluation in the Federal Republic of Germany. In Program evaluation and the management of government: Patterns and prospects across eight nations (pp. 37–51). New Brunswick: Transaction Publishers. Derlien, H. U., Gerhardt, U., & Scharpf, F. W. (Eds.). (1994). Systemrationalität und Partialinteresse: Festschrift für Renate Mayntz. Baden-Baden: NomosVerlag. Deutscher Bundestag. (2017). Bekanntmachung der öffentlichen Liste über die Registrierung von Verbänden und deren Vertretern vom 4. Mai 2017. Bundesanzeiger BAnz AT 19.05.2017 B1. https://www.bundestag.de/blob/189456/130eb b7895b22f6a58da1dc805ceaadb/lobbylisteamtlich-data.pdf. Accessed on 29 March 2019. Deutsches Zentralinstitut für soziale Fragen (DZI). (2016). Jahresbericht 2016. Berlin: DZI. Die Bundesregierung. (Ed.). (2016). Deutsche Nachhaltigkeitsstrategie. Neuauflage 2016. https://www.bundesregierung.de/resource/blob/975292/730844/ 3d30c6c2875a9a08d364620ab7916af6/deutsche-nachhaltigkeitsstrategie-n euauflage-2016-download-bpa-data.pdf?download=1. Accessed on 16 May 2019. Dommach, H. (2008). Das Verfahren der Erfolgskontrolle durch die Bundesverwaltung für zuwendungsfinanzierte Projekte und Institutionen. Öffentliche Verwaltung, 61(7), 282. Emrich, E., Müller, A., Meyer, W. & Gassmann, F. (2013). Saarländische Bildungsinvestitionen und regional-ökonomische Effekte der Universität des Saarlandes. In E. Emrich, W. Meyer, & L. Rampeltshammer (Eds.), Die Universität des Saarlandes in sozio-ökonomischer Perspektive. Ausgewählte Analysen sozialer und wirtschaftlicher Effekte (pp. 71–102). Saarbrücken: Universaar. Fischer-Blum, K., Mittag, S., Sundermann, E., Buß, S., Winter, M., & Habel, E. (2000). Zur Einheit und Vielfalt in der Evaluation von Studium und Lehre. Bericht des Arbeitskreises Hochschulevaluation. In DeGEval (Ed.), Evaluation. Reformmotor oder Reformbremse? (pp. 51–80). Köln: DeGEval.

7

GERMANY

195

Gaus, H. (2015). Die Zeitschrift für Evaluation—Aufbau und Entwicklung. In V. Hennefeld, W. Meyer, & S. Silvestrini (Eds.), Nachhaltige Evaluation? Auftragsforschung zwischen Praxis und Wissenschaft. Festschrift zum 60. Geburtstag Reinhard Stockmanns (pp. 177–183). Münster: Waxmann. Gaus, H., Müller, C. E., & Konradt, I. (2014, September 11). Publikationen in referierten allgemeinen Evaluationszeitschriften im internationalen Vergleich— Ergebnisse einer explorativen Studie. Presentation within the joint annual conference of DeGEval and SEVAL, Zürich. Großmann, D., & Wolbring, T. (Eds.). (2016). Evaluation von Studium und Lehre. Grundlagen, methodische Herausforderungen, Lösungsansätze. Wiesbaden: Springer. Grotz, F. (2007). Vertikale Gewaltenteilung: institutionenpolitische Leitidee oder demokratietheoretische Chiffre? Reform westeuropäischer Bundes- und Einheitsstaaten im Vergleich. Zeitschrift für Parlamentsfragen, 38(4), 775–797. Gusy, C. (2015). Evaluation von Sicherheitsgesetzen. Wiesbaden: Springer. Harris-Huemmert, S., Mitterauer, L., & Pohlenz, P. (2014). Evaluation im Kontext der Hochschule. In W. Böttcher, C. Kerlen, P. Maats, O. Schwab, & S. Scheikh (Eds.), Evaluation in Deutschland und Österreich. Stand und Entwicklungsperspektiven in den Arbeitsfeldern der DeGEval—Gesellschaft für Evaluation (pp. 107–115). Münster and New York: Waxmann. Knie, A., & Simon, D. (2016). Innovation und Exzellenz: Neue und alte Herausforderungen für das deutsche Wissenschaftssystem. In D. Simon, A. Knie, S. Hornbostel, & K. Zimmermann (Eds.), Handbuch Wissenschaftspolitik (pp. 21– 38). Wiesbaden: Springer. Konzendorf, G. (2009). Institutionelle Einbettung der Evaluationsfunktion in Politik und Verwaltung in Deutschland. In Evaluation (pp. 27–39). Wiesbaden: VS Verlag für Sozialwissenschaften. Maaß, K. J. (2013). Rolle und Bedeutung von Evaluation in der auswärtigen Kulturpolitik Deutschlands. In V. Hennefeld & R. Stockmann (Eds.), Evaluation in Kultur und Kulturpolitik. Eine Bestandsaufnahme (pp. 35–52). Münster: Waxmann. Mayntz, R. (1997). Soziologie in der öffentlichen Verwaltung. Heidelberg: C. F. Müller. Mayntz, R. (2003). Governance im modernen Staat. In A. Benz (Ed.), Governance – Regieren in komplexen Regelsystemen (pp. 65–76). Wiesbaden: Springer. Meyer, W. (2003a). Evaluation als Thema in sozialwissenschaftlichen Berufs-und Fachverbänden. Teil I: Soziologie und Psychologie. Zeitschrift für Evaluation (ZfEv), 1(2003), 131–142. Meyer, W. (2003b). Evaluation als Thema in sozialwissenschaftlichen Berufs-und Fachverbänden. Teil II: Erziehungs-und Politikwissenschaften. Zeitschrift für Evaluation (ZfEv), 2(2003), 323–336.

196

R. STOCKMANN AND W. MEYER

Meyer, W. (2015a). Toward professionalization? The contribution of universitybased training programs in pioneer countries. In R. Stockmann & W. Meyer (Eds.), The future of evaluation: Global trends—New challenges—Shared perspectives (pp. 98–112). Basingstoke: Palgrave Macmillan. Meyer, W. (2015b). Professionalisierung von Evaluation—ein globaler Blick. Zeitschrift für Evaluation (ZfEv), 14(2), 215–246. Meyer, W., Funk, E., & Nguyen, L. (2016). Participation and valuation—Ongoing methodological challenges. In R. Stockmann & W. Meyer (Eds.), The future of evaluation: Global trends—New challenges—Shared perspectives (pp. 294–308). Basingstoke: Palgrave Macmillan. Meyer, W. (2017). The global state of evaluation as a profession: Some results. Paper presented at the 12th EES Biennial Conference: Evaluation Futures in Europe and beyond. Connectivity, Innovation and Use. Maastricht, the Netherlands, 28–30 September 2016. Meyer, W., Naidoo, I., D’Errico, S., Hofer, S., Bajwa, M., Tello Pérez, L. A., El-Saddik, K., Lucks, D., Simon, B., & Piergallini, I. (2018, January). VNR reporting needs evaluation: A call for global guidance and national action. IIED Briefing. https://pubs.iied.org/17446IIED. Accessed on 13 May 2019. Misereor. (2016). Was wirkt? Dem Gelingen auf der Spur. Jahresevaluierungsbericht 2016. Aachen: Misereor. Müller, P. (2011). Performance Audit und Evaluation. Gemeinsamkeiten und Unterschiede zwischen den Prüfansätzen des Bundesrechnungshofes zur Aufgabenwahrnehmung der Bundesagentur für Arbeit und arbeitsmarktpolitischen Evaluationen. Zeitschrift fur Evaluation, 10, 17–38. Nationaler Normenkontrollrat. (2015). Chancen für Kostenbegrenzung verbessert. Digitale Chancen tatsächlich nutzen! Jahresbericht 2015 des Nationalen Normenkontrollrates. Berlin. Neurath, K. (2016). Tarifautonomie und das Tarifeinheitsgesetz. Wiesbaden: Springer. OECD. (2006). DAC in dates—The history of OECD’s development assistance committee. http://www.oecd.org/dac/1896808.pdf. Accessed on 13 January 2019. OECD. (2016). Evaluation systems in development Co-operation: 2016 review. Paris: OECD. http://www.oecd.org/dac/evaluation-systems-in-development-co-op eration-9789264262065-en.htm. Accessed on 13 January 2019. PHINEO. (2015). Selbstevaluation der PHINEO gAG. Berlin: PHINEO. PHINEO. (2016). Kursbuch Stiftungen. Förderprojekte wirkungsorientiert gestalten. Berlin: PHINEO. Priemer, J., Krimmer, H., & Labigne, A. (2017). ZiviZ-Survey 2017. Vielfalt verstehen. Zusammenhalt stärken. Berlin: Bertelsmann Stiftung. https://www. bertelsmann-stiftung.de/fileadmin/files/user_upload/ZiviZ_Survey2017.pdf. Accessed on 3 April 2019.

7

GERMANY

197

Puhani, P. A., & Sonderhof, K. (2011). Evaluation sozialpolitischer Reformen. Zeitschrift für Arbeitsmarktforschung, 44, 205–213. Pürer, H. (2015). Medien in Deutschland. Presse – Rundfunk – Online. Konstanz and München: UTB. Schmidt, U. (2009). Evaluation an deutschen Hochschulen – Entwicklung, Stand und Perspektiven. In T. Widmer, W. Beywl, & C. Fabian (Eds.), Evaluation. Ein systematisches Handbuch (pp. 163–169). Wiesbaden: Springer. Schuster, U. (2017). Das Kohärenzprinzip in der Europäischen Union. BadenBaden: Nomos. Sebaldt, M., & Straßner, A. (2004). Verbände in der Bundesrepublik Deutschland. Eine Einführung. Wiesbaden: Springer. Simon, D. (2014). Wie viel Innovation erlaubt das deutsche Wissenschaftssystem? Systemübergänge, institutionelle Settings und die Wissenschaftsgovernance. In M. Jostmeier, A. Georg, & H. Jacobsen (Eds.), Sozialen Wandel gestalten. Zum gesellschaftlichen Innovationspotential von Arbeits- und Organisationsforschung (pp. 33–44). Wiesbaden: Springer. Simon, B., & Meyer, W. (2017). Agenda 2030 - bleibt die Evaluation zurück? Ein Blick auf die ersten freiwilligen nationalen Berichte. Zeitschrift für Evaluation, 16(1), 148–159. Simon, B., Meyer, W., D’Errico, S., Schwandt, T., Lucks, D., Chen, Z., et al. (2017). Evaluation: A missed opportunity in the SDGs’ first set of Voluntary National Reviews. IIED Briefing. https://pubs.iied.org/17423IIED/. Accessed on 16 May 2019. Speer, S., Pattyn, V., & De Peuter, B. (2015). The growing role of evaluation in parliaments: Holding governments accountable? International Review of Administrative Sciences, 81(1), 37–57. Ständige Arbeitsgruppe Deutsch als Fremdsprache (StADaF). (Eds.). (2006). Deutsch als Fremdsprache weltweit Datenerhebung 2005. Berlin, Bonn, Köln, and München. Statistisches Bundesamt. (2016). Statistisches Jahrbuch 2016. Deutschland und Internationales. Wiesbaden: Statistisches Bundesamt. Stockmann, R. (Ed.). (2006). Evaluationsforschung: Grundlagen und ausgewählte Forschungsfelder. Münster: Waxmann Verlag. Stockmann, R. (2012). Von der Idee zur Institution. Institut für deutsche Entwicklungsevaluierung gegründet. Zeitschrift für Evaluation, 11(1), 85–93. Stockmann, R. (2013). Evaluation in der Entwicklungszusammenarbeit. In J. Wilhelm & H. Ihne (Eds.), Einführung in die Entwicklungspolitik (3rd ed., pp. 541– 562). Berlin: LIT Verlag. Stockmann, R., Menzel, U., & Nuscheler, F. (2016). Entwicklungspolitik. Theorien – Probleme – Strategien (2nd ed.). München: De Gruyter Oldenbourg. Stockmann, R., & Meyer, W. (2014). Evaluation: Eine Einführung (2nd ed.). Leverkusen: UTB Verlag.

198

R. STOCKMANN AND W. MEYER

Stockmann, R., & Meyer, W. (Eds.). (2016). The future of evaluation: Global trends, new challenges, shared perspectives. Basingstoke: Palgrave Macmillan. Stockmann, R., Meyer, W., & Schenke, H. (2011). Unabhängigkeit von Evaluationen. Zeitschrift für Evaluation, 10(1), 39–67. Stockmann, R., & Willms-Herget, A. (1985). Erwerbsstatistik in Deutschland: Die Berufs- und Arbeitsstättenzählungen seit 1875 als Datenbasis der Sozialstrukturanalyse. Frankfurt: Campus. Stoetzer, M. W., & Watzka, K. (2017). Die Akkreditierung von Studiengängen in Deutschland: Ein Instrument zur Qualitätssicherung? (Jenaer Erklärung zur Akkreditierung, Jenaer Beiträge zur Wirtschaftsforschung, No. 2017/1). Weißer, U. (2017). Erfolgsmodell Soziale Marktwirtschaft. Das System, die Akteure und ihre Interessen verstehen. Wiesbaden: Springer. Widmer, T., Beywl, W., & Fabian, C. (Eds.). (2009). Evaluation: ein systematisches Handbuch. Wiesbaden: Springer. Wollmann, H. (1994). Implementationsforschung/Evaluationsforschung. In J. Kriz, D. Nohlen, & R.-O. Schultze (Eds.), Lexikon der Politik: Politikwissenschaftliche Methoden (2nd ed., pp. 173–178). München: Beck. Wollmann, H. (1998). Modernisierung der kommunalen Politik- und Verwaltungswelt–zwischen Demokratie und Managementschub. Lokale Verwaltungsreform in Aktion: Fortschritte und Fallstricke. Stadtforschung aktuell, 68, 400–438. Wollmann, H. (2013). Die deutsche kommunale Selbstverwaltung. Auslauf- oder zukunftsfähiges Politik- und Handlungsmodell? Ein Kommentar. In H. Bauer, C. Büchner, & F. Brosius-Gersdorf (Eds.), Starke Kommunen in leistungsfähigen Ländern: der Beitrag von Funktional- und Territorialreformen (pp. 195–206). Potsdam: Universitätsverlag Potsdam. Wollmann, H. (2017). Entwicklungslinien von Evaluation und Evaluationsforschung in Deutschland. Zwischen Vorgeschichte, Aufbruch und Konsolidierung, ZfEv, 2(2017), 33–53.

CHAPTER 8

Switzerland Thomas Widmer

1 1.1

General Country Overview

Institutional Framework of the Swiss Political System

The idiosyncrasies of the Swiss political system have been described many times before; whether Switzerland qualifies as an exceptional case will not be decided here (but see the discussions in Klöti et al. 2007; Kriesi and Trechsel 2008; Linder 2010; Sager and Zollinger 2011; Bochsler et al. 2015). The key characteristics of the Swiss political system are unique, if not downright peculiar, and for the purposes of what follows, certain elements are particularly significant, and have remained largely stable throughout the period under consideration (on this, see Widmer 2017 and 2007, and the literature cited there): Federalism: Switzerland is, and has been, shaped by a pronounced threetiered federalism, with the centre of power located at the middle tier, the cantons. The national level has competency to act only in those areas where it is (and has been) granted the authority to do so by a majority of the cantons and a majority of the voters. Furthermore, the lowest level

T. Widmer (B) Department of Political Science, University of Zurich, Zurich, Switzerland e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_8

199

200

T. WIDMER

of government, Swiss communities/towns/cities, have a high degree of autonomy, manifested most clearly in the power they have to set their own local tax rates. There is a lack of congruence between language areas and denominational boundaries, helping to facilitate the reconciliation of interests between regions and making a clear territorial anchoring of political lines of conflict difficult. Given the heterogeneity of the Swiss population, composed of four language groups (71.1% of the population speak German, 24.1% French, 4.5% Italian, and 0.3% Romansh), this is of crucial importance in ensuring social peace in a Switzerland united only by voluntary association (Willensnation). Semi-direct democracy: Elections as instruments of representative democracy are supplemented in Switzerland by direct democratic plebiscites on substantive issues, at all three levels of government. These plebiscites have a binding character and directly determine the content of constitutions or laws. At the local level in smaller communities, especially in German-speaking Switzerland, as well as in the two remaining cantonal assembly (Landsgemeinde) cantons of Appenzell Innerrhoden and Glarus, one also finds institutionalised forms of popular participation in citizens’ assemblies; these exist alongside ballot voting. Concordance democracy: Political decisions in Switzerland are extremely carefully crafted to achieve a broad-based consensus, due to the extensive direct democratic mechanisms available to Swiss voters which can block such decisions: “Therefore policy making in Switzerland is orientated toward gathering support more than toward gathering evidence” (Widmer and Neuenschwander 2004, p. 391). As a rule, the time-consuming search for consensus leads not only to drawn-out decision-making processes but also to considerable involvement on the part of a large circle of interested actors (or ‘stakeholders’). The benefit is that once a decision has finally been reached, it tends to remain (comparatively) constant. Multi-party governance: Multi-party governments exist at all three levels, owing to the aforementioned demands for consensus. These multiparty governments, as a rule, are ‘oversized coalitions’ which represent a very high proportion of the voters. This is true not only for the local and cantonal governments, ordinarily directly elected, but also for the national executive (Bundesrat ) indirectly elected by the bicameral Federal Assembly. Political executives are organised as collegial authorities, their heads merely primus or prima inter pares who lack authority over their fellow members of the executive.

8

SWITZERLAND

201

Segmented structure of the executive: The administration is led by the government (or elected political executive) who are organised as a collegial authority. There is no powerful head of government (see above) and there is no governmental programme as such. For these reasons, administrative units have considerable decision-making discretion. Under such conditions, even negative coordination places (what are often too) high demands, and it is extremely rare for there to be a positive coordination between departments on all levels of the federal state (see Scharpf 1994). Weakness of the legislative branch: In principle, parliaments in Switzerland have a comparatively weak power base, not least because the governing parties in the parliaments have clear majorities and are not interested in (further) strengthening (their) parliamentary control. Even though the legal powers or competencies of parliaments have been significantly increased since the mid-1990s, especially at the federal but also at the cantonal level, the resources available to the legislatures (notably in terms of personnel and financial resources) remain low compared with those available to the executive. The ‘militia’ parliament continues to play a significant role here, though this tradition may be more respected than actually practised (Bundi et al. 2017). Added to this, and particularly pronounced at the subnational level, are the limits placed on legislative power by direct democratic instruments and assembly democracy practices. Lack of constitutional jurisdiction: There is no judicial authority at the national level which can rule on the constitutionality of federal laws, let alone over popular initiatives or other legal objects. The judiciary is thus structurally weaker than in comparable political systems. Contradictions in foreign relations: As a small state, and despite its general restraint and continued adherence to the principles of non-intervention and neutrality, Switzerland is strongly economically interdependent. This contradiction is most evident in the ongoing domestic political disagreements about what Switzerland’s relation to the EU should be. It is with this institutional background in mind that I present the four historical phases of the development of evaluation practices in Switzerland. 1.2

A Brief History of Evaluation in Switzerland

1.2.1 The Pioneer Phase (Until 1990) Unlike in political systems which early on developed a social science-based, applied research practice, owing to structural changes to the economy and the expansion of the welfare state, the minimal Swiss state barely engaged

202

T. WIDMER

in critical reflections about social and political developments during economically prosperous decades. In a state without major problems, there was little demand for problem-solving social technologies. It is therefore hardly surprising that evaluation, as an instrument for measuring the ability to solve problems, was not addressed in Switzerland during much of the post-war period. Federalism and direct democracy strongly limited the expansion of the central state (see Sect. 1.1 above, as well as Widmer 2016), and the resulting minimal, if liberal, state thus did not need social scientific expertise to measure the effectiveness of (what were largely non-existent) policies. The Swiss political system, spoiled by many years of economic success and spared serious social challenges, simply had no use for evaluation. The small scale of Swiss federalist practices and the many possibilities for direct democratic participation further limited the demand for feedback to the state (Bussmann 1995, 1996). Even the 1970s oil crises did change much, since Switzerland at the time was primarily concerned with developing appropriate policies, not assessing them. The subsequent economic recovery, which unlike in many other industrialised countries soon led to full employment without structural unemployment, also meant that interest in governmental intervention (and its effectiveness) soon faded. It is thus hardly surprising that into the late 1980s, few evaluations were carried out in Switzerland; they occurred sporadically and were barely related to one another. These early, isolated efforts can be observed • first, in education policy, a realm of social service delivery welldeveloped in Switzerland, • second, in newer, innovative policy areas such as environmental protection, which became increasingly prominent by the mid-1980s, and • third, in development cooperation (called development aid at the time), which was traditionally strongly internationally linked and exposed, but at the same time faced (and still faces) legitimacy questions. In this pioneer phase before 1990, one can hardly find any programmatic or structural frameworks in the area of evaluation. When a need became evident, an evaluation was ‘spontaneously’ carried out, such as of the comprehensive school at Dulliken (Jenzer 1981) or of Swiss development aid in Nepal (Geschäftsprüfungskommission des Nationalrates 1984). In the

8

SWITZERLAND

203

manner of what one sees in shops and restaurants nowadays, one might call such efforts ‘pop-up evaluations’. A survey of federal policy areas up to the late 1980s identified about 250 studies or publications deemed to contain evaluations, but the descriptions make clear that many of these were not evaluations in the narrower meaning (Zimmermann et al. 1989). If one examines the design of evaluations at the time, the various individual efforts—some rather homespun—were strongly anchored in the perspective of a particular policy area. One does not find an overarching perspective, cross- or interdisciplinary approaches, or references to the private sector. One might best characterise this as the evaluation as art era. 1.2.2 The Establishment Phase (1990–2000) By the late 1980s and early 1990s, considerably more dynamics came into the evaluation scene in Switzerland. Looking back, two developments can be seen as pointing the way forward both structurally and programmatically: 1. Establishment of the Working Group on Evaluating Laws (AGEVAL) with the Federal Department of Justice and the Police: The group was “charged with proposing appropriate measures to strengthen the prognosis and assessment of the impact of legislation and to examine their possibilities of implementation” (AGEVAL 1991, p. 4). To this end, various studies were conducted from 1987 to 1991 to establish the status quo (Zimmermann et al. 1989; Klöti and Schneider 1989) and five evaluation studies tested the suitability of the approach (AGEVAL 1991). From this, the AGEVAL developed a series of proposed measures to strengthen evaluation at the federal level (1991). 2. Creation of a ‘National Research Programme on the Effectiveness of State Measures’: The Federal Council commissioned the Swiss National Science Foundation to launch a National Research Programme on the ‘Effectiveness of State Measures’. As a result, a large number of full evaluations, brief evaluations and meta-evaluations were carried out from 1990 to 1995 to test an evaluation instrument and its performance. This led to numerous publications, but above all gave significant impetus to evaluation practice (Bussmann et al. 1997; Reber and Simon 2007).

204

T. WIDMER

These two programmatic activities culminated in 1996 in the founding of the Swiss Evaluation Society (SEVAL). This specialist association has exercised decisive influence on the further development of evaluation, and from the outset has focused on federal-level activities (see below Sect. 2.4). Evaluation developed significantly in the 1990s, especially with respect to the federal level. This is reflected by a study which examined evaluation activities (at the level of individual studies), the evaluation activities of selected federal offices, and evaluation in the inter- and trans-departmental realms (Widmer et al. 2001; Spinatsch 2002; Widmer and Neuenschwander 2004). This dynamic development came to an end owing to two developments, both of which (at different levels) represented codifications, and both proved essential for subsequent developments: 1. The Federal Constitution was completely revised in the late 1990s and came into force in 2000. It included a new article about evaluation, which was a deviation from the fundamental principle that this revision of the Constitution only constituted an updating and a formal systematisation but was not meant to introduce any material changes. The new Article 170 states that parliament shall ensure that federal measures taken will be evaluated with respect to their effectiveness. 2. As the first professional society outside North America to do so, SEVAL codified Swiss evaluation standards in the spring of 2001 (Widmer et al. 2000; see below Sect. 2.4). This phase can be described as the evaluation as craft era. The creative potential of evaluation began to be realised, and with increasing competence in its use, evaluation also became more efficient and meaningful, as reflected in the corresponding codifications. 1.2.3 The Mature Phase (2000–2010) Article 170 did not have a strong or direct influence on evaluation activities; evaluation clauses in laws and regulations (Bussmann 2005, see below) had more direct impact. Yet the constitutional article was not without consequence, especially in the federal administration. While the wording of the article makes clear parliament is responsible for evaluation, it was actually an ‘Inter-departmental Impact Assessment Contact Group’ established by the Conference of General Secretaries in the federal administration in

8

SWITZERLAND

205

May 2002 which was tasked with implementing Article 170 in the executive branch (IDEKOWI 2004). After considerable discussion, this group crafted a report which made recommendations in the areas of ‘Measures to be taken in Federal Offices’, ‘Measures to be taken in Departments and the Federal Chancellery’, ‘Measures at the level of the Federal Council’‚ ‘Measures to be taken by cross-agency bodies’, and the ‘Relationship between Parliament, Executive, and Financial Control’ (IDEKOWI 2004). The Federal Council largely followed these recommendations (Widmer and De Rocchi 2012) and in its annual report to parliament provided information about the implementation of these measures. The results have been mixed. Though some federal agencies by now have a highly developed culture of evaluation, it has not (yet) been possible to implement Article 170 in such a manner as to reach every federal agency and department. Analyses of evaluations actually carried out thus far tend to confirm this view (Balthasar 2007; cf. Widmer 2015). On the other hand, the supply of evaluation has developed significantly since 2000. Organisations commissioned to carry out evaluations, which include both smaller and larger evaluation bureaus and offices as well as evaluation units at universities and other institutions of higher education, accumulated extensive experience in conducting evaluations in this era. The knowledge gathered ensures a steadily more competent, if at the same time increasingly standardised, handling of evaluation mandates. However, providers of evaluation services with a field specific expertise but without a well-founded knowledge in evaluation are still active in the field. Some commissioners with limited evaluation know how continue to mandate such ‘laypersons’ who lack the appropriate competence to conduct evaluations (Widmer and Beywl 2009). Overall, one can call this the evaluation as manufacturing phase. Evaluations are increasingly being carried out, and in larger numbers, on a (more or less) clearly structured basis. 1.2.4 The Phase of Bureaucratisation (Since 2010) Since 2010, evaluation activities in Switzerland have continued to become more standardised. The established evaluation providers—primarily private sector offices, and less and less often evaluation offices at higher education institutions—implement standardised evaluation products in large numbers, on tight schedules and with limited resources. Evaluation contractors, mostly at the national but also at the cantonal level, use existing frameworks

206

T. WIDMER

to ‘design’ evaluations that continue to dwindle in size, and award them competitively far under the relevant thresholds. To obtain such contracts, and thereby secure their continued existence as providers of evaluations, these private offices invest a steadily growing share of project resources into making bids. This diminishes the funds, so that if a project is awarded, less overall resources are available for actually conducting the evaluation; the final reports of such projects then often look little different than the original bids. And those resources used by unsuccessful bidders in making their offers are lost, or worse, exploited in inadmissible ways by the contracting party and/or by the successful bidder. Evaluation is increasing regarded as a bureaucratic routine in which the main concern of those giving out the contracts is to ensure the most trouble-free and efficient implementation. As a result, more innovative but little-tried approaches are avoided due to the associated risks. Existing assumptions are barely questioned because that might disrupt the process, and the behaviour of the actors involved is markedly risk-averse. Increasing formalisation and control is used to try to eliminate potentially disruptive factors; creativity and innovation are not wished for. Evaluation is increasingly influenced by legalistic ways of thinking and business-influenced traditions, and the connections to a more critical social science are increasingly lost. Evaluation is increasingly interpreted as a kind of ‘technology of the bureaucracy’. This is paralleled by a division of labour among specialists. One group has an orientation which is bureaucratic, legalistic, or managerial. Their culture increasingly converges with that of auditors and their focus is correspondingly on checking rule conformity and the exact implementation of instructions contained in detailed handbooks. A rather different group is much more oriented to research and innovation. Their interest lies more in the overarching issues than in carrying out such narrowly prescribed evaluations—and they tend to close themselves off from administrative routines of the kind described. They increasingly turn to research questions which give them the freedom to ask uncomfortable questions; they are also more open to the possibility of failure. Evaluation in this phase increasingly walls itself off from research, and dedicates itself to a bureaucratic logic of auditing, in accordance with axioms of ‘compliance’ and ‘controllability’. Standardisation is sought in the name of reducing risk.

8

SWITZERLAND

207

Overall, one can see this as the evaluation as industry phase. The processes have become highly standardised and normed, evaluations are conducted according to predefined processes, and are streamlined for efficiency. Creativity and innovation are avoided as they are seen as too risky.

2

Structures and Processes of Evaluation in the Political System 2.1

Evaluation Regulations

At both national and cantonal levels, there are numerous legal obligations in Switzerland to carry out evaluations (Mader 2015). According a compilation assembled by the Federal Office of Justice (Bundesamt für Justiz 2016), federal law contains more than one hundred corresponding regulations. One can distinguish between three types of formal clauses (Bussmann 2005): • Overall evaluation clauses: these are general and not restricted by object. • Institutional evaluation clauses: authorise or require a particular public authority to carry out evaluations. • Subject-oriented evaluation clauses: oblige public authorities to check whether a particular decree or sub-section of a decree is effective. One should also distinguish between the temporal aspects of an evaluation: • Prospective evaluation clauses require an estimate of the assumed effect of a draft measure, and are meant to help optimise it before it is adopted. • Retrospective evaluation clauses require the evaluation of what has already been enacted and are thus to be understood both as an instrument for improving implementation as well as for accountability. If one looks at the existing evaluation clauses in federal law, then it becomes clear that the overwhelming majority are subject area-oriented and retrospective clauses. In my experience, these also have the strongest

208

T. WIDMER

influence on actual evaluation practice in the federal administration. However, the large number of sectoral evaluation clauses has also led to critique, since it would be more economical to have a cross-sectional regulatory framework (Mader 2009). A general clause of this type does exist in the Swiss constitution (Mader 2005; Widmer 2007): Art. 170 Evaluation of Effectiveness The Federal Assembly shall ensure that the federal measures are evaluated with regard to their effectiveness.

Accordingly, the constitutional mandate is addressed to the legislative branch (parliament) and not, or not directly, to the federal executive. This text also does not say the legislature is to do so itself, only that its mandate is to ensure, and in interpreting this article, the inevitable question is what is included in the category ‘measures taken by the Confederation’. Luzius Mader (2005) argues that a broad interpretation of what terms like ‘effectiveness’ (Wirksamkeit ) and ‘evaluated’ is probably appropriate. The differing formulations which can be found in the German, French and Italian versions of the Swiss constitution, all of which are equally valid, also make it clear that terms used in Article 170 are more colloquial than technical. Because of this open formulation, and because the institution being addressed is the legislature, this constitutional article has not, in my opinion, had much actual influence on evaluation practice—in contradistinction to the importance is has been accorded in the literature. Its influence has been primarily felt in the foundations and strategies of evaluation work (see, e.g. IDEKOWI 2004). Far more relevant for practical evaluations have been the sector-specific clauses, or those oriented to institutions, or those mandated under specific laws. Prospective evaluation clauses have in any case been rare. One can find an example in the Federal Act on the Federal Assembly (Art. 141; SR 171.10) which obliges the Federal Council to examine draft decrees with regard to their expected effects on the economy, society, the environment, and on future generations—and particularly on gender equality. The regulatory impact assessments (RIA) which are based on this focus concentrate particularly on their assumed economic impact (WBF 2013). Numerous studies have shown that these RIAs have thus far not been carried out in such a manner as to fully realise their potential. The timing turned out to

8

SWITZERLAND

209

be particularly problematic: if an RIA was carried out early in the legislative process, it could not take the final version of the respective bill into account, but if it occurred late, then there was little willingness to take its findings into account (EFK 2017; Schlegel 2015; Allio 2011; Sager and Rissi 2011). One thus again encounters a well-known problem: the tension between the summative and the formative functions of evaluations (Widmer 2008a, p. 281). Attention should be paid, however, to not naively ignoring for example pre-emptive effects by reducing prospective evaluations to a purely instrumental use (Rissi and Sager 2013). Numerous evaluation clauses exist at the cantonal level as well (Wirths et al. 2017; Mader 2009). Cantons, similarly to the federal level, have clauses of different types and they are anchored differently, some in the cantonal constitution. However, all cantons have at least one evaluation clause in their laws. It is notable that, beyond the large cantons of Bern and Zurich, it is predominantly in the French-speaking cantons that one finds numerous evaluation clauses. It is rarer to find them in the more rural, German-speaking areas, especially in central and eastern Switzerland. Thematically, one finds them most frequently in the areas of education, health and social welfare, policy areas in which the cantons have chief responsibility (Wirths et al. 2017). 2.2

Evaluation Practices

2.2.1 The Federal Level The question, given these regulations and frameworks, is which practices have evolved at the federal level. This has been the topic of a number of significant studies, the conclusions of which can be briefly summarised here. The earliest inventory of evaluation activities of which I am aware dates from the late 1980s (Zimmermann et al. 1989) and was initiated by the aforementioned Working Group in the Federal Department of Justice and Police (AGEVAL 1991). This review covered the period from 1978 to 1989 and was based on database searches, asking at federal offices, and on information provided by evaluators. This resulted in information about 168 studies, though one cannot assume this was a comprehensive census or a representative sample of all evaluations carried out during this time period. Indeed, a very broad understanding of evaluation was used, such that the list includes many studies I would not call evaluations.

210

T. WIDMER

The Federal Office of Justice continued to collect similar information subsequently, so that the inventory for the legislative period from 1991 to 1995 listed a total of 534 evaluations, again using a very broad understanding of what an ‘evaluation’ is (Bundesamt für Justiz 1996). An inventory carried out by an academic, covering the period from 1999 to 2002, ‘only’ identified 308 evaluations (Balthasar 2007, 2009a), but this apparent decrease was primarily due to a more stringent definition of what counted as an evaluation (Balthasar 2007). The survey of federallevel evaluation studies has since been extended up to 2015, and it lists a total of 1,414 evaluations, though no clear temporal trend was discernible (Balthasar 2015; Balthasar and Strotz 2017). However, the figures also make clear that the practices differ considerably among federal agencies and offices. Some larger federal offices (including the Federal Chancellery) carried out no evaluations at all from 1999 to 2015, while others evaluated quite frequently. These included the Federal Office of Public Health (169), the Swiss Agency for Development and Cooperation (163), the State Secretariat for Economic Affairs (135) and the State Secretariat for Education, Research and Innovation (131) (these figures are for 2000–2014; see Balthasar and Strotz 2017). The large number of subject-orientated, retrospective evaluation clauses in Swiss federal law, and the initiative of many actors within the federal administration, has ensured that the evaluation of federal measures of all kinds is quite widespread today. Nevertheless, and as in the past, there are federal activities which are barely subject to evaluation. One of these gaps exists in foreign policy (other than development cooperation) (Widmer 2008b). The responsibility for a major part of current evaluation activities at the federal level lies in the federal offices themselves. The supervisory bodies, the Swiss Federal Audit Office, which carried out 56 evaluations from 2000 to 2014, and the Parliamentary Control of the Administration (43 evaluations) do conduct evaluations on a continuous basis, but they are very outweighed in numerical terms by the evaluation studies of the federal agencies and offices (Balthasar and Strotz 2017). With this in mind, it is not surprising that evaluation activities vary considerably between the federal offices not just quantitatively but also qualitatively. The original reasons for introducing evaluations in the federal offices also strongly differ, each federal office works within quite diverging premises and frameworks, and the orientation and extensiveness of evaluation is considerable even if one can discern some tendencies towards

8

SWITZERLAND

211

convergence and a unification of practices (Widmer et al. 2001; Jacob and Varone 2003; Widmer and Neuenschwander 2004; CDF 2005; Balthasar 2007; Widmer and Beywl 2009; Balthasar and Strotz 2017). Diversity in the practices should not be criticised per se; it may well be an adequate expression of a situationally differentiated set of conditions. However, based on the results of certain studies, one may need to conclude that a significant portion of the variety is due rather to factors extraneous to the subject of the evaluation. They include administrative power considerations, personal preferences and contacts, and well-trodden routines. There is also considerable range with respect to the orientation of evaluations, and the notion that all the functions of an evaluation can be simultaneously taken into account is still widespread. The functional differentiation of evaluation activities thus far has only been able to selectively establish itself in the federal administration (Widmer et al. 2001; Widmer and Neuenschwander 2004; Widmer 2008a). The lack of functional differentiation means the quality of the evaluation conducted also suffers; the demands made on it often diverge, and the objectives become overburdened. While the existing evaluation clauses increase the scope of evaluation activity, they cannot guarantee that the evaluation mandate is fulfilled at a high-quality level. On the contrary: the quality of the evaluations may sink if (or when) evaluations are carried out solely with the intent of complying with a legal stipulation that they be carried out. 2.2.2 Cantons (and Communities) Little systematic information is available about evaluation activities at the sub-national level in Switzerland. One can also assert—without great risk— that this is also true of Swiss communities. However, this does not mean that we know nothing about evaluation at these levels. Balthasar and Rieder’s survey showed that from 2003 to 2008, the 26 cantons carried out about 40 evaluations per year. These evaluations were also somewhat unevenly distributed: a few cantons (Zurich, Bern and Geneva) carried out six or more per year, while many others rarely conducted evaluations at all (they were particularly rare in AppenzellAusserrhoden, Fribourg, Glarus, Jura and Thurgau). Of the 242 evaluations identified, the largest numbers were in education (108), with relatively few in health (23) or administrative reform (7) (Balthasar and Rieder 2009).

212

T. WIDMER

A more recent online databank1 covering the years from 2000 to 2012 but focused on select political units (the cantons of Bern, Geneva, and Zürich, along with the federal level) and policy areas (education, energy, health), includes 657 evaluations (Frey et al. 2017). The largest number were found in the health sector (290) followed closely by education (258), with energy policy evaluations (126) considerably less common (Frey et al. 2017, p. 73). In this databank, more than half the evaluations were at the federal level (364), followed by the cantons of Zurich (148), Geneva (94) and Bern (70)—and only 8.4% of the evaluations were prospective in nature. The executives, both government and administration, were the dominant contractors (82.5% of all evaluations included), parliaments (2.6%) hardly at all (Frey et al. 2017). In addition to these broader surveys, various analyses exist which focus on specific policy areas: • Education (largely a cantonal responsibility). Despite the many differences between the cantons, one can identify a number of overarching tendencies, of which three are particularly noteworthy. One is a movement towards a greater autonomy of individual educational institutions, from elementary schools to universities, in which traditional forms of political and bureaucratic tutelage are replaced (or supplemented) by more professionalised systems. As part of this development, evaluation instruments—at least nominally—are increasingly being used. On the one hand, these systems bear close similarities to earlier school supervising authorities (Schulaufsicht ), and thus have a controlling character. On the other hand, they are close to organisational development and quality management approaches. A similar development can be seen in higher education institutions that have established their own evaluation systems. Here one can take note of accreditation procedures, though they have (to date) played a rather subordinate role in evaluation systems (Rhyn 2009; Widmer 2008a). At the cantonal level, evaluations in the education sector are more often used as an instrument to evaluate project efforts. At issue here, often, is the extent to which an educational innovation introduced in selected units on a trial basis should be employed permanently and comprehensively across that canton (e.g., see Rhyn et al. 2002;

1 For more information, see www.syneval.ch.

8

SWITZERLAND

213

Stamm 2003). Evaluation—again, at least nominally—is also increasingly found in conjunction with cross-national measurements of pupil performance. International surveys of this kind, including PISA and TIMMS, receive considerable political and media attention in Switzerland, as in the other countries surveyed. In Switzerland, however, such measurements are carried out not by the individual cantons but at the federal level (Bundesamt für Statistik) or through the Conference of Cantonal Ministers of Education. It is a matter of some controversy whether such measures of current levels of learning and understanding among pupils should be called evaluations; in my view, these are less evaluations than monitoring. • A recent study comparing the development of health and education policies from 2000 to 2013 in the cantons of Basel-City, Bern and Lucerne has drawn a distinction between institutionalisation, practice, and attitudes and has proposed one can speak of a ‘culture of evaluation’. In anchoring such a culture, considerable changes can be observed over time which have often (though not always) led to strengthening this ‘culture of evaluation’ (Dolder et al. 2017). • A number of studies at the cantonal level have looked at administrative reform in the context of New Public Management. Haering (2002), for example, has summarised the evaluations of administrative reform in the cantons of Aargau, Lucerne and Zurich. Wüest-Rudin (2002) examined the embedding of evaluation in reforms of public administration, including at the cantonal level. Widmer (2002) analysed the conceptual orientation, in six cantons, of evaluations which dealt with New Public Management. Balthasar and Rieder (2009) noted that a total of ten cantons have had their administrative reforms evaluated, while Paganini (2017) has recently compared the substantive orientation in evaluations of reforms in the cantons of Aargau, Bern, Grisons, Lucerne, Solothurn, Zug and Zurich. Information about evaluation practices at the local level is even more limited than what is known about the cantonal level. While a few evaluation studies of local situations are well-known, a systematic overview of the local level remains to be written. Not surprisingly, the larger cities are more active in evaluating, while smaller communities (with very few exceptions) lack the capacities, resources, and incentives to launch evaluations. The piecemeal information about evaluation activity at the cantonal level, and the even more limited information about what communities have undertaken show

214

T. WIDMER

there is a clear need for action if one intends to provide evidence-based policy-making. 2.3

Evaluation *Utilisation

2.3.1 Public Administration Considerable research has been conducted on the use of evaluation in and by the public administration. It is safe to say that since the late 1980s, no research topic in the area of evaluation has enjoyed greater attention (Klöti and Schneider 1989; Widmer et al. 2016). A recent survey of the current state of research (Frey and Ledermann 2017, p. 216) estimates there are around twenty empirical studies of Switzerland (e.g., see Balthasar 2006, 2009b; Balthasar and Müller 2014; Balthasar and Rieder 2009; Frey 2010; Frey and Widmer 2011, 2013; Ledermann 2012, 2014; Nideröst 2002; Rissi and Sager 2013; Sager and Rissi 2011; Stamm 2003; Widmer and Neuenschwander 2004). While they do not all restrict themselves to the use of evaluation by and in public administration bodies, it is often the central focus. Procedurally speaking, these are frequently studies of the use of evaluation in formulating laws, programmes, or other policies, or they look at implementation (Frey and Ledermann 2017). Since the pioneering work of Carol Weiss (including 1977, 1981, 1983), there has been continued international interest in the usage of evaluation in the specialist literature. This has led to a strong theoretical foundation, comparatively speaking, in empirical research (Kirkhart 2000; Henry and Mark 2003). According to Frey and Ledermann, there is occasional systematic use of evaluation in policy formulation. More common is its symbolic use so as to try to increase the attractiveness of legislative bills, whether by the executive or by administrative bodies. In designing policy implementation, evaluation findings can play a certain role, particularly in the administration. As to what determines the use of evaluation, these authors found that clarification of functions and the capacities of the responsible agencies were the most frequently cited reasons (Frey and Ledermann 2017). 2.3.2 Parliaments A considerable gap in the research literature long existed with respect to the role evaluation played in Swiss parliaments. More recent work has led to a number of findings which have been summarised in work done by Eberli and Bundi (2017).

8

SWITZERLAND

215

Evaluation as an instrument is highly institutionalised in Swiss parliaments (Eberli and Bundi 2017; Jacob et al. 2015). At the federal level, the Parliamentary Control of the Administration is of central importance, even when the evaluation studies under their aegis have been few compared to the number produced in and by the federal administration (as noted above). The activities of the Parliamentary Control of the Administration also do not give a complete picture of the role evaluation plays in the Swiss federal legislature. In addition to their role as evaluation producers, parliaments play an essential role as both initiators of evaluations and as utilisers of evaluations. Evaluations can be initiated in various ways. One is by including evaluation clauses in suggested draft legislation and tasking the executive with then carrying them out. Another is to trigger evaluation activity as part of a parliamentary request which directly asks the executive to do so. A third is by calling on the executive to provide information, thereby indirectly leading the executive to conduct evaluations in order to be able to respond adequately to the parliamentary query. As for utilisation, members of parliament may use evaluation findings in their legislative activity, whether to give a substantive basis for a legislative proposal or to influence the political acceptance of such a proposal—either positively or negatively. A survey conducted among members of the 26 cantonal parliaments as well as both chambers of the federal parliament indicated that members of parliament were more likely to use parliamentary requests as the means of initiating evaluations (Bundi et al. 2014; Eberli et al. 2014). More rarely will they do so using evaluation clauses. As a rule, these are introduced into legislation by the executive or the administration (Eberli and Bundi 2017; EFK 2011; Wirths et al. 2017). Members of the Federal Assembly, meaning both houses of parliament, most often use the form of an interpellation in order to initiate an evaluation by means of a parliamentary request (Bundi et al. 2018). Members of parliament call for evaluations above all to exercise their oversight powers vis-a-vis executive and administration (Bundi 2016). Those members of parliament with many years of experience, along with those in leadership position, call more often for evaluations than do their fellow parliamentarians (Bundi 2016). They do so especially in policy areas where legitimation pressure is particularly high, and where the provision of public services is delegated to third parties (Bundi 2018b). While earlier studies have also addressed the utilisation of evaluations in parliament (e.g. Balthasar 2007, 2009a, b; Frey 2010, 2012; Frey and Widmer 2011, 2013), Daniela Eberli’s recent work has focused exclusively on it.

216

T. WIDMER

Her work shows that as a part of the law-making process, evaluations are used both analytically (to improve the draft law or proposal) and politically (in support of one’s own position). She also points to the considerable significance evaluations are accorded as evidence in parliamentary commissions (Eberli 2018). Further, results show that evaluation utilisation in the pre-parliamentary process (by the public servants) shapes the utilisation during the parliamentary phase (Eberli 2018). Nevertheless, it is also clear that it is far from possible to speak of a comprehensive, systematic use of evaluation. Rather, it is a selective use, both in terms of which evaluation results are cited and which parliamentarians use evaluations (Eberli 2019). It is moderately dispiriting that evaluations themselves are considerably more frequently called for than that their results are actually used (Bundi 2018a). 2.3.3 Public Discourse The public in Switzerland has quite good access to evaluation reports. This applies in particular to evaluations of the federal administration, which must be generally accessible for the public in accordance with the Federal Publicity Law (BGÖ; Article 8, Section 5). As noted at the outset, direct democracy ranks high in Switzerland, and the Swiss population can use its plebiscitary rights to make decisions about many politically controversial issues. For this reason, a substantial part of public discourse about politics is devoted to the content of referenda and popular initiatives. This has prompted some scholars recently to ask what role evaluation findings have had in initiative and referenda campaigns (Stucki and Schlaufer 2017; Schlaufer 2018; Schlaufer et al. 2018). The evidence provided by evaluations appears to play a very marginal role, at least in the reporting in the mass media (Stucki and Schlaufer 2017, p. 286). Nevertheless, such evidence can strengthen arguments brought in public debates, experts are of central importance in communicating the findings of evaluations, and the use of evidence helps strengthen credibility (Stucki 2016, 2017, 2018). The findings from the PISA studies—though there is the caveat about whether these measurements can properly be thought of as evaluations—are strongly present in narratives which contribute to debates about education policy (Schlaufer 2016, 2018). The inclusion of evaluation evidence in referenda and popular initiative campaigns helps elevate the quality of the discourse in newspaper reporting (Schlaufer 2018). Further analyses show that evaluation findings certainly can contribute to more informed direct democratic decisions (Stucki et al. 2018).

8

SWITZERLAND

217

These are studies which focused on education and health policy, areas which have traditionally had relatively large numbers of evaluations (Widmer et al. 2009; Widmer et al. 2016; Frey et al. 2017). It is thus rather sobering to discover that evaluations are of such low importance in referenda and popular initiative campaigns and debates. However, Swiss experience does indicate that evaluation and (direct) democracy can certainly complement each other (Sager 2018). 2.4

Professionalisation

Since the early days, the Swiss evaluation community has developed between the partly conflicting demands of a scientific discipline and a profession (see Widmer 2015). Already in the 1980s, first programmes in education and training were supplied, mainly by academic institutions. In addition to existing and longstanding programmes at which one can study evaluation, one should also note the continuing education course in evaluation available at the University of Bern. Evaluation has also become a major subject in courses in various social science disciplines. In addition, members of the Swiss evaluation community contributed considerably to the international research on evaluation (Hense et al. 2013; Brandon 2015; Widmer et al. 2016). The Swiss Evaluation Society, founded in 1996 (see above, Sect. 1.2.2), has contributed extensively to the development of the evaluation community in Switzerland. Unlike evaluation societies in other countries, SEVAL also includes members who primarily commission evaluations rather than just those who carry them out. SEVAL has established itself as a recognised professional society, even though a significant proportion of evaluation work continues to be carried out by those uninvolved with it; it organises symposia and co-publishes the journal ‘LeGes - Legislation & Evaluation’ (Bussmann 2015). In the mature phase (years 2000–2010, see above, Sect. 1.2.3), SEVAL standards served as an important guideline for a growing part of the evaluation community (Widmer 2011). Still, one critical analysis based on four cases concluded that SEVAL standards were at times only being applied in an implicit manner (Rieder 2009). This is a little surprising, since the problem is more often that evaluation is used symbolically, which presupposes an explicit use. At the end of this phase, a survey in particular among SEVAL members indicated that awareness of the standards was extremely high (97%), and that a very large percentage (87%) of those asked had

218

T. WIDMER

already used the SEVAL standards (Balzer 2013). Anyway, based on these findings, a SEVAL working group decided that a revision of the standards was needed in order to strengthen their dissemination (Rüefli 2013). In more recent times, SEVAL followed a strategy directed towards bureaucratic routinisation and closing off the market. Such tendencies—to distance oneself from the principles of methodological openness and innovation—are reflected in two recent developments connected to SEVAL: • The revised SEVAL standards are based on a positivist worldview that postulates unbiased evaluation. Methodological openness is no longer a core aspect of the standards, which are instead seen as a set of rules intended to support a problem-free implementation of predetermined evaluation projects (SEVAL 2016). • The suggestion has been made that SEVAL should launch a programme to certify evaluators, granting certificates to the appropriately qualified and examined specialists. Such certification is meant to help prevent evaluations from failing and to seal off the evaluation market from (the non-certified) competition. In contrast to earlier assessments (Furubo et al. 2002), the more recent literature considers the state of institutionalisation of evaluation in Switzerland to be high compared to other countries (Jacob et al. 2015).

3

Conclusion

Overall, evaluation in Switzerland is well-established, with diverse institutions and practices that have progressed strongly since the 1990s. While this is remarkable in many respects, evaluation in Switzerland still shows significant weaknesses, of which two are particularly important: 1. How evaluations are designed remains very heterogeneous in the country. While such diversity is quite typical for Switzerland and thus not very surprising, one can demonstrate that such heterogeneity is neither democratically legitimated nor objectively justifiable. Instead, one has the impression of a certain degree of haphazardness and/or arbitrariness about whether, let along how in a given situation, an evaluation should be carried out. An outside observer might recommend a more systematic approach be taken. Here the research

8

SWITZERLAND

219

conducted on evaluation could make a valuable contribution, as it permits reflection about the structures and processes involved, and can suggest how better to support a more systematic, goal-oriented, and effective evaluation practice. 2. Efforts to professionalise evaluation in Switzerland have to be viewed with scepticism for two reasons. Despite how much evaluation has improved, expertise in evaluation itself lacks an appropriate recognition. In many areas, a misconception still exists that knowledge of a particular policy or institutional context is sufficient to carry out a good evaluation. The result is that there are still numerous evaluations, carried out by laypersons, which contain grave, evaluationrelated, weaknesses. Furthermore, in trying to counteract the lack of recognition of evaluation expertise, the evaluation community has adopted a strategy of bureaucratic routinisation and closing off the market. This ignores that for evaluation to continue to develop in a healthy way, it absolutely needs to involve a lively exchange between practitioners and social scientific research. One can wish of Swiss evaluation that its power to innovate prevails and that the form of professionalisation it takes remains open and reflective (Widmer 2015). In my view, evaluation should remain in a creative tension between a profession and a discipline, supported by research on evaluation providing evidence for continuous improvements. Acknowledgements This contribution is largely based on work completed as part of the ‘Policy Evaluation in the Swiss Political System—Roots and Fruits’ (SynEval) research programme. It was funded by the Swiss National Science Foundation (SNF), from 2013 to 2016, through its Sinergia programme. I would like to thank the SNF and the researchers involved for their contributions. Further information can be found in Sager et al. 2017 as well as on www.syneval.ch.

References AGEVAL Arbeitsgruppe Gesetzesevaluation (Ed.). (1991). Die Wirkungen staatlichen Handelns besser ermitteln. Schlussbericht der Arbeitsgruppe Gesetzesevaluation an das Eidgenössische Justiz- und Polizeidepartement, Oktober 1991. Bern: EDMZ. Allio, L. (2011). Evaluation des analyses d’impact approfondies et des études Standard Cost Model effectuées para la Confédération entre 2007 et 2009. Bern: SECO.

220

T. WIDMER

Balthasar, A. (2006). The effects of institutional design on the utilization of evaluation: Evidenced using Qualitative Comparative Analysis (QCA). Evaluation, 12(3), 353–371. Balthasar, A. (2007). Institutionelle Verankerung und Verwendung von Evaluationen: Praxis und Verwendung von Evaluationen in der schweizerischen Bundesverwaltung. Zürich: Rüegger. Balthasar, A. (2009a). Evaluationen in der Schweiz: Verbreitung und Verwendung. In T. Widmer, W. Beywl, & F. Carlo (Eds.), Evaluation (pp. 486–497). Wiesbaden: VS Verlag. Balthasar, A. (2009b). Institutional design and utilization of evaluation. Evaluation Review, 33(3), 226–256. Balthasar, A. (2015). L’utilisation de l’évaluation par l’administration fédérale. In K. Horber-Papazian (Ed.), Regards croisés sur l’évaluation en Suisse (pp. 115– 132). Lausanne: PPUR. Balthasar, A., & Müller, F. (2014). Die Verbreitung evidenzbasierter und gleichstellungssensitiver Information in den Entscheidungsprozessen kantonaler Steuerund Sozialtransferpolitik. Swiss Political Science Review, 20(1), 70–95. Balthasar, A., & Rieder, S. (2009). Wo ist evidenzbasierte Politik möglich? Die Verbreitung von Evaluationen auf kantonaler Ebene. In A. Vatter, F. Varone, & F. Sager (Eds.), Demokratie als Leidenschaft (pp. 403–429). Bern: Haupt Verlag. Balthasar, A., & Strotz, C. (2017). Verbreitung und Verankerung von Evaluation in der Bundesverwaltung. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 89–117). Zürich: NZZ Libro. Balzer, L. (2013). Bekanntheitsgrad und Nutzung der Evaluations-Standards der Schweizerischen Evaluationsgesellschaft (SEVAL) sowie erste Modifikationshinweise. LeGes – Gesetzgebung & Evaluation, 24(2), 439–458. Bochsler, D., Hänggli, R., & Häusermann, S. (Eds.). (2015). Consensus lost? Disenchanted democracy in Switzerland [Special issue]. Swiss Political Science Review, 21(4), 475–490. Brandon, P. R. (Ed.). (2015). Research on evaluation: New Directions for Evaluation (Vol. 148). San Francisco, CA: Jossey-Bass. Bundesamt für Justiz. (1996). Zusammenstellung der bei den Dienststellen des Bundes in der Legislaturperiode 1991–1995 durchgeführten, in Auftrag gegebenen oder durch Beiträge unterstützten Studien über die Wirtschaftlichkeit und Wirksamkeit staatlichen Handelns. Bern: BJ. Bundesamt für Justiz. (2016). Übersicht über Evaluationsklauseln im Bundesrecht. www.bj.admin.ch/bj/de/home/staat/evaluation/materialien/uebersicht. html. Accessed on 26 February 2018. Bundi, P. (2016). What do we know about the demand for evaluation? Insights from the parliamentary arena. American Journal of Evaluation, 37 (4), 522–541. Bundi, P. (2018a). Parliamentarians’ strategies for policy evaluations. Evaluation and Program Planning, 69, 130–138.

8

SWITZERLAND

221

Bundi, P. (2018b). Varieties of accountability: How attributes of policy fields shape parliamentary oversight. Governance, 31(1), 163–183. Bundi, P., Eberli, D., & Bütikofer, S. (2017). Between occupation and politics: Legislative professionalization in the Swiss cantons. Swiss Political Science Review, 23(1), 1–20. Bundi, P., Eberli, D., Frey, K., & Widmer, T. (2014). Parlamente und Evaluationen: Methodenbericht. Zürich: Institut für Politikwissenschaft, Universität Zürich. https://www.syneval.ch/images/PDF/Methodenbericht_ ParlEval_web.pdf. Accessed on 19 April 2018. Bundi, P., Varone, F., Gava, R., & Widmer, T. (2018). Self-selection and misreporting in legislative surveys. Political Science Research and Methods, 6(4), 771–789. Bussmann, W. (1995). Evaluations and grassroots politics: The case of Switzerland. Knowledge and Policy, 8(3), 85–98. Bussmann, W. (1996). Democracy and evaluation’s contribution to negotiation, empowerment and information. Evaluation, 2(3), 307–319. Bussmann, W. (2005). Typen und Terminologien von Evaluationsklauseln. LeGes – Gesetzgebung & Evaluation, 16(1), 97–102. Bussmann, W. (2015). Konstanz, Wandel und Neuentwicklungen: 25 Jahre Evaluation in LeGes. LeGes – Gesetzgebung & Evaluation, 26(1), 107–125. Bussmann, W., Klöti, U., & Knoepfel, P. (Eds.). (1997). Einführung in die Politikevaluation. Basel: Helbing & Lichtenhahn. CDF Contrôle fédérale des finances (Eds.). (2005). Comment les offices mesurent-ils les effets de leurs actions? Bern: CDF. Dolder, O., Rohrbach, W., & Varone, F. (2017). Evaluationskultur auf kantonaler Ebene. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 119–153). Zurich: NZZ Libro. Eberli, D. (2018). Tracing the use of evaluations in legislative processes in Swiss cantonal parliaments. Evaluation and Program Planning, 69, 139–147. Eberli, D. (2019). Die Nutzung von Evaluationen in den Schweizer Parlamenten. Zürich: Seismo. Eberli, D., & Bundi, P. (2017). Parlament und Evaluation: Guts Meets Brain. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 243–278). Zurich: NZZ Libro. Eberli, D., Bundi, P., Frey, K., & Widmer, T. (2014). Befragung Parlamente und Evaluationen: Ergebnisbericht. Zürich: Institut für Politikwissenschaft, Universität Zürich. https://www.syneval.ch/images/PDF/Ergebnisbericht_ParlEval. pdf. Accessed on 19 April 2018. EFK Eidgenössische Finanzkontrolle. (2011). Umsetzung der Evaluationsklauseln in der Bundesverwaltung. Bern: EFK. EFK Eidgenössische Finanzkontrolle. (2017). Prognosen in den Botschaften des Bundesrates. Evaluation der prospektiven Folgenabschätzungen. Bern: EFK. Frey, K. (2010). Revising road safety policy: The role of systematic evidence in Switzerland. Governance, 23(4), 667–690.

222

T. WIDMER

Frey, K. (2012). Evidenzbasierte Politikformulierung in der Schweiz. Baden-Baden: Nomos. Frey, K., Di Giorgi, V., & Widmer, T. (2017). Die Evaluationstätigkeit in der Schweiz. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 67–88). Zurich: NZZ Libro. Frey, K., & Ledermann, S. (2017). Nutzung von Evaluationen in Regierung und Verwaltung. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 211–242). Zurich: NZZ Libro. Frey, K., & Widmer, T. (2011). Revising swiss policies: The influence of efficiency analyses. American Journal of Evaluation, 32(4), 494–517. Frey, K., & Widmer, T. (2013). Vom Eigenen oder Fremden lernen bei der evidenzbasierten Politikgestaltung? In J. Hense, S. Rädiker, W. Böttcher, & T. Widmer (Eds.), Forschung über Evaluation (pp. 165–185). Münster: Waxmann. Furubo, J. E., Rist, R. C., & Sandahl, R. (Eds.). (2002). International atlas of evaluation. Brunswick: Transaction Publishers. Geschäftsprüfungskommission des Nationalrates. (1984). Bericht über die Erfolgskontrolle von Entwicklungsprojekten in Nepal vom 23. August 1984. Bundesblatt, 2(37), 1338–1358. Haering, B. (2002). Erfahrungen mit und aus WOV-Evaluationen. LeGes – Gesetzgebung & Evaluation, 13(1), 9–21. Henry, G. T., & Mark, M. M. (2003). Beyond use: Understanding evaluation’s influence on attitudes and actions. American Journal of Evaluation, 24(3), 293– 314. Hense, J., Rädiker, S., Böttcher, W., & Widmer, T. (Eds.). (2013). Forschung über Evaluation. Waxmann: Münster. IDEKOWI Interdepartementale Kontaktgruppe Wirkungsprüfung. (2004, Juni 14). Wirksamkeit von Bundesmassnahmen. Bericht der Interdepartementalen Kontaktgruppe „Wirkungsprüfungen“. Bern: Bundesamt für Justiz. Jacob, S., Speer, S., & Furubo, J. E. (2015). The institutionalization of evaluation matters: Updating the international atlas of evaluation 10 years later. Evaluation, 21(1), 6–31. Jacob, S., & Varone, F. (2003). Evaluer l’action publique. Etat des lieux et perspectives en Belgique. Gent: Academia Press. Jenzer, C. (1981). Die Evaluation des Schulversuchs “Gesamtschule Dulliken”. In J. P. Meylan (Ed.), Innovation und Evaluation von Schulversuchen (pp. 61–82). Bern: Haupt. Kirkhart, K. E. (2000). Reconceptualizing evaluation use: An integrated theory of influence. New Directions for Evaluation, 88, 5–23. Klöti, U., Knoepfel, P., Kriesi, H., Linder, W., Papadopoulos, Y., & Sciarini, P. (Eds.). (2007). Handbook of Swiss Politics (2nd ed.). Zurich: NZZ Publishing. Klöti, U., & Schneider, G. (1989). Die Informationsbeschaffung des Gesetzgebers. Grüsch: Rüegger.

8

SWITZERLAND

223

Kriesi, H., & Trechsel, A. H. (2008). The politics of Switzerland. Cambridge: Cambridge University Press. Ledermann, S. (2012). Exploring the necessary conditions for evaluation use in program change. American Journal of Evaluation, 33(2), 159–178. Ledermann, S. (2014). Evidenz und Expertise im vorparlamentarischen Gesetzgebungsprozess. Swiss Political Science Review, 20(3), 453–485. Linder, W. (2010). Swiss democracy (3rd ed.). Basingstoke: Palgrave. Mader, L. (2005). Artikel 170 der Bundesverfassung: Was wurde erreicht, was ist noch zu tun? LeGes – Gesetzgebung & Evaluation, 16(1), 29–37. Mader, L. (2009). Die institutionelle Einbettung der Evaluationsfunktion in der Schweiz. In T. Widmer, W. Beywl, & C. Fabian (Eds.), Evaluation. Ein systematisches Handbuch (pp. 52–63). Wiesbaden: VS Verlag. Mader, L. (2015). Le rôle des clauses d’évaluation dans le processus législatif fédéral. In K. Horber-Papazian (Ed.), Regards croisés sur l’évaluation en Suisse (pp. 67– 78). Lausanne: Presses polytechniques et universitaires romandes. Nideröst, B. (2002). Erfolgsbedingungen für Evaluationen. LeGes – Gesetzgebung & Evaluation, 13(1), 39–55. Paganini, L. S. (2017). Analysedimensionen kantonaler Verwaltungsreformen. Master-Arbeit. Bern: KPM. Reber, B., & Simon, C. (2007). Wirkungsprüfung Nationale Forschungsprogramme. Bern: Staatssekretariat für Bildung und Forschung. Rhyn, H. (2009). Evaluation im Bildungsbereich in der Schweiz. In T. Widmer, W. Beywl, & C. Fabian (Eds.), Evaluation: Ein systematisches Handbuch (pp. 182– 192). Wiesbaden: VS Verlag. Rhyn, H., Widmer, T., Roos, M., & Nideröst, B. (2002). Zuständigkeiten und Ressourcen in Zürcher Volksschulen mit und ohne Teilautonomie (TaV). Zürich: Bildungsdirektion des Kantons Zürich. Rieder, S. (2009). Anwendung der SEVAL-Standards. Ergebnisse aus vier Fallstudien. LeGes – Gesetzgebung & Evaluation, 20(3), 387–397. Rissi, C., & Sager, F. (2013). Types of knowledge utilization of regulatory impact assessments: Evidence from Swiss policymaking. Regulation & Governance, 7 (3), 348–364. Rüefli, C. (2013). Die Revision der SEVAL-Standards. LeGes – Gesetzgebung & Evaluation, 24(2), 459–469. Sager, F. (2018). Policy evaluation and democracy: Do they fit? Evaluation and Program Planning, 69, 125–129. Sager, F., & Rissi, C. (2011). The limited scope of policy appraisal in the context of the referendum democracy. Evaluation, 17 (2), 151–163. Sager, F., & Zollinger, C. (2011). The swiss political system in comparative perspective. In C. Trampusch & A. Mach (Eds.), Switzerland in Europe (pp. 27–42). London: Routledge. Scharpf, F. W. (1994). Games real actors could play: Positive and negative coordination in embedded negotiations. Journal of Theoretical Politics, 6(1), 27–53.

224

T. WIDMER

Schlaufer, C. (2016). Global evidence in local debates: The programme for international student assessment (PISA) in swiss direct-democratic debates on school policy. Policy & Politics, 44(4), 547–561. Schlaufer, C. (2018). The contribution of evaluations to the discourse quality of newspaper content. Evaluation and Program Planning, 69, 157–165. Schlaufer, C., Stucki, I., & Sager, F. (2018). The political use of evidence and its contribution to democratic discourse. Public Administration Review, 78(4), 645–649. Schlegel, T. (2015). Qualität der Regulierungsfolgenabschätzung. Bern: SECO. Schweizerische Evaluationsgesellschaft (SEVAL). (2016). Evaluationsstandards der Schweizerischen Evaluationsgesellschaft. Bern: SEVAL. Spinatsch, M. (2002). Evaluation in Switzerland: Moving toward a decentralized system. In J. E. Furubo, R. C. Rist, & R. Sandahl (Eds.), International atlas of evaluation (pp. 375–391). New Brunswick: Transaction. Stamm, M. (2003). Evaluation und ihre Folgen für die Bildung. Münster: Waxmann. Stucki, I. (2016). The use of evidence in public debates in the media. Evidence & Policy, 12(4), 487–504. Stucki, I. (2017). Arguing about smoking bans. Critical Policy Studies, 11(4), 411– 432. Stucki, I. (2018). Evidence-based arguments in direct democracy. Evaluation and Program Planning, 69, 148–156. Stucki, I., Pleger, L. E., & Sager, F. (2018). The making of the informed voter. Swiss Political Science Review, 24(2), 115–139. Stucki, I., & Schlaufer, C. (2017). Die Bedeutung von Evaluationen im direktdemokratischen Diskurs. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 279–310). Zurich: NZZ Libro. WBF Eidgenössisches Departement für Wirtschaft, Bildung und Forschung. (2013, März). Regulierungsfolgenabschätzung. Handbuch. Bern: WBF. Weiss, C. H. (1977). Research for policy’s sake: The enlightenment function of social science research. Policy Analysis, 3, 531–545. Weiss, C. H. (1981). Measuring the use of evaluation. In J. A. Ciarlo (Ed.), Utilizing evaluation: Concepts and measurement techniques (pp. 17–33). Beverly Hills: Sage. Weiss, C. H. (1983). Ideology, interests, and information: The basis of policy positions. In D. Callahan & B. Jennings (Eds.), Ethics, the social sciences, and policy analysis (pp. 213–245). New York: Plenum Press. Widmer, T. (2002). Staatsreformen und Evaluation: Konzeptionelle Grundlagen und Praxis bei den Schweizer Kantonen. Zeitschrift Für Evaluation, 1(1), 101– 114. Widmer, T. (2007). Rahmenbedingungen und Praxis der Evaluation im schweizerischen Bundesstaat. ÖHW – Das öffentliche Haushaltswesen in Österreich, 48(1–2), 69–93.

8

SWITZERLAND

225

Widmer, T. (2008a). Evaluationsansätze und ihre Effekte. In H. Matthies & D. Simon (Eds.), Wissenschaft unter Beobachtung. Effekte und Defekte von Evaluationen (Vol. 24, pp. 267–287). Leviathan Sonderheft. Wiesbaden: VS Verlag. Widmer, T. (2008b). Evaluation in der Aussenpolitik: Gründe für eine Evaluationslücke. Zeitschrift Für Internationale Beziehungen, 15(1), 125–137. Widmer, T. (2011). Zehn Jahre Evaluationsstandards der Schweizerischen Evaluationsgesellschaft (SEVAL-Standards). Schweizerische Zeitschrift Für Kriminologie, 10(2), 23–30. Widmer, T. (2015). Développements et perspectives entre profession et discipline. In K. Horber-Papazian (Ed.), Regards croisés sur l’évaluation en Suisse (pp. 285– 307). Lausanne: PPUR. Widmer, T. (2016). Wandel von Staatlichkeit in der Schweiz in der zweiten Hälfte des 20. Jahrhunderts. In L. Criblez, C. Rothen, & T. Ruoss (Eds.), Staatlichkeit in der Schweiz (pp. 403–418). Zürich: Chronos Verlag. Widmer, T. (2017). Geschichte der Evaluation im schweizerischen politischen System. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 51–66). Zurich: NZZ Libro. Widmer, T., & Beywl, W. (2009). Divergente Entwicklungspfade der Evaluation. In T. Widmer, W. Beywl, & C. Fabian (Eds.), Evaluation: Ein systematisches Handbuch (pp. 509–527). Wiesbaden: VS Verlag. Widmer, T., Beywl, W., & Fabian, C. (Eds.). (2009). Evaluation: Ein systematisches Handbuch. Wiesbaden: VS Verlag. Widmer, T., & De Rocchi, T. (2012). Evaluation. Grundlagen, Ansätze und Anwendungen. Zürich: Rüegger Verlag. Widmer, T., Eberli, D., Ackermann, G., Balzer, L., Bundi, P., Hirschi, C., et al. (2016). Forschung über Evaluation in der Schweiz. LeGes – Gesetzgebung & Evaluation, 27 (3), 459–483. Widmer, T., Landert, C., & Bachmann, N. (2000). Evaluations-Standards der Schweizerischen Evaluationsgesellschaft (SEVAL-Standards). Bern and Genève: SEVAL. Widmer, T., & Neuenschwander, P. (2004). Embedding evaluation in the Swiss federal administration. Evaluation, 10(4), 388–409. Widmer, T., Rüegg, E., & Neuenschwander, P. (2001). Stand und Aussichten der Evaluation beim Bund (EvalBund). Zürich: Institut für Politikwissenschaft. Wirths, D., Rosser, C., Horber-Papazian, K., & Mader, L. (2017). Über die gesetzliche Verankerung von Evaluation. In F. Sager, T. Widmer, & A. Balthasar (Eds.), Evaluation im politischen System der Schweiz (pp. 155–187). Zurich: NZZ Libro. Wüest-Rudin, D. (2002). Evaluation von Reformen der öffentlichen Verwaltung. LeGes – Gesetzgebung & Evaluation, 13(1), 57–82. Zimmermann, W., Kolly, D., & Mahon, P. (1989). Die wissenschaftliche Evaluation von Bundespolitik. Bern: EJPD/AGEVAL.

CHAPTER 9

Ireland Richard Boyle, Joe O’Hara, Gerry McNamara and Martin Brown

1

A Historical Overview 1

Ireland is a small country with a population of approximately 4.75 million. Irish government is loosely based on the Westminster and Whitehall public interest and common-law traditions. Ireland has a written constitution and a popularly elected president, who has limited powers, with the main political power residing in the parliament—the Houses of the Oireachtas — comprising Dáil Eireann (the lower house) and The Seanad (Senate, the upper house).

1 This section is largely derived from Boyle (2002, 2014).

R. Boyle (B) Head of Research, Publishing and Corporate Relations, Institute of Public Administration, Dublin, Ireland e-mail: [email protected] J. O’Hara · G. McNamara · M. Brown Centre for Evaluation Quality and Inspection, Dublin City University, Dublin, Ireland e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_9

227

228

R. BOYLE ET AL.

With regard to evaluation, in Ireland in the late 1960s, in line with many other countries, there was an interest in rational analysis and its application to planning and budgeting. A report on the development of social research in Ireland (Friss 1965) called for objective studies of social programmes and the development of trained social researchers. In response to such demands, a number of initiatives were taken to develop evaluative skills. The Department of Finance set up an analysis section, whose role was to provide training for a number of staff each year, to be placed as analysts in departments and to provide technical advice on issues related to project or programme evaluation. Many state agencies also developed evaluation skills within their organisations. Despite these initiatives, in the 1970s and into the 1980s evaluation was limited in scope and largely peripheral to decision making. In the 1980s, domestic interest in evaluation as a tool of good governance took a back seat given the drive to control public expenditure in the context of the economic crisis at the time. Poor economic conditions meant that the emphasis was put on cutting back expenditure rather than examining its use. A very significant exception to this scenario of limited interest in evaluation in the 1980s is with regard to European Union (EU) expenditure in Ireland. This is one area where demand for evaluation was consistent and systematic. Tied to European expenditure was a requirement to evaluate EU funded programmes. In the 1980s this had significant impact in two main policy areas: industrial training and employment creation schemes, and anti-poverty and other community development programmes. From 1989 evaluation of the EU structural funds became a formal requirement of those receiving assistance, and this led to further developments in evaluation practice in Ireland. The funds were applied through a number of operational programmes which ran under a joint Irish-EU agreed community support framework (CSF) plan. Each operational programme was subject to ex ante, intermediate, and ex post evaluation, as was the CSF as a whole. Various evaluation arrangements were put in place

G. McNamara e-mail: [email protected] M. Brown e-mail: [email protected]

9

IRELAND

229

for each operational programme and centrally, combining internal, external and independent evaluation (Boyle 2005). Evaluations carried out were seen to have some impact (Hegarty 2003). Outside of EU expenditure, the early 1990s saw a renewed interest in the evaluation of public expenditure. As part of a more wide-ranging programme of management reforms, there was an emphasis on programme review and evaluation. In 1993, the Comptroller and Auditor General (Amendment) Act was passed which gave a mandate to the comptroller and auditor general to carry out value-for-money audits. In 1997 the government approved a series of expenditure reviews to be conducted as part of a new system of comprehensive programme review (currently titled the Value for Money and Policy Review [VFMPR] initiative). The intention was to institutionalise evaluation of expenditure programmes in the context of the wider agenda of public service reform. Demand for evaluation, however, reduced in the 2000s. As the economy continued to grow rapidly in the early part of the decade, the pressure to engage in detailed scrutiny of public expenditure decreased. A report on progress with the VFMPR in 2004 found that few reviews were completed on time, some departments had opted out of the review process, and there was limited evidence of the reviews impacting on the reform of spending programmes (Department of Finance 2004). As the EU structural funds reduced in importance, the evaluation arrangements associated with the funds were cut back. While there was evaluation activity related to the 2000–2006 tranche of funding, during the 2007–2013 national development plan, the level of evaluation activity decreased significantly. There is currently very little evaluation activity associated with European funding of programmes in Ireland. However, in line with the ups and downs of demand for evaluation, in response to the fiscal crisis of the late 2000s, the control and management of public expenditure has again been a topic of much concern and debate. To some extent, as in the 1980s, evaluation initially took a back seat as the emphasis was on re-gaining control of expenditure. But there have been developments aimed at re-growing policy evaluation capability within government, in the context of a wider public service reform initiative which includes a commitment to apply evaluation techniques to deliver value for money in public expenditure (Department of Public Expenditure and Reform 2011).

230

R. BOYLE ET AL.

The government also committed to introducing a number of evaluationrelated measures in the context of a revised budgetary and expenditure framework. This approach recognises that in the past “[…] evaluations have not been sufficiently connected to decisions on the allocation of resources to particular programmes and have not generally informed improvements to the delivery of Government services” (Government of Ireland 2012, p. 81). These new measures include: • The use of evaluations in periodic comprehensive reviews of expenditure, including both departmental-based and thematic evaluation reports. • The introduction of an Irish Government Economic and Evaluation Service (IGEES) developed as a cross-government service to enhance the role of economics and value for money analysis in public policy making. • Changes to the VFMPR process, including more targeted reviews, alignment with the expenditure allocation process and greater involvement for parliament in the selection and assessment of reviews. • The introduction of focused policy assessments, more narrowly focused assessments designed to answer specific issues of policy configuration and delivery and complement the VFMPR process. • The introduction of a Public Spending Code to bring together good practice guidance on evaluation and appraisal. There has also been an enhanced role for philanthropies in promoting evaluation of policies and programmes they jointly fund with the government. The Atlantic Philanthropies and the One Foundation (both lifelimited philanthropies that have now ceased to operate), for example, saw building evaluation and organisational learning capacity as a central element in their grant giving.

2

Institutional Structures and Processes 2.1

National and Sectoral Laws and Regulations

In 2013, a Public Spending Code was introduced (Central Expenditure Evaluation Unit 2012). The Public Spending Code is the set of rules and procedures that apply with the aim of ensuring that value for money is achieved in spending programmes across the Irish public service, and

9

IRELAND

231

effectively guides evaluation practice. The code sets the context for evaluation across the public service, both nationally and sectorally. The code brings together in one place all of the elements of the value-for-money framework that had been in place up to then. The code is maintained and updated under the management of the Central Expenditure Evaluation Unit (CEEU) of the Department of Public Expenditure and Reform as a resource for the entire Irish public service. The code includes, but is not limited to, guidelines on the practice of evaluation. The main area of evaluation practice covered is the VFMPR process. VFMPRs are the main feature of the evaluation landscape in Irish public policy-making. The reviews are conducted in accordance with detailed guidelines originally laid down in a 2007 Guidance Manual and incorporated in the Public Spending Code. All government departments and public bodies and all bodies in receipt of public funding must comply, as appropriate, with the relevant requirements of the Public Spending Code. The VFMPR evaluations have a primary focus on accountability for the use of public money, with some of the basic questions all evaluations being asked to address including: • • • • •

What is the rationale and the objectives for the scheme? Are the objectives still relevant, in light of evolving policy priorities? Has the scheme achieved its objectives? How efficiently has the scheme been delivered? How does the scheme rate against alternative ways of achieving the same objectives?

Apart from the Public Spending Code, Spending Reviews, periodic reviews of the totality of public expenditure introduced in 2011 and repeated in 2014, 2017, and 2019, represents an important part of the medium-term expenditure framework for budgetary decision-making and one that evaluation feeds into. While the Public Spending Code sets out rules and procedures for the whole of the public service, it is not a law. The one sector where evaluation has a legal mandate is in the education sector. Legislation including the 1997 Universities Act, the Education Act, 1998, and the Qualifications (Education and Training) Act, 1999, all have specific sections requiring evaluations of programmes and institutions. In the case of primary and postprimary level, for example, a parallel mode of evaluation exists whereby the

232

R. BOYLE ET AL.

role of the inspectorate of the Department of Education and Skills (DES) is to evaluate the education standards in such schools or centres “[…] to promote excellence in the management of, teaching in and the use of support services by schools” (Section 13, Education Act 1998). In parallel, from 2012, all primary and post-primary schools are required to carry out their own internal evaluations of a particular aspect of teaching and learning using evaluation guidelines and criteria developed by the inspectorate of the DES (Department of Education and Skills 2012). Some sectors also have their own specific policies and strategies. The Department of Foreign Affairs and Trade has an evaluation policy based on its long-standing work in the area of foreign aid and the evaluation of that expenditure. The Department of Jobs, Enterprise and Innovation produced A Framework for the Evaluation of Enterprise Supports in 2011, informed by international best practice, to provide guidance regarding core principles and methodologies and addressing the common challenges facing enterprise evaluation. In the education sector, in May 2003, the Department of Education in Ireland published twin documents entitled ‘Looking at our School’ (LAOS), as aids to school self-evaluation in primary and postprimary schools (Department of Education and Skills 2003a, b). These were subsequently updated in 2016 (Department of Education and Skills 2016c). Evaluation in Parliamentary Structures There is no institutionalised evaluation function in the Oireachtas (parliament), though parliamentarians sometimes require evaluations to be carried out for their own work, but on an ad hoc basis. While there is no formal evaluation function, in recent years there has been a growth in the supply of evaluative information to Parliament and its committees by the Oireachtas Library and Research Service. They provide a service that aims to condense evaluative and other information into digestible material for busy parliamentarians. For example, through their Spotlight series which provides an in-depth briefing on single topics such as tendering for public service provision and community policing. There is positive feedback on this service from parliamentarians to the Library and Research Service: it is seen as helpful and informative. From 2012 the budgetary process moved to a ‘whole of year’ timetable. Oireachtas committees feed their views into the process starting in the

9

IRELAND

233

spring of each year. It is expected that by the autumn of each year committees will be informed by the VFMPRs produced by departments. But there is little evidence of substantial engagement with the findings of VFMPR reports. There are a couple of main methods used by the Oireachtas library and research service of relevance to getting evaluative information to politicians in a way that encourages use: 1. Providing short, condensed papers focused on providing concise information on issues of particular interest to politicians, such as how a programme is working, has it worked elsewhere and if so how will it translate, and is it cost effective. 2. Providing opportunities for face-to-face meetings with academics and others who can provide an explanation of the evidence arising from evaluation and research studies and key issues arising. This is increasingly happening with pre-legislative scrutiny, where the library and information service prepares a stakeholder selection paper indicating people with an interest and knowledge of the area under scrutiny, and the Oireachtas committees then themselves decide who they will call to give evidence. 2.2

Evaluation Practice

Scope of Evaluation With regard to the VFMPR evaluations, the Public Spending Code sets out the requirement that all strategic programmes of expenditure should be subject, in whole or in part, to rigorous evaluation over a 3-year period. The Code notes that while it may not be practicable to evaluate the entirety of spending under each strategic programme, significant elements of expenditure should be covered and VFMPRs should not be focused on schemes that account for very minor elements of spending. In practice, the scope of evaluation activity across the various sectors is relatively limited. Resource constraints and prioritisation of other activities mean that evaluation has tended to cover a relatively small proportion of total expenditure in any sector in any three-year period. However, the

234

R. BOYLE ET AL.

Spending Review process has brought more structure to the analysis of public expenditure and assessment is becoming more comprehensive. Internal and External Evaluation The Public Spending Code sets out the formal position in Ireland with regard to the use of consultants for VFMPR evaluations. This states that internal evaluation should be the norm. The engagement of paid consultants to carry out VFMPRs, as a rule, is not permitted. Exceptions can only be justified on the basis that a particularly complex piece of analysis is required and that the necessary skills are not available internally. Even where this is the case, it is not a justification to outsource the whole review. This desire not to use external evaluators in part arises from a desire to build up an evaluation capacity and culture within the public service. In other words, on the job training and internal VFMPRs is the norm. In part, it is also driven by a negative political and public perception with regard to the use of public money to fund work carried out by consultants more generally, and an assessment that the quality of reports produced by consultants is not necessarily higher than those produced internally. Centrally within government, the CEEU in the Department of Public Expenditure and Reform (DPER) coordinates evaluation activity. The CEEU was established in 2006 to promote the application of Value for Money (VFM) practice in public expenditure programmes and projects. Its role has since evolved beyond that to include the provision of analytical and research support to DPER and also to others. The unit promotes best practice in the evaluation and implementation of programme and project expenditure, including the provision of guidance which is set out in the Public Spending Code. The unit is responsible for the conduct of a number of evaluations linked to the Spending Reviews, including focused policy assessments on particular topics and cross-cutting evaluations which cover issues common to a number of departments. Within government departments and offices, arrangements for the conduct of evaluation vary. Traditionally, location at the programme management level has been the most common arrangement, with evaluations being conducted or commissioned primarily at the programme level, but often with some corporate-level support, both from within the organisation and centrally. Six departments and offices with sectoral remits formally have evaluation as a named function in the title of a corporate unit, most often

9

IRELAND

235

combined with finance and audit. For example, the Department of Transport, Tourism and Sport has an economic and financial evaluation unit, and the Department of Foreign Affairs and Trade have an evaluation and audit unit. The further development of units within departments with specialist economic and evaluation expertise is being progressed as part of the process stimulated by the creation of the IGEES covered further below. While the main focus has been on internal evaluation across the sectors, one sector that has traditionally commissioned out a lot of evaluation work to external contractors is overseas development. Due to the nature of the work, the particular challenges associated with evaluating in overseas environments, and the growing trend for shared evaluations between different donor countries (rather than each conducting their own evaluation of the same area), the use of external evaluators is more common here than elsewhere. Another interesting trend with regard to external influences on evaluation, particularly in relation to the health and social care sectors, has been the influence of one philanthropy, The Atlantic Philanthropies (Atlantic), on evaluation practice. Atlantic have been involved in co-funding a number of major investments with the Irish government, particularly with regard to dementia care, supports for people with disabilities, and infant and young people’s care. As part of their agreement with the government, Atlantic have insisted on evaluation of the co-funded investments, through the use of independent, external evaluators, often from academia. These evaluations are usually seen as being of very high quality and informative (Boyle and Shannon, 2018). The education sector appears to be pioneering a reciprocal model of internal/external evaluation. In the education sector, a process of selfevaluation informs the work of a visiting team of inspectors that carry out school evaluations using a range of inspection models that vary in duration (Department of Education and Skills 2016a, b). In line with the drive for a genuinely reciprocal evaluation relationship between schools and the inspectorate, schools are also encouraged to use the same inspection standards and quality statements to evaluate teaching, learning and leadership as contained in LAOS 2016 (Department of Education and Skills 2016c, d). As stated by the Chief Inspector, ‘These are the first fully comprehensive set of published standards for Irish schools’ (Hislop 2017, p. 9).

236

R. BOYLE ET AL.

Location of Evaluation Expertise There is no specific evaluation institute with a national remit and responsibility for the conduct of evaluation. There are a small number of organisations, such as the Economic and Social Research Institute, the Institute of Public Administration, the National Economic and Social Council, and the Education Research Centre (ERC) that have a role to play in the promotion of evaluation practice but only as part of a wider remit in each case. The IGEES is a cross-Government service set up to develop a professional economic and evaluation service to provide high standards of economic and policy analysis to assist the Government decision-making process. The IGEES has a whole of government remit and supports the VFMPR process. In 1993, the Comptroller and Auditor General (Amendment) Act was passed. This act gave a mandate to the Comptroller and Auditor General to carry out value-for-money (VFM) audits. Value for money reports record the results of examinations into (a) the economy and efficiency with which state bodies acquire, use and dispose of resources and (b) the systems, procedures and practices employed by state bodies for evaluating the effectiveness of their operations. Quantity and Quality of Evaluations Under the 2012–2014 round of VFMPRs, fourteen evaluation studies were carried out. Under the 2015–2017 round, 42 studies were scheduled. These cover nearly all government departments and a wide range of topics, from animal health and welfare to drug reimbursement policy. The sectors that have the most evaluation studies undertaken are justice, jobs and enterprise, education, and health, reflecting in large part the high proportion of public expenditure allocated to these sectors. The number of VFM audit reports produced by the Office of the Comptroller and Auditor General has fluctuated over the years, with a maximum of nine reports in 1997 and none in 2011. The average number of reports produced between 1994 and 2013 was four per year. By way of contrast, the supreme audit institutions in Belgium and the Netherlands, with notably higher levels of resourcing, produce around fifteen performance audits (equivalent to VFM audits) per year (Lonsdale 2011, p. 11). With regard to quality, and quality control, governance arrangements for the IGEES include provisions for the oversight of practice and competency

9

IRELAND

237

requirements. An external advisory group made up of representatives from universities, research institutes and other relevant bodies support IGEES management in developing the service including continuing professional development and peer review. An Oversight Board for the IGEES reviews overall performance and development and advises on best practice and the future direction of the service. It reports periodically to the Minister for Public Expenditure and Reform about the performance and development of IGEES. These reports are publicly available. Each VFMPR has a steering group, one of whose areas of responsibility is to provide quality assurance for the evaluation report. But not all steering groups have people with evaluation expertise, and quality of the reports can vary. There are periodic overview scrutinies of the quality of VFMPRs, though these are ad hoc in nature and infrequent. An external quality review of a selection of VFM reports published by the Office of the Comptroller and Auditor General in 2008 found that the overall quality of the VFM reports reviewed was good when assessed against a range of quality criteria (Institute of Public Administration 2008). 2.3

Use of Evaluations

It is the norm that evaluation reports are published. Under the Public Spending Code, there is an expectation that reports will be published and made available to the public and media through departmental websites, and to parliamentarians through the committee system. In practice, there is little public interest in or use of most evaluations (beyond those directly affected) unless they cover an issue which is particularly topical or controversial. Good and Bad Aspects of Evaluation Engagement and Use The Department of Foreign Affairs and Trade (DFAT) has engaged in a process of improved planning and delivery of all its expenditure. A key aspect of such an approach is regular and rigorous evaluations of programmes and activities. A strong culture and practice of evaluation exists in the Department’s Development Cooperation Division, as part of a wellestablished results-based approach. This culture has been developed over a number of years, based on international pressures for evaluation of development aid, such as from the World Bank and OECD DAC, and a strong

238

R. BOYLE ET AL.

internal audit and evaluation unit in the department which promotes evaluative thinking and practice. Similarly, evaluation in the enterprise and agricultural sectors has a strong tradition. With regard to evaluation in the education sector, research to date indicates positive outcomes in the areas of professional and organisational development. On the other hand, because school evaluation in Ireland is a compromise between self-evaluation and external accountability it exemplifies an underdeveloped focus on data and evidence and an insufficient role for key stakeholders such as parents and students. According to the Chief Inspector in reference to the first cycle of mandatory school self-evaluation in Ireland, ‘schools also struggled with handling data and reporting school self-evaluation outcomes to parents’ (Hislop 2017, p. 14). In the social care sector, with regard to moving towards more personalised services for people with disabilities and away from institutional care, Healy and Keogh (2014) note the use of evaluative evidence: Providing evidence was also seen as important, particularly to inform budgetary allocations. … In terms of assessing the case for this new paradigm, there is a need to test whether it is more beneficial for service users and also if it is more cost-effective. Evidence was seen as central to this new paradigm, and that therefore there was a need to ‘walk the walk’ on this. Having hard evidence of impact was also seen as important to enlisting the support of the Department of Public Expenditure and Reform and private funders. (pp. 79– 80)

On the less positive side, there is some evidence that evaluation, to date, has been of limited use in Ireland when it comes to budgetary decisionmaking more extensively. Smyth (2007) analysed thirteen VFM reviews carried out in the Department of Agriculture and Food between 1998 and 2007. Of the 99 conclusions and recommendations assessed, he found none recommended programme termination; 14 supported the status quo, and the remainder largely focused on minor programme alterations or changes in management processes. An assessment was also undertaken by Boyle (2014) of recommendations contained in nineteen VFMPR reports published between 2011 and June 2014, as well as in five evaluation papers produced by the central expenditure evaluation unit in DPER on cross-cutting issues. Of the 428 VFMPR recommendations assessed, only 1% recommended programme

9

IRELAND

239

or part programme termination. The overwhelming majority (94%) recommended programme adjustments. By way of contrast, the recommendations produced by the CEEU in their cross-cutting evaluation reports are more likely to suggest substantial re-design of the programme (55% of the 29 recommendations) or programme/part-programme termination (10%). These points support the contention by Mayne, Divorski and Lemaire (1999) that evaluation carried out close to the programme or policy is effective at addressing operational issues but less likely to address impact and continued relevance issues. To examine continued relevance and impact requires some distance from the programme or policy, as is achieved by a corporate group in a central department such as the CEEU. During the boom times of the Celtic Tiger years in the late 1990s and early 2000s, values of partnership and consensus prevailed. These helped shape an evaluation practice based primarily on the improvement of programmes. While the rhetoric of evaluation contained many of the phrases of new public management such as accountability and value for money, in practice a more developmental focus was evident (McNamara et al. 2009). Evaluation in this context was welcomed as an aid to decision making, but at the same time, there was widespread recognition of the limitations of evaluation as it was being practised. The rapid economic decline of the late 2000s has led to the shattering of the social partnership and consensus approach. Values of efficiency, performance and compliance are more to the fore. This is shaping evaluation practice. There is a growing articulation of the need for critical, centrally driven evaluations that will inform resource reduction and resource reallocation decisions. More generally, there is a view expressed by some practitioners that evaluation is at best an inconvenience and at worst a management tool used by the front line to justify expenditure when what is needed is simply expenditure control. There are, however, some indications against this trend in Ireland. For example, in response to evaluations carried out by the Inspectorate of the Department of Education and the ERC that showed gains in the literacy and numeracy levels in Designated (Delivering Equality of Opportunity in Schools) DEIS primary schools (Weir and Archer 2011) the Irish government reversed a decision to cut teaching posts for schools in disadvantaged areas.

240

R. BOYLE ET AL.

3

Societal Dissemination/Acceptance (Social System)

3.1

Institutionalised Use of Evaluations by Citizens, Enterprises and Civil Society

There is little evidence of significant evaluation use by citizens and private enterprises. There is limited use of evaluations by civil society, and it could not be described as institutionalised. However, interest in, and use of, evaluation has increased over the last ten years or so. This is in part because philanthropies have had a role in spreading a culture of using evaluation for learning amongst civil society organisations in the projects and programmes they grant support. Most prominent in Ireland is The Atlantic Philanthropies (Atlantic), a life-limited philanthropy that started work in Ireland in the 1980s and handed out its last grants in 2016. For a number of years, it has been standard practice for Atlantic to build an evaluation line into the budget of all the projects they fund. So civil society organisations supported by Atlantic have developed capacity and skills in commissioning and using evaluation. In some areas, such as working with children and youth, dementia, and human rights, this has led to the creating of a wealth of evaluative data from both evaluations and research studies. The intervention of philanthropies has also raised the profile of an outcome focused approach to evaluation within civil society organisations. For example, with regard to the children and youth programme funded by Atlantic, over twenty evaluations using randomised control trials (RCTs) have been carried out. RCTs were virtually unheard of in Ireland before this intervention. Moreover, as a by-product of this approach, there has been an increase in the capacity base of evaluators, by skilling up evaluation practitioners in the use of complex evaluation methodologies. Within civil society organisations, there has also been an increased awareness of the role evaluation can play, primarily from a learning perspective, of where to focus their efforts to best effect. For example, The Childhood Development Initiative (CDI), a civil society community based local partnership, is part of a prevention and early intervention programme, a joint initiative of the Department of Children and Youth Affairs and Atlantic. CDI was set up with the objective of testing innovative ways of delivering services and early interventions for children and young people. Amongst the projects supported by CDI were Mate-Tricks and Doodle Den. Mate-Tricks was an after-school programme designed to

9

IRELAND

241

promote prosocial behaviour amongst children aged nine to ten. The evaluation included a randomised controlled trial to measure the effects of the programme on child outcomes and a process evaluation that investigated its implementation. The evaluation found that even though it was positively received by participants, Mate-Tricks did not achieve the changes in behaviour which it was designed to support (O’Hare et al. 2012). It was consequently decided not to continue with the initiative. Doodle Den aimed at improving literacy outcomes for young children in a disadvantaged community through an after-school literacy programme. Using a RCT, the evaluation found improvements in children’s overall literacy ability. The evaluation also found that Doodle Den led to improved concentration and reduced problem behaviours in school, an increase in family library activity, and in the child’s reading at home (Biggart et al. 2012). Since completion of the evaluation, CDI has continued the development and delivery of Doodle Den in the seven participating schools. However, there is a danger that as the funding from Atlantic ceases, with its withdrawal from Ireland, civil society organisations will reduce their evaluation efforts due to the lack of funding support. The extent to which evaluation has been institutionalised into the organisations supported is an open question that will only be answered in the coming years. In the education sector at primary and post-primary level, in a sense, all parents and students have an active role as stakeholders in the inspection process. To assist with inspection judgements on the quality of education provided by schools, the Inspectorate administers a series of questionnaires to parents and students as well as carrying out a series of focus groups with these cohorts. They also review student’s work and assessment data. In other words, Whole-school evaluation (inspection) is deemed a ‘collaborative process involving the teaching staff, the management of the school, parents, and students’ (Department of Education and Skills 2016a, p. 30). This is also the case with school self-evaluation standards. For example, LAOS 2016 has been developed following ‘extensive consultation with students, teachers, parents, school leaders, management bodies and other education professionals and a wide range of other bodies’ (Department of Education and Skills 2016c, p. 5). However, there is also an acknowledgement that the role of parents and students in school evaluation remains underdeveloped. As stated by the Chief Inspector: “we believe that the time is right for us to review and improve the ways in which we access, analyse and use the experiences and opinions of parents and learners as we go about our inspection work. This will be an important element in

242

R. BOYLE ET AL.

the development of our inspection practice in the next few years” (Hislop 2017, p. 20). In the social care sector, as noted above, evaluation has been promoted by Atlantic Philanthropies and adopted by a number of civil society organisations to advance their policy objectives. This is exemplified by Genio, an Irish-based non-profit organisation working with government and philanthropy to transform social services. In terms of policy development, work undertaken and evaluations commissioned by Genio influenced the government’s 2013 ‘Value for Money and Policy Review of Disability Services’ with the resulting report identifying Genio’s model of individualised supports as the way forward for future state service provision. 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

While evaluation as an instrument of public policy is not necessarily particularly well-known or debated in society, there would be reasonably extensive public exposure to the findings of evaluation studies. The media will often produce stories based on evaluations that address a high profile or topical programme or policy (while ignoring many evaluation studies if they do not address these criteria or present non-controversial findings or findings generally supportive of the area under scrutiny). In the case of education, for example, Baird et al. (2015) are of the view that the magnitude of media coverage devoted to state examinations deviates considerably from the norm in other jurisdictions. There is also little media discussion of or interest in following up on the use of the evaluation, or of the quality of the evaluation (unless there are conflicting views on the quality in which case this might present the angle for the story). Most evaluations conducted under the guidance of the Public Spending Code are publicly available. The Code sets out a principle of transparency for Value for Money and Policy Reviews: A primary rationale for the VFMPR process is to facilitate better resource allocation decisions by bringing to light, and testing, the evidential basis for spending programmes. Complementary to this is the general principle of transparency about how public money is allocated, used and evaluated. The

9

IRELAND

243

CEEU will maintain a central repository of all reports including terms of reference, timescale, status update and letters (if any) seeking extensions to deadlines on the http://publicspendingcode.per.gov.ie website. As a guiding principle, all of the background material that would be released in response to an FOI request should be made public at the same time as the VFMPR report is published. (Central Expenditure Evaluation Unit 2012)

4 4.1

Professionalisation

Academic Study Courses, Further Training, Et cetera

Historically, Ireland has a solid tradition of applied social sciences, and disciplines such as psychology and economics have provided a basic grounding for many evaluators. However, until relatively recently, there have been few systematic attempts to develop evaluation skills. While there are few higher education programmes dedicated solely to evaluation, there are some programmes that include modules or substantive material of direct relevance. The Institute of Public Administration provides a number of third-level programmes with a particular focus on evaluation, most notably a Professional Diploma in Official Statistics for Policy Evaluation; a Postgraduate Diploma in Policy Analysis; and a Master of Economic Science in Policy Analysis. The Centre for Evaluation Quality and Inspection (EQI) at Dublin City University (DCU) also offers a professional and international Doctorate in Leadership and Evaluation. The National University of Ireland Galway offers a Master of Economic Science (MEconSc) (Economic Policy Evaluation and Planning). Within the public service, the IGEES Medium Term Strategy 2016 to 2019 commits to the continued professional development of evaluators recruited to the service. This includes the development of training programmes. More broadly, training programmes on evaluation are run on an ad hoc basis by various academic and state bodies. For example, the Institute of Public Administration has run or contributed to training programmes for public servants engaged in evaluation as part of the VFMPR initiative. The Centre for Effective Services has run an evaluation training programme for officials in the Department of Children and Youth Affairs undertaking or commissioning evaluation studies and is exploring the possibility of widening out this programme across the civil service.

244

R. BOYLE ET AL.

4.2

Profession/Discipline

The Irish Evaluation Network (IEN) provides a forum for those engaged in policy evaluation and is open to all evaluators in Ireland (Dublin City University 2017) including those working in the private, voluntary and community sectors. The website of the IEN provides a communications channel, though its uptake is limited and sporadic. Within central government, the IGEES Network comprises civil servants who are engaged in policy-related analytical work. The EQI also exists as a research centre specialising in the evaluation of educational policies, programmes and personnel. The work of the Centre encompasses schools, other learning centres and the wider public service and is primarily concerned with governance and accountability mechanisms including quality assurance processes, school inspection and organisational self-evaluation. The purpose of these networks are manifold and provide fora for those involved in policy evaluation to share experiences and expertise, to peer review analytical work, and to facilitate the flow of information between the public service and external experts regarding evaluation, appraisal and policy analysis. Seminars have been held to promote cooperation between public service evaluators and other interested parties such as the universities, civil society organisations, and European experts. With regard to professional journals, there is no journal within Ireland solely dedicated to evaluation. The journal of the Institute of Public Administration, Administration, occasionally carries articles with a focus on evaluation. Moreover, through association with the European Evaluation Society, many evaluators would be familiar with and occasionally contribute to, the journal Evaluation. Standards, Principles and Practitioners There is no overarching set of standards for evaluators in Ireland although there are some sectoral ones such as those produced by the inspectorate of the Department for Education and Skills. Internationally recognised standards developed for different sectors or areas of activity, such as the Joint Committee on Standards for Educational Evaluation Program Evaluation Standards and the OECD Development Assistance Committee Principles for Evaluation of Development Assistance may be used on occasion. The

9

IRELAND

245

Public Spending Code sets out general principles to be applied to the evaluation of public expenditure programmes. Most policy evaluations carried out by government departments has traditionally been done by personnel received little formal training in evaluation. Most evaluation work undertaken for the VFMPR has been done by line managers who may have some methodological and subject area expertise but who would not see themselves as evaluators. There is a trend to develop specialised professionals and put them in dedicated posts or units. This occurs in a number of departments, such as the evaluation and audit unit in the Department of Foreign Affairs and Trade, which has a small number of evaluation specialists. The development of the IGEES is intended to build economic and evaluation expertise and develop dedicated units with specialised economic evaluators in government departments and offices. There is a broad range of freelancers (people who call themselves evaluators) and a number of consulting firms that also provide evaluation services. The freelancers generally would be individuals who have knowledge of and specialise in particular methodological approaches and/or particular sectors, rather than individuals with a broad range of evaluation expertise across a range of methodologies. There is no certification system for evaluators or an authority that acts as a conciliator or arbitrator in case of disputes. There is no specific professorship for evaluation. Professionalisation of the evaluation sector has not been a topic of any significant debate in Ireland. 4.3

Compliance to Standards and Quality Obligations

Within central government, evaluations carried out as part of the VFMPR are required to conform to the guidance set out in the Public Spending Code. The Central Expenditure Evaluation Unit in the Department of Public Expenditure and Reform has a role in overseeing the extent to which this happens. However, there is no formal public reporting of this aspect of the work, so it is not possible to state with certainty the degree of conformance with the guidance in the Public Spending Code. However, the presence of steering groups for each VFMPR evaluation, which includes representatives from the Central Expenditure Evaluation Unit, provides a mechanism for quality assurance and encourages a systematic approach to quality issues.

246

R. BOYLE ET AL.

5

Conclusion

Evaluation practice in Ireland has gone through a number of phases. Prior to the early 1990s, evaluation tended to be periodic and ad hoc. Due in large part to the influence of the EU and in particular structural funding, and the associated requirement for evaluation, a strong and comprehensive evaluation practice was developed across most areas of public expenditure during the 1990s. As the importance of structural funding from the EU declined in the late 1990s and early 2000s, so too did the focus on evaluation. Interest in evaluation then grew again in the late 2000s due to the effect of the economic crisis and the emphasis on needing to prioritise and get the most out of public expenditure, and also due to the influence of philanthropy on encouraging evaluation in the social care sector. While it is important to be cautious about the role and influence of evaluation, and open to the possibility of future peaks and troughs, there are some signs of a positive nature with regard to the future for evaluation in Ireland. The creation of the IGEES represents a very important initiative, and one aimed at institutionalising evaluation practice within government. Feeding into processes such as the Spending Reviews provides a support structure for evaluative activity. Evaluation in the social care area supported by Atlantic has introduced more rigorous and substantive methodologies and has been influential in informing resource allocation decisions. The big issue will be the extent of continuing demand for evaluation. Ireland has shown that when there is demand, the supply side can react to provide good quality evaluation studies. But Irish experience also shows that in economic good times, the demand for rigorous scrutiny of expenditure can diminish rapidly.

References Baird, J., Hopfenbeck, T. N., Elwood, J., Caro, D., & Ahmed, A. (2015). Predictability in the Irish leaving certificate. Oxford and Belfast: Oxford Centre for Educational Assessment and Queen’s University. https://www.examinations. ie/about-us/Predictability-Overall-Report.pdf. Accessed on 12 June 2017. Biggart, A., Kerr, K., O’Hare, L., & Connolly, P. (2012). Evaluation of the effectiveness of the childhood development initiative’s doodle den literacy programme. Dublin: Childhood Development Initiative (CDI). Boyle, R. (2002). A two-tiered approach: Evaluation practice in the Republic of Ireland. In J. E. Furubo, R. C. Rist, & R. Sandahl (Eds.), International atlas of evaluation (pp. 261–271). New Brunswick, NJ: Transaction Publishers.

9

IRELAND

247

Boyle, R. (2005). Evaluation capacity development in the Republic of Ireland (ECD Working Paper Series No. 14). Washington, DC: The World Bank. Boyle, R. (2014). The state of policy evaluation in Ireland. Dublin: Institute of Public Administration. Boyle, R., & Shannon, L. (2018). Better together? Philanthropy and government: Lessons from the Atlantic philanthropies and Irish government partnership-based co-investments. Dublin: Institute of Public Administration. Central Expenditure Evaluation Unit. (2012). Public spending code. http:// publicspendingcode.per.gov.ie/. Accessed on 21 December 2017. Department of Education and Skills. (2003a). Looking at our school: An aid to selfevaluation in primary schools. Dublin: Department of Education and Skills. Department of Education and Skills. (2003b). Looking at our School: An aid to self-evaluation in post-primary schools. Dublin: Department of Education and Skills. Department of Education and Skills. (2012). Circular 0040/2012, implementation of school self-evaluation. https://www.education.ie/en/Circulars-and-Forms/ Archived-Circulars/cl0040_2012.pdf. Accessed on 21 December 2017. Department of Education and Skills. (2016a). A guide to inspection in post-primary schools. Dublin: Department of Education and Skills. Department of Education and Skills. (2016b). A guide to inspection in primary schools. Dublin: Department of Education and Skills. Department of Education and Skills. (2016c). Looking at our School 2016: A quality framework for post-primary schools. Dublin: Department of Education and Skills. Department of Education and Skills. (2016d). Looking at our School 2016: A quality framework for primary schools. Dublin: Department of Education and Skills. Department of Finance. (2004). First formal report to the minister for finance by the expenditure review Central steering committee: Expenditure review initiative (covering the period June 2002–June 2004). Dublin: Department of Finance. Department of Public Expenditure and Reform. (2011). Public service reform. Dublin: Department of Public Expenditure and Reform. Dublin City University. (2017). Irish Evaluation Network website. http://www. dcu.ie/eqi/ien/index.shtml. Accessed on 21 December 2017. Friss, H. (1965). Development of social research in Ireland. Dublin: Institute of Public Administration. Government of Ireland. (2012). Comprehensive expenditure report 2012–14. Dublin: Stationery Office. Healy, J. A., & Keogh, F. (2014). Exploring a model of public services reform—A case of the Genio organisation. Administration, 62(1), 55–90. Hegarty, D. (2003). Framework for the evaluation of the structural funds in Ireland. In Fifth European Conference on the Evaluation of the Structural Funds (pp. 26–27). Budapest.

248

R. BOYLE ET AL.

Hislop, H. (2017). A co-professional approach to inspection for accountability and improvement: Progress and prospects in the Irish context. Inaugural Public Lecture to Mark the Incorporation of the Centre for Evaluation, Quality and Inspection Within the DCU Institute of Education. https://www.dcu. ie/sites/default/files/ceqie/a-co-professional-approach-to-inspection-foraccountability-improvement.pdf. Accessed on 12 June 2017. Institute of Public Administration. (2008). External quality review of VFM published reports. Dublin: Office of the Comptroller and Auditor General. Lonsdale, J. (2011). Introduction. In J. Lonsdale, P. Wilkins, & T. Ling (Eds.), Performance auditing: Contributing to accountability in democratic government. Cheltenham: Edward Elgar. McNamara, G., O Hara, J., Boyle, R., & Sullivan, C. (2009). Developing a culture of evaluation in the Irish public sector. Evaluation, 15(1), 100–112. Mayne, J., Divorski, S., & Lemaire, D. (1999). Locating evaluation: Anchoring evaluation in the executive or the legislature, or both or elsewhere? In R. Boyle & D. Lemaire (Eds.), Building effective evaluation capacity: Lessons from practice (pp. 23–52). New Brunswick, NJ: Transaction Publishers. O’Hare, L., Kerr, K., Biggart, A., & Connolly, P. (2012). Evaluation of the effectiveness of the childhood development initiative’s mate-tricks pro-social behaviour after-school programme. Dublin: Childhood Development Initiative (CDI). Smyth, K. (2007). Value for money reviews–doing exactly what it says on the tin (Unpublished Paper Produced as Assignment as Part of Requirement for Doctorate in Governance). Dublin: Institute of Public Administration. Weir, S., & Archer, P. (2011). A report on the first phase of the evaluation of DEIS. Report to the department of education and skills. Dublin: Educational Research Centre.

CHAPTER 10

United Kingdom Tracey Wond

Evaluation in the United Kingdom (UK) has been influenced by various political, social and professional contexts. In keep with other chapters in this compendium, this chapter explores evaluation from these various lenses. Data for the chapter was collected from secondary sources and through expert interviews with evaluators from various sectors (private, public, and not-for-profit) and fields including health, education and enterprise. Since this book adopts a wide definition of evaluation, the position of professionals in wider performance and reporting roles are also explored.

1

General Country Overview

The UK lies in north-western Europe and comprises England, Scotland, Wales and Northern Ireland. The UK population in 2016 stood at 66 million, according to the Office for National Statistics (ONS) (2017). The political context of the UK, both now and in the recent past, is worthy of exploration in order to support our understanding of the UK evaluation context. Through history, the purpose and approach to evaluation has

T. Wond (B) University of Derby, Derby, UK e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_10

249

250

T. WOND

been heavily tied to the political context, with new governments espousing particular beliefs and assumptions of what evaluation would bring (Henkel 1991). The UK is a unitary state and its government remains supreme, as opposed to federal (like Germany or Belgium). However, decentralisation of power has been observed in the case of Scotland, Northern Ireland and Wales who have their own regional assemblies. The UK government is described as a liberal democracy (Grayson 2010), reflecting the democratic voting system and the legal rights afforded to citizens in respect of many matters; it is also parliamentary in form. The ‘Westminster model’ adopted by the UK government, denotes its centralised nature and political party structure. One key characteristic of the Westminster model is the way in which executive (government) and legislative (House of Commons) functions of the UK government are ‘fused’ as opposed to being kept separate (Garnett and Lynch 2013, p. 4). There is no codified constitution in the UK and it remains flexible. The political party system in the UK has seen alternating seats of power amongst two main parties, the Labour Party, and the Conservative Party, the Liberal Democrats have also held power as part of a liberal-conservative coalition formed in 2010, an arrangement not seen in the UK since the 1940s (Grayson 2010). Other regional political parties also exist. The political parties link citizens to state and offer alternative representation for citizens (an outcome of pluralism linked to liberal democracies) (Demir 2000; Garnett and Lynch 2013). Policy issues are led by Cabinet ministers who hold ministerial accountability (Grayson 2010) for their departments. These departments have seen several reshuffles over the years, for instance the Department for Business, Innovation and Skills and Department of Energy and Climate Change merged to form the Department for Business, Energy and Industrial Strategy in July 2016. Some have noted the costly and rushed nature of such changes previously in government (White and Dunleavy 2010). Others have noted examples where ministers have not fallen accountable for their failures (Flinders 2011). There is a clear governance role for evaluation in this context, to support this so-called “crisis of accountability” to be overcome (Garnett and Lynch 2013, p. 19). A political system long considered to be an example of success, UK politics has been more recently marred with scandal such as Members of Parliament expenses (Garnett and Lynch 2013) and claims of a post-truth society (as very apparent during the Brexit referendum explored later)

10

UNITED KINGDOM

251

(Brown 2016). Public trust in both politicians and experts has understandably been damaged through these events. Such drama reinforces a need for transparency and accountability in government, something that evaluation may be able to facilitate. Local governments are distinguished across the UK and are responsible for a range of public services (f. i. education, waste/sanitation, social services, highways, economic regeneration). The exact structure of local government can be complex with several tiers of local government across the UK (f. i. parish councils, city/district councils, unitary authorities and county councils). Local governments are funded through local taxation (council tax), central government grants and by charging fees for services. There is opportunity for evaluation activity within local government, to evaluate the success of initiatives or to perform internal evaluation roles to monitor progress against key performance indicators. Audit Scotland (2012) explored the implementation of self-evaluation across Scottish councils, finding that the majority had adopted self-evaluative mechanisms. Self-evaluation in Scottish public services had been strongly advocated through the Crerar Review, an “Independent Review of Regulation, Audit, Inspection and Complaints Handling of Public Services in Scotland” (Scottish Government 2007). The UK joined the European Community (which later became the European Union in 1992) in 1973, following several failed attempts to gain membership (twice vetoed by French President Charles de Gaulle). In 2016, a ‘Brexit’ referendum vote was held in the UK leading to the decision to withdraw from the European Union (EU), this was followed by the UK Prime Minister, Theresa May, triggering Article 50, giving notice of the UK’s intention to leave the EU and exercising its unilateral right to exit, as per the Lisbon Treaty. The decision to pursue withdrawal has been controversial—key public figures are still calling for the ‘divorce’ not to happen and Cabinet ministers appear divided. Indeed, the UK’s relationship with the EU has not always been a smooth one, with the need to renegotiate in 1984 and a decision in 2002 not to accept the Euro as a single European Currency. Life for the UK and the UK evaluation community after Brexit is uncertain. EU-funded programmes generate a great deal of evaluation assignments in the UK. At the time of writing, there is still some uncertainty as to what initiatives might replace previously EU-funded activity— such as research funding and economic regeneration programmes. The UK’s Industrial Strategy outlines key policy areas that the UK may look to prioritise and an Industrial Strategy Challenge Fund has been unveiled

252

T. WOND

with £4.7 billion pledged for research and development over the next four years (further funding has since been pledged to support science). It is conceivable that without EU-funded programmes (where evaluation is fairly normalised) there may be a contraction in the evaluation ‘profession’1 in the UK. Public administration in the UK has provided a good base for much evaluation activity. The UK joined the US and France in the ‘first wave’ of public management reform in the early 1970s, bringing with it an emphasis on ‘strategic policymaking and evaluation’ (Pollitt and Bouckaert 2011, p. 6). This reform was the beginning of New Public Management, which the UK welcomed in the 1980s. New Public Management prompted a focus on results-driven systems, providing a renewed need for evaluation to function for accountability, and heavily influencing the measurement culture seen in the UK public sector today (Devanney and Uglebjerg 2009). Developments in public management and policy-making clearly motivated evaluation to function for governance purposes in the UK public sector. Yet, the complexities of the UK’s government structures also see evaluation hit with challenges inherent in any political system (f. i. ministers trying to leave legacies, a lack of evidence informing some decisions, adverse evaluation reports being hidden) (Chelimsky 2008; Weiss 1993; Wond 2017).

2

Institutional Structures and Processes (Political System) 2.1

Evaluation Regulations

There are widely varying approaches to the adoption of evaluation in the UK. Relatively few national laws or regulations mandate evaluation, or the use of evaluations, in the UK. This creates an interesting challenge regarding the quality, provision and utilisation of evaluation. Exceptions to this regulatory position can be observed when we turn to sector specific practice and performance reporting more generally, including:

1 The status of evaluation as a ‘profession’ is still uncertain as this chapter outlines. A lack of regulation and oversight mean that anyone can undertake an evaluation. The breadth of the evaluation concept makes governing it difficult.

10

UNITED KINGDOM

253

• Aid spending: The International Development (Official Development Assistance Target) Act 2015 placed a legal requirement on the Secretary of State to provide an independent evaluation of official development assistance. • Carbon reporting: The Companies Act 2006, mandates that UK companies listed on the London Stock Exchange (or elsewhere in the European Economic Area) are to measure and report on their greenhouse gas emissions. • Modern slavery: The 2015 Modern Slavery Act places disclosure requirements on commercial organisations with turnover greater than £36 million. Companies must publish an annual ‘Slavery and Human Trafficking Statement’. The Modern Slavery and Carbon Reporting examples above both illustrate a self-evaluative approach (also observed in the UK Higher Education sector and noted earlier in Scottish councils). These demonstrate management models of evaluation (Payne 1994), as opposed to more anthropological (such as responsive or goal free evaluation) or judicial models. Evaluations in the form of inquiries or reviews are also observable. Commons Select committee, Lords Select committee and Joint Committee inquiries can make open calls for evidence on various social, political and economic themes. These are different from commissioned evaluations but show a regard for an evidence-based approach in the UK. For instance, in 2016, the Communities and Local Government (Commons Select) Committee noted that, “there has not been sufficient robust, objective and evidence-based monitoring, evaluation or review of the National Planning Policy Framework” (UK Parliament 2016), and therefore called for a comprehensive review of this policy framework. A substantial driver for evaluation in the UK, has been the stipulation of EU-funding to monitor programmes. Most evaluators interviewed for the purposes of this chapter had predominantly engaged with the evaluation of EU-funded projects; the impact of the UK’s withdrawal from the EU on the quantity and quality of evaluation could, therefore, be detrimental to the position of evaluation in the UK political system. There is substantial variation in evaluation adoption and maturity across various policy domains. Health and education in particular are wellestablished and there is much evidence of formative and summative evaluation being embedded into programmes and initiatives, and policymaking. Evaluation capacity building is also well evidenced in these sectors, often

254

T. WOND

with well-developed, accessible web resources. Within accounting, evaluation, through the form of auditing and strict governance protocol, are observable—the professionalisation of this area and strict codes of conduct may explain this. A number of independent advisory non-departmental public bodies exist to scrutinise aspects of the UK public spending. The Independent Commission for Aid Impact (ICAI), scrutinise aid spending, they are independent of government but report directly to Parliament. The Regulatory Policy Committee also ‘rate the quality of evidence and analysis supporting new regulatory and deregulatory proposals, and check the estimates for the equivalent annual net cost to business of new regulations’ (Regulatory Policy Committee 2018). The National Audit Office (NAO) is an independent Parliamentary body, and explore the value for money of public spending. Various evaluation strategy documents exist in the UK, often published by government departments. These vary from generic strategies that outline a general expectation for evaluation, to more specific strategies on how to measure particular aspects of activity. The Magenta Book (HM Treasury 2011), an evolving guidance document which “sets a broad direction of travel” (Little and Axford 2012) and maintained by the HM Treasury is a significant document to evaluation in central government departments. It advocates both the use of evaluation, as well as providing technical guidance on how to perform evaluation. This book is currently being updated with input with the involvement of the Cross Government Evaluation Group (CGEG) and members of the UK Evaluation Society (UKES)—it is encouraging to see a diversity of stakeholders being involved in its revision. Within the health sector, the National Health Service (NHS) have worked proactively to embed tools to build evaluation capability. Specific evaluation strategies are seen to exist to support various initiatives. National Institute for Health and Clinical Effectiveness guidance and shared learning resources support evaluation to be undertaken and for a culture of evidence to be developed. Box 1 Evaluation Strategy for new care model Vanguards (National Health Service 2016) In 2015, the NHS sought to establish a number of vanguards for models of care. In doing so, they published an evaluation strategy to “provide robust

10

UNITED KINGDOM

255

results and learning that can be rapidly disseminated across the NHS and elsewhere” (p. 4). The evaluation strategy outlined that the complex nature of the new care models programme required an approach to evaluation that was ‘innovative, sophisticated and multi-faceted’. The strategy reflected aspects such as the timing (ex post, ex ante) and method (logic model, interviewing, monitoring of core metrics) of the evaluation. The strategy was produced by the Operational Research and Evaluation Team of the NHS.

The education sector has also emerged as sector-leading in evaluation in the UK, dominating academic discourse. Possibly owing to the diverse education setting, a single evaluation strategy for educational evaluation is not present. Within Higher Education (the University sector) there is growing pressure for the impact of research and knowledge exchange activity to be measured, as well as teaching quality. This has resulted in several instruments and evaluation exercises being developed (f. i. the Teaching Excellence Framework, Research Excellence (and Student Outcomes) Framework and, more recently, the Knowledge Exchange Framework). ‘What Works’ centres, introduced in 2014, have been a recent but positive mechanism to support evaluation evidence to be generated, synthesised and disseminated across a range of policy areas, including: health and social care, educational achievement, crime reduction, early intervention (child development), local economic growth; improved quality of life for older people, and wellbeing. What Works centres involve multi-agency collaboration between government, universities, research councils and not-forprofit organisations. What Works centres embed the principles of evaluation for decision-making, ensuring that high quality evidence is produced, disseminated and utilised to support public services delivery. The network of centres, according to Bristow et al. (2015, p. 130), “reflects a belief in government circles that research-based evidence should make a greater contribution to improving policy-making and delivery”. Amid the UK political context just introduced, the functions of evaluation within the UK political system are varied. Evaluation in the UK has largely evolved from supporting public reform in the 1960s (see Hansson 2006) to supporting governance, this is observed in both practice and the academic literature (Bovaird and Loeffler 2007; Stern 2008). Evaluation also serves a legitimising purpose, maintaining the UK’s democratic society, by ensuring that evidence is made available for decision-makers and other stakeholders. Evaluation aids policy and resource decisions, although many

256

T. WOND

argue that evaluation is under-utilised (Hird 2005; Sanderson 2000). Initiatives such as the What Works centres encourage evaluation for learning and enhancement. 2.2

Evaluation Practice

Within central government, local government (local authorities/councils), and charity organisations, internal evaluation functions are identifiable. Internal functions often perform monitoring, performance measurement and reporting activity, for instance, to satisfy a growing demand for information from central government. Economic (measuring effects of policy/intervention), process (implementation of policy/intervention) and impact (difference made by policy/intervention) evaluations are all observable in government evaluations. Temporal concerns have long been held in evaluation, with claims that evaluation is often biased towards ex post evaluation, and are not funded or commissioned in a manner which allows longer term impact to be observed (as such they are accused of being short-termist). Given the vast scope of evaluation there is little commonality in the way in which evaluation is conducted in the UK, making it difficult to describe. Sector specific practice does allow some trends to be observed. For instance, in the enterprise growth sector in the UK at the current time, a heavily quantitative approach is evident with the use of randomised control trials (RCTs) and impact evaluation dominating evaluation practice (Bristow et al. 2015). 2.3

Use of Evaluations

Evaluation is not clearly tied to new legislation or programmes, although it is common to see some degree of review, particularly towards outputs. European Union funding has been influential in establishing evaluation practice across a range of sectors and both internal and external evaluations are seen. Monitoring has been essential to ensure that programmes are delivering as expected (e.g. 12 contact hours per SME), and to target (e.g. reaching 20 businesses within a quarter). A review or evaluation of a policy initiative is often proposed at the inception of the initiative. Evaluation is also often seen in pilots, with the evaluation being used to review a trial policy intervention (MacKenzie et al. 2006; Bailey et al. 2017). If the pilot evaluation suggests the intervention or

10

UNITED KINGDOM

257

initiative is effective then it is likely that an extended pilot or full initiative is supported and implemented. Many third sector and public-sector funders require evaluation evidence as part of their funding conditions, and to support continuation grant funding. As such, it is clear that evaluation in these contexts, fulfils a role concerned with transparency, accountability and good governance. From the evaluation strategies explored earlier in this chapter, we see a range of functions for evaluation in the UK. These include: • Evaluation for governance; • Evaluation for accountability and transparency (including value for money); • Evaluation for bureaucracy (box-ticking); • Evaluation for learning (within the intervention [formative], and after); • Evaluation for decision-making. Good evaluation performers include health, education and international development. The UK’s Department for International Development (DfID) are strong advocates of evaluation, and this is notable from the extent of their evaluation activity, and external profile in evaluation. In a 2014 report, DfID were apparently investing over £1.2 billion in ‘strengthening evaluation and research’ (Independent Commission for Aid Impact 2014, p. 1). According to DfID (2013): “evaluating the impact and lessons of our aid programmes is a crucial part of our work. High quality evaluations help us spend aid more effectively so that more people are lifted out of poverty”. Evaluation for both learning and transparency is noted through DfID’s work, and the scope of DfID evaluations extend from programme to sector, and country level evaluations. DfID’s activity is independently evaluated by the ICAI mentioned earlier. Like DfID, the ICAI undertake evaluation to ensure transparency and learning. Evaluation in the enterprise sector appears less mature. There is a great deal of experimentalism in this sector at the current time, with RCTs and other impact methodologies being explored and funded. Over the past decade, enterprise agendas have shifted from the sustainability of enterprise to support innovation in UK business; how this is measured (and deriving causality, mediating for displacements etc.) has remained relatively unchanged.

258

T. WOND

The actual use of evaluation is more greatly debated. Both evaluators and policy makers alike acknowledge a utilisation gap with evaluation reports suffering ‘dusty-shelf syndrome’ (Adelman 1996). Evaluation findings are cited by political figures, although there has been some controversy over ‘buried’ evaluation reports, hidden because they contain damaging content (f. i. the ‘Troubled Families’ evaluation2 ). Across policy areas, a host of policy initiatives have been criticised for failing to demonstrate clearly the value of the initiative (Howlett 2014; Sivarajah et al. 2016), this suggests both a need for evaluation but also a failing in that evaluation has not been effectively used previously to evade such criticism. Evaluation quality assurance remains ad hoc. There has also been significant criticism of the quality of evaluations being funded by the UK government. In 2013, the NAO published a report that was highly critical of evaluation in government (NAO 2013). It criticised the lack of comprehensiveness, challenges in accessing data, and poor use of evaluation by the UK government. There was poor mapping of spend to evaluation evidence, “government does not publish a comprehensive overview of evaluation evidence mapped against total government spending and other interventions” (ibid., p. 7). Further, a lack of evaluation was noted, with the report referencing earlier observations that 12 of 17 main government departments lacked evaluation evidence and post-implementation reviews (ibid., p. 7). The NAO (2013) estimated that in 2010–2011, £44m was spent on government evaluation. Further, meta-evaluation and meta-analyses are not commonly seen across policy areas, although evaluators and organisations demonstrate growing interest in performing such meta-analyses. Interviews for the purpose of this chapter cited capacity to conduct such meta-analyses, as well as a lack of demand for evaluation quality assurance, as some of the reasons for this under-assurance. Speaking at the launch of the Centre for Evaluation across the Nexus in 2016, Michael Kell (Chief Economist at the NAO) was sceptical about UK government evaluation, concerned that such evaluations were not seen as trustworthy or independent.

2 See Brecknell (2016).

10

3 3.1

UNITED KINGDOM

259

Societal Dissemination/Acceptance (Social System) Institutionalised Use of Evaluations by Civil Society

The UK government’s Office for Civil Society (OCS) champions the UK civil society sector. Key priorities reflected by the OCS at this time include tackling loneliness, promoting volunteering and social action, and building an inclusive economy (OCS 2018). Whilst the OCS are not particularly vocal about the use of evaluation, the regulatory aspects of the OCS are handled by the Charity Commission where appreciably evaluation responsibility may be devolved to. A Report of Session by the Select Committee on Charities recommends the evaluation of charities’ impact: All charities should be seeking independent evaluation of their impact on their beneficiaries, in order to ensure that they are delivering for them and to demonstrate this to beneficiaries, funders and the public. The form of such evaluation may vary considerably … public sector commissioners (should) assess such evaluation when awarding contracts … the Government and the charity sector (should) continue to pursue initiatives to better understand and promote the impact of charities … (and they recommended) that the Office for Civil Society (OCS) develops guidance for the rest of the public sector on how to set contractual impact reporting requirements appropriately. (Authority of the House of Lords 2017, pp. 39–40)

The need for accountability within charitable organisations was evident in this report. Interestingly, heavy dominance on impact evaluation and assessment was also observed in this report—perhaps signalling a demand for further evaluation capability/capacity in this sector. The Centre for Social Action has been one particularly large initiative which has promoted social action in a range of areas, attracting some £35m from the Cabinet Office and a similar amount from other innovation agencies such as NESTA and The Social Innovation Partnership. The Centre for Social Action explores projects in a range of areas to prompt social action (helping the aged, social mobility, supporting ex-offenders to transform their lives etc.). NESTA in particular are strong advocates of evaluation and support organisations engaging with the Centre for Social Action (through their Centre for Social Action Innovation Fund) to conduct evaluation. In promoting evaluation, NESTA stress the importance of evaluation evidence

260

T. WOND

to persuade organisations ‘to commission and fund programmes’, to support funders to ‘understand the impact’ of the work that they fund, and to ‘improve’ the programmes being delivered. NESTA are strong advocates of evaluation, as evident across their programmes. Similarly, NESTAs Standards of Evidence are an important development for the use of evaluation in a civil society context, advocating different levels or standards of evidence including independent, replicable evaluation, impact capture and the demonstration of causality. As described above, civil society organisations tend to be strong advocates for the production and use of evaluation in the UK. Such organisations do so to justify their own activity, improve their practice, as well as using evaluation data to lobby others. There are strong evaluation capacitybuilding efforts in civil society organisations. A Charities Evaluation Services report found that support for monitoring and evaluation in UK social enterprises had increased, but that funding was a key driver for this evaluation, limiting the use of evaluation for learning (Ellis and Gregory 2008). Evaluation culture and practice is maturing in key policy areas, although the effectiveness of evaluation evidence in decision-making still appears limited. Interviews undertaken by the author demonstrate that many evaluators feel their work is largely under-utilised, used symbolically, or misused to demonstrate certain points or to support particular agendas (Wond 2018). Greenaway (2016) shares these concerns noting that there is limited evaluation recognition and discourse on evaluation practice. Participatory evaluation has been embraced to varying degrees in the UK. Participatory monitoring and evaluation echo the growing acknowledgement that public services are co-produced, and also that evaluation itself can be co-produced. Co-production (as well as co-creation) is steadily being embraced as a model for UK public services delivery (Osborne et al. 2016) and the impact of this on evaluation are yet to be fully explored. Whilst many see more co-produced and participatory forms of evaluation as a ‘good thing to do’, the practicalities of many evaluations limit the extent to which evaluations are participatory. Within some sectors, the results of evaluative grading are being used extensively beyond their primary purpose, for instance to market organisations and influence consumer behaviour when they are publicly released. For instance, within UK Higher Education, the Teaching Excellence (and Student Outcomes) Framework assesses teaching quality and results in gold, silver and bronze ratings. Higher education institutions are leveraging these ratings when marketing their programmes (Holmwood 2015).

10

3.2

UNITED KINGDOM

261

Public Perception and Discussion of Evaluation and Evaluation Findings

The term ‘evaluation’ is often misunderstood by the general public, given its expansive remit this is understandable. Efforts to build evaluation capacity in project settings demonstrates acknowledgement that there is a need for evaluation, but signifies the ongoing skills gaps towards evaluation. External evaluation reports are often made publicly available via government and project websites, and increasingly social media (such as Twitter) is used to provide notice that evaluation reports are made available. There are numerous examples of evaluation collections and catalogues which make evaluation reports publicly available. For instance: • Scottish Enterprise: Scotland’s economic development agency, publish their collection of evaluation and research reports online (for more info, see www.evaluationsonline.org.uk) (Scottish Enterprise 2018). • Medical Research Council: Demonstrating a commitment to making evaluation research widely available to publics in the UK and beyond, host a range of resources as well as promoting the ‘Researchfish’ database (see https://mrc.ukri.org/successes/evaluating-researchoutcomes/) (Medical Research Council 2018). • Age UK: The charity makes their evaluation reports available on their website (for more info, see https://www.ageuk.org.uk/our-impact/ policy-research/publications/evaluation/) (Age UK 2018). The media do not frequently discuss evaluation findings or evaluation as a subject in itself, although political figures may popularise some findings to serve or defend their agendas which then become news worthy. Recently there has been heightened coverage of Brexit impact assessments, with claims that these were ignored and false claims were made by some politicians that they did not exist (Sparrow 2017). Discourse on evaluation as a field in itself, its benefits, challenges, quality and professionalisation occur in pockets. Evidence-based policy making and decision-making discourse is more common in some arenas such as health and policing at the current time, indirectly implicating evaluation.

262

T. WOND

3.3

Civil Societies Demand Evaluations

Civil society organisations are regular commissioners of evaluation, seeking to legitimise some of their activity and provide evidence of responsible spending (to donors f. i.). The public are able to raise petitions for issues to be discussed at parliament on a government run website (https://petitions.parliament.uk), but at the time of writing there are only 8 petitions that ask for evaluations of various societal activity, these include calls for the evaluation of the health telephone service (111), and availability of state-supported housing (council housing). However, such petitions rarely gain enough support to be considered or result in an evaluation.

4 4.1

Professionalisation (System of Professionalisation)

Academic Study Courses, Further Training Et cetera

The engagement of academics with evaluation assignments, as well as knowledge in social inquiry, gives rise to relatively good awareness of evaluation within Universities. Academic provision in evaluation does exist, although is not widespread. Writing on the state of evaluation in the UK in 1996, Adelman (1996) noted that there was only one UK evaluation course available. Examples of evaluation-oriented provision at the current time include: • MSc in Evidence-based social intervention and policy evaluation (University of Oxford). • MSc Impact Evaluation for International Development (University of East Anglia). • MSc Social Research and Evaluation (University of Huddersfield). • MPA Comparative Applied Social and Public Policy, Evaluation and Research (University of York). Notably, this provision is at a postgraduate level, likely to appeal to professional learners working in public services. General policy-related programmes also demonstrate an embedding of evaluation within them. For instance, the University of Liverpool’s, Master of Public Health contains modules relating to evaluation (‘Generating and Evaluating Public

10

UNITED KINGDOM

263

Health Evidence’; ‘Monitoring and Evaluation of Systems and Programmes’) (University of Liverpool 2018). Evaluation training is increasingly being offered through consultancies and evaluation networks (VOPEs) such as the UKES. Recent UKES events have included training on developmental evaluation and evaluating value for money. Elsewhere in the UK, training in impact assessment is popular at the current time. Koljonen (2012) explored evaluation in a UK fire and rescue service and concluded that there is insufficient organisational support for evaluation capacity-building in emergency services settings. Evaluation capacitybuilding was a concern voiced by several evaluators and programme managers interviewed for the purposes of this chapter. 4.2

Profession/Discipline

The UK is credited with being one of the first countries to develop an evaluation society presence. The UKES was established in 1994 and promotes the development of both evaluation theory and practice and offer national and regional opportunities for evaluators, academics, policy makers and other stakeholders to discuss evaluation. UKES’ main event is an annual evaluation conference which sees a diverse range of evaluators come together to network, and share and debate practice. In 1996, the UKES’ membership was approximately 200 (Adelman 1996). Today there is a healthy population of both institutional and individual members. Membership numbers of the UKES are certainly no proxy for the size of evaluation in the UK. Given the nature of evaluation work there are many professionals who undertake evaluation but who would not identify primarily as an evaluator, or perpetually engage or pay membership with a professional society such as the UKES. Indeed, Adelman (1996) noted that there are few ‘devoted’ evaluators in the UK. Several evaluators interviewed for the purpose of this chapter were unaware of the UKES and did not recognise evaluation as a specific discipline. Instead, they perceived evaluation as not dissimilar to other research assignments. This was an interesting discovery and is perhaps worthy of exploration in the future. Regional evaluation networks also exist inside and outside of the UKES, as well as sector-specific communities of practice who explore evaluation and evidence (such as the Charity Evaluation Working Group, and CGEG.

264

T. WOND

There are a number of practitioner and academic publications that specialise in evaluation. ‘The Evaluator’, a UKES publication, explores professional activity, updates and special themes (such as complexity and evaluation) relating to evaluation. The ‘Evaluation’ journal, edited by Eliot Stern of the Tavistock Institute (London, UK) is a well-respected and peerreviewed international evaluation journal. The Chartered Association of Business School’s rates ‘Evaluation’ as a 2* journal meaning that it is considered of an acceptable standard with original research (Chartered Association of Business Schools 2018). Other journals with an evaluation focus exist in specific fields, for instance, ‘Assessment and Evaluation in Higher Education’, in the higher education sector. Academic journals on evaluation are heavily occupied with case studies and methodological papers, there has been some growth in contributions on evaluator development and professional practice. Evaluators include freelancers, consultancy firms, internal evaluators (often administrative staff who collate monitoring data), and university staff (academics and researchers in dedicated research and evaluation centres) and sometimes doctoral students. Within academia, some professorial roles in evaluation have been observed. 4.3

Compliance to Standards and Quality Obligations

The UKES developed ‘Guidelines for Good Practice in Evaluation’ and a ‘Capabilities Framework’ (both recently updated), both to support the quality of evaluation in the UK. The Voluntary Evaluator Peer Review (VEPR) Scheme, an innovative pilot to support the development of evaluators has also been jointly delivered by the European Evaluation Society and UKES. UKES have since continued the scheme holding a 2019 round of peer review. Amongst evaluation networks, there is recognition of the varied quality of evaluation commissioning. Explanations for this include lack of evaluation knowledge by commissioners, and the insincerity toward evaluation by some commissioners. Pawson and Tilley’s (1997, p. 14) assertion that “he who pays the researcher calls the methodological tune”, is true in some cases with quite rigid expectations from some evaluations, and some evaluators describing how commissioners have influenced the direction of evaluations. There are many efforts to improve the quality of evaluation commissioning, and support materials are available on the internet to help professionals in a variety of sectors (particularly health-related).

10

5

UNITED KINGDOM

265

Conclusion and Outlook

The emergence of new public management provided a stronger rationale for evaluation, and resulted in evaluation gaining a greater presence in a range of public organisations. The UK have certainly established an evaluative culture. However, there are ongoing concerns and threats to the legitimacy of evaluation in the UK, and, at times, evaluation as a discipline is overlooked. The ‘Brexit’ referendum outcome revealed an erosion of trust in experts and of evidence, with key politicians and voices from the public shunning the advice of economists and other industry experts (who were voicing concern of the impact that a ‘leave’ decision could have on the UK economy). At a time when evaluation already struggles for legitimacy, particularly in relation to having results used by policy-makers, a weakening of ‘epistemic trustworthiness’ (see Hendriks et al. 2015) is a worrying observation. Evaluation, like many sectors, will need to grasp the technological changes of the digital era, and the use of social media and more audiovisual means of communication pose opportunities for evaluation findings to be shared more effectively, overcoming criticism that evaluation reports are overly long, unappealing to various audiences, and that such policy instruments don’t aid transparency as they should (Sivarajah et al. 2016). The quantitative paradigm is currently occupying evaluation in many policy areas, suiting various political agendas (numbers are thrown around regularly in UK politics). The use of RCTs for instance, considered by some to provide a rigorous, ‘gold standard’ methodology, pleasing those who seek to demonstrate the quality of their research (for the Research Excellence Framework in the case of academic evaluators). Might it be that quantitative research has finally won the evaluation paradigm war in the UK public policy arena?

References Adelman, C. (1996). Anything goes: Evaluation and relativism. Evaluation, 2(3), 291–305. Age UK. (2018). Evaluation reports. https://www.ageuk.org.uk/our-impact/ policy-research/publications/evaluation/. Accessed on 30 January 2019. Audit Scotland. (2012). Local government self-evaluation arrangements (2011/12). http://www.audit-scotland.gov.uk/work/scrutiny/docs/as_1112_councils_ self_evaluation.pdf. Accessed on 31 January 2019.

266

T. WOND

Authority of the House of Lords. (2017). Stronger charities for a stronger society: Report of session 2016–2017. https://publications.parliament.uk/pa/ ld201617/ldselect/ldchar/133/133.pdf. Accessed on 31 January 2019. Bailey, S., Checkland, K., Hodgson, D., Mcbride, A., Elvey, R., Stephen, P. M., et al. (2017). The policy work of piloting: Mobilising and managing conflict and ambiguity in the English NHS. Social Science and Medicine, 179, 201–217. Bovaird, T., & Loeffler, E. (2007). Assessing the quality of local governance: A case study of public services. Public Money & Management, 27 (4), 293–299. Brecknell, S. (2016). DCLG blasted over troubled families report delays and “tick box” approach. Civil Service World. https://www.civilserviceworld.com/print/ 5135. Accessed on 31 January 2019. Bristow, D., Carter, L., & Martin, S. (2015). Using evidence to improve policy and practice: The UK what works centres. Contemporary Social Science, 10(2), 126–137. Brown, T. (2016). Evidence, expertise, and facts in a “post-truth” society. British Medical Journal, 355, i6467. Chartered Association of Business Schools. (2018). Academic Journal Guide. https://charteredabs.org/academic-journal-guide-2018-view/. Accessed on 30 January 2019. Chelimsky, E. (2008). A clash of culture: Improving the ‘fit’ between evaluative independence and the political requirements of a democratic society. American Journal of Evaluation, 29(4), 400–415. Demir, H. (2000). The role and treatment of political parties in liberal democracies with reference to the United Kingdom, Turkey and the European Convention on Human Rights (PhD thesis). University of Leeds, Leeds, UK. Department for International Development. (2013). Evaluation reports. https:// www.gov.uk/government/collections/evaluation-reports. Accessed on 30 January 2019. Devanney, C., & Uglebjerg, A. (2009). Can ethnography complement evaluation within new public management? Case studies from England and Denmark. Paper presented at the Liverpool-Keele Symposium. https://www.liverpool.ac. uk/media/livacuk/schoolofmanagement/docs/abstracts/ethnography2009/ Devanney.pdf. Accessed on 30 January 2019. Ellis, J., & Gregory, T. (2008). Accountability and learning: Developing monitoring and evaluation in the third sector (Research Report). London: Charities Evaluation Services. Flinders, M. (2011). Devolution, delegation and the Westminster model: A comparative analysis of developments within the UK, 1998–2009. Commonwealth and Comparative Politics, 49(1), 1–28. Garnett, M., & Lynch, P. (2013). Exploring British politics. Harlow: Pearson. Grayson, R. S. (2010). The liberal democrat journey to a lib-con coalition: And where next? London: Compass.

10

UNITED KINGDOM

267

Greenaway, L. (2016). Evaluation that empowers: An iterative investigation of how organisations can become evaluation-minded (PhD thesis). University of Dundee, Dundee, UK. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos. 685626? Accessed on 31 January 2019. Hansson, F. (2006). Organizational use of evaluations: Governance and control in research evaluation. Evaluation, 12(2), 159–178. Hendriks, F., Kienhues, D., & Bromme, R. (2015). Measuring laypeople’s trust in experts in a digital age: The Muenster epistemic trustworthiness inventory (METI). PLoS One, 10(10): e0139309. https://doi.org/10.1371/journal. pone.0139309. Accessed on 30 January 2019. Henkel, M. (1991, Spring). The new evaluative state. Public Administration, 69(1), 121–136. Hird, J. A. (2005). Power, knowledge and politics: Policy analysis in the states. Washington, DC: Georgetown University Press. HM Treasury. (2011). The magenta book: Guidance for evaluation. https://www. gov.uk/government/publications/the-magenta-book. Accessed on 30 January 2019. Holmwood, J. (2015, November). Slouching toward the market: The new green paper for higher education part 1. Campaign for the Public University. http://publicuniversity.org.uk/2015/11/08/slouching-toward-themarket-the-new-green-paper-for-higher-education-part-i/. Accessed on 31 January 2019. Howlett, M. (2014). From the ‘old’ to the ‘new’ policy design: Design thinking beyond markets and collaborative governance. Policy Sciences, 47 (3), 187–207. Independent Commission for Aid Impact. (2014). How DFID learns (Report 34). https://icai.independent.gov.uk/wp-content/uploads/HowDFID-Learns-FINAL.pdf. Accessed on 31 January 2019. Koljonen, H. (2012). Building evaluation capacity into a large public sector emergency service: An action research study (PhD thesis). University of Salford, Manchester, UK. http://usir.salford.ac.uk/29434/. Accessed on 30 January 2019. Little, M., & Axford, N. (2012). Plurality and rigour. Journal of Children’s Services, 7 (3). https://doi.org/10.1108/jcs.2012.55207caa.001. Accessed on 31 January 2019. Mackenzie, M., Blamey, A., & Hanlon, P. (2006). Using and generating evidence: Policy makers’ reflections on commissioning and learning from the Scottish Health Demonstration Projects. Evidence & Policy: A Journal of Research, Debate and Practice, 2(2), 211–226. Medical Research Council. (2018). Evaluating research outcomes. https://mrc.ukri. org/successes/evaluating-research-outcomes/. Accessed on 31 January 2019.

268

T. WOND

National Audit Office. (2013). Evaluation in government (Report). https:// www.nao.org.uk/wp-content/uploads/2013/12/10331-001-Evaluation-ingovernment_NEW.pdf. Accessed on 31 January 2019. National Health Service. (2016). Evaluation strategy for new care model vanguards. https://www.england.nhs.uk/wp-content/uploads/2015/07/ncmevaluation-strategy-may-2016.pdf. Accessed on 31 January 2019. Office for Civil Society. (2018). About us. https://www.gov.uk/government/ organisations/office-for-civil-society/about#our-priorities. Accessed on 30 January 2019. Office for National Statistics. (2017, June 28). Population estimates. https:// www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/ populationestimates. Accessed on 10 May 2019. Osborne, S., Radnor, Z., & Strokosch, K. (2016). Co-production and the cocreation of value in public services: A suitable case for treatment? Public Management Review, 18(5), 639–653. Pawson, R., & Tilley, N. (1997). Realistic evaluation. London: Sage. Payne, D. A. (1994). Designing educational project and program evaluations: A practical overview based on research and experience. Boston, MA: Kluwer. Pollitt, C., & Bouckaert, G. (2011). Public management reform. Oxford, UK: Oxford University Press. Regulatory Policy Committee. (2018). About us. https://www.gov.uk/ government/organisations/regulatory-policy-committee/about. Accessed on 30 January 2019. Sanderson, I. (2000). Evaluation in complex policy systems. Evaluation, 6(4), 433– 454. Scottish Enterprise. (2018). Evaluations online: Home. http://www. evaluationsonline.org.uk/evaluations/Index.do. Accessed on 30 January 2019. Scottish Government. (2007). Crerar report: Independent review of regulation, audit, inspection and complaints handling of public services in Scotland. http:// www.gov.scot/Publications/2007/09/25120506/0. Accessed on 30 January 2019. Sivarajah, U., Weerakkody, V., Waller, P., Lee, H., Irani, Z., Choi, Y., et al. (2016). The role of e-participation and open data in evidence-based policy decision making in local government. Journal of Organizational Computing and Electronic Commerce, 26(1–2), 64–79. Sparrow, A. (2017, November 1). MPs set to pass labour motion saying Brexit impact reports should be published as Tories abstain—Politics live. The Guardian. https://www.theguardian.com/politics/blog/live/2017/nov/ 01/damian-green-should-consider-standing-aside-while-really-serious-claimsare-investigated-says-tory-colleague-politics-live. Accessed on 31 January 2019. Stern, E. (2008). Evaluation: Critical for whom and connected to what? Evaluation, 14(2), 249–257.

10

UNITED KINGDOM

269

UK Parliament. (2016). Committee calls for a comprehensive review of national planning policy. https://www.parliament.uk/business/committees/committeesa-z/commons-select/communities-and-local-government-committee/newsparliament-2015/national-planning-policy-report-published-15-16/. Accessed on 30 January 2019. University of Liverpool. (2018). Masters in public health. https://www.online. liverpool.ac.uk/programmes/master-of-public-health/structure. Accessed on 30 January 2019. Weiss, C. (1993). Where politics and evaluation research meet. Evaluation Practice, 14(1), 93–106. White, A., & Dunleavy, P. (2010). Making and breaking Whitehall departments: A guide to machinery of government changes. https://www. instituteforgovernment.org.uk/sites/default/files/publications/making_ and_breaking_whitehall_departments.pdf. Accessed on 30 January 2019. Wond, T. (2017). Trust matters: Distrust in an external evaluation of a public sector program. International Journal of Public Administration, 40(5), 408–415. Wond, T. (2018). Evaluation for what purpose? Findings from two stakeholder groups. In D. Crowther, S. Seifi, & T. Wond (Eds.), Responsibility and governance: The twin pillars of sustainability (pp. 73–86). Singapore: Springer.

PART IV

Southern Europe

CHAPTER 11

Italy Erica Melloni

1

General Country Overview

If one would describe three main issues of the Italian public sector, one could mention its large, but deemed basically inefficient, administrative apparatus, an excessive administrative burden and a predominant legal background of public officers, more attentive to the procedural compliance than to the results of the public action. Within this context, policy evaluation and evaluators in Italy have struggled to be recognised as a resource in Italy, even though the last twenty years have shown encouraging changes. First of all, the public sector has constituted for many years, and particularly between the 1960s and 1980s, a resource of public employment for an identified part of the population. This was especially the case, but not solely, for the southern regions, characterised by lower levels of development and higher levels of unemployment if compared to the more industrialised northern Italy. The rise of the public debt (which evolved from 35% of the gross domestic product in 1970, a level inferior to the European average, to 100% at the end of the 1980s) compelled the state to foster, since the 1990s, a number

E. Melloni (B) Istituto per la Ricerca Sociale, Milan, Italy e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_11

273

274

E. MELLONI

of political reforms aiming at reducing the size of the public sector. Two main levers have been used. The first lever was privatisation or corporatisation of some sectors of the state, and, more in general, a wave of reforms (the most important ones being the so-called ‘Bassanini laws’) aiming at introducing some New Public Management (NPM) principles within the Public Administration. As two main examples, in 1992 the state railways, an entity that had peaked to 235,000 employers in 1980, was turned into a company 100% owned by the Ministry of Economy and Finance. In the same year, a reform of the health system entrusted the regions with the coordination of this policy domain. It also modified the legal nature of local health units, and of some hospitals, according to the private sector’s management principles. A second lever was a severe limitation to the public spending of the administrations, through various tools, including early retirement measures and a widespread and lengthy period of recruitment block. According to Marra (2017), the workforce of the Italian public sector downsized from 16.2% in 1990, to 14% in 2015, totalling 3.2 million employees today. A second problematic issue is overregulation and an excessive administrative burden, for both enterprises and citizens. According to an estimate of the Italian House of Representatives research unit, there are 100,000 laws in Italy of which 18,000 relate to the regional level and 10,000 are derived from European regulations. Simplification of the number of laws, and more important, the reduction of red tape, has often been claimed as a crucial goal from both the political debate and several national and regional programmes. The initiatives in this respect have been relevant but still not enough. Among these initiatives, as two examples, it is worth recalling the selfcertification, which has eliminated three quarters of the certificates and 85% of the signature authentications required by government offices (Bassanini 2013). Moreover, high investment on the information technologies (IT) allowed Italian society to take a relevant step further in the possibility of people to interact with the public administration. Among the consequences, other than faster procedures, IT made available a large amount of information on the websites of the public administrations in the form of open data, which would have been simply unbelievable at the beginning of this century. Some dynamics go in the opposite direction, too, such as the growing attention on the problem of corruption and the performance management of the administrations. These issues, which are per se of crucial importance, have also turned into an higher level of administrative

11

ITALY

275

requirements and controls, and as a consequence, in new burdens for the public officers dealing with them for the enterprises interacting with the public administrations (e.g. in case of public tendering) and for citizens as a whole. Third, the recruitment of public officers is based on comparative procedures that have traditionally favoured a solid knowledge of the regulatory framework in which the candidate is going to work. The public officials (with the exception of the more technical profiles) with a background in policy analysis, economics or statistics are still a minority in the Italian public administration. The university courses on these topics are growing, and in consequence so are the numbers of graduates. However, the block in public administration recruitment has slowed down the generational and cultural change, which is deemed essential for a true modernisation of the sector. Within this picture, policy evaluation and evaluators in Italy only appeared at the end of the 1990s. Actually, this first wave of evaluation was pushed in this period, thanks to two main factors: a political agenda inspired by the NPM, and the requirements of the European Union (EU) in terms of evaluation of public investments. As the political agenda, during the 1990s Italy was attempting to reform the balance of the powers of the state, attributing more autonomy to the local levels from the central state. This orientation will peak in 2001 with a constitutional reform that provided the territorial administrations with some powers previously entrusted to the national government. Other than the revision of the constitutional framework, the process of decentralisation was requiring strengthening the role of the managers, and evaluation was among the tools envisaged to this end. It is worth recalling in that period the ‘Bassanini laws’ (1997–1999), that imported into the public administrations methods of management and labour relations long in use in the private sector (and in Anglo-Saxon administrative systems), such as strategic planning, management control, performance evaluation, accountability, merit and customer satisfaction. These pivotal reforms were not able to be embedded into the general culture of the public administration. They represented, however, the push for some forerunner public managers and administrations (e.g. some municipalities, universities, health units) to build their own performance management systems. The EU has been one of the main carriers of evaluation in Italy, thanks to the introduction of mandatory evaluations for the national and regional programmes implementing the Cohesion Policy. This provision was also at

276

E. MELLONI

the origin of the growth of a market in evaluation, composed by societies competing on the evaluation tenders. In the same period, the Italian Society for Evaluation (Associazione Italiana di Valutazione—AIV) was founded, with the goal of spreading the culture of evaluation in Italy. The last twenty years have shown a general, even if jeopardised, growth in the use of the evaluation in different policy domains. This institutionalisation has been brought forward through new regulations introducing evaluative requirements, and the creation of institutions or processes focusing on evaluation in various sectors, such as health, schools, university, EU programmes and public investments, and more recently, within the judiciary system and in the non-profit sector. This growth has also fostered the creation of evaluation units within (some of) the organisations, adding new competencies to the ones held by the private consultants and research institutes providing evaluative services. The following sections are dedicated to describing the main elements of this growth.

2

Institutional Structures and Processes (Political System) 2.1

Evaluation Regulations

In Italy, legislative power is shared between the national level (constitution and ordinary laws) and the regional level (regional laws). The 20 Italian regions have a consistent level of autonomy from the national state,1 even though recent laws have restrained this autonomy. Regions hold legislative power on all subjects not reserved by the constitution to the state.2 The EU represents a third level of regulation and, as far as evaluation, has represented one of the main drivers for the introduction and diffusion of evaluation in Italy. Evaluation can be traced back in different tiers of regulation and in different policy sectors. The following paragraphs are dedicated to drawing

1 Five regions (such as, Sicily, Sardinia, and other border regions) out of 20 have a special status and their legislative power is higher than the power held by the other regions. 2 The Italian constitution (Article 117) also identifies a conspicuous number of topics with a shared competence between regions and state, thus generating a large number of conflicts of jurisdiction between the two levels.

11

ITALY

277

a picture of the main regulations introducing or consolidating the use of evaluation in different sectors. The Regulatory Impact Assessment (RIA) At the national level, evaluation is considered within the framework of RIA (in Italian: Analisi di impatto della regolazione), which was introduced as an experiment in 1999 (Law 50/99). The goal of RIA is to analyse, in an ex ante perspective, the possible impacts of laws drafted by the national government, and of secondary regulations issued by ministries. In 2003, RIA was also extended to the independent authorities (Law 229/2003), which hold regulatory power in different economic fields (energy, communication, anti-corruption, transport, etc.). After a long experimental phase with uncertain results, the Simplification Law 246/053 foresaw a general application of the RIA regime to all bills of law. However, the subsequent implementation guidelines issued in 2008 exempted ‘more complex laws’ from the application of RIA, de facto weakening the entire system (Martelli 2009). A special office, DAGL (Dipartimento per gli Affari Giuridici e Legislativi—Department for the Legal and Legislative Affairs) is entitled to coordinating RIAs initiated by governmental branches and of requiring integrations to the RIA drafts to improve their quality. According to the Study Office of the Senate (Senato della Repubblica 2017), between 2007 and 2016, the Italian government had undertaken 1283 RIAS. Despite this number, the content of RIAs is not always considered properly evaluative, but more similar to a technical presentation of the regulation they are annexed to, even though the quality seems to have improved in recent years, as the Study Office has shown. A revised version of the guidelines for the impact report accompanying the draft regulation was issued in February 2018 by the government (Directive of the President of the Council of Ministers, 16 February 2018). Most of the Italian regions have also activated the RIA scheme, with varying results. One of the most relevant experiences has been promoted by the Lombardy region, where the statutory law, renewed in 2008, sets out that a regional law shall define the policy areas for whom a technical analysis 3 Law 246/2005 aimed at simplifying the excessive number of laws in Italy—more than 100,000—of which 18,000 were regional laws and 10,000 were European laws, according to a study of the House of Representatives. The Simplification Law foresaw several measures for reducing the number of laws, among which were the abolition of laws promulgated before 1970 (with some exceptions), and the introduction of the RIA for bills of law.

278

E. MELLONI

or an RIA is required. The reform of the statutory law paved the way for an intense programme of simplification, begun in 2010, using different tools including a revised form of regulatory analysis, a fact that made Lombardy the region with the smallest number of laws in Italy. Evaluation and Use of Evaluation Findings Embedded in Legislative Structures Except for the above-mentioned provisions, which refer mainly to the regulatory quality domain, evaluation is not rooted within the legislative process so far, especially at the national level. Evaluation findings only rarely fuel the public debate, except, sometimes, the discussion on the level of spending of European funding. A step further towards the institutionalisation of evaluation within the parliament was the creation of an Impact Assessment Office within the Senate of the Republic (one of the two chambers of the Italian parliament) in 2015. The Senate Impact Assessment Office (SIAO) is chaired by the Senate president and steered by the four vice-presidents of the Senate. The SIAO has no assigned budget; its working group is composed of various Senate department officials, competent in public policy analysis and evaluation. The mission of the SIAO is to disseminate, develop and foster a policy evaluation culture in government. To this end, the office conducts impact analyses and public policy evaluation studies. In its two-year history, the SIAO has delivered several studies, all publicly accessible from a dedicated website.4 Even though the SIAO is at its beginning stages, its studies have attracted some attention from the media, and several journal articles have used the SIAO evaluation findings. At the regional level, evaluation is embedded within legislative activities thanks to the mechanism of the evaluation clause. Since 2000, with Tuscany and Piedmont as forerunner regions, many regional laws have included an evaluation clause, that is, a specific law article assigning an information mandate to collect, process and finally communicate a series of selected information to legislative bodies on the results of the law after its implementation.

4 More information can be found at: http://www.senato.it/4783?testo_generico=1299.

11

ITALY

279

Knowledge on the advancements of evaluation within the regional legislative assemblies is collected and diffused by Progetto Capire,5 a project promoted by the legislative assemblies of the regions with the mission of promoting culture and use of policy evaluation in legislative assemblies. Sectoral Regulations on Evaluation 2.1.1 The Environmental Impact Assessment The Environmental Impact Assessment (in Italian, Valutazione di Impatto Ambientale—VIA) is among the earliest forms of evaluation seen in Italy. VIA has been regulated by Law 349/1986, following the European directive 85/337/CEE; in the same year, the Environment Ministry was established for the first time. In 2006, the Strategic Environmental Assessment (Valutazione Ambientale Strategica—VAS) was also introduced, following EU Directive 2001/42/CE. The idea underpinning the VAS is that the protection of the environment should be included from the beginning in programmes and plans with a potential impact on the environment. The legislative decree 152/2006 has also reorganised the most relevant legislation on the subject, with a goal of overcoming weaknesses of the procedure identified by a screening of the directive, introduced by the EU in 2003. 2.1.2

Evaluation of Public Investments at the National and Regional Level At the end of the 1990s, the European Regional Policy Funds expected in Italy were worth EUR 64,295 million for the period 2000–2006, comprising EU, national and private funding.6 The massive investment was entailing complex problems of managing and accounting for the funds, tendering procedures, implementation, and, of course, evaluation of the investments, to be performed by the managing authorities entrusted with

5 More information can be found on the project website: www.capire.org. 6 In Italy the territorial development programmes of the 2000–2006 period were called

Nuova Programmazione—New Programming. According to the Study Service of the House of Representatives (Servizio Studi della Camera dei Deputati 2005), the overall financial amount for the 2000/2006 programming period was equal to EUR 64,295 million; in the period 2007–2013, the total amount was EUR 59,413 million (Open Coesione 2017, data on the certified expenditure for the period 2007–2013).

280

E. MELLONI

the funds. In this context, the unit for the evaluation of the public investments (Unità di Valutazione degli Investimenti Pubblici—UVAL) was created, as a branch of the newly created Department for Cohesion Policy.7 UVAL is entrusted with specific competencies in terms of evaluation of programmes. Moreover, in the same period the Italian government funded the NUVAL project, with the goal of providing training and technical support and fostering the implementation of units for the evaluation and monitoring of public investments within each national and regional administration, as required by Law 144/99 (Table 1). Other units for the evaluation and monitoring of public investments have been progressively established within most of the administrations. These units provide technical support to the programming and evaluation activities in ex ante, ongoing and ex post phases, with reference to the regional, national and/or EU-funded programmes. Their composition and functions generally follow specific regulations of the administration of belonging. Membership can be internal, external or both; when external evaluators are expected, public tendering is undertaken to acquire professional competencies (both thematic and/or methodological) that meet the specific evaluation needs of the administration. The NUVAL cooperates with the national statistical office and with the other units for evaluation and monitoring of public investments established at the national and regional level (established by Law 144/99) involved in the programming, monitoring and evaluation of development policies, either funded by Cohesion Policy or by national funds. Moreover, the NUVAL cooperates with the National System of Evaluation (Sistema Nazionale di Valutazione—SNV), which holds a special focus on education. 2.1.3 Evaluation in Education and University Sectors Specific bodies for policy evaluation also exist within the education and university sectors. The National Institute for the Evaluation of the Education and Training system (Istituto Nazionale per la Valutazione del Sistema Educativo di Istruzione e di Formazione—INVALSI) is a research agency, established in 1999 at the premises of the Ministry of Education, in charge of evaluating the efficiency and effectiveness of the education system. INVALSI is part of, and coordinates the SNV. Moreover, it performs 7 For a complete reconstruction of the creation of the Department for Cohesion policy, see Melloni and Pasqui (2009).

11

ITALY

281

Table 1 The units for the evaluation and monitoring of public investments created within the Italian regional administrations, according to Law 144/99: year of institution and membership Region

Emilia Romagna Lombardia Calabria Sicilia Basilicata Piemonte Toscana Veneto Friuli Venezia Giulia Marche Abruzzo Molise Campania Liguria Puglia Umbria Bolzano (regional province) Valle d’Aosta Sardegna Trento (regional province) Lazio

Year of institution

Number of members

Composition of members

1999

12

✓

2000 2000 2000 2000 2001 2001 2001 2001

25 (max.) 10 25 11 13 16 3 5

✓ ✓ ✓ ✓

2001 2001 2001 2001 2001 2002 2002 2003

3 12 5 6 5 7 8 n/a

2005 2006 2013

18 7 11

2014

15

External

Internal Internal and external

✓ ✓ ✓ ✓

✓

✓

✓ ✓ ✓

✓

✓ ✓

✓ ✓ ✓ ✓

Source Author’s elaboration from information collected through the regional administrations’ official websites

periodic and systematic checks on students’ knowledge and skills and on the overall quality of education as well as training institutions’ educational offerings. INVALSI has a staff of about 87. The National Agency for the Evaluation of the University and Research System (Agenzia Nazionale di Valutazione del sistema Universitario e della Ricerca—ANVUR) established by law in 2006, is the agency entrusted

282

E. MELLONI

with the evaluation of the universities and research entities. The birth of the ANVUR is part of a reform process already underway for over a decade, aimed at balancing the autonomy of the universities and at submitting them to procedures for evaluation and accountability (Graziosi 2018). Universities and research institutions enjoy significant management autonomy but are subject to periodic external evaluation, based on transparent and codified procedures, entrusted to the ANVUR. The ANVUR evaluations cover all the main activities of the institutions: research, teaching (carried out only by universities), technology transfer and knowledge (referred to as ‘third mission’) and the performance of administrative staff. ANVUR performs external assessment of the quality of activities of universities and research entities entitled to public funding. Among its activities, ANVUR evaluates processes, outputs and outcomes of universities, defines criteria and methodologies for the assessment of colleges and study courses, and elaborates procedures for assessing satisfaction of the students. Moreover, ANVUR has drafted guidelines for the universities to implement performance-related documents required for public administrations. ANVUR was initially composed of 18 people, but the staff has doubled in accordance with Law 232/2016.8 In 2011–2013 ANVUR carried out a first research quality assessment exercise, which involved about 185,000 products presented by the universities and research institutes, covering the years 2004–2010. The results of the exercise, deemed to be one of the most extensive so far conducted in Europe, have been used for the assignment to universities of the performance-related part of the university’s ordinary fund. In 2016, the agency was engaged in a second evaluation exercise on products published in the 2011–2014 period. Law 232/2016 established the five-year frequency of such exercises; the next evaluation will therefore cover the period 2015–2019. 2.1.4

Evaluation of the Performance of Public Administration and Public Managers In the last ten years, the evaluation of the performance of public administrations and public managers has achieved a central place within the national political agenda. However, the topic was not new, with several reform laws issued since the 1990s, and several experiments carried out by forerunner 8 More information about this on the ANVUR website: http://www.anvur.it/anvur/ struttura-organizzativa/.

11

ITALY

283

Italian administrations. Some sectors, such as the national health system, characterised by a profound corporatisation process, have developed wellestablished performance management systems for years, while others have only recently approached evaluation (see following paragraphs). The profound differences among the regional and local administrations’ experience towards evaluation is partly due to their high level of autonomy from the central state, favoured by a constitutional framework (reform of the ‘Title V’ of the Italian Constitution issued in 2001) giving special power to territorial administrations. Since the first decade of 2000, several governments, both right- and left-oriented, have fostered re-centralisation measures. In 2009, the legislative decree 150/2009 ‘Brunetta’9 followed a political and media campaign fighting the ‘fannulloni’ (‘slackers’), as the Italian public officers (better said, a part of them) were called. The Brunetta reform of 2009 aimed at reorganising and strengthening the performance management systems of public administrations, according to a principle of effectiveness and efficiency. The characteristics of the reform are shown in Table 2. The Renzi government (2014–2016) launched a new reform of the public administration, the ‘Madia’ reform (Law 124/2015). Among other provisions, the ‘Madia’ reform abolishes the CIVIT and tasks two different organisations with its coordination activities: the DFP of the Presidency of the Ministers (tasked with the performance-related assignments) and the authority dealing with anti-corruption (Autorità Nazionale Anti Corruzione—ANAC). The Renzi government then collapsed, after failing a referendum on a proposed constitutional reform10 in 2016, which among other things, was expecting to task the Senate with the policy evaluation function on laws drafted by the House of Representatives (only the already mentioned Impact Assessment Office within the Senate has been implemented). 9 Named after the minister of the Berlusconi government issuing the decree 150/2009. 10 The Constitutional Reform, that was finally rejected by the Referendum of 4 December

2016, aimed at reforming the Senate in order to change its dimension (from 315 to 100 members) and functions: from a perfect equivalent of the House of the Representatives, to a non-elected body formed by representatives of the regions, with competencies on a restricted list of topics. A policy evaluation function was also attributed to the reformed Senate. The House of the Representatives was the sole custodian of the legislative power. The reform also led to the abolition of the provinces and the absorption of functions and personnel within the regions.

Legislative decree 150/2009 ‘Brunetta’

Regulation

– It creates a coordinating body (the CIVIT—Commissione per la Valutazione, la Trasparenza e l’Integrità delle amministrazioni pubbliche b ) at national level, with sanctioning power over the administrations lagging behind in the application of performance requirements

– It introduces a complete performance cycle (yearly composed as a performance plan and performance report) into each administration, focusing on the goals to be attained, and the results (to be) achieved – It creates (within each administration) an independent assessment body (the OIV—Organismo indipendenti di Valutazione, independent evaluation units),a formed by one to three external members tasked with the role to oversee and steer the correct implementation of the performance cycle

Elements of the performance system

Main novelties of the regulation Transparency and anti-corruption – It introduces a quite long series of transparency and anti-corruption provisions (plans and information to be mandatory published on the administrations’ own website, following a specific information tree and all condensed within a ‘transparency page’)

Performance-pay system – The performance-pay system is based on the assessment of results and behaviours of managers and other personnel

Table 2 Main elements of the regulations of 2009 and 2015 on the public administrations’ performance

284 E. MELLONI

Main novelties of the regulation

– The citizen evaluation (through customer satisfaction or other forms of evaluation) is introduced as a mandatory component of the performance system

– A national list of those entitled to be selected as members of the OIVs is created (the application is sifted by the DFP; annual training is mandatory to stay on the list)

– A new website at the national level is created, dedicated to performance, which includes a list of all documents produced by the administrations in accordance with the performance management system and a database of the comparative procedures for selecting members of the OIV

– It tasks the Public Function Department (DFP) of the Presidency of the Ministers of the performance-related assignments previously assigned to the CIVIT, which is abolished

Elements of the performance system

– The ANAC (Autorità Nazionale Anti Corruzione, the authority dealing with anti-corruption) is created

– The existence of a proper performance system within each administration is required as a conditio sine qua non for the administration itself to deliver results-related wages and salary increases, and to assign responsibilities to personnel

The Freedom of Information Act— FOIA (translated in Italian as ‘accesso civico generalizzato’) is introduced

Transparency and anti-corruption

Performance-pay system

Source Author’s elaboration a The OIVs took over from the evaluation units, which were formerly present in various administrations. OIVs are mandatory for all the State entities, while local administrations may decide to maintain the evaluation units, which partially respond to different regulatory requirements b The CIVIT created in 2009 was entrusted with coordinating both the performance and anti-corruption requirements. The CIVIT was eliminated in 2015 and its functions transferred to the Public Function Department—DFP for the performance-related topics, and ANAC for the anti-corruption topic

Law 124/2015 (‘Madia’ reform)

Regulation

11 ITALY

285

286

E. MELLONI

Various scholars have pointed out that the ambitious Brunetta reform of 2009 has been in practice less effective than expected (Bonini Baraldi 2014; Rebora et al. 2015). Strongly inspired by the principles of NPM, the reform was initially characterised by “a corporate approach and a slightly punitive taste” towards administrations and especially the staff (Busetti and Dente 2014, p. 226). In fact, the sanctioning message of the reform (the fight against the ‘slackers’), used as a ram at the time of the introduction of the legislative decree, was rapidly watered down. The link between the detection of the annual performance of the organisation and the assessment of individual merit, which was also a symbolic pivot of the reform, fell victim to the cuts in public budgets and limited, if not in fact cancelled, the possibility of using individual rewards as a strategic lever for performance management. The ‘Madia’ reform launched in 2015 tried to re-affirm a stronger connection between organisational performance and individual rewards, and to create a trained group of evaluators overseeing performance and individual assessment. However, the change of government in 2018, with a coalition composed of Lega Nord and the Movimento Cinque Stelle, has again changed the national framework. The agenda of the new minister, Giulia Bongiorno, concerning evaluation-related issues focuses on two main points (Ministro per la Pubblica Amministrazione 2018). The first one refers again to the evaluation of public managers, and the fact that too often the evaluation assigns the maximum score available on the scale. The second point refers to the independency of the experts enrolled within the OIVs. The criticism attests to the fact that the OIV experts are selected and contracted by the top of the administration (sometimes by the elected representative). This procedure would not guarantee the necessary independency of the evaluators. The solution envisaged is to change the system in the sense of creating a totally external evaluation body, with no links with the administration to be evaluated. This position, however, still presents some critical issues (in terms of costs, administrative burdens, informative asymmetry, etc.). Basically, the alternative is between a conceptualisation of the evaluation function as a support of the organisation’s management and development system, from one side; or as a tool for checking, control and compliance from the other side. The oscillation between these two positions is mainly due to the difference level of ‘maturity’ of the public administrations under observation. According to the writer of this article, the time has come for differentiating the evaluation architectures according to the development

11

ITALY

287

of appropriate management systems and of the level of the results achieved. This would allow the best performers to organise themselves in the best way, and, on the contrary, to impose more rigid control systems for the organisations lagging behind. Even though the performance reform(s) has shown various weaknesses, some progressions are worth pointing out. The first progress is the introduction of a complete cycle of performance that involves a series of steps and documents to be produced annually (mainly, the plan and the performance report), based on triennial objectives and performance indicators to evaluate their achievement. Second, the reforms fostered a greater transparency, through the obligation to make a whole series of information on the functioning and expenses of the organisation easily accessible to the public. Third, a new group of evaluators (the people included whose application to the national list has been approved and are then entitled to be selected as members of the OIVs) has been identified. At the time of writing, the establishment of a training route for the OIVs members is ongoing, with many organisations applying to the central body coordinating the performance cycle, the DFP, for the development of courses offering necessary credits to the 3419 OIVs members11 to remain on the list. The Italian Evaluation Society has recently been awarded as one of the providers for training. 2.1.5 Evaluation Within the Italian Judiciary System Despite its conspicuous dimension,12 the judiciary system is one of the sectors in which evaluation entered more recently. This happened together with a new conceptualisation of the judicial offices as ‘organisations’, to be efficiently managed as all the other public institutions. Until the end of the 1990s, the rigorous selection of judiciary personnel, its continuous training and evaluation for career purposes, as well as a strict independence and autonomy of the single magistrate from its organisation were pillars of the dominant paradigm of the judicial system (Vecchi 2018). It is only after 2000 that the judiciary system began to search for new

11 The number of experts included within the National List of Experts for the OIVs. 12 The judiciary sector was composed in 2013 of nearly 9000 magistrates, and 39,000

administrative staff; the overall financial resources for the sector were about EUR 3.4 billion; the judiciary procedures, both civil and criminal sectors, were EUR 7 million (Performance Plan of the Judiciary Sector 2013).

288

E. MELLONI

solutions to its even harsher problems: the growth of the demand for justice and the excessive length of procedures, in particular in the civil sector. The answers to these problems were of different kinds13 : among them, the goal of implementing a monitoring system of the judiciary process, also thanks to the introduction of information and communication technologybased solutions through weighty modernisation programmes.14 In particular, since 2012 the organ of self-government of the judiciary (Consiglio Superiore della Magistratura—CSM) started to coordinate a programme to reduce the over three-year backlog in the civil sector, which obliges each court to formulate an annual management programme of the backlog (on the basis of Decree Law 98/2011). Those programmes initiated many judiciary offices in their activities of analysis, identification of gaps, re-engineering of processes and were often anticipated by self-assessment analysis based on the Common Assessment Framework, the method for the evaluation of the public organisation’s quality developed by the European Institute for Public Administration.15 In 2016, the CSM declared the will to verify in good time the court’s ability to achieve reduction targets of the ultra-three-year backlogs and to this end it started the implementation of its own data warehouse. Moreover, the application of the performance requirements, also for the judiciary sector, has favoured the introduction of performance plans and evaluations based on the results achieved and on the most pressuring goals for the future. Even if the process is not yet completed, the introduction of monitoring and evaluation has paved the way for a reform of the judiciary system, and for a growth of competencies of the judiciary office chiefs in terms of evaluation and planning. 2.1.6 Evaluation in the Non-profit Sector Basing on the last census (2011) of the National Statistical Institute (Istituto Nazionale di Statistica—ISTAT), the non-profit sector in Italy is composed of 300,000 organisations, with EUR 64 billion turnover and about

13 For a complete reconstruction of the introduction of evaluation in the judiciary system, see Vecchi (2018). 14 Reference here is made to the Diffusion of Best Practices into the Judicial Offices programme, co-financed by the European Social Fund, targeting several judiciary offices and taking place from 2009 to 2016. 15 See the website at: https://www.eipa.eu/portfolio/european-caf-resource-centre/.

11

ITALY

289

6 million employees. Evaluation is not a traditional activity of this sector. A survey on the diffusion of impact evaluation among Italian NonGovernmental Organisations (NGOs) (Melloni 2014), replicating a model realised by the New Philanthropy Capital that focused on the UK charities, has shown that more than two-thirds of the Italian and UK NGOs affirm that they evaluate more than in the past. However, only 32% of the Italian NGOs declare to regularly assess impacts, in comparison to 70% of the UK charities. In other words, evaluation among Italian NGOs is still more oriented towards accountability, while the analysis of the change produced is not mainstreamed in the system. However, this sector is rapidly changing. As a result of the economic crisis and new policies promoted (such as the Business Initiative promoted by the EU in 2014), the model spreading today is increasingly marketoriented, with growing relationships with credit institutions and the need for more robust methods to evaluate the work done. A recent reform of the non-profit sector (Law 106/2016 reforming the ‘third sector’) has explicitly introduced the concept of impact evaluation and fostered, other than the evaluation of achieved impacts, a further effort of transparency and accountability (with mandatory realisation of the social budget for the NGOs). This provision has ignited a debate among NGOS, funders and evaluators, on the more appropriate methods and processes for a useful and parsimonious impact evaluation of social policies. An interesting debate on the different approaches and motivations for fostering them was promoted by the AIV in collaboration with the University of Urbino, in October 2018. The seminar gathered various project funders (such as Fondazione Cariplo and Impresa Con I Bambini), stakeholders and evaluators presenting different approaches (theory-based; counter-factual; Social Return on Investment), benefits as well as difficulties of impact evaluation in the third sector (Table 3). 2.2

Evaluation Practice

Speaking at a general level, one could say that in Italy evaluation is not undertaken at a regular pace. New legislations or programmes do not rely on the evaluation of results achieved in the past. However, a closer view on specific topics reveals, to some extent, a different picture. As we have seen at the beginning of the chapter, the RIA has become a mandatory endeavour for new legislation, even though this effort is not well recognised by the broader public and their evaluative content is not completely satisfactory.

University

Education

•

Evaluation of public investments, EU regional policy

•

•

•

•

•

• • •

Environment

• • • •

RIA

1999 experimentation of the RIA scheme (Law 50/99) 2003—RIA applied to independent authorities (Law 229/2003) 2005—RIA applied to all bills of law (Law 246/05) 2018—updated guidelines for conducting RIA (Directive of the President of the Council of Ministers, 16 February 2018) Several regional laws include the application of RIA 1986—Law 349/1986 2006—introduction of the Strategic Environmental assessment (legislative decree 152/2006) 1999—a unit dedicated to evaluation and monitoring of public investments shall be established within each administration (Law 144/99) creation of the Cohesion Policy Department and the UVAL—unit for evaluation 1999—creation of INVALSI (National Institute for the Evaluation of the Education and Training system) in charge of evaluating the efficiency and effectiveness of the education system 2014–2015—creation of the National System of Evaluation coordinated by INVALSI (Law 190/2014 and Law 107/2015) 2006—creation of ANVUR entrusted with the evaluation of the universities and research entities (Law 286/2006) 2016—Law 232/2016 established the five-year frequency exercises of the Research Quality Assessment

Year/regulation

Policy domain

Table 3 A list of the main regulations concerning evaluation in different policy domains

290 E. MELLONI

• 1997–1999—Bassanini Laws. Among them, the legislative decree 286/1999, no 286, required each local authority to have internal management controls (evaluation units) and management assessment • 2009—Legislative decree 150/2009 ‘Brunetta’ introduces several requirements related to performance and transparency: a complete performance cycle to be defined within each administration; a performance-pay system based on the assessment of results and behaviours of managers and other personnel; transparency and anti-corruption provisions; the creation of an independent assessment body the OIV within each administration supervising the evaluation of the performance cycle; a central body (the CIVIT) coordinating at the central level • 2015—‘Madia reform’ (Law 124/2015) distinguishes between performance and anti-corruption activities; links the delivery of result-related wages and salary increases to the existence of a proper performance system; introduces the citizen evaluation; creates a national list of persons entitled to be selected as members of the OIVs. It also splits the competencies of the CIVIT to the Department of Public Function of the Presidency of the Council of Ministers (concerning performance-related issues) and to the ANAC (regarding anti-corruption and transparency) • 2011—the CSM started to coordinate a programme to reduce the over three-year backlog in the civil sector, Decree Law No. 98/2011 • 2016—the CSM declared the will to verify in good time the court’s ability to achieve reduction targets of backlogs and started the implementation of its own data warehouse • 2016—Law 106/2016 fosters the evaluation of impacts for non-profit organisations

Performance of public administrations

Non-profit sector

Judiciary

Year/regulation

Policy domain

11 ITALY

291

292

E. MELLONI

Performance assessment is, by now, institutionalised within the public sector, despite some light and shadow in the system and too frequent reforms. Schools and universities hold well-established evaluation systems, going as far as the assessment of impacts, notwithstanding some criticism regarding distortions that the systems seem to produce. The overall evaluation of programmes co-financed by the EU is almost always supported by ex ante, ongoing or ex post evaluations, giving birth to a thriving evaluation market. Some private donors, such as the major bank foundations, require providing with an evaluative design even more often when applying for project funding. No information is available to the writer on the ratio between external and internal evaluation in Italy. Sometimes evaluation entails external resources, sometimes it is mainly an internal effort, other times the evaluation design requires both external and internal evaluation. An example of the latter is the performance system of public administrations, where administration’s personnel is tasked with the drafting of performance documents and procedures, while the OIV is the evaluation unit, composed from experts contracted by the administration, which steers, controls and eventually amends performance-related provisions. During its research quality assessment, the ANVUR organises evaluation visits arranged by commissions composed of teaching staff, even though not belonging to the university under assessment. Internal and external evaluations are both parts of the evaluation design proposed by the INVALSI. Other sectors require an external evaluation. This is the case of the evaluation of the EU-related programmes, mainly outsourced to private companies based on public tendering. A different choice has been made recently by IRES, the research institute of the Piemonte region. IRES has signed an agreement with the region to perform the evaluation of all the EUfunded programmes (European Social Fund, European Fund for Regional Development) that will be thus evaluated using internal resources. It is more difficult to say whether process or impact evaluation is predominant. Some sectors are more prone to include (some forms of) impact assessment within their evaluation process, in primis, the education and university system, and the health system. Recently, the law reforming the non-profit sector (Law 106/2016) has introduced the evaluation as a specific requirement for the sector, even though studies underlined that impact assessment is not a common provision in this sector, where the output and process assessment still prevails (Melloni 2014).

11

ITALY

293

On the contrary, performance assessment focuses more on procedural and output assessment of the public administrations, and only rarely goes as far as to assess the external outcomes of its policies. It is not by coincidence that the latest ‘Madia’ reform in 2015 has included ‘customer satisfaction’, a very basic form of assessment of the outcomes, as a mandatory analysis to be undertaken within the performance cycle. A single independent institute does not exist in Italy, even though various organisations tasked with evaluation purposes have been established. Some of them have been already mentioned: the Brunetta reform established the CIVIT (echoing the United States’ Government Accountability Office), whose competencies were afterwards shifted to the DFP. DFP is now tasked, inter alia, of coordinating the activities required of the administrations on the performance cycle. Other sectoral institutes coordinating evaluations include INVALSI for the school sector, ANVUR for the university system, NUVAL as a coordinating unit for the public investment sector and the Internal Evaluation Office of the Senate. 2.3

Use of Evaluations

As previously said, some sectors rely on evaluation more than others. Health, schools and universities are among the sectors that most rely on evaluation. It is worth noting that all of them have a clear connection with their ‘clients’ (the patients, the students). On the contrary, there are sectors, mostly related to the core functions of the state, in which evaluation is not yet rooted as a routine activity, such as the judiciary. At a broad general level, evaluation in Italy seems most widespread in administrations with a local and a client orientation, while their application is more recent (or lacking) when going up to higher levels of the central state. However, this statement has to be taken carefully; in fact, the picture of the Italian administration is highly diversified and the use of evaluation is, consequently, scattered. The embedding of evaluation within single administrations greatly relies on the interest of leaders (policymakers and top managers) to use evaluation as a tool to achieve organisational and policy results. In these cases, evaluation supports planning and steering functions: a case history of the use of evaluation as a managerial tool is the implementation of a complete management and evaluation system within the Chamber of Commerce of Milan, one of the most important in Italy and a proud forerunner in terms of innovation (Dente 2010). Another case is the project Good Practice, led by the Politecnico of Milan: since 1999,

294

E. MELLONI

it involves several universities in a benchmarking exercise of their administrative services under different perspectives, such as cost and effectiveness, customer satisfaction and digital learning. One could note that most evaluations, for example those related to EU funds, and more in general, the ones related to projects funded by external donors, are often limited to accountability and legitimacy purposes. One reason is the timing of evaluation, which often occurs when projects/programmes are closing—when outcomes are not yet visible. Another reason is the lack of funding for outcome/impact evaluation, which usually requires competent experts to be used. A third reason is the difficulty of including evaluation findings at the right time within the policy cycle, in order to fuel the policy process with fine-grained information for policymakers. However, examples of evaluation conducted with ‘enlightenment’ purposes exist and sometimes they reach the attention of the media (Argentin and Barone 2018).

3 3.1

Societal Dissemination/Acceptance (Social System) Institutionalised Use of Evaluations by Civil Society

The use of evaluation is not common during elective waves, neither at the national nor at the local level. However, the last election (March 2018) was characterised by a consistent use of the term ‘fact checking’ as a way to limit a growing populism within the political debate. Examples of NGOs fostering the use of evaluation exist with the goal of improving some aspects of institutional, social and economic life. A relevant example is ‘Cittadinanzattiva’, a consumer organisation founded in 1978 whose main objectives are the promotion of civic participation and the protection of citizens’ rights in Italy and Europe. The strategy of Cittadinanzattiva entails the production of civic information, that is, data on the situation of citizens’ rights collected and analysed by citizens themselves. The fields in which Cittadinanzattiva is particularly active are health, with the tribunal for patients’ rights, consumer and users’ rights, with the citizens’ advocates, education and training, with the school for active citizenship, justice, with justice for citizens’ rights, European citizenship, with active citizenship network and corporate social responsibility. Basing on its experience in civic evaluation, in particular on health services, Cittadinanzattiva created the agency for civic evaluation in 2010. According to

11

ITALY

295

its website, “The basic idea is that a more active role of citizens appears essential to requalify the assessment systems already present in the various institutional and sectoral areas of the public administration and to facilitate the implementation of real processes of change in the interests of citizens and community”.16 In 2013, the agency of civic evaluation elaborated for the National Agency for Regional Health Services 140 indicators and a specific methodology to implement a participatory assessment involving 16 regions and 54 health structures (Cittadinanzattiva 2012). Moreover, Cittadinanzattiva has supported a project of civic evaluation in several justice courts. Different forms of involvement of NGOs as well as trade unions and other stakeholders are often required by participatory evaluation processes, such as in the case of the evaluation of (parts of) the EU-funded programmes, or social impact evaluation designs. 3.2

Public Perception, Discussion of Evaluation and Evaluation Findings

Evaluation is not well recognised within society and the professional evaluator shares the same destiny. As an example, the profession of ‘evaluator’ is still not included within the catalogues of professions are promoted by different organisations in Italy, such as the ISTAT and the Institute for Public Policy Innovation—INAPP. AIV is working with both of them, with the goal of including the evaluator’s profile within these official catalogues. Evaluation reports are not always made publicly available to the wider public. This should particularly be the case when accountability is one of the main purposes of the evaluation, for example in the case of evaluations of the programmes co-financed by EU funding. However, the EU has recently produced a new section on its website dedicated to the collection of all evaluation reports undertaken in different states, funds and operational programmes,17 a provision that obviously will bypass the clients resistant to the disclosure of evaluation content.

16 For more information see the website: https://www.cittadinanzattiva.it/. 17 More information on the website: http://ec.europa.eu/regional_policy/en/policy/

evaluations/member-states/.

296

E. MELLONI

4 4.1

Professionalisation (System of Professionalisation)

Academic Study Courses, Further Training Et cetera

Several evaluation courses exist in academia as single modules. There are also a few degrees, masters and PhDs dedicated to evaluation. As an example, the Sapienza University of Rome offers a degree in Communication, Evaluation and Social Research; the ‘Marco Biagi’ Department of Economics at the University of Modena (Reggio Emilia) offers a degree in Evaluation of Public Policies and Territory. Since 2017 the University of Padova offers a masters in Innovation, Planning and Evaluation of Policies and Services. Since 2016, the Ca’ Foscari University of Venice has organised, in cooperation with the Senate and other organisations, the Masters in Analysis and Evaluation of Public Policies. The Polytechnic of Milan offers the Executive Master in Public Management for the Federalism that includes a module dedicated to evaluation. Other masters in evaluation are organised by the Universities of Catania and Genova. Some PhD courses deal with evaluation, even though their title is more general, such as the PhD from the University of Turin—Policy Analysis and Institutions, Administrations and Regional Policies, organised for several years by a consortium of universities led by the University of Pavia. Other forms of training in evaluation, general or dedicated to specific topics, are offered by several non-academic institutions, such as the AIV; the Society for the development of the analysis and evaluation of public policy— Associazione per lo Sviluppo della Valutazione e delle Politiche Pubbliche; the Institute for Social Research (Istituto per la Ricerca Sociale—IRS), a cooperative funded in 1973 that produces research, evaluation and training in different policy fields, Human Foundation, which is specialised in the diffusion of the social return on investment methods, and many others. A growing wave of training courses dedicated to non-profit organisations is arranged by foundations and other providers, in order to cope with the new requirements of evaluation and accountability for this sector, such as impact assessment, social return on investments and social budgeting. 4.2

Profession/Discipline

Italy has its own voluntary organisation for professional evaluation. AIV is a non-profit association founded in 1997. Since its foundation, AIV

11

ITALY

297

promotes the debate on evaluation through different channels, the most important one being the annual congress that is organised in different parts of the boot. Every year, the society publishes three editions of the Italian Evaluation Review (Rassegna Italiana di Valutazione), a referee journal collecting articles on evaluation. Moreover, since 2000 AIV has been promoting a series of monographic books dedicated to evaluation, the Collana di Valutazione. About 40 titles have been published up to now. Information on evaluation is also spread through a mailing list and various social networks (the website, Facebook, Twitter, LinkedIn). As previously mentioned, AIV has recently launched a specific training course on performance evaluation following the accreditation of the DFP. AIV has drafted a deontological code for evaluators, which is available on the society’s website.18 The evaluation market has shrunk since the economic crisis started in 2008. Several small consultancies or independent evaluators have seen a decrease in turnover or have exited the market, to the benefit of larger companies. In the same period, the larger companies have started to compete even on bids of smaller size. Universities have also entered the market of evaluation, even though mostly in the form of single professionals rather than as organisations (as, e.g. in the role of experts in independent organisations for evaluation within public administrations). In other words, the last ten years have brought a change in the characteristics of the evaluation market. A certification system for evaluators does not exist, but some work on this has been brought forward within AIV. A register of professional evaluators was promoted by AIV but it closed recently (2017), as it did not reach the expected results, neither in terms of use by evaluators, nor as a source of information on competent evaluators among the organisations requiring this profile. A new project is ongoing, with the goal of accompanying people who want to increase their competencies in evaluation with a self-assessment tool of their competencies and dedicated training. Moreover, AIV is working on a definition of the profession of evaluator, with the aim of including it within the national catalogue of professions held by INAPP.

18 The website can be found at: http://www.valutazioneitaliana.it/contents/pagine/25/ allegati/1514891892codicedeont.aiv.pdf.

298

E. MELLONI

External certification is possible under the framework of Law 14/2003 on the unregulated professions. Several professions have voluntarily produced, often with the assistance of the Italian entity for standardisation— UNI, a standard on the core requirements of a profession that can be certificated by an external, independent entity. However, the evaluators’ community has opted for auto-certification instead of the external certification. 4.3

Compliance with Standards and Quality Obligations

As previously stated, AIV asks their members to adhere to its deontological code approved by its assembly. Standards and quality in evaluation are required, for example, within the public tenders of evaluation; however, it is not such a common demand. The most common criteria on which the tenderers compete refer to the quality of the methodological approach proposed, and to the previous experience in evaluation of the society and the research team. Sometimes, donors ask organisations to use specific evaluation tools and approaches when they present a project to be funded. This is for example the case of Con i bambini, a philanthropy entity that funds interventions fighting educational poverty. It has recently asked the tenderers to include evaluation as a mandatory part of their proposals, to be carried out by one of the societies selected by the foundation itself with a solid background on different evaluation approaches.19

5

Conclusions

In 1997, when a group of precursors gave life to the Italian Evaluation Society, the evaluation was playing a very marginal role in public policies. It was little studied in universities and it was completely absent from the political and public debate. Even though there were difficulties and delays, today we are witnessing a growing diffusion of the evaluation in Italy, thanks to the increase of scholars and professionals. These exponents master methods and tools, and institutionalisation paths in many sectors, and develop a consistent corpus of evaluation practices applied to a vast and complex range of objects. However, the diffusion of evaluation has not always turned into a 19 The list of the selected societies is available at: http://www.conibambini.org/wpcontent/uploads/2017/11/Lista-enti-DEFINITIVA-1.pdf.

11

ITALY

299

widespread evaluative culture, as a source of learning and improvement for policymaking. In many thematic areas, evaluation is still felt to be a duty and a punishment. Italy is experiencing a ‘second era’ in evaluation. Evaluation today is a practice formally introduced into our political and administrative system, but it needs to find a space in a substantial way. It is necessary to agree on the necessity of abandoning some old-fashioned habits definitively (acting in a routine, undergoing checks, fearing and circumventing them) to promote a new logic that is less legalistic-oriented and more based on results. It is also necessary to avoid all the temptations for a ‘bureaucratic’ evaluation, and to bet on the responsibility of managers for making choices that are discretionary but not arbitrary, based on results more than compliance. Moreover, the overall system should acknowledge the contribution and coresponsibility of the public, private and third-sector actors, both in design and in the implementation of policies as well as in their evaluation, basing on a common acquis: the underlying objective of improving the way public problems are tackled. A step further, from formal to substantial evaluation, requires an open attitude based on the search for improvement, on a continuous and frank course of self-reflection and on a trial and error process. In this process, the information on errors should be considered a crucial source of knowledge, and the assessment should not be considered a civilised form of punishment, but an essential tool for policymaking and innovation. A ‘third era’ of Italian evaluation would require a common agreement on these assumptions and maybe this is still to come.

References Argentin, G., & Barone, C. (2018, April 20). Contro le disuguaglianze prescrivere il liceo. La Voce.info. http://www.lavoce.info/archives/52614/ledisuguaglianze-si-riducono-consigliando-il-liceo/. Accessed on 31 January 2019. Bassanini, F. (2013). Twenty years of reforms in Italy. http://www.bassanini.it/wpcontent/uploads/2013/10/Twenty-years-of-administrative-reform-in-Italy. pdf. Accessed on 5 December 2018. Bonini Baraldi, S. (2014). Evaluating results of public sector reforms in Rechtsstaat countries: The role of context and processes in the reform of the Italian and French cultural heritage system. International Public Management Journal, 17 (3), 411–432.

300

E. MELLONI

Busetti, S., & Dente, B. (2014). Focus on the finger, overlook the moon: The introduction of performance management in the administration of Italian universities. Journal of Higher Education Policy and Management, 36(2), 225–237. Cittadinanzattiva. (2012). Convegno per la presentazione dei risultati del Progetto di Ricerca Corrente 2010. https://www.cittadinanzattiva.it/files/corporate/ attivismo_civico/valutazione_civica/materiali_rc_2010_07_11_12_def.pdf. Accessed on 31 January 2019. Dente, B. (2010). Ricominciare da tre. Impresa & Stato, 88, 48–51. Direttiva del Presidente del Consiglio dei Ministri, 16 febbraio 2018. (2018). Approvazione della Guida all’analisi e alla verifica dell’impatto della regolamentazione, in attuazione del decreto del Presidente del Consiglio dei ministri 15 settembre 2017, No. 169. Graziosi, A. (2018). La valutazione come riforma? L’esperienza dell’ANVUR e il caso delle scienze umane, giuridiche e sociali. In A. Giardina & M. A. Visceglia (Eds.), L’organizzazione della ricerca storica in Italia. Viella: Roma. Marra, M. (2017). Valutare la valutazione. Adempimenti, ambiguità e apprendimenti nella PA italiana. Bologna: Il Mulino. Martelli, M. (2009). L’analisi di impatto della regolazione – AIR. Rapporto Astrid. http://www.astrid-online.it/static/upload/protected/02_A/ 02_Astrid_RapportoQR_Parte3_AIR_010709.pdf. Accessed on 18 August 2018. Melloni, E. (2014). Come le organizzazioni Nonprofit valutano l’impatto delle proprie attività. http://www.sodalitas.it/public/allegati/INDAGINE_ RISULTATI_201531512283440.pdf. Accessed on 25 December 2018. Melloni, E., & Pasqui, G. (2009). L’innovazione amministrativa per il policy change: Il caso del dipartimento per le politiche di Sviluppo e Coesione. In F. Butera & B. Dente (Eds.), Change management nella Pubblica Amministrazione: una proposta. Franco Angeli: Milano. Ministry of Justice, Performance Plan of the Judiciary Sector 2013–15. https:// www.giustizia.it/giustizia/it/contentview.page?contentId=ART915738& previsiousPage=mg_1_29_6_17. Ministro per la pubblica amministrazione. (2018). Dirigenti statali, si cambia: rigore e giudizi dei cittadini. http://www.funzionepubblica.gov.it/ articolo/ministro/13-07-2018/dirigenti-statali-si-cambia-rigore-e-giudizidei-cittadini. Accessed on 25 October 2018. Open Coesione. (2017). Spesa certificata dei fondi strutturali 2007–2013. https:// opencoesione.gov.it/it/opendata/#!spesa_section. Accessed on 31 January 2019. Rebora, G., Ruffini, R., & Turri, M. (2015). Una partita difficile: il performance management nei Ministeri italiani. Risorse Umane nella PA, XXIX (3), 11–30.

11

ITALY

301

Senato della Repubblica, Servizio per la Qualità degli atti normativi. (2017). L’AIR nella relazione del governo alle Camere per l’anno 2016. https://www.senato. it/service/PDF/PDFServer/BGT/01028432.pdf. Accessed on 25 December 2018. Servizio Studi della Camera dei deputati. (2005). http://leg15.camera.it/ cartellecomuni/leg14/RapportoAttivitaCommissioni/testi/05/05_cap25_ sch01.htm#_ftn2. Accessed on 25 December 2018. Vecchi, G. (2018). La valutazione della performance negli uffici giudiziari come strumento di politica pubblica. Milano: Franco Angeli.

CHAPTER 12

Spain María Bustelo

1

Introduction

Over the course of the last few decades, the evaluation of public interventions and investments has been transformed into a requisite of democratic society. The creation of specific institutions or agencies, the obligation for accountability in relation to the use of public funds, as well as the rights of society for appropriate, thorough and quality interventions that improve the life of citizens, underlines a new professional and academic field that requires specialisation. Therefore, the practice of evaluation in every national context is formed and understood, to a large degree, by how the role of evaluation has been institutionalised in each of the political administrative systems, markets and societies. This chapter deals with the Spanish case. Spain is a southern country in Europe with a population of 46 million, which has been a member of the European Union (EU) since 1986. Its form of government has been a parliamentary monarchy since 1975 (before which the country experienced a dictatorship since the end of the 1936–1939 civil war), and its political organisation is understood to be quasi-federal, composed of 17

M. Bustelo (B) Complutense University of Madrid, Madrid, Spain e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_12

303

304

M. BUSTELO

autonomous communities and two autonomous cities (Ceuta and Melilla). The admission into what was then called the European Economic Community (today the EU) is key to understanding the rapid modernisation of the country, and especially, the history of evaluation in Spain and its institutionalisation, which is the subject of this chapter. The strength of the function and practice of evaluation in every country depends, in a large part, on how the role of evaluation is institutionalised, in its institutions, as well as in markets and society. The main objective of this chapter is to debate what institutionalisation involves, what forms it takes and to what extent evaluation is institutionalised in Spain, as well as analyse the trends of said institutionalisation. In other words, I want to analyse, understand and explain how the role of evaluation is applied, strengthened and institutionalised in the Spanish context. The chapter is presented in three sections, in addition to the introduction. In the first, a brief history of evaluation in Spain is described, which serves as context for the subsequent analysis. Secondly, after briefly presenting the different current frameworks for analysis that have been used to understand and explain the institutionalisation of the role of evaluation in Spain, an analysis of the political, social and professionalisation subsystems of the institutionalisation of evaluation in Spain is carried out. In the third and final section, the conclusions of the analysis are described. A Brief History of Evaluation in Spain Spain is a country in which the processes of the institutionalisation of evaluation have been slower and tardier than in other European countries. This is due to the late democratisation that took place in the last decades of the twentieth century. Spain experienced a civil war (1936–1939) prior to the Second World War, brought about by a military coup on the 2nd republic (1931–1936), which was followed by a dictatorship of almost 40 years which only started to resolve itself with the death of the dictator Francisco Franco in 1975. In only a decade, in the period known as ‘the transition’, Spain stopped being a dictatorially controlled country with the support of the Catholic Church, strongly centralised and lacking in rights and freedoms—some illustrative examples include the last executions of political prisoners that took place in the same year, 1975, and that women could not open a bank account without the permission of their husband, and the age at which they became an adult was 23 years old, different to men at 21—and became the decentralised democratic state with three levels of

12

SPAIN

305

government (national, autonomous and local) that entered into the EU in 1986. This brought about the creation of multiple institutions, government structures, administrations and public services. This was a time in which a lot of basic structures had to be brought into being in a very short space of time, and in which a lot of policies and public services were created ex novo. It was a time more for doing and creating than for reflecting and analysing, and therefore, evaluating. It is with admittance into Europe, and the money associated with different programmes, that some evaluations were starting to be done, even though these were not large in scale, nor systematic. In Spain, it was influential the admission in 1986 into the European Economic Community—now the EU—(Viñas 2009), and to a certain extent still remains, as a substantial part of the activity related to evaluation in recent decades is linked to the different authorities or sectors of the European Commission (whether it be regional politics, research, international development; links to structural funds, the European social fund, etc.). Except for some evaluations carried out in the field of education (García Sánchez 2005)1 and social services (Restrepo et al. 1990)2 in the second half of the 1980s, it is more in the 1990s when evaluations in Spain were starting to be done, although still frequently in an informal way. Those evaluations are mainly linked to policy and social services (Aguilar and AnderEgg 1992; Bustelo 1990), to health education and the prevention of drug dependency (Bustelo 1994), regional politics (in this case, with a clear influence of the demands of European programmes (Díez López 2001); and only at the end of the 1990s, international development (Gudiño 1999; Ministerio de Asuntos Exteriores 1998). Also, in this decade, the different public administrations, which at that time dedicated quite a lot of resources to the training of their staff, started to introduce, among many other permanent training subjects, some short courses (between four to 20 hours) on programme evaluation.

1 Evaluation of secondary education reforms (1984–1992); Evaluation of the programme for the integration of students with special education needs (1985–1989); Evaluation of Project Atenea for the introduction of computers in classrooms (1987–1990). 2 This is the case with the efforts carried out by the Department for Social Services of the Community of Madrid in which an information system was designed that had more to do with monitoring needs, but that also carried out an important evaluative function that was pioneering at that time.

306

M. BUSTELO

It is in the 2000s that evaluation started to be institutionalised in Spain. As can be seen further ahead in this chapter, it started with a more systematic evaluation practice in different policy sectors, the first masters and postgraduate courses started to emerge along with the first doctoral theses (Díez López 2001; Bustelo 2001; García Sánchez 2005; Arbulú Saavedra 2008), and the creation of the first government structures for evaluation, such as the National Agency for the Evaluation of Public Policy and Quality of Services (Agencia Estatal de Evaluación de Políticas Públicas y Calidad de los Servicios, AEVAL) at the national level, and the Catalan Institute of Public Policy Evaluation (Institut Català d’Avaluació de Polítiques Públiques [IVALUA]), at the Autonomous Community level. The Spanish Evaluation Society was also created in 2001. The economic crisis that developed towards the end of the previous decade is what is going to characterise the 2010s and will affect the entirety of this decade. On one hand, it has diminished the practice of evaluation, since the growth of evaluation has stagnated, and shorter evaluations with lower budgets are carried out - as demonstrated in studies carried out in 2009 and 2014 by (Bustelo and Fitzpatrick 2010; Rodríguez 2014). On the other hand, institutionalisation in itself is going to find itself compromised, a clear example of which is the abolition of the AEVAL in 2017. Despite the maintenance of some elements, as the continuity of the Master in Evaluation of the Universidad Complutense de Madrid (UCM) and the creation of new networks as the creation of the Iberian Association of Evaluation Professionals (APROEVAL) in 2014, this decade is going to be characterised by a stall, even a setback, of evaluation in Spain. Analysis of the Institutionalisation of Evaluation in Political and Social Systems and of Professionalisation in Spain The strength of the role and practice of evaluation in every country depends, to a large degree, on how evaluation is, and is continuing to be, institutionalised, and aids the identification of indicators and variables to keep in mind in the study of every national reality. To make sense of these elements, and structure them in a significant and comparative way for the Spanish case, this chapter uses the framework suggested by the editors of this volume. This framework helps to explain some of the reasons why evaluation in Spain is seen as especially affected by the economic and financial crisis that started at the end of the 2000s, the effects of which reached their climax between 2011 and 2015, but continue to be felt

12

SPAIN

307

today, more than ten years later. In Spain, as we will now see, some of the elements of institutionalisation are, or have been, particularly strong (a strong and comprehensive structure such as the AEVAL that worked for ten years from 2007–2017; and a dedicated postgraduate level higher education programme), and others are especially weak as a significative absence of any evaluation culture and practice around parliament and the legislative branch.

2

Political Subsystem: Institutional Structures and Processes 2.1

Evaluation Regulations

General regulations for evaluation, that are different from the rules that regulate government structures that fulfil evaluation functions of executive power, do not exist in Spain. Some regulations that mention certain evaluation functions do exist in some sectors,3 such as in the case of the higher education (HE) sector in which regulations form the foundations for the accreditation of both the teaching faculty and the education programmes or university curricula.4 Evaluation is also mentioned in the law regulating universities.5 On the one hand, it talks about a ‘culture of evaluation’, laying the foundations for the creation of the National Agency for Evaluation, Quality Standards and Accreditation (Agencia Nacional de Evaluación, Calidad y Acreditación, ANECA), and on the other hand, granting evaluation powers for the quality standards of universities to the autonomous communities. However, it doesn’t have anything to do with the rules that regulate the function of programme and public policy evaluation, but rather the two specific aspects previously mentioned (the accreditation of faculty and of university

3 The chapter concentrates in the sectors of Public Administration, International Development, HE and R&I policies, as those are the most consolidated regarding a institutionalised evaluation function. Due to space reasons, other sectors that also have a relatively strong evaluation function, such as Health, Employment or non-university Education policies, are not included. 4 Through various Laws and Decrees that establish accreditation in order to access teacher faculties (RD1312/2007) and the organisation of university doctrine in different levels and areas (RD 1393/2007; 1614/2009; 861/2010; 99/2011; 43/2015; 415/2015). 5 Ley Orgánica 6/2001, de 21 de diciembre, de Universidades.

308

M. BUSTELO

curricula). A later elaboration of regulations for institutional and teachers’ evaluation (via ANECA) does exist, and at this moment in time the units in charge of evaluation or ‘quality control’ at universities spend a lot of time and resources generating information for the university rankings as a mark of university quality. Intimately related to the HE sector are research and innovation policies (R&I). In this case, and in parallel to HE policies, there is an evaluative activity confined to the evaluation of the performance of researchers (using almost exclusively bibliometric analysis and evaluation through peerreview), and ex ante evaluation for the allocation of resources and funding of research projects. In this regard, over the course of the last few years, there have been advances in the monitoring and evaluation of the research funded projects, but the perspective of an evaluation of policies and programmes is very underdeveloped. For example, in the 2011 law on Sciences (Ley 14/2011, de 1 de junio, de la Ciencia, la Tecnología y la Innovación), the term evaluation is mentioned on 63 occasions, but mainly confining evaluation to the field of the evaluation of the financial support awarded for research projects (ex ante and ex post), almost without developing the role related to the evaluation of plans, strategies and policies. All this, despite the influence of R&I policies of the EU, which have developed active and intense evaluation activities. In fact, in the ‘Spanish Roadmap for The European Research Area Development 2016–2020’, mentioned in the Plan Estatal de Investigación Científica y Técnica y de Innovación 2017– 2020, Priority 1 (improve the efficiency of the Spanish science, technology and innovation system) includes the objective to strengthen the evaluation of R&I policies with the objective of seeking out complementarity and streamlining the existing instruments at the European and national level (Ministerio de Economía, Industria y Competitividad 2017, p. 16). Although it may be self-evident, beyond this declaration of intensions, there are still no regulations at the present time in the Spanish science, technology and education system, neither in practice or in specific structure (unit or department), that promotes this role of policy evaluation. In the case of evaluation in the international development sector, it is important to note that the Law of International development cooperation (Ley 23/1998, de 7 de julio, de Cooperación Internacional para el Desarrollo) grants evaluation, along with planning, a key role in international development policy. Contrary to what occurred in the sectors above mentioned, in this sector the main link between the role of evaluation and the evaluation of programmes and policy is very clear, even mentioning

12

SPAIN

309

that said programmes and policies will be evaluated using the evaluation criteria developed by the Development Assistance Committee (DAC) of the Organisation for Economic Cooperation and Development (OECD).6 This policy sector is the only one which has published an Evaluation Policy of Spanish Development Cooperation (Política de Evaluación de la Cooperación Española), first published in 20077 and later revised and expanded in 20138 by the Evaluation Division of the Ministry of Foreign Affairs and International Development. This document has served, and continues to serve, as a reference document in the field of international development. In fact, this is the sector in which the role of evaluation was first and most clearly established and consolidated in Spain. The 2013 ‘Evaluation Policy’ is a comprehensive document in which one can see the signs of a certain maturity of evaluation policy. It does not only mention the use of evaluation, but also, in its introduction, stresses that evaluations must try to respond to the fact that “despite the considerable increase in the number of evaluations carried out in recent years, their utilisation for feedback and decision making, and the use of the knowledge generated in general has been insufficient” (Ministerio de Asuntos Exteriores y de Cooperación 2013, p. 7). As indicated in the 2011 DAC Peer Review Report of the OECD, the Spanish Agency for International Development Cooperation should be more strategic at the time of deciding what to evaluate and how to extract lessons learned from the results of evaluations. As a result, the document considers monitoring the use of evaluations as essential to verifying whether they are contributing effectively to feedback, the improvement of systems, and accountability, as well as the establishment of a system for monitoring evaluations, produced by the units responsible for the fields evaluated. This includes a management strategy which presents recommendations that can be put into practice in a concerted manner and give rise to an improvement plan. The report states that these plans must be published alongside the evaluation reports in an established evaluation system. This

6 Verbatim translation: “Evaluation will take into account the pertinence of the objectives and the degree to which they are achieved, as well as efficiency and effectiveness, the impact attained, and feasibility demonstrated in the finished projects and programmes” (Art. 19 of Law 23/1998). 7 For more info, see http://www.exteriores.gob.es/Portal/es/SalaDePrensa/Multimedia /Publicaciones/Documents/Pol%C3%ADtica%20evaluaci%C3%B3n%20esp.pdf. 8 For more info, see http://www.aecid.es/Centro-Documentacion/Documentos/Evalu aci%C3%B3n/POLITICADEEVALUACION.pdf.

310

M. BUSTELO

marks an advance in the level of commitment regarding the use of evaluations, even if it is still only in the implementation phase. Moreover, within its principal guidelines, one can find independence, impartiality, transparency, credibility and utility (as well as efficiency, participation, and strengthening capacity, coordination and complementarity), and within its functions; learning, improvement, transformation, transparency and accountability. With respect to the existence of administrative regulations, such as instructions or guidelines, over the course of the last few years, some documents have been published both in the field of international development cooperation, which are linked to the Evaluation Division of the Ministry of Foreign Affairs, as well as in the context of AEVAL. As early as 1998, the ‘Evaluation Methodology for Development Cooperation’ was published by the Ministry of Foreign Affairs. In the case of AEVAL, ‘Fundamentals of public policy evaluation’ was published in 2009,9 and the ‘Practical guide for the design and production of public policy evaluation’ was published in 2015. In both cases, these documents define and shape the guiding principles of the AEVAL (that are also in their statute): transparency, plurality, impartiality, independence, integrity, responsibility and efficiency. Furthermore, they distinguish between two types of uses and functions of evaluation: a strategic and democratic use (which includes accountability, transparency, and participation) and a managerial use, in which evaluation provides useful information to managers and those responsible for programmes on which they can base their decisions, helping them obtain feedback on the policies and their possible improvement. In any case, more than instructions, they are general guidelines that are neither rules nor obligatory; even though they have fulfilled an important internal role for the creation of an institutional focus and conversation on evaluation. Analysing the functions fulfilled by the different stakeholders within the political and administrative system in Spain, it is possible to observe the weak role that parliament and the autonomous general assemblies have had, and continue to have, in the growth of evaluation in the country. Neither the production of evaluations, nor the use of their results, are presented in a systematic way in parliamentary structures: parliamentarians do not take into account nor stress the need for evaluation, and, if they do, evaluations are undertaken on a non-essential, anecdotal, basis. This also

9 For more info, see http://www.aeval.es/export/sites/aeval/comun/pdf/evaluaciones/ Fundamentos_de_evaluacion.pdf.

12

SPAIN

311

relates to the general lack of a culture of evaluation in the political parties, which in a system such as the Spanish one, is closed-listed with a high level of party discipline, and a foundation that is unconducive to evaluation reaching the legislative field. As such, the culture of evaluation in Spain has started to develop itself mainly around governmental and public administrations, and practically not at all in relation to the legislative power. In fact, specific training in evaluation is rarely produced in this field, as we will see in the system of professionalisation. Indeed, although the Expert Commission for the creation of an Evaluation Agency which lead to AEVAL recommended in 2003 that, for reasons of independence and credibility, evaluation agencies are better placed under the legislative branch, subsequent analyses indicate that creating AEVAL under the Executive, as it was the case, might have been a better option. These analyses conclude that, due to the Spanish political context, an evaluation agency might have more opportunities to work with independence and professionalism, counting on necessary autonomy, being in the executive branch than in the legislative, given the risk of partisan practices by a community with very little training and culture of evaluation (Bustelo 2014). 2.2

Evaluation Practice

In the Spanish case, the practice and ‘market’ for evaluation is perceived as being dependant on multiple factors, as will be seen throughout this chapter. However, here we will focus on two. First, the influence of Spain’s admittance into the EU that permitted the entrance of European money in the form of different programmes and subsidies that require subsequent evaluation. In this case, the binomial donor-recipient structure of public funds, and the requirements on the part of whoever gives and distributes the money, are key. In other words, it is evaluations closely linked role to accountability that is going to stimulate the practice of evaluation. Secondly, external global factors that affect the production of evaluations and the evaluation ‘market’ should be kept in mind. This is the case with the effects that the financial and economic crisis has had on evaluation in Spain, something that saw significant and exponential growth of evaluation practice in 2009 (Bustelo and Fitzpatrick 2010) and stagnation and deprivation five years later (Rodríguez 2014). A survey conducted in 2009 and then replicated in 2014, provides some evidence about the evaluation market in Spain.

312

M. BUSTELO

The survey included information on evaluators’ profiles, education and experience, their experiences in evaluation, and also case studies of concrete evaluations. Evaluators were asked about their profile and evaluation activity and then they were asked to think of a significant evaluation they have participated in or conducted and to respond to more in-depth questions about that specific evaluation. The respondents were people linked to the evaluation field in Spain (the call for responses was launched by the Spanish Evaluation Society and the UCM Masters course). The sample was made up of 146 respondents in 2009 and 195 in 2014. If we take evaluation activity (responses related to how many evaluations they have participated in by year) there is an increasing trend in the volume of evaluations until 2011, the year in which stagnation starts to occur (Fig. 1). Moreover, evaluations in 2014 were reported to be shorter in duration and have smaller budgets. According to the people interviewed as part of the survey, starting in 2014, there seems to be an indication that the evaluation market is recovering, all be it slowly. For example, even though the crisis brought with it an almost complete suspension of evaluation activities that had started to be carried out by local administrations, this activity seems to be shyly recovering in some local administrations, which now happen to have greater ownership of the subject of evaluation. 500

451

449

446

2011

2012

2013

450 385

400 337

350 300

257

263

2007

2008

250 185

200 150

135

133

2004

2005

100 50 0 2006

2009

2010

Fig. 1 Evaluation activity in Spain 2004–2013: number of evaluations reported by respondents each year (Rodríguez 2014)

12

SPAIN

313

With respect to the frequency of evaluation undertakings, the sector where the rate has been constant and systematic, despite the years of the crisis, is that of international development cooperation. In general, the AEVAL produced 77 evaluations over the course of its just over ten years of existence; despite this reduction in some years of the crisis (in 2010 and 2013 they only carried out five), and from 2015 when the team started to be reduced, the frequency of evaluation undertakings remained constant. The quantity of evaluations produced by AEVAL, which could be considered small, was dependent on the fact that evaluations were conducted internally and not commissioned to external teams. The termination of AEVAL in July 2017 resulted in a sudden halt of this constant stream of evaluation activities. We also have data from the 2009 and 2014 surveys regarding other sectors in which evaluations are carried out. From 2009 to 2014 there is a reduction in the reported number of evaluations in the areas of education, economy, employment and public administration, and an increase in the areas of international development, health, social services and gender equality. The most important increase in relative terms was in international development, which emphasises the fact that this sector already had the most well-established evaluation function. In the 2009 survey, 58% of the evaluations reported were external, 23% were mixed, and just 19% were internal (Fig. 2), with no significant change in 2014 (60% external, 19% mixed and 21% internal). As has been previously stated, in the international development sector, practically all of the evaluations commissioned by the Evaluation Division are external, whereas the evaluations carried out by AEVAL in its decade of existence were undertaken internally by the agency. As we will see in the conclusions, this strengthens the internal function of evaluation in public administrations but does not create an evaluation market. The survey also asked what type of evaluation had been undertaken (design, process, outputs, or outcome-impact), with the format of the answer being ‘mark all that apply’. Output evaluation was the evaluation type most marked in 2009 with 78%, and 65% in 2014, followed by 65% and 60% process evaluation and 49 and 44% design, and finally 30 and 34% outcome and impact evaluation. This means that from 2009 to 2014 there was a slight increase in the number of evaluations that had an impact evaluation component, but impact evaluation still remains a minor part of the Spanish evaluation landscape. The decline of the other three types of

Fig. 2 2014)

%

1% 1%

5%

3%

3% 3%

2%

3%

6%

5% 5%

13%

15%

12%

10%

9%

10%

7%

21%

20%

19%

18%

25%

26%

30%

2009

2014

35%

32%

Evaluation activity in different fields: number of evaluations reported in each field in 2009 and 2014 (Rodríguez

Others

Development cooperation

Social services

Education and culture

Health

Economy and employment

Environment and agriculture

Public administration

Industry and energy

Gender equality

Areas of interventions evaluated by years

314 M. BUSTELO

12

SPAIN

315

evaluation from 2009 to 2014, which is less in the case of impact evaluation, is probably due to slightly greater specialisation of evaluations (they tend not to cover so many types of evaluation in the same study). Regarding evaluation structures, there is a tipping point in Spain when the AEVAL was created, as it generated an independent evaluation structure both at the national level as well as in a more general sense. This institution was created by the Ley de Agencias (Ley 28/2006 de 18 de julio, de Agencias Estatales para la Mejora de los Servicios Públicos ) but did not start to be implemented as such until the passing of the statute in January 2007.10 The AEVAL was brought into being by a project considered to be well conceived and which received wide support, given that it had been developed based on a range of meetings with experts in the preparation of the socialist PSOE party’s election manifesto for the March 2004 general elections. Furthermore, once the new government assumed office, a complete preliminary analysis was carried out, which is presented in the report Comisión MAP, 2004, along with a period of internal negotiation in the General State Administration (AGE in its Spanish acronym) (Garde 2007). The AEVAL was a national level government body, granted management autonomy that facilitated its functioning, as well as boosting its possibilities of independence while carrying out evaluations. As part of its mandate, the AEVAL carried out evaluations of national policies by internal teams of evaluators. These teams decided on a process of negotiation with the management centres (the implementers of the policy being evaluated) and the annual list of evaluations to be undertaken was approved by the Cabinet of Ministers. The main decision-making body was the Steering Committee in which high level government representatives (initially Vice-Ministers and at least Director Generals) of the areas of public administration, economy and finance, and presidency, in addition to three highly recognised independent advisors from the evaluation community. The AEVAL was therefore conceived as a much more ambitious project than it was later able to be implemented as; not only was it unable to fulfil even half of what was expected, but it also had to grapple with budget cuts and loss of personnel, resulting in a substantial reduction in its evaluative capacity and influence. The laudable work of the AEVAL team from

10 Royal Decree 1418/2006 in which the AEVAL Statute is created (Original title: Real Decreto 1418/2006, de 1 de diciembre, por el que se aprueba el Estatuto de la Agencia Estatal de Evaluación de las Políticas Públicas y la Calidad de los Servicios ).

316

M. BUSTELO

2011 to 2017 should be highlighted as they continued to carry out evaluations with resources that were greatly diminished from the original project, which had the principal objective of the establishment of an AEVAL style and methodology. As was to be expected to a certain degree, given that the pact between the Popular Party and the Citizens’ party (Ciudadanos ) in 2015 included a clause that demanded the elimination of the AEVAL in order to create an ‘independent evaluation institute’, the AEVAL was eliminated on July 2017. Curiously enough, the activities of evaluation were left to an “Institute of Evaluation, which had the status of Sub-directorate General and intended to enjoy independence, something which seemed largely incompatible for anyone who knew the Public Administration” (Garde and Ruíz 2019; Garde 2017). This ‘Institute for the Evaluation of Public Policy’ takes part of the Ministry of Territorial Policy and Public Administration since the ministerial reconstruction of June 2018 done by the Pedro Sánchez (PSOE) new government. This reduction in status and direct dependence on the ministerial structure that deals with public administration means that it is not consistent with independence, a function that evaluation should have, but also that it is confined to a role that is very much internal to public administration and is also a very weak structure that depends on the possible fluctuations of governments. In reality, we can say that this institute has become more of an internal evaluation unit or department within a ministry. This is also the situation with the Evaluation Division of the Ministry of Foreign Affairs that, despite being an internal unit and dependent on the ministry, it has been on an important institutional journey and now has a strong and well-established evaluation function in the development cooperation sector. It remains to be seen what role the Institute for the Evaluation of Public Policy wants to grant evaluation. In any case, the lack of political will to institutionalise the role of evaluation and continue with the political project that started the AEVAL continues to be evident despite having restored the government of the same political party that created it (PSOE). The AEVAL project and the pro-evaluation culture of the pre-crisis era made the existence of projects for the development of structures akin to the AEVAL (in the General State Administration) possible in various autonomous communities. However, the crisis, that ended up particularly affecting the public administrations of the Autonomous Communities, meant that the only one created was the Catalan Institute of Public Policy Evaluation (Institut Català d’Avaluació de Polítiques Públiques, IVÀLUA)

12

SPAIN

317

in 2008. IVÁLUA is structured in the form of a public consortium and has had more of a policy of support through training and the dissemination of a culture of evaluation and the commission of some evaluations with external teams, than that of direct evaluation like the AEVAL. 2.3

Use of Evaluation

As has been described in the previous section, the sector with the most systematic attempts to promote the use of evaluation is that of international development, which promotes a system of written responses to evaluation recommendations from programme managers. Although, according to experts, there remains significant room for improvement, particularly regarding the role of evaluation in learning and knowledge management (Gudiño 2016). In the case of the AEVAL, the evaluation process also had a period in which the management unit of the policy being evaluated could react to evaluation outputs, but the level of utilisation has been highly variable depending on the subject of the evaluation, and almost always more centred on the role of accountability than that of learning and improvement for future actions. In the case of the other sectors (HE and R&I policies), those in which the role of policy evaluation is not so well developed, but instead is done for the accreditation of faculty professors, course curricula and the allocation of resources to research projects, evaluation results are directly used in decision making and the allocation of resources or merit, with a purpose that is more examination and classification in nature, but almost never with a the purpose of learning or improvement. Perhaps in some cases they would also mention possible areas in which they would intend to promote a function of learning (such as in the case of the ANECA that has been previously mentioned), the evaluation of the teaching faculty (the DOCENTIA programme) and the institutional evaluation of the university centres (the AUDIT programme). In regards to the professional groups that use the results of evaluations regularly, one can say that, in addition to the uses related to the evaluation of policies described in the previous paragraph and in those which the evaluation of merit and research is used directly in a specific way for the allocation of resources and recognition, the only sector that is systematising the use of evaluation by decision makers and project managers is International development, in which a more mature evaluation system is present and there is a more developed culture of policy evaluation. In the same

318

M. BUSTELO

regard, the quality control of evaluations via meta-evaluations and requirements of determined competencies of evaluators is only realised regularly in that sector. In the field of HE, the ANECA evaluates the teaching faculty with the aim of accreditation.

3

Social System: Distribution of Evaluation Towards Society

This is clearly the weakest system with regards to the Spanish case. Until now there has not been even one use or institutionalised demand for evaluation by civil society. The results of evaluations are also not used by the population. The reasons for this civic disinterest in evaluation are manifold, but in general, we especially find ourselves facing a Spanish political culture that is not characterised by its participatory nature. The level of political associationism is very low in Spain, and a tradition of demanding accountability or responsibility from politicians does not exist in Spanish society, something which does not favour a demand nor a use of evaluation by civic society. Perhaps some signs of hope can be observed with the new civil movements, especially at the local level. An example of something that moves in the direction of demanding accountability of politicians by Spanish society is the webpage www.poletika.org,11 an initiative by various non-governmental organisations (NGOs) to monitor and put pressure on political parties, monitoring their campaigns and election manifestoes: its objective is to ensure, from the level of civil society, that political parties fulfil their commitments, as well as promoting new and improved policies. Intriguingly, the word evaluation is not once mentioned, but simply a purpose of citizen oversight. With respect to the participation of NGOs and civil society groups in evaluations, we have to mention again the international development sector, given that development projects are normally formulated and developed (and in many cases evaluated as well) by NGOs, as in this sector they are stakeholders that regularly participate in evaluations, as clients and users of the evaluations. It is also important to mention that there are new private actors that are entering the evaluation scene, such as some foundations (e.g. the Obra Social La Caixa, which carries out systematic evaluations of some of their social development programmes). Likewise, some businesses

11 http://poletika.org/quienes-somos.

12

SPAIN

319

are becoming interested in their corporate social responsibility in order to fulfil a certain degree of accountability towards society. Although this is a path that still has a long way to go in Spain. This logically results in a lack of public discourse about evaluation in Spanish society. The term evaluation is associated with the realm of education and is related to the evaluation of student performance and achievement, or the realm of academia with the evaluation of professors and researchers. However, evaluation is rarely referred to in education, academic, and research policies. More recently, due to the cases of corruption uncovered in recent years, a better articulated discourse about transparency has started to develop, which creates a certain confusion with respect to evaluation. The term transparency appears continuously in the media, but not so much the evaluation one. This is certainly helping to create in the citizenship a culture towards accountability and public information demands from politicians. However, although transparency generates greater public demand for information, something which is necessary for evaluation, it is not the same thing, and one does not result in the other.

4 4.1

System of Professionalisation Academic Study Courses, Capacity Development and Further Training

Since the end of the 1980s and, with more intensity, during the 1990s, education in evaluation became developed, with short courses, workshops and seminars organised by different public entities and, especially, public administrations. These were however almost always directed at medium technical personnel rather than top managers or policy makers. This landscape changed with the appearance of specific university level postgraduate courses in the 2000s. This proved a tipping point for the institutionalisation of evaluation, given that these educational processes have been key to the generation of a supply of professional evaluators with a more robust, complete and specific education foundation than their predecessors had. As we will see further ahead, this was very important for the creation of processes for professional identity and the creation of networks and professional associations. The first postgraduate course specialising in evaluation was the Expert Diploma in Evaluation in Development Cooperation at the UCM delivered between 1999 and 2001, which was the precursor to the Masters in

320

M. BUSTELO

Evaluation of Programmes and Public Policy, also of the UCM, which has continued to be delivered ever since it was first run in the academic year 2002/2003. The masters at UCM is a well-established course with more than 350 graduates who have contributed to the institutionalisation and establishment of entities and structures of evaluation in the three levels of government and public administration (state, autonomous region, and local), as well as in consultancies and the third sector. Since its inception, the master’s programme has principally been directed towards the specialisation in evaluation of professionals from different disciplines and sectors, something that has attracted professionals from many different fields. Furthermore, the master’s course attracts and trains people that will go on and put into practice their training in the capacity of evaluators who will both conduct evaluations or commission, manage and coordinate evaluations from different public structures or from the third sector. Until the moment in which the effects of the economic crisis started to become noticeable, around 2010–2011, the masters course formed well balanced cohorts of professionals that were made up, in equal parts, of professionals from public administrations, NGOs and the third sector, and consultants, in addition to the approximately 20% of young students starting their professional careers. Since the crisis, professionals from public administrations almost disappeared during the last years (although from 2016 this group is being timidly recovered), those from NGOs have also reduced, but those from consultancies, especially freelancers and people that want to redirect their professional careers towards evaluation, have increased. In addition to the UCM master’s course, during the 2000s there were some other postgraduate level courses being run, such as a master’s degree course at the University of Almería that ran for two years, and the IberoAmerican Master’s Degree in the Evaluation of Public Policy at the International University of Andalucía, which ran for eight academic years. During the 2009–2010 and 2010–2011 academic years the UCM also ran an expert diploma course, in collaboration with AEVAL, for the training of public employees of the Spanish public administrations in evaluation, but this course stopped running as a result of the budget cuts to AEVAL. Beyond specific training, it should be highlighted that in some study programmes, the evaluation of programmes forms a part of the official university curriculum as a specific subject of study in sociology, social work, education and public administration courses. As an example, at the UCM, the undergraduate degree course ‘Management and Public Administration’ and the master’s degree in ‘Government and Public Administration’ both

12

SPAIN

321

have two obligatory subjects on evaluation. The message here is extremely important: anyone that studies Administration and Public Management in any of its variations should obligatorily know about evaluation. In addition to university education, there continues to be a sporadic appearance of other trainings and short courses about evaluation relating to NGOs, public administration and consultancies, such as in the case of professionals’ societies, as we will see further ahead. 4.2

Profession/Discipline

Since 2013 there has been a specialist academic journal ‘Journal of Public Programs and Policy Evaluation’, edited by the Department of Applied Economics and Statistics of the Universidad Nacional de Educación a Distancia, which has edited some monographies on different sectors (education, international development, health). There are other scientific journals in other disciplines, such as Management and Analysis of Public Policy (Gestión y Análisis de Políticas Públicas ), Spanish Journal of Sociological Research (Revista Española de Investigaciones Sociológicas ), or the Spanish Journal of Political Science (Revista Española de Ciencia Política), which occasionally publish articles related to evaluation. Beyond articles in academic journals, there are some specific publications on programme and policy evaluation by professional and academic evaluators, which are or have been a reference and very much used in academic training and capacity development (e.g. Ligero 2015; Osuna et al. 2000). Communication between Spanish evaluation professionals occurs, above all, via specialist websites and electronic mailing lists among professional associations, other entities such as was the case with AEVAL, or the Masters’ degree in the Evaluation of Programmes and Public Policy at the UCM. It is also worth highlighting that a network of education professionals linked to the University of Valladolid are very active internationally through figures such as Robert Stake (who was appointed Honoris Causa in said University in 2009) and Helen Simons. Although the Spanish university system does not have specialised professorships, and therefore they do not exist for evaluation or any other specific subject but in broad academic fields, there is a small group of specialised teaching faculty related to the disciplines of Political Science and those of Administration, Sociology, Applied Economics, Education, Psychology, Social Work, and Agronomic Engineering that are well established in various Spanish universities. It is also important to highlight the academic production that has led to specialisation in evaluation over the last two

322

M. BUSTELO

decades: beyond PhD theses, there is a number of Masters’ thesis (Trabajos de Fin de Máster TFM ) about the subject, in the specific masters on evaluation, as well as in related masters in which the subject of evaluation is selected. It is worth mentioning that some of these works consist of carrying out meta-evaluations in different fields and are a very direct way of generating academic outputs that are genuinely specific to evaluation. Such projects can also be carried out in relation to some form of public entity, whether it be public administrations (such as evaluations carried out by the Evaluation Division of the Spanish Ministry of Foreign Affairs) or nonprofit entities (e.g. studies that address the Evaluation Policy of OXFAM International). With respect to professional evaluation societies, The Spanish Evaluation Society was created in 2001, and in 2014 a group of evaluation professionals in Spain and Portugal created APROEVAL, which is very active in evaluation networks as well as in starting new initiatives for the advancement of and advocacy for professional evaluation. Some examples include the development of the tool ‘how much does an evaluation cost?’ which aids more transparent budgeting that is objective and fair for clients, as well as the periodic meetings of ‘Aftereval’, which are informal spaces for the exchange of ideas with different speakers that can be attended in person and also online for those that are not physically in Madrid. There are no standards or guiding principles for evaluators in Spain that have been approved by the professional associations. However, in 2009 AEVAL developed a Professional Ethics Code12 which was conceived as an internal code for AEVAL evaluators and, as mentioned above, APROEVAL created the tool ‘how much does an evaluation cost?’. In the same manner, there is no certification system for evaluators in Spain, although APROEVAL has been active in the development of the Voluntary Evaluation Peer Review piloted by the European Evaluation Society. By the same token, there is no compliance or quality obligations in evaluation beyond the AEVAL Ethics Code. Regarding the professional evaluation market, although during the 2000s big and established consulting firms entered the marketplace, which was then mainly dominated by academics (either working alone or in teams), nowadays it is mostly dominated by freelancers, and in some cases small consulting firms, ‘cooperatives’, and professional networks. During 12 See

pdf.

http://www.aeval.es/export/sites/aeval/comun/pdf/actualidad/Codigo_etico.

12

SPAIN

323

the time of the AEVAL, a small amount of professional public administrators was trained and actually acted as evaluators themselves in the Agency, some of which still remain in the Institute for the Evaluation of Public Policy. Finally, there is also a related ‘market’ of evaluation professionals that consists of evaluation commissioners and managers in public administrations and other public entities in the third sector (NGOs, foundations, etc.) who are increasingly important in the Spanish evaluation community.

5

Conclusions

This chapter has given a general idea about the institutionalisation of evaluation in Spain through the explanation of the national evaluation context. I hope this ‘x-ray’ serves as a good foundation for further discussions on two key topics. Firstly, on how this particular process of institutionalisation has influenced and has been influenced by global and international ideas (something that this Global Atlas will greatly contribute to) by allowing comparison and contrasting realities. Secondly, for open a critical debate about the ‘evaluation enterprise’ understood both as an initiative that aims to contribute to the betterment of decision-making, and as an activity where specific actors—policy makers, commissioners, evaluators, evaluation societies—with different interests interact with one another, leading to specific paths towards institutionalisation (Furubo and Stame 2018). Generally speaking, the role of evaluation in Spain has established itself later than in other European countries and is largely conditioned by twentieth century Spanish history and its belated, although rapid, development of democratic institutions that were only created towards the end of the 1980s. In general, the financial crisis in Spain arrived just at the moment when a culture and a supply and demand for evaluation was taking off. Its potential was to a certain extent, curtailed, or at least highly hampered, as a result. Consequently, even though there had been very commendable efforts to develop a true culture of evaluation, this had not been sufficiently stimulated in our public institutions, which meant that instead of evaluations being a useful tool in the necessary budget cuts, it became the subject of cuts itself. It was considered as something that was new and somewhat marginal, and also something that had been developed by a government from a different political orientation from that which carried out the cuts, and as such it fell into the domain of least priority. Regarding the political system, we can see that the development of regulations has been especially propelled by the government entities created

324

M. BUSTELO

around evaluation, whether they be independent agencies such as what was the AEVAL, or unites within ministries, such as the Evaluation Division of the Ministry of Foreign Affairs. That is to say, in Spain, there is a tendency for firstly creating entities or structures and then creating the rules and regulations, not the other way around. Aside from the international development sector, the regulations established practically do not mention the use of evaluation, something that indicates a more incipient and immature role of evaluation. If we consider political sectors, as we have seen, although the function of evaluation is well developed with regards to HE and R&I policies, it is only in specific areas (the evaluation of personnel—students, faculty professors, researchers—and the financial support for awarded projects via peer review) and does not frequently include the evaluation of policies and programmes. However, the transversal development of programme and public policy evaluation by AEVAL has been very important, and despite the fact that it has been eliminated as an independent agency, a large part of its roles, such as the creation of a culture of evaluation in public administration, continue in the Institute of Public Policy Evaluation. One clear issue that we have seen throughout this chapter is that the institutionalisation of evaluation in Spain has been focussed on the executive powers and is practically non-existent in the legislative branch of government. This is contributed to by a political system that is close-listed and that generates a strong party discipline at the heart of the political parties during elections, and from my point of view does not particularly promote an attitude of open self-criticism that is necessary in the practice of evaluation. In any case, the culture of evaluation, in the political parties as well as in the national parliament and the autonomous legislative assemblies, is practically non-existent. Rarely are educational programmes or training courses on evaluation imparted in this context, and even though we have had some exceptions, graduates from the UCM master’s course are seldom members of parliament or form a part of the political parties’ apparatus. This is contrasted with a much greater presence of civil servants and public workers from all levels of public administrations (national, autonomous community, regional and local). It is important to highlight here that the strong professionalisation of the Spanish civil service, which has a very professional administrative culture, has permitted the introduction of evaluation. However, at the same time, it is characterised by a strong and traditional corporate spirit—in which the field of law, and to a lesser degree economics dominate, not having so much

12

SPAIN

325

representation from other disciplines—something which is not conducive to the existence of multidisciplinary teams in which there is a greater opening for the introduction of new evaluation approaches and methodologies. The economic crisis and the reduction of public employment, together with the demographic trends of the Spanish population, has also produced an ageing of the workforce, which has not allowed for renewal and the entrance of personnel that already have a specific education and training in evaluation that is now present in the new academic curricula. As we have previously seen, the social system of institutionalisation is, without doubt, the weakest in the Spanish case, and the one which should undoubtedly be strengthened through the demands of civil society towards the political parties and their representatives to generate true culture of evaluation. This system could also be strengthened through a better and clearer relationship between government structures and civil society at the time of undertaking evaluation, something that was not fully developed through the AEVAL approach, in which, in the process of doing the evaluation with internal teams, an exchange of ideas and practices with the community and the market of evaluators wasn’t promoted. In this sense, in the international development sector, in which a large part of their policies is implemented by NGOs, they have formed a fruitful exchange with the civil society in the carrying out of evaluations, something which strengthens the use, the meaning and the consolidation of evaluation. Finally, the system which has evolved the most in Spain over the last decade is that of professionalisation. In fact, I believe that this has been an important engine that has contributed—and is contributing—substantially to the evolution of evaluation in Spain. The establishment of postgraduate education programmes—specific programmes such as the UCM master’s course, of which an impact evaluation is planned for 2020, as well as the inclusion of other academic curricula—and the existence of professional organisations such as APROEVAL and the Spanish Evaluation Society are good indicators of this strength. Acknowledgements I want to thank José Manuel Argilés, Celia Casillas, Juan Antonio Garde, Juan Andrés Ligero, and Ana Ruíz for their time in the interviews conducted for this chapter. I will also like to acknowledge Jody Fitzpatrick, Juan Andrés Ligero and Belén Rodríguez for the joint work in the 2009 and 2014 surveys conducted to the Spanish evaluation community.

326

M. BUSTELO

References Arbulú Saavedra, J. (2008). La política de evaluación de la cooperación española: una metaevaluación (Doctoral dissertation). Universidad Complutense de Madrid. Aguilar, M. J., & Ander-Egg, E. (1992). Evaluación de servicios y programas sociales. Madrid: Siglo XXI. Bustelo, M. (1990). La evaluación de programas de intervención comunitaria. Cuadernos de Acción Social, 26, 90–95. Bustelo, M. (1994). Diseño, ejecución y evaluación de programas de prevención de las drogodependencias. Alcohol y adolescencia: experiencias y programas de educación preventiva (pp. 75–85). Madrid: FERE. Bustelo, M. (2001). La evaluación de las políticas públicas de igualdad de género de los gobiernos central y autonómicos en España: 1995–1999 (Doctoral dissertation). Madrid: UCM. Bustelo, M. (2014, September 30–October 2). Innovative solutions to challenges linked to independence, credibility and use of evaluations in PNUD. In Solutions related to challenges of independence, credibility and use of evaluation: Proceedings from the third international conference on national evaluation capacities. Brazil: São Paulo. http://web.undp.org/evaluation/documents/nec/NECproceedings-2013.pdf. Accessed on 17 April 2019. Bustelo, M., & Fitzpatrick, J. (2010). Evaluation in Spain: Practice & institutionalization. Presentation at the AEA Annual Meeting, Orlando. Díez López, M. A. (2001). La evaluación de la política regional: Propuestas para evaluar las nuevas políticas regionales. Serie Tesis Doctorales. Bilbao. Servicio Editorial de la Universidad del País Vasco. Furubo, J.-E., & Stame, N. (2018). The evaluation enterprise: A critical view (1st ed.). New York: Taylor & Francis. García Sánchez, E. (2005). La evaluación de programas de reforma educativa en España: tres estudios de caso desde un enfoque de metaevaluación (Doctoral dissertation). Universidad Complutense de Madrid, Servicio de Publicaciones. Garde, J. A. (2007). La creación de la Agencia de Evaluación en España. Evaluación de Políticas Públicas. ICE, 836, 11–17. Garde, J. A. (2017, 5 August). AEVAL, Crónica de una muerte anunciada. https://cincodias.elpais.com/cincodias/2017/08/04/midinero/ 1501857354_174399.html. Accessed on 18 April 2019. Garde, J. A., & Ruíz, A. (2019). La institucionalización de la Evaluación en España y la Experiencia de AEVAL. In M. J. Garde et al. (Eds.), Hacienda Pública y Gobernanza Fiscal en España: Desafíos 2020. Madrid: Institutos de Estudios Fiscales. Gudiño, F. (1999). El sistema de evaluación de la cooperación española. Revista Española de Desarrollo y Cooperación, 4, 31–49.

12

SPAIN

327

Gudiño, F. (2016). Evaluar para aprender. Un análisis de utilización de las evaluaciones de la Cooperación Española para la toma de decisiones. Madrid: Ministerio de Asuntos Exteriores. http://www.cooperacionespanola.es/sites/default/ files/evaluar_para_aprender.pdf. Accessed on 18 April 2019. Ligero, J. A. (2015). Tres métodos de evaluación de programas y servicios. Juicios finales sumativos, teoría del cambio y evaluación orientada a los actors implicados. Madrid: Means Evaluación. Ministerio Asuntos Exteriores. (1998). Metodología de evaluación de la Cooperación Española. Madrid: Ministerio de Asuntos Exteriores and SECIPI. Ministerio de Asuntos Exteriores y de Cooperación. (2013). Política de Evaluación de la Cooperación Española. Madrid: Ministerio de Asuntos Exteriores y de Cooperación. Ministerio de Economía, Industria y Competitividad. (2017). Plan Estatal de Investigación Científica y Técnica y de Innovación 2017–2020. Madrid: MINECO. Osuna, J. L., Márquez, C., Cirera, A., & Vélez, C. (2000). Guía para la evaluación de políticas públicas. Sevilla: Instituto de Desarrollo Regional. Restrepo, P. R., Arribas, M., & Torrego, J. M. (1990). Evaluación del trabajo y resultados: aproximación a los servicios sociales. Consejería de Integración Social, Dirección General de Planificación de Servicios Sociales. Rodríguez, B. (2014). La situación de la evaluación de programas y políticas públicas en España: Avances y retrocesos. Trabajo Final de Máster para el Máster en Metodología de Investigación. Madrid: UCM. Viñas, V. (2009). The European Union’s drive towards public policy evaluation: The case of Spain. Evaluation, 15(4), 459–472.

CHAPTER 13

Portugal Ana Diogo

1

Introduction

Public policy and programmes evaluation is rather recent in Portugal (Capucha et al. 1996) and, maybe because of that, the country has not been included in former studies about the institutionalisation of evaluation (Furubo et al. 2002; Jacob et al. 2015; Varone and Jacob 2004). This chapter grasps the state of the art of the phenomena through a structured framework. The information was collected through literature review, available data on website pages, interviews with relevant stakeholders and practitioners from several of the mentioned organisations and also benefits from the perspective of the author’s experience as an emergent evaluator. After a quick overview of the country, we will deepen the level of analysis into the dimensions and sub-systems proposed by the Evaluation Globe. This will highlight the path done until now and the blank spaces still existing.

A. Diogo (B) Secretariat General of the Presidency of the Council of Ministers, Lisbon, Portugal © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_13

329

330

A. DIOGO

2

General Country Overview

This chapter offers a general overview and description of Portugal’s experience with public policy and programmes evaluation. The evaluation capacity building journey of the country cannot be disentangled from two fundamental events that must be bore in mind throughout this exercise: 1. Portugal’s transition to democracy occurred in the mid-1970s of the twentieth century, after a long period of 40 years of dictatorship. As a esult, a certain legacy endured in the way civil society, institutions and the public sector still view topics such as transparency, accountability and demand for better public services; 2. The country’s adhesion to the European project, in 1986, has been shaping Portugal’s path into an evaluative culture (Torres 2000). Although the interaction with other international organisations, such as the Organisation of Cooperation and Economic Development (OECD) or United Nations, has contributed to this evolution, the European Union (EU) continues to be the main driving force of external pressure for the ongoing evaluation activities of the Portuguese public sector. According to Ferrão and Mourato (2010), public policy evaluation in Portugal has existed for almost three decades and it was born due to external imposition of the EU, since this was an obligation to access funding. It was the territorial cohesion policy that set the tone for the Portuguese experience (Ferrão 1996). Thus, to understand the administrative context in which political decisions are being made we must acknowledge that three state levels coexist (national—includes ministries and central administration; local—municipalities and regional—which comprises the two archipelagos: Azores and Madeira) and all of them have different degrees of autonomy and competencies. And, although all levels have been liquid beneficiaries of European funding, which implies mandatory evaluation requirements, this influence, on what regards the development of an evaluation culture, is mostly perceived at the national level. This observation is backed up by the lack of known evaluations commissioned by this level of the administration. On what concerns the scope of subject matters or policy fields more prone to incorporate evaluation principles, practices and methodologies, it is not surprising that the ones more exposed to external exposure rank better. Sectorial policies as international cooperation or regional development

13

PORTUGAL

331

present a distinct standing point when compared with other policy fields, such as defence, infrastructures, public-sector reform or fiscal and budgetary policies (Vicente 2013). Again, we sustain this affirmation based on the lack of available documents related with evaluation and the inexistence of public debate about evaluation related with these sectorial policies (Torres 2000, 2010). Another relevant fact is that Portugal has never had an organisation solely devoted to public policy or programmes evaluation. On the contrary, a large fragmentation and dispersion of entities across the public sector has been answering to different types of demands. Despite having or not a formal office to deal with evaluation, the majority of the evaluations produced for public sector are external. This means that some organisations, namely the ones that reply to external demand for evaluations as the Cohesion and Development Agency (AdC) and Instituto Camões, have created their own internal evaluation departments and are commissioners with a certain degree of specialisation; but, in the cases where demand results from legal internal obligations, external evaluations are commissioned as just another ordinary procurement procedure (AdC 2016a, b; Instituto Camões da Cooperação e da Língua 2012). This may relate with the different stages of evaluative capacity development in Portugal and also with the absence of a more profound knowledge of what evaluation and its role is. The misunderstanding of evaluation as a scientific discipline results in terminological and conceptual confusion. Often, monitoring and evaluation are commonly presented as the same activity and auditing is also widely used as a synonym for evaluation, inclusive among professionals. We must also refer that the study and professionalisation of the discipline is still very limited. On the one hand, there is no specific course or training related with evaluation matters. On the other, evaluation itself has not yet gained its own place: most of the times evaluation is viewed as a topic related with political science, and therefore, public administration, and, recently, has been gaining weight near some public policy economics courses, but always as a subject matter and not as a specific and independent field of knowledge and practice. Perhaps due to that, the term evaluator is not even used to designate the profession and no claims exist in that sense. This chapter will cover a series of topics to detail and describe Portugal’s experience regarding evaluation. A series of practices or examples may, unfortunately, be left out because of the fact that evaluation initiatives seem to derive more from the sensibility and opportunities perceived by

332

A. DIOGO

organisations, managers and civil servants than from a strategical national evaluation policy. The investigation for this chapter revealed a series of ongoing evaluation activities in small organisational silos. There is a slow movement committed in setting forward evaluation, but it seems they are working apart and disconnected, not even knowing of each other’s existence. This underground evaluation movement may just need the right trigger to network and foster a higher evaluation culture and practice.

3

Institutional Structures and Processes (Political System) 3.1

Evaluation Regulations

Evaluation regulation, or its use, is scattered among primary and secondary legislation, but Portugal misses an overall national law or a legal framework guiding policy evaluation across the government. The legal references concerning evaluation and its use may be found as spurious mentions in sectorial laws or regulations. Usually, these documents assert the need for a programme to be evaluated and the reason why such exercise should be conducted. The National Plan for Gender Equality or the Strategic Migrations Plan, both sectorial policies, are examples of such regulatory options. These plans were conceived for more than one year of management cycle and counting with wide participation of public-sector organisations in its implementation. When the binding rule was passed it already foresaw the obligation to evaluate those instruments. In fact, a closer look at the rule of law of both these plans reveals that there was a high commitment in producing regular evaluations during the implementation phase and a greater focus on the final evaluation, which should be external and independent. In the case of the V National Plan for Gender Equality, Citizenship and Non-Discrimination (Government Resolution 2013), for the period concerning 2014–2017, the legislator defines the scope of the evaluation, since the main concern was figuring out if that Plan was producing any change in gender inequality. We find a different scenario in the case of the Strategic Migrations Plan (Government Resolution 2015) for 2015–2010. This document is a lot less clear on what should be the evaluation purpose. It only outlines that monitoring and evaluation exercises will endorse the Strategic Migrations Plan’s capacity to adapt and adjust to new challenges and opportunities that may come across during the implementation phase.

13

PORTUGAL

333

As seen, these two sectorial policies, both based at the Presidency of the Council of Ministers, translate distinctive positions about evaluation and its use into their regulations. In common there is only the fact that, in both cases, the evaluation function is considered as a phase of the public policy cycle. But because the role and use of evaluation is not sufficiently detailed in those legal documents, or any further documents, and the terms of reference for those evaluations are not public, it becomes extremely hard to assess more in-depth information beyond the general intention stated above. In the field of regulatory impact assessment, Portugal has, in the last two years, come some steps further due to closer collaboration with the OECD. As part of the modernisation programme Simplex+ 2016, the government introduced a mechanism to measure, at an early stage of the legislative process, the economic costs of new legislative and administrative acts (Government Resolution 2017) in order to avoid an extra burden for citizens and companies. The tool Custa quanto (‘How much does it cost?’) is partially based on the standard cost model and it is already used in many member states of the EU, since it embedded OECD’s recommendations on ‘better regulation’ and also took into account the principle of ‘think small first’ from the ‘small business act’ for Europe. The inclusion of this perspective in the legislative process actually implied changing the guiding rules of the Council of Ministers of the 21st Government. Now, every piece of the project of law has to integrate an economic costs assessment based on that evaluation model. Although this is a positive step into evidence-based design of public policies, the effectiveness of such a measure is unknown. Not only because this is a very recent policy but also because the assessments made under such a measure aren’t publicised. Although the Council of Ministers Resolution rules that this policy will be subject to an evaluation, in the same form as with the previous examples, there are no further details on how and what for will the findings be used for. Other similar examples may be found in a myriad of sectorial regulations since almost every plan or policy change proclaims the need for evaluation. Evaluation is perceived as the final part of the policy cycle and not as a transversal element. Moreover, the specifications regarding evaluation, or its use, have very low binding force. Evaluation is not viewed as an addedvalue activity but only as a formal and legal obligation. This way, evaluation functions are reduced to a mere legal and administrative conformity, losing all its transformative power. Seldom evaluation findings are presented or

334

A. DIOGO

even discussed, even their disposal for public information or research use is very constrained. Despite the above examples, there are some policy fields such as the Cohesion Policy or the International Development and Cooperation, that constitute exceptions to the general trend. These policy areas have produced administrative regulations such as instructions, guidelines and even an evaluation policy for the sector. The fact that these policies are embedded in wider contexts as the EU or the OECD that act like driving forces for an evaluation culture, and that mandatory evaluation is a condition to access funds, makes them more mature in comparison with other policy fields. Instituto Camões, the Portuguese development cooperation agency, shares in its website not only the evaluations done as well as several documents and guidelines. Many of those materials are OECD’s translated documents, but others result from Camões long experience and are intended to be used by all the stakeholders involved in the agency’s activities. Alongside Camões, the AdC is a remarking case of administrative regulation. Their website gathers instructions, guidelines and other relevant contents. Once again, many of these materials are translated versions of European documents, but they are accessible to a wider public and begin to represent an interesting acquis for evaluation culture and capacity building in the Portuguese language. The several documents adopted by these entities are similar to those proposed by the European Commission or other international organisations; therefore, they follow the trends on what regards the independence of evaluation, quality standards and impact orientation. AdC’s working papers reflect these trends. The information about available budgets for evaluation activities is almost inexistent. Apart from the mandatory evaluations of the co-funded policies, where it is easier to find fixed budgetary limits for every operational programme, it is very hard to trace such details. An ex post mechanism to gather that information, for public sector, could be through the procurement portal base.gov, where all public procurement contracts should be published. This would imply an extensive research due to the plasticity use of the term evaluation, for instance, a quick search returned over 4466 results but a vast majority is related with environmental evaluation or even engineering activities.

13

PORTUGAL

335

Concerning the presentation of the evaluation findings, examples such as Camões and AdC, as well as the case of the auditing reports of the Portuguese Supreme Audit Court, clearly reveal a well-established mechanism to publicise the results, conclusions and recommendations. For instance, AdC has a global programme of evaluation for the period of 2014–2020, in which the communication strategy is well defined: the kind of target audiences, key messages, language, communication outputs and responsibilities. The official websites of the organisations are, most of the time, the main vehicles for promotion and, depending on the topic, media can actually echo some of the findings. Some practices of public presentation and discussion of findings with key stakeholders may occasionally take place. Nevertheless, this option does not configure a systematic or systematised practice, and it isn’t included in a larger strategy of communication aiming accountability, improvement or knowledge generation. Looking at the government and the parliament, evaluation seems to be in the agenda whenever political arguments for a changing policy are needed (legitimation) or to demonstrate that the previous legislature didn’t achieve the promised results, as a quick look at the electoral programmes from last elections testifies. A structured relation with evaluation is missing, even though the parliament has some practice in occasionally demanding information and specific audits to the Supreme Audit Court. This relationship may probably be a result of the conceptual blurring lines between audit and evaluation. However, this does not extend to a wider intervention for public policy evaluation. Although the Portuguese parliament comprises technical units to support the deputies’ work, such as the Technical Unit for Budgetary Support (UTAO—Lei Orgânica da Assembleia da República No. 77 ), whose mission is to deliver studies and technical working papers on budgetary management and public finance, no similar unit was ever created to provide independent evidence and knowledge through evaluation. The Supreme Audit Court has, in recent years, been drawing a path towards evaluation. In fact, the rule of law that sets the Supreme Audit Court internal organisation clearly mentions ‘evaluation’ as one of the Court’s powers (Lei de organização do Tribunal de Contas No. 98). Considering this degree of responsibility, the Portuguese Supreme Audit Court was a member of the International Organisation of Supreme Audit Institutions (INTOSAI) working group on programmes evaluation, in 1992, and was also a member of the European Evaluation Society (Tavares 1998).

336

A. DIOGO

On what regards the parliamentarian activity, a search for the term ‘evaluation’ in the parliament search engine returns less than 100 results for the present legislature, which began in late November 2015. These results show, once again, the elasticity in which the term ‘evaluation’ is being used, most of the time the intended meaning is assessment, performance measurement or environmental evaluation. Besides that, there is no mechanism to follow up how many of these evaluation demands actually originated an evaluation, so the effectiveness of such demands is not clear. Often, evaluation is used as an argument by the incoming government or the opposition leaders when policies are discontinued, but there is no management tool or organised movement to follow up those claims. 3.2

Evaluation Practice

Since Portugal is missing an across-government evaluation policy, the existing evaluation initiatives are rather confined to the environment where they are developed. Organisations who respond to mandatory evaluations abide by the imposed calendars and rhythms, for instance, the evaluation calendar of AdC during the Portugal 2020 implementation comprehends thirteen evaluations, distributed by the 16 operational programmes. In this case, the type of evaluations being demanded include process evaluations, impact, theory-based evaluations and counterfactual evaluations. As referred before, the systematic and regular approach of AdC relies on the connection and rules of the EU and does not reflect itself in the case of most national policies. The Employment and Vocational Training Institute also has a unit that deals with evaluation, but its activity is rather unknown, something that may be leveraged by the EU’s growing funding (Martins 2018). As the cited strategic plans referred above, about gender and migrations, the details about the evaluations procedures and calendars are usually rather vague. As it was the case of these two examples, only a demand for a final external evaluation was included in the rule of law. It can also happen that even those final evaluations do not take place and no consequence or sanction is applied. This was exactly what happened with the National Plan for Fires of 2006–2012 (Government Resolution 2006), which ruled a mandatory annual evaluation but only three were made, according to what is published on the official website. Nevertheless, after the deadly wildfires of 2017 in Portugal, the media gave a strong emphasis

13

PORTUGAL

337

on the lack of those evaluations. The aftermath was the creation, by the parliament, of two special commissions Comissão Técnica Independente 1 and 2 to assess the events but no evaluative perspective was included in the reports done by these temporary structures. Once again, considering the national context and the range of policies being executed by the government, the policy fields better positioned as evaluation practitioners are regional development policies or cohesion policies, especially if funded by EU, or cooperation and international development. The spillover to other policy fields, perhaps due to the Portuguese administrative model, hasn’t, up to now, produced evident positive externalities. We must underline that these better performers, most of the time, are not conducting evaluations but only demanding or commissioning it. External evaluations are the most typical form of evaluation being carried out and, generally, they are conducted by academic research teams or consultants. Although in 2006 the public-sector reform (PREMAC) attempted to create in each ministry a department of planning, strategy, evaluation and international relations (GPEARI), these never gained the weight, or the internal competencies, needed to carry out their mission and eventually disappeared or specialised in just one of the axes. This environment does not stimulate or favour the conditions for internal evaluation to take place and since there hasn’t been an investment in the development of national evaluative capabilities it would be very difficult to fulfil such demands. Over the years there has been a public policy for shared services at the central administration level and the development of specialised service centres, as well as a concentration movement towards the gathering of backoffice services at the Secretary General of each ministry. None of these initiatives included evaluation as a service to be provided for several ministers or transversal policies, so an independent evaluation institute was never created or even discussed. On what regards the role of the GPEARI cabinets in the development of evaluative services or capabilities, they were not successful in their purposes; most of them focused only on public-sector organisations performance management and never reached or aligned with the macro-level of public policy evaluation. Some public organisations may have departments working on evaluation-related issues, as is the case of the Competition Authority, an independent organisation, whose Public Policy Evaluation Department’s mission is to reinforce the promotion of a pro-competition regulatory environment, so it implements competition impact assessment of public

338

A. DIOGO

policies. But even without a player responsible for public policy and programmes evaluation, several sectorial organisations, due to their mission and stakeholders, have developed internal skills and small departments in order to give a response to such demand. The sectors more exposed to European public funding or international cooperation lead the ranking but the new budgetary framework law may represent a twist if the foreseeable budgetary programmes evaluation goes forward. This law assigns the parliament a greater role in the assessment of the efficiency and efficacy of public expenditures and the relation with the obtained results in regard with the public policies objectives. The implementation of this new data report is under the leadership of the Ministry of Finance, in close collaboration with the coordinator entities of each ministry. Till date, no methodological approaches have been established and there is a lack of comprehensive guidelines. Even though the Supreme Audit Court, in the last years, has been recommending a stronger commitment in the demonstration of the use of public resources and its linkage with performance results, very few ministries have been pulling forward such an exercise. The Secretariat General of the Presidency of the Council of Ministers is one of the coordination entities that, since 2013, is experiencing to cross budgetary data with public-sector performance results. The first attempts revealed severe methodological constraints, such as access to reliable and timely data related with detailed public policy expenditure. There is still a large improvement trajectory for programme budgeting in Portugal, until this becomes a useful accountability and decision-making instrument, as recognised by the Supreme Audit Court. The last example of the evaluation practice, or its absence (Diogo 2018), is reflected in Portugal’s Sustainable Development Goals (SDGs) National Voluntary Review (MNE 2017). The governance model for the SDGs in Portugal delegates the questions about data collection and monitoring to the National Statistic Authority—INE but, up to now, there is no decision made over who and how a systematic process of interpretation and merit judgement of those data will be produced. Even though the push for the evaluation is one of the distinctive features of the SDGs in comparison with the Millennium Development Goals (MDGs) and that the UN will demand for evaluations alongside the voluntary national reviews, Portugal’s report is scarce about the role of evaluation and the institutional framework is not reflecting this shifting paradigm.

13

3.3

PORTUGAL

339

Use of Evaluations

The sectors with higher exposure to international evaluation practices and standards are the ones performing better regarding the use of evaluation and its findings. In matters related with EU funding or international cooperation processes are thought in order for the findings being embedded in the strategic planning process and benefiting the design of new action plans. For instance, the use of ex ante evaluations and lessons learned from previous cycles brings the evaluation findings into the debate when reprogramming or new cycles of programming occur. This is a sector with high external pressure for results demonstration, which means bigger attention and scrutiny not only from the European Commission but also from political parties in the opposition, beneficiaries and non-beneficiaries of those fundings. In this case, evaluation is primarily used as an accountability and legitimation tool and also as an enlightenment opportunity. In the case of these two sectors, the evaluation findings are posted on their websites as well in working papers. For instance, Instituto Camões publishes materials such as the rules for Development Evaluation Quality; peer reviews; guidelines for promoting lessons learned from evaluation, the evaluation glossary and results management. The Development and Cohesion Agency is actually using the flow of data being generated by evaluations as an argument to attract research institutions to present themselves to evaluation calls, the rational is that evaluation findings and administrative data can be an invaluable source for future knowledge production. Frequently, a conceptual confusion between monitoring and evaluation activities can be observed, as it was the case of the reports related with gender, migrations or wildfires already cited. Therefore, it is not uncommon to find performance measurement reports that are being presented as intermediate evaluations. Some of the publicised final evaluations we’ve found, conducted by external entities, pinpoint the fact that the time frame for doing the evaluation does not allow for the findings being feed into the design of the next action plans. But despite the recommendations made on such a topic, the evaluation and planning timings recurrently still collide in the new versions of those plans. Therefore, as this example testifies, there is lots of improvement space for the use of evaluation findings and recommendations. Up to now the advantages of the exercise haven’t been fully captured, which

340

A. DIOGO

diminishes the planning and steering potential of the value for money of such efforts. More than bad performers in the use of evaluation findings we could talk about bad performers in commissioning evaluations. Social policies such as health, education, employment or social security, which impact directly the life of citizens and concentrate on a large amount of public resources, lack a consistent knowledge generation coming from evaluations. Perhaps, the fact that such sectors are very exposed to ideological beliefs from political parties can explain that they are lagging behind. Evaluation has not yet succeeded in gaining the status of a management tool and a decision-making support instrument (Campos 1998). At last, due to the state of the art, the quality of the evaluations is not generating debate. Due to the short market, there isn’t a tradition of conducting meta-evaluations. But once again, AdC, in its global evaluation planning, defines the standards being observed to measure the quality of the products delivered by the evaluation. For instance, they developed a checklist based on the quality criteria proposed by EVALSED.

4

Societal Dissemination/Acceptance (Social System) 4.1

Institutionalised Use of Evaluations by Civil Society

Portugal’s civil society has not yet discovered the full potential of evaluations as a source of knowledge for referenda or political decision-making on a communal basis. The low level of institutionalisation of evaluation, the lack of deep evaluative culture and a civil society not sufficiently aware of its added value may be hindering the establishment and consolidation of this management tool. In the case of territorial cohesion policy, Ferrão and Mourato (2010) pinpointed that Portugal was far away from a wide evaluation culture and a truly citizens’ demand because a change of the mindset, values and behaviours of the politicians, civil servants and evaluators was missing. Resistance to change, lack of theoretical debate around evaluation, corporatism among communities of professionals and absence of collaborative work spirit, especially in an exiguous internal market, can, according to those authors, explain the state of the art.

13

4.2

PORTUGAL

341

Public Perception and Discussion of Evaluation and Evaluation Findings

Evaluation has not yet gained status as an effective and powerful management tool for accountability and decision-making support (Campos 1998). Some of that role is being assumed by audit activities, namely the ones conducted by the Supreme Audit Court. These products receive more public and media attention than the majority of evaluations do. Although full versions of evaluation reports, namely from the lead performing sectors, are available, posted on the websites of the commissioners, there is little public discussion, whether by the media or by the executive and legislative branches. Sometimes structural policies receive stronger attention from the media and are widely promoted by the government as case studies. For instance, the modernisation programme Simplex was presented by the government as a success policy and an annual report is being made since 2016. Unfortunately, only a small version of such a report is available to general public and, although an evaluation made by a Portuguese University was announced, that product hasn’t been made accessible. Regardless, the media gave it coverage without ever questioning such results or pinpointing the absence of public knowledge of the proclaimed evaluation. Although evaluations can be a powerful way of ‘telling the truth to power’, the fifth power, as Chomsky named the press, is still, at least in Portugal, quite apart from this accountability instrument. The debate over the benefits of evaluation, the quality of evaluations or even the professionalisation of evaluation is inexistent. One of the factors that may be hindering this is the fact that, in Portugal, there is almost no discussion about the functioning of public administration, the delivery of public services or the results of the implemented public policies. And when the discussion is promoted, the line of argument relies on different ideological positions rather than in informed, scientific and data evidence debates. When some sectors of policy are underperforming it is commonly argued, by the opposition or relevant stakeholders, that it should be evaluated. But opinion influencers, such as Nazaré or Rodrigues, have often reported that in Portugal there is few and limited knowledge about the result of public policies and public expenditure (Nazaré 2017; Rodrigues 2017).

342

A. DIOGO

4.3

Civil Societies Demand Evaluations

The demand for evaluations from individual citizens, civil society organisations (CSO), private enterprises or other actors is very uncommon. It may occur when some political decisions are announced as a way to challenge changes, but these eventual claims are not effective in the political arena of the decision-making processes in Portugal. The decision-making mechanisms are traditionally connected with politics and the political parties’ ideology, so they are not perceived as evidence-based politics or knowledge informed processes. In this context, there may be some perception that the mindset and the arguments to make a point have to be political and not technicalities. The perspective that evaluation is about ‘telling the truth to the power’ seems to push away the civil society demand since the rules of the game have a different nature. There is little evidence of the gain of producing evaluations as a vehicle to improve or upgrade public policies or to change the course of some options in Portugal, especially when other forms of pressure are more effective, such as media coverage, strikes or manifestations. One of the hindering factors of the use of evaluations as civil society demand for better governance may be due to the fact that there are low levels of political trust in Portugal; there is a growing disentanglement towards politics and the actual levels of education. Perhaps, some of the legacy of 40 years of dictatorship remains in the way civil society perceives its own role in building a better society. With high levels of distrust and insufficient association capacity, the State is viewed in a paternalistic way, which may hinder a stronger position for higher quality in the conduction of public affairs.

5 5.1

Professionalisation (System of Professionalisation)

Academic Study Courses, Further Training, Etc.

According to Ferrão and Mourato (2010), the 1990s represent the gold period for evaluation in Portugal. Suddenly, the first papers published were about the debates and experiences of other countries. In 1996, a special edition of the review Sociologia, Problemas e Práticas was dedicated to evaluation. Two years later, the National Institute of Administration, responsible for civil servants’ induction and life long training, organised the first

13

PORTUGAL

343

seminar about evaluation in public administration. But this boom didn’t flourish for long. The higher university education available on evaluation has been circumscribed to few compulsory units in courses mainly from social sciences area. Up to 2018, there are no specific programmes, masters or diplomas exclusively focused on programmes and public policy evaluation. Public administration and management, political sciences or economics and public policies are the scientific disciplines where evaluation may be found. Usually students get in touch with this problematic while coursing the licentiate’s degree, sometimes as an independent curricular unit, others as a subject included in some unit about public policy. This reality may also be found in some Master and Doctoral programmes in universities as the University of Aveiro, Instituto Superior de Ciências Sociais e Políticas, Instituto Universitário de Lisboa (ISCTE), to name just a few. Even though the existing higher education offer is not vast there seems to be a growing scientific production in topics evaluation related, mainly master dissertations and doctoral thesis. This may indicate that the students’ interest in the subject may not have sufficient offer provided by the national market. In the case of civil servants and public-sector managers the Instituto Nacional de Administração (INA) usually offers an annual brief course on evaluation, theoretical and basic level, that middle and top managers are legally bound to attend. In fact, an initiative of INA and Camões, in 2009, offered the opportunity for a small version of the World’s Bank training, IPDET, to occur in Lisboa. In recent years, at least two universities offered training courses on social programmes evaluation or third sector and social economy oriented. The first attempt to launch a course about impact evaluation methodologies in social economy by ISCTE in 2016, wasn’t successful. As the minimum number of participants wasn’t reached the course was postponed and, in 2017, a new version of it, adding a project management dimension, was launched. The same pattern was observed with the initiative of Universidade Católica to launch a post-graduation course about social impact evaluation, only on a second attempt, in 2018, was it able to engage a sufficient number of graduates. The third sector, namely foundations, seem to be more and more interested in measuring and demonstrating the social value and impact created by their action. In 2017, the Portuguese Foundation Centre promoted some initiatives related with impact evaluation in social projects. Since this

344

A. DIOGO

sector is under tight scrutiny of the value for money, some foundations like Calouste Gulbenkian are increasing the demand for the evaluation of the programmes and projects they are implementing or supporting. Also, in 2017, for the first time, the Asociación Iberica de Profesionales de Evaluadores (Aproeval), the Iberian association of evaluation professionals, promoted, in Lisboa, a seminar for those interested in evaluation, regardless of the kind of relation with the field of practice (students, practitioners, evaluators or commissioners). This pioneering initiative provided a sharing environment between public sector, consultancy companies and foundations. The event, a full one-day seminar, tackled evaluation-related topics, the speakers were professionals from Portugal and Spain and came from different sectors: foundations, international development cooperation organisations, universities, consultancies and evaluation institutions. This seminar was an opportunity for the construction of an Iberian network and cross-fertilisation, allowing Aproeval to create a more robust presence in Portugal. 5.2

Profession/Discipline

The Portuguese Evaluation Society AVALPortugal was created in 2009 with the aim of promoting an evaluation culture around public policies and programmes; promoting and disseminating knowledge; and evaluation capacity building. The association allowed its members to create working works, such as gender, professionalisation, impact evaluation. In order to fulfil its mission, AVALPortugal became a member of the Network of Evaluation Societies in Europe in 2010. In their first years some activities took place, such as seminars but recently only a small presence online was kept, like a webpage and a Facebook profile, where contents from other organisations where reproduced. According to internal documents available on the webpage, the paralysis of AVAL is due to its low associates’ number and to their impossibility to fully dedicate to the organisation. Eventually, in 2018, the actual direction decided to put an end to the association and started an open process of strategic thinking in order to define new forms of evaluation capacity development. Aproeval was created in Spain, in 2014, by a group of alumni and teachers from the Public Policy and Programmes Evaluation Master from the

13

PORTUGAL

345

Complutense University of Madrid. This association had, since the beginning, an Iberian vocation but, despite that, only in 2017 it was possible to raise the awareness of its existence in Portugal. The evaluation market in Portugal is mainly pushed by public-sector demand, led by the European Funds obligation to produce regular evaluation exercises. The public procurement information available reveals that the competitions for evaluation services, EU related or not, are most of the times won by consulting firms or scientific research institutes from universities. In Portugal, the term ‘evaluator’ is not common. The under prevailing of the term may be a cause of the fact that the professionals undergoing evaluation activities usually present themselves as consultants or researchers. This can be a factor hindering the affirmation of the term evaluator as a professional category. Another reason that might explain the absence of the term from the Portuguese lexicon is due to the lacking offer of higher education exclusively dedicated to evaluation, like it already happens in other European countries, and the inexistent debate of whether or not to develop a certification system or competence validation for those operating in evaluation. There is no professorship for evaluation. The discipline and the added value of programmes and public policy evaluation are not part of the political agenda or the civil society. Universities have stepped in, as a business opportunity offering courses and gaining evaluation contracts but perhaps still not enough in the promotion of the discipline as a pillar for evidence and knowledge generation. 5.3

Compliance to Standards and Quality Obligations

On what concerns standards compliance there is not a unified practice regarding quality. For instance, Instituto Camões has adopted an Evaluation Code of Ethic and also standards for Evaluation Management and Quality Standards for Develop Evaluation, as well as other guidelines, such as rules to avoid conflicts of interests. In the case of AdC, the European acquis on evaluation is transposed into the national practice of European funds evaluation. One of the instruments being used by organisations is the definition of criteria and requirements through the terms of reference.

6

Conclusion

Portugal has made serious progress in generating an evaluation culture in some policy fields. A series of tendencies offer an optimistic perspective for

346

A. DIOGO

the future of evaluation capacity building in Portugal: The Academia is also pushing forward the topic offering new and specific training opportunities; new players, as the foundations, are revealing a growing interest on the subject; the international movement around the SDGs is also fuelling the awareness and demand for better policies and use of public funds; training opportunities across Europe, such as IPDET, and several online courses may reduce the learning curve for several emergent and young evaluators. Nevertheless, a great deal of work expects this community of teachers, students, evaluators and practitioners: raising awareness in the society and working in proximity with the relevant stakeholders such as the government, parliament, political parties and so on. We hope this chapter can contribute to the debate and push forward the discipline.

References Agência para o Desenvolvimento e Coesão (AdC). (2016a). Plano global de Avaliação 2014–2020. Orientações para o acompanhamento da implementação das recomendações das avaliações do Portugal 2020 (follow-up). Rede PT 2020. https://www.portugal2020.pt/Portal2020/Media/Default/ Docs/AVALIACAO/20160701_PGA_PT2020_Orientac%CC%A7o%CC% 83esFollowUp-3.pdf. Accessed on 7 February 2019. Agência para o Desenvolvimento e Coesão (AdC). (2016b). Plano global de Avaliação 2014–2020. Orientações para o planeamento e preparação das avaliações do Portugal 2020. Rede PT 2020. https://www.portugal2020.pt/ Portal2020/Media/Default/Docs/AVALIACAO/02_20161228_Orienta% C3%A7%C3%B5esTdR_dez2016.pdf. Accessed on 7 February 2019. Campos, A. C. (1998). Apresentação do I Encontro. In A avaliação na Administração Pública (pp. 7–9). 1.º Encontro INA. Lisboa: INA. Capucha, L., Pedroso, P., & De Almeida, J. (1996). Metodologias de avaliação: o estado da arte em Portugal. Sociologia – Problemas e Práticas, 22, 9 –27. Diário da República. (1988). Lei No. 77/88 de organização e funcionamento dos serviços da Assembleia da República. Diário da República. (1997). Lei No. 98/1997 de organização do Tribunal de Contas. Diogo, A. (2018). Hacia un sistema integrado de M&E para los objetivos de desarrollo sostenible en Portugal (Master thesis). Universidad Complutense de Madrid, Madrid. https://www.somosiberoamerica.org/wp-content/uploads/ 2018/10/Ana-Diogo-Hacia-un-sistema-ME-ODS-en-Portugal_UCM.pdf. Accessed on 7 February 2019. Ferrão, J. (1996). A avaliação comunitária de programas regionais: aspectos de uma experiência recente. Sociologia - Problemas e Práticas, 22, 29–41.

13

PORTUGAL

347

Ferrão, J., & Mourato, J. (2010). A avaliação de políticas públicas como factor de aprendizagem, inovação Institucional e cidadania: o caso da política do ordenamento do território em Portugal. Revista Brasileira de Estudos Urbanos e Regionais, 12(1), 9–28. Furubo, J.-E., Rist, Ray C., & Sandahl, R. (2002). International Atlas of evaluation. New Brunswick: Transaction Publishers. Government Resolution. (2006). Resolution of the Council of Ministers, No. 65. Government Resolution. (2013). Resolution of the Council of Ministers, No. 103. Government Resolution. (2015). Resolution of the Council of Ministers, No. 12b. Government Resolution. (2017). Resolution of the Council of Ministers, No. 44. Instituto Camões da Cooperação e da Língua. (2012). Several working documents. https://instituto-camoes.pt/sobre/sobre-nos/transparencia/ avaliacao-e-auditoria/avaliacao. Accessed on 7 February 2019. Jacob, S., Speer, S., & Furubo, J.-E. (2015). The institutionalization of evaluation matters: Updating the International Atlas of Evaluation 10 years later. Evaluation, 21(1), 6–31. Martins, P. S. (2018). Public policy, big data, and counterfactual evaluation: An illustration from an employment activation programme. In N. Crato & P. Paruolo (Eds.), Data-driven policy impact evaluation: How access to microdata is transforming policy design (pp. 149–163). Cham: Springer. Ministério dos Negócios Estrangeiros (MNE). (2017). Relatório nacional sobre a implementação da Agenda 2030 para o Desenvolvimento Sustentável. Por ocasião da Apresentação Nacional Voluntária no Fórum Político de Alto Nível das Nações Unidas julho 2017 – Nova Iorque. https://sustainabledevelopment. un.org/content/documents/15771Portugal2017_PT_REV_FINAL_28_06_ 2017.pdf. Accessed on 7 February 2019. Nazaré, L. (2017). Quem quer fazer contas? Jornal de Negócios [Press release]. https://www.jornaldenegocios.pt/opiniao/colunistas/luis-nazare/detalhe/ quem-quer-fazer-contas. Accessed on 7 February 2019. Rodrigues, M. L. (2017). Avaliar as políticas públicas, melhorar a qualidade do debate político. Diário de Notícias [Press release]. https://www.dn.pt/ opiniao/opiniao-dn/maria-de-lurdes-rodrigues/interior/avaliar-as-politicaspublicas-melhorar-a-qualidade-do-debate-politico-5724937.html. Accessed on 7 February 2019. Tavares, J. (1998). Administração, Controlo, Avaliação e Responsabilidade. A avaliação na Administração Pública. 1.º Encontro INA. Lisboa: INA. Torres, M. (2000). Avaliação de políticas públicas. Razões para o recente desenvolvimento desta figura. A situação portuguesa. Um pretexto para discorrer sobre o binómio Estado/sociedade civil (Master thesis). Universidade Nova de Lisboa, Lisboa. Torres, M. (2010). Potencialidades e desafios da avaliação. Avaliação e Boa Governação. Lisboa: Modelos e Práticas.

348

A. DIOGO

Varone, F., & Jacob, S. (2004). Institutionnalisation de l’évaluation et nouvelle gestion publique: un état des lieux comparatif. Revue internationale de politique comparee, 11(2), 271–292. Vicente, A. (2013). A avaliação dos processos de reforma administrativa em Portugal (Master thesis). Universidade de Aveiro, Aveiro. http://hdl.handle.net/ 10773/12211. Accessed on 7 February 2019.

PART V

Central Eastern Europe

CHAPTER 14

Czechia Jiˇrí Remr and Oto Potluka

1

General Country Overview

Publications that have been describing the development of evaluation culture so far neglected countries in Central and Eastern Europe. A comprehensive review of evaluations covered countries worldwide (Furubo et al. 2002). The current global trends in evaluations have also been discussed by Barbier and Hawkins (2012) or Stockmann and Meyer (2016), but none of these studies covered Central Europe, including Czechia. Therefore, this chapter tries to shed some light on what has been happening in the field of evaluation in Czechia and to fill this information gap. The political system in the former Czechoslovakia was influenced by the Soviet-style Communist regime that was forcibly implemented after 1948. The democratic power has been distorted as the whole political power had been dominated by the Communist party until the year 1989.

J. Remr (B) Institute for Evaluations and Social Analyses, Prague, Czechia e-mail: [email protected] O. Potluka Center for Philanthropy Studies, University of Basel, Basel, Switzerland e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_14

351

352

J. REMR AND O. POTLUKA

During that time, the centrally planned economy had been applied to control economic activities. This quasi-omniscience of political elites had also influenced the development of evaluation, its perception, and approach to its utilisation. Centrally planned economy required comprehensive and detailed performance related data (evidence) and information about the effectiveness of implemented interventions (not only into the economy but also to social development). Even though terms like ‘monitoring’ and ‘evaluation’ were not used in those times, the sort of monitoring and evaluation used to be an inevitable component of the planning, managing and supervising processes. In this respect, it enabled setting the plan and finding out the extent to which the plan was fulfilled. Such heritage is an important and often forgotten element that is even now driving the way, how evaluations are perceived, commissioned and used. As a consequence of this: • Many stakeholders in the public sector (e.g. ministries) do perceive evaluations neither as a tool to improve policies nor even to prove the results but rather as a mechanism for reporting and supervision. • There is low trust in data and evidence that evaluation systems are using; many stakeholders consider reported data as artificial, imprecise and irrelevant (i.e. not representing the merit of interventions and evaluands). It is a heritage of pretending: responsible officials are pretending their plans have the logic, and economic subjects are pretending they perform required activities and achieving the results. • Goal-based approach and monitoring principles are well-embedded, especially in the public sector. However, the proprietary terminology (like interim evaluation, ex post evaluation, monitoring, and absorption capacity) and whole evaluation system including its specific rationale, goals, and objectives came later, together with programming as such. Last decades of dynamic socioeconomic development in Central European countries (i.e. after 2000) brought changes into the practice of management in public, private, and non-profit sectors, including evaluation field. Critical drivers as for evaluation expansion are of an extrinsic nature; they came mainly from the EU.

14

CZECHIA

353

The exceptional role-played accession into the European Union (EU) as it brought enormous EU financial assistance (EU Cohesion Policy) including programming principles and evaluations (see Bienias and Lewandowska 2009 for an evaluation development overview in the central Europe). Another field of changes concerns developmental cooperation performed within the framework of official development assistance (ODA) provided by Czech Developmental Agency which has also been enabling evaluation culture to develop. Both of them enabled growing awareness about evaluations among public servants, academics, and consultants. This development has been leading to a change in perceiving evaluations as a controlling instrument of the EU, and supervision to active application of evaluation in policy learning cycle (Bienias et al. 2009, p. 149). Moreover, the vibrant evaluation life caused that evaluation associations appeared in these countries. The very first experience with this environment represented the PHARE programme (‘Poland and Hungary Assistance for Restructuring their Economies’ created in 1989 and then expanded to ten other countries) that Czechia used since 1990. During the first phase (approximately until 1997) it supported projects aimed at addressing the issues of transformation of economy and society. Later (1998–2003), this programme aimed at preparing the country to accession in the EU. As part of this programme, the monitoring and evaluation practice was introduced and started to develop. This chapter describes the current state of the art in evaluation culture in Czechia. It is organised as follows. The first section offers an insight into the Czech national evaluation regulations, structures and processes. It covers national rules and sectoral approaches to evaluations in Czechia. The first section further describes the evaluation practice in Czechia, especially in the field of European Structural and Investment Funds. Moreover, use of evaluations and its key drivers are pointed-out. The second section focuses on societal dissemination. We pay attention especially to the civil society and the way how non-governmental organisations (NGOs) are helping to build the evaluation capacities and improve evaluations. Then, we present some observations concerning public perception and discussions on the utilisation of evaluation findings. The third section brings the overview of activities that are performed in order to professionalise the evaluations and institutionalise it as a discipline in Czechia. Therefore, we examine the academic environment and the role that evaluation plays within it including information about courses and further training. Moreover, we mention other activities performed

354

J. REMR AND O. POTLUKA

by non-academic stakeholders. Particular attention is paid to the Czech evaluation society and its efforts in the field of evaluation capacity building. We also describe and explain the most critical mechanisms aimed at improving the quality of evaluations. Concluding, fourth, section gives an estimation of expected future development and possible scenarios that evaluation might take soon.

2 2.1

Institutional Structures and Processes Legal Framework and Regulations Concerning the Evaluation

2.1.1 National Rules Currently, there is no specific law in force that intentionally focuses on policy evaluations in Czechia. For the time being, the new act on financial audit within the public sector has been prepared. In this act, some provisions shall be devoted to the evaluation of public policies and intervention programmes. However, the details have not been published so far. Though there is no specific provision on the evaluation of interventions or programmes financed from the public budgets, there is a whole array of requirements that enable building evaluation capacity. For example, a requirement to apply the ‘Three-E-approach’, that is effectiveness, efficiency, the economy is part of the legal system (Act on Control Over Management of Czech Public Finance 2001). Furthermore, public authorities who are providing funding for specific interventions might set the requirement to conduct evaluations (i.e. on the programme or project level). The amount of such requirements/evaluations varies throughout the time as a consequence of amount and size (importance) of such interventions. Beside this, regulatory impact assessment (RIA) is compulsory in Czechia (Government Resolution 2016). Such an obligation allows only minor exceptions from the general rule that sets RIA as compulsory for all acts. In fact, many RIAs have only pseudo-assessment nature because they are conducted only formally by civil servants themselves. Typically, instead of specific data utilisation and conducting robust analyses based on sound methodology and reliable data, only estimates or general (and in the given cases superficial) statistics/data are taken into account. There is no specific requirement concerning the methodology of RIAs when only general principles have been issued (Government Resolution 2016).

14

CZECHIA

355

Lack of resources (budget, time), guidance, no training concerning the instrumental knowledge and skills is performed (or such training is only ad hoc and is not sufficient). Lack of methodological guidance together with insufficient funding and the time pressure are the main factors of the currently suboptimal use of RIAs. Different arrangement (split between internally and externally performed tasks) should have to be taken into account to improve the usefulness and effectiveness of this type of assessments. 2.1.2

Sectoral Approach to Evaluations

European Structural and Investment Funds Evaluations are part of the European Structural and Investment Funds (ESIF), including evaluation plans which are negotiated with the European Commission. The legal basis for the past programming period of evaluation within this field was set by the Council Regulation No. 1083/2006 (European Council 2006) which has been changed by the Council Regulation No. 1303/2013 (European Council 2013). There is a set of guidelines concerning evaluations prepared by National Coordination Authority at the Ministry for Regional Development. Such guidelines are not legally binding, but they have a high level of respect by the implementation structure (e.g. managing authorities or intermediate bodies). Guidelines provide a general framework for performing and commissioning evaluations. Moreover, some operational programmes (especially the Operational Programme Employment) require an evaluation at the level of individual projects. For instance, within the field of social innovation, the Operational Programme Employment requires that the budgets of each project have some proportion devoted to evaluation. In this respect, the emphasis is put on formative evaluation aiming to provide the project teams with an evaluation expert that would help them to manage the projects more reasonably and efficiently. At the same time, the Operational Programme Employment also requires an adequate form of summative evaluation that would summarise the outputs and outcomes of the funded projects. In Czechia, the ‘Ethical Code of Evaluators’ and ‘Formal Standards of Conducting Evaluations’ are both generally binding, because they were

356

J. REMR AND O. POTLUKA

approved by the Government and released as an appendix of governmental decree on guiding documents for evaluation (Government Resolution 2013). This is a unique situation. ODA ODA is based on particular act besides other provisions, defines that the Ministry of Foreign Affairs is a responsible body for conducting the evaluations of development cooperation and humanitarian aid projects (Act on Development Cooperation and Humanitarian Aid 2010). The Ministry in this respect closely cooperates with the Czech Development Agency that is defined by the same law as a responsible body for implementing humanitarian aid projects. As part of the duties concerning the evaluation of projects performed within the framework of development cooperation, the Ministry of Foreign Affairs has been commissioning evaluations. It also publishes the yearly evaluation plans of the Czech ODA and provides complete lists of evaluations on the official website. Table 1 shows numbers of evaluations performed in the given years. Within this field of evaluation altogether 32 evaluations were conducted from 2010 to 2017. For comparison, according to the Ministry of Regional Development 401 evaluations were conducted in the field of ESIFs during 2007–2013. However, despite a significantly lower number of evaluations in the ODA, this field is considered by evaluators and other stakeholders as equally crucial concerning evaluation capacity building primarily because of the methodological requirements and the peer-review process that is implemented into the final reports’ acceptance procedure. The number of delivered reports is stable with the exception in years 2011 and 2012 when more reports were prepared. Generally, lower numbers after 2012 reflect the fact that the development cooperation has being concentrated (i.e. fewer priority countries) and more focused (which means that fewer projects in the same field were conducted). Evaluations of the ODA differ from those performed in ESIF especially because they are: (a) conducted in situ, that is evaluators must travel to the target destinations and communicate with local partners (which is rather exceptional in ESIF where many evaluations use desk research); (b) performed on projects (whereas in ESIF the programme level is evaluated predominantly, and the individual projects are used only as case studies illustrating the mechanism how the given operational programme

14

Table 1 Number of performed evaluations in the Czech ODA (Ministry of Foreign Affairs)

Year of evaluation completion 2010 2011 2012 2013 2014 2015 2016 2017

CZECHIA

357

Number of delivered reports 3 7 7 4 4 4 3 3

works). Therefore, its methodology is unique in providing learning about the specific type of interventions. Education Evaluation at schools focuses at all levels of the educational system below the tertiary education (i.e. universities are not affected by this provision). However, universities are evaluated as Research and Development organisations (R&D) within specific evaluation system (see below). Evaluation in the education sector concerns, therefore, kindergarten (3–5 years) and schools at primary (for children aged 6–13) and secondary (for students 14–18) levels. Such obligation is defined by law (Act on Pre-School, Basic, Secondary, Tertiary Professional and Other Education 2004). By this law, it is compulsory to undergo an evaluation every second year. The evaluation has two components: (a) self-evaluation that is performed internally by school personnel most often initiated by the director of the school and (b) independent evaluation performed by the Czech School Inspection and also the funding institutions (typically municipal or regional offices). Even though the self-evaluation is performed internally, schools should disseminate their results. Self-evaluation results should be incorporated into the annual reports that each school must publish. The ‘decree 15/2005 Coll. which sets out requirements for long-term plans, annual reports, and the school’s self-evaluation’ further specify detailed provisions on how to focus that self-evaluation. The decree prescribes goal-based evaluation when it is set that self-evaluation aims to evaluate to which extent the qualitative and quantitative goals of the school are

358

J. REMR AND O. POTLUKA

fulfilled. It also defines that the school must decide how to measure the quality of education. The self-evaluation should focus on technical, economic, hygiene, and other conditions of education, on process of teaching, on organisational culture and management of relationships with parents and local communities, on results of education (i.e. performance), on school strategic management, on quality of human resource management, and on compliance with long-term educational programme. Each school is encouraged to use all available secondary data and documents including inspection reports, to conduct interviews with teachers and parents, to release questionnaires among pupils, teachers, and parents, and to use the results of the nationwide skills, comprehension and knowledge tests. Results of self-evaluations are meant for the improvement of the schools’ performance, and therefore the schools are not required to share the detailed results of self-evaluations with supervising authorities (i.e. Czech School Inspection). However, schools must report to the supervising bodies fulfilling the obligation to perform the self-evaluation. Regulation involved in the national law is rather simple and superficial, whereas the decree is far more elaborated and provides detailed guidelines on how to perform such self-evaluation. Apart from that, there are many guidelines, brochures, studies and check-lists for schools aimed to increase the necessary knowledge and skills of those who are responsible for conducting self-evaluations and using their results. These tools are not legally binging however they enable to comply with legal obligations. R&D Former system of R&D evaluation that had been in force until 2016, overestimated the role of quantitative indicators (i.e. bibliometric data) and it was therefore criticised not only by the universities and research organisations but also by independent experts and foreign peers (see, e.g. National Evaluation of Research Organisations by Technopolis Group). In reaction to that, there was an attempt to implement a new system that would rely more extensively on peer-review and a more complex set of indicators. According to the Ministry of Education, Youth and Sports, that is responsible for R&D as well, the new system that is being implemented since 2017 also brings a new focus of evaluation. It should not be backward-looking anymore (i.e. measuring and evaluating the previous

14

CZECHIA

359

performance of research organisations) but it is somewhat forward oriented when it takes into account the capacities of universities and research organisations to attain their mission. The research programmes of Technological Agency represent an indirect way to evaluations. These programmes enable applied research to be conducted and evaluate some policies. It is more or less dependent on the willingness of researchers to conduct such an evaluative analysis. It conˇ cerns mainly R&D policies (Cadil et al. 2016b; Srholec 2015) and EU funding policies (Potluka and Špaˇcek 2013, 2016). Other Sectors As for the overall approach to evaluation in other sectors, there is an attempt to focus primarily on processes. Thus, the requirements to conduct evaluations are aimed at identifying the level of compliance with procedures. It concerns especially the following sectors: • Environment sector—waste management (Act on Waste and the Amendment of Some Other Acts 2001), energy savings (Act on Energy Management 2000), EIA and SEA, • Public health sector (prevention programmes) (Act on Pharmaceuticals and on Amendments to Some Related Acts 2007), • domestic security sector (Act on Cyber Security 2014), • Social affairs and employment sector (Act on Social Services 2006). An example here is the system assuring the equality standards in social services. In this case, the set of standards is imposed, and each social service provider has to comply with these minimum standards. However, such an evaluation should rather be considered as an assessment or quality audit, • Trade, business, and industrial sectors; competitiveness, and innovations (the Ministry of Industry and Trade, Czechinvest), • NGOs—there is a rising attempt to use evaluations in order to declare the results (to show accountability, for fundraising). Therefore, the primary emphasis is put especially on finding the evidence that the given regulations are followed, and deviations from the plan are minimal. If there is an attempt to focus on results, it is typically assessed on the ex ante basis (see, e.g. environmental impact assessments).

360

J. REMR AND O. POTLUKA

To sum up, the sectoral regulations usually have a general anchor in legislative acts with the power of law that is followed by technical details specified in a decree. Arrangements like these enable to perform evaluation activity (because public organisations like ministries may only do what they are obliged to do by law) and they grant evaluations with the necessary authority and respect (it is required by law). At the same time, having detailed provisions specified in a decree (that is passed by governmental decision and not by the Parliament), requirements remain flexible and open for immediate revision, if necessary. 2.2

Evaluation Practice

It has not been a habit to evaluate national policies in Czechia (or at least not in the way as evaluation is currently defined). In this respect, the ‘champions’ are ODA and the ESIF. ODA that has been helping to build evaluation capacities and develop evaluation culture in Czechia. The fact that the majority of the founding members of the Czech Evaluation Society were evaluators of projects financed by ODA exemplifies this situation. Five to ten projects are performed and evaluated within ODA each year (Table 1). It is not a high number of evaluations; however, the evaluations usually use a sound methodology. Moreover, they are always broadly discussed when major stakeholders are involved, open access and therefore high impact on evaluation capacity building, evaluation knowledge transfer and improving skills. In recent years, the leading role is overtaken by the ESIFs (the EU Cohesion Policy). In this field, altogether 401 evaluations have been performed during the years 2007 and 2013 (Table 2). Most of them are declared as on-going evaluations (268, i.e. 67%) whereas results focused Table 2 Number of evaluations of ESIF during the period 2007–2013 (Ministry of Regional Development)

Type of evaluation

Number of delivered reports

Ex-ante on-going Ex-post ad hoc; others Total

21 268 18 94 401

5% 67% 5% 23% 100%

14

CZECHIA

361

evaluations (especially ex post evaluations) are somewhat rare because there are only 18 of those (i.e. 5%). Within ad hoc and other evaluations, most of such evaluations are focused on management and implementation of the programme (39), absorption capacity (14) and indicators 12. Remaining 29 ad hoc evaluations focused on a variety of other topics. There is an increasing role of internal evaluation since 2014. The growing number of evaluation units within the public sector proves this development. The role of these units is to conduct internal evaluations and to commission the external evaluations. The ratio between the internal and external evaluation is difficult to estimate while many internal evaluations do not have formally published results such as a final report. Moreover, performing evaluations and producing evaluation reports are not the main tasks (they build internal capacities, create a framework, and deliver evaluation findings to relevant stakeholders). Within the ESIF, a useful practice has evolved: National Coordination Authority is participating on many external evaluations that are commissioned by other state departments. Employees of the National Coordination Authority are in these cases part of the team (as commissioners), and they provide methodological guidance to help their colleagues who are commissioning the given evaluation. This helps to maintain consistent methodology and congruence across different programmes, to share experience and to provide useful learning. The other stakeholders (like, e.g. evaluators) appreciate this approach very much. Another institution that is relevant to the evaluation perspective is the National Audit Office. It focuses especially on financial supervision, public procurement, public bodies’ spending. However, as part of its reviews, National Audit Office focuses also on the effectiveness and efficiency of other public bodies and their activities, and therefore it uses not only the audit methodology, but it also conducts evaluations. Generally, there is a high concern on impact evaluation among both, the commissioners (who want or need such evaluations) and evaluators (who tries to respond to the demand). Impact evaluations are therefore discussed at almost every evaluation conference in Czechia in recent years,1 although impact evaluations have been performed rather rarely 1 For instance, in 2014 there were two workshops at the Annual Conference of the Czech Evaluation Society devoted to impact evaluation (one specifically focused on CIE); similarly, in 2016 several papers were presenting the results of impact evaluations. Even

362

J. REMR AND O. POTLUKA

Table 3 The primary focus of evaluations of ESIF during the period 2007–2013 (Ministry of Regional Development)

The focus of the evaluation (TOP 5 categories) Management and implementation of the programme Absorption capacity Indicators Publicity Impacts

Number of delivered reports 130

41 33 27 25

(Table 3; Potluka and Bruha ˚ 2013). Most of the conducted evaluations are process evaluation. On the other hand, many evaluations do serve for learning. They are used as studies, analytical reports. Many evaluations also serve as proof that the programmes were managed well. In many cases, the feedback, audit or effectiveness-analysis is used instead of evaluation. There were a few exceptional cases when there was an attempt to conduct CIE that was based on sound methodology. These cases come from ˇ the field of R&D (Cadil et al. 2016a; Dvouletý and Blažková 2017) and active labour market policies (Hora and Sirovátka 2012; Potluka et al. 2016; Potluka et al. 2017a). It is difficult to count the number of all evaluation reports (i.e. evaluations). The reason is that some evaluations are not declared as evaluations, and they do not even use the term ‘evaluation’, despite the fact they evaluate the given programmes or policies. On the other hand, there are some reports declared as evaluations, whenever they resemble an analysis, survey, opinion poll or interpretation of statistical data. The impact evaluations rarely use rigorous counterfactual impact evaluation design. It has recently changed as the EU Cohesion Policy as well as national R&D poliˇ cies (Cadil et al. 2016a; Dvouletý and Blažková 2017) and active labour market policies (Hora and Sirovátka 2012; Potluka et al. 2016; Potluka et al. 2017b) have used this approach.

the first Annual Conference of the Czech Evaluation Society in 2011 introduced the impact evaluation case studies.

14

Table 4 Number of evaluations delivered in ESIF (Ministry of Regional Development)

Evaluation completion (year) 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

CZECHIA

363

Number of reports 8 25 67 58 72 49 46 29 32 7

Another issue concerning the evaluation practice is the volatility in demand for evaluation. This fact reflects the large extent the programming cycle (Table 4). During the first two years of a programming period, there is a lower demand for evaluations (2007/2008; 2014/2015). It also depends on the volume of call for proposals open. Such deviations in the volume are not sustainable for most providers as they have to allocate their evaluation capacities. In order to stay on the market, they must also perform other activities (provide consulting) or reduce the staff. The supply-side of the evaluation market is further characterised by insufficient resources for long-term development and capacities building. Not an exceptional situation was that there had been tenders not involving any qualitative criteria in the selection of an evaluator. The price was the only decisive criterion as the intention of commissioners was to achieve transparency and to avoid possible objections from other competitors and appeals from supervising or audit authorities. However, such practice had a disastrous effect on quality of evaluations. Some providers without any experience with evaluations were capable of winning a tender because of bidding very low prices. When the demand for evaluation increases (e.g. 2009; 2016/2017), the supply is insufficient (experienced evaluators are not available or only poor-quality services are provided by inexperienced evaluators). 2.3

Use of Evaluations

Evaluations are hardly recognised by the members of the Parliament. Most provisions come from National Coordination Authority and are

364

J. REMR AND O. POTLUKA

introduced as governmental decrees (i.e. out-bounding parliamentary structures). On the other side, programme or project managers use evaluations, especially summative evaluations. The administrative personnel are in another position and copes mainly with formative evaluations. Among them, programme and project managers are the typical users of evaluations. The evaluation culture and use of evaluations as learning tools are pulled by the ODA, ESIFs and universities. National policies are evaluated rather rarely. From a sectoral point of view, the main user of evaluations is the Ministry of Labour and Social Affairs. The evaluation is primarily used for the legitimisation purpose. Audits are used for the control and are beyond the evaluation scope. In some cases, evaluation and processes had been used for learning of evaluation community. Quality requirements for evaluations exist. They are defined by the quality requirements issued by the Czech evaluation society (Czech Evaluation Society 2013) and applied by several public agencies for their evaluations. In the case of externally conducted evaluations, the programme managers define the actual requirements relating to evaluator’s competencies in terms of references. Thus, the quality requirements vary according to demand. Such requirements do not ask for specific education. In some tenders, the experience in evaluation on a professional basis is a requirement. The applicant must prove that s/he performed, for example three similar evaluations during the past three years. Requirements are set arbitrarily and can hardly be achieved (e.g. ex ante evaluation during the past three years required at the end of programming period). However, in many tenders, the only requirement is to be literate (price gets 100% weight, so the applicant must pass just the basic qualification criteria and offer the lowest price or offer the cheapest solution). After an evaluation is delivered, programme managers conduct a review and discuss results with the external evaluators. Moreover, the Czech Evaluation Society organises voluntary peer-review to help to increase the quality of evaluations, especially in ODA. Meta-evaluations are carried out only on an ad hoc basis: some meta-evaluations are conducted as independent research activities. Application of evaluations’ recommendations is solely in the responsibility of the managers of the evaluated programmes. Thus, the application of recommendations varies according to the approach of particular

14

CZECHIA

365

managers. As for the evaluation of ESIFs, there is an administrative procedure that defines how the evaluation results must be tracked and implemented. According to this rule, every single recommendation that is mentioned in the evaluation report must be tracked. Evaluation commissioners must comment on such recommendation, and after a reasonable time, they must declare whether the recommendation was implemented or not. If the recommendation was implemented, then also the result of such implementation shall be provided; if not implemented, proper arguments explaining the reasons for recommendation refusal must be specified. The success of evaluation’s recommendations is dependent on the political skills of the evaluators and submitting body. There is missing a tradition of accountability. Project management and learning organisations are almost unknown issues in the public sector. Interest in learning and feedback is shallow. Thus, the success of an evaluation is more dependent on political skills and political networks rather than on the strength of arguments provided by evaluations. As for the evaluation of ESIFs, there is an administrative procedure that defines how the evaluation results must be tracked and implemented. Such a system was implemented in order to encourage the use of evaluation results. Before such provision was set, very often the evaluation reports were put on a shelf and no further steps were taken. It seems that besides detailed tracking the findings and recommendations coming from evaluation, the system increases the recognition of evaluation and improves its importance.

3 3.1

Societal Dissemination Use of Evaluations by the Civil Society

Evaluations are more or less national level policy issue. If citizens or civil society organisations participate in evaluations, it is usually initiated by funding obtained from national or supra-national programmes. Similarly to national politicians, also local political elites stay away from evaluations and do not require them by themselves unless it is a top-down requirement (e.g. given by the law as it is the case of evaluation of schools). The authors of the chapter are aware of analytical studies conducted for political decision-making at the local level, but these studies do not have the merit of evaluations.

366

J. REMR AND O. POTLUKA

NGOs take part in evaluations of their projects. During the years 2007–2013, self-evaluation of the European Social Fund funded project started being a requirement. Thus, NGOs implementing these projectsbuilt evaluation capacities. These capacities were only short-time as they were very closely related to the implemented projects and their budgets. When these projects ended, no additional funding to keep evaluation capacities prevailed. This weak position of evaluations is also underlined by the situation in the civil society sector. The NGO-representatives involved in the partnership process in the programming of ESIFs complained that the NGO sector is limited by its approach to hard data and persuasive analyses to enable them to lobby efficiently (Potluka et al. 2017a). Some grant-making foundations are open to evaluations of their programmes even if they do not have their evaluation capacities. For example, the Civil Society Development Foundation or Partnership Foundations are cases of organisations which have provided data for voluntary evaluation of their programmes. It reflects their approach as learning organisations capable of using evaluations behind the standard approach (Batterbury 2006). 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

Evaluation reports financed from the public budgets are publicly available. In some exceptional cases, the full-reports are not on a website. However, every citizen is entitled to use the act on free access to information (Act on Free Access to Information 1999). This law grants the right to ask (and get) whatever information (apparently that is not classified or otherwise confidential). The public authorities, when asked, must deliver the required information, including evaluation reports and its appendices. The general issue is that public bodies usually fulfil their informational duties by placing information on the Internet. However, they do not attempt to deliver such information to relevant stakeholders and to disseminate the information actively. It also concerns evaluations. There are differences in approach taken by different departments. The Ministry for Regional Development is transparent, that is making public all reports. Besides transparency, the Ministry of Foreign Affairs is pro-active

14

CZECHIA

367

as it also actively disseminates the evaluation findings. Some other departments publish executive summaries of evaluation reports, and some provide them only on request. There are concerns about negative feedback (or identified weaknesses) that the evaluation reports are pointing out. Civil servants are afraid that the feedback (originally meant to help them improve the programmes) is misused for their ostracism. Public discourse happens in a limited scope and always in a specific field. For example, schools are evaluated according to several criteria (Map of school, SCIO) by parents, teachers, and children. School directors get information on the position of the given school in comparison to other schools in Czechia. It is up to the directors of schools how they do communicate the results to the public if they do that at all. The general public gets information about policies’ evaluations results through media. It is usually biased by the purpose on how the journalists want to report about public policies. In such a case, the methodology of evaluation is entirely missing as neither journalists nor the general public would understand it. Journalists tend to misuse the findings of evaluations. They only pick the most negative findings because they consider it ‘attractive’ or ‘sensational’ for their readers. A balanced dispute is rather lacking. For example, in the year 2016, there were a series of articles (even engaging the readers) in one of the most read newspapers on the most absurd project financed from ESIF. Recently, there is only one short (three-column; 10 cm high) article about evaluation. The headline was: “How not to waste money— unknown profession might help” (Rychlík 2016).

4 4.1

Professionalisation

Academic Study Courses and Further Training

Academic Study Courses There exist several individual courses on evaluation taught at the Czech universities. However, no comprehensive course of evaluation and evaluation methodology exists at the university level. It is usually an initiative of individual scientists, especially those in the field of public finance and public policies, regional development, and humanities. From the perspective of scientific methodology, there are several courses combining evaluation

368

J. REMR AND O. POTLUKA

approaches and evaluation methodology. These are rarely named as evaluation courses. They are instead called scientific methodology or methods in science. List of universities where evaluation is taught in individual courses follows: • Charles University, Faculty of Social Sciences (Evaluation Research, Evaluation of Public Policy) • Charles University, Faculty of Arts (Social Work Effectiveness) • Charles University, Pedagogical Faculty (Evaluation at schools) • Charles University, Faculty of Natural Sciences • The University of Economics in Prague (Regional Development) • Masaryk University, Faculty of Social Sciences (Evaluation of intervention programmes and social interventions) • Palacký University, Developmental Studies (Principles of ODA).

Further Training Beyond the scope of universities, there are several initiatives concerning further training and evaluation knowledge dissemination. The Czech Evaluation Society is accredited by the Ministry of Interior to perform courses on evaluation for civil servants in municipalities. Beside this, Czech evaluation society organises other ad hoc training and meetings organised regularly to disseminate experience in evaluations. Each meeting is dedicated to one topic. These meetings usually react to need from evaluation practice (e.g. contracting, use of advanced methods in an evaluation). Since 2017, the training session ‘Evaluation Minimum’ has been organised.

4.2

Evaluation as a Profession

Evaluations started constituting a new profession globally (Meyer 2016). In Czechia, not all defining elements of a profession have been met. Professional ethics and identity of evaluators and social status have been established. Moreover, the Czech Evaluation Society as a national professional organisation has been set up. On the other side, scientific-based education and educational programmes at the universities still do not exist as a comprehensive programme. Moreover, evaluation is not officially recognised

14

CZECHIA

369

as a profession (it is not listed among the professions). Thus, a certification system for evaluators does not exist. Evaluation is also not recognised as an economic activity. So, the companies providing evaluations do officially declare themselves as ‘consulting firms’ or ‘research companies’. Small steps towards setting the evaluation as a profession have been made. The Czech Evaluation Society has introduced two crucial documents. First, it is the ‘Ethical Code’ (Czech Evaluation Society 2011). Czech Evaluation Society has its ethical code since 2011. This document reflects the values that members consider essential in the course of evaluation conduct—the proficiency, integrity, and accountability. Each of those values is further elaborated in detail, and a set of specific provisions is defined. Second, the ‘Formal Standards for Evaluation’ (Czech Evaluation Society 2013) are in the second important document prepared by the Czech Evaluation Society. Beside the Ethical Code, Czech Evaluation Society implemented the Programme Evaluation Standards of the Joint Committee on Standards for Educational Evaluation (for more information, see www.jcsee.org). In this respect, the Czech Evaluation Society decided not to create its own set of standards because it wanted to express the commitment and adherence to norms that were also adopted by other evaluation societies around the world. Doing so, the Czech Evaluation Society shows its appreciation of the principles defined by evaluators from countries with long tradition in evaluations. Since 2016, the Czech Evaluation Society organises a competition for university students. Students, who study different disciplines, are asked to form small groups of three to five members competing in the best evaluation design proposal to the given topic. The primary goal of this competition is building evaluation capacities. Thanks to this event, students have a great opportunity to practically use the theoretical knowledge that they are gaining during the whole semester. Moreover, competition stimulates the interest of the students, keeps them active and engaged during the semester and introduces the gamification element into the process of their education and their future carrier development. Typically, four to six teams from different universities across the country take part. Their proposals are carefully peer-reviewed, and the winners are announced. They are also invited to the Annual conference of the Czech Evaluation Society where they can introduce their proposal and thus attract prospective employers. The competition was inspired by the competition that is for

370

J. REMR AND O. POTLUKA

many years organised by Canadian evaluation society. Recently the cooperation on the competition between Canadian and Czech evaluation societies is undergoing. Czech Evaluation Society The Czech Evaluation Society has been set up in the year 2007. Since that it has developed into a respected institution in the field of evaluations not only in Czechia but also within the whole Central European region. It is a non-profit, non-governmental, and independent organisation staffed by volunteers. The primary goal of the Czech Evaluation Society is to associate individuals and institutions interested in evaluations. Members of the Czech Evaluation Society are driven by their concern in building evaluation capacities within Czechia, in increasing awareness about evaluations among relevant target groups, in improving the reputation of evaluations as well as in advancing professional skills of practitioners and commissioners. Data about the member of Czech Evaluation Society might be used in this respect. In 2018 (last available data) there were 44 members. The annual conference of the Czech Evaluation Society plays a crucial role in the exchange of experiences in evaluations and in opening the new horizons in the evaluation. Being held since 2011, one-day event has become the most crucial evaluation event in Czechia. Each year more than 100 participants attend approximately 15 lectures of Czech and often also foreign speakers, carefully selected by the Conference Steering Committee. Other activities of the Czech Evaluation Society relate to evaluation culture building. It publishes a bi-monthly newsletter that has a form of the e-zine. Newsletter summarises the key events from the past two months, informs about planned conferences organised throughout Europe and in other places around the world, it informs about key decisions of the board of the Czech evaluation society, about activities of the working groups and it also introduces the new members. Since 2013, a scientific journal ‘Evaluaˇcní teorie a praxe’2 has been published regularly by the institute for evaluations and social analyses in cooperation with the Czech Evaluation Society. It has started as a journal with Czech articles only, but recently there are more and more articles being published in English. Since 2016, the journal is indexed in the 2 Engl.: ‘Evaluation Theory and Practice’.

14

CZECHIA

371

European Route of Industrial Heritage database. It is the only journal that focuses on the evaluation on a regular basis in Czechia. However, occasionally there are articles relevant from the evaluation point of view also in journals such as ‘Forum for Social Policy’, ‘Society & Scientia’, ‘Regional Studies’,3 or ‘Envigogika’.

4.3

Mechanisms to Achieve the High Quality of Evaluations

There is general arbitration which does not relate specifically only to evaluations. Contracting parties might decide that instead of going to court, they will solve their possible breach of contract in front of an arbiter. However, many contracts (that are pre-defined by key commissioners) do not use the arbitration. The members of the Czech Evaluation Society had discussed the issue of an arbitration board, but they decided not to set it up. The reason was a relatively small evaluation market in Czechia which could cause conflicts of interests easily. Thus, the standards and codes of conduct issued by the Czech Evaluation Society are followed voluntarily even by non-members. Notably, in the cases when these standards are part of the terms of reference as an expected approach of the winning evaluators. In ESIFs at the individual project level, there are steering committees composed of the commissioners (civil servants using the evaluation findings), providers and experts from National Coordination Authority (who are helping by providing methodological guidance or supervision). In the ODA, the working group for evaluation is established on a long-term basis, reviews all evaluations conducted, and provides detailed feedback. It helps to communicate among all relevant stakeholders about expectations, quality standards, methodologies and mechanisms how to improve evaluations performed. Since 2015, there is another conference (open to all evaluators) organised by the National Coordination Authority that is specialised on issues that are relevant for evaluations of European Structural and Investment Funds. This conference is usually attended by a comparable number of participants similar to the Annual Conference of the Czech Evaluation Society (i.e. approximately 100 attendees). By its merit, it focuses on evaluation issues that are specific to the field of European Structural 3 Czech: ‘Regionální studia’.

372

J. REMR AND O. POTLUKA

and Investment Funds. Similarly, as Annual Conference of Czech Evaluation Society, the main focus is methodological: performing counterfactual impact evaluations, the experience of different managing authorities with managing the evaluation process, commissioning. International speakers represent the essential benefits of this conference as they bring and share their experience from more developed evaluation systems and organisations.

5

Conclusions and Future Outlook

The development in the last decade proves that the evaluation capacities are growing in Czechia. It concerns both, evaluators and commissioners of evaluations. In the public sector, evaluations have ceased to be an enforced external requirement but a suitable addition to in the organisations capable of learning and developing. However, there are still some challenges ahead of the evaluation community in Czechia. Cooperation of all stakeholders is an essential issue for evaluation capacity building, especially in the Civil Society sector. Moreover, evaluations are dominantly used in education, the ODA and the ESIF. Outside this framework, especially in national policies, however, it remains underdeveloped and underutilised. We see this as a lost opportunity because of potential interests of the whole population—we all are taxpayers and evaluations have a considerable potential to improve the implementation of public policies and save taxpayers’ money. Global development of evaluations also concerns the Czech evaluation community. The current pressure on Big Data use in evaluations (Petersson and Breul 2017), and speed of the required delivery of evaluation reports and recommendations belong among the most pressing challenges.

References Act on Free Access to Information. (1999). Parliament of Czechia. No. 106. Act on Energy Management. (2000). Parliament of Czechia. No. 406. Act on Control Over Management of Czech Public Finance. (2001). Parliament of Czechia. No. 320. Act on Waste and the Amendment of Some Other Acts. (2001). Parliament of Czechia. No. 185. Act on Pre-School, Basic, Secondary, Tertiary Professional and Other Education. (2004). Parliament of Czechia. No. 561. Act on Social Services. (2006). Parliament of Czechia. No. 108.

14

CZECHIA

373

Act on Pharmaceuticals and on Amendments to Some Related Acts. (2007). Parliament of Czechia. No. 378. Act on Development Cooperation and Humanitarian Aid. (2010). Parliament of Czechia. No. 151. Act on Cyber Security. (2014). Parliament of Czechia. No. 181. Barbier, J.-C., & Hawkins, P. (Eds.). (2012). Evaluation cultures: Sense-making in complex times. New Brunswick: Transaction Publishers. Batterbury, S. C. E. (2006). Principles and purposes of European Union Cohesion policy evaluation. Regional Studies, 40(2), 179–188. Bienias, S., Gapski, T., J˛akalski, J., Lewandowska, I., Mackiewicz, M., Opałka, E., et al. (2009). Evaluation in Poland: Brief overview of evaluation process of EU Cohesion Policy funds. In S. Bienias & I. Lewandowska (Eds.), Evaluation systems in the Visegrad member states (pp. 140–189). Warsaw: Ministry of Regional Development. Bienias, S., & Lewandowska, I. (Eds.). (2009). Evaluation systems in the Visegrad member states. Warsaw: Ministry of Regional Develpment. ˇ Cadil, J., Mirošník, K., & Petkovová, L. (2016a). Impact of R&D subsidies on enterprise performance in the Czech Republic. Society and Economy, 38(3), 387–398. ˇ Cadil, J., Petkovová, L., Hartman, D., Mirošník, K., Fischer, J., Mertlík, P., & Beránek, M. (2016b). Metodika hodnocení podpory R&D z hlediska konkurenceschopnosti podpoˇrených subjektu. ˚ Czech Evaluation Society. (2011). Etický kodex evaluátora (p. 4). Prague: Czech Evaluation Society. Czech Evaluation Society. (2013). Formální standardy provádˇení evaluací (p. 3). Prague: Czech Evaluation Society. Dvouletý, O., & Blažková, I. (2017). Are publicly supported companies in the Czech food and drink industry performing better? Initial findings from the microdata. Paper Presented at the Innovation Management, Entrepreneurship and Sustainability (IMES 2017). Prague. http://imes.vse.cz/wp-content/ uploads/2015/08/Conference_Proceedings_IMES_2017.pdf. Accessed on 23 January 2019. Council of the European Union. (2006). Council Regulation (EC) No. 1083/2006 of 11 July 2006 laying down general provisions on the European Regional Development Fund, the European Social Fund and the Cohesion Fund and repealing regulation (EC) No. 1260/1999. Official Journal of the European Communities, 210. Furubo, J.-E., Rist, R. C., & Sandahl, R. (2002). International atlas of evaluation. New Brunswick: Transaction Publishers. Government Resolution. (2013). Government resolution on a set of methodological documents on the areas of evaluation, generation and use of indicators, eligibility

374

J. REMR AND O. POTLUKA

of expenditure and their reporting and risk management in the programming period 2014–2020, No. 597. Government Resolution. (2016). Government resolution on general principles of regulatory impact assessment (RIA), No. 76. Hora, O., & Sirovátka, T. (2012). Srovnání efektu˚ aktivní politiky zamˇestnanosti ˇ v Ceské republice v období rustu ˚ (2007) a bˇehem první fáze krize (2009). Praha: Výzkumný ústav práce a sociálních vˇecí, v.v.i. Meyer, W. (2016). Toward professionalization? The contribution of universitybased training programs in pioneer countries. In R. Stockmann & W. Meyer (Eds.), The future of evaluation: Global trends, new challenges, shared perspectives (pp. 98–112). Basingstoke: Palgrave Macmillan. Petersson, G. J., & Breul, J. D. (Eds.). (2017). Cyber society, big data, and evaluation: Comparative policy evaluation (Vol. 24). London: Routledge. Potluka, O., & Bruha, ˚ J. (2013). Zkušenosti s kontrafaktuální dopadovou evaluˇ ací v Ceské republice. Evaluation Theory and Practice, 1(1), 53–68. Potluka, O., & Špaˇcek, M. (2013). Postupy a metody kontrafaktuálních dopadových evaluací pro Operaˇcní program Zamˇestnanost v období 2014–2020. https://www.mpsv.cz/files/clanky/17051/Metodika_CIE_MPSV_131015. pdf. Accessed on 23 January 2019. Potluka, O., & Špaˇcek, M. (2016). Metodika pro aplikaci kontrafaktuálních dopadových evaluací Operaˇcního programu Životní prostˇredí 2014–2020. Potluka, O., Bruha, ˚ J., Špaˇcek, M., & Vrbová, L. (2016). Counterfactual impact evaluation on EU Cohesion Policy interventions in training in companies. Ekonomický cˇasopis, 64(6), 575–595. Potluka, O., Špaˇcek, M., & Remr, J. (2017a). NGOs as partners: Obstacles in the EU Cohesion Policy? Ekonomický cˇasopis, 65(8), 715–736. Potluka, O., Špaˇcek, M., & von Schnurbein, G. (2017b). Impact of the EU Structural Funds on financial capacities of non-profit organizations. VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, 28(5), 2200–2223. Regulation, E. U. (2013). No. 1303/2013 of the European Parliament and of the Council of 17 December 2013 laying down common provisions on the European Regional Development Fund, the European Social Fund, the Cohesion Fund, the European Agricultural Fund for Rural Development and the European Maritime and Fisheries Fund and laying down general provisions on the European Regional Development Fund, the European Social Fund, the Cohesion Fund and the European Maritime and Fisheries Fund. Official Journal of the European Communities, 20. Rychlík, M. (2016). Jak neplýtvat penˇezi? Pomuže ˚ neznámá profese. Lidové noviny. Srholec, M. (2015). Metodika hodnocení hospodárnosti úˇcelové podpory pro aplikovaný výzkum a vývoj ve firmách. https://tacr.cz/dokums_raw/

14

CZECHIA

375

metodiky/Metodika%20hodnocen%C3%AD%20hospod%C3%A1rnosti%20% C3%BA%C4%8Delov%C3%A9%20podpory%20pro%20aplikovan%C3%BD% 20v%C3%BDzkum%20a%20v%C3%BDvoj%20ve%20firm%C3%A1ch_Srholec. pdf. Accessed on 23 January 2019. Stockmann, R., & Meyer, W. (Eds.). (2016). The future of evaluation: Global trends, new challenges, shared perspectives. Basingstoke: Palgrave Macmillan.

CHAPTER 15

Latvia Krišj¯ anis Veitners

1

General Country Overview

Since regaining its independence in 1991, Latvia has transformed from a former Soviet republic to a parliamentary multi-party democracy. Latvia is a unitary state where executive power is exercised by the government (Cabinet of Ministers [CoM]) consisting from 13 ministries. Ministries and the centre of government1 are responsible for 18 policy areas and 88 policy sectors (CoM Regulation No. 300). Major part of areas and sectors are covered by policy planning documents, except some that are regulated by regulatory legislation acts—primary laws or subordinate regulations. The legislative power is enacted by the government and the unicameral parliament Saeima, who both can initiate legislative acts. Laws are issued by the president who has personal discretion over the proposal of legislation to the parliament including vetoing legislation. In 27 years since regaining its independence Latvia has become European Union (EU) Member state (in 2004), member of North 1 Including the State Chancellery and the Cross-Sectoral Coordination Centre (PKC) directly

subordinated to the Prime Minister.

K. Veitners (B) Latvian Evaluation Society, Riga, Latvia © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_15

377

378

K. VEITNERS

Atlantic Treaty Organisation (in 2004), full member of Eurozone (in 2014), entered Organisation for Economic Co-operation and Development (OECD) (in 2016) and other international organisations (United Nations, World Bank etc.). Fulfilment of commitments related to entering these organisations and later follow up to their common policies has largely influenced policy agenda and still is an important factor in policy making culture. Accession to the EU and other international organisations, has set requirements for the transfer of good practice and has encouraged the institutionalisation of several evaluation elements in policy planning (evaluation of policy planning documents) and regulatory policy (Regulatory Impact Analysis [RIA] of legislative acts). Political transition processes were supported by external financial assistance from various multilateral and bilateral donors, associated with preconditions for mandatory programme evaluations. Continuous practice of external programme evaluations has played a significant role in the introduction of evaluation in public sector, institutionalisation of evaluation function in certain policy areas as well as development of local evaluation market. EU regulations and guidance on evaluation of EU funds has served as the main impulse for spreading more sophisticated, rigorous and comprehensive approaches to evaluation practices in various policy sectors. Regular funds provided by these programmes for their evaluation have also contributed to the development of local evaluation market. Research on institutionalisation of evaluation in Latvia is largely affected by the linguistic peculiarities of the term ‘evaluation’. In Latvian there is no single concept that could be used to conceptualise the diverse phenomenon of the term ‘evaluation’ (Lake 2011). The two most often used words in Latvian are: izv¯ert¯ešana (emphasising the procedural nature) and nov¯ert¯ešana (highlighting the result of the activity by determining the value of an object). Both words are being used interchangeably and often as synonyms. Additional problems cause the fact that these words are polysemic and are widely used in legislative texts and in policy documents to express such English terms as ‘assessment’, ‘valuation’, ‘inspection’ or ‘checking’. In other cases, English terms ‘assessment’ and ‘analysis’ are linguistically made closer to ‘evaluation’, for example RIA is being literary translated as ‘ex ante impact evaluation of draft legislative acts’, and Environmental Impact Analysis (EIA) as ‘environmental impact evaluation’. Conceptual and linguistic problems are reflected also in academic debates in Latvia, where is no clearly distinctive evaluation related discourse.

15

2

LATVIA

379

Institutional Structure and Processes (Political System) 2.1

Evaluation Regulations

2.1.1 Evaluation Requirements in Laws or Regulations In the Latvian database of legislative documents2 the words izv¯ert¯ešana, nov¯ert¯ešana and their derivative forms are present in 224 different legislative acts. In 50 of these acts both words are used to mean the term ‘evaluation’. These are legislative acts related to management of various EU funds and programmes, policy planning, RIA and EIA representing areas where we can speak about institutionalisation of evaluation in Latvia. Policy Evaluation Initial steps in the establishment of a policy impact evaluation system in Latvia began in early 2000s. The Policy Planning Guidelines published in 2001 for the first time defined two policy evaluation types: initial (ex ante) evaluation of new policy initiatives and evaluation of policy impact and implementation (ex post) (Politikas pl¯anošanas pamatnost¯adnes 2001). The State Chancellery with external support elaborated in 2004 a guide on policy impact evaluation, created public data base of policy planning documents and started regular training programme for civil servants at the School of Public Administration (Lake 2011). The EU-level ‘Better Regulation’ initiative gave a new impulse for development of policy evaluation system, that fostered the development of normative framework institutionalising evaluation in legislative acts. The Law on Development Planning System and related CoM regulations were issued in 2009, defining policy planning system and setting responsibilities and requirements for ex ante, ex post and interim-evaluation for planning documents. The CoM Regulation define evaluation as “a set of methods and procedures that the developer of the policy planning document uses to assess the socio-economic impact of the implemented policies” (CoM Regulation No. 737, Art. 32). According to the CoM Regulation main task of the initial impact assessment (ex ante) is predicting the likely impact of a

2 For the Latvian database of legislative documents, see www.likumi.lv. It contains all legislative acts of Republic of Latvia that ever entered into force since 1990s.

380

K. VEITNERS

decision or action towards various policy objects of impact.3 The aim of the interim-evaluation is to monitor the implementation progress of the planning document, to evaluate achievement of the initially set goals and the expected results, to justify the use of the funds and to identify the necessity for elaboration of policy planning document for the next planning period. Goals of the ex post evaluation are similar to those of the interim-evaluation. The CoM Regulation does not require mandatory use of all three evaluation types for each planning document. There are no specific policy planning documents or strategies on evaluation or use of evaluation in policy planning system. To provide guidance for the development of policy planning documents the Cross-Sectoral Coordination Centre (PKC) has published a Policy Making Handbook (PKC 2016). The handbook contains a section related to evaluation, where three evaluation types and most commonly used assessment methods are shortly explained. The handbook mentions the necessity for adequate resources as well as independency from policy makers. On the administrative level there are no official instructions or guidelines containing detailed information on how to plan, do and use the evaluation.4 Civil servants can get more information by attending a specific one-day training on policy evaluation proposed by the School of Public Administration on average once a year.5 The Law and CoM Regulation apply to all policy planning documents in all sectors. Legislative acts create formal normative framework for the policy evaluation system’s operation but does not provide any requirements or guidance for ensuring qualitative and meaningful evaluation activities. According to the legislative acts evaluation is understood in the narrow sense as a set of analytical actions applied before, during or after the implementation of policy planning document. The CoM Regulation sees evaluation as a regulatory instrument for assessment of policy results. The CoM Regulation does not encourage applying ex ante evaluation prospectively 3 Regulation specifies following policy impact objects: macroeconomic environment, business environment, administrative burden, social impact, environmental impact, impact on territorial development, impact on state and local government budgets, impact on the functions and human resources of the administration, impact on the legal system and Latvia’s international obligations, and impact on human rights. 4 In fact, there are guidelines explaining purpose, process and use of evaluation in the policy planning process, which have been drafted under a specific capacity building project (CPM 2011b). But this document is not officially issued or promoted, therefore, little is known. 5 The title of the course ‘Impact evaluation of development planning documents’ symbolically demonstrate the focus of policy evaluations.

15

LATVIA

381

for identification of most effective solutions but rather for justification of already made political decisions. The legislative norms do not specify the use of evaluation results and it is not considered as an integral part of the evaluation. Normative framework does not specify anything about the quality of evaluation, evaluator’s independence or budget that should be available for evaluations. Policy evaluations are not linked with the budget planning process on the normative level.6 Evaluation and Impact Assessment of Legislative Acts RIA is probably the most recognisable form of evaluation in the public sector, as almost each public servant has been involved in the drafting of a new legislation, that also involves mandatory filling of ex ante impact assessment form. The system of annotations of legislative acts was introduced in 1997. RIA is carried out both by the government and the parliament. CoM Instruction regulate RIA on the executive level (CoM Instruction No. 19). On the legislative level RIA is regulated by the Rules of the Order of parliament (Saeima 2014). RIA is mandatory for all draft legislation acts approved by the government. RIA form includes such information as purpose of legislative act, costs and benefits of achieving the objective and estimated impact on policy impact objects.7 Recent amendments to the CoM Instruction have widened the scope of issues to be analysed and scope of institutions involved in the quality assessment of particular sections of RIA. As in the case of policy evaluation, the CoM Instruction defines mandatory performance of assessment but does not specify the quality and depth of assessment and analytical activities. Most RIAs are elaborated for amendments of legislative acts and therefore rarely contain in-depth analysis but rather aggregation of information from various secondary sources. When new legislative acts (e.g. laws) are elaborated, RIA may be based on specific study made for the particular purpose. According to interviewed experts political agenda, knowledge and motivation of the civil servants involved in the process are 6 In 2016 the government started a closer integration of policy planning and mediumterm budget planning processes, gradually introducing a budget planning system based on the performance assessment of goals set in development planning documents. The process is coordinated by the Ministry of Finance in the framework of annual budget planning procedure and is not linked to policy evaluation processes. 7 See Footnote 3.

382

K. VEITNERS

main factors influencing the quality of RIAs. As for the policy evaluations, RIAs are used for justification of already made political decisions. RIAs are increasingly used as sources of reference explaining the reasoning and meaning of the legal act. Detailed guidance and examples for filling the annotation form are provided in the Manual for Elaboration of Normative Acts (Valsts Kanceleja 2016). Civil servants can attend a training course at the School of Public Administration or learn from the materials available on a specific web site. A separate procedure of RIAs for draft laws is established at the parliament, which is not synchronised with the government system. RIAs elaborated by the parliament substantially differ from those elaborated by the government containing less information and lower level of analysis. RIAs for government-initiated laws are not amended by the parliament and often do not reflect substantial information explaining reasoning for changes made after discussions at the parliament. A system of ex post evaluation of legislative acts is currently being introduced in Latvia. The State Chancellery in cooperation with the parliament in 2017 started a pilot-project carrying out two ex post evaluations of draft law and CoM Regulation. Results of the pilot will be summarised in the new legislative act and ex post evaluation methodology to be finished in 2019. Evaluation of EU Funds and Programmes The EU funds evaluation system is the most regulated by the legal acts. Requirement for evaluation of all EU funded programmes stems from the EU regulations that are translated in the national legislative acts. The CoM Regulation on evaluation and monitoring of EU Structural funds (CoM Regulation No. 108) contains the highest level of details. It defines evaluation types, steps of the evaluation process, mechanisms for evaluation planning, data provisions, general principles of evaluation quality, the use of evaluation results and designate the establishment of the Evaluation Consultative Working Group (ECWG). Regulations for other EU funded programmes and funds8 are less detailed, mentioning evaluation as one of the management functions and designating the responsible body.

8 For example Regulation on Management of Rural Development and Fisheries Funds.

15

LATVIA

383

On the administrative level the Ministry of Finance9 has issued detailed rules of procedure of the ECWG. ECWG plays a central role in the coordination of EU funds evaluation processes including elaboration of Terms of Reference, commenting draft reports, disseminating results and monitoring the implementation of evaluation recommendations. The ECWG members represent Latvian ministries and agencies involved in the EU funds implementation and the main social partners. The EU funds evaluation system has essentially influenced development of evaluation in Latvia. The evaluation requirements of the EU funds have directly and indirectly strengthened the evaluation capacity of civil servants, academics and consultants. Regular EU programme evaluations have encouraged the development of evaluation skills for the involved actors. Evaluation practices are more developed in the ministries with large proportion of the EU legislation and available EU co-financing. Ministries with dominating national policy initiatives and legislation have a less developed evaluation practice and civil servants there demonstrate comparatively lower knowledge and understanding of evaluation (Lake 2011). No specific sectoral laws or regulations about evaluation or use of evaluation exist in Latvia except laws and regulations on spatial development planning. Elaboration of new spatial planning documents at the level of planning regions and municipalities shall be based on evaluations of previous documents. The planning regions are obliged to regular assessment of implementation of regional long-term and medium-term planning documents (Law on Spatial Development Planning; CoM regulation No. 402). There is no sectoral policy or strategy on evaluation or use of evaluation. 2.1.2 Evaluation Within the Parliament Evaluation within the parliament Saeima is clearly less present than at the executive level. Until 2017 Saeima was one of the few parliaments in Europe without an analytical unit that could provide members of parliament (MP) with systematic research and evaluation support. This gap was partially filled by outsourced consultants, MP assistants or advisers commissioned by parliamentary commissions or groups. Evaluations (or rather investigations) were performed on ad hoc bases by specific commissions established for investigation of specific cases or accidents.

9 Managing Authority of the EU Structural funds and Cohesion Fund.

384

K. VEITNERS

The Saeima Analytical Service was established in 2017 and currently employs three full time researchers plus director. The goal of the Analytical Service is to provide descriptive and analytical research to improve the quality of draft laws, to assess the effectiveness of the laws, and to provide support in the decision-making process. The research topics may be initiated by the presidium, the Council of Parliamentary Groups, committees or at least two parliamentary groups. An annual research plan is made on the basis of an open competition. The assessment criteria include the availability of in-house expertise in the field, the availability of information, data and technological provisions, link with topics submitted by other applicants and the availability of previous studies on the subject. The annual research plan is approved by a joint decision of the Saeima presidium and the Council of Political Groups and published on the internet. Currently the Analytical Service is mostly producing synthesis reports and in-depth analysis. According to its internal regulation the Service shall also deliver ex ante and ex post evaluations and cost-benefit analysis, but that has not been the case yet due to the limited capacity. The Presidium approves the report and just then it is sent to the primary commissioner of the study and published on the web page. The Analytical Service is currently developing metrics for monitoring the further use of their studies. Work of the Analytical Service has turned out to be of high interest to the media and different interest groups who make regular inquiries. The Analytical Service has not been involved in RIAs of draft laws that are currently done by the legal office. There is still long way to go for embedding evaluation in the parliament. The mandate of the Analytical Service is much larger than its current capacity. Increasing demand from the MPs and growing interest from various stakeholder groups places political neutrality as one of the most important working principles of the Service. The built-in dependency from the Presidium rather than parliament administration is one of the highest risks to its independence.

15

2.2 2.2.1

LATVIA

385

Evaluation Practice

General Description of Evaluation Practice

Policy Evaluation Policy planning documents are subject to ex ante, interim or ex post evaluation (CoM Regulation No. 737). Ex ante evaluation is a mandatory prerequisite for the approval of any new policy planning document. If a policy planning document is extended for the next planning period, ex ante evaluation may be combined within interim-evaluation, that is the most often used type of policy evaluations. Ex post evaluations are carried out just in cases when planning document is not extended for the next planning period (CoM Regulation No. 737). According to the CoM Regulation all policy planning documents should be evaluated, but it is difficult to ascertain if that is being done. Policy evaluations in most cases are not presented as evaluation reports, but as other documents used by the public administration. The Policy Planning Document Database (POLSIS)10 contains 134 different documents that include elements of policy evaluations. Most part of them (65%) are informative reports11 dedicated to a specific issue, programme or legal act. The rest are other types of policy planning documents containing evaluation results either as a separate section or as an annex. Decision to conduct (or skip) evaluations is under full responsibility of ministries. Elaboration of evaluations and their use is neither controlled, nor monitored by any other institution. The quantity and regularity of evaluations is depending from the number of sectoral policies in the respective ministry and its internal evaluation culture. In terms of content the majority of policy evaluations does not meet the quality requirements of an evaluation. Applied methodological techniques are highly heterogeneous depending on evaluation type and policy sector. For various reasons, most evaluations are carried out in a simplified form (widespread practice is a descriptive analysis combined with survey of involved institutions about current problems and completed or 10 The Policy Planning Document Database http://polsis.mk.gov.lv contain all policy planning documents adopted since 1991. 11 Specific form of policy document usually containing information, or an overview of the progress related to some specific issue, assessment of the implementation of a planning document or the execution of a legal act. Informative reports may be reviewed by the government or automatically registered in the Database without a review.

386

K. VEITNERS

outstanding tasks; the policy’s impact on target groups and society is usually not evaluated). A higher quality and systematisation in evaluation techniques and use can be observed in cases where evaluation is related to an EU-funded programme assessment (Lake 2011). Ex ante policy evaluations are of lower quality comparing to other policy evaluation types, due to the weak regulative description and lack of proper methodological guidance. The most significant policy evaluation problems are non-compliance with the regulatory requirements and a lack of methodological guidance. The CoM Regulation defines requirements for all evaluation types, but in practice the fulfilment of these requirements is not being controlled or monitored. The quality of policy evaluations depends on the availability of reliable data, the capacity and competence of the ministry’s staff to do or to commission an evaluation in accordance to the requirements, available time resources, availability of external expertise and the quality of the work of the external evaluators (Baltic Institute of Social Sciences 2013). The lack of reliable data is the most often mentioned problem both by civil servants and evaluators.12 Public institutions often do not have proper monitoring systems. Therefore, a substantial part of any evaluation is related to collection of data. Civil servants responsible for internal policy evaluations often use secondary data from related research studies. Neither in the Research and Publications Database,13 nor on the web sites of the ministries, it is possible to clearly and accurately identify which of the published studies are commissioned for evaluation purposes. The Database contains approximately 400 different studies titled as ‘evaluations’. Less than half (39%) of them can be qualified as evaluations by their object, methods used for assessment and evaluative content. Programme Evaluation The EU Cohesion Policy funds14 are subject to ex ante, ongoing and ex post evaluations (CoM Regulation No. 108). Ex ante evaluations are done

12 According to the expert interviews and results of evaluators survey (n = 12). 13 The Research and Publications Database http://petijumi.mk.gov.lv currently contains

approximately 3.000 research studies and publications commissioned by public institutions since 1998. The Database does not contain all research studies and publications made by the public sector, therefore data from it can only be used for illustrative purposes. 14 European Regional Development Fund, European Social Fund and Cohesion Fund.

15

LATVIA

387

both at the level of Operational Programme focusing on overall programme’s strategy, indicators and goals, and at the level of each Specific Objective (SO) focusing on estimated socio-economic impact of the SO and its implementation strategy. Ongoing-evaluations are done at the level of each priority. Ex post evaluations are done at the level of priority or broader, focusing on efficiency, effectiveness and impact of the funds. Comparing to policy evaluation, the CoM Regulation on the EU funds more strictly demands mandatory ex ante evaluation for each SO and ongoing and ex post evaluation for each priority. The EU funds Managing Authority (MA) monitors fulfilment of the CoM Regulation by coordinating planning and use of all EU funds evaluations. Since the beginning of the establishment of the EU funds evaluation system in 2004 in total 48 evaluation studies have been made.15 The highest number (21) was published during the 2004–2006 planning period, when the EU funds management system was decentralised, and evaluations were commissioned by different institutions. During the 2007–2013 programming period (from 2008–2015) in total 17 different evaluations were carried out. Almost half (7) of them are rather specific studies on issues closely related to the effectiveness of the EU funds management system. In the 2014–2020 planning period (from 2016 till March 2019) 10 evaluation studies are finished and published. Due to strict regulations, evaluations of EU funds are done regularly, and it is possible to speak of a certain rhythm of evaluations, albeit not a frequent one. Comparing to the other Baltic countries the total number of evaluations can be considered as low. For example, during the 2007–2013 planning period (2008–2015) in Lithuania 60 different evaluation studies have been made (Burakiene 2015). Comparing the number of evaluation studies to the number of population and to the total volume of investments received from EU funds, EU funds evaluations in Latvia in 2008–2015 are done twice less than in Estonia and three times less than in Lithuania (Fig. 1). Such difference can be explained by the fact that EU funds evaluation commissioners in Latvia prefer large horizontal studies covering wide area of topics in one evaluation. That indicates that evaluation is seen as a mandatory task implied by the regulation and thus done at the minimum required 15 According to the information provided on the web page of the MA www.esfondi.lv. This number does not include ex ante evaluations at each SO level investigated in the period of 2014–2017.

388

K. VEITNERS

21.4

Lithuania

8.85 16.8

Estonia Latvia

6.4 8.6 3.8

Number of evaluations per millions of inhabitants Number of evaluations per billion of EU investments

Fig. 1 Number of EU funds evaluations done in Baltic countries in 2007–2013 planning period (Visionary Analytics 2017)

frequency. If evaluation would be seen as an instrument for learning, it would be done more often for in-depth studies of specific issues. In the 2014–2020 planning period the EU funds evaluation system is more centralised, and the frequency of evaluations is rising. By 2019 on average three EU funds evaluations are published annually. Equal coverage of various policy sectors is one of the criteria applied by the MA. For the Rural Development and Fisheries Funds managed by the Ministry of Agriculture a separate regulation exists, requiring the establishment of an ongoing evaluation system of these funds. During the planning period both funds are being evaluated ex ante and ex post plus two mid-term evaluations per period and two thematic evaluations annually. Dominating Forms of Evaluation Accurate data are not available on a number of internally or externally conducted policy evaluations. Experts believe that most part of the policy evaluations are conducted internally. Substantial differences exist across different policy areas depending on the availability of financing for external expertise, institutional capacity, internal culture and demand from social partners. Three approaches may be distinguished: (1) policy evaluation is done fully internally, using some secondary data; (2) the majority of the evaluation is done internally, but study on the specific issue is outsourced; and (3) policy evaluation is fully outsourced. Policy evaluations, that are presented in the form of informative reports, are mostly done either fully

15

LATVIA

389

internally or contain some findings from external studies. Policy evaluations annexed to the policy planning documents are mostly done externally. External evaluations more often are commissioned in policy areas having access to EU or other external funds.16 Specific types of external policy evaluations are large studies performed by international organisations like the World Bank or the OECD. These are most often used in education, health and public finance sectors. Results of these directly commissioned evaluation studies are usually used for justification of difficult political decisions. In several sectors (e.g. employment, health) the international organisations (OECD, World Health Organisation) regularly initiate external evaluations and comparative studies involving also the responsible ministries. EU funds and other programme evaluations are almost entirely external.17 The main reasons for selecting external evaluators are independence of expertise and limited internal capacity of the institutions. Peer reviews are not used as evaluation types. 2.2.2 Relation Between Process and Impact/Outcome Evaluations Both regulations on policy evaluation and on EU funds evaluation distinguish evaluations only by timing (ex ante, interim/ongoing and ex post). The CoM Regulation on policy planning denominates all three types as policy impact evaluations (CoM Regulation No. 737). In practice most part of policy evaluations focusses on outputs and outcomes, concentrating on a descriptive assessment of achievement of policy results and result indicators. The effectiveness of policy actions are less analysed. In policy evaluations the impact is seen from a narrow perspective as impact towards the policy objectives (i.e. outcomes) or as impact towards the spending of funds (efficiency). Rigorous policy impact evaluations are exceptional and as a rule done externally. The EU funds evaluations focus on effectiveness, efficiency and impact (CoM Regulation No. 108). In most cases the EU funds evaluations contain all three of these elements with impact as the dominating focus.

16 For example, budget and financial policy, employment, education, public administration, regional development. 17 The EU funds Managing Authority recently started to perform internal evaluations involving other institutions or specific EC services. The ratio between external and internal evaluations is 10:1.

390

K. VEITNERS

Process evaluation is the least used form both for policy evaluations and the EU funds evaluations. Process evaluations are substituted by internal monitoring of outputs. Formative evaluation actions are not encouraged neither by regulatory prescriptions nor institutional culture. Independent Evaluation Institutes No specific evaluation institutes exist in Latvia. In the centre of the government few institutions play some role related to the institutionalisation of evaluation. The State Chancellery ensures and controls compliance of policy documents and draft legal acts including coordination and quality control of the RIA and ex post evaluation of legal acts. The PKC situated within the office of the Prime Minister is responsible for evaluation and monitoring of medium- and long-term national planning documents and monitoring of the Sustainable Development Goals. The PKC is also running the Research and Publications Database containing studies and evaluations made or procured by public sector institutions. In some policy domains are established specific units or entities fulfilling evaluation related functions that can be classified as independent evaluation institutes. In the agricultural domain the Institute of Agricultural Resources and Economics (AREI) is entrusted by the Ministry of Agriculture with the ongoing evaluation function of the Rural Development and Fisheries Funds. AREI is providing regular evaluations since 2008. AREI carries out evaluations under a five-year contract with the ministry financed by the EU technical assistance. The ongoing evaluation system of the Rural Development and Fisheries Funds can be considered as the best practice example of evaluation institutionalisation in Latvia. In the education domain since 1998 the Institute for Educational Research of the Latvian University is entrusted to participate in international comparative education studies (PISA, TALIS etc.) organised by the International Association for Evaluation of Educational Achievement and OECD. The costs for regular participation in these studies are covered by the EU funds for several consecutive planning periods. The results are available both to policy makers and a wider public and are used for development and management of the educational system in Latvia. From an evaluationinstitutionalisation-perspective these studies do not have notable influence on the establishment of a regular evaluation practice in general education

15

LATVIA

391

as there are no continuing national studies. From a terminology perspective these studies are always referenced as education research and are not associated with evaluation neither by researchers nor policy makers. On a national level the State Audit Office (SAO) also undertakes independent evaluative activities. In addition to traditional financial audits, legality and regularity audits, since 1999 the SAO conduct also performance audits.18 Performance audits make up approximately 9% of all audits performed by the SAO, but the number of performance audits per year significantly differs.19 Performance audits focus on economy, efficiency and effectiveness of public policy. An important part of performance audits is the assessment of achievement of policy results and goals. Policy impact is recently included in the scope of analysis. The SAO selects the policy areas to be audited on basis of its risk analysis and is increasingly addressing themes that are topical for the society. The SAO regularly presents its work to the parliament. Conducted performance audits gain considerable attention from policy-makers and general public and is one of the most often referenced forms of evaluative activities by the general public. The Saeima Analytical Service that according to its mandate is responsible for in depth research and evaluation of selected legislative initiatives in the Parliament may also be classified as an independent evaluation institute. Evaluation Units In the public sector are no specific policy evaluation units. Policy evaluation function is usually located in the policy planning units of sectoral ministries. Evaluation of policy planning documents is a part of policy planning experts’ responsibilities (CoM Regulation No. 1075). In practice, evaluation function in sectoral ministries often is fulfilled by one civil servant whose responsibilities include also different other tasks not related to evaluation.

18 According to the SAO audit register available at the web page of the State Audit Office, see http://www.lrvk.gov.lv/revizijas/. 19 In 1999–2005 the SAO performed on average seven performance audits per year. During the years of the financial crisis and afterwards (2009–2014), the number of performance audits dropped to two audits per year on average. Since 2015 the number of performance audits is increasing reaching on average 19 audits per year.

392

K. VEITNERS

In several policy areas analytical or monitoring units are responsible for data collection, analysis and reporting at the level of ministry or subordinated public agency. Such units are characteristic for public health, economy, employment, education, regional development, environment and other policy domains. Under the EU funds Managing Authority the Evaluation Unit (five persons) responsible for management and coordination of all EU funds evaluations is established. The unit is primary involved in coordinating the commissioning of all EU funds evaluations. It is also doing some internal evaluations and ensuring the work of the ECWG. 2.3

Use of Evaluation

2.3.1 Best Performing Sectors The most regular and well institutionalised evaluation activities are related to the EU funds interventions under the European Structural and Investment (ESI) Funds. The well-regulated and organised EU funds evaluation system exists as a separate conclave in the public administration with clearly defined procedures for initiation, implementation and use of evaluation results. It counts with designated personnel at the sectoral ministries and specific body (ECWG) for follow up of results of all evaluations. Good examples of evaluation use can be identified in policy areas where specific favourable conditions have contributed to the development of a regular evaluation practice. A good example is the culture domain, where the Ministry of Culture has established close cooperation with the Scientific Centre of the Academy of Culture, that is regularly involved in evaluations of both policy planning documents as well as various programmes or cultural events of significant scale. Factors contributing to the regularity of evaluations and use of their findings can be attributed to the high-quality supply of expertise both in evaluation and in the specific field. Internationally, such policy domains as education, employment, health and development cooperation have traditionally strong evaluation culture. In Latvia evaluation practice in these policy areas differs in terms of use of evaluation and its findings. From the above-mentioned policy areas, only employment can be considered as one with the most frequent and multiform evaluation practice where regular evaluations are done both for policy planning documents and the EU funded programmes. The employment policy is also regularly evaluated by various international organisations (e.g. OECD) in their horizontal studies. Results of these evaluations are used

15

LATVIA

393

for policy making and for improvement of employment related measures. Civil servants responsible for employment policy planning and implementation demonstrate good knowledge and understanding of various types of evaluation and its use. The main contributing factors are a strong demand from external partners, a good supply of expertise (mostly international) and availability of funds for evaluations from various sources. 2.3.2 Poor Performing Sectors In comparison to employment, evaluation in the education field is much more fragmented and less used. In the General Education area evaluations are done through international comparative studies (PISA, TALIS etc.) provided by the Institute for Educational Research. Results of these studies are mostly used as a source of reference by education professionals and decision makers. On all education levels self-assessments of teachers and institutions are required as part of the regular accreditation processes. The existing practice demonstrates that these self-assessments are rarely used, and all involved sides treat them as bureaucratic requirements. Evaluations performed in the Higher Education area are mostly related to quality assurance, efficiency or impact (for measures financed by the EU funds). Scientific institutions are evaluated once in six years for regulatory purpose and the results of these evaluations are conditional to allocation of public funding. The education domain is notable in commissioning large external evaluations from international institutions (World Bank, OECD). Results of such evaluations are typically used for justification of complicated policy reforms. Similar practice exists also in the health domain, where large external evaluations are commissioned from World Bank and used mainly for legitimation purpose of complicated policy reforms. Despite the developed monitoring system and a strong demand both from external and internal partners, health domain demonstrates low use of evaluation. Evaluations in various health policies (Public Health, Health Care, Pharmacy) are done non-systematically and evaluation data are being used selectively according to the political interests (Lak¸ e 2010). In health evaluations descriptive analysis of data dominates with little attention to effectiveness and impact. Unique comparing to other policy areas is the active and strong demand

394

K. VEITNERS

for evaluations by Non-governmental organisation (NGO) specialising in health economy issues.20 For a long period of time development cooperation programmes were not evaluated at all due to the very symbolic size of the Latvian development assistance. Recently situation has improved, and in addition to the annual monitoring report to the government, a Delphi discussion with experts involved in implementation of projects is organised. No external efficiency or impact evaluations have been done despite the regular demands from NGOs involved in the development assistance. Currently there is no evidence that this situation will change towards more systematic and rigorous evaluations due to the lack of interest and capacity of the responsible ministry. Main factors influencing the potential of evaluation use in most cases can be attributed to the capacity and understanding of individuals responsible for the policy evaluations, (lack of) demand from the top management and political agenda that is driven by other interests rather than evidence. Evaluation Focus Evaluation use is largely affected by the evaluation purpose. Policy planners often see the evaluation of policy planning documents as an administrative requirement and part of the policy planning process routine. Regulatory nature dominates especially regarding the ex ante evaluations, that are a precondition for approval of new planning documents. Policy evaluations are not used for accountability purpose, as there are other more widespread and accessible instruments (audits and monitoring). The EU funds evaluations primarily fulfil the planning and steering function and also partly accountability function as all evaluation reports are publicly available. Legitimacy purpose is dominating regarding the large policy evaluations outsourced to the World Bank or OECD for evaluating politically sensitive issues. Results of these evaluations later are being used for justification of reforms in the field (recent examples are evaluations in higher education, health and tax policy areas).

20 The Latvian Association of Health Economists is uniting representatives from academia and practitioners. The association is regularly impelling the need for evaluations, as well as performing their own evaluations.

15

LATVIA

395

Evaluation Users The main users of policy evaluations are policy planners (civil servants) themselves. Experts from different policy domains are very critical on the use of evaluation results by decision makers. That is related to the high political influence on decision making, where political interests prevail even the priorities agreed in the policy planning documents. Evaluations are often used as secondary sources for other evaluations or reports prepared by the public administration. In the EU funds evaluation system there is a well-regulated procedure of review and follow up to the evaluation recommendations. According to the EU funds Managing Authority, drafting of the planning documents for the next planning period will be strongly based on the findings from evaluations. Quality Assurance of Evaluation Regulative framework does not set any quality requirements for evaluations. The main instrument for quality assurance of external evaluations is technical specification and provisions provided in the contract that are used for quality control of both the evaluation report and the evaluation process. Mandatory practice for EU funds evaluations is to establish a steering committee from stakeholders involved from drafting the technical specification till the approval of the final report. Recent practice is also mandatory monthly meetings with the project leader and lead evaluator during the evaluation process. In case of external policy evaluations steering arrangements differ for each ministry. For internal evaluations no quality assurance measures have been noted. Evaluator competence requirements are requested in technical specifications, demanding education related to a particular field, experience in a number of previous evaluations on similar topics or involving particular methods (e.g. contrafactual impact evaluation designs). Evaluation specific education is not required as quality criteria. Evaluation manuals with varying level of details are available for EU funds and for evaluation of Rural Development and Fisheries Funds measures. The manual for policy evaluations is of general nature. No meta-evaluations are performed neither by evaluation commissioners nor evaluators. Evaluation related training is limited even under the EU funds domain.

396

K. VEITNERS

3

Societal Dissemination/Acceptance (Social System) 3.1

Institutionalised Use of Evaluation in Civil Society

NGOs are the most active part of the society that are using the results of different evaluations and studies. The use of evaluations differ depending from the NGO function. NGOs acting as interest representatives mainly use evaluations as sources of information or evidence in discussions and negotiations with the governmental or municipal bodies. They see evaluation as a tool for enforcement of their interests. NGOs acting as service providers see evaluation as a tool for assessment of quality and efficiency of their services. It is not a common practice for NGOs to commission evaluations. Evaluations are done mostly for projects financed from external financing (mostly various EU programmes), where it is a requirement and the project budget allows to include evaluation related costs. Under the national NGO grant schemes evaluations are rare due to the very small project budgets and lack of capacity. NGOs are often involved as respondents of surveys and participants of focus group discussions in various evaluations commissioned by the public sector. On rare occasions NGOs are also taking part in designing and analysing results of evaluations initiated by public sector. At communal level evaluations are initiated by municipalities or by entities financed by EU programmes.21 These evaluations are done fulfilling regulatory requirements and are not initiated by society. Consultations with society during the process of elaboration of municipal planning documents are often formal, but there is an increasing tendency of positive examples where various society groups are involved at an early stage of planning and consultations involve elements of evaluation. 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

Evaluation as an instrument is not widely known in society. On a daily conversational level, it is most often associated with an audit. Even among

21 Local action groups financed from the EU LEADER+ programme shall ensure regular evaluations of their strategies.

15

LATVIA

397

public administration there is a limited understanding of evaluation, its purpose and application. A deeper understanding is seen among those civil servants, who have direct experience with EU funds evaluation (Corporate and Public Management Consulting Group [CPM] 2011a). All EU funded programme evaluation reports are made public on specific government websites following the publicity provisions of EU support. Policy evaluations are published in the centralised Research and Publications Database. In exceptional cases evaluation results are presented in conferences or special events. There is no specific discourse in mass media related to evaluations, its use, quality or professionalisation. In most cases when evaluation findings are being discussed in the media they are related to some critical political issues, bad performance or misuse of funds. 3.3

Civil Societies Demand of Evaluations

In most cases evaluations are initiated by the public sector according to regulative requirements in legislation. Evaluations may also be initiated following requests from stakeholders (e.g. social partners). In the period of economic crisis in 2008–2012 social partners were particularly emphasising the need for evaluation and analysis of rational policies, that resulted in several initiatives related to functional audits and capacity building of the analytical capacity at the centre of the government. This internal demand lasted just a few years and lost its influence with the restart of the economic growth processes. In most cases the demand for evaluation comes from entrepreneur organisations or interest representative NGOs. No information was found on the number of evaluations really done reacting to the demand of civil society.

4 4.1

Professionalisation (System of Professionalisation) Academic Study Courses and Further Training

A system for evaluation of policies, programmes and legislative acts is developed in Latvian public sector. But there are no relevant offers in the higher education programmes that would allow acquiring the competencies of an evaluator. None of the Latvian universities offer specific training courses on evaluation. Some elements of evaluation are integrated in policy analysis

398

K. VEITNERS

courses taught under sociology or political science programmes, in courses on assessment and quality of education under the pedagogy programmes, in health economy course, in the course on evidence based social work and in the course on assessment in art therapy. None of these courses contain topics related to evaluation theory, evaluation models or approaches. Under the policy analysis courses evaluation is reviewed as one of the steps in the policy planning cycle, shortly introducing evaluation types (ex ante, ex post) and the most often applied data collection methods. The pedagogy and social work programmes review general evaluation principles related to quality assessment or assessment of client conditions. The lack of supply of higher education programmes is partly due to the small evaluation market, which does not provide a sufficiently wide demand for qualified, academic-educated experts. There is no specific professorship for evaluation. The other reason for this situation is related to the fact that Latvian academic environment has not developed a stable, generally accepted and persistent conceptual tradition regarding the social research typology, including the classification of applied research types (Lake 2011). Evaluation is almost not discussed in the academic debates in Latvia. That could be partly explained by the lack of academic interest in the field that has no clear identity. As a result, almost no evaluators in Latvia have evaluation related academic education. Professionals conducting evaluations are autodidacts, who have acquired specific evaluation related knowledge by participating in some professional seminars or trainings in Latvia or abroad, during their studies (mostly abroad), from the practical experience (learning by doing), from their more experienced international colleagues, or by reading evaluation guidelines.22 A better situation comes with the availability of continuing education offers. A training course on policy evaluation is provided for public servants by the School of Public Administration. This one day training course is offered once a year and is publicly accessible also for a wider public, for example NGOs or private experts. The course offers an overview of evaluation types, principles and methodological approaches and their practical application. Public servants involved in EU funds have also opportunities to participate in external trainings offered by the European Commission (summer schools, seminars) or other training providers. Participation in such trainings is not frequent as it requires availability of

22 Based on survey of experts (n = 12) and interviews (n = 8).

15

LATVIA

399

the resources that just few ministries are ready to provide. In order to reach larger groups, the EU funds Managing Authority once a year is organising capacity building seminars and trainings in Latvia. Evaluation experts currently active in the market also have benefitted from these training courses as part of them have previous working experience in public administration. 4.2

Profession/Discipline

No journals or other academic publication media dedicated to evaluation exist in Latvia. Some evaluation related publications can be found in scientific papers published by various Latvian universities23 and in thesis of scientific conferences organised in Latvia. In most cases these publications reflect results of specific studies made for acquiring an academic degree. Just one author has academic publications related to evaluation research and the use of evaluation in policy making. Individual evaluation related publications can be found also in non-academic professional journals, for example in periodical journal for law professionals was published an article on evaluation of legislative acts. Despite the well institutionalised EU funds evaluation system no academic publications reflecting the results of EU funds evaluations have been identified. Such practice is neither required nor stimulated by the EU funds authorities. The Latvian Evaluation Society (LATES) is established in 2012 with the primary objective to develop the evaluation culture in Latvia. Initiative for establishing LATES came from evaluation experts mostly involved in evaluation of EU funds and programmes. LATES currently unites 13 members that are predominantly working in private sector. Part of LATES members has previously worked in government institutions responsible for management (incl. evaluation) of EU funds. LATES is open for all kinds of evaluation related stakeholders, but currently it has not succeeded in attracting civil servants and representatives from academia. LATES activities include regular networking among evaluators, exchange of information via e-mail and website, public discussions involving evaluation commissioners, providing comments on legislative acts related to evaluation, participation as

23 For example, the Latvian University, the Latvian Academy of Culture, the Riga Stradins University, the Latvian University of Agriculture, or the Stockholm School of Economics in Riga.

400

K. VEITNERS

observers in ECWG meetings and cooperation with Voluntary Organisations for Professional Evaluation from neighbouring countries (Estonia, Finland, Poland). The evaluation market in Latvia is mostly dominated by consultancy firms and freelancers working with those firms or sometimes individually. In the last 15 years approximately 30 different Latvia based consultancy firms and organisations have conducted at least one evaluation study. Half of them have participated in evaluations more than once, but five to ten companies are participating in evaluations more often. In some sectors (e.g. rural development, culture) evaluations are done almost entirely by scientific research institutes. Evaluation market in Latvia is small with only few external evaluations commissioned per year. Therefore, companies or individual experts cannot specialise, and evaluation is just one among many of their fields of activity. In Latvia there is no certification system for evaluators. 4.3

Compliance to Standards and Quality Obligations

No standards or guiding principles for evaluators exist in Latvia. LATES has initiated discussions on necessity for developing evaluation standards in 2013. As a starting point, several international evaluation standards were translated in Latvian and their relevance and applicability in local situation discussed among evaluators and evaluation commissioners. Due to the very low interest both from evaluators and commissioners the process currently is put on hold. The otherwise well regulated EU funds evaluation system has also not defined any quality standards for evaluation. The EU funds MA has issued Guidelines for EU funds evaluation that are targeted at evaluation commissioners and concentrate on the process of selection of external evaluators and appropriate impact evaluation methods. These guidelines contain also a checklist for assessment of quality of the evaluation reports. In relation to policy analysis, the State Chancellery has implemented few capacity building projects on capacity building for policy impact assessment. This included development of the Manual on Policy Impact Assessment (2005) and the Manual on Policy Evaluation (2012). The PKC has developed guidelines for policy making, including short chapter on evaluation (PKC 2016). Information on evaluation use is not included in these manuals.

15

LATVIA

401

In absence of any standards it is up to evaluators to decide the principles of their ethical conduct. In most cases the main instrument of quality assurance from the commissioner’s side is the technical specification and provisions of evaluation contract. In very few cases evaluation commissioners have requested compliance with the code of ethics approved by the Latvian sociological association.

5

Conclusion

The existing level of institutionalisation of evaluation in Latvia can be considered as low despite the clearly observable progress in last 20 years. Evaluation is present both in the executive and legislative branches and it is institutionalised at the level of regulations. In practice its use is very fragmented and not being monitored except for the EU funds system. It is difficult to find policy areas with a developed evaluation culture and just a few separated conclaves can be identified. The main driving force in all identified cases were key persons understanding evaluation importance and potential. Evaluation culture has grown in general, especially after the accession into the EU. Evaluation has been brought entirely by external sources, and external pressure (regulations, agenda or funds) still remains the main impulse. Commitments for entering international organisations as well as external financing related regulations has been the dominant source of continuous external demand for evaluation. External pressure on the other hand has fostered institutional culture within the public administration, which considers evaluation as a formal regulatory requirement rather than a tool for policy learning. There is a very low internal demand for evaluations. Internal demand is episodic and induced mostly by the economic development processes. Attitude of policy makers towards evaluation even could be considered as a resistance, that further transfers to the top management of the ministries in form of inertia or lack of demand. The strongest internal demand comes from the organised groups of the society, that themselves are limited in use of evaluation due to the lack of proper financing. Evaluation capacity is a problem both at supply and demand side. Existing measures for demand side (training, guidelines) are not carried out systematically even in the case of the EU funds where specific funds are allocated for this purpose. Quality of evaluation supply is limited due to the small market that prevents from specialising. Lack of generally accepted

402

K. VEITNERS

and persistent academic tradition has contributed to the lack of identity of the evaluation field in Latvia.

References Baltic Institute of Social Sciences (BISS). (2013). Politikas pl¯ anošanas sist¯emas att¯ıst¯ıbas pamatnost¯ adnu ¸ un att¯ıst¯ıbas pl¯ anošanas sist¯emas ietekmes nov¯ert¯ejums. Burakiene, D. (2015). Use of evaluation as a result-based management tool: Lithuanian case. 2nd joint EU Cohesion Policy Conference, Riga, Latvia. Cabinet of Ministers (CoM). (2009a). Procedure for assessing the initial impact of a draft law (Instruction No. 19). https://likumi.lv/doc.php?id=203061. Accessed on 21 April 2018. Cabinet of Ministers (CoM). (2009b). Cabinet of Ministers rules of order (Regulation No. 300). https://likumi.lv/doc.php?id=190612. Accessed on 21 April 2018. Cabinet of Ministers (CoM). (2010). State and municipal institution’s job catalogue (Regulation No. 1075). https://likumi.lv/doc.php?id=222271. Accessed on 21 April 2018. Cabinet of Ministers (CoM). (2013). Regulation on spatial development planning documents of the planning regions (Regulation No. 402). https://likumi.lv/doc. php?id=258626. Accessed on 21 April 2018. Cabinet of Ministers (CoM). (2014). Regulation on development planning and impact assessment of development planning documents (Regulation No. 737). https://likumi.lv/doc.php?id=270934. Accessed on 21 April 2018. Cabinet of Ministers (CoM). (2015). Procedure for monitoring and evaluating the implementation of the European Union Structural Funds and the Cohesion Fund and establishing and using the Cohesion Policy Funds management information system for 2014–2020 (Regulation No. 108). https://likumi.lv/doc.php?id= 272807. Accessed on 21 April 2018. Corporate and Public Management Consulting Group (CPM). (2011a). Esoš¯ as situ¯ acijas anal¯ıze ietekmes izv¯ert¯ešanas jom¯ a. 1.nodevuma zinojums. ¸ Corporate and Public Management Consulting Group (CPM). (2011b). Metodiskais materi¯ als (rokasgr¯ amata). 3.nodevuma zinojums. ¸ La¸ke, A. (2010). Nov¯ert¯ejuma p¯et¯ıjuma izmantošanas prakse vesel¯ıbas apr¯ upes politikas veidošan¯ a Latvij¯ a: probl¯emas un risin¯ ajumi. R¯ıgas Stradina ¸ universit¯ates 2010. gada zin¯atnisk¯a konference. T¯ezes: R¯ıga. Lake, A. (2011). Rationality and utility of evaluation research in policy making. Studies of Transition States and Societies, 3(3), 64–82. Law on Development Planning System. (2009). https://likumi.lv/doc.php?id= 175748. Accessed on 21 April 2018. Law on Spatial Development Planning. (2011). https://likumi.lv/doc.php?id= 238807. Accessed on 21 April 2018.

15

LATVIA

403

P¯arresoru koordin¯acijas centrs (PKC). (2016). Politikas veidošanas rokasgr¯ amata. Policy Planning Document Database. http://polsis.mk.gov.lv. Accessed on 21 April 2018. Politikas pl¯anošanas pamatnost¯adnes. (2001). http://polsis.mk.gov.lv/ documents/391. Accessed on 21 April 2018. Research and Publications Database. http://petijumi.mk.gov.lv. Accessed on 21 April 2018. Saeima. (2014). Saeimas k¯ art¯ıbas rullis. https://likumi.lv/ta/id/57517-saeimaskartibas-rullis. Accessed on 21 April 2018. Valsts Kanceleja. (2016). Normat¯ıvo aktu projektu izstr¯ ades rokasgr¯ amata. Visionary Analytics. (2017). Vertinimo geb˙ejimu˛ stiprinimo studija ir priemoniu˛ planas.

CHAPTER 16

Poland Monika Bartosiewicz-Niziołek, Beata Ci˛ezka, ˙ Weronika Felcis, Katarzyna Hermann-Pawłowska, Seweryn Krupnik, Piotr Strz˛eboszewski and Agnieszka Szczurek

1

General Country Overview

To understand the development of evaluation and its institutionalisation in Poland, one must consider two crucial events for its current political system.

The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of Polish Ministry of Investment and Economic Development or any agency of the Polish government. M. Bartosiewicz-Niziołek (B) · B. Ci˛ezka ˙ · W. Felcis · A. Szczurek Polish Evaluation Society, Warsaw, Poland e-mail: [email protected] B. Ci˛ezka ˙ e-mail: [email protected] W. Felcis e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_16

405

406

M. BARTOSIEWICZ-NIZIOłEK ET AL.

The first is the collapse of communism and the re-establishment of a democratic republic in 1989. Although the current political system has been lasting for almost thirty years, its short history has exhibited a relatively low level of stability. In some public policy areas (for example education), reforms are implemented very often. While such dynamics should increase demand for evaluation of national policies, it rarely does. The lack of stability of the political system in interacting with the factors described below hinders the institutional learning of public administration (Ferry and Olejniczak 2008). The second circumstance is Poland’s accession to the European Union (EU) on 1 May 2004. The comparatively high popularity of evaluation of the UE initiatives in Poland stems mostly from the external requirements imposed by the European Commission (Bienias et al. 2012; Januszkiewicz 2016). Although the widespread use of evaluation of UE funded programmes seems to be a positive effect, these practices are quite often perceived by some stakeholders as a form of control rather than an opportunity to learn more about institutional actions and as a possibility to improve them (Ferry and Olejniczak 2008; Bartosiewicz-Niziołek et al. 2018). Thus, evaluation is sometimes treated as an artificial cost of cofinancing rather than an immanent value of a learning process. Moreover, even if evaluation is institutionalised, it is not integrated enough with the planning and implementation of public policies (Kot 2014). W. Felcis · S. Krupnik Centre for Evaluation and Analysis of Public Policies, Jagiellonian University, Cracow, Poland e-mail: [email protected] K. Hermann-Pawłowska IDEA of Development Foundation, Wierzbica, Poland e-mail: [email protected] P. Strz˛eboszewski National Evaluation Unit, Ministry of Investment and Development, Warsaw, Poland e-mail: [email protected] A. Szczurek O´srodek Ewaluacji Sp. z o.o., Warsaw, Poland e-mail: [email protected]

16

POLAND

407

Another important factor influencing the institutionalisation of evaluation in Poland is the moderate level of the development of civil society (EU-Russia Civil Society Forum 2017; USAID 2017; Adamiak et al. 2016). This is true especially in relation to the formal dimensions of social activities, such as membership in non-governmental organisations (NGOs) and contribution to community activities. Poles prefer more individualised and informal forms of social activity (Wenzel and Kubik 2009). As a result, the preference for institutionalised measurement or research along with reflection about the effects of implemented initiatives are relatively poor. Moreover, Polish NGOs are highly dependent on public grants, that is national, regional and local funds (Przewłocka et al. 2013). That is why the decision about the conduct and scope of evaluation depends mostly on the funders’ preferences. It is worth underlying that the state of civil society in Poland is related to the comparatively low level of social trust (ESS 2016, 2019). This is also true in case of trust in public institutions and the state in general (Marková 2004). Only 23% of Poles report to have confidence in the national government, which is one of the lowest results for Organisation for Economic Cooperation and Development (OECD) countries, where an average confidence level is 42% (OECD 2016). The low level of social trust handicaps the perception of evaluation as a learning and improvement tool, within which all stakeholders openly articulate their opinions. Some entities that commission of evaluation research (regardless of the public or NGO sector) identify it with monitoring, audit, assessment or accreditation (Korporowicz 2013), thus tend to expect mainly positive results and prefer not to mention weaker sides of conducted activities. Such an approach may reduce evaluation practices to formal and bureaucratic aspects—a kind of ritual, ostensible activities (Podkonska ´ 2016; Jaskuła 2018)—as well as result in a negative attitude towards evaluation (Bartosiewicz-Niziołek 2019). Another set of factors influencing the process of institutionalisation relates to the Polish academic system. As far as universities are concerned, it is a common view that they have three missions: teaching, research, and supporting regional development by working together with public, private and non-governmental organisations (Brundenius and Göransson 2011). To understand the situation in Poland one should keep in mind that while teaching and research are acknowledged and institutionally rewarded, this is not necessarily the case with its third mission. Thus, universities generally cooperate with other organisations to a relatively limited scope and intensity

408

M. BARTOSIEWICZ-NIZIOłEK ET AL.

(Górniak 2015a). Moreover, research units exist, which focus on developing and transferring knowledge about evaluation methodology and praxis. However, their focus stems from strategic decision of its leaders rather than from national or organisational policies. As a result, the number of evaluations conducted by university research institutes is quite limited (Kot 2014). Beyond EU funds, the school education system is an important area where evaluation is present (Mizerek 2017; Jaskuła 2018). The Minister of National Education’s directive of 7 October 2009 legally sanctioned school system evaluation. Over the past two decades, evaluation tools and methods have been included in a number of activities aiming to improve the quality of school education system in Poland (Mazurkiewicz 2012). Various stakeholders contributed to this process, such as governmental bodies, school supervising institutions and education departments at the central and municipal levels. It is also worth emphasising that a key role in ensuring the quality of Poland’s education system has also been played by academic centres and non-governmental institutions where the first evaluations were conducted in the mid-1990s (Mizerek 2006). The latter often provided financial support under various grant programmes for schools and other education institution. In conclusion, while evaluation is strongly present in Poland, its institutionalisation faces challenges that are typical for the beginning of this process, especially in a country with a young democracy. The most important supporting factors involve external pressure and the growing awareness of its utility among national stakeholders. Above all, to fully develop national evaluation capabilities in Poland, elements of political, social and academic dimensions in the system must be appropriately ready. Among these factors, the most essential include: the stability of public policy strategic planning at the country and regional levels, the stronger integration of evaluation with other elements of public policy cycle management leading to a higher awareness of its benefits among all stakeholders; further development of organised forms of civil society and an increased level of social trust; policy support for the fulfilment of universities’ third mission; continued development of evaluation competencies along with positive attitudes towards evaluation.

16

2

POLAND

409

Institutional Structure and Processes (Political System) 2.1

Evaluation Regulations in Different Fields

The issue of evaluation in the context of mandatory or desirable research or other related activity is currently raised in over 1000 legal acts of various rank in Poland.1 Unfortunately, there is very little information about the use of evaluation in the work of the parliament. This does not mean, of course, that they do not use research and analysis. In the structures of the Polish parliament (both in the Sejm and the Senate), there are research and analytical offices. Their tasks include, among others, supporting the legislative process with scientific data and advice; providing expert assistance in the performance of the parliamentary mandate; conducting research (in the areas of law, economy, society) relating to the legislative process. In their activities, however, they do not refer directly and intentionally to the theory and practice of evaluation. Therefore, in the following chapter, we concentrate on horizontal solutions and selected areas with the relatively strongest legal bases of the evaluation process. The practice of policy evaluation is anchored in a national law regulating the modes for conducting policies in Poland. The law from 20062 sets specific requirements concerning the process of evaluating national and regional strategies and programmes. According to this act, each strategy (including the country’s development strategy and main strategies determining the regulations for activities in particular public sectors) is subject to a mandatory ex ante evaluation. In addition, any public programme of over 300 million PLN (approx. 70 million EURO) must also be evaluated.

1 I.e. 1098 legal acts as of 25 January 2018 according to a query based on keywords using the legalis database, https://legalis.pl/produkt/baza/. 2 Act of 6 December (2006) on the principles of conducting the development policy (No. 227). The planned amendments would limit the obligation to conduct an ex-ante evaluation for the strategic documents, whereas the programmes would be required to be subject to expost evaluations. The introduced change is supposed to be a response to the lack of financial resources required to perform the ex-ante evaluation. In addition to the introduced ex-post evaluation, there are plans to introduce the mandatory provision of adequate funds for the evaluation process within a given program.

410

M. BARTOSIEWICZ-NIZIOłEK ET AL.

The next horizontal document creating the legal framework for developing an evaluation culture in Poland is the ‘Strategy for Responsible Development’ from 2017. It is the government’s main strategic document until (including the perspective up to 2030). The Strategy assumes, among others, strengthening the analytical capacities of the state, including development of evaluation. One of the tasks provided in the strategy is creation of the system to evaluate public policies based on the experiences stemming from the Cohesion Policy evaluation system that has been being developed in Poland since 2004. For this purpose, an analytical–prognostic network is to be set up within government’s administrative structures to feed policymaking processes with evidence–based information (SRD 2017). The Cohesion Policy is one of the areas with the strongest and the most detailed legal framework for evaluation in Poland. In this field, irrespective of directly binding EU regulations, national law and guidelines on evaluation were introduced for the 2014–2020 period. Those documents together with additional instructive documents prepared by the National Evaluation Unit,3 regulate, among others, the thematic scope, timing, organisation and rules for evaluation of Cohesion Policy programmes financed by EU funds.4 Over the past two decades, evaluation has also been strongly institutionalised in the Polish education policy due to the introduction of legal changes in the way the pedagogical supervision is conducted. The 2009 directive issued by the Minister of National Education introduced a new strategy of pedagogic supervision, which includes evaluation of the educational activities of schools and educational institutions (Sobieraj 2014). Another impetus promoting evaluation culture in the education sector is the requirement to evaluate projects (implemented by schools) that are financed from various international and national grant programmes. The next example of a policy field where evaluation has been anchored in a sector-specific law concerns development assistance that is provided through the Ministry of Foreign Affairs. The Development Cooperation

3 The coordination unit of the evaluation process in Poland. Established pursuant to the Act of 6 December 2006 on the principles of conducting the development policy. Currently under the Ministry of Investment and Economic Development. 4 One of the most significant changes compared to the 2007–2013 programming period was a very significant limitation of a scale of evaluation conducted on projects’ level (details were described in a reference no. 7), as well as the introduction of counterfactual approach along with the theory-based evaluation (TBE).

16

POLAND

411

Act of 2011 requires the evaluation of the multiannual development cooperation programme (known as Polish Aid). Whereas the effort made by Poland to introduce evaluation in this field is recognised by OECD, further efforts guaranteeing the overall independence of the evaluation function as well as widening its scope to all aid channels are recommended and supported by the OECD Development Assistance Committee (OECD 2016). Following the inspirational developments in the EU and OECD on better regulations, government rules have been gradually introduced since 2001 on the implementation of regulatory impact analysis (RIA) in Poland. The ministerial guidelines from 2015 on RIA encourage drawing on the experiences of the evaluation of Cohesion Policy programmes in Poland, particularly relating to methods for ex post impact assessment of regulations (Górniak 2015b). Legal requirements also apply to some programmes that are financed by national funds and implemented by NGOs—not only the largest one, Civil Initiatives Fund, but also minor funds administered by municipal authorities. 2.2

Evaluation Practice

The first evaluation practices were introduced in Poland in the mid-1990s, as a result of foreign funds (such as PHARE and USAID) that were aimed at emerging NGO sector as well as the reform of the education system. The latter one was supported by two programmes, that is TERM (Training for Educational Reform Management, implemented in 1994–1997) and its continuation SMART (Strategic Measures for Achieving Reform Target, carried out in 1997–2000). The initial studies were conducted by the international team managed by Professor Helen Simons (the former President of the United Kingdom Evaluation Society), who was a promoter of evaluation of the fourth generation (Mizerek 2006, 2017). This approach was reflected in the following mottos that accompanied the birth of the evaluation practices in Poland: “There is no evaluation without democracy” and “There is no democracy without evaluation” (Jaskuła 2018, p. 63). Then, the process of development of evaluation research expanded thanks to EU pre-accession funding (2004–2006). Between 2002–2017, 1279 evaluations were carried out in Poland within Cohesion Policy (CP). In the development cooperation policy (a much less experienced but growing evaluation field) 11 thematic evaluations have been conducted since 2012 (OECD 2017). In the education sector, the number of completed external

412

M. BARTOSIEWICZ-NIZIOłEK ET AL.

evaluations is much higher—31.010,5 but their scope is limited to assessing the activities of schools. As regards NGO community, there is no available data that enable to assess the number of evaluation research conducted, so far. Cohesion Policy in Poland has been implemented in the following, socalled financial perspectives—2004–2006, 2007–2013, 2014–2020. Thus, three stages of the development of the CP evaluation system in Poland can be distinguished.6 The first one—‘beginnings’—covers pre-accession period and the first, short financial perspective to 2007. During this time, a framework of the evaluation system and its capacity was established. The second developmental stage started in 2007. Regional evaluation units were established and a dynamic increase in the number of implemented evaluations began (125% increase in the number of completed studies between 2007 and 2008). The last change in the trend in the number of evaluations relates to modifications in the dominant research approaches. At the end of the 2007–2013 financial perspective, the use of counterfactual methods was introduced and promoted across the EU Cohesion Policy evaluation system. It was also linked to focusing on results (relating to results-oriented policies) and impact evaluations. This type of research, more demanding, expensive and time-consuming, reduced the number of CP evaluations being conducted (Fig. 1).7 The number of implemented evaluations is also influenced by the public intervention cycle. In the years coinciding with the ex ante and ex post evaluation, lower dynamics of the process is observed than in the period dominated by on-going evaluations (Fig. 2). The evaluation system of the Cohesion Policy in Poland is characterised by a balance between the regional and central levels (altogether 33 evaluation units existed in these both types of administration in 2016). In the entire analysed period (2002–2017), 52% of evaluations were carried out by units located in central public administration. With the establishment of

5 Data from the web evaluation platform: http://www.seo2.npseo.pl (Accessed on 30 January 2018). 6 The development stages of the two evaluation systems (concerning the CP and education ˙ policy) are described differently by various authors (cf. Zuber and Bienias 2008; Korporowicz 2011; Bienias et al. 2012; Mizerek 2017). 7 In 2015 the Ministry of Development decided to that evaluations at the project level would be ineligible for financing (unless approved by the managing authority). This resulted in the almost total elimination of projects evaluations, causing a reconstruction of the evaluation market in Poland.

16

POLAND

413

180 160 140 120 100 80 60 40 20 0 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

Fig. 1 Number of completed evaluation studies of Cohesion Policy in years (Own study based on the Evaluation Database of the National Evaluation Unit [Accessed on 8 February 2018]) 945

1000 800 600 400 200

142

141

51

0 ex ante

on-going

ex post

other

Fig. 2 Number of evaluations as due of the moment of their implementation (Own study based on the Evaluation Database of the National Evaluation Unit [Accessed on 8 February 2018])

a larger number of regional evaluation units in 2007, a small change in the above-mentioned proportions is noticeable. In 2007–2017, units located in the central administration conducted 49% of all studies.8 The predominant type of evaluations commissioned within the Cohesion Policy was on-going one, which is a natural consequence of the seven-year programme cycle and day-to-day information needs of evaluation’s users. The most often examined thematic areas within the Cohesion Policy in

8 There is also an analogous unit in the Polish Agency for Enterprise Development – in relation to EU funds, and evaluation units in educational institutions (universities, Central Examination Board) in case of national funds.

414

M. BARTOSIEWICZ-NIZIOłEK ET AL.

Poland were: the CP delivery system (286 evaluations, which accounts for 22% of all research carried out), regional and local development (263 evaluations, i.e. 20%), labour market (156 evaluations, i.e. 12%).9 A prevailing number of evaluations commissioned by the public administration is carried out externally. The employees of evaluation units are also engaged in other than evaluation tasks, which makes it almost impossible to conduct evaluations internally. The position of the European Commission also had an impact on the adoption of such solutions. It is considered a best practice to assign the evaluation to external experts or to a different organisation than the one accountable for implementing the programme (European Commission 2014). The same is true for development cooperation policy, where one dedicated member of the staff in the Ministry of Foreign Affairs is responsible for planning and commissioning external evaluators as well as for disseminating evaluation findings. As a result, the independence of the process is ensured by having public administration evaluation studies commissioned through tenders. As regards external evaluation studies in Poland, they are conducted mainly by domestic companies. In the last two years, there were about 40 entities in the Polish research market capable to evaluate the Cohesion Policy. First of all, these were consulting and research companies, while rarely NGOs or representatives of the academic community. A different practice has been adopted in the education policy sector. Here the evaluation is conducted both externally—by properly prepared inspectors, and internally by the employees of a given school/educational institution, in accordance with the school’s/institution’s needs and for its own use (Mazurkiewicz and Berdzik 2010). The external evaluation is determined by local education offices in the annual evaluation plan, indicating the schools that will be evaluated in a year. It is worth mentioning that the internal evaluation is initiated and carried out by schools themselves to asses and assure the quality of education they provide. According to the Polish legislation, the school principal is obliged to carry out this process with other teachers. Involvement of the whole school community (i.e. students and their parents) is also widely promoted (OECD 2017). In respect to school evaluations, external evaluation is carried out at the

9 Own study based on the Evaluation Database of the National Evaluation Unit. See http://www.ewaluacja.gov.pl/strony/badania-i-analizy/wyniki-badan-ewaluacyjnych/ baza-badan-w-arkuszu-kalkulacyjnym/. Accessed on 8 February 2018.

16

POLAND

415

level of programme management in grant-giving institutions and in case of internal evaluation beneficiaries are responsible themselves. As regards the scope of applied methods and research techniques, it is quite varied. In most studies concerning EU funded projects triangulation is used as both qualitative and quantitative methods are implemented (Chojecki et al. 2016). Thanks to good cooperation with the National Statistical Office, which provides data, counterfactual methods are also used. In the last three years, the use of theory-based evaluation has been also evolving. On the contrary, in the education policy sector, within the context of schools’ simple quantitative methods and group techniques are most often utilised. 2.3

Use of Evaluation

Evaluation in Poland serves various functions and supports learning, decision-making, and improvement (formative role), but it is also used— presumably even more frequently—for the sake of accountability (summative role), legitimacy and formal requirements (procedural role). In turn, the promotional function of the Polish evaluation (i.e. increasing the credibility and gaining public support) seems to be the rarest one. The available data on the use of evaluation in Poland concerns exclusively the Cohesion Policy, which is justified by the number of evaluation studies that were conducted and their potential impact on national policies. One of the recent study identified several good practices concerning the dissemination of evaluation findings in Poland in comparison to other V4+4 countries10 (Olejniczak et al. 2018). This research indicates practices such as: creating an online searchable database including evaluation reports as well as sending short briefs to politicians, members of the government, think tanks,

10 V4+4—a cultural and political alliance of eight central European states: Poland, Slovakia, Hungary, Czech Republic, Slovenia, Bulgaria, Romania, and Croatia. It was created as an extension of the activities of the so-called Visegrad Group (V4)—the alliance of four Central European states—the Czech Republic, Hungary, Poland and Slovakia, that are members of the EU and NATO–for the purpose of advancing military, cultural, economic and energy cooperation with one another along with furthering their integration in the EU. The Group traces its origins to the summit meetings of leaders from Czechoslovakia, Hungary, and Poland held in the Hungarian castle-town of Visegrád on 15 February 1991. Visegrád was chosen as the location for the 1991 meeting as an intentional reference to the medieval Congress of Visegrád in 1335 between John I of Bohemia, Charles I of Hungary and Casimir III of Poland.

416

M. BARTOSIEWICZ-NIZIOłEK ET AL.

opinion leading media, et cetera. Although these solutions facilitate access to evaluation results for a wide range of potential evaluation users, the study shows that the main users of evaluation findings continue to be employees of the institution that commissions the evaluation—programme managers and department directors. Moreover, cooperation between knowledge users, its producers and brokers need to be improved. The dissonance between what evaluation units claim they provide and what users think they receive is still significant. According to the evaluators themselves, they mainly provide knowledge about mechanisms, while users indicate that they usually receive information concerning effects and processes (Olejniczak et al. 2018). Thus, the knowledge regarding mechanisms of implementation is a crucial unmet commissioners’ need for information. The other limitation of using evaluation results different sources of information, such as those ‘stemming from experience on the ground’ that are more frequently used. This phenomenon can indicate that evaluation takes too long to provide findings when they are needed (Kupiec 2015). The example of use of using evaluation regarding the Cohesion Policy reveals universal challenges that are characteristic for Poland, for example, matching the scope of evaluation with the needs of its users, the right timing of conducting research, and the participatory character of evaluation11 as well as the quality of the whole process (NEU 2014). The use of EU programme evaluations in Poland is facilitated by the National Evaluation Unit (NEU), which coordinates the provision of standards and guidelines for all EU funds-related evaluation units (e.g. based in the central institutions or the regional ones). In order to improve the process of evaluation use, an innovative system for monitoring of the process of implementation of recommendations has been created. This system enables the ongoing monitoring of the extent to which the recommendations resulting from the evaluation process are used by public administration. The use of evaluation results is conditioned to a great extent by its value. In Poland, the quality assurance of evaluation is supported by the ‘Evaluation Standards’ that were elaborated and disseminated by the Polish Evaluation Society, as well as by the more practical solution that has been

11 Participatory character of developmental evaluation in Poland encourages discussion of the evaluation findings with a wide range of potential users.

16

POLAND

417

developed in Poland within the CP implementation system. The Evaluation Guidelines12 require evaluation units to assess the quality of the whole evaluation process along with its findings after completion of the evaluation service. An assessment card of evaluation containing eight specific categories and basic information concerning each study is prepared. The steering groups on evaluation may also play an important role in ensuring the quality of the evaluation by discussing its findings. Another tool for improving the quality of the conducted research is the meta-evaluation that concerns the Cohesion Policy. Many difficulties with the quality assurance of evaluations that are carried out in Poland (not only concerning EU funds) are primarily related to the abovementioned lack of trust, especially between the commissioning agencies and their contractors (external evaluators), as well as to the form of this relationship that is mostly very formalised and based on legal, usually short-term (several month long) contracts. Secondly, not insufficient quality of evaluation may be caused, among others, by the lack of a certification system (in Poland there is no legitimate way for proving evaluation competencies), domination of the price criterion in public procurement (particularly in the past years), and also low competences of some commissioners who are not able to demand the right quality from contractors’ side.

3

Societal Dissemination/Acceptance (Social System) 3.1

Institutionalised Use of Evaluation in Civil Society

Non-governmental organisations are a crucial embodiment of civil society and there are about 70.000 NGOs in Poland. Most of them are small organisations with a limited budget. In 2014 the average annual budget of a Polish NGO was PLN 27.000—approx. 6460e (Adamiak et al. 2016). Key funding sources were public (i.e. EU funds, local and central government financing), however, evaluation of NGO activities–at the projects’ level–is rarely a requirement connected to accessing these funds. The role of evaluation was accentuated in the NGO Practice Charter in 2010. Non-governmental organisations and other representatives of civil 12 Guidelines of the Minister of Investment and Development on the evaluation of Cohesion Policy for 2014–2020, 28 September 2015 together with later amendments.

418

M. BARTOSIEWICZ-NIZIOłEK ET AL.

society are involved in the evaluation process of some public interventions in Poland, however, (1) mostly at the programme level (not a single project they run, but a given programme within which this project is implemented); (2) as the participants of steering groups or monitoring committees that support the process of evaluation of a certain programme. Once again, Cohesion Policy is an example of a good practice in this area. NGO representatives participate in the groups established for supporting the evaluation of every operational programmes. As members of a given Programme Monitoring Committees or Steering Groups, they are informed of evaluation results and the state of implementation of recommendations, as well as discuss these issues. Participants of these groups may also propose themes and consult the concepts of the evaluations which are conducted by public administration. The scope of powers and responsibilities of social partners is described, among others, in the national guidelines for the evaluation of the Cohesion Policy for 2014–2020. Nevertheless, as an everyday practice evaluation is not very common among the Polish NGOs. A 2008 report on non-governmental organisations stated: “Many non-governmental organizations cannot even state the number of beneficiaries of its actions, which points to a lack of evaluation and monitoring culture in the third sector” (Stowarzyszenie Klon/Jawor 2008, p. 34). Only 22% of examined NGOs declared conducting evaluation of their projects and actions. The increased interest in evaluation was linked to the EU-financed programme requirements. In the 2007–2013 financial perspective the most popular portal for the third sector provided a series of articles promoting evaluation. There were also organised various evaluation trainings, workshops and seminars for the NGOs in that period. In 2012, the share of organisations that rely on evaluation increased by 15% points, but still, 63% of examined NGOs reported that they had not carried out a systematic evaluation of their own activities or their effects (Stowarzyszenie Klon/Jawor 2012; unfortunately, there is no more follow-up data after 2012). One of the exceptions to this rule is the Polish-American Freedom Foundation (PAFF) that provides a good example of the evaluation use. PAFF along with partner organisations created a coherent evaluation system, which include funding of research, standards & guidelines, support and training for the partners. PAFF runs a dedicated MEVA programme (Monitoring, Evaluation and Analysis), which finances evaluation of implemented projects that is conducted with partner organisations. It

16

POLAND

419

promotes dissemination of know-how and building of evaluation culture among PAFF partners. Although the NGO environment was one of the first areas where evaluation practices developed in Poland, the institutionalised use of evaluation by civil society it is still underdeveloped. Even though there are national resources designed for NGO activities, (i.e. the Fund of Civic Initiatives), that are externally evaluated, evaluation research conducted by its beneficiaries (on the projects level) is not required, contrary to a monitoring that is mandatory.13 Grass-root calls for evaluation mainstreaming in civil society usually come from the local level, especially if citizens are directly involved in policy or budgetary decisions. Such direct decisions rarely concern more than 3–5% of the total municipal budget. Evaluation is also used to assess the impact and usefulness of local community centres. This allows them to better adjust their profiles to meet citizens’ needs. Thus, using evaluation may be the result of expectations and pressure from the general public after it gains direct decision-making influence or it can be treated by local governments as a useful tool for engaging the voters. The use of evaluation by civil society is described in publications relating to best practices in participatory budget implementation and local public consultations (Fundacja Pracownia Badan´ i Innowacji Społecznych „Stocznia” 2015; O´srodek Ewaluacji 2016). Both documents themselves were elaborated as a result of civil society initiatives and participation. Social programmes, such as participatory budgeting, prevention or revitalisation programmes as well as public consultations are usually evaluated externally in large Polish cities (e.g. Warsaw, Cracow, Gdynia, Poznan, Lublin), while in smaller towns, such studies are much rarer, reduced to monitoring reports or limited to internally conducted surveys (O´srodek Ewaluacji 2017).14 In case of large cities evaluation reports are published on the internet and the results are used to communicate with citizens. In Warsaw (the capital of Poland), a unit of the mayor’s office i.e. Public Communications Centre conducts evaluations of many of its programmes. Its evaluation 13 Civic Initiative Fund Program for 2014–2020. Annex to Resolution No. 209/2013 of the Council of Ministers of 27 November 2013. 14 http://wolomin.org/wp-content/uploads/2018/02/Raport-z-ewaluacji-WBO2018.pdf; https://twojbudzet.um.warszawa.pl/aktualnosci/wydarzenia/zako-czy-o-sibadanie-ewaluacyjne-iii-edycji.

420

M. BARTOSIEWICZ-NIZIOłEK ET AL.

activities provide a demonstration effect for other institutions in Warsaw. Each year, the number of evaluations conducted by the city and its districts is increasing. The Office of the Capital City of Warsaw requires evaluation of cultural projects funded by public funds and also projects in the field of social assistance coordinated and outsourced by the city to various NGOs (known as ‘public tasks’). In case of evaluation of the latter one the crucial problem is the lack of awareness and competence of the local authority and contracting NGOs (that are required to carry out evaluation) as well as unrealistic assumptions regarding the schedules and budgets of these evaluations. As regards evaluation practices much depends on the experience and determination of civil servants. Unfortunately, the process of building this institutional maturity, that underpins organisational learning as well as the evaluation culture, is rather slow and based on single persons who are enthusiasts of evaluation. This process should be also supported by systematic training of the civil servants along with as well as additional efforts aimed at popularisation of evaluation in local governments, especially in smaller towns. Concluding, institutionalised use of evaluation in civil society in Poland is in the initial phase. Polish NGOs evaluate their projects more often, if they are large organisations with stable budgets as well as are financed by foreign funds that require evaluation research. Smaller NGOs approach evaluation in a less systematic manner, often using their own resources as well as less advanced, cheaper and single methods of collecting data that do not meet the triangulation standard (f.i. team discussions or surveys of beneficiaries). 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

Despite the large number of evaluation reports elaborated in the area of the Cohesion Policy as well as education, public perception and debate on evaluation results almost do not exists besides the internal and limited to abovementioned monitoring committees and steering groups. Moreover, “production” of evaluation reports (especially in 2007–2013) seems to exceed their “consumption”, that is the use of evaluation findings what caused specific inflation of evaluation (Olejniczak 2013). Most of the evaluation reports concerning education policy and the Cohesion Policy Operational Programmes are disseminated via websites (and as regards the latter

16

POLAND

421

relatively rarely as paper publications). Since 2006 all reports concerning the CP are collected and shared in the Evaluation Database published on the website maintained by the National Evaluation Unit.15 Typically, evaluation contracts include (along with a report) a presentation of the results to the key decision-makers. Sometimes brochures popularising the main results are also produced. Such practices help to disseminate evaluation results, but they rarely reach a wide audience. Public awareness of what evaluation is and what benefits it can bring, is still unsatisfactory. This leads to limited public interest and debate on evaluation results. The same applies to the school system evaluation, that is obligatory since 2009. Report are widely discussed in schools, followed by teachers, but much less frequently or not at all by parents or local communities. Thus, public debate on evaluation results is significantly limited. It is not led with the broad participation of society (e.g. in the media), but at best during conferences intended for experts, mainly representatives of commissioners and contractors. This situation stems from the above-described context in which evaluation appeared in Poland. Evaluation practices had not developed as a result of a natural evolutionary process, but they were rather introduced as a formal requirement connected to EU funds implementation (Kot 2014). In a country where an undemocratic regime was maintained for over than 40 years, the process of introducing evaluation conceived as ‘a tool of institutional learning’ or ‘a quality management tool’ encountered numerous obstacles. Instead of popularisation of the fourth generation of evaluation, it is often reduced in practice to the simple assessment, control and accountability of achieved results (Korporowicz 2008, 2011; Bartosiewicz-Niziołek 2019). There is still too little reflection on the effects of implemented programmes undertaken by their stakeholders, as well as too much focus and pressure on analysing planned vs achieved indicators. Despite the number of evaluations conducted, there are no regular mechanisms engaging various stakeholders in all stages of the evaluation processes, so far. Strengthening public ownership of evaluation and better dissemination of results may be a way to enhance the usefulness of evaluation (Mackiewicz 2014). The real participation, engagement and collectivisation of the evaluation process seems to be a rational direction of the Polish road to evaluation (Korporowicz 2012).

15 For more info, see: http://www.ewaluacja.gov.pl/strony/badania-i-analizy/wynikibadan-ewaluacyjnych/baza-badan-w-arkuszu-kalkulacyjnym/.

422

M. BARTOSIEWICZ-NIZIOłEK ET AL.

4

Professionalisation (Academic System)

Although hundreds of people have been involved in evaluation over the last 10 years, the profession of ‘evaluator’ does not formally exist in Poland,16 nor is one able to obtain a university degree in this field in the country. Current regulations provide the opportunity to initiate new academic disciplines, so it may be that evaluation will be introduced as a field of university education, but such degree programmes have not been developed in Poland yet. 4.1

Academic Study Courses, Further Training, and Other Learning Opportunities

With the introduction of education standards in 2011, university study programmes in Poland must be in line with the requirements of a National Qualifications Framework (i.e. a detailed description of qualifications attained in the higher education system). In accordance with these standards, a master’s programme in sociology includes the subject ‘Procedures of Evaluation Research’ that ensures the acquisition of the following skills and competencies: understanding the issues of evaluation, its social functions and the role of an evaluator, as well as applying basic analytical procedures and empirical sociology tools in evaluation research. Several universities offer courses in evaluation, but only as a part of their two-year master’s degree programmes, mostly in sociology but also in social policy, political science, pedagogics and management. Most of these classes are conducted at two of the largest and oldest Polish universities–the University of Warsaw (eight courses)17 and at the Jagiellonian University in Cracow (five

16 Even though the Polish Classification of Occupations and Specialties includes ‘the evaluator of education programmes’, this refers exclusively to education and is assigned to the group of ‘teaching inspectors and specialists’. 17 Smaller universities may be afraid of such ‘a niche discipline’ strongly associated with

EU funds, so having an uncertain future or they may perceive it as a threat to the student recruitment process. Instead of increasing the number of new students, it could ‘disperse’ potential candidates of current disciplines, intensifying recruitment problems for other faculties. Thus, most probably only the largest universities would be able to afford the introduction of a specialisation in evaluation.

16

POLAND

423

courses).18 Also, at smaller universities, evaluation is usually taught within sociology or pedagogics specialties.19 Beside Master’s programmes yearlong post-graduate studies in evaluation are also available. At first (beginning in 2004), such studies mainly focused on EU funds. Most of them were offered in Poland’s capital, such as at the University of Warsaw, in the institutes of sociology, social policy, management, as well as at some smaller universities (i.e. Warsaw School of Economics, Cardinal Wyszynski University and the private one Kozminski University). Moreover, post-diploma studies for teachers are offered for example by the Jagiellonian University, but also by minor universities or educational institutions throughout the country and even by non-academic entities (i.e. Widespread Knowledge Society). Smaller cities offer such studies in higher schools of economics, management or entrepreneurship and at universities of technology, both public and private ones. Since 2008, the Centre for European Regional and Local Studies in cooperation with the Ministry of Investment and Economic Development has been operating (at the University of Warsaw) the ‘Academy of Evaluation’, the only free of charge post-graduate study programme for civil servants. One of its main aims is to convince the civil servants to use the

18 For example, at the Institute of Sociology—‘Evaluation Research’ (30 hours), the Institute of Applied Social Sciences—‘Procedures for Evaluation Research’ (30 hours), the Institute of Political Science—‘Analysis and Evaluation of Specific Policies’ (30 hours), the Faculty of Education —‘Evaluation of Educational Institutions’ (60 hours), the Institute of Social Policy—‘Evaluation in Social Policy’ (60 hours) and ‘Evaluation of Labour Market Projects’ (30 hours), the Institute of Social Prevention and Resocialisation – ‘Evaluation Methodology’ (45 hours); these last two classes are exceptional, as they are taught at the undergraduate level (licentiate degree). The Centre for European Regional and Local Studies offers the course ‘Evaluation Theory and Practice’ (60 hours) within a Master’s seminar in the field of spatial management. 19 Two evaluation classes are offered in the Institute of Sociology, also as parts of the Master’s programme in Data Analysis and Social Studies—‘Evaluation and Analysis of Public Policies’ (60 hours) and ‘Evaluation in Practice: an Assessment of a Selected Social Intervention’ (60 hours). Evaluation is also taught in the Institute of Public Affairs—‘Evaluation in Social Policy’ (30 hours), the Institute of Intercultural Studies—‘Evaluation of Cultural Projects’ (30 hours) and the Institute of European Studies—‘Evaluation and Audit of EU projects’ (5 hours). However, initial analysis of the syllabuses from the university’s internet databases revealed dozens of specialty subjects contained the term ‘evaluation’, but the status of many of them was defined as inactive (Maciejewska 2014).

424

M. BARTOSIEWICZ-NIZIOłEK ET AL.

knowledge brought by evaluation to effectively operate their institutions’ development policies.20 As regards non-formal education, the Polish Evaluation Society (PES) has the longest tradition in Poland in conducting training in evaluation. Since 2000, it has trained over 500 representatives of public administration, NGOs, consultancy and research companies, mainly through a 60-hours course ‘The Evaluation Forge’ that has been offering since 2010.21 Its curriculum includes: evaluation design, qualitative and quantitative research methods along with tools, data analysis, evaluation report and using evaluation results. The training is practically oriented and is carried out by experts who have extensive experience in commissioning and conducting of evaluation. In 2016, PES implemented a project based on blended learning that included distant training ‘Take a course on evaluation’ along with some workshops, webinars and tutoring. The course was addressed to those who wish to gain basic and intermediate information about evaluation, systematise or deepen information in this area or merely confirm their knowledge by passing the examination and obtaining the certificate. The structure of the distant course is similar to ‘The evaluation forge’ and takes about 30 hours of learning, which includes texts, examples, exercises, quizzes, problem solving, tasks and knowledge tests.22 Since 2009, PES has also supported another Polish NGOs by implementing three projects providing competence through tutoring (training, workshops and consultancies regarding evaluations of projects implemented by participants of the courses). More advanced course concerning counterfactual methods in evaluation is offered by the Centre for Evaluation and Analysis of Public Policies of the Jagiellonian University, in cooperation with IDEA of Development Foundation, the Polish Evaluation Society and Timberlake Consultants Ltd. This 50-hour summer school ‘Evidenced-based public policies’ is addressed mainly to representatives of public administration and firms conducting

20 The course is financed from EU funds (Technical Assistance) and aims at supporting the

Polish evaluation system by training representatives of public administration in commissioning evaluation and using its results. http://www.euroreg.uw.edu.pl/pl/akademia-ewaluacji11. 21 This course is organised once or twice a year in two different formulas—a closed one for a respective institution, or in an open form for any interested applicant. 22 At present, together with a few Voluntary Organisations for Professional Evaluation, PES is seeking a grant to translate the course into English and several other languages.

16

POLAND

425

evaluation research.23 The Polish training market also offers several, usually one-day training in monitoring and evaluation provided mainly by consultancy firms and NGOs. Longer course targeting specific vocational groups were also available, for example in 2011, the Office for the Movement of Social Initiatives offered 72-hour evaluation training within an EU project entitled ‘Evaluation as the key to success’, to support the development of professional qualifications and consulting for Polish enterprises. Since 2012, according to the Regulation of the Minister of Family, Labour and Social Policy, the training programme for the specialisation of organising social assistance includes 20 hours of classes on evaluation. 4.2

Profession/Discipline

The largest and most influential VOPE in Poland is the Polish Evaluation Society (PES) established in 2000. In 2007 the pool of PES members began to grow rapidly, reaching more than 150 persons. PES associates representatives of public administration, NGOs, consulting and research firms, academics and independent evaluators. Its main aims are to build an evaluation culture in Poland, propagate evaluation as a democratic and social process, integrate people interested and involved in evaluation, enrich knowledge about evaluation and support its quality. In 2008, PES implemented a yearlong project aimed at developing evaluation standards. This initiative was financed by the EU and implemented as a part of the Transition’s Facility programme ‘Strengthening Evaluation Capacity in the Public Sector’. This project involved over 60 persons, both commissioners and evaluators, representatives of public administration, NGOs, consulting firms, scientific and academic institutions, as well as independent evaluators (free-lancers) in the process of developing evaluation standards for Poland’s evaluation community. However, this project has not been accompanied so far by further initiatives aimed at the professionalisation of evaluation, such as the establishment of an arbitration board and an ombudsman for evaluation or a journal devoted solely to evaluation issues.24

23 It is intended for those who commission or conduct research using counterfactual analyses and the theory-based approach (TBE), which are recommended in 2014–2020 perspective by the European Commission in relation to impact evaluations. 24 Though PES has been striving for this for some years.

426

M. BARTOSIEWICZ-NIZIOłEK ET AL.

Articles on evaluation are occasionally published in journals regarded public management, social policy or regional studies. The largest series of several books on evaluation has been published since 2007 by the Polish Agency for Enterprise Development. Although the series focuses mainly on evaluation in public administration (particularly with references to entrepreneurship), it raises a wide range of topics. Some books were also published by units of Polish evaluation system, for example, a guide for public administration staff released by the National Evaluation Unit (Bienias et. al 2012), or post-conference publications prepared by the evaluation units of regional authorities (Marshal Offices, e.g. in Torun, Opole). The Regional Centre for Social Policy in Cracow issued a number of publications on the evaluation of social policies, including an online evaluation guide as a part of the EU project ‘Malopolska Observatory of Social Policy’ implemented in 2008–2015. In 2010–2015, a series entitled ‘Evaluation in Pedagogical Supervision’ was also published by the Jagiellonian University as a part of an EU project25 addressed to the education community (e.g. teachers, advisors, inspectors). Moreover, several publications were also issued by Polish NGOs. For instance one of the PAFF’s partners, The Unit for Social Innovation and Research ‘Shipyard’ published two manuals: ‘Evaluation—how is it done’, and ‘Mini Guide for Evaluation of Cultural Education Project Implementers’, as well as elaborated tips for evaluation of the participatory budget and released a film promoting evaluation (Fundacja Stocznia; Ostrowski and Wi´snicka 2013; Durlik and Dziarmakowska 2017). Presently, the evaluation market is dominated by groups of specialised contractors, mainly consulting and research firms conducting evaluations for the public sector. The pool of contractors,26 both domestic and foreign companies, has developed rapidly since 2007 for over a dozen years as a response to the dynamically increasing demand driven by the evaluation requirements relating to EU funds. The Evaluation Research Database operated by the NEU includes over 230 entities that conducted evaluation of Cohesion Policy initiatives implemented in 2002–2017. The group of contractors is strongly differentiated—beside consulting and research firms,

25 This project aimed at strengthening the effectiveness of the pedagogical supervision system and assessing the quality of school work. 26 This may be related to the need to use innovative methodological approaches and expert knowledge.

16

POLAND

427

it includes universities, research institutes, NGOs and independent evaluators, though their share in the evaluation market is relatively small compared to the major players—consulting companies (Bienias et al. 2015). Nevertheless, in recent years, significant changes have occurred in the Polish evaluation market. Several international consulting companies have withdrawn, a number of evaluators have shifted from the private to the public sector, and there has been some reduction in the share of freelancers. This may be due to a transition in financial perspectives,27 accompanied by a change of the political authorities in Poland, resulting in a several-month gap in tendering for evaluation contracts. Another reason may be the strong price competition that is contributing to a reduction of evaluation budgets. Small firms and independent evaluators were also affected by the decision of the Ministry of Investment and Economic Development making the evaluation of EU projects a conditionally eligible cost as of 2016,28 resulting in the cessation of evaluation at this (i.e. single project) level.

5

Conclusion

As explained throughout this chapter, evaluation research in Poland started to develop rapidly under the EU paradigm of monitoring and evaluation systems dominating other approaches. This has overwhelming consequences for Polish evaluation culture. On one hand, it has been swiftly introduced in public administration at central and local levels with large amount of funds devoted to creating the system in the beginning of 2000s, as well as new workplaces, a variety of publications, conferences, training (mainly, but not only for civil servants) and many other evaluation-related activities. On the other hand, at the beginning of this process, evaluation was a hermetic tool of managing EU funds with limited spill-over effect into other sectors, constricting the independent development of approaches and methodologies or limiting evolution of evaluation as an academic discipline. In addition, the operational programmes cycles affect the size of the demand and in consequence the stability of the profession to a great

27 Between 2007–2013 and 2014–2020. 28 This concerns projects financed by the European Regional Development Fund, the Euro-

pean Social Fund and the Cohesion Fund. An institution managing a given operational programme has to approve a project level evaluation.

428

M. BARTOSIEWICZ-NIZIOłEK ET AL.

extent. Nevertheless, some positive trends can be observed since the evaluation was introduced to Polish administrative practice after the EU accession. Evaluation has been anchored in national law as a mandatory tool for the ex-ante assessment of public horizontal and sectoral strategies and programmes. Evaluation is also a required procedure in the education system (since 2009), Polish development cooperation (known as ‘Polish Aid’) that supports less developed countries, and also NGOs’ projects funded by national and some local and municipal resources. Unless the non-governmental environment was one of the first areas where evaluation developed in Poland, institutionalised use of evaluation research by civil societies is mainly affected by a low level of social trust. Nevertheless, there are admittedly also successes in the third sector evaluation—its role was accentuated in the NGO Practice Charter in 2010. Over the past two decades, also evaluation in the education sector has also been strongly institutionalised in Poland due to the introduction of legal changes in the way pedagogical supervision is exercised. However, the overall reforms of educational system are frequent and not always evidencebased, which destabilise the context of schools’ operations. Although hundreds of people have been involved in evaluation processes over the last 10 years, the profession of evaluator does not formally exist in Poland, nor is it possible to obtain a university degree in this field. Evaluation is instructed as a scientific discipline mainly as part of two-year Master’s degree studies—mostly in sociology but also in social policy, political science, pedagogics and management. Diverse post-graduate, yearlong studies in evaluation are also available, focused especially on UE funds and the education system. Publications on evaluation have also been issued primarily as a part of EU projects, usually implemented by non-academic entities working on public administration, education or social policy issues. The Polish Evaluation Society has the longest tradition in Poland of conducting training in evaluation. Since 2000, PES has trained over 500 representatives of public administration, NGOs, consultancy and research companies, mainly through its 60-hour course ‘The Evaluation Forge’ that has been organising since 2010. Evaluation standards developed by PES in 2008 to support the evaluation process and ensure its quality, but the use of this document, both by commissioners and evaluators, is voluntary. As described above, the advancement of the institutionalisation of evaluation in Poland varies among sectors. Indeed, some national policies and third sector programmes are permanently evaluated. There are also evaluation studies based on rigorous methods that has been widely promoted in

16

POLAND

429

the region as good examples. Cohesion Policy evaluation tries to reverse the focus from the quantity of studies towards their quality and rooting it in legislative structures. In turn, the third sector produced its own materials on evaluating social projects, while evaluation in education has experimented in the development of different evaluation systems and is still open to changes. Last but not least, the Polish Evaluation Society as the oldest in Central and Eastern Europe is a strong actor in professionalising the field with its courses, seminars, conferences and congresses, as well as with standards and guidelines on conducting evaluation ethically and rigorously. Concluding, the last 20 years of developing the evaluation culture in Poland is comprised of extensive efforts to root ‘improving by learning’ and evidence-based policymaking in all the discussed fields. Although this process has been meeting some failures or weaknesses, it has also resulted in many advantages that can be considered exemplary for this European region.

References Act of 6 December. (2006). Principles of conducting the development policy (No. 227). Act of 16 September. (2011). Development cooperation. Act of 15 September. (2017). On the National Institute of Freedom—Center for the Development of Civil Society. Adamiak, P., Charycka, B., & Gumkowska, M. (2016). Kondycja sektora organizacji pozarzadowych ˛ w Polsce 2015 (Raport z badan). ´ Warszawa: Stowarzyszenie Klon/Jawor. Bartosiewicz-Niziołek, M., Wróblewski, J., & Fiut, K. (2018). Report on the assessment of the current state of DRG evaluation in Poland. In M. Sumpor (Ed.), Democracy, rule of law and human rights, and good governance (DRG) evaluation in Europe. Report on the assessment of the current state of DRG evaluation in Croatia, Greece, Macedonia, Poland, Serbia, Turkey and Ukraine (Research project funded under the IOCE/EvalPartners Small Grants Programme “Strengthening the Role of VOPEs in Democratizing Learning and Evaluation: Democracy, Human Rights, and Governance as a Showcase”). Bartosiewicz-Niziołek, M. (2019). Wyniki badania postaw koordynatorów projektów UE wobec ewaluacji (prezentacja wygłoszona podczas IV Mi˛edzyregionalnej Konferencji Ewaluacyjnej w Toruniu “Ewaluacja - obowi˛azek czy klucz do uproszczen´ w realizacji programów operacyjnych”. http://www.mojregion.eu/ index.php/rpo/ewaluacja-szkolenia-konferencje).

430

M. BARTOSIEWICZ-NIZIOłEK ET AL.

Bienias, S., Felcis, W., Hermann-Pawłowska, K., & Trzcinski, ´ R. (2015). Inspiracje z ewaluacji—najlepsze praktyki. In J. Górniak (Eds.), Ocena Wpływu oparta na dowodach. Model dla Polski. Warszawa: Akademia Leona Kozminskiego. Bienias, S., Gapski, T., J˛akalski, J., Lewandowska, I., Opałka, E., Strz˛eboszewski, P., et al. (2012). Ewaluacja. Poradnik dla pracowników administracji publicznej. Warsaw: Ministry of Regional Development. Brundenius, C., & Göransson, B. (2011). The three missions of universities: A synthesis of UniDev project findings. In B. Göransson & C. Brundenius (Eds.), Universities in transition: Insight and innovation in international development (pp. 329–352). New York: Springer. Chojecki, J., Huras, P., Kaminska, ´ A., Matysiak, M., Nowinska, ´ K., Pałka, S., et al. (2016). Metaanaliza wyników badan´ ewaluacyjnych dotyczacych ˛ oceny wsparcia z EFS – Raport czastkowy ˛ I. Warszawa: Evalu na zlecenie Miniterstwa Rozwoju. Durlik, M., & Dziarmakowska, K. (2017). Miniporadnik ewaluacji dla realizatorów projektów edukacji kulturalnej. Warszawa: Fundacja Pracownia Badan´ i Innowacji Społecznych „Stocznia”. http://www.mkidn.gov.pl/media/po2017/ decyzje/20170515_Miniporadnikewaluacji.pdf. European Commission. (2014). The programming period 2014–2020, guidance document on monitoring and evaluation. Concepts and Recommendations. EU-Russia Civil Society Forum. (2017). Report on the state of civil society in the EU and Russia, 2016. http://eu-russia-csf.org/fileadmin/State_of_Civil_ Society_Report/18_05_2017_RU-EU_Report_spaudai_Hyperlink_Spread. pdf. Accessed on 6 March 2019. European Social Survey Round 8. (2016). https://www.europeansocialsurvey. org/data/download.html?r=8. European Social Survey Round 9. (2019). http://www.europeansocialsurvey.org/ about/singlenew.html?a=/about/news/essnews0076.html. Ferry, M., & Olejniczak, K. (2008). The use of evaluation in the management of EU programmes in Poland. Warsaw: Ernst & Young. http://www.ey. com/Publication/vwLUAssets/Evaluation_EU_Funds_Poland.pdf/$FILE/ Evaluation_EU_Funds_Poland.pdf. Accessed on 6 March 2019. Fundacja Pracownia Badan´ i Innowacji Społecznych „Stocznia”. (2015). Standardy procesów budzetu ˙ partycypacyjnego w Polsce. https://stocznia.org. pl/publikacje/standardy-procesow-budzetu-partycypacyjnego-w-polsce/. Accessed on 6 March 2019. Górniak, J. (2015a). Ogólne uwarunkowania sytuacji szkolnictwa wyzszego ˙ w Polsce. In J. Górniak (Eds.), Diagnoza szkolnictwa wyz˙ szego. Program rozwoju szkolnictwa wyz˙ szego do 2020 (pp. 25–36). Warszawa: Fundacja Rektorów Polskich. Górniak, J. (Ed.). (2015b). Ocena wpływu oparta na dowodach. Model dla Polski. Warszawa: Akademia Leona Ko´zminskiego. ´

16

POLAND

431

Januszkiewicz, A. (2016). The practice of evaluation of EU projects in local governments in Poland. International Journal of Contemporary Management, 15(3), 7–30. Jaskuła, S. (2018). Ewaluacja społeczna w edukacji. Warszawa: Impuls. Korporowicz, L. (2008). Interakcyjne aspekty procesu ewaluacyjnego: pomi˛edzy analiz˛a a animowałem zmian społecznych. In A. Haber & M. Szałaj ´ (Eds.), Srodowisko i warsztat ewaluacji. Warszawa: Polska Agencja Rozwoju Przedsi˛ebiorczo´sci. Korporowicz, L. (2011). Zmienne losy polskiej ewaluacji. Pomi˛edzy nadziej˛a, animacj˛a i konfuzj˛a. In B. Niemierko & M. K. Szmigiel (Eds.), Ewaluacja w edukacji. Koncepcje. Metody. Perspektywy. Kraków: PTDE. Korporowicz, L. (2012). Refleksja jako działanie. O stałej potrzebie przełamywania redukcji w pojmowaniu badan´ ewaluacyjnych w edukacji. In B. Niemierko & M. K. Szmigiel (Eds.), Regionalne i lokalne diagnozy edukacyjne. Kraków: Tomami. ˙ Korporowicz, L. (2013). Rozwojowa misja ewaluacji. In Animacja Zycia Publicznego, Zeszyty Centrum Badan´ Społeczno´sci i Polityk Lokalnych nr 4 (11). Kot, T. (2014). Mocne i słabe strony procesu ewaluacji 2007–2013. Po˙zadane ˛ kierunki zmian na lata 2014–2020. Warszawa: Krajowa Jednostka Ewaluacji Departament Koordynacji Strategii i Polityk Rozwoju, Ministerstwo Infrastruktury i Rozwoju. http://www.kozminski.edu.pl/fileadmin/Konferencje/Ewaluacja_ pregramow_operacyjnych/4.PerspektywaSystemowa_T.Kot.pdf. Accessed on 6 March 2019. Kowalik, W., Malczyk, K., Ma´znica, Ł., Kna´s, P., St˛epniak, W., Dulian, L., et al. (2015). Nowa sprawozdawczo´sc´ instytucji kultury. Kraków: Małopolski Instytut Kultury w Krakowie. Kupiec, T. (2015). Ewaluacja regionalnych programów operacyjnych w warunkach prawa zamówien´ publicznych i finansów publicznych. Samorzad ˛ Terytorialny, 10, 27–39. Maciejewska, M. (2014). Badania ewaluacyjne w edukacji i ich konteksty w akademickim kształceniu nauczycieli i pedagogów. Opuscula Sociologica, 4, 31–43. Mackiewicz, M. (2014). Ewaluacja programów operacyjnych w ramach polityki spójno´sci 2007–2013 – do´swiadczenia i wnioski na przyszło´sc´ . https:// www.kozminski.edu.pl/fileadmin/Konferencje/Ewaluacja_pregramow_ operacyjnych/6.PerspektywaWykonawcy_M.Mackiewicz.pdf. Accessed on 5 March 2019. Marková, I. (Ed.). (2004). Trust and democratic transition in post-communist Europe (Vol. 123). Oxford: Oxford University Press. Mazurkiewicz, G. (Ed.). (2012). Jak by´c jeszcze lepszym. Ewaluacja w edukacji. Kraków: Wydawnictwo Uniwersytetu Jagiellonskiego. ´ Mazurkiewicz, G., & Berdzik, J. (2010). System ewaluacji o´swiaty: model i procedura ewaluacji zewn˛etrznej. In G. Mazurkiewicz (Ed.), Ewaluacja w nadzorze pedagogicznym: odpowiedzialno´sc´ . Kraków: Jagiellonian University.

432

M. BARTOSIEWICZ-NIZIOłEK ET AL.

Ministerstwo Gospodarki we współpracy z Kancelari˛a Prezesa Rady Ministrów. (2015). Wytyczne do przeprowadzania oceny wpływu i konsultacji publicznych w ramach rzadowego ˛ procesu legislacyjnego. Ministerstwo Infrastruktury i Rozwoju. (2015). Plan Ewaluacji Programu Operacyjnego Wiedza Edukacja Rozwój na lata 2014–2020. Warsaw. Mizerek, H. (2006/2016). Ewaluacja edukacyjna w Polsce. trajektorie, perspektywy i dylematy rozwoju. „Zarzadzanie ˛ Publiczne”, 1(33), s. 1–16. Mizerek, H. (2017). Ewaluacja edukacyjna. Interdyskursywne dialogi i konfrontacje. Kraków: Impuls. National Evaluation Unit (NEU). (2014). Wpływ ewaluacji na skuteczno´sc´ i efektywno´sc´ realizacji polityki spójno´sci w Polsce. Dobre praktyki. Warsaw: Ministry of Infrastructure and Development. National Evaluation Unit (NEU). (2016). Annual of evaluation 2016: Overview of evaluation activities within cohesion policy in Poland. Warsaw: Ministry of Economic Development. OECD. (2016). Society at a glance 2016: A spotlight on youth, how does Poland compare? https://www.oecd.org/poland/sag2016-poland.pdf. Accessed on 5 March 2019. OECD. (2017). OECD development cooperation peer reviews: Poland 2017. Olejniczak, K. (2013). Mechanisms shaping an evaluation system—A case study of Poland 1999–2010. Europe-Asia Studies, 65(8), 1642–1666. Olejniczak, K., Kupiec, T., Wójtowicz, D., & Felcis, W. (2018). Evaluation in v4+4 countries—An overview of practices. Warsaw: Ministry of Investment and Economic Development. Ornacka, K. (2003). Ewaluacja mi˛edzy naukami społecznymi i praca˛ socjalna. ˛ Kraków: Wydawnictwo UJ. O´srodek Ewaluacji. (2016). Ewaluacja procesu budzetu ˙ partycypacyjnego w Warszawie na rok 2017. Raport koncowy. ´ http://twojbudzet.um.warszawa.pl/sites/ twojbudzet.um.warszawa.pl/files/raport_koncowy_wrzesien_v5.pdf. Accessed on 5 March 2019. O´srodek Ewaluacji. (2017). Ewaluacja funkcjonowania domów sasiedzkich—raport ˛ z badania. https://inicjatywa.um.warszawa.pl/sites/inicjatywa.um.warszawa. pl/files/Ewaluacja%20MAL%202016%20Por%C3%B3wnanie%203%20DS.pdf. Accessed on 5 March 2019. Ostrowski, Ł., & Wi´snicka, M. (2013). Ewaluacja—jak to si˛e robi? Poradnik dla programów PAFW. Warsaw: Fundacja Pracownia Badan´ i Innowacji Społecznych „Stocznia”. Podkonska, ´ A. (2016). Ewaluacja w lokalnej polityce społecznej - przykład projektów aktywnej integracji. Problemy Polityki Społecznej. Studia i dyskusje, 35(4), s. 131–148. https://www.problemypolitykispolecznej.pl.

16

POLAND

433

Polish Evaluation Society (PES). (2008). Evaluation standards. Warszawa. http:// pte.org.pl/wp-content/uploads/2015/08/PTE_Evaluation_Standards.pdf. Accessed on 5 March 2019. Przewłocka, J., Adamiak, P., & Herbst, J. (2013). Podstawowe fakty o organizacjach pozarzadowych. ˛ Raport z badania 2012, Stowarzyszenie Klon/Jawor. http://www.ngo.pl/PodstawoweFakty_2012_raport/ebook/content/ PodstawoweFaktyNGO_2012_KlonJawor_raport.pdf. Accessed on 5 March 2019. Sobieraj, I. (2014). Badania ewaluacyjne w ´srodowisku szkoły. Idee-metodyorganizacja. Kraków: Nomos. SRD. (2017). Strategy for responsible development till 2020 (with perspective until 2030) adopted by Council of Ministers on the 14th of February 2017. Stowarzyszenie Klon/Jawor. (2008/2012). Podstawowe fakty o organizacjach pozarzadowych. ˛ Raport z badan. ´ http://osektorze.ngo.pl/files/civicpedia.pl/ public/raporty/podstawowe_fakty_2008_calosc_popr_FIN.pdf. Accessed on 5 March 2019. Szarfenberg, R. (2009). Ewaluacja strategii a polityka społeczna. In M. Grewinski, ´ A. Karwacki (Eds.), Strategie w polityce społecznej. Warsaw: Mazowieckie Centrum Polityki Społecznej. Szatur-Jaworska, B. (Ed.). (2010). Ewaluacja w słuzbach ˙ społecznych. Warsaw: Mazowieckie Centrum Polityki Społecznej. USAID. (2017). CSO sustainability index for central and Eastern Europe and Eurasia. https://www.usaid.gov/sites/default/files/documents/1866/ CSOSI_Report_7-28-17.pdf. Accessed on 5 March 2019. Ustawa z dnia 15 wrze´snia 2017 r. o Narodowym Instytucie Wolno´sci – Centrum Rozwoju Społeczenstwa ´ Obywatelskiego. Wenzel, M., & Kubik, J. (2009). Civil society in Poland: Case study. Report prepared for International Conference the Logic of Civil Society in New Democracies: East Asia and East Europe. http://www.cbos.pl/PL/wydarzenia/04_ konferencja/Civil%20society%20in%20Poland.pdf. Accessed on 5 March 2019. ˙ Zuber, P., & Bienias, S. (2008). System ewaluacji w Polsce – dotychczasowe do´swiadczenia i wyzwania na przyszło´sc´ . In K. Olejniczak, M. Kozak, & B. Ledzion (Eds.), Teoria i praktyka ewaluacji interwencji publicznych: Podr˛ecznik akademicki. Warsaw: Wydawnictwo Akademickie i Profesjonalne.

CHAPTER 17

Romania Nicolae Todera¸s and Teodora Diana Iacob

1

General Country Overview

In Romania, the use of evaluation is a recently introduced practice as a result of the reforms undertaken within the central public administration in the early 2000s. Until 1990, in the context of the authoritarian regime, although certain Western concepts related to public management were applied, rigid and centralised planning of public intervention prevailed. Therefore, although the term evaluation was used, it was rather understood as control, verification and reporting. During the first years of democratisation of the Romanian society, the vision related to the implementation of public interventions gradually began to change towards a pluralistic, deliberative and participative approach. During the transition period, technical assistance projects funded by the European Union (EU), United States Agency for International

N. Todera¸s (B) Department of International Relations and European Integration, National University of Political Studies and Public Administration, Bucharest, Romania e-mail: [email protected] N. Todera¸s · T. D. Iacob Programmes Evaluation Unit, Ministry of European Funds, Bucharest, Romania © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_17

435

436

N. TODERAS¸ AND T. D. IACOB

Development, Department for International Development, World Bank et cetera, had a defining role. They supported public authorities to reform their functionality and the procedural framework according to the principles of New Public Management. The purpose of the technical assistance projects was mainly the creation of an improved national system for the implementation of public policies. In this respect, the evaluation was interpreted as an essential stage of the policy cycle: substantiating interventions and demonstrating the appropriateness of their implementation as well as their relevance and impact on social and economic development. The development of the evaluation culture experienced an accelerated rhythm during the EU pre-accession period (especially in the years 2001– 2006). Thus, the central public administration in Romania had to respond to the requirements regarding the need to strengthen the evaluation culture, coming mainly from the EU institutions either through different incentives (e.g. more consistent financial allocations) or through concrete conditionalities (e.g. the role of the six-monthly monitoring reports of European Commission). In 2006, a favourable context for the diffusion of the evaluation practice within the central administration structures was created influenced by the projects with external financing related to the evaluation of public interventions. Moreover, during the same period, a broad regulatory framework has been developed, which, with certain adjustments, has been applied so far. Following the EU accession, the processes for strengthening the evaluation culture has slowed the pace. Although concrete incentives have been designed to increase the need for evaluation exercises and for the use of evaluations results, evaluation practice developed especially in the area of EU funds management system. The incidence of evaluations conducted at the level of other public interventions remains relatively low. Various analyses and case studies reveal that these types of activities are to a large extend conducted at the pressures of external actors. The results of evaluations are rarely used to improve the management of public policies and public authorities. In many cases, their use was limited to the dissemination of the main conclusions regarding the evaluated interventions. From a general perspective, the last decade is characterised by a latent development of the evaluation culture, even if significant resources have been invested in developing evaluation capacity and implicitly the evaluation culture. Civil society and the economic environment had a relatively

17

ROMANIA

437

narrow role in supporting bottom-up processes in strengthening the internal evaluation culture. They have rather harnessed the pressures exerted by external factors, without generating continuous and sustainable pressure to increase the use of evaluation.

2

Institutional Structures and Processes (Political System) 2.1

2.1.1

Evaluation Regulations

Provisions on Evaluation or Use of Evaluation in the National Laws or Regulations So far, in Romania no specific legislative initiative that regulates the evaluation of public interventions has been developed and adopted. Nevertheless, during the 2005–2008 period, several debates have been organised in the context of the implementation of some projects whose objectives were to develop a new culture of designing public policies and strengthen the evaluation capacity at central public administration level (St˘avaru 2015; Perianu 2008). There have been different proposals to develop and adopt a distinct law to regulate evaluation (Cerkez 2009), but the proposals did not meet the critical mass of support and consequently the initiatives were limited to introducing new articles that would require central and local public authorities to conduct periodic evaluations as amendments to the tax legislation. However, with the changes that occurred at the level of government by the end of 2008, the proposed amendments have not been adopted. In the period to date, there have been no further attempts to elaborate, support or adopt a law on the evaluation of public programmes and policies. Nevertheless, the focus was rather on normative regulation (Cerkez 2009), based on regulations or guidelines adopted at the level of the Secretariat General of the Government or at the level of ministries and/or specialised agencies. This normative approach for evaluation derives from a series of government decisions on policy making processes. Thus, the first government decisions were adopted in the context of the negotiation process for EU membership. For example, the Government Decision No. 1006 of 4 October 2001 approving the government strategy on accelerating the reform of public administration has introduced procedures designed to guide the substantiation process of public policies, specific to New Public Management, focusing also on the effectiveness of public policies and programme evaluation. Subsequently, during the pre-accession period, a

438

N. TODERAS¸ AND T. D. IACOB

number of decisions were adopted on the procedures for the elaboration, monitoring and evaluation of public policies at central level. In this respect, the most frequently invoked is the Government Decision No. 775 of 14 July 2005 approving the regulation on the procedures for elaboration, monitoring and evaluation of public policies at central level. With certain adjustments made in 2006, 2009 and 2016, this regulation remains the normative reference applied so far in terms of evaluation of public interventions. The 2016 amendments (Government Decision No. 523) aim to boost the development of the evaluation culture by introducing annual reporting by the ministries on the process of public policy implementation, monitoring and evaluation. 2.1.2

Normative Framework Paths on Development of Capacity and Culture of Evaluation Within the process of developing the normative framework, there were two relevant directions that focused on strengthening the capacity and culture of evaluation. The first direction aimed at consolidating the constant use of mid-term and ex post evaluations of public policies (including the evaluation of administrative capacity), highlighting the effectiveness, efficiency, relevance and usefulness of these policies and programmes during their implementation period. The second direction focused on Regulatory Impact Assessment (RIA). To an extent, the two directions were not complementary, but competing, although the stake was the same, namely to contribute to improving the quality of public policies and programmes. For example, in 2007, by amending the Law on normative technical standards for drafting legislative acts (Law No. 24), RIA became mandatory for the new legislative proposals and draft normative acts. The normative institutionalisation of RIA was provided through specific administrative capacity development projects, where methodologies, procedures, guidelines and manuals were developed. However, the interest of decision-makers and epistemic factors was not so visible for the development of the normative framework on evaluation. Moreover, the interest for the development of subsequent instruments declined significantly since 2009. An exception was in 2015–2016, when a number of improvements related to the normative framework were made to fulfil the ex ante conditionalities for the European Structural and Investment Funds 2014–2020. The Law on normative technical standards for drafting legislative acts does not contain explicit references to the need of using evidences for legislative proposals and normative acts on the basis of regular evaluations of

17

ROMANIA

439

public interventions. However, the law requires the use of RIAs. It states clear that during the preparation of legislative and normative proposals, evaluation activities are circumscribed to the use of secondary data such as the analysis of the specialised literature, different studies, researches, statistical evaluations, as well as the collection of primary data, including consultation of specialists. Not even the subsidiary normative framework has the capacity to transpose the need for using evidences from evaluation on the impact, effectiveness, efficiency and relevance of public interventions. Taken into consideration that the Romanian administrative system is characterised by the prevalence of legal coercion mechanisms, the government decisions did not have the appropriate weight to impose compared to the laws (Todera¸s 2017). Therefore, without the obligation and the need for accountability, central and local public authorities in Romania have rarely resorted to evaluation. Exceptions have been made by those structures which, through the nature of European funds regulations or international organisations (such as the World Bank or United Nations structures) have been conditioned to conduct thematic evaluations on a continuous and systematic basis. Over the last two decades, within the sectoral laws the focus has been on analysing organisational and institutional compliance as well as performance. These types of evaluations are designed for specific processes to obtain permits and operating licences or periodic accreditation service providers or programmes and services provided by those organisations. Such provisions can be found in primary and secondary legislation that regulate specific interventions in different sectors such as education, research, health, social protection, environment, energy, et cetera. The dominant approaches for these sectors were mainly based on the principle of quality assurance and equitable access to services or programmes offered. However, the requirements on evaluations included in these norms did not lead over time to an increased need to extend these activities through periodic thematic evaluations so as to match them to the needs of the beneficiaries or communities served, and to improve the implementation process—focus on utility, relevance and effectiveness. For example, in the education sector, there is a distinct law on quality assurance of the education system and higher education system (Law No. 87). Both systems have gone from the same paradigm of evaluation—demonstrating the degree of satisfaction of the beneficiaries’ expectations by observing the quality standards. Over the last decade, through the subsidiary amendments to

440

N. TODERAS¸ AND T. D. IACOB

the normative framework (methodologies, guidelines and guides) organisations subject to periodic evaluation have been encouraged to conduct other types of evaluations to demonstrate usefulness, relevance and effectiveness of performed programmes and activities. Examples of evaluations such as: tracking the graduates’ professional path can be listed here; the effectiveness of school—after school types of activities, the impact of master programmes on improvement of different areas or sectors of activity. Within the health and education sectors (and also higher education system), sectoral laws foresee the evaluation of organisational performance through periodic classification and hierarchy exercises. These types of exercises represent a condition to hospitals and higher education institutions to use data based on different evidences collected and obtained by conducting periodic evaluation exercises. 2.1.3

A Top-Down Approach of Institutionalisation of Evaluation Use The institutionalisation of evaluation in Romania is a result of the implementation strategies that were circumscribed to the reform of public administration, as well as those designed to improve the implementation of public interventions. Two generations of strategic documents that fostered the institutionalisation of the existing evaluation culture can be distinguished. The first generation has been characterised by the promotion of evaluation as a means of control. Instead, the second generation of strategies was meant to generate a culture of lessons learned. The intensity of the provisions included in the strategies that advocate the use of evaluation has largely depended on the recommendations of the EU institutions as well as other international organisations such as the World Bank or the Organisation for Economic Cooperation and Development (OECD). For example, over the past two decades, the recommendations that have had the strongest power of conviction have proven to be the ones outlined in the SIGMA reports.1 The first generation of strategies is specific to the EU accession period that was strongly influenced by the negotiations’ conditions (2000–2005). One of the conditionalities referred to a changed

1 Support for Improvement in Governance and Management (SIGMA) is a joint initiative of the OECD and the EU. One of its key objective is to improve the design and implementation of public administration reforms. Initially, it targeted the EU candidate countries and potential candidates (starting with 1993), subsequently a version of the Principles has been developed for European Neighbourhood countries.

17

ROMANIA

441

approach in the management of public interventions from a centralist and authoritarian style to a pluralistic and decentralised one, so the focus was on public administration reform. Thus, for that period, the Government Strategy on Accelerating Reform of Public Administration for 2001–2004 was emblematic. Adopted in October 2001, the strategy stated that the evaluation of impact and efficiency of public policies is key instrument to reform the administrative system. The strategy had the role of initiating a demarcation in terms of management functions between the political and the administrative levels, in order to ensure a qualitative implementation and evaluation of the effectiveness of the public policies. Moreover, the transition to a managerial style of public administration based on a cyclical approach for the policy making process, including planning, implementation, monitoring, control and evaluation, has been operationalised. The operationalisation of the strategy was mostly based on the PHARE programme. At the same time, based on the strategy’s provisions, in 2003 the Public Policies Unit was created within the Secretariat General of the Government (St˘avaru 2015). This was the first governmental structure that had the role of coordinating the programming and evaluation of public policies at central level. Subsequently, in 2004–2005, the institutionalisation process extended to line ministries and other government agencies through the creation of public policy units, which also had the role of conducting evaluation exercises on public intervention. The second generation of strategies focused on the development of the national policy coordination system. This phase begun in 2006 and is still ongoing. The specific accent of this phase was on improving the process of delivering public interventions and moving to a performance based management culture. Thus, during 2006–2013 period, the strategy for improving the elaboration, coordination and planning processes of public policies in the central public administration, which was adopted in June 2006 by Government Decision No. 870, is emblematic. The strategy emphasises the obligation to report and evaluate ex post public interventions. It develops the main procedural milestones concerning the timing and frequency of evaluations, the objectives of the evaluations, and the use of evaluation results. The financial operationalisation of the strategy was mostly based on the Operational Programme Administrative Capacity (OPAC) Development 2007–2013. At the beginning of this institutionalisation period, a distinct evaluation strategy was elaborated, the National Evaluation Strategy for 2007–2013, published in November 2006. The stake of this strategy

442

N. TODERAS¸ AND T. D. IACOB

consisted of creating a national evaluation system targeting both public and private sector and civil society. Based on the EVALSED recommendations,2 a gradual approach to the development of the national evaluation system was proposed alongside the creation of a distinct central structure to coordinate the evaluation of public interventions, including those funded from the European budget. It should be noted that, in spite all the efforts, officially this strategy did not come into force. Neither the Ministry of Finance, the initiator of the strategy, nor the Secretariat General of the Government has propelled the proposal of the strategic document at the level of a technical regulatory framework. The failure to adopt the strategic proposal was influenced by the fact that the process of institutionalising the evaluation function during the first period happened very fast, without creating a critical mass of in terms of advocates for evaluation. At the same time, against the background of an incipient institutional architecture in the area of public policy coordination, the proposal to create a central state ‘resource’ as a centre of evaluation expertise has raised a number of concerns. Last but not least, as Cerkez (2009) notes, taking over ideological conceptions from EVALSED has led to confusion and contradiction in the institutionalisation of clear evaluation procedures without an administrative culture in this respect. However, some elements of the strategic document have been used to fundament public interventions that aimed at strengthening the administrative capacity, out of which the establishment of master programmes or the implementation of administrative capacity development projects. Currently, the legislative framework for the implementation of public policies is governed by the Strategy for Strengthening Public Administration 2014–2020, which was approved by the Romanian government in October 2014 (Government Decision No. 909). As regards the use of evaluation, the main objective of the strategy is to improve the regulatory framework for the design of the evaluation system and to strengthen the capacity of institutions to collect, correlate and interpret data in order to adequately substantiate public interventions. The strategy has set as priority activities the development of procedures and instruments for monitoring and evaluation, both from the RIA and the mid-term and ex post evaluation 2 EVALSED: The resource for the evaluation of Socio-Economic Development— Evaluation guide, available at: http://ec.europa.eu/regional_policy/en/information/ publications/evaluations-guidance-documents/2013/evalsed-the-resource-for-theevaluation-of-socio-economic-development-evaluation-guide.

17

ROMANIA

443

of public interventions. Implementation of the strategy is also supported through the OPAC from 2014 until 2020. 2.1.4 Guidelines and Tools for Evaluation Practice During the last decade soft measures mainly represented by instructions, methodologies, guides and procedures prevailed over hard measures such as legal provisions. To a large extent, they have been developed within the framework of different projects aimed at consolidating the administrative capacity. The funding sources for these projects were mainly the EU funds. Most of the projects were implemented during 2005–2009 period and were designed to foster the use of evaluation within the central public administration. The Secretariat General of the Government (SGG) had the main role in this respect. For example, under the guidance of the SGG in 2006, the Handbook of Methods used in planning and evaluation of impact of public policies was elaborated. It was later used by ministries or public authorities as a reference in establishing internal procedures for evaluation or establishing the requirements included in the terms of reference for contracting out these types of services. As a result of the implementation of the 2014–2020 OPAC, over the last two years, the efforts for developing new instructions and guidelines on evaluation have increased within some projects funded through this programme. OPAC funded interventions are expected to help strengthen the evaluation culture in Romania. The strategies and normative acts specific to the first generation of institutionalisation of evaluation of public interventions in Romania did not emphasise aspects related to ensuring adequate independence, quality and budgeting. The emphasis was on highlighting the efficiency and effectiveness of planning, coordination and control of public interventions. Only the second generation of regulations ensured a gradual diversification of evaluation functions, including those related to accountability and legitimacy, as well as enlightenment. These provisions are also much clearer in terms of temporality and the cyclicality of the evaluation exercises. At the same time, aspects related to independence, performance and impact, as well as adequate budgeting for evaluation activities are strengthened. For example, by transposing the approaches and principles of the European Commission’s guidelines and instructions, the National Strategy for Evaluation (NSE) can be considered to be the most representative document in this respect.

444

N. TODERAS¸ AND T. D. IACOB

Even though the 2014–2020 strategic documents set out the scope and use of evaluation, there is currently a change in focus for the impact evaluation of public interventions. At the level of ministries and central administration authorities the evaluation focuses on the analysis of organisational performance based on the impact of the implementation process of internal managerial control standards. As a consequence, several public policy units, instead of evaluating the impact of public interventions, they actually perform assessments on the implementation of managerial internal control standards as well as design of operational procedures specific to the internal and managerial control system (St˘avaru 2015). Such a misappropriation of the public policy unit’s activity explains that decision-makers avoid using evaluations in order to unveil deficiencies of public interventions. 2.1.5 Budget Allocation for Evaluations In terms of the need for ensuring appropriate budgets for evaluation activities, the regulatory framework is elusive. Different guides and manuals recommend that public authorities should allocate distinct budgets for various types of evaluations based on evaluation plans. For example, in the 2010–2012 period, with the support of the World Bank, several functional analyses were conducted on different public sectors and the recommendations clearly stated that in the context of the progressive transition to multi-annual programmes, public authorities will inevitably have to allocate budgets distinct for the evaluation activities of those public programmes. So far, the transition from annual budgeting by economic sectors to multiannual budgets for public programmes has not been operationalised and therefore distinct allocations for evaluation activities are not possible except for the cases where funding comes from external sources. To increase the use of evidences in the planning process of public policies, some authorities have managed to allocate financial resources to conduct evaluations through sectoral Research and Development (R&D) plans. These plans contain projects designed to evaluate institutional performance as well as projects aiming at evaluating the efficiency, effectiveness, impact and relevance of public policies and programmes in these particular sectors. Nevertheless, this type of budget allocation has not become a generalised practice even if it has been institutionalised since 2004 through Government Decision No. 1266. The practice of budgeting evaluation activities based on sectoral R&D plans is more likely to be found in the sectors that have social impact (Ministry of Labour, Ministry of Health, etc.) rather than in regulatory processes or infrastructure sector. In 2016, in the context of the

17

ROMANIA

445

technocrat governance, several ministries have requested ad hoc evaluations of policies and programmes under implementation, with separate budgets being allocated for these services. For example, in the case of the Ministry of Youth and Sport, an independent evaluation on the implementation of Youth Programmes for the 2010–2015 period was requested, as well as technical assistance from the World Bank to carry out an RIA on the Volunteer Law. It should be noted that so far, no such complex exercises have been conducted within the respective ministry, and the decision-makers preferred not to require independent evaluations of any kind. 2.1.6

Evaluation and Use of Evaluation Findings in Parliamentary Activities The political activity of the politicians has not stimulated the development of the evaluation culture at the level of the Romanian public administration. For example, the representatives of parliament have not been so preoccupied with the evaluation of public interventions. Their interpellations addressed to the ministries or authorities are rather concerned with the enforcement aspects of the laws and the normative provisions. By 2007, the NSE mentioned that at the level of the parliament a practice to contract out evaluation exercises did not exist (Ministerul Finant, elor Publice 2006). The support structures of the Romanian parliament did not take into consideration developing a specific unit responsible for planning and conducting evaluations to help draft evidence-based legislative proposals. Statistical data provided by statistical institutes, including Eurostat, or secondary data extracted from European Commission communications underpinning the EU legislation are used. The results of evaluations conducted within different ministers and other public authorities are rarely invoked in the process of drafting legislative projects. One of the notable initiatives that targeted an increased use of evaluation by the representatives of the parliament was the conference organised by the Romanian Association for Evaluation in cooperation with CENTRAS in 2010 ‘Responsibility and good governance through evaluation’. The scope of the event was to discuss the perspectives for the use of evaluation in the work of the parliamentarians, gathering also representatives of nongovernmental organisations (NGOs), experts and journalists.3

3 For more info, see http://www.evalrom.ro/dezbatere-parlament5.

446

N. TODERAS¸ AND T. D. IACOB

2.2

Evaluation Practice

As mentioned before, in Romania, the use of evaluation has not yet become a recurrent generalised and easily accepted practice by the decision-makers or civil servants. Evaluation activities are usually implemented in response to different conditionalities and have not been institutionalised as an intrinsic routine within the cycle of public policy implementation. The main scope of the evaluation activities was to analyse the relevance, efficiency, effectiveness, relevance and to an extent the sustainability of different public programmes. Most often, evaluation is used for the public programmes that have social impact, for which the new public interventions are increasingly appealing to the RIA. At the same time, due to legal conditions, in recent years, the RIA is more and more used in areas where new regulations or significant adjustments are proposed to the regulations in place. While many of these are based on poor methodology, it is expected that maximising the use of RIA will also lead to a multiplying effect of evaluations conducted for the public programmes, outlining their impact. A key factor for the development of evaluation relates to the financing source of the interventions. As it was stated earlier, in most cases, for the interventions financed from the national budget, periodical evaluations are not foreseen. Contrary to this approach, for the interventions supported by external donors in areas such as education, health, social assistance, environmental protection, energy, public administration, etc., evaluations are mandatory and are conducted according to the specific (evaluation) plans established since the beginning of the technical support. For example, in higher education and research systems, the first organisational and institutional evaluation exercises were conducted in 1997–2001 (Toderas, and St˘avaru 2015) as part of a PHARE (Universitas 2000) structural reform programme, and also for the loan granted by the World Bank (International Bank for Reconstruction and Development 1996). Within certain components of the higher education and research system, the practice of evaluation has become recurrent in specialised agencies for which distinct resources are allocated. For example, in the multiannual R&D and innovation programmes, evaluations are carried out periodically, both with the scope of highlighting the impact of the programme, and for programming the new cycles of strategies and programmes. For the operational programmes specific to the European Structural and Investment Funds (ESIs), according to EU regulations, evaluation is mandatory and is based on very structured evaluation plans. In these cases,

17

ROMANIA

447

the European Commission has a coordination role which includes also the verification of compliance with evaluation plans and, where appropriate, additional requirements to management authorities to conduct ad hoc evaluations in order to gather evidences for programme management. For these operational programmes, the ex ante and intermediate evaluations prevail. Even though, according to the provisions of the European Funds Regulations, the ex post impact evaluations are mainly the responsibility of the European Commission responsibility, some national ex post evaluations have been finalised in recent years. One of the arguments of the decision-makers for their hesitation related to the use of evaluation for programmes financed from the national budget refers to the fact that these strategic documents do not include specific requirements for conducting these types of activities, thus these programmes are implemented by default year after year, without signalising the need for evaluation. This argument is also supported by the fact that, through its audit missions, the Court of Auditors verifies the compliance of the interventions with the legal financial provisions. These types of compliances checks do not encourage public authorities to undertake supplementary activities to analyse the impact of implemented public programmes. In case of some ministries, periodic reports on different initiatives and interventions and sociological researches such as barometers are elaborated, detrimental to evaluations, which can later serve to longitudinal evaluations. In ministries where evaluation studies are performed, they are more process oriented so as to adapt the interventions to the new socio-economic contexts, rather than analysing the impact of those interventions. In terms of internal and external evaluations, to a very large extent the evaluation services are contracted out. The public policy units within the ministries and agencies do not have enough specialised personnel in evaluation. The limited evaluation capacity is influenced also by relatively low income in the public sector, alongside a temporary blockage in the recruitment process of civil servants. If by the end of the 2000s, a shortage of human resources specialised in evaluation of public interventions was acknowledged, the number of professionals has now increased. Moreover, in order to ensure the necessary level of independence and neutrality of the evaluation studies, given the current functional structure of Romanian ministries and authorities, it is quite difficult to create an autonomous structure that could realise in house evaluations. For example, for the evaluation of the operational programmes specific to ESI funds, outsourcing

448

N. TODERAS¸ AND T. D. IACOB

is motivated by the need for optimisation in terms of coherence and quality of the evaluation exercises. However, in this particular case, there is an autonomous specialised structure responsible for the evaluation activities. For the current financial framework, the Ministry of European Funds manages five operational programmes. The evaluation function at the level of the Partnership Agreement and operational programmes was delegated to the Programmes Evaluation Unit. Nevertheless, the evaluation structure specific to the European funds management system does not have enough administrative capacity to carry out internal evaluations. Complementary to the incentives of administrative structures, a very limited number of evaluations have been carried out independently, at the request of international organisations or on the initiative of the private sector. For example, in 2015, a private operator, stakeholder of the programme, requested an independent evaluation of the ‘National Roll and Milk Programme in the context of the Milk Scheme in Schools’. The results of this evaluation were used both by the private operator who contracted the evaluation, adapting its milk production and distribution procedures in schools, but also by the Ministry of Agriculture that adapted its national programme to the provisions of the Milk Scheme in the EU. Since there is little information on the private demands for evaluation studies, it remains subject to a more complex research endeavour. In order to conduct external evaluations, public procurement procedures are used, mandatory by law for if financed from the national budget, but also for those funded from the EU budget. Very often, procurement procedures encounter significant delays, implicitly the start of evaluation exercises. The quality of evaluation reports is variable, both in terms of methodological robustness and the reliability of the achieved results. Until 2016, public procurement legislation encouraged the utilisation of the ‘lowest price’ award criterion for contracts, including for evaluation services. For the past two years a shift in the approach in the legislation permit and encourage contracting authorities to use the ‘value for money’ criterion for the procurement of evaluation services, with a much higher share of the quality component. In some situations, external evaluations at the request of different ministries or other central government agencies are attributed directly to research institutes or universities specialised in specific areas. This approach is encouraged also within the Strategy for strengthening the system of planning, coordination and implementation of public policies at the level o central public administration (Government Decision No. 870).

17

ROMANIA

449

Given that, evaluation has become a relatively recent practice within a system characterised by a limited evaluation culture, an independent evaluation institution has not yet been established, either as a research institute or as a public agency. Moreover, such a structure has not been created for a specific sector or policy field. In some areas evaluation functions are the responsibility of national research institutes specialised in different areas. For example, in the area of education, the Institute of Education Sciences is often commissioned by the Ministry of Education to conduct thematic evaluations. Similarly, the National Scientific Research Institute for Labour and Social Protection carries out thematic evaluations contracted by the Ministry of Labour and Social Justice. Although the SNE project included a concrete proposal to create such a structure, as it was already mentioned, so far this goal has not been materialised. A recent attempt in this sense was made during the 2014–2016 period, when based on the experience gained from the Delivery Unit activity within the Chancellery of the Prime Minister, a distinct structure was planned to be created on the premises of the reorganisation of the directorate responsible for policy coordination and programmes within the General Secretariat of the Government. Following the United Kingdom model, the Delivery Unit role was to independently evaluate the performance and progress of the implementation of priority public interventions. The activity of this structure was supported by European funds. In that arrangement, at the level of the Chancellery of the Prime Minister, a Strategic Unit was the next structure to be created to evaluate the relevance, effectiveness and impact of priority interventions, but this endeavour did not become operational. After the finalisation of the technical assistance project, Delivery Unit gradually diminished its activity and, since 2017, it was permanently interrupted. At the same time, the goal of the strategy unit was reoriented towards the organisation of some public awareness campaigns. The main drawback when creating a central structure relates to the limited demand for evaluation, implicitly a very low pressure on centre of government to efficiently coordinate evaluation processes. Another important factor is the lack of initiatives from the civil society and citizens to exert constant pressure on public authorities to provide in a transparent and accountable manner the evidences that fundament different political decision.

450

N. TODERAS¸ AND T. D. IACOB

2.3

Use of Evaluations

Results of evaluations are often used by programme or project managers and other administrative staff. In case of interim evaluations, they use results to report progress on the implementation of those public interventions. In case of ex post exercises, results are used to fundament the new interventions or to make decisions about changing elements within the implementation processes. Up to this moment, the central public administration has not encountered situations in which the results of an evaluation are invoked to make a decision on the interruption or closure of a programme. In addition, the evaluation results are used on a short-term basis, usually right after the finalisation of the exercise. Nevertheless, the programming exercises for 2014–2020 period used the results of interim evaluations to support some of the changes and directions proposed for financing, based on the lessons learned that were highlighted within the studies. So far, the creation of a national repository containing the existing evaluation studies did not materialise. Their results serve different sectors or related areas in a fragmented manner and allow limited external monitoring on how and to what extend the results were used and recommendations implemented. Over time, the public discourse of political decision makers highlighted the fact that they are not concerned with the results of evaluations or the lessons learned from previous experiences. They continue substantiating political decisions based on political responses to the problems emphasises by different governing parties. As regards the evaluation of operational programmes specific to ESI funds, currently, the quality management of these exercises is ensured through Evaluation Steering Committees (ESC), as well as through scientific committees. Each of the ESC includes relevant actors for the subject or sector to be evaluated as well as the representatives of social partners. The role of the ESC is to analyse and improve the Terms of Reference, the Inception report and final evaluation reports and to formulate opinions on the clarity of the reports and the usefulness of the recommendations. The role of the Scientific Committees is to ensure a higher quality of evaluations in relation to the relevance of the themes and approaches proposed in the Terms of Reference, the robustness of the proposed methodologies, the accuracy of the data used in the evaluation, the quality of the analyses and the impartiality of the results (Ministry of European Funds 2012).

17

3

ROMANIA

451

Societal Dissemination/Acceptance (Social System) 3.1

Institutionalised Use of Evaluations by Civil Society

The contribution of civil society in terms of fostering the use of evidencebased approaches for different policies and programmes represent a very important mission of the national and sectoral NGOs for more than two decades. Active NGOs and associative structures such as Romanian Academic Society, the Institute for Public Policy, Freedom House Romania, the Romanian Centre for European Policies or the Coalition for the Development of Romania conduct evaluation studies and policy analyses, and to a large extent, they create the necessary premises to discuss and debate the findings within different events where decision-makers, media reresentatives and key actors are invited. The extent to which evaluation findings are used is analysed considering the two dimensions of the Romanian evaluation system: on the one hand, the level of European funds management system, and on the other hand, the national evaluation system of public policies and programmes that cover all the public interventions, both at the level of parliament and government. As Cerkez (2009) mentions, even though there is minimal legislative framework on evaluation, the monitoring and evaluation activities in Romania are rather an exception than an administrative practice. In this context, a recent study on the Visegrad 4+4 countries evaluation practices (Czech Republic, Hungary, Poland, Slovakia plus Bulgaria, Croatia, Romania, Slovenia) contracted by the Polish National Evaluation Unit within the Ministry of Economic Development, showed that the users of the evaluation findings “[…] are mostly employees of their own institution —managers of other units, department directors. Only in the case of Hungary and Romania senior public administration staff was given more priority than unit managers” (Olejniczak et al. 2017, p. 42). Moreover, as regards the evaluations conducted outside the ESIs arrangements, in no country (implicitly in Romania) there is any conviction that evaluations produced by national systems serve local politicians (e.g. ministers, parliamentarians) or EU institutions (idem). There is little knowledge regarding the extent to which various analyses and evaluations, conducted by different NGOs, influence the referenda, or different, the decision-making processes. According to Aioanei (2015), the relatively little use of evaluation studies when planning and implementing the programmes financed from the EU

452

N. TODERAS¸ AND T. D. IACOB

budget is influenced by factors such as: national and sectoral regulations as well as the institutional system design; the quality of evaluations; the level of dissemination of the evaluation findings; the general character of the recommendations; difficulties at the implementation level that reoriented the focus and the need for evaluations. In addition to these factors, the national discourse on evaluation and a relatively low level of understanding of evaluation by the decision-makers may hinder the results of the evaluation studies to a larger extent As stated in the previous chapter, the demand for evaluation studies relates to the central administrative structures, out of which the structures responsible for the planning, implementation and evaluation of EU funded programmes make the most visible efforts in order to disseminate the information, namely conferences, presentations, working groups, etc. Moreover, the reports are fully available on an online library that gathers more than 200 titles of various evaluations and sectoral analyses.4 Thus, knowledge resulting from evaluation activities is available to the general public, but there is no consistent information whether the findings of the reports are used in academic purposes or to enforce the interests of different actors such as NGOs or private sector. The participatory approach for the preparation of 2014–2020 strategic documents might represent a very good example in terms of contribution to a consolidated use of evaluation findings by the members of the civil society actively involved in the monitoring processes, respectively members of the Monitoring Committees. According to Aioanei (2016), main factors that contribute to a relatively low use of the evaluation findings are the quality of the evaluation studies and the interest of the stakeholders regarding the evaluative process. Nevertheless, as the author mentions, a higher quality of the evaluation studies does not necessary imply an increased level of utilisation of evaluation findings. Another element that should be taken into consideration relates to the design of the evaluation system, which includes both the institutional framework and regulation of the evaluation processes. During the past two decades, the level of involvement of key actors in the elaboration and implementation of public interventions increased considerably. The need for thematic and sectoral expertise that characterises the profiles of NGOs represented a major factor that determined an increased level of participation, especially in the planning phase of public programmes

4 For additional info, please access http://www.evaluare-structurale.ro/, the website of the e-library which has also an English version.

17

ROMANIA

453

financed under the EU budget and the dissemination activities dedicated to the presentation of achieved results through different public events. An extensive involvement of the civil society members in the evaluation exercises was registered at the level of Sectoral Operational Programme Human Resources Development 2007–2013 evaluation, but this situation was due to the fact that they were beneficiaries of the interventions foreseen by the programme. According to a three-year study on the measurement of the evaluation culture in the context of EU Cohesion Policy in Romania conducted by the Ministry of European Funds, the level of participation of civil society in evaluation related activities is perceived as being medium by the relevant actors. According to the e-survey respondents, the part of civil society that should be more involved in evaluation related activities is represented by professional associations, think tanks and NGOs (Ministry of European Funds 2014). For these programmes, many NGOs were involved in evaluation activities, taking into consideration that they benefited from the EU interventions, as stated earlier. Nevertheless, the programming period 2014– 2020 has brought a new orientation regarding the involvement of relevant stakeholders in the implementation of EU Cohesion Policy. Thereby, the partnership principle sets the premises for new opportunities to actively involve the civil society members in evaluation activities for the next years. The situation is expected to encounter more positive changes in the future, especially in this particular area given the collaborative approach requested through the legislation in force. The steering committees that govern the evaluation plans for each operational programme have as members key stakeholders from the civil society. 3.2

Public Perception and Discussion of Evaluation and Evaluation Findings

According to the findings presented in the ‘Final Report on the Framework to Analyse the Development of Evaluation Capacity in the EU Member States’, in 2008, Romania had an extremely limited evaluation experience, but a great emphasis was put on the incentives and the strong commitments made by the relevant actors in terms of integration of evaluation in both EU financed programmes and national policy domains (European Commission 2008). In this sense, two concrete initiatives were the adoption of the National Evaluation Strategy and the continuous support to develop an

454

N. TODERAS¸ AND T. D. IACOB

evaluation professional association. The external pressure in terms of efficient use of resources represented a stimulus for the financed institutions alongside the demand for results from the European institutions. With view to national policies, there were no such motivations that could contribute to the development of the monitoring and evaluation activities. The level of knowledge on evaluation is also influenced by a relative limited promotion and visibility of these types of activities. As we have mentioned earlier, the evaluation reports dedicated to the analysis of EU funded interventions are fully available on an electronic platform (Evaluation Library). In contrast, evaluation reports elaborated for other public programmes are made available extreme rarely. In some cases only executive summaries may be consulted. The perception on the information included in the reports and how to work with it is generally limited because they are understood more as ‘internal documents’. One of the main causes that explain this practice refers to an exaggerated fear related to the effects of publishing or subsequent distribution of the databases (Stavaru 2015). According to Cerkez (2009), the institutions that conduct evaluation studies for their own policies do not make the reports publicly available, on their web sites in their attempt to cover or even hide potential failures. There is a fear caused by the possible negative consequences that may affect the image of the institution or public image of different civil servants or officials. Moreover, the communication of the evaluation results encounters some difficulties. During the 2007–2013 programming period there were organised national and international conferences that offered the opportunity to present a wide range of analyses and evaluation reports and their findings. In this sense, we mention two major events, namely, the International Conference ‘Evaluating for 2014–2020: Evidences and Experiences’ of 26–27 April 2012 in Bucharest and the National Conference ‘Evaluation of Regional Operational Programme Impact’ of 3 March 2016 in Bucharest. Within the analysis of the evaluation culture in the context of EU Cohesion Policy in Romania, the number of public events carried out per year, other than Monitoring Committees, related to evaluation and dissemination of evaluation evidence, is very low, namely between zero to three events per year (Ministry of European Funds 2012). Yet, over the last years, the term ‘evaluation’ is used more and more frequently by officials in the public discourse even if the concept may have various meanings.

17

3.3

ROMANIA

455

Civil Societies Demand Evaluations

The political climate that characterised the past years alongside the incentives that occurred in terms of legislative proposals as well as the preparation and implementation of public initiatives determined an increased attention from the civil society in terms of efficiency and impact of the measures. The demand for evidences from the civil society organisations and the general public experienced an important growth. This increase in demand is complemented by continuous need for transparency of decisions and accountability of public expenditure.

4 4.1

Professionalisation (System of Professionalisation) Academic Study Courses, Further Training Etc

The need for academic training in evaluation was underlined since 2006 when different analyses on the national evaluation culture were conducted. The offer for local evaluators, regardless the discipline, was considered small, the development of evaluation service market being rather circumstantial than systematic (Curley and Perianu 2006). Thus, even the interest of the national universities to develop training programmes on evaluation has been expressed in a constant manner; in 2009 a first master programme has been accredited in Bucharest at National University of Political Studies and Public Administration: Evaluation of European Public Programmes and Policies (Clain 2017). In 2013, based on the previous model, two other universities, University of Oradea and Babes-Bolyai University in Cluj-Napoca developed master programmes in the area of evaluation. In addition to these master programmes, a number of universities in Romania have introduced different training modules in evaluation in a broader curriculum that focuses on public policies and administration or regional development studies (Transilvania University of Brasov, University of Bucharest or University of Craiova). Even though in Romania a master degree in evaluation exists for several years now, the quality of the studies are not highly valued in evaluation community (Clain 2017). Moreover, as Clain mentions, the non-formal education offer has also experienced a positive evolution, with training opportunities both from private companies and from NGOs in the area of evaluation of projects funded by European funds. Nevertheless, the role of professional associations in this area is limited, with a low involvement in both the promotion of formal education provision

456

N. TODERAS¸ AND T. D. IACOB

and the promotion or provision opportunities for continuing professional training of evaluators (Clain 2017). 4.2

Profession/Discipline

The Romanian Journal of Evaluation was founded under the aegis of the Institute for Evaluation of European Public Policies and Programmes within the National University of Political and Administrative Studies. The journal was bilingual, planned with a biannual release. The main scope of the review was to provide the necessary premises for the academic community and practitioners to promote in-depth evaluations and studies, understanding of different areas of analysis and to develop a culture of evaluation in the public space. The journal was open for contributions from diverse backgrounds, encouraging the both theoretical and empirical work. A first volume of the review was launched in 2013, but subsequent activities in this particular area were not developed after. In addition to this initiative, different professional journals offer the possibility to both researchers and practitioners to present the results of their work, coordinated within various departments of universities (political science, public administration, European studies or regional development), NGOs and think-tanks such as Romanian political science journal, published by The Romanian Academic Society. Regarding the evaluation profession and professional associations, in 2008 the report released by the European Commission on Developing Evaluation Capacity revealed that an informal Romanian Evaluation Network (EvalRom) was set-up in July 2006. EvalRom was formally set-up in 2008 and acts as an open network which brings together professionals (both commissioners and practitioners) working in evaluation of programmes and projects from the public and private sectors. Its overall purpose is to contribute to the inception and consolidation of evaluation culture in Romania (European Commission 2008). Alongside EvalRom, two other Voluntary Organisation of Professional Evaluators (VOPEs) were created in 2010, the Romanian Society of Evaluators and in 2012, the Association for Development of Evaluation in Romania. Their main objective is the promotion of evaluation and the development of an evaluation culture in Romania. The activities performed by these associations are based on the volunteer contribution of their members, out of which: professional workshops, conferences, regional seminars, joint projects implemented together with other

17

ROMANIA

457

VOPEs out of which could be mentioned the Macedonian Evaluation Network, the Israeli Association for Program Evaluation or the Swiss Evaluation Society. Within the projects implemented by the VOPEs, there have been incentives that focused on the elaboration and promotion of different guidelines and standards, for example ethical guidelines, but no finalisation of these actions was registered, presently different standards used for the evaluation activities has been developed only by central institutions such as the Secretariat General of the Government and the Ministry of European Funds. According to the information presented during the Evalpartnes International Forum by the president of EvalRom, in 2012, the main challenges in developing the association were: the immaturity of the national evaluation market (modest size and limited diversity), extremely limited financial resources and mobilisation capacity of the organisation as well as the local cultural context, dominated by low trust and blame culture. Even though the activities and different projects implemented by the VOPEs have a relatively broad coverage, their number is insufficient in relation to the development needs that characterise the existing evaluation culture and capacity. The support provided by the associative structures was not constant and partially complemented the actions dedicated to the evaluation undertaken by the public administration units, respectively the support for an increased demand for evaluation exercises, increased level of knowledge and participation of key actors as well as higher quality of the evaluation results and their use (Iacob 2018). Up to this moment, in Romania a certification system for evaluators does not exist, there is only the master qualification obtained by the graduates of the master programme. The evaluation market is mostly dominated by large consultancy firms that easily fulfil the selection criteria of procurement and are contracted to conduct evaluations. Usually, the evaluation teams are mixed, including both national and external evaluators that gained the knowledge and practice through similar projects in Romania and other countries. According to the findings of the measurement on the evaluation culture “[…] overall, the Romanian evaluation market is considered as partially competitive since evaluation services are most of the times assigned to a limited number of players. Furthermore, international firms are more present than local firms on the Romanian evaluation market (13 international, seven national) and the involvement of universities in evaluation activities is limited” (Ministry of European Funds 2014, p. 20).

458

N. TODERAS¸ AND T. D. IACOB

4.3

Compliance to Standards and Quality Obligations

General orientations and standards that guide the evaluation exercises were not elaborated by professional associations, even though there were different initiatives in this sense. Within the central public administration, a set of quality standards were elaborated and discussed, for example within the PHARE project ‘Development of a professional evaluation community’, but not officially approved. The quality of the processes as well as the evaluation reports designed for the programmes financed from the EU budget, it ensured using specific procedures, including quality assessment grids. These procedures are mandatory for both the evaluation managers and evaluators, the latter are even asked for an internal system for quality assurance. The quality management of evaluations is most often provided on the basis of distinct procedures applied at the level of contracting authorities using quality grids to assess the reports.

5

Conclusion

In conclusion, compared to previous decade, the level of knowledge and reflection on evaluation has increased notably, both at the level of the public administration and the level of the evaluation market, but also in the case of the citizens. At present, evaluation is perceived as a rather positive practice, distinct from control and audit activities. At the same time, at the level of public discourse, evaluation has different connotations, compared to the pre-accession period to the EU, such as strengthening equity and transparency or improving the relevance and suitability of interventions. In this context, there is a certain confidence that the processes of strengthening of the evaluation capacity and culture will intensify.

References Aioanei, M. (2015). Utilizarea evalu˘arilor în contextul Sistemului Instrumentelor Strcturale din România. In M. Cerkez (Eds.), România în schimbare. Evaluarea în contextul programelor s,i proiectelor cu finant, are extern˘a (pp. 23–100). Bucharest: Tritonic Books. Aioanei, M. (2016). Improving structural interventions through the evaluation process: Case study—The regional operational programme of Romania 2007– 2013. Europolity, 10(2), 33–74.

17

ROMANIA

459

Cerkez, S, . A. (2009). Construirea capacit˘at, ii de evaluare la nivelul sectorului public din România. In M. Cerkez (Eds.), Evaluarea programelor s, i politicilor publice. Teorii, metode s, i practici (pp. 117–141). Ias, i: Polirom. Clain, A. (2017). Educat, ie s, i formare profesional˘a continu˘a în domeniul evalu˘arii de politici s, i programe publice în Romania. In N. Toderas, & A.-M. St˘avaru (Eds.), Evaluarea programelor s, i politicilor în România (pp. 141–198). Bucharest: Tritonic Books. Curley, H., & Perianu, E. (2006). Analiza culturii de evaluare din România, Raport realizat in cadrul Proiectului Phare de Asisten¸ta˘ Tehnic˘a pentru Programare, Monitorizare ¸si Evaluare. http://discutii.mfinante.ro/static/10/ Mfp/evaluare/Assessment_Rom_Eval_CultureRO.pdf. Accessed on 27 Feburary 2019. European Commission. (2008). Developing evaluation capacity—A study for the commission. Final report on the framework to analyse the development of evaltion capacity in the EU member states (European Union Region Policy). http://ec.europa.eu/regional_policy/sources/docgener/evaluation/ pdf/report_integrated_2007.pdf. Accessed on 29 January 2019. Government Decision No. 1006. (2001, October 4). Approval of the government strategy on accelerating the reform in public administration. 2001. Government Decision No. 1266. (2004, August 25). Approval of the methodological norms regarding the contracting, financing, monitoring and evaluation of the projects in the sectoral research and development plans. Government Decision No. 775. (2005, July 14). Approval of the regulation on procedures for the elaboration, monitoring and evaluation of public policies at central level. Government Decision No. 870. (2006, June 28). Approval of the strategy for the improvement of the public policy elaboration, coordination and planning system in the central public administration. Government Decision No. 909. (2014, October 15). Approval of the strategy for strengthening public administration 2014–2020 and the establishment of the national committee for coordination of the implementation of the strategy for strengthening public administration 2014–2020. Government Decision No. 523. (2016, July 20). Amending and completing the regulation on the procedures for elaboration, monitoring and evaluation of public policies at central level, approved by the government decision No. 775 of 2005. Iacob, T. D. (2018). Rolul organizat, iilor non-guvernamentale în dezvoltarea capacit˘at, ii de evaluare a politicilor publice din România (PhD thesis). National University of Political Studies and Public Administration, Bucharest. International Bank for Reconstruction and Development. (1996). Reform of higher education and research project procedures and criteria—Loan No. 4096 RO. http://documents.worldbank.org/curated/en/202741468095364819/ pdf/84E94240DAA26DE985256F03000A2A98.pdf. Accessed on 31 January 2019.

460

N. TODERAS¸ AND T. D. IACOB

Law No. 24. (2000, March 27). Law on the normative technical standards for drafting legislative acts. Law No. 87. (2006, April 16). Law for the approval of government emergency ordinance No. 75 in 2005 regarding the quality assurance of education. Ministerul Finant, elor Publice. (2006). Proiectul Strategiei Nat, ionale de Evaluare pentru perioada 2007–2013. http://discutii.mfinante.ro/static/10/Mfp/ evaluare/NationalEvaluationStrategyNov29-06_RO.pdf. Accessed on 29 January 2019. Ministry of European Funds. (2012). Measurement report of evaluation culture in the context of EU cohesion policy in Romania, first measurement cycle. Ministry of European Funds. (2014). Measurement report of evaluation culture in the context of EU cohesion policy in Romania, third measurement cycle. Olejniczak, K., Kupiec, T., Wojtowicz, D., & Felcis, W. (2017). Evaluation in V4+4 countries—Overview of practices. Final report. https://www.ewaluacja.gov.pl/ media/52288/OverviewV4_4_RK_2017.docx. Accessed on 29 January 2019. Perianu, E. (2008). Politicile Publice în România. De la cultura raport˘arii la cultura evalu˘arii. In C. Cr˘aciun & P. E. Collins (Eds.), Managementul politicilor publice: transform˘ari s, i perspective (pp. 267–288). Ias, i: Polirom. St˘avaru, A.-M. (2015). Înt˘arirea capacit˘at, ii administrative. Dezvoltarea sistemului nat, ional de evaluare a intervent, iilor publice în România: design, cultur˘a s, i capacitate de evaluare. Bucharest: Tritonic Books. Toderas, , N. (2017). Evaluarea—institut, ionalizare s, i cultur˘a organizat, ional˘a. In N. Toderas, & A.-M. St˘avaru (Eds.), Evaluarea programelor s, i politicilor în România (pp. 13–54). Bucharest: Tritonic Books. Toderas, , N., & St˘avaru, A.-M. (2015). Evaluation capacity building as a means to improving policy making and public service in higher education. In A. Curaj, L. Deca, E. Egron-Polak, & J. Salmi (Eds.), Higher education reforms in Romania (pp. 87–104). Dordrecht: Springer. Toderas, , N., & St˘avaru, A.-M. (2017). Evaluarea programelor s,i politicilor în România. Impact, r˘aspundere public˘a s, i cultura organizat, ional˘a. Bucharest: Tritonic Books.

PART VI

Transnational Organisations

CHAPTER 18

European Union Roland Blomeyer and Ansgar Eussner

1

General Information

This chapter deals with the development of evaluation in the European Commission (EC), the European Parliament (EP) and the Council of Europe (CoE). These institutions share the common objective of fostering cooperation and integration of Europe. However, they also have different memberships, founding dates and charters, focus and ways of operating. Also, they differ largely in terms of staff numbers and size of budget. The CoE was created in 1949 with 10 founding members and is thus the first European organisation. Its main aim is to promote peace and political cooperation in Europe. It has today 47 member countries, that means all European countries except Belarus and the Vatican which are observers. The CoE headquarters are in Strasbourg and it has field offices in about a dozen member countries who are not members of the European Union

R. Blomeyer (B) Blomeyer & Sanz, Madrid, Spain e-mail: [email protected] A. Eussner Eussner Consult, Strasbourg, France e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_18

463

464

R. BLOMEYER AND A. EUSSNER

(EU). It has a staff of about 2400 and an annual budget of about 230 Mio. Euro, including about 50 Mio. Euro extra-budgetary resources which to the order of about 70% come from the EC, the Secretariat of the EU. The EU was created in 1957 by six founding members as European Economic Community. It deepened and expanded to the EU with the treaties of Maastricht in 1992 and Lisbon in 2007. It went through several enlargement rounds and counts today 28 members. The offices of the EC are in Brussels. The EC has a staff of about 20,000, representations in all 28-member countries and an annual budget of about 145 billion Euro. The European Council was established as an informal summit of heads of state in 1975 and was formalised in 2009 upon the entry into force of the Lisbon treaty. The EP is the directly elected parliamentary institution of the EU. Together with the Council of the EU and the EC, it exercises the legislative function of the EU. It has a staff of about 6000 and an annual budget of 1.9 billion Euros. Its headquarters are in Brussels and Strasbourg, where it holds a weekly session every month. This article first presents the institutional setup of the evaluation function in the EU and its development over the past thirty years; this includes the evaluation policy, the guidelines, the practice over the last years and the different stakeholders involved. This is followed by an analysis of the practice and the use of evaluations in the EC. This article is based on a literature review on the EU evaluation policy, a review of EC documents regarding evaluation, and the authors’ experience with conducting evaluations for the different EU institutions.

2

Development of the Evaluation Framework of the EU: Institutional Set-Up and Processes1

Over its years of existence, the EU developed its own evaluation system composed of the EC, the EP and the Council (Stern 2009). Another actor in the EU evaluation framework is the European Court of Auditors (ECA) that increased its production of special reports which are similar to performance audits, over the past twenty years (Stephenson 2015). The EC, as the EU executive body is responsible for commissioning, implementing and disseminating evaluation findings (Hojlund 2014) and has the legal obligation to evaluate as stated in the legal bases of many programmes and

1 Co-authored by Elsa Perreau.

18

EUROPEAN UNION

465

policies. As a result, the EC undertakes the largest number of evaluations, of which 80% are externalised. The EC defines evaluation in the 2017 Better Regulation Guidelines as “an evidence-based judgement of the extent to which an existing intervention is effective, efficient, relevant given the current needs, coherent both internally and with other EU interventions, and has achieved EU added-value” (EC 2017a, p. 53). The EU evaluation approach has evolved significantly over the past thirty years, expanding progressively from expenditure to regulatory policies. In the meantime, the link between ex-ante and ex-post evaluation was also strengthened. These two elements characterise the EU’s ‘new approach to policy evaluation’. Four phases can be identified in the development of this new approach (Hojlund 2015). Between 1980 and 1994, evaluation is mainly meant to produce hierarchical accountability, “emphasising procedural rules and hierarchy in order to deliver effective policy formulation according to the Commission mandate” (Hojlund 2015, p. 39). During that period, evaluation is unsystematic, decentralised, and only takes place in a few Directorates-General (DG), creating learning on DG-level only. After a sharp increase in the size of the general EU budget, financial accountability became the main incentive to conduct evaluations between 1995 and 1999. Evaluation was then limited to spending programmes. DG Budget and DG Financial Control were responsible for common evaluation guidelines and each DG was to build its evaluation capacity and prepare evaluation plans for its programmes. Learning from evaluation was limited to the DGs’ evaluation units without reaching the policy units. Limited learning at the policy level continued over the period from 2000 to 2006. The focus of evaluation changed from expenditure programmes to all EC activities. The evaluation system became linked to the EC’s internal control standards, which gave DG Financial Control the competence to review the evaluation systems of each DG. Legal accountability (complying with legal requirements) was the main incentive to produce evaluation. Evaluation was also included in the EC’s strategic planning and the results of evaluation were to be available easily. Each DG adopts an evaluation plan and a multi-annual evaluation programme, and the DG Budget of the EC adopts an evaluation programme. This period saw the introduction of impact assessments (IA) in the EC toolbox in 2002. These assessments incorporate aspects related to the cost-effectiveness of the action,

466

R. BLOMEYER AND A. EUSSNER

as well as the traditional aspects that IA focus on (economic, environmental impact, etc.). In 2006, the EC launched its Better Regulation Agenda with the objective of reducing administrative burden resulting from EU legislation. The Agenda aimed inter alia to increase the use of IAs and the Impact Assessment Board (IAB) was created to control the quality of IAs produced by the EC. This responded to criticisms on the objectivity and independence of the IA given they are produced by the lead DG for the policy proposal. In 2015, the IAB was replaced by the Regulatory Scrutiny Board. From 2007 to 2014, the Smart Regulation agenda succeeded the Better Regulation agenda and a stronger focus was put on policy learning. Systematic regulatory evaluation was introduced in the EC and evidencebased policy making became one of the main objectives of evaluation, along with democratic accountability (Hojlund 2015). In 2009, the Secretariat General (SG) of the EC became the lead unit responsible for evaluation, thus replacing DG Budget. The Smart Regulation Agenda put the emphasis on giving a central place to evaluation in the policy process, linking it to ex-ante assessments, for example IAs. The Smart Regulation Communication introduced fitness checks, which are an “evaluation of a group of interventions which have some relationship with each other (normally a common set of objectives), justifying a joint analysis” (EC 2017a, p. 53). Fitness checks can be used to identify burdens, inconsistencies, overlaps and gaps that result from the accumulation of regulations in one policy sector. Fitness checks are considered complementary to individual evaluations because they provide an assessment of the performance of the framework, and do not provide as many details as individual assessments. In 2015 and in 2017, the EC published communications on Better Regulation: Better Regulation for Better results—An EU Agenda (2015) and completing the better regulation agenda: better solutions for better results (2017b). The 2017 EC Communication repeats the three pillars formulated in the 2015 EC Communication: (1) systematic impact assessment of new proposals; (2) ‘evaluate first’ and (3) stakeholder engagement (EC 2015). These three pillars are in line with the trends identified by Smismans (2015) and constitute the new evaluation approach of the EU. A new inter-institutional agreement between the EC, the EP and the Council was signed in 2016. Over the past years, increasing emphasis was put on consulting stakeholders and promoting a participatory approach in law making. In 2016 the EC launched the ‘contribute to law-making’ website, an online page detailing the various possibilities for stakeholders to

18

EUROPEAN UNION

467

contribute to the legislative process. The EC continues to aim to integrate evaluation fully in the policy cycle. The Regulatory Scrutiny Board “now examines selected evaluations and systematically asks whether—in line with the ‘evaluate first principle’—an evaluation has been done to support the impact assessment report where legislation is amended” (EC 2017b, p. 3). Other tools linked to evaluation but aiming mainly at reducing the regulatory burden were also introduced recently. In 2012, the EC introduced the Regulatory Fitness and Performance (REFIT) programme. It aims to “review the entire stock of EU legislation—to identify burdens, inconsistencies, gaps or ineffective measures and to make the necessary proposals to follow up on the findings of the review” (EC 2013, p. 2). In 2015, the REFIT platform was setup to collect feedback on performance and how to make EU law simpler. REFIT provides the frame for the Better Regulation tools. Cumulative Cost Assessments (CCA) “are a specific type of study/research intended to assess EU related regulatory costs on an industrial sector” (EC 2017c, p. 317). These assessments also look at the policy level but are not considered as evaluation by the EC, even though the Better Regulation Toolbox guidelines state that these assessments provide an important input for evaluations. For this reason, CCA are considered a valuable addition to the better regulation toolbox (Schrefler et al. 2015). All these tools contribute to the evaluation system of the EC, even though they may not be formally considered as evaluations. Four key features can be identified in the current evaluation approach of the EU on the basis of the various EC communications mentioned above (2013, 2015 and 2017b). First, evaluation is meant to apply to all EC activities (whether regulatory or expenditure). Second, the ‘evaluate first’ principle states that all new activities must take place after the assessment of past actions. Third, evaluation is linked to the REFIT programme and the objective to reduce inconsistencies and burden in the regulatory framework. Finally, the objective of evaluation is to produce policy learning, and not only financial and programme or project learning (EC 2017b).

3

Evaluation Practice and Use in the EC

As seen above, the rationale for conducting evaluations has evolved over the past years creating an evaluation system aiming at both accountability and policy learning. Along these changes, the practice and the use of evaluation

468

R. BLOMEYER AND A. EUSSNER

has changed. Evaluation use in the EC has been little studied and there is little data available on this. Evaluation Practice The Better Regulation agenda aims for a uniform practice of evaluation across policy sectors. The 2017 EC Communication on Better Regulation provides an overview of the evaluation activities since their launch in the EC. As of August 2017, 798 evaluations and 1028 IAs have been conducted. All financial and important legislative activities are supposed to be evaluated. However, in practice it seems that not all major legislations are evaluated (Van Voorst and Mastenbroek 2017). Van Voorst and Mastenbroek (2017) have found that the EC tends to prioritise the evaluation of major directives over major regulations. They also found that pieces of law were more likely to be evaluated if the responsible DG had an evaluation unit, as well as if the DG has evaluation guidelines. This confirms that evaluation capacity is an important factor to consider when looking at evaluation practice in the EC. Indeed, DGs are responsible for evaluating their policies and programmes but they do not all have the same resources available, even if about 80% of the EC’s evaluation work is outsourced. As a result, “there is much variation in the extent to which Directorates-Generals prioritise evaluation as well as in the amount of human and technological capital that they invest in evaluation” (Van Voorst 2017, p. 24). This variation can be explained by the differences in the budget allocated to each DG. A study on the use of evaluation in the EC found that the development of standards (such as guidelines) fosters the practice of evaluation among DGs, as well as the use of evaluation results at the operational level (De Laat 2005). Moreover, DGs’ activities differ from one another and DGs are likely to use different evaluation tools, depending on what they work on. DGs dealing primarily with regulatory tasks (ENER, MARKT, ENV,2 etc.) tend to adopt more IAs rather than conduct ex-post evaluation. On the contrary, DGs dealing primarily with expenditure policies (AGRI, DEVCO, REGIO, RTD,3 etc.) tend to produce more ex-post evaluation than IAs. In general, the incentive for a DG to evaluate is not always clear, despite the recent focus put by the EC on policy learning. Hojlund refers to an 2 DG Energy, DG Internal Market now DG GROW, DG Environment. 3 DG Agriculture and Rural Development, DG International Cooperation and Develop-

ment, DG Regional and Urban Policy, DG Research and Innovation.

18

EUROPEAN UNION

469

inter-service study indicating that “the majority of evaluation unit staff considered legal requirement to be the main reason why evaluations were undertaken” (2015, p. 43). Finally, the practice of ex-post legislative evaluation is relatively patchy and results rather from legislative obligations than DGs’ initiatives (Mastenbroek et al. 2016). Smismans (2015, p. 25) argues that there is a push towards “using evaluation as an instrument to address the regulatory burden, rather than as a learning system aimed at gathering the best evidence for policy-making in the broadest possible way”. Evaluation Use in the EC Evaluation use in the EU has been little studied. However, the resources available allow to draw an overview of potential and effective use of the evaluations produced by the EU. It seems that evaluation use, just as evaluation implementation, takes place more in the EC than in the EP or in the council (Hojlund 2014). An analysis of 220 ex-post evaluations showed that only 16% of them were followed up by questions at the EP (Zwaan et al. 2016). This analysis also showed that members of the EP usually use evaluation for agenda setting purposes, and not so much to hold the EC accountable. The analysis concludes that the main variable for the use of evaluation at the EP is the level of conflict between the EP and the EC during the legislative process. In 2010, the ECA published a special report on the use of IAs in policymaking. The ECA found that IAs contributed to policy-making. However, the report also found that the EP only rarely refers to the IAs conducted by the EC. IAs are not systematically discussed at the EP. Use of IAs in the council seems to be limited as well (ECA 2010). Moreover, it seems that evaluation use takes place more at the programme level than at the policy-making level (Hojlund 2014). This does not mean that there is no use of evaluation at the policy level. In fact, using the example of the LIFE4 programme, Hojlund (2014) found that policymakers tend to use evaluations for advocacy and to get information and they use it much less than managers at the programme level. A study on the use of evaluation suggests that the political importance of the interventions, their expected socio-economic impact or their visibility are not key 4 LIFE is an EU financial instrument supporting environmental and nature conservation, and climate action projects throughout the EU. It was created in 1992 and funded more than 4500 projects (Source EC website on LIFE).

470

R. BLOMEYER AND A. EUSSNER

determinants for evaluation use (De Laat 2005). The same analysis further states that other information inputs (IAs, expert opinions, etc.) contribute more significantly to policy-making than evaluations. One explanation can be that the evaluation system of the EU produces a lot of competing information given the numerous tools it has at its disposal. This tends to reduce the relevance of evaluation to policy makers (Hojlund 2014). Evaluations are also mostly used after their completion (finding use), and not so much during their implementation (process use) (Hojlund 2014). Finally, the link between ex-ante and ex-post evaluation is not always as clear as it should be according to the EC guidelines. Research shows that “the proportion of IAs making use of an available ex-post evaluation (65%) is much larger than the proportion of ex-post evaluations making use of an available IAs (17%)” (Van Golen and Van Voorst 2016, p. 402). The ECA 2010 report on the use of impact assessment found that ex-post evaluation often differed significantly in terms of focus with the IA (ECA 2010). This suggests a varying integration of evaluation in the policy cycle.

4

The Development of the Evaluation Function at the CoE

This section presents first the institutional set up of the evaluation function in the CoE and its development from the beginnings in 2008; this includes the evaluation policy, the guidelines, the practice over the last years and the different stakeholders involved. This is followed by an analysis of the use of evaluations by CoE managers and governing bodies, including the follow up to evaluation recommendations. Finally, the compliance with evaluation standards, membership in professional organisations and the process of professionalising the evaluation practice is analysed. 4.1

Overview of the CoE

The primary aim of the CoE is to create a common democratic and legal area throughout the whole European continent, ensuring respect of its fundamental values in the areas of human rights, democracy, and rule of law. This hasn’t changed since the founding of the COE in 1949 as documented in the Statutes signed by the 10 founding member countries on 5 May 1949

18

EUROPEAN UNION

471

in London.5 The European Convention of Human Rights (ECHR) is the central instrument for this6 ; over the years a number of other conventions have been developed and adopted.7 The main tool used is standard-setting through the elaboration of norms (legally binding or not); about 200 conventions in many policy areas were concluded during the last decades. Compliance by member states with these standards is regularly monitored by experts reporting to monitoring bodies. Member states which require support for the implementation of the standards can obtain it through cooperation projects which are mainly funded by extra-budgetary resources (about two third by the EC). The interactions between these three elements have been called the dynamic triangle and have been subject to several evaluations in recent years. 4.2

Institutional Set Up and Processes for Evaluations

As in most international organisations, there is an increased pressure by member states to improve performance and efficiency. The motto is to achieve more results with less resources; for several years of zero nominal growth budgets have been forcing the CoE to concentrate on priority programmes. In this context, it becomes more important to document the results and demonstrate their impact rather than to report only on activities planned and implemented. This implies an increased interest of member states for evaluations, however not by all. While there are a few vocal supporters, in the Committee of Ministers (CM) during discussions on evaluation plans, budgets and reports, most from the northern and north-western part of Europe, others are indifferent, hesitant or even reluctant to give evaluation an increased role and visibility. In this context it is important to note that in the CoE, member states are represented by their foreign offices

5 See Statutes of the Council of Europe as documented on the UN Treaties website https:// treaties.un.org/doc/Publication/UNTS/Volume%2087/volume-87-I-1168-English.pdf. 6 See the text of the ECHR on https://www.echr.coe.int/Documents/Convention_ENG. pdf, and also the 2017 evaluation on the CoE Support to the Implementation of the ECHR at National Level on DIO’s website https://rm.coe.int/evaluation-of-the-council-of-europesupport-to-the-implementation-of-t/168079721a. 7 An overview is provided on the CoE website https://edoc.coe.int/en/378-thecouncil-of-europe-in-brief, and also on https://www.nyulawglobal.org/globalex/Council_ of_Europe.html. For future challenges see https://edoc.coe.int/en/an-overview/7915ready-for-future-challenges-reinforcing-the-council-of-europe.html.

472

R. BLOMEYER AND A. EUSSNER

which are much less used to evaluations than ministries for development cooperation. In 2008, the Committee of Ministers (CM) approved the Evaluation Policy, setting out the objectives, the scope and the role of the different stakeholders for evaluations in the CoE (CM 2008). This policy remained unchanged until 2014 when some minor revisions were approved by the CM, clarifying in particular that all areas of CoE-programme activities, including those conducted under the Partial Agreements (adhered to only by some, not all member states), are within the evaluation universe, meaning that they can be included in the annual evaluation work programme, a provision which had been contested by some member states (CM 2014). The Evaluation Policy was again up-dated in November 2019 when the CM approved document (CM(2018)159-final), taking inspiration from the UNEG and OECD DAC evaluation norms and standards (UNEG 2016; OECD DAC 2010). The revision clarifies among other issues that “DIOmanaged Council of Europe evaluation reports are made public on the DIO intranet and internet sites within two months after the Director of Internal Oversight has declared them being final, whether or not a management response has been received (Council of Europe 2019).” This new provision should end problems with the earlier version of the Evaluation Policy which had stipulated that the SG decides about the use of the evaluation reports. This had resulted in some delays in their publication, and in some cases, only the executive summaries were published, a practice which diminished the independence of the evaluation function. DIO has been created in 2010 as part of the reform process of the CoE initiated by the SG Tjorborn Jagland shortly after his election in 2009. Internal Audit which had existed since 2001 was merged with the rudimentary evaluation unit in the planning department and additional staff and budget was provided. Evaluation staff was further increased in 2012 to four evaluation professionals and one assistant (position cut in 2016) and the operational budget of DIO was doubled to 130,000 Euros. The first evaluations were presented in 2010/2011. In 2012 evaluation guidelines were prepared and distributed to all staff and evaluation training programmes were launched. The guidelines were updated in 2014 and they will again be updated to reflect the latest changes made in the evaluation policy in November 2019. In an effort to further strengthen the governance and oversight system, in 2008 an independent external advisory committee, the Audit Committee, was established: three experts were elected by the CM for three years

18

EUROPEAN UNION

473

in their personal capacity after an open call for candidates had been sent to all member states. The initial members were all auditors and the Committee focused on audit issues. In 2014, the membership was enlarged to five, an evaluator and an investigator supplemented the three internal auditors and the name was changed to Oversight Advisory Committee (OAC). This reflected the broader scope of DIO’s work and the increased attention the Committee paid to evaluation and investigation issues. The OAC meets three times per year, consults regularly with senior management and provides an annual report to the CM and comments on on-going debates like the one on the revision of the Evaluation Policy.8 In this context, DIO organised a seminar with CoE staff, representatives of member states and other international organisations with a view to promote the evaluation culture.9 4.3

Acceptance and Use of Evaluations in the CoE

Evaluations are only useful if they result in organisational learning processes and if their recommendations are accepted and implemented. This requires continuous communication before, during and after the preparation of the evaluation reports. An effective evaluation work programme begins with the selection of relevant and timely evaluation topics. They should reflect the interests/requests by major stakeholders (SG, CM, member states, Major Administrative Entities [MAEs], as well as donors including the EC). The topics chosen should have strategic relevance and anticipate a contribution to reform initiatives and organisational learning. They should also help to identify major risks, cover a significant part of the programmes and projects and enhance the coherence of action among MAEs and/or various sources of funding. In order to identify the most urgent and relevant evaluation topics, DIO conducts consultations with senior managers in all major departments. This includes also the timing of the evaluations. Recent evaluations topics were for example: • Effectiveness of the CoE support to the implementation of the European Convention of Human Rights at national level. • Human Rights Education for Legal Professionals Programme. 8 See the links to these Annual Reports of the OAC at https://www.coe.int/en/web/ internal-oversight/reports. 9 See summary video on https://www.coe.int/en/web/internal-oversight/news-views.

474

• • • • • • • • • • •

R. BLOMEYER AND A. EUSSNER

Centre of Expertise for Local Government Reform. Results-based budgeting system. Capacity Development Programmes. World Forum for Democracy. Regional Programmes. Election observation and support. Co-operation of international non-governmental organisations with the CoE in standard-setting and monitoring. Intercultural cities programme. Gender mainstreaming in technical co-operation programmes. Country programme Ukraine. The North-South Centre.

The next element for enhancing the acceptance of evaluations by managers and governing bodies is to demonstrate the use of adequate methodologies and rigour in applying them. The sampling of interview partners for example was at times hotly debated in the CM and it was crucial to show that the selection had been made in strict impartiality. In DIO, mostly mixed methods are used for data collection, consisting of desk studies, webbased surveys, semi-structured interviews, and focus groups, the results of which are triangulated for validation. For more technical issues or for large evaluations, consultants are hired to reinforce the DIO evaluation team. DIO also regularly offers training courses to staff dealing with evaluations of projects, in order to enhance their understanding of the methods available and their awareness of the potential benefits and risks of conducting evaluations. The evaluation division is also providing ad hoc advice and on the-job coaching, for example on developing evaluation terms of reference and contracting suitable consultants. A new web-based form to facilitate requests for support to decentralised evaluations has been created. In addition, DIO prepared a strategy paper on its support to decentralised evaluations which outlines a strategy for providing further support for their preparation and implementation. In order to increase the chances that most recommendations are swiftly implemented it is essential to limit their numbers, to make concrete suggestions for enhancing programme performance in terms of processes and results, and to develop several options if required. For recommendations that are addressed to several managers one is designated to take the lead for their implementation. It is important to involve the managers in drafting

18

EUROPEAN UNION

475

feasible recommendations which are ambitious but take also the existing constraints in terms of budget, human resources and political support into account. One way of focusing on the essential recommendations is to distinguish between core recommendations which are monitored and ‘good to have’ items for which implementation is left to the discretion of the managers. All recommendations are presented as draft action plans which are appended to each draft final evaluation report and discussed and if possible agreed with managers prior to issuing the final reports. The Evaluation Division collects feedback from MAEs once a year on the implementation of all recommendations that are still pending. In order to better keep track and report on the implementation status of its recommendations, DIO has developed a database for the follow-up to recommendations. In 2016, the progress made in implementing each recommendation was entered in the tool. The implementation rate of recommendations two years after their issuance is a key performance indicator for the Evaluation Division. At the end of 2016, 71% of the evaluation recommendations issued since the beginning of 2015 had been fully implemented (CM 2017). The implementation status of recommendations is annually reported to the OAC and the governing bodies. The OAC regularly discusses the evaluation reports in some detail in the presence of the evaluators and of the managers concerned who are asked to provide comments on the evaluation process and the usefulness of the evaluation. The governing body rarely discuss individual reports, but their members have access to them. In the discussion on the annual report of DIO which contains summaries of all reports, they provide feedback and sometimes ask detailed questions on substance and methodology. In addition, several seminars with representatives of member states and CoE staff were organised by DIO where individual evaluation reports were presented and discussed in detail. To sum up, the successful use of evaluation recommendations depends on involving senior and middle management from start to end, to listen to their view points, concerns and proposals and to take constraints they are facing into account. It is essential for the evaluators to become trusted advisers to managers at various levels of the hierarchy rather than being perceived as a control instance. Evaluators need to be sufficiently experienced and improve continuously their knowledge of management processes and of the substantive issues at hand in order to be able to talk on equal level with managers. Significant progress has been made in DIO in this respect, but it is a gradual process and cannot be achieved in a short time span.

476

R. BLOMEYER AND A. EUSSNER

4.4

Professionalisation of the Evaluation Function at the CoE

The professionalisation of the evaluation work at the CoE advanced with the hiring of three professional evaluators in 2013/2014 who continued to take courses on the web and in selected seminars. A database of 80 evaluation consultants was built up from which in several cases specialised consultants were recruited to strengthen the evaluation team. As mentioned in section one above, comprehensive evaluation guidelines were prepared in 2012 and updated in 2014. The director DIO recruited in 2011 (retired in May 2017) was an experienced evaluator who had worked for nearly 20 years in several agencies of the UN in senior evaluation positions. The deputy director was nominated as director in November 2017. DIO became an institutional member of the European Evaluation Society in 2012 and has contributed regularly to the discussions at its biennial meetings. Two peer reviews conducted in 2011 and 2017 with the support of the Swiss Development Agency confirmed that the evaluation work has applied the United Nations Evaluation Group (UNEG) evaluation standards (UNEG 2016).10 Unfortunately, several attempts made by DIO to participate as observer in the annual meetings of UNEG and the Organisation for Economic Co-operation and Development (OECD)/Development Assistance Committee (DAC) evaluation group have not been successful, as the CoE is not considered as eligible for membership by these bodies. Challenges ahead are to continue keeping the competencies of the evaluation staff up-to-date as well as increasing those of other CoE staff dealing with evaluations. It would be useful to create a community of evaluation practitioners at the CoE, possibly through discussions on a dedicated website and/or occasional seminars or lectures. Increased exchanges of information with evaluators in other international organisations would also be welcome, although an attempt at creating a network of them through a meeting at the European Evaluation Society biennial conference in Dublin in 2014 has not been followed up by further activities beyond individual contacts and exchanges. Gaining observer status in UNEG and the OECD/DAC evaluation group would be enriching and widening the view on international debates in the field of evaluation. The same holds true for 10 On the UNEG Website, also the Evaluation Policy documents of numerous UN agencies can be found.

18

EUROPEAN UNION

477

the participation in the annual meetings of national evaluation societies, like DeGEval, SEVAL or the Evaluation Societies of France or the United Kingdom.

5

Comparative Conclusions

It is obvious that evaluation has a much longer history in the EU than in the CoE where it started only in 2008 and became formalised in 2010 while in some DGs of the EC evaluation has been on the agenda for more than 30 years, for example in the DG Development. Also, the numbers of evaluations carried out in the EC are much larger than in the CoE. A distinguishing element is also that the EU is the largest donor of extrabudgetary resources for the CoE and expects regular evaluations and audits of projects funded with these resources. These evaluations are sometimes quite critical and recommend improvements of the planning and implementation processes at the CoE. The common elements are the increased attention for evaluation by the governing bodies for both accountability and learning purposes. While not all member countries are equally interested, an active minority expects regular and thorough evaluations and links them to funding decisions.

References Committee of Ministers—CM. (2017, March 22). Annual report 2016 of the directorate of internal oversight. 1282nd meeting. https://search.coe.int/cm/ Pages/result_details.aspx?ObjectId=09000016806f417c. Accessed on 16 May 2019. Committee of Ministers—CM. (2008, November 25–27). Evaluation policy for the Council of Europe. 1042nd meeting. https://search.coe.int/cm/Pages/result_ details.aspx?ObjectId=09000016805d27f8. Accessed on 16 May 2019. Committee of Ministers—CM. (2014, April 16). Review of the implementation of the evaluation policy 2008–2012. 1197nd meeting. https://search.coe.int/cm/ Pages/result_details.aspx?ObjectID=09000016805c602d. Accessed on 16 May 2019. Council of Europe. (2019). New evaluation policy. https://rm.coe.int/cm-2018159-final-new-evaluation-policy/168098d5bf, para. 40. De Laat, B. (2005). Study on the use of evaluation results in the commission. Brussels: European Policy Evaluation Consortium. European Commission. (2012). Communication from the commission to the European Parliament, the Council, the European Economic and Social Committee and

478

R. BLOMEYER AND A. EUSSNER

the Committee of the Regions—EU regulatory fitness. Brussels: European Commission. European Commission. (2013). Communication from the European Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions—Strengthening the foundations of smart regulation—Improving evaluation. Brussels: European Commission. European Commission. (2015). Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions—Better regulation for better results: An EU agenda. Brussels: European Commission. European Commission. (2017a). Staff working document—Better regulations guidelines. Brussels: European Commission. European Commission. (2017b). Communication from the commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions—Completing the better regulation agenda: Better solutions for better results. Brussels: European Commission. European Commission. (2017c) Better regulation toolbox. Brussels: European Commission. European Court of Auditors. (2010). Impact assessments in the EU institutions: Do they support decision-making? (Report No. 3/2010). Brussels: European Court of Auditors. Hojlund, S. (2014). Evaluation use in evaluation systems—The case of the European Commission. Evaluation, 20(4), 428–446. Hojlund, S. (2015). Evaluation in the European Commission: For accountability or learning? European Journal of Risk Regulation, 6(1), 35–46. Mastenbroek, E., van Voorst, S., & Meuwese, A. (2016). Closing the regulatory cycle? A meta evaluation of ex-post legislative evaluations by the European Commission. Journal of European Public Policy, 23(9), 1329–1348. OECD DAC. (2010). Evaluating development co-operation—Summary of key norms and standards (2nd Ed.). Paris: OECD. Schrefler, L., Luchetta, G., & Simonelli, F. (2015). A new tool in the box? The cumulated cost assessment. European Journal of Risk Regulation, 6(1), 68–78. Smismans, S. (2015). Policy evaluation in the EU: The challenges of linking ex ante and ex post appraisal. European Journal of Risk Regulation, 6(1), 6–26. Stephenson, P. (2015). Reconciling audit and evaluation? The shift to performance and effectiveness at the European Court of Auditors. European Journal of Risk Regulation, 6(1), 79–89. Stern, E. (2009). Evaluation policy in the European Union and its institutions. New Directions for Evaluation, 2009(123), 67–85. United Nations Evaluation Group—UNEG. (2016). Norms and standards for evaluation. New York: UNEG. https://www.betterevaluation.org/en/resources/ example/UNEG-evaluation-standards-2016. Accessed on 15 May 2019.

18

EUROPEAN UNION

479

Van Golen, T., & Van Voorst, S. (2016). Towards a regulatory cycle? The use of evaluative information in impact assessments and ex-post evaluations in the European Union. European Journal of Risk Regulation, 7 (2), 388–403. Van Voorst, S. (2017). Evaluation capacity in the European Commission. Evaluation, 23(1), 24–41. Van Voorst, S., & Mastenbroek, E. (2017). Enforcement tool or strategic instrument? The initiation of ex-post legislative evaluations by the European Commission. European Union Politics, 18(4), 640–657. Zwaan, P., Van Voorst, S., & Mastenbroek, E. (2016). Ex post legislative evaluation in the European Union: Questioning the use of evaluation as instruments for accountability. International Review of Administrative Sciences, 82(4), 674–693.

PART VII

Synthesis

CHAPTER 19

The Institutionalisation of Evaluation in Europe: A Synthesis Reinhard Stockmann, Wolfgang Meyer and Lena Taube

1

Introduction

The countries examined here exhibit an array of differently structured government and social systems. In addition to parliamentary monarchies and democracies, there are (semi-) presidential systems of government, with one directorial system also in place. Whilst some are characterised by proportional representation, which tends to produce multi-party governments, others use a first-past-the-post system, which favours single party governments. Some countries are organised on an extremely federal or central basis, some have a high public spending ratio and a well-developed welfare state—others less so. In some countries’ associations, trade unions and the

R. Stockmann (B) · W. Meyer · L. Taube Saarland University, Saarbrücken, Germany e-mail: [email protected] W. Meyer e-mail: [email protected]; [email protected] L. Taube e-mail: [email protected] © The Author(s) 2020 R. Stockmann et al. (eds.), The Institutionalisation of Evaluation in Europe, https://doi.org/10.1007/978-3-030-32284-7_19

483

484

R. STOCKMANN ET AL.

media play a pivotal role; in others, civil society is organised to only a small degree. There are also strong differences between European countries in relation to their level of expenditure on science, research and development, and in the way these fields are organised. Unsurprisingly, countries also differ greatly in terms of how advanced the institutionalisation of evaluation is. In this chapter, the respective country reports will be used to form the basis of a systematic synthesis. As a starting point, the institutional structure of evaluation within the political system will be delineated. However, as the existence of evaluation regulations is by no means the same thing as their practical application and implementation—or indeed as the utilisation of evaluation results—these aspects will also be subjected to closer analysis, before our attention is turned to examining the same questions within the social system. Here, public perception of evaluation within civil society, and the participation of the latter in—and its demand for—evaluation, play a key role. As a next step, the degree of professionalisation of evaluation in Europe will be examined, in order to determine how evaluation offerings are structured, whether these meet the demand emanating from the political sphere and civil society—in a quantitative and qualitative sense—and whether they are able to provide any innovative impetus. In a further step, the three sub-areas examined will be analysed in terms of their interrelatedness. The aim is to identify clusters in terms of the way in which the political and social demand for evaluation is structured—and in terms of the degree of professionalisation on the supply side—and to detect any linkage between these aspects. Lastly, an attempt will be made to put forward some preliminary explanations for the different development paths and the divergent degrees of institutionalisation of evaluation in Europe. The methodology employed is to use the country reports published here as the data basis, and to perform a cross-sectional analysis of this data, using the analytical model developed to provide structure. This entails a number of difficulties: Firstly, information is not available for all elements of the analytical model in all reports; secondly, the terminology is not always used in a consistent way—for example, the central term ‘evaluation’ is sometimes used in a very broad sense, encompassing performance measurement, audit, accreditation, quality assurance, impact assessments, and even “research or other related activity” (Poland, p. 409). As a justification for this, it is stated that due to overlapping in practice, it is now difficult to separate these methods from one another (Denmark, p. 48). In the case of Latvia, there is

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

485

an additional problem stemming from the lack of any concept which could feasibly be deemed as corresponding to ‘evaluation’, meaning terms such as ‘assessment’, ‘valuation’, ‘inspection’ or ‘checking’ are also encompassed (p. 378). Despite these issues of clarity, it is nonetheless possible to generate a meaningful overview of the institutionalisation of evaluation in Europe.

2

The Institutionalisation of Evaluation in the Political System

If we examine the embedding of evaluation within legislation, it is conspicuous first and foremost that hardly any general national laws prescribing the use of evaluation exist (cf. Table 1, column 1). The same applies to the existence of central country-wide policies or strategies on policy evaluation. Switzerland and France are particular exceptions to this rule: since 2000, article 170 of the Swiss Federal Constitution has stipulated “that parliament shall ensure that federal measures taken will be evaluated with respect to their effectiveness” (Switzerland, p. 204). In France, an evaluation clause was introduced into the constitution in 2008 (p. 141). On top of this, there is an array of further laws containing evaluation clauses at a national as well as cantonal level. In the Netherlands, the Government Accounts Act ensures that “ministers in the national government are legally responsible for evaluating their policy products periodically” (The Netherlands, p. 90f.). Each budgeted overarching policy goal must be examined not only with regard to its effectiveness and efficiency, which entails a range of further aspects including the relevant policy logic. There are thus hardly any country-wide, national evaluation laws or policies, but what there are, are regulations and decrees (cf. Table 1, column 2). • In Germany, the Federal Budget Code (FBC) requires the implementation of performance measurement, as well as impact and costeffectiveness control, within all ministries. The Joint Rules of Procedure of the Federal Ministries also stipulate that such controls must be carried out (p. 172). • In France, the first decree on this subject was issued in 1990 and “has thus given birth to a regulatory and institutional system aiming at strengthening the space of evaluation at the national level” (France, p. 142).

0

1

0

0

0

0

1

0

Latvia

Netherlands

Poland

Portugal

Romania

Spain

Switzerland

UK

National decrees b 0

1

1

0

0

0

0

1

1

0

0

1

1

1

0

0

0.5

0.5

1

0

0

0.5

0.5

0

1

0.5

1

0

1

0.5

0

0.5

Organisational embedding c

0

1

0

0

0

0

1

1

0

0

0

0

1

1

1

Role of evaluation in audit office d 1

0.38

0.88

0.25

0.00

0.00

0.13

0.88

0.50

0.25

0.13

0.5

0.5

0.75

0.38

0.25

0.38

Σ

0

1

0

0

0

0

0.5

0

0

0

0.5

0

0

0.5

0

0

Sectoral spread f

Use of evaluation Role of parliament e

0.5

1

1

0

0

0

1

0

0

1

1

1

1

0.5

0.5

0.5

0.5

1

0.5

0

0

0

1

0.5

0

0

0.5

0

0.5

0.5

0.5

Scope of evaluation practice g 0.5 0.33

0.33

1

0.50

0.00

0.00

0.00

0.83

0.17

0.00

0.33

0.67

0.33

0.50

0.50

0.33

Σ

0.36

0.93

0.36

0.00

0.00

0.07

0.86

0.36

0.14

0.21

0.57

0.43

0.64

0.43

0.29

0.36

Overall Mean

discusses evaluation results (If both criteria are met = 1, one criterion = 0.5, no criterion = 0) f Degree of spread across sectors, number of sectors (policy fields) mentioned in the case studies in which comprehensive evaluations are carried out (above 7 = 1, between 6 and 5 = 0.5 below 4 = 0) g Intensity and frequency by which evaluations are carried out. Qualitative assessments according to case studies. Ratings in the text in brackets

e Two criteria were evaluated: (A) Parliament has an evaluation unit as its disposal and commissions evaluations. (B) Parliament regularly takes note of and

carry out evaluations with a broader focus (yes = 1, no = 0)

d National audit offices carry out not only performance audits (which are limited to the evaluation of goal achievement [effectivity] and/or efficiency), but also

bodies of evaluation (existing in both categories = 1, existing in one category = 0.5, existing in no category = 0)

c Hereby is meant, the existence of specific evaluation units in government institutions (such as ministries) and /or the existence of independent stand-alone-

fields) (existing = 1, not existing = 0)

b As national decrees all regulations are considered which are not laws (as described under 1), but refer to the whole nation (i.e. not only to particular policy

parliament (existing = 1, not existing = 0)

a Here, only those laws were included that have a comprehensive national validity. As national laws only such laws are considered, which were passed by the

0

Germany

0

0

France

Italy

1

Finland

Ireland

0

0

Denmark

0

0

Belgium

Czechia

Country

National laws and strategies a

Institutionalisation of evaluation

Table 1 Legislative institutionalisation of evaluation and evaluation use

486 R. STOCKMANN ET AL.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

487

• In Finland too, “there is no single ‘master’ act on evaluation”, but a whole host of different government acts, statutes and regulations (Finland, p. 66). • In the United Kingdom (UK), only a few relevant regulations exist at a national level (p. 252). • In Latvia, there is a range of legislative acts which create a formal normative framework for policy evaluation (p. 380). Whilst regulations at a national level are rather scarce, in all countries there are laws, regulations or policy strategies relating specifically to a certain policy field. In most cases, the individual ministries decide what is to be evaluated, why, and to what extent. Regulatory Impact Assessments (RIA), which are discussed in detail in some case studies, can be considered part of the legislative institutionalisation of evaluation. Strictly speaking, however, as the name RIA already implies, they are not ex-ante evaluations but assessments. For this reason, RIA practice in the various countries under study was not included in the rating of the institutionalisation of the evaluation and is only briefly mentioned here: The aim of RIA is to improve the quality of law-making, by making government action more effective and limiting state intervention to a minimum. RIA is designed to help assess the probable consequences and side-effects of legislative provisions. To this end, various analyses are carried out, for the most part cost-effectiveness checks using the Standard Cost Model (Ziekow 2015). In some countries, other elements are examined too, such as administrative simplification, gender, sustainable development, or policy coherence for development. In contrast to what is possible within an ex-ante evaluation, here the objectives of a certain regulation are not questioned, but rather it is merely a case of appraising the implementation process and its consequences. In several countries internal evaluation units have been set up for this purpose, which in some cases also carry out self-evaluations. This is the case in (cf. Table 1, column 3): Belgium: For example the Evaluation Office for Development Cooperation (p. 121). Finland: For example the Prime Minister’s Office, the Ministry for Foreign Affairs, FINEEC (The Finnish Education Evaluation Centre, p. 75).

488

R. STOCKMANN ET AL.

Germany: For example the Federal Ministry for Economic Cooperation and Development (p. 175). Ireland: For example the Central Expenditure Evaluation Unit combined with finance and audit units (p. 231). Italy: For example UVAL, the evaluation unit for evaluation of the investment programmes within the Department for Cohesion Policy (p. 280). The Netherlands: Each policy department has a division with responsibility for accountability, control and evaluation (p. 98). Poland: For example National Evaluation Unit in the Ministry of Investment and Development, and 16 regional evaluation units at the Marshal Offices, as well as an analogical unit in the Polish Agency for Enterprise Development. Spain: For example the Evaluation Division of the Ministry of Foreign Affairs and International Development. However, in many countries it is unclear whether these are genuinely internal evaluation units, or rather units within which evaluation—alongside accounting, controlling or other functions—plays a certain role. In some countries there are also independent stand-alone bodies for evaluations. For example, in Switzerland, there is the “Parliamentary Control of Administration”, which reports to parliament (Switzerland, p. 210). Or in Denmark, the “Danish Parliamentary Ombudsman” (Denmark, p. 46), and in Italy, the “Impact Assessment Office” (Italy, p. 278). In addition, there are a number of independent evaluation bodies which belong to specific policy fields, for example in: Denmark: Center for Patient Experience and Evaluation and several further research and accreditation bodies (p. 42). Germany: German Institute for Development Evaluation (p. 175). Finland: There are three independent units with an evaluation mandate, namely the “Government Policy Analysis Unit” which was created in 2014 under the Prime Minister’s Office, which—among other things—also commissions evaluations (p. 66), EVA-11, which is an evaluation unit under Ministry for Foreign Affairs (p. 73) and FINEEC (The Finnish Education Evaluation Centre, p. 75). Spain: State Agency of Public Policy Evaluation and Quality of Services (AEVAL) (which existed until 2017, p. 306). UK: Independent Commission for Aid Impact (p. 257).

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

489

Another feature of the political institutionalisation of evaluation is the role that audit courts play in the evaluation system of a country. In some countries—listed below—this role is particularly important (cf. Table 1, column 4)1 : Belgium: In some cases, evaluation activities go beyond performance audits, also encompassing “policy evaluations with a substantial judgment on outcome and impact” (Belgium, p. 128). Czechia: The National Audit Office not only carries out audits but evaluations, too (p. 361). Denmark: “The National Audit Office…has moved still more into evaluation in the form of management and performance auditing…” (Denmark, p. 46). Finland: The National Audit Office does carry out evaluations, despite these usually going by the name of performance audits (pp. 68–69). Latvia: The State Audit Office undertakes independent evaluation studies that go beyond mere performance audits (p. 391). Netherlands: The Court of Audit may decide of its own initiative to undertake evaluations itself. Moreover, “regional and local audit offices have been established by law, focusing on the outputs and outcomes of decentral policies” (The Netherlands, p. 93). Portugal: The Supreme Audit Court increasingly also carries out evaluations (p. 335). Switzerland: Between 2000 and 2014, the Swiss Federal Audit Office undertook 56 evaluations (p. 210). The role of parliaments relates not only to their law-making power, i.e. their ability to pass evaluation laws or enshrine evaluation clauses in other legislation, but also to their ability in some countries to instigate or even directly commission evaluations. On top of this, parliaments can be important users of evaluation results. Below is an overview of the situation in this regard (in alphabetical order) (cf. Table 1, column 5). • In Belgium, the institutionalisation of evaluation in parliament is low. Reasons cited for this are “the lack of specific structures and the

1 This only covers cases (in alphabetical order) in which the audit court not only undertakes performance audits but explicitly evaluations too.

490

• •

•

•

•

•

• •

R. STOCKMANN ET AL.

absence of evaluation-specific budgets” (Belgium, p. 122). In 2011, “at the federal level, a parliamentary committee for legislative evaluation” (Belgium, p. 122) was founded, but is not actively used. In the parliament in Czechia, evaluations are barely acknowledged (p. 363). In Denmark, there is a Parliamentary Ombudsman, whose responsibility it is—among other things—“to control and evaluate the performance of public institutions” and “to conduct thematic evaluations about the well-being of citizens” (pp. 46–47). From time to time, the parliament also commissions ad hoc evaluations of major initiatives and reforms. Whilst the parliament of Finland—with its “committee for the Future”—has its own think tank, the role played by evaluation is a minor one. The results of academic studies are indeed dealt with in parliament, but there is no particular focus here on evaluation studies (p. 76). Evaluation does not play much of a role within the French Parliament. All attempts at creating “specific structures for evaluation within the Parliament turned out to be non-effective several times” (France, p. 146). Evaluation is not institutionally embedded within the German Bundestag. In keeping with this, public consultations on evaluations are rare. However, the growing interest taken by parliament in evaluation can be seen in the huge increase since the dawn of the new millennium in the adoption of evaluation clauses within legislation, as well as in the frequency with which the term ‘evaluation’ crops up in plenary session minutes (Germany, p. 180). In the parliament in Ireland, evaluations only rarely feature. The Oireachtas (parliament) Library and Research Service provides, among other things, ‘digestible material’ on the subject of evaluations ‘for busy parliamentarians’ (p. 232). Evaluation is hardly relevant at all within the parliament in Italy, and neither is evaluation embedded in the legislative process (p. 278). The Senate has a unit for evaluation, but without a budget (p. 278). The Latvian Parliament that continuously had no analytical unit, recently established one entrusting it also with evaluation function. Unit’s mandate is much larger than its capacity and it had not yet been able to do any evaluation (pp. 383–384).

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

491

• Although the parliament in the Netherlands does not have any distinct evaluation structures, evaluation reports are often discussed in parliament and “parliamentarians quite regularly ask for evaluation studies to be conducted” (The Netherlands, p. 96). • Whilst the Polish Parliament has a “research and analytical office” (Poland, p. 409), this body hardly deals with evaluation at all. • The Portuguese Parliament has at its disposal a “Technical Unit for Budget Support”, but evaluation does not form part of this body’s focus (Portugal, p. 335). • In Romania, evaluation results barely have an influence on the work of parliament (p. 445). • In the Spanish Parliament, evaluation results are not acknowledged: “Parliamentarians do not take into account nor stress the need for evaluation”. By the same token, there is no interest in evaluation within the political parties (Spain, p. 310). • Evaluation is firmly established within the Swiss Parliament. National and cantonal parliaments “play an essential role as both initiators of evaluations and as utilizers of evaluations”. Parliament’s primary reason for commissioning evaluations is “to exercise their oversight powers” (Switzerland, p. 215). • The British Parliament does not have its own evaluation structures (p. 254). The comments made above demonstrate that evaluation only features in European parliaments to a small degree. Switzerland once again stands out here, not only because it utilises evaluations for its own work, but also as it has distinct units at its disposal and commission its own evaluations. In Denmark, too, Parliament occasionally requests evaluations itself. The only other countries in which the parliament concerns itself with evaluation are the Netherlands and Germany. Utilisation of evaluations to facilitate rational decision-making as defined by Mertens (1998) is of course only possible if evaluations are actually carried out. Therefore, the following section will examine in which sectors evaluations are often performed, the following feature most prominently2 :

2 The number of mentions made in the country reports is shown in brackets.

492

• • • • • • • •

R. STOCKMANN ET AL.

Education, schools (14) Development cooperation (13) Science, research and development, innovation (11) Health (11) The labour market, employment, the economy (11) Social affairs policies (9) The environment (8) Urban and spatial planning (6).

Areas that are mentioned either very rarely or not at all include, inter alia, defence (with the exception of France and Finland), foreign policy, culture (except in France and Latvia), the judiciary (except in Finland and Italy), home affairs, and agriculture (except in Finland). Only in the case of the Netherlands is it noted that “evaluation takes place in practically every policy domain” (The Netherlands, p. 97). In France, the varying intensity of evaluation across different sectors is seen as being due to the fact that “mainly policies facing legitimacy issues are evaluated while those considered as fundamental are less questioned” (France, p. 148). In some countries—above all the evaluation newcomers—evaluation activities are primarily concentrated within sectors which receive European Union (EU) support and for which there is therefore a requirement for evaluations to be conducted in accordance with EU guidelines. This observation doesn’t only apply to the new EU member states in the east, such as Poland, Romania and Latvia, but also to southern European countries in which evaluation is still a relatively young discipline. Portugal, for example, is subject to mandatory evaluation within the framework of EU-financed cohesion policy, but “the spillover to other policy fields […] hasn’t, up to now, produced positive externalities” (Portugal, p. 337). Countries exhibit starkly differing ranges of sectors in which evaluations are performed regularly and evaluation structures established (Table 2). The case studies say relatively little about the scope and frequency with which evaluations are carried out in the individual sectors, whether internal or external evaluations are mainly used, whether formative or summative evaluations, whether processes or impact evaluations, who primarily makes use of the results as well as how the quality of evaluations is rated and how it is assured. In the following, however, an attempt will be made to extract the scope and frequency of evaluation practice from the case studies beyond the sectoral spread (cf. Table 1, column 7).

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

Table 2

493

Degree of spread across sectors (cf. Table 1, column 6)

high (1) Finland France Germany Ireland Netherlands Spain Switzerland

medium (2) Belgium Czechia Denmark UK

low (3) Italy Latvia Poland Portugal Romania

Note (1) = spread in seven sectors or more; (2) = spread in five to six sectors; (3) = spread in four sectors or less

Once again, it becomes evident that in Switzerland (1; cf. Table 1, column 7) evaluations are not only carried out in the majority of sectors but are also widespread at cantonal level and very comprehensive due to the legal situation. In this context, the scope, frequency and intensity of evaluations vary greatly from ministry to ministry (p. 210). Evaluations are mainly carried out there “where legitimation pressure is particularly high, and where the provision of public services is delegated to third parties” (Switzerland, p. 215). Also in the Netherlands (1; cf. Table 1, column 7), it is true that evaluation is not only carried out in almost all policy domains, but also at a high frequency and depth. However, the underlying intensity and rigour vary greatly across domains (p. 97). Process evaluations predominate, but according to the Dutch Court of Audit, 42% of the government-wide policy budget of e 94 billion was assessed by impact studies (p. 100). Primary motives are accountability and enlightenment (p. 102). Such a description applies equally to Finland (0.5; cf. Table 1, column 7), where evaluation has become a standard tool and the evaluation capacity of all policy areas is assessed as being medium to strong (p. 74). On the other hand, there is criticism that—in many branches of government—clear procedures on how the evaluation information is to be utilised and linked to policy learning are missing, and that “evaluation knowledge very often comes too late to help with actual steering” (Finland, p. 67). The State Audit Office faults the ministries for the fact that evaluation and monitoring results are not used for the formulation of new policies to a large enough extent (p. 72). In Germany (0.5; cf. Table 1, column 7) as well, where there is no national evaluation legislation but instead national and federal state-level regulations requiring evaluation, evaluation is not carried out within all policy fields (p. 175). The Federal Audit Office and the National Regulatory

494

R. STOCKMANN ET AL.

Control Council have repeatedly lamented this deficit in implementation in the past decades. In Germany, mainly internal process evaluations are carried out primarily for learning purposes. However, external impact evaluations are gaining in importance even though the audit office generally complains about a lack of such evaluations (p. 175). If they are carried out, they often do not have the necessary methodological quality (p. 174). Due to the very strong position of the various ministries, the scope and intensity of evaluations depends largely on the individual ministries (p. 178). In the UK (0.5; cf. Table 1, column 7) likewise there is a large variety of evaluation practice, as is apparent inter alia from the widely differing evaluation strategy documents of the individual government departments (p. 254), the different evaluation objectives (one the one hand learning and enhancement, on the other hand legitimacy and accountability), and the forms of evaluation (economic, process, and impact evaluations) (p. 260). Nevertheless, as in Germany, a “utilisation gap” (p. 258) is lamented. The British National Audit Office criticises “the lack of comprehensiveness, challenges in accessing data, and poor use of evaluation by UK government” (p. 258). Furthermore, there is criticism “that 12 of 17 main government departments lacked evaluation evidence and post-implementation reviews” (UK, p. 258). In Latvia (0.5; cf. Table 1, column 7), whilst there is an array of legislative acts which create a formal normative network, there is no institution that coordinates the execution of evaluations. This decision lies solely, as in the other European countries, “under full responsibility of ministries” (Latvia, p. 385). Beyond that there are no regulations “for ensuring qualitative and meaningful evaluation activities” (Latvia, p. 380). The consequence of this is that most policy evaluations do not meet the necessary quality standards (p. 386). With the exception of the EU programmes, most evaluations are carried out internally and focus mainly on outputs and outcomes, whereby at a minimum only descriptive assessments are carried out (p. 389). Process evaluations are rarely conducted (p. 390). Although in Denmark (0.5; cf. Table 1, column 7) evaluations are conducted in various sectors, most legislative acts do not incorporate an automatic evaluation mechanism, and major public-sector reforms are hardly evaluated at all. Evaluation is often portrayed as signifying a “lack of trust” (Denmark, p. 39) and is therefore avoided. In Denmark, too, it is after all the ministries and regional governments that decide whether and how an evaluation is carried out. Both internal and external evaluations are conducted in the form of process, impact and goal attainment evaluations

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

495

(p. 44). With regard to the quality of the evaluations, there is criticism that “the programme theory often lacked clarity, that the evaluations were often inconsistent and conclusive, and lacked attention to side effects” (Denmark, p. 48). In the case study of Belgium (0.5; cf. Table 1, column 7) it is reported that the government-wide public sector reforms have led to a strengthening of evaluation practice across all sectors over the last ten years despite major sectoral differences. Indications for this “can be found in the presence of an evaluation framework, a long term planning of evaluation activities and/or systems of evaluation quality control” (Belgium, p. 120). Furthermore, it is also noted that evaluation is referenced in an increasing amount of policy documents and coalition agreements (p. 130). Whereas ex-post evaluations used to predominate, “yet output and process evaluations are comparably more numerous” (Belgium, p. 125). Most evaluations are performed inhouse (p. 125f.) and serve policy improvement and learning. There is no distinctive quality control. In Czechia (0.5; cf. Table 1, column 7), the introduction of evaluation in the sectors typically receiving EU support has led to a wider dissemination of evaluation practice. The establishment of a growing number of evaluation units within the public sector since 2014 has led to a rise in internal evaluations, even though there is still no “habit to evaluate national policies” (p. 360). The quality of evaluations suffers from the fact that there are not enough evaluators in Czechia and that in calls for tenders mostly “the price was the only decisive criterion” (Czechia, p. 363). However, “in many tenders, the only requirement is to be literate” (Czechia, p. 364). As a result, “such practice had a disastrous effect on quality of evaluations” (Czechia, p. 363). Evaluations in the context of Official Development Assistance as well as Structural and Investment Funds are mainly used “as learning tools” (Czechia, p. 364). On the other hand: “National policies are evaluated rather rarely” (Czechia, p. 364). The evaluation practice in Spain (0.5; cf. Table 1, column 7) “has started to develop itself mainly around governmental and public administrations, and practically not at all in relation to the legislative power” (Spain, p. 311). Evaluations are “closely linked to accountability that is going to stimulate the practice of evaluation” (Spain, p. 311). Most evaluations are conducted externally and present the output of evaluations while impact evaluations are implemented on a rare basis (p. 313). Following a rapid surge in the significance of evaluation after the turn of the millennium (e.g. as measured by the number of evaluations performed). The global financial crisis led to a

496

R. STOCKMANN ET AL.

stark drop in evaluations. Rather than being utilised by the budget units as a beneficial tool, evaluation itself became one of the first victims of budget cuts, as it was perceived by politicians as marginal and expendable (p. 306). Although there are extensive legal regulations in France (0; cf. Table 1, column 7), “they were not sufficient to impact the achievement of evaluation and to stimulate a true demand for evaluation” (France, p. 142), leading to the current criticism that “[t]here is no systematic evaluation for national or local programmes” (France, p. 147). Two thirds of the evaluations are carried out externally, predominantly by private consulting companies (p. 149). Impact evaluations still play a minor role. In Ireland (0; cf. Table 1, column 7), no extensive evaluation practice can be detected, but rather it is the case that “the scope of evaluation activity across the various sectors is relatively limited” (Ireland, p. 233). Internal evaluations are the rule (p. 234), with the exception of the overseas development sectors (p. 235). In Italy (0; cf. Figure 1, column 7), which has only a low density of regulations at the national level, evaluation has nevertheless experienced a clear upswing in the last ten years (p. 282), but this has not led to a situation in which evaluation is undertaken with a regular pace nor is it applied to new legislation or programmes (p. 289). The use of evaluation and the utilisation of its results depend to a large degree on the interests of leaders (policy-makers and top managers) (p. 293). 1 0.9

NL

CH

0.8

FN Institutionalisation

0.7 0.6

LV

0.5 0.4

UK BE

DK

IT

CZ

ESP

PL

IE

0.3 0.2 0.1 0 0

GE R

FR

PT RO

0.1

0.2

0.3

0.4

0.5 Use

0.6

0.7

0.8

0.9

Fig. 1 Institutionalisation and evaluation use in European countries

1

1.1

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

497

Evaluation practice in Portugal (0; cf. Table 1, column 7) is even more limited still: “Evaluation is not used as an added-value activity but only as a formal and legal obligation […] losing all its transformative power” (Portugal, p. 333). Moreover, “[e]valuation has not yet succeeded in gaining the status of a management tool and a decision-making support instrument” (Portugal, p. 340). Since there is no cross-sectoral evaluation policy, it very much depends on the circumstances whether or not evaluations take place. Even where they are prescribed, it happens that evaluations “do not take place and no consequence or sanction is applied” (Portugal, p. 336). The most common form are external evaluations. Internal evaluations rarely take place “since there hasn’t been an investment in the development of evaluative capabilities” (Portugal, p. 337). Likewise, in Romania (0; cf. Table 1, column 7), the evaluation practice used within EU programmes has not been successfully transferred to other national sectors (p. 436). Evaluations only take place where external funds are available (p. 444). The lack of a legal framework and budgets have led to a situation in which “the use of evaluation has not yet become a recurrent generalised and easily accepted practice by the decision-makers or civil servants” (Romania, p. 446). Also, in Romania, evaluations are mainly conducted externally, because “ministries and agencies do not have enough specialised personnel in evaluation” (Romania, p. 447). A negative effect on the quality of the reports is that the price is the decisive criterion for allocating tenders (p. 448). In Poland (0; cf. Table 1, column 7), evaluation activities also concentrate mainly on the cohesion policy financed by the EU. These evaluations are mostly carried out externally (p. 414). The mechanisms for quality assurance of evaluations, such as the development of evaluations guidelines, were also developed in connection with the implementation of the cohesion policy but cannot conceal a low quality of evaluations (not only concerning EU funds). In the case study, this is attributed, among other things, to the priority of price criterion, the lack of a certification system for evaluators and the lack of knowledge by commissioners (p. 428). On the one hand, evaluation has experienced an enormous upswing under EU membership, on the other hand especially during the first years of accession evaluation became a “hermetic tool of managing EU funds with limited spill-over effect into other sectors” (Poland, p. 427). If one compares the degree of institutionalisation of evaluation, as expressed in the criteria (1) existence of national laws and strategies,

498

R. STOCKMANN ET AL.

(2) national decrees, (3) organisational embedding in governmental organisations, (4) role of evaluation in audit courts in Table 1, with the criteria that say something about the use of evaluation ((5) role of evaluation in parliament, (6) degree of sectoral spread, (7) scope of evaluation practice), then the following interrelation becomes clear, as shown in Fig. 1. In countries where evaluation is strongly anchored at an institutional level, evaluations are carried out in a wider scope and also with greater intensity and frequency. In countries without such institutional frameworks, the implementation of evaluation is lagging far behind. However, this crosssectional analysis also shows that the mere existence of relevant regulations in itself is not sufficient for ensuring comprehensive evaluation practice. Even the countries in which the institutional embedding of evaluation is most well-developed have not succeeded in this regard. Particular in Switzerland, which has achieved the highest degree of legal anchoring in European comparison, this has not led to evaluations being carried out in all ministries, federal agencies and departments or policy fields (p. 207ff.). Through deeper analysis of the evaluation practices of European countries, it becomes clear that evaluations are—in the vast majority of cases— used by the executive, but only in a few countries (e.g. Switzerland, the Netherlands and Germany) by the legislature. In terms of evaluation objectives, at a national level accountability leads the way. The few countries in which evaluation is also used for policy learning or even enlightenment are, without exception, those with a medium to high degree of institutionalisation and extensive implementation practice (such as the Netherlands, Switzerland, Germany, Finland and Denmark). Accordingly, evaluations for purposes of reviewing policy goal attainment, impact studies, and effectiveness, efficiency and sustainability analyses feature most prominently in these countries. The primary methods employed here (not including RIA) are process and ex-post evaluations. At the programme level, process evaluations are preferred; these are used primarily by managers to support strategic decision-making, meaning learning is the main focus. However, in no country is it possible to put a figure to the ratio of the extent of internal evaluations to external evaluations, as there is no central register for evaluations. Internal evaluations are carried out first and foremost in those countries in which the individual ministries or authorities have strong evaluation competence at their disposal (e.g. Switzerland, Finland, Denmark, Germany and the Netherlands). External evaluations

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

499

are performed in all countries, with the primary aim of ensuring greater independence and high quality. In some countries (Switzerland, the Netherlands, Germany, Denmark, Finland, Latvia and Spain), meta-evaluations are undertaken for the purpose of reviewing the methodological quality of evaluations or to synthesise content in order to acquire new overarching insights, sometimes discovering serious flaws. In Latvia, a meta-evaluation of 400 evaluation studies revealed that only 39% “of them can be qualified as evaluation by their structure, content and purpose” (Latvia, p. 386). In 2017, a metaevaluation of 80 evaluations was conducted in France, uncovering significant shortcomings, which were deemed as being due to the fact that “there seems to be a lack of shared professional standards or competence requirements for evaluations” (France, p. 152). Even if quality can be ensured, this of course does not mean that the results are then utilised. There are reports of barriers to use in numerous countries. For example, in Belgium, there is criticism that “specific structures or procedures for the dissemination and use of evaluation findings do not exist” (Belgium, p. 122). In France, evaluations are “very far from being systematically disseminated to policy-makers and used by them” (France, p. 163). If, as in the case of Italy, evaluation in many thematic areas “is still felt as a duty and a punishment” (Italy, p. 299), then there seems to be little hope in terms of evaluation results actually being used. Even in the countries in which evaluation seems well-established within the political system, pitfalls of various types are present. For example, there are reports that in Switzerland evaluation is increasingly construed “as a kind of ‘technology of the bureaucracy’”, turning into a rigid routine involving largely standardised and streamlined processes (Switzerland, p. 206). The upshot of this is that university-based research institutes are bidding for evaluation tenders less and less frequently, meaning evaluation is increasingly being carried out by private service providers “with specific expertise but without well-founded knowledge in evaluation” (Switzerland, p. 205). One consequence of this is that evaluation is not being further enhanced and adapted to take account of contextual shifts (e.g. evaluation of sustainable development goals, e-government, use of big data, etc.). Another development, which can be witnessed primarily in Finland and Denmark, can also have the effect of undermining the institutionalisation of evaluation within the political system. We are referring here to evaluation disintegrating in the course of the use of other tools, thereby losing its distinct profile and its intended function. In the case of Finland, it has

500

R. STOCKMANN ET AL.

been reported that evaluation is becoming increasingly detached from its traditional model and transmuting into a scattered sphere of co-creation und peer learning. The role of the evaluator then becomes that of “critical friend or a co-designer” (Finland, p. 66). A similar tendency can be made out in Denmark, where evaluation is to an even greater extent being mixed up with other concepts such as audit, accreditation and quality assurance, in terms of its design, the form it takes and its use (p. 40). In this way, the fact-generating element of evaluation so important for its use is becoming lost and evaluation is getting entangled with institutional and managerial practices (p. 51).

3

The Institutionalisation of Evaluation in the Social System

It is not a secret that Europe’s social systems diverge greatly from one another. This can be seen in different political cultures, traditions of decision-making in the public, heritages and latest history, or political and societal orders and mechanisms. As a result, the attempt to compare these diverse systems leads scientists to uncertain territory in general. This is especially true for researching evaluation on a comparative basis, which might be the reason why there is almost no research that focuses on the use of evaluation in the social systems. It is recalled here that the social system in this volume does not refer to a social system in the sense of a conventional welfare state characteristic, but to the system in which social interaction, action and behaviour of groups of civil society take place. Regarding the social system the analytical framework focuses on three main issues: the institutionalised use3 of evaluations by civil society, the public perception and discussion of evaluation and evaluation findings as well as civil societies’ demand for evaluations. 3.1

Institutionalised Use of Evaluations by Civil Society

The column “Institutionalised use of evaluation” (Table 3, column 1) comprised information about whether and to what extent citizens and civil society organisations use evaluations for political and organisational

3 Here, use is understood as use of evaluation findings as well as the use of evaluation as an instrument.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

Table 3

Institutionalisation of evaluation in the social system Institutionalised use of evaluations

Country Belgium Czechia Denmark Finland France Germany Ireland Italy Latvia Netherlands Poland Portugal Romania Spain Switzerland UK

501

General use of evaluation in CS 0.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0.5 0

a

Public perception and discussion of evaluation Knowledge about evaluation 0 0 0.5 0.5 0 0 0 0 0 n.a. 0 0 0 0 0.5 0

Public discussion / media 0 0 0.5 0 0 0 0 0 0 0 0 0 0 0 0.5 0

Availability of reports

Civil societies’ demand of evaluations Civic Demand

0.5 0.5 n.a. 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0 0.5 n.a. 0.5 0.5

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Mean 0.20 0.10 0.25 0.20 0.10 0.10 0.10 0.10 0.10 0.25 0.10 0.00 0.10 0.00 0.40 0.10

a 1 = high spread, 0.5 = medium spread, 0 = low spread, n.a. = not answered

decision-making processes in an institutionalised way and whether and to what extent they are participatively involved in evaluations. Results of the cross-sectional analysis show evaluation results usually do not play a major role in providing knowledge for decision-making by civil society organisations. Only in two out of sixteen countries an existent general use of evaluation within their civil societies can be reported. • Even in Switzerland, which shows high scores in rankings regarding the institutionalisation of evaluation in the political system, evaluation is sparsely used to back up the political processes on part of the civil society (p. 216f.). • Civil society in Belgium uses information drawn from evaluations on a regular basis in the context of decision-making, but it does not have a central place (p. 130). In all other countries, it is not at all possible to speak of a systematic use of evaluation results by civil society. Symptomatic are statements such as those found, for example, in the UK (p. 259) study, in which it is stated that evaluation findings do usually provide insights, but use is rather limited. For Finland, which has a high degree of political institutionalisation, it is presumed that “[c]oncerning referenda, there is no systematic practice to

502

R. STOCKMANN ET AL.

use evaluation for providing knowledge as background information” (Finland, p. 79). Even in countries where civil society organisations (CSOs) and non-governmental organisations (NGOs) have a consultative role in advisory boards or steering committees to the government (Belgium, Finland, Ireland, the Netherlands) evaluation results do not seem to play a major role. An explanation for the limited use of evaluation by civil society could be found in the fact that the existence of a differentiated, active civil society in a country might be important. Switzerland and Belgium have such characteristic civil societies. But this also applies to many other European countries, where civil society hardly makes use of evaluation results. Some country experts in this volume (Belgium, Finland, Ireland, Netherlands, Portugal) came to the conclusion that a reason for this gap of institutionalised evaluation in the civil society might be that, due to the democratic systems of the countries where more than one party often comprises the government, decisions are made in a bargaining process between parties, are related to certain political ideologies, or are simply based on coalition contracts. Thus, in such a decision-making process evaluation might not be necessarily needed, or that evaluation results are known, but shift to the background in the course of the policy debate and discussions. Another factor that might hinder evaluation to be used regularly in the decision-making process probably is its weak position in general and the resulting low awareness of evaluation (Czechia, France, Germany, Italy, Latvia, Poland, Portugal, Romania, Spain and UK) and its potential benefits. Even though evaluation findings are rarely used by civil society in the political decision-making process (Table 3, column 1), this does not mean that the use of evaluation is generally low in civil society. In some countries, CSOs and NGOs make use of evaluation for organisational purposes. • In some countries evaluation use in civil society is growing, especially as a management instrument to steer own projects. Evidence for this can be found in Denmark (p. 51ff.), France (p. 157), Germany (p. 181ff.), Ireland (p. 240ff.), the Netherlands (p. 102ff.) or in the UK (p. 262). Still, the general use is far from being institutionalised and the increase mainly touches the classical sectors education and development cooperation.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

503

• Some CSOs, NGOs, foundations, professional associations, private enterprises (only in Belgium) and also churches (only in Denmark and Finland) do use evaluation and evaluation findings for legitimacy and accountability purposes—especially in projects financed with external funds—, as well as learning and improvement of projects and strategies. • Denmark seems to be a unique and interesting example in this context. The country is characterised by a highly organised society (p. 39) and especially in the social sector evaluation is seen quite ambiguous. People who work on a voluntary basis in organisations funded with public money, are regularly confronted with evaluation demands of the state. This causes conflicts since volunteers see their work as a service to society and their motivation as not result-oriented. Consequently, they resist participation in evaluations (p. 51). • Ireland is another interesting case. Due to the practice of evaluation in philanthropies an evaluation culture spread to other civil society organisations. This raised the general use of evaluation in the Irish civil society and also amplified a tendency to evaluation approaches that are outcome focused. “As a by-product of this approach, there has been an increase in the capacity base of evaluators, by skilling up evaluation practitioners in the use of complex evaluation methodology” (Ireland, p. 240). The same back loop from practice to capacity can be seen on Finland where “evaluation has become familiar to CSOs and they have got practice for planning, doing and using evaluation” (Finland, p. 81). In summary, it can be stated that CSOs and NGOs hardly incorporate evaluation results into the political decision-making process, even in those countries in which they would have institutionalised possibilities to do so via referenda, advisory boards or steering committees. In addition, it can be concluded that CSOs and NGOs are not involved in a systematic, formalised way in the planning and the implementation of governmental evaluations. This even applies to countries such as Switzerland, the Netherlands and Finland, which have a high degree of political institutionalisation of evaluation and in which a high degree of political participation of civil society as part of the political culture prevails. In many European countries, the use of evaluation in CSOs and NGOs can only be observed at an organisation-internal level. To an increasing extent, evaluation is being used

504

R. STOCKMANN ET AL.

by CSOs and NGOs as a management tool to ensure that projects and programmes are more effective and efficient. This development is intrinsically motivated but limited to a few sectors such as development cooperation. The fact that evaluation of NGOs and CSOs is also being used increasingly for accountability and legitimacy purposes is often not due to the organisations’ own demands but to government requirements. A statement from Belgium sums up the situation for many countries: “The demand for evaluation mainly comes from the executive, herein often triggered by evaluation clauses in legislation” (Belgium, p. 133). Similar indications can be drawn from the country studies in France (p. 157), Italy (p. 295), Spain (p. 318), the UK (p. 262) or Czechia, where “[e]valuations are more or less national level policy issue” (Czechia, p. 365). All in all, there is no systematic use of evaluation across the board in CSOs and NGOs in all sectors in any country. 3.2

Public Perception and Discussion of Evaluation

The issue “public perception and discussion of evaluation” includes information about how well-known evaluation is in (civil) society, whether evaluation is discussed in the media, and the availability of evaluation reports. The overall knowledge about evaluation as an instrument is not commonly perceived to the general public in the countries comprising this volume (Table 3, column 2). Only in Denmark and Finland it was found that evaluation is well known in their societies. Evaluation is not familiar to (civil) societies in the other remaining countries. It turned out that the main reason for the unfamiliarity with the concept might likely be the comprehension of ‘evaluation’. “The term ‘evaluation’ is often misunderstood by the general public, given its expansive remit this is understandable” (UK, p. 261). This statement from the UK counts for many of the countries. Evaluation suffers from a lack of visibility (France, p. 155), low public interest (Germany, p. 185), low awareness (Portugal, p. 340), or little general knowledge about the topic (Romania, p. 454). Media may play its role in this context. Instead of informing the public objectively about benefits and challenges of evaluation, about evaluation findings and their making, journalist rather use evaluation results for the production of sensational content. Reporting on evaluation “is usually biased by the purpose on how the journalists want to report about public policies. In such a case,

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

505

methodology is completely missing as neither journalists nor the general public would understand it” (Czechia, p. 367). Regarding the public discussion, the analyses showed a very low level of evaluation content in the media (Table 3, column 3). Here, only Switzerland (p. 216) and Denmark (p. 52) count with a—more or less balanced— public debate about evaluation and its findings. Overall, in the rest of the countries evaluations make it to the public media arena, if the topic they are treating is salient, if results seem to be scandalous, if findings are especially controversial, if evaluation topics tackle critical political positions, or can be used to steer public opinion for or against something. Therefore, “many evaluators feel their work is largely under-utilised, used symbolically, or misused” (UK, p. 260). A balanced debate is rather missing here. Hence, the media, as the fifth power in states, also fails to recognise the enormous potential that could arise from evaluations to exercise control over the state. There is much better to say about the general availability of evaluation reports (Table 3, column 4). In almost all the countries reports or at least summaries of the reports are available. Of course, this relates mainly to reports of external evaluations. Internal evaluation reports are primarily understood as internal documents and are, therefore, not displayed to the general public. Evaluations funded by the EU are generally available. Beyond that, Latvia (p. 379), Poland (p. 421), and Romania (p. 454), for example, count with a centralised database where EU evaluation reports are made available to the public. This can be a great source of knowledge. Regarding the availability of evaluation reports, there is only one drop of bitterness, which is well summarised in a statement from Czechia: “[P]ublic bodies usually fulfil their information duties by placing information on the internet. However, they do not attempt to deliver such information to relevant stakeholders and to disseminate the information actively” (Czechia, p. 366). At this point it can be stated that the low use of evaluation results by civil society cannot be caused by a lack of transparency, as many reports are published. Here it must be stated restrictively that this openness does not apply equally to all sectors and organisations and that there is almost no institutionalised obligation to publish evaluations in the selected countries, with the exception of—again—Switzerland (p. 216).

506

R. STOCKMANN ET AL.

3.3

Civil Societies’ Demand for Evaluations

Regarding the demand for evaluation on the part of civil society, it was asked, whether citizens, CSOs, or private enterprises are demanding evaluations, how regularly this happens, and whether there are any obstacles. The analyses show that civil society as a whole, individual organisations, the media, the private sector or citizens neither ask for nor demand evaluations from the state nor from civil society organisations themselves (Table 4, column 5). The private sector demand for evaluations is also low and individual citizens remain very far from demanding evaluations, partly due to low media coverage. As a result, the demand for evaluation arises mainly from the executive, making the “state […] the first commissioner” (France, p. 148). This quotation explaining the French case, also displays the situation for almost all the countries under investigation. A hindering factor for this finding may be the low awareness of evaluation as an instrument and its low systematic embedding in civil society structures in the decision-making process. In some countries, trust in government was mentioned as a reason for the low demand by civil society. This counts especially for countries which have had a time of dictatorship in their recent past like Spain, Portugal, or Czechia. All in all, it can be concluded that despite the socio-cultural diversity of the European countries, a surprisingly uniform result is achieved with respect to the institutionalisation of evaluation in the social system: The social system by far turned out to be weakest system of all in institutional terms: • There are hardly any institutionalised structures and processes that systematically involve citizens or civil society organisations in evaluation. • Evaluation results are used in CSOs for management and accountability purposes, but with a few exceptions, not for influencing societal problems as a whole. • Citizens, media, and (with the exceptions mentioned) CSOs themselves know little about evaluation and hardly ask for or even demand evaluations. • There is barely any public debate about evaluation results, although many evaluation reports are nowadays published in almost all European countries.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

4

507

Institutionalisation of Evaluation as a Profession

The analytical framework in this part focus on three different aspects, derived from professionalisation theory and with respect to some preliminarily research on this issue (Meyer 2015, 2016a, b, 2019). The first aspect on education and training is about the institutionalisation both on the academic as well as on the further education system of the countries. While the university system in Europe is homogenised by the Bologna Process in 1999 (cf. Reinalda and Kulesza 2005), further education is still a broad and diverse field with more or less developed quality assurance mechanism (although there are some trials at the European Level for standardisation, European Committee for Quality Assurance). European universities are nowadays accredited by independent offices and provide comparable grading systems on the European level (European Credit Transfer and Accumulation System), not limited to the EU but also including all other European countries in the European Higher Education Area. The Bologna process also opened new opportunities for universities to install apprenticeship trainings for applied sciences and therefore supported the institutionalisation of evaluation at European Universities. Nevertheless, this did not lead to a boom of study programmes and courses but remained in a very unstable and fluctuating state (comparable to the situation in the United States of America and other forerunner states of evaluation, cf. Meyer 2016a). As Friedrich (2016, 2017) lined out in several publications, there are about one dozen European universities offering Master programmes in evaluation but only some of same are stable offers for a reasonable time period. The GLOBE confirms these findings (cf. Table 4, column 1) and reveals a certain concentration on some countries: several universities with Master programmes on evaluation can be found in France (p. 157), Italy (p. 296), Spain (p. 319ff.) and UK (p. 262f.) while other countries—Belgium (p. 133f.), Germany (p. 187f.), Romania (p. 455f.), Switzerland (p. 217)—are offering one such programme. The two Scandinavian states Denmark (p. 54f.) and Finland (p. 81f.) failed by durably establishing such programmes. All other countries, including big ones like Poland or some with a well-developed evaluation tradition like the Netherlands do not have specialised evaluation programmes but included Evaluation as modules in some other disciplines, merely public policy, administration, sociology or economy. Compared to the US, education and psychology seem to play a less important role in institutionalisation of evaluation at European universities.

508

R. STOCKMANN ET AL.

Table 4

Professionalisation Index (reference to Appendix)

Country

Education

Communication

Organisation Norms

Mean

Belgium

1,00

0,25

0,25

0,00

0,38

Czechia

0,50

1,00

0,75

0,75

0,75

Denmark

0,50

0,00

1,00

0,50

0,50

Finland

0,50

0,00

1,00

0,50

0,50

France

1,00

0,75

1,00

0,75

0,88

Germany

1,00

1,00

1,00

0,75

0,94

UK

1,00

1,00

1,00

0,75

0,94

Ireland

0,75

0,25

0,25

0,00

0,31

Italy

1,00

1,00

1,00

0,75

0,94

Latvia

0,50

0,25

0,75

0,50

0,50

Netherlands

0,75

0,25

0,50

0,50

0,50

Poland

0,75

0,25

1,00

0,50

0,63

Portugal

0,50

0,00

0,50

0,00

0,25

Romania

1,00

0,25

0,75

0,00

0,50

Spain

1,00

1,00

1,00

0,50

0,88

Switzerland

1,00

1,00

1,00

0,75

0,94

Non-Academic trainings are not able to replace or substitute academic study programmes, especially not in such small countries like Latvia (p. 399) or Portugal (p. 344f.) with a lack of university-based trainings. In general, such kind of offers supplement existing (and even sometimes growing) academic offers and cannot be seen primarily as competitors. Providers are state agencies (e.g. the Rijksacademie in the Netherlands, p. 104), NGOs and large associations (e.g. the Portuguese Foundation Centre, p. 343), being interested in a certain quality of evaluations commissioned by themselves (e.g. an Irish training programme for officials in the Department of Children and Youth Affairs, p. 243) or in a particular field of activities (e.g. in UK fire and rescue services, p. 263). Only the national voluntary organisations for professional evaluation (VOPEs) offer a more general and broader spectrum of trainings (explicitly mentioned in Czechia, p. 370; Finland, p. 82; France, p. 160; Italy, p. 296f.; Poland, p. 425 and UK, p. 263). The gathering of and regularly exchange between evaluators (including also commissioners, researchers, politicians and other interest persons) is the most important characteristic of VOPEs and also of the second aspect of professionalisation, communication and organisation. Communicating

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

509

Members (absolute)

about evaluation (cf. Table 4, column 2) is an important step toward professionalisation because it offers the opportunity to develop a shared perception and awareness of being part of a profession and helps also to locate and sharpen the profile of a new discipline. Such kind of communication is possible through media—especially newsletters and mailing lists (e.g. the Vide Nieuwsbrief in the Netherlands, p. 105 or the Forum für Evaluation in Germany, p. 189)—or through regularly meetings organised by VOPEs. The term ‘VOPE’ has been developed by the International Organisation for Cooperation in Evaluation, the umbrella organisation of all kind of interest groups for professional evaluations (cf. Table 4, column 3). Some of these groups are well-established and formalised associations like the Gesellschaft für Evaluation (DeGEval) in German-speaking countries (p. 190), others are open networks with almost no limitations and rules for joining like the ‘Irish Evaluation Network’ (p. 244), and yet others are small clubs with limited membership like the Latvian Evaluation Society (LATES, p. 399). These three forms can be found in certain variations in Europe (Fig. 2). 900 800 700 600

Open Network

500 400 300 200 100

Spain

Germany

Switzerland

France

Ireland

Finland

Netherlands

Great Britain

Italy

Denmark

Poland

Romania (RoSE)

Belgium (Wallonia)

Czech Republic

Romania (Ader)

Romania (EvalRom)

Latvia

Portugal

Belgium (Flanders)

0

Fig. 2 Voluntary organisations of professionalisation in Europe 2019 (GLOBE Authors)

510

R. STOCKMANN ET AL.

There are five (including also the regional European Evaluation Society (EES) with about 400 members) big evaluation associations in Europe— the French Evaluation Society (p. 159ff.), SEVAL in Switzerland (p. 204), the Spanish Evaluation Society (p. 322) and the Austrian-German DeGEval. However, compared to the size of these countries and other professional associations, even these organisations are still rather small.4 A second group of six evaluation societies, including Denmark, Finland, the Netherland, UK, Italy and Poland, are also reasonable big (about 100 members) and grant opportunities for fruitful exchanges. Some of them are well established (like the oldest Evaluation Society in Europe, the UKES, founded in 1994, p. 263). Officially, all investigated European countries have an evaluation society, although they are rather small in the last group of six states and some of them are not very active. At least two of them (the Belgian SWEP, p. 134, and the Portuguese AVALPortugal, p. 344) stopped activities most recently and two others (the Irish Evaluation Network, p. 244, and the second Belgian organisation VEP, p. 134) are only open networks without membership. However, some like the Czech Evaluation Society (p. 368) or Latvian LATES (p. 399) are quite engaged despite the limitations of size. Finally, the situation in Romania is unique because there are three competing societies RoSe, EvalRom and ADER (p. 456f.). Almost all EES offer a broad spectrum of communication opportunities, among others regular meetings, publications, newsletters, trainings et cetera. Without any doubt, communication about evaluation is primarily organised by these groups although other ones—merely focused on specific topics or organised by other interest groups (e.g. the Dutch association of professionals Vide, p. 105)—are also offering such opportunities. From an academic perspective, exchange organised by associations is limited. The key platforms are scientific journals and the number of research publications on evaluation are enormous increasing during the last two decades (Bono et al. 2018). Most of these papers are written and published in English by well-established social science journals, including some specialised in evaluation (Table 4, column 2).

4 In average, there are five to ten members in Evaluation societies per 1 Mio inhabitants in European countries, only in two countries – Finland and Switzerland – the relation is much higher (about 50 members per Mio). The professional societies in Social Sciences have more than twice as much members: in Germany, for instance, the associations of political sciences, sociology, education and psychology have between 1.700 and 5.500 official members.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

511

There is a certain amount of such journals existing in Europe. The most important one is “evaluation” published by Sage and edited by the Tavistock Institute in London, UK (p. 264). The Zeitschrift für Evaluation in German language (p. 189) and the Spanish Revista de Evaluación de Programas y Políticas Públicas (p. 321) are some of the most important non-anglophone Evaluation journals. The Italian Rassegna Italiana di Valutazione (p. 297) and the Czech Evaluaˇcní teorie a praxe (p. 370) are also journals with a particular focus on evaluation. Most of these journals seem to work well and do not lack of submitted papers. The only exception is the Romanian Journal of Evaluation for which the publication activity has been limited only to a first issue in 2013 (p. 456). It is a foggy line between journals targeting on scientific exchange primarily in evaluation and in strongly related areas like public management or administration. Some VOPEs like in Finland (p. 82), the Netherlands (p. 106), or Switzerland (p. 217), have contracts with other journals not primarily focused on evaluation (the Dutch online Journal of public administration, the Finnish Administrative Journal and the Swiss LeGes). Other VOPEs emphasised on newsletters like in France (p. 159). Regularly, VOPE members have free access to these journals or newsletters. Compared to the already mentioned aspects, the final one on regulation and certification reveals a certain shortfall of professional institutions (cf. Table 4, Column 4). The Swiss Evaluation Society (p. 204) started very early after its foundation developing and endorsing general evaluation standards in 2001 and so did the Austrian-German DeGEval three years later (p. 190f.) and the Polish Evaluation Society in 2008 (p. 425); in Romania similar processes failed (p. 442). In general, there is a significant lack of evaluation standards, guidelines or other documents for regulating the implementation and use of evaluation in Europe. Even the standard systems do not have binding power, although some recent publications in Germany and Switzerland proof their use in a broad variety of different policy fields (Sager et al. 2017; Böttcher et al. 2019). Some evaluation societies developed recommendations (like the French charte de l’evaluation, p. 140f.), or the “UKES Guidelines for Good Practice in Evaluation” (p. 264), or ethical codes (like the Czech CES-Ethical Code, p. 369), the Spanish AEVAL codes of ethics (p. 322), or the Italian AIV deontological, (p. 297) with even more limited binding power. Certification is only discussed seriously in Italy: “A certification system for

512

R. STOCKMANN ET AL.

evaluators does not exist, but some work on this respect has been brought forward within AIV. External certification is possible under the framework of the law 14/2003 on the ‘unregulated professions’. However, the evaluators’ community has opted for auto-certification instead of the external certification. A register of the Professional Evaluators has been promoted by AIV but recently (2017) has been closed, as it has not reached the expected results nor in terms of use by evaluators nor as a source of information on competent evaluators among the organisations requiring this profile” (p. 297). Strong regulative forces are nowhere implemented. Three main results can be identified in such a comparison (Table 4). • Europe is strong in building up professional organisations and training opportunities, although the institutionalisation is not well secured and still depending too much on single initiatives and individual activities. • Europe is still very weak in implementing rules and norms for evaluation and there is at least in some countries a strong resistance against such ideas. • Europe is divided in four parts according the professionalisation process. The first group composed of France, Germany, UK, Italy, Spain and Switzerland have developed almost all professional institutions. The second group (Czechia and Poland) is not too far behind, representing the catch-up process of Central-Eastern European countries. The third group is merely built by small states, lacking of a critical mass of evaluators. Most of these countries (especially the Benelux and Scandinavian countries Belgium, Netherlands, Denmark and Finland) are more oriented towards the international (European) evaluation community than trying to build up national institutions. In Latvia and Romania, the situation is different because evaluation is not as settled as in these countries. The final two countries—Ireland and Portugal—are probably representing some of the left-out states (e.g. Bulgaria, Croatia, Greek), being at a very early state of professionalisation. However, Ireland as well as Portugal are also oriented towards EES and better-established neighbour organisations in UK or Spain respectively. This is the same situation as in Austria or Luxembourg towards Germany, which are also left-out here.

19

5

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

513

Conclusions: Correlations and Interlinkages

For looking at the interrelationship between the different indicators describing institutionalisation of evaluation, a standardised scale was created. This scale is varying between 0 (no institutionalisation) and 1 (fully institutionalised), taking only valid observations into account and is used for all three parts (political system, social system and system of professions). For comparing the state of the countries, a z-transformation is used (mean of the scale is set to 0 and scaling to standard division units above and under the mean) (Fig. 3). As a first result, the relative performance within the three main systems is not similar for all countries. Only one country (Switzerland) is clearly ranked above the others in all three dimensions, while five (Romania, Portugal, Poland, Latvia and Ireland) are equally below average. Especially in big countries (UK, Spain, Italy, Germany and France) the institutionalisation in the systems of professions is much better than in the other countries. With the exception of Switzerland, this might be primarily a question of size: for big countries, it is probably easier to find at least some people fostering the implementation of scientific research and teaching (and United Kingdom Switzerland Spain Romania Portugal Poland Netherlands Latvia Italy Ireland Germany France Finland Denmark Czech Republic Belgium

Political System Social System Professionalisation

-2

-1

0

1

2

3

4

Fig. 3 Z-Scores for standardised indices on political, social and professional system

514

R. STOCKMANN ET AL.

reaching a critical mass of students respectively authors for scientific articles) as well as building VOPEs with a considerable number of members and activities (the professionalisation index correlates high with population size, p = .74). But this positive aspect of size is accompanied with a poorer performance in the social and the political system (with the exception of Germany and France who are above average here). Finland and Denmark show somehow the opposite pattern—they are above average in the institutionalisation of evaluation in their political and social system but below average according to the systems of professions. Both countries tried, for instance, to establish a study programme but this finally failed probably because of the small size of the evaluation community and the demand for trained evaluators. The remaining countries—the Netherlands, Czechia and Belgium—show some better institutionalisation in one of the dimensions (the Netherlands in the Political System, Czechia in Professionalisation and Belgium in the Social System) but remain below average in the other dimensions. In general, the size of the countries (in terms of population) and also the state of economic welfare and democratic system does not have an important impact on the institutionalisation of evaluation in the political as well as in the social system. The reasons for success in institutionalisation of evaluation seems to be quite different and particular for each country. Belgium, for instance, does not have a strong and good working evaluation association, while the small Czech evaluation society is very busy and internationally highly recognised. Hence, this difference does not lead to a huge impact on institutionalisation in political and social system—the performance is even slightly better in Belgium than in Czechia. In opposite, some countries with a poor state of institutionalisation in its political and social system developed well-working evaluation associations and networks (e.g. Italy and UK). To summarise this, there is no viewable general pattern for a commonly shared institutionalisation process across countries and systems. The institutionalisation of evaluation is driven by different forces and some internal dynamics, there seems to be no general rule how to establish evaluation within country subsystems. However, it is possible to create a general model for the institutionalisation of evaluation in Europe by using the chosen indicators, showing the different influences of each of these indicators.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

515

In general, all indicators except civic demand (no variation here) have a certain influence for creating the three scales of each system. The most significant result is the poor relation between professionalisation and the two governance systems. Although evaluation is without any doubt supposed to be a strong support for improving the quality of governance in societies, neither the political system nor civil society seems to be interested in building up a strong and powerful evaluation profession. The mechanisms working in the professional systems are obviously merely independent from the demand side for such kind of services. For the political system, two different aspects—the institutionalisation especial in the legal system and the use of evaluation in policies—have been explored in further details. In general, all indicators are strongly related to the index and with each other. Correlations vary between .61 and .82 except of the integration in political organisations. There is no evidence for a certain pattern or driving force in one specific element. However, there is a remarkable relationship between the overall score of the institutionalisation index for the political system and the World Governance Indicators published by the World Bank (2019), especially with the two indicators on regulatory quality (.86) and government effectiveness (.80). European countries are developing more institutions for evaluation within their political system and making more use out of evaluation if they have a higher degree of regulatory quality, “capturing perceptions of the ability of the government to formulate and implement sound policies and regulations that permit and promote private sector development”, and of government effectiveness, “capturing perceptions of the quality of public services, the quality of the civil service and the degree of its independence from political pressures, the quality of policy formulation and implementation, and the credibility of the government’s commitment to such policies” (World Bank 2019). This emphasises the importance of evidence driven policies for the implementation of evaluation—and probably also the other way around. The situation is more difficult, if one looks at the social system and the institutionalisation and use of evaluation in civil society. In an overall view, the state of the art is still rather poor: the scores are much lower than in the political system although there is a certain correlation between both systems. The main challenge here is the diversity of civil society, leading to much bigger differences within and between the European countries. Two indicators—civic demand and the availability of reports—are very week in explaining the differences between the European countries, all the other indicators vary between .69 and .80. The main aspect still

516

R. STOCKMANN ET AL.

seems to be the public perception of evaluation. In many countries, evaluation is almost unknown or at least poorly reflected in public discussions. Evaluation achieves only in one country—Switzerland—a certain amount of public recognition although it is also still weak. Remarkably, more practice and better integration in political and/or social system does not lead to more professionalism and institutionalisation of quality controls. While the state of the professionalisation process varies, this is almost not related to the degree of integration into the political or social system. Quality assurance by norms and standards (.85) as well as the successful institutionalisation of specialised scientific exchange and communication opportunities (.91) are the main characteristics differentiating the state of professionalisation in European countries. In many countries, evaluation as a profession is still limited to an open communication platform or small expert circles, meeting at conferences or workshops organised by very weak ‘Evaluation Societies’ with almost no training or study opportunities. However, building professional institutions did not have any impact on the institutionalisation in the political or social system. There seems to be almost no interest for building up quality in evaluation, neither on demand side by political actors or civil society organisations nor on the offer side by evaluation practitioners at universities or consulting companies. Professionalisation of evaluation in Europe is still the result of activities of a very small group of actors with limited impact on the national evaluation markets.

6

Some First Attempts of Explanation

It can finally be said that there has been an enormous growth in the overall evaluation landscape regarding the institutionalisation of evaluation in the last decades. However, the development is not linear, and it differs from country to country. In some of the former forerunning countries, the level of development has slowed down to a certain degree, in some cases even a high level, but has lost its dynamic (UK, Denmark, Finland, Switzerland and Germany). Institutionalisation is stagnating or even declining in yet other countries (e.g. in Spain). The question that remains to be clarified is what has driven the institutionalisation of evaluation in Europe in recent decades. Regarding the political system, the answer is surprisingly simple and definitive. In spite of the huge political, social and cultural differences between countries, two developments have been primarily responsible:

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

517

Firstly, efforts at achieving rational public administration within the framework of New Public Management (NPM), and secondly, EU requirements in the course of the allocation of subsidies, which are also connected to the ideas of NPM. Evaluation established itself first in those European countries in which the Planning, Programming, Budgeting System—developed in the US—became the model for the modernisation of the respective country’s public administration. In France (p. 140), the implementation of a programme in the 1960s aimed at the “rationalisation of the budgetary choices” is labelled “the birth certificate of public evaluation”. The UK likewise joined “the ‘first wave’ of public management reform in the early 1970s, bringing with it an emphasis on ‘strategic policy making and evaluation’” (UK, p. 252). This ultimately led to reforms based on the NPM model. This model discovered the customer in citizens. It strived to be oriented towards principles of competition and quality—as is the norm within the private sector—and it replaced the management of input control with output and outcome orientation instead. In order to procure the management information new instruments were needed. Evaluation lent itself to this purpose. Without this kind of administrative modernisation, and without the ideas of evidence-based policy and the concept of making strategic political and administrative decisions based on a rational informational underpinning, it would have remained insignificant. In the 1990s, the EU gave evaluation a further leg up in its development: By making the performance of evaluations one of the mandatory conditions of receiving subsidies from the EU structural fund, evaluation now also came to be practised in those countries which had not previously applied the concept. This second wave of evaluation in Europe most notably reached countries such as Belgium (p. 117), Italy (p. 275), Portugal (p. 331), Spain (p. 305) and Finland (p. 64). In the countries of Central Eastern Europe, the concept of evaluation only began to spread following their entry into the EU. Yet in a number of country reports, it is mentioned that planning, reporting and controlling instruments were in fact necessary within planned economies (Czechia, p. 352). However, such instruments related to a rigid top-down model within which evaluation, if the term featured at all, was perceived primarily as a threat (Romania, p. 436f.). A result of this is that the understanding of evaluation in these third-wave countries such as the Czechia, Latvia, Poland and Romania, continues to be ambivalent and different to Western Europe.

518

R. STOCKMANN ET AL.

Above all, the development of evaluation in Europe has been policy driven. The dominant importance of the administrative and, ultimately, political demand for evaluation can also be seen in its development over the decades. This is highlighted by, inter alia, the questions that the evaluation has to deal with: At times, during periods of limited finances, there is a trend towards examining efficiency, whilst at other times the focus is on appraising the possible consequences of new legislation on an ex ante basis, or on questions of impact for the purpose of legitimising government action. More recently, evaluation has become increasingly geared towards issues of sustainability within the context of the sustainable development goals. Even phases of growth and decline in evaluation are politically determined; periods of reform are often accompanied by an increasing demand for evaluation, for example in Germany in the 1960s and 1970s, and again in the 1990s (p. 167f.), in France in the mid-2010s (p. 141). Financial crises, on the other hand, are often associated with reduced interest in evaluation, such as has been the case to a drastic extent in Spain in the past years (p. 306; p. 312ff.). Leaving aside the Central Eastern European countries, which only gained access to western European institutions and to the EU from the 1990s or even later—as well as Portugal and Spain, which only became EU members in 1986—it is difficult to explain why evaluation was able to be established institutionally in a more comprehensive and effective manner in some countries of the first and second waves than in others. Likewise, it is hard to say why two countries (Switzerland and Finland), which both began as latecomers, are at the top of the European pile in terms of the institutionalisation of evaluation. There do not seem to be any universal explanations for individual development paths of different countries. One might, for example, theorise that, in those countries that are geared to a large extent towards private enterprise and liberal economics—such as Ireland and the UK—well-developed evaluation systems may have emerged. The rationale behind this: If the state is to be entrusted with money, it must be closely monitored to ensure that it is used in a targeted and effective manner. Yet neither Ireland nor the UK belong to the leading group of countries with a high degree of institutionalisation. Conversely, as in the case of Denmark (p. 56), one may reverse this theory and argue that wide-ranging evaluation structures have been developed precisely in those countries with a high public spending ratio and high levels of taxation, as their high expenditure on the public sector requires particular justification. However, whilst there are numerous countries in Europe with a public spending ratio above the European average

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

519

(47.1%) and a high degree of institutionalisation of evaluation (e.g. Denmark, Finland and, to some extent, Belgium), in others, such as France and Italy, this is not the case (Eurostat Press Office 2018). Regarding the social system one of the most important question risen was to what extent does the social, or so to say the (organised) civil society, demand evaluation. It can be stated that the demand is generally low and the involvement of regular citizens in evaluation is almost inexistent. Only in a very few cases (e.g. the UK), private persons (can) raise a demand for evaluation. The involvement of civil society in governmental evaluations ranks also at a very low level. CSOs, NGOs, foundations and so forth mainly evaluate own projects, programmes or policies. They do seldomly ask for evaluations for accountability or control reasons. A clear pattern of hindering or promotional factor for the institutionalisation of evaluation could not be found. Also, there is no clear linkage that connects the developments in the social systems in the analysis to the political nor the profession system there is one aspect, almost all the country experts complained about, which is the poor knowledge of the concept of evaluation in and by society. So this might be taken as a hint for the poor institutionalisation of evaluation within civil society, but there is no evidence for improvement if the knowledge of evaluation is better promoted. Finally, the system of professions is almost independent from the development of demand in political or social system. There is also one critical factor fostering or hindering the development of professional structures: population size. Not surprisingly, a new emerging profession is very small in its early beginnings and the first important step is to reach a ‘critical mass’ for building up a functional infrastructure, not depending on individual engagement of a handful of people. This is obviously easier in large countries like Germany, France, Italy or UK than in small countries like Latvia, Portugal or Belgium. The nucleus seems to be the ‘evaluation society’, in many cases just an exchange platform which can be developed towards a professional association with regular services for its members. To provide such kind of attractive services, some resources are needed and must be primarily derived from membership contributions (monetary and non-monetary). In small societies, it is sometimes difficult to achieve a sufficient engagement in evaluation for building up such an evaluation community. The most relevant option for evaluators from small countries is the engagement in international communities or huge associations in neighbour countries. In Europe, the EES and the Austrian-German DeGEval are

520

R. STOCKMANN ET AL.

attracting many people from neighbour countries and developed towards key players in organising exchange within the European Evaluation community. However, compared to the American Evaluation Association or other regional evaluation associations, the European organisations are still small and not very powerful. Size seems also to matter according to the implementation of evaluation within the academic sector. While some small countries tried to build up study programmes for evaluators, they merely failed. Like in North America, the offer for evaluation trainings is not very stable and even large study programmes vanished during a short period of time. With the exception of Switzerland, only some programmes in big countries like France, Germany, Spain and UK achieved a critical mass of students and were able to build up a certain infrastructure for academic research and exchange. Nevertheless, this is still relaying on individual engagement and the discipline is not well-anchored at universities. The reason is not only size: there are strong forces for keeping evaluation as part of well-established disciplines like education, political science, psychology or sociology, and the soil is not everywhere fertile for applied sciences. Moreover, a lot of cheap and easy accessible offers for evaluators are made by international organisations and many people believe, this is enough for being a good evaluator. Although evaluation has taken some roots in the professional system in Europe, it is still a young and fragile plant. One may take this as a more general conclusion of this book on the institutionalisation of evaluation in Europe. Although evaluation has a certain and lasting history in some countries like UK, Switzerland and Germany (and others not treated here like, e.g. Sweden may be added), it is still young and not well established. A lot of different forces foster and hinder institutionalisation processes in each of the countries, there is no general and prior pattern. Further and much deeper analysis on national level are necessary for understanding the development of evaluation within each of the three systems. Nevertheless, this international comparison may offer a good starting point for discussions about improvement for institutionalisation processes of evaluation in Europe.

19

THE INSTITUTIONALISATION OF EVALUATION IN EUROPE: A SYNTHESIS

521

Appendix For summing up the state of professionalisation, a Guttman Scale from 0 (‘no institutions’) to 1 (‘completed institutionalisation’) is used (aiming on four institutions—main study courses, specialised scientific journals, strong voluntary organisations for professional evaluation and obligatory standards and/or certifications. See for the detailed composition of the four scales; the analytical guideline combined the two elements ‘communication’ and ‘organisation’ under the rubric ‘profession’, while the other two indicators are mentioned by several questions in separate parts on ‘academic study courses, further training et cetera’ and ‘compliance to standards and quality obligations’. (See Appendix of Chapter 1). CPEI Professionalisation Scale Scale

I1 Education: study I2 Communication: I3 Organisation: programs focused exchange VOPE

0 0.25

No offer available Only non-academic offers Only single academic courses Minor subject courses

0.5 0.75

1

Major subject courses

No offer available Exchange in other discipline media Exchange in open media Exchange in regularly published media Exchange in academic journals

I4 Norms: general agreement

No VOPE existent No rules existent Open network Informal agreements without duties Formalised network Self-commitment on internal rules Small formalised Endorsed general organisation rules Large formalised organisation

Obligatory rules and certifications

References Bono, P. H., Debu, S., Desplatz, R., Hayet, M., Lacouette-Fougère, C., & Trannoy, A. (2018). Vingt ans d’évaluations d’impact en France et à l’étranger. Analyse quantitative de la production scientifique. France Strategie, Document de travail, 6. Böttcher, W., Hense, J., Kalman, M., & Meyer, W. (Eds.). (2019). Standards der Evaluation in unterschiedlichen Handlungsfeldern: Einheitliche Qualitätsansprüche trotz heterogener Praxis?. Münster: Waxmann. Eurostat Press Office. (2018). Euro area and EU 28 government deficit at 0.9% and 1.0% of GDP respectively. News release Euro Indicators, 69. https://ec.europa.eu/eurostat/documents/2995521/8824490/223042018-AP-EN.pdf/6e5b346e-e302-4132-920a-854b00ac196d. Accessed on 14 June 2019.

522

R. STOCKMANN ET AL.

Friedrich, V. (2016). European University-based study programs in evaluation: Characteristics and future challenges. In R. Stockmann & W. Meyer (Eds.), The future of evaluation: Global trends, new challenges, shared perspectives (pp. 113– 128). Basingstoke and New York: Palgrave Macmillan. Friedrich, V. (2017). Evaluationsstudiengängen an europäischen Hochschulen: Merkmale und Herausforderungen. In R. Stockmann & W. Meyer (Eds.), Die Zukunft der Evaluation. Trends, Herausforderungen, Perspektiven (pp. 111– 124). Münster: Waxmann. Mertens, D. M. (1998). Research methods in education and psychology: Integrating diversity with quantitative and qualitative approaches. Thousand Oaks, CA: Sage. Meyer, W. (2015). Professionalisierung von Evaluation: ein globaler Blick. Zeitschrift Für Evaluation, 14(2), 215–246. Meyer, W. (2016a). The global state of evaluation as a profession. Some results. Paper EES16-0161 presented at the 12th EES Biennial Conference “Evaluation futures in Europe and beyond. Connectivity, innovation and use” 28–30 September 2016 Maastricht, The Netherlands. Meyer, W. (2016b). Toward professionalization? The contribution of universitybased training programs in pioneer countries. In R. Stockmann & W. Meyer (Eds.), The future of evaluation (pp. 98–112). Basingstoke and New York: Palgrave Macmillan. Meyer, W. (2019). Professionalisierung zwischen Kohärenz, Koexistenz und Konkurrenz. Das Fallbeispiel der Evaluation. In N. Burzan (Ed.), Komplexe Dynamiken globaler und lokaler Entwicklungen. Verhandlungen des 39. Kongresses der Deutschen Gesellschaft für Soziologie vom 24.-28 September 2018 an der Georg-August-Universität Göttingen. Reinalda, B., & Kulesza, E. (2005). The Bologna process–Harmonizing Europe’s higher education: Including the essential original texts. Opladen and Bloomfield: Barbara Budrich Publishers. Sager, F., Widmer, T., & Balthasar, A. (Eds.). (2017). Evaluation im politischen System der Schweiz. Entwicklung, Bedeutung und Wechselwirkungen. Zürich: NZZ Libro. World Bank. (2019). https://info.worldbank.org/governance/wgi/#home. Accessed on 14 June 2019. Ziekow, J. (2015). Stand und Perspektiven der Gesetzesfolgenabschätzung in Deutschland. Speyerer Forschungsberichte, 281.