293 107 13MB
English Pages [1701] Year 2020
The SAGE Handbook of
Political Science
Editorial Board Attila Agh, Corvinus University, Budapest, Hungary Arjun Appadurai, New School, NYU, United States Leonardo Avritzer, Federal University of Minas Gerais, Brazil Nathaniel Beck, NYU, United States Walter Carlsnaess, Uppsala University. Sweden Yvonne Galligan, Queen’s University, Belfast, UK Manuel Antonio Garretόn, Universidad de Chile, Santiago de Chile, Chile John Groom, University of Kent at Canterbury, UK Emmanuel Gyimah-Boadi, University of Ghana, Accra, Ghana Guy Hermet, Sciences Po, Paris, France Takashi Inoguchi, University of Niigata Prefecture, Japan Robert Jervis, Columbia University, New York, United States Max Kaase, Mannheim University, Germany Bahgat Korany, American University, Cairo, Egypt Richard Ned Lebow, King’s College, London, UK Andrej Melville, Higher School of Economics, Moscow, Russia Helen V. Milner, Princeton University, Princeton, United States Winnie Mitullah, University of Nairobi, Nairobi, Kenya Carole Pateman, University of Cardiff, UK Dianne Pinderhughes, University of Notre Dame, United States Surinder Kler Shukla, University of Chandigarh, India Ilter Turan, Bilgi University, Istanbul, Turkey Laurence Whitehead, Oxford University, UK
The SAGE Handbook of
Political Science
Volume 1
Edited by
Dirk Berg-Schlosser Bertrand Badie and Leonardo Morlino
SAGE Publications Ltd 1 Oliver’s Yard 55 City Road London EC1Y 1SP SAGE Publications Inc. 2455 Teller Road Thousand Oaks, California 91320 SAGE Publications India Pvt Ltd B 1/I 1 Mohan Cooperative Industrial Area Mathura Road New Delhi 110 044 SAGE Publications Asia-Pacific Pte Ltd 3 Church Street #10-04 Samsung Hub Singapore 049483
Editor: Natalie Aguilera Editorial Assistant: Umeeka Raichura Production Editor: Jessica Masih Copyeditor: Sunrise Setting Ltd Proofreader: Sunrise Setting Ltd Indexer: Cathryn Pritchard Marketing Manager: Chazelle Keeton Cover Design: Naomi Robinson Typeset by Cenveo Publisher Services Printed in the UK
Preface, Introduction & Editorial arrangement © Dirk Berg-Schlosser, Bertrand Badie, & Leonardo Morlino, 2020 Chapter 1 © Siddharth Mallavarapu, 2020 Chapter 2 © Jun Ayukawa, 2020 Chapter 3 © Gianfranco Poggi, 2020 Chapter 4 © James F. Hollifield and Hiroki Takeuchi, 2020 Chapter 5 © Timofey Agarin, 2020 Chapter 6 © Marian Sawer, 2020 Chapter 7 © Dingping Guo, 2020 Chapter 8 © B. Guy Peters and Jon Pierre, 2020 Chapter 9 © Furio Cerutti, 2020 Chapter 10 © Yves Schemeil, 2020 Chapter 11 © Jack Paine and Scott A. Tyson, 2020 Chapter 12 © Richard Beardsworth, 2020 Chapter 13 © Henrik P. Bang, 2020 Chapter 14 © Andreas Anter and Hinnerk Bruhns, 2020 Chapter 15 © Rudra Sil, 2020 Chapter 16 © Uwe Wagschal and Felix Ettensperger, 2020 Chapter 17 © Derek Beach, 2020 Chapter 18 © Michael Baumgartner, 2020 Chapter 19 © Zachary Elkins, 2020 Chapter 20 © Claudius Wagemann, 2020 Chapter 21 © Hans Keman, 2020 Chapter 22 © Anna Bassi, 2020 Chapter 23 © Einar Berntzen, 2020 Chapter 24 © Terrell Carver, 2020 Chapter 25 © Nathaniel Beck, 2020 Chapter 26 © Manfred Max Bergman, 2020 Chapter 27 © Jonathon Moses, 2020 Chapter 28 © Bruno Cautrès, 2020 Chapter 29 © Herbert Kitschelt, 2020 Chapter 30 © Ursula Hoffmann-Lange, 2020 Chapter 31 © Ireneusz Pawel Karolewski, 2020 Chapter 32 © Liborio Mattina, 2020 Chapter 33 © Daniel-Louis Seiler, 2020 Chapter 34 © Roland Czada, 2020 Chapter 35 © Oscar Gabriel, 2020 Chapter 36 © Gianpietro Mazzoleni and Cristopher Cepernich, 2020 Chapter 37 © Dirk Berg-Schlosser, 2020 Chapter 38 © Maria Marczewska-Rytko, 2020 Chapter 39 © Donatella della Porta, 2020 Chapter 40 © Manuel Antonio Garretón and Nicolás Selamé, 2020 Chapter 41 © Yannis Papadopoulos, 2020 Chapter 42 © Oliver Schlumberger and Tasha Schedler, 2020 Chapter 43 © Philippe C. Schmitter, 2020 Chapter 44 © Bernard Grofman, 2020 Chapter 45 © Ferdinand Müller-Rommel and Michelangelo Vercesi, 2020 Chapter 46 © Surinder Kler Shukla, 2020 Chapter 47 © Jean-François Gagné and Anne-Laure Mahé, 2020
Chapter 48 © Daniela Piana, 2020 Chapter 49 © Werner J. Patzelt, 2020 Chapter 50 © Hans-Joachim Lauth, 2020 Chapter 51 © Jennifer Cyr and Alexis Work, 2020 Chapter 52 © Laurence Whitehead, 2020 Chapter 53 © Jeffrey Haynes, 2020 Chapter 54 © Jeeyang Rhee Baum, 2020 Chapter 55 © Edeltraud Roller, 2020 Chapter 56 © I. William Zartman, 2020 Chapter 57 © B. Guy Peters, 2020 Chapter 58 © Bo Rothstein, 2020 Chapter 59 © Carlos R. S. Milani, 2020 Chapter 60 © Harald Sætren, 2020 Chapter 61 © Leonardo Avritzer, 2020 Chapter 62 © Hellmut Wollmann, 2020 Chapter 63 © Eva G. Heidbreder and Daniel Schade, 2020 Chapter 64 © Giliberto Capano, 2020 Chapter 65 © Evert Vedung, 2020 Chapter 66 © Michael Howlett, 2020 Chapter 67 © Claire A. Dunlop and Claudio M. Radaelli, 2020 Chapter 68 © Rajesh Chakrabarti and Kaushiki Sanyal, 2020 Chapter 69 © David Levi-Faur and Yael Kariv-Teitelbaum, 2020 Chapter 70 © Maurizio Ferrera, 2020 Chapter 71 © Geoffrey Wiseman, 2020 Chapter 72 © Jonathan Paquin, 2020 Chapter 73 © Helen V. Milner, 2020 Chapter 74 © Stéphane Paquin, 2020 Chapter 75 © Richard Ned Lebow, 2020 Chapter 76 © Gunther Hellmann, 2020 Chapter 77 © David M. Malone and Rohinton P. Medhora, 2020 Chapter 78 © Herfried Münkler, 2020 Chapter 79 © Mohammad-Mahmoud Ould Mohamedou, 2020 Chapter 80 © Louise Fawcett, 2020 Chapter 81 © Klaus Schlichte and Elizaveta Gaufman, 2020 Chapter 82 © Jeffrey D. Maslanik, 2020 Chapter 83 © Charles-Philippe David and Alexis Rapin, 2020 Chapter 84 © Bertrand Badie, 2020 Chapter 85 © Tancrède Voituriez, 2020 Chapter 86 © Salvador Santino Fulo Regilme Jr., 2020 Chapter 87 © Christoph Rass, 2020 Chapter 88 © Karim Emile Bitar and Charles Thibout, 2020 Chapter 89 © Schirin Amir-Moazami, 2020 Chapter 90 © Hanspeter Kriesi, 2020 Chapter 91 © Scott Mainwaring and Fernando Bizzarro, 2020 Chapter 92 © Atta El-Battahani, 2020
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. At SAGE we take sustainability seriously. Most of our products are printed in the UK using responsibly sourced papers and boards. When we print overseas we ensure sustainable papers are used as measured by the PREPS grading system. We undertake an annual audit to monitor our sustainability.
Library of Congress Control Number: 2019948041 British Library Cataloguing in Publication data A catalogue record for this book is available from the British Library 978-1-5264-5955-8
Contents List of Figures and Tables xiii Notes on the Editors and Contributorsxvi Prefacexxxvi Dirk Berg-Schlosser, Leonardo Morlino and Bertrand Badie
Volume 1 Introduction1 Dirk Berg-Schlosser, Leonardo Morlino and Bertrand Badie PART I POLITICAL THEORY
11
1.
13
Comparative Political Theory Siddharth Mallavarapu
2. Constructivism Jun Ayukawa
30
3.
Emile Durkhein’s Sociological Insight Into Political Phenomena Gianfranco Poggi
48
4.
Economic Analysis in Political Science James F. Hollifield and Hiroki Takeuchi
64
5.
Functionalism and Its Legacy Timofey Agarin
83
6.
Feminist Political Science Marian Sawer
96
7.
Marx and Marxism in Politics Dingping Guo
114
8.
The New Institutionalism in Political Science B. Guy Peters and Jon Pierre
133
9.
How to Understand Normative Political Theory Furio Cerutti
153
10.
Political Anthropology and Its Legacy Yves Schemeil
170
vi
The SAGE Handbook of Political Science
11.
Uses and Abuses of Formal Models in Political Science Jack Paine and Scott A. Tyson
188
12.
Postmodernism Past, Present and Future Richard Beardsworth
203
13.
David Easton’s Political Systems Analysis Henrik P. Bang
211
14.
Max Weber and the Weberian Tradition in Political Science Andreas Anter and Hinnerk Bruhns
233
PART II METHODS
253
15.
The Survival and Adaptation of Area Studies Rudra Sil
255
16.
Big Data in Social Sciences Uwe Wagschal and Felix Ettensperger
272
17.
Case Studies and Process Tracing Derek Beach
288
18. Causation Michael Baumgartner
305
19.
Concept Regulation in Political Science Zachary Elkins
322
20.
Configurative Methods Claudius Wagemann
341
21.
Designing a Research Project Hans Keman
357
22. Experiments Anna Bassi
373
23.
Historical and Longitudinal Analyses Einar Berntzen
390
24.
Interpretative Methods Terrell Carver
406
25.
Methodology: Qualitative and Quantitative Approaches Nathaniel Beck
423
Contents
vii
26.
Mixed Method and Multimethod Research and Design Manfred Max Bergman
437
27.
Ontologies, Epistemologies and the Methodological Awakening Jonathon Moses
447
28.
Survey Research Bruno Cautrès
464
Volume 2 PART III POLITICAL SOCIOLOGY
477
29. Clientelism Herbert Kitschelt
479
30. Elites Ursula Hoffmann-Lange
499
31. Identities Ireneusz Pawel Karolewski
517
32.
530
Interest Group Systems in the Age of Globalization Liborio Mattina
33. Parties Daniel-Louis Seiler
548
34. Pluralism Roland Czada
567
35.
Political Behavior Oscar Gabriel
584
36.
Political Communication Gianpietro Mazzoleni and Cristopher Cepernich
602
37.
Political Cultures Dirk Berg-Schlosser
619
38.
Political Socialization Maria Marczewska-Rytko
641
39.
Social Movements Donatella della Porta
656
40.
Social Structure Manuel Antonio Garretón and Nicolás Selamé
674
viii
The SAGE Handbook of Political Science
PART IV COMPARATIVE POLITICS
693
41.
Political Accountability Yannis Papadopoulos
695
42.
Authoritarianisms and Authoritarianization Oliver Schlumberger and Tasha Schedler
712
43. Democracies Philippe C. Schmitter
730
44.
Electoral Systems Bernard Grofman
744
45.
Executive Power Ferdinand Müller-Rommel and Michelangelo Vercesi
760
46. Federalisms Surinder Kler Shukla
776
47.
Hybrid Regimes Jean-François Gagné and Anne-Laure Mahé
784
48.
Judicial Power Daniela Piana
799
49.
Legislative Power Werner J. Patzelt
814
50.
Legitimacy and Legitimation Hans-Joachim Lauth
833
51.
Political Competition Jennifer Cyr and Alexis Work
852
52.
Regime Change Laurence Whitehead
867
53.
Religion and Politics Jeffrey Haynes
884
54. Responsiveness Jeeyang Rhee Baum
900
55.
Political Performance and State Capacity Edeltraud Roller
916
56.
State Formation and Failure I. William Zartman
934
Contents
ix
Volume 3 PART V PUBLIC POLICIES AND ADMINISTRATION
951
57.
953
Bureaucracy and Bureaucratic Effectiveness B. Guy Peters
58. Corruption Bo Rothstein
970
59. Governance Carlos R. S. Milani
986
60. Implementation Harald Sætren
1001
61.
Informal Governance and Participatory Institutions Leonardo Avritzer
1023
62.
Local Politics Hellmut Wollmann
1034
63.
Policies beyond the State Eva G. Heidbreder and Daniel Schade
1052
64.
Politics and Policy Giliberto Capano
1065
65.
Policy Evaluation Evert Vedung
1080
66.
Policy Instruments Michael Howlett
1105
67.
Policy Learning Claire A. Dunlop and Claudio M. Radaelli
1121
68.
Policy Making: Models Rajesh Chakrabarti and Kaushiki Sanyal
1134
69. Regulation David Levi-Faur and Yael Kariv-Teitelbaum
1152
70.
1173
Welfare State Maurizio Ferrera
x
The SAGE Handbook of Political Science
PART VI INTERNATIONAL RELATIONS
1191
71. Diplomacy Geoffrey Wiseman
1293
72.
1214
Foreign Policy Analysis Jonathan Paquin
73. Globalization Helen V. Milner
1231
74.
International Political Economy Stéphane Paquin
1250
75.
International Political Theory Richard Ned Lebow
1266
76.
International Relations Theory Gunther Hellmann
1282
77. Multilateralism David M. Malone and Rohinton P. Medhora
1300
78.
The New Wars Herfried Münkler
1320
79.
In Search of the Non-Western State: Historicising and De-Westphalianising Statehood Mohammad-Mahmoud Ould Mohamedou
1335
80. Regionalism Louise Fawcett
1349
81.
State, Power and Security Klaus Schlichte and Elizaveta Gaufman
1366
82.
Transnational Relations as a Field Jeffrey D. Maslanik
1382
83.
War and Peace Charles-Philippe David and Alexis Rapin
1400
PART VII MAJOR CHALLENGES FOR POLITICS AND POLITICAL SCIENCE IN THE 21ST CENTURY1419 84.
Changes of International Power Relations Bertrand Badie
1421
Contents
xi
85.
Environmental Changes Tancrède Voituriez
1437
86.
Human Rights and Humanitarian Interventions in the International Arena Salvador Santino Fulo Regilme Jr
1456
87.
International Migration Christoph Rass
1474
88.
International Violence Karim Emile Bitar and Charles Thibout
1490
89.
Minorities: Empirical and Political-Theoretical Reflections on a Cunning Concept Schirin Amir-Moazami
1508
90. Populism Hanspeter Kriesi
1524
91.
Outcomes after Democratic Transitions in Third-Wave Democracies Scott Mainwaring and Fernando Bizzarro
1540
92.
New Wars in the Global South Atta El-Battahani
1558
Glossary
1575
Index
1587
This page intentionally left blank
List of Figures and Tables Figures 4.1 Trends in international migration: A ‘Crisis’? 4.2 A typology of international regimes 4.3 The dilemmas of migration governance 4.4 Migration interdependence 13.1 Easton’s four-function model 13.2 Regime structuration 16.1 Different graphical user interfaces of the Debat-O-Meter used in TV debate 17.1 A two-stage evidence-evaluation framework for turning empirical material into evidence of mechanisms in process tracing 17.2 Four types of cases in process tracing 19.1 Results from a Google search: ‘us constitution’ 19.2 A snapshot of Constitute’s topic tree 19.3 Entering ‘women’ in Constitute’s search box triggers topics 27.1 Aggregate Web of Science results 27.2 Regional trends for ‘ontolog*’ search, Web of Science 27.3 Regional trends for ‘epistemolog*’ search, Web of Science 27.4 A methodological mapping 28.1 The different steps of a survey 28.2 The two parallel processes of controlling the ‘total survey error’ 37.1 Components of political culture in a system framework 37.2 Levels of analysis 37.3 Huntington’s ‘The world of civilizations: post-1990’ 37.4 From allegiant to assertive citizens 37.5 From allegiant to assertive citizens (trajectories) 58.1 Corruption as keyword in scientific articles 65.1 The general system model adapted to public intervention 65.2 Goal-attainment evaluation focused on goals-results 65.3 Side effects model with specified pigeonholes for side effects 65.4 Potential stakeholders in local social welfare interventions 65.5 The policy instruments triad with affirmatives and negatives 65.6 Main effects, side effects, perverse effects, and null effects 65.7 Basic elements of New Public Management 66.1 A taxonomy of substantive policy instruments (cells provide examples of instruments in each category) 66.2 A spectrum of substantive policy instruments 66.3 A resource-based taxonomy of procedural policy tools (cells provide examples of instruments in each category) 66.4 A spectrum of procedural policy instruments
72 73 74 77 218 221 279 295 297 335 336 336 454 455 455 458 466 467 622 626 629 635 636 971 1082 1086 1087 1090 1092 1093 1098 1107 1107 1109 1109
xiv
The SAGE Handbook of Political Science
68.1 Pictorial representation of the legislative strategy framework 68.2 The Stacey diagram 69.1 Annual creation of regulatory agencies in the sample (left); cumulative annual creation of agencies in the sample (right) (1950–2017) 85.1 Multilateral (MEA) and bilateral (BEA) environmental agreements 1800–2018 85.2 Mapping out international relations theories of international cooperation 91.1 Outcomes and levels of democracy at transition
1146 1148 1154 1441 1449 1552
Tables 8.1 Types of informal institutions 11.1 Key differences between phenomenon and experimental approaches 13.1 Bureaucracy, technocracy and democracy 17.1 Foundational assumptions of case- and variance-based approaches 17.2 ‘Unpacked’ causal mechanism of nuclear taboo 17.3 Template for transparent evaluation of mechanistic evidence in process tracing 17.4 Mapping cases using a similarity graph 17.5 Four variants of process tracing 19.1 Selected datasets in the constitutional domain 43.1 Two realistic and two idealistic models of democracy 43.2 Two perverse models of democracy 45.1 Executive power in 55 authoritarian countries, 2010 (percentage of countries) 45.2 Executive power in 103 democratic countries, 2017 (percentage of countries) 45.3 Women executives in office on December 31, 2017 by country, office and regime 51.1 General trends in electoral volatility in different regions 55.1 Typology of performance criteria 60.1 Type of data in core journal articles by time period published. Percentage base: empirical articles 60.2 Level of empirical analysis in core journal articles by time periods. Percentage base: empirical articles 60.3 Core journal articles by regional focus/origin and time period published. Percentage base: all sample articles 60.4 Core journal articles by most frequent type of policy studied and time period published. Percentage base: articles with a policy focus 60.5 Core journal articles by region of focus/origin and research methodologies. Percentage base: empirical articles 60.6 Core journal articles by regional focus/origin and research methodologies before and after the mid 1990s. Percentage base: empirical articles 60.7 Core journal articles by regional focus/origin and gender profile of authors before and after the mid 1990s. Percentage base: all sampled articles 60.8 Articles by type of core field journal published in and time period. Percentage base: all sample articles 61.1 Models of informal governance 61.2 Expansion of informal governance to semi- or non-democratic contexts 65.1 Eight questions approach to public policy evaluation 65.2 Substance-only and Substance/Cost value criteria
141 190 227 289 293 295 296 301 331 733 734 764 766 769 861 921 1005 1005 1006 1006 1011 1012 1012 1012 1029 1030 1082 1085
List of Figures and Tables
65.3 Positivist ranking list of research designs for causal impact 65.4 Diffusion-oriented strategy for improved use of evaluation processes and products 65.5 Medical bent evidence hierarchy of research designs for causal impact 68.1 Lowi’s policy typologies and resulting politics 68.2 Wilson’s policy typologies 68.3 Timelines of the selected laws and bills 68.4 Proposed legislative strategy framework 69.1 Three theoretical perspectives on regulation: basic premises, actors’ motivations, capture and regulatory expansion 77.1 Key characteristics of regional financial arrangements (relative to the IMF) 77.2 The Fed’s and PBOC’s swap arrangements 80.1 Early, first-wave regionalisms 80.2 New, second-wave regionalisms (post 1980) 91.1 Democratic breakdowns in the third wave 91.2 Democratic erosions without breakdown 91.3 Stagnations after democratic transitions 91.4 Democratic advances in the third wave 91.5 Highly democratic without major advances 91.6 Regression results 92.1 New wars classification
xv
1094 1095 1100 1136 1137 1142 1146 1161 1312 1313 1354 1357 1544 1545 1546 1547 1548 1551 1560
Notes on the Editors and Contributors The Editors Dirk Berg-Schlosser is Professor Emeritus at Philipps-University, Marburg. He has been awarded degrees of Dr oec publ (Munich 1971); Dr phil habil (Augsburg 1978), and PhD (UC Berkeley 1979). He has been Director of the Institute of Political Science and Dean of the Faculty of Social Sciences at Marburg. He has researched and taught at the universities of Munich, Aachen, Augsburg, Eichstaett, Nairobi, Stellenbosch/South Africa and Berkeley. From 2003 to 2006 he was Chair of the European Consortium for Political Research (ECPR) and from 2006 to 2009 Vice-President of the International Political Science Association (IPSA). From 2010 to 2016 he was founder and coordinator of the IPSA Summer Schools on Research Methods at the universities of Sao Paulo, Stellenbosch, Singapore, Ankara, Mexico City and St Petersburg. His research interests include political culture, empirical democratic theory, development studies, comparative politics and comparative methodology. He is a Fellow of the Stellenbosch Institute of Advanced Studies (STIAS) and a member of the Transformation Research Unit (TRU) at Stellenbosch University. Major recent publications in English include: Democratization – the State of the Art (2007, 2nd edition), International Encyclopedia of Political Science (2011, co-edited with Bertrand Badie and Leonardo Morlino 2011), Mixed Methods in Comparative Politics (2012) and Political Science – A Global Perspective (2017, with Bertrand Badie and Leonardo Morlino). Bertrand Badie is Emeritus Professor of Political Science and International Relations at Sciences Po Paris. He has published about 30 books about the state, comparative politics and international relations, including, The Imported State (2000), The Diplomacy of Connivance (2012), Rethinking International Relations (2020, Elgar Pub), Political Science (2017, with Dirk Berg-Schlosser and Leonardo Morlino) Humiliation in International Relations (2017) and New Perspectives on International Order (2018). He co-edited the International Encyclopedia of Political Science (2011), with Dirk Berg-Schlosser and Leonardo Morlino. Leonardo Morlino is Emeritus Professor of Political Science and President of the International Research Center on Democracies and Democratizations at LUISS, Rome. He was President of the International Political Science Association (IPSA) (2009–12). He is the author of more than 40 books and more than 200 journal essays and book chapters published in English, French, German, Spanish, Hungarian, Chinese, Mongolian and Japanese. His most recent books include: Equality, Freedom and Democracy. Europe After the Great Recession (forthcoming), The Impact of Economic Crisis on South European Democracies (2017) with F. Raniolo London, Palgrave (Italian transl. 2018), The Quality of Democracy in Latin America (IDEA, 2016) Comparison. An Methodological Introduction for the Social Sciences, Leverkusen and London, Barbara Budrich Publ and Changes for Democracy (2011).
Notes on the Editors and Contributors
xvii
The Contributors Timofey Agarin is Senior Lecturer in Politics at Queen’s University Belfast, where he is also the Director of the Centre for the Study of Ethnic Conflict. His research interests are ethnic politics and their impact on societal transition, including majority–minority relations, nondiscrimination, migration and civil society, with a particular focus on post-communist states in Central and Eastern Europe. Schirin Amir-Moazami is Professor at the institute of Islamic Studies at Freie Universität Berlin and head of the research and teaching program on Islam in Europe. She is also Principle Investigator in the Excellence Initiative “Contestations of the Liberal Script” (SCRIPTS) and Principle Investigator at the Graduate School Muslim Cultures and Societies at FU Berlin. Her research interests include configurations of political secularism in Europe, Islamic practices and politics of academic knowledge production. Andreas Anter is Professor of Political Science at the Faculty of Economics, Law and Social Sciences at the University of Erfurt. After studying political science and sociology in Münster, Freiburg and Hamburg, and receiving his PhD in Hamburg (1994), he taught political theory and domestic politics at the Universities of Hamburg, Leipzig and Bremen. He is the author of Max Weber’s Theory of the Modern State (2014), Max Weber und die Staatsrechtslehre (2016), and Theorien der Macht zur Einführung (2018, 4th edition). Leonardo Avritzer is a full professor of Political Science at the Federal University of Minas Gerais. He has a Ph.D. in political sociology from the New School for Social Research, where his dissertation received the Albert Salomon Dissertation Award. He was a visiting professor at several universities: University of São Paulo (2004), Tulane University (2008) and recurring visiting professor at the University of Coimbra. Avritzer is also the author of “Democracy and the Public Space in Latin America”, published by Princeton University Press, and “The Two Faces of Institutional Innovation Promises and Limits of Democratic Participation in Latin America”, in 2017, and “Los Desafios de la Participación en América Latina”, 2014. Jun Ayukawa is Professor at the School of Law and Politics, Kwansei Gakuin University. He has published Juvenile Crimes and Social Problems in Japan: A Social Constructionist Perspective (2019) in English, as well as six Japanese books based on social constructivism. The most recent of them is Crime and Criminal Policy Studied from the Viewpoint of International and Cultural Comparison (2017). Henrik P. Bang is professor of governance at the Institute for Governance and Policy Analysis (IGPA), the University of Canberra. He is mostly known for his concepts of the Everyday Maker, the Expert Citizen and Culture Governance. He is currently working on a book about Habermas and the New Populisms. Anna Bassi is Associate Professor of Politics at the University of North Carolina, Chapel Hill. Prior to joining UNC, she studied economics at Sant’Anna School of Advanced Studies, where she received a PhD in economics and management in 2006. Furthermore, she studied political science at New York University, where she obtained a PhD in politics in 2010. Her research interests lie at the intersection of economics and political science in the areas of formal theory and experimental methods, with applications to comparative politics, voting and risk attitudes.
xviii
The SAGE Handbook of Political Science
Jeeyang Rhee Baum is Adjunct Lecturer in Public Policy at Harvard Kennedy School of Government. Her research and teaching interests include comparative political institutions, administrative law and regulatory reform and political economy of bureaucratic reform, particularly as they relate to the enhancement of accountability, responsiveness and public participation in policy development. Previously, she was a Research Fellow at the Ash Center for Democratic Governance and Innovation and a Visiting Scholar at the Weatherhead Center for International Affairs at Harvard University. Michael Baumgartner is a Professor of Philosophy at the Department of Philosophy of the University of Bergen. His research focuses on questions in the philosophy of science and logic, more specifically on causation and causal explanation, data analysis with configurational comparative methods, regularity theories, interventionism, mechanistic explanation, determinism, logical formalisation, argument reconstruction and modelling in the social sciences. He has developed the method of coincidence analysis (CNA) and is a co-author of the corresponding CNA software package for the R environment. Derek Beach is a Professor of Political Science at the University of Aarhus, where he teaches case study methodology, international relations and European integration. He has authored articles, chapters and books on research methodology, international negotiations, referendums and European integration, and he co-authored the books Process-tracing Methods: Foundations and Guidelines and Causal Case Studies. He has taught qualitative case study methods at ECPR and IPSA summer and winter schools, held short courses at the APSA annual meeting on processtracing and case-based research and numerous workshops and seminars on qualitative methods throughout the world. He is also an academic co-convenor of the ECPR Methods Schools. Richard Beardsworth is Professor of International Politics and Head of School, Politics and International Studies, University of Leeds. He is also Research Associate at the Institut des Etudes Politiques (SciPo), Paris. He was previously E.H. Carr chair in International Politics and Head of Department, International Politics, Aberystwyth University. Past interests were in continental political philosophy (Derrida and the Political, 1996; Nietzsche, 1997). His main interests now lie in international normative theory, the global challenges of climate change and planetary sustainability, and state leadership (The State and Cosmopolitan Responsibility, 2019). Recent publications rehearse a Weberian and republican account of ethical responsibility towards global challenges that re-aligns national and global interests and duties. Nathaniel Beck is Professor of Politics at New York University. He is the founding editor of Political Analysis, a winner of the Lifetime Achievement Award from the Society for Political Methodology and a Fellow of the American Academy of Arts and Sciences. Manfred Max Bergman holds the Chair of Social Research and Methodology at the Department of Social Sciences, University of Basel. He is president of the Swiss Academic Society for Environmental Research and Ecology (SAGUF) and member of the Research Council of the Swiss National Science Foundation, the Uganda National Academy of Sciences, and the Sustainable Development Solutions Network (SDSN), a global initiative for the United Nations. His research is focused on sustainability in relation to the UN Sustainable Development Goals and the UN Global Compact, specifically how sustainability intersects business and society in a globalised world. His recent publications deal with the business-society nexus in China, India, and the United States. Pursuing alternative forms of policy-relevant and
Notes on the Editors and Contributors
xix
change-oriented research, he is also working on a new research approach, entitled social transitions research (STR). Einar Berntzen is Associate Professor at the Department of Comparative Politics, University of Bergen. He has authored numerous book chapters and articles on Latin American and European politics. Among his latest publications are ‘Demokratiseringen av Sør-Europa’ (2015) in Historien og idéene, ‘Rokkan in the Andes. Cleavages, Party Systems and the Emergence of New Leftist Parties’ (2016) in Norwegian Social Thought on Latin America, ‘State- and Nation-Building in the Nordic Region: Particular Characteristics’ (2017) in The Nordic Models in Political Science. Challenged, but Still Viable? and ‘Norsk bistand til LatinAmerika’ (2017) in Norge i Latin-Amerika. Forbindelser og forestillinger. Karim Emile Bitar is Acting Director of the Institute of Political Science at the Saint Joseph University of Beirut (USJ) and Director of the Arab Master in Democracy and Human Rights (Global Campus of Human Rights). He is Associate Professor of International Relations at USJ, and Lecturer in Middle East Studies at Ecole Normale Supérieure (ENS-Lyon). He is a Senior Fellow at the Institute for International and Strategic Relations in Paris (IRIS) and the Editor of French monthly public affairs magazine L’ENA hors les murs. He is an Associate Fellow at the Geneva Center for Security Policy (GCSP). He co-edited and co-wrote the collective books Regards sur la France (2007, with Robert Fadel), and Le Cèdre et le Chêne, De Gaulle et le Liban (2015, with Clotilde de Fouchécour). He has also authored numerous book chapters and articles in leading publications including The New York Times, Le Monde, Le Monde diplomatique, Libération, An-Nahar, L’Orient-Le Jour, Informed Comment, Atlantico and La Vanguardia. He frequently testifies before the Foreign Affairs Committees of the French and European Parliaments. Fernando Bizzarro is a PhD Candidate in Political Science at Harvard, and a Research Associate to the David Rockefeller Center for Latin American Studies. He studies the nature, causes, and consequences of political institutions, especially political parties and democracy. He has published book chapters and journal articles on the topic. Hinnerk Bruhns is Director of Research Emeritus at CNRS, affiliated to the Centre de recherches historiques (EHESS/CNRS) in Paris. After his PhD in history in Cologne (1973), he taught at the Universities of Aix-en-Provence and Bochum, and at the EHESS in Paris. He is the editor of Trivium. Revue franco-allemande en sciences sociales et humaines. His books include Max Weber und die Stadt im Kulturvergleich (2000, with Wilfried Nippel), Max Webers historische Sozialökonomie/L’économie de Max Weber entre histoire et sociologie (2014) and Max Weber und der Erste Weltkrieg (2017). Giliberto Capano is Professor of Political Science and Public Policy at the University of Bologna. He is the editor of Policy & Society. He has been a member of the Executive Committee of the International Political Science Association (2009–14) and was the cofounder of the International Public Policy Association. He is a member of the Executive Committee of the European Consortium of Political Research. He specialises in comparative public policy, policy design, policy instruments and change. His latest books are Changing Governance in Universities. Italian Higher Education in Comparative Perspective (2016, with M. Regini and M. Turri), Designing for Policy Effectiveness. Defining and Understanding a Concept (2018, with M. Howlett, I. Mukherjee, M-H. Chou, G. Peters and P. Ravinet).
xx
The SAGE Handbook of Political Science
Terrell Carver is Professor of Political Theory at the University of Bristol. He has published widely on Marx, Engels and Marxisms, and on sex, gender and sexualities. He has also written many reference-book entries and articles on topics in the philosophy of social science, and he teaches discourse and visual analysis at the IPSA Methods School at the National University of Singapore. His research papers incorporate empirical studies of meaning-making within urban spaces and the built environment, and genre analysis of ideology construction through intellectual biography and popular cinema. Bruno Cautrès is a senior CNRS Research Fellow at the Centre de Recherches Politiques (CEVIPOF) at Sciences Po Paris. He is a specialist in voting behaviour, political attitudes and quantitative methods. He has been involved in major cross-national survey projects like the ISSP, the EVS and the ESS and in French national election studies. He teaches quantitative methods at different international summer schools (ECPR and IPSA) and at Sciences Po. Cristopher Cepernich is a sociologist of media and politics at the University of Turin. As Associate Professor, he teaches sociology of communication and media systems and ICT. He is Director of the Observatory on Political and Public Communication of the Department of Culture, Politics and Society. He is Scientific Director of the Master’s in Journalism, ‘Giorgio Bocca’. With Roberta Bracciale, he is scientific coordinator of Policom.Online, a national research and monitoring group on digital political communication. His main research interests are focused on election campaigns and digital communication. Among his recent works are Le campagne elettorali al tempo della networked politics (2017) and ‘Love and Hate in Politics. The Emotionalization of Political Communication’ (Comunicazione Politica, January 2018, edited with Edoardo Novelli). Furio Cerutti is Professor of Political Philosophy emeritus at the Università di Firenze and Affiliate Professor at the Scuola superiore S.Anna, Pisa. He has been a Visiting Scholar or Professor at the J.W. Goethe Universität, Frankfurt am Main, Harvard University, the Université de Paris 8, the Humboldt Universität zu Berlin, the London School of Economics and Political Science, 外交学院 (China Foreign Affairs University), Beijing and Stanford University in Florence. He is a Research Alumnus of the Ruprecht-Karl Universität, Heidelberg. Besides the works quoted in his chapter, Cerutti has published The Search for a European Identity: Values, Policies and Legitimacy of the European Union, ed. with S. Lucarelli, Routledge: London 2008; Brauchen die Europäer eine Identität? ed. with E. Rudolph, Zürich: Orell Füssli 2011;全球治理:挑战与趋势 (Global Governance: Challenges and Trends), ed. with Zhu Liqun and Lu Jing, Beijing: 社会科学文献出版社 (Social Science Academic Press), 2014. Rajesh Chakrabarti is the Dean at the Jindal Global Business School, Jindal Global University and co-founder of Sunay Policy Advisory Pvt. Ltd. He was a faculty member at the University of Alberta, Georgia Tech and the Indian School of Business (ISB). At ISB, he became the founding Executive Director of the Bharti Institute of Public Policy. He also led the Research and Policy vertical at the Wadhwani Foundation. Rajesh is an alumnus of Presidency College, Calcutta and IIM Ahmedabad and earned his PhD from the University of California, Los Angeles. Jennifer Cyr is Associate Professor of Political Science and Latin American Politics at the University of Arizona. She writes on political representation, identity and democracy in Latin America, and on the rigorous integration of qualitative methods, and especially focus groups, into mixed-methods research. She has published two books, The Fates of Political Parties:
Notes on the Editors and Contributors
xxi
Institutional Crisis, Continuity, and Change in Latin America (2017) and Focus Groups for the Social Science Researcher (2019), and has articles in several journals, including Comparative Political Studies, Comparative Politics, Studies in Comparative International Development, Sociological Methods and Research and Revista de Ciencia Política. Roland Czada is Chair in Government and Public Policy, University of Osnabrück. He received his MA in 1979 (University of Tübingen) and his doctorate in 1986 (University of Constance). He held academic positions at the Free University Berlin, the University of Constance and the Max Planck Institute for the Study of Societies, Cologne, as well as visiting appointments at the Humboldt University Berlin (1993), the University of Cape Town (2001/2), and the University of Tokyo (2003). His current work is on energy policies, welfare-state reform, non-majoritarian politics and negotiation democracy. In English his publications include ‘“Post-Democracy” and the Public Sphere: Informality and Transparency in Negotiated Decision-Making’ (2015). Charles-Philippe David is Full Professor of Political Science, President of the Centre for United States Studies, as well as the Founder of the Raoul Dandurand Chair of Strategic and Diplomatic Studies at the University of Québec at Montréal. He has authored and directed many French- and English-edited scholarly publications on American foreign policy and international security, such as La Guerre et la Paix: Approches et enjeux de la sécurité et de la stratégie (2020, 4th edition, with Olivier Schmitt). Donatella della Porta is Professor of Political Science, Dean of the Department of Political and Social Sciences and Director of the PhD programme in Political Science and Sociology at the Scuola Normale Superiore in Florence, where she also leads the Center on Social Movement Studies (Cosmos). The main topics of her research include social movements, political violence, terrorism, corruption, the police and protest policing. Among her very recent publications are Legacies and Memories in Movements (2018), Sessantotto. Passato e presente dell’anno ribelle (2018), Contentious Moves (2017), Global Diffusion of Protest (2017), Late Neoliberalism and its Discontents (2017), Movement Parties in Times of Austerity (2017), Where Did the Revolution Go? (2016); Social Movements in Times of Austerity (2015), Methodological Practices in Social Movement Research (2014), Spreading Protest (2014, with Alice Mattoni), Participatory Democracy in Southern Europe (2014, with Joan Font and Yves Sintomer), Mobilizing for Democracy (2014), Can Democracy Be Saved? (2013), Clandestine Political Violence (2013, with D. Snow, B. Klandermans and D. McAdam) and Blackwell Encyclopedia on Social and Political Movements (2013). Claire A. Dunlop is Professor of Politics and Public Policy at the University of Exeter. A public policy and administration scholar, her main fields of interest include the politics of expertise and knowledge utilisation, risk governance, policy learning and analysis, impact assessment and policy narratives. Since 2014, Claire has been an editor of Public Policy and Administration. Atta El-Battahani is Professor of Political Science, Khartoum University (Sudan). He received his PhD from Sussex University. He was the Head of the Department of Political Science, Khartoum University (2003–6), a founding member of Amnesty International Khartoum Group (1987–9), the Sudanese Civil Society Network for Poverty Alleviation (SCSNPA) (2002–5) and Sudan Country Manager of International Institute for Democracy and Electoral Assistance
xxii
The SAGE Handbook of Political Science
(2006–10). His areas of research and publication include ethnic and religious conflicts in the Third World, governance and state institutional reform, gender politics and peripheral capitalism and political Islam. He is currently the Editor-in-Chief of Sudan Journal of Economic and Social Studies. Zachary Elkins (Department of Government, University of Texas at Austin) studies issues of democracy, institutional reform, research methods and national identity, with an emphasis on cases in Latin America. His current research centres on the origins and consequences of national constitutions. Elkins earned his BA from Yale University, an MA from the University of Texas at Austin and his PhD from the University of California, Berkeley. Felix Ettensperger is a PhD candidate and Political Science Lecturer at Albert-Ludwigs-Universität in Freiburg. His research focus is the development and improvement of conflict-forecasting models using self-learning algorithms and neural networks. He is currently working on various research projects incorporating advanced data-clustering techniques, text mining and other big-data methods in the field of political science. He currently holds the position of Assistant Managing Editor of the bi-annually published scientific journal Statistics, Politics and Policy. Louise Fawcett is Professor of International Relations and Head of the Department of International Relations at the University of Oxford. She is also the Wilfrid Knapp Fellow and Tutor in Politics at St Catherine’s College. She is the author/editor of many works on regionalism including Regionalism in World Politics (with Andrew Hurrell), published by Oxford University Press. Maurizio Ferrera is Professor of Political Science at the University of Milan, Italy. He is currently one of the PIs of ERC Synergy project SOLID–Policy Crisis and Crisis Politics. Sovereignty, Solidarity and Identity in the Eu post 2008, and in June 2019 he completed his former ERC Advanced project REScEU – Reconciling Economic and Social Europe (www. resceu.eu). His main research interests include comparative welfare states, European integration, Italian politics and political theory. He is the author of The Boundaries of Welfare 2005; French translation in 2009) and his most recent articles have recently appeared in the European Journal of Political Research, the European Journal of Public Policy, the Journal of Common Market Studies and the Journal of European Social Policy. His latest book in Italian is Il Quinto Stato (2019). Oscar Gabriel is Emeritus Professor of Comparative Politics at the University of Stuttgart. His fields of research are political attitudes and behaviour. During his academic career he held positions at the universities of Mainz, Bamberg and Stuttgart and was Visiting Professor at the University of Vienna and at Sciences Po Bordeaux. From 2001 to 2013 he was a member of the German National Coordinating Team of the European Social Survey. He has published around 300 books and articles. Jean-François Gagné is a Research Fellow at the Center for International Studies (CÉRIUM) and is a Research Fellow at the Center for International Studies (CERIUM) and Adjunct Professor on Political Development and Technology. He holds a PhD in political science from the Université de Montréal. His current research focuses on the impact of artificial intelligence on political regimes. He was a Political Analyst at Export Development Canada and Fellow at the Raoul-Dandurand Chair of Strategic and Diplomatic Studies.
Notes on the Editors and Contributors
xxiii
Manuel Antonio Garretón, sociologist, graduated from the Universidad Católica de Chile. His PhD is from the Ecole des Hautes Etudes en Sciences Sociales, Paris. He has been director of several academic institutions in Chile and has taught at universities in Chile and other countries. He is an advisor and consultant for several international, public and non-governmental organisations. He has published more than 400 articles in different languages and more than 50 books as author, co-author or editor. His current position is Professor, University of Chile, Department of Sociology. In 2007, he was awarded the Chilean National Prize in Social Sciences and Humanities. In 2015, he was awarded the Kalman Silvert Award, LASA, Among his books The Chilean Polical Process (1989); Incomplete Democracy (2004) . Latin America at the 21st Century, Toward a new sociopolitical matrix? (with M.Cavarozzi, J Hartlyn, P Cleaves, Gary Gereffi) (2003). Elizaveta Gaufman is a Post-doctoral Fellow at the Institute for Intercultural and International Studies (InIIS) at the University of Bremen. She is the author of Security Threats and Public Perception: Digital Russia and the Ukraine Crisis (2017). Her current research project is focused on theorizing and investigating everyday foreign policy practices in Russia and the United States. Bernard Grofman is Jack W. Peltason Chair of Democracy Studies and Distinguished Professor of Political Science at the University of California, Irvine. His research deals primarily with issues of representation, including minority voting rights, the comparative study of electoral rules, constitutional design, and party competition; and he is a specialist in behavioral social choice. He is co-author of five books with major university presses, and co-editor of 23 other books, with over 300 research articles and book chapters, including ten in the American Political Science Review. Dingping Guo is Professor of Political Science and Director of the Dr Seaker Chan Center for Comparative Political Studies in Fudan University. He was Chinese Director of the Confucius Institute at the University of Nottingham (2012–14), Vice-Dean of the Institute of International Studies at Fudan University (2009–12) and Director of the Center for Japanese Studies (2008–12) at Fudan University. He received his first PhD from Fudan University in 1999 and his second from Tokyo University in 2002. His research interests focus on political theory and comparative politics, especially in East Asia. Jeffrey Haynes is Emeritus Professor of Politics at London Metropolitan University. He has research interests in several areas, including religion and international relations, religion and politics, democracy and democratisation, and the politics of development. The more recent of Haynes’s more than 250 publications include From Trump to Huntington: Thirty Years of the Clash of Civilizations (2019) and The United Nations Alliance of Civilisations and the Pursuit of Global Justice: Overcoming Western versus Muslim Conflict and the Creation of a Just World Order (2018). Haynes is Editor of the book series ‘Routledge Studies in Religion & Politics which publishes around four books a year, Co-editor of the journal Democratization and Co-editor of Democratization’s book series, ‘Special Issues and Virtual Special Issues’, which publishes approximately three volumes a year. Eva G. Heidbreder is Professor for Political Science/Multilevel-Governance in Europe at the Otto von Guericke University Magdeburg. In her research, she takes a public-policy and publicadministration view on the EU. Concretely, this includes studies on the multilevel administrative system of the EU and civil-society participation in the EU, as well as the evolution of the
xxiv
The SAGE Handbook of Political Science
EU’s institutional competences and negotiation dynamics in the Brexit process. After completing her PhD at the European University Institute in Florence, she has held positions at the Heinrich Heine University Düsseldorf, the Hertie School of Governance, the Freie and Humboldt Universities in Berlin and the University Konstanz. Gunther Hellmann is Professor of Political Science at Goethe University Frankfurt/ Main. He specializes in German foreign policy, European and transatlantic security relations, and the theory of international relations. He taught at Freie Universität Berlin and Darmstadt University of Technology and held Visiting Professorships at the SAIS Bologna Center of Johns Hopkins University and Dartmouth College. Since 2014 he served as Executive Secretary and, since 2017, as President of the World International Studies Committee. Ursula Hoffmann-Lange is Professor Emerita at the University of Bamberg. Her fields of research are elites, political culture and democratisation. She held Visiting Professorships at the University of Texas at Austin and Vanderbilt University and co-edited several comparative volumes on elites James F. Hollifield is Ora Nixon Arnold Chair in International Political Economy, Professor in the Department of Political Science, and Director of the Tower Center at Southern Methodist University (SMU) in Dallas, Texas, and Global Fellow at the Woodrow Wilson International Center in Washington, DC. His major books include Immigrants, Markets and States (1992), L’Immigration et l’Etat Nation: à la recherche d’un modèle national (1997), Pathways to Democracy: The Political Economy of Democratic Transitions (2014, with Calvin Jillson), Migration Theory (2000 with Caroline Brettell, now it its 3rd edition), and Controlling Immigration (1994, with Philip Martin and Pia Orrenius, also in its 3rd edition). He also has published numerous scientific articles and reports on the political economy of international migration and development. Michael Howlett is Burnaby Mountain Professor and Canada Research Chair (Tier 1) in the Department of Political Science at Simon Fraser University in Vancouver, BC, Canada. He specializes in public policy analysis, political economy, and resource and environmental policy. He is the author of Canadian Public Policy (2013); Designing Public Policies (2011 and 2019), The Policy Design Primer (2019) and co-author of Policy Consultancy in Comparative Perspective; (2019), Designing for Policy Effectiveness: Defining and Understanding a Concept; (2018) Application of Federal Legislation to Alberta’s Mineable Oil Sands (2013), The Public Policy Primer (2010 and 2018), Integrated Policymaking for Sustainable Development (2009), Studying Public Policy (2019, 2009, 2003 & 1995), In Search of Sustainability (2001), The Political Economy of Canada (1999 and 1992) and Canadian Natural Resource and Environmental Policy (1997 and 2005). Yael Kariv-Teitelbaum is a Doctorate Research Fellow in the Faculty of Law at the Hebrew University of Jerusalem. Her main field of research is public law, focusing on regulation and privatization. After completing an LLB in Law and Psychology (Summa Cum Laude) and serving as the Chief Editor of The Hebrew University Law Review, she was awarded several honourable scholarships, including the President Scholarship for Doctoral Students, the Hoffman Leadership and Responsibility Fellowship Program and the Faye Kaufman Memorial Prize for Doctoral Students.
Notes on the Editors and Contributors
xxv
Ireneusz Pawel Karolewski is Professor of Political Theory and Democracy Research at the Leipzig University. Since 2008, he has been Professor of Political Science at the Wroclaw University, Visiting Professor and Visiting Scholar at Harvard University, the Hebrew University of Jerusalem, Université de Montréal (2013), New York University and the Institut Politiques in Lille. His main areas of research are collective identity and nationalism in Europe. His more recent book publications include European Identity Revisited (2016), Civic Resources and the Future of the European Union (2012), The Nation and Nationalism in Europe (2011) and Citizenship and Collective Identity in Europe (2010). Hans Keman is Professor Emeritus of Political Science. He graduated at the University of Leiden and taught at the University of Amsterdam, Leiden, and as a guest professor abroad. He has published several books, many articles in peer-reviewed journals and regular contributions to research volumes. Most of his research and related publications are in the field of comparative political science, methodology and the relationship between history and social sciences. Among his latest books are Social Democracy: A Comparative Account of the Left-Wing Party Family (2017) and Handbook of Research Methods and Applications in Political Science (2016, with Jaap Woldendorp). Herbert Kitschelt is Professor of Political Science at Duke University. He has published on the configuration of party systems in advanced democracies (e.g. The Transformation of European Social Democracy (1994), The Radical Right in Western Europe (1995), The Politics of Advanced Capitalism (2015, co-editor)), post-communism (Post-Communist Party Systems (1999), with co-authors) and Latin America (Latin American Party Systems (2010)). One major concern has been the role of clientelism in party systems (Patrons, Clients, and Policies (2007, co-editor)), also documented in data and publications within the framework of the Democratic Accountability and Linkage Project (DALP). Hanspeter Kriesi holds the Stein Rokkan Chair in Comparative Politics at the European University Institute in Florence. He is also affiliated to the Laboratory for Comparative Social Research, National Research University Higher School of Economics, Russian Federation. Previously, he taught at the universities of Amsterdam, Geneva and Zurich. His wide-ranging research interests include the study of various aspects of democracy, political communication, political mobilisation and opinion formation. In 2016, he was the holder of the Francqui Chair at the University of Leuven. In 2017, he received the Mattei-Dogan Prize. Hans-Joachim Lauth (Germany) is Professor of Comparative Politics and Systems Studies at the Institute for Political Science and Sociology (IPS) of the Julius-Maximilians-Universität Würzburg. He has published many articles and books on democracies in comparison, rule of law in comparison, civil society, informal rules (such as corruption and clientelism), Governance, and comparative methods. He has been one of the speaker of the DVPW working group ‘Intercultural Democracy Comparison’ (1997–2006), and of the DVPW working group ‘Democracy Research’ (2006–2012). He has been Board Member of the IPSA Committee on Concepts and Methods (2006–2012) and is editor and responsible editor of the journal Comparative Governance and Politics ZfVP (since 2008), member of the Editorial Board of Comparative Sociology and member of the Editorial Board of Politics and Governance. In his current research activities he is investigating the development of the quality of democracy and its causes and he is a member of the DFG research group ‘Local Self-Regulations in Antiquity and Modernity’.
xxvi
The SAGE Handbook of Political Science
Richard Ned Lebow is Professor of International Political Theory in the War Studies Department of King’s College London and Bye-Fellow of Pembroke College, University of Cambridge. His most recent books are Reason and Cause (forthcoming), The Rise and Fall of Political Orders (2018), Avoiding War, Making Peace (2017) and Max Weber and International Relations, (2017). He is a Fellow of the British Academy. David Levi-Faur is a Professor for Political Science and Public Policy at the Federmann School of Public Policy and the Department of Political Science of the Hebrew University of Jerusalem. He specialises in the theory of regulation and comparative public policy. He has held research and teaching positions at the University of Haifa, the University of Oxford, the Freie Universität Berlin, Wissenschaft Centrum Berlin, the Australian National University and the University of Manchester. He has also held visiting positions in the London School of Economics, the University of Amsterdam, University of Utrecht, Center for Advanced Studies LMU (CASLMU) and University of California (Berkeley). Anne-Laure Mahé is the East Africa Research Fellow at the Institute for Strategic Research (IRSEM, Paris). She holds a PhD in political science from the Université de Montréal and specialises in comparative politics and African studies, conducting research on authoritarian regimes, the relationships between their resilience and development policies and, specifically, on the case of Sudan. Scott Mainwaring is the Eugene and Helen Conley Professor of Political Science at the University of Notre Dame. His book with Aníbal Pérez-Liñán, Democracies and Dictatorships in Latin America: Emergence, Survival, and Fall (Cambridge University Press, 2013) won the best book prizes of the Comparative Politics section of the American Political Science Association and of the Political Institutions section of the Latin American Studies Association. He was elected to the American Academy of Arts and Sciences in 2010. In April 2019, PS listed him as among the 50 most cited political scientists in the world. He served as the Jorge Paulo Lemann Professor for Brazil Studies at Harvard University from 2016 to 2019. He previously taught at Notre Dame from 1983 to 2016. Siddharth Mallavarapu currently serves as a Professor at the Department of International Relations and Governance Studies at the Shiv Nadar University. Prior to this, he taught International Relations at the Jawaharlal Nehru University and at the South Asian University in Delhi. He is Co-series Editor (along with Himadeep Muppidi of Vassar College, NY, and Raymond Duvall of the University of Minnesota) of ‘Critical Global Thought’, published by Oxford University Press. Siddharth is a member of the editorial board of the online journal Global Perspectives, published by the University of California Press. He has been a Visiting Professor at Sciences Po, Paris in March 2016 and earlier this summer has been a Visiting Research Professor at the Wissenschaftszentrum Berlin. Theory Talks (www.theory-talks.org/) and E-International Relations have both interviewed and featured him. His most recent publication is a book co-edited by Kanti Bajpai and himself titled India, the West, and International Order. David M. Malone is a Canadian author on international security and development, as well as a career diplomat. He is a former President of the International Peace Institute, and a frequently quoted expert on international affairs, especially on Indian foreign policy and the work of the UN Security Council. He became President of the International Development Research Centre
Notes on the Editors and Contributors
xxvii
in 2008 and served until 2013. On 1 March 2013, he took up the position of UN UnderSecretary-General, Rector of the United Nations University, headquartered in Tokyo, Japan. He holds an MPA from Harvard’s Kennedy School of Government and earned a DPhil in International Relations from Oxford University and has most recently, he co-edited The Oxford Handbook of UN Treaties (OUP, 2019) and Megaregulation Contested: Economic Ordering after TPP (OUP, 2019). Maria Marczewska-Rytko is Professor of Political Science and Religious Studies, Faculty of Political Science and Journalism, Maria Curie-Skłodowska University in Lublin. She is the Chief Editor of Annales Universitatis Mariae Curie-Skłodowska: Sectio K Politologia. Her main research topics are direct democracy, populism, religion and politics, political and social movements and political communication. Her edited volumes include the Handbook of Direct Democracy in Central and Eastern Europe after 1989 (2018), Democratic Thought in the Age of Globalization (2003), Religion in a Changing Europe: Between Pluralism and Fundamentalism. Selected Problems (2018) and Civic Participation in the Visegrad Group Countries after 1989 (2018). Jeffrey D. Maslanik is an Associate Researcher with Lund University and a Visiting Lecturer of IPE and International Security at the University of Social Sciences and Humanities Department of International Relations in Ho Chi Minh City. He completed his PhD in International Relations from Florida International University’s Steven J. Green School of International and Public Affairs in 2017. Liborio Mattina is a former Professor of Political Science and Comparative Politics at the University of Trieste.His works include ‘International Pressures and Democratisation in Central and Eastern Europe’ (2005) in Europeanisation and Democratisation, ‘Interest Groups’ (2011) in International Encyclopedia of Political Science, ‘Interest Groups and the “Amended” Liberalizations of the Monti Government’ (2013) in Italian Politics, Technocrats in Office and ‘Left-of-centre Parties and Trade Unions in Italy: From Party Dominance to a Dialogue of the Deaf (2017) in Left-of-Centre Parties and Trade Unions in the Twenty-First Century. Gianpietro Mazzoleni has been Professor of Sociology of Mass Communication and of Political Communication in the Universities of Salerno, Genoa and Milan, and invited Visiting Professor at Innsbruck Universität, George Mason University, Freie Universität Berlin and Université de Toulouse. He served as Editor-in Chief of the International Encyclopedia of Political Communication (2016) and is the author of several publications in the field of media and political communication.With Paolo Mancini, he founded in 2000 the peer-reviewed journal Comunicazione Politica, and he is currently President of the Italian Association of Political Communication. In 2018, he was named Fellow of the International Communication Association. Rohinton P. Medhora is President of the Centre for International Governance Innovation (CIGI), joining in 2012. Previously, he was Vice President of programmes at Canada’s International Development Research Centre. He received his doctorate in economics in 1988 from the University of Toronto, where he subsequently taught. His fields of expertise are monetary and trade policy, international economic relations and development economics. He has published in professional and non-technical journals, and produced several books. He is a member of the Commission on Global Economic Transformation, co-chaired by Nobel
xxviii
The SAGE Handbook of Political Science
economics laureates Michael Spence and Joseph Stiglitz and The Lancet-Financial Times commission on artificial intelligence and global health. Carlos R. S. Milani is Professor of International Relations at the Rio de Janeiro State University’s Institute for Social and Political Studies (IESP-UERJ). He is also a Senior Research Fellow with the Brazilian National Science Council (CNPq). His research agenda includes Brazilian foreign policy, regional powers and comparative foreign policy, international development cooperation, and climate change international politics. His latest articles published inter alia by International Affairs, the Cambridge Review of International Affairs and The South African Journal of International Affairs are available at https://carlosmilani.com.br/articles/ Helen V. Milner is the B. C. Forbes Professor of Politics and International Affairs at Princeton University and the director of the Niehaus Center for Globalization and Governance at Princeton’s Woodrow Wilson School. She has written extensively on issues related to international and comparative political economy, the connections between domestic politics and foreign policy, and the impact of globalisation. Mohammad-Mahmoud Ould Mohamedou is Professor of International History at the Graduate Institute of International and Development Studies in Geneva, and Visiting Professor at the Doctoral School at Sciences Po Paris. He was previously the Associate Director of the Programme on Humanitarian Policy and Conflict Research at Harvard University. Jonathon Moses is a Professor of Political Science at the Norwegian University of Science and Technology (NTNU), in Trondheim. Along with Torbjørn Knutsen, Moses is the co-author of Ways of Knowing: Competing Methodologies in Social and Political Research, which was recently released in a third edition. Ferdinand Müller-Rommel is Professor (Emeritus) of Comparative Politics at Leuphana University Lüneburg. He was Visiting Professor at the University of New South Wales, the University of Miami, the University of California (Irvine), Siena University and the European University Institute. He is a member of the IPSA Executive Committee, former member (Vice-Chair) of the ECPR Executive Committee and President of the German Political Science Association. He has published numerous books and journal articles on political executives, party government and party systems in European democracies. Herfried Münkler is a German political scientist. He is a Professor of Political Theory at Humboldt University in Berlin. Münkler is a regular commentator on global affairs in the German-language media and author of numerous books on the history of political ideas (German: Ideengeschichte), on state-building and on the theory of war, such as “Machiavelli” (1982), “Gewalt und Ordnung” (1992), “The New Wars” (orig. 2002) and “Empires: The Logic of World Domination from Ancient Rome to the United States” (orig. 2005). In 2009 Münkler was awarded the Leipzig Book Fair Prize in the category “Non-fiction” for Die Deutschen und ihre Mythen (engl. “the Germans and their myths”). Jack Paine is an Assistant Professor of Political Science at the University of Rochester. His two main research projects examine (1) how dictators strategically use repression and powersharing, and their consequences for authoritarian survival and civil war and (2) the origins and consequences of democratic institutions under colonialism.
Notes on the Editors and Contributors
xxix
Yannis Papadopoulos is Professor of Political Science at the Institute of Political Studies (IEP) of the University of Lausanne. His research interests focus on democratic transformations, policymaking processes, accountability and multilevel governance. His major publications include (coedited with Deirdre Curtin and Peter Mair) Accountability and European Governance (2013) and Democracy in Crisis? Politics, Governance and Policy (2013). Jonathan Paquin is Professor of Political Science at Université Laval, and the editor of Études internationales. He is Co-editor of America’s Allies and the Decline of US Hegemony (2019), the co-author of Foreign Policy Analysis: A Toolbox (2018), the Co-editor of Game Changer: The Impact of 9/11 on North American Security (2014) and the author of A Stability-Seeking Power: US Foreign Policy and Secessionist Conflicts (2010). He has written numerous articles on foreign policy and international relations in Cooperation and Conflict, Foreign Policy Analysis, Mediterranean Politics, the Canadian Journal of Political Science and International Journal, among others. He received a PhD in political science from McGill University and was a Fulbright Visiting Scholar and Resident Fellow at the School of Advanced International Studies (SAIS, Johns Hopkins) in Washington, DC. Stéphane Paquin is Full Professor at the École nationale d’administration publique (ENAP) in Montréal. He has written, co-writter or edited 33 books and journals including Theories of International Political Economy (2016), and many more articles about international and comparative political economy. He has received numerous awards, including a Canada Research Chair in International and Comparative Political Economy and a Fulbright Distinguished Chair at the State University of New York. He has taught in many universities, including Northwestern University in Chicago and Sciences Po in Paris. In 2014, he was the President of the local organizing committee of the World Congress of Political Science Montréal 2014 (IPSA). Werner J. Patzelt is a Professor Emeritus of Political Science at the Technical University of Dresden, having held the chair for comparative analysis from 1992 to 2019. He is a member of the editorial board of the Zeitschrift für Parlamentsfragen and of various advisory committees of public institutions. He is coordinator of the International Political Science Association’s summer schools for social science research methods. B. Guy Peters is Maurice Falk Professor of Government at the University of Pittsburgh, President of the International Public Policy Association and editor of the International Review of Public Policy. His recent publications include Policy Problems and Policy Design and Institutional Theory in Political Science (4th edition). Daniela Piana is Professor of Political Science, University of Bologna, and an International Fellow with the IHEJ and the ISP Ecole Normale Superieure of Paris Saclay. She was a Jean Monnet Fellow in 2003, Fulbright Scholar in 2007 and International Resident at the IAS in 2017. She has worked on democratic quality, rule-of-law promotion and the transformation of justice systems at the crossroads of comparative politics, policy analysis and political sociology. She serves as member of the OECD research committee on justice and as a member of the Research Unit of the Italian Council of the State. She is coordinator of the ICEDD research section Rule of Law and Digital Citizenship. Jon Pierre is Professor of Political Science at the University of Gothenburg and Adjunct Professor at the University of Pittsburgh. His most recent books in English include Globalization
xxx
The SAGE Handbook of Political Science
and Governance (2013), Governing the Embedded State (2015, with Bengt Jacobsson and Göran Sundström), The Oxford Handbook of Swedish Politics (2015) and Comparative Governance (2016, with B. Guy Peters). Gianfranco Poggi was born in Italy in 1934. He holds a Ph. D. from the University of California (Berkeley,1964). Throughout his career, he has held tenured positions at the Universities of Florence (1962–64), Edinburgh (1964–88), Virginia (1988–2995), the European University Institute (1995–2002), and the University of Trento (2002–2008). He has taught at several other Universities, among these Sydney, UC Berkeley, Victoria BC, UCLA, Harvard, Washington, and has held Fellowships at the Center for Advanced Study in the Behavioral Sciences and at The Institute for Advanced Study in Berlin (Wissenschaftskolleg zu Berlin. His teaching and publishing have been mainly in two fields of scholarship: ‘classical’ social theory – the sociology of political institutions. Claudio M. Radaelli is Professor of Public Policy at University College London, School of Public Policy, Department of Political Science. A Fellow of the Academy of Social Sciences, Claudio’s main fields of interest include policy learning, regulation, the role of knowledge in the policy process, EU public policy and policy narratives. He is the chief editor of International Review of Public Policy. Alexis Rapin is Research Fellow at the Raoul Dandurand Chair of Strategic and Diplomatic Studies at the University of Québec at Montréal. He has co-authored various French-edited scholarly contributions, as well as non-scholarly publications, on American foreign policy and armed conflict, notably at Presses de Sciences Po. Christoph Rass is Professor of Modern History and Historical Migration Research at Osnabrueck University and a member of the Institute for Migration Research and Intercultural Studies (IMIS). His research in migration studies centers on migration regimes and knowledge processes as well as cultural representations and translations of migration. His current projects focus on policy learning with regard to knowledge transfer and practices as part of the production of migration. For further information, visit www.chrass.de. Salvador Santino Fulo Regilme Jr is tenured University Lecturer of International Relations at the Institute for History, Leiden University. His research mainly focuses on human rights, transformations in the global order, and US foreign policy. He is the co-editor of American Hegemony and the Rise of Emerging Powers (2017) and the author of peer-reviewed articles in International Political Science Review and International Relations, among others. He has recently completed a book manuscript titled Foreign Aid Imperium: How United States Foreign Policy Impact Human Rights in Southeast Asia. He holds a joint PhD in Political Science and North American Studies from the Freie Universität Berlin and previously studied at Yale, Osnabrück and Göttingen. Edeltraud Roller is Professor of Political Science at the Johannes Gutenberg University Mainz. Her research mainly focuses on welfare-state cultures, performance of democracy and authoritarianism, and political support in old and new democracies. Her publications include Einstellungen der Bürger zum Wohlfahrtsstaat der Bundesrepublik Deutschland (1992) and The Performance of Democracies (2005).
Notes on the Editors and Contributors
xxxi
Bo Rothstein holds the August Röhss Chair in Political Science at University of Gothenburg, where he was co-founder and former head of the Quality of Government (QoG) Institute, 2004–15. He served as Professor of Government and Public Policy at the University of Oxford in 2016 and 2017 and has been a Visiting Fellow at Cornell, Harvard and Stanford. Among his recent books in English are Making Sense of Corruption (2017) and The Quality of Government: Corruption, Inequality and Social Trust in International Perspective (2011). Harald Sætren is a Professor Emeritus in Administration and Organization Science at the University of Bergen. He has been co-director of the EGPA Permanent Study Group in Public Policy and Implementation since 2010. He has also been Visiting Professor at Harvard University several times and, more recently, the at University of Arizona and Stanford University. An enduring research interest throughout most of his academic career has been comparative studies of the dynamic relationship between policy formulation/design and implementation. His publications in recent years have been bibliometric-based state-of-the-art assessments of implementation research more generally. Kaushiki Sanyal is the Co-Founder and CEO of Sunay Policy Advisory Pvt. Ltd. Her prior experiences include working with the think tanks PRS Legislative Research, Bharti Institute of Public Policy (ISB) and Vidhi Centre for Legal Policy. She also consulted for the World Bank, Kamonohashi Project and Rajiv Gandhi Foundation. She has authored two books – Oxford India Short Introductions: Public Policy in India (2016) and Shaping Policy in India: Advocacy, Alliances, Activism (2017) – and several articles. She has an MA in Political Science and a PhD in International Relations from the Jawaharlal Nehru University, New Delhi. Marian Sawer is a former Head of the Political Science Program at the Australian National University and former president of the Australian Political Studies Association. She has served as Vice-President of the International Political Science Association and Editor of the International Political Science Review. She was made an Officer of the Order of Australia (AO) for services to women and political science and is a Fellow of the Academy of the Social Sciences. She has headed large research projects such as the Democratic Audit of Australia and has published 20 books and 140 research articles or book chapters. Daniel Schade is a Postdoctoral Researcher and Lecturer at the Chair for Multilevel Governance in Europe at the Otto von Guericke University Magdeburg. His research is concerned primarily with the internal decision-making underpinning the EU’s external policymaking, as well as the role of parliaments in international politics. He has worked at the Vienna School of International Studies and obtained his PhD in International Relations from the London School of Economics and Political Science (LSE). Tasha Schedler is a Research Assistant with the research unit Comparative Politics/Middle East Politics at the Institute of Political Science of Eberhard Karls University Tübingen. Her scholarly interest focuses on varieties of identity politics and on sectarianisation in authoritarian contexts. Yves Schemeil is Emeritus Professor of Global and Comparative Politics at the University of Grenoble (Sciences Po). He has taught political anthropology, political theory and epistemology in American, French, Japanese and Lebanese universities. His publications in the field of
xxxii
The SAGE Handbook of Political Science
this chapter address topics such as political cultures, intercultural negotiations, clientelism, resistance to anthropology in political science, urban ethnography, archeopolitics and the politics and diplomacy of cooking and hospitality, as well as the works of structuralist and interactionist master thinkers like Lévi-Strauss, Descola, Park, Whyte and E. Anderson. Klaus Schlichte is a Professor of International Relations and World Society at the University of Bremen. He authored In the Shadow of Violence. The Politics of Armed Groups (2009) and articles in International Political Sociology, Armed Forces and Society, Geoforum, Zeitschrift für Internationale Beziehungen and Revue de Synthèse, among others. With an interest in political violence and state formation, he has carried out research in Mali, Senegal, Uganda, Serbia and France. Oliver Schlumberger is Professor of Comparative Politics and heads the research unit on Comparative Politics/Middle East Politics at the Institute of Political Science of Eberhard Karls University Tübingen. His research focuses on authoritarianism, political regime theory and processes of democratic breakdown/authoritarianisation, as well as on Middle East politics and the political economy of development. Apart from his scholarly work, he also has extensive experience in policy advisory work for various governments and institutions. Philippe C. Schmitter is currently Professor Emeritus at the European University Institute (EUI) as a member of its Department of Political and Social Sciences. From 1986 to 1996, he was at Stanford University and, from 1968 to 1981, he was an Assistant, Associate and Full Professor at the University of Chicago. More recently, he has been a recurrent Visiting Professor at the Central European University in Budapest, at the Istituto delle Scienze Humanistiche of the Scuola Normale di Superiore in Florence and Fudan University in Shanghai. He has been the recipient of the Johan Skytte Prize of the University of Uppsala, the Mattei Dogan Prize of the ECPR and the Lifetime Achievement Award of the European Studies Association. Daniel-Louis Seiler is Emeritus Professor at the Institut d’études politiques of Aix-enProvence and Visiting Professor at the European School of Political and Social Sience at the Catholic University of Lille. He was born in Belgium in 1943 and graduated from the Catholic University of Louvain, where he got his PhD. Previously, he was Junior Lecturer at University College Dublin, Professor at the Université du Québec à Montréal and Full Professor at the University of Lausanne. He was then a Chair at the University of Bordeaux and later at Aix-Marseille University. He was also Visiting Scholar at the University of Michigan and at UCLA, and a Visiting Professor at the Autonomus University of Barcelona, Catholic University of Louvain, University of Geneva, Free University of Brussels (ULB), University of Neuchâtel, Sciences Po Paris, University of the Basque Country in Bilbao, University of Silesia (Katowice), University of Marmara, University of Colorado (Boulder) and University of Freiburg (Freiburg in Breisgau). He works in the field of comparative politics, mainly on political parties, and has published ten books in French and many articles and contributions in French, English, Spanish and German. Surinder Kler Shukla Professor Surinder K Shukla began her academic career at Panjab University in 1987 after completing her doctorate with the prestigious UGC Junior Research Fellowship 1979. She is a gold medallist having topped the master’s programme at Panjab
Notes on the Editors and Contributors
xxxiii
University. In addition to directing national and International students and scholars, Professor Shukla has led International Programmes first as Faculty and later as Director ICSSR NWRC (Indian Council of Social Science and Research Northwest Regional Centre) 2012–2014. Under her leadership more than 9000 students have benefitted from several countries including Canada, China, Estonia, Germany, Hong Kong, Malaysia, Poland, Singapore, South Korea, Thailand, UK and the US. Dr Shukla has been elected member of several International bodies including IPSA (International Political Science Association, Research Committee 13 1994–1997 and 1997–2000 Research Committee 16 2001–2003) and International Commission of Folk Law and Plural Law (2004), thereby providing leverage to students. Publications include: Chapters in Studia Polityczne, Warzawa, Citizen Action and Governance 2007, Prospects of Democratic Survival Under Difficult Conditions – View from India, Columbia University Press 2001. She was Editor of Research Journal of Social Science Panjab University during 2009–2011 Furthermore, a new Intercultural learning initiative was established by her, focused on promoting and assessing intercultural learning that occurs during overseas study experiences. As a member of Placement Committee at Panjab University, Dr Shukla was instrumental in spearheading a unique connect between International scholars and Indian scholars at graduate, undergraduate and post-graduate levels. Her specializations include Comparative Politics, International Relations and Gender Studies. Rudra Sil is Professor of Political Science at the University of Pennsylvania, where he is also the SAS Director of the Huntsman Program in International Studies & Business. His scholarly interests encompass Russian and East European studies, Asian studies, labour politics, international development, qualitative methods and philosophy of social science. He has authored, co-authored or co-edited seven books, including Comparative Area Studies: Methodological Rationales and Cross-Regional Applications (2018, co-edited with Ariel Ahram and Patrick Köllner). He is also author or co-author of three-dozen chapters and papers. He holds a PhD from the University of California, Berkeley. Hiroki Takeuchi received his B.A. in Economics from Keio University in Japan, his M.A. in Asian Studies from University of California at Berkeley, and his Ph.D. in Political Science from the University of California at Los Angeles. He is currently associate professor of political science, and Director of the Sun and Star Program on Japan and East Asia in the Tower Center at SMU. Previously, he taught at UCLA as a faculty fellow of the Political Science Department and at Stanford University as a postdoctoral teaching fellow of the Public Policy Program. Professor Takeuchi’s research and teaching interests include Chinese and Japanese politics, comparative political economy of authoritarian regimes, and international relations of East Asia, as well as applying game theory to political science. He is the author of Tax Reform in Rural China: Revenue, Resistance, and Authoritarian Rule (New York: Cambridge University Press, 2014). Charles Thibout is a Research Fellow and PhD student at the European Centre for Sociology and Political Science (University of Paris I-Panthéon-Sorbonne, CNRS, EHESS), and Research Fellow at the French Institute for International and Strategic Affairs (IRIS). His is also Lecturer at Université Paris Diderot and Editor-in-Chief of France Culture’s morning show. His research focuses on transnational digital companies, emerging technologies, and their role in international relations, on which he has published various book chapters and journal articles.
xxxiv
The SAGE Handbook of Political Science
Scott A. Tyson is an Assistant Professor of Political Science at the University of Rochester and Research Associate, W. Allen Wallis Institute of Political Economy. His research focuses on formal political theory, political economy, conflict, authoritarian politics and collective action. Evert Vedung is Emeritus Professor of Political Science, especially housing policy, at Uppsala University’s Institute for Housing and Urban Research (IBF) and Department of Government. He has taught and researched in all the Nordic countries, Austria, the United States and Korea. Lately, he has been associated with Aalborg, Linnaeus, Mälardalen and Helsinki U–Soc&Kom as well as Fluminense U in Rio and UnB, ENAP and TCU in Brasília. His most cited publications in English are Public Policy and Program Evaluation (1997/2017), ‘Policy Instruments: Typologies and Theories’ (1998/2017) in Carrots, Sticks and Sermons and ‘Four Waves of Evaluation Diffusion’ (Evaluation, July 2010). Michelangelo Vercesi is Research Associate in Comparative Politics at Leuphana University Lüneburg and an elected member of the executive board of the International Committee on Political Sociology (CPS). He has held teaching and research positions in Austria, Germany, Italy and the UK. His research focuses on comparative government, political elites and leadership and political parties, on which he has published various book chapters and journal articles. Tancrède Voituriez is Economist and Senior Researcher at the French Agricultural Research Centre for International Development (CIRAD). He is also Lecturer at Sciences Po Paris and Associate Researcher at the Institute for Sustainable Development and International Relations (IDDRI). His research and teaching focus on the governance of sustainable development, in particular on the processes underpinning the emergence of policy responses to environmental degradation. Claudius Wagemann is a Full Professor of Qualitative-Empirical Political Science Methods at Goethe University Frankfurt. After graduating from the European University Institute (EUI) in Florence, he held various temporary positions at the Istituto italiano di scienze umane (SUM, Florence, today Scuola Normale Superiore di Pisa) and New York University Florence. He has published extensively on QCA and set theoretic methods, most prominently Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis (2012, with Carsten Q. Schneider). Other research areas include interest groups, the quality of democracy, political extremism and political violence and, most recently, Italian–German bilateral relations. Uwe Wagschal is Professor for Comparative Politics at Albert-Ludwigs-Universität in Freiburg. He is Vice Dean of the philosophical department and author of a multitude of studies and articles regarding political science research methodology, political conflict, party politics, direct democracy, national debt and fiscal consolidation. Currently, he works on several bigdata-related projects in the field of RTS measurement systems and voting advice applications (VAAs). He currently holds the position of Editor-in-Chief of the bi-annually published scientific journal Statistics, Politics and Policy. Laurence Whitehead is a Senior Research Fellow in Politics at Nuffield College, Oxford University. His book publications include Democratization: Theory and Experience (2002, with a Spanish edition in 2011), Let the People Rule? Direct Democracy in the 21st Century
Notes on the Editors and Contributors
xxxv
(edited with Saskia Ruth and Yanina Welp) and Illiberal Practices: Territorial Variance with in Large Federal Democracies (2016, co-edited with Jacqueline Behrend). His articles include ‘International Democracy Promotion as a Political Ideology: Upsurge and Retreat’ (The Journal of Political Ideologies, February 2015), ‘“Enlivening” the Concept of Democratisation: The Biological Metaphor’ (Perspectives on Politics, July 2011) and ‘Losing “the Force”? The Dark Side of Democratization after Iraq’ (Democratization, April 2009). He is the editor of the Oxford University Press series Studies in Democratization. Geoffrey Wiseman is Professor and Director of the Asia-Pacific College of Diplomacy at the Australian National University. He has worked at the Ford Foundation, the University of Southern California and in the Strategic Planning Unit of the Executive Office of the United Nations Secretary-General. He is a former Australian foreign service officer, serving in three diplomatic postings (Stockholm, Hanoi and Brussels) and as private secretary to the Australian Foreign Minister. With Paul Sharp, he co-edited American Diplomacy (2012). With Pauline Kerr, he co-edited Diplomacy in a Globalizing World: Theories and Practices (2018). His current research interests include diplomatic theory and practice, soft power and public diplomacy, and diplomatic culture. Hellmut Wollmann is Emeritus Professor at the Social Science Institute of Humboldt Universität Berlin. His research and publications have over the years been primarily directed at comparative government and comparative public administration with a focus on local-level politics and administration. Among his many publications (in several languages) are Evaluation in Public Sector Reform (2003), The Comparative Study of Local Government and Politics (2006, with H. Baldersheim), Public and Social Services in Europe (2016, with I. Kopric and G. Marcou), Evaluating Reforms of Local Public and Social Services in Europe (2018, with I. Kopric and G. Marcou) and Introduction to Comparative Public Administration (2019, with S. Kuhlmann, 2nd edition). Alexis Work is a graduate student at the School of Government and Public Policy at University of Arizona. She is broadly interested in women’s rights in democratic settings. In particular, her research is focused on the relationship between political advocacy, public awareness and government accountability as they relate to the issue of gender-based violence. I. William Zartman is the Jacob Blaustein Distinguished Professor Emeritus of International Organization and Conflict Resolution at the School of Advanced International Studies of The Johns Hopkins University in Washington, and a member of the Steering Committee of the Processes of International Negotiation (PIN) Program at the German Institute of Global and Area Studies (GIGA) in Hamburg. His doctorate is from Yale and doctorates honoris causa from Louvain and Uppsala. Nicolás Selamé is a Sociologist at the University of Chile, with interests in political sociology and labour sociology. He has participated in comparative investigations of the ideological basis of social programmes in Latin America. Currently, his works are more specifically directed at the emergence of new left parties, the ways they can relate with the previous left renovation and the current channels of representation affected by new social movements and the crisis of party systems, Moreover, is projecting to deepen the study of these topics under the prism of populism’s theories. He is now working as an investigator at the Facultad Latinoamericana de Ciencias Sociales (FLACSO).
Preface D i r k B e r g - S c h l o s s e r, L e o n a r d o M o r l i n o and Bertrand Badie
Politics and political science have changed over time and were deeply transformed during the second half of the last century. Empirical research and theoretical reflections on politics and its multiple connections with all other aspects of human life developed enormously during this period and now cover virtually all parts of the world and their growing interdependence. They concern, for example, such basic issues as war and peace, prosperity, welfare and a sustainable environment, but also issues of freedom, justice, gender and democracy under changing cultural perspectives. This three-volume Handbook presents a major retrospective and prospective overview of political science as a benchmark work that frames, assesses and synthesises the discipline. In doing so, it helps to define its current and future developments. It emphasises the global and cross-area perspectives in political science, which have been neglected by dominating AngloAmerican or Eurocentric approaches. After a general introduction, the Handbook covers the most important sub-fields of political science in separate parts, organised in alphabetical order, supplemented by a glossary and a comprehensive index. Where they apply, the chapters also take inter-disciplinarity, gender and the consequences of digitalisation as cross-cutting aspects into account. As a Handbook, it is distinguished from a mere dictionary with short definitions of terms, but also from an encyclopedia with somewhat longer entries in alphabetical order. Rather, each chapter, of about 8–10,000 words, provides a comprehensive overview of each subject. Each chapter has, as far as possible, a similar structure covering the following main items: • • • • • •
a short history of the subject basic theories and concepts global/regional differentiation (where this applies) empirical databases (where this applies) major advances, ongoing debates, critical assessments perspectives.
This Handbook follows in the tradition of the Handbook of Political Science (Greenstein and Polsby, 1975) and A new Handbook of Political Science (Goodin and Klingemann, 1996). The massive 11-volume Oxford Handbook of Political Science (Goodin, 2006) is another predecessor. In contrast with these works, however, which built mostly on the American and European experiences (more than 90% of contributors were Anglo-American), this Handbook is explicitly global, taking the constellation of worldwide politics in the 21st century and its regional variations into account. In this respect, the editors can build on their experience and international networks as editors of the eight-volume International Encyclopedia of Political Science (Badie et al., 2011).They have also co-authored an advanced-level textbook: Political Science – A Global Perspective (Morlino et al., 2017). This Handbook now brings previous assessments
Preface
xxxvii
up-to-date in a comprehensive and systematic way, recognizing the theoretical and cultural pluralism of our approaches. The Handbook is organised following the conventional division of the six major sections of the discipline of political science, all quite broadly conceived. It begins with a section on political theory from a variety of classic and contemporary perspectives, including important elements of the history of political ideas but also formal and ‘positive’ theory. This is followed by a section on a broad spectrum of qualitative and quantitative methods employed in what is both a philosophical and an empirical political science, including reflections on common ontological and epistemological foundations. The third section covers the social bases of politics (social structures, political cultures, etc.) and the links between societies and political systems (interest groups, social movements, parties, media and their varieties). The fourth section deals with comparative politics: different types of political systems and some of their specific features, including longer-term perspectives of regime changes, state formations and failures. The fifth section then turns to the output side of politics, public policy and administration, in all its varieties, strengths and problems. The sixth section, finally, covers the broad field of international relations and the global dimensions of politics, which we consider an integral part of political science and not a separate composite discipline as, for example, in distinct university departments in some countries. As a special feature, the Handbook contains an additional section on major challenges for politics and political science in the 21st century, such as changes in international power relations and the political consequences of demographic and environmental developments. In this way, it can serve as a major reference work both for academic and non-academic audiences, including the media, international organisations and practical politics. As editors, our division of labour is in line with our specific experiences and backgrounds. Bertrand Badie was responsible for the parts on political theory and international relations; he also has extensive research experiences in the Middle East and Asia. Leonardo Morlino covered the fields of comparative politics and public policies and administration; his research focuses mainly on Southern Europe and Latin America. Dirk Berg-Schlosser was in charge of the sections on methods and political sociology; he has done extensive work in Sub-Saharan Africa and Europe. As with the International Encyclopedia, the editors have maintained close links with the International Political Science Association (IPSA) as former president (Leonardo) and vice-presidents (Bertrand and Dirk). This has helped in the selection of contributors. As in our previous joint publications, we have cooperated very closely over the entire process preparing this Handbook. This began with discussions with the publisher about the project, the organisation of the Handbook and the selection of topics and contributors in several meetings. During all stages of chapter drafts and reviews, we have consistently exchanged our views and agreed upon all major decisions. This was facilitated by electronic communication, without which such a global undertaking would have been impossible, even a few years ago. But our intensive friendship and mutual trust, which has developed over many years (and even decades), was even more important in allowing us to cooperate in this way. Therefore, we bear joint and equal responsibility for the final product. For this Handbook, we have changed the order of names once more (from Badie, Berg-Schlosser, Morlino – in alphabetical order – for the International Encyclopedia, and Morlino, Berg-Schlosser, Badie – in reverse alphabetical order – for the textbook), to Berg-Schlosser, Morlino, Badie, to reflect our equal contributions. We are proud to have been able to assemble scholars from more than 30 countries and all continents, all of them leading experts in their fields. This team provides a wide coverage of key areas both globally and regionally, which has created a lively interaction among us in many ways, assuring the high quality of the final product.
xxxviii
The SAGE Handbook of Political Science
References Badie Bertrand, Berg-Schlosser Dirk and Morlino Leonardo (eds) (2011) International encyclopedia of political science. Thousand Oaks, CA: Sage. Greenstein, Fred L. and Polsby, Nelson W. (eds.) (1975) Handbook of Political Science, 8 vol.s, Reading, Mass.: Addison-Wesley. Goodin Robert E. and Klingemann Hans-Dieter (eds) (1996) A new handbook of political science. New York: Oxford University Press. Goodin, Robert E. (ed.), The Oxford Handbook of Political Science, 11 vol.s, (2006ff), Oxford: Oxford University Press. Morlino, Leonardo, Berg-Schlosser, Dirk and Badie, Bertrand (2017), Political Science – A Global Perspective, Los Angeles: SAGE.
Introduction D i r k B e r g - S c h l o s s e r, L e o n a r d o Morlino and Bertrand Badie
Political science from a global perspective The concept of politics is an old one, which is present in all cultures, but for a long time it was not subject to systematic scientific inquiry. Instead, for some it designated an art (scholars ‘studying politics’, in imperial China as well as in Europe), for others an activity (people ‘playing politics’ at all levels of social life) or a profession and function (as ‘politicians’). Political science progressively emerged from this plurality of meanings and was, in the beginning, discussed mainly in philosophy and the history of political thought. Even though it is misleading to think of a unique essence of politics, the discussion has been structured around a prevailing question: how can different people, numerous families and individuals, coexist in the same community? Following this question, we understand that politics can be understood as both polity and policy. The
first refers to an organisation (a state, a regime, a system and its constitution), the second to a series of decisions concerning different sub-fields (the economy, health, education, relationships with other states, etc.). In this sense, politics must be considered as a function as well as an action. Some scholars postulated that these did not exist in early societies, which ignored politics, where order and coexistence were maintained by mere social control (Clastres, 1978). We follow here the majority of anthropologists, who consider politics as a universal function, which can be found in all histories – that is to say, in all sequences of the human adventure and in all cultures, opening the way to a broad comparative approach. Politics attained the status of a scientific discipline in the 20th century, due to the progress and the transformation of law studies and the growing influence of behaviourism in the social sciences. The latter promoted a functional definition of politics, whereas the former led to an
2
The SAGE Handbook of Political Science
instrumental one. The functional definition conceives politics as aiming to organise and to assert the coexistence of individuals in the same community. Many philosophers had previously located politics in this art of coexistence. This point was made by Plato, for example, who considered politics as the art of preserving social harmony. But the same approach can be found in many other cultures. Islam was conceived in this way by Prophet Muhammad as preserving unity (tawhid) in a context of high tribal fragmentation. As such, tawhid would be achieved through the absolute unity of God and the indivisible nature of the Umma, the community of believers. In Confucian culture, social harmony and permanent order are considered the ultimate goals of political action. In Hinduism, politics is presented, especially in the Mahabarata (first millennium BC), as an absolute requisite for keeping peace and order. Such a functional vision is also at the basis of Western political modernity through the invention of the key concept of social contract, which organises the will of people to live in the same community (nation) with common political institutions. In a normative way, some philosophers strive to go further, demanding that politics also refers to welfare and virtue – that is to say, to achieve a good community, which would operate according to the law of nature, of reason, the divine law or, in a simpler way, by meeting basic human needs and social expectations. If politics is only the science of the polity, it is first of all a behaviouralist science, as is the case in the major part of contemporary political science. If it becomes the science of constructing the Supreme Good, it moves to the status of a normative science. In order to overcome this dilemma, some social scientists have alternatively promoted an instrumentalist approach, which considers that the distinctive nature of politics has to be found not in the aims but in the instruments used for running the polity.
At the core of this instrumentalist vision of politics, power plays the key role. In Max Weber’s conceptualisation, it is defined as the ability to achieve your aims even against someone else’s will. Political scientists consider that the polity cannot exist without power, no matter how it is structured, and there is a long tradition of connecting power and politics, in which political science studies how power is formed, organised and shared (Lasswell and Kaplan, 1950). This broad concept opened the way to ‘huge comparisons’ (Tilly, 1984) and typologies dealing with the different structures of power. Here, a distinction must be made between some authors who think that any use of power implies politics – politics could then be played in a club, a family or a company – and a larger number who consider that politics is limited to a particular use of power. In fact, if we broadly define power, the very nature of politics becomes blurred. This is why Max Weber claims that a community has a political quality only if its rules are maintained in a given territory. Then, politics does not refer to power as such but only to a kind of power. But even this option can be risky: the addition of this territorial criterion is questionable, as it hardly fits nomadic societies or traditional ones that did not accept the territory (i.e. a strictly delimited space with assertive borderlines) as a universal concept. This debate shows that politics is an evolutionary but also a cultural and historical concept, which is always endangered by the ethnocentric temptation to consider it through one’s own system of meanings. Because political science as an academic discipline was established relatively late, it has also been considered as a crossroads, related to philosophy, law, history, sociology and even anthropology and economics. For this reason, its relationship with the other social sciences is neither clear nor simple. In many European countries, especially in France and Germany, political science was generated by law studies, inside the faculties that were devoted to them. That is why the
introduction
first generation of political science in these countries was strongly influenced by an institutional view of politics. Other scholars took a sociological approach, which was strong enough to free itself from this trend. By contrast, the behaviourist and positivist origins of American political science isolated it from the other social sciences, containing the assaults of a ‘political sociology’ which loses a part of its meaning in the American universities, where political science and sociology are clearly separated. The positivist and quantitativist inclinations of this dominant political science encouraged many scholars to borrow their methods and paradigms from economics, opening the way to the rationalchoice school and putting a strong emphasis on political economy and international political economy (IPE). More recently, an anti-positivist reaction encouraged many scholars around the world to go back to history and anthropology to revisit and reduce the ethnocentric orientations of a political science that was mainly, and often exclusively, inspired by the Western model of political development. States, political regimes, revolutions and political mobilisations were now reconsidered in their own historical context (Skocpol, 1984), while political science moved consequently to an ‘historical sociology of politics’. Anthropology helped, on its side, to reintroduce different cultures, which participate in shaping politics in the various world regions (Geertz, 1973). The main epistemological issue now at stake is to find a just balance between an absorption by these older sciences, which would transform political science into a part of sociology, law or history, and a fierce independence ignoring the neighbouring sciences and how they can enrich the study of politics. Clearly, it is impossible, as this Handbook suggests, to study politics without taking into account economic parameters, demographic changes, social transformations or institutional set-ups. Basically, political science has to be considered as a social science, as a political fact is a
3
social fact, according to the definition given by Durkheim: ‘A social fact is any way of acting, whether fixed or not, capable of exerting over the individual an external constraint’ (Durkheim, 1982: 59). But it differs from sociology, as political scientists generally consider that both the functions and the instruments of politics are distinct enough to imply specific theories, paradigms and methods, as we present them in the first and second parts of this Handbook. We also emphasise that many of these theories and methods contain concepts and orientations that are, for this reason, shared with neighbouring disciplines.
The global organisation of political science As mentioned above, political science, paradoxically, is a very old discipline, going back at least to the Greek classics, but it was only relatively recently established as a separate academic field. The first university chair that resembled those of today’s political science, at least in name, was created at the University of Uppsala in Sweden in 1622 by a generous donation of the then chancellor of the university and a tutor of King Gustav II Adolf, Johan Skytte. It was dedicated to a professorship in ‘eloquence and politics’. The political sciences in Europe – and the plural form of the term is still used quite frequently, as in Sciences Po – have been shaped by a great variety of traditions. They have roots in political philosophy, public and international law, history, economics, etc. Indeed, the first academic institutionalisations of the discipline at the Ecole Libre des Sciences Politiques in 1871 in Paris, the London School of Economics and Political Science in 1885 and the Hochschule für Politik in Berlin in 1919 were largely conglomerates of these various disciplines, and they remain so, to a certain extent, today. The first and still the biggest political science association is the American Political
4
The SAGE Handbook of Political Science
Science Association (APSA), founded in 1903, with currently about 12,000 members from more than 80 countries (www.apsanet. org/). It is subdivided into 49 organised sections with regular activities of their own covering a wide range of topics. The annual meetings and other events, together with the leading journals (American Political Science Review, Perspectives on Politics and PS), have set international standards in many ways. Some activities extend to other parts of the world, like workshops and thematic summer schools in Africa, the Middle East and Asia. This is also expressed in APSA’s recent slogan ‘Networking a World of Scholars’. Nevertheless, it is estimated that about 75–80% of American political scientists still specialise in domestic affairs or US foreign policy, and for a long time a researcher specializing on a single foreign state counted as a ‘comparativist’, making, at best, some implicit comparisons with the home country. This dominant concern with the American political system also led to a certain myopic bias that some concepts and theories were thought to be applicable in other parts of the world, such as the concept of party identification or the theory of collective action, for example. This was in spite of the fact that the United State’s social composition, institutional set-up, role in world politics and many other aspects are unique and cannot be generalised so easily. Other associations were created in Canada (1913), Finland (1935), India (1938), China (1932) and Japan (1948). However, ‘communication between them was virtually nonexistent … the very definition of ‘political science’ was uncertain and the relevance of any distinction between philosophy, the social sciences, and the humanities was the subject of debate’ (Boncourt, 2009). The first international organisation was the International Political Science Association (IPSA), founded in 1949 at a conference in Paris under the auspices of UNESCO, including such well-known scholars as Raymond Aron and Maurice Duverger. Quincy Wright
from the University of Chicago was elected as its first president. In the beginning, it was a relatively loose federation of the (few) national associations, representatives of which came together in the executive committee and during triennial (biennial since 2012) world congresses, the first of which was held in Zurich in 1950. In the course of time, the organisation became stronger and financially independent of UNESCO. The flagship journals Political Science Abstracts and International Political Science Review and the growing collective (national associations), institutional (political science departments and research institutions) and individual membership contributed to this success. Currently, there are about 50 collective, some 60 institutional and roughly 2,000 individual members (www.ipsa.org). Since 1970, many specialised research committees (corresponding to the APSA sections) have come into being with regular worldwide activities. Today, these number about 50. A permanent secretariat with a growing staff has been established at Montreal since 2000. An IPSA portal covering some 300 websites were created at the University of Naples in the late 1990s, and IPSA massive open online courses (MOOCs) are now available at the same institution. Beginning in 2010, regular annual IPSA summer schools on concepts, methods and techniques in political science at Sao Paulo, Stellenbosch/South Arica, Singapore, Ankara, Mexico City and St. Petersburg, emulating earlier ones at the universities of Ann Arbor and Syracuse in the United States and the University of Essex in the UK, have contributed to create advanced training facilities in areas where these were lacking before. In Europe, the European Consortium for Political Research (ECPR) was founded in 1970 with Stein Rokkan as first chair and Jean Blondel as executive director of the permanent secretariat at the University of Essex (https://ecpr.eu/). It was originally modelled on the Inter-University Consortium for Political Research (ICPR) at the University of Michigan, which
introduction
basically was, and still is, a data-collecting and training institution for quantitative methods in the social sciences. Having institutional membership only, it was based on the simple but, as it turned out, very productive idea to let institutions pay and individuals benefit. A major innovation was then brought about in 1973 by Rudolf Wildenmann: the first Joint Sessions of Workshops. These provided a new and, still today, a successful format for bringing together senior and junior scholars in relatively small groups where everyone is obliged to present a paper on ongoing research with sufficient time to discuss it in much greater detail than can be provided at the big national or international conferences (Fondation Nationale des Sciences Politiques, 1996). ECPR has now grown to become the second-largest political science association worldwide. It has about 350 institutional members (still no individual membership) in some 50 countries. In addition to the very successful Joint Sessions of Workshops, biennial and now annual general conferences have been held since 2001, occasionally outside Europe. Since 2005, biennial graduate conferences have also taken place. About 50 standing groups (comparable to the APSA sections and IPSA research committees) are now active. There are three flagship journals (European Journal for Political Research, European Political Science and European Political Science Review) and, most recently, an open access journal (Political Research Exchange). In 2005, ECPR also became a publisher, as ECPR Press, launching several book series and now working in partnership with Rowman & Littlefield. From the beginning, a methods summer school was held annually at Essex, following the example of the Inter-University Consortium of Political and Social Research (ICPSR) at the University of Michigan in Ann Arbor. Since 2005, additional methods summer schools (and now also winter schools) have been held at Ljubljana, Budapest, Vienna and Bamberg,
5
offering a wide range of qualitative and quantitative courses. In addition, in 2010, the European Political Science Association (EPSA) was founded, based on individual membership. It holds annual conferences and publishes the journal Political Science Research and Methods. There also are some smaller regional organisations, like the Nordic Political Science Association (NOPSA), representing the Scandinavian countries, and the Central European Political Science Association (CEPSA). With regard to other world regions, political science is often represented within broader social science associations, like the Latin American Studies Association (LASA) and the African Studies Association (ASA), which had their origin in North America but now have considerable regional membership as well. In 1973 a separate African Association of Political Science (AAPS) was founded. Since 2002, a specific Latin American political science association (ALACIP, Associação Latinoamericana de Ciencia Política) was created which holds regular regional conferences. More generally speaking, political science, more than any other discipline, requires a certain ‘breathing space’ of a minimum of academic freedom, freedom of information, etc. and a favourable political environment in order to prosper. For this reason, its development has been closely related to broader processes of democratisation in many parts of the world (Easton et al., 1991). Only after the latest ‘wave’ in the 1980s and 1990s could an independent, empirically oriented and internationally linked political science emerge, but its existence is threatened again by renewed authoritarian tendencies, as in Turkey or Russia. Taken altogether, this is certainly a success story. There are now professional political scientists in all major regions and most countries. They have similar training, knowledge and skills. They can see eye to eye and cooperate on a par with their international
6
The SAGE Handbook of Political Science
colleagues. The days of ‘safari research’ and mere ‘airport comparatists’ are clearly over. In a globalised world, these developments have been greatly enhanced by new electronic means of communication and cheaper international travel. Nevertheless, the worldwide distribution of the discipline remains uneven (Stein and Trent, 2012). In this way, political science has not only grown as a discipline and achieved high international standards, but it has also widened its perspectives and today looks at our current domestic and international political problems from a variety of epistemological and methodological angles, benefiting from the diversity of rich cultural backgrounds in all parts of the world. This is also evident in the contributions to this Handbook.
Challenges and Innovations in the 21st century Although the actual influence of political science scholars on political decisions by incumbent authorities and, more generally, on politics is often non-existent, usually uncertain and rarely evident, research on the major challenges of politics with possible political innovations is at the core of our work. The last section of this Handbook presents the analyses of some scholars on the most important contemporary challenges, which may or may not be followed by institutional changes. Here, we propose some considerations that can help to frame these phenomena. To begin with, the challenges affecting our lives can come from global phenomena that national governments cannot control. Climate change is out of reach of any government, and only agreements that encompass the whole world may be able to cope with it. Similarly, the changes of international power relations (Badie, Chapter 84, this Handbook) are beyond the control of a specific government, as are immigration, emigration and
international terrorism. These are, however, mostly regional phenomena and can be addressed by a group of countries belonging to a specific area, such as the Mediterranean or Sub-Saharan Africa. There are other widespread phenomena but which specific governments or societies can try to deal with – for example, the diffusion of neo-populism (Kriesi, Chapter 90, this Handbook), the position of minorities (Amir-Moazemi, Chapter 89, this Handbook) and major human rights issues (Regilme, Chapter 86, this Handbook). Briefly, there are several challenges at different levels that have to be coped with and analysed in different ways. Among the different topics that can be addressed, we refer here to a few selected issues. These are international power relations, changes in representative democracies, innovations in authoritarian regimes and the problem of social inequalities. As for the first topic, the chapter by Badie clarifies that power ‘has deeply changed in International Relations, but more in its efficiency and its functions than in its nature and its appearance’. This is to emphasise that the contemporary world is characterised by new conflicts and new uncertainties, where mutual weakness, rather than powerfulness, is the key aspect to look at, with its consequent challenge for the traditional powers. Within this context, Badie recalls the contemporary salience of inter-societal relationships rather than the classic inter-state relationships, the stronger presence of non-state actors in the international arena and the multiplying communications and related information that overcome and reshape borders and sovereignties. In the context of globalisation, this includes the new role of huge supranational companies, such as Microsoft, Apple and a few others, which have become more politically relevant than most of the 195 officially recognised independent states. These and other aspects have also brought about state governments’ reaffirmation of their autonomous powers with regard to domestic issues and related interests. In turn, such reactions
introduction
undermine the trend of strengthening integration in the different world regions. This concerns, for example, the weakening of the integration process in the EU and other world regions, such as the Cono Sur in Latin America, Sub-Saharan Africa and South-East Asia. In recent years and very likely in years to come, changes in contemporary representative democracies are and will be the centre of attention in politics and political science. These are analysed in several chapters of this Handbook. Here, we only emphasise the most evident ones. First, we consider the changes in the three main actors of representation – that is, in the parties and party systems, in political movements and in interest groups. All over the world at the end of the 20th century and in the first decades of the 21st, these actors’ organisations and activities have been deeply affected by social changes, new forms of industrial organisations and technologies and the digital revolution. Thus, the fading of traditional organisational structures, the newly dominant role of elected politicians, the personalisation of political competition, the reshaping of relevant political cleavages (including the left–right dimension) and new tools influencing the formation of public opinion have all deeply transformed political parties. They ironically disappeared as they traditionally existed, especially in the Western democracies, but reappear like the famous phoenix from the ashes to reclaim centre stage with new leaders, many of them with neo-populist features (Kriesi, Chapter 90, this Handbook). The impact of the information transformation is especially evident in the new social movements (della Porta, Chapter 39, this Handbook). This was the case whether they failed – for example, in the case of the so called Orange Revolution in Ukraine or, more profoundly, in Egypt, with the subsequent restoration of a tough military regime – or when, at least organisationally, they were successfully institutionalised into new parties, as happened to the social movements in Spain and
7
Greece with the thrust of the economic crisis. This is also the case when they become worldwide actors through popular mobilisation and demonstrations in favour of salient and internationally relevant issues, such as the movement concerning climate change. Interest groups are apparently much less at the centre of popular attention, and there are reasons for this (Mattina, Chapter 32, this Handbook). The weakening of unions and employers’ associations in influencing key economic decisions is the result of globalisation, which in Europe was compounded by the role achieved by the EU in this arena. But, as is well known, very rarely in politics what appears to happen actually mirrors the reality. In fact, the apparent weakening of these organisations is complemented by the actual development of lobbying activities, who have become the only ‘actors in town’ with the fading away of the traditional programmatic parties – that is, of alternative actors who propose contrasting policies. The transformation of representative actors is compounded by popular dissatisfaction with them, and the digital revolution refuelled the criticisms of representative democracy and triggered a new debate to promote and implement direct forms of democracy. This also happened in political philosophy, especially with the debate and publications on deliberative democracy. In political science, the debate was focused on the opportunities provided by the digital revolution to potentially reach each citizen directly and to give her/him the possibility of expressing opinions or casting a vote on any proposal or decision to be made. The actual implementation of experiments in direct democracy, especially carried out by party leaders, show little citizen participation and create new possibilities for manipulation. This notwithstanding, the criticisms of representative actors and mechanisms and the hopes for a future implementation of direct forms of democracy are resilient and continue to be a part of the political debate.
8
The SAGE Handbook of Political Science
Technological developments also concern the electoral process in representative democracies and experiments in electronic voting. The only country that has practised national online voting has been Estonia, where the percentage of internet voters increased from 3.4% (2007 general election) to 43.8% (2019 general election). At the local level, experiments in electronic voting have been carried out in a large number of countries. So far, however, given the high level of competition and recurrent mutual distrust among political parties, compounded by citizens’ distrust, online voting, other than in Estonia, has not gone beyond the experimental stage. In the first two decades of this century, there also have been profound innovations in authoritarian regimes. In addition to the spreading of so-called hybrid regimes (Gagné and Mahé, Chapter 47, this Handbook), these basic changes were labelled electoral authoritarianism (Schedler, 2016). Such a nondemocratic regime presents all the formal rules and institutions of a democracy, from the constitutional charter to the electoral system, from the parliament to the supreme court and elected local governments. Civic associations, interest groups and private media are allowed but regulated. The participation of more than one party in elections is permitted, to give the appearance of a democratic regime and to have the opposition parties indirectly legitimise the regime, parties that are often rewarded by the authoritarian ruler. At the same time, elections are systematically manipulated in different ways, such as the alteration of lists, the purchase of votes and the falsification of ballots so that they are neither free and fair nor competitive, but only regular (Schlumberger, Chapter 42, this Handbook). In a nutshell, consistent with a widespread legitimation of democracy, authoritarian leaders are able to maintain their authoritarian rule by keeping up democratic appearances. Again, the discrepancy between these regimes and the reality is evident. In some cases, this can destabilise the regime and open the path towards a hybrid
system, but through informal tools of control and suppression rulers can counteract such tendencies. A final challenge to mention in this Introduction concerns the issue of inequalities. At the end of the last century and the beginning of the new one, a multidisciplinary debate has focused not only on socio-economic inequality but also on justice in political philosophy, on poverty in economics, on fairness in cognitive psychology and on solidarity at the level of international or supranational organisations, such as the EU. In political science, perhaps under the influence of the last book by Robert A. Dahl (2006, especially chapter 2), the prevailing attention has been on political equality, which is a complex notion that refers to effective participation, equality in voting, final control of the agenda, inclusion and fundamental citizen rights. In other words, Dahl includes in that notion the key elements of an ideal democracy. After Nobel Prize-winner Joseph Stiglitz’s (2012) work on the negative effects of inequality, that of Thomas Piketty (2013) on the bases of inequality, and of Branko Milanovich (2016) on the international dimension of inequality, all of which have clarified the economic mechanism at work, how precisely this impacts on democracy is a key research question that still deserves to be explored in detail in political science. Beyond the challenge to research, on this theme there is an ongoing phenomenon in politics that has been unfolding in contemporary democracies. Although from a mainstream normative perspective, freedom and equality are still considered the two key values of a good democracy, empirical research on the latter is pointing to different developments. First, the attention of governments and governmental majorities is more and more focused on supporting provisions that deal with poverty and avoid its extreme cases rather than looking for redistributive egalitarian policies, which are more and more difficult to carry out with low or no economic growth. Second, as already shown
introduction
by Kriesi et al. (2016), the profiles of contemporary democracies in terms of the quest for social justice and equality vis-à-vis freedom are very much differentiated. Whereas the citizens of northern European countries, the UK included, are mainly looking for civil rights, those of southern and eastern European countries are looking for greater equality and social justice. The explanation is not difficult to find. The first group is more egalitarian overall and enjoys stronger social rights than the second group. Consequently, northern European citizens take egalitarianism and social rights for granted and demand greater freedoms, while the citizens of southern and eastern Europe demand the former. There is a third aspect to mention, which Rosanvallon (2011) described when he singled out three factors that are gradually disappearing and that previously created the conditions for redistributive egalitarian policies: the belief in the necessity of democratic reform to avoid social and political turmoil, the practical impact of the two world wars and a decline in the belief of individual responsibility for one’s destiny. On the whole, the third factor deeply affects cultural attitudes and contributes to the undermining and transformation of pursuing equality as a key goal of a democracy. Of course, future political developments will present other challenges for both politics and political research – or even perhaps a return to phenomena previously dealt with, like the revival of the analysis of democratic crises, which reminds us of debates and research from the 1970s.
References Boncourt, Thibaud (2009), A History of the International Political Science Association, Montreal: IPSA. Clastres, Pierre (1978) [1st pub. 1974], Society against the State, London: Blackwell.
9
Dahl, Robert A. (2006), On Political Equality, New Haven, CT: Yale University Press. Durkheim, Émile (1982) [1st pub. 1895] (Steven Lukes ed.), The Rules of Sociological Method and Selected Texts on Sociology and Its Method, New York: The Free Press. Easton, David, John G. Gunnell and Luigi Graziano (eds) (1991), The Development of Political Science, London: Routledge. Fondation Nationale des Sciences Politiques (ed.) (1996), La Science Politique en Europe: Formation, Coopération, Perspectives, Paris: Fondation Nationale des Sciences Politiques. Geertz, Clifford (1973), The Interpretation of Cultures, New York: Basic Books. Kriesi, Hanspeter, Willem Saris and Paolo Moncagatta (2016), ‘The Structure of Europeans’ Views of Democracy: Citizens’ Models of Democracy’, in Monica Ferrín and Hanspeter Kriesi (eds.), How Europeans View and Evaluate Democracy, Oxford: Oxford University Press, pp. 64–89. Lasswell, Harold and Abraham Kaplan (1950), Power and Societies, New Haven, CT: Yale University Press. Milanovich, Branko (2016), Global Inequality: A New Approach for the Age of Globalization, Cambridge, MA: The Belknap Press of Harvard University Press. Piketty, Thomas (2013), Le Capital au XXIe Siècle, Paris: Editions du Seuil. Rosanvallon, Pierre (2011), La Société des Egaux, Paris: Editons du Seuil. Schedler, Andreas (2016), The Politics of Uncertainty. Sustaining and Subverting Electoral Authoritarianism, Oxford: Oxford University Press. Skocpol, Theda (ed.) (1984), Vision and Method in Historical Sociology, Cambridge: Cambridge University Press. Stein, Michael and John Trent (eds.) (2012), The World of Political Science: A Critical Overview of the Development of Political Studies around the Globe: 1990–2012, Opladen: Barbara Budrich Publishers. Stiglitz, Joseph E. (2012), The Price of Inequality: How Today’s Divided Society Endangers Our Future, New York: W. W. Norton & Co. Tilly, Charles (1984), Big Structures, Large Processes, and Huge Comparisons, New York: Russell Sage Foundation.
This page intentionally left blank
PART I
Political Theory
This page intentionally left blank
1 Comparative Political Theory Siddharth Mallavarapu
Introduction Let me begin with two stories. My hope is that they serve as an invitation to start thinking about the place, dilemmas and value of comparative political theory (CPT) as a distinct scholarly enterprise (see Cerutti, Chapter 9, this Handbook). I briefly recount these stories here merely with the intent of gesturing to concerns that are germane to CPT. These stories tell us much if we are willing to listen. They hold a mirror to encounters with texts, ideas and authors from other parts of the world. They hint at how these encounters touch us in more ways than we can imagine. They goad us to step out of familiarity and stumble, initially clumsily, in uncharted territories of mind and place. They compel us to discern ideas which may have been encountered in the past but now have a different face, name and affect as they surface from another milieu. Ultimately, these stories make us reflect on an inevitable metamorphosis as we ponder our own complex
relationship with old- and new-world imaginaries over the duration of a lifetime. The first deals with Orhan Pamuk, the Turkish writer who, in his Nobel lecture titled ‘My Father’s Suitcase’, talks about several intellectual debts he notches up prior to arriving at Stockholm to be feted as a Nobel laureate in literature. The lecture is a tribute principally to his father who had initially introduced him to the republic of letters and left him with an insatiable appetite for more. There are many remarkable moments in this story. The early introduction to ‘world writers’, the significance of dreaming, the acknowledgment of other worlds, a latent ambivalence towards the ‘West’ and the process of centring, de-centring and re-centring are all integral to the narrative. The young Pamuk is curious about the world outside and is perplexed by its richness and diversity. Such wonder co-exists with more than a hint of melancholia. Pamuk set out believing that the scene of action in world literature was several removes away from Istanbul where
14
The SAGE Handbook of Political Science
he was growing up. What is intriguing about the journey is not this lament, but a complete reversal in the manner in which the world is approached by the end of the rendition. Pamuk observes that [w]hat I feel now is the opposite of what I felt as a child and a young man: For me the center of the world is Istanbul. This is not just because I have lived there all my life, but because for the last thirty-three years I have been narrating its streets, its bridges, its people, its dogs, its houses, its mosques, its fountains, its strange heroes, its shops, its famous characters, its dark spots, its days, and its nights, making them part of me, embracing them all (Pamuk, 2006).
A parallel story is the account of an anthropologist turned novelist, Amitav Ghosh, who tells the story of his intellectual inheritances in another intervention titled ‘The Testimony of my Grandfather’s Bookcase’. Growing up in Calcutta, Ghosh recounts the curious life of a bookcase which has assumed the status of a sacred family heirloom. What Ghosh attempts to do in the piece is to tease out a particular ‘vision of literature’ that emerged from an eclectic choice of ‘books from a wide array of countries’. The bookcase has a haunting presence in Ghosh’s intellectual make-up. Every other experience, including overseas study in England, pales in comparison to the warmth and affection with which Ghosh recalls the original bookcase, which was a formative influence on him. Ghosh, like Pamuk growing up, partakes of world literature from a different corner of the world and acknowledges elements which travel from his immediate cultural ecology to become part of a global inheritance. He follows with particular interest for instance, the trajectory of stories that went popularly by the name of Panchatantra. Ghosh argues that the Panchatantra is reckoned by some to be second only to the Bible in the extent of global diffusion. Compiled in India early in the first millennium, it passed into Arabic through a sixth century Persian translation, engendering some of the best known of middle eastern fables, including parts of
The Thousand and One Nights. The stories were handed on to the Slavic languages through Greek, then from Hebrew to Latin, a version in the latter appearing in 1270. Through Latin they passed into German and Italian. From the Italian version came the famous Elizabethan rendition of Sir Henry North, The Moral Philosophy of Doni (1570). These stories left their mark on collections as different as those of La Fontaine and the Grimm brothers, and today they are inseparably part of a global heritage (Ghosh, 1998).
What I intend doing in the course of this chapter is to consciously reside in these spaces to excavate the lineages, tensions and possibilities surrounding the intellectual project that goes by the nomenclature of CPT today. In terms of a roadmap, I begin by mapping competing intellectual rationales, claims and counter-claims for pursuing CPT. Subsequently, I examine threadbare two facets that lie at heart of these tensions – the nature of the ‘canon’ and ‘otherness’. Without conceding defeat to these genuine concerns, I go on to probe alternative perspectives from within the global south when it comes to constructively engaging these dimensions. Finally, the chapter concludes by engaging normative commitments and curiosities that might be worth taking on board for a new generation of fine minds that are intellectually invested in enriching CPT.
Contextualizing CPT: Competing Rationales For anyone located in the global south, an inevitable question which is likely to emerge while thinking about an intellectual project like CPT would be to speculate about its relationship with decolonization. Decolonization and CPT at one plane appear to be natural allies. If you are persuaded that the colonizer fashioned specific ontologies and epistemologies that rendered the colonized inconsequential, an immediate question follows. What does epistemic decolonization entail when it comes to the study of
Comparative Political Theory
political theory? While CPT may be a more recent intellectual endeavour, postcolonial political theory has been around for some time, mounting a set of responses to the challenge of decolonizing knowledge systems. There are some figures who are particularly critical to this intellectual history. Frantz Omar Fanon (1925–61) has a special place here. Fanon is best known for his classic interventions. These include Black Skin, White Masks which was published in 1952 and The Wretched of the Earth, published close to a decade later in 1961. While working as a psychiatrist at Bilda-Joinville Hospital in Algeria in 1953, Fanon acquired a close view of the pathologies of colonialism reflected in the mental illnesses of his patients (Drabinski, 2019). Colonialism impacted both sides, the colonizer and the colonized, as Ashis Nandy demonstrates with great finesse in his classic, The Intimate Enemy (2009, 2nd edition). Fanon drew on an eclectic set of methods (phenomenology, existentialism, poetry, psychology and political theory) which were then harnessed with great erudition to plumb the depths of both black and white consciousness under conditions of colonialism (Drabinski, 2019). The question of ‘racial embodiment’ was central to Fanon’s quest to understand the deeper impact of colonialism at so many levels of our being, most significantly the body and the psyche (Drabinski, 2019). The more controversial part of Fanon’s legacy deals with his position on violence. Achille Mbembe in his conversation with David Theo Goldberg argues that Fanon worked with two notions of violence. The first was ‘a kind of violence that precedes the awakening of the subaltern to consciousness’ and the second was violence intended ‘to disrupt and eventually to interrupt the colonial order of things’. While Fanon rejected the ‘liberatory potential’ of the first kind of violence described here, he was intrigued by the promise ‘to make it possible to reinvent the human’ in its second incarnation. Mbembe clarifies that Fanon was aware of
15
the excesses of violence and to him it was ‘both a weapon and a medicine’. In 2015 another set of unpublished writings of Fanon were made available and scholars are poring over these writings for a more complete picture of a remarkable mind (Mbembe and Goldberg, 2018). John Drabinski, in his 2019 Stanford Encyclopaedia of Philosophy entry on Fanon, confirms why Fanon may be such a good candidate for CPT scholars. His legacy points to ‘the fecundity of Fanon’s ideas, their elasticity and capacity for extending across historically and culturally diverse geographies’ (Drabinski, 2019). Another scholar of particular note is Edward Wadie Said (1935–2003) who is best known for his classic work on Orientalism (1978). His other well-known books include, The Question of Palestine (1979), Covering Islam (1981), The World, the Text, and the Critic (1983) and Culture and Imperialism (1993). Said was a great exponent of the Palestinian cause and was particularly critical of the Oriental lens through which the West often constructed the East. His work served as most instructive when it came to thinking about ‘anti-imperialism’, and here, akin to Fanon, the question of ‘revolutionary struggle’ was integral to Said’s set of intellectual and deeply political concerns. Both Said and Fanon were not without their critics but it would be hard to deny the enormous influence they have both had in shaping generations of postcolonial scholarship. Coming back to postcolonialism and its critics, two recent books merit special mention here. Vivek Chibber’s Postcolonial Theory and the Specter of Capital (2013:291) set the cat among the pigeons considering his claim that postcolonial exceptionalism was a chimera, and there is no reason to deny ‘that there is a universal history, in which East and West are both full-time participants’ and ‘the reality of capital’s universalization is perfectly consistent with an appreciation of the persistence of difference’. Rosie Warren brought a set of responses to Chibber’s critique of postcolonial theory in her edited
16
The SAGE Handbook of Political Science
book titled The Debate on Postcolonial Theory and the Specter of Capital (2017) with contributions from well-known postcolonial theorists like Partha Chatterjee, George Steinmetz and Gayatri Spivak among a galaxy of many notable others. While I shall not delve into the details of these responses here, students of CPT will stand to gain from partaking of these animated debates, which shall provide useful insights into thinking, especially about the tensions of universal versus exceptionalist claims when it comes to theorizing the world(s) we inhabit. There are many reasons why CPT might be worth pursuing. But, first a word about its genesis. Does CPT have anything to do with the end of the Cold War? Diego von Vocano, in an engaging overview of this terrain, suggests that the growing discontent with Orientalism, staid comparative politics accounts and most critically, the suspect claims of Francis Fukuyama about the ‘end of history’ and Samuel Huntington’s ‘clash of civilizations’ provided the goad for a much more rigorous engagement with a nascent field of study, CPT. Thus for Vacano, ‘CPT’s emergence is marked by the ashes of the Cold War’ (2015: 468). He is not alone in tracing the genesis of CPT to this period of time. Sara R. Jordan and Cary J. Nederman also argue that the 1990s were formative for CPT and suggest that it is still only in its ‘adolescent’ form (2012: 627). Now, let us consider arguments about why CPT might be a worthwhile endeavour. There are at least six claims I distil here from the burgeoning literature on CPT.
Argument 1: ‘Genuine Universalism’ The first argument relates to a case for ‘genuine universalism’ and is associated with Fred Dallmayr, one of the pioneers of CPT as an independent field of inquiry. In 1997, Dallmayr’s contribution to The Review of Politics journal advanced a case for CPT. He
focused particularly on the many weaknesses of comparative politics (see Part IV, this Handbook) and exhorted his colleagues to recognize that ‘the Western practitioner of political theory/philosophy must relinquish the role of universal teacher (buttressed by Western hegemony) and be content with that of fellow student in a cross-cultural learning experience’ (Dallmayr, 1997: 422). He further made an emphatic plea for ‘comparative political theorizing [that] would be genuinely global in character, by ranging from Europe and the Americas to Africa, Asia and Australasia’ (1997: 422). Dallmayr also claimed that ‘the ideology of sameness flies in the face of diverse historico-cultural trajectories and also of profound asymmetries in the distribution of global wealth and power’ (1997: 423). In 2004, he renewed his commitment to CPT arguing that ‘we surely need to take others and their aspirations seriously, which requires dialogue and empathetic awareness’ (Dallmayr, 2004: 253). None of these claims appear unreasonable. Coming to terms with Western hegemony, greater inclusivity while thinking about global political theory, demythologizing the flatness of the world and calling for more honest conversations among unequals are all par for the course in a quest for ‘genuine universalism’ which is the signature motif of Dallmayr’s approach (2004: 253).
Argument 2: ‘Principled Value-Conflicts’ A detailed programme for CPT was fleshed out by Andrew March in 2009. March began by endorsing the notion that ‘comparative work is not a zoological cataloguing of diversity’ and must instead focus on ‘a specific problem or question that is illuminated through multiple examples’ (2009: 537). There are several vantage points from which the case for CPT has been advanced. March suggests that there have been ‘epistemic, global democratic, critical-transformative,
Comparative Political Theory
explanatory-interpretive and the rehabilitative’ claims surrounding CPT (2009: 538). However, March is sceptical of CPT being merely an apologia for non-western political theory. He is of the view that there must be room in CPT ‘for political theorists to critique and even reject some of the nonWestern views and theories that we are trying to bring in without fear of necessarily reinforcing hegemony’ (March, 2009: 563). The animating spirit of CPT, as far as March is concerned, relates to its ability to delve into ‘principled value-conflicts’ in the light of a specific ‘moral tradition’ (2009: 560). Critical of intellectual projects aimed at dislodging a canon through processes of ‘expansion’ and ‘de-centring’, March suggests that the motivations here are far from ‘comparative’ and therefore should not then be accorded the status of CPT. Ultimately, the real test for CPT scholars is to be able to see how they can ‘reconcile’ traditions of engagement surrounding some common questions that animate political theory (2009: 531–65).
Argument 3: Embracing ‘Cosmopolitan Political Thought’ Farah Godrej finds March’s criteria too puritanical for an evolving CPT. She suggests that one of the unintended consequences of March’s claims about CPT is that it might end up reproducing a familiar ‘Eurocentrism’ (Godrej 2009a: 567–582; Ilieva, 2019). She finds particularly troublesome March’s stance ‘that Gandhi can simply be studied and taught alongside Plato and Machiavelli without any rethinking of the very categories of inquiry that structure our treatment of these texts’ (Godrej, 2009a: 575). On a more positive note, Godrej suggests that ‘[t]he comparative political theorist needs to take seriously the other of a text as well as of the tradition in which it is embedded’ (2009b: 142). To illustrate her stance, she points out for instance, that reading dharma through the lens “...of Acquinas’
17
natural law” would result in distortions while seeking to understand the concept (Godrej, 2009b:143; quoted in Ilieva, 2019). Godrej seeks to de-naturalize an innocent reading of political theory. She is keen to emphasize that ‘[e]xistential engagement alone cannot complete the task of a comparative political theorist: it must be followed by the project to disturb, provoke, and dislocate familiar modes of knowledge through speech and discourse’ (Godrej 2009b:159; Ilieva 2019). The fundamental point that Godrej seeks to drive home is that there is scope for mutual learning in CPT. It is time that we recognized that the cultural and intellectual traffic is not one way, but must draw on a much more multi-plural and genuinely cosmopolitan engagement. The non-west becomes a crucial interlocutor in this conversation considering that it has hitherto played a somewhat insignificant role in this conversation.
Argument 4: The Democratic Impulse in CPT Contesting the Western ethnocentrism that characterizes political theory as a field, Melissa S. Williams and Mark E. Warren welcome a ‘much deeper engagement with non-Western ideas about politics’ (2014: 27). Williams and Warren gesture to a whole range of methodological possibilities which could potentially democratize CPT knowledge. They observe in this connection that we should conceive of comparative political theory as engaging a wide range of ideational resources; formal scholarly work by non-Western scholars writing or academic audiences in their own language, political ideas of public intellectuals, principles of law and formal institutional structures, normalized practices and rituals of politics, the ideas of leading political actors and opposition figures, and everyday languages and practices of politics (Williams and Warren, 2014: 37).
This is a tall wish-list but it is useful to give a flavour of the many genres of CPT scholarship that are possible. Advancing a plea to uncover ‘political imaginaries’ from diverse
18
The SAGE Handbook of Political Science
settings, Williams and Warren note that the real challenge is ‘to render them intelligible to others’ (2014: 37). Two other skill sets might be invaluable to the comparative political theorist. First, translation and second, the ability to enter into a productive ‘dialogue’ with eclectic audiences. The task of probing ‘political imaginaries’ also calls for both ‘linguistic’ and ‘empirical’ investments. The ultimate value of the democratic quest in CPT is to consider ‘often forgotten resources and influences that make us who we are and what we might become’ (2014: 48).
Argument 5: Evolving a Shared CPT Grammar Another point of entry to think of CPT is to ask what the term ‘comparative’ means in this context. Jordan and Nederman argue that it is important to think about ‘the constraints and procedures endemic to the cross-cultural comparison of political ideas and languages’ (2012: 640–1). They go on to argue that an interest in an individual thinker or text from outside the Western canon does not intrinsically qualify as CPT. What is perhaps more significant is ‘to offer a defensible comparison of meanings or individual viewpoints on topics related to understanding politics theoretically’ (Jordan and Nederman, 2012: 630–1). CPT at its best must engage the global. Jordan and Nederman also point out that ‘[t]o understand politics theoretically in a comparative way means that we examine the beliefs expressed by authors of two or more writings of culturally disparate origin, with the intent of making generalizations about the similarities and differences that says something about the task of global political theorizing’ (2012: 631). All of this opens up exciting intellectual possibilities in the field of CPT. The inclusion of perspectives from outside the mainstream is not merely an issue of introducing novelty but also raising fresh
and original questions from a different cultural vantage point.
Argument 6: Reconfiguring Political Theory Erecting a distinction between two modes of approaching CPT, the ‘normative’ and the ‘interpretive’, Diego von Vacano argues that ‘[n]ormative accounts aim to achieve some moral end, although the specific content of that aim varies widely’ (2015: 468). This is in some contrast to ‘[i]nterpretive research [which] intends primarily to broaden knowledge of political questions or issues, without an underlying prescriptive objective’ (2015: 468). Making the case for CPT, Vacano makes a plea for ‘expanding the archive of political theory in ways that permit comparative readings’ (2015: 470). A cautionary note here is worth reiterating. The suspicion of the Western canon, Vacano suggests, must not result in the baby being thrown out with the bathwater. What this translates into is the view that ‘when some insights or Western thinkers do provide indispensable resources for specific theoretical problems, we ought to retain their centrality in the context’ (2015: 471). Further, Vacano also endorses a view of ‘cultural fluidity’ as opposed to a view of monolithic ‘cultural blocs’ with some essentialist traits. He also makes two further claims on CPT. The first is that there is considerable diversity in terms of methodological stance when one does CPT. Both in terms of the ‘normative’ as well as the ‘interpretive’ modes of inquiry, Vacano recognizes ‘the scholarly, the phenomenological, the immanent-reconstructive, and the conceptual-metanarratives’ (2015: 471). Second, he insists that political theory must not operate in a vacuum. It must engage the real world and ‘as politics change, political theory must adapt to new global circumstances’ (2015: 478). This appears to be a reasonable demand and CPT, in his assessment, must step up to the plate in this regard.
Comparative Political Theory
Contending with a Canon How has political theory conventionally understood shaped what we treat as knowledge in this domain? Whom and what does it include and exclude? Do some voices count for more and some for less? How are these decisions made and who arbitrates these claims? With these questions in the arsenal it is imperative for any student of CPT to revisit the ‘canon’ of political theory as well. The Kenyan writer Ngugi wa Thiong’o raises a relevant question about ‘the organization of literary space and its impact on the politics of knowing’ (2012: 7). It is worthwhile asking the same question in the context of political theory. A constant lament about much of the political theory that preceded CPT is that it was not sufficiently ‘comparative’. Reading through six broadly representative arguments I sampled earlier that mapped an intellectual agenda for CPT, it is not hard to identify some core anxieties relating to the canon in political theory. Very often, what passes as political theory is Western political theory. Most courses on political theory tend to introduce students to an array of thinkers drawn largely, if not solely, from the Western pantheon and treat them as sacrosanct. While there is much that is rich and worth engaging in Western political theory, there is quite evidently sophisticated political thinking in other parts of the world as well. If one were to choose the civilizational tack, you could add a whole range of potential candidates – Sinic, Indic, Ottoman, Egyptian and Aztec, among others. Surely questions like the nature of political rule, the means of securing political order, the quest for legitimacy, and notions of what counts as just, war and peace were germane to all these political entities. The concern here is ‘what we can do to de-parochialize political theory – that is, to shift the field in the direction of much deeper engagement with non-Western ideas about politics’ (Williams and Warren 2014: 27).
19
There are many possible illustrations of efforts in this direction. Suren Pillay, in his 2018 piece ‘Thinking the State from Africa: Political Theory, Eurocentrism and Concrete Politics’, seeks to re-think the category of the state from an African standpoint. He engages the still pertinent question of what decolonization of political theory might mean to a scholar from Africa today. This also ties up well with an earlier emphasis about CPT engaging the real world. Pillay believes ‘that thinking the state from a geo-political location like Africa requires a more sceptical view of political theory as an activity for normative theorisation of abstract values’ (2018: 32). The converse that he argues is true. It is worthwhile ‘to think of political theory as embedded in concrete politics and political actions’ (2018: 32). What Pillay views as essential is to enter into a larger debate on political theory by thinking with the grain of a political theory of realism. It is encouraging reflections on the relationship between normative abstractions of thought offered by thinkers to uplift, civilise, make modern, give rights to, and make citizens on the one hand, and the concrete political world that shapes what becomes of these abstractions and how they emerge in daily life (Pillay, 2018: 43).
Diego von Vacano similarly excavates Latin American political thought (2014). These examples can be multiplied. Many of these arguments are comparative to the extent that they participate in a global conversation around political theory. Sungmoon Kim critically engages the claims of Fred Dallmayr on Confucianism. He concludes by suggesting that ‘Dallmayr applies the generic concept of democracy, rooted in Western experiences, directly to East Asia where democracy is either non-existent, or relatively new’ (Kim, 2018: 46). Kim argues that we can anticipate ethical Confucian democracy only after the introduction and further entrenchment of institutional democracy, which is justified on moderately consequentialist grounds and thus independent of Confucian virtue ethics, although I
20
The SAGE Handbook of Political Science
admit the Confucian virtue ethics can help integrate democracy’s institutional and instrumental value into the end of equal citizenship in which democracy’s intrinsic value lies (Kim, 2018: 50–1).
In a somewhat different vein, Yan Xuetong speculates about how traditional Confucian values might be synthesized with some liberal values to shape the world order of the 21st century. While clearly recognizing the decline of liberalism both in the West and outside (though for different reasons), Xuetong examines three claimants to a special status within the Chinese value system (2018: 1–22). The first relates to official ‘Marxism’, the second to a philosophy of ‘economic pragmatism’ and the third most influential set of values relate to Chinese ‘traditionalism’ which includes Confucianism in conjunction with other elements. It is the latter element which should particularly pique the interest of comparative politically theorists. Xuetong notes in this regard that [d]espite their differences, each of these schools emphasises the significance of political leadership, as well as the role of strategic capability, in constituting a base for the solidarity and durability of that leadership. In this regard, they argue that a superpower’s foreign policies should prioritise strategic reputation. To achieve this end, the ancient idea of ‘humane authority’ (wang) promotes the value of benevolence (ren) and justice (yi) in guiding decision-making (2018: 8).
Further, what makes attention to these values important is the fact that they appear to receive the endorsement of the Chinese state as its footprint grows in the world. It was ‘at the Conference on Diplomatic Work towards Surrounding Countries, explanations of new foreign policies included terminology applicable to Chinese traditional values such as qin (closeness), cheng (credibility), hui (beneficence), and rong (inclusiveness). These four Chinese characters also appeared in the foreign policy section of the 19th Party Congress report’ (Xuetong, 2018: 9). A core Confucian value emphasized in this discourse
is ‘benevolence’. Xuetong also alerts us to other traditional conceptions like ‘badao’ (hegemony) which has a completely different play if it serves as the guiding ethical compass for Chinese foreign policy. He reassures us that it is rather low in the pecking order of values when it comes to statesmanship in China (2018: 1–22). Another continent of knowledge that needs to be explored more thoroughly through the lens of a more inclusive CPT is the many worlds of Islamic thought. Exploring the troubled relationship often posited between liberalism and Islam, Mustapha Kamal Pasha for instance, suggests that [m]isrecognition of ‘political’ Islam as an atavistic throwback or as a reactive social phenomenon merely recycles orientalism or strengthens the liberal conceit of expansiveness and mutual tolerance. The dual uses of Islam, as negation and as tolerance, represents all that liberalism ostensibly negates, Islam’s assumed closure, irrationality, belligerence, and bigotry. In the second instance, liberalism opens its doors to difference, tolerating the otherness of Islam, extending reason to the cultural worlds of Islam (Pasha, 2006: 70).
Pasha is particularly critical of International Relations (IR) as a discipline for its inability to study Islamic thought with an open mind. He argues here that [t]he resistance of Western IR to alternative forms of knowledge is not a question of malice, conspiracy, or ignorance or a sudden reaction to unprecedented events. Rather, the current response to recent developments exposes the limits of the canon drawn principally from durable cultural imaginaries and patterns. The dependence of IR on these imaginaries and patterns directs inquiry away from presentism or primitive functionalism to an appreciation of historical and ideological strands. In the first instance, Western IR congeals to the burden of historical encounters with others. Despite appeals to universalism, the hegemonic proclivity to deny alternatives legitimacy and a simultaneous quest to either annihilate opposition or to assimilate other conditions the historical past of the discipline (Pasha, 2006: 66).
Comparative Political Theory
There is a special role that CPT can play here especially when it comes to avoiding some of these prejudices in terms of how it approaches Islamic political thinking. Similarly, Roxanne Euben engages Sayyid Qutb ‘to recognize an important non-Western perspective on the experience of modernity; to see that the rationalist foundations of modern political and moral life may have failed to sustain us in some crucial ways; and by extension, to gain insight into the contemporary challenges to liberalism, democracy, and rationalism’ (1997: 53). Carlo Bonura adds a cautionary note in this regard when she notes that ‘discussions on the compatibility of Islam and liberalism are never about an essential compatibility. Rather, the discussion always focuses on some configuration, some comparative relation, that affirms the global primacy of liberal ideas as well as the ontological assumptions of liberal politics’ (2013: 45). This must be avoided if CPT is to enrich our understanding of milieus that mainstream scholars are especially less familiar with. A good illustration of bringing a thinker from the non-west in to conversation with global currents surrounding conceptions of world order is evident in a piece titled ‘Retrieving “Other” Visions of the Future: Sri Aurobindo and the Ideal of Human Unity’ by international legal scholar, B. S. Chimni. While there are a number of claims advanced here that are worth pondering over for comparative political theorists, the opening salvo in the piece situates Aurobindo within a larger pantheon of global political thought. Chimni asserts [t]he visions of the future of world order that find a place in contemporary writings and scholarship are essentially those advanced by Western thinkers (from Kant to Held). The work of nonWestern thinkers and visionaries hardly find a mention in them. The writings of Sri Aurobindo (1872–1950) are a good example of an integral vision of the future of global society that has received little attention. Based on a coherent, albeit contestable, theory of the evolution of
21
human society, Sri Aurobindo argues that the ideal of human unity will inevitably be realized. But if human unity is to contribute to individual and collective growth of nations and peoples, it must have spiritualism at its foundations (Chimni, 2006: 197).
However, Chimni throws an important caveat here when he claims I wish to clarify that my thesis is not about any marriage of materialism with religion but about transcending these binary opposites. The concept of spiritualism as deployed by Sri Aurobindo helps us to do so, as it is rooted in philosophical reflections about the essential unity of all phenomena, both material and spiritual (Chimni, 2006: 199, emphasis in original).
Kanti Bajpai and I have embarked on a project to constructively address the neglect in the IR of non-western political thought. Given our acquaintance with elements of Indian political thinking, we are in the process of bringing together a multi-volume series that introduces a wide gamut of thinkers and practitioners who speak directly to the subject of India’s international and strategic thought. Our intent is to participate in a much wider global conversation around international and strategic thought. The first of our volumes, titled, India, the West, and International Order is an effort to really bring to light how different streams (sometimes antagonistic) of Indian political thought, across the continuum, reflected on fundamental questions relating to both international and strategic thought; dating from the 19th century to more recent times (Bajpai and Mallavarapu, 2019). It would not be surprising to see more of such endeavours from within the global south in the years to come. Serious students of CPT must partake of these efforts to widen the repertoire of what they treat as part of their intellectual palate. Returning to the question of the ‘canon’, it is not hard to see that concerns relating to the canon are not specific only to political theory. Scanning the missing African–American
22
The SAGE Handbook of Political Science
presence in the American literary canon, Toni Morrison argues that [c]anon building is empire building. Canon defense is national defense. Canon debate, whether the terrain, nature, and range (of criticism, of history, of the history of knowledge, of the definition of language, the universality of aesthetic principles, the sociology of art, the humanistic imagination), is the clash of cultures. And all of the interests are vested (Morrison, 1988, emphasis in the original).
Morrison ends up reading the American literary canon differently when she looks ‘for the ways in which the presence of AfroAmericans has shaped the choices, the language, the structure – the meaning of so much American literature. A search, in other words, for the ghost in the machine’ (1988). The ‘ghost in the machine’ of political theory leads us to another reason to think about why CPT acquires a certain urgency at this historical moment. This has to do with ‘other’ and how most of our disciplines including political theory often struggle contending with difference. While ‘we generally ought to provincialize the Western canon’ we must also find a language to deal with ‘other’ (Vacano, 2015: 471). The larger point here which must be highlighted is that, as Farah Godrej persuasively observes civilizational representation and transcultural resonance are hardly mutually exclusive. The challenge, then, is to strike a delicate balance between seeking to subsume all other by explaining it in terms of the familiar, suggesting, therefore, that a familiar or transcultural message is implicitly contained within the words of a given thinker, or insisting that other is recognizable only to those embedded in its context and that transcultural knowledge, is, therefore, impossible (Godrej, 2009a: 579–80).
More scathing is her indictment that ‘the problem is that the existing canons, methods, and practices of inquiry within political theory are not structured in any way that makes such a recognition central’ (Godrej, 2009a: 580). Armed with this knowledge, it might be worthwhile thinking about the ways in which we address the dimension of other which
appears, on the face of it, inescapable in our engagement with the world. I take recourse to reflections by the philosopher Bimal Krishna Matilal, followed subsequently with an account of Gayatri Spivak’s reading of Mahasweta Devi, a leading tribal activist in India, as part of an effort to seriously grapple with the dimension of ‘other’. I conclude with some cautionary notes on the dangers of an uncritical CPT succumbing to the temptations of ‘Orientalism’ and how best to steer away from it.
Coming to Terms with OTHERNESS Let us think about the phenomenon of other through a compelling illustration. Pamuk notes ‘[w]hen Proust writes about love, he is seen as someone talking about universal love. Especially at the beginning, when I wrote about love, people would say that I was writing about Turkish love’ (2007: 378). It is fair to ask in what fashion do the politics of appropriation or misappropriation transpires here. Why does one formulation of love qualify as a generic human condition while the ‘other’ formulation remains tethered to a very specific local accent of time and place. Does it have to do merely with spatial location? Or does it have to do with power? Simply put, why does one manifestation of love go uncontested as universal while the other in the same breath is rendered provincial. These questions should bother all comparative political theorists who value diversity and intend drawing in a variety of influences in a CPT conversation. There are several tacks any of us could adopt to address the question. It is closely linked to the thorny question of relativism. An approach which might be worth adopting here is to examine similar debates in the sphere of comparative ethics. I take recourse here to the work of Indian philosopher Bimal Krishna Matilal who authored two seminal
Comparative Political Theory
pieces on the question of ethical relativism and pluralism in the context of cultures. However, before proceeding to some key claims relating to relativism it might be worthwhile examining how Matilal evaluated the concept of ‘others’. In a piece titled, ‘The East, the Other’, Matilal observes unequivocally that there is a persistent image of the East in the Western mind – the East that is exotic, mysterious and mischievous. It is the mystique of the East. To raise the above question is not to endorse this idea of the East. For the East can hardly be a mystery to an Easterner, not at least in the same way as it appears to the Westerner. The mystique of the East is essentially a Western construction (Matilal, 2015a: 265).
Critical of unproblematized notions of cultural exclusivity, Matilal suggests that a dialogue with ‘the other’ is indispensable. He asks how can one understand or become self-consciously aware of one’s own uniqueness without understanding the Other? Uniqueness is, of course, there, but in the context of a living culture interacting with others, no uniqueness is static or immutable. The old uniqueness vanishes to make room for new uniqueness, being enriched by adoption and absorption. (Matilal, 2015a: 271)
An interesting related claim is that ‘[t]he Western thought-world has plundered other cultures and enriched itself with the booty although the booty has been transformed beyond recognition and assimilated totally’ (Matilal, 2015a: 271). However, Matilal conveys a stance which displays no special anxiety about others. By conceding its wide role in the social, the ethical and the political universe, Matilal is asking us to come to terms with it. These views also percolate to his conception of relativism. Matilal’s survey of positions in the field of ethical relativism presents us with five distinct positions. These are worth recounting because they also hold a mirror to scholars of CPT. To ensure that positions are not misrepresented in any sense, I shall let Matilal speak. He writes
23
There are various strands in the texture of our controversy over moral relativism. I isolate the following positions: 1 Ethical standards found in different cultures are only in apparent conflict with each other. This plurality exists only at the surface. At some deeper level there is only one set of moral standards to which everybody should conform, and it is possible to discover this singular standard of universal morality through rational means. I shall call it moral monism or singularism. 2 Intercultural plurality of moral standards is reflected also in the intracultural plurality of norms (which is witnessed by the pervasive presence of moral conflicts in persons). 3 Genuine plurality exists, and some are more right than others, but there is no way to decide or know which ones are better or worse than others. Despite the air of Orwellian cliché, this can be seriously held position. I shall call it agnosticism. Moral conflicts on this view would be ineliminable. 4 Among the many moral norms available across cultures it is impossible to judge objectively some as better or worse than others, for although they may be mutually comprehensible, there is no transcultural standard of evaluation. Culture-bound norms are neither good nor bad. This is what I shall call soft relativism. 5 Culture-bound norms are both incommensurable and mutually incomprehensible, and hence one may say that one norm is just as good or as bad as the other. This I shall call hard relativism. (Matilal, 2015b: 218–9)
Comparative political theorists are also compelled to engage these positions in the light of their own work. It might be worthwhile to come clean on which of these models they find more plausible and why in the field of CPT. They are CPT monists, who are convinced that the questions of political theory are universal, and wherever you find yourself it would be inevitable to contend with a set of core facets relating to conceptions of the good life, notions of political order and legitimacy, or even notions of justice. The inter- and intra-cultural pluralists are not so sure the monists are right. They probe differing conceptions and think of political
24
The SAGE Handbook of Political Science
traditions as more complex human endeavours. The agnosts are willing to claim one set of political standards as superior to another but are still are unable to arbitrate claims of one against the other. Agnosts generate indifference at best and nihilism at their worst when it comes to arbitrating conflicting claims about the most desirable way to proceed. The soft relativists sit on the fence and are unable to judge a political framework as superior and another as inferior in any simple terms. They would suggest that there are no widely shared criteria by which one set of ethical standards or in the case of CPT, political precepts, are inherently superior or inferior to another set of similar precepts. Thus, their basic claim is the impossibility of evolving a ‘transcultural standard of evaluation’ (Matilal, 2015b: 219). Hard relativists fully endorse the incommensurability position and leave us with again no choice in arbitrating one claim against the other. Such a position does not augur well for those looking for clear-cut directions in terms of choosing between competing ethical/political models or standards in this regard. In the light of the preceding discussion, where does Matilal himself stand in relation to these positions and does it appear as a reasonable stance to adopt? Matilal introduces the idea of a ‘minimal moral standard’ or alternatively the idea of ‘the basic fabric of the human world’ which are his lodestars when it comes to choosing between competing ethical standards (2015c: 252, 255, 261). He disavows a nihilist position and is willing to argue that some core universal basic is well worth arguing over. He distinguishes this from any simple notion of universalism or any simple form of cultural essentialism. His argument is straightforward and clear as reflected below I submit the following thoughts for consideration: the supposition of context-neutral rules assumes that there is a basic moral fabric in all societies, all communities and cultures, which holds their members, human beings together. This should not be conflated with the old Universalism, nor with some boring commonalities of the species, nor with any objectionable form of essentialism (viz., all humans
have the same essence). Since we talk about minimal agreements in norms, this view concedes relativism up to a limit but rejects it when it militates against the basic fabric of the human world (Matilal, 2015c: 255).
The real issue for Matilal is how to salvage this ‘basic universal moral fabric’ when it is juxtaposed against the backdrop of a ‘mad world of war, bigotry, fundamentalism and power-brokerage’ (2015c: 261). Returning to more fundamental asymmetries of representation and others, Walter Mignolo echoes Pamuk when he observes that, much like Turkish love once upon a time scholars assumed that if you ‘come’ from Latin America you have to ‘talk about’ Latin America; that in such a case you have to be a token of your culture. Such expectation will not arise if the author ‘comes’ from Germany, France, England or the US. In such cases it is assumed that you have to be talking about your culture but can function as a theoretically minded person. As we know: the first world has knowledge; the third world has culture, Native Americans have wisdom, Anglo Americans have science (Mignolo, 2009: 2).
Contesting such a pecking order, Mignolo argues that ‘[g]eo-politics of knowledge goes hand in hand with geo-politics of knowing. Who and when, why and where is knowledge generated … Asking these questions means to shift the attention from the enunciated to the enunciation’ (2009: 2). To decolonize knowledge entails ‘the unveiling of epistemic silences of Western epistemology and affirming the epistemic rights of the racially devalued’ (2009: 4). Closely aligned to this intellectual project of recovery, Gayatri Spivak follows the work of Mahasweta Devi, a leading tribal activist in India, to raise larger questions relating to theory, knowledge formation and the inclusion of indigenous peoples in any such project. The reason why I insert this sensibility here is because CPT in the years ahead will have to increasingly take on board these unrepresented voices, and not merely chronicle their experiences, but build theory on the solid edifice of the lived lives of people from different corners of the globe.
Comparative Political Theory
Referring to these maps as ‘imaginary maps’, Spivak argues that [o]ne of the not inconsiderate elements in the drawing of these maps is the appropriation of the Fourth World’s ecology. Here a kinship can be felt through the land-grabbing and deforestation practiced against the First Nations of the Americas, the destruction of reindeer forests of the Suomis of Scandinavia and Russia, and the tree-felling and eucalyptus plantations of the original nations, indeed of all early civilizations that have been pushed back and away to make way for what we call the geographic lineaments of the map of the world today (Spivak, 2015b: 200–1).
In an interview with Spivak, Mahasweta Devi draws attention to one of her stories Pterodactyl. Devi claims that Pterodactyl is an abstract of my entire tribal experience. Through this Nagesia experience I have explained other tribal experiences as well. I have not kept to the customs of one tribe alone. In the matter of the respect for the dead, for example, I have mixed together the habits of many tribes. If read carefully, Pterodactyl will communicate the agony of the tribals, of marginalized people all over the world (Devi, 2015: xiv).
Spivak concludes that ‘Mahasweta’s fiction resonates with the possibility of constructing a new type of responsibility for the cultural worker in a world that is already under way’ (2015a: xx–xxi). When we think of ‘others’ there is room to enormously challenge our normative assumptions and think of ways of bringing the excluded back into our conversation about theory and different worlds. To the author of ‘My Father’s Suitcase’ to be a writer is to acknowledge the secret wounds that we carry inside us, the wounds so secret that we ourselves are barely aware of them, and to patiently explore them, know them, illuminate them, to own these pains and wounds, and to make them a conscious part of our spirit and our writing (Pamuk, 2006).
This sensibility might be worth taking on board for comparative political theorists of the 21st century. It brings us back to the idea of ‘empathetic awareness’ which Dallmayr
25
gestured to in 2004 while fleshing out an intellectual agenda for CPT. Perhaps it is also worth pausing briefly here to reflect on the dangers of Orientalism in CPT scholarship. The work of Megan Thomas seeks to alert us to the ‘family resemblances’ in some variants of CPT with Orientalism. She argues that ‘[c]omparative political theory needs to place less of a premium on distinct “others,” on “civilization” as marking boundaries, and on (textual) “tradition” as marking contexts, if it is to avoid repeating the emphases and exclusions of earlier Orientalism’ (Thomas, 2010: 655). It must eschew exoticization of the non-west; it must engage broader theoretical questions that are of interest to political theorists across the board, and avoid naïve textual exegesis that merely reproduces classical Oriental modes of viewing these sources. It must be also willing to ask original questions, be willing to renounce received wisdom where warranted from a privileged Western lens and challenge itself intellectually to evaluate a wide range of sources with criteria that factor context more critically. Ultimately, Thomas suggests that Orientalism often conceived of itself as bridging the world of contemporary (European) scholarship, on the one hand, and what it saw as unduly marginalized texts, thinkers, traditions, on the other. Comparative political theory need not claim a patrimony in Orientalism, but it might more specifically delineate what distinguishes it from earlier calls to broaden Europe’s narrow and self-centered attention. By more carefully distinguishing itself from earlier Orientalism, comparative political theory might avoid repeating Orientalism’s quiet failure (Thomas, 2010: 677).
This is an intellectual project that if persuasively accomplished is worth its weight in gold.
In Lieu of a Conclusion: Contending with the ‘Ghost in the Machine’ I began this intervention by framing my interest in CPT in the crevices of the autobiographical
26
The SAGE Handbook of Political Science
memories of two accomplished storytellers – Pamuk and Ghosh. The negotiations of the corpus of ideas from the world outside and from what we regard as ‘home’ are layered and indeed complex. Novelists are particularly well placed to tap into these dilemmas given their ability to straddle many worlds. Honest accounts speak to moments of self-doubt and the delicate task of contending with the competing realities engulfing both the immediate universe and the external worlds, which are at times, some removes from this immediacy. I believe that CPT scholars have their own share of parallel predicaments, thrown as they are in the maelstrom of the global, having to contend with ideas and practices emanating from diverse political milieus. They are expected to disaggregate their eclectic contexts, comprehend specific inflections and oftentimes complex motivations that underpin political praxis. What I have sought to do in the course of this chapter is to encapsulate the state of play of CPT in terms of its multi-faceted intellectual quest. Like any other field of study, this is a story about it origins, animating debates, the shifting sands of these agreements and disagreements, protagonists and their detractors, trajectories taken and eschewed and its current moment of reckoning. I shall not desist here from flagging two questions and in the course of answering these briefly reiterate some fundamental claims that our journey has revealed. The first question is rather basic. What is the essential nature of the beast that interests us here (namely, CPT) and second, where should it be headed? Let me begin by addressing the first question. CPT to my mind is a vital intellectual project in the world we inhabit. It is not a coincidence that the genesis of the field coincided with the end of the Cold War. There has been for some time now a growing unease with the existing frameworks, making it imperative that we find a new language to pose questions more squarely, and to then go on to address them with a newfound sense of purpose. There are several reasons
why CPT has come to be viewed intellectually as worthwhile pursuing. In the course of the chapter, I identified six arguments supporting CPT. These include a quest for ‘genuine universalism’, navigating a world of ‘principled value-conflicts’, inserting a ‘cosmopolitan’ sensibility, strengthening a democratic impulse when it come to the broader sociology and politics of knowledge, evolving shared standards of political judgment, and contributing to reconfiguring old-style political theory with a deliberate comparative inflection and a global set of preoccupations. When one locks horns with this beast there are two inevitable dimensions that surface. I dealt with these at some length in this chapter because I believe that they are critical for any robust version of CPT. The first of these is the ‘canon’ and the second deals with another human reflex, i.e. relentless ‘others’. On both these questions, I want to come clean on where I stand in relation to them. With regard to the canon, I endorse the view that it needs to be critically approached and examined threadbare in terms of its aetiology, capacity to fundamentally illuminate aspects of our social and political existence and reveal newer sides when read in diverse contexts. Canons must be evaluated in conjunction with an open acknowledgment of their imbrication with power, the social and cultural capital which lend it a special status and the accompanying epistemic violence that was inflicted when it edged out competitors in the process of foisting itself as the commonsense or representative approach to our chosen domain of study. I would like to reiterate here, that while Western political theory has tended to dominate, and at times eviscerate, other forms of political theorizing, there still remains a kernel of political thought from this tradition that is worth preserving and engaging for the questions they continue to pose if not always for the provisional answers they provide to these questions. Blandly stated, we must not disengage but more critically engage with this body of literature. We must
Comparative Political Theory
also guard against the dangers of any form of nativism or unintended ethnocentrism wherever it emerges from. There are also the usual traps to avoid such as any manifestation of Orientalism in CPT scholarship. On the question of others, I find Matilal’s project to salvage a global standard for evaluating the relativist bases of comparative ethics an audacious endeavour. If we are to avoid being nihilistic about relativism, we will have to find a way of arbitrating our collective claims to what might count as a valid comparative political theory set of criteria by which a community of scholars can acknowledge a genuine contribution and reject a spurious one. We must guard against the usual follies involved in such judgments. In other words, we must avoid non-inclusive and partisan judgments, and eliminate the imposition of standards merely by sleight of hand premised on the camouflage of power. The real challenge is how to keep the Nietzschean ‘will to power’ momentarily at bay or quarantined if at all possible (Nietzsche 1978: 215–231). I would like to close by reiterating that CPT is poised at an exciting global conjuncture. There is more than a degree of political urgency to develop a more refined understanding of the many worlds we inhabit. There is a need to widen our palate of ideas, our appreciation of eclectic political traditions and practices and a mechanism by which we can generate a global conversation around themes that animate scholars in this field of scholarship. Political theorists in the 21st century are asking what a political theory of the future might look like. There are several concerns here – decoding empire and imperialism in the 21st century, asking how political theory can contribute to deciphering the Anthropocene, what analytical moves might help us get a handle on political populism, confronting newer scientific and technological advancements and ways in which we re-negotiate the legacies of the classics in a fast-changing world. CPT has a role here too. By insisting on a comparative frame and
27
by asking probing questions with regard to all these dimensions and many more (too long to list here), it is perhaps best placed to speculate on plausible future utopias to embrace and dystopias to reject. The hope is that CPT might find a new language and sensibility to contend with the ‘ghosts in the machine’, whether these relate to the historical ‘canon’ or ‘others’ and pave the way for a richer and more robust democratic epistemic culture, marking a clear departure from the explicit and implicit silences of the past.
Acknowledgments The author wishes to express his gratitude to Kanti Bajpai, Rustom Bharucha, B.S.Chimni, Peter Katzenstein, Srikanth Mallavarapu, Medha and co-editors of this Handbook for their insightful comments on this chapter. The usual disclaimer applies.
References Bajpai, Kanti & Mallavarapu, Siddharth (2019). Introduction. In: India, The West, and International Order. Hyderabad: Orient Blackswan, pp. 1–50. Bonura, Carlo (2013). Theorizing Elsewhere: Comparison and Topological Reasoning in Political Theory. Polity, 45 (1), 34–55. Chibber, Vivek (2013). Postcolonial Theory and the Specter of Capital. Delhi: Navayana. Chimni, B. S. (2006). Retrieving ‘Other’ Visions of the Future: Sri Aurobindo and the Ideal of Human Unity. In: Branwen Gruffydd Jones ed. Decolonizing International Relations. Lanham, MD: Rowman & Littlefield Publishers, pp. 197–217. Dallmayr, Fred (1997). Introduction: Toward a Comparative Political Theory. The Review of Politics, 59 (3), 421–7. Dallmayr, Fred (2004). Beyond Monologue: For a Comparative Political Theory. Perspectives on Politics, 2 (2), 249–57.
28
The SAGE Handbook of Political Science
Devi, Mahasweta (2015). Imaginary Maps (3rd impression). Kolkata: Thema. Drabinski, John (2019).Frantz Fanon. In Edward N. Zalta ed. Stanford Encyclopedia of Philosophy (Spring 2019 Edition), Forthcoming from: [Accessed 22 December 2019]. Euben, Roxanne L. (1997). Comparative Political Theory: An Islamic Fundamentalist Critique of Rationalism. The Journal of Politics, 59 (1), 28–55. Fanon, Frantz (1952). Black Skin, White Masks (2008: Revised edition). New York: Grove Press. Fanon, Frantz (1961). The Wretched of the Earth (2015: Revised edition). New York: Grove Press. Fukuyama, Francis (1992). The End of History and the Last Man. New York: Avon Books. Ganeri, Jonardon (2015). Ethic and Epics: The Collected Essays of Bimal Krishna Matilal (Vol. 2, paperback edition). New Delhi: Oxford University Press. Ghosh, Amitav (1998). The Testimony of My Grandfather’s Bookcase [Online] Available from: https://www.amitavghosh.com/essays/ bookcase.html [Accessed 2 January, 2019]. Godrej, Farah (2009a). Response to ‘What is Comparative Political Theory?’. The Review of Politics, 71 (4), 567–82. Godrej, Farah (2009b). Towards a Cosmopolitan Thought: The Hermeneutics of Interpreting the Other. Polity, 41 (2), 135–65. Huntington, Samuel (1996). The Clash of Civilisations and the Remaking of World Order. New York: Touchstone. Ilieva, Evgenia. (2019). ‘Countering the Dogmatism of Eurocentrism: Comparative Political Theory as Heterology’. Unpublished manuscript. Jordan, Sara R. & Nederman, Cary J. (2012). The Logic of the History of Ideas and the Study of Comparative Political Theory. Journal of the History of Ideas, 73 (4), 627–41. Kim, Sungmoon (2018). Fred Dallmayr’s Postmodern Vision of Confucian Democracy: A Critical Examination. Asian Philosophy, 28 (1), 35–54. March, Andrew F. (2009). What is Comparative Political Theory? The Review of Politics, 71 (4), 531–65. Matilal, Bimal Krishna (2015a). The East, the Other. In: Ganeri, Jonardon ed. Ethic and Epics: The Collected Essays of Bimal Krishna Matilal
(Vol. 2, paperback edition). New Delhi: Oxford University Press, pp. 265–77. Matilal, Bimal Krishna (2015b). Ethical Relativism and the Confrontation of Cultures. In: Ganeri, Jonardon ed. Ethic and Epics: The Collected Essays of Bimal Krishna Matilal (Vol. 2, paperback edition). New Delhi: Oxford University Press, pp. 218–41. Matilal, Bimal Krishna (2015c). Pluralism, Relativism and Interaction Between Cultures. In: Ganeri, Jonardon ed. Ethic and Epics: The Collected Essays of Bimal Krishna Matilal (Vol. 2, paperback edition). New Delhi: Oxford University Press, pp. 242–62. Mbem Society. be, Achille & Goldberg, David Theo (2018). Conversation: Achille Mbeme and David Theo Goldberg on Critique of Black Reason. Theory, Culture & Society [Online] Available from: https://www.theoryculturesociety.org/conversation-achille-mbembe-anddavid-theo-goldberg-on-critique-of-black-reason/ [Accessed 31 August, 2019]. Mignolo, Walter D. (2009). Epistemic Disobedience, Independent Thought and De-Colonial Freedom. Theory, Culture & Society, 26 (7–8), 1–23. Morrison, Toni (1988). Unspeakable Things Unspoken: The Afro-American Presence in American Literature. The Tanner Lectures on Human Values. Delivered at The University of Michigan, October 7. [Online] Available from: https://tannerlectures.utah.edu/_documents/ a-to-z/m/morrison90.pdf [Accessed 22 December, 2019] Nandy, Ashis (2009). The Intimate Enemy: Loss and Recovery of Self under Colonialism (2nd edition). New Delhi: Oxford University Press. Nietzsche, Friedrich. 1978. A Nietzsche Reader (translator R.J.Hollingdale). London: Penguin. Pamuk, Orhan (2006). My Father’s Suitcase. The Nobel Lecture 2006. [Online] Available from: https://www.newyorker.com/magazine/ 2006/12/25/my-fathers-suitcase [Accessed 2 January, 2019]. Pamuk, Orhan (2007). The Paris Review Interview. In: Pamuk, Orhan ed. Other Colours. London: Faber and Faber, pp. 353–78. Pasha, Mustapha Kamal (2006). Liberalism, Islam and International Relations. In: Branwen Gruffydd Jones ed. Decolonizing International Relations. Lanham, MD: Rowman & Littlefield Publishers, pp. 65–85.
Comparative Political Theory
Pillay, Suren (2018). Thinking the State from Africa: Political Theory, Eurocentrism and Concrete Politics. Politkon, 45 (1), 32–47. Said, Edward W. (1978). Orientalism (1994: Revised edition). New York: Vintage Books. Said, Edward W. (1979). The Question of Palestine (1992: Revised edition). New York: Vintage Books. Said, Edward W. (1981). Covering Islam (1997: Revised edition). New York: Vintage Books. Said, Edward W. (1983). The World, the Text, and the Critic. Cambridge, MA: Harvard University Press. Said, Edward W. (1993). Culture and Imperialism (1994: Revised edition). New York: Vintage Books. Spivak, Gayatri (2015a). Translator’s Preface. In: Devi, Mahasweta ed. (2015) Imaginary Maps (3rd impression). Kolkata: Thema, pp. i–xvi. Spivak, Gayatri (2015b). Appendix. In: Devi, Mahasweta ed. (2015) Imaginary Maps (3rd impression). Kolkata: Thema, pp. 199–210.
29
Thiong’o, Ngugi wa (2012). Globalectics: Theory and Politics of Knowing. New York: Columbia University Press. Thomas, Megan C. (2010). Orientalism and Comparative Political Theory. The Review of Politics, 72 (4), 653–77. Vacano, Diego von (2014). Latin American Political Thought. In Gibbons, Michael T., Coole, Diana, Ellis, Elisabeth & Ferguson, Kennan eds. The Encyclopaedia of Political Thought (Vol. IV: Gui–Len). Wiley Blackwell, pp. 2051–8. Vacano, Diego von (2015). The Scope of Comparative Political Theory. Annual Review of Political Science, 18: 465–80. Warren, Rosie, ed. (2017). The Debate on Postcolonial Theory and the Specter of Capital. London: Verso. Williams, S. Melissa & Warren, Mark E. (2014). A Democratic Case for Comparative Political Theory. Political Theory, 42 (1), 26–57. Xuetong, Yan (2018). Chinese Values vs Liberalism: What Ideology Will Shape the International Normative Order? The Chinese Journal of International Politics, 11 (1), 1–22.
2 Constructivism Jun Ayukawa
Introduction: What is Constructivism? Constructivism is based on the idea that social phenomena are constructed through human interaction. Usually, people take these social phenomena for granted and regard them as objective facts or occurrences. Constructivism, on the other hand, scrutinizes what people believe to be truths or objective phenomena; it considers accepted facts to be constructed interpretations brought about through human interaction. Because construction is accomplished through interactions, this perspective is called social constructivism. Although constructivism’s roots are in sociology, it has been adopted by political scientists as an approach that examines how political issues and policies are shaped. Social constructivism proposes that humans use language to define situations and then act according to those definitions. It attempts to examine narratives and discourse
in order to comprehend the meanings, interpretations and definitions of political conditions and social circumstances.
Social Constructivism: The Birth of Constructivism and Its Development Peter Berger and Thomas Luckmann’s The Social Construction of Reality (1966) led to the emergence of modern social constructivism. This book coined another term – constructionism. European scholars and scholars interested in science and technology tend to use the term constructivism, while American scholars and scholars in social sciences tend to prefer the word constructionism. However, the terms can be used interchangeably (Holstein and Gubrium, 2008). Berger and Luckmann’s book has become a sociological classic. It offers a synthesis of the works of such leading sociological
Constructivism
figures as Max Weber, Émile Durkheim, George Herbert Mead, Karl Marx and Berger and Luckmann’s mentor, Alfred Schutz. Alfred Schutz was social phenomenologist and sociologist. He moved from Austria to America in 1939, where he worked as a banker and taught at the New School of Social Research. He introduced phenomenological ideas into Weberian action theory, and he proposed the concept of multiple realities, refuting traditional thinking that there exists only one reality. Berger and Luckmann also drew upon works by earlier sociological theorists. Mead’s interest in how the self is constructed through human interaction led him to understand that other new meanings, new definitions of situations and new values emerged through interaction (Mead, 1934). The early writings of Karl Marx note that when an individual joins an organized group and is given status, the role of the individual becomes fixed within the relationships of the group and meaningful action is changed into the routine work of the organization, and this is the process by which individual desires are reified into organizational functions. For modern scholars, the ideas of social constructivism seem unexceptional. However, Berger and Luckmann’s book was welcomed as a counter to the dominant theory of structural functionalism, which evaluated a person from the limited perspective of being simply a function within an organization. The phenomenological and symbolic interactionist points of view were new and a declaration of human creativity. As social constructivism became more popular, its claims became more diversified and radicalized. Over time, it began to emphasize language, discourse, narrative and story. Although placing value on language might seem extreme, it is reasonable from the viewpoint of the theory of emancipation. For a person whose consciousness has been conditioned within a limiting physical environment, there is only one way to trigger the sense of emancipation – through language and narrative – which is able to
31
cause a change from an sich sein (in itself) to für sich sein (for itself). Words can cause a revolutionary subjectivity and consciousness. Berger and Luckmann also introduced the Marxist concepts of Entfremdung (alienation) and Verdinglichung (reification) into their book (Marx, 1844; Marx and Engels, 1845–1846). The prime difference between Entfremdung and Verdinglichung is that alienation presupposes some essential character of human being, but reification thinks of the human being as an ensemble of social relationships. The concept of reification offers escape from postulating the essence of human being, that is, escape from essentialism. Young sociologists and social scientists who are oriented to emancipation seek to get rid of ‘false consciousness’. From this critical perspective, one theorist who was ignored in The Social Construction of Reality was Antonio Gramsci, but a scholar who had the similar orientation toward the emancipation by breaking through constrained consciousness is George Lukacs, who is referred to in Berger and Luckmann’s book. There is critical constructivism in the study of social problems. This approach proposes to integrate symbolic interactionism and critical theory of those such as Marx and Gramsci (Heiner, 2002). This is just a brief summary of the classic theory of the origin of constructivism, now I would like to consider the developments of constructivism by examining a few themes and topics which are mainly relevant to social problems, politics (including international relations (IR)) and public policy.
Constructivism in the Study of Social Problems The two social science fields which have produced the most important social constructivist work are sociology and political science (Edelman, 1988). In particular, the constructivist study of social problems has flourished, and there are calls to extend this work, which
32
The SAGE Handbook of Political Science
would further strengthen the constructivist perspective (Best and Loseke, 2018). The constructivist study of social problems emerged by combining the original constructivist sociology of knowledge and the sociology of deviance. It owes heavily to the ideas about the labeling perspective in the study of deviance, which describes how the labels ‘deviant’, ‘criminal’ or ‘delinquent’ are given to someone whose behavior is recognized as offensive to the established order. Thus, deviance is socially constructed and controlled by laws that are established and enforced by social control agencies. One of the founders of the labeling perspective, Howard S. Becker, wrote about the creation of the marijuana tax law (Becker, 1963). His paper showed that the construction of laws could be applied not only to political science but also to social problems. His labeling perspective could be relevant not only to deviant behavior but to wider social problems. John I. Kitsuse, an influential labeling theorist, played a key role in applying the labeling perspective to social issues by examining the process of claims-making by which phenomena are constructed as social problems (Spector and Kitsuse, 1977). Claimsmaking is ‘the process of making claims, of bringing a troubling condition to the attention of others’ (Best, 2017: 342). When claims demand that something should be done about a phenomenon which has been constructed as a social problem, it requires some public policy or social policy. In this sense, the study of social problems becomes relevant to political science. Social constructivism does not approach problems in society in the same way as the traditional studies of social problems, which assume that objective conditions caused the issues. Social constructivism examines problems not as a condition but as a continuum of human activities, which include claimsmaking, media coverage, public perceptions, public persuasion and local government influence, as well as social problems workers who are responsible for managing the people
involved in the social issue through activities and devising policy outcomes.1 The social constructivist studies these activities, especially focusing on the rhetoric used by the participants. The use of large numbers to influence society of the seriousness of the problem is one common feature. One example of the usage of large numbers to influence society was the case of missing children in the United States. The idea that strangers abduct children is common, including the notion that a child might have been sacrificed by a cult in a secret ceremony. This idea is propagated by the mass media when it postulates that a stranger has abducted a child. In fact, in many cases it was a parent or relative who took the child. Other children ran away and were either found soon after they were declared missing or discovered living with one parent who had been refused custody. Although actual cases of missing children taken by strangers are rare, claimsmakers often estimate that the total is vastly higher (Best, 1990). Joel Best proposed a theoretical model of natural history of social problems in which there are several sequential stages, from claims-making to media coverage, public reaction, policy making, social problem work and policy outcome (Best, 2017). Best suggests that in some cases it is in the interest of people and the mass media to use large numbers in describing a problem to underline its seriousness. Even when it is found out that the large number was incorrect, the smaller number is not always announced in the media. Failure to construct a social problem can be a challenge for people who may not enjoy resources that allow them to get attention from people in general. On the other hand, construction of one problem may make people blind to other, more serious problems. For example, in order to conceal the failure of the president/prime minister in office, and thereby distract people’s attention from the big political problems, the powerful may try to focus on some minor problem or on their own achievements in order to disperse
Constructivism
people’s attention. In this sense, the concept of ‘agenda setting’ proposed by mass media research is relevant to constructivist study of social problems.
Diversities in Constructivism Social constructivism claims that it lets the people who make claims define social problems and avoids presupposing the existence of objective conditions. But, in practice constructivists often find themselves contrasting what they treat as objective conditions and the subjective conditions depicted by claimsmakers. This kind of selective relativism is termed ‘ontological gerrymandering’. Gerrymandering is an American political idiom that refers to drawing boundaries of electoral districts so as to make some outcomes more likely. Ontological gerrymandering, then, argues that some analysts selectively applied theoretical assumptions so as to advance their arguments. A dispute over ontological gerrymandering in the middle of 1980 led to the division of the social constructivism of social problems into two schools. One is strict constructionism,2 which was represented by John I. Kitsuse, and the other is contextual constructionism, led by Joel Best. Strict constructivists accept the criticism of ontological gerrymandering and try to avoid referring to social conditions. The position that claims that we cannot but presuppose something is called contextual constructivism. Contextual constructivism posits that it is possible to assess claims about a condition by analyzing the process by which the condition is claimed to be a social problem. It is very clear that both strict and contextual constructivist studies mainly analyze the process by which some condition is claimed to be a social problem and the process by which that definition is successfully recognized, shared and authorized. Neither focuses on the conditions being defined as social problems. Strict constructivism tries
33
to maintain theoretical purity by not mentioning those conditions, by being detached to the material world and staying focused on discourse and narratives. But in practice their empirical research cannot obey these principles, unless they limit themselves to theory and forego producing empirical studies (Best, 1993; Holstein and Miller, 1993). One term used by the contextual constructivists is ‘domain expansion’ (Best, 1990, 2017). This refers to the way the definition of a social problem can evolve over time. For example, the problem of domestic violence at one time referred to physical violence; the victim of domestic violence was the wife of a domineering husband who controlled the family’s life. However, claims-makers came to argue that the definition of domestic violence ought to be expanded to include, not only physical violence, but also economic control and verbal or psychological cruelty. Domain expansion is this process of expanding definitions. In fact, domestic violence as a social problem is no longer limited to husband and wife, but can include unmarried partners and elders. A similar example of domain expansion can be seen in the term ‘child abuse’. At one time, physical abuse was considered child abuse, but that problem now includes psychological abuse as well as neglect and sexual abuse. In the realm of political science and political sociology, one of the insightful concepts developed is ‘target population’ in the study of public policies, social policies and social problems. Schneider and Ingram proposed this concept of target population, which refers to the people who are targeted to receive benefits and help from public policy (Schneider and Ingram, 2008). Their theory examines how particular people are chosen to be targeted for help and how the help is constructed, including how to solve the problem or situation. They offer a typology of four types of target populations: ‘Advantaged’, ‘Contenders’, ‘Dependent’ and ‘Deviant’. For example, ‘the advantaged groups have excessive resources for the influence of
34
The SAGE Handbook of Political Science
policy such as size, wealth, mobilization potential, and positions of authority, but they also carry very positive social constructions as being deserving of the special benefits received from policy’ (Schneider and Ingram, 2008: 193). On the other hand, in social problems such as drug use, how the drug problem and drug user are constructed is a crucial matter, and policies differ depending on how the target population is constructed. The dependence on drugs of an addict or a patient under medical supervision are constructed differently, the addict, as a ‘Deviant’, who should be punished; the patient, as a ‘Dependent’, who should be treated. Or when juveniles commit crimes, it is not just the nature of the crime that is at issue, but there is a wider examination of other factors, such as family history, intellectual ability and present living conditions, so that youths may be constructed as belonging to the category of ‘Dependent’ children and juvenile, rather than being constructed as belonging to the category of ‘Deviant’. Different assessments will lead to different public policies being chosen and executed. ‘Empirical research has confirmed the importance of social constructions of target populations and has shown how constructions are used, manipulated, reproduced, and changed in the policymaking process’ (ibid., p. 189). And, it has developed into the way that ‘[i]t brought the importance of social construction into the understanding of how power operates in policymaking; it brought the design and content, not just the process, of public policy into the discussion of symbols and values; and it connected policy design elements to messages that affect citizenship and participation’ (ibid., p. 191, emphasis in original).
Crime Problems Many social constructivist analysts studying social problems focus on the construction of crime problems. The number of prisoners in
the United States was relatively stable up until about 1980, but it has since increased. From 1980 to 2000, the number of prisoners (including inmates in prisons and jails) increased roughly fourfold – from approximately 500,000 to more than two million. After 2000, the rate of increase slowed but the number of prisoners continued to increase until 2008 when it peaked at 2,310,300. It has since declined, but at a slower rate than the previous increases. In 2016, the number incarcerated was 2,162,400. Federal, state and local governments decide criminal policy. They are influenced by public opinion. When people listen to or watch crime news, people are liable to believe that crimes happen randomly and they are afraid of being victimized (Best, 1999). ‘Just deserts’, ‘zero tolerance’, ‘broken windows’, ‘war on drugs’, ‘mandatory minimum sentence’ and ‘three strikes and you are out’ were promoted as criminal justice policies. Lots of criminologists have warned about the bad effects of incarcerating large numbers of people, but in vain. Criminologists argued that as well as the huge cost of running prisons, the social effects on the prisoners could be serious; for example, families would lose income, families would be broken, prisons would be overcrowded which could lead to fighting, injuries, accidents and problems with prison guards. In the privately run prisons, many prison guards are paid low wages and are not properly trained, leading to the mistreatment of prisoners and riots. Political scientists also pointed out that incarcerating large numbers of people may damage one of the fundamentals of democracy – the election process. Comparing the rate among general population, minorities, especially black males, are excessively imprisoned. Most prisoners – and often former felons – are deprived of the opportunity to vote, and this may have big influences on the result of elections in constituencies.3 It wasn’t professionals’ cautions or public opinion that stopped the increase of inmates, but the economic crisis and budget problems
Constructivism
caused by the Great Recession beginning in 2008. The term ‘mass incarceration’ only became common after 2008 (Roeder, 2015). In Japan, the government revised the juvenile law in 2004. This was the result of a brutal crime widely publicized by the media, which created a public sensation. Although the number of murder cases committed by juveniles was at the lowest since World War II, the gruesome reportage alarmed the public and gave the impression that such crimes had increased. Politicians responded to the outcry in order to keep their popularity and demanded that the government revise the juvenile law. It was also in the individual interests of governmental agencies and departments, such as the National Police Agency and Ministry of Justice, to promote the revision (Ayukawa, 1995).
Smoking Problems In the last 50 years, one of the most drastic changes in society has been the social definition of smoking. Once considered commonplace and even elegant, smoking is now judged offensive and social policies have been instigated to regulate or suppress it. It is interesting to examine how smoking problems have been constructed in Japan, and how they are relevant to IR. The initial lawsuit against the national railway of Japan to ban smoking in the coaches of the Super Rapid Express in 1979, was the first noticeable event in the history of smoking problems in Japan. Although the lawsuit was rejected, the result was that the majority of coaches were changed into non-smoking. This lawsuit gained the attention of the media with the use of the Japanese term ‘ken en ken’ (‘the rights to dislike smoking’). As the two words ‘ken’ (‘dislike’ and ‘rights’) have strong meanings in the Japanese language, the term had a strong impact. The progression of the anti-smoking movement was difficult and fraught with
35
problems. In Japan there is no system of class action; lawyers are volunteers and do not gain any rewards for winning a lawsuit. (We can say that these lawyers are pure Good Samaritans.) The other problem facing the non-smoking claims was that the Ministry of Finance had an effective monopoly on tobacco. Although the company was privatized and became Japan Tobacco, a number of bureaucrats formerly of the Ministry of Finance continued to be its CEOs. There was also a law that promoted the development of the tobacco business. Thus, any movement to control smoking or tobacco consumption was made more difficult by the representatives at the National Diet, who wanted to continue a good relationship with the Ministry of Finance, in order to get their own budgets approved. The Ministry of Health, Labor and Welfare and various claims-making groups tried to regulate smoking and promote nonsmoking. However, the opposition was difficult. In the end, their tactic was to raise the price of cigarettes and tobacco products and secure the maximum income from tax for the Ministry of Finance (Ayukawa, 2001). There was certain pressure from outside Japan from international organizations, which considered smoking dangerous for health. The World Health Organization (WHO) urged Japanese cooperation and, in 2003, the Japanese government signed the WHO Framework Convention on Tobacco Control. It was passed in the National Diet in 2004 and became effective in 2005. The international treaty, which is considered above ordinary law in Japan, was a boon to the Ministry of Health, Labor and Welfare and the non- smoking claims-makers. The Ministry of Health, Labor and Welfare, as well as claimsmakers for non-smoking, gained a stronger position with arguments about the risks of passive smoking. Victims, including children, are vulnerable when they are forced to be in a smoke-filled environment as they risk cancer and other illness (Ayukawa, 2015). Although the medical profession has established an academic society to research
36
The SAGE Handbook of Political Science
the effects of smoking, many of them receive funds from pharmaceutical companies, and doctors are paid to treat cancer patients, so their findings may be influenced by financial gain. In 2018, the National Diet approved the first nationwide legislation, which bans the use of certain types of tobacco at schools, hospitals and government offices. Also, restaurants of a certain size must have a separate well-ventilated smoking area apart from the eating and drinking areas. This amendment revised the Health Promotion Law but it falls short of the original plan offered by the Ministry of Health, Welfare and Labor. In anticipation of the Olympics in 2020, the Tokyo metropolitan municipal council separately passed an ordinance banning smoking in eateries with employees unless the employees are family members. In that municipal ordinance there is another article, which bans smoking in any private living space where a child might be living.4 E-cigarettes are sold in Japan and now about 10% of smokers use this form of smoking. Claims-making groups and some medical doctors are against e-cigarettes, which they claim damage health in the same way as traditional cigarette smoking. Four points can be learned from the problem of smoking in Japan. The first is that rhetoric can be an important factor in changing or influencing both public opinion and governmental policy. Second, certain domestic issues can sometimes be linked to global problems and global pressure can alter a domestic situation. In terms of smoking, there has been a worldwide concern with the problem of cancer caused by tobacco and this has some impact on the process of claims-making and policy execution within the domestic sphere. Third, social structure can prevent or hinder changes demanded by claims-makers, especially if it is not in their interest, as is the case with the Ministry of Finance and tobacco. Finally, certain new situations that can be called an ‘emergency’ or ‘contingency’, like the Tokyo Olympics, can affect policy and change social attitudes.
Constructivism in IR When the US Senate was discussing invading Iraq in response to the aggression against Kuwait, a story was made public which influenced their decision. A young girl said she had witnessed Iraqi soldiers cruelly killing a baby. Her tears and emotional story moved the senators and public against Iraq and the United States’ involvement in the war was approved. However, months later it was revealed that the young girl who had related the story of the baby’s murder was in fact the daughter of the ambassador of Kuwait to the United States, and that she had not been in Kuwait when the Iraqi troops invaded. Another story was about how the Iraqi soldiers had destroyed an oil plant, causing damage and pollution, and photos of oil-covered birds were shown. This story also justified the decision to attack Iraq. Later, it was revealed that American troops had destroyed the oil plant and caused the pollution. These two cases demonstrate how powerful the media and government can be in persuading people to believe in something. This concern is one of the reasons why social constructivism researches the subjective world of people and the subjective definition of situations, and shows how influential the subjective definitions that constructivists examine and analyze can be in IR. Alexander Wendt has identified three models of culture of anarchy in IR. One of these is Hobbesian, which the realist diplomats and realist scholars of IR believe in; this type presupposes that everyone is fighting against everyone else.5 If all nations define that the Hobbesian situation is a common, universal fact or objective law, and all nations behave believing in it, then this law or belief becomes the reality as the result of a ‘self-fulfilling prophecy’ (Wendt, 1999: 263, 309). Wendt points out that it is important to interpret what people believe as objective, universal law and see
Constructivism
that it is constructed through interactions between people who have similar subjective beliefs. Social constructivism maintains that the main idea in IR is human agency, which is able to interpret, define and explore a situation. To analyze IR as nations pursuing their own interests egotistically from the viewpoint of material forces is realism, which leads us to view human beings as if they don’t have a will, and behave automatically. However, in social constructivism human agency is a crucial concept. When a social scientist emphasizes the concept of norm, it is usually viewed as a structure which controls individuals’ behavior from outside. This resembles structural functionalism which views structural pressures as forcing members or people to obey the established norm or rule. Indeed, one of the prominent constructivists in the study of IR, Wendt, recognizes that constructivism originated in both symbolic interactionism and the structuration theory of Anthony Giddens in sociology. Giddens’ theory of structuration is different from traditional structural functionalism, in that both individuals and structure are adapting each other. Also, the situation of IR is totally different from the single domestic society; it is far more fluid and unsettled than a domestic society. Norms or rules in IR are ones to be created through interaction, negotiation and bargaining. Also, rules work as protection against the large, powerful nations’ abuse of power to dominate small, powerless nations or organizations in IR. International society consists of a large number of international institutions which were established by treaties and rules which nations ratified or agreed upon. These treaties and rules were the result of a great many narratives, rhetoric and discourse. As pointed out above, these rules regulate the behavior of nations, therefore international agencies need to appreciate the rules as well as consider the material power of a nation. In this sense the constructivist study of rules concerning IR
37
is fundamentally important (Onuf, 1997, 1998). Liberalism in IR views nations as interdependent and recognizes the role of authorized international organizations to regulate nations. But, with the advancement of globalization, it is not only nations and international organizations such as the United Nations that play important roles, but also many different kinds of non-governmental organizations (NGOs) and non-profit organizations (NPOs) are important agents and agencies in the world. Among the authorized international organizations, there are the United Nations, and its commissions, councils and divisions such as the Human Rights Council, Human Rights Committee, Office of the High Commissioner for Human Rights and the Office of the UN High Commissioner for Refugees, as well as the European Union, Council of Europe, the European Commission of Human Rights and the International Criminal Court (ICC), which were established by international conventions or treaties among nations, and so on. The emergence of rules is necessary to establish these new organizations. For example, to establish the ICC, the Rome Statute needed to come into effect. Even under the umbrella of the United Nations, these organizations have accumulated vested interests and developed conflicts of interest. This has resulted in requests for new interpretations of rules as well as a crucial need to formulate new rules and carry them out with enforcement by international organizations. For example, this problem became evident with the question of protection of refugees during internal wars in African countries. The definition of refugees was revised and expanded to include not only refugees who fled across a border but also inland domestic refugees. This process of ‘domain expansion’ (Best, 2017) can be analyzed satisfactorily by constructivism. The ICC prosecuted not only war criminals who invaded other countries and committed genocide, but also a president who
38
The SAGE Handbook of Political Science
suppressed people inside of his own country. The ICC even successfully prosecuted those guilty of damaging important historical cultural properties. Constructivism contributes to these kinds of studies on the emergence and formation of rules, interpretations of rules, the life cycle, life course and life history of rules. In the age of globalization, there are many NGOs and NPOs, who play an important and influential role in the world. These groups are respected and their claims are often made public. Claims made by Amnesty International, Human Rights Watch and others, are recognized and reported by the mass media. Major television networks and prestigious newspapers communicate their claims which can influence members of the United Nations and other authorized international organizations. For example, one important Japanese group that is internationally active and makes claims that the government cannot ignore is the Japan Federation of Bar Associations. This organization represents lawyers all over Japan. When the Japanese government submitted the reports on ‘Conventions on the Rights of the Child’, ‘International Convention against Torture’ and ‘International Convention on Civil and Political Rights’ to the United Nations, the Japan Federation of Bar Associations and some NGOs and NPOs submitted counter-reports. They not only sent reports but also invited members of the Human Rights Committee to see the relevant institutions such as prisons and detention centers. The activities of these organizations are important as they point out problems and offer criticism and recommendations that do not necessarily follow the government’s position. Constructivist research contributes pertinent studies on the dynamics of international organizations. Since these are influential agents in the present situations of IR, it is important to know about their interaction and negotiations.
Constructivism in Europe with a Focus on Scandinavian Countries Constructivist ideas have influenced Scandinavian research on a variety of topics such as crime statistics, violence at work, drug policy and ethnic identity, examples of which are given in what follows.
Construction of Crime Statistics Hanns von Hofer, a professor in the Department of Criminology at the University of Stockholm examined and analyzed how statistics were constructed, especially those dealing with crime, in order to interpret them more carefully (von Hofer, 2000). A case in point is the crime of rape. In Sweden, the number of reported rapes is exceedingly high. This is because the definition of rape is broader and more comprehensive than in other countries. Feminism and female rights are strong there, and half the representatives in parliament as well as half the ministers in the cabinet are women. Sweden was the first country to punish the customer who procured a prostitute. In this society, women are treated as equals, and they do not feel inferior to men. They understand their rights and are able to report a rape without fear. The high rape rate might not indicate that Sweden is any more dangerous than other places for women, but that they feel they have the right to report the crime. In some patriarchal societies, women may not report a rape because the police may lack sympathy or the general public may stigmatize the victim. Therefore, in those countries, the statistics for rape might be quite low. Although the statistics might indicate that there were more rape victims in Sweden than in another country, it could be that the rapes there were not reported and did not become a statistic. Thus, a high statistic for sex crimes may not reflect the reality of a
Constructivism
country’s situation. It is difficult to make a comparison between two countries based on statistics, because each country might have different attitudes and beliefs as well as different systems and procedures of criminal justice. We can see that when examining or comparing two or more countries, it is problematic when looking at official statistics that show the increase or decrease of a crime. It is also important to use care when tracking crime statistics across time. Apparent increases or decreases may not reflect a change in criminal behavior, so much as a revision of the law or changes in the definitions of a crime or in the methods of compiling successive statistics. The numbers may also be influenced by the shifting attitudes of police and prosecutors, or by changing court procedures. Cultural changes such as victim recognition and the impact of feminist thought on women can also affect statistics. Another factor to consider when examining statistics based on self-report surveys is the way the data are collected. At one time, research staff or hired personnel spoke to individuals or left the questionnaire at households to be sent in or later picked up. Recently, most surveys are done through the internet and computers analyze the data collected. Data collected through the internet tend to over-represent some segments of the population that are good at handling smartphones or computers. This can be problematic, as is the reliability of answers to questionnaires (Best, 2012).
Violence at Work One of the classics of constructivism on statistics is ‘A Note on the Use of Official Statistics’ by Kitsuse and Cicourel (1963). Getting insight from this paper, four Swedish criminologists examined the statistics on violence at workplaces and concluded that four main factors contributed to the increase
39
in the number of cases of violence at work (Estrada et al., 2010): changes in the expanded definition of the crime, the media reports, social intolerance toward violence and the working conditions. Another research article is entitled ‘Violence at Work in Finland’ (Heiskanen, 2007). It points out that the number of women working in health care, social work and education has increased which means they are working in direct contact with students, clients and patients. There has been a problem between the supply and demand for services, which has caused people to complain and act aggressively toward these women. Increasingly, workers will no longer accept violent behavior and are willing to report incidents. This increases the statistics of violence in the workplace in Finland. Research carried out in Denmark focused on violence in four workplace situations: psychiatry, eldercare, prison and probation services and special schools. The workers with the highest level of vulnerability were those working at special schools (Rasmussen et al., 2013). In Norway, research was carried out in all the medical treatment areas. The report pointed out that from 1993 to 2014 the number of female doctors increased, but that there was no increase in violence (Johansen et al., 2017). These statistics and researches show how the constructivist viewpoint can be useful.
Drug Policy How drugs are perceived is an interesting theme from the perspective of constructivism. Among Scandinavian countries, Denmark has a lenient attitude toward marijuana, similar to that of the Netherlands. Other Scandinavian countries, such as Sweden, have a stricter attitude toward all drugs and traditionally have had a zero- tolerance policy. Even alcoholic beverages that contain more than 5% alcohol are sold at
40
The SAGE Handbook of Political Science
designated shops, which are closed on weekends. This is because it is very cold in winter and the government hopes to prevent the prevalence of alcoholism. Reporting on drugs in mainstream newspapers in Sweden appears to be changing. It has been noted that the drug policy in Sweden may depend on international policies so that if policies internationally are changing, then Swedish policy might follow this trend (Månsson, 2016). For example, medical use of marijuana in some states in the United States has been legalized and recreational use of marijuana has been legalized in Canada.
Ethnic Identity During the 2015 refugee crisis in Europe, Sweden accepted the second largest number of refugees, but because of its modest size, Sweden had the highest rate of refugees per head of population. A lot of immigrants live in Sweden. Researchers have studied the constructions of life plans by girls who come from African countries, including their identities and expectations of themselves when they are adults (Mohme, 2014). Sweden not only has immigrants from Eastern Europe, the Middle East or Africa, but also there are minorities in Sweden such as the Sami in northern regions who are recognized as a minority by the United Nations. The Swedish government must guarantee and assure the opportunities of language, culture transmission through generations and self-decision making on politics. And, recently, it is not easy to find persons with such language abilities. Among some tribes of Sami, the number and the rate of persons who speak their traditional language is estimated to be less than 100 and less than 10%. There is a report that some young Sami girls prefer to express themselves in English rather than either their native language or Swedish (Lindgren et al., 2017). Looking at these
researches that are relevant to constructivism, there will be expected some development of varieties of constructions of identities, including a change of the notions of national identity and ethnicity in Sweden.
Further Comments Bo Stråth wrote in his ‘Constructionist Themes in the Historiography of the Nation’, ‘the concepts of culture, collective memory and myth are central to historical constructionist approaches’ (2008: 627) and culture is ‘a process of symbolic work that frames interpretation and community. It is emergent and not pregiven in its expressions. Culture is a matter of communication and negotiation’ (Stråth, 2008: 628). As shown above, in Scandinavian countries, there have been insightful studies based on the constructivist perspective, from the articulation on statistics to experimental researches.
Constructivism in Asia Japan Constructivist studies have become popular throughout Asia with the majority of papers written by Japanese. Japanese customs and institutions are different from those in Western countries, but Japan is one of the most democratic countries in Asia. Models or programs of constructivism, which were developed in the United States or in Europe, are easily applied to Japan. One of the pioneers of constructivist studies was John I. Kitsuse, a second-generation Japanese living in America. His connection with Japan made him influential in Japanese sociology. Kitsuse’s approach was based on strict constructionism, which analyzes language in relation to a social phenomenon and does not consider, either, the objective
Constructivism
conditions of the problem. Recently, contextual constructivism has become a popular approach. Some popular themes written by Japanese constructivists concern youth problems. For example, social problems concerning young girls’ sexual behavior (Yamamoto, 2001), youths’ exposure to pornocomics (Suzuki, 2001; Akagawa, 2015) and juveniles’ deviant behavior (Yamazaki, 1994; Ayukawa, 1995). Recently, there have been social constructivist studies regarding human rights policies (Ayukawa, 2015, 2019). There have also been papers written by non-Japanese constructivists about Japan (Nichols, 1995).6
The Interest in Constructivism in China Motorcycles were the symbol of development, modernization and prosperity in the 1990s. This image changed after a newspaper in Guangzhou complained about motorcycle noise, pollution and problems. After this campaign, motorcycles were banned. The constructionist paper by Jianhua Xu (2015) examines how the motorcycle became a target of criticism. Another political issue that became the study of social constructivism was the Hong Kong government’s attempt to standardize the curriculum of primary and secondary education with that of the mainland. Michael Adorjan and Ho Lun Yau wrote ‘Resinicization and Digital Citizenship in Hong Kong Youth: Youth, Cyberspace, and Claims-making’ (2015). The writers look at the claims-making group made up of mainly young people born in the 1990s who opposed this move. The young people claimed that the move would deprive them of basic human rights, including the freedom of expression. They used Facebook and cell phones to protest, and organized mass demonstrations. Reading their explanations, we can estimate that there might be some similarities and common points between
41
constructivism and the resource mobilization theory of social movements (McCarthy and Zald, 1977; Best, 2017). China consists of multiple ethnicities. Among them, according to an interesting paper, Muslim Uighurs endure marginalization and experience trauma and anger, and these in turn are major factors in violence and terrorist behavior (Li and Niemann, 2016). Han people have immigrated into Xinyang in order to integrate Muslim people into Chinese culture. The Chinese government tries to develop Xingjian economically, and promotes campaigns to condemn violence and prevent rioting or terrorism. But, there have been incidents in Beijing in 2013 and at the railway station in Kunming city. At the end of their paper, Li and Niemann note that Uighur-language education has been marginalized, and wonder what will happen with the Chinese government’s very largescale project for the Silk Road Economic Belt when the railroad of super rapid trains bound for European and Arabic countries will be constructed through Xinyang, which might bring big changes in Xinyang and the Uighur community. There is an interesting aspect to consider when constructivist papers are written by Chinese scholars. The people in power can look at these papers and learn, from the constructivist discourse, how to manipulate people. This would be a reflexive use of constructivism, as governing bodies learn from the conclusions reached about the process of the claims-making and apply this knowledge to their planning, administration and activities. In the paper ‘China and the Remolding of International Human Rights Norms’ by Yuan et al. (2017), the authors refer to constructivist viewpoints and maintain that the Chinese cooperate with the Human Rights Council of the United Nations. China insists that ‘human rights are essential matters within the domestic jurisdiction of a country. Respect for each country’s sovereignty and non-interference in internal affairs are universally recognized
42
The SAGE Handbook of Political Science
principles of international law; they are applicable to all fields of international relations and naturally apply to the field of human rights as well’ (Yuan et al., 2017: 38) The authors of this paper refer to constructivism and discuss the life cycle of international norms, and propose, instead of the process of emergence, diffusion and internationalization, another process of origination, diffusion and remolding. They claim that through dialogue on norms, discourse critique, self-remolding and other means, China has enriched the practice of remolding international human rights norms with a human rights theory centered on the right to service and develop, thereby providing a new approach and new angle of vision that allows non-Western countries to break away from the monist approach of norm development (25). The paper also describes ‘following the Constructivist turn in international relations theory, international norms entered the mainstream of international relations research and rapidly become a hot topic’ (ibid., p. 26). They criticize ‘Western bias’ or ‘Euro centrism’ of the uni-directional expansion model, and propose ‘the remolding of norms occurring through the meeting or clashing of norms in a process of growth–diffusion–remolding’ (ibid., p. 28). This paper shows how it is possible to analyze claims and narratives from a constructivist perspective and how that analysis can show the characteristics of reflexivity in constructivism. It is interesting to see how the Chinese government tries to revise and remodel the definition of human rights. It would also be interesting to note how scholars cooperate with the Chinese government and how NGOs and NPOs react, interact and negotiate with China’s remolding activities.
Ethnicity, Gender and Identity Most Asian nations consist of multiple races and ethnicities, and have numerous languages. There are only a few nations that
believe that the country constitutes only one race of people. South Korea is one such nation, with only 2.25% of the population in 2011 being foreign. For two decades there has been a debate about accepting immigrants but recently newspapers have changed the term, ‘foreigners’ to ‘residents’, ‘immigrants’ and ‘multicultural citizens’. Constructivist studies look at how the changing of terms or narratives in the media can influence people’s attitudes (Park, 2014). In most Asian countries, the relationship between race and ethnicity is complex and multilayered. For example, although the majority of Indonesians are Muslim, the Balinese are mostly Hindu; however, in the search for their ethnicity, the majority of Balinese chose to adhere to a minority monotheistic form of Hinduism and not to majority Indian Hinduism (Johnsen, 2007). People may try to construct their identity by looking for their ancestors’ ethnicity, but in the society that is developing rapidly in the age of IT, people may construct their identity not depending on race or ethnicity but by searching for social status in the world of rapidly expanding economics (Ellis et al., 2012). This might be especially the case for women who pursue their careers in the expanding business world, breaking through patriarchal tradition (Fernando and Cohen, 2011). Gender is socially constructed and is acknowledged as one of the central traits of identity. In Malaysia, a predominately Muslim country, homosexuality is forbidden. The number of Malaysian respondents of a survey of gay people is very low because, according to Mark Stephan Felix (2016), the people involved were careful about to whom, where and when they disclosed their sexual identity. This was in order ‘to minimize instances where they would have to compromise either their faith or their identities’ (ibid., p. 115) and ‘to make the best of the duality of their situation to love their lives and to construct an identity that is comfortable and acceptable to themselves’ (ibid., p. 117). In India, the law to ban homosexuality was nullified in 2018.
Constructivism
Tsunami The term ‘tsunami’ originated in Japan. Tsunamis as well as typhoons and earthquakes are natural disasters that often hit many other Asian countries. When an earthquake occurs in the sea, especially at the bottom of narrow gulfs, a tsunami can develop. When the tsunami hit Indonesia, Malaysia, Sri Lanka and Thailand in 2004, hundreds of thousands of people were killed. It was not only the local populations that were affected by the tsunami, many tourists from America and Europe lost their lives. Swedish foreign travelers had the largest number of deaths, and the second largest were German tourists. The 2004 tsunami was so powerful that it reached the shores of Africa and an estimated 300 people in Somalia were killed. The mass media reportage on the tsunami in Sweden was different from that in other countries. The names of the victims were never broadcast nor published in order to protect privacy. It caused great concern for the relatives and friends of those Swedish travelers in Asia (Kivikura and Nord, 2016). A constructivist study researched the newspapers in three countries, India, Indonesia and Thailand, in order to look at the claims-making in the three countries. It studied their reports of damages and claims-making for support and looked at the results (Letukas, 2013). The study noted that Indonesia complained that they had not been provided with long-term recovery plans, while India was not satisfied with the culturally inappropriate donations. The Thai newspapers reported that the help for the fishing business was satisfactory but that the aid they needed to support the tourists was not enough. Letukas concluded that Indonesia didn’t complain enough while India’s complaints were excessive. The author of the article is American and consequently her main interest is the relationship between the United States and these three countries. She noted that ‘[m]edia in the United States
43
argued that the new Marshall Plan policy was favorable in south Asia, resulting in the construction of an effective policy outcome. In contrast, Western relief was constructed as excessive in India, because massive amounts of unusable aid hindered, rather than helped, the recovery process’ (Letukas, 2013: 280). It could be argued that the damage in Indonesia was so severe that it had to depend on help from developed countries. India, on the other hand, has a culture sensitive to what it might consider an insult and perhaps did not need the help as much as Indonesia. The different attitudes and responses to the aid were the result of the scale of damage. In Indonesia, the number of people killed by the tsunami was approximately 130,000 and 40,000 were missing. It is more than 10 times as many as in India. In India most of the damage occurred on the islands situated near Indonesia. This constructivist study points out that it is a form of reflexivity. The author’s description is also a kind of claimsmaking, referring to the media construction of the situation. Other constructivists who read this study can examine and challenge it as a construct in itself. Constructivist studies not only look at the tsunami’s after-effects in terms of damage and help from developed countries, but also how the local people were victims of not only the disaster but also the reconstruction process. A region in the southern part of India, Tamil Nadu, was seriously affected by the 2014 tsunami. A constructivist study found that the people who lived there were treated differently depending on social characteristics, religion, caste and socioeconomic status as well as by gender (Luke, 2012). The article points out that the gender of the victim influenced their access to rehabilitation resources or toilet facilities for women. Women could not get proper medical treatment, which was especially serious for pregnant women, and the fatality rate of women was higher than men in that region. The author concluded that under normal conditions, women were
44
The SAGE Handbook of Political Science
considered less important, but that during the after-effects of the tsunami, the situation escalated and they were even more vulnerable than usual. ‘The origins of such disproportionate effects are not natural: rather, they are socially constructed’ (Luke, 2012: 19). In 2010 another large tsunami hit Indonesia and its neighboring islands and lands. The Mentawai Islands form a chain of islands off the coast of Sumatra which are famous as a surfing destination, where a ‘Nirvanic myth’ was constructed by the surfing tourism. This myth referred to the remote, exotic, beautiful area with the best waves for surfing. The islands were developed by foreign companies that dealt in surfing tourism but did not bother to build a good relationship with the local people. When the tsunami hit the islands it caused the deaths of 400 people, destroyed 4,000 households and displaced more than 20,000 people (Ponting et al., 2005; Ponting and McDonald, 2013).
Collective Identity The construction of identity not only refers to the population of a state-nation, or people or group in a country, but also people both inside and outside a country. Also, the construction of identity matters for international organizations. The Association of Southeast Nations (ASEAN) has not reacted to the Rohingya crisis in Myanmar at the border between Myanmar and Bangladesh. It has not tried to solve the situation nor intervene in the conflict. Thousands of Rohingya refugees have fled Myanmar and the UN Office of the High Commissioner for Human Rights has issued official reports that Myanmar soldiers have committed genocide even though such crimes are banned by international treaties. ASEAN has not intervened in the Rohingya crisis even though it is a signatory to the ASEAN Charter, the ASEAN Intergovernmental Commission on Human Rights and the ASEAN Declaration of Human
Rights. To date, there are few papers that analyze the situation from a constructivist viewpoint, but the ones which do point out similar cases, and posit that this inaction is caused by the norm of non-interference as well as a collective identity among ASEAN member states. It is ironic then that the homepage of ASEAN shows its motto: One Vision, One Identity, One Community. One constructivist author states that ‘[t]he absence of collective identity led to natural interest-oriented political processes rather than a regional-oriented one’ (Rosyidin, 2017: 53). ASEAN has ‘the unreflexive cognitive and behavioral qualities of regional relations’ (Glas, 2016: 833) which can be named the ‘habit of peace’. These articles demonstrate that both constructivist analysis and the concept of reflexivity are relevant and meaningful.
Conclusion Constructivist studies bring fruitful and original insights into the fields of political science and sociology. As well as astute analysis, the studies can also yield productive results. The constructivist approach is so intellectually appealing that Asian as well as Western scholars have adopted it. As long as human beings interact, interpret and produce meaning, constructivism will continue to contribute to the study of political science and the social sciences.
Notes 1 These activities of social problem workers and policy outcomes may cause new claims-making. 2 In political science the label ‘strict constructionism’ has been applied to jurists who seek to interpret the American Constitution as narrowly as possible. In other words, they refrain from progressive, comprehensive or broad interpretations that seek to adapt the Constitution to fit modern advanced society’s circumstances, or to use the Constitution as a tool to promote social justice. When strict constructionism was named in the
Constructivism
3
4
5
6
study of social problems, this connection was implicit, but political scientists should not confuse the different ways the same term is used in sociology and in law (Best, 2019). Politicians who have been found guilty and imprisoned usually experience the damage caused by the get-tough-on-crime social policies they promoted when in office; they are astonished to discover huge numbers of non-violent offenders and the disproportionate majority of inmates who belong to racial minorities in prison (Smith, 2015). Here we can see a rule which suggests that a pure innocent such a child should not be victimized, and they should be protected by law or ordinance at any sphere. The other two models of culture of anarchy which Wendt pointed out are Lockean, based on the relationship of rivalry, and Kantian, based on the role of friendship. As this chapter is on social sciences, I have not mentioned psychology. There are several references of constructivism concerning the work of Kenneth J. Gergen, and also in brief therapy in psychology in Japan.
References Adorjan, Michael and Ho Lun Yau. 2015. ‘Resinicization and digital citizenship in Hong Kong: Youth, cyberspace, and claims- making’, Qualitative Sociology Review, 11(2): 160–178. Akagawa, Manabu. 2015. ‘Regulating pornocomic sales for juveniles in Japan: Cycles and path-dependence of a social problem’, Qualitative Sociology Review, 11(2): 62–73. Association of Southeast Asian Nations. ‘ASEANmotto,’ https://asean.org/asean/about-asean/ asean-motto/. Retrieved 5 October, 2018. Ayukawa, Jun. 1995. ‘The construction of juvenile delinquency as a social problem in post World War II Japan’, in Holstein, James A. and Gale Miller eds. Perspectives on Social Problems, vol. 7: 311–329, Greenwich, Connecticut: JAI Press. Ayukawa, Jun. 2001. ‘The United States and smoking problems in Japan’, in Best, Joel ed. How Claims Spread: Cross-national diffusion of social problems, 215–242, Hawthorne, New York: Aldine de Gruyter. Ayukawa, Jun. 2015. ‘Claims-making and human rights in domestic and international
45
spheres’, Qualitative Sociology Review, 11(2): 110–121. Ayukawa, Jun. 2019. ‘Social constructionism in the study of social problems and globalization: International human rights narratives and efforts to abolish death penalty in Japan’, American Sociologist 50(2): 290–299. Becker, Howard S. 1963. Outsiders: Studies in sociology of deviance, New York: Free Press. Berger, Peter L. and Thomas Luckmann. 1966. The Social Construction of Reality: A treatise in the sociology of knowledge, New York: Doubleday. Best, Joel. 1990. Threatened Children: Rhetoric and concern about child-victims, Chicago: University of Chicago Press. Best, Joel. 1993. ‘But seriously folks: The limitation of the strict constructionist interpretation of social problems’, in Holstein, James A. and Gale Miller eds. Reconsidering Social Constructionism: Debates in social problems theory, 129–147, New York: Aldine de Gruyter. Best, Joel. 1999. Random Violence: How we talk about new crimes and new victims, Berkeley, California: University Press of California. Best, Joel. 2012. Damned Lies and Statistics: Untangling numbers from the media, politicians, and activists, 2nd ed., Berkeley, California: University of California Press. Best, Joel. 2017 (2008). Social Problems, 3rd ed., New York: W. W. Norton & Company. Best, Joel. 2019. ‘The bumblebee flies anyway: The success of contextual constructionism’, American Sociologist, 50(2): 220–227. Best, Joel and Donileen R. Loseke. 2018. ‘Prospects for the sociological study of social problems’, Treviño, A. Javier ed. The Cambridge Handbook of Social Problems, 169– 182, New York: Cambridge University Press. Edelman, Murray. 1988. Constructing the Political Spectacle. Chicago: University of Chicago Press. Ellis, Nick, Michel Rod, Tim Beal and Val Lindsay. 2012. ‘Constructing identities in Indian networks: Discourses of marketing management in inter-organizational relationships’, Industrial Marketing Management, 41(3): 402–412. Estrada, Felipe, Anders Nilsson, Kristina Jerre and Sofia Wikman. 2010. ‘Violence at work: The emergence of a social problem’, Journal
46
The SAGE Handbook of Political Science
of Scandinavian Studies in Criminology and Crime Prevention, 11(1): 46–65. Felix, Mark Stephan. 2016. ‘Gay identity construction of ten Muslim male undergraduates in Penang, Malaysia: A phenomenological qualitative study’, Asia-Pacific Social Science Review, 16(2): 113–119. Fernando, Weerahannadige Dulini Anuvinda and Laurie Cohen. 2011. ‘Exploring the interplay between gender, organizational context and career: A Sri Lankan perspective’, Career Development International, 16(6): 553–571. Glas, Aarie. 2016. ‘Habits of peace: Long-term regional cooperation in Southeast Asia’, European Journal of International Relations, 23(4): 833–856. Heiner, Robert. 2002. Social Problems: An introduction to critical constructionism, New York: Oxford University Press. Heiskanen, Markku. 2007. ‘Violence at work in Finland; Trends, Contests, and Prevention’, Journal of Scandinavian Studies in Criminology and Crime Prevention, 8(1): 22–40. Holstein, James A. and Gale Miller eds. 1993. Reconsidering Social Constructionism: Debates in social problems theory, New York: Aldine de Gruyter. Holstein, James A. and Jaber F. Gubrium eds. 2008. Handbook of Constructionist Research, New York: Guilford Press. Johansen, Ingrid Hjulstad, Valborg Baste, Judith Rosta, Olaf G. Aasland and Tone Monken. 2017. ‘Changes in prevalence of workplace violence against doctors in all medical specialties in Norway between 1993 and 2014: A repeated cross-sectional survey’, BMJ Open, 7(8) e017757. doi:10.1136/bmjopen2017-017757 Johnsen, Scott Adam. 2007. From Royal House to Nation: The construction of Hinduism and Balinese ethnicity in Indonesia. PhD dissertation, Department of Anthropology, University of Virginia, Charlottesville. Kitsuse, John I. and Aaron V. Cicourel. 1963. ‘A note on the uses of official statistics’, Social Problems, 11(2):131–138. Kivikura, Ullamaija and Lars Nord eds. 2016. After the Tsunami: Crisis communication in Finland and Sweden, Nordicom, https://www. researchgate.net/p ublication/283569483_ After_the_Tsunami._Crisis_Communication_ in_Finland_and_Sweden. Retrieved 10 October, 2018.
Letukas, Lynn. 2013. ‘Global policy outcomes: Comparing reactions to post tsunami aid’, in Best, Joel and Scott R. Harris eds. Making Sense of Social Problems, Boulder, 265–281, Colorado: Lynne Rienner Publishers. Li, Yuhui and Christopher Niemann. 2016. ‘Social construction of ethnic identity and conflict: The cases of the Chechen and the Uighur’, Journal of Muslim Minority Affairs, 36(4): 584–596. Lindgren, Eva, Asbjørg Westum, Hanna Outakoski and Kirk P. H. Sullivan. 2017. ‘Meaning-making across languages: A case study of three multilingual writers in Sápmi’, International Journal of Multilingualism, 14(2): 124–143. Luke, Jurian. 2012. ‘The gendered nature of disaster: Women survivors in post-tsunami Tami Nadu’, Indian Journal of Gender Studies, January 19(1): 1–29. Månsson, Josefin, 2016. ‘The same old story?: Continuity and change in Swedish print media constructions of cannabis’, Nordic Studies on Alcohol and Drugs, 33(3): 267–285. Marx, Karl. 1844(1932). Ökonimisch- Philosophische Manuskripte aus dem Jahre 1844, (Economic and philosophic manuscript), Karl Marx Friedrich Engels historischkritische Gesamtausgabe, im Auftrage des Marx-Engels-Instututs, Moskau, Herausgegeben von V. Adoratskij, Erste Abteilung, Bd. 3, Marx-Engels-Verlag G.M.B.H., Berlin. Shirotsuka Noboru and Kichiroku Tanaka tr. Keizaigaku Tetsugaku Soko, Tokyo: Iwanamishoten. 1964. Marx, Karl and Friedrich Engels. 1845–1846. Die Deutsche Ideologie, (German Ideology), Hiromatsu Wataru ed. and tr. With supplemental translation by Masato Kobayashi Doitsu Ideorogi, Tokyo: Iwanami-shoten. 2002. McCarthy, John D. and Zald, Mayer N. 1977. ‘Resource mobilization and social movements’, American Journal of Sociology, 82(6): 1212–1241. Mead, George Herbert. 1934. Mind, Self and Society, Chicago: University of Chicago Press. Mohme, Gunnel Maria. 2014. ‘Imagined adulthood under transition: Somali-Swedish girls’ life-planning in a late modernity context’, Gender and Education, 26(4): 432–447.
Constructivism
Nichols, Lawrence T. 1995. ‘Cold wars, evil empires, treacherous Japanese: Effects of international context on problem construction’, in Best, Joel ed. Images of Issues: Typifying contemporary social problems, 2nd ed., 313–334, New York: Routledge. Onuf, Nicholas. 1997. ‘A Constructivist Manifesto’, in Burch, Kurt and Robert A. Denemark eds. Constituting International Political Economy, Boulder, 7–17, Colorado: Lynne Rienner Publishers. Onuf, Nicholas. 1998. ‘Constructivism: A user’s manual’, in Kubálková, Vendulka, Nicholas Onuf, and Paul Kowert eds. International Relations in a Constructed World. Armonk, 55–78, New York: M.E. Sharpe. Park, Keumjae. 2014. ‘Foreigners or multicultural citizens? Press media’s construction in immigrants in South Korea’, Ethnic and Racial Studies, 37(9): 1565–1586. Ponting, Jess and Matthew G. McDonald. 2013. ‘Performance, agency and change in surfing tourist space’, Annals of Tourism Research, 43: 415–434. Ponting, Jess, Matthew McDonald and Stephen L. Wearing. 2005. ‘De-constructing Wonderland: Surfing tourism in the Mentawai Islands; Indonesia’, Loisir et Société/Society and Leisure, 28(1): 141–162. Rasmussen, Charlotte Ann, Annie Hogh and Lars Peter Andersen. 2013. ‘Threats and physical violence in workplace: A comparative study of four areas on human service work’, Journal of Interpersonal Violence, 28(13): 2749–2769. Roeder, Oliver. 2015. ‘A Million People Were in Prison Before We Called It Mass Incarceration’, and “FiveThirtyEight.com” https:// fivethirtyeight.com/features/a-million-peoplewere-in-prison-before-we-called-it-massincarceration/. Retrieved 8 September, 2018. Rosyidin, Mohamad. 2017. ‘Why collective identity matters: Constructivism and the absence of ASEAN’s Role in the Rohingya crisis’, The Asia-Pacific Social Science Review, 17(1): 53–65. Schneider, Anne L. and Helen Ingram. 2008. ‘Social constructions in the public policy’, in
47
Holstein, James A. and Jaber F. Gubrium eds. Handbook of Constructionist Research, 189– 211, New York: Guilford Press. Smith, Jeff. 2015. Mr. Smith Goes to Prison: What my year behind bars taught me about America’s prison crisis, New York: St. Martin Press. Spector, Malcolm and John I. Kitsuse. 1977. Constructing Social Problems, Menlo Park, California: Cummings. Stråth, Bo. 2008. ‘Constructionist themes in the historiography of the nation’, in Holstein, James A. and Jaber F. Gubrium eds. Handbook of Constructionist Research, 627–642, New York: Guilford Press. Suzuki, Tadashi. 2001. ‘Frame diffusion from the U.S. to Japan: Japanese argument against pornocomics, 1989–1992’, in Best, Joel ed. How Claims Spread: Cross-national diffusion of social problems, Hawthorne, 129–145, New York: Aldine de Gruyter. von Hofer, Hanns. 2000. ‘Crime statistics as constructs: The case of Swedish rape statistics’, European Journal on Criminal Policy and Research, 8(1): 77–89. Wendt, Alexander, 1999. Social Theory of International Politics, Cambridge: Cambridge University Press. Xu, Jianhua. 2015. ‘Claims-makers versus nonIssue-makers: Media and the social construction of motorcycle ban problems in China’, Qualitative Sociology Review, 11(2): 122–141. Yamamoto, Isao. 2001. ‘Legislation on sexual misconduct in Tokyo and the rhetoric of victimization’, Journal of Criminological Sociology (Hanzai shakaigaku kenkyo), 25: 49–66. Yamazaki, Atsushi. 1994. ‘The medicalization and demedicalization of school refusal: Constructing an educational problem in Japan’, in Best, Joel ed. Troubling Children: Studies of children and social problems, 201–217, Routledge. Yuan, Zhengqing, Zhiyong Li and Xiaofei Zhufu. 2017. ‘China and the remolding of international human rights norms’, Social Sciences in China, 38(3): 25–46.
3 Emile Durkhein’s Sociological Insight Into Political Phenomena Gianfranco Poggi
As the authoritative biography by Steven Lukes (1985) indicates, Emile Durkheim – one of the founding fathers of modern sociology – witnessed and took some part in the complex and often troubling political experiences of the France of his times (1878–1917). He was a committed patriot, proud of his country’s great role as the site and prime promoter of political modernization, beginning of course with the French Revolution itself. He was fully aware of the extent to which that event and many that followed had intensely divided the population and inspired violent political and cultural contrasts. Durkheim had what can be called a passion for unity, which inspired many aspects of his intellectual and scholarly legacy. However, many controversial contemporary political events saw him taking sides. He generally aligned himself – privately but also in his dialogue and correspondence with colleagues and students, and in occasional publications commenting on those events – with
what one may call the progressive side of the on-going debate. Some particularly virulent contemporary contrasts in his time had to do with one particular import of the French Revolution. The monarchical ancien régime had openly favored the Catholic Church, and endowed it with multiple privileges, regarding especially its large wealth and its role in the education of the new generations. The subsequent political order had disestablished the Church; in the eyes of its opponents, it had sinfully turned down the role France had long enjoyed as la fille ainée de l’Eglise (‘the first daughter of the Roman Catholic Church’). This produced a deep-rooted and persistent social and cultural cleavage, which found expression in a phenomenon of great personal significance for Emile Durkheim – anti-semitism. Born to a Jewish family, he had been expected, in due course, to succeed in the position of rabbi his forebears had long held, in the Alsatian town of Epinal, within its Jewish minority. However, as an adolescent
Emile Durkhein’s Sociological Insight Into Political Phenomena
Emile ceased to be trained for that role, and successfully pursued, instead, a public, secular education. In spite of this, no matter how faint the traces left on Durkheim’s professional production by his early familiarity with the great tradition of Jewish learning, his academic adversaries and their followers persistently and invidiously reminded him – and his associates – that he was indeed a Jew (Moore, ‘David Emile Durkheim and the Jewish Response to Modernity’). This rendered him, for his adversaries, a legitimate target of diffidence and indeed suspicion, regarding not only his commitment to the power and welfare of the French nation, but also the intrinsic significance and validity of his own academic achievements, teachings and publications. In the latter context, his opponents always criticized the success Durkheim enjoyed as an academic politician, in recruiting and training pupils and followers, then promoting their advancement within the highly prestigious and competitive French university system. All this Durkheim did in the untiring pursuit of what he considered his life-task – the promotion of sociology as a significant, autonomous academic discipline. He elaborated a specifically sociological approach to a variety of social and cultural phenomena; but – in my judgment – he did not explore political matters in as original and thorough manner as he did regarding other major social phenomena. The Abstract of an important essay by HansPeter Mueller phrases this point as follows: ‘Emile Durkheim was neither a political scientist nor a political sociologist. His oeuvre though exhibits a political dimension which is not easy to grasp’ (Mueller, 2009: 227). Why so? First of all, his life-work echoes an attitude vis-à-vis politics-as-such which characterizes the diverse intellectual undertakings involved, beginning in the era of the Enlightenment, in the rise and establishment of the modern social sciences. That attitude reversed the privileged position political phenomena had long held in European intellectual discourse on social affairs. Basically, larger social entities had
49
traditionally been seen as chiefly constituted and managed as the object of rule from above, with underlying populations made to serve the particular interests of variously constituted narrow minorities – typically, princely dynasties – if necessary by threatening or exercising violence on the subjects. The nascent social sciences sought to overcome the intrinsic limitations which such views about the phenomenon of rule placed on the appropriate understanding and critique of broader social events and arrangements. New disciplines undertook to theorize instead major new aspects of European social development, in ways sometimes inspired by recent, massive advances in the knowledge of natural phenomena. In particular, what came to be called ‘political economy’ argued that over recent generations the whole society had benefitted from the extent to which more and more individuals had disregarded or circumvented political constraints on the peaceful pursuit of their private interests. Increasingly, they invested their energies and resources in self-seeking, open-ended activities, and freely exchanged with one another the respective products on the market, whose workings, if left to themselves, would automatically recognize and reward the values embodied in those products. The everyday pursuits of such individuals, if considered and respected as their own private matter, no longer narrowly constrained by traditional normative expectations or by the arbitrium of rulers, would continue to develop material and cognitive resources, and generate products addressing ever new needs. The main political requirement for this to take place could be expressed in a simple, negative recommendation to the political and administrative personnel in charge of the political management of economic affairs: laissez-faire, do not interfere. It was up to individuals to settle with one another, on the market, the terms of their own transactions. A growing body of contemporary experiences suggested that such abstinence of authorities would indirectly foster the general welfare.
50
The SAGE Handbook of Political Science
Such a view was voiced in France, among others, by Henri de Saint-Simon (1760–1825), who saluted the advent of the ‘industrial system’. He wrote that if a socially selective epidemic were to suddenly strike dead 3,000 men in possession of political prerogatives, society at large would impassively continue to function; whereas it would be paralyzed by the death of 3,000 engineers, scientists, entrepreneurs. One cannot attribute Durkheim’s relatively low scholarly attention to the political phenomena of his time to a similarly dismissive, indeed contemptuous view of their material import. He was fully aware of the changes in government and public administration instituted in France after the end of the ancien régime, and proud of the role his own country had played in promoting such changes in other ones. Rather, that lower attention expressed his unwavering intent – inspired by the great on-going accomplishments of the natural sciences – to establish an objective, empirically grounded, intrinsically valid understanding and critique of social phenomena, much broader than that represented by the current scope of the burgeoning discipline of political economy. In this perspective, even the impressive body of scholarly thinking devoted to politics since the heyday of classical Greece, which Durkheim knew well and whose intellectual and literary merits he appreciated, appeared to him intrinsically limited in its scientific significance by what one may call its speculative intent. That is: It strove to discover not the nature and origin of social phenomena, not what they actually are, but what they ought to be: its aim was not to offer as valid a description of nature as possible, but to present us with the idea of a perfect society, a model to be imitated. Even Aristotle, who was far more concerned with empirical observation than Plato, aimed at discovering, not the laws of social existence, but the best form of society. Something similar might be said of the more recent tradition of juridical discourse
about political institutions. Durkheim had a keen interest in the contemporary development of French scholarship on constitutional law – indeed, he markedly influenced the significant contribution to that field by a key figure, Léon Duguit. But the content itself of that scholarship did not constitute a proper object for Durkheim’s own overriding pursuit, because it was concerned primarily how public affairs ought to be handled, not how they actually were. A significant essay by Hans-Peter Mueller (2009) characterizes as follows Durkheim’s position, 1. (He) rejected current politics with their conflicts and its intrigues, and the prevalence in political discourse of ‘ideology’, thus as the opposite of science. 2. Sociology as a rational, empirically grounded must inquire into social reality ‘as it is’, not ‘as it ought to be’. 3. Politics understood as party politics pursues particular interests, whereas he wanted to commit his sociology to the service of the common good for the society as a whole. His attention was focused on the ‘polity’, not on ‘politics’. In this capacity, sociology could envisage, design and promote ‘a new morality’ focused on a distinctively modern value: the autonomy of the human person. (Mueller, 2009, 245)
Durkheim was aware that accomplishing this entailed a big challenge, which his first masterpiece, The Division of Labor in Society formulated as follows: ‘How does it come about that the individual, while becoming more autonomous, depends ever more closely upon society? How can he become at the same time more of an individual and yet more linked to society?’ (Durkheim, 2014: 7). Durkheim saw the main methodological model for sociology as he understood it, in the way the natural sciences had been lately spectacularly advancing and promoting intellectual modernization. He expounded that model frequently and emphatically, especially in his second masterpiece, The Rules of Sociological Method (Durkheim, 1982). Such rules presupposed the existence of a distinctive realm of reality, which Durkheim often designated as ‘society’ itself. It was constituted by the totality of
Emile Durkhein’s Sociological Insight Into Political Phenomena
‘social facts’ – phenomena present in the various environments in which human beings co-exist that were not established by nature (although they were often perceived as grounded on nature itself), but on the one hand were produced by past human conduct, on the other conditioned present and future conduct. ‘Social facts’ both represented those phenomena and prescribed the appropriate forms of individual and group conduct, seeking to enforce them when such conduct did not spontaneously comply with them. Their content varied hugely according to time and place; thus sociology was engaged, on the one hand, in ascertaining the specific social arrangements in operation at a given time and place, on the other in comparing such arrangements across time and space. To be valid, both pursuits were to advance objectively. As Nisbet puts it: ‘Dispassionate study, objective research, were, above all others, Durkheim’s ideals’ (Nisbet, 1975: 17). As Durkheim himself stated, Our main objective is to extend the scope of scientific rationalism to cover human behavior, demonstrating that, in the light of the past, it is capable of being reduced to relationships of cause and effect, which, by an operation no less rational, can then be transformed into rules of action for the future and that an equally rational operation can subsequently transform those relations into rules of action for the future. What has been termed our positivism is merely a consequence of this rationalism (Durkheim, 1982: 4).
In fact, the impact on Durkheim’s sociological project of the example represented by the natural sciences, was not merely methodological. His first major work, The Division of Labor in Society (Durkheim, 2014) transferred to the social realm a huge process which Darwin’s On the Origin of Species had creatively detected in the realm of nature. Darwin had argued that in the course of natural evolution new species progressively differentiated that realm, in two ways: new species displaced previously existent ones by competing successfully with them for existent resources; their own internal
51
structure was generally more complex than that of previous species, comprising organs specialized in performing particular vital functions. Herbert Spencer (1820–1903) had already transposed Darwin’s core argument to the course of human history, emphasizing the salutary virtues of competition as the main promoter of social advance, triumphant at last in the advance of modernity. However, Durkheim expressly denied validity to Spencer’s utilitarian restatement of Darwin’s argument, contesting the fundamental role Spencer assigned to the human individuals’ egoistic pursuit of their own private interest as the key mechanism of the whole process. They connected with one another on their own initiative by means of contracts of their own making rather than by imposing or obeying authoritative duties. Durkheim had two main objections to this vision. First, individuals as Spencer construed them were themselves produced by that process in its advanced phases. Previous to that, human beings had not perceived themselves as of self-standing, self- activating, self-seeking entities. Second, not everything in contracts is contractual. The making by individuals of particular mutual arrangements with the expectation of their being duly executed, presupposes the institution of contract – a pre-existent set of general, binding, enforceable arrangements establishing who could enter contracts with whom, about what, in what forms, to what effects. In the name of such considerations Durkheim contested a recurrent motif in Spencer’s thought, which had attained large resonance within public opinion in England and threatened to do so also in France. Spencer considered state institutions and policies as superannuated arrangements, damaging legacies of the past; for Durkheim, however, the political activities that excited Spencer’s indignation and hostility, continued to play an indispensable role in fostering economic activities by conferring and upholding the rights of individuals. The state
52
The SAGE Handbook of Political Science
could indeed turn despotic and oppressive, but ‘the social force it constituted could be neutralized by other social forces counterbalancing it’, especially by collective entities which encompassed significant pluralities of otherwise power-less individuals (Durkheim, 2018: 74). In comparison with the temporal scope of Darwin’s theory – which retrospects on nothing less than the advent of life itself and its course over eons of time – the temporal scope of both Spencer’s and Durkheim’s theories is relatively minimal, addressing exclusively events unfolding since the advent of human beings, and covered by pre-historical and historical scholarship on the basis of a variety of archeological and literary sources. Spencer and Durkheim also share a prevalent interest in the nature and significance of what they consider the major break within the unfolding of human experience. At a high level of abstraction, both of them contrast a before and an after within the apparent continuity of that experience, but differ markedly on how they conceptualize that break. Spencer focuses on what we might call the advent of modernity, a complex of relatively recent historical events which he conceptualizes primarily as the transition from militant to industrial society. Durkheim views this as an episode unfolding relatively late within a story that had begun long before, with the disappearance at various moments and in various locales of societies we might summarily characterize as primitive. According to Spencer, militant society, structured around relationships of hierarchy and obedience, was simple and undifferentiated: industrial society, based on voluntary, contractually assumed social obligations, was complex and differentiated. According to Durkheim, however, the relationships of hierarchy and obedience structuring Spencer’s militant society were themselves the product of a more fundamental evolutionary process, affecting utterly simple societies, with small populations internally differentiated only by gender and generation.
There may be a gradient of superiority/ inferiority to this differentiation, but on the whole relations between individual members of the population are tightly prescribed by universally recognized beliefs and norms. Typically their livelihood is chiefly secured (if at all) by shared access to the products of hunting and gathering, practiced over the territory each population considers its own, on the basis of relatively elementary, traditionally handed-down technology and knowledge. Durkheim characterizes as ‘segmental’ the relation between two or more such populations whose territories lie next to one another. They may know of one another and be similar, but between them there are few exchanges of communications or of products. Each population of this kind reflects one answer to the dominant question in Durkheim’s thinking, which can be phrased as follows: ‘what makes a society hang together?’. He labels this one answer ‘mechanical solidarity’ – the extent to which the individuals composing the whole population, due to the minimal extent to which it is internally differentiated, unproblematically subscribe to and abide by the same beliefs and values. The other answer Durkheim labels ‘organic solidarity’. It applies to societies generated by processes from which primitive societies were typically excluded. Over time some of them, however, had come to terms with one another’s existence; their territories had come to overlap or had become the locus of intensified mutual communication, becoming to that extent larger and no longer related ‘segmentally’. Their respective populations, now aware of different resources and techniques, have been induced to differentiate themselves internally; their parts have engaged in diverse productive pursuits, and undertaken to exchange with one another the respective products. Over time, this same process has extended to the relations between populations previously juxtaposed with one another. These innovations have often been refracted within
Emile Durkhein’s Sociological Insight Into Political Phenomena
previously separate sections of a society’s population, diversifying their particular needs, practices, expectations. The most visible – and consequence-laden – aspect of such developments has been the distinction between town and country within previously homogenous territories. In turn, the activities carried out within towns may differ from town to town. They are also diverse within each town, and divide the local population into sub-units associated with specialized productive and commercial practices. No sub-unit aims at self-sufficiency, rather it seeks to exchange the products of its own activities with those of others. The territory’s physical structure itself is now marked by an increasingly extensive and complex network of roads or canals which support the traffics in question. The network itself is internally differentiated, with a particular unit playing a central role, mediating the exchanges between different regional clusters. In sum, we are now confronted by societies constituted, not by the replication of similar homogeneous segments, but by a complex system of different organs, each of which has a special role, and which themselves are formed from different parts. The elements in society are not of the same nature nor are they arranged in the same manner. They are neither placed together end-on, as are the rings of an annelida worm, nor embedded in one another, but co-ordinated and subordinated to one another around the same central organ which exerts on the rest of the organism moderating effects (Durkheim, 2014: 143). Increasingly, the state monitors and controls the whole process, and on this account Durkheim denies validity to the disparagement and hostility the state found in Spencer’s views. In particular, it establishes and enforces the institution of contract, allowing a multitude of private arrangements exchanging the products of multiple parties to take place in an orderly fashion. The typical society in question is much larger than those we labeled primitive. It is highly diversified in terms not only of the resources it employs and produces
53
but also of the needs prevailing within its population and – as we might put it today – in terms of the identities of the population’s components, comprising highly distinctive and mutable sets of beliefs and values. But one may detect some tension between on the one hand Durkheim’s emphasis on the factuality of the social phenomenon and on its thing-like nature, on the other the fact that at bottom, in both primitive and advanced societies the response to the question ‘what makes a society hang together?’ lies – to use a favorite expression of his – in ‘répresentations collectives’, ‘manières d’agir et de penser’ [‘collective representations’, ‘ways of acting and thinking’], thus essentially in ‘things that people carry around in their heads’. Durkheim appealed to this insight also to account for contrasting interests, irreducible animosities, hostilities, divisions, especially in advanced societies – for instance the France of his times – where sometimes such potentially disruptive phenomena found intellectual expression and justification in elaborate ideologies. A great work of Durkheim, Le suicide; étude de sociologie (Durkheim, 1997) explored a further, somewhat paradoxical import of this argument. Those ‘things that people carry around in their heads’ reveal their significance not only in conducts complying with them, but also in those contradicting, deviating from them. All societies present, in different formulations, a strong, generally entertained expectation that their members will not commit suicide even when they undergo at length disappointing, frustrating, painful experiences. This is suggested by two different practices discussed in Suicide. On the one hand, many pre-modern societies handle, and dispose of, the bodily remains of members who have committed suicide in ways which symbolically indicate their strong disdain, spite, abhorrence, condemnation toward those members. On the other hand, many modern communities only reluctantly attribute officially to suicide – over against, say,
54
The SAGE Handbook of Political Science
medical conditions or accidents – the demise of particular members, in order to protect the social standing of their survivors. Durkheim, however, gives a somewhat paradoxical twist to his own treatment of the suicide phenomenon. Against the background of the expectation mentioned above, he sees the phenomenon itself as strongly influenced – and in a manner, caused – by the bodies of belief-and-thought prevailing in different societies. He connects with this insight his own, original typology of suicide, based on the cultural makings not so much of individual suicidal acts as of the different suicide rates characterizing particular social locales and historical circumstances. Consider first the type Durkheim labels altruistic suicide. It presents itself in societies (mostly primitive ones) where the suicide rate expresses the prevalence of norms and beliefs which extol the significance of the community’s own interests, and belittle that of the single individual’s private interests. This may go so far as to produce in some individuals a willingness, in certain circumstances, to place their very existence at the disposal of the community; to sacrifice themselves in order to benefit somehow their community. There is much historical and literary evidence for this phenomenon in pre-modern societies, which however do not produce the kind of evidence Durkheim privileges in Suicide; largely derived from police sources, thus carefully collected, quantitatively coded and processed via mathematical analysis (Selvin, 1958). Thus, his discussion of altruistic suicide largely concerns its well- documented occurrence in one highly particular section of contemporary societies – the military profession – where self-sacrifice is to an extent positively appraised and symbolically rewarded. In other societies, however, and signally in contemporary ones, the much increased incidence of the suicide phenomenon largely concerns the type Durkheim calls egoistic. Here, the individual’s private interests (not
necessarily economic ones; they may derive as well, say, from their intense commitment to a love relation) have priority, and when they are threatened by a change in the circumstances (say, the partner’s own commitment to the relation is definitively withdrawn) the individual may feel irreparably hurt and damaged, deprived of the main source of meaning in their existence. That may induce them to commit suicide. In altruistic suicide the individual obeys the dutiful expectations issued imperatively by their collective belonging; in egoistic suicide, the individual, so to speak, turns their back on the expectations of associates, even the closest ones. Durkheim connects a third type of suicide, which he calls anomic, to a further, massive aspect of contemporary culture. The term anomic is the adjectival form of ‘anomie’, a noun meaning normless-ness – another phenomenon characteristic of contemporary societies and causally associated (among other things) with the increased incidence of suicide in them. In principle, even under contemporary conditions, individuals generally derive their views about reality, and the binding expectations orienting their conduct, from authoritative complexes of norms and values they share with their fellow beings. However, two overlapping, imperious developments – on the one hand rapid changes in the economic conditions of society and in its cognitive and technical resources, on the other the expectation that individuals will act more and more exclusively on behalf of their own particular interests of whatever nature – render such complexes of norms and values less stable, less generally known and subscribed to, thus less likely to provide individuals with moral guidance. This may happen even when the norms in question have been produced, diffused and sanctioned by public authorities. To contain such phenomena and reduce their problematical effects is a responsibility that falls largely on the state, as we shall see later on. Meanwhile, in any case, a change in the personal circumstances of individuals –
Emile Durkhein’s Sociological Insight Into Political Phenomena
paradoxically, even a positive change in their economic position – may make it more difficult for individuals to attach an intensely felt meaning to oneself, one’s possessions, associates, prospects. Such meaninglessness in one’s existence may be very hard to bear, and in certain circumstances may induce individuals to do away with themselves on account of a feeling that they no longer have something to live for. One can account in this way, Durkheim suggests, for the fact that – as he claims to find from contemporary data – in contemporary societies. suicides may be rendered more frequent not only by suddenly deteriorating economic conditions but also by suddenly improving ones, which may induce in them that very feeling. However, in contemporary societies the appearance of such a distinctive, unprecedent ‘suicidal current’ – to use an expression of Durkheim’s – is only one, relatively minor, symptom of a general condition of anomy, which in turn is a crucial component of a broad, disturbing situation of crisis, the acute awareness of which plays a major role in Durkheim’s thinking and to an extent motivates it. Indeed, Steven Lukes characterizes Durkheim as being ‘haunted by the idea of man and society in disintegration’ (Lukes, 1985, 218), connecting him to the theme of social dissolution widely present in 19th-century French thought. What motivates such concerns, for Durkheim in particular? Many developments in contemporary industrial society, beginning with France, fail to fulfil or even just to approximate a condition of ‘organic solidarity’. They are beset by tensions, riven by contrasts, by mutually incompatible attempts to impose the priority of the collective interests of one part over those of other parts. Public discourse is largely carried out by the proponents of incompatible ideologies, sometimes masquerading as scientifically grounded understandings of the way in which society should be ordered, or trying to revive over-idealized past conditions or prospecting unattainable future ones.
55
True, the former tradition-based attachment of individuals to their own ‘segmental’ society has been replaced by new attachments, in particular those related to their occupational roles; but the content of these is constantly altered by new technological developments, and above all they generate vis-à-vis other occupations, antagonisms, the most significant (and disruptive) of which constitute the class conflict emphasized by various forms of socialist doctrine. All this renders problematical the assumption that in industrial societies the process of social and cultural differentiation will be complemented and countered by a process of integration addressing in new ways the question ‘what makes a society hang together?’. This can only take place if such societies operate a political entity authorized and empowered to monitor, orient and control both processes. Again, on this entity – ‘the state’ – falls the responsibility for ensuring that the on-going division of labor is accompanied by appropriate developments of a moral nature – chiefly those promoting moral individualism as the dominant societal value (Mueller, 2009). For this to take place the state must contrast and override the ‘spontaneous development’ of two ‘abnormal’ forms of the division of labor – the enforced and the anomic form (Mueller, 2014: 82). The enforced division of labor takes place when large numbers of individuals are compelled to accept disadvantageous occupational positions largely because they lack economic resources that would allow then for advantageous ones. Durkheim is particularly sensitive to the extent to which the actual workings of the labor markets systematically privilege individuals mobilizing resources handed down to them by their families, rather than drawing primarily on their own natural capacities and their educational experience. This phenomenon appears illegitimate to Durkheim, who occasionally suggests it should be eliminated or moderated by changes in the state’s law of succession. He also shares some arguments (including
56
The SAGE Handbook of Political Science
some found in Marx) to the effect that in the relationship between employers and employees is inherent at least the possibility of the former exploiting the latter (Giddens, 1986: 30–1). But there is another critical task for the state: it must not allow changes in the economic sphere and in the occupational structure to take place within an anomic environment, and must orient and control those changes by means of appropriate arrangements in the political sphere – particularly the formation and when necessary the reformation of public, enforceable, legal and administrative rules. Durkheim’s more sustained treatments of such matters are present not so much in his four masterpieces (Division of Labor, Rules of Sociological Method, Suicide, Elementary Forms of Religious Life) or in other writings he himself published, but in a set of manuscripts Durkheim drafted toward the end of the 19th century himself, and subsequently revised on the several occasions in which he taught introductory courses in the discipline of sociology. Their content is known to us chiefly from the notes he wrote to lecture from, edited as late as 1950 by a Turkish scholar and published in Istanbul with the title Leçons de sociologie. They appeared in Paris that same year, and an English version was subsequently published under the title Professional Ethics and Civic Morals (Durkheim, 2018). I derive chiefly from this work what I consider the three most significant aspects of Durkheim’s views concerning the political management of the inherent tendencies to disorder that are inherent in contemporary society. The following statement from an essay by Hans-Peter Mueller prospects the argument that follows, intended to complement what has been said before on the political aspects of Durkheim’s thought, If it is possible through institutional reforms to achieve smooth coordination between professional groups, the democratic state, and the individualistic ideal, then the division of labor will create organic solidarity and ensure social integration. In
Lecons de sociologie, (Durkheim) therefore outlines the nomos of a functionally differentiated society and sketches a normative picture of a dynamic and just social order. (Mueller, 1993: 98)
His views to this effect can be subsumed under three headings: 1) The state as the societal brain. In the context of the prolonged process which has rendered more and more complex the structure of modern society, a properly constructed and managed state must monitor attentively that process itself and as far as possible secure society at large from some of its potentially disruptive effects. On this account, Durkheim proposes a view of the state which we might label cybernetic. That is: the state has primarily the task of constantly obtaining and communicating knowledge as objective and up to date as possible, about what goes on in the society at large whose parts continuously affect one another. On this account the state should also decide what, if anything, can and should be done about such processes by the governmental and administrative components of the political system the state itself lies on top of. It should design appropriate authoritative arrangements but entrust their operation to those components, again monitoring their activities and modifying them if necessary. Thus, Durkheim developed a view of the state as constituting ‘the brain, the cerebralspinal system’ of society. On this account, it is primarily the site of intellectual activities, operating autonomously on behalf of society. ‘When the State takes thought and makes a decision, we must not say that it is society that thinks and decides through the State, but that State thinks and decides for it’ (Durkheim, 2018: 49). Essentially, the state, is a group of officials sui generis, within which representations and volitions involving the collectivity are worked out, although they are not the product of collectivity. It is not accurate to say that the state embodies the collective consciousness … Rather, it is the centre only of a particular kind of consciousness, of one that is limited but higher, clearer, and with a more vivid sense of itself.…We
Emile Durkhein’s Sociological Insight Into Political Phenomena
can therefore say that the State is a special organ, whose responsibility is to work out certain representations which hold good for the collectivity.… distinguished by their higher degree of consciousness and reflection. (Durkheim, 2018: 50)
2) Democratic processes of society-to-state information and guidance. However, the state does not only transmit to the rest of the political system judgments and decisions elaborated by it on the basis of the information it possesses. It can do this promptly, competently, and efficiently, only if it can avail itself in turn of information generated, and reliably conveyed to itself, from a diffuse set of listening posts, of sites self-consciously reporting the needs, the resources, the successes or failures of an increasingly complex (and increasingly dynamic) society. Within the early history of the modern state, the standard arrangement to this effect was for outlying, largely autonomous components of the political system not only to receive and act upon decisions elaborated on and emitted by the state, but also to provide it with the kind of information we mentioned. In case of war, say, the outposts of the state deployed military resources of their own on terms agreed with the state; during peacetime, they managed some local fiscal arrangements whereby the state extracted and allocated its own economic resources. Further inputs from below regarded the views and preference of local notables about state initiatives concerning the territory. Unavoidably, such information was often, so to speak, biased to favor the interests of the sources which gathered and transmitted it – in particular, the interests of the local members of privileged strata. To counter such an effect, in the course of its development the state relied more and more on personnel which it selected, trained, appointed to special offices, and sent out to the various localities. In many cases it would not leave the members of that personnel in particular localities long enough for them to become excessively sensitive to the particular needs of those localities and/or to those of the privileged strata.
57
The same effect was pursued (especially in Austria and Prussia) by selecting and training public officials with reference to sophisticated bodies of academic knowledge concerning the proper handling of administrative, economic and financial affairs. The activities of those officials were unavoidably oriented also by their career interests, but these were considered by the state itself as more conducive to its own interests. The communication upwards, to the state itself, of potentially significant information, massively increased its own significance, from the 18th century on, with the advance of two phenomena. One was the increasing reach and effectiveness of various kinds of public education agencies, which markedly increased the literacy of the population, and thereby fostered the formation of bodies of opinion which, via the press, also variously dialogued or contended with one ano ther regarding political affairs (Habermas, 1991). The other was the democratization of political systems, thanks to which different judgments and demands concerning state activities came to be legitimately transmitted to the state by its arranging periodic, regular electoral consultations at various levels intended to select the personnel operating key public agencies and to have them issue authoritative directives of public officials (Birnbaum, 1976). In Durkheim’s cybernetic view of democracy, this amounts to complementing the top-to-bottom transmission of information and decisions concerning policy with a bottom-to-top transmission. In a democracy, communications between the state and the other parts of society are many, and both regular and organized. The citizens are kept in touch with what the State is doing and the State is at given periods, or may be continuously, told of what is going on in the deep layers of society. It is informed either through administrative channels or by the voice of the electorate. (Durkheim, 2018: 92)
Only by availing itself of such a two-way process of communication can the state design and manage a project ultimately
58
The SAGE Handbook of Political Science
intended to curb the anomic tendencies inherent in an advanced process of division of labor left entirely to itself. 3) Durkheim’s corporatist project. There is something paradoxical about a further component of Durkheim’s thinking about policy and the state. It expressed his intention to reform existent arrangements, enabling them better to deal with contemporary and future conditions; but it was largely inspired by his views about earlier arrangements, established in various parts of Europe in the course of early modernity, with distant precedents in classical antiquity. The key aspect of those arrangements – mostly referred to by such expressions as corporative order, corporativism – was the existence of groupings constituted by the practitioners of diverse ‘arts and trades’ (to begin with, within single towns) in order to further the economic interests (in principle, private interests) those practitioners shared with one another. Typically, such groupings – say, various guilds or cooperatives – would impose on their own members some obligations, for instance, about whether and to what extent they could add new members to their number, how and at what length these should be trained, what techniques and materials members were allowed to use in production, how much they should charge non-members for their products, etc. The main intent of such arrangements was to limit the number of practitioners of a given trade, to prevent them from competing with one another by producing novel objects or services and/or by adopting new tools and practices, or by accepting low bids for their products. If complied with, those arrangements would essentially place each grouping, in its dealings with others or with prospective customers, in the advantageous position of a monopolist. This allowed each grouping to restrict the freedom of its members in order to promote its own collective interest, by exercising powers of public nature, normally vested in offices legitimately managing a larger, more comprehensive collectivity. If this happened across a large number of
diverse groupings in a given locale – a town, a market, a territory – their ensemble constituted an essential component of the overall authoritative ordering of that locale. Often, the leading members of one or more groupings had exclusive access to significant official positions in the overall political structure of the whole locale, often to the disadvantage of those excluded from such groupings. Durkheim objected to some effects of those arrangements, particularly the opportunity they had given in the past to some favored groupings to prevent, retard, bias the content of, the commercialization and industrialization of some European economies. However, he also thought that an advanced economy would benefit from a public recognition of the persistent significance for great numbers of individuals of their occupational identities. Optimally, this would involve the operation, within each major sector of the national economy, of a publicly instituted agency, which Durkheim often called a corporation. Here, one organization representing the sector’s employers and one representing its work force, while continuing to pursue the interests of the respective base, would cooperate with the appropriate governmental agency in designing the policies required by the sector as such. Thus structured, the corporation would constitute for the state as a whole an intermediate layer of authority, which would prevent the state’s dangerous tendency to centralize all public activation and monitoring of activities of an economic (and, increasingly, industrial) nature. In a passage toward the end of Suicide, Durkheim argued his own preference, instead, for ‘the only decentralization which would make possible the multiplication of the centers of communal life without weakening national unity, what might be called occupational decentralization’. But this ‘can fulfil its destined role only if [the corporation], instead of being only a creature of convention, becomes a definite institution, a collective personality, with its own customs and traditions, its rights and duties, its
Emile Durkhein’s Sociological Insight Into Political Phenomena
unity’. All these aspects of its existence and its autonomy must be expressly established by the state by means of legislation, produced and validated by constitutional legislation (Durkheim, 1997: 360). Once so established, each corporation would perform two tasks. On the one hand, it would make competent inputs into the making of appropriate public policies; of course, this required frequent and orderly consultation between the corporation and of the relevant state organs, and as far as possible the joint formation of policy and monitoring of its execution. On the other hand, each corporation would instruct and discipline its members (both those organized by unions and those belonging to employer’s associations) in order to assist the responsible execution by the state of those policies. Durkheim’s corporatist proposal was controversial. In its revolutionary phase, with the Loi le Chapelier, the French state had expressly abolished all intermediate bodies – most of which constituted by various ‘arts and crafts’ – and had re-engineered its own structure chiefly with reference instead to the geographical components of its territory, with each department been administered chiefly by a prefect appointed from the center. The electoral system in Durkheim’s time also emphasized local constituencies, in spite of the fact – he remarked – that the population had long become more and more mobile, the scope of that mobility often encompassed the whole national territory, and individuals identified with their occupation much more than with their locality. This suggested the constitutional changes he proposed. At bottom, their intent was to generate and manage a new form of public morality – a key concern of his, variously echoed in many of his writings. Relations between individuals and between groups should express not only the parties’ pursuit of their own interests, but also a respect for those of the other parties, and for the constraints imposed on each party by the shared recognition of the legitimacy of those
59
constraints, of their morally binding nature. Thus, relations should be oriented not only by mere considerations of convenience, or by the awareness that the violation of those constraints would activate sanctions – but also by a recognition of the dutifulness of one’s own compliant conduct, by the activation of what one might call a sense for the ought-ness of some expectations In principle, this should be the case – to a different extent and in different forms – in all social relations, from those between individuals motivated by personal feelings, to those formed on the market resulting from shared occupational memberships, or for that matter, from shared loyalty to one’s country. Durkheim’s last masterpiece, The Elementary Forms of Religious Life (1912) argued that there was a religious dimension to all these manifestations of social life. Durkheim defined sociology as ‘the science of institutions’, that is of ‘ways of acting, thinking and feeling … existing outside the consciousness of the individual.… but … endowed with a compelling and coercive power by virtue of which, whether he wishes or not, they impose themselves on him’ (Durkheim, 1982: 51). His work presented and analyzed several institutions, for instance education, the family, contract, science. But religion is the only institution to which he dedicated a whole book, which soon attracted considerable attention (there was an English version as early as 1915) and is still widely recognized as a major contribution to ‘classical’ social theory (Durkheim, 1995). He had expressed previously his professional interest in religious phenomena, but Elementary Forms of Religious Life approached them in a distinctive – and controversial – manner. It focused its discourse on a peculiar set of those phenomena, those embodied in Aboriginal Australian totemism. Here all significant ‘ways of acting, thinking and feeling’ of any given clan, have their source, and derive their unique significance for the clan itself, from the reference to sacred entity. Typically, in the
60
The SAGE Handbook of Political Science
clan’s natural environment there is present one animal species, which the whole clan identifies with and worships. That is, it renews its collective awareness of one set of beliefs and its commitment to one set of religious practices, via its members’ universal or near-universal participation in periodic ceremonies. These celebrate the totem’s virtues and evoke their abiding significance for collective life at large, not just its religious aspects. The clan’s totem – whatever its nature – evokes in both the clan as a whole and its individual members, on the one hand a sense of submission and dependence, on the other an exalting, empowering sense of self-identification with it. Over the last few decades of the 19th century, and in first one of the 20th, Western explorers, missionaries, traders, ethnographic scholars, had built up a considerable body of knowledge about totemism, and speculated on its relation with other, more familiar, such phenomena. In the scholarly debate about which of these might represent the most primitive manifestation of religion, Durkheim’s Elementary Forms expressly argues that Aboriginal totemism is uniquely entitled to this position. In the evolutionary framework of analysis put forward in Division, this turned all other religions, starting with totemism itself, into successive products of one all-encompassing process of differentiation. Thus, a close-up analysis of the Aboriginal variety of religion would contain general propositions valid for all religions, concerning among other things their significance for other cultural and social phenomena. As Durkheim puts it in the very first pages of Elementary Forms, Whenever we set out to explain something human at a specific point of time – be it a religious belief, a moral rule, a legal principle, an aesthetic technique, or an economic system – we must begin by going back to its simplest and most primitive form. We must seek to account for the features that define it at that period of its existence and then show how it has gradually developed, gained in complexity, and become what it is at the moment under consideration. (Durkheim, 1995: 3)
Thus, the whole book developed insights it considered valid for religion-as-such, no matter how numerous and diverse its manifestations had been across space and time. At bottom, the religious phenomenon ‘always assumes a bipartite division of the universe, knowing and knowable, into two genera that include all that exists but radically exclude one another’ which Durkheim labels as respectively the profane and the religious realm. ‘Sacred things are things protected and isolated by prohibitions; profane things are to those to which the prohibitions are applied and that must be kept at a distance from what is sacred’ (Durkheim, 1995: 38). On this account the human universe is structured by one critical, sharply asymmetrical relationship. Totemic practices exhibit the typical attitude of individuals and groups regarding the sacred realm – an attitude of respect, a reverence reflecting both the awareness of the realm’s unique, sovereign power and significance, and the individual’s or the group’s aspiration of be affiliated with it and protected by it. On the other hand, the attitudes appropriate to profane things range from indifference to casualness, to an attention focused on such things’ de facto usefulness or lack of it; they encourage or allow individuals and groups, as they seek their own mundane advantage, to deal with such things matter-of-factly. Religion’s job, then, consists in positing and reaffirming that fundamental asymmetry. Against what? Against the threat represented by an equally basic feature of the human being, which Durkheim characterizes as its being homo duplex, two-fold man. What does this mean? I quote at length, below, from my chapter on Durkheim in a book I co-authored with Giuseppe Sciortino, As any other animal species, humans exist only as individuals; but two components operate in the mental life of each one of them. The first is directly grounded in the individual’s bodily, sensorial apparatus, and primarily concerns desires and activities related to the natural needs associated with its path from birth to death. The second component,
Emile Durkhein’s Sociological Insight Into Political Phenomena
however, is constituted by expectations, beliefs, aspirations, understandings, and values that primarily derive from, and in turn orient, each individual’s relationship to others, and manifest themselves in activities expressing solidarity, indifference, or hostility toward some of them.… It owes its existence to the fact that, more than any other animal, each human being is necessarily engaged in relationships with fellow human beings; thus it is the source of the vast majority of the contents of the individual’s mind.… These contents are not simply juxtaposed with the bodily and sensory human apparatus; it is their task – while registering the passions, instincts, and behavioral modalities that derive spontaneously from that apparatus – to assert their own superiority over the first component, to orient the individual’s sentiments and activities toward needs and preferences beyond its own immediate survival and physical well-being. However, the superiority of the second component over the first, though justified and necessary, is intrinsically problematic. It does not result automatically from merely natural processes; nor does it spontaneously generate congruent behavioral tendencies. Instead, it manifests itself by shaping after its own image, so to speak, the tendencies arising from the first component, or where necessary by denying them expression and repressing them. At the same time, the first component is not docile … it contrasts and resists the disciplines the second seeks to impose on it. Therefore the relationship between the two components is irreducibly contingent; it is never certain which in any specific case will prevail over the other…. If human society is to establish and maintain itself, it is necessary for the second component to prevail over the first, even if not always and everywhere. Such necessity descends from the fact that humans are intrinsically social beings, and their sociality can be affirmed and maintained only if (at the least) the majority of individuals in most circumstances orient their behavior to expectations expressing their awareness of others and their needs, and comply with codes, criteria, and sentiments they share with one another…. Each individual, in considering and addressing the others, abides by salient, long-lasting, demanding constraints on conduct that reflect and affirm the superiority of the second component. Normally, indeed, the interests the individuals share must prevail over those private to each of them. (Poggi and Sciortino, 2011: 28–9)
Typical totemic rituals, according to Durkheim, acknowledge and affirm that superiority in a particularly impressive,
61
emotion-laden manner. They directly confront each individual member of the clan with its membership as a whole, generating in it a ‘collective effervescence’ which not only reproduces in each individual her/his awareness of the clan’s demands and assurances, but reaffirms their significance for the group as a whole. This effect, Durkheim suggests, finds a more or less distinct echo in all religious practices: they motivate in participants on the one hand a sense of dependency visa-à-vis the group, on the other a pride in membership within it. In 1889, in his Lectures on the Religion of the Semites, Robertson Smith had paid great attention to the rituals of sacrifice (Adam & Charles Black 1889). Typically, such rituals involved killing a particular animal victim, cooking and feeding it to the participants, thereby symbolically affirming the group’s dependency on the benevolence of its gods. ‘One of the most important functions that fall squarely upon the shoulders of the deity is to see that men have the food they need to live.’ But Smith denied that an associated ritual – offering some of the victim’s remains to the gods themselves – credibly expressed the opposite symbolic meaning: that the gods relied on such human offerings for their own subsistence. ‘It seems contradictory for the gods to expect their food from men, when it is by them that man himself is fed.’ But Durkheim sees no contradiction here. As the ritual experience suggests, ‘social life….unfolds as if in a circle. On the one hand man receives from society all that is best in him…. Take away from him language, science, the arts, moral beliefs, and he descends to the level of animals…. On the other hand society does not exist and live except within and by means of the individuals…. it possesses reality only to the extent that it maintains its existence within human consciousness, and it is our task to guarantee such a position to it. Society can no more do without individuals than these can do without society … Allow the idea of society to become extinct within the minds of individuals, allow the beliefs, traditions, aspirations of the collectivity cease to be perceived and shared by individuals, and society will perish.’ (Durkheim, 1995: 351)
62
The SAGE Handbook of Political Science
Durkheim’s view of the effects on the minds and actions of participants in ritual celebrations of the fundamental symbol of the clan – the totem itself – leads to a provocative question. If the totem ‘is at the same time the symbol of the god and of the society, is it not because the god and the society are one and the same thing’? His own answer to this question was definitively positive. It provoked much debate, and still does; some of the interventions in the debate object that the answer unduly deifies society, others that it renders the religious phenomenon too mundane. We cannot review the debate here, much less try to settle it. We might just say that in Durkheim’s judgment no society can do without something like a worship of itself. One definition of religion in Elementary Forms complements somewhat the huge significance the whole book attributes to ‘beliefs and practices’, by adding to it what one might call an organizational component: ‘A society whose members are united because they imagine the sacred world and its relations with the profane world in the same way, and because they translate this common representation into identical practices, is called a Church’. (Durkheim, 1995: 41). But in the rest of the work there is little sustained elaboration of this component – except perhaps for its being called upon to separate religion from magic: ‘There is no church in magic … The magician has a clientele, not a Church’ (Durkheim, 1995: 42). Perhaps the main strength of ‘Durkheimon-religion’ remains the central role played in it by the analysis of myths and rituals. It was based on a large body of literature regarding, again, totemism and especially in its manifestations in Aboriginal Australia, some of which he presents diffusely in – one might say – luminous language. But some aspects of its content are rendered controversial, of course, by the fact that Division is not based on data collected by the author in the course of author’s own, original field research – Durkheim never even traveled to Australia. It refers instead to data collected
by other scholars, many of these seriously challenged by successive sources. ‘Durkheim-on-religion’ has a further weakness, suggested by comparing it with ‘Weberon-religion’, which expressly compares and contrasts a number of historically significant religions. Durkheim was interested of course in the variety of manifestations of religion, as one can see, for example, in his comparison between the contemporary suicide rates of respectively Protestant, Catholic and Jewish populations. But Elementary Forms does not discuss that variety; rather, it assumes that the essential properties of totemism characterize sociologically religion-as-such.
References Birnbaum, Pierre, ‘La conception durkheimienne de l’Etat : l’apolitisme des fonctionnaires’, Revue française de sociologie, vol. 17, 2, 1976, 247–258. Durkheim, Emile, The Division of Labor in Society. New York: Free Press, 2014. Durkheim, Emile, The Rules of Sociological Method. New York: Free Press, 1982. Durkheim, Emile, Professional Ethics and Civic Morals. London: Routledge, 2018. Durkheim, Emile, Suicide: A Study in Sociology. New York: Free Press, 1997. Durkheim, Emile, The Elementary Forms of Religious Life. New York: Free Press, 1995. Giddens, Anthony (ed.), Durkheim on Politics and the State, Cambridge: Polity Press, 1986. Habermas, Juergen, The Structural Transformation of the Public Sphere: An Inquiry into a Category of Bourgeois Society, Boston: MIT Press, 1991. Lukes, Steven, Emile Durkheim: His Life and Work: A Historical and Critical Study, Stanford: Stanford University Press, 1985. Moore, Deborah Dash, ‘David Emile Durkheim and the Jewish Response to Modernity’, Modern Judaism, vol. 6, 3, October 1986, 287–300. Mueller, Hans-Peter, ‘Durkheim’s political sociology’, in Turner, Stephen (ed.), Emile Durkheim: Sociologist and Moralist,
Emile Durkhein’s Sociological Insight Into Political Phenomena
London and New York: Routledge, 1993, pp. 95–110. Mueller, Hans-Peter, ‘Emile Durkheims Moralpolitik des Individualismus’, Berliner Journal für Soziologie, vol. 19, 2, 2009, 227–247. Nisbet, Robert A., The Sociology of Emile Durkheim, London: Heinemann, 1975. Pinon, Stéphane, ‘Le positivisme sociologique: l’itinéraire de Léon Duguit’, Revue interdisciplinaire d’études juridiques, vol. 67, 2, 2011, 69–93. Poggi, Gianfranco, ‘The Place of Political Concerns in the Early Social Sciences’, European
63
Journal of Sociology, vol. 21, 2, 1980, 362–371. Poggi, Gianfranco and Giuseppe Sciortino, Great Minds: Encounters with Social Theory, Stanford: Stanford University Press, 2011. Robertson Smith, Lectures on the Religion of the Semites. Fundamental Institutions. First Series London: Adam & Charles Black 1889. Selvin, Hanan C., ‘Durkheim’s Suicide: Further Thoughts on a Methodological Classic’, American Journal of Sociology, vol. 63, 6, 1958, 607–619.
4 Economic Analysis in Political Science J a m e s F. H o l l i f i e l d a n d H i r o k i Ta k e u c h i
Rational Choice and Strategic Interaction Most political scientists would agree that politics involves control, influence, power, or authority. If we add Max Weber’s concerns about government, legitimacy, and the state, together with Aristotle’s more normative focus on issues of participation, citizenship, and justice, we have a fairly complete picture of what Robert Dahl (1991) calls the ‘political aspect’. We can see immediately how politics touches every dimension of human activity, including the procedural or distributional dimension – who gets what, when, how, why, and at whose cost; the legal or statist dimension, involving issues of governance and legitimacy; and the ethical or normative dimension, which revolves around questions of citizenship, justice, and participation. The study of politics, like economics, also involves preferences, interests, and trade-offs. But unlike economics, where the emphasis is on scarcity and efficiency, in politics the
primary emphasis is on power, influence, and authority, with strong ethical and normative overtones, concerning justice, membership, and citizenship. In a free market, the allocation of scarce goods and resources takes place according to the logic of the marketplace (the price mechanism), that is, the interaction of supply and demand. The exercise of power, however, takes place in the ideational, legal, and institutional confines of political systems. Then what have economic theories added to the study of politics? We know that politics, unlike economics, is not interested narrowly in the allocation of scarce goods and resources. Although politics affects markets through policies, laws, and rules that regulate competition, in a mixed capitalist system politics is not directly concerned with the individual economic decisions of consumers and producers or the optimal allocation of scarce resources. Nonetheless, politics, like economics, does involve choices and strategic interactions. This is where those who
Economic Analysis in Political Science
advocate a positive approach to the study of politics join forces with economists to lay the micro-foundations of political analysis. So-called rational choice approaches in political science share common assumptions with economists about human rationality and strategic decision-making, and they seek to construct economic theories to explain political behavior (see also Paine and Tyson, Chapter 11, this Handbook). The most common ‘economic theories’ of politics take rational choice approaches. They assume that individuals are rational in the sense that they will make choices to ‘maximize their chances of achieving their goals’ (Geddes, 2003: 177). They give priority to agency (individual rational actors) over structure (institutions and other political constraints). They assume that individuals have goals, and that institutions and other factors affect individual strategies and preferences. In this framework, utility-maximizing individuals will do what they can to achieve their goals, engaging in strategies to anticipate the actions of others (their opponents), who will in turn anticipate the actions of the other side. Strategic interactions therefore refer to the ways in which each individual not only looks out for his or her own interests but also takes into account the interests and strategies of others. In this rational choice framework, conflict and cooperation, and the give and take of political life, are the result of myriad strategic interactions. Rational choice approaches often use game theory to understand the complexity of strategic interactions in situations of conflict. Developed by applied mathematicians in the mid 20th century, game theory is widely used by economists and to an increasing extent, by political scientists. Game theory is not a ‘theory’ in the sense of a set of claims, laws, or propositions about the way the world works. It is rather a method for constructing theories, and it offers the analyst a set of concepts and tools that enable her to formalize her arguments. Game-theoretic analysis requires
65
careful specification of the beliefs, wants, and needs of individuals, and a clear understanding of what strategies are available to them. The need for specificity makes game theory less useful as a tool for applied political and social science; nonetheless, it helps us to understand the logic and structure of politics, whether we are studying domestic politics, international relations (IR), political economy, or public policy.
Strategic Interactions and Democracy Economic analysis in political science begins with assumptions about individual rationality. Yet the problem remains of how collective decisions relate to individual choices? To answer this question, social choice theory focuses on how individual preferences add up to collective action, and what roles institutions play in ‘engineering’ social choices. The disjuncture between individual and collective preferences is Condorcet’s paradox – the puzzle that calls into question the notion of majority rule, which is incapable of producing a stable relationship between individual preferences and collective decisions. Kenneth Arrow (1963) seeks to address the puzzle of how individual preferences affect collective choices, and he concluded there is no mechanism short of a dictatorship that can achieve collective rationality. This is Arrow’s impossibility theorem. Arrow’s theorem helps us to understand how democracies work. Arrow does not prove that collective rationality would transitively guarantee the aggregation of individual preferences. Rather his theorem requires concentration of power in the hands of a single decision maker or dictator. The main implication of Arrow’s argument is that institutions are the critical link in understanding how radically divergent individual preferences are translated into collective action in rather stable ways.
66
The SAGE Handbook of Political Science
Building on Arrow’s social choice theory, William Riker (1980) states, politics is the dismal science because we have learned from it that [unlike economics] there are no fundamental equilibria to predict. In the absence of such equilibria we cannot know much about the future at all, whether it is likely to be palatable or unpalatable, and in that sense our future is subject to the tricks and accidents of the way in which questions are posed and alternatives are offered and eliminated. (Riker, 1980: 443 italics in the original)
While Riker recognized the failure to find equilibria for collective rationality, he stressed the importance of institutions for the smooth functioning of a democracy. He answers the question of how democracies can reach collective decisions despite the lack of equilibria in the following way: ‘[in a democracy] the way … tastes and values are brought forward for consideration, eliminated, and finally selected is controlled by … institutions. And institutions may have systematic biases in them so that they regularly produce one kind of outcome rather than another’ (ibid.: 443). Following Riker’s expansion of Arrow’s theorem, we must ask ourselves how institutions matter in democratic decision-making. Many studies have debated whether and how institutional structures determine the existence and location of equilibria for collective choice. Anthony Downs (1957) argues that governments are not really interested in maximizing individual voters’ preferences, but in maximizing votes tout court. In his analysis, the sole point of politics is to gain and hold power; and in a two-party system, politicians must take positions as close as possible to the median voter. His view explains why candidates tend to become moderate in the general election while they emphasize their party’s ideologies – such as conservatism or liberalism – during the party’s primary election, especially in American presidential or congressional elections. Voting has long been a principal subject of political science. Maurice Duverger (1954) pointed out that strategic behavior of
voters and candidates is heavily influenced by electoral institutions. Duverger’s Law holds that plurality voting in single-member districts tends to produce two-party systems, whereas voting based on proportional representation or multi-member districts leads to multi-party systems and coalition governments. In two-party systems, candidates have an incentive to obfuscate and avoid taking strong stands on key issues, as they jockey for position vis-à-vis the median voter; whereas in multi-party systems, candidates have an incentive to take stronger positions to attract a significant minority of voters, but their positions may change once they enter a coalition g overnment. Riker (1982) argues that Duverger’s Law is applicable to many countries’ party politics controlling regionally – but not nationally – strong third parties. In retrospect, these arguments may seem obvious or almost self-evident; but they are in fact early examples of economic analyses in political science, stressing the role of strategy, procedure, and institutions. Following Duverger and others, we can see that political parties and electoral systems are the most important institutions for the smooth functioning of a democracy. They translate and aggregate individual preferences into policy. John Aldrich (1995: 76) argues that democracies would not work without parties, noting that ‘majority voting was highly unstable, shifting, and chaotic – just what would be expected in multidimensional choices that lack preference-based equilibrium’. He shows how parties regulate the number of people seeking office, and how they mobilize voters to achieve and maintain the majorities needed to implement policy once they have gained power. As a result, ‘institutional arrangements could induce equilibrium where preferences alone would not’ (ibid.: 77). By aggregating individual preferences, parties help to solve Arrow’s impossibility theorem; and they move society toward equilibria that make governing possible. Moreover, party competition enables democracy to lead to political outputs that
Economic Analysis in Political Science
better the interests of its citizenry because of politicians who face the real possibility of losing their next election. John Aldrich and John Griffin (2018) find that even in the American South – where historically political parties were weak and underdeveloped – recently, a two-party system has emerged, and party competition in the South has played the same roles as that of the North for democracy to work. The argument for the importance of parties relies heavily on economic reasoning, and it assumes that voters are able to make informed choices. Studies of democratic elections, by contrast, have found that individuals appear to know very little about politics (e.g., Lupia, 2016; Lupia and McCubbins, 1998; Page and Shapiro, 1992; Popkin, 1994). However, the fact that people lack information about politics does not mean that they make random political choices. Arthur Lupia and Mathew McCubbins (1998: 5) argue that people ‘use a wide range of simple cues as substitutes for complex information’ and choose elite cues among competing messages based on the credibility of the senders of the messages. As a result, voters vote as if they have sufficient information to make reasoned choices. However, nowadays as people get news increasingly from social media, voters rely less on expert opinion that includes multiple competing messages. Rather they are wedded to monolithic perspectives that they want to hear, victims of selection and confirmation bias. Economic analysis in political science can explain why and how social media have made it more difficult for democracy to function and to produce stable governing coalitions.
Strategic Interactions and War Economic analysis in political science abounds in the study of one of the most important issues in IR: war and peace. At first glance, it is not surprising to see that
67
states are more often than not in conflict, because each state must pursue its own interests, maximizing its power and wealth in order to provide security. States are trapped in a ‘security dilemma’, which arises when efforts that states make to defend themselves lead other states to feel less secure and to fear that they will be attacked. The logic of the security dilemma is often explained in gametheoretic terms, using the so-called ‘Prisoner’s Dilemma’, whereby the two actors’ rational strategy to maximize individual payoffs creates a worse outcome than some other possible outcome that would be better for both actors. This interaction captures why international cooperation is difficult under anarchy: in the absence of enforcement mechanisms to punish defections, states can give into a temptation to act unilaterally. The point is that states have an individual incentive to defect, which leads to an outcome of mutual defection even though both would be better off with cooperation. In sum, the Prisoner’s Dilemma provides the micro-foundations for realist theories of IR, which argue that states always will approach IR as a zero-sum (win or lose) game, thus leading to the anarchic nature of the international system (e.g., Mearsheimer, 2001; Waltz, 1979). While it may seem irrational for states to engage in an arms race, thereby increasing the propensity to go to war and making it more difficult to resolve conflict through negotiations, conflict and insecurity, according to realists, are enduring features of world politics (see also David and Rapin, Chapter 83, this Handbook). In the meantime, international cooperation is more likely if interactions occur repeatedly with the same partners (Axelrod, 1984). In this situation – commonly known as the repeated (iterated) Prisoner’s Dilemma – actors find their best interest to be cooperating in every period if future payoffs are valued highly enough – this is the so-called ‘shadow of the future’ and it forms a basis for cooperation in world politics. In this way, the danger of war is lessened or eliminated
68
The SAGE Handbook of Political Science
through interdependence and institutions. The repeated Prisoner’s Dilemma lays the micro-foundations for liberal theories of IR (see Hellmann, Chapter 76, this Handbook). Interdependence and constant strategic interaction produce common interests and therefore decrease conflict among states, reducing the role of military power and the insecurity it breeds. Institutions, then, both international and domestic, can mitigate the effects of anarchy and, as a result, there is opportunity for positive-sum, mutually beneficial cooperation (e.g., Ikenberry, 2011; Keohane, 1984; Lake, 2011; McDonald, 2009). Even under anarchy, international institutions help states to overcome the security dilemma and promote cooperation by creating the expectation of repeated interactions across time and with multiple partners, defining norms (standards of acceptable behavior), providing information about activities of other states, and creating linkages across policy dimensions (Jervis, 1976; Martin and Simmons, 2001; Voeten, 2005). This logic of cooperation suggests that alliances would work if states seize opportunities to cooperate over time and across issues, and if each state trusts that the other states see the virtues of cooperation. If a powerful state accuses its allies of defection and free-riding, the shadow-ofthe-future-based cooperation would not work and the chance for peace and stability in IR will diminish. Hence, the iterative Prisoner’s Dilemma suggests that the America First foreign policy of the United States under Donald Trump will undermine international security. War is an extremely costly way for states to settle their disputes (see David and Rapin, Chapter 83, this Handbook). Given the human and material costs of military conflict, why do states sometimes wage war rather than resolving their disputes through negotiations? Motivated by this puzzle, James Fearon (1995) offers ‘rationalist explanations for war’. He postulates three mechanisms of how conflict escalates into war. He sees two ways that miscalculation can lead to war. First, if there is uncertainty about an adversary’s
capabilities – such as the size of the military, the effectiveness of military technology, the quality of leadership, the contribution, if any, of allies; or if there is uncertainty about the adversary’s resolve to fight wars, which raises questions about how much each side values the ‘good’ that is in dispute and what the ultimate cost of war will be in terms of blood (casualties), treasure (wealth), and domestic politics (whether the leader can stay in office) – the international system will be destabilized leading to war. Second, Fearon sees three scenarios where war may occur because of ‘commitment problems’ when states are unwilling to trust their adversaries to honor a negotiated deal. The first scenario is that when the issue in dispute affects future bargaining power (e.g., strategic territory, weapons programs, etc.), the bargaining can fail if a state fears that its adversary will exploit concessions to make further demands – this is the so-called dilemma of coercive disarmament. Moreover, when the relative power of one side is expected to grow rapidly (e.g., rapid economic growth, acquisition of new weapons, etc.), the declining state may have an incentive to fight now to prevent or slow the power shift, and a war fought for this reason is called a preventive war. Furthermore, if the outcome of the war depends on delivering the first blow, a first-strike advantage exists and creates incentives to engage in a preemptive attack to take the first shot before the adversary does – a ‘guns of August’ scenario. In the last scenario, Fearon discusses a problem that can prevent states from reaching settlements because the disputed good is indivisible. Fearon (1995) and Robert Powell (2006) argue that we should be skeptical because claims that goods are indivisible may reflect a bargaining position adopted for strategic reasons rather than a true description of the good. Fearon’s rationalist explanations for war show us that a mutual (and rational) preference for peace is not sufficient for states to overcome the incomplete information problem or the commitment problems. Threats
Economic Analysis in Political Science
may lack credibility, because states generally have incentives to exaggerate and misrepresent their capabilities and their resolve to fight. States have a big incentive to bluff; witness Saddam Hussein in the two Gulf Wars. In July 1990 when Iraq was engaged in coercive diplomacy with Kuwait, both Iraq and Kuwait had the mistaken belief that the other side would cave. In such a situation, even though war is costly and regrettable ex post, actions that entail a risk of war can be sensible ex ante. Meanwhile, in the cases of commitment problems, states face a choice between war today on favorable terms and the threat of war tomorrow on unfavorable terms. The threat of war tomorrow diminishes if states can make credible promises not to use force to revise subsequently the terms of the deal. However, it is difficult for a state to make such a promise in a credible manner in the absence of any enforcement mechanisms, and hence a commitment problem arises in IR where states interact under anarchy. One of the enduring issues in the study of IR is the so-called democratic peace argument, with many competing arguments about why there are few, if any, clear cases of war between mature democratic states. Why is it that democracies do not fight each other? A common theoretical argument is rooted in the idea that democratic states experience a lower probability of war with one another due to domestic political constraints and cooperation flows from a presumption of mutual trust and respect, and shared strategic interests (Doyle, 1986; Gowa, 1999; Russett, 1993). Democracy increases the political costs of war by making elected leaders accountable to people who ultimately must pay the costs of war. The rationalist explanations for war can shed light on this debate, focusing on how democracy influences bargaining interactions between states and increases the chances that a peaceful settlement will be found. For example, Kenneth Schultz (2001) argues that democracy increases transparency, which may help
69
overcome the incomplete information and commitment problems in strategic interactions between states, thereby reducing uncertainty about the capabilities and resolve of democratic states. In addition, democratic leaders may be able to communicate their resolve in a credible manner, because backing down from a threat creates public disapproval and democracy magnifies the political importance of this effect (e.g., Fearon, 1994; Tomz, 2007). Yet the question remains, why are democratic states unlikely to fight each other but are more warlike in general? Some scholars suggest that while fellow democracies enjoy a presumption of friendship, democratic publics treat autocrats with suspicion and mistrust (e.g., Dixon, 1993; Mousseau, 1998; Tomz and Weeks, 2013). Rationalist explanations of IR interpret this observation by pointing out that democratic states have preferences that favor compromise over the use of force when bargaining with other democratic states. With this logic, in the extreme, autocrats are legitimate targets for regime change, as, for example, when President George W. Bush argued that bringing democracy to Iraq would advance US national security. In sum, economic analysis in political science suggests that the two Gulf Wars occurred because of Iraq’s failure to communicate credibly its capabilities and resolve, as well as US lack of norms that favor compromise over the use of force against autocracies.
Strategic Interactions and Politics of International Trade Economic analysis in political science also helps us to understand the politics of trade. Economic theory tells us that free trade is beneficial (see also Stéphane Paquin, Chapter 74, this Handbook). Why, then, do states not always embrace trade liberalization? Why has the international trading system been open only under particular conditions?
70
The SAGE Handbook of Political Science
The core concept of the economics of trade, explaining how trade is beneficial for all trading partners, is comparative advantage. Adam Smith in The Wealth of Nations (1979 [1776]) suggests that there is a benefit to an international division of labor whereby countries specialize based on skills or endowments. In this way, self-interested economic exchange makes everyone better off through the ‘invisible hand’ of the market. Applying Smith’s theory to international trade, David Ricardo in On the Principles of Political Economy and Taxation (2015 [1817]) proposes the idea of comparative advantage: a system of free trade in which countries specialize in the areas where they have a comparative advantage, leading to the optimal allocation of scarce goods and resources. ‘Comparative’ means that it is not necessary for a country to have an absolute advantage in the production of a good or service. All countries have a comparative advantage, in that they can produce some goods and services more efficiently than other countries; and trade will lead them to specialize. Thus, it is wrong to say that trade is only beneficial for wealthier developed countries. Explaining why countries trade, the Heckscher-Ohlin theory (H-O) takes the logic of comparative advantage a step further, pointing out that countries will export goods that require large inputs of their more abundant factors, and import goods that require inputs of more scarce factors. H-O theory explains trade based on factor prices and endowments. Developing countries that are abundant in unskilled labor (or land) should export labor-intensive manufactured goods such as textiles and clothing (or agricultural products), whereas developed countries that are rich in (human) capital should specialize in more high-tech products and services. In sum, comparative advantage and H-O theory can explain why developing countries demand access to developed countries’ markets of textiles and apparel (or agriculture) – instead of protecting their own domestic markets – in trade negotiations. Although economists make a strong case that trade is beneficial for global welfare,
political scientists point out that every country currently has at least some restrictions on trade, called protectionism, involving the imposition of barriers to restrict imports. Why do governments restrict trade? Using the simple economic logic of the Stolper-Samuelson theorem, which postulates that protection benefits the scarce factor of production, Ronald Rogowski (1989) explains which interests will support protectionism and which will support trade liberalization. According to H-O theory, countries export goods that are intensive in the factors of production that are relatively abundant, while importing goods that are intensive in the factors of production that are relatively scarce. Therefore, trade liberalization will increase the income of the relatively abundant factor by increasing its exports, while decreasing the income of the relatively scarce factor by increasing import competition. This logic suggests that the demand for protectionism should come from those whose income would suffer because of trade liberalization. Applying this logic to the United States, which is a labor-scarce country compared with developing countries, unskilled labor benefits from protectionism and loses from trade liberalization with developing countries such as China and Mexico (Spence, 2011). In addition, the Ricardo-Viner model suggests that interests in trade may be industry-specific. Exporting industries want open foreign markets, importcompeting industries want to protect the home market to reduce competition, and industries using imports as inputs want the open home market to be open to reduce production costs. In sum, the Stolper-Samuelson theorem explains why unskilled workers support measures to restrict trade that would benefit their industries. When the United States approved China’s entry into the World Trade Organization (WTO) in 1999, labor unions representing autoworkers (the United Auto Workers), lorry drivers (the Teamsters union), and dockworkers (the International Longshore and Warehouse Union) opposed China’s accession to the WTO. This is despite the fact that increasing trade with China
Economic Analysis in Political Science
would benefit their members, by lowering prices for consumer goods, stimulating economic growth, and raising employment (Friedman, 2000). They fear that if worker incomes drop in import-competing sectors, trade would depress labor income in all sectors, because unskilled labor is a substitute for trade, not a complement. Likewise, the Ricardo-Viner model explains how protectionism may hit the firms that use imports as inputs. For example, the US imposition of tariffs on auto parts under the Trump administration hurt domestic auto producers that use imported parts, even though Trump argued that tariffs would protect them. Car tariffs – presumably designed to protect the jobs of US autoworkers – have raised production costs of car manufacturing in the United States, and as a result in 2018 General Motors announced that it would close four plants in the United States and one in Canada, cutting 14,000 jobs (Sandbu, 2018). The Stolper-Samuelson theorem and the Ricardo-Viner model show that even if trade liberalization makes a country as a whole better off, it creates winners and losers within the country, and it is the shifting constellation of interests, according to Rogowski, that will drive openness and closure to trade. Even when there are more winners than losers from free trade, the losers can be compensated with the winners’ gains. National welfare can be improved by compensating with winners’ gains. Economic theory calls such a move Pareto improving, which makes at least somebody (if not everybody) better off without making anyone worse off. What policies mitigate the negative impacts of trade and create Pareto improvement? Economic theory assumes that labor can move between different jobs with no cost. What this theory implies is that policies that lower the cost for labor to move from a declining industry to a growing industry would lead to a Pareto improvement. Social welfare policies such as improving unemployment insurance and enhancing job training are essential to build support for free trade (Scheve and Slaughter,
71
2007). The cost of moving from one sector to another would be high if a worker loses basic health insurance coverage when changing jobs. Therefore, a national health care system would help to achieve a Pareto improvement. Perhaps most importantly, education reform to improve basic skills in the workforce is key for workers in developed countries to compete with those in developing countries in the global economy (Alden and TaylorKayle, 2018; Scheve and Slaughter, 2019). If the productivity of better-paid developed countries’ workers is the same as that of lesspaid developing countries’ workers, developed countries’ workers will face downward pressure on their wages. While protectionism neither builds the needed safety net nor makes its workers more competitive, it makes domestic producers less competitive in the global market. As a result, economic growth is constrained, making it more difficult to establish a safety net because of declining GDP and lower tax revenue. The H-O theory, together with the StolperSamuelson theorem, explain why trade with developing countries has led to reduced wages for many unskilled workers in developed countries, which has caused a protectionist backlash in the United States and other developed countries. During the 2016 US presidential election, Donald Trump told workers that he would bring back unskilled jobs by restricting trade, foreign investment, and immigration, campaigning on the nationalist slogan of ‘America First’. In fact, economic research shows that trade with China has been responsible for a significant part of the decline in US manufacturing employment in the last two decades, and there is no evidence that trade with other developing countries is responsible for job or wage losses of US workers – in short, China is different. One study estimates that US trade with China during 1999–2011 led to net job losses of 2.0– 2.4 million in the United States (Acemoglu et al., 2016). Another study finds that people who work in parts of the United States most affected by import competition from China
72
The SAGE Handbook of Political Science
tend to have greater unemployment and reduced lifetime income (Autor et al., 2016). Still, other studies argue that enhanced productivity because of automation has had a far bigger effect than import competition with developing countries, pointing to the fact that US manufacturers have increased their productivity and need fewer workers. Although the US steel industry lost 400,000 jobs (75% of its workforce) during 1962–2005, its production did not decline (Collard-Wexler and De Loecker, 2015). Even though technology is a bigger threat to unskilled jobs than trade, foreign countries – China or Mexico – are more convenient scapegoats than machines or robots. Lawrence Katz, quoted in a New York Times article: ‘Just allowing the private market to automate without any support is a recipe for blaming immigrants and trade and other things, even when it’s the long impact of technology’ (Miller, 2016). Edward Alden (2017) argues that the United States has
failed to adjust economic and trade policies to the new reality of an automated and globalized economy. As a result, those who have lost the safety net see immigrants and trade as the cause of their economic difficulties.
Strategic Interactions and International Migration Another field of study where economic theories help explain political outcomes is international migration – the movement of people across national borders – which has been steadily increasing in every region of the globe since the end of World War II (see Badie, Chapter 84, this Handbook). In 2017 approximately 258 million people reside outside of their country of birth and over the past half-century individual mobility has increased at a steady pace (see Figure 4.1).
Figure 4.1 Trends in international migration: A ‘Crisis’? Source: UN Population Division.
Economic Analysis in Political Science
Tens of millions of people cross borders on a daily basis, which adds up to roughly two billion annually. International mobility of people is part of a broader trend of globalization, which includes trade in goods and services, investments and capital flows, greater ease of travel, and a veritable explosion of information. While trade and capital flows are the twin pillars of globalization, migration is the third pillar or the third leg of the stool on which the global economy rests. Migration is in many ways connected to trade and investment, yet it is profoundly different. People are not shirts, which is another way of saying that labor is not a pure commodity. Unlike goods and capital, individuals can become actors on the international stage (they have agency) whether through peaceful transnational communities or violent terrorist/criminal networks. In the extremely rare instances when migrants commit terrorist acts, migration and mobility can be a threat to the security of states. However, many economic studies show that the benefits of migration far outweigh the
Figure 4.2 A typology of international regimes
73
costs (Martin, 2015). Immigrants bring much needed manpower in demographically deficient countries, human capital, and new ideas (entrepreneurial know-how) and cultures (diversity) to their host societies. However, in liberal democracies, they also come with a basic package of (human and civil) rights that enables them to settle and become productive members of society, if not citizens of their adoptive countries. Conversely, they may return to their countries of origin where they can have a dramatic impact on economic and political development (Martin et al., 2006), with a brain drain turning into a brain gain or brain circulation.
Migration and Governance In strategic interactions over the issue of migration, international cooperation is difficult. Figure 4.2 highlights the inadequacies of global migration governance compared to trade and finance. Why has no international migration regime emerged to complement the Bretton Woods regimes for
74
The SAGE Handbook of Political Science
trade (General Agreement on Tariffs and Trade (GATT)/WTO) and finance and development (International Monetary Fund (IMF) and World Bank)? The answer lies in collective action problems. To date, unwanted labor migration is more of a nuisance for host countries, especially from a political and security standpoint. Labor migrants are not fundamentally threatening, the building of walls along the US– Mexican border notwithstanding. Migration governance often is unilateral and done on an ad hoc or bilateral basis. The payoff from international cooperation in the area of unwanted labor migration is negative, and opportunities for defection from a global migration regime are numerous, notwithstanding the new UN ‘global compact on migration.’ The possibilities for monitoring, enforcing, or developing some core principle of non-discrimination (as in the WTO) are minimal at this point, and there is little or no reciprocity. Thus, states have a strong incentive to free-ride other states’ efforts, and international migration of all types poses a challenge for individual states, as well as for regional integration processes like the European Union (EU) and the Association of Southeast Asian Nations (ASEAN), and for the international community as a whole (Hollifield et al., 2014). That brings us back to the domestic level in our quest to understand migration governance and to explain why states risk openness, and it requires a political economy approach. Despite its benefits, both economically and culturally, international migration is one of the most politically controversial issues in developed countries. Reactive populism in Europe and the United States is nativist and xenophobic, and immigration is a key issue for many voters, as evidenced by the British vote to leave the EU and the election of Donald Trump as President of the United States. Four factors drive immigration policies: economic interests (markets), cultural and ideational concerns, security, and rights (see Figure 4.3). Opponents claim that immigrants suppress the wages of native workers (markets), impose welfare burdens and diminish citizenship
Figure 4.3 The dilemmas of migration governance
(rights), threaten national identity (culture), and cause crime and terrorism (security). In their research on public opinion, Gary Freeman and Alan Kessler (2008) find that opposition to immigration is related not only to economic factors, such as job market threat from immigrants and higher taxes to support immigrants’ use of welfare programs, but also to non-economic factors, such as the desire for cultural homogeneity and a fear of loss of national identity (see also Huntington, 2004). In ‘normal’ times, the debate about immigration control in liberal democracies revolves around two poles: markets (numbers) and rights (status); or how many immigrants to admit, with what skills, and with what status? Should migrants be temporary (guest) workers, or allowed to settle, bring their families, and get on a ‘path to citizenship’? To explain the push and pull factors of international migration, economic analysis assumes individual migrants as preeminently rational, utility-maximizing agents (Martin, 2015). For example, George Borjas (1990) argues that the welfare state itself is a significant pull factor because low-skilled migrants would choose to migrate expecting that they can benefit from the recipient country’s social welfare services after admission. As a result, Martin Ruhs (2013) argues, there are trade-offs in the policies of developed
Economic Analysis in Political Science
countries between openness to admitting immigrants (numbers) and the rights granted to immigrants (status). Those who argue for the trade-offs between markets and rights assume that migrant and native workers are substitutes, and hence that immigration harms native workers as their wages fall (e.g., Borjas, 2003). However, migrants and native workers can be complements if they belong to different skill groups, so that immigrants may have a positive impact on the wages of native workers (e.g., Peri and Sparber, 2009). Accounting for the complementarity effects, Gianmarco Ottaviano and Giovanni Peri (2012) find that in the United States immigrants during 1990–2006 had a small positive effect on average wages of US-born workers (including unskilled workers) and a substantial negative effect on wages of recent, low-skilled immigrants. This economic analysis draws two important policy implications: the more social mobility the workers – both migrant and native – have, the more beneficial the arrival of migrant workers are for both native workers and employers, and previous immigrants would lose from more immigration if they fail to raise their skill levels after arrival. In other words, policies that increase workers’ social mobility would mitigate the negative impacts of immigration and create a Pareto improvement. Thus, regulatory reform to create more flexible labor markets and education reform to enhance skill levels of both native and migrant workers would be important to mitigate negative public reactions to immigration. The logic of collective action suggests that organized groups would have more impact on policymaking than disorganized public opinion, especially in democratic countries where vote-maximizing politicians find it more important to cater to influential interest groups (Olson, 1965). How do interest groups shape US immigration policy at the sector level? Margaret Peters (2017) argues that firms that lobby for open immigration to lower their labor costs when trade policy is closed will
75
adapt to import competition by other means – such as increasing labor productivity or closing their businesses – when trade policy is open. She states that trade liberalization and the increased ability of firms to move overseas has reduced the business community’s pressure for open immigration, empowered antiimmigrant groups, and spurred greater limits on immigration. Giovanni Facchini and his coauthors (2011) assume that labor unions want restrictions on immigration – so as to maintain higher wages for native workers – while business groups want greater openness to immigration, and they find that barriers to immigration are lower in sectors where business groups incur larger lobbying expenditures and higher in sectors where labor unions are more powerful. In sum, economic analysis of international migration suggests that business firms seek greater openness to immigration to confront import competition, while workers demand greater controls on immigration when they fail to upgrade their skill levels and hence have to confront the downward pressure of their wages due to automation – not immigration. In times of war and political crises, the dynamic of markets and rights give way to a culture-security dynamic and finding equilibrium (compromise) in the policy game is much harder – this is the policy dilemma facing leaders across the globe in the 21st century. Cultural concerns – where should the immigrants come from, which regions of the globe, with which ethnic characteristics – and issues of integration often ‘trump’ markets and rights, and the trade-offs are more intense in some periods and in some countries than in others. Indeed, studies of public opinion toward immigration show that cultural concerns play a significant role in how willing people in recipient countries are to accept newcomers (e.g., Hainmueller and Hopkins, 2014). For example, in Germany, widely shared but wildly fabricated stories of Arab men raping Western women epitomize the view that the newcomers with particular religious and ethnic backgrounds are defiling the nation (Eddy, 2017). Michael Lusztig
76
The SAGE Handbook of Political Science
(2017) takes issue with multiculturalists (Kymlicka, 1995), arguing that multiculturalism and other forms of culturalism pose a threat to liberal democracy. With the terrorist attacks of September 11, 2001 in the United States, and again with attacks in Europe on November 13, 2015, in Paris, immigration and refugee policymaking has been dominated by a national security dynamic (with a deep cultural subtext, fear of Islam) and the concern that liberal immigration and refugee policies pose a threat to the nation and to civil society. In the United States, Donald Trump has stoked fear of immigrants to gain votes, and as a result anti-internationalism has escalated from protectionism into xenophobia, nativism, and racism. Even though immigration is not a cause of job losses, the perception that immigrants are ‘taking our jobs’ has proven to be politically potent (e.g., Scheve and Slaughter, 2001). Those who feel ‘immigrants have stolen our jobs’ are open to Trump’s xenophobic one-way Twitter demagoguery of ‘we are deceived by foreigners’. Protectionism and restricting immigration have become the rallying cry of anti-globalists. Without a social welfare safety net that would create a Pareto improvement, those in the United States who feel left behind by globalization find immigrants and foreigners to be convenient scapegoats. However, the situation in Europe is different. Despite strong welfare states, the fear of Islam and terrorism overrides the basic political economy dynamic of markets and rights (see Figure 4.3).
Migration Interdependence and International Cooperation If the domestic four-sided game (Figure 4.3) is not complicated enough, it becomes more difficult by virtue of the fact that migration control has important foreign policy implications. The movement of populations affects international security, and in some situations like the partition of India or the breakup of
Yugoslavia, it can change the balance of power. Hence, political leaders are always engaged in a strategic interaction, a two-level game, seeking to build domestic coalitions to maximize support for policy but with an eye on the foreign policy consequences (Putnam, 1988). Migration is an important factor driving economic interdependence and creating an international labor market. The first rule of political economy is that markets beget regulation. Hence, some type of a stronger global or regional migration regime is necessary to sustain open labor markets. What will be the parameters of such a regime, how will it evolve, and how can economic theories of politics help us to understand it? One of the principal effects of economic interdependence is to compel states to cooperate (see the discussion of trade in the previous section). Increasing international migration is one indicator of interdependence, and it shows no signs of abating. From Figure 4.4, we can see levels of migration interdependence, with states in Europe, North and South America, Africa and Asia relying heavily on migration for national development, whether through labor migration (both high- and lowskilled) or income generators via remittances. As the international market for skilled and unskilled labor grows, pressures to create an international regime will increase. Economic theories help us to identify two ways in which states can overcome coordination problems in the absence of a multilateral process that builds trust and reciprocity and thereby helps to overcome asymmetries: (1) through the centralization of regulatory power and pooling of sovereignty (as in the EU), and (2) suasion and ‘tactical issue linkage’. We already have seen an example of the first strategy at the regional level in Europe (see Fawcett, Chapter 80, this Handbook). The EU and, to a lesser extent, the Schengen and Dublin regimes were built through processes of centralization and pooling of sovereignty. This was easier to do in the European context because of the symmetry of interests and power within the EU
Economic Analysis in Political Science
77
Figure 4.4 Migration interdependence
and the existence of an institutional framework (the various treaties of the EU). It is much more difficult to centralize control of migration in the Americas or Asia, for example, where the asymmetry of interests and power is much greater, and levels of political and economic development vary tremendously from one state to another. Different from the EU, it is unlikely that regional trade regimes like the North American Free Trade Agreement (NAFTA), Asia-Pacific Economic Cooperation, or the Trans-Pacific Partnership (now Comprehensive and Progressive Agreement for Trans-Pacific Partnership) will lead quickly to cooperation in the area of migration. Nevertheless, the regional option – multilateralism for a relevant group of states where migration governance is a club good – is one way to overcome collective action problems and to begin a process of centralization of regulatory authority. Most international regimes have had a long gestation period, beginning as bilateral or regional agreements. It is unlikely, however, that an international migration
regime (a Global Compact on Migration and Refugees) could be built following the genesis of international organizations such as the GATT (now the WTO), the IMF, and the World Bank, which provide a certain level of multilateral governance for the other two pillars of globalization. In the area of migration governance, it is difficult to fulfill the prerequisites of multilateralism: indivisibility, generalized principles of conduct, and diffuse reciprocity. The norm of non-discrimination (equivalent of the most-favored nation status) does not exist, and there are no mechanisms for punishing free-riders and no way of resolving disputes. In short, as depicted in Figure 4.2, the basis for multilateralism is weak, and the institutional framework is not well developed. However, this has not prevented the international community (via the United Nations) from moving forward with a Global Compact for Migration, built around the principle of ‘safe, orderly and regular migration’. The challenge of course will be to convince the most powerful states, especially the United States, to support a multilateral process for global
78
The SAGE Handbook of Political Science
migration governance. For the moment, the United States and other powerful countries (like the UK) are moving in exactly the opposite (nationalist and unilateral) direction. With the asymmetry of interests and power between developed (migration receiving) and less-developed (migration sending) countries, suasion, including financial incentives, is the only viable strategy for overcoming collective action problems, whether at the regional or international level. This game follows several steps. The first step is to develop a dominant strategy, which can be accomplished only by the most powerful states, using international organizations (like the UN) to persuade or coerce smaller and weaker states. From the standpoint of recipient countries, the orderly movement of people, defined in terms of rule of law and respect for state sovereignty, should be the principal objective of the powerful liberal states. From the standpoint of the sending countries, migration for development, taking advantage of remittances and returns (brain gain) or circular migration, should be the guiding principle of an international migration regime. Then, the second step is to persuade other states to accept the dominant strategy. This will necessitate tactical issue linkage, which involves identifying issues and interests not necessarily related to migration, and using these to leverage, compel, or coerce states to accept the dominant strategy. This is, in effect, an ‘international logroll’. Such tactics will have only the appearance of multilateralism, at least initially. Tactical issue linkage is central in negotiations between the United States and Mexico over the NAFTA (now United States-Mexico-Canada Agreement (USMCA)) and over refugee flows from Central America. Likewise, migration management figured prominently in negotiations between the EU and neighboring states, especially EU candidate countries in the Western Balkans and Turkey. The third step for developed countries is to institutionalize this process. The long-term benefits of such a strategy for recipient countries are obvious. It will be less costly to build a multilateral migration
regime than to fight every step of the way with every sending state, relying only on unilateral or bilateral agreements. Multilateral processes may entail some short-term loss of control and sovereignty in exchange for long-term stability and orderly migration based on rule of law. The payoff for sending states is greater freedom of movement for their nationals, greater foreign reserves and a more favorable balance of payments, increased prospects for return migration, and increases in technology transfers. Thus, it is potentially a ‘win-winwin’ for sending and recipient countries and the migrants themselves. Changes in the international system with the end of the Cold War have altered this game in several ways. First, it has made defection easier. Since the 1990s, states have had a more incentive to free-ride by not cooperating with neighboring states in the making of migration and refugee policies. Second, the new configurations of interests and power make it more difficult to pursue a multilateral strategy for managing international migration. In recipient countries, internationalist rights-markets coalitions of the left and the right (for example, civil rights Democrats and business or so-called ‘Wall Street’ Republicans in the United States) have broken apart. Instead, increasing polarization and politicization over immigration and refugee issues have led to nationalist culture-security coalitions of the far left and the far right (for example, job threatened unionized workers and economic nationalists). Yet liberalization and democratization in formerly authoritarian states have dramatically reduced the transaction costs for emigration. Initially this caused panic in Western Europe, where there was a fear of mass migrations from east to west. Headlines screamed, ‘The Russians are coming!’. Even though these massive flows did not materialize, Western states began to hunker down and search for ways to reduce or stop immigration. The time horizons of almost all Western democracies are much shorter because of these changes in domestic and international
Economic Analysis in Political Science
politics since the end of the Cold War. The terrorist attacks of the 2000/10s have exacerbated these fears, and migration and mobility are perceived by many to pose a threat to national security. If, as seems likely, the United States and the EU defect from international cooperation over migration and refugee flows, such defections would alter the equilibrium outcome, making migration more costly in political terms to all states and to the international community, and the economically virtuous process of increased exchange and mobility would be reversed. International cooperation on migration depends on how the more powerful recipient countries manage migration and whether they will pursue an aggressive strategy of multilateralism. To avoid a domestic political backlash against immigration, powerful liberal states must take the short-term political heat for longterm political stability and economic gain, much as Angela Merkel and Germany did in the face of the refugee ‘crisis’ of 2015–16. However, the asymmetry of interests, particularly between developed and developing countries and short-term political considerations (countering the rise of the extreme populist right) are too great to permit states to overcome problems of coordination and cooperation. Economic analysis in political science suggests that even as states become more dependent on trade and migration, they are likely to remain trapped in what James Hollifield (1992) calls a liberal paradox, needing to be economically open and politically closed, for decades to come.
Conclusion The election of Donald Trump to be President of the United States in 2016 poses a great challenge to the economic analysis of politics, specifically rational choice theory. Trump’s unpredictability questions one of the key assumptions of rational choice:
79
the consistency of each actor’s preference ordering. Is Trump irrational? We suspect that the reason why Trump is unpredictable is that his policy agenda has no basis in strategy but relies instead on social psychology. New York Times columnist David Brooks (2017a) wrote: ‘It’s not clear if Trump is combative because he sees the world as dangerous or if he sees the world as dangerous because it justifies his combativeness. Either way, Trumpism is a posture that leads to the now familiar cycle of threat perception, insult, enemy-making, resentment, self-pity, assault and counterassault’. While many analysts have struggled to identify a strategy behind his erratic pronouncements, it makes more sense to assume that he chooses his policy positions based on preference ordering in a way to maximize his ego satisfaction. Even if some in the Trump administration believe rulemaking through multilateral institutions benefits US strategic interests, President Trump will not listen to their advice because he is impervious to strategic arguments, and only responds to what satisfies his ego. He also attacks political institutions such as the separation of powers and freedom of speech because those institutions hurt his ego. For many of his supporters the less civil he is the more attractive his rhetoric is, as his anti-institutionalist attitude and lack of civility are criticized by those who, he tells his supporters, look down upon them (Brooks, 2017b). To understand the strong backlash against a liberal, rationalist view of politics, may require us to make more room for interpretivist, social psychological, and even Freudian approaches. Perhaps, economic analysis in political science constitutes a ‘scientific revolution’ à la Thomas Kuhn (1962), moving the study of politics away from its formal-legal and sociopsychological roots and in the direction of more systematic and falsifiable propositions. However, the irony is that the study of economics and politics is moving in opposite directions, with a renewed emphasis on socio- psychological approaches to the study of markets and economic behavior. As our analysis
80
The SAGE Handbook of Political Science
of trade and migration show, Trump’s seemingly irrational behavior can be explained best by incorporating psychological factors into his preference ordering. Perhaps, rationality and psychology will meet halfway and a true political economy will emerge; but this strikes us unlikely because the objects and the subjects of inquiry are quite different. In a single essay, we cannot begin to resolve the dispute between rationalists, social psychologists, and institutionalists. We instead fall back on Max Weber who leaves ample room for rationalist and interpretivist approaches to the study of politics.
References Acemoglu, Daron, David Autor, David Dorn, Gordon H. Hanson, and Brendan Price. 2016. ‘Import Competition and the Great US Employment Sag of the 2000s’, Journal of Labor Economics 34: S141–S198. Alden, Edward. 2017. Failure to Adjust: How Americans Got Left Behind in the Global Economy. Lanham, MD: Rowman & Littlefield. Alden, Edward, and Laura Taylor-Kale. 2018. The Work Ahead: Machines, Skills, and U.S. Leadership in the Twenty-First Century. New York: Council on Foreign Relations. Aldrich, John H. 1995. Why Parties? The Origin and Transformation of Political Parties in America. Chicago, IL: University of Chicago Press. Aldrich, John H., and John D. Griffin. 2018. Why Parties Matter: Political Competition and Democracy in the American South. Chicago, IL: University of Chicago Press. Arrow, Kenneth J. 1963. Social Choice and Individual Values, second edition. New York: Wiley. Autor, David H., David Dorn, and Gordon H. Hanson. 2016. ‘The China Shock: Learning from Labor-Market Adjustment to Large Changes in Trade’, Annual Review of Economics 8: 205–240. Axelrod, Robert. 1984. The Evolution of Cooperation. New York: Basic Books.
Borjas, George J. 1990. Friends or Strangers: The Impact of Immigrants on the U.S. Economy. New York: Basic Books. Borjas, George J. 2003. ‘The Labor Demand Curve Is Downward Sloping’, Quarterly Journal of Economics 118(4): 1335–1374. Brooks, David. 2017a. ‘A Gift for Donald Trump’, New York Times, February 10. (https://www.nytimes.com/2017/02/ 10/opinion/a-gift-for-donald-trump.html, accessed February 20, 2019) Brooks, David. 2017b. ‘When Politics Becomes Your Idol’, New York Times, October 30. (https://www.nytimes.com/2017/10/30/ opinion/when-politics-becomes-your-idol. html, accessed February 20, 2019) Collard-Wexler, Allan, and Jan De Loecker. 2015. ‘Reallocation and Technology: Evidence from the US Steel Industry’, American Economic Review 105(1): 131–171. Dahl, Robert. 1991. Modern Political Analysis. Englewood Cliffs, NJ: Prentice Hall. Dixon, William J. 1993. ‘Democracy and Management of International Conflict’, Journal of Conflict Resolution 37(1): 42–68. Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper. Doyle, Michael W. 1986. ‘Liberalism and World Politics’, American Political Science Review 80(4): 1151–1169. Duverger, Maurice. 1954. Political Parties: Their Organization and Activity in the Modern State. New York: Wiley. Eddy, Melissa. 2017. ‘Bild Apologizes for False Article on Sexual Assaults in Frankfurt by Migrants’, New York Times, February 16. (https://www.nytimes.com/2017/02/16/ world/europe/bild-fake-story.html, accessed February 17, 2019) Facchini, Giovanni, Anna Maria Mayda, and Prachi Mishra. 2011. ‘Do Interest Groups Affect US Immigration Policy?’, Journal of International Economics 85(1): 114–128. Fearon, James D. 1994. ‘Domestic Political Audiences and the Escalation of International Disputes’, American Political Science Review 88(3): 577–592. Fearon, James D. 1995. ‘Rationalist Explanations for War’, International Organization 49(3): 379–414. Freeman, Gary P., and Alan K. Kessler. 2008. ‘Political Economy and Migration Policy’,
Economic Analysis in Political Science
Journal of Ethnic and Migration Studies 34(4): 655–678. Friedman, Thomas L. 2000. ‘America’s Labor Pains’, New York Times, May 9. (https:// www.nytimes.com/2000/05/09/opinion/ foreign-affairs-america-s-labor-pains.html, accessed February 2, 2019) Geddes, Barbara. 2003. Paradigms and Sand Castles: Theory Building and Research Design in Comparative Politics. Ann Arbor, MI: University of Michigan Press. Gowa, Joanne. 1999. Ballots and Bullets: The Elusive Democratic Peace. Princeton, NJ: Princeton University Press. Hainmueller, Jens, and Daniel J. Hopkins. 2014. ‘Public Attitudes Towards Immigration’, Annual Review of Political Science 17: 225–249. Hollifield, James F. 1992. Immigrants, Markets, and States: The Political Economy of Postwar Europe. Cambridge, MA: Harvard University Press. Hollifield, James F., Philip L. Martin, and Pia M. Orrenius, eds. 2014. Controlling Immigration: A Global Perspective, second edition. Stanford, CA: Stanford University Press. Huntington, Samuel P. 2004. Who Are We? The Challenges to America’s National Identity. New York: Simon & Schuster. Ikenberry, G. John. 2011. Liberal Leviathan: The Origins, Crisis, and Transformation of the American World Order. Princeton, NJ: Princeton University Press. Jervis, Robert. 1976. Perception and Misperception in International Relations. Princeton, NJ: Princeton University Press. Keohane, Robert O. 1984. After Hegemony: Cooperation and Discord in the World Political Economy. Princeton, NJ: Princeton University Press. Kuhn, Thomas S. 1962. The Structure of Scientific Revolutions. Chicago, IL: University of Chicago Press. Kymlicka, Will. 1995. Multicultural Citizenship. Oxford: Clarendon Press. Lake, David A. 2011. Hierarchy in International Relations. Ithaca, NY: Cornell University Press. Lupia, Arthur. 2016. Uninformed: Why People Know So Little About Politics and What We Can Do About It. New York: Oxford University Press.
81
Lupia, Arthur, and Mathew D. McCubbins. 1998. The Democratic Dilemma: Can Citizens Learn What They Need to Know? New York: Cambridge University Press. Lusztig, Michael. 2017. The Culturalist Challenge to Liberal Republicanism. Montreal: McGill-Queen’s University Press. Martin, Lisa L., and Beth A. Simmons. 2001. International Institutions: An International Organization Reader. Cambridge, MA: MIT Press. Martin, Philip L. 2015. ‘Economic Aspects of Migration’, in Caroline B. Brettell and James F. Hollifield, eds., Migration Theory: Talking Across Disciplines (3rd Edition, pp. 90–114). New York, NY: Routledge. Martin, Philip L., Susan F. Martin, and Patrick Weil. 2006. Managing Migration: The Promise of Cooperation. Lanham, MD: Lexington Books. McDonald, Patrick J. 2009. The Invisible Hand of Peace: Capitalism, The War Machine, and International Relations Theory. New York: Cambridge University Press. Mearsheimer, John J. 2001. The Tragedy of Great Power Politics. New York: W. W. Norton & Company. Miller, Claire Cain, 2016. ‘The Long-Term Jobs Killer Is Not China: It’s Automation’, New York Times, December 21. (https://www. nytimes.com/2016/12/21/upshot/the-longterm-jobs-killer-is-not-china-its-automation. html, accessed February 3, 2019) Mousseau, Michael. 1998. ‘Democracy and Compromise in Militarized Interstate Disputes, 1816–1992’, Journal of Conflict Resolution 42(2): 210–230. Olson, Mancur. 1965. The Logic of Collective Action: Public Goods and the Theory of Groups. Cambridge, MA: Harvard University Press. Ottaviano, Gianmarco I. P., and Giovanni Peri. 2012. ‘Rethinking the Effect of Immigration on Wages’, Journal of the European Economic Association 10(1): 152–197. Page, Benjamin I., and Robert Y. Shapiro. 1992. The Rational Public: Fifty Years of Trends in Americans’ Policy Preferences. Chicago, IL: University of Chicago Press. Peri, Giovanni, and Chad Sparber. 2009. ‘Task Specialization, Immigration, and Wages’, American Economic Journal: Applied Economics 1(3): 135–169.
82
The SAGE Handbook of Political Science
Peters, Margaret E. 2017. Trading Barriers: Immigration and the Remaking of Globalization. Princeton, NJ: Princeton University Press. Popkin, Samuel L. 1994. The Reasoning Voter: Communication and Persuasion in Presidential Campaigns, second edition. Chicago, IL: University of Chicago Press. Powell, Robert. 2006. ‘War as a Commitment Problem’, International Organization 60(1): 169–203. Putnam, Robert D. 1988. ‘Diplomacy and Domestic Politics: The Logic of Two-Level Games’, International Organization 42(3): 427–460. Ricardo, David. 2015 [1817]. On the Principles of Political Economy and Taxation. New York: Dossier Press. Riker, William H. 1980. ‘Implications from the Disequilibrium of Majority Rule for the Study of Institutions’, American Political Science Review 74(2): 432–446. Riker, William H. 1982. ‘The Two-Party System and Duverger’s Law’, American Political Science Review 76(4): 753–766. Rogowski, Ronald. 1989. Commerce and Coalitions: How Trade Affects Domestic Political Alignments. Princeton, NJ: Princeton University Press. Ruhs, Martin. 2013. The Price of Rights: Regulating International Labor Migration. Princeton, NJ: Princeton University Press. Russett, Bruce. 1993. Grasping the Democratic Peace: Principles for a Post-Cold War World. Princeton, NJ: Princeton University Press. Sandbu, Martin. 2018. ‘Tariffs Are Bad for GM and Bad for America’, Financial Times,
November 28. (https://www.ft.com/content/ db323146-f24d-11e8-ae55-df4bf40f9d0d, accessed January 27, 2019) Scheve, Kenneth F., and Matthew J. Slaughter. 2001. ‘Labor Market Competition and Individual Preferences over Immigration Policy’, Review of Economics and Statistics 83(1): 133–145. Scheve, Kenneth F., and Matthew J. Slaughter. 2007. ‘A New Deal for Globalization’, Foreign Affairs 86(4): 34–47. Scheve, Kenneth F., and Matthew J. Slaughter. 2019. ‘How to Save Globalization: Rebuilding America’s Ladder of Opportunity’, Foreign Affairs 98(1): 98–108. Schultz, Kenneth A. 2001. Democracy and Coercive Diplomacy. New York: Cambridge University Press, 2001. Smith, Adam. 1979 [1776]. The Wealth of Nations. New York: Penguin Books. Spence, Michael. 2011. ‘The Impact of Globalization on Income and Employment: The Downside of Integrating Markets’, Foreign Affairs 90(4): 28–41. Tomz, Michael. 2007. ‘Domestic Audience Costs in International Relations: An Experimental Approach’, International Organization 61(4): 821–840. Tomz, Michael R., and Jessica L. P. Weeks. 2013. ‘Public Opinion and the Democratic Peace’, American Political Science Review 107(4): 849–865. Voeten, Erik. 2005. ‘The Political Origins of the UN Security Council’s Ability to Legitimize the Use of Force’, International Organization 59(3): 527–557. Waltz, Kenneth W. 1979. Theory of International Politics. Reading, MA: Addison-Wesley.
5 Functionalism and Its Legacy Timofey Agarin
Introduction Functionalism is a methodological concept explaining social phenomena by specifying an asymmetrical relationship between the two objects under consideration. Func tionalism in political science focusses on the process of interaction between political institutions (e.g. parties in electoral competition or during government formation), states (e.g. during regional integration) or international organisations (e.g. when negotiating economic policies), with particular attention to growing integration of the interacting elements into a politically and thus functionally consistent system. Functionalism rests on the assumption that phenomena can best be explained in terms of what they do and what their impact is on other phenomena: it considers systems of interaction among individuals and groups. A functional definition of functionalism can take the following general form:
Given a system S in a certain state s with a structure T there is an activity a from the point of view of the observer, regularly coming from an element E of T, and having an effect upon S or its environment.
The theoretical status of s can be either stationary (the most frequent case) or dynamic. If one can, with response to a certain theoretical point of reference, claim a systematic relationship between a and s then E can be interpreted as a ‘function’. It can also be interpreted as a ‘functional contribution’ to the maintenance of S whether it means stability, identity, equilibrium or changing of S. This explanation of s neither involves an explication of a’s origin nor explains the causal nexus of its effects on s. For example, a political party can be considered as contributing to the working of democracy without constructing this function as the cause of its creation. It is important to note that most functional explanations start off from the empirical evidence at hand: one observes s and a, then
84
The SAGE Handbook of Political Science
one establishes the functional relationship between the two via abductive reasoning from the features of T and E. As a projection of a future s, derived from T as a goalorientated agent, under the assumption that a’s effect upon S is a functional prerequisite for the maintenance of it, one can equally engage in predictive reasoning inferring the best possible explanation for potential outcomes. In case of a role-differentiated organisation of parts observed, a prognosis of such a kind will become normative. It will express the expectation that S, as a whole, will function in such a way as prescribed by its blueprint. However, functional explanations only make sense under several conditions. If one states that general and free elections in parliamentary systems have the ‘function’ of maintaining the circulation of political elites, then one assumes a bottom-up nomination of candidates, and other functions of elections, e.g. representatively giving shares to legislation to different groups of the population, are not excluded. Functional analyses are based on assumptions of the already existing mid-range theories about social and, by extension, political systems broadly defined, but they are hardly able to generate midrange theories by themselves. According to the widespread usage of functional explanations, systems can be represented as institutions, religious behaviour patterns, cultures, societies, etc., whereas a functionalist abduction may focus on different kinds of activity, exchange, information, sanction, service, coercion, production and other forms of output from institutions. In political science, functionalism emphasises the functions of political institutions, their interconnectedness and the impact on the political process. This entry outlines the ideas at the origins of functionalism and describes the main stages in its development (structural functionalism, neofunctionalism, equivalence functionalism), before moving to the main criticisms of functionalism and concluding with its legacies found in political studies today.
Origins The origins of functionalism can be found in much earlier works of classic social and political theorists. Montesquieu may pass for such an early proto-functionalist thinker. In his Spirit of the Laws (1748) he classified the comparative studies of political systems, their legal orders as relatives – one may say functions – of their physical, geographic, climatic, mental and cultural ‘nature’, i.e. the structure of their external systems. The respective legal principles activate the natural structures towards an outcome of political order, by which the subjects are reminded of their duties and rights. The privileges of the nobility, for instance, are a function of its freedom, whereas parliamentary power is one of the, as functionalists would relay, functions of constitutional monarchy. At the same time, Montesquieu made it clear that important political functions can often be hidden from the common observer behind ostensibly incidental or useless symbols and ways of acting. As such, Montesquieu drew distinction between effects and functions: the notion that functions are not necessarily the effects of social action later became central to functional analysis. An important predecessor of classical functionalism was Herbert Spencer who was characterised as the leading ‘social Darwinist’ and spelt out his views in Social Statics: or, The Conditions Essential to Human Happiness Specified, and the First of Them Developed (1851). Spencer tried to explain the development of industrial society in an evolutionist way. According to him, the law underlying societal evolution proceeds from the homogeneity of equal and independent parts to the heterogeneity of unequal and dependent ones. These parts – institutions, groups, technologies, ideas, etc. – tend to grow in specification and, as a result, fulfil increasingly specialised functions in society. The starting point of this process is population growth involving functional prerequisites like advances in productivity, distribution or
Functionalism and Its Legacy
regulation. Social differentiation, functional specialisation and interdependence of parts offer the complex mechanics to the same social evolutionary movement by means of feeding back one another. One of the founders of modern sociology, Emile Durkheim, explicitly reflected on the ‘tool’ of ‘functional analyses’ as a methodological programme, distinguishing it from the genetic causal explanation present in social science of the day. In his works, The Division of Labour in Society (1892) and Rules of the Sociological Method (1895), Pace Montesquieu, Durkheim argued that to explain a social phenomenon one has to distinguish its generating cause from the function actually fulfilled by it. It is inadmissible to explain collective phenomena in a utilitarian way because their respective functions can be originated from, or serve, different purposes. At the same time, teleologicalcausal explanations are insufficient because a collective phenomenon such as society does not constitute a consistently acting whole and more than just one purpose might be served by one and the same action. Thereby Durkheim suggested that we ought to understand social phenomena as emergents: these come into being as a result of an accumulation of aggregated individual actions, or of unintended effects of purposeful actions. Thus, any social phenomenon emerges and tends to enhance the importance of its function when individuals recognise the advantage gained from it and develop an interest in optimising outcomes of similar actions in the future. Overall, Durkheim was one of the first to conceptualise function as a self-reinforcing mechanism and a feedback system, suggesting both methodological approaches to untangling the mechanics of action from the context in which it takes place. Durkheim demonstrates this through the division of labour, pace Spencer: the increasing specialisation of occupations and social roles leads towards the growing dependence of actors upon one another. As a result, human
85
networks become closer knit and make the exchange of specialised goods easier even outside of the tightly knit communities. Consequentially, organic solidarity, a social order of morals sui generis, comes into being, divorcing individual trust in one another from the context of interactions, regulated by socially policed regulations. In the course of the process of social differentiation, people experience the secondary benefits of division of labour, such as the growth of economic productivity or differentiation of the legal system. Finally, the division of labour results in new forms of production emerging at the point in time t0 with an ever increasing specialisation at the subsequent point in time t1, consolidating and feeding back on social progress with a growing functional complexity of societal system. Building in part on Durkheim’s observations on organic solidarity and in part reflecting the need for a detailed understanding of social practices, functionalist thinking dominated ethnographic and ethnological segments of social science that ultimately became known as anthropology. From the Occidental ethnological point of view, there was a strong temptation to interpret ‘unusual’, ‘counterintuitive’ and outright ‘strange’ phenomena by talking up the stability of the social order. Cultural systems played a central point of reference. Bronislaw Malinowski (1944) and Alfred Radcliffe-Brown (1956), for instance, used this lens to explain the performance of magic rites by the islanders of the Pacific. While Malinowski emphasised the physiopsychological functions of need satisfaction and fear reductive function of magic rites, Radcliffe-Brown claimed that magic served to ensure the survival of the group and to preserve the structure of the social system. In Structure and Function in Primitive Society (1956: 394), Radcliffe-Brown maintained that ‘the concept of function applied to human societies is based on an analogy between social life and organic life’. This paved the way for later efforts to clarify the relationship between the elements of social and political
86
The SAGE Handbook of Political Science
systems by means of biological analogies. For functionalists, many recurrent social or political activities, e.g. graduation ceremonies or electoral behaviours, function to maintain the stability that is required for the structural integrity of society and/or polity as a system. In his early English Villagers of the Thirteenth Century (1941) and in his classic The Human Group (1950) George Homans presented an interesting proposition for ‘reconciling’ the different approaches by overcoming the one-sidedness of both Malinowski and Radcliffe-Brown. In reference to magic rites of extraeuropean societies, Homans opined that magic was initially performed in order to relieve stress and anxiety in dangerous situations for the person performing (such as fishing at sea or child-bearing), to predispose a ghost to the intent of the performer (such as ensuring a plentiful catch and guaranteed successful delivery of the child), and to bring about the sense of normalcy, and hence confidence to the group (provided that the rites are performed in the ‘right’ way by members). As such, magic rites are important group activities which strengthen the bond and solidarity of the group, increasing the chances of group’s, i.e. system’s own survival while overriding the effects of action for individuals. This process instils in its members a fear of punishment for non-compliance through the rites that serve as a means of enforcing acquiescence with the existing norms. Society therefore evolves from and beyond the individual members involved in such an activity that generates and consolidates conforming behaviour towards the norms in place; the violation of these norms would have serious consequences for the group as a whole as well as grave repercussions for its members individually. By combining the psychological and the structural aspects of a functional explanation, Homans follows Durkheim’s proposition that functions are hybrid in nature and ought to be taken into account as such. Thus, the most significant contribution made by the early functionalist approaches
is in their offer of quasi-theoretical conceptual schemata for conducting research. These were following an experimental design that featured both the ‘relatedness’ and ‘connectedness’ of conditions, actors and outcomes of action, presuming these are readily comparable. The two functions feature prominently in diverse social sciences operating the functionalist approach and relate to the ‘system’ where the processes and their outcomes are nested. However, early functionalism fails to convince the academic community as an analytic approach, setting few, if any, independent standards for collection of empirical data, while at the same time making no claims to veracity or offering causal explanations of social phenomena.
Structural Functionalism Talcott Parsons’ The Social System (1951) is widely viewed as the culmination of classical functionalism in the form of structural functionalism. Social systems, the point of reference for Parson’s theory, have to solve a limited number of general problems to stay in balance or indeed to maintain their existence. In the context of modern society, ‘function’ refers to the solutions to such general problems. Some of the functions are externally directed: they provide for the adaptation of the system to its environment. Others are internally directed: they are prerequisites for the integration of parts or for actors’ motivations. In his famous AGIL (adaptation, goal attainment, integration, latency) scheme, Parsons specified the four main systemproblems and related them to the four subsystems, from which proceed the corresponding functions of problem solution. The functions are interrelated by processes of interchanging the special classes of ‘media’ that define their respective outputs, as it were the material, psychic and social goods of energies in the metabolism of society. These functions are geared towards system stability: ‘Since the
Functionalism and Its Legacy
structure of social systems consists in institutionalized normative culture, the “maintenance” of these normative patterns is a basic reference point for analysing the equilibrium of the system. However, whether this maintenance actually occurs or not, and in what measure, is an entirely empirical question’ (Parsons et al., 1961: 37). The conceptually nested series of (in the real world) interpenetrating but (analytically) distinct systems assumes that each constitutes an ordered aggregate of some functionally organised entities. Each has a set of specifiable boundaries such as system prerequisites and requisites and system-endemic mechanisms that ensure either stasis, or dynamism for re-stabilisation of the system that is temporarily out of equilibrium. The four system- problems are repeated on every subsystem level: these nest subsystems which are functionally structured in the same way across the board. The social system as a whole is vaulted by the cultural system as a total of legitimate values, norms and symbols; while polity is vaulted by and is a subsystem of the social system. In Parsons’ view, the stability of a system at each level is a sign of its ‘consolidation’, not of its inherent conservatism, or lack of innovation indicating the preference to stability over change. Parsons writes, The most essential condition of successful dynamic analysis is the continual and systematic reference of every problem to the state of the system as a whole … Functional significance in this context is inherently teleological. A process or set of conditions either ‘contributes’ to the maintenance (or development) of the system or it is ‘dysfunctional’ in that it detracts from the integration and effectiveness of the system. It is thus the functional reference of all particular conditions and processes to the state of the total system as a going concern which provides the logical equivalent of simultaneous equations in a fully developed system of analytical theory. (Parsons, 1949: 21)
One of the constituent parts of a system is adaptation (A) – i.e. system maintenance within the environment – which is a function of the economy, including technology, labour and consumption, and the output of it flows in
87
the general form of (equivalence of) money. The attainment of goals (G) is guaranteed by the function of the subsystem of polity which brings about the effects of control in the general form of power. Societal communities contribute to this process of system integration (I) by providing for an optimal climate of morals, solidarity and socialisation as well as by sanctioning the conformity of subsystems with values and norms (e.g. by awarding prestige). Finally, latency (L), or pattern maintenance, typically arises from the fiduciary system. The actors of this subsystem are sources of influence that ensure common interests, standards of professionalism and other forms of trust. The interchange between A and G, for instance, can be manifested by the way that the law and political sanctions regulate the functioning markets, whereas the economic system enables the functioning of the political system by providing revenues from, for example, taxes. Within the subsystems, social roles – the intersections of structure and actor – are interlinking their components. They represent the institutionalised expectations towards the occupants of social positions which are derived from the given structure of social differentiation. Overall, the overlap of the subsystem-function dimensions across the segments of the AGIL schemata ought to be accepted in order to match a special function with a given part of the system. For several decades, Parsons’ functionalist approach has been applied to various research contexts. However, applying the AGIL model to empirical analyses involved some considerable compromises. Most critiques of Parsons’ approach focussed on the inherent tendency towards an holistic view of society as an ‘actor’ without any specification of the micro-foundations. This assumes the fiction of a system’s tendency towards an equilibrium, as well as the fiction of the common ‘goals’ of society, and their ‘teleological’ character, which is hard to stomach for anyone lacking a belief in predisposed courses of social actions and their potential outcomes. Equally, Parsons was reproached for having
88
The SAGE Handbook of Political Science
explained social change too simplistically by referring to the interference from the environment or an internal accident, instead of taking into consideration the dynamic features in the system’s structure. Thus, Parsons’ functionalism is most useful for providing explanations in social and political analysis in terms of ‘system maintenance’, where ‘function’ suggests a programmatic orientation for outstanding empirical research of mostly postfactum character. Parsons, and functionalists following his dictum have not been concerned with the ‘how, on what terms and with what consequences’ for each ‘subsystem’ that the connection of their component parts and functions bring. Rather, these scholars believe that functional analysis offers the context in which the answers to relational questions and the solutions to relational problems become ‘meaningful’. This ‘open-ended approach to post- factum analyses’ of social phenomena was robustly criticised by Robert K. Merton in Social Theory and Social Structure (1957). Rather than rejecting structural functionalism in totality, he suggested revising it on many points. According to Merton, in a system, functions have consequences which are intended or recognised; but there are also the ones that are neither intended nor recognised. Therefore if functionalism is ‘expressed in the practice of interpreting data by establishing their consequences for larger structures in which they are implicated’, all consequences of all actions need to be taken into account (Merton, 1957: 56). Merton’s critical innovation lay in the acknowledgement that not only observed consequences explain the adaptation or adjustment (i.e. ‘functions’) of any given system. Equally, he observed, consequences that lessen the adaptation or adjustment of the system (which he referred to as ‘dysfunctions’) play a role for system maintenance. Thus, the ‘manifest’ functions are those objective consequences that are intended to
contribute to the adjustment or adaptation of the system and are recognised as such by participants, while their correlate, ‘latent’ functions are neither intended nor recognised. These manifest functions arise as a result of cumulated effects from individual, intended action, typically a case of individual conformity to collective rationality. On the other hand, latent functions arise from the type of collective effects that emerge beyond the control by an individual, and when there is no matching of goals and consequences. Electoral outcomes offer a useful example in the context: they are an effect of multiple individual intended (casting a ballot for a distinct party) and unintended (abstaining from electoral participation) actions, while their manifest function is a political representation of the diversity of citizens’ opinions, and the latent function could be government – either by a single party, or by a governing coalition. The distinction corresponds to the difference between ‘functional systems’ and ‘systems of interdependence’ suggested by the French sociologist Raymond Boudon in La logique du social (The Logic of Social Action, 1981). The first are goal-oriented systems of interaction that are regulated on the basis of reciprocal, specialised role expectations and are controlled by means of input– outputs assessments. Business organisations, universities, political parties, bureaucracies and so on, are of this kind. On the other hand, systems of interdependence emerge wherever actors are interrelated, thereby losing (parts of) control over their actions, ultimately becoming involved in a complex structure that is non-transparent in its nature and its consequences. If, for instance, at an election with three parties for selection, a clear ranking of votes is the result, one usually would expect that the leader of the most successful party, A, was likely to become the next prime minister. Yet in case of a proportional representation, together with an approved culture of forming a coalition, he could come from the secondary party, B,
Functionalism and Its Legacy
as well, after the end of successful negotiations to come along with party C, which has got the least number of votes – on condition that B’s and C’s votes add up to more than 50%. In Merton’s terms, the ‘manifest functions’ of parties is the selection of candidates and the organisation and running of the elections around them, yet their ‘latent functions’ do not necessarily follow if they are considered outside the ‘system of interdependence’. In electoral systems, the main feature of this ‘system of interdependence’ is that cabinet formation does not derive directly from the ranking of votes. Thus, the ideal ‘function’ of democratic elections to provide for a freely elected government is but a latent function. To understand this structure of emergence, it is necessary to find out the statistical as well as institutional and procedural rules that are interdependently responsible for the composition of collective consequences out of the individual effects of electoral action. Recognising the difficulty to functionally analyse systems of interdependence, including that of ‘society’, Merton has been somewhat sceptical of the fruitfulness a functionalist approach could have for empirical research as long as it did not set mid-range theories to its toe. He turned away from ‘big theory’, instead pleading for progress in developing less abstract, and less general, ‘theories of the middle range’. As a consequence, he proposed to decompose the general concept of function, and to make a distinction between ‘function’ (in a sense of adaptation or adjustment of a system), ‘dysfunction’ (as a consequence of an action destabilising a system) and ‘nonfunctional consequences’ (irrelevant for the state of a system). For respective functional, analysis supplied with higher concreteness and specification, he raised a number of criteria that once met would allow for a more parsimonious approach for empirical studies and easier up-scaling for the purposes of constructing a more encompassing mid-range theory, even if not a general theory per se. These criteria included the following:
89
• The phenomena under consideration should have an institutional character and be standardised; • Subjective moments of action should be taken into account, as well as the distinction between positive, negative and neutral consequences; • The functional affected units should be specified, as well as the criteria of system maintenance; • Functional alternatives should be specified with respect to their scope of application; • Dysfunctions producing change should be identified; • Comparative studies should be performed to better validate approaches of explanation; • Features of actors, meanings, places and motivational and overt behaviour that are involved in the pattern of action should be specified.
It is difficult to imagine that this demanding programme of assessing society holistically as a system of interrelated parts would have been executed in a more or less coherent way as a whole. The usual restriction on personal, material and temporal resources and on theoretical potentialities allowed only for selective implementation of Merton’s approach in selected studies, which, however, have given inspiration to the segmental approaches in social and political studies on unanticipated consequences of purposeful action. His focus on the organisation of social relationships implicating members of the society differently at various points in time, and not at all times fully knowing what shapes their own thinking, has sustainably impacted the work of his students, such as the American political scientists James S. Coleman and Seymour Martin Lipset. These have become leading figures in political studies and helped legitimate the focus on the relational aspects of individual and collective action in politics. Social sciences, however, should be most indebted to Merton for his functional approach to the science of society overall, aptly summarised as ‘the functionally necessary demand that theories or generalisations be evaluated in [terms of] their logical consistency and consonance with facts’ (Merton, 1938: 326).
90
The SAGE Handbook of Political Science
Equivalence Functionalism and Neofunctionalism The work of Robert Merton was influential enough, yet not far enough to shift functionalism towards theories of middle range, with some notable scholars having continued to work on a functionalist grand theory of their own. Jeffrey Alexander (1998) and other scholars made the next step in the further development of functionalism, calling their approach neofunctionalism (1998). In accord with some of Merton’s suggestions, the main aspects of their revision sought to preserve the main building blocks of Parsons’ ‘analytic model’. However, there were no explications given neither on the methodological status of the concept ‘function’, nor of a functional explanation in Parsons’ work. This meant that the assertion about the tendency in the system towards ‘equilibrium’ is hard to ascertain empirically. Alexander particularly sought to offset the inflexibility of the systems theory approach by including ‘contingencies’ into individual action: the interrelations between cultural order, as the frame of action, and the contingency of moments of actions in themselves, require methodological tools from political studies, ethnomethodology, symbolic interactionism, etc. in order for the empirical analysis of subsystems to be integrated with the functional assessment of the system. In so doing, Alexander critically reviewed Parsons’ ‘idealistic view’, that culture and values are nesting the social order, which in turn vaults the political order opining for an emphasis on the instrumental dimensions of social action, and as such following suggestions from the cognate midrange, rational choice theories. Following these criteria, Alexander (1998) analysed change in society rather than societal stability, breaking with Parsons’ structural functionalism. His general contribution was that the established principles of social actions are better able to resolve any acute problem in the process of multistage
iterational interchange between conflictual actors and their practices that leads to the establishment of the new institutional structures. This system-level, endogenous innovation can be usefully applied to justify the ongoing process of gradual societal change without the fundamental revision of societal norms already in place. As a result of pressure from this process, competing, contrasting interest groups form and stimulate one another, subsystems and the system as a whole into action, i.e. change. Thus, it is functional to optimise as a result of internal differences so the numbers of alternatives are reduced step by step, and a generalisation of public discussion can emerge as a new consensus. If, further on, under the pressure of ‘institutional entrepreneurs’, it happens that an established pattern becomes illegitimate, then an alternative wins out and is accepted as final by the actors. At the same time, the general value system is never questioned as such. Similar concern for system stability in and of itself underpins the theory of autopoetic systems put forward by Niklas Luhmann in Soziale Systeme: Grundriß einer allgemeinen Theorie (1984; Social Systems, 1995). Luhmann claimed to have established a universal theory of social systems as well as proposed a new definition of ‘function’ that allowed for a larger scope of interpretations. In his approach, functions have both multiple meanings for different actors, as already suggested by Durkheim, and also an ‘equivalent’ one: functions serve as criteria for comparison of the actually identified functions with the virtual ones – i.e. some of the theoretically possible relationships in a system under consideration. The openness, indefiniteness and complexity of this concept of function involve a comparative approach to functional explanations, distinguishing particular system levels as well as system–environment differences. They are viewed as an expression of ‘unity and difference’, indicating contingent structures of the system. Underlying this approach sits Luhmann’s general commitment to forego
Functionalism and Its Legacy
the reference to ‘action’ and preference for ‘communication’ in his equivalence functionalism to underline the interrelated and nested nature of system-immanent processes. Luhmann uses the research question on the mechanisms of regulating the shortage of goods to illustrate his approach: at first sight, it is easy to come to understand function as a combination of moral rules and economic mechanisms, in order to compare its particular effects. Yet, the question of why shortages ought to be regulated could only be dealt with in relation to the difference in system and environment (Luhmann, 1995: 404). The interface between the two is therefore what explains the particular function that action plays for specific systems and in specific contexts.
Criticisms of Functionalism Functionalism as a tradition or a real ‘school’ culminated in the 1950s and 1960s and reflected a progressive move from behaviourist explanations of political processes. However, from the beginning, the critics of functionalism often put forward that the model of analysis advanced by its different incarnations suggests something about the societal and political phenomena which they study. Yet, the explanatory value is limited due to the nature of the quasimechanical, if not outright homeostatic analogies that are employed for the explanation of political processes in functionalist approaches. These analogies frequently use terms such as ‘pattern maintenance’, ‘adaptation to the environment’, ‘goal attainment’ and ‘integration’ with little effort made to specify the degree or measure these terms require to keep the system going before it finally fails. It did not help that functionalism’s aspiration for explanation was matched by its commitment to predict systems’ stability without the consequent verification of
91
the object itself, i.e. defining what stability actually entails in empirical terms. The substantive analogies offered by functional analyses often failed to convince scholars interested in cause-and-effect relationships and served mainly as heuristic devices that were useful for taxonomic purposes, but rarely (if ever) in themselves as explanations. Functionalism delivered many ‘working hypotheses’ with an intention to offer an instrument for explanation, never an explanation by itself. One of the central criticisms the functionalist approach was subject to was its overreliance on analogy with the organic and specifically homeostatic physiology of political systems. More fundamentally, functionalism’s claim to becoming a grand theory without engagement with, or indeed developing any of the supporting mid-range theories of its own, prevented its further development as a theory that was able to tackle increasingly specialist questions and offer parsimonious explications of narrowly defined political problems of the real world. The main criticisms that were launched against the ‘grand theorising’, however, were too serious to further permit its development in academia as a ‘school’ rather than as an ‘approach’. Among the problems that weigh heavily on functionalism are the following: • An inclination to holism, i.e. the view of societies as systems as well as their properties ought to be viewed not as a collection of parts but as wholes, resulted in the neglect of the effects of action at the micro- and meso-levels on the overall instability of systems. This required the focus on broad social and political processes and analytical disregard for the varied effect of individuals, civic groups, parties – in short, subsystems – on the persistence of polities and policies; • An overemphasis on the equilibrium of systems, without explicating potential for change as a feature of the structure itself. This resulted in functionalist studies analysing continuities rather than changes and making them less appealing to the study of dynamics in the political system;
92
The SAGE Handbook of Political Science
• A temptation to look at systems by analogy with organisms – i.e. institutions may in principle ‘function’ like organs of a living being, compounded with a tendency for teleological assumptions, even with respect to systems whose structures were not created purposefully. Both these issues came into disrepute with the advance of constructivist methodologies, as well as anti- and post-positivist epistemologies which called for the focus on contingencies in the systems rather than on their immanent certainties; • Functionalism as a methodological approach that could be empirically tested and contribute to the development of more specific, mid-range theories for political analyses has been significantly challenged by the growing need for policy-relevant explanations of social and political processes, rather than systems as a whole. Functionalism’s over-abstractedness was always pointed to lacking specification of origins, places and ranges of manifestation of functions the approach suggested to study; • A lack of comparative studies for better validation of function detected as well as the identification of functional equivalences has been somewhat revised since the scholarship of Gabriel Almond took a foothold in political studies. Yet, rather than generalising the results of these studies into a grand theory in the footsteps of functionalism, many of these have retained their place as theories of middle range that explain some political processes but do not assert to offer comprehensive explanations of political systems, or these being a part of social system.
Functionalism as either an analytic paradigm or a continuously developing research programme has had its day, gradually losing its attractiveness. This was more due to the growing specialisation of social studies, than to functionalism’s own shortcomings. Since political studies from 1960s onwards have become established as a science with a methodological preference for empirically testable, theories of middle range, the view of the political system as a whole, made up of interdependent structures and mutually responsive processes that tend to maintain relative stability and distinctiveness appeared too ‘grand’ to offer a research programme of its own.
Relevance for Political Studies For some time, functionalism played one of the most stimulating roles in the study of society, bridging the disciplinary boundaries of social and political science. Building upon the work by Talcott Parsons, David Easton and Marion Levy particularly, Gabriel Almond contributed significantly to the popularisation of functionalism by using it in political development studies – in The Politics of Developing Areas (Almond and Coleman, 1960), and consolidating the approach in Comparative Politics: A Developmental Approach (Almond and Powell, 1966). However, the ‘structural-functional approach’ in comparative political studies per se was relatively short-lived, owing to its limited ability to deliver a rigorous scientific method for an empirical study of some of the pressing problems of contemporary politics. However, variants of (neo)functionalism have exercised impact in parts of comparative politics. Here, the concept of function primarily refers to the instructional task, concern, responsibility or another aspect of purposeful action done by, and respectively expected from actors, systems or strategies. Thus, ‘functions’ are understood as intentional rather than emergent processes of collective effects of action. Furthermore, approaches in political functionalism are often based on generalisations of empirical observations rather than on abduction from explicit theoretical or methodological principles. Common to these approaches is the conception that political systems or processes (e.g. international affairs or European unification), can be better controlled by crosscutting institutions (e.g. NGOs, experts or supra-regional networks of scientists) than by these closed systems themselves (i.e. governments operating within the limits of their territorially defined sovereignty). Processes of political decision chiefly run along the actual activities (‘functions’) of these agents instead of those done by formally legitimated authorities, who only validate the results. Therefore, certain needs
Functionalism and Its Legacy
or interests of groups, who are the truly operating agents, have become the focal point of research, making use of the ground-up, as well as analyses of interrelations. In his 1966 paper, ‘Political Theory and Political Science’, Gabriel Almond provides an example of a kind of functional thinking in the ‘theoretical framework’ for comparative politics. After having subjected a great number of states to comparative analysis, they generate a limited typology of functions, taken as inherent processes of system maintenance in order to apply these to further cases. Then, the same procedure is applied with respect to the system units that are exercising functions, such as parliaments, executives, courts, bureaucracies and so on. Due to their origin and direction, the functions are distinguished in the following way: ‘system functions’ are settled in the domestic environment, including socialisation, recruitment and communication. These are intervening conditions for the ‘process functions’, consisting of interest articulation and aggregation, policy making, policy implementation and adjustment. ‘Policy functions’, as a result, operate as the essential system outputs, subdivided into subsystem-specific ones, such as extraction, regulation and distribution. They react to the process functions, which form an interchangeable set of parameters for analysis of similar dynamic feedback loops to apply to political systems of other states. In order to empirically test their assumptions, functionalists operationalise concepts first to collect empirical evidence; these are then comparatively analysed in relation to the application of their functions in different real polities, i.e. systems; ultimately drawing conclusions about the implications of different functions in respect to policy development (e.g. in comparative politics), as well as to the kind of relationships between the states (e.g. in international studies). Earlier, Almond insisted that to assess the ‘functions’, [w]e mean to include not just the structures based on law, like parliaments, executives, bureaucracies,
93
and courts, or just the associational or formally organised units, like parties, interest groups, and media of communication, but all of the structures in their political aspects, including undifferentiated structures like kinship and lineage, status and caste groups as well as anomic phenomena like riots, street demonstrations and the like. (Almond and Coleman, 1960: 8, emphasis added)
Beyond the studies on the interface of ‘society’ with ‘polity’, there are two assumptions that underlie this application of functionalism in multiple fields of political science. First, is that all societies share in the performance of a number of crucial political functions and, second, that there exists a core of significant political tasks, analogously interrelated in all societies. The existence of core functions which are political per se allows analyses of their interrelation across systems (i.e. polities that are subsystems of societies). This approach underpins comparative analysis of political systems’ dynamics and stability. Because the input and output sides of functions are interrelated, functionalists can study a significant component of the political process: this component is ‘political culture’ that appears as a function on both sides of the distinct, system-endogenous process as cause and effect of change. For analytical purposes, it can be studied by sampling public opinion for attitudes, beliefs, orientations, cognitions and such like, at two distinct time-points: as aggregated as a part of the function of the political system and superimposed on political institutions ‘political culture’ is located on the ‘input’ side of the political chart; at the same time, ‘political culture’ is a product of these same political institutions (or ‘structures’ in the jargon of structural functionalism) and can be located at the ‘output’ side of the political chart and is co-constituted with all other components of the system. In other words: as a result of this approach, ‘functions’ are not defined as ex post facto interpretations of relations observed. Rather, they are outcomes of institutions and agendas, which are a priori specified as indicators for data collection on the basis of foreknowledge
94
The SAGE Handbook of Political Science
that is available to the researcher before sampling begins. Crucially, in response to the group of deterministic theories of politics and structural functionalism, political science has moved to historical institutionalism as one of the most productive approaches to study political phenomena in the long-term perspective. In seeking explanations of distinctly national political outcomes and in the assessment of political inequalities and of the institutional organisation of the polity, historical institutionalism borrowed heavily from functionalism’s view of the polity as a system of interacting parts. Although historical institutionalists accept system complexity and, to a degree, the interdependence of its parts, they do not view the social, psychological or cultural traits of individuals as the parameters centrally determining the political system’s operation. Rather, the institutional organisation of the polity is seen as the principal factor structuring the collective behaviour in society and generating opportunities for distinct outcomes for actors nested in the political system. Therefore, historical institutionalism puts the onus on the ‘structuralism’ of structural functionalism over ‘functionalism’ which described political outcomes in terms of the system’s own response to the system internal adaptation needs. Historical institutionalists grant a great deal of attention to the central concept of the school, ‘path dependency’ and ‘critical junctures’ where social and political systems experience external shocks. This builds directly on the functionalism’s commitment to explaining stability over change in societies and polities. Thus, many suggestions of classical functionalism entered the mainstream of social and political science, such as the basic distinction between genetic-causal and functional actions, focussing ones’ attention on the expected as well as on the unanticipated collective consequences of individual actions. Foremost, the attention to interrelatedness of the political background, be it institutions, actors or norms, and situational
opportunities to pursue or forego action have been extremely inspirational for contemporary political studies. Despite the shortcomings that resulted from commitment to assessing rather than to analysing the relationships in political systems, functionalism has allowed political studies to view political interaction as something more than just simple and situational interpretations of actors’ and systems’ components. Being a ‘grand theory’ it offered a comprehensive set of impulses to think of political system as a whole nested in social environment, without precluding greater attention being paid by others to subsystems of politics, opening space for many mid-range theories with a more specific focus.
References Alexander, Jeffrey C. 1998. Neofunctionalism and After. London: Blackwell. Almond, Gabriel A. 1966. ‘Political Theory and Political Science’. American Political Science Review 60 (4): 869–79. Almond, Gabriel A. and James S. Coleman. 1960. The Politics of the Developing Areas. Princeton, NJ: Princeton University Press. Almond, Gabriel, A. and Bingham Powell. 1966. Comparative Politics: A Developmental Approach. New York: Little, Brown. Boudon, Raymond. 1981. The Logic of Social Action: An Introduction to Sociological Analysis. Boston: Routledge & Kegan Paul. Durkheim, Emile. 1938 (1895). The Rules of Sociological Method. Chicago: University of Chicago Press. Durkheim, Emile. 1997 (1892). The Division of Labour in Society. New York: Free Press. Homans, George C. 1941. English Villagers of the Thirteenth Century. New York: W.W. Norton. Homans, George C. 1950. The Human Group. New York: Harcourt College Publishers. Luhmann, Niklas. 1995. Social Systems. Stanford, CA: Stanford University Press. Malinowski, Bronislaw. 1944. A Scientific Theory of Culture and Other Essays. Chapel Hill, NC: University of North Carolina Press.
Functionalism and Its Legacy
Merton, Robert K. 1938. ‘Science and the Social Order’. Philosophy of Science 5 (3): 321–37. Merton, Robert K. 1957. Social Theory and Social Structure. New York: Simon & Schuster. Montesquieu, Charles-Louis de Secondat. 1748. The Spirit of the Laws. Parsons, Talcott. 1949. The Structure of Social Action. Volume 2: Weber. New York: Free Press. Parsons, Talcott. 1951. The Social System. New York: Free Press.
95
Parsons, Talcott, Edward Shils, Kaspar D. Naegele and Jesse R. Pitts. 1961. Theories of Society: Foundations of Modern Sociological Theory. New York: Free Press. Radcliffe-Brown, Alfred R. 1956. Structure and Function in Primitive Society. Free Press Edition. Spencer, Herbert. 1851. Social Statics: Or, the Conditions Essential to Human Happiness Specified, and the First of Them Developed. J. Chapman.
6 Feminist Political Science Marian Sawer
A Short History The normative underpinnings of feminist political science date back to the French Revolution and Mary Wollstonecraft’s A Vindication of the Rights of Woman (1792). This first major challenge to the exclusion of women from political citizenship was followed by William Thompson’s Appeal of One Half of the Human Race (1825), a book he described as the ‘joint property’ of himself and the Owenite feminist Anna Wheeler. By the middle of the 19th century John Stuart Mill, in collaboration with Harriet Taylor, was further developing the argument for why women needed the vote. The Subjection of Women (1869) described the abuse of male power in the family as evidence for the claim that women could not rely on men to represent their interests; for this they needed the vote. Moreover, the law had allowed the family to be a ‘school of despotism’, inculcating habits of mind incompatible with free and equal citizenship.
While the normative arguments for women’s political equality inspired the women’s suffrage movements of the late 19th and early 20th centuries, feminist political science only dates from the arrival of what is often called the ‘second wave’ of the women’s movement. Before this renewed mobilisation of women in the late 1960s, political science largely assumed that the absence of women from public life was a condition rather than a problem. While women had achieved full political rights in most democracies, it was still expected that their citizenship duties would be fulfilled mainly in the home. When Maurice Duverger undertook the first cross-national survey-based research on women’s political participation, under the auspices of the International Political Science Association, he noted that one of the difficulties was that political scientists asked to provide information often regarded its purpose ‘as a secondary one, of no intrinsic importance’ (1955: 8). From its beginnings in the 1970s, feminist political science introduced into the
Feminist Political Science
discipline the values, cognitive frames and organisational philosophy being articulated by the women’s movement. It was critical of what was seen as the complicity of political science in keeping politics as a male domain (Bourque and Grossholtz, 1974: 225). Moreover, it challenged the way that political science had restricted the definition of politics to formal political institutions and had taken male politics and male political behaviour as the norm. Because political science had regarded women as a group as politically irrelevant it had not taken interest in the forms taken by women’s political activity, the sources of women’s subordination or the gendered nature of power.
Basic Theories and Concepts One of the distinguishing characteristics of feminist political science has been its normative commitment to producing knowledge that will advance gender equality. In other words, there is a commitment not only to sharpen the focus of the discipline by introducing a gender lens, but also to contribute to the goal of greater social and political equality. This normative commitment is at odds with a behaviouralist emphasis on ‘valuefree’ political science. Feminist political science emerged at a time when behaviouralism was largely determining (at least in its North American hub) what was regarded as authoritative knowledge production in political science. The explicit articulation of standpoint was regarded as exhibiting a lack of objectivity and hence feminist political science was discounted as a contribution to knowledge. In contrast, feminist standpoint theory argued for the epistemological advantages arising from a subordinate if essential position in the social division of labour (Hartsock, 1987). Those engaged in the marginalised work of caring have to negotiate the conflict between their own versions of reality and those of dominant groups, providing
97
new insights and knowledge. In terms of the research process, feminist political science has emphasised the need for reflexivity about power relations and the values the researcher brings to their research – sometimes referred to as the ‘feminist research ethic’. Reflexivity may entail disclosure of embodiment and standpoint, including lived experience of discrimination and marginalisation but also of relatively advantaged locations. The normative component and emphasis on reflexivity are unifying characteristics of feminist research, which otherwise now varies widely in its research methods, often combining quantitative and qualitative approaches. The one approach that has not been taken up is that of rational choice; the rational choice assumption, borrowed from economics, of autonomous utility-maximising individuals has seemed incompatible with gendered understandings of political life. Moreover, it is important to recognise that not all ‘gender’ research should be characterised as feminist political science: for example, sex-disaggregated research into electoral behaviour may lack any theoretical connection to the ‘collective endeavour of feminist scholarship’ (Ackerly and True, 2013: 137). On the other hand, such research may be theoretically informed and very aware of its ‘usefulness’ – for example the work of Ronald Inglehart and Pippa Norris (2000) on the development of the ‘modern gender gap’ whereby women were shifting leftwards of men in advanced industrial societies.
The Public/Private Divide From the beginning, feminist political theory has challenged the traditional public/private divide that defined the limits of public realm and the domain of political knowledge. Political science had inherited the public/ private divide of classical liberal theory that separated the realm of politics and public regulation from the private realm of the family and civil society. The feminist
98
The SAGE Handbook of Political Science
challenge preceded the arrival of the second wave of the women’s movement, having been central to ‘feminist writing and political struggle’ for some two centuries (Pateman, 1998: 281). Feminist theorists drew attention to the interdependence of the public and private and the political nature of the power relations found in the private sphere. John Stuart Mill, for example, was outspoken in The Subjection of Women about the inequality of the marriage contract upheld by both statutory and common law; its consequences were the abuse of male power in the form of domestic violence and marital rape. Marital inequality was also harmful as a preparation for democratic citizenship; children needed to experience relationships of equality within the family if they were to relate as equals outside. Domestic tyranny could prepare neither men nor women for democracy. This theme, of the need for egalitarian family relationships as the basis of the public virtues of citizenship had also been adumbrated by Mary Wollstonecraft (1792). The arrival of the second wave of the women’s movement saw the sharpening of the challenge to the public/private distinction under the widely used slogan ‘the personal is political’. Originally this was a title given to an essay by Carol Hanisch that was published in Notes from the Second Year: Women’s Liberation (1970). In the essay, Hanisch contested the characterisation of women’s liberation consciousness-raising groups as a form of therapy and argued that on the contrary they were a form of political action. Women’s liberation members had been ‘belittled’ for bringing so-called ‘personal problems’, like body issues or demands that men share the housework and childcare, into the public arena. Rather, she said, ‘One of the first things we discover in these groups is that personal problems are political problems. There are no personal solutions at this time. There is only collective action for a collective solution’ (Hanisch, 1969/2006). Feminist critique of the public/private distinction was associated with a critique of other
gendered dichotomies such as that between reason and emotion. Reason had been associated with the masculine public sphere, while intimacy and emotion were associated with the private sphere in which women were located. Such distinctions obscured the role of emotion in political life while at the same time suggesting that women were not suited for public roles. Moreover, the association of emotion with the private sphere was increasingly displaced by the recognition of how emotional labour was being required in many areas of paid work, particularly in femaledominated occupations such as flight attendants (Hochschild, 1983). Pateman continued to engage with the complexities of the public/private divide in classical liberal theory, where a public/private dichotomy, based on the state and civil society, had concealed the family within the latter. In her powerful revisioning of the story of the social contract and the genesis of the liberal state she argued that hidden in the social contract was a sexual contract. While patriarchal power was overthrown in the public realm and civil society was created, the fraternal contract upheld the power of men over women in the private realm (Pateman, 1988). While critical of the way in which the public/private dichotomy had shielded the structures of women’s oppression from public scrutiny, feminists had differing views concerning values that might be threatened by state interference with the private realm. For example, Jean Bethke Elshtain was wary of the ‘overpoliticization’ of intimate relations and ‘the total collapse of public and private as central categories of explanation’ (Elshtain, 1981: 217). On the other hand, the late Heather Brook examined the many forms of the regulation of intimacy to be found in the ‘conjugal body politic’ and the continuing role of marriage and marriage-like relations in producing political subjects (Brook, 2002). Nonetheless, making the family the object of political analysis was to have a global policy impact as we shall see below in relation to domestic violence policy. Today violence
Feminist Political Science
against women is the issue most often cited by Organisation for Economic Co-operation and Development (OECD) countries as their top gender equality issue (OECD, 2017: 85).
Feminist Institutionalism Methodologically, extending the scope of political science to include the ‘private’ realm of emotions and intimate relations has meant drawing on different disciplines and research methods. One approach from within the discipline that has proved useful for this purpose is that of new institutionalism. Concepts derived from new institutionalism such as path dependence, critical junctures and the importance of timing and sequence were now given gendered constructions, as was the idea of institutional logics of appropriateness. The emphasis on the significance of informal rules and norms in shaping expectations and practices illuminated the gendered logics at work in different kinds of political, governmental and judicial institutions, as well as the institutions of everyday life (Chappell and Mackay, 2017). Institutional ‘stickiness’ and the resilience of hidden practices helped explain the subversion of institutional change, despite the introduction of new formal rules (Curtin, 2019: 125). This feminist engagement with new institutionalism gave rise to a new theoretical framework known as ‘feminist institutionalism’, which sought to identify how the interplay of formal and informal rules and the embedded nature of gendered codes of behaviour affected the possibilities for institutional change. The concept of ‘nested newness’ helped explain why even where feminists had helped design new political institutions, overarching institutional and cultural legacies could reassert their own logic of appropriateness (Mackay, 2014). Building on the emphasis on informal rules has been rich ethnographic observation of the rituals through which politics is performed. Such observation has illuminated the experience of those who are political outsiders in
99
terms of gender or other forms of difference and what happens when they enter public spaces such as parliaments (Puwar, 2004). Attention to the rituals and symbols through which politics and political representation is performed also highlights its emotional dimension and its intertwining with other aspects of culture and the social order (Rai and Spary, 2019: 14–15).
Gender-Responsive Budgeting In terms of public policy, the contribution of feminist scholarship has been of two major kinds. First, since the 1970s there has been strong development in the applied analysis of gender impact on policy and budget decisions. This started from the premise that no policy or spending decision is likely to be genderneutral in its effects, given the very different location of women and men in the social division of labour and the skewed distribution of paid and unpaid work. Ester Boserup’s book Woman’s Role in Economic Development (1970) was a foundational work, leading aid agencies to start moving away from the assumption of gender neutrality in development assistance. It compared the differing roles played by women in agriculture in different continents and how women’s work in subsistence agriculture and household production was left out of official statistics, leading it to be overlooked and undervalued. It showed that development policies designed to assist the movement of men into cash crops in Africa could result in increased burdens on the women left behind in the subsistence sector. In many countries, public policy had also been based on untested assumptions of pooling of income in the household. From the 1970s feminist political scientists contributed both basic and applied research on how to embed gender-based analysis into public policy making and to create national accounts that were more inclusive of non-market work and environmental values (Waring, 1988). From the time of the UN’s First World Conference
100
The SAGE Handbook of Political Science
on Women in 1975 the World Plan of Action promoted national machinery to ensure the gender impact analysis of policy, and from the 1980s there was the adoption and diffusion of what became known as ‘gender-responsive budgeting’. This involved close analysis of the gender effects of budgetary decisions in order to pre-empt decisions that would increase gender inequality, by overlooking the gender segregation of the labour market or the relationship between paid and unpaid work. From the 1990s the embedding of gender analysis into the machinery of government became known as ‘gender mainstreaming’ and feminist scholars contributed valuable insights into the political opportunity structures, women’s movement strategies and evolving international and regional norms that contributed to successful mainstreaming (for example, Chappell, 2002; Walby, 2005). Research into political opportunity structures included the gendered opportunities and constraints posed by federalism and multilevel governance, meaning that political architecture was now conceived as extending above, beyond and below the analytic frame of the nation state (Haussman et al., 2010). While in the 1970s and early 1980s there had been lively debates over whether it was possible to promote a transformational agenda through patriarchal institutions, by the 1990s there was no longer much support for the view that state institutions were governed by a unitary logic. Instead, there was increasing recognition of the possibilities created by feminist institution-building within national and transnational institutions and the role of ‘inside agitators’ or ‘femocrats’ (Eisenstein, 1996; Banaszak, 2010). The term ‘femocrat’, invented as a term of abuse for feminists who took up women’s policy positions in Australian government, subsequently entered into international usage in a more neutral sense, to refer to feminists in government, whether in women’s policy positions or elsewhere (Sawer, 2016). The rise of feminist international relations contributed, as we shall see, to greater awareness
of the leverage for domestic actors provided by norm building within transnational institutions, while large-scale comparative research on women’s policy agencies highlighted the factors which made such agencies effective or ineffective in transmitting women’s movement policy frames into government. An important aspect of the political opportunity structure was that of discourse, and feminist scholars began paying increased attention to discursive struggles and the effects of policy framing in closing off or opening up policy options (Lombardo et al., 2009). For example, if domestic violence is framed in terms of family dysfunction or problematic individual behaviour it results in a very different policy response from if it is framed in terms of gender inequality (Johnson, 2019: 199). Similarly, policy responses to breastfeeding are likely to be very different if it is framed as a contribution to the economy and public health, or as a lifestyle choice. Discursive framing has been analysed not only for its effects on policy choices, but also for its strategic role in social movement policy influence. A good example of a successful discursive strategy was the reframing of the absence of women from public decisionmaking as a democratic deficit (Sawer, 2019). A good example of unsuccessful policy framing was the women’s movement demand in the 1970s for ‘free community-controlled 24-hour childcare’. Intended to exhibit sensitivity to the needs of low-paid shift workers who needed childcare outside normal business hours, this slogan instead contributed to perceptions that those involved in women’s liberation were unnatural and wanted to get rid of their children.
Major Advances, Ongoing Debates, Critical Assessments Key concepts that have informed much feminist political science and are the focus of attention here are those of descriptive and
Feminist Political Science
substantive representation and the relationship between these. In the 1960s Hanna Pitkin had distinguished systematically between forms of representation in terms of ‘standing for’ (descriptive or symbolic representation) or ‘acting for’ (substantive representation’) (Pitkin, 1967). While the justice arguments for women to have equal political rights played an important part in the suffrage movement, they were often coupled with expectations that women would make a difference to the masculine norms of politics. When the demand for equal representation of women in public decision-making began bearing fruit in the 1990s there was sometimes disappointment that it did not bring about more change in political culture. Pioneering work on the barriers to the legislative recruitment (descriptive representation) of women began appearing in the 1970s and a sophisticated supply-anddemand framework was developed by Pippa Norris and Joni Lovenduski (1995). This was applied by Norris and Mona Lena Krook (2014) to the new institutional context introduced by quotas, to examine the effects on both supply (willingness of women to run for office) and demand (constraints and incentives for gatekeepers). Other work adopted a feminist institutionalist framework to emphasise the interaction between formal and informal rules in institutions of recruitment and the gendered mechanisms of continuity and change (Kenny, 2013). The fall in the early 1990s in the average representation of women in national parliaments shook any assumption of an inevitable if gradual increase in the participation of women in public decision-making. The advent of democracy in post-communist states could not be guaranteed to advance the political equality of women or even to maintain existing levels. Thanks to the successful discursive strategy already noted, the absence of women from public decision-making became a major concern for the proliferating democracy assistance programmes designed to assist transitional democracies.
101
The Inter-Parliamentary Union (IPU) had already institutionalised its continuous monitoring of the representation of women in national parliaments and this became a key indicator in the assessment of the quality of democracy. Its convenience meant it was also seized upon for the gender equality indices developed by international standard-setting institutions such as the UN Development Programme (UNDP) and the OECD. While the representation of women’s interests was sometimes cited as an additional indicator of the quality of democracy (Lijphart, 1999: 314–18) this was conceptually complex and much more difficult to operationalise than counting the number of women in parliament and the Cabinet. Nonetheless, feminist political scientists had begun investigating carefully the relationship between the descriptive presence of women in politics and substantive outcomes including the representation of women’s interests. Women’s interests were initially conceptualised as arising from women’s specific relationship to reproduction and the division of labour in private life. It was forcefully argued that these were political interests despite arising from outside what had been deemed the political sphere (Sapiro, 1981). Some made the rather different case that the presence of women and minorities in legislatures was needed to ensure that a diverse range of life experience, including experience of group-based discrimination, was brought to bear on public issues (for example, Phillips, 1995; Mansbridge, 1999). Iris Marion Young (2000) also stressed the importance of representing perspectives arising from the history of social group relations and structural oppression. She suggested that those belonging to a social group might share a perspective arising from social location while having differing ideas and interests; it was important for democracy to ensure the inclusion of such under-represented perspectives. Earlier, Young had written about how women convert from being a serial collective, defined by structural
102
The SAGE Handbook of Political Science
constraints such as enforced heterosexuality or the sexual division of labour, to being a social collective or group, defined by a shared project such as changing those constraints (Young, 1994). Another approach was that of the French concept of parité, which suggested that the representation of women in politics was important not for the representation of different interests or perspectives, but because the abstract individual or republican universalism was universally either male or female. Hence, parity was required to preserve the principle of universalism rather than to represent a multiplicity of interests, as quotas were seen to do (Tremblay, 2018: 146–9).
Critical Mass One explanation for the failure of newly elected women to change the nature of politics centred on the concept of ‘critical mass’. Rosabeth Moss Kanter (1977) had published a very influential work exploring the effects of relative proportions on group behaviour within corporations. She found that where there was only a token presence of a group there was overwhelming pressure to overcome the distrust associated with being visibly ‘different’ and to survive the additional scrutiny involved. Such pressures could lead to over-assimilation to dominant norms and dissociation from other members of the minority group, in the desperate attempt to be accepted. It could also affect both perceived and actual performance. Kanter suggested such pressures diminished as minorities grew larger and that when they reached somewhere between 15 and 40% they were able to exercise greater influence on the organisation. Kanter’s work prompted wide discussion among feminist activists. For example, in debates over why elected women were not ‘making a difference’ a proportion of 30% was suggested, as the minimum required before such impact was likely to occur. This proportion was labelled
‘critical mass’ – a term taken from nuclear physics and indicating the quantity that would trigger a chain reaction. Danish political scientist Drude Dahlerup was the first to explore the relevance of Kanter’s work and the related concept of critical mass to the role of women in parliamentary politics. She concluded by querying the relevance of the critical mass concept to the social sciences and suggesting that the willingness and ability of members of minority groups to engage in critical acts was perhaps more important than the mechanical effect of numbers (Dahlerup, 1988). She was also critical of what she called the ‘difference f allacy’ – the assumption that a key measure was whether differences could be identified in the political priorities of women and men. She pointed out that women’s influence might have changed party agendas and hence the priorities expressed by both men and women. Her criticisms have been taken up by others, who have emphasised the significance of critical actors and enabling institutional contexts, regardless of relative numbers. Nonetheless, the idea became popularised that when women move from being a small to a large minority in parliament constraints would be lessened and they would find it easier to engage in acts of substantive representation. This became an important argument for gender electoral quotas and was endorsed in international jurisprudence. For example, in 1997 it was reflected in a General Recommendation on Article 7 of the UN Convention on the Elimination of All Forms of Discrimination against Women: Research demonstrates that if women’s participation reaches 30 to 35 percent (generally termed a ‘critical mass’), there is a real impact on political style and the content of decisions, and political life is revitalized. (CEDAW, 1997: para 16)
Gender Electoral Quotas and Policy Diffusion A large quota of literature emerged to evaluate quota strategies and the factors needed to
Feminist Political Science
make them effective. This included the fit between quotas and electoral systems and the use of appropriate incentives and sanctions (for example, within the public funding framework) to increase the number of women candidates put forward by political parties. Caps on campaign expenditure and political donations could also help level the playing field for women candidates. Another issue that gained increasing attention was whether ‘quota women’ were less qualified or more dependent on male political leaders. The early leaders in the development of this literature were Drude Dahlerup (2006) and Mona Lena Krook (2009) and both were also involved in the setting up of the global database capturing the diffusion of legislative and other forms of quotas around the world (see ‘Empirical Databases’ at the end of this chapter). The rapid diffusion of electoral gender quotas (adopted by some 140 countries by 2018) helped inspire the growing feminist literature on transnational norm diffusion. This had begun with observation of the similarly rapid diffusion of the concept of gender mainstreaming, following the Fourth World Conference on Women at Beijing in 1995 (Keck and Sikkink, 1998; True and Mintrom, 2001). An international movement embodied in feminist officials and networks and drawing on feminist political science had been able to promote gender mainstreaming as an important element in evidencebased policy or of the modernising of public administration. Observation of such diffusion became an important part of feminist innovation in both international relations (IR) and political science. In particular, this new literature emphasised the dynamic nature of norm diffusion and the role of both transnational advocacy networks and local civil society actors in the process of norm evolution. The role of transnational advocacy networks and feminist officials in institutionalising norms at international and regional levels reinforced efforts by local actors. By moving away from
103
the notion of fixed content of norms, this new scholarship drew attention to how local actors adapted ‘the meaning of a norm to fit with prior norms and identities’ (True, 2019: 138). Feminist studies of policy diffusion threw light not only on the rapid spread of gender mainstreaming agendas around the world, but also the rapid diffusion of other policy agendas including electoral gender quotas, violence against women policy and climate change strategies. Special attention was given by IR scholars to the dissemination of the UN Security Council’s Women, Peace and Security policy agenda, particularly UN Security Council Resolution 1325 with its emphasis on the participation of women in post-conflict peace-building processes. In addition to the burgeoning feminist literature on policy diffusion and strategies to increase the descriptive representation of women in public decision-making, extensive empirical investigations were carried out into whether or not women do make a difference to legislative agendas. There were findings that women parliamentarians brought subjects onto the public agenda previously regarded as belonging to the private realm, such as gender-based violence, and that their policy priorities often reflected the priorities arising out of the everyday life of women, such as health and social policy (Wängnerud, 2009). Unsurprisingly, women parliamentarians have also been found to be more concerned about gender equality issues and representing women than their male colleagues. On the other hand, policy differences among legislators usually owe more to partisan than to gender identity and it is in the shaping of party policy programmes or participation in the core executive that feminist policy interventions may be most important (Annesley and Gains, 2010).
Women’s Interests The issue of whether women legislators made a difference gave rise to a large
104
The SAGE Handbook of Political Science
literature on ‘substantive representation’, the type of representation that involves ‘acting for’ rather than just ‘standing for’ (Pitkin, 1967). One difficult concept bound up with the concept of substantive representation was that of women’s interests. Substantive representation was generally viewed in terms of representing women’s interests although differing indicators of substantive representation were employed. Some operationalised it as congruence with gender differences in public opinion (Campbell et al., 2009) while others used indicators such as responsiveness to organised women’s movements and their policy framing. There were long debates over whether women’s interests had been defined too much in essentialist terms, relating to child-bearing and location in the social division of labour. Some saw these socially allocated caring roles as giving rise to a distinctive ethic of care or maternal ethics, to be contrasted with the abstract rights and duties of a justicebased ethics (Gilligan, 1982; Ruddick, 1989). This was part of the emphasis on standpoint in feminist theory and the epistemological advantages stemming from marginalised locations in the social division of labour. Many local women politicians have seen their caring roles as placing them at a disadvantage in political institutions geared to ‘privileged irresponsibility’, but also as a source of desirable skills and values (Mackay, 2001: 202). Caring roles are commonly associated with a less adversarial and more consensus-seeking approach to politics. An ethic of care suggested that these values, usually associated with the private sphere and with women, should be located at the heart of the public world of politics and be central to democratic citizenship. However, while the ethic of care presented the possibility of revaluing women’s work and moral perspectives, it may also reinforce stereotyped gender roles and be a source of burdensome expectations of moral superiority. Rather than conceptualising women’s interests as being pregiven and arising from
distinctive caring roles, some suggested they should be seen in more dynamic terms as constituted in a political process of claimsmaking. As we shall see, Laurel Weldon (2002) argues that it is only when women come together in collectivities that common interests and claims become crystallised. She emphasised that the substantive representation of women needed to stem from group organisation and the development of collective perspectives rather than from the isolated experience of individual legislators. Others were producing evidence of a related point: that it was not the gender of legislators but openness to the women’s movement and feminist consciousness that was of more importance in substantive representation (Tremblay and Pelletier, 2000). Another challenge was whether the concept of women’s interests was able to encompass how such interests intersected with other identities relating to race, class, disability, ethnicity or other attributes. The concept of ‘intersectionality’ was introduced by Kimberlé Crenshaw (1989) to bring into focus the distinctive experience of African American women and the intersecting identities and privileges that complicate gendered power relations. This presented an apparent complication for the gender mainstreaming strategies adopted by governments. While attention had been paid to special issues faced by, for example, Indigenous or immigrant women, the primary focus was to track overall gender gaps and direct attention to them. Women with disabilities might well complain that findings of a general upward trend in labour force participation by women overlooked the very different statistics for their own participation. In a slightly different approach, Nancy Fraser highlighted the issue of class in her influential work distinguishing between the politics of recognition and the politics of redistribution. The message was that claims for recognition based on nationality, race, gender, sexuality or ethnicity (‘identity politics’) should not ignore the claims of class
Feminist Political Science
and the redistributive aspect of justice, particularly in the context of widening inequalities (Fraser, 1995). A concern for the recognition of ‘difference’ was no substitute for a concern with inequality. The concept of intersectionality was quickly adopted by gender experts within national and transnational governance institutions as well as gender scholars. From at least 1997 EU policies were ‘stretching’ the concept of gender equality to include other inequalities. Scholars and gender practitioners started referring to gender mainstreaming as policymaking from a ‘gender+’ perspective, to acknowledge that ‘other axes of inequality always intersect with gender’ (Lombardo et al., 2017: 2). In Canada the gender-based analysis (GBA) required in the federal bureaucracy was rebranded as ‘GBA+’ in 2011, to highlight the intersection of ‘gender and other diverse identity factors’. While identity groups might be at least coconstituted by such policy recognition, in an era of neoliberalism such recognition was not generally extended to the underclass. A third challenge relating to the concept of women’s interests emerged from a shift to the more flexible concept of gender, encompassing a range of sexual identities as well as the relationship between men and women. Judith Butler (2004) argued strongly for gender to be understood as performance rather than as the stable binary identities required by heteronormativity. If gender and sexuality were understood as performative, then concepts of gender identity and related interests become more complex. The new concept of gender as a spectrum raised questions over the mobilising of a collective identity as women, which had been the basis of women’s movements over time. It also drew attention to the way that strategies to increase the electoral representation of women might not necessarily favour the representation of those with diverse sexual identities. As Manon Tremblay pointed out, while proportional representation was generally viewed as favouring the representation of women, it might be that given their
105
concentration in inner-urban areas, LGBT people were better served by single-member constituencies (Tremblay, 2019).
The Vectors of Substantive Representation In addition to debates over the concept of women’s interests came new debates over the primary vector for the substantive representation of women. The international Research Network on Gender Politics and the State (RNGS) undertook a large-scale comparative examination of the role of women’s policy agencies in providing women’s movements with access to the policy process (McBride and Mazur, 2010). This research project was conducted over a period of 15 years, combining quantitative and qualitative elements in its research design and giving rise to a comprehensive dataset. The dataset covered the strength of women’s movements, their partnerships with women’s policy agencies, the policy environment, issue frame fit and leftwing party and union support. The role of women’s policy agencies was measured in terms of mediation of the policy frames and policy demands arising from society-based women’s movements. The assumption was that representation of women came about through government responsiveness to policy claims arising from autonomous women’s movements. The RNGS framework did not encompass frames generated by the collective work of ‘feminist insiders’, whether in the bureaucracy, political parties or parliaments. Another important contribution was Laurel Weldon’s ‘Beyond bodies: Institutional sources of representation for women in democratic policymaking’ (2002). Weldon was building on her first large-scale comparative study of violence against women policy which had found that women’s policy agencies in consort with strong women’s movements provided a more effective voice for women in policymaking than did individual women legislators. She sought to turn attention away
106
The SAGE Handbook of Political Science
from the role of individual legislators, and towards the role of collectivities such as women’s movement organisations acting in conjunction with women’s policy agencies. In doing this, she was casting a critical eye on the idea that the substantive representation of women primarily came about through the presence of women in parliament. Subsequently, Mala Htun and Laurel Weldon (2012) expanded the initial study of violence against women policy to cover 40 years and 70 countries (including the Global South). They used panel data to show that it was the strength of autonomous women’s movements that led to comprehensive policy responses and to the institutionalising of feminist ideas in global and regional norms. They measured the strength of women’s movements by reference to organisations, protest events and public opinion and also looked at how women’s policy agencies could add to the work of women’s movements. Their dependent variable was the degree of policy scope. They again found factors such as number of women in the legislature to be relatively insignificant compared with the strength of the independent women’s movement and the leverage provided by international and regional norms. Research into the policy impact of autonomous women’s movements and its relationship to women’s agencies in government took the exploration of the substantive representation of women well beyond the role of women legislators. However, its inheritance from the social movement literature often meant an assumed dichotomy between movements and institutions. This gave rise to a new set of questions and some critique of the ontological privileging of autonomous women’s movements over various forms of ‘insider advocacy’ (for example, Banaszak, 2010). Insiders were not necessarily less radical in their thinking than those located in an external social movement, just facing a different set of constraints. Laure Bereni, for example, introduced the concept of ‘institutional sites of advocacy for
women’, arguing that women’s policy agencies, women’s sections in political parties, gender equality bodies in parliaments and gender research centres could all be viewed as fully-fledged components of the women’s movement rather than as its allies (Bereni and Revillard, 2012: 33–4). Another approach to conceptualising the substantive representation of women which does not accord ontological primacy to autonomous women’s movements is that of ‘velvet triangles’. The concept of velvet triangles, deriving from the ‘iron triangle’ literature on the relationship between parliament, bureaucracy and interest groups, gives equal weight to insider and outsider strategies on the part of the women’s movement. It looks at the informal networks linking, for example, ‘femocrats’ in the European Commission, feminists in the European Parliament, women’s NGOs and gender researchers. The research question is whether such informal networks and relationships of mutual support can counterbalance the more exclusive ‘iron triangles’ and corporatism found in many policy sectors (Woodward, 2003).
Parliaments as Representative Institutions The development of feminist institutionalism, described above, also brought renewed attention to the role of parliamentary institutions in promoting substantive representation, as contrasted with either the role of individual legislators or the impact of autonomous women’s movements. Feminist political scientists investigated the gendered nature of institutional norms and expectations, such as those surrounding debate and interruption, as well as the formal rules and conventions of parliaments. This ‘institutional turn’ encompassed both the study of parliament as a workplace and the role of specialised parliamentary bodies for the promotion of gender equality. Because parliaments were historically designed for men able to leave family
Feminist Political Science
responsibilities behind in the ‘private sphere’, the entry of women has been accompanied by a long struggle to accommodate caring responsibilities. Feminist political scientists such as Lenita Freidenvall (2017) and Sonia Palmieri (2019) have provided new conceptual understandings of parliament as a workplace, as well as undertaking applied work for both national and regional parliaments and international standard-setting bodies such as the IPU. Parliamentary terms are now more likely to be aligned with school terms, childcare is more likely to be provided and babies no longer removed as ‘strangers’ from the part of the chamber reserved for elected members. Nonetheless the complex issues of reconciling parliamentary work and family life are still far from resolved, particularly for those who have traditionally been the primary carers. Moreover, in some ways, parliaments are still less regulated than other workplaces to ensure women have equal opportunity to perform their roles; issues such as sexual harassment, bullying and sexualised hate speech continue to affect even women prime ministers (Summers, 2013). Specialised parliamentary bodies for the promotion of gender equality have generally been established after an influx of women into parliaments. They have received increasing recognition from practitioners as a significant element in national machineries for the promotion of gender equality and from scholars as a vector for the substantive representation of women. The Gender-Focused Parliamentary Institutions Research Network (GenParlNet) was established in 2013 and has promoted cumulative knowledge and comparative analysis of the workings and effectiveness of such bodies. Researchers have developed typologies of such bodies together with indicators of feminist characteristics and of the quality of their contribution to substantive representation (Celis et al., 2016). In general, dedicated parliamentary bodies have been found to provide space and legitimacy for women-centred deliberation, to act as a feminist reference group and a
107
gateway to the legislative process for women in the community. If they are a parliamentary committee, they may have a formal remit to apply a gender lens to legislative proposals, while a women’s caucus in a parliamentary party may subject front benchers to collective pressure over the gender impact of proposals (Sawer and Turner, 2016). Parliamentary committees on gender equality may perform different roles (executive or legislative scrutiny) in different institutional contexts (Holli and Harder, 2016) while a women’s caucus in a parliamentary party may both support critical actors in promoting substantive representation and be a critical actor themselves (Allen and Childs, 2019).
Regional and Policy Differences While normative commitment and reflexivity are identifying badges of feminist research, there are regional differences in methods similar to those found in other parts of the discipline – for example, more emphasis on political economy in Latin America, more emphasis on quantitative methods in North America, on discourse (‘critical frame analysis’) and meaning-making in Europe or new institutionalism in the ‘Anglosphere’. In addition to differences in methods, there are also regional differences in the focus of feminist political science – for example, histories of state violence in Latin America have influenced feminist theorising of violence against women and policy responses to it. The concept of ‘femicide’ or ‘feminicide’ has become central to regional understandings of the issue, leading femicide to be a legally defined crime in almost 20 Latin American countries (Fregoso and Bejarano, 2010). But, within regions as well, feminists have sometimes adopted very different and competing approaches to specific policy issues. For example, the politics of prostitution law reform is a subject traditionally neglected by political science, but given a new centrality by
108
The SAGE Handbook of Political Science
feminist research and activism. Some prominent feminist political scientists have taken up the argument long advanced by women’s movement organisations, that prostitution is of its nature exploitive of women, perpetuating men’s right to access women’s bodies in an inherently unequal power relationship (Pateman, 1988; Raymond, 2004; Sullivan, 2007; Jeffreys, 2009). Today, this argument, that prostitution is incompatible with gender equality, has been strongly promoted by women’s movements and the women’s wings of political parties in the Nordic countries as well as by the European Women’s Lobby and the Women’s Rights (FEMM) Committee of the European Parliament. It has been translated into a distinctive public policy solution intended to avoid the traditional scapegoating of prostitutes themselves. Sweden led the way in 1999 with its legislation criminalising the purchasing of sex by the client rather than its sale by the prostitute. The Swedish policy model was subsequently adopted in Norway and Iceland and then in Canada, France, Ireland and Israel. In contrast to those feminist political scientists arguing that prostitution is in and of itself exploitive of women, other feminist political scientists have argued that prostitution among consenting adults can be seen as a form of work. They have undertaken comparative research suggesting that decriminalisation or legalisation is the way to protect the rights, working conditions and safety of workers in the sex industry or ‘sex workers’ (Outshoorn, 2004). They maintain that sex work can be a free choice made by women and should have the same protection as other forms of employment; only forced prostitution should be illegal. In this policy model, implemented at the national level in the Netherlands in 2000, prostitution is subject to regulation like other forms of work but is not illegal in itself. We shall call this the Dutch model, although different forms of legalisation and decriminalisation have also taken place at the sub-national level in Australia and in New Zealand since the end
of the 1970s. In comparing the effectiveness of different types of prostitution law reform in promoting the safety and rights of sex workers, the Australian evidence suggests that a key factor is the representation of sex-worker advocacy organisations in the policy process, made possible by decriminalisation (Jeffrey and Sullivan, 2009). While both sides of the feminist prostitution debate seek to promote gender equality and to oppose trafficking in women, their policy solutions are very different and their studies of prostitution law reform come to very different conclusions. Either the Swedish or the Dutch models are shown to have ‘failed’; for example, legalisation is alternatively said to have significantly increased the safety of sex workers or to have led to an expansion of the sex industry and of sex trafficking. Feminist frame analysis has been applied to show either the influence of international ‘neo-abolitionist’ networks in framing prostitution as violence against women (Ward and Wylie, 2017) or the influence of sex-worker advocacy organisations and their allies in framing prostitution as a legitimate employment choice. Prostitution and pornography policy are one of the most polarising issues debated within feminist political science today. While concepts of the state no longer divide radical feminists and reformists in the way they did in the 1970s, with both sides wishing to use the state to introduce law reform, there are ongoing debates over the strategy of gender mainstreaming. These debates are not only about the ‘technocratic’ nature of the methodologies adopted but also about the relationship of gender mainstreaming to neoliberal policy agendas. Gender mainstreaming is seen by its critics as promoting women’s labour force participation (and the marketisation of services) at the expense of revaluing and supporting non-market caring work. Hence, gender mainstreaming is seen as complicit in the abandoning of the transformational agenda originally so important to feminist political science. On the other
Feminist Political Science
hand, gender mainstreaming tools such as gender-responsive budgeting are not necessarily looked on with favour by neoliberal policymakers – they reveal the disproportionate impact of neoliberal policies both on women as a whole and on particular groups of women. Moreover, neoliberal policy agendas also contribute to populist backlash, mobilising discontent with economic and cultural change, including gender equality and diversity projects (Verloo, 2018). While feminist political science has become highly professionalised since it first became a collective presence within the discipline, it is still identifiable by its commitment to producing knowledge that will advance gender equality. It has created an epistemic community around this normative ambition as well as the diverse conceptual tools needed to adequately address the gendered dynamics of political institutions.
Empirical Databases One of the richest sources of data on women’s policy, women’s policy agencies and the participation of women in public decisionmaking is to be found in the country reports and NGO shadow reports by the 189 state parties to the UN Convention on the Elimination of All Forms of Discrimination against Women (CEDAW). General Recommendation 23 of the CEDAW Committee on Article 7 of CEDAW encourages the use of temporary special measures such as quotas to realise women’s right to equal participation in political and public life. The IPU has published statistics on women’s representation in national parliaments since 1985, covering the period from 1945 onwards. The IPU has also published statistics on parliamentary bodies specialising in gender equality from 2006. The UN Division for the Advancement of Women (later merged into UN Women) monitored the establishment of national machinery
109
for the advancement of women that was part of the Plan of Action adopted at the First World Conference on Women, held in Mexico City in 1975. Regular surveys collected data on the nature of such machinery, finding that 127 countries had adopted such machinery by 1985; 165 by 2004. UN Women continues to collect such data, for example, see its 2016 regional consultation with national machineries from across South East Asia. International IDEA (Institute for Demo cracy and Electoral Assistance), the IPU and the University of Stockholm jointly sponsor the Gender Quotas Database, which tracks the introduction around the world of legislated candidate quotas, party quotas and reserved seats from 2003. The OECD Gender Equality Data Portal presents much data relevant to feminist political science under the governance heading. This includes comparative data on the percentage of women parliamentarians and ministers, the mandates of parliamentary gender equality committees, legislative or other requirements for gender impact assessment and gender-responsive budgeting at the central/federal level of government. The Standing Group on Gender and Politics of the European Consortium for Political Research maintains a ‘Syllabus Bank’ of university courses on gender and politics to assist in the development of such courses. The dataset from the RNGS project is available on the Harvard Dataverse under the heading: ‘Women’s Movements and Women’s Policy Offices in Western Postindustrial Democracies, 1970–2001’. The Feminism and Institutionalism Inter national Network (FIIN) promotes the synthesis of insights from new institutionalist theory and feminist scholarship on institutions. It has a website providing information on publications and events in this area. The Gender-Focused Parliamentary Institutions Research Network (GenParlNet) also has a website with information on publications and events in this emerging research area.
110
The SAGE Handbook of Political Science
In addition to the more formal databases, important Facebook groups have been established to facilitate the sharing of news and resources on politics and gender – for example, the Electoral Gender Quotas group managed by Mona Lena Krook since April 2011, and the Violence against Women in Politics group co-created by Krook and Juliana Restrepo Sanín in February 2015.
References Ackerly, Brooke and Jacqui True (2013) ‘Methods and methodologies’ in Georgina Waylen, Karen Celis, Johanna Kantola and S. Laurel Weldon (eds) The Oxford Handbook of Gender and Politics, Oxford: Oxford University Press, 135–59. Allen, Peter and Sarah Childs (2019) ‘The grit in the oyster? Women’s parliamentary organisations and the substantive representation of women’, Political Studies, 67(3): 618–38. Annesley, Claire and Francesca Gains (2010) ‘The core executive: Gender, power and change’, Political Studies, 58(5): 909–29. Banaszak, Lee Ann (2010) The Women’s Movement Inside and Outside the State, New York: Cambridge University Press. Bereni, Laure and Anne Revillard (2012) ‘Un mouvement social paradigmatique? Ce que le mouvement des femmes fait à la sociologie des mouvements sociaux’, Sociétés contemporaines, 85(1): 17–41. Boserup, Ester (1970) Woman’s Role in Economic Development, London: George Allen & Unwin. Bourque, Susan C. and Jean Grossholtz (1974) ‘Politics as an unnatural practice: Political science looks at female participation’, Politics & Society, 4(2): 225–66. Brook, Heather (2002) ‘Stalemate: Rethinking the politics of marriage’, Feminist Theory, 3(1): 45–66. Butler, Judith (2004) Undoing Gender, London and New York: Routledge. Campbell, Rosie, Sarah Childs and Joni Lovenduski (2009) ‘Do women need women representatives?’, British Journal of Political Science, 40(1): 171–94.
CEDAW (Committee on the Elimination of Discrimination against Women) (1997) General Recommendation No. 23. http://www.un.org/ womenwatch/daw/cedaw/recommendations Celis, Karen, Sarah Childs and Jennifer Curtin (2016) ‘Specialised parliamentary bodies and the quality of women’s substantive representation: A comparative analysis of Belgium, United Kingdom and New Zealand’, Parliamentary Affairs, 69(4): 812–29. Chappell, Louise A. (2002) Gendering Government: Feminist Engagement with the State in Australia and Canada, Vancouver: University of British Columbia Press, pp. 23–44. Chappell, Louise and Fiona Mackay (2017) ‘What’s in a name? Mapping the terrain of informal institutions and gender politics’, in Georgina Waylen (ed.) Gender and Informal Institutions, London: Rowman & Littlefield. Crenshaw, Kimberlé (1989) ‘Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics’, University of Chicago Legal Forum, 140: 139–67. Curtin, Jennifer (2019) ‘Feminist innovations and new institutionalism’, in Marian Sawer and Kerryn Baker (eds) Gender Innovation in Political Science: New Norms, New Knowledge, London: Palgrave, pp. 115–33. Dahlerup, Drude (1988) ‘From a small to a large minority: Women in Scandinavian politics’, Scandinavian Political Studies, 11(4): 275–98. Dahlerup, Drude (ed.) (2006) Women, Quotas and Politics, London and New York: Routledge. Duverger, Maurice (1955) The Political Role of Women, Paris: UNESCO. Eisenstein, Hester (1996) Inside Agitators: Australian Femocrats and the State, Sydney: Allen & Unwin; Philadephia: Temple University Press. Elshtain, Jean Bethke (1981) Public Man, Private Woman: Women in Social and Political Thought, Oxford: Martin Robertson. Fraser, Nancy (1995) ‘From redistribution to recognition? Dilemmas of justice in a postsocialist age’, New Left Review, 212: 68–93.
Feminist Political Science
Fregoso, Rosa-Linda and Cynthia Bejarano (eds) (2010) Terrorizing Women: Feminicide in the Americas, Durham, NC: Duke University Press. Freidenvall, Lenita (2017) ‘The Swedish Parliament: A Gender-Sensitive Working Place?’, paper presented at the European Conference on Politics and Gender, Lausanne, 8–10 June. Gilligan, Carol (1982) In a Different Voice, Cambridge MA: Harvard University Press. Hanisch, Carol (1969/2006) ‘The Personal is Political: The Women’s Liberation Movement classic with a new explanatory introduction by Carol Hanisch’, available at: http://www. carolhanisch.org/CHwritings/PIP.html Hartsock, Nancy (1987) ‘The feminist standpoint: Developing the ground for a specifically feminist historical materialism’, in Sandra Harding (ed.) Feminism and Methodology, Milton Keynes: Open University Press, pp. 157–80. Haussman, Melissa, Marian Sawer and Jill Vickers (eds) (2010) Federalism, Feminism and Multilevel Governance, Aldershot: Ashgate. Hochschild, Arlie Russell (1983) The Managed Heart: Commercialization of Human Feeling, Berkeley: University of California Press. Holli, Anne Maria and Mette Marie Staehr Harder (2016) ‘Towards a dual approach: Comparing parliamentary committees on gender equality in Denmark and Finland’, Parliamentary Affairs, 69(4): 794–811. Htun, Mala and S. Laurel Weldon (2012) ‘The civic origins of progressive policy change: Combating violence against women in global perspective, 1975–2005’, American Political Science Review, 106(3): 548–69. Inglehart, Ronald and Pippa Norris (2000) ‘The developmental theory of the gender gap: Women’s and men’s voting behavior in global perspective’, International Political Science Review, 21(4): 441–63. Jeffrey, Leslie A. and Barbara Sullivan (2009) ‘Canadian sex work policy for the 21st century: Enhancing rights and safety, lessons from Australia’, Canadian Political Science Review, 3(1): 57–76. Jeffreys, Sheila (2009) The Industrial Vagina: The Political Economy of the Global Sex Trade, London/New York: Routledge. Johnson, Carol (2019) ‘Gender research and discursive policy framing’, in Marian Sawer
111
and Baker Kerryn (eds) Gender Innovation in Political Science: New Norms, New Knowledge, London: Palgrave, pp. 195–218. Kanter, Rosabeth M. (1977) Men and Women of the Corporation, New York: Basic Books. Keck, Margaret E. and Kathryn Sikkink (1998) Advocacy Beyond Borders: Advocacy Networks in International Politics, Ithaca, NY: Cornell University Press. Kenny, Merryl (2013) Gender and Political Recruitment: Theorising Institutional Change, Houndmills, UK: Palgrave Macmillan. Krook, Mona Lena (2009) Quotas for Women in Politics: Gender and Candidate Selection Reform Worldwide, Oxford: Oxford University Press. Lijphart, Arend (1999) Patterns of Democracy: Government Forms and Performance in Thirty-Six Countries, New Haven, CT: Yale University Press. Lombardo, Emanuela, Petra Meier and Mieke Verloo (2009) The Discursive Politics of Gender Equality: Stretching, Bending and Policymaking, London: Routledge. Lombardo, Emanuela, Petra Meier and Mieke Verloo (2017) ‘Policymaking from a gender+ equality perspective’, Journal of Women, Politics & Policy, 38(1): 1–19. McBride, Dorothy E. and Amy G. Mazur (2010) The Politics of State Feminism: Innovation in Comparative Research, Philadelphia: Temple University Press. Mackay, Fiona (2001) Love and Politics: Women Politicians and the Ethic of Care, London: Continuum. Mackay, Fiona (2014) ‘Nested Newness, institutional innovation, and the gendered limits of change’, Politics & Gender 10(4): 549–71. Mansbridge, Jane (1999) ‘Should blacks represent blacks and women represent women? A contingent “Yes”’, Journal of Politics, 61(3): 628–57. Mill, John Stuart (1869) The Subjection of Women, London: Longmans. Norris, Pippa and Mona Lena Krook (2014) ‘How quotas work: The supply and demand model revisited’, in Rosie Campbell and Sarah Childs (eds) Deeds and Words: Gendering Politics after Joni Lovenduski, Colchester: ECPR Press, pp. 185–205. Norris, Pippa and Joni Lovenduski (1995) Political Recruitment: Gender, Race and Class in
112
The SAGE Handbook of Political Science
the British Parliament, Cambridge: Cambridge University Press. OECD (2017) The Pursuit of Gender Equality: An Uphill Battle, Paris: OECD Publishing. Outshoorn, Joyce (2004) The Politics of Prostitution: Women’s Movements, Democratic States and the Globalisation of Sex Commerce, Cambridge: Cambridge University Press. Palmieri, Sonia (2019) ‘Feminist institutionalism and gender-sensitive parliaments: Relating theory and practice’, in Marian Sawer and Kerryn Baker (eds) Gender Innovation in Political Science: New Norms, New Knowledge, London: Palgrave, pp. 173–94. Pateman, Carole (1983) ‘Feminist critiques of the public/private dichotomy’, in Stanley I. Benn and Gerald F. Gaus (eds) Public and Private in Social Life, London, Croom Helm, pp. 291–303. Pateman, Carole (1988) The Sexual Contract, Cambridge: Polity Press. Phillips, Anne (1995) The Politics of Presence, Oxford: Clarendon Press. Pitkin, Hanna Fenichel (1967) The Concept of Representation, Berkeley, University of California Press. Puwar, Nirmal (2004) Space Invaders: Race, Gender and Bodies Out of Place, Oxford: Berg. Rai, Shirin M. and Carole Spary (2019) Performing Representation: Women Members in the Indian Parliament, Oxford: Oxford University Press. Raymond, Janice G (2004) ‘Prostitution on demand: Legalising the buyers as sexual consumers’, Violence against Women, 10(10): 1156–86. Ruddick, Sara (1989) Maternal Thinking: Towards a Politics of Peace, Boston: Beacon Press. Sapiro, Virgina (1981) ‘When Are Interests Interesting? The Problem of Political Representation of Women’, American Political Science Review, 75(3): 701–16. Sawer, Marian (2016) ‘Femocrat’, in Nancy A. Naples (ed.) The Wiley-Blackwell Encyclopedia of Gender and Sexuality Studies, Vol. II, Oxford: Wiley Blackwell, pp 909–11. DOI: 10.1002/9781118663219.wbegss047 Sawer, Marian (2019) ‘How the absence of women became a democratic deficit’, in Marian Sawer and Kerryn Baker (eds) Gender Innovation in Political Science: New Norms, New Knowledge, London: Palgrave, pp. 13–40.
Sawer, Marian and Alicia Turner (2016) ‘Specialised parliamentary bodies: Their role and relevance to women’s movement repertoire’, Parliamentary Affairs, 69(4): 763–77. Sullivan, Mary Lucille (2007) Making Sex Work: A Failed Experiment with Legalised Prostitution, Melbourne: Spinifex Press. Summers, Anne (2013) ‘The Prime Minister’s rights at work’ in Anne Summers (ed.) The Misogyny Factor, Sydney: NewSouth Publishing, pp. 104–36. Thompson, William (1825) Appeal of One Half of the Human Race, Women, against the Pretensions of the Other Half, Men, to Retain Them in Political, and thence in Civil and Domestic Slavery, London: Longman. Tremblay, Manon (2018) 100 Questions about Women and Politics, Montreal: McGillQueens University Press. Tremblay, Manon (2019) ‘Uncovering the gendered effects of voting systems: A few thoughts on the representation of women and LGBT people’, in Marian Sawer and Kerryn Baker (eds) Gender Innovation in Political Science: New Norms, New Knowledge, London: Palgrave, pp. 91–114. Tremblay, Manon and Réjean Pelletier (2000) ‘More feminists or more women? Descriptive and substantive representations of women in the 1997 Canadian federal elections’, International Political Science Review, 21(4): 381–405. True, Jacqui (2019) ‘Gender research and the study of institutional transfer and norm diffusion’, in Marian Sawer and Kerryn Baker (eds) Gender Innovation in Political Science: New Norms, New Knowledge, London: Palgrave, pp. 135–52. True, Jacqui and Michael Mintrom (2001) ‘Transnational networks and policy diffusion: The case of gender mainstreaming’, International Studies Quarterly, 45(1): 27–57. Verloo, Mieke (ed.) (2018) Varieties of Opposition to Gender Equality in Europe, New York: Routledge. Walby, Sylvia (2005) ‘Introduction: Comparative gender mainstreaming in a global era’, International Feminist Journal of Politics, 7(4): 453–70. Wängnerud, Lena (2009) ‘Women in parliaments: Descriptive and substantive
Feminist Political Science
representation’, Annual Review of Political Science, 12: 51–69. Ward, Eilis and Gillian Wylie (eds) (2017) Feminism, Prostitution and the State: The Politics of Neo-Abolitionism, London and New York: Routledge. Waring, Marilyn (1988) Counting for Nothing: What Men Value and What Women are Worth, Wellington: Allen & Unwin. Weldon, S. Laurel (2002) ‘Beyond bodies: Institutional sources of representation for women in democratic policymaking’, The Journal of Politics, 64(4): 1153–74.
113
Wollstonecraft, Mary (1792/2004) A Vindication of the Rights of Woman, London: Penguin. Woodward, Alison E. (2003) ‘Building velvet triangles: Gender and informal governance’, in Thomas Christiansen and Simona Piattoni (eds) Informal Governance in the European Union, Cheltenham: Edward Elgar, pp. 76–93. Young, Iris Marion (1994) ‘Gender as Seriality: Thinking about Women as a Social Collective’, Signs, 19(3): 713–38. Young, Iris Marion (2000) Inclusion and Democracy, New York: Oxford University Press.
7 Marx and Marxism in Politics Dingping Guo
Introduction Marxism has been defined and studied as a political theory, political ideology and political movement. In political studies, Marxism refers to a specific school of social and political theory about human life, historical development, capitalist crisis and communist revolution, which was developed by Karl Marx and Friedrich Engels during the mid-to-late 19th century, and subsequently elaborated on by their disciples from various backgrounds all over the world. Although there are inconsistencies and contradictions in Marx’s theory during the different periods of its development, and there are considerable debates and disputes over its nature and structure, some basic consensus can be reached based on an analysis of Marx’s works and the studies on the subsequent evolution of Marxist theory. Marxism is not only one of the most important social and political schools of thought, but also the guiding ideology in the communist revolution and the socialist construction of many countries worldwide.
Capitalist Development and the Birth of Marxism Karl Marx, one of the most famous and influential theorists of the modern historical age from whom the socialist or communist movements have derived their ideas, was not only a political thinker but also social philosopher and economist whose research ranged widely over many fields. Marx has had a profound impact on the thoughts and actions of people in many countries since the mid 19th century, and in the 21st century he is still regarded as the greatest instructor by the political left, including the adherents of communist parties, and is derided as a source of political and social chaos by the political right. The ideas and programs developed by Marx and Engels have been generally called Marxism. Born in Trier, German Rhineland, into a Jewish family on May 5, 1818, Karl Marx received a good education and displayed great potential as an outstanding student. During his student days at the universities of Bonn and Berlin, Marx studied law and the
Marx and Marxism in Politics
history of philosophy, took a strong interest in the works of Georg Wilhelm Friedrich Hegel and joined a student/professor group called the ‘Young Hegelians’. Marx submitted his doctoral dissertation at the University of Jena in 1840 and received a doctoral degree the following year. This dissertation is entitled ‘The Difference between the Democritean and Epicurean Philosophy of Nature’ and can be regarded as the starting point of his transition from idealism to materialism (Jessop, 1999: 98). After his initial failure to establish an academic career, his liberal political views led him to find employment as an editor of a radical newspaper in Cologne, Rheinische Zeitung. Because of his journalistic abilities and radical views, Marx was well received in liberal circles and quickly promoted to editor of the newspaper. This radical newspaper, under the guidance of Marx, had to face the problem of censorship by the authoritative Prussian government, and was finally suppressed after the printing of Marx’s article on the poverty of farmers in the Mosel Valley. In 1843, Marx married Jenny von Westphalen and emigrated to Paris with her in order to escape political persecution. There he became acquainted with French socialist thinkers and began to witness firsthand the living conditions of people in poverty by socializing himself with working-class people. More importantly, he first encountered, and subsequently established his lifelong friendship with Friedrich Engels, the author of the classic work, The Condition of the Working Class in England in 1844. As a result of his economic and philosophical researches, Marx wrote Economic and Philosophic Manuscripts of 1844, in which he showed great concern for the dignity and freedom of the individual. In February 1844, together with the philosopher and political writer, Arnold Ruge, Marx published the first, and only issue of a new journal, DeutschFranzösische Jahrbücher [German-French Annals] in which he published articles on a broad range of matters such as philosophy, politics and society. His radical ideas were not tolerated in France. After Marx published
115
an article on capitalism in German-French Annals, he angered his partner Arnold Ruge and their journal was banned in France and Germany. Based on his experiences of living among the working class and his comprehensive researches on history, economics, politics and philosophy, Marx became an ardent communist. He proposed his ideas about communism by criticizing the alienation of labor in a capitalist society. According to Marx, under capitalism, the working class invests its creative labor, while the capitalist class appropriates the results of this labor in exchange for wages. This means that the human world created by the proletariat does not belong to them, but is owned instead by a class of non-laboring owners. In January 1845, Marx was expelled from Paris by Premier François Guizot at the instance of the Prussian government and moved to Brussels. During his stay in Brussels, Marx exchanged polemics with the Hegelians, Feuerbach, Stirner and the ‘True Socialists’, and finished two important works in collaboration with Engels, The Holy Family and The German Ideology. In the The German Ideology, Marx and Engels provided a historical and material basis for Marx’s radical views and insisted that the nature of individuals depended on the material conditions determining their own productions. In 1847, Marx started another polemical exchange with the French anarchist thinker PierreJoseph Proudhon and wrote The Poverty of Philosophy, in which he developed the fundamental propositions of his economic interpretation of history. By early 1846, Marx established the Communist Correspondence Committee in order to connect all of Europe’s socialist leaders. Next year, the League of the Just held a congress in London where the two groups merged to form the Communist League. Marx and Engels attended the Second Congress of the Communist League at which they were commissioned to write a manifesto for the League, which became Manifesto of the Communist, inspired by Engels’ The Principles of Communism (1847). The
116
The SAGE Handbook of Political Science
Communist Manifesto, originally written as the platform of the Communist League, has become one of the most radical and influential books since it was first published in February 1848. It begins with the famous proposition: ‘The history of all hitherto existing society is the history of class struggles’, and contains a summary of Marxist theory (Carver, 1996: 1–30). For example, one of the major points is to abolish private property and implement public ownership of the economy. The theory of the communists may be summed up in the single phrase: ‘abolition of private property’. The second point is to bring the proletariat to power and annihilate the exploiting class, especially the bourgeoisie, in politics. According to Marx and Engels, the first step in the revolution by the working class is to raise the proletariat to the position of ruling class in order to win the battle for democracy. (This was why Mao Zedong defined the nature of his new republic as the ‘people’s democratic dictatorship’, a hundred years later.) The Communist Party as the avant-garde of the proletariat then comes to power after winning the struggle against the old classes, the landowners and the bourgeoisie. The third is to envision a classless society in ‘which the free development of each is the condition for the free development of all’ (Carver, 1996: 20). Upon publication in 1848, The Communist Manifesto quickly became the credo of the poor and oppressed all over the world and led to the greatest political upheavals of the 19th and 20th centuries and the establishment of the communist governments that ruled half the globe for several decades. It is regarded as the most important classic text for Marxist political theory. After the Manifesto came to light, even the relatively tolerant Belgium government served Marx with an expulsion order and he returned to Paris. The revolutionary atmosphere in Germany in 1848 enabled him to return to Cologne where he persuaded some liberal industrialists to back a new version of his old newspaper, the Neue Rheinische Zeitung. It became extremely radical and
took an anti-government stance under the editorship of Marx and it was again suppressed by the authorities. In protest, Marx printed the last issue of the Zeitung in red ink. He was arrested for press offences and incitement to armed insurrection, but after a long and powerful speech delivered at his trial, Marx was acquitted by a jury in Cologne. Faced with expulsion from Cologne and suppression of his newspaper, Marx visited Paris again as a representative of German democracy before the Paris National Assembly, but similarly was served with an expulsion order from Paris. Following his expulsion, in 1849 Marx moved to London where he lived with his large but devoted family until his death. Although Marx was a correspondent to the New York Tribune from 1852 to 1861, for the most part he was dependent for his livelihood on the generous financial support of Engels. His typical day was spent in the Reading Room of the British Library, where from 10am to 7pm (when opening hours allowed) he wrote a number of volumes on different subjects. Sometimes he lacked money for the postage to send his manuscripts to his publishers. Afflicted with multiple health problems and a contentious and uncompromising temper, Marx was not a prepossessing figure during his final years. Although virtually unknown in England, he enjoyed great popularity on the Continent, especially in liberal circles and among working people, and in 1864 was invited to participate in the formation of the International Working Men’s Association. Also known as the First International, the organization was founded at a meeting in St. Martin’s Hall, London, and Marx was invited to draw up the guiding principles in the ‘Inaugural Address’. In 1867, the first volume of Marx’s greatest work, Das Kapital [Capital], was published. The second, third and incomplete fourth volumes did not appear until after Marx’s death in 1883 (McDonald, 1962: 347). In Capital, Marx analyzed the secret of capitalist production by focusing on the
Marx and Marxism in Politics
concept of surplus value and formulated his revolutionary theory by revealing the injustice of the capitalist system. According to Marx, labor is a commodity like any other commodity; therefore, following the labor theory of value, it must be valued by the man-hours devoted to its ‘production’, that is, to feeding, clothing and sheltering the worker in order to maintain life at subsistence level. In the capitalist system, labor is bought just like any other commodity. But, unlike any other commodity, labor is not consumed in a clearly determined period of time. A laborer is bought for the price of sustaining him physically, prorated in hours or days or weeks. But he may produce the equivalent of the price in economic value in 6 or 8 hours of work, whereas the factories of Marx’s day kept men going for 10, 12 or 14 hours a day. The difference between what the worker does and what he is paid is surplus value, the source of all capitalist profits. In a capitalist society that is divided into the capitalist class with the means of production, and the proletariat without the means of production, the injustice heaped upon the workers is not the result of bad men, but of a particular system. Reform within the system, however well intentioned, is doomed to failure. Only revolutionary overthrow of the whole capitalist system can lead to the liberation of the working class. Marx died and was buried in Highgate Cemetery, London, with a tombstone epitaph reading ‘Workers of all lands, unite’, the final slogan in The Communist Manifesto. In the years following his death, Engels edited and translated his works, and in many ways continued their friendship until his own death in 1895. Since the intellectual activity of Marx and Engels was intertwined based on their close friendship and great collaboration, when most people today speak of Marxism they are speaking of the joint output of Marx and Engels (McDonald, 1962: 345). Although Marx spent most of his life in reading and writing as a student and scholar, he has had a strong influence not only on modern ideas,
117
but also on political practices and social movements in many countries around the world. Today, his works are reprinted and read widely, and his ideas discussed and debated in the fields of philosophy, sociology and political science.
Marxism as a Political Theory Marxism is the system of social and political theory about human life, historical development, the capitalist crisis and the communist revolution developed by Marx and Engels, and elaborated on by such disciples as Lenin and Mao. Although there are inconsistencies and contradictions in the Marxist development, the basic doctrine of Marxism can be outlined as follows.
Theoretical Sources Marx developed his eponymous theoretical system from many different sources, such as utopian socialist thought in France and England, classic philosophy in Germany, political economics in England and the Greek political and philosophical tradition (Xu, 2005: 277). Among these, three major sources are especially important: German philosophy, French politics and English economics. In his early works, Marx showed great interest in law and philosophy; and his later works were more concerned with political economy and political strategy. The German philosophy on which Marx drew was primarily that of Hegel but included the Young Hegelians and Ludwig Feuerbach’s materialism. During his student days at the universities of Bonn and Berlin, Marx studied history and philosophy, took a strong interest in the works of Hegel and joined a student/professor group called Young Hegelians. Many of Marx’s basic ideas, such as his critique of civil society and private
118
The SAGE Handbook of Political Science
property, emerged when he was writing the Critique of Hegel’s Philosophy of Right. He asserts that religion is the ‘opium of the people’ and calls for an ‘uprising of the proletariat’ to realize the conceptions of philosophy, a point also made in Theses on Feuerbach (1845). Marx had been strongly influenced by Hegel’s Logic and dialectical method, and his great work Capital is imbued with intellectual categories derived from Hegel. As one scholar pointed out, ‘Hegelian dialectic is a permanent tool of Marxist thought’ (Jessop, 1999: 128). French politics and socialist movements played an important role in shaping Marx’s thoughts. Marx’s father-in-law, Baron von Westphalen and his teachers were all strongly influenced by the French Enlightenment. Marx was also strongly influenced by the French Revolution and French thinkers such as Jean-Jacques Rousseau. After he emigrated to Paris together with his wife in order to escape political persecution by the German authorities, Marx became acquainted with French socialist thinkers and began to witness the living conditions of people in poverty by socializing himself with workingclass people. French socialism, as expressed and explained in the works of Henri de SaintSimon and Charles Fourier, enabled Marx to break with Hegel’s teleological approach to history, to develop a broad-ranging social economy, to understand the social and personal impact of modern industry and grasp the significance of socialism. After studying the development of Bonapartism and commenting on the nature and significance of the Paris Commune, Marx completed several political works (The Class Struggles in France, 1840– 1850 and The Eighteenth Brumaire of Louis Bonaparte), and expounded his major political ideas about the State and revolution. The third major source of Marx’s theory is English (and Scottish) economics exemplified by writers such as Adam Smith, David Ricardo and Thomas Malthus. It was during his years in Paris that Marx began his study of English economics. From the early 1840s,
he made an increasingly detailed study of the works of English economists. After moving to London, Marx undertook deep and systematic research on the development of the capitalist mode of production in England. In writing the 1844 Manuscripts, Marx relied extensively on the work of Adam Smith, especially his views on the division of labor, rent, subsistence wages and the three stages of society. Once he became acquainted with Ricardo’s work, On the Principles of Political Economy and Taxation (1817), Marx abandoned the economic theory developed in the 1844 Manuscripts. His critique of The Poverty of Philosophy was Ricardian in character. By absorbing the ideas in the works of classic political economists and analyzing the capitalist development in England, Marx established his own status as political economist. Although Marx drew on various sources, he did not merely combine them mechanically. A distinctive feature of Marx’s theory is his creative ability to synthesize. By studying German philosophy, French politics and English economics, Marx was able to develop his own philosophical, economic, social and political theory.
Historical Materialism and Social Development Marx’s unique contribution to historical philosophy is his historical materialism and theory of social development. According to his explanations in The German Ideology (1846) and The Critique of Political Economy (1859), the nature of individuals depends on the material conditions determining their productions. In the social production of their existence, people enter into definite relations that are indispensable and independent of their will. According to Marx in The Critique of Political Economy: [These] relations of production correspond to a definite stage of development of their material forces of
Marx and Marxism in Politics
production. The sum total of these relations of production constitutes the economic foundation of society on which there arise legal and political superstructures and to which correspond definite forms of social consciousness. (Preface)
As Marx and Engels put it in The German Ideology: ‘In direct contrast to Germany philosophy, which descends from heaven to earth, here we ascend from earth to heaven’ (O’Malley, 1994: 125). Returning to The Critique of Political Economy, we learn that: [t]he mode of production of material life conditions the general process of social, political and intellectual life. It is not the consciousness of men that determines their existence, but, on the contrary, their social existence that determines their consciousness. At a certain stage of their development, the material forces of production come eventually into conflict with the existing relations of production. … From forms of development of the forces of production these relations turn into their fetters. Then begins an era of social revolution. With the change of economic foundation the entire immense superstructure … is more or less rapidly transformed. In considering such transformations the distinction should always be made between the material transformation of the economic conditions of production … and the legal, political, religious, aesthetic, or philosophical, in short, ideological transformation. (Marx, 1859: Preface)
All ideological transformations ‘must be explained from the contradictions of material life, from the existing conflicts between the social forces of production and the relations of productions’ (ibid.). Therefore, the ‘legal relations as well as the forms of state could neither be understood by themselves, nor explained by the so-called general progress of the human mind; they are rooted in the material conditions of life (ibid.). Since every society is divided into various groups, a strong minority tends to use their economic power in order to exploit the mass of the population by appropriating the economic surplus for their own benefit. This inherently conflicting situation gives rise to a class struggle that centers on the ownership
119
and control of the means of production. The social group that controls the means of production forms the ruling class, and the group without the means of production constitutes the ruled class. All political institutions and cultural beliefs are shaped by the ruling class so as to bolster the unequal distribution of resources: The history of all hitherto existing society is the history of class struggles. Freeman and slave, patrician and plebeian, lord and serf, guild-master and journeyman, in a word, oppressor and oppressed, stood in constant opposition to one another, carried on an uninterrupted, now hidden, now open fight that each time ended, either in a revolutionary re-constitution of society at large, or in the common ruin of the contending classes. (Marx and Engels, 1848: Section I)
Based on the conflict between the forces of production and the relations of production, the history of mankind progresses through revolutions to the next higher stage. This theory of social development is usually called historical materialism. In the Marxist view of history, the primitive agrarian society was followed by the slave society of the ancient world, the feudal society, the capitalist society and finally by the communist society. The progress is made by inevitable and ultimately uncontrollable material forces, rather than human thought and initiative. This is sometimes summarized as so-called economic determinism or historical determinism. In fact, while Marx emphasized the crucial role of material forces in social development, he also analyzed the important and strong influences of political superstructure and human initiative on history. Political institutions and political leadership play an important role in many cases of historical development. As Wang Huning explains, political superstructure is not completely passive and inert, and contrarily, it may exert some decisive influences on economic foundations in some special cases, especially during the preliminary stage of the socialist states (Wang, 2004: 60–1).
120
The SAGE Handbook of Political Science
Capitalist Crisis and Proletarian Revolution In February 1848, Marx and Engels published the Manifesto of the Communist Party, generally regarded as a public statement of the general theory and political ideology of Marxism, and a call for general cooperation among different workers’ organizations. According to their analysis, capitalism as a revolutionary mode of production was fundamentally changing the course of civilization. It introduced market relations into all spheres of society and throughout the world: [T]he markets kept ever growing, the demand ever rising. … This market has given an immense development to commerce, to navigation, to communication by land’.… The bourgeoisie cannot exist without constantly revolutionizing the instruments of production, and thereby the relations of production, and with them the whole relations of society. (Marx and Engels, 1848: Section I)
By continually modernizing the forces of production and promoting the division of labor, capitalism prepared the material conditions necessary for social cooperation and planned management in economic life. Despite the ever increasing social character of capitalist production or socialization of the forces of production, the capitalist system was operated for private profit under private ownership. The search for private profit imposed fetters on the further development of production. The capitalist relations of production came finally into conflict with its forces of production. While a huge sum of wealth was accumulated in the hands of capitalists, its direct producers were impoverished. Lack of demand coexisting with unsold goods produced an ever worsening economic crisis of overproduction. This dynamic of capitalism created conditions of its own overthrow. Moreover, capitalism was creating the industrial proletariat as ‘its own gravediggers’ (Carver, 1996: 14) (Manifesto). As capitalism destroyed pre-capitalist modes of
production at home and abroad, other classes were eliminated and the proletariat expanded. As a result, the whole capitalist society was increasingly divided into two major classes: capitalist and proletarian. According to Marx: With the development of industry the proletariat not only increased in number; it became concentrated in greater masses, its strength grew, and it felt that strength more. The various interests and conditions of life within the ranks of the proletariat were more and more equalized, in proportion as machinery obliterated all distinctions of labor, and nearly everywhere reduced wages to the same low level. (Marx and Engels, 1848: Section I)
As individual workers, then groups of workers in a factory or trade, and eventually all workers in a nation-state or even the world economy mobilized to resist capitalist exploitation, the proletariat would grow more conscious of their shared class position and their common interest in the overthrow of capitalism. When their economic struggles encountered the resistance of the State as well as individual capitalists and groups of employers, the working class would develop a revolutionary consciousness and move from trade unionism to political party. With the economic crisis deepening and the proletariat gaining in strength, the revolution would be inevitable. Since the revolution is an inevitable historical product as the result of the conflict between the forces of production and the relations of production, and especially class struggle between the bourgeoisie and the proletariat in capitalist society, neither Marx nor Engels paid attention to the means of revolution, especially political leadership and political strategy which would be explained and expounded by their followers such as Lenin and Mao Zedong. They also said little about what would happen after the revolution. It was believed that it would be absurd to predict the future society in detail. Nonetheless, some major ideas about socialism and communism can be found in the classic works of Marx and Engels.
Marx and Marxism in Politics
Marx’s Socialism and Communism When the revolution broke out, the proletariat seized the power of the State and transformed the means of production in the first instance into State property. As Marx and Engels suggested, the revolutionary measures in the most advanced countries would include the abolition of private property, a heavy progressive or graduated income tax, the abolition of all rights of inheritance, confiscation of the property of all emigrants and rebels, centralization of credit in the hands of the State and centralization of the means of communication and transport in the hands of the State (Carver, 1996: 19–20). By doing so, it puts an end to itself as the proletariat, it puts an end to all class differences and class antagonisms, and it puts an end to the State as the State. The government of persons is replaced by the administration of things and the direction of the process of production. The State is not ‘abolished’, it withers away. But in a few places Marx and Engels referred to the transitional ‘socialist’ stage as ‘the dictatorship of the proletariat’. As the avant-garde of the proletariat in the revolution, the Communist Party is established and then lead the struggle against the old classes: the landowner and the bourgeoisie. Therefore, the existence of classes is bound up with particular, historic phases in the development of production; the class struggle necessarily leads to the dictatorship of the proletariat after all other antagonistic classes are annihilated; this dictatorship itself only constitutes the transition to the abolition of all classes and to a classless society. With regard to post-revolutionary politics, Marx cited the experience of the Paris Commune and talked about the possibility of bridging the gap between the State and civil society that had been opened up by capitalist democracy. As an instance of the abolition of the division of labor in politics, Marx welcomed the Commune’s proposal to have all officials, including judges, elected by universal suffrage and revocable at any time; to pay officials the same wages as
121
manual laborers; to replace the standing army by the armed people; and to divest the police and clergy of their political influence. The initiative of the Commune could yield a decentralized, federal political structure and an economy based on cooperatives united by a common plan. According to Marx’s explanation and prediction, the fundamental features of communism include at least the following elements. The first is to eliminate the private property and implement the public ownership in economy. The second is to limit free competition and carry out economic planning. The classic socialist believes that the capitalist market competition may lead to economic disorder and increasing inequality. Only after all economic activities are placed under the comprehensive economic plan can economic development be promoted and economic crisis avoided. The third is to distribute the economic surplus based on labor and need. In contrast with the capitalism in which capital plays the most important role in the process of distribution, the communists insist that labor and need are the most important factor in distributing social wealth. Finally, the State as a tool of rule by the ruling class would wither away and gradually be replaced by the administration of public affairs. As Marx described: In a higher phase of communist society, after the subjection of individuals to the division of labour, and thereby the antithesis between mental and physical labour, has disappeared; after labour has become not merely a means to live but the foremost need in life; after the multifarious development of individuals has grown along with their productive powers, and all the springs of cooperative wealth flow more abundantly – only then can the limited horizon of bourgeois right be wholly transcended, and society can inscribe on its banner: from each according to his abilities, to each according to his needs! (Critique of the Gotha Programme, part I, in Carver, 1996: 214–15)
In place of the old bourgeoisie society, with its classes and class antagonisms, there shall be an association, in which the free development
122
The SAGE Handbook of Political Science
of each is the condition for the free development of all.
Western Marxism after Marx The classic Marxist theory was expounded and elaborated based on the historical developments in Western industrialized countries such as England, France and Germany. After economic, social and political changes took place, many social theorists and political leaders tried to redefine and develop Marxism based on the new situations. During the period of economic depression and political repression in the 1880s, Marxism became dominant in the German Social Democratic Party. Karl Kautsky ‘explained and defended the theories of surplus value, immiseration, class polarisation and capitalist crisis’ (Geary, 2003: 220). His works ‘defined Marxism for the generation after Marx and constituted the fundament of “orthodox Marxism”’ (Geary, 2003: 220). Another theorist, Eduard Bernstein, ‘launched the revisionist attack on “orthodox Marxism”’, and ‘refuted the theories of surplus value, impoverishment, capital concentration and crisis’ (Geary, 2003: 228). According to Bernstein, ‘[w]orkers were not becoming poorer; the numbers of peasants was not declining; a “new middle class” was growing in size and importance; share ownership refuted the claim of capital concentration; and capitalism was developing mechanisms to reduce competition and remove recurrent economic crisis’ (Geary, 2003: 228). Therefore, these revisionist Marxists such as Bernstein rejected the violent way of revolution against capitalism and proclaimed the possibility of a peaceful, gradual and legal transition to socialism, brought about through the adoption of the parliamentary road. During much of the 20th century, Marxism was thus divided into two different camps. Orthodox Marxism was adopted and developed by Lenin in Russia and Mao Zedong in China, and transformed
into socialism and communism. By contrast, the revisionists and reformists tended to accept the European economic and political system, and practiced parliamentary politics and social democracy. Meanwhile, a more complex and philosophical form of Marxism entitled ‘Western Marxism’ developed in Western Europe from the early 20th century. As McLellan tells us: Unlike the previous generation of Marxist theoreticians, most of the thinkers grouped under the rubric of ‘Western’ Marxist were not important figures in political parties. They tended to be academics rather than activists, writing in a period of declining working-class activity [due to capitalist democratic and economic developments] and therefore in comparative isolation from political practice. … [T]he term ‘Western Marxism’ normally excludes orthodox communists of strict Marxist obedience … and is confined to the … collection of thinkers that centred around the work of Lukács and Korsch in central Europe, that of Gramsci in Italy, and perhaps above all, of the Frankfurt school in Germany. Western Marxism is thus a philosophical meditation on the defeat of Marxism in the West.…While some people might question whether these modes of thought were really compatible with anything recognisable as Marxism, they undoubtedly extended the horizons of Marxist discussion beyond the rather limited perspectives of the Second International and Leninist orthodoxy. Gramsci’s concept of hegemony and its consequences for political culture, the treatment of Freud by Marcuse, the drastic critique of the Enlightenment in Horkheimer and Adorno – all these attempts to remedy weaknesses or gaps in the classical Marxist tradition have produced a compelling, if sometimes rather convoluted, literature on philosophy, politics and society. (McLellan, 2003b: 282–3)
The first important thinker of Western Marxism is the Hungarian philosopher Georg Lukács whose major ideas were expressed in his work, entitled History and Class Consciousness (First published in 1923). Lukács reevaluated the role of Hegel in the formation of Marx’s thought and reinterpreted Marxist dialectics. Unlike Engels who emphasized the dialectics of nature, Lukács focused more on the dialectics of human history. For Lukács, the subject and object have become separated in
Marx and Marxism in Politics
the long history of human development. Only with the rise of the proletariat in capitalist society, would subjective thought and objective action be finally united. ‘This historical interaction of subject and object was for Lukács the basic form of the dialectic’ (McLellan, 2003b: 284). However, the class consciousness of the proletariat, its unification of the roles of subject and object was blocked by the comprehensive process of ‘reification’ in capitalist society. According to Lukács, ‘as the product of capitalism the proletariat must necessarily be subject to the modes of existence of its creator. This mode of existence is inhumanity and reification’ (1971: 76). This process of reification has originated from commodity fetishism in the age of modern capitalism and transformed the social relations between persons both subjectively and objectively into relations between commodities. The reified consciousness is ‘trapped in the two extremes of crude empiricism and abstract utopianism’ (Lukács, 1971: 77) and led to the ideological crisis of the proletariat. The economic crisis will certainly increase the possibility of revolution in capitalist society, but ‘the fate of revolution (and with it the fate of mankind) will depend on the ideological maturity of the proletariat, i.e. on its class consciousness’ (Lukács, 1971: 70). Another influential figure in Western Marxism is Antonio Gramsci. He participated in revolutionary activities, helped to found the Italian Communist Party in 1921 and became its leader for the two years before his arrest and imprisonment in 1926. His basic ideas and theoretical innovations were contained in his Prison Notebooks (1971) which he compiled in prison during 1929–36. First, Gramsci analyzed the role and function of intellectuals in society and made a distinction between the the traditional and the organic. While the traditional intellectuals considered themselves to be autonomous of social class and have no substantial links with the social and economic changes, the organic intellectuals maintained close relations with their social class and considered themselves to be members. Second,
123
after discussing the organic quality of the intellectuals and their degree of connection with a fundamental social group, Gramsci fixed two major superstructural ‘levels’: ‘civil society’ and ‘political society’ or ‘the State’. According to Gramsci, these two levels correspond on the one hand to the function of ‘hegemony’ which the dominant group exercises throughout society and on the other hand to that of ‘direct domination’ or command exercised through the State and ‘juridical’ government; the intellectuals are the dominant group’s ‘deputies’ exercising the subaltern functions of social hegemony and political government (Gramsci, 1971: 12). Gramsci analyzed the civil society in detail and developed a quite different theory about it from Marx. While, for Marx, the civil society usually meant private spheres and economic relations, Gramsci tended to use it to refer to the superstructure, that is, all the organizations and technical means that the ruling classes used to justify their ideology. Finally, based on the studies of civil society, Gramsci drew a distinction between two different revolutionary strategies in the East and the West. In less-developed societies, such as Russia, where there was no active civil society and the ruling class tended to use State power to suppress public protest; the target of revolution was naturally the State and governmental bureaucracy. Gramsci called this kind of attack ‘a war of movement or manoeuvre’ (Gramsci, 1971: 233). By contrast, in more-developed societies, where there was an advanced civil society and intellectuals played an important role in supporting and legitimizing the ruling class, a ‘war of position’ was more effective because a longer period of cultural assault on the ideological support of the ruling class was needed. According to Gramsci, in the most advanced States, ‘the superstructures of civil society are like the trench-systems of modern warfare. In war it would sometimes happen that a fierce artillery attack seemed to have destroyed the enemy’s entire defensive system, whereas in fact it had only destroyed the outer perimeter; and at the moment of their advance and attack the assailants would find themselves confronted by a line of defence which was still effective’ (Gramsci, 1971: 235).
124
The SAGE Handbook of Political Science
This is because the civil society has become a very complex structure. Gramsci’s analysis of cultural hegemony and revolutionary strategy represents a new development of revolutionary theory in the Western advanced societies. Among the varieties of Western Marxism after Karl Marx, the Frankfurt School represented the latest and most philosophical developments. It took its name from the Institute of Social Research founded in Frankfurt in 1923. ‘Originally concentrating on a more orthodox form of Marxism, the Institute changed its orientation with the appointment of Max Horkheimer as its director in 1930’ (McLellan, 2003b: 289). He and his colleagues, Theodor Adorno and Herbert Marcuse, together with the most influential figure of the second generation of the School, Jürgen Habermas, contributed greatly to its development from the 1930s to the late 20th century. The results of their research were described as ‘critical theory’, perhaps originating from Horkheimer’s seminal article of 1937 entitled ‘Traditional and Critical Theory’. Critical theory was directed above all against positivism and empiricism, a source of reification and an endorsement of the status quo in Western capitalism. The works of the Frankfurt School were a blend of Marxist political economy, Hegelian philosophy and Freudian psychology, which focused on the comprehensive critique of the Western advanced societies and contributed greatly to the revival of Western Marxism. Moreover, thanks to the wide spread of Marcuse’s works and his ideas about onedimensional man, the Frankfurt School had a considerable impact on the New Left in the 1960s and some substantial influences on political research (Jay, 1996: 9). Perry Anderson is one of the most famous Marxist political scientists and has contributed greatly to the study of Western Marxism. He served as the editor of New Left Review for many years and published several books on Western
Marxism such as Considerations on Western Marxism (1976), In the Tracks of Historical Materialism (1983) and The Antinomies of Antonio Gramsci (2017). More importantly, Anderson applied historical materialism to his studies of historical sociology in his brilliant work, Lineages of the Absolutist State (First published in 1974). According to Anderson, there are ‘two different planes of Marxist discourse’: while Marxist historians have paid little attention to theoretical questions about historical materialism, Marxist philosophers have sought to solve the theoretical problems without engaging with the specific empirical issues posed by historians. Anderson designed his work as a Marxist study of Absolutism and tried to explore a mediated ground between the two by combining theoretical analysis and historical studies. Anderson argued that ‘[t]he arrival of Absolutism was never a smooth evolutionary process for the dominant class itself: it was marked by extremely sharp ruptures and conflicts within the feudal aristocracy’ (Anderson, 2013: 20). With the growth of commodity relations, ‘reorganization of the feudal polity as a whole and the dilution of the original fief system, landownership tended to become progressively less “conditional” as sovereignty became correspondingly more “absolute”’ (Anderson, 2013: 20). Many others have conducted their research based on their understanding of Marxism and social development. For example, Barrington Moore Jr. studied the relation between political developments and social classes in Social Origins of Dictatorship and Democracy: Lord and Peasant in the Making of the Modern World (1966), and discussed the issues of authority, inequality and justice in Moral Aspects of Economic Growth (First published in 1998). Immanuel Wallerstein, the master of world-systems theory, analyzed the capitalist development and emergence of a world market, and argued that capitalism has brought about immiseration in the Global South in his book, Historical Capitalism, with
Marx and Marxism in Politics
Capitalist Civilization (First published in 1983). Michael Burawoy examined the legitimization mechanism in capitalist societies in Manufacturing Consent: Changes in the Labor Process under Monopoly Capitalism (1979), and called for ‘bringing workers back in’ in The Politics of Production: Factory Regimes under Capitalism and Socialism (1985). Burawoy argues that ‘the industrial working class has made significant and self-conscious interventions in history.… [and] these interventions were and continue to be shaped by the process of production’ (Burawoy, 1985: 5).
The October Revolution and Russian Marxism Contrary to Marx’s expectations, the socialist countries were not founded in the Western advanced countries but in some underdeveloped countries such as Russia. After the Bolshevik Revolution of October 1917, the first socialist country was established in Russia under the strong leadership of V. I. Lenin. To lead the proletarian revolution, Lenin contributed much to Marxist theory with his theory of the Party and his concept of capitalistic imperialism. By musing on the theoretical points in the works of Marxist writers and studying the political and economic realities, Lenin developed his own theory about capitalism and imperialism in Imperialism: The Highest Stage of Capitalism in 1916. According to his analysis, imperialism is the monopoly stage of capitalism in which capitalist free competition is displaced by capitalist monopoly. Specifically, this definition of imperialism includes the following features: (1) the concentration of production and capital has developed to such a high stage that it has created monopolies which play a decisive role in economic life; (2) the merging of bank capital with industrial capital, and the creation, on the basis of this ‘finance capital’, of a financial oligarchy; (3) the
125
export of capital as distinguished from the export of commodities acquires exceptional importance; (4) the formation of international monopolist capitalist associations which share the world among themselves, and (5) the territorial division of the whole world among the biggest capitalist powers is completed (Lenin, 2008: 239).
At the stage of imperialism, monopolies have stimulated the seizure of the most important sources of raw materials and also produced fierce competition for resources and markets among the biggest capitalist powers. This intense struggle for economic territory, for the division and redivision of the world, will inevitably lead to militarism and imperialist war. This war has not only revealed the nature of parasitic and decaying capitalism, but also created the opportunities for the proletarian revolution in some underdeveloped or colonial/semi-colonial societies. Based on this assessment, the socialist revolution could happen and succeed at the weak point of the global imperialist system. Lenin insisted that the revolutionary a vanguard party play a crucial role in the socialist revolution because he realized that the proletariat, easily deluded by bourgeoisie ideas for improving their working and living conditions, would not overthrow the whole capitalist system. A revolutionary party must be established on a very firm foundation of Marxist theory, comprised of professional and dedicated revolutionaries capable of exercising ideological leadership through political education. Its organization was to be based on the rigorous and truly iron discipline and the principle of democratic centralism combining free discussion and efficient action. The Party was to serve as the vanguard of the working class and act in the interests of the proletariat class with the fullest and unreserved support from the entire mass of the working class. In the process of building the first socialist country, Lenin had been searching for several models of socialism, such as War Communism and the New Economic Policy,
126
The SAGE Handbook of Political Science
in the face of foreign invasion and domestic hardship. In comparison with the comprehensive State direction and management of the economy in ‘War Communism’, the ‘New Economic Policy’ was the strategic retreat in which ‘the state withdrew from the ownership and management of small and medium enterprises, retaining only the very large-scale strategically important parts of industry and communications. Freedom for peasants and traders to market their goods was extended as the state withdrew’ (Harding, 2003: 261). However, after Joseph Stalin ascended to the leadership of the Soviet one-party State, he proceeded to announce radical plans for the rapid industrialization of the country and the collectivization of agriculture. When Stalin announced the First Five Year Plan in 1928, it started the ‘second revolution’ of the 1930s, which brought about the swift and total eradication of private enterprises. All resources were brought under the control of the State, and a system of central planning dominated by the State Planning Committee was established. During the 1930s, empowered by his cult of personality, Stalin established an increasingly brutal authoritarian dictatorship through a series of purges that suppressed and destroyed all opposition. This system combining central planning and personal dictatorship has been called Stalinism or Stalin-style socialism. Although some mistakes were corrected and reforms implemented after the death of Stalin in 1953, the core principles of the Leninist Party and Stalin-style socialism stubbornly resisted pressure for reform and has had a huge impact on political and economic developments in many other Communist countries.
The Rise of China and Chinese Marxism The October Revolution in Russia accelerated the spread of Marxism and the development of the Communist Revolution in China.
During the long period of the communist revolution and socialist construction, the Chinese Communist Party (CCP) and its leaders applied Marxism and Leninism to the military struggle, economic development and political construction of China. Communist leaders, with Mao Zedong as their best representative, have developed many new ideas, concepts and theories, all of which embody Chinese Marxism. Mao Zedong is the principal Chinese Marxist theorist, and communist statesman who contributed to the founding of the CCP in 1921, the Communist Army in 1927 and The People’s Republic of China (PRC) in 1949. In contrast with Karl Marx spending his most of life in reading and writing as an editor, reporter and scholar, Mao had taken part in almost all military struggles, political conflicts and social movements during his time. He emerged as a supreme leader in 1935 in the CCP because his smart ideas and correct tactics were adopted by most of the CCP leaders. He was chairman of the PRC from 1949-59 and chairman of the CCP until his death in 1976.
Maoism as Sinicized Marxism Maoism, officially called Mao Zedong Thought in China, is composed of many ideologies, strategies and tactics believed to be the creative result of applying MarxismLeninism to China, a semi-feudal and semicolonial country without modern industrial development. Maoism developed along with the progress of the communist revolution and socialist movement. After a bloody split in April 1927, Chiang Kai-shek of the Kuomintang, or Nationalist Party, dismantled the united front with the Communist Party against the warlord government in Beijing and broke with his allies in the Communist Party. In the following campaigns, many communist organizations were destroyed, and a large number of communist leaders and members killed. Mao led several peasant uprisings in Hunan and
Marx and Marxism in Politics
Jiangxi provinces and established the communist military bases during the early 1930s. Based on these experiences, Mao realized the importance of the Chinese peasantry in the Chinese communist revolution. In October 1934, Mao and the communists retreated from Jiangxi under strong military attack by Chiang’s Nationalist Army and started their epic Long March to the new base in Shanxi province. By making use of the second united front with the Nationalist Party against the Japanese, Mao and his communist comrades had not only consolidated their base and expanded their sphere of influence, but also formulated an ideology and methodology for Chinese revolution which was officially described as Maoism in 1945. In a new civil war between the communists and nationalists from 1946 to 1949, the nationalist government ruled by Chiang was defeated and retreated to Taiwan. As a result, the Communist Party came to power and Mao declared the founding of the PRC in Beijing on October 1, 1949. During the long period of military struggle against the Nationalist Party and the Japanese, ‘one of the most original contributions to the theory and practice of contemporary Marxism was his conception of guerrilla warfare’ (McLellan, 2003a: 270). According to the logic of Marxist socialism, only after the capitalist commodity economy developed fully would socialism succeed. After the communist revolution succeeded and the CCP came to power in 1949, Mao and the communist leaders – dreaming that their countries would immediately leap forward into communism – decided to implement and promote a socialist transformation of the Chinese economy and society, unfortunately ignoring the basic facts about the country’s underdevelopment. In order to build socialism and realize communism as soon as possible, they conducted radical and drastic reforms in socialist practice in the light of Marxist theory, especially Stalinist theory imported from the Soviet Union. The Party tried to eliminate private ownership and establish public ownership, reduce free
127
competition and develop a planned economy, limit the role of capital in the distribution of resources and promote egalitarianism in order to eradicate exploitation and suppression. To promote the socialist transformation and build a new socialist country, the Communist Party must always keep power in the hand of the proletarian class, take the class struggle as basic line and consolidate the central leadership. After eight years of the socialist transformation, in 1957, Mao declared confidently: We are now building socialism.… The petty bourgeoisie in agriculture and handicrafts and the bourgeoisie in industry and commerce have both experienced changes.… [the] individual economy has been transformed into collective economy, and the capitalist private ownership is being transformed into socialist public ownership. (Mao, 1977: 403)
In search of its own socialist model, after Stalin was criticized at the 20th Congress of the Communist Party of the Soviet Union in February 1956, China, under Mao, adopted many radical policies and launched mass movements one after another. The most important was the ‘Great Leap Forward’ (GLF, 1958–60) and the ‘Cultural Revolution’ (1966–76). During the period of the GLF, in rural areas of China, 750,000 higher-stage cooperatives were merged into 25,000 people’s multifunction communes in order to bring private farmers into peasant collectives in the pursuit of egalitarian ideals. A huge amount of manpower had been mobilized to build the famous backyard steel furnaces and extensive irrigation works. China’s leaders were eager to catch up with Britain, and eventually with the United States, and at least for a brief moment were willing to believe that utopian methods worked and would produce more tons of steel and food. The GLF caused a great deal of waste by mobilizing great manpower and huge resources to produce low-quality, even useless, steel in an unproductive way in the development of the socialist economy. What is more, owing to this policy placing emphasis on the
128
The SAGE Handbook of Political Science
development of heavy industry rather than light industry, the meager development of heavy industry came at a huge cost with little or no improvement in people’s living standards. The GLF caused an immense famine and an unusually high death rate – the consequence not only of declining harvests but of excessive requisitioning of grain, based on false reports that far more grain had been produced than was actually the case. Many years later, official sources admitted that 8 million people had died of causes related to the GLF; unofficial sources estimated the figure at about 30 million (Saich, 2004: 41). During the period of the Great Proletariat Cultural Revolution, Mao took the class struggle as fundamental. He believed that the main goal of the Cultural Revolution was to save Chinese socialism from the threat of ‘revisionism’ by purging his lieutenants who, as in case of Liu Shaoqi and Deng Xiaoping, attached greater importance to economic efficiency than ideological purity. The Cultural Revolution placed emphasis on purifying the superstructure rather than changing the economic conditions, because Maoists believed that purification would give a push to economic development in which the bureaucrats would engage directly, and to which the masses, imbued with revolutionary ideals, would exert themselves on behalf of collective undertakings. Meanwhile, with the ideal of egalitarian, a leftist ‘wind’ swept through the country, resulting in the curtailing of private landholding, a free market and other personal rights. However, the facts show that China ended up in 1976, at the end of the Cultural Revolution, with neither efficiency nor equity. Although Maoism contains many different and sometimes contradictory elements during the different stages of Chinese Revolution and socialist construction, several salient features can be found in his works and experiences. The first important feature is his emphasis on the importance of peasant issues in Chinese revolution and socialist construction. The Marxist-Leninist tradition treated peasants as
incapable of revolutionary initiative and only marginally useful in backing urban proletarian revolution. Based on his life experiences and his analysis of the rural situation in China, Mao came to recognize the potential power of China’s hundreds of millions of peasants and decided to establish his base in rural areas instead of big cities. The peasants constituted the vast majority of China’s population and most of them were hard-pressed and lived in extreme poverty. According to Mao, they were very receptive to revolutionary agitation and could become a revolutionary force if fully mobilized and properly guided. Proceeding from this belief, Mao proposed to instill in them a revolutionary consciousness and make their force alone suffice for the revolution. By so doing, Mao led the Chinese revolution to succeed and gradually formed a special sentiment for the peasants. During the Cultural Revolution, Mao sent many city workers, intellectuals and bureaucrats to rural areas and forced them to receive re-education through agricultural labor, working alongside the peasants, because Mao believed the big cities were a corrupting influence for many. In Mao’s thinking, there were long-standing populist elements and continuous emphasis on the ‘mass line’. The mass line means the emphasis on the interests and preferences of the common people and demands that the government be responsive to them. It was first created in the revolutionary period, based on the idea of class struggle. The idea of the disadvantaged classes overthrowing the privileged classes through interclass struggle naturally entailed the idea of mobilizing the oppressed masses to fight for their own interests against those of the oppressing classes. Thus, the notion of class struggle led to Mao’s stress on the importance of the masses and mass movements and to what Mao explicitly labeled as the ‘mass line’ in the Yan’an period (late 1930s and early 1940s) and thus it is extolled as one of precious traditions in communist history. Therefore, it is common to assert that the mass line is a style of leadership, which is, at its best, a democratic style
Marx and Marxism in Politics
of leadership, just as indicated in the slogans such as ‘Serve the People’ and ‘Everything depends on the Masses, from the Masses, to the Masses’. Mao believed that even the vanguard Party needed to be rectified and reformed through criticism from the people it led and that the masses of China should be encouraged to become involved in even the highest affairs of the State. In the Cultural Revolution of 1966–76, the masses had been mobilized so broadly and deeply that the country verged on the edge of anarchy. In spite of political disorder and economic depression in China, Maoism developed into a worldwide movement in the 1960s and thereafter. All Maoists expressed fidelity to the thought of Mao Zedong. But at a practical level, self-identified Maoist political formations differed considerably. In parts of Asia where conditions were similar to those that prevailed in China before 1949, Maoism was largely a peasant movement, engaging in guerilla warfare and establishing bases in rural areas, and if successful, surrounding the cities and seizing State power. Elsewhere in the Third World, especially in Latin America, facing very different conditions, Maoists had to modify classical Maoist forms of revolutionary struggle. In the developed capitalist countries, Maoism meant something very different. Western Maoism was particularly attractive to young people during the 1960s, if only for its ostensible purity and populist nature. Although Maoism has been upheld as one of the major guiding principles in the CCP and other left-wing parties, it has gradually lost its appeal in China and other parts of the world since China adopted a new ‘reform and opening’ policy in 1978.
Contemporary Chinese Marxism This is socialism with Chinese characteristics; contemporary Chinese Marxism, proposed by the communist leaders with Deng Xiaoping at its core. After the death of Mao and the end of the Cultural Revolution, Deng
129
Xiaoping emerged as the new supreme leader and began to review and revise the basic line adopted by Mao. He was dissatisfied with the poor performance that had been achieved in the preceding 30 years and made the painful discovery that Chinese socialism had produced only meager results in comparison with the other nations of capitalist East Asia. He believed that these were consequences of a rigid and dogmatic understanding of socialism; that is, taking classic socialist theory as the CCP’s guideline without considering China’s special circumstances and blindly believing that socialism equals public ownership plus planned economy. So Deng Xiaoping thought it imperative to give a new perspective on socialism and make a breakthrough in socialist theory under the banner of ‘emancipating the mind’ and ‘seeking the truth from the facts’ (Deng, 1994: 140). The new theory of socialism with Chinese characteristics has been expounded and enriched by Deng Xiaoping and other communist leaders during recent decades. In 1978, at the Third Plenary Session of the Eleventh Central Committee of the CCP, the Party with Deng Xiaoping as its core leader, restored the principle of ‘seeking the truth from the facts’, stopped using the slogan ‘politics taking command’ and shifted the major goals to socialist construction, focusing on economic development and modernization. This made a good start on ‘reconsidering’ socialism and brought about the historic decision on ‘reform and opening up’ – the Opening of China – which marked the beginning of China’s era of reform. At the time, China had a clear desire to increase productivity and raise living standards by reforming its economic system. However, it didn’t have a clear objective of what the new system would be like and thus proceeded with the reform as though ‘crossing the river by touching the stones’. In 1982, at the 12th National Congress of the CCP, Deng Xiaoping proposed the idea of constructing socialism with Chinese characteristics by combining the basic principles of Marxism
130
The SAGE Handbook of Political Science
and China’s special national conditions. By 1987, at the 13th National Congress of the CCP, the Party had a new perspective on the progress of socialist development, proposing that as China was still in the primary stage, the private economy should be encouraged. At the beginning of 1992, Deng Xiaoping made a famous southern tour in which he talked a lot about the nature of socialism, redefining it as ‘liberating productivity, developing productivity, eradicating exploitation, getting rid of the polarization between rich and poor, and finally getting rich together’ (Deng, 1993: 373). He said: ‘The planned economy is not equal to capitalism, and there are plans in the capitalist countries. The market economy is not equal to capitalism, and there are markets in the socialist countries.’ ‘The measure of socialism or capitalism is not that there are more markets or more plans’, ‘but whether it is helpful to develop productivity, enhance the comprehensive national power, and improve the living standard for common people’ (Deng, 1993: 372–3). In October, at the 14th National Congress of the CCP, Jiang Zemin, the new supreme leader as General Secretary of the CCP, attributed the fruits of ‘reform and opening up’ to the theory of socialism with Chinese characteristics proposed by Deng Xiaoping, and declared that the goal of economic structural reform would be to establish the socialist market economy (Jiang, 2006a, 210). In 1993, at the Third Plenum of the Fourteenth Central Committee of the CCP, the leaders adopted ‘the Decision on Issues Concerning the Establishing of a Socialist Market Economic System’, which was the turning point on China’s road to marketization. Thus, the Chinese government and Communist Party formally admitted and adopted the market economy, in which the market would play the fundamental role in allocating economic and social resources. In contrast with the traditional socialist theory, which rejected the commodity economy and market mechanism, the new theory insisted on the coexistence of socialism and the market economy because socialism has been
redefined in grand schemes for developing productivity and creating wealth, and the market is regarded as the mere means to organize and regulate economic relations. In 1997, at the Fifteenth National Congress of the CCP, the leaders made an official decision to adopt Deng Xiaoping’s theory as their guiding principle along with Marxist-Leninist and Mao Zedong Thought (Maoism). In the process of developing the socialist market economy, the private economy has been playing a more and more important role and a new stratum of private entrepreneurs has become more powerful and influential. At the Meeting Celebrating the 80th Anniversary of the Founding of the CCP, Jiang Zemin said that most of these people, such as private entrepreneurs and technical personnel, have contributed to the development of productive forces and other social undertakings through honest labor or lawful business operations, and they are also working for building socialism with Chinese characteristics (Jiang, 2006b: 286). This officially opened the door for private entrepreneurs to join the CCP. For the Communist Party, which insisted on the eradication of exploitation and the exploiting class for such a long time, this decision to permit private entrepreneurs as ‘new blood’ was revolutionary. In 2002, at the 16th National Congress of the CCP, the CCP leaders endorsed officially the important thought of the ‘Three Represents’ proposed by Jiang Zemin as their guiding principle along with the Marxism-Leninism, Mao Zedong Thought and Deng Xiaoping theory. The ‘Three Represents’ means that the CCP must represent the most advanced productivity, the most advanced culture and the most comprehensive interests of Chinese people (Jiang, 2006b: 2). This marked the fundamental transformation of the CCP from the party of the working class to a catch-all party. Although China has achieved rapid growth and improved living standards for most of its residents, the developments in China are limited, imbalanced and at a low-level in terms of GDP per capita and technological innovation,
Marx and Marxism in Politics
and also at high-cost. There are widening gaps between the urban and rural areas, between the east and west and between the coastal and mountainous areas. Environmental degradation and the low efficiency of resource consumption as a result of fast industrialization and urbanization have all contributed to a bottleneck in sustainable development. After Hu Jintao came to power as supreme leader in 2002, the theory of scientific development and harmonious society was proposed as the new development of Chinese-style socialism. The concept of scientific development was first launched by Hu Jintao at the Third Plenum of the 16th Central Committee in 2003. It emphasized equitable, balanced and sustainable development. The building of harmonious society is aimed to enable all the people to share the social wealth brought by reform and development, forge an ever closer relationship between the people and government and result in lasting stability and unity. Social harmony is an essential attribute of socialism with Chinese characteristics. Scientific development and social harmony are integral to each other; neither is possible without the other. In 2007, at the 17th National Congress of the CCP, the leaders decided to build the socialist harmonious society according to the notion of scientific development. Since Xi Jinping rose to be the supreme leader as the General Secretary of the CCP in 2012, enormous and massive political changes have taken place in China. In contrast with Deng Xiaoping’s ideas about the limited role of the Communist Party, General Secretary Xi has called for the overall leadership of the CCP in all areas. On February 13, 2017, when delivering a speech at the study session attended by provincial and ministerial leaders, Xi Jinping emphasized that ‘the Party exercises overall leadership over all areas of endeavor in every party of the country’ (Xi, 2017: 20). In the political report delivered at the 19th National Congress of the CCP, Xi Jinping insisted that ‘the defining feature of socialism with Chinese characteristics is the leadership of the Communist Party of
131
China; the greatest strength of the system of socialism with Chinese characteristics is the leadership of the Communist Party of China; and the Party is the highest force for political leadership’ (Xi, 2017a: 20). In the revised Constitution of the PRC, endorsed by the National People’s Congress in 2018, the communist leadership has been included and reemphasized in an added article. This new definition of socialism and the CCP’s role represented the latest development in contemporary Chinese Marxism and will contribute to the party-centered governance in China.
Conclusion From the above reflections and discussions, we may conclude that Marxism has been developing along with historical development. During different periods of historical development, Marxism can be understood from very different perspectives. For example, the new perspectives on socialism during the reform era in China represented the theoretical transition from the pure, classic and traditional socialism characterized by public ownership, planned economy and class struggle, to socialism with Chinese characteristics emphasizing the reform and opening up, the development of the market economy and the building of a prosperous and harmonious society under the leadership of the CCP. Taking a long and broad view, global conflict between Marxism and liberalism, communism and capitalism, left and right has dominated world politics during the past two centuries. On the left there have been many controversies and disputes over the interpretation of Marxism and its application. Generally speaking, Marxism has been criticized and rejected when capitalist countries are prosperous and socialist countries decline and collapse; on the contrary, Marxism has been adopted and embraced when capitalist countries face crisis and socialist countries make progress.
132
The SAGE Handbook of Political Science
References Anderson, Perry. 1976. Considerations on Western Marxism. London: Verso Edition. Anderson, Perry. 1983. In the Tracks of Historical Materialism. London: Verso Edition. Anderson, Perry. 2013. Lineages of the Absolutist State. London: Verso (New Left Books). Anderson, Perry. 2017. The Antinomies of Antonio Gramsci. London: Verso Edition. Ball, Terence and Richard Bellamy, eds. 2003. The Cambridge History of Twentieth-Century Political Thought. Cambridge University Press. Burawoy, Michael. 1979. Manufacturing Consent: Changes in the Labor Process under Monopoly Capitalism. Chicago: The University of Chicago Press. Burawoy, Michael. 1985. The Politics of Production: Factory Regimes under Capitalism and Socialism. London: Verso (New Left Books). Carver, Terrell, ed. 1996. Marx: Later Political Writings. Cambridge, UK: Cambridge University Press. Deng Xiaoping. 1993. Deng Xiaoping Wenxuan (Selected Works of Deng Xiaoping), Volume 3, Beijing: Renmin Chubanshe (People’s Press). Deng Xiaoping. 1994. Deng Xiaoping Wenxuan (Selected Works of Deng Xiaoping), Volume 2, Beijing: Renmin Chubanshe (People’s Press). Engels, Friedrich. 1993 (First published). 2009 (Reissued). The Condition of the Working Class in England. Edited with an Introduction and Notes by David McLellan. Oxford: Oxford University Press. Geary, Dick. 2003. ‘The Second International: Socialism and Social Democracy’, in T. Ball and R. Bellamy (eds) The Cambridge History of Twentieth-Century Political Thought. Cambridge University Press, pp. 219–238. Gramsci, Antonio. 1971. Selections from the Prison Notebooks of Antonio Gramsci. New York: International Publishers. Harding, Neil. 2003. ‘The Russian Revolution: An Ideology in Power’, in T. Ball and R. Bellamy (eds) The Cambridge History of TwentiethCentury Political Thought. Cambridge University Press, pp. 239–266. Jay, Martin (Mading Jie). 1996. Falankefu Xuepai Shi (The Dialectical Imagination: A History of the Frankfurt School and the Institute of Social Research, Chinese version). Guangzhou, China: Guangdong People’s Press. Jessop, Bob (with Russell Wheatley), eds. 1999. Karl Marx’s Social and Political Thought (Volume V: Marx’s Life and Theoretical Development). London and New York: Routledge. Jiang, Zemin. 2006a. Jiang Zemin Wenxuan Diyijuan (Selected Works of Jiang Zemin Volume 1). Beijing: Renmin Chubanshe (People’s Press). Jiang Zemin. 2006b. Jiang Zemin Wenxuan Disanjuan (Selected Works of Jiang Zemin), Volume 3, Beijing: Renmin Chubanshe (People’s Press).
Lenin, Vladimir I. 2008. Revolution Democracy Socialism: Lenin’s Selected Writings 1895-1923 (Edited and with an introduction by Paul Le Blanc). London: Pluto Press. Lukács, Georg. 1971. History and Class Consciousness. Cambridge, Massachusetts: The MIT Press. Mao Zedong. 1977. Mao Zedong Xuanji (Selected Works of Mao Zedong), Volume 5, Beijing: Renmin Chubanshe (People’s Press). Marx, Karl. 1973. Economic and Philosophic Manuscripts of 1844. Edited with an Introduction by Dirk J. Struik. London: Lawrence & Wishart Ltd. Marx, Karl. 1990. Capital: Volume 1: A Critique of Political Economy. Introduced by Ernest Mandel. Translated by Ben Fowkes. London: Penguin Books in association with New Left Review. McDonald, Lee Cameron. 1962. Western Political Theory: The Modern Age. New York: Harcourt, Brace & World. McLellan, David. 2003a. ‘Asian Communism’, in T. Ball and R. Bellamy (eds) The Cambridge History of Twentieth-Century Political Thought. Cambridge University Press, pp. 267–281. McLellan, David. 2003b. ‘Western Marxism’, in T. Ball and R. Bellamy (eds) The Cambridge History of Twentieth-Century Political Thought. Cambridge University Press, pp. 282–298. Moore, Barrington, Jr. 1966 (Reprint in 1993). Social Origins of Dictatorship and Democracy: Lord and Peasant in the Making of the Modern World. Boston, Massachusetts: Beacon Press. Moore, Barrington, Jr. 2018. Moral Aspects of Economic Growth and Other Essays. Ithaca and London: Cornell University Press. O’Malley, Joseph, ed. 1994. Marx: Early Political Writings. Cambridge University Press. Ricardo, David. 2004. On the Principles of Political Economy and Taxation. London: Dover Publications. Saich, Tony. 2004. Governance and Politics of China. Palgrave MacMillan. Wallerstein, Immanuel. 2011. Historical Capitalism with Capitalist Civilization. London: Verso Edition. Wang, Huning. ed. 2004. Zhengzhi De Luoji: Makesi Zhuyi Zhengzhixue Yuanli (The Logic of Politics: Marxist Principles of Political Science). Shanghai, China: Renmin Chubanshe (Shanghai People’s Press). Xi Jinping. 2017. The Governance of China (II). Beijing: Foreign Languages Press. Xi Jinping. 2017a. Juesheng Quanmian Jiancheng Xiaokang Shehui Duoqu Xinshidai Zhongguo Tese Shehuizhuyi Weida Shengli (Secure a Decisive Victory in Building a Moderately Prosperous Society in All Respects and Strive for the Great Success of Socialism with Chinese Characteristics for a New Era). Beijing: Renmin Chubanshe (People’s Press) Xu Datong, ed., 2005. Xifang Zhengzhi Sixiangshi (A History of Western Political Thought), Volume 4, Tianjin: Renmin Chubanshe (Tianjin People’s Press).
8 The New Institutionalism in Political Science B. Guy Peters and Jon Pierre
Institutionalism has been the foundation of political science. For many ancient political philosophers such as Aristotle, the assumption was that constructing institutions was the way in which to control human behavior and shape the outputs of governance. The same arguments for the importance of institutional structures persisted in political philosophers and analysts such as Hobbes, Locke, Montesquieu and numerous others. In the past, to think about politics and government was to think about institutions, often in the context of the institutions of the State. As political science began to differentiate itself from moral philosophy, another version of the ‘old institutionalism’ emerged. It was more empirical in its focus, and attempted to utilize formal institutions as the means for understanding and explaining differences among political systems. This old institutionalism was highly formal and legalistic, assuming that constitutions would be followed and were the primary, if not sole, basis for political action within a country. Further, this
style of institutionalism was holist, assuming that all the institutions within a political system worked together smoothly, and were designed to produce effective governance, albeit governance of different types.1 For some good reasons the old institutionalism lost favor with scholars, especially as the emphasis shifted from structure to agency in political theory. The ‘behavioral revolution’ and later the development of rational choice approaches, shifted attention away from institutions and toward the decisions made by individuals within, and without, those institutions. But that emphasis on the individual also has been contested, and became a principal motivation for developing the New Institutionalism. The argument, in brief, was that the emphasis on the atomistic individual gave insufficient attention to the mechanisms through which structures influenced the behavior of those individuals, both normatively and through incentives and disincentives, for behavior. The New Institutionalism therefore reflects not only some roots in the distant past of
134
The SAGE Handbook of Political Science
political science but also some contemporary debates. This approach, or perhaps set of approaches, does not reject many aspects of contemporary political science but, rather, is arguing that the emphasis on individual behavior must be understood in a context that is composed of structures. The New Institutionalism does not look at institutions as formal and immutable, but rather looks at the role of agency in creating and transforming the structures. Although there is certainly a strong emphasis on structure, the strength of contemporary institutional theory is that it emphasizes structure and agency, rather than emphasizing a stark dichotomy. Therefore, a definition of institutions would be ‘structural features which shape social behavior, persist over time and express shared values among members of the institution’ (Peters, 2019). Note, however, that the structures need not be formalized, as in the case of networks and other social institutions. In addition, the extent of the sharing of values need not be complete, and institutions may have some subcultures and internal divisions. The various strands of contemporary institutionalism (see below) all have their own perspectives on institutions, but there is a common thread among them that can be used as a definition of institutions. That is, institutions are structural features that shape social behavior and persist over time. These structures also express the shared values of their members. Some aspects of this definition may be more evident than others, for example, there is not always a tangible structure but more a structure of ideas, but this definition can encompass all the versions.
Varieties of New Institutionalism Although we have been discussing the New Institutionalism as an entity, perhaps the fundamental point about this body of political theory is that it does not appear to be an
integrated approach, but rather a collection of competitive, and complementary, approaches to institutions and their role in politics. As we will point out later in this essay there are many points of commonality among these various versions of New Institutionalism, but the differences often appear to outweigh the similarities. We will now proceed to discuss five versions of New Institutionalism, asking a common set of questions of each, and then later attempting to answer the rather difficult question of whether there really is a New Institutionalism.
Normative Institutionalism James March and Johan Olsen (1984, 1989) issued the first call for the development of a New Institutionalism in political science. They argued that political science at the time they were writing, and still today, was dominated by a ‘logic of consequentiality’, meaning that the assumption was that individuals calculated the consequences for them of any political action, and behaved according to the results of that calculus. Further, they were making those calculations as atomistic individuals, rather than as individuals embedded in a set of institutional structures that provided some guidance about behavior. The guidance given to the individuals by membership in an institution was conceptualized by March and Olsen as the ‘logic of appropriateness’. That is, individuals within institutions learn the myths, symbols, routines and ideas of the institution. Those members of the institution utilize those institutional values for guidance, rather than relying upon their own ideas about the right thing to do or following their own self-interest. Stated differently, the preferences for individuals are endogenous to the institution, rather than coming from outside, and being personal. The principal exception may be that individuals may already be attracted to the values of an institution and that motivates their decision to join the institution.
The New Institutionalism in Political Science
Within normative institutionalism an institution is a collection of norms, values, routines and symbols. These may be arranged as a formal structure, or they may not, but the important thing is that the values tend to be relatively consistent and that they are accepted by the members of the institution. Further, a normative institution must invest a good deal of energy in socializing new members and maintaining the commitment of its current members. In this conception of an institution, the sanctions available to the leadership are also largely normative, rather than material. As well as needing to understand what an institution is, we also need to know how they are formed. It is rather easy to identify major existing institutions, but the process of formation may be less identifiable. The sociologist Philip Selznick (1957) argued that the process of creating institutions – institutionalization – was ‘infusing a structure with values greater than those required for the mechanical achievement of its tasks’.2 Any ordinary organization may be able to achieve certain tasks, but an institution will have meaning for its members and thereby tend to generate greater commitment from those members. Change in normative institutionalism therefore must come about through changing the normative structure of the institution. This normative change can happen in several ways. One is through leadership. If the leadership of the institution recognizes that the existing value structure is no longer functional, it may attempt to inculcate a new set of values or goals. This normative change may be significantly more difficult than change within the other forms of institutionalism to be discussed below, especially since a successful institution will have invested in creating the existing set of values. An institution or organization may face the need to introduce value change if they are so successful that they achieve their goals. In that case they must decide whether to set new goals or simply cease to exist. Or conversely, an institution may be forced into change
135
when it becomes clear that the actions of its members do not conform to the stated values of the institution. The institution would be confronted with the need to either manage behaviors to fit the norms more closely, or to change the norms. Depending upon how well institutionalized those norms have become, producing normative change may be difficult unless they are merely to conform to the informal working norms (see Azari and Smith, 2012). Change in the normative institutionalism can also occur through changing the principal input into the institution – the people who occupy roles within it. When major social and cultural change occurs outside their structures, institutions may have little option but to alter the ways in which they attempt to motivate the individuals within their structures. For example, during the 1960s the US Army had to change the ways in which it dealt with soldiers, being incapable of exercising the same strict discipline with troops who wanted things explained to them rather than following orders without question. In sum, normative institutionalism, as the name implies, assumes that institutions are formed around ideas and values. Given that basic premise, the formation and maintenance of an institution can be seen as inculcating those ideas into the members of the institution, and change involves changing the normative structure that guides the behavior of institutional members. Perhaps most fundamentally, normative institutionalism focuses on the logic of appropriateness within the institution as the guide for action for members, as opposed to their own self-interest.
Rational Choice Institutionalism Although March and Olsen might consider the term to be a misnomer, there is a strong strand of rational choice institutionalism in political science (see Paine and Tyson, Chapter 11, this Handbook). As it is true for all forms of rational choice theory, the
136
The SAGE Handbook of Political Science
assumption here is that individuals are utility maximizers and that they will utilize any situation in order to maximize that utility. Participating in an institution will be no different from any other situation. In contrast to the normative institutionalism, preferences in rational choice institutionalism are assumed to be exogenous to the institution, and remain largely unaltered by the individuals participating in an institution. An institution in the rational choice approach is conceptualized as a set of incentives and disincentives for individuals. Assuming that individuals have relatively similar preferences, the designer of the institution can devise a set of incentives to produce the type of behavior desired. Institutions can also be conceptualized as sets of rules (Ostrom, 1990) that mandate that individuals do, or not do, certain things. And institutions have also been described as sets of decision points or veto points, that can delay or facilitate decisions, or that tend to produce certain types of decisions (Tsebelis, 2002). And finally, institutions can be conceptualized as a set of principal-agent relationships with the attendant problem of monitoring the possible shirking or sabotage of the agents (Brehm and Gates, 1997; Pierre and Peters, 2018). The literature on rational choice institutionalism sees institutions being formed in two dominant manners. One is through design. The individual or group responsible for designing an institution, whether they be writing a constitution or merely forming a small group, will attempt to design the incentives within the structure so that certain behaviors and certain outcomes are more probable. These designers are dealing with people so they cannot guarantee a specific outcome, but they often can be very effective. The framers of the constitution of the United States wanted to create a government in which bold action would be difficult, and they succeeded, perhaps too well. The other means through which institutions are formed within rational choice institutionalism is more evolutionary. Rather
than being designed, institutions evolve because individuals involved with the emergent institution negotiate among themselves to produce a set of effective rules, or at least expected patterns of behavior. Elinor Ostrom (1990) demonstrated how even self-interested individuals may be able to overcome collective action problems through devising rules that constrain their own individually rational actions. This insight then can link the study of some of the normative elements usually associated with the normative institutionalism with rational choice institutionalism. The process of institutionalization of the institutional structures vary very much in line with the conception of formation. If the design model of creating the institution is assumed then institutionalization is not much of an issue. The individuals within the institution will make decisions according to their preferences as soon as they understand the consequences of those choices for their utility. For the more evolutionary approach to the formation of institutions, the process of institutionalization is that process of recognizing the need to overcome collective action, or perhaps transaction cost, problems and devising a solution. And for that solution to remain viable over any time some value must be attached to it in order to prevent defections. Ostrom’s work, as well as that of many other rational choice scholars, demonstrates that one of the most important problems that institutions may be designed to overcome is the collective action problem, or the disjuncture between individual and collective rationality (Olson, 1965). In the tragedy of the commons, for example, the pursuit of individual gain leads to collective ruin. The difficulties of social choices with inconsistent preferences is another rational choice problem which can be addressed through the development of institutions (Riker, 1980). And finally, institutions can be very effective at minimizing transaction costs by creating structures that do not require the renegotiation of deals among parties, perhaps an
The New Institutionalism in Political Science
especially important role for international organizations. In the design mode of thinking about rational choice institutionalism change is relatively easy. All the designer must do is to alter the incentives, or rules, and the institution should produce different outcomes. This assumes, of course, that preferences are consistent and that the designer understands those preferences. In the more evolutionary mode of thinking about institutions, change would be substantially more difficult, being a process more like that in normative institutionalism. The incentives and rules would have to change more through negotiation and agreement than through imposition. In summary, it is easy to create a stereotype of rational choice institutionalism. It does make questionable assumptions about the clarity of preferences of individuals and about their presumably rather simplistic responses to incentives and disincentives. Despite that, this version of institutional theory does provide important insights into the possibilities of generating desired outcomes through the manipulation of incentives. Further, as the approach has evolved and integrated some concerns with norms associated with the institutions, it can provide a more complete picture of the ways in which political institutions function.
Historical Institutionalism Historical institutionalism is the third major approach to institutional theory in contemporary political science. In some ways, this approach is almost a stereotype of the usual thinking about public institutions. That is, the historical institutionalism’s fundamental argument is that, once created, institutions persist unless there is a strong force capable of changing them. Phrased somewhat differently, for critics of the public sector, publicsector institutions are hyper-stable and tend to continue doing what they have always done unless forced to do otherwise.
137
There is, however, more theoretical content to the historical institutionalism than the stereotypical critique of the public sector. First, this approach to institutions is based on ideas, but unlike the normative institutionalism the ideas are focused primarily on specific policies rather than on the general value orientation of the institution (Hall, 1986). The commitment to those policy ideas within the institution creates the path dependency that describes the persistence of policies even after their utility may have passed.3 Second, the historical institutionalism does have a clearly developed notion, or actually several notions, of change, again in contrast to the stereotypical view. The original idea of change was punctuated equilibrium. The idea here is that institutions will indeed continue in their initial paths until they are moved by some major shock and after some time a new equilibrium will be established. That new equilibrium will persist until there is another shock. The difficulty with this idea of change is that there is nothing in the theory that permits prediction of the punctuation. It is easy to argue that the punctuation has occurred after the fact, but little to explain when it will happen. We have argued elsewhere (Peters et al., 2005) that conflict among ideas is crucial for explaining the why of change, but still does not explain the when. The reliance upon radical change in historical institutionalism was also the target of major criticism of the approach. Although major change does occur, most change in public policy and public institutions appears to be incremental. Later conceptualization in historical institutionalism has identified four versions of gradual change. All of these versions of change provide for the continuing adjustment of the pattern of functioning within the institution, while maintaining the key aspects of the existing regimen. The analytic question which remains, however, is how much incremental change would accumulate to a major punctuation in the life of the institution? There is not a strong conception of institutionalization within historical institutionalism.
138
The SAGE Handbook of Political Science
The idea of the ‘formative moment’ is not developed much beyond a simple description of the creation of the institution, meaning that if the idea of structure does persist then it has been institutionalized, but if it does not then it has not. The argument then is implicitly tautological. We could, however, link this version of institutionalism with the normative institutionalism to argue that the acceptance of the idea as the basis of the institution is in essence the infusing of values into the structure described by Selznick (1957). What the historical institutionalism has been developing, however, has been a set of ideas explaining the persistence of the institution that goes beyond simple inertia, although inertia can be a powerful explanation in itself (Rose, 1990). Pierson (2000), for example, argues that institutions persist because they create positive feedbacks and reinforcement for the actors involved – whether members of the institution or the clients. Sarigil (2015) has argued that the persistence of institutions can be explained merely by habit; certain ways of doing things become ingrained and are not questioned. In summary, like both the normative institutionalism and discursive institutionalism, the historical institutionalism has a strong connection with ideas, and specifically with policy ideas. The institution can be seen as the creation of some structure – whether programmatic or organizational – around the idea. The assumption is that once that idea has become accepted or institutionalized it will continue to be significant in the life of the structure until some forces produce change, no matter whether that change is gradual or extreme.
Empirical Institutionalism A fourth version of the New Institutionalism has strong links with older ideas about institutions. What we refer to here as empirical institutionalism is based on the rather obvious notion perhaps, that the formal structures
of government have consequences. What is different from the old institutionalism, however, is the more explicit attempts to build theory about the role of structure in shaping behavior. In addition, there has been some attempt to include some of the insights from behavioral political science and even rational choice models to understand the impact of structure on the performance and behavior of institutions. Although it has attempted to include more contemporary theoretical approaches, empirical institutionalism has addressed many of the classic institutional questions in political science. For example, Weaver and Rockman (1993; see also Pal and Weaver, 2003) asked the question of whether institutions matter, referring to the differences between parliamentary and presidential forms of governance. To that familiar dichotomy we can add the emergence of various forms of semi-presidentialism (Elgie et al., 2011). Empirical institutionalism has also addressed the nature of individual institutions within the public sector. For example, beginning with Nelson Polsby’s (1968) study of the US House of Representatives there has been a strand of research that focuses on the institutionalization of legislative bodies (see Palanza et al., 2012). This literature has argued that the extent to which a legislature is institutionalized will affect its capacity to legislate effectively, and to function as an effective foil to the powers of the executive (but see Judge, 2003). Yet another application of empirical institutionalism has addressed autonomous and quasi-autonomous organizations within the public sector. For example, a significant body of literature has addressed the autonomy of central banks and the importance of that autonomy for monetary policy (Fernández-Albertos, 2015). In addition, the administrative reforms associated with the New Public Management have created multiple autonomous and quasi-autonomous agencies within public administration. This body of institutionalist literature also raises
The New Institutionalism in Political Science
important questions about accountability for these autonomous institutions (Vibert, 2007). Finally, empirical institutional analysis has been used to address issues of political development. Beginning with Samuel Huntington (1968), students of development have linked the development of the capacity of political institutions with the overall development of the political system. This research has paid special attention to the development of the public bureaucracy, and its central role in creating more effective governance. Increasingly that work has focused on creating ‘islands of excellence’ (Roll, 2014) in administrative systems that otherwise may be less than excellent, and then building out from those to attempt to improve government performance more generally. We could add studies of other institutions within the public sector to this catalog of the work of empirical institutionalists. The basic point of this approach would, however, remain much the same. The argument is that the formal design of institutions matters for their performance, and for the performance of the public sector more generally (Choudhry, 2008). This approach is more allied with the rational choice perspective than with other contemporary approaches to institutionalism, if for no other reason than it assumes that it is indeed possible to alter the behavior of individuals, and the performance of institutions, simply through designing their formal structures.
Discursive Institutionalism A fifth approach to institutionalism in political science has been described as ‘discursive institutionalism’ (Schmidt, 2010, 2011). Previously scholars such as Colin Hay (1995, 2008) and Nicolas Jabko (2006) had discussed some of the same ideas in terms of constructivism. Vivien Schmidt, however, created a stronger conception of this logic in her discursive model. The basic logic of this approach is that institutions are defined by
139
ideas and the manner in which these ideas are communicated within the structure (see also Alasuutari, 2015). Unlike some, or indeed most, conceptions of institutionalism this version is not based on hierarchy or formal structures but is based more on shared patterns of communication.4 While explanations based on interest have been dominant in political science, there is also a significant strand of thought emphasizing the role of ideas (Béland, 2009) as independent sources of explanation. These two forms of explanation may be artificially distinct – ideas may be used to justify interests, and interests may grow out of ideas – they do represent alternative avenues for understanding the complexities of public action and of the institutions involved in that action. In discursive institutionalism, ideas dominate the explanation of institutional behavior, and constitute the foundations of institutions. Although elaborated in somewhat different ways the fundamental logic of discursive institutionalism and normative institutionalism are similar in several important ways. In both cases institutions are defined largely by their ideas and norms. Further, creating the institution in both approaches depends heavily on inculcating a set of values among the prospective members. In both cases the actors involved in an institution are members because of the values and ideas that the institution represents. And in both, institutional change comes through changing ideas and the associated norms, although the normative version tends to be somewhat more strongly dependent upon top-down processes for producing the change. Although there is the basic similarity of these two versions of institutionalism, there are also several crucial differences. The most fundamental of these differences is that the normative institutionalism has strong roots in organizational theory and tends to take organizations as the fundamental locus for institutional activity.5 In contrast the discursive approach to institutions does not assume established organizational structures
140
The SAGE Handbook of Political Science
but rather that the institutions emerge from the interaction of the members and their discourses (Hope and Raudla, 2015). Following from the interactive nature of institutions in its approach, the norms within the discursive institutionalism are more flexible and tend to be constructed through interactions, while those in the normative approach are more defined by the existing patterns of norms, symbols, routines and myths within an institution.
Informal Institutions The idea of informal institutions may appear to many readers to be an oxymoron. After all, institutions are formal structures that persist over time, and often are too formalized and too rigid. But at the same time informal arrangements develop among political actors – individuals as well as other organizations and institutions – that have characteristics of institutions without the need to become formalized. Especially if one adopts the sociological perspective on institutions as being defined by ‘myths, symbols, routines’, etc. (Meyer and Rowan, 1977) then informal arrangements can easily be conceptualized as institutions. Informal institutions can be as important as formal ones in defining social relationships and making governance function effectively. Many of the ‘rules’ that shape behavior in the public sector are not formal but represent the accretion of understandings among individuals across time. While the study of formal institutions has bridged public administration, comparative politics and political theory, informal institutions have been especially important for comparative politics. Informal institutions such as clans (Murtazashvili, 2016), clientelism (Stokes et al., 2013), corruption (Rose and Peiffer, 2016), consociationalism (Lijphart, 1985; Jarrett, 2016) and patrimonialism have been very important for understanding and explaining how governments work,
especially in developing and transitional societies. But even in industrial democracies informality can be important, and occurs in seemingly formal aspects of governing such as writing regulations within the public bureaucracy (Azari and Smith, 2012). And the emphasis on network governance in many northern European countries (Keast et al., 2014) can also be seen as the functioning of informal institutions linking the public and private sectors. The presence of informal institutions – political, economic and social – can be seen as providing alternative means of governing and reaching collective goals when the formal institutions of society are weak. Many of the aspects of informality are condemned by international organizations, by scholars and by many citizens in the countries affected, but they may still be essential for providing governance and supplementing weak formal institutions. These institutions may be described as working in the ‘twilight’ (Lund, 2007) or in the ‘shadows’ (Gore and Pratten, 2003; Peters, 2011) but they do function and they do provide some collective as well as individual benefits (Bratton, 2007).
What is an (Informal) Institution? Potter Stewart, a justice on the US Supreme Court famously said that he could not define pornography but he knew it when he saw it. Much the same appears to have been true in the study of informal institutions. The term is used widely, but often without a clear definition, and often simply to denote something that is not a formal institution. Especially in political science, the contrast is made between the formal institutions such as legislatures and courts and the more amorphous patterns of behavior that complement, or oppose, the behaviors expected from the formal structures. Although more amorphous, these patterns of behavior are institutionalized, with behavioral regularities, expectations and predictability.
The New Institutionalism in Political Science
Gretchen Helmke and Steven Levitsky (2004), in perhaps the most cited article on informal institutions and governance, point out that the term has been applied to a ‘dizzying array’ of phenomena in the world of politics. They do go on, however, to attempt to provide a more precise definition of the concept. They define informal institutions as ‘socially shared rules, usually unwritten, that are created, communicated and enforced outside of officially sanctioned channels’ (2004: 727). It is important to note that this definition is not hugely different from the definition of institutions in the normative institutionalism, with its emphasis on the ‘logic of appropriateness’ for defining individual behavior. The major difference from formal institutions, they argue, is the absence of official sanctions present in the formal structures, even though these patterns of behavior are still legitimate for the participants. The importance of interactions and informal modes of decision-making actually appears in Helmke and Levitsky’s own discussion of what they deem ‘complementary’ patterns of interaction between formal and informal institutions (2004: 728). They mention, for example, informal rules within bureaucracies for coordinating among organizations without having to revert to using formal, hierarchical mechanisms (see also Peters, 2015). The failure to achieve coordination may be seen as a sanction for organizations who cannot make these informal mechanisms function, but we would argue that patterns of interaction may be better conceptualized in terms of opportunities rather than as sanctions. We may therefore be able to introduce some of the ideas of the opportunity structures literature (Kitschelt, 1989) to explain
the emergence of informal institutions. While this literature has been concerned primarily with social movements (see Della Porta, 2013) it can also be used to explain the likelihood of other types of organizations and institutions forming. For example, Kitschelt’s discussion of the impact of open and closed input structures and weak and strong output structures could also be seen as making the formation of informal institutions more or less likely in particular settings. Following from the above, we can think of at least two types of informal organizations – those based on sanctions and those based on opportunities. Both of these variations of informal institutions will be able to shape the behavior of individuals and other organizations, although one will be based more on carrots and the other on sticks (BemelmansVidec et al., 1998). And somewhat similarly to Helmke and Levitsky we can ask whether the activities or goals of the informal institution are congruent with those of the formal institutions or not. The interaction of these two variables produces the typology presented in Table 8.1. As with the analysis of Helmke and Levitsky this typology demonstrates that informal institutions need not subvert the public sector, but rather in many instances they support the public sector and enable it to perform better than it could otherwise. Thus, informality in governance is not necessarily either positive or negative. Each circumstance must be judged differently depending on the circumstances. Indeed, some patterns that are positive in some circumstances may be negative in others. For example, civil servants working informally to produce coordination in general would make positive contributions
Table 8.1 Types of informal institutions Basis of institutions
Goals of institutions
Congruent Incongruent
141
Sanctions
Opportunities
Shared enforcement Alternative governance
Facilitating Competitive
142
The SAGE Handbook of Political Science
to governing, but if they collude to pursue goals other than those of their organizations then the interactions must be seen as having negative consequences. The formality or informality of institutions may also vary with context, and over time. Take for example political parties. Parties can have all the trappings of a formal institution in almost any definition. However, within the context of an institution such as a legislature the party may function more as an informal institution. Internal norms such as party unity may contribute to, or reduce, the capacity of the legislature to perform its tasks. Some aspects of clientelism may also move between formality and informality, as when the distributive nature of politics in clientelistic systems becomes more formalized. Informal institutions constitute a very real part of political life. As well as being important on their own, they are perhaps even more important as they interact with formal institutions and contribute to, or detract from, the achievement of collective goals through the processes of governing. In some instances, informal institutions can contribute to the achievement of governance goals, while in other situations they may pursue their own goals that undermine governance. Also, while we tend to think about formal institutions as involved in processes of governing through formal, legal means (Lepsius, 2013), informal institutions are more important for public participation and political democracy (Lauth, 2000).
Is There One Institutionalism? The discussion to this point has been cataloging a variety of approaches to institutions and institutionalism. This catalog of approaches demonstrates the rich variety of ideas present in institutionalism, but it raises another significant question about contemporary institutional theory. If institutionalism is to be an alternative paradigm in political science, then
this variety and the apparent internal contractions must somehow be reconciled and some common understanding of institutions emerge. We will argue that there are some common elements in contemporary institutional theory, although the next section will point to some continuing challenges within this body of theory. The first and most fundamental point is that structure matters. Although contemporary institutional theory is capable of integrating some aspects of agency, the underlying logic of the approach is that the structures within the political system – including informal structures – do matter. Further, individual behavior is not atomistic, but is shaped at least in part by membership in these structures. The shaping may occur because of ideas or because of incentives or through rules, but there is still some reduction on the variability of behavior. Following from the first point, another commonality within these theories is that they are concerned with creating some regularity of behavior for individuals. Left to themselves, individuals can be rather unpredictable, and institutions are designed to reduce that unpredictability. Some level of predictability is necessary for governments to function, whether in democratic or authoritarian regimes, and institutions create some control over individuals. But the level of control that is necessary, and the level that is acceptable, will vary across political systems. And the level of control that is desirable and acceptable will also differ across institutions within the same political system. A third defining feature of institutions is that they replicate behaviors over time, and create more predictability not just for individuals but also across time. Constitutions and law are major institutional constraints on rapid change within a political system, and thus provide some confidence for citizens and for businesses. That said, all institutions do change across time, responding to changing external conditions and the changing leadership within the institution itself. That gradual change, characterized in several ways in
The New Institutionalism in Political Science
historical institutionalism, may be endemic in institutions (Mahoney and Thelen, 2010). Finally, all versions of institutionalism require some means of understanding the interaction between individuals and the institution. There is some tendency in casual discussions of institutionalism to ignore the role of individuals, but that would be to misunderstand the fundamental nature of institutions. They are human creations and can be changed by individuals, but at the same time they shape the behavior of the individuals within them. This interaction between structure and agency is a crucial element in any understanding of the complexity of institutional life and institutional theory. In sum, there are some common elements in all versions of institutionalism in political science. But are these common elements sufficient to define a potential paradigm for the discipline, or even some sub-fields within the discipline? That is to some extent in the eye of the beholder. If that beholder does see that institutions are a pervasive part of political and social life then perhaps there is a sufficient common core to say that an integrated approach exists. But if one focuses only on the disparate approaches then there may well only be a set of partial approaches to the role of structure in politics.
What Does Institutionalism Do for Political Science? The above discussion has mentioned a number of impacts that institutionalism has had on the study of politics, but we should also identify those contributions more systematically, and especially note the ways in which institutional approaches may address issues that approaches based on ‘methodological institutionalism’ (Udehn, 2001) may be less adept at addressing. Indeed, institutionalism can provide some solutions to problems (theoretical as well as actual) that are posed by a focus on individual behavior.
143
The several discussions below address some of the main contributions of institutionalism to political science, but the contributions are in fact more pervasive. Thinking about institutions, especially the bureaucracy, has been very important in the study of public administration (Thoenig, 2003). In addition, institutional theory has made a significant contribution to the study of international relations, both for formal international institutions and for less formalized institutions such as regimes (Rittberger et al., 2012) (see Malone and Medhora, Chapter 77, this Handbook). At the other end of a scale of extensiveness, institutional theory has also contributed to urban and local governance studies (Papadopoulos, 1996). Only for those areas of the discipline that focus entirely on the micro level of behavior is institutional theory not of great relevance.
Collective Action Problems and Social Choice First, institutional analysis can be a means of addressing the familiar collective action problems arising from the disjuncture of individual and collective rationality (Olson, 1965). This literature, perhaps most famously through Garrett Hardin’s 1968 essay ‘Tragedy of the Commons’, has identified a number of situations in which the pursuit of individual utility can produce outcomes that are socially harmful (R. Hardin, 1985). The problem then becomes finding ways of curtailing the pursuit of individual maximization. This can, of course, be done by law and other authoritative institutions, but the more interesting solutions may arise from the development of institutions that do not depend so strongly on imposition. Elinor Ostrom’s work (1990) on the development of institutional answers to collective action problems is almost certainly the most famous. She demonstrated that individuals do have the capacity to develop rules and norms
144
The SAGE Handbook of Political Science
among themselves that can curtail the pursuit of individual self-interest. Although the rules developed to solve these problems are created through interactions among those who are self-interested, they are still enforceable and do still constrain the actions of individuals. These rules operate at the constitutional, policy and operational levels, and through the interactions of the levels, generate effective governance. Individual rationality and preferences create other types of problems for governing as well. For example, different preference ordering of individuals may make reaching collective decisions difficult, as in the famous Arrow impossibility theorem (Arrow, 1951). In these situations, institutional rules can be created to constrain the pursuit of interests in voting and enable otherwise unstable voting patterns to create some stability (Riker, 1980). But we must be cognizant of the possible perverse effects of institutional rules that can contribute to gridlock and indecision, as in the case of American presidentialism. The above discussion of institutions has emphasized formal structures and rules, but informal institutions (see above) can perform some of the same tasks. By generating social norms, and understandings among the members of an institution, the informal institutions can supplement the formal, and produce more effective compliance. That said, however, informal institutions may have values that conflict with the formal and produce internal conflict and undermine the success of the institution. Thus, designing formal institutional structures must be married with an understanding of the norms of the participants, and some effort made to produce consistent values if there is to be consistent behavior.
Policy Choices Institutional theory also provides mechanisms for understanding the policy choices made by governments. Public policy emerges
from processes within institutions, and among institutions, and these structures help to understand better what choices emerge from those processes. While that understanding of the linkage between institutions and policies often has been more implicit than explicit, there is an emerging body of literature linking institutional design issues to policy design (Peters, forthcoming). The simplest version of the linkages between institutions and policy has been provided by the historical institutionalists (see Capano, Chapter 64, this Handbook). The argument is rather simply that institutions lock in the policy designs made at the time of the formation of the institution, and those policies will persist unless acted upon by some significant political force, and perhaps some conflict (Peters et al., 2005). Leaving aside the path dependency central to this argument, there is an assumption that these policy choices are shaped by the ideas held by the institution and its members, an assumption that is similar to that of the normative institutionalism. The empirical and the rational choice versions of institutionalism both assume greater importance for structures than for ideas in shaping policy. While the other versions of institutionalism may provide better explanations for the content of policy, these two approaches may provide better explanations for the capacity to make policy at all, and the level of innovation contained within the policy. For example, George Tsebelis’s (1990) concept of veto points within institutions provides a useful means of assessing how decisions could be made, as does Fritz Scharpf’s (1997) conceptualization of games within institutions and their impacts on decisions.
Comparative Politics Finally, and perhaps most importantly, institutional theory is essential to the study of comparative politics and governance. As already noted comparative politics had its
The New Institutionalism in Political Science
roots in the study of institutions, and despite some movement away from a focus on institutions, understanding how different political systems function requires an understanding of their institutions. That understanding may be more analytic than that of the traditional institutionalists, but it is nonetheless central to any comparative analysis. First, institutions help to explain the stability of political systems, at both a meso and a macro level. As was true for the study of collective action problems mentioned above, institutional rules within legislatures and among the array of institutions help to generate equilibrium (Gualini, 2001) when that might otherwise be difficult to attain. And at a more macro level of comparison, the development of institutional structures such as consociationalism (Lijphart, 1996) and elite pacts (Durant and Weintraub, 2014) can help create greater stability and comity than would be possible otherwise. Institutions also help explain how decisions are made in governments. As mentioned for the study of public policy, the structure of veto points, or clearance points as discussed in the study of implementation, can predict the relative levels of success in decisionmaking. And the literatures on electoral laws (Shugart and Taagepera, 2017) and coalition rules (Martin and Vanberg, 2011) demonstrate how institutions can alter the manner in which governments are formed and function after being formed. And institutions also structure the flow of information within governments, which in turn shapes decisions.
Challenges to Institutionalism To this point, we have been singing hymns of praise to institutionalism but we should also consider some of the challenges that are faced by institutional theory. One of the principal challenges is integrating the various versions of institutional theory, a question discussed above. In addition, some of the
145
challenges we will discuss are familiar critiques of institutionalism, but others are also possible avenues for advancing the power of the approach as a potential organizing frame for political science. Both of these types of questions about institutionalism, however, should be considered as we assess the capacity of this general approach to function as a potential paradigm for political science.
Change The conventional critique of institutional theory is that it is incapable of dealing adequately with change. The virtue of institutions is in part that they provide stability and predictability, but that virtue is also a vice if we consider the need to respond to changing environmental and political circumstances. This critique is in part valid, but also may be overstated. Each of the approaches to institutions discussed above does contain some idea about how an institution can change. Some of those ideas about change are better developed and more convincing than are others, but institutional theory has not ignored the issue of change. Perhaps the most challenging aspect of change in institutional theory is the capacity to create designed, purposive change. Many of the approaches to change are more descriptive than they are seen as means of producing intentional change. For example, the four versions of institutional change developed within historical institutionalism describe types of change which can be observed in institutions, but much less is said about how the institutional designer might be able to produce those changes. Likewise, change in normative institutionalism tends to occur through the emergence of value differences among members of the institution, or between the institution and the surrounding society. The above having been said, rational choice institutionalism and discursive institutionalism do have some ideas about more
146
The SAGE Handbook of Political Science
purposive change. In rational choice institutionalism change occurs rather easily, with the designer being capable of altering rules and other incentives (and disincentives) and thus being able to produce changes in behavior. Change in discursive institutionalism is perhaps less designed, but it is still assumed to be more characteristic of institutions than in other versions of the theory. The continuing discourse among members of the institution is assumed to produce continuing change within that institution. The theoretical conceptions of change in institutions tend to correspond better to observations of the real world than does an insistence that institutions are stable and immutable, or that institutional theory is insensitive to change. Even institutions that persist for long periods of time are not really unchanged but adapt to their environments, or are changed by more formal means. The ‘unwritten’ constitution of the UK may be praised for its adaptability, but other constitutions also change and manage to keep up with changing political demands and values, or they are replaced.
Institutionalization Another challenge to institutional theory is not unrelated to the study of institutional change. Most institutional theory discusses institutions in dichotomous terms – they exist or they do not. In reality, however, institutions may actually be more or less in existence.6 The process of institutionalization, and its reverse of deinstitutionalization (Oliver, 1992), represent the movement back and forth between more inchoate, or more mechanical, states into that of being an institution. For example, Max Weber’s work on institutions (see Lepsius, 2017) emphasized the need to think of institutions as emerging and decaying, and Eisenstadt’s (1959) work on bureaucratization and debureaucratization had much the same emphasis. Selznick’s conceptualization of institutionalization, being the infusing of
values into a structure, provides a good starting point for considering the development of institutionalization. This view, shared by the normative institutionalism in political science, emphasizes that an institution must be more than a mere formal structure in which the members simply follow the formal rules. The structure must have some meaning for the participants. This is the difference between working in a university and working in a fast-food restaurant Selznick (1949). When discussing governance problems of developing countries, Samuel Huntington (1968) conceptualized institutionalization more in terms of building capacity within government structures. Thus, in this approach from empirical institutionalism, to the extent that the institutions were adaptable, autonomous, coherent and sufficiently complex they could be said to be institutionalized. The same logic has been applied to examining the development of other organizations and institutions, such as the Executive Office of the President in the United States (Ragsdale and Theis, 1997). Given the importance of institutionalization and deinstitutionalization for understanding the waxing and waning of institutions, the absence of theorizing about the process is significant. One of the few examples of conceptualizing these processes, by Christine Oliver (1992), argues that deinstitutionalization occurs for political, social and functional reasons. Although this conceptual scheme is directed at deinstitutionalization, it can be run backwards to help explain process of institutionalization.
Measurement Thinking about institutionalization and deinstitutionalization does raise another challenge to institutional theory – measurement. If we say that a structure becomes more or less institutionalized over time, how can we measure that? An in-depth analysis by an
The New Institutionalism in Political Science
individual scholar may convince him or her of those changes, but how can the differences be conveyed to others in ways that are replicable? Oliver’s work on deinstitutionalization provides some inklings about measurement, and Ragsdale and Theis (1997) provide measures for the ideas that Huntington advanced about institutionalization, but for the most part we lack the means of assessing the extent to which any structure is really an institution. And other questions raised in institutional theory also may require greater attention to measurement. For example, in the discussions of change in historical institutionalism we discuss changes such as ‘drift’ and ‘conversion’ without any clear mechanisms for measuring those changes. If those categories of change are to be more useful for understanding policy, they will require some means of assessing the extent of the drift, as well as the extent to which the policies and practices that existed previously do still exist. And rational choice versions of institutionalism that focus on compliance with rules may want to have some assessment, quantitative or qualitative, of the degree of compliance. This list could be extended, but the basic point is that advancing institutional theory will require some greater attention to measurement.
Processes within Institutions A fourth challenge to institutional theory comes from the absence of theorizing about processes within institutions, an issue somewhat related to the study of change within institutions. Institutional theory is very good at describing and explaining structures, but contains little to describe or explain what happens within those structures. We know, for example, that rules or values may be important for shaping the behavior of individuals within a structure, but not how those rules or values produce outcomes. Veto points may, for example, make decisions
147
more difficult within institutions, but there is still relatively little said about the processes that do actually produce decisions. The above said, several of the approaches to institutionalism do have something to say about the policy process. For example, the empirical institutionalism has attempted to understand the consequences of different institutional choices for the policies being produced, especially between parliamentary and presidential systems (Weaver and Rockman, 1993). While not explicitly dealing with internal processes, the argument is based on the consequences of ‘separated’ versus integrated systems, and the consequent differences in the manner in which policy choices are made. In addition, the discursive institutionalism implies a process of dialogue within the institution that produces decisions. The actual content of that process is not clearly articulated, but there is some sense that institutional processes do matter for the choices made by institutions.
Individuals and Institutions Yet another challenge to institutional theory involves the linkage between individuals and the institutional structure. For some versions of institutionalism, notably the normative and the discursive versions, this linkage is crucial. But in all versions there is a question of how do individuals shape the institutions of which they are members, and how do the institutions shape the individuals who function within their confines. Given the argument above about the need to link structure and agency within institutional theory, these linkages will depend on a variety of causal mechanisms embedded in the institutions or in society (Beach, 2013). Robert Grafstein (1992) noted that institutionalism had an important paradox residing at its center. Institutions were assumed to constrain the behavior of individuals, but individuals created those institutions. Of course the institutions in question could
148
The SAGE Handbook of Political Science
have been created decades or centuries earlier by different individuals, but they are still human creations that can be changed by other humans. But we do design institutions to place limits on individual behaviors, whether that behavior is of a president or prime minister constrained by a constitution, or a criminal constrained by legal institutions. As well as this paradox of constraint on behavior implied by institutionalism, it is important to examine the capacity of institutions to mold the ideas and values of individuals. In particular, normative institutionalism assumes that an institution will be capable of persuading, or even forcing, individuals to accept a ‘logic of appropriateness’ that will guide their behaviors.7 But these discussions of institutionalism do not provide a clear analysis of the mechanics through which values are inculcated into the members of the institution.
Explanation or Only Context? Finally, we should ask a fundamental question about the utility of institutional theory as a paradigm for political science. Do institutions, and does institutional theory, really provide an explanation for political phenomena, or do they only provide the context for the operation of other mechanisms that are more proximate explanations for behaviors? And if we are not able to explain behavior at an individual level, can we use institutions to explain the outcomes of policymaking or other major events in the public sector or, again, are we only capable of providing context for those outcomes? The discussion above concerning comparative politics and policymaking does appear to argue that institutional rules can produce outcomes. But the outcomes being explained are often in terms of the speed of decision-making or the size of a coalition, rather than the substance of the policies selected or the parties involved in the coalition. Those characteristics of the outcomes are important, but not as important perhaps as the actual substance of the policy and political choices being made.
Again, does institutionalism only explain the context of the actual decisions being made? The normative institutionalism comes the closest to this level of explanation, given that the values contained within the institution should influence policy choices. One can make both positive and negative arguments about the explanatory capacity of institutional theory. On the one hand establishing the link between institutional, contextual factors and outcomes does advance the understanding of causality in itself (see Pollitt, 2013). On the other hand, however, institutions and institutional theory do not provide an explanation of the dynamics involved in making decisions without invoking some other form of explanation. The question then may become what level of explanation is required to say that the requirements for an explanation have been fulfilled?
Summary and Conclusions Institutionalism is one of the oldest approaches to political science. It has, however, changed significantly over the centuries during which scholars and ordinary citizens have discussed the institutions of government and their role in governing. The ‘New Institutionalism’ discussed here contains a number of alternative conceptions of institutions, but all differ from older versions of institutionalism in their explicit concern for building theory about institutions, and in their linkages to other forms of theorizing in the social sciences. Despite the advances made by the New Institutionalism, there are a number of problems remaining within this approach. The most fundamental of which is that rather than being an integrated approach, it remains a collection of alternative visions about the nature of institutionalism. Despite some common elements, the approaches also have fundamentally different perspectives on what an institution is and how it functions.
The New Institutionalism in Political Science
Contemporary institutional theory also faces a number of other challenges. But despite those problems it does manage to provide important insights into the way in which the public sector functions. By attempting to integrate understandings of individual behavior with an understanding of structure, the New Institutionalism does provide a viable bridge across the structure–agency divide in political science theory, and does provide insights into how governments actually function.
Notes 1 This is in contrast to contemporary students of institutions, who argue that institutions often are formed at different times and for different purposes, and do not necessarily work together well. See Skowronek (1982). 2 The normative institutionalism in political science is based heavily on the work of sociologists such as Meyer and Rowan (1977) and Selznick (1957). For other perspectives on institutionalism in sociology see Peters (2019, chapter 7). 3 The idea of path dependency comes from economics, where it was used to describe the lock-in of some products even though technically superior products may exist (Arthur, 1989). The costs of moving away from the sub-optimal solution are too great so the older system persists. 4 There is an obvious connection with the Habermasian ideas of communications and discourse within this approach (Habermas, 1984). 5 The original state of the normative approach by March and Olsen (1984) discussed institutionalism as the ‘organizational basis of political life’. 6 In this sense, institutions may be analogous to variables in fuzzy set Qualitative Comparative Analysis (QCA). They may be in the set of institutions in part, but not entirely. 7 The normative institutionalism tends to focus on softer means of creating commitment to the institution, but the fundamental point is that the logic may be created through more means, for example, creating esprit de corps in military institutions.
References Alasuutari, P. (2015), ‘The Discursive Side of the New Institutionalism’, Cultural Sociology 9(2): 162–84.
149
Arrow, K. J. (1951), Social Choice and Individual Values (New Haven, CT: Yale University Press). Arthur, W. B. (1989), ‘Competing Technologies, Increasing Returns, and Lock-in by Historical Events’, The Economic Journal 99(394): 116–31. Azari, J. R. and J. K. Smith (2012), ‘Unwritten Rules: Informal Institutions in Established Democracies’, Perspectives on Politics 10(1): 37–55. Beach, D. (2013), ‘Taking Mechanisms Seriously?’, European Political Science 12(1): 13–15. Béland, D. (2009), ‘Ideas, Institutions and Policy Change’, Journal of European Public Policy 16(5): 701–18. Bemelmans-Videc, M.-L., R. C. Rist and E. Vedung (1998), Carrots, Sticks and Sermons: Policy Instruments and their Evaluation (New Brunswick, NJ: Transaction Publishers). Bratton, M. (2007), ‘Formal Versus Informal Institutions in Africa’, Journal of Democracy 18(3): 96–110. Brehm, J. and S. Gates (1997), Working, Shirking and Sabotage: Bureaucratic Response to a Democratic Public (Ann Arbor, MI: University of Michigan Press). Choudhry, S. (2008), Constitutional Design for Divided Societies: Integration or Accommodation? (Oxford: Oxford University Press). Della Porta, D. (2013), ‘Political Opportunity/ Political Opportunity Structure’, Wiley- Blackwell Encyclopedia of Social and Political Movements (Oxford: Blackwell). Durant, T. C. and M. Weintraub (2014), ‘How to Make Democracy Self-enforcing After Civil War: Enabling Credible and Adaptable Elite Pacts’, Conflict Management and Peace Science 10: 1–20. https://doi.org/10.1177/073 8894213520372 [Accessed on: 20 December, 2019] Eisenstadt, S. N. (1959), ‘Bureaucracy, Bureaucratization and Debureaucratization’, Administrative Science Quarterly 4(3): 302–20. Elgie, R., S. Moestrup and Y.-S. Wu (2011), Semi-Presidentialism and Democracy (London: Macmillan). Fernández-Albertos, J. (2015), ‘The Politics of Central Bank Independence’, Annual Review of Political Science 18: 217–37. Gore, C. and D. Pratten (2003), ‘The Politics of Plunder: The Rhetorics of Order and Disorder
150
The SAGE Handbook of Political Science
in Southern Nigeria’, African Affairs 102(407): 211–40. Grafstein, R. (1992), Institutional Realism (New Haven, CT: Yale University Press). Gualini, E. (2001), Planning and the Intelligence of Institutions (London: Routledge). Habermas, J. (1984), The Theory of Communicative Action, Volume 1: Reason and Rationalization in Society (London: Heinemann). Hall, P. A. (1986), Governing the Economy: The Politics of State Intervention in Britain and France (New York: Oxford University Press). Hardin, R. (1985), Collective Action (Baltimore, MD: Johns Hopkins University Press). Hay, C. (1995), ‘Structure and Agency’, in D. Marsh and G. Stoker, eds, Theory and Methods in Political Science, pp. 132–51. (New York: St. Martins Press). Hay, C. (2008), ‘Constructivist Institutionalism’, in S. A. Binder, R. A. W. Rhodes and B. A, Rockman, eds, Oxford Handbook of Political Institutions, pp. 56–72. (Oxford: Oxford University Press). Helmke, G. and S. Levitsky (2004), ‘Informal Institutions and Comparative Politics: A Research Agenda’, Perspectives on Politics 2(4): 725–40. Hope M. and R. Raudla (2015), ‘Discursive Institutionalism and Policy Stasis in Simple and Compound Polities: Estonian Fiscal Policy and United States Climate Change Policy’, Journal of Policy Studies 33(5): 399–418. Huntington, S. P. (1968), Political Order in Changing Societies (New Haven, CT: Yale University Press). Jabko, N. (2006), Playing the Market: A Political Strategy for Uniting Europe, 1985–2005 (Ithaca, NY: Cornell University Press). Jarrett, H. (2016), ‘Consociationalism and Identity in Ethnically Divided Societies: Northern Ireland and Malaysia’, Studies in Ethnicity and Nationalism, 16(3): 401–15. Judge, D. (2003), ‘Legislative Institutionalization: A Bent Analytic Arrow?’, Government and Opposition 38(4): 497–516. Keast, R., M. Mandell and R. Agranoff (2014), Network Theory in the Public Sector: Building New Theoretical Frameworks (London: Routledge). Kitschelt, H. (1989), The Logics of Party Formation: Ecological Politics in Belgium and West Germany (Ithaca, NY: Cornell University Press).
Lauth, H.-J. (2000), ‘Informal Institutions and Democracy’, Democratization 7(4): 21–50. Lepsius, M. R. (2013), ‘Institutionenalyse und Instituionenpolitik’, in Lepsius, ed. Institutionaliserung politischen Handeln (Wiesbaden: Springer). Lepsius, M. R. (2017), ‘The Institutionalization of Rationality Criteria and the Role of Intellectuals’, in C. Wendt, ed. Max Weber and Institutional Theory, pp. 187–212. (Bern: Springer). Lijphart, A. (1985), Power-Sharing in South Africa (Berkeley: Institute of International Studies, University of California). Lijphart, A. (1996), ‘The Puzzle of Indian Democracy: A Consociational Interpretation’, American Political Science Review 90(2): 258–68. Lund, C. (2007), Twilight Institutions: Public Authority and Local Politics in Africa (Malden, MA: Blackwell). Mahoney, J. and K. Thelen (2010), ‘A Theory of Gradual Institutional Change’, in Mahoney and Thelen, eds, Explaining Institutional Change: Ambiguity, Agency and Power, pp. 3–28. (Cambridge: Cambridge University Press). March, J. G. and J. P. Olsen (1984), ‘The New Institutionalism: Organizational Factors in Political Life’, American Political Science Review 78(3): 738–49. March, J. G. and J. P. Olsen (1989), Rediscovering Institutions: The Organizational Basis of Politics (New York: Free Press). Martin, L. W. and G. Vanberg (2011), Parliaments and Coalitions: The Role of Legislative Institutions in Multiparty Government (Oxford: Oxford University Press). Meyer, J. W. and B. Rowan (1977), ‘Institutionalizing Organizations: Formal Structure as Myth and Ceremony’, American Journal of Sociology 83(2): 340–63. Murtazashvili, J. B. (2016), Informal Order and the State in Afghanistan (New York: Cambridge University Press). Oliver, C. (1992), ‘The Antecedents of Deinstitutionalization’, Organization Studies 13(4): 563–88. Olson, M. (1965), The Logic of Collective Action: Public Goods and the Theory of Groups (Cambridge, MA: Harvard University Press). Ostrom, E. (1990), Governing the Commons: The Evolution of Institutions for Collective
The New Institutionalism in Political Science
Action (Cambridge and New York: Cambridge University Press). Pal, L. A. and R. K. Weaver (2003), The Government Taketh Away: The Politics of Pain in the United States and Canada (Washington, DC: Georgetown University Press). Palanza, V., C. Scartascini and M. Tomassi (2012), On the Institutionalization of Congress(es) in Latin America and Beyond (Washington, DC: Inter-American Working Bank). Papadopoulos, A. G. (1996), Urban Regimes and Strategies (Chicago: University of Chicago Press). Peters, B. G. (2011), ‘Governing in the Shadows’, Asia Pacific Journal of Public Administration 33(1): 1–16. doi: 10.1080/23276665. 2011.10779375. Peters, B. G. (2015), Pursuing Horizontal Management: The Politics of Public Sector Coordination (Lawrence: University Press of Kansas). Peters, B. G. (2019), Institutional Theory in Political Science: The New Institutionalism, 4th ed (Cheltenham: Edward Elgar). Peters, B. G. (forthcoming) Institutional Design and Policy Design: Making the Linkage, Policy & Politics Peters, B. G., J. Pierre and D. S. King (2005), ‘The Politics of Path Dependency: Political Conflict in Historical Institutionalism’, Journal of Politics 67(4): 1275–1300. Pierre, J. and B. G. Peters (2017), ‘The Shirking Bureaucrat: A Theory in Search of Evidence?, Policy and Politics 45(2): 157–72. Pierson, P. (2000), ‘Increasing Returns, Path Dependence, and the Study of Politics’, American Political Science Review 94(2): 251–67. Pollitt, C., ed. (2013), Context in Public Policy and Management: The Missing Link? (Cheltenham: Edward Elgar). Polsby, N. (1968), ‘The Institutionalization of the U.S. House of Representatives’, American Political Science Review 62(1): 144–68. Ragsdale, L. and J. J. Theis, III (1997), ‘The Institutionalization of the American Presidency 1924–92’, American Journal of Political Science 41(4): 1280–1318. Riker, W. H. (1980), ‘Implications from the Disequilibrium of Majority Rule for the Study of Institutions’, American Political Science Review 74(2): 432–46.
151
Rittberger, V., B. Zangl and A., Kruck (2012), International Organization, 2nd ed. (Basingstoke: Macmillan). Roll, M. (2014), ‘The State that Works: A “Pockets of Effectiveness” Perspective on Nigeria and Beyond’, in T. Bierschenk and Olivier de Sardan, eds, States at Work: Dynamics of African Bureaucracies, pp. 213–34. (Leiden: Brill). Rose, R. (1990), ‘Inheritance Before Choice in Public Policy’, Journal of Theoretical Politics, 2(3): 263–91. Rose, R. and C. Peiffer (2016), Bad Governance and Corruption (Basingstoke: Palgrave Macmillan). Sarigil, Z. (2015), ‘Showing the Path to Path Dependence: The Habitual Path’, European Political Science Review 7(2): 221–42. Scharpf, F. W. (1997), Games Real Actors Play: Actor-Centered Institutionalism in Policy Research (Boulder, CO: Westview Press). Schmidt, V. A. (2010), ‘Taking Ideas and Discourse Seriously: Explaining Changes Through Discursive Institutionalism as the Fourth “New Institutionalism”’, European Political Science Review 2(1): 1–25. Schmidt, V. A. (2011), ‘Reconciling Ideas and Institutions Through Discursive Institutionalism’, in D. Béland and R. H. Cox, eds, Ideas and Politics in Social Science Research (Oxford: Oxford University Press). Selznick, P. (1949), TVA and the Grass Roots: A Study in the Sociology of Formal Organization (Berkeley: University of California Press). Selznick, P. (1957), Leadership in Administration (New York: Harper and Row). Shugart, M. S. and R. Taagepera (2017), Votes from Seats: Logical Models of Electoral Systems (Cambridge: Cambridge University Press). Skowronek, S. (1982), Building a New American State: The Expansion of National Administrative Capacities, 1877–1920 (Cambridge: Cambridge University Press). Stokes, S. C., T. Dunning, M. Nazareno and V. Brusco (2013), Brokers, Voters, and Clientelism: The Puzzle of Distributive Politics (New York: Cambridge University Press). Thoenig, J.-C. (2003), ‘Institutional Theories and Public Institutions: Traditions and Appropriateness’, in B. G. Peters and J. Pierre, eds, The Sage Handbook of Public Administration, pp. 169–78. (London: Sage).
152
The SAGE Handbook of Political Science
Tsebelis, G. (1990), Nested Games: Rational Choice in Comparative Politics (Berkeley: University of California Press). Tsebelis, G. (2002), Veto Players: How Political Institutions Work (Princeton, NJ: Princeton University Press). Udehn, L. (2001), Methodological Individualism: Background, History and Meaning (London: Routledge).
Vibert, F. (2007), The Rise of the Unelected: Democracy and the New Separation of Powers (Cambridge: Cambridge University Press). Weaver, R. K. and B. A. Rockman, eds. (1993), Do Institutions Matter?: Government Capabilities in the United States and Abroad (Washington, DC: The Brookings Institution Press).
9 How to Understand Normative Political Theory Furio Cerutti
What is this chapter about? The answer might not be as self-evident as many are inclined to think. Also, if we wish to rethink the relevant notions rather than conforming to the mainstream, it is convenient to make clear which language conventions we are going to follow. By norm let us understand a sentence that indicates in an imperative mode (‘ought’) how to behave in certain fields of action and that is linked to a general rule (‘principle’). Normativity is the effective presence of norms in a certain field (e.g. morality, politics, aesthetics) of human activity, accompanied by reasons or justifications for their validity. Normativity (‘ought’) is opposed to facticity (‘is’) and ‘normative’ is different from ‘evaluative’, in the sense that acknowledging a value does not translate immediately into an imperative to realize it. Whether ‘normative’ and ‘prescriptive’ are the same or not shall here remain open.1 We can look at instances of normativity either from a sociologically descriptive (as observers) or properly normative (as participants) vantage point, the latter being prevalent but not exclusive in this chapter. Trickier is lastly to define what
we mean by normativism: in my understanding the belief that an entire sphere of human activity can be and ought to be regulated by a norm system descending from a supreme principle of action such as justice or utility or God’s will. Plato (in Republic and Laws from Book IV on), Kant and Rawls are in various fashions champions of this normativism. Having clarified the key notions of this chapter, let us do the same in the First Section with the very notion of normative political theory, though in a way fairly different from the usual one.
1. A FIELD REDEFINED: WHAT IS NORMATIVE POLITICAL THEORY? In conventional academic language, normative political theory consists of the thoughts an array of important philosophers – from Plato to John Rawls – have formulated with regard to the best possible shape to give to politics and the polity under the dominance of supreme principles; the Romans spoke of a reasoning
154
The SAGE Handbook of Political Science
de optima republica (about the best possible commonwealth), in German philosophy it is Sollen vs Sein (which very roughly corresponds to ‘ought’ vs ‘is’). I will not follow this traditional path for the substantial reasons that shall be illustrated, but also because my primary interest goes to the role and the forms of normativity in real politics rather than the development of models, which may be of interest to the philosophers’ guild alone. After clarifying some basic notions in Section one, we remind ourselves that normativity exists in politics and in political thought also outside the majestic and pyramidal architectures of Platonic normativism – a pluralistic, less pretentious normativity that is however leaving its mark. In the second Section, we will look at the maxims, principles and theories or semi-theories that were or are effectively at work in political reality. This is to show how, really, influential norms, along with their conceptual background, are born from political and legal developments rather than from the history of philosophy. In the same spirit, in the third Section we then turn to two substantive issues that have greatly stimulated the production of norms (war) or are now starting to do so (global and lethal challenges) – a bottom-up path, from facts to norms, as it were. In Section four we will address two well-known questions informing meta-normative debates: the relationship between politics and ethics and the much debated role of ‘ideal theory’. Needless to say, in all this we will go back whenever necessary to past thinkers, but by no means identify normative political theory with its own history, which can (and must) be studied by using the appropriate tools.2 Why not? First, the history of a theory is not exhaustive of the substance of what the theory is about; a conceptual exploration of its very substance is needed, though accompanied by the historical awareness of former explorations and vocabularies. Second, one thing is the historical sequence of the normative models philosophers wanted the polity to conform to, quite another one the effective influence of norm systems on the behaviour
of political actors. Plato’s Republic did not shape Greek policy in the 4th century bce, nor Kant’s political writings Prussia’s policies in the late 18th and early 19th centuries. Nor can it be said that those works credibly mirrored the beliefs held by relevant political actors. Normative political philosophy can, in a word, have a very thin relationship to real politics as practised by politicians, movements, parties, institutions, alliances – a relationship that can border on irrelevance.3
2. A PARADE OF NORMATIVE FORMULAS AFFECTING POLITICS IN THE HISTORY OF WESTERN POLITICS AND POLITICAL THOUGHT In Western political culture three major sources of normativity can be identified: the JudeoChristian tradition, the Roman heritage, and modernity with its competing (realism- idealism) positions. We shall examine a selection of those principles or maxims that arguably had more influence in giving both formulation and justification to the course of action that actors intended to follow. We shall also try to sketch the more general implications, i.e. the implicit normative theories contained in those formulations. We are all aware of the fact that political actors do not usually act merely in application of the principles they claim to respect; nor do they, on the other hand, use lofty principles exclusively in order to surreptitiously justify their own self-interest, although they often do. They mostly act neither as fully principled players nor as cynical manipulators, but out of a mix of those extreme attitudes that cannot be once for all expressed in a formula.
2a. The Judeo-Christian tradition The main Judeo-Christian normative statements are• Let us barely mention the Golden Rule (‘and as you wish that men would do to you, do
How to Understand Normative Political Theory
so to them’, Luke 6:31), which highlights the nature of commutative/retributive justice as the pillar of civil coexistence, as well as the Ten Commandments. • ‘Give back to Caesar what is Caesar’s and to God what is God’s’ (Mark 12:17): whatever the interpretation given to this statement made by Jesus, here lies the foundation of the exclusively Western/Christian distinction of civil and religious spheres and the legitimation of the political organization as different from the religious one. • The Sermon on the Mount (Matthew, 5–7), with its unprecedented praise for self-restraint (‘turn the other cheek’) and love of the enemies as well as promises of heavenly reward to the meek, the poor in spirit, the pure in heart and the peacemakers. This revolutionary manifesto inspired the normative system of radical Christianity (Bogomils, Cathars, Anabaptists) and remains an important document of Liebesakosmismus (as Max Weber dubbed the estrangement from the world in the name of love). • ‘Remota … iustitia quid sunt regna nisi magna latrocinia?’ (‘once justice is removed, what are kingdoms if not large larcenies?’ – City of God, Augustine of Hippo, 426 ce, 4,4). This is the original and powerful formulation of the (not only Christian) challenge to political power, which is asked to prove its legitimacy beyond mere facticity by proving its adherence to justice.
2b. Political Identity and Principles of Law in Greece and Rome 1 In a conversation with the Persian king, Xerxes, the self-image of Greeks – more precisely of Spartans – is defined by Demaratos, the former Spartan king, as relying on liberty and the dependence on a sole ruler, the law or nomos. This centrality of the abstract and impersonal norm is pitched against the personal servitude characterizing barbarian states (Herodotus, 440 bce, VII,104). To a large extent this definition of political identity was absorbed into Roman culture as well. 2 ‘Salus rei publicae suprema lex esto’ (‘the safety/ welfare of the commonwealth should be the supreme law’ – De legibus, Cicero, 43 bce, III,3,VIII): this motto was later endorsed by Machiavelli, Hobbes, Spinoza and Locke as well as a swath of diverse political actors.
155
(a) ’Dulce et decorum est pro patria mori’ (‘It is sweet and proper to die for the fatherland’). This verse from Horace’s Odes, III,2,13 (Horace, 23 bce), epitomized apologetically for centuries the existential link between (male) individual life and the state, which can legitimately ask a man to sacrifice his life. This sentence lost eventually any justification on the killing fields of the Great War, where it was dubbed ‘the old lie’ by the fallen English poet Wilfred Owen. 3 During the same long evolution of the Roman polity, an entire set of abstract principles – legal norms along with their justification and interpretation – was created with lasting effect. We refer to them in the formulation they found in the compilation made in the 6th century ce in the Corpus juris civilis (body of civil law) at the order of the Byzantine Emperor Justinian I (Justinian Code). Particularly after the rebirth of Romanist legal studies in Bologna (11–12th century), Roman law deeply influenced the legal culture of Europe, the common law countries such as England not excluded (cf. Stein, 1999). Now, the opening statement of one of the main parts of the Justinian Code reads: ‘iuris praecepta sunt haec: honeste vivere, alterum non laedere, suum cuique tribuere’ (‘the precepts of law are these: to live honestly, to injure no one, to give to each his own’ – Institutiones 535 ce, 1,1,3). These are social and civil as well as political principles; the second of them represents the core value of civil coexistence, the premise of peace and conflict prevention as pillars of the polity. The third one was seen for two millennia (sporadically even in China) as the overall definition of justice, both distributive and commutative; undetermined and flexible as it is, it has been applied to justify all and everything, inscripted as it was (Jedem das Seine) in the entrance gate to the Nazi extermination camp in Buchenwald. Not even lofty principles are, in their lack of contextualization, sheltered from perversion. Two further principles or rather meta-principles must be at least mentioned: ‘nemo in causa sua iudex’ (‘no-one should be a judge in his own case’ – Codex, 535 ce, 3,5,1) and ‘pacta sunt servanda (‘agreements must be kept’ – attributed to Ulpian d. 228 ce).
156
The SAGE Handbook of Political Science
2c. Political Realism and Idealism in Modernity By political modernity in Europe we understand the era marked by the vanishing of the medieval universal powers, the Roman Church and the Holy Roman Empire; the rise of absolutist monarchies (later to become nation states); and the establishment of the international Westphalian system. In this era (1648–1914) normativity, once detached from any religious roots, was reshaped in a bipolar – realist as well as idealist – way, both claiming to be inspired by reason or rationality. We look first at a founding topos of realism:4 • ‘If men were entirely good this principle [not to keep faith when this turns to one’s own disadvantage] would not hold, but because they are bad, and will not keep faith with you, you too are not bound to observe it with them’. (Machiavelli, 1513: chapter XVIII). In this very ‘Machiavellian’ passage from Machiavelli the legitimacy of a non-moral behaviour in politics finds its foundation in a pessimist anthropology. • Apart from realism, another well-known precept ‘right or wrong, my country (or party)’ represents the essence of the xenophobic nationalism that has wreaked havoc in the entire world in the 20th century and beyond. On the opposite pole, idealism, Kant’s categorical imperative has been the flagship in all (deontological) attempts to bring
politics back again under the sceptre of morality, as he forcefully argued for, especially in the Appendix to Perpetual Peace (Kant, 1795). Also, the specifically political principle of publicity formulated in the same work (‘all actions relating to the right of other human beings are wrong if their maxim is incompatible with publicity’) has also shaped the normative tradition of Kantian liberalism. • Another brand of liberalism, prevailing in countries of Anglo-Saxon culture (remember John Stuart Mill), is connected to the basic principle of utilitarianism, which is also a normative philosophy, though consequentialist rather than deontological in its posture.5 This principle tells actors to seek the greatest good (or pleasure) for the greatest number of human beings. • Marx mocked abstract universal ‘goddesses’ such as justice or liberty, but briefly reflected on the normative principles possibly guiding the time after the abolition of capitalism: ‘to each according to the amount of work he contributes’ was seen as the basis for the initial phase of communist society, which would then, in its full-blown configuration, shift to ‘from each according to his ability, to each according to his needs’ (Marx, 1875: §1). These two formulas became popular, which shows how mass participation in politics could be motivated by relying on a rational utopia. • Max Weber’s point, made in Politics as a Vocation (1919), that a true vocation to politics can be found only in those politicians whose political ethics includes both intimate conviction in upholding one’s own beliefs (Gesinnungsethik) and a sense of responsibility for the consequences of
Niccolò Machiavelli Of the relevant political theorists, Machiavelli (1469–1527) has been, like Thucydides and Cicero, one of the few who could look back to a career as a high-ranking official and diplomat – for Machiavelli, under the short-lived republican regime that ruled Florence after the Medici’s expulsion in 1494 and their comeback in 1512. Between his demise from government (he was even tortured as a possible conspirator) he wrote The Prince, the Discourses on the First Ten Books of [Roman historian] Titus Livy, The Art of War, the Florentine Histories as well as the comedy Mandragola (Mandrake). Machiavelli conceives of politics as an empirical field, whose rules and patterns can be understood from the study of history. Necessity, fortune and virtue (the ability to master one’s own destiny) are its determinants. As the realm in which human beings pursue their individual and collective self-interest, politics is different from morality, most evidently in the case of a new princedom like the Italy Machiavelli wants to be freed from foreign rule and unified. Under less exceptional conditions domestic political conflicts are conducive to freedom and were wisely mirrored in the institutional structure of ancient Rome. Vilified by many, Machiavelli was of great interest for the Founding Fathers of the United States of America, Hegel and Gramsci.
How to Understand Normative Political Theory
157
Immanuel Kant Immanuel Kant (1724–1804) never left his native Königsberg in East Prussia (after World War II renamed Kaliningrad, Soviet Union/Russian Federation), where he was a professor at the local university (now Балтийский федеральный университет имени Иммануила Канта/the Immanuel Kant Baltic Federal University). His fundamental works are Critique of Pure Reason (1781), Critique of Practical Reason (1788) and Critique of Judgment (1790). Perpetual Peace (1795) and Metaphysics of Morals (1797) are his main contributions to political philosophy; relevant also are On the Old Saw: That May be Right in Theory But It Won’t Work in Practice (1793) and An Answer to the Question: What Is Enlightenment? (1784). Kant explores the principles that Reason dictates independently from any experience (i.e. in a transcendental way) with regard to political life. They take the form of principles of law, which can reshape politics only if morality rules over it instead of being subdued to criteria of expediency. Peace among nations can be achieved as a perpetual, not temporary good, only by eliminating the causes of war in the domestic regime and international relations. This leads to republicanism (freedom and equality of all citizens under the law) and federalism (a league of states rather than a world republic); they are to be complemented by cosmopolitan law, which gives anybody the right to visit foreign territories.
action (Verantwortungsethik) has later become a topos, indicating the necessary balance between the divergent normative approaches emerging in political life – against adventurism and fanaticism on the one hand and mere opportunistic behaviour on the other.
This survey of relevant normative positions could be enriched with other recent examples: from Deng Xiaoping’s dictum (‘It doesn’t matter whether a cat is black or white, as long as it catches mice’, pronounced in 1962), which was later to legitimize the opening up of Chinese communism against ideological purity, thus redirecting Chinese and world history, to John Rawls’ second principle of justice (‘Social and economic inequalities are to be arranged so that they are both (a) to the greatest expected benefit of the least advantaged and (b) attached to offices and positions open to all under conditions of fair equality of opportunity’) (Rawls, 1999a: 72), which sociologically can be seen, in the aftermath of the New Deal (it was written in the 1950–60s), as a justification of redistributive policies and the welfare state. Two comments on this parade: First, as mentioned above, it is hardly possible to gauge the operative presence of these prescriptions in the actors’ behaviour across the centuries. There is no scholarly
ascertained methodology that is capable to assess how far the pretence of normative principles to influence the actors’ line of action is successful or vain. Only in the case of principles enshrined in a country’s or empire’s legal order can their effectiveness be assessed by looking at courts’ decisions. Otherwise normative theories, particularly in democratic polities, can at most let their recommendations trickle down into the scholarly conversation, then the public debate in the media and finally some corner of legislation. In a more extended time span, however, normative theories can play a greater role not so much in shaping the actors’ actual behaviour, but rather conferring or denying legitimacy to it. Prussian, German and European leaders in the 19–20th centuries did not act as Kant required them to do, but his ideas had influence on the minds of some elites and citizens, thus setting limits to what they regarded as a legitimate course of action. Second, what strikes our eyes while taking note of the prescriptions for action issued in the history of Western politics is their multiplicity and variety, not to say disparateness, which is not just the consequence of the large number of actors and interests addressed. In the decisions taken and the justifications offered by the very same actor – state or party – in the last two centuries, the principles
158
The SAGE Handbook of Political Science
Karl Marx Karl Marx (1818–83) studied law and philosophy in Germany and went into exile to London after the failure of the 1848–49 revolutions. He published the Manifesto of the Communist Party (1848, co-authored with Friedrich Engels), A Contribution to the Critique of Political Economy (1859) and the first volume of Capital (1867), his economic and philosophical masterwork. After his death the second and third volumes as well as Theories of Surplus Value and Critique of the Gotha Programme were published; in the 20th century the Economic and Philosophical Manuscripts of 1844, The German Ideology (written with Engels in 1845–46) and the Grundrisse, containing the preparatory studies for Capital, also became accessible. Marx explored the whole capitalist mode of production as the most revolutionary in history due to its ability to exploit wage labour (labour theory of value), but saw it undermined by its own dynamics of concentration and competition. Its collapse, accelerated by the struggles of the working class, will free the way to a society without the socially relevant division of labour and, in its final, communist stage, favourable to the full realization of the individual. In his late years Marx mentioned the possibility of a peaceful transition to a classless society by means of democratic elections in countries with stable representative institutions; he did not focus exclusively on the ‘dictatorship of the proletariat’.
mentioned were present in a mix that has rapidly changed (after elections, wars and revolutions) and often contained conflicting aspects. This reminds us that political life relies on complex patterns of motivations and justifications, in which the lack of systematic consistency between rules does not mean by itself an insurmountable obstacle to effective action. One has to grasp this complexity if one wants to introduce normative viewpoints into policy and have a chance of gaining influence; a one-size-fits-all system of norms may satisfy no other than some philosopher’s taste for perfect architectures.
3. NORMATIVITY AT WORK: TWO CASES Should we make happiness or virtue or the common good or justice the true goal of politics? This is the fundamental question most normative theories have seen as defining their own business. Inspiring though this existential question may be, it is doubtful that the actual behaviour of states, leaders and parties, by which the freedom or happiness of societies and individuals is largely determined, has ever been shaped – at least in a relevant amount – by the answers given by philosophers to that question. Not so with
other, less supreme but still crucial indications regarding two cases that are very much overshadowing our communal life: war, and newly, the endangering of civilized human life on the planet. Two cases in which political life has been confronted with the advice of philosophy and ethics.
First Case: Just War Normative questions arising from war are as old as the answers found in the Iliad or the Bhagavadgita and have transcultural character. The reflection on how to set an end to the ‘scourge of war’ (UN Charter, Preamble) establishing Kant’s perpetual peace or how to restrain it to a less unbearable measure is fundamental for political philosophy, because politics contains violence as one of its constitutive elements, and can even be seen as an activity regulating violence along with its cousin, fear – so did Hobbes, Weber, Franz Neumann and Niklas Luhmann. On the other hand politics is an attempt to regulate human affairs in a peaceful way, be it by argumentative deliberation or negotiating and compromising, and the outbreak of civil or international war is always a defeat of politics, though it remains part of it. The alternative is then between keeping one’s own hands allegedly clean, by refusing any involvement in
How to Understand Normative Political Theory
a posture of radical pacifism, or preparing conceptual tools such as the just war doctrine, capable to tell us how to save what can be saved from unlimited violence (the laws of war or ius in bello), and under which restraining conditions the resort to arms can be justified or even necessary (ius ad bellum) for the preservation of basic civilizational values, as in the case of the war of 1939–45 against fascist dictatorships.6 The alternative we have just mentioned can be aptly illustrated by going back to the origin of the just war doctrine in the work of Augustine of Hippo, particularly in his De civitate Dei (The City of God). Previous theologians had focused on the message of unconditional love contained in the Gospels and rejected the legitimacy for a Christian to be a soldier (Dyson, 2006: chapter 4). Augustine reversed this position by referring to other passages in the Gospels and justified war when necessary to re-establish order and stability in the earthly life of human beings – as some of the ‘just’ wars fought by Rome were, he acknowledged. Behind this reversion of Christian pacifism was his negative anthropology: after the original sin humans cannot be saved from vice, especially the libido dominandi (lust of mastery), which is at the origin of war, and only those touched by the grace of God can attain entrance into the City of God. Theological explanations apart, Augustine shares his sober view of man with political realism, in particular with chapter XVIII of Machiavelli’s The Prince (1513). Under these premises just war, waged by a lawful authority (a state based on justice, as seen above), can bring enough peace as necessary to the continuation of human life, but this is nothing more than a remedy to even worse developments of human wickedness, not the true peace enjoyable only in the civitas Dei. The detailed just war doctrine, first sketched by Cicero in On Duties, I,11–13 (Cicero, 44 bce), was then developed by Aquinas and later by Francisco de Vitoria, Alberico Gentili and Hugo Grotius, and came
159
to be entangled with the fledgling international law (Emmerich de Vattel). In the 19th and 20th centuries, domestic (military) and international (treaties, conventions) legislation gave legal shape and some effectiveness to provisions of ius in bello elaborated across the centuries, such as the protection of noncombatants and prisoners of war and the proportionality in the amount of violence used. The right to go to war (ius ad bellum), once a matter of endless dispute as it was based on specific motivations, was recognized as pertaining to all legitimate states (compétence de guerre) in the framework of the multipolar system established with the Treaties of Westphalia (1648). It has been, since 1945, excluded for all countries by the UN Charter, while military operations are allowed only in self-defence or, if permitted by the Security Council, to help other UN members repel an invader. Outside legal literature the book that shortly after the Vietnam War revived the philosophical discussion on just war (Walzer, 1977) has been in recent years followed by attempts to rethink this notion in the light of novelties such as ethnic, religious, terrorist, drone and cyber wars, most of which escape the interstate scenario that was defining of the just war tradition. Further, the necessity of a ius post bellum, regulating the problems arising at the end of an armed conflict, has been advanced. Let us recapitulate: the normative attitude towards war has been around for more than 2,000 years, adapting to the big changes in warfare as well as in the political ground structure behind it. It has been endeavoured to extend the grasp of this attitude to nuclear war, with little success however, as we will see in the next section. Whatever its foundation (Augustine’s theology, Aquinas’ notion of natural law, the idea of humanity in the European Enlightenment), the just war theory has always sought ways to make war less unbearable, not to eliminate it, in the persuasion that conflict and violence can and ought to be restrained, but for the time being cannot be expelled from the life of communities.
160
The SAGE Handbook of Political Science
Thomas Hobbes Thomas Hobbes (1588–1679) lived as a tutor in English patrician families and after the Restoration on a royal pension; he went into exile in Paris during the Civil War. While developing a philosophy of materialism and science, in political philosophy he wrote De Cive (1641), Leviathan (1651) and Behemoth or the Long Parliament (1668). Writing in a time of domestic upheaval and European wars, Hobbes looked into the contractarian origin of the state (or commonwealth, in his language) in order to find chances of peace. The polity is not an organic product of human nature, as Aristotle and the Aristotelians wanted to have it; it rather arises from the generalized will – motivated by fear – to get out of the state of nature, which is life-threatening for everybody. This happens by conferring all our rights upon an artificial creation, the sovereign or Leviathan, whose absolute power is the only guarantee against the Behemoth of rebellion and civil war. States, however, are not compelled to follow the same path as individuals are, thus remaining in a potential state of war with each other (a condition later called international anarchy).
This attitude has been classified as rationalist or reformist in the trilogy outlined by the English School of International Relations (Martin Wight and Hedley Bull), its counterparts in the trilogy being Kant’s revolutionary project to eliminate war altogether, as well as the realist acceptance of it, exemplified by Thomas Hobbes, as a ground element of (interstate) politics. Not to be overlooked is the built-in ambivalence of just war theory: it stems primarily from the intention to define and reduce the occasions in which war is justified (this is the true meaning of ‘just’), but it can lose its critical potential and slip into a catalogue of possible justifications for actors willing to wage war. Of the many normative attempts to give politics a different orientation (rather than to allegedly reshape it altogether) the just war doctrine, along with international and more recently humanitarian law, has been one of the most successful in practical terms. ‘War remains hell’, to quote Union General William T. Sherman’s dictum after the American civil war, but it is not misplaced to assume that fewer civilians have been killed, fewer prisoners mistreated and fewer monuments destroyed thanks to the centurylong work of philosophers, theologians and jurists. By no means however can the power of argument and norm-setting be overstated, as the tremendous disproportion between ius in bello precepts and the unlawful killing or hardship inflicted upon soldiers and civilians in the 20th century proves. Lastly, it is
difficult to say how many of the evils that were not committed were so thanks to the respect for the norms (‘for the right reason’, to use the language of the theory of justice) or the fear of reprisal or punishment in case they were not observed. Principled and instrumental (to one’s own self-interest) behaviour may go sometimes hand in hand, indistinguishably, and lead to good though impure ends.
Second Case: Future Generations and a Politics for the Future In the second case we are looking at normative questions and theories arising from a state of the world that is coming up under our watch – often an absent-minded watch – but is far from self-evident, thus making a description necessary – readers may share it or not. Since 1945 or more exactly 1955 (as both East and West acquired the capacity to bring about mutual assured destruction) the worldwide balance of (nuclear) terror constitutes the most inner cell of international relations. All the rest – geopolitical and geoeconomic developments, regime change, international terrorism – is of minor relevance for philosophy and civilization compared with the dormant but very real possibility of the self-destruction of humankind (more exactly, of the civilizational p illars – agriculture, trade, communication, basic legal order – of its survival). This possibility is present also in the deterrence regime that some regard as stable enough as to avoid that things
How to Understand Normative Political Theory
turn to the worst. Arms control and partial disarmament accords have not changed this regime, though they have contributed to keep in check nuclear proliferation, which is but an epiphenomenon of the existence of nuclear arms. The fact that the Third (nuclear) World War has been to date avoided is no guarantee for future generations to be spared as well. This is not the whole story. Since the 1980s a new universal threat has entered the field of political negotiations: man-made global warming due to the sky-rocketing emissions of greenhouse gases from human activities since the Industrial Revolution. This is the main component in a process of climate change whose speed is unprecedented, widely overpassing the chance of our civilization to weather it. If unchecked, climate change is likely to unsettle the life conditions of future generations by multiplying mass migrations and unleashing wars for the control of rapidly diminishing resources such as territory, water and food. The Paris Climate Agreement of 2015, even before the United States of America’s withdrawal, does not make legally binding emissions cuts capable of keeping the temperature increase under 2°C by the end of the century. Let us call these two threats global and lethal, in the narrow definition of anthropogenic physical threats that put in danger the life conditions of all present and, even more, future generations and that can be seriously addressed only by the joint effort of nearly all economic and political actors. It has turned out that politics as usual or politics #1 (the activity allocating scarce but divisible material or relational resources among conflicting actors by means of legitimate power) is incapable of taking on these threats and transforming them into productive challenges for the political system. This seems rather to be the business of the fledgling politics #2: the activity aimed at saving and managing our global commons, first of all the atmosphere, on behalf of present and primarily future generations. This new type of politics is not a project or a prophecy, but a reality struggling to emerge, and has inspired
161
documents such as the Kyoto Protocol (1997) and the Paris Agreement or some disarmament agreements, regardless of their effectiveness in solving those problems; it is destined to coexist in an adversarial tension with politics #1, but by no means to replace it. What is at stake, is the chance to add politics as shaped by the will to protect generations of the far future, to the politics defined by the decisions made by contemporaries on behalf of their present preferences (at best with a thought to their children and grandchildren). The rationale for this shift is not a somersault of generosity, but the debt we have contracted with posterity by spoiling the life conditions they will be born into as well as our now undeniable knowledge of this causal link. Should we repay this debt, or – out of metaphor – act in our institutions with the goal to downsize our spoiling activities (emissions mitigation) and to help those already affected come to terms with the worsened conditions (adaptation)? Do we have obligations towards people who do not yet exist and to whom we have no emotional relationship? Or, using a different moral rather than political vocabulary largely shaped by feminism, do we have motivations for caring for them? This is – in variation – the fundamental normative question arising from the description of the state of the world sketched above. One can falsify and replace the description and/or reject any notion of politics not limited to the present and the self-interest of its dwellers. If these steps are not taken, answering that question appears to be the key task for political philosophy today, especially in its normative side, if philosophy’s ambition to peer in penetrating terms into humankind’s path is to be taken seriously. This view is miles away from mainstream normative theory in as much as it: 1 makes the assessment of the state of the world (a cognitive move) the premise for identifying the normative questions we are confronted with in our life as citizens, instead of seeing the
162
The SAGE Handbook of Political Science
context-free and timeless construction of an ideal republic as the proper philosophical mission, and 2 does not deem the development of norms from a supreme principle such as justice to be the essential modus cogitandi of political philosophy, as it rather prefers to extract and reconfigure norms from substantive (rather than formal or procedural) questions such as the survival of human civilization, the defence of fundamental rights and liberal democracy, and the dilemmas raised by technological advances (e.g. biotechnology, Artificial Intelligence) affecting the integrity of human beings.
We do not have room here for arguing a fullblown political answer to lethal challenges, but let us have a sketch of it. The reason for pursuing politics #2 along the inevitable, but now constrained continuation of politics #1 lies primarily in the reconfiguration of the substantial aim of politics – building institutions that prevent disaster, regulate conflict and provide protection – under the new conditions defined by those challenges. Being ready to reform our present economy and way of life in a low-carbon direction or to restructure our (collective) security in a nuclear-free sense are costly undertakings for the present generations, but we have good reasons for taking upon ourselves those costs for the sake of future generations that are both close to and far away from us. These reasons aim at protecting the community of humans that is now endangered to an unheard degree by man-made devices, compensating our posterity for the damages already inflicted upon them, extending into the future the obligations of solidarity, barring which no polity can keep together (this could be called time universalism, a still little-known cousin of the space universalism or cosmopolitanism that is restricted to our fellow humans of the present). By the way, future states of affairs, concerning non-lethal issues and sufficiently known to us from physics or economics, also play into normative debates and decision-making: this is in Western countries the case with the future (reform or collapse?) of pension systems, which are
crucial for keeping liberal democracy accepted by broad masses. In this set of reasons political and civilizational or ethical aspects converge: we should, say, bring down greenhouse-gas emissions not only to enhance the chances of peace and well-being for future polities, but also to allow future parents to raise their children in an environment not unbearably worse than ours (more in Cerutti, 2007: 146–51). At the core of our normative preoccupations for posterity lies the meta-political question: how far are we able and willing to recognize men and women of the future as fellow humans, to respect their interests and rights? Far from being purely ethical, this question contains a set of moral, psychological, political and legal aspects. We can reformulate this question in concrete political terms: why should the basic values of liberal–democratic societies (liberty, equality, justice, solidarity – the latter in a normative rather than Durkheimian sense) be available only to our contemporaries, but not to the people living, say, in three centuries? An important aspect is the evolution of the notion of humankind, which, once only philosophical, theological or literary, has now attained something of a political profile, at least potentially: it means the community of all those who can be hit by nuclear war or disastrous climate change and have a common interest (if not yet a common will) in taking institutional steps against those threats. Being kept together by non-voluntary bonds such as existential threats is indeed an identifier of the political community as different from the social or cultural one – as Hobbes, Rousseau and Weber knew; a community in which norms are not just theorised, but implemented by (not necessarily legal) sanctions and incentives as well. If humankind is now recognized as a quasi-political community, including future beings as far as their life conditions clearly (as in the case of lethal threats) depend on our actions and omissions, then our attitude towards future generations implies political aspects such as the duty to
How to Understand Normative Political Theory
protect them and to respect their vital interests by adequate institutional arrangements. On a different path, mention should be made of the attempts to define our obligations towards posterity in terms of justice, as Rawls did with a limited scope (‘just savings’) in §44 of A Theory of Justice (1999a), or to expand them by assuming – differently from Rawls himself – that the people that in the original position decide on the principles of a well-ordered society are not contemporary among themselves (Cerutti, 2007, 141–3). No doubt a wide gap exists between the primarily political path described in this chapter and Rawls’ deontological approach, which frames the problem of future generations in a timeless and context-free theory developing from a supreme principle. Now, the utility of this approach in orienting our actions remains doubtful, even more since its supporters do not raise the preliminary question of understanding: how does politics work; how can we modify in our sense its workings?;7 nonetheless, its contribution to the conceptual clarification of notions and the discussion of alternatives can hardly be renounced. Accepting a plurality of approaches and experimenting with them is instructive, as long as eclecticism is avoided.
4. NORMATIVE POSITIONS IN POLITICS In the preceding sections, we have seen in Section two a variety of normative assertions resulting from the most diverse sources and attitudes, while in Section three we have examined two sets of substantive issues (war and the attitude towards future generations), whose normative aspects we have tried to work out. It is now time to make some conceptual alternative underlying those issues explicit, in particular two that are sometimes overlapping:
α. politics and ethics, or political and moral normativity, and β. the pros and cons of ideal theory.
163
4α. Politics and Ethics From Niccolò Machiavelli on, modern political theory has been oscillating between recognizing politics’ autonomy from morality (the Florentine Segretario, Hobbes and Hegel, these two in very different versions) and trying to bring the former back under the sceptre of moral law as dictated by reason (Kant). The world however has changed a lot, and so has politics, as exemplified above with the twin notions of politics #1 and #2. This makes the old hard-realist position, for which ethics and more broadly universal norms have nothing to do with politics, look obsolete and obtuse. Even before politics #2 brought the attention to future generations to the fore, developments such as the universal institutionalization of legal human rights (from the Universal Declaration of Human Rights to the International Criminal Court) or the necessity for politics to legislate in matters of bioethics and biotechnology have made the borders between politics and morality more flexible. Some of these developments were stimulated by the fierce contrast between the dehumanized policies pursued by 20th-century totalitarianisms and the moral and political resistance of their opponents and victims. On a more theoretical level the complexity of politics–morality relations is indicated by their sharing of some essential categories, for example liberty and equality, which require, however, a finegrained differentiation between their political and moral usage. A well-known example is the positive conception of political liberty or ‘liberty to’, to put it with Isaiah Berlin, and its links to the moral notions of autonomy and self-realization. While we cannot enter a detailed examination of this relationship, it may be useful to do away with the widespread prejudices or idola fori that often surround the concepts we are talking about. First comes the conventional view that sees politics as the crude realm of unconstrained self-interest and poorly regulated pursuance of material
164
The SAGE Handbook of Political Science
or relational (e.g. prestige) goods; a realm that needs to be complemented or supplanted by ethics in order to give room to human values and dignity. This is not true not even for politics #1, in which the need to regulate conflict and to legitimize power (see below) creates mental and legal institutions of universalistic nature such as the legal system, the Constitution, fundamental rights and the rule of law (not only in modern post-absolutist politics; remember what Demaratos tried to make clear to Xerxes). In politics #2 the reasons for acting against lethal threats on behalf of future generations are political as well as philosophical (the will to preserve civilization from its man-made destruction) and moral (the respect for humankind, the protection of and empathy with future parents and children). Political normativity needs to clarify its relationship to the moral one,8 not to be swept away or disciplined by superior deontological or utilitarian teachings. Another face of the same prejudice sees all emergence of principles or universalistic norms in political life as the intervention of ‘moral standards’ (as argued by Stemplowska and Swift, 2012) superseding the daily and undignified business of politics. Two remarks can be made in this respect. First, unlike in common language, in a scientific discussion the word ‘moral’ cannot be watered down to mean whatever goes just one step beyond the pragmatic wisdom used by actors in their daily affairs. Nor can it easily be transferred from its proper dimension, the individual as a person acting towards other such actors, to the realm of the relations between groups and institutions. Second, it is the very political context that in discrete evolutionary stages distils principles and duties (embodied in the institutions just mentioned), whose observance becomes over time the basic condition for membership in the polity or the international community. These principles, often related to philosophical doctrines, can become as fundamental for individuals as to justify hardships and the loss of one’s own life suffered in their name, far away from the exclusively self-centred
attitude attributed to people acting politically. Also, attributing ‘ethical’ motivations to political goals enhances perhaps the rhetorical effect, but does not bring them any inch closer to realization, as the evolution of climate policy shows. Acting politically remains however, different from acting morally, also because it is dedicated not only to theorizing and proclaiming collective goals, but to their attainment as well. In politics the very definition of goals is subject to something like a cost– benefit analysis, in the sense that ideally valuable goals may be dropped because the human and political (in terms of consensus) costs are too high. Besides, in the pursuance of political goals the timing plays a role that is irrelevant to morality, since along with their content goals sought or attained, are valuable or not depending on the time they require till they are attained – the coalitions supporting them may in the meantime dissolve or change preferences. What is more, assuming responsibility for the effects and sub-effects, even if undesired, of our course of action – and factoring them in while choosing goals and strategies – is essential to our acting politically. Transferring into politics the simple imperative ‘obey the law’ (or the principles established by ideal theory, see below) in analogy to what we are doing in normative morality denies Weber’s ethics of responsibility and may result in righteousness or fanaticism. In politics normativity weighs in as a (often hidden) premise, setting goals (e.g. more equality; peace in freedom) and limits (no corruption; no violence); dismissing its authentic role, as Foucauldianism or historicism tend to do, misses the cement of the polity and cannot explain its existence. Another difference is that moral law can define and justify obligations (‘how’ to act), but not design motivations (‘why’ to act in a certain way). It is part of politics as well as political theory to identify the collective actors, the interests, the kind of discourse and the long-term coalitions that can underpin projects and strategies.
How to Understand Normative Political Theory
Lastly, moral normativity addresses individual actors in their interiority, while political normativity deals with collective actors in their behaviour. There is no more irritating sign of the equivocations generated by morality’s pretence to rule over political theory than the examples brought by its followers and mostly featuring the acting of fictional individuals in fictional situations rather than groups and institutions, as these are shaped by history and anthropology. More attention should be given in this context to Jürgen Habermas’ reflection on the relationship between facticity and normativity. For his discourse theory of law, law’s legitimacy in allocating liberties to individuals is not conferred upon it by morality, but results from the normative principle that regards only those norms as valid to which all persons possibly affected could agree as participants in rational discourses (Habermas, 1996). This version of deontological normativity is alternative to Rawls’ (cf. Habermas, 1995 and Rawls, 1995).
4β. Normativity and Ideal Theory The relationship of politics and ethics has many points of contact with the issue of ideal and non-ideal theory, which is in the theory of justice the leading path to normative construction. We will not highlight these points, leaving it to the reader’s perception, but we must point out that thinking in terms of ideal theory is not as such tantamount to thinking politics in terms of morality. Especially in Political Liberalism of 1993, Rawls insists on his ideal theory of a just society being a political theory. Now, what exactly is ideal theory? For Rawls, ideal theory ‘develops the conception of perfectly just basic structure and the corresponding duties and obligations of persons under the fixed constraints of human life and favourable circumstances’ (Rawls, 1999a: 216). Non-ideal theory deals with the principles that are to be adopted ‘under less
165
happy conditions’ and in the case of non- compliance to the norms issued by ideal theory, which assumes instead strict compliance with the principles of justice. This stance seems to be inspired by common sense, but let us heed Hegel’s warning: common sense can be the worst of all metaphysicists (der ärgste Metaphysiker). In this case, it is the assumption that life, political life in a democracy in particular, will be better if we redesign it thinking that people have in mind high values that are rationally established and can build institutions perfectly adequate to them, except that the latter are to be reconciled with people not always compliant with the models, or with circumstances that are not exactly favourable to their implementation. We know, however, that those perfect people never existed in history or were a tiny minority of saints, losers or fanatics, the latter sometimes with blood on their hands. We also know that models of justice or liberty or solidarity are effective only inasmuch as they are born from conflicts and movements in a particular country or area at a particular time, hence they are very much marked by history and anthropology; their importance does not stem from being the specification or re-adaptation of a systematic ideal honed by philosophers. Also, to become politically effective, the values we pursue, the concrete models of better institutions we may have in mind must be to a certain extent able to come to terms with the less ideal and rather self-centered interests of the groups that are to support the realization of the model. Innovative policy shifts can be performed not by a company of the stainless, but rather by coalitions in which angels are ready to walk for a while hand in hand with less noble creatures, if not with devils. The idea that political philosophy deals with perfect institutions that all citizens can be loyal to, while non-compliance with those institutions is a matter for non-ideal theory as a sort of B-theory goes against the modern view that politics is first of all about conflicts, in which by no means the parties
166
The SAGE Handbook of Political Science
are a priori on the right or the wrong side, because a perfectly just solution rarely exists, while the first and foremost problem is to develop institutions and policies that prevent conflicts from degenerating into war and disruption. The notion of ideal theory does not lack a certain naïveté paired with the philosopher’s presumption: on the one hand, the belief that possessing the perfect model is prior and conditional to redesigning reality ignores how the reality of political and social life comes about, in a way very distant from conceptual engineering. A suspicion of arrogance is, on the other hand, undeniable in the pretence that, if we want to find criteria capable of reordering to the best problem-laden areas of human life, we have to operate by philosophical deduction from principles – thus ignoring other types of knowledge, such as the reflection on history and human nature. This unhappy complex is what often makes the self-described ‘normative political philosophy’ appear futile and in its runaway fictional speculations intellectually unexciting, especially whenever it does not even attempt at preserving in the theory the fullness of the stuff it pretends to ideally regulate. What is missing in this mental attitude is the sense of the obstacle one should keep alive and bring to bear in the very moment in which one is looking for formulas capable of conceptualizing politics and society, instead of relegating the obstacles in the non-ideal theory closet.9 This is also the problem with the positive attitude of ideal theories towards utopia, which a priori dismisses theory’s chance to cognitively penetrate reality and give policymaking some orientation. Another trouble with ideal theory is that its very idea fails to recognize politics as the sphere in which ideas matter, but only do so if they can find cultural, social and political forces endorsing them and translating them into strategies and decision-making. The great ideas that have moved the world had each a bearer or protagonist, whom theorists identified as the principal agent for their ideas: the monarchy for absolutism, the bourgeoisie for
constitutionalism and liberalism, the working class and sections of the middle class for European socialism and the New Deal, and peasants and intellectuals in the liberation movements of the ‘Third World’. The attempt at locating this agent is lacking in the recent, pale appearances of ideal theory; the need to provide, along with speculative formulas, a Zeitdiagnose (diagnosis of the times) as their complement is disregarded and possibly felt as non-philosophical. Yet all this is not to deny the heuristic value of ideal theory whenever it contributes to define concepts and clarify conceptual alternatives ingrained in social and political life.10 Besides, on reading Rawls in comparison with his followers, a huge difference in the density and the usefulness of the theoretical argument becomes palpable. All of this can also be looked at from an evolutionary point of view. European modernity has already experienced a powerful endeavour to rethink the polity in the light of a morality shaped by the idea of Reason. This happened in the Enlightenment up to its philosophical culmination in Kant’s thought, but had hardly any influence on real politics, which continued to be better understood and managed on a realist path. Even the timid, and for a long while ineffective, efforts to build a collective security system were due to the reaction to the unprecedented bloodletting of 1914–18, rather than to the teachings of the idealist tradition, though the latter helped formulate legal proposals for a new international order (League of Nations). After this evolution, something different from an updated and greatly refined rerunning of Kant’s normativism, such as Rawls’ work in the substance is, was to expect, in the sense of a normative theory capable of integrating into its method the awareness of the real behaviour of actors and the role of the historical context.11 Also, for assessing justice in real politics one needs to compare cases of relative justice rather than to build a whole theory of perfect justice – to put it with Amartya Sen (2009: 16), who jokingly adds that knowing that the Mona Lisa
How to Understand Normative Political Theory
(La Gioconda) is the most perfect picture in the world does not help when the choice is between a Dali and a Picasso. Lastly, after the fall of grand narratives such as Marxism or positivistic progressivism, there is little reason for resurrecting grand speculations de optima republica, or to make the achievement of an ‘end state’ (of justice, happiness or what else) the research objective of political philosophy. Philosophy has in our time all reasons to lower its pretensions. A short but overdue clarification: anthropology has been introduced here as philosophical rather than cultural anthropology. This has to do with the scepticism about designing an ideal polity peopled with ‘free and rational persons’ (Rawls, 1999a: 10), while later proceeding to look at the actions of not-so-rational persons (e.g. war) under the lenses of non-ideal theory. Better chances of both understanding and transforming politics belong to the approaches that assume from the very beginning a complex and ambivalent structure of political actors (including the gender differentiation) and take due note of the secular changes that have, for example, marked their cognitive equipment and perception apparatus in the shift towards a globalized and highly interconnected world.
Conclusion At the end of this journey through political normativity and its theoretical underpinnings the reader may ask: what is all this fuss about? Is politics not a power game going on everyday and driven by interest, struggle and compromise, with little need of normative beacons? This is a refreshing down-to-earth view, which however forgets about an essential feature of power, institution and policy: legitimacy. Apart from being ‘immoral’ or sometimes inhuman, power that cannot show its credentials of legitimacy is increasingly unstable and at the end of the day powerless. A polity can prove its legitimacy only as far
167
as its members deem it to some extent capable to fulfil their notions of good governance as resulting from the normative models they have in mind, but also to provide the essential goods (security, legality, minimal well-being) that citizens expect from institutions.12 In political philosophy the category of legitimacy is the connecting link between the conceptual reconstruction of how polities work and the categories giving them normative guidance (liberty, equality, justice and solidarity, in this author’s account). As is the case with this whole essay, this version of legitimacy has been designed so that its conceptual features remain as close as possible to the workings of real politics and its actors. This has happened in the persuasion that normative political theory is more useful and productive when handled less as an inner-philosophical issue and rather as a necessary piece of any effort to conceptually grasp the core of politics.
Notes 1 On normativity in general, see Robertson (2009). 2 See the relevant entries in Klosko (2011) and Gaus-D’Agostino (2013) as well as in the online Stanford Encyclopaedia of Philosophy (https:// plato.stanford.edu). 3 Rawls was aware of this gap, see ‘Remarks on Political Philosophy’, which constitute the introduction to Rawls (2007: 1–20). 4 For the present debate on political realism see Rossi and Sleat (2014), including its attention to realism in international relations. 5 In moral philosophy normative theories (determining what is right) are either deontological (actions are good or evil in themselves, depending on a supreme principle and regardless of the consequences) or consequentialist (they are good or evil depending on whether their effects enhance or harm the utility or happiness or pleasure of people). Still different are teleological theories, defining what is good and indicating how to best configure one’s own life in order to achieve this telos or aim. Kant, Jeremy Bentham and Aristotle are representative of the three theory types. Hegel and the communitarians of our time (Charles Taylor, Michael Walzer, Michael Sandel and others, all critical of deontological liberalism) can be seen as very specific versions of teleologism.
168
The SAGE Handbook of Political Science
6 This aspect has been particularly discussed by Rawls in his rethinking of the just war notion in the light of the law of peoples, see Rawls (1999b: in particular 99–103), in which the legitimacy of both fire-bombing and nuclear bombing on Japanese cities in 1945 is questioned. 7 It is no mere ‘guidance problem’, as Valentini (2009) seems to believe. 8 The distinction of moral and political normativity has been once again formulated by Williams (2005). On ‘institutionalism vs moralism’ see Sangiovanni (2008). 9 ‘Climate ethics’ is an impressive instance of the futility of ideal theory whenever it addresses top-down a concrete political problem, as I have argued in Cerutti (2016). For a productive example of the theory of justice applied to a policy problem see Pogge (2008). 10 This is true also for Nozick (1974), the libertarian answer to Rawls. 11 Idealism and normativism are not exactly the same thing, but reasons of space prevent me from discussing their relationship. 12 More on legitimacy in Cerutti (2017: chapter 2).
REFERENCES Augustine of Hippo. 426 ce. The City of God against the Pagans, available at https:// www.gutenberg.org/files/45304/45304-h/ 45304-h.htm, p. 140. Cerutti, Furio. 2007. Global Challenges for Leviathan, Lanham, Md: Rowman & Littlefield. Cerutti, Furio. 2016. ‘Climate Ethics and the Failures of “Normative Political Philosophy”’, in Philosophy & Social Criticism, Vol. 42, No. 7, 707–726. Cerutti, Furio. 2017. Conceptualizing Politics, London: Routledge. Cicero, Marcus Tullius. 43 bce. On the Laws (De legibus), available at https://www.nlnrac.org/ classical/cicero/documents/de-legibus Cicero, Marcus Tullius. 44 bce. On Duties (De officiis), available at http://www.bostonleadershipbuilders.com/cicero/duties/book1.htm#11 Codex. 535 ce, available at http://www. thelatinlibrary.com/justinian/codex3.shtml Deng Hsiao Ping. 1962. https://www.chinadaily. com.cn/china/2014-08/20/content_ 18453523.htm, accessed 24 December 2019.
Dyson, Robert W. 2006. Saint Augustine of Hippo: The Christian Transformation of Political Philosophy, London: Continuum. Gaus, Gerald F. and Fred D’Agostino, eds. 2013. The Routledge Companion to Social and Political Philosophy, especially ‘Part III: Normative Foundations’, New York: Routledge. Habermas, Jürgen. 1995. ‘Reconciliation Through the Public use of Reason: Remarks on John Rawls’s Political Liberalism’, in The Journal of Philosophy, Vol. 92, No. 3, 109–131. Habermas, Jürgen. 1996. Between Facts and Norms, tr. William Rehg, Cambridge, Mass.: The MIT Press. Herodotus. 440 bce. Histories, available at http://classics.mit.edu/Herodotus/history. html Horace. 23 bce. Odes, available at http://www. thelatinlibrary.com/horace/carm3.shtml [English: http://www.perseus.tufts.edu/hopper/ tex t?doc = Pers eus %3A tex t%3A 1999. 02.0025%3Abook%3D3%3Apoem%3D2] Institutiones. 535 ce (from Codex Juris Civilis), available at http://thelatinlibrary.com/law/ institutes.html Kant, Immanuel. 1795. Perpetual Peace, available at https://www.mtholyoke.edu/acad/ intrel/kant/kant1.htm Klosko, George, ed. 2011. The Oxford Handbook of the History of Political Philosophy, Oxford: Oxford University Press. Machiavelli, Niccolò. 1513. The Prince, available at https://www.victoria.ac.nz/lals/about/ staff/publications/paul-nation/PrinceAdapted2.pdf Marx, Karl. 1875. Critique of the Gotha Programme, available at https://www.marxists. org/archive/marx/works/1875/gotha/ch01. htm Nozick, Robert. 1974. Anarchy, State, and Utopia, New York: Basic Books. Plato. 2012. Republic, transl. by Chr. Rowe, New York: Penguin Books. Plato. 2016. The Laws, ed. by M. Schofield and transl. by T. Griffith, Cambridge: Cambridge University Press. Pogge, Thomas. 2008. World Poverty and Human Rights, 2nd ed., Cambridge: Polity Press.
How to Understand Normative Political Theory
Rawls, John. 1993. Political Liberalism, New York: Columbia University Press. Rawls, John. 1995. ‘Political Liberalism: Reply to Habermas’, in The Journal of Philosophy, Vol. 92, No. 3, 132–180. Rawls, John. 1999a. A Theory of Justice (Revised Edition), Cambridge, Mass.: Harvard University Press. Rawls, John. 1999b. The Law of Peoples, Cambridge, Mass.: Harvard University Press. Rawls, John. 2007. Lectures on the History of Political Philosophy, ed. Samuel Freeman, Cambridge, Mass.: Harvard University Press. Robertson, Samuel, ed. 2009. Spheres of Reason: New Essays in the Philosophy of Normativity, Oxford: Oxford Scholarship Online. Rossi, Enzo and Matt Sleat. 2014. ‘Realism in Normative Political Theory’, in Philosophy Compass, Vol. 9, No. 10, 689–701. Sangiovanni, Andrea. 2008. ‘Justice and the Priority of Politics to Morality’, in The Journal of Political Philosophy, Vol. 16, No. 2, 137–164.
169
Sen, Amartya. 2009. The Idea of Justice. Cambridge, Mass.: Harvard University Press. Stein, Peter. 1999. Roman Law in European History, Cambridge: Cambridge University Press. Stemplowska, Zofia and Adam Swift. 2012. ‘Ideal and Nonideal Theory’, in David Estlund, ed., The Oxford Handbook of Political Philosophy, Oxford: Oxford University Press, available at DOI:10.1093/oxfordhb/ 9780195376692.013.0020, accessed 20 December 2019. Valentini, Laura. 2009. ‘On the Apparent Paradox of Ideal Theory’, in The Journal of Political Philosophy, Vol. 17, No. 3, 332–355. Walzer, Michael. 1977. Just and Unjust Wars, New York: Basic Books. Weber, Max. 1919. Politics as a Vocation, available at https://archive.org/details/weber_ max_1864_1920_politics_as_a_vocation/ page/n39, accessed 15 December 2018. Williams, Bernard. 2005. ‘Realism and M oralism in Political Theory’, in Geoffrey Hawthorn, ed., In the Beginning was the Deed, Princeton, N.J.: Princeton University Press, 1–18.
10 Political Anthropology and Its Legacy Yv e s S c h e m e i l
The goals, methods, and content of political anthropology (PA) vary across time and place. In some academic traditions it is conceived as ‘cultural anthropology’, which raises concerns about what ‘cultural’ really means. Other streams of research call it ‘ethnology’, to claim that it cannot be reduced to ethnography (the mere collection of data and artifacts). Generalizing from observation is the main ambition of political a nthropologists – with a risk: discarding local and historical evidence to opt for a global comparative perspective. PA does not track politics as is usually conceived, i.e. pertaining to the public sphere, but looks for it everywhere else. It focuses on the political underpinnings of social structures such as kinship and the family, a street gang, a banquet, etc. It looks for non-professional politics wherever and whenever it can be observed – be it in the remote past, far away from Europe where it was invented, or in modern structures and institutions driven by politics without politicians.1 This could be done even when institutions remain stealthy,
quasi-invisible to political scientists. It is also useful when more visible objects of political science (PS) (power interactions, social influence, and institutional regulation) are investigated. What does PA bring to PS? First, it embeds the political into a holistic set of social interactions as a ‘total social fact’, instead of studying it in isolation. Second, it focuses on the local language of power and influence (using concepts as expressed in autochthonous idioms). Third, it brings tools like the use of standardized field notebooks, the collection of artifacts, the drafting of sociometric graphs, participatory observation, etc. Fourth, it offers an alternative genealogy of the founding fathers that is not limited to philosophers (like Plato or Rawls), but is extended to famous political anthropologists (from Evans-Pritchard to Geertz). Fifth, it addresses in its own way the two big methodological problems that any social scientist must face: how to generalize from the singular (or abstract and theorize from the
Political Anthropology and Its Legacy
factual), and how to compare heterogeneous cases to find true universality without losing specificity. Sixth, it has ‘a corrosive power’ (Balandier, 1970: viii) over established theories, which often take for granted the distinction between the West and the Rest (Gledhill, 2009) or ‘complex’ versus ‘simple’ polities. Seventh, it takes attention away from utilitarianism and rational decisions to focus on political will and humanistic vision (Szakolczai, 2018: 20).
A Long Genealogy Visiting ‘strange’ countries (at a time when the notion of ‘foreign’ was still vague) is documented very early in history, and very far away in space. In ancient Egypt and Mesopotamia, palace walls were painted or engraved with images describing past explorations of unknown territories and peoples. Explorers, such as Ibn Battuta in the Arab world and Zheng He in China, reported faithfully and somewhat neutrally what they saw and heard, to such an extent that they earned the nickname of ‘logographers’. What is nonetheless striking in all these accounts is the attempt to restitute with precision a reality that would otherwise remain non-accessible to contemporaries. And here is the first principle on which PA is grounded: it is a standardized observation of the variety of the political customs and institutions. The second principle matters even more to the building of scientific knowledge of other cultures: systematic comparisons between every peoples must be made. Herodotus did not visit only ‘civilized’ Greeks and their neighbors; 16th-century Japanese writers triangulated analyses of their compatriots, the Chinese, and the Westerners. One thing was still missing, though, until the British foundation of an academic discipline taught in universities: PA should look for the non-visible political institutions in countries without script, history textbooks or constitutions, let alone parliaments, elections,
171
and parties – and no identified political heads. Hence, PA must look for politics in spaces where it is not recognized as a professionalized activity. The British and the French conquered or traded with a number of non-official polities. Since their administrators had to understand the context in which they lived, seminal works on colonized areas were much needed. Evans-Pritchard and Fortes’ African political systems, Evans-Pritchard’s The Nuer (1948), as well as Leach’s Political systems of Highland Burma (1954) did a lot to excavate power relations from the mud of networks of sociability from which power, influence, and manipulation were apparently absent. The ethnographic records of Malinowski or Boas also contributed to building functionalist and structural theories of politics in societies were power was hidden. Malinowski himself assumed that PA should look for ‘equivalents’, ‘substitutes’, or ‘alternatives’ to our political structures and functions (Malinowski, 1960: 120,123,137; Boas, 2006). The problem faced by the founding fathers was the lack of precedent. How could they make visible a ‘system’ of political relations that was not engraved into the marble of fundamental laws and made manifest in a ‘regime’? One way to proceed was to look for social hierarchies. British political anthropologists simply replaced compliance with deference; they found equivalents of legal punishment in social ostracism; they located real power in ‘big men’, sorcerers, and shamans or distinguished characters – like EvansPritchard’s ‘leopard-skin chief’ of the Nuer (Evans-Pritchard, 1940: 172–73), or Leach’s ‘thigh-eating chief’ of the Kachin (Leach, 1954: 121–22). They established kinship reciprocity and collective responsibility as substitutes for legal enforcement. When siblings must collectively assume guilt there is no need for courts, police forces, and jails. However, ‘segmentarity’ rules out state building (until now) or weakens colonial rule (in the 1930s). Because blood segments aggregate only as lineage branches, no political institution
172
The SAGE Handbook of Political Science
can emerge in societies that are nonetheless endowed with a sense of identity. What people get instead is a network of networks – another relevant concern in Internet-governed polities of today. Some anthropologists like Leach and Turner observed spatial and temporal variations of hierarchies.2 The former studied the oscillation of power in highland Burma – an unstable configuration of power called gumsa located between shan feudal relations and the gumlao-organized anarchy of equalitarian factions. The latter showed that ‘liminality’ was at the roots of any organized society in Africa (from East to West) and elsewhere, since transition from adolescence to adulthood as well as the passage from rank and file people to leadership positions required zones of undetermined status in which people could find out that it would be pointless to do to others things they would not like to experience themselves. Leach grounded social relations in space, and Turner anchored them in time. Both scholars insisted on informal relations, invisible structure, and flexible if not unstable behavior. Both noted that political communities existed outside Europe. Leach showed that the Kachin could understand each other’s twisted interpretation of common great narratives (as often stated by respondents, ‘if I were one of them I would think and act as they do’).3 Hence irrationality, inconsistency, instability, interactivity, and informality could well spread further across time and space than assumed in northern universities where the concept of institutionalization prevailed. The ultimate challenge to PA, as conceived, came from the treatment of conflict in informal political systems. Because the form and level of ‘political contention’ vary, social cohesion depends on varieties of feud regulation. However, if this is but a ‘variant’ of any form of peacekeeping, then politics is ‘relegated to accessory status’ as Easton once argued. On the contrary, PS considers regulation as paramount since it works on whole ‘systems’ as the most inclusive set
of interactions in any society. According to Easton the founding fathers of PA could only see politics as the result of the use of force plus a non-centralized organized authority – the only difference between ‘traditional’ and ‘modern’ political systems being this absence of centralization – whereas political scientists could paradoxically do without force, organization, and centralization. Anthropologists went too far in the other direction and equated the kinship structure with a political regime (Easton, 1959: 211–12, 219–20, 214–16). War was another sort of conflict that mattered to anthropologists in the founding years of the discipline, since it is a rare example of this ‘use of force’ associated with politics (see David and Rapin, Chapter 83, this Handbook). In war, strong fighters and smart strategists are needed, all sorts of weapons are used, and people are killed or enslaved. Even though there are no formalized borders, boundaries do exist between territories, and trespassing them is conducive to conflicts like anywhere else. To address such threat deterrence may sometimes suffice – as among Amazonians whose potential fighters simply extend their web of social relations to show rivals how difficult and illegitimate it would be to attack them (Descola, 1996). In both types of conflicts, some activities are more political than others, those that deal with collaboration and dispute-settlement among groups competing for power, and those which lead to binding decisions allocating or granting things of value to people (Easton, 1959: 226). Another cleavage, consanguinity versus contiguity, is debatable because it is either too evolutionist or excessively focused on kinship (Lewellen, 2003: 3). What is not debatable is the existence of power struggles and coercion everywhere. Only the containment of violence varies (it can be achieved by parliaments, a group of elders, secret societies – as the hundred membered Dogon families called ‘ginna’ – or agegroups and bands of young men). Since there must exist a system of arbitration everywhere,
Political Anthropology and Its Legacy
which requires in turn deliberation among selected people who make binding decisions, then the major political difference between societies is not the degree of specialization of the arbiters but the permanence or lack of permanence of their political role. This is the end of the typologies duplicating the canonic oppositions between stateless and stately societies, kinship and kingship, etc. Difficulties in finding standards of classification made it worthless. Contrary to most British anthropologists, Their French peers quickly got rid of taxonomies. Due to a durable Durkheimian imprint on ethnology (see Poggi, Chapter 3, this Handbook), they switched from political structures to political processes – how to create a national or communitarian ethics, how to alternate moments of collective enthusiasm or heroism with sheer routine, etc. Temporary coalescence replaces permanent structures. Continuity is striking from Mauss and Griaule to Lévi-Strauss and Godelier, or even Hellenists interpreting the meaning of the ‘beautiful death’ (Vernant, 1991). Despite the absence of central institutions and heads, ‘effervescence’ is nonetheless possible through particular moments of the social cycle (religious festivals, public banquets, coming of age ceremonies, celebrations of past events or heroes, etc.). They bring the necessary social glue to anarchical contexts. In other words, neither kinship nor territory may suffice to give people a political identity. A polity exists when it becomes visible to its members, as in complex societies with independence days, national celebrations, state visits, and even world championships. Also relevant for politics, a new insistence on valuable knowledge outside Western science paved the way for a more balanced vision of the relationships between European thought and local cosmogonies. When it emerged, anthropology was a sort of sociology applied to different objects (illiterate, ahistorical, and stateless societies). The Chicago School and its heritage in current urban ethnography blurred this implicit division of labor, and turned certitudes
173
upside down. Authors such as Isaac Thomas, Park, and Wirth, not to mention latecomers like Foote Whyte (2004) and Alice Goffman (2009) spotted people who live in towns although not as other urbanites did – immigrants, hobos and other homeless people, bohemian artists, corner boys and members of street gangs, inhabitants of racial ghettos and city districts, etc. Decolonization brought two additional ingredients to the new PA: a necessity to know the authentic other, and the impossibility of carrying on studying groups living in isolation in recently freed societies. These two contradictory goals gave birth, respectively, to comparative politics centered on political cultures; and to the rural ethnography of Western societies with a focus on how village life could survive the nationalization of politics, the centralization of polities, and the urbanization of human settlements. If they did, resilience should be attributed to a specific factor, non-political as well as non-institutionalized: culture, again, was a good candidate for the status of an explanatory variable; and so were interactions and alliances, as in the southern community about which Wylie devoted his PhD thesis, where people used to see their relations with others as being split between those with whom they had good relationships, and those from whom they have been estranged at some point – or, translated into a political idiom, between members of the same party, hence customers of the same café and shops, and all the others (Wylie, 1957). In recent years, the field of PA has expanded dramatically. Corporations, sectarian brotherhoods, safety or security teams, industrial towns, international organizations, etc. are now part of the legitimate perimeter of anthropology because they have to solve the problems of social hierarchy, common rules, and collective action. Likewise, processes and protocols like religious conversions, state visits, or caricatures have entered the field of ethnology. Therefore relations between actors have tended to substitute social and political agents themselves.
174
The SAGE Handbook of Political Science
Such versatility shows that politics was about coordination, conflict, and constraints even when political institutions were lacking.4 What eventually maintained the unity of PA was a continuous concern for cultures – i.e. shared values, moral beliefs, religious creeds, acceptable behavior, and confidence in central institutions – plus social networks (of siblings, friends, generations, or allies). But such a selection of concepts certainly evolved along time.
The Classics and Their Critics From structuralism and functionalism, to culturalism, then interactionism, we have four funerals and a wedding. The birth of PA was characterized by a focus on structure as the main explanatory variable of both universality and specificity among human societies. The basic assumption was therefore ‘structuralist’: everything was arranged according to an inner structure because it had to fulfill a much needed function – hence, this approach was also ‘functionalist’. Structures and functions were allegedly visible in complex societies but not so much in isolated ones where they would gain visibility only when anthropologists could observe rules, then link them to a probable underlying organizational chart saying who can marry whom, dine with whom, hunt or till land with whom, build houses with whom, and communicate with ancestors through whom. Reproduction, nutrition, housing, and religion were assumed to be full of rules that reflected structures and responded to a physical and social necessity. Such structural functionalism (see Agarin, Chapter 5, this Handbook) can either derive from law (most founding fathers were trained in legal studies [Thomassen, 2008: 265]); biology (basic needs being allegedly ‘natural’ or material); or linguistics (social relations depend on communication skills and symbolization is language). The British
opted mostly for biology, and the French for linguistics. Both were then superseded by American culturalism and interactionism. Culturalism (i.e. ‘privileging cultural factors over more political variables such as social stratification, ideological and partisan cleavages, political attitudes, institutional constraints, and strategic choice’, Schemeil, 2011: 511) was once a promising avenue towards greater recognition of PA’s achievements, as a way of ‘providing contrary evidence that may test and question the overambitious claims of theories’ (Daloz, 2018: 179). Culturalism now suffers from decreasing legitimacy. On the one hand, political culture is viewed as an outdated way to analyze the political underpinnings of the more global relationships between citizens, either in democratic or illiberal polities. While ideology seems more relevant as a way of studying partisanship and political alignments, culture itself tends to be considered as an ideological construct – not to mention ‘civilizations’ and their alleged ‘clash’. On the other hand, culturalism presupposes the absence of significant change in beliefs and behaviors – societies remaining identical from one century to the next, which would also give them an identity. In short, and for most critics, culturalism relies on essentialism and eternity. However, there is much more to say about the use of culture in explanatory models. Studying cultural aspects of an asocial reality is like drafting the dictionary of a foreign language: not all idioms are in use or even known to the speakers of that language, but they offer a repertory of alternatives and options to make themselves plain with enough nuance to be distinguished from their interlocutors. Picking out this or that idiomatic expression depends on tactical goals – like defending one’s self-interest or hiding it behind a commitment to the general interest. Since culture is mostly made of signs, symbols, and icons, what happens when people meet is a mutual activation of the symbols they have learned to understand in order to communicate peacefully – hence the privilege once given to static
Political Anthropology and Its Legacy
culture should inevitably prefigure a coming attention to symbolic interaction. This is exactly how interactionism became the winner of the contest between all these streams of research. This is because there is a close connection between PA as an approach and interactionism as a method. Since mainstream anthropologists avoid excessive interference with the people they study, they analyze interactions between them – be they material or symbolic. Once displaced from isolated societies to communities living with many neighbors in a modern environment, like a department store or a public transportation terminal, any form of anthropology is even more clearly an example of interactionism. When done with a special focus on politics, the interactionist component is converted into a crystallized set of beliefs and behaviors, like partisanship, electoral campaigns, or fights against foreign enemies in which myriads of social interactions are aggregated and dichotomized (between, pro and con, we and they, etc.). However, frontal oppositions may eventually hide the many different meanings of siding with some people against other people. At that point classical PA must be supplemented with new trends such as socio-anthropology and archeopolitics. Culturalism also declined because it gave way to cognitivism – since people see what they are trained to perceive (i.e. things as they are framed by their culture), their potential knowledge is bounded and limited. Whereas cognitivism anchors PA into culture or nature (the brain), socio-anthropology and archeopolitics anchor it within the other social sciences: sociology and psychology on the one hand, history and PS on the other hand.
Basic Concepts Political Culture In the absence of any boundary, identity document, or military mobilization, two ways
175
are left to make a society visible to foreigners as a distinct whole: diplomacy to keep others at bay, and political culture, a conception of the self as belonging to a polity (see BergSchlosser, Chapter 37, this Handbook). It is of note that, first, political culture cannot be conflated as ‘civic’ (contra Almond and Verba); second, it is no shortcut for ‘civilization’ (contra Huntington); third, it is not uniform. Uniformity of mores does not guarantee that a ‘culture’ will be recognized by outsiders. People have hyphenated identities. Subcultures can diverge. What insiders expect from a shared culture is not unanimous agreement, only a better understanding of their peers’ arguments and what makes sense, without endorsing the rationale for their decisions. Culture makes interactions easier; it also spares time and effort in anticipating the conduct of others because some things at least can be ‘taken for granted’, and are ‘commonsensical’, hence difficult to deny openly (like laicité in France, parliamentary democracy in Great Britain, Islam in Saudi Arabia, or Marxism in the former Soviet Union). Political culture may also be viewed as a cake of which each piece has a specific flavor. According to PA there is some evidence that the separation of the political from other dimensions of life is a Western phenomenon. But there is also some certitude that even when embedded in the social, most political practices and beliefs differ from the Global North to the Global South, and even from country to country. For instance, to attract supporters, political leaders in Nigeria opt for ‘conspicuous display’ while in Scandinavia ‘conspicuous modesty’ is of the essence (Daloz, 2018: 187; Bayart, 2009). In spite of this, representatives all benefit from a delegation of power, which has the same meaning in both cultures, and is therefore universal.5 Moreover, if there are as many registers of meaning as there are cultures, it is very unlikely that in a globalized world people could ignore what the others mean by ‘representation’. Hence, a 2011 Cairo demonstrator
176
The SAGE Handbook of Political Science
would have been perfectly aware of the difference between gurus, patrons, and lobbyists.6 But she could have combined these three registers according to her self-interest, the current context, and her mood.
Myths and Rituals Whereas myths are either written or not, they are much respected ‘texts’ meant to guide people in their daily life, which give some meaning to their deeds and opinions. Rituals are social processes that translate statements into practices, and make ordinary people commit themselves to the collective vision of their identity and ethics. According to LéviStrauss, myths are what unconsciously structure behavior, while rituals are events in which behavior can be observed. Myths point out consistency; rituals express contradictions (Gluckman, 1963). In a way, going to the polls after a highly divisive electoral campaign is also a means of displaying national unity: elections are but civilized substitutes for civil wars. In the absence of texts, anthropologists have focused on indigenous accounts of Creation. Myths are non-written narratives that also tell us where the community comes from, and what prohibitions to respect. They also say what a correct or moral behavior is. Since there is no constitution on which to rely, myths additionally provide a scale of obligations towards the others, and, when it exists, a clear hierarchy of deference to whom and when. In modern PA, mythical narratives can be translated as ‘sacred books’ (Bromberger, 1995), i.e. texts to which members of a community refer on any occasion (the Universal Declaration of Human Rights, the Manifesto of the Communist Party, The Federalist Papers, the Quran, etc.). Such texts help people to memorize a particular mapping of social encounters, with zones of comfort, zones of benign neglect, and zones of conflict.
One of the most analyzed processes of initiation rituals is political socialization, an introduction into a new circle of sociability of people so far kept out of it because of age, sex, race, activity, or genealogy. It aims to put future heads into the shoes of those on whom they will some day impose their decisions, and transfer knowledge about secret societies’ proceedings, male-only meeting rules, magic, therapy, and prophetic skills. Joining gangs or remaining loyal to their leaders, being put on the roll call of voters, mobilized by the army, or invited as a guest of a host in an exclusive banquet, all require special rituals. Narratives are special myths, especially ‘Grand narratives’. They include other materials open to anthropological work such as stump speeches, lullabies, legends, fairy tales, and ‘national novels’ (history textbooks which reconstruct the past in order to make it consistent with an imagined community). Not all communities have a narrative: there is at least one culture in which history is ignored, the Amazonian Jívaro who have nothing to pass on to the next generation in terms of personal, family or collective heritage (Descola, 1996). Moreover, when such narratives exist, they are competing with each other. Members of such communities understand that each of its components has an equal right to stick to its own version of a reconstructed origin (every version is therefore considered as equally valid). Whereas complex, populated, and intermingled societies tend to produce a national history taught at school in accredited textbooks, members of small illiterate communities recognize the right of each person to choose among a plurality of views (this is well established among the Dogon in Africa, and the Kachin in Asia). Within the Classics, some authors used PA to show that tragedy could also be a political narrative which could help an audience to learn what is bygone (e.g. loyalty to kin) and what is a new way to think and act (e.g. loyalty to kings [Meier, 1993]). Drama has always been a convenient channel to tell
Political Anthropology and Its Legacy
people what to do, so PA should include it within its perimeter. From drama to public speech there is some conspicuous continuity. This explains why public speech and specialized idioms that are adopted when negotiating with foes are major sources of identity building – a proxy for ideology in societies with no written language. People who will soon step in as leaders must learn the specific language of public speech, as do the Ashuar in Ecuador and Peru (Descola, 1996) or the Guayakí of Paraguay (Clastres, 1974/1987).
Language Anthropologists have long been divided on the causal relationships between language and society: does language frame social interaction, or does it stem from previous intercourse? This ontological problem is known as the Sapir-Whorf hypothesis. The way people build sentences (with a subject, an object, and a verb), their numeration, and their vocabulary may stem from their way of life (urbanites do not need hundreds of quasisynonyms for animals, meteorological conditions, and edible plants). However, it may also be true that once invented, language evolved and made people view the local world accordingly (if you have a lot of words to address ethical issues, then they become more prominent, and the same could be said for politics and war). In PS nowadays the conflict between structuralism and constructivism is still raging, and it lies mostly in the language/society option. Is language a natural source or a mere construct? Whatever the answer given to this question, language and translation issues remain central in PA. The following riddle is an enlightening example of language ambiguity and the necessity of knowing the context to understand it correctly. It is a convincing explanation of the difficulty of being understood plainly when talking (and delivering political speeches, for instance). Imagine a situation where the indigenous
177
partner of a foreign observer takes him to the outback to teach him the local terminology, when a rabbit suddenly surges from nowhere and starts running, accompanied by the interpreter’s shout (gavagaï!). The ‘derided anthropologist’ is then unable to decide on the exact meaning of the word (‘rabbit’? ‘look’? ‘beware’? ‘hunt’? ‘wonderful’, or just ‘wow’), each saying something about the value of the rabbit and the meaning of the situation to the indigenous interpreter (Quine, 2013). The lesson to be drawn is that context and intention matter even more than facts – a lesson we tend to forget in our world made of objective and precise description that must be deprived of any equivoque.
Kinship: Real and Pseudo Most of the comparative advantages of anthropology over the rest of the social sciences lie in its focus on marriage and filiation. Parenthood trees are even considered as a specific technique without which the social structure could never be understood. Consequently, every scientific study of men and women’s interactions and institutions worldwide should start from the basics – marriage, kinship, and family structures (as Egyptologists do). However, research published about modern polities often skips such fundamental relationships. Hence, there is plenty of pseudo anthropological research borrowing its techniques from anthropological fieldwork without serious consideration of its objects. In PS, research on family links is not as technical and meticulous as in mainstream anthropology. However, personal connections between people in office and their challengers (who lives with whom, who is the godmother or the wedding witness of whom, etc.) are often studied. Of course, building solid knowledge about such webs of pseudo parenthood is an uneasy task – as such relationships may be drowned in an ocean of other connections (professional, educational, or incidental).
178
The SAGE Handbook of Political Science
Since PA also targets societies in which marriage and filiation play only a marginal role, continuity between mainstream anthropology and the way political scientists use it presupposes an extension of the perimeter of kinship, to include fake blood ties – i.e. clientelist ties, allegedly similar to those analyzed by anthropologists, although void of any blood relationship (Leca and Schemeil, 1983). Anthropological findings on kinship are a source of inspiration for political scientists. It forces them to take a closer look at: prohibited and preferential marriage or friendship (with whom one must be seen or not when dining out); ‘joking relationships’ among siblings, or just among peers (Radcliffe-Brown); social effervescence (Durkheim); mentorship of junior politicians by seniors, etc.
Leadership To avoid notions that may look Western (like ‘rulers’), anthropologists use indigenous labels like ‘big men’, ‘prophets’, and ‘sorcerers’ (respectively translated into the modern idioms of political regimes as ‘leaders’,7 ‘spin doctors’, and ‘fixers’). PA needs words that neither mean full compliance with their public statements nor imply enforcement of their decisions. This points out the relative impotence of powerful people: to be heard by those over whom they rule they cannot rely on force, or majority in Parliament. PA shows that compliance depends on intangible perceptions, emotions, and symbols as much as they also rely on tangible resources. It also targets the volatility of personal influence, which depends on an endless maintenance of networks of sociability, plus a sincere understanding of reciprocity. Instead of taking for granted that institutions define the amount of power that someone can use when in authority, PA makes manifest the level of rulers’ vulnerability, even in modern political regimes (Abélès, 2002). It sheds light on deterrence and curse
rather than on the actual use of violence authorized by law. The paramount opposition of charismatic leaders to tricksters in most documented societies also underlines the fragility of power. Its double nature (Balandier, 1970: 37) is systematically made manifest through the opposition of two characters. In ancient monarchies, the King faces his substitute, or his jester; in tribal systems, the political chief is confronted by a ritual dignitary, or a ‘master of the land’; in modern regimes the ruler fights against the leader of the opposition. The former is meant to accumulate power in order to protect the people and provide for its needs, as well as limiting entropy, anomy, etc.; the latter is conceived as a means of preventing excessive accumulation of power, unacceptable to the people, through gossip and mockery or checks and balances (Wedeen, 2002). Some leaders are specialized in the secular, others in the sacred. Some inherit their status while feast-givers ‘buy’ it with gifts and free banquets (Abélès, 1980), as in modern electoral campaigns not so long ago in Japan or Italy. These differentiated ways of accessing a position of power guarantee that no influencer can ever become durably hegemonic.
Exchange Since Mauss, then Lévi-Strauss, then Goody the role of mutual exchanges in isolated societies has been carefully theorized. Although deprived of law and constitutions such communities have long ago designed sustainable processes of self-administration. Basic organizational principles are universal: circulation of property is required; some exchanges are prescribed and others prohibited; in the long run a strict equality of advantages must be achieved. With this in mind, the actual specificities of the modern world become obvious. Throughout the history of mankind, property rights have not been the rule but the exception. PA shows that in the long run humans
Political Anthropology and Its Legacy
opt for commons that cannot be owned privately. Contemporary political anthropologists claim that such lineage shows that the economy is always embedded in the cultural and the political (Thomassen, 2008). The implication for contemporary societies is obvious: distributive justice is compelling; self-help is illegitimate. Income and property must pass not only from one generation to the next but from neighbor to neighbor. ‘Waste’ (as in Amerindian potlatch or Ethiopian banquets) may be a functional equivalent for modern rational investment tools such as campaigns expenses, interest rates, transaction costs, or hedge funds. When redistribution or restitution is not possible, excessive wealth must be destroyed, something illustrated by the potlatch and New Guinean Kula ring (Boas 2006/189; Mauss, 1966; Malinowski 1922). Exchange is also a proxy for alliance. Everywhere people are trying to find who is or could become an ally, and who is or could become a foe. Most people living in isolated societies express concerns about latent surrounding hostility. There is a scale of distrust, in which stepbrothers are common targets of defiance (Leach, 1954; Descola, 1996). The lesson to draw in politics is: be aware that most traitors are recent friends who defect when they consider that the structure of bilateral exchange has become imbalanced. Commons and redistribution, plus networks and alliance, are at the core of an anthropological view of exchange because they create and maintain more sociability than political institutions can.
Norms This concept is not specific to PA but most anthropologists trace actual behavior to compelling norms with or without enforcement structures. In isolated societies norms organize every single detail of life. When there is no visible web of institutions, contrary to human groups studied by sociologists, or polities
179
observed by political scientists, then an exhaustive review of the hidden sources of compliance becomes a priority. Scholars must go beyond the surface and find the causes of deference for others, and respect for the rules. For instance, there is no enduring social organization without prohibitions – such as the prohibition of incest that applies even when shared interest or mutual love would call for exceptions. Likewise, there is no society without prescriptions – like the law of asylum even when fierce enemies or despised guests must be accommodated and protected willy-nilly. These paramount norms are vault keys of the whole social architecture. Beneath these paramount norms, second-tier ones, like loyalty, solidarity, and reciprocity, derive from them. For instance, political scientists know that no state apparatus could resist a sudden mobilization of the masses: the core question is which norm(s) can prevent such an uprising, and this can vary according to time and place. In 2010 in Egypt and Syria (just before the ‘revolutions’ in these two countries) two norms consolidated a tacit social contract for a long time: first, an imaginary ‘red line’ not to be trespassed at the risk of personal and social disorders of a large magnitude; second, contempt with regard to the political sphere, the realm of vice, ruse, corruption, and cheating. When support for these norms weakened, activists began to mobilize a less lethargic population. Although legal and even social rules had not changed at all, the hollowing out of these two political norms made the uprisings possible.
A Global Ambition It is of note that the Global South and the Global North make differentiated use of PA. In the former, political scientists trace anthropological methods and problems to colonization. They rely on classical anthropology or its revival. In the latter, anthropological insights may help authors reveal sorcery,
180
The SAGE Handbook of Political Science
prophecy, or magic hidden behind modern procedures of rulers’ endorsement. Additionally, classical anthropology may bring solutions to knowledge traps. When scholars are confronted with hostility, misunderstanding, and distrust towards invasive investigation, then taking stock of the works of past anthropologists is helpful. This can apply to urban gangs (A. Goffman, 2009), religious activists, corporations (Pudal, 2013), state visits (Mariot, 2011), citizens’ committees (Patsias et al., 2017), or war militias (Baczko et al., 2011). In the absence of seminal works in their own discipline that would help political scientists to find causes of the processes they study, they tend to refer to famous anthropologists. In recent years, two ethno-sociologists have emerged from a long list of big names: Erving Goffman, and Nina Eliasoph. Both are celebrated for focusing on the avoidance of conflict in public venues, which in a pun alluding to ‘political science’ the latter labeled ‘political silence’. According to them, human beings live in a Hobbesian world, even after complex institutions have been invented to pacify social interactions. Encounters with strangers are potentially dangerous, worrisome, or embarrassing. Once conflicts break out from uncontrolled interactions, people must openly take sides. However, a dissenting opinion will tear down the group. Cost- and risk-avoiding citizens in densely populated nations are therefore close to indigenous people who minimize conflict because they can neither exit from the small community to which they belong, nor voice their grievances too boldly. Turning to anthropology is also the best option when quantitative methodology is not feasible. But Goffman’s and Eliasoph’s achievements show that causation is not absent from qualitative research. PA can describe, illustrate, understand, without renouncing to theorize and explain. In Turner’s work (1969), ‘liminality’ explains much – as does Godelier’s survey of socialization processes among New Guinean Baruyas (Godelier and
Strathern, 2008). Leach takes for granted that divided selves exist everywhere. People are politically flexible and they can understand deviant and selfish behavior that they may themselves occasionally adopt. Scholars all assume that context and meaning matter. But they are also looking for causes that could apply elsewhere – such as Abélès’s ‘eligibility’ (an implicit status that brings heirs of former leaders to the fore when candidates must be endorsed). It makes sense to look for genealogical credentials in a French town as well as in an Ethiopian village (Abélès, 1980, 1989), but it also shows that ‘selectorates’ exist everywhere (clerks in Iran, the military in Thailand, the Communist Party in China). In the end, most political scientists sincerely believe that in-depth interviews (not iterative conversations), accredited observation (not participant observation), and some attention to context (not field notebooks) are means of obtaining recognition from mainstream anthropologists. They simply skip the very tools that distinguish the latter from other social scientists: family trees, folk culture, and body language, which can be inferred from external signs such as garments, tattoos, and body piercing. Whereas anthropologists meticulously collect artifacts (and make drawings or photographs and even films that focus on objects), amateur ethnologists usually show disdain for ethnographic materials, since their quest for narratives take a heavy toll in terms of time spent on the ground (Schemeil, 2006). To be sure, methodological tools matter much when what is at stake is the establishment or the legitimization of a science. No political scientists could however indulge in a fully inductive method. Living for years among people whose lifestyles are remote from theirs, in search of something that would make sense and consolidate their own explanation of a political process in a non-political society, is beyond reach, and possibly counterproductive. PA is to PS what slow food is to fast food: a humble non-directed quest for those rare
Political Anthropology and Its Legacy
occasions where the deep meaning of rituals and beliefs can be extracted from the relatively large amount of data with which professional and amateur anthropologists are confronted. While politics is quickly changing in political societies, authoritative assessment is required about what is going on there. It must be grounded in solid theory and exemplified by remarkable events. When hectic politics is hidden behind non-political processes and religious narratives whose pace of change is very slow, then patience and caution are of the essence. Because of such a gap between politics in formally politicized and non-politicized societies, selecting moments of exception offers a timely solution to this dilemma. Durkheim’s ‘effervescence’ in ‘primitive societies’ (Durkheim, 1915), Balandier’s ‘theatrocracy’ in Africa (Balandier, 1980), Geertz’s drama in Bali (Geertz, 1973), and Meier’s Greek tragedy (Meier, 1993) help us understand politicians’ performances in the media. This methodological shortcut brings together two assumptions: that visual representation of politics as a drama is enlightening; that big events make political virtues manifest. Here, popular support and participation prevail over what is actually represented – the uses and abuses of political structures. A similar remark could be made about Erving Goffman’s total institutions such as garrisons, hospitals, jails, and even schools, with their extensions in Parliaments and Courts: there are exceptional and reduced models of macro politics, whose social meaning is captured during special events (medical treatments, the hunt for fugitives, verdicts). It’s as if, in order to conciliate PS (which tries to explain events) and ethnology (which unveils their underlying structure), PA should skip the fabric of politics to focus on what comes out of it. Because scholars are increasingly focusing on special events, we may wonder to what extent they can generalize from single case observations. Since events could
181
be exceptional, inferring kinship rules from one wedding, or political repertoires from one mass demonstration would be spurious. Rather than sound generalization this would be extrapolation, an epistemological sin. Case studies would be worthless if no general rule could be drawn from their particular lessons.
A Disputed Legacy Major advances in PA may be considered as drawbacks in PS. A telling example is Leach’s tripartite distinction between kin, stepfamily, and foe, which diverges from Carl Schmitt’s frontal conflict between friends and enemies: because PA identifies people living in ego’s vicinity rather than foreigners as the main sources of conflict, it focuses on civil wars rather than international confrontations. PA also shows that introducing a third party between ‘us’ and ‘them’ substantially changes the nature of social conflict. It tells the story of humans fighting less fiercely against aliens than against allies, thus putting treason rather than reason at the core of their relationships with others. Another divergence with mainstream PS is about the continuity of nature and culture. Despite variation between scholars over time, nature still matters more in PA than in PS, where it suffers from an early association with the much debatable socio-biology. Political scientists feel estranged from biology; hence animals, bodies, brains, and genes do not easily make their way as legitimate objects of study elsewhere than in ‘biopolitics’, a sidelined stream of research. On the contrary, anthropologists are less and less convinced of the necessity of distinguishing ‘physical’ from ‘cultural’ anthropology. They insist on our animal nature, the environment in which we live, and our relations with non-humans. In doing so, they are allegedly explaining variations between societies much better than they would, were they relying only on cultural differences (Descola, 2013).
182
The SAGE Handbook of Political Science
There are methodological caveats, though. For instance, making successful efforts to popularize ‘grounded theory’ and ‘analytic narratives’ is a respectable attempt to combine qualitative and quantitative methods. Both have nonetheless to catch up with the mantra of a less theorized ‘thick description’ (Geertz, 1973), which looks more like a motto than a method: is this 1973 notion ‘thicker’ than observation as usually understood in ethnology (Hastings and Roux, 2018)? Because ‘thick’ description is selective, is it ‘richer’ than a detailed field description? How new is thick description compared to Lévi-Strauss’s (1955: 443) ‘slated structure’, which also focuses on a plurality of meanings for the same narratives or the same acts? According to Lévi-Strauss, multilayered mythical sets are made up of all successive versions, like the Oedipal myth from ancient Greece to Freudian Vienna (‘a myth is made up of all its variants’ and they all belong to a ‘permutation group’ [Lévi-Strauss, 1955: 435, 439, 443]). For Geertz, successive versions of a Balinese wedding from a 19th-century account to participant observation in the 20th century enrich the interpretation of this rite. The two approaches are similar, whatever Geertz could have said: both track the underlying resemblance concealed beneath manifest difference. However, Lévi-Strauss’s slated structure might be quantified whereas Geertz’s ‘thick description’ cannot be measured. General laws are less easily achievable with a methodology attached to idiosyncrasies. Because anthropologists focus on external symbols and artifacts of social life, they mainly work with cultural materials – and this may be of course conducive to culturalism. Geertz himself could step down from his iconic position in PA when we look at the way he addressed cultural issues. To get rid of any essentialism without removing specificity from his analysis he tries to find a global meaning behind the local accounts made of a cultural concept (like ‘person’, or ‘law’). Rather than his treatment of particularities, political scientists drawing inspiration from
his work should take stock of his obsessional generalization: to convey their intentions to others, people freely pick out tools in a constrained cultural repertoire. Lévi-Strauss and Balandier are not very far away, neither is J.-P. Daloz when he compares two strategic options, both culturally acceptable – opting for modest or agonistic behavior (Daloz, 2006). Beyond unequal achievements and doubts about best practices, recent trends in PA are being contested. Take the ‘interpretive turn’. In PA, people perform, they are acting, and they interpret a partition. Therefore, PA is all about interpretation – theatrical interpretation by actors on stage (as in drama), or theoretical assumptions made by interpretive anthropologists. Does this add much to the epistemology of Lévi-Strauss, Balandier, and Geertz? Since Geertz displays a preference for an intuitive, inductive, and impressionist framework, the ‘interpretive turn’ in the social sciences is traced to his Interpretation of Culture (1973). However, the meticulous depiction of Clouet’s portrait of Elisabeth of Austria in The Savage Mind was already an exercise in such kinds of empirical inference (Lévi-Strauss, 1966: 24–5). The description of the lace collar in miniature was an opportunity to infer a global law from a reduced model of reality,8 i.e. art is metaphorical and made up of events, while science is metonymic and made up of structures. But both require the collaboration of the informer/ artist and the spectator/anthropologist – a rather Geertzian conclusion. This goes much beyond the famous ‘tour de force’ through which Geertz tries to draw global information about eye blinking by examining all possible ways to wink, blink, etc. and convey information with one’s body (Geertz, 1973: 6-10). Either interpretation is a synthesis of explanation and understanding which encompasses them both; or, it is simply a literary facility used by scholars who are not novelists to freely depict reality as they see it, with no epistemological constraint to objectify it. And, of course, if the interpretive turn is
Political Anthropology and Its Legacy
only a detour on the road to a fully assumed cognitivism, then PA merely underlines the potential of cognition instead of taking stock with regard to interpretation. Lastly, look at the ‘practice turn’. Although it is too early to say how new this actually is, there are some solid reasons to remain cautious about inferring pecking orders from actual behavior instead of deducting them from structural arrangements (Pouliot, 2016: 15). If, contrary to anthropologists, political scientists and internationalists lose contact with ontological issues (as stated by Joseph and Kurki, 2017), ‘practitionists’ a fortiori would remain on the surface of what they should explain in depth. However, turning to practice makes anthropologists vulnerable to empiricism, inductionism, and excessive empathy for the people observed. Here lies a big contradiction: a better understanding of the meanings of social behavior may be achieved at a high cost – loss of lucidity. Overall, the balance is nonetheless positive: PA brings more novelty than pseudoinnovation, and it opens up many more avenues than dead ends.
A Bright Future Two alternatives are offered to PA. Either it opts for interactionism and cognitivism as the best ways to conciliate anthropology and sociology, theory and practice, interpretation and explanation.9 Or it politicizes methodology and epistemology in an attempt to offer new explanations, different from what mainstream PS takes for granted. PA surfs on the postcolonial wave and contributes to the debate about the colonization/decolonization process, on the ground or in the mind. Gender issues are less prominent, though. Because it reflects a reality to which it tries to be faithful, PA often relays gender imbalances that exist in the real world. PA could benefit from the recent popularization of ‘digital humanities’, hence making
183
true Lévi-Strauss’s dream of a computational anthropology. This would go much beyond the timid quantification of a qualitative science. For instance, we can imagine all the possible permutations of acts or processes meant in various spaces and times to convey the idea of a limitation of power. Such moves could well bring PA closer to PS – so close that in the end the boundary between the two would collapse. One question would nonetheless remain: to what extent does any ‘anthropology of…’ power, States, revolutions, insurgency, wars, migration, securitization, climate change, class, etc., to cite entries of one of the latest handbooks (Wydra and Thomassen, 2018a), differ from a political sociology or even a political theory of the same issues? Is it not true that power as defined as ‘resistance to intrusion’ (Horvath, 2018), rather than the capability of erecting borders in relation to others, is similar to Philip Pettit’s concept of ‘non-domination’ (1997)? Is it not true that revolutions, as defined as ‘ritual processes, highlighting transformative characteristics that closely resemble liminal in-between periods and spaces known from rites of passage’ (Thomassen, 2018), belong to a broader category – the ‘political socialization’ of amateurs who become activists (Patsias et al., 2017). Contrary to the growing popularity of PA, which is at the root of the so-called ‘anthropological turn’, or may be considered as the rescuing of post-positivism by anthropologists, ‘political anthropology’ cannot be a politicized sort of knowledge; it cannot be conflated with ‘the politics of anthropology’ (Thomassen, 2008: 1); it cannot indulge in tracing ‘the political stakes of practice’ (Rabinow and Stavrianakis, 2018). To remain meaningful and keep their recognized added value, political anthropologists must export their methodological skills and their own objects of observation (like interpersonal networks) rather than just importing political scientists’ concerns. Mimicking or ‘challenging conventional wisdom in PS’ will not suffice. Admittedly, PA
184
The SAGE Handbook of Political Science
is currently focusing on ‘process’ (instead of structure or function), ‘identity’ (instead of interest and organization), and ‘otherness’ (instead of a Western-centered view of the rest of the world) (Wydra and Thomassen, 2018b: 1–2). However, becoming post-positivist, post-modern, and postcolonial will not spare theorization about institutions and policy designs, the strategic manipulation of culture, and the intermingling of Western and non-Western powers. Conversely, political scientists cannot be mere amateurs in the realm of PA or, worse, hide their lack of competence in (and appetite for) quantitative methods behind an anthropologist-friendly attitude. To become a true anthropologist one should get enough ethnological training, and become familiar with the genealogy and the debates of the profession from the outset (which goes back to Maine, Morgan, and Taylor). Actually, professional anthropologists and political scientists who use PA may agree on one point: taming violence, inventing forms of self-restraint, and avoiding conflict are documented across time and space. Most people are risk-averse, in particular when they live or act within small groups and isolated societies. Encounters with strangers are possible sources of discomfort (Anderson, 2011; Schemeil, 2017). This may be true when someone tries to express a dissenting opinion, which will inevitably jeopardize the very existence of the group. Hence, nearly everywhere ‘avoiding politics’ is of the essence (Eliasoph, 1998). This may well be the paradoxical lesson that can be gleaned by the sudden popularity of PA outside anthropology: the more you do it, the less its objects remain ‘political’. To conclude, the added value of true PA is multifold. It turns politics upside down, and works bottom-up instead of top-down, horizontally instead of vertically, with networks as substitutes or rivals to governmental institutions. It helps us to see the invisible (unframed institutions, latent norms, and underlying social structures) concealed by
the visible. It combines the universal and the specific. It is conducive to extended comparison across time and space with no regard for disciplinary insistence on ‘historicity’, ‘continuity’, and ‘incomparability’. In other words, it paves the way for a cognitive socio-anthropology of the political in comparative perspective, an interdisciplinary approach that is still to come.
Notes 1 PA ‘shows that all human societies produce politics and that they are all subjects to the vicissitudes of history’ (Balandier, 1970: viii). 2 ‘The very strong stress on social equilibrium, which was so evident in Evans-Pritchard’s approach, was quickly questioned in a series of works that focused more on conflict and change’ (Wydra and Thomassen, 2018b: 4). 3 ‘The Kachins choose, according to their particular situation, the most favourable mythical references for their present interest’ (Balandier, 1970: 72). 4 ‘Anthropology was of course always about politics’ (Thomassen, 2008: 265). 5 Otherwise, how could we explain that listeners from most parts of the world understand the following joke immediately and punctuate its ending by a loud laugh? (‘Paradise is a place where organization is German, police is British, cooking is French, and opera is Italian; now, what’s Hell? A place ruled by German police, British cooking, French opera, and Italian organization’). 6 While disciple-master relationships matter in Islam, clientele are unavoidable in the Middle East, and parliamentary work owes much to pressure groups in the United States. 7 Although Bøås (2018) relevantly points out that in civil wars militias are headed by local ‘big men’, which explains the extent and durability of fragmentation of states like Syria, and the replacement of state institutions by ‘networks’ of insurgents. 8 ‘Science would have worked on the real scale but by means of inventing a loom, while art works on a diminished scale to produce an image homologous with the object. The former approach is of a metonymical order, it replaces one thing by another thing, an effect by its cause, while the latter is of a metaphorical order’. (Lévi-Strauss, 1966: 25) 9 One excerpt from E. Goffman’s work (1983) is enlightening in this regard: ‘Social structures
Political Anthropology and Its Legacy
don’t “determine” culturally standard displays, merely help select from the available repertoire of them. The expressions themselves, such as priority in being served, precedence through a door, centrality of seating, access to various public spaces, preferential interruption rights in talk, selection as addressed recipient, are interactional in substance and character; at best they are likely to have only loosely coupled relations to anything by way of social structures that might be associated with them’. (Goffman, 1983: 1)
References Abélès, M., 1980. ‘Religion, traditional beliefs, interaction and change in a southern Ethiopian society: Ochollo (Gemu Gofa)’. In D. L. Donham and W. James (eds), Working Papers in Society and History in Imperial Ethiopia. Cambridge: Centre of African Studies, pp. 163–168. Abélès, M., 1989. Jours tranquilles en 89: Ethnologie politique d’un département français. Paris: Editions Odile Jacob. Abélès, M., 2002. ‘Political anthropology of European institutions: Tension and stereotypes. Multiculturalism in everyday life’. In K. Liebhart, E. Menasse, and H. Steinert (eds), Frembilder, Feinbilder, Zerrbilder: zur Wahrnehmung und Diskursiven Konstruktion des Fremden. Klagenfurt: Drava, pp. 241–254. Almond, G. and S. Verba, 1980. The Civic Culture Revisited. Boston (Mass.): Little Brown. Anderson, E., 2011. The Cosmopolitan Canopy: Race and Civility in Everyday Life. New York: W.W. Norton. Baczko, A., G. Dorronsoro and A. Quesnay, 2011. ‘Mobilisations as a result of deliberation and polarising crisis: The peaceful protests in Syria’. Revue française de science politique, 63(5): 1–25. Balandier, G., 1970. Political Anthropology. Harmondsworth: Penguin Books. Balandier, G., 1980. Le pouvoir sur scènes. Paris: Balland. Bayart, J-F., 2009. The State in Africa: The Politics of the Belly, 2nd Edition. Cambridge and Malden, MA: Polity. Boas, F., 2006. Indian Myths & Legends from the North Pacific Coast of America: A Translation of Franz Boas’ 1895 Edition of
185
Indianische Sagen von der Nord-Pacifischen Küste-Amerikas. Vancouver, BC: Talonbooks. Bøås, M., 2018. ‘New war zones or evolving modes of insurgency warfare?’. In H. Wydra and B. Thomassen (eds), Handbook of Political Anthropology. London: Elgar, pp. 331–343. Bromberger, Christian (ed.), 1995. Le Match de football: Ethnologie d’une passion partisane à Marseille, Naples et Turin. Paris: MSH. Clastres, P., 1974. La Société contre l’Etat. Paris, Éditions de Minuit. (English version: 1987. Society Against the State. Cambridge: MIT Press.) Daloz, J-P., 2006. ‘Sur la modestie ostensible des acteurs politiques au nord du 55e parallèle’. Revue internationale de politique comparée, 13(3): 413–427. Daloz, J-P., 2018. ‘Comparative political analysis and the interpretation of meaning’. In H. Wydra and B. Thomassen, (eds), Handbook of Political Anthropology. London: Elgar, pp. 177–190. Descola, Ph., 1996. The Spears of Twilight: Life and Death in the Amazon Jungle. New York: New Press. Descola, Ph., 2013. Beyond Nature and Culture. Tr. Janet Lloyd. Chicago: The University of Chicago Press. Durkheim, E., 1915. The Elementary Forms of Religious Life: A study in religious sociology. London: G. Allen & Unwin; New York: Macmillan. Easton, D., 1959. ‘Political anthropology’. Biennial Review of Anthropology, 1: 210–262. Eliasoph, N., 1998. Avoiding Politics: How Americans Produce Apathy in Everyday Life. Cambridge: Cambridge University Press. Evans-Pritchard, E. E., 1940. The Nuer: A Description of the Modes of Livelihood and Political Institutions of a Nilotic People. Oxford: Clarendon Press. Evans-Pritchard, E. E. and Meyer Fortes, 1961. African Political Systems. Oxford: Oxford University Press. Foote Whyte, W., 2004. Street Corner Society: The Social Structure of an Italian Slum. Chicago: The University of Chicago Press, 1955. Fortes, M., and E.E. Evans-Pritchard, 1948. African political systems. London: Oxford University Press.
186
The SAGE Handbook of Political Science
Geertz, C., 1973. The Interpretation of Cultures: Selected Essays. New York: Basic Books. Gledhill, J., 2009. ‘Power in political anthropology’. Journal of Power, 2(1): 9–34. Godelier, M., and M. Strathern, 2008. Big Men and Great Men: Personifications of Power in Melanesia. Cambridge: Cambridge University Press. Goffman, E., 1983. ‘The Interaction Order: American Sociological Association, 1982 Presidential Address’. American Sociological Review, 48(1): 1–17. Goffman, A., 2009. ‘On the run: Wanted men in a Philadelphia ghetto’. American Sociological Review, 74(3): 339–357. Gluckman, M., 1963. Order and Rebellion in Tribal Societies. London: Cohen & West. Hastings, M., and Ch. Roux, 2018. ‘Que faire de la ‘description dense’ de Clifford Geertz?’. In G. Devin and M. Hastings (eds), 10 concepts d’anthropologie en science politique. Paris: CNRS Editions, pp. 13–36. Horvath, Agnes, 2018, ‘Charisma/trickster: On the twofold nature of power’. In H. Wydra and B. Thomassen (eds), Handbook of Political Anthropology. London: Elgar, pp. 51–64. Huntington, S.P., 1996. The Clash of Civilizations and the Remaking of World Order. New York: Simon & Schuster. Joseph, J., and M. Kurki, 2017. ‘The limits of practice: Why realism can complement IR’s practice turn’. International Theory, 10(1): 71–97. Leach, E. R., 1954. Political Systems of Highland Burma: A Study of Kachin Social Structure. London, LSE and Cambridge, MA: Harvard University Press. Leca, J., and Y. Schemeil, 1983. ‘Clientélisme et patrimonialisme dans le monde arabe’. International Political Science Review, 4(4): 455–494. Lévi-Strauss, Cl., 1955. ‘The structural study of myth’. The Journal of American Folklore, 68(270): 428–444. Lévi-Strauss, Cl., 1966. The Savage Mind. Chicago: The University of Chicago Press. Lewellen, C. T., 2003. Political Anthopology. Westport, CT: Praeger. Malinowski, B., 1960. A Scientific Theory of Culture and Other Essays. New York: Oxford University Press.
Malinowski, B., 1922. Argonauts of the Western Pacific: An Account of Native Enterprise and Adventure in the Archipelagoes of Melanesian New Guinea. London: G. Routledge and Sons. Mariot, N., 2011. ‘Does acclamation equal agreement? Rethinking collective effervescence through the case of the presidential “tour de France” during the twentieth century’. Theory and Society, 40(2): 191–221. Mauss, M., 1966. The Gift: The form and reason for exchange in archaic societies. London: Cohen & West. Meier, Ch., 1993. The Political Art of Greek Tragedy. Baltimore, MD: The Johns Hopkins University Press. Patsias, C., J., Durazo Herrmann, and S. Patsias, 2017. ‘The steep and slippery slope of politics: Civic spirit, empowerment and politicization in citizen committees’. European Journal of Cultural and Political Sociology, 6(1): 95–123. doi:10.1080/23254823.2018. 1496844. Pettit, Ph., 1997. Republicanism. A Theory of Freedom and Government. Oxford: Clarendon Press. Pouliot, V., 2016. ‘Hierarchy in practice: Multilateral diplomacy and the governance of international security’. European Journal of International Security, 1(1): 5–26. Pudal, R., 2013. ‘Politics in the fire station: An ethnographic approach to relations to politics in the firefighting world’. English version. doi:10.3917/rfsp.615.0917. [French: La politique à la caserne. Revue française de science politique, 61(5), 2011, 917–944]. Quine, W. V. O., 2013. ‘Translation and meaning’. In Quine, W. V. O. (ed.), Word and Object (New ed.). Cambridge, MA: MIT Press, pp. 23–72. Rabinow, P., and A. Stavrianakis, 2018. ‘Contemporary political stakes: After-lives of the modern’. In H. Wydra and B. Thomassen (eds), Handbook of Political Anthropology. London: Elgar, pp. 65–75. Schemeil, Y., 2006. ‘Une anthropologie politiste en France’. Raisons Politiques, 22(2), 49–72. Schemeil, Y., 2011. ‘Culturalism’. In B. Badie, D. Berg-Schlosser, and L. Morlino (eds), The International Encyclopedia of Political Science, vol. 1. London: Sage, pp. 511–514.
Political Anthropology and Its Legacy
Schemeil, Y., 2017. ‘Le moment noir: pourquoi les études des rapports interethniques en Europe et en Amérique ne convergent pas’/ ‘The “nigger moment”: Why the studies of interethnic relations in Europe and America do not converge’. Revue française de science politique, 67(4): 695–714. (English version forthcoming). Szakolczai, A., 2018. ‘Recovering the classical foundations of political anthropology’. In H. Wydra and B. Thomassen (eds), Handbook of Political Anthropology. London: Elgar, pp. 19–36. Thomassen, B., 2008. ‘What kind of political anthropology?’. International Political Anthropology, 1(2): 263–274. Thomassen, B., 2018. ‘The anthropology of political revolutions’. In H. Wydra and B. Thomassen (eds), Handbook of Political Anthropology. London: Elgar, pp. 160–176.
187
Turner, V., 1969. The Ritual Process: Structure and Anti-Structure. Chicago: Aldine Press. Vernant, J.-P., 1991. ‘A “beautiful death” and the disfigured corpse in Homeric epic’. In Froma I. Zeitlin (ed.), Mortals and Immortals: Collected Essays. Princeton, NJ: Princeton University Press, pp. 50–75. Wydra, H., and B. Thomassen (eds), 2018a. Handbook of Political Anthropology. London: Elgar. Wydra, H., and B. Thomassen, 2018b. ‘Introduction: The promise of political anthropology’. In H. Wydra and B. Thomassen (eds), Handbook of Political Anthropology. London, Elgar, pp.1–18. Wedeen, L., 2002. ‘Conceptualizing culture: Possibilities for political science’. American Political Science Review, 96(4): 713–728. Wylie, L., 1957. Village in the Vaucluse. Cambridge, MA: Harvard University Press.
11 Uses and Abuses of Formal Models in Political Science J a c k P a i n e a n d S c o t t A . Ty s o n
The Use of Models Formal political theory is a methodological approach – common in domestic politics, comparative politics, and international relations – that is characterized by its strong commitment to logical rigor as well as its conceptual and analytical clarity.1 One of formal political theory’s core strengths is that it confronts foundational questions about politics. For instance, who shapes policy? What strategies do they use? And what informational and incentive constraints affect political interactions? Pioneering insights from formal political theory to each of the subfields go beyond these basic questions to precisely articulate the mechanisms responsible for the political outcomes we observe, often by untangling countervailing effects and isolating clear counterfactual comparisons. A formal political theory is usually comprised of at least two components: first, a logical (often mathematical) structure representing the critical individuals, decisions, constraints,
and information that make up the substantive question, and second, and just as important, an interpretation of that logical structure that gives substantive meaning to the aspects and results of the model. These two components are critical (Rubinstein, 2012), and they also introduce a flexibility in the questions formal political theory can address. The diverse ways in which formal political theory can contribute to understanding politics has also engendered considerable disagreement about how scholars can most productively use formal models as an analytical tool. In fact, there is a great deal of disagreement among formal theorists regarding what qualities make for a good (or insightful) model, the relationship between theory and empirical work, and what kinds of questions formal models are most appropriate for answering. In this chapter, we present a novel distinction between two common approaches to formal models in political science. First is the phenomenon perspective, which seeks to relate a formal model to descriptive empirical
Uses and Abuses of Formal Models in Political Science
patterns, and the second is the experimental perspective, which views formal models as an explication of a causal mechanism.2 To illustrate the strengths of each of these perspectives (relative to the other), we consider the typical concerns a theorist confronts when developing a formal model from each perspective. We focus in particular on how each perspective approaches a comparative static comparison, which examines a comparison from changing one factor, while all other factors remain ‘static’. A comparative static analysis focuses on an ‘all else equal’ comparison by changing a single factor, holding all other aspects of the model fixed, and looking at the change in some outcome (perhaps simply equilibrium strategies). An ideal model from the phenomenon perspective addresses three empirical considerations. First, what patterns in the real world motivate the need for a formal model? Second, do real-world actors perceive tradeoffs that correspond with key assumptions in the model setup? Third, do the model’s comparative static predictions match empirical relationships? Although phenomenon-driven models are not realistic in the sense of providing a literal description of the real world, the setup and implications of these models do attempt to match attributes of the real world. Many approaches to model construction in political science draw elements from the phenomenon approach, whether they espouse combining models with quantitative evidence (Morton, 1999; Granato and Scioli, 2004), qualitative evidence (Bates et al., 1999; Goemans and Spaniel, 2016; Lorentzen et al., 2017), or a combination (Laitin, 2003). Furthermore, in practice, many scholars attempt to provide insight into real-world phenomena when writing models, therefore implicitly adopting at least some elements of the phenomenon approach. Lorentzen et al. (2017) surveyed every game theory article in six prominent political science journals between 2006 and 2013 that examined topics in international relations or comparative politics. They found that of the 182 articles,
189
128 (70%) included either a quantitative or a qualitative empirical component. The extent of this evidence differs from article to article, ranging from brief anecdotes in the introduction, to regression analysis of experimental or other originally collected data, and detailed case studies. But even sparse discussions of empirical evidence aim to convince the reader that aspects of the model are ‘realistic’, and descriptively reflect substantive cases. The real world is messy and complicated, and sometimes the best approach to understanding how it works is to analyze things in isolation. But there are always substantive features which, although known to be important real-world considerations, are nevertheless superfluous for explaining the core political mechanism. This observation motivates the experimental approach to writing a formal model, which focuses on isolating and understanding substantive mechanisms. Ideally, an experimental-driven model is intentionally parsimonious because the priority is on viewing a particular causal mechanism in isolation. Consequently, introducing extraneous features into the model, although more descriptively realistic, is counterproductive because either such features add no additional insights, or worse, they create confusion. Instead, the more focused the model, the more focused the comparison, and the more general the insight (Banks, 1990). Comparing the experimental approach to formal political theory to actual experimental design highlights its goals and virtues (Haavelmo, 1944; Ashworth et al., 2015). The classic setup of an experiment considers different levels of a ‘treatment’ and compares average outcomes between a treatment group and control group. Holding all else equal is precisely the goal of models from the experimental perspective, and consequently, there is less concern with accounting for the full panoply of substantive factors because – from the theorist’s perspective – these additional things are not a critical part of the analysis.3 A key strength of this approach is that by focusing on a particular mechanism, the
190
The SAGE Handbook of Political Science
analysis can reveal and understand the nuts and bolts of a substantive case, regardless of whether the mechanism of interest actually operates in isolation in the real world. A particularly important component of comparative static analysis from the experimental perspective concerns indirect effects. Often changing a single parameter can affect an outcome of interest through direct and informational channels. For instance, to understand the influence of political mobilization on government policies (through voting, protest, or other means), scholars generally study two effects. The first is a direct effect: mobilized dissent can create various problems that a government is forced to deal with regardless of the reason for the disruption to society. Second, mobilized dissent is generally considered to communicate dissatisfaction among members of the citizenry with the government’s policies. This leads to a conceptually distinct, informational, channel through which mobilized dissent influences government policy. From the phenomenon perspective, indirect effects can be a nuisance because they obstruct clean directional predictions from the model. However, from the experimental perspective, indirect effects are often the most interesting aspect of the model, because they demonstrate the character and importance of strategic considerations. Below we provide numerous examples of the phenomenon and experimental approaches in applied research, distinguished by model motivation, setup, and comparative statics. We then discuss common critiques of
formal models based on empirical applicability or lack thereof, and illustrate the differences in how the two approaches handle critiques. We discuss two influential debates. First, redistributive political transition models posit that economic inequality affects prospects for democratization by affecting demands for redistribution (Acemoglu and Robinson, 2000, 2001, 2006; Boix, 2003). Second, selectorate theory examines how institutional variation in leadership selection affects a range of outcomes, including public good provision and international war (Bueno de Mesquita et al., 2005). We conclude with implications for research and training. Specifically, we emphasize how graduate game theory courses, by incorporating crucial philosophical and conceptual issues, could demonstrate how models can address substantively interesting questions in addition to teaching the technical structure of models. Table 11.1 summarizes the defining elements of both approaches. Importantly, these approaches are not mutually exclusive, and most published formal modeling articles contain elements of each. However, explicating this distinction is critical for understanding how to use formal models to advance knowledge of political phenomena, and how to avoid common critiques that may be pertinent to one approach but not the other. Our conceptual distinction between different perspectives has largely been overlooked and is useful for all political scientists who might otherwise neglect some contributions of f ormal models.
Table 11.1 Key differences between phenomenon and experimental approaches Motivation Model setup Comparative statics
Phenomenon
Experimental
Explain descriptive patterns Assumptions should correspond with tradeoffs perceived by real-world decision-makers Sign of key comparative static predictions (usually the total effect) should match statistical relationship or actions/outcomes in empirical cases
Isolate mechanisms Assumptions should be parsimonious to yield conceptual clarity Comparative statics are used to isolate substantive channels
Uses and Abuses of Formal Models in Political Science
The Phenomenon Perspective Motivation The phenomenon perspective is often motivated by empirical patterns or a set of observations, which can be either quantitative or qualitative, that existing research does not convincingly explain. Sometimes, the researcher presents a single pattern that raises strategic questions. For example, Slantchev asks a question about a particular case: During the last days of September 1950, the US administration faced a momentous decision about what to do in Korea: should American forces stop at the 38th parallel, as originally planned, or should they continue into North Korea, and turn the conflict from a war of liberation into a war of unification? (Slantchev, 2010: 357)
He then presents a model in which an optimal response to such dilemmas depends on the opponent’s incentive to ‘feign weakness’. Miller and Schofield (2003) demonstrate that US states won by the Republican presidential candidate William McKinley in the 1896 election nearly perfectly corresponds with states won by the Democratic presidential candidate Al Gore in the 2000 election, motivating their model on how party agents can push platforms that over time yield party realignment. Acemoglu and Robinson’s (2006) book begins with narratives from Britain, Argentina, Singapore, and South Africa to highlight four regime trajectories that differ on whether democratization occurs and its stability. Their model explains how economic inequality shapes the equilibrium behavior of elites and the masses, which creates varying regime trajectories. Other articles juxtapose disparate patterns and argue that they share a common strategic logic. For example, Powell (2012: 620) posits ‘three striking features or stylized facts about both interstate and civil war’ based on quantitative and qualitative evidence in existing research: ‘(1) there are often periods of persistent fighting, (2) fighting commonly
191
ends in negotiated settlements as well as in militarily decisive outcomes, and (3) fighting sometimes recurs’. He argues that shifts in the distribution of power and actors’ strategic fighting decisions (to forestall adverse shifts) yield equilibrium behavior consistent with all three patterns. Paine (2018) contains a section before the model setup that presents regression tables to highlight a mixed empirical pattern: higher country-level oil production covaries with less frequent center-seeking civil wars, whereas higher regional-level oil production covaries with more frequent separatist civil wars. The model highlights two main countervailing effects of oil production on the likelihood of civil war onset, and explains why these mechanisms vary in magnitude depending on the opposition’s optimal civil war aims.
Model Setup To explain empirical phenomena, the model setup should incorporate important tradeoffs that real-world actors perceive when making choices. Although explicitly motivating assumptions using real-world examples is somewhat less common than motivating examples or testing comparative statics predictions, Lorentzen et al.’s (2017) survey shows that 23% of game theory articles in their sample contained explicit evidence for assumptions. For example, Svolik (2009) studies an interaction between a dictator that seeks to concentrate power and a ruling coalition that attempts to maintain a powersharing arrangement. He assumes that the dictator’s strategic action to concentrate power sends an informative (but imperfect) signal to the ruling coalition, who may react by staging a coup. Svolik demonstrates the empirical relevance of this assumption by providing examples in which leaders’ attempts to consolidate power-generated observable signals to ruling coalition members. In the Soviet Union, Lavrenty Beria merged formal ministries after Joseph
192
The SAGE Handbook of Political Science
Stalin’s death to concentrate power in his hands. In Iraq, Saddam Hussein gradually replaced earlier supporters with loyalists from his home town. In these cases, subordinates gained information that was consistent with attempted power concentration, but they were unsure of the true motives of the dictator – which corresponds with the core assumptions of Svolik’s (2009) model. The motivating puzzle in Nalepa (2010) is that in the late 1980s, many communist regimes in Eastern Europe negotiated democratic transitions with the opposition. Gaining assurances that communist agents would not face punishment following a regime transition, they resigned peacefully in cases such as Poland, Hungary, and Czechoslovakia. This is puzzling when considering that the communists should seemingly have expected the opposition to break these promises. But, empirically, the new democratic leaders held their promises, which is also puzzling given their widespread desire to punish the communists. Nalepa (2010) studies a signaling model and explains that these promises were credible because of communists’ private information about transgressions committed by the opposition as informants during communism – i.e., their ‘skeletons in the closet’. But this mechanism is only empirically relevant if the real-world actors did indeed perceive this information asymmetry, which she confirms using evidence from interviews. For example: The communists attempted to exploit this informational advantage by trying to convince the opposition that it was highly infiltrated. One of the dissidents representing Solidarity in the roundtable negotiations recalled: ‘When I met Kwasniewski, he said, ‘Do not mess with those files, let them be – the agents were mostly your own people’. (Nalepa, 2010: 349–50)4
Comparative Statics Whatever the initial motivation for presenting and setting up a formal model, the analysis generates comparative static predictions that researchers can evaluate either with
statistical or qualitative evidence. This is a central element of the influential ‘Empirical Implications of Theoretical Models (EITM)’ approach to political game theory (Morton, 1999; Signorino, 1999; Signorino and Yilmaz, 2003; Granato and Scioli, 2004) and receives support from methodological research on combining game theory and qualitative methods (Bates et al., 1999; Goemans and Spaniel, 2016; Lorentzen et al., 2017). Lorentzen et al.’s (2017) aforementioned survey of game theory articles in political science shows that 63% of game theory articles provided either statistical tests, cross-case comparisons, or case studies to evaluate comparative static predictions. For example, Conrad and Ritter (2013) examine the effects of international human rights treaties on incentives for domestic leaders to exercise repression. First, these treaties increase the likelihood of domestic protests in reaction to repression, increasing the need to exercise repression to retain power. Second, international human rights treaties increase the probability that repressive rulers will face litigation, which increases the costs of repression. Their formal analysis shows that the magnitude of the first effect depends on other aspects of the leader’s job security. The first mechanism is relatively small in magnitude for secure leaders because they are unlikely to experience mass unrest regardless of the presence of an international treaty. However, the first mechanism is large in magnitude if the ruler is insecure, and dominates the second mechanism. This analysis yields a clear implication about an empirically observable interaction effect. Conrad and Ritter provide regression evidence that international human rights treaties are uncorrelated with repressive behavior in states with insecure leaders, but covary with lower repression in states governed by secure leaders. As another example, Paine (2016) examines two countervailing implications of oil production: it raises the value of capturing the state for a rebel group, but it also increases government revenues to spend on patronage
Uses and Abuses of Formal Models in Political Science
distribution and coercion. Untangling these distinct effects yields an implication about conventional practice in the empirical conflict literature. Standard conflict models include both oil production and income per capita on the right-hand side of the regression, and usually find that more oil production covaries with a higher frequency of civil war. The motivation for controlling for income per capita is that this is a strong predictor of civil war onset. However, the logic of the model highlights the problem with this control variable, which many argue proxies for government revenues. By controlling for income, the regression implicitly answers the largely irrelevant question of what the effect of discovering oil in countries like Saudi Arabia would have been if discovering oil did not increase government revenues. Revised regression specifications that incorporate this consideration demonstrate empirical results inconsistent with conventional wisdom about a conflict resource curse. As an example of using qualitative evidence, Dunning (2008) highlights a set of conditions where resource wealth can promote democratic stability. High rents enable the government to provide public goods to the masses without needing to soak wealthy elites for tax revenues – mitigating class conflicts that would otherwise arise under a democratic regime. Using evidence from Venezuela, he shows that when oil rents were high in the 1970s, elites did not object to the high levels of public benefits provided to the masses because these public goods did not require high taxation (163–6). By contrast, as oil rents fell, Dunning (2008) shows that politics became polarized around classes and redistributive conflicts and ultimately facilitated the rise of the populist Hugo Chavez (166–83). These examples also highlight the valueadded of the formal analysis for deriving empirically testable comparative statics. In all three examples, the model analysis highlights two countervailing effects of a particular stimulus. The formal model facilitated
193
the rigorous examination of the interaction between the two mechanisms and the conditions in which one should dominate the other. In each case, the analysis yielded novel empirical predictions that the researcher could take to data and check for directional congruence.
The Experimental Perspective Motivation The goal of experimental-driven models is to study specific attributes of strategic tradeoffs, such as individual motivations, information frictions, and other strategic issues that shape politics. For instance, institutional constraints like voting rules, the timing of elections, or the rules determining how legislation must be proposed dramatically influence various aspects of democracy (Diermeier and Krehbiel, 2003; Dewan and Shepsle, 2011). Other examples include how political accountability differs from standard contracting problems (Ashworth, 2012), and the importance of communication in bureaucracy (Gailmard and Patty, 2012). As another example, Di Lonardo and Tyson (2018) study the interaction of domestic political threats and the logic of deterrence, which they approach from an experimental perspective. They first present the baseline crisis bargaining model of Fearon (1994) and Schultz (1998), and use this model to formally articulate the conventional logic of deterrence. Then they introduce domestic political threats into this framework similar to Bueno de Mesquita et al. (2005) and Baliga et al. (2011): domestic support is necessary for a leader to keep power. To isolate the effect of domestic political threats on the logic of deterrence, it is important when adding domestic politics to hold constant all other aspects from the benchmark model. In addition, although the benchmark crisis bargaining model suffers from some shortcomings, the contribution of Di Lonardo
194
The SAGE Handbook of Political Science
and Tyson (2019) would not be clear had they started with a non-standard benchmark model of an international crisis.
Model Setup The goal of experimental-driven models is not to attempt to approximate the real world, but instead to only include in the model elements needed to elucidate the core mechanism. For example, Tyson (2018) studies a central problem with exercising repression in authoritarian regimes: the dictator requires the cooperation of her security apparatus. However, the very need for a security apparatus creates an agency problem: if the leader loses power, then she cannot completely fulfill the promises made to members of the repressive apparatus. Tyson (2018) explicitly removes other agency problems from the model, like moral hazard and adverse selection, even though such features are unarguably present in reality. Tyson (2018) does not include these aspects in his model in order to study implications resulting exclusively from the agency problem that arises from the leader’s tenuous hold on power. As another example, Banks and Duggan (2006) study determinants of public policy in legislatures with majority rules. They adopt a bargaining approach that assumes different members of the legislatures interact over time, and each can be randomly selected to make a policy proposal. If a majority adopts a proposed policy, then it becomes the new policy. By contrast, if a majority rejects a proposal, then the status quo remains in place. The goal of the model is to examine the implications of changing one key assumption from existing models: each legislator prefers any settlement to the status quo policy, i.e., the status quo is necessarily bad. This change implies that legislators may view the status quo policy favorably, making legislators more reluctant to vote for a new policy. Although real-world legislatures contain many additional features that Banks and Duggan (2006)
do not incorporate into their model, making the setup more realistic would distract from their goal of changing a single substantive feature from existing models.
Comparative Statics The purpose of comparative static exercises in experimental-driven models is to highlight the distinct channels through which a single factor causes a change in an outcome of interest, including equilibrium actions or their substantively relevant consequences. In most cases, there are numerous channels that correspond to separate mechanisms. The primary goal of the experimental approach is to elucidate each mechanism. As a canonical example, suppose that different values of a treatment are represented by different values of x that directly influence an outcome, but may also provide information to decision makers, i.e., by changing their beliefs. In this case, the substantive outcome of interest, Y(x,β), depends on x and beliefs, β. Supposing that everything is differentiable, then the total derivative with respect to x equals the sum of the direct and informational effects: dY/dx = ∂Y/∂x + ∂Y/∂β . dβ/dx The first term reflects the direct influence of x on Y. The second term combines the direct effect of beliefs on the outcome and the effect of x on beliefs. These distinct effects may pose a nuisance if the goal is to yield a predicted relationship between x to Y to take to the data. However, from the experimental perspective, the goal of the model is to untangle these distinct mechanisms. Sometimes these results are counterintuitive and ‘surprising’ from the perspective of existing theories, and this is a key strength of models emerging from the experimental perspective. For an experimental-driven formal theorist, perhaps the most interesting aspect of a comparative static relationship is the indirect effects that arise as a result of the strategic
Uses and Abuses of Formal Models in Political Science
context. To highlight the importance of informational effects, consider, for example, the classic jury problem, which studies whether jury verdicts reflect people’s sincere opinions gathered from the facts of the case (AustenSmith and Banks, 1996; Feddersen and Pesendorfer, 1998; Persico, 2004). To clarify this point, suppose there are N jurors and two collective outcomes, guilty (G) and innocent (I). Suppose also that convicting (i.e., choosing G) requires unanimity. There are also two equally likely states of the world: the defendant is truly guilty, or she is truly innocent, represented by ω ∈ {G, I}, respectively. Jurors want to convict the guilty and to acquit the innocent, and their payoffs are represented by: u(G,G) = u(I,I) = 1 and u(G,I) = u(I,G) = 0. Each juror attends the trial, but despite their common preferences, each interprets the evidence and arguments slightly differently. To capture this, each juror receives an informative signal, where guilty signals are more likely to be seen when the defendant is guilty, and innocent signals are more likely to be seen when the defendant is innocent. Formally, juror i receives a signal, si that equals either G or I, and Pr(si =ω|ω) = q ∈ (0.5,1). Will all jurors vote sincerely in line with their signal? Consider the problem from the perspective of an individual juror, who truly wants to convict only the guilty and to acquit only the innocent. Imagine this juror has seen a signal suggesting that the defendant is innocent. However, she also knows there is some probability that her signal is wrong and the defendant is guilty, and, moreover, other jurors’ signals may differ from hers. The juror in this example is driven by an informational concern. There is a direct effect that follows from her signal, namely, an innocent signal suggests that the defendant is innocent. However, there is an important indirect effect that follows from the structure of the jury problem, namely, the voting rule. Specifically, a juror considering whether her
195
vote is pivotal in the ultimate verdict, and who is considering voting to acquit, knows that the only case in which her vote will make a difference is when all other voters have cast guilty votes. But if all these voters have voted sincerely, it means that they have all received guilty signals – an extremely unlikely event when the defendant is in fact innocent. Consequently, it is not a best response for the juror to vote sincerely. More broadly, this example illustrates how indirect informational concerns influence decisions in political contexts.
The Abuse of Models Most formal political theory articles contain elements of both the phenomenon and experimental approaches. A formal political theory formulated from one perspective is motivated by a distinct set of concerns that the other perspective does not necessarily share – nor should it. But because the distinction between phenomenon and experimental kinds of models has not been articulated previously, concerns that are important ingredients from one perspective are often unintentionally used to obstruct the other. For example, an experimental-driven model, on the surface, appears to be far more stylized than one written from the phenomenon perspective. However, it is important to stress that this superficial kind of ‘artificiality’ is intentional, and constitutes one of the key strengths of this theoretical approach. The experimental theorist is driven not by a desire to include as many factors as possible, but instead, needs to ensure that mitigating influences, with respect to the main factor of interest, are suppressed. To accomplish this theoretically, the theorist intentionally omits factors, even though they might be important in the real world. These omitted factors are precisely the things that an empiricist controls for, but for a formal model to keep such things fixed, the theorist must omit them from the model.
196
The SAGE Handbook of Political Science
A common critique of game theoretic models is that their implications are unimportant because they rest on unrealistic assumptions (e.g., Green and Shapiro, 1996; Elster, 2000).5 To illustrate the difference between the phenomenon and experimental approaches, consider how a theorist from each perspective might respond to this criticism. A phenomenon-driven theorist should respond by modifying the assumptions to better reflect reality, whereas the experimentaldriven theorist would allege that such a complaint reflects a misunderstanding of the question their model was designed to address. Scholars have also debated the role and importance of empirical evidence in validating a model’s predictions. On the one extreme, the American Journal of Political Science proposed briefly in the early 2000s a submission policy in which the editors would desk-reject any formal modeling manuscript that lacked an accompanying empirical test (Fowler, 2005). On the other extreme, Clarke and Primo (2012) argue that empirically testing models misunderstands their purpose. Instead, they argue that the only purpose of models is what we call the experimental approach. With regard to this controversial debate, the difference between the phenomenon and experimental approaches to formal political theory is crucial for understanding the source and relevance of these different points of view. Confusing philosophical positions with quality judgments tends to obscure the discussion, leading scholars to talk past each other regarding things that are largely orthogonal to substantive issues. To illustrate our point, we present two examples from prominent models that exemplify these distinctions.
Redistributive Political Transitions The idea that inequality and prospects for economic redistribution affect incentives to seek or to resist democratization has a long
pedigree in political science. More recently, Acemoglu and Robinson (2000, 2001, 2006) present a parsimonious formal framework to explain these incentives in which a commitment problem is the key mechanism. Acemoglu and Robinson’s (2006) core model analyzes an interaction between a representative rich elite that sets policy under a dictatorship, and a representative agent of the poor masses that sets policy under democracy. Each actor seeks to maximize its own consumption by affecting the tax rate. Because of the assumed wealth disparity, elites prefer no taxes whereas the masses prefer a positive tax rate. Furthermore, economic inequality determines the extent to which the two actors disagree about taxes, as higher inequality causes the masses to prefer a higher tax rate. Although the elite unilaterally determine the tax rate under dictatorship (de jure power), the masses may be able to force higher tax rates by staging a revolution (de facto power). The elite has three options to stave off revolution: temporary concessions, repression, or democratization. One key mechanism that Acemoglu and Robinson’s (2006) model elucidates is the effect of economic inequality on the likelihood of democratization. They derive a non-monotonic relationship in which democratization only occurs if inequality is intermediate. At low levels of inequality, there is low demand by the masses for democracy because the amount of wealth held by elites that the masses could redistribute to themselves in democracy is low. At high levels of inequality, democratization does not occur because the elites use repression instead. The amount of redistribution under democracy would be so high that elites prefer to use costly repression to retain power. However, if inequality is intermediate, then mass demand for democratization is high enough that negotiated concessions are insufficient to prevent revolution, but the elites’ fate under democracy is not dire enough for them to use repression. Subsequent research criticizes numerous assumptions of the model setup. Some scholars allege that class differences between rich
Uses and Abuses of Formal Models in Political Science
and poor is usually not the primary political cleavage that drives political transitions (Epstein et al., 2012; Haggard and Kaufman, 2012; Ansell and Samuels, 2014). Others argue that, at least in the post-colonial world since 1945, economic elites do not usually exercise political control. For example, the military usually does not act as a proxy for the wealthy (Slater et al., 2014). Some posit that revolutionary threats rarely provide a stimulus for democratization and that other factors appear more important for explaining manhood suffrage in most European countries (Collier, 1999; Lizzeri and Persico, 2004; Llavador and Oxoby, 2005), womanhood suffrage (Przeworski, 2009), or internationally driven transitions in recent decades (Levitsky and Way, 2010; Haggard and Kaufman, 2012). Finally, many democracies do not redistribute en masse either because they lack infrastructural capacity (Slater et al., 2014), or because elites exert considerable influence even under democracy (Albertus and Menaldo, 2018).6 Are these critiques relevant? From the phenomenon perspective, many of these are pertinent critiques that require a sustained theoretical and empirical dialogue. Given Acemoglu and Robinson’s (2006) stated goal to explain empirical instances of democratic transitions, it is important for the model to incorporate key tradeoffs that real-world policy makers faced. Correspondingly, models written in response to these critiques have yielded numerous insights by altering aspects of the original setups to more closely capture particular empirical settings (Dower et al., 2018). From an experimental perspective, these critiques are less relevant because the key contribution of Acemoglu and Robinson (2000, 2001, 2006) was to build on existing non-formal theories of democratization to understand the strategic interaction among social classes.7 Moreover, perhaps the most important contribution of these models is in identifying how democratization can result from a commitment problem that arises when elites lose (even temporarily) de facto
197
political power. The models of Acemoglu and Robinson (2000, 2001, 2006) also generate several counterintuitive predictions. For example, Acemoglu and Robinson show that if the masses can only mobilize infrequently to stage a revolution, then eventual democratization becomes more likely. This result follows because infrequent mobilization enhances the masses’ bargaining leverage in periods they can organize for revolution, since their future valuation of the status quo regime is low. As another example, Boix (2003) shows that high inequality does not cause elites to resort to repression when asset liquidity is high. If elites can move their assets abroad, then they do not fear high taxes under democracy, hence highlighting a subtle mitigating effect in the inequality–democratization relationship. Furthermore, highlighting the value of mechanism-based contributions to spurring future research and empirical insights, Paine (2019a) extends the asset liquidity mechanism in a dynamic model to help explain the empirical relationship between oil production and separatist civil wars.
Selectorate Theory The experimental perspective to formal political theory asks a different question. Specifically, does the setup of the model, including the underlying assumptions, isolate clear causal mechanisms? Whereas ensuring that all relevant factors are incorporated into the model is a mark of quality from the phenomenon perspective, it is often a sign of conceptual confusion from the experimental perspective. Likewise, having a clean, streamlined, and focused model is ideal for the experimental approach, but a scholar motivated by the phenomenon perspective typically has a skeptical view of such a model’s conclusions. As an illustration, consider the selectorate theory presented in Bueno de Mesquita et al. (1999) and Bueno de Mesquita et al. (2005). A simple observation motivates selectorate
198
The SAGE Handbook of Political Science
theory: every leader relies on the support of some specified set of individuals, called the selectorate, which is designated by a country’s institutions. As a result of this, leaders cannot sustain their hold on power without adequately compensating their winning coalition, the proportion of the selectorate needed to keep them in office. When the selectorate is small, as in autocratic regimes, this is most effectively accomplished through providing private goods. By contrast, when the selectorate is large, as in democracies, this is most effectively accomplished through public goods provision. Numerous implications follow from this core insight, including why democracies do not fight each other, which they confirm with numerous statistical tests. Like redistributive political transition models, selectorate theory has attracted considerable criticism. Gallagher and Hanson (2015) critique three main aspects, all of which reflect a phenomenon perspective. First, in reality, there is no clear distinction among winning coalition members, selectorate members, and non-selectorate members. Second, existing measures of these concepts are flawed, rendering Bueno de Mesquita et al.’s (2005) statistical tests invalid.8 Third, Gallagher and Hanson (2015) criticize selectorate theory’s core assumptions, arguing that the theory treats selectorate members as homogeneous, conflates rulers with regimes, and mischaracterizes the relationship between public goods and political rights. Once again, the response to these critiques depends on one’s philosophical perspective. From the phenomenon perspective, it is important to improve the descriptive accuracy of the assumptions and to incorporate additional elements into the original model. These considerations have motivated several extensions to the original model that include revolutions, purges, and other forms of authoritarian ruler turnover (Bueno de Mesquita and Smith, 2009, 2017), the effects of natural disasters (Flores and Smith, 2013), and leader health shocks (Bueno de Mesquita and Smith, 2018).
However, viewed from the experimental perspective, a deeper concern with the core selectorate theory model is that it may attempt to be too realistic. The baseline selectorate model presented in Bueno de Mesquita et al. (2005: chapters 2 and 3) contains more than ten choice variables, plus a number of exogenous parameters and an infinite horizon. The core mechanism of the model, however, can be expressed more clearly by removing most of these moving pieces. For instance, Bueno de Mesquita (2016: chapter 11) presents a simplified version of selectorate theory that isolates the effects of the core mechanism – winning coalition size – and shows how it affects public goods provision and foreign policy aggression.
Implications for Research and Training Many debates about specific formal models in political science, and the modeling enterprise more generally, draw from what we term the phenomenon and experimental approaches. But because scholars have not previously articulated these distinct perspectives, we often talk past each other – both among those actively engaged in the formal theory enterprise and those who are not. Perhaps the most important takeaway from our discussion is that neither the phenomenon perspective nor the experimental perspective is inherently flawed. Instead, scholars often combine them effectively, if only implicitly, and insights from each has unique strengths that have improved the scholarly understanding of politics. Importantly, the phenomenon and experimental approaches to formal models are not mutually exclusive, and most published models contribute to both approaches. However, most authors typically frame their contribution, as emphasizing one perspective over the other, which generally leads the overall contribution to be overlooked. Compared to the experimental approach, for many, the phenomenon
Uses and Abuses of Formal Models in Political Science
approach is more intuitive when writing and thinking about models in political science because it more closely corresponds to historical and qualitative approaches. However, the experimental perspective has been gaining ground in all the social sciences – and political science is no exception.9 Consequently, the experimental approach to formal political theory will become more useful as it more naturally connects to research designs focusing on causal relationships as well as lends insight into the issues that are at the heart of these empirical studies. In addition to the direct implications for conducting and evaluating formal political theory research, the phenomenon and experimental distinction also carries important implications for future formal political theory training in graduate programs. Formal political theory’s key strengths lie in its ability to bring conceptual clarity to substantive issues by transparently articulating the relationships that drive broader scholarly debates. But introductory courses in formal political theory focus almost exclusively on ‘tools’ or ‘skill-building’, which has the unintended consequence of leaving some important philosophical and conceptual issues unaddressed. Of course, correctly solving a formal model is necessary, but it is not sufficient for making a contribution to political science using formal political theory. Instead, articulating the distinct virtues of the phenomenon and experimental approaches highlights the diverse contributions of formal political theory, and it is our hope that explicitly highlighting distinct philosophical perspectives to formal political theory can clarify the general discussion.
Notes 1 We interchangeably refer to ‘formal political theory’, ‘game theory’, and ‘formal theory’. 2 See Cox (1990) for a similar distinction applied to statistical models. 3 In experiments, all else equal is accomplished by randomization of treatment assignment (Kempthorne, 1977; Rosenbaum, 2017).
199
4 Less frequently, scholars motivate key model assumptions using statistical evidence (e.g., Paine, 2018). 5 The inherent complexity of the social world requires imposing some simplifying assumptions to construct a model of political behavior, and thus, all models simplify, formal or not (Clarke and Primo, 2012). Friedman (1966) presents an extreme view that models should be assessed solely for their predictive ability, and that the assumptions that generate these predictions are entirely unimportant. On the other end of the spectrum, Bates et al. (1999: 14) argue that ‘the assumptions [should] fit the facts’ for a model to have empirical applicability, which is perhaps also too extreme. 6 Others examine empirical contexts in which the core assumptions of Acemoglu and Robinson’s original redistributive political transition theories exhibit greater empirical plausibility. Paine (2019b) argues that post-1945 European settler colonies in Africa fit the scope conditions because a rich and politically dominant European elite feared the revolutionary potential of the non-European majority. He demonstrates statistical evidence consistent with Acemoglu and Robinson’s and Boix’s prediction that high inequality should yield high repression and revolution. 7 Some responses by the authors adopt a mechanismbased defense. Discussing the original model, Acemoglu et al. (2013: 2, 16) state that ‘once one relaxed the simple poor versus rich nature of political conflict in their original models as well as the restriction of policy instruments, the nature of the comparative statics with respect to inequality in the basic model changed. Put simply, if the groups in conflict were not rich versus poor, but for example based on ethnic, religious or regional cleavages, it was not necessarily true that increasing inequality, in the sense of a higher Gini coefficient, would exacerbate conflict between groups. It might just result in increased redistribution within groups’. 8 This critique also relates to Clarke and Stone’s (2008) re-analysis of Bueno de Mesquita et al.’s (2005) data. 9 This movement gained substantial momentum following Leamer (1983).
References Acemoglu, Daron and James A. Robinson. 2000. ‘Why Did the West Extend the Franchise? Democracy, Inequality, and Growth in Historical Perspective’. Quarterly Journal of Economics 115(4): 1167–1199.
200
The SAGE Handbook of Political Science
Acemoglu, Daron and James A. Robinson. 2001. ‘A Theory of Political Transitions’. American Economic Review 91(4): 938–963. Acemoglu, Daron and James A. Robinson. 2006. Economic Origins of Dictatorship and Democracy. New York, NY: Cambridge University Press. Acemoglu, Daron, Suresh Naidu, Pascual Restrepo and James A. Robinson. 2013. ‘Democracy, Public Policy, and Inequality’. Newsletter for the Comparative Democratization Section of the American Political Science Association 11(3): 2,16–20. Albertus, Michael and Victor Menaldo. 2018. Authoritarianism and the Elite Origins of Democracy. New York, NY: Cambridge University Press. Ansell, Ben W. and David J. Samuels. 2014. Inequality and Democratization: An Elite Competition Approach. UK: Cambridge University Press. Ashworth, Scott. 2012. ‘Electoral Accountability: Recent Theoretical and Empirical Work’. Annual Review of Political Science 15: 183–201. Ashworth, Scott, Christopher Berry and Ethan Bueno De Mesquita. 2015. ‘All Else Equal in Theory and Data (Big or Small)’. PS: Political Science and Politics 48(1): 89–94. Austen-Smith, David and Jeffrey S. Banks. 1996. ‘Information Aggregation, Rationality, and the Condorcet Jury Theorem’. American Political Science Review 90(1): 34–45. Baliga, Sandeep, David O. Lucca and Tomas Sjöström. 2011. ‘Domestic Political Survival and International Conflict: Is Democracy Good for Peace?’. The Review of Economic Studies 78(2): 458–486. Banks, Jeffrey S. 1990. ‘Equilibrium Behavior in Crisis Bargaining Games’. American Journal of Political Science 34(3): 599–614. Banks, Jeffrey S. and John Duggan. 2006. ‘A General Bargaining Model of Legislative Policy-Making’. Quarterly Journal of Political Science 1(1): 49–85. Bates, Robert H., Avner Greif, Margaret Levi, Jean-Laurent Rosenthal and Barry Weingast. 1999. Analytic Narratives. Princeton, NJ: Princeton University Press. Boix, Carles. 2003. Democracy and Redistribution. New York, NY: Cambridge University Press.
Bueno de Mesquita, Bruce and Alastair Smith. 2009. ‘Political Survival and Endogenous Institutional Change’. Comparative Political Studies 42(2): 167–197. Bueno de Mesquita, Bruce and Alastair Smith. 2017. ‘Political Succession: A Model of Coups, Revolution, Purges, and Everyday Politics’. Journal of Conflict Resolution 61(4): 707–743. Bueno de Mesquita, Bruce, and Alastair Smith. 2018. ‘Political Loyalty and Leader Health’. Quarterly Journal of Political Science 13(4): 333–361. Bueno de Mesquita, Bruce, James D. Morrow, Randolph M. Siverson and Alastair Smith. 1999. ‘An Institutional Explanation of the Democratic Peace’. American Political Science Review 93(4): 791–807. Bueno de Mesquita, Bruce, Alastair Smith, Randolph M. Siverson and James D. Morrow. 2005. The Logic of Political Survival. Cambridge, MA: MIT Press. Bueno de Mesquita, Ethan. 2016. Political Economy for Public Policy. Princeton, NJ: Princeton University Press. Clarke, Kevin A. and David M. Primo. 2012. A Model Discipline: Political Science and the Logic of Representations. Oxford, UK: Oxford University Press. Clarke, Kevin A. and Randall W. Stone. 2008. ‘Democracy and the Logic of Political Survival’. American Political Science Review 102(3): 387–392. Collier, Ruth Berins. 1999. Paths Toward Democracy: The Working Class and Elites in Western Europe and South America. UK: Cambridge University Press. Conrad, Courtenay R. and Emily Hencken Ritter. 2013. ‘Treaties, Tenure, and Torture: The Conflicting Domestic Effects of International Law’. Journal of Politics 75(2): 397–409. Cox, David R. 1990. ‘Role of Models in Statistical Analysis’. Statistical Science 5(2): 169–174. Dewan, Torun and Kenneth A. Shepsle. 2011. ‘Political Economy Models of Elections’. Annual Review of Political Science 14: 311–330. Diermeier, Daniel and Keith Krehbiel. 2003. ‘Institutionalism as a Methodology’. Journal of Theoretical Politics 15(2): 123–144. Di Lonardo, Livio and Scott A. Tyson. 2019. ‘Political Instability and the Failure of Deterrence’. Mimeo: University of Rochester.
Uses and Abuses of Formal Models in Political Science
Available at https://drive.google.com/file/d/1 ZXSEweCUMVOLQhka5wuiPZa5qnAsumiL/ view. Accessed 12/17/19. Dower, Paul Castañeda, Evgeny Finkel, Scott Gehlbach and Steven Nafziger. 2018. ‘Collective Action and Representation in Autocracies: Evidence from Russia’s Great Reforms’. American Political Science Review 112(1): 125–147. Dunning, Thad. 2008. Crude Democracy: Natural Resource Wealth and Political Regimes. UK: Cambridge University Press. Elster, Jon. 2000. ‘Rational Choice History: A Case of Excessive Ambition-Analytic Narratives’. American Political Science Review 94(3): 685–695. Epstein, David, Bahar Leventoglu and Sharyn O’Halloran. 2012. ‘Minorities and Democratization’. Economics & Politics 24(3): 259–278. Fearon, James D. 1994. ‘Domestic Political Audiences and the Escalation of International Disputes’. American Political Science Review 88(3): 577–592. Feddersen, Timothy and Wolfgang Pesendorfer. 1998. ‘Convicting the Innocent: The Inferiority of Unanimous Jury Verdicts under Strategic Voting’. American Political Science Review 92(1): 23–35. Flores, Alejandro Quiroz and Alastair Smith. 2013. ‘Leader Survival and Natural Disasters’. British Journal of Political Science 43(4): 821–843. Fowler, James H. 2005. ‘AJPS rejection without review for theory papers’. Posted on H-POLMETH24 March 2005. https://lists.hnet.org/cgi-bin/logbrowse.pl?trx=vx&list=hpolmeth&month=0503&week=d&msg=tvD9 N7rz/5zpnT1STs198A Accessed 12/13/13. Friedman, Milton. 1966. Essays in Positive Economics. Chicago, Il: University of Chicago Press. Gailmard, Sean and John W. Patty. 2012. ‘Formal Models of Bureaucracy’. Annual Review of Political Science 15: 353–377. Gallagher, Mary E. and Jonathan K. Hanson. 2015. ‘Power Tool or Dull Blade? Selectorate Theory for Autocracies’. Annual Review of Political Science 18: 367–385. Goemans, Hein and William Spaniel. 2016. ‘Multimethod Research: A Case for Formal Theory’. Security Studies 25(1): 25–33.
201
Granato, Jim and Frank Scioli. 2004. ‘Puzzles, Proverbs, and Omega Matrices: The Scientific and Social Significance of Empirical Implications of Theoretical Models (EITM)’. Perspectives on Politics 2(2): 313–323. Green, Donald and Ian Shapiro. 1996. Pathologies of Rational Choice Theory: A Critique of Applications in Political Science. New Haven, CT: Yale University Press. Haavelmo, Trygve. 1944. ‘The Probability Approach in Econometrics’. Econometrica: Journal of the Econometric Society 12: iii–vi, 1–115. Haggard, Stephan and Robert R. Kaufman. 2012. ‘Inequality and Regime Change: Democratic Transitions and the Stability of Democratic Rule’. American Political Science Review 106(3): 495–516. Kempthorne, Oscar. 1977. ‘Why Randomize?’ Journal of Statistical Planning and Inference 1(1): 1–25. Laitin, David D. 2003. ‘The Perestroikan Challenge to Social Science’. Politics & Society 31(1): 163–184. Leamer, Edward E. 1983. ‘Let’s Take the Con out of Econometrics’. The American Economic Review 73(1): 31–43. Levitsky, Steven and Lucan A. Way. 2010. Competitive Authoritarianism: Hybrid Regimes after the Cold War. New York, NY: Cambridge University Press. Lizzeri, Alessandro and Nicola Persico. 2004. ‘Why Did the Elites Extend the Suffrage? Democracy and the Scope of Government, with an Application to Britain’s “Age of Reform”’. Quarterly Journal of Economics 119(2): 707–765. Llavador, Humberto and Robert J. Oxoby. 2005. ‘Partisan Competition, Growth, and the Franchise’. Quarterly Journal of Economics 120(3): 1155–1189. Lorentzen, Peter, M. Taylor Fravel and Jack Paine. 2017. ‘Qualitative Investigation of Theoretical Models: The Value of Process Tracing’. Journal of Theoretical Politics 29(3): 467–491. Miller, Gary and Norman Schofield. 2003. ‘Activists and Partisan Realignment in the United States’. American Political Science Review 97(2): 245–260. Morton, Rebecca B. 1999. Methods and Models: A Guide to the Empirical Analysis of
202
The SAGE Handbook of Political Science
Formal Models in Political Science. New York, NY: Cambridge University Press. Nalepa, Monika. 2010. ‘Captured Commitments: An Analytic Narrative of Transitions with Transitional Justice’. World Politics 62(2): 341–380. Paine, Jack. 2016. ‘Rethinking the Conflict “Resource Curse”: How Oil Wealth Prevents Center-Seeking Civil Wars’. International Organization 70(4): 727–761. Paine, Jack. 2018. ‘A Theory of Strategic Civil War Aims: Explaining the Mixed Oil-Conflict Curse’. Working paper, Department of Political Science, University of Rochester. http://www.jackpaine.com/work-in-progress. html Accessed 12/21/18. Paine, Jack. 2019a. ‘Economic Grievances and Civil War: An Application to the Resource Curse’. International Studies Quarterly 63(2): 244–258. Paine, Jack. 2019b. ‘Redistributive Political Transitions: Minority Rule and Liberation Wars in Colonial Africa’. Journal of Politics 81(2): 505–523. Persico, Nicola. 2004. ‘Committee Design with Endogenous Information’. The Review of Economic Studies 71(1): 165–191. Powell, Robert. 2012. ‘Persistent Fighting and Shifting Power’. American Journal of Political Science 56(3): 620–637. Przeworski, Adam. 2009. ‘Conquered or Granted? A History of Suffrage Extensions’. British Journal of Political Science 39(2): 291–321. Rosenbaum, Paul R. 2017. Observation and Experiment: An Introduction to Causal
Inference. Cambridge, MA: Harvard University Press. Rubinstein, Ariel. 2012. Lecture Notes in Microeconomic Theory: The Economic Agent. Princeton, NJ: Princeton University Press. Schultz, Kenneth A. 1998. ‘Domestic Opposition and Signaling in International Crises’. American Political Science Review 92(4): 829–844. Signorino, Curtis S. 1999. ‘Strategic Interaction and the Statistical Analysis of International Conflict’. American Political Science Review 93(2): 279–297. Signorino, Curtis S. and Kuzey Yilmaz. 2003. ‘Strategic Misspecification in Regression Models’. American Journal of Political Science 47(3): 551–566. Slantchev, Branislav L. 2010. ‘Feigning Weakness’. International Organization 64(3): 357–388. Slater, Dan, Benjamin Smith and Gautam Nair. 2014. ‘Economic Origins of Democratic Breakdown? The Redistributive Model and the Postcolonial State’. Perspectives on Politics 12(2): 353–374. Svolik, Milan W. 2009. ‘Power Sharing and Leadership Dynamics in Authoritarian Regimes’. American Journal of Political Science 53(2): 477–494. Tyson, Scott A. 2018. ‘The Agency Problem Underlying Repression’. The Journal of Politics 80(4): 1297–1310.
12 Postmodernism Past, Present and Future Richard Beardsworth
Postmodernism names an increasingly varied set of intellectual strategies that, in contradistinction to the rational empiricist method of mainstream political thought, in general and international relations in particular, has foregrounded the interpretive and political nature of all international relations scholarship (see Cerutti, Chapter 9, this Handbook). Postmodernism is, that said, not simply an academic exercise of knowledge (although, as I argue later, it can run the risk of theoretical abstraction). Postmodernism’s general aim is to test the limits and assumptions of our forms of knowledge of the political world in order to open up alternative ways of conceiving and practising the political (particularly to one side of the tenets of modernity). With its focus on contingency, plurality and reflexivity, postmodernism is often considered an appropriate form of political thinking for a globalizing world. In comparison to the schools of liberalism, realism and Marxism in both political and international theory, postmodernism does not present a
school of thought but no scholar of a postmodern disposition accepts the term uncritically; indeed, a wide variety of critical practices are covered by it. That said, the above brief description underscores, in Wittgensteinian terms, a basic ‘family resemblance’ between the arguments of these scholars. It is this resemblance that has allowed the concept of ‘postmodernism’ to acquire, in the last 30 years, intellectual consistency, particularly in the disciplinary field of International Relations (IR) and to designate something like a specific approach to international relations: an approach that, for younger scholars especially, may appear justified by recent history – the relative decline of the West, the concomitant changes in international order, global development, the political crisis of climate change, etc. In this light, the following account of postmodernism focuses on: (1) the intellectual origins of the term and its application to the discipline of IR; (2) postmodernism’s major strategies and commitments within the IR field;
204
The SAGE Handbook of Political Science
(3) important problems it meets as a specific disposition in IR; and (4) future perspectives. My account of the past, present and future of postmodernism is neither unduly sympathetic nor unduly critical. It seeks rather to situate the postmodern IR disposition in relation both to its historical formation and to the complexity of contemporary and anticipated international politics.
Origins Postmodernism in IR must be traced back to the major moves of French thought in the 1960s and 1970s. Amalgamating a set of arguments from modern linguistics, anthropology (see Schemeil, Chapter 10, this Handbook), psychoanalysis, post-Kantian philosophy and phenomenology, and literary theory, intellectuals like Jacques Derrida (2016), Gilles Deleuze and Felix Guattari (2013), Michel Foucault (1980, 1991, 2008), Jean-Francois Lyotard (1986a), Julia Kristeva (1984) and Roland Barthes (1972) launched from the late 1960s onwards a series of critiques of the major tenets of humanism in the humanities and social sciences. They criticized, in particular, the authority of the modern subject (as the ground of modern knowledge, ethics, politics and aesthetics), representative theories of truth and reality and progressive theories of history. Although these critiques are different in kind and, at times, mutually incompatible, they are all, to a greater or lesser degree, intellectually committed to the following: to look to difference, otherness and the non-human there, where the modern period and/or modernity (within a larger metaphysical movement or not) secures ethical, historical and political identity through reason; to undermine any straightforward correspondence between representation and reality, the knowing subject and the known object; and, by showing the constructed nature of our representations of the world, to prise open the limits of these identities and constructions in order to allow
marginalized and alternative voices to emerge. Postmodern authorships have been concerned, in sum, with transcending modern ontology, epistemology and ethics, with expositing forms of critique that do not rely on a final instance of truth – God, humanity, the modern subject, the proletariat, the West, etc. – and with opening up from within prevalent structures of power, alternative conceptions of ethical and/or political determination. From the 1980s onwards, this thought began to influence the sub-discipline of IR within political science. Richard Ashley (Ashley and Walker, 1990), David Campbell (1998), William Connolly (2002), James Der Derian (2009), Jim George (1994), Michael Shapiro (1999) and Robert Walker (1993) are the foremost figures in this translation. The French critiques of language and representation helped, notably, to offer an alternative to the empirical rationalism of mainstream IR thought: the belief that theory describes the world and is validated through empirical evidence. The above refusal of an ultimate referent in language and of the concomitant ‘performative’ nature of all knowledge has allowed IR scholars, on the one hand, to historicize the dominant ‘objects’ of international relations thought – anarchy and collective action, sovereignty and interdependence, security and conflict resolution – contra the perceived abstractions of ‘rational choice theory’ and the empirical research methods of dominant forms of realism and liberalism. On the other hand, this refusal has also allowed them to reveal relations of interest and power within any theoretical construction of political reality. Here postmodernist critique has dovetailed with those of feminism and post-colonialism: or rather, the origins of IR postmodernism have equally informed the latter critiques as well. Since knowledge cannot be neutral, re-presented its object as such, it always already involves hierarchies of construction and interpretation.1 As a result of these two gestures, the field of IR is pulled away from explanatory analysis and related back to both
Postmodernism Past, Present and Future
theory and power; and the objects of IR are, in principle, opened up to alternative possibilities of political knowledge, agency and organization. This account of the origins of postmodern IR does little justice, within critical IR theory in general, to the important normative differences between ‘modernist’ and ‘postmodernist’ understandings of international politics. I will take these differences up in the next sections.
Commitments Since the establishment of these strategies within IR theory, postmodern IR thought has increasingly weighed theoretical understanding with empirical investigation. Postmodern topics in IR range from genealogies and deconstructions of the IR tradition (see Hellmann, Chapter 76, this Handbook), through focused attention on modern delimitations of sovereignty, to theorizations of contemporary modalities of warfare and security, of ‘liberal’ regimes of governance and surveillance, of American and British foreign policy, humanitarian interventions and migration. As said, since postmodern approaches to these topics simultaneously open up questions of patriarchy and western domination, they have necessarily dovetailed with respective transformations of IR through feminism and post-colonialism. This article will focus on the topic of ‘modernity’ in postmodern IR theory since all the above engagements presuppose a common approach to it. To understand the ‘how’ of postmodern commitments in the IR field entails understanding this presupposition. In its strongest philosophical formulations – one thinks here particularly of the works of Jacques Derrida and of Jean-François Lyotard – the ‘post’ of ‘postmodern’ designates neither an ‘after’ nor an ‘against’ of the modern period, of modernization processes or of the general descriptive ‘modernity’
205
(see particularly Lyotard, 1986b). It refers, rather, to a critical, ambiguous relation to the general tenets of the modern (see Derrida, 1982, 2004). These tenets consist in a belief in progress, in the emancipating powers of reason, in the political centrality of subjectivity and in ethical and political universalism (for example, and respectively, human rights and international/global institutions). Together with these tenets, commitment to ‘modernity’ affirms the exemplarity of a certain number of modernization processes that accompanied western capitalist development over the last 500 years: most importantly, the distinction between religion and the state; the separation of civil society from the state; the emergence of the individual and the categories of ethical, legal and political personhood; and liberal and social variants of democratization. For postmodern thought – and here the work of Michel Foucault has been by far the most influential in the social sciences in general and IR in particular – these processes dominate and discipline the individual as much as they individualize and empower him or her. Subjectivity implies, also, subjection. In Derrida’s terms, modern law covers over its own violent emergence and is consequently unable to reflect upon its ethical and political limits. Within the parameters of modern subjectivity and law, liberalism affirms, for example, universal tolerance, general assembly and historical progress based on specific understandings of individual and collective freedom. These parameters are, however, considered necessarily exclusive and violent since their core assumption of a unified subject ignores multiple identities and histories (within the modern subject, within western formations and among western, non-western and planetary formations). Finally, for postmodern thought, following the French ethical philosopher Emmanuel Lévinas (1969), the immanent relation between theoretical and practical reason in modernity codifies ethics and turns the Enlightenment principles of freedom into ones of abstract domination
206
The SAGE Handbook of Political Science
with little eye to locality and contingency. In these three respects, modernity’s promise of a generalizable axiomatics of freedom hides (ethical and political) practices of discipline and domination. For postmodern thought, modernity can be neither avoided nor transcended. Its inherent limits are, however, to be constantly (re)negotiated in a larger understanding of human and non-human worlds. Within the field of IR this argument has meant the following. For many in IR – traditionalist or not – the practice and discipline of international relations turns on the ‘anarchy’ of relations that befall the state outside the parameters of its own sovereignty as well as on the collective action dilemmas that ensue from this structural anarchy. For postmodernists, sovereignty and anarchy are mutually constitutive, violent constructions, the respective delimitations of which suppress the multiple differences and pluralities of the real world. Modern political organization – the state and the state system – does wrong to these differences, and it is only through recognitions of these differences that other ways of thinking and practising political agency in international and global politics are possible. Contra contemporary cosmopolitanisms, however, postmodern IR thought does not affirm either the individual against the modern state or the trans-national human rights regime that underpins supra-national juridical personhood. As said above, the limits of the individual are as constructed as those of the modern state. Modern liberal individualism and its internationalist equivalent, cosmopolitan liberalism, codify subjects in particular ways as bearers of rights, as fully delimited and conscious persons, as beings prior to social interaction and meaning. While, therefore, aware of the freedoms gained through modernity in distinction to the hierarchies of the pre-modern age, postmodern IR thought has considered, until now, the state as an obstacle in the plural fields of global politics, and sees the individual and universalism as the legacies of the particular formation of western Enlightenment. This legacy excludes understanding of non-western
practices of value, social interaction and community. It cannot, consequently, theorize for humanity as a whole. Postmodern thought converges here with neo-Marxist and postcolonial critiques of western imperialism: modernity cannot be a straightforward counter-force to such imperialism since, however much it distinguishes itself from authoritarianism, it is deeply complicit with past and present practices of domination. What emerges from this post-modern analysis of modernity? Any counter-subject to replace western modernity would simply reproduce the tenets of modernity. Postmodernists have consequently not sought (to date at least) a new political subject. They have sought to untie and foreground what is repressed by political subjectivity as such, hence their practical concern with the excluded, the marginal, the dissenting and an often too implicit engagement with political organizations that are more inclusive, and less exclusive. Postmodern thought’s prevalent political concern has been either with non-state agency and/or with a radical ethics. This ethics is not that of rational universalism. Addressing, as reasonably as possible, what political reason and subjectivity will necessarily exclude, it is an ethics of contingency, of uncertainty, of humility in the interstices of political arrangement and institution. Given technological and economic globalization, postmodernism in IR befits, it is argued, a global age. For in this age, plurality and difference are increasingly critical values that require articulation under the homogeneity of capitalist globalization processes and with the pluralization of international power. The postmodern engagement with modernity leads, as a result, to critical, dissenting approaches to international topics.
Problems I have said that postmodern thought considers itself to be a form of thinking that befits a
Postmodernism Past, Present and Future
globalized age. Its aim to recognize difference and plurality from under the categories of Enlightenment reason, modernity and globalization would indeed seem an appropriate intellectual project in order to transcend the binaries of west/non-west or north/ south and set up a more appropriate ‘global’ discourse and practice of international politics under contemporary processes of globalization (increasing interdependence and fragmentation). The dovetailing of postmodernist, feminist and post-colonialist concerns over the last 20 years in IR appears also to institutionalize this endeavour. There are, that said, a set of problems within postmodernism: ones affecting postmodern intellectual commitments as such; and ones that have emerged as these commitments have come to be rehearsed within the discipline of International Relations. This section considers two overriding problems; the next section considers the futures of IR postmodernism in a globalized, fragmented age in their light. One major problem met by the postmodern intellectual disposition is relativism. If, as postmodernism argues, all knowledge is constructed – and therefore informed by interest and power – it becomes difficult to posit hierarchies of knowledge either in the realms of ontology and epistemology or in the realms of ethics and politics. Without such weighted orders of knowledge, arguing for one set of knowledges and/or values against another becomes inherently tendentious. As a result, postmodernists can end up giving theoretical and/or moral equivalence to very different sets of claims in the name of reflexivity or diversity. The most evident example of this equivalence is in the postmodern argument against theoretical and moral universalism (considered as a basic tenet of modernity). To argue that no theory can make universal claims, or rather that any theory must understand how its claims are located in time and space in order to have theoretical rigour in the first place, ignores the processes by which thought is universalized beyond the bounds of (a particular) time and space. A theory has
207
universal validity, for example, because the ‘peer review’ process of its particular community underpins the claims of that theory. Without this process of peer review, these claims would remain within the community and not transcend it. The emergence of the truth of climate change over the last 50 years provides, perhaps, the most important recent illustration of this process. As a result of worldwide review among scientists of climate data, the theory of anthropogenic climate change has been established beyond all reasonable doubt. No postmodernist would wish to give intellectual equivalence to climate science and climate denialism; and yet, postmodernism’s suspicion regarding modern rationalism has not helped avoid our ‘post-truth’ era. More importantly, it remains unclear how the commitments of postmodernism can help restore consensus around the priority of truth. The relativist risk of postmodernism is equally evident today in the realm of ethics. In a globalizing and fragmented world one must pay careful attention to pluralism and to the plural ways in which human and non-human worlds deserve recognition and respect. To argue against the modern norms of the European Enlightenment in the name of this pluralism can confuse, however, source and validity. To argue that all humans ‘possess’ individual rights that require institutional framing constitutes a claim that has its overt source in European thought at a particular moment of economic, social and political transformation. The claim is not, however, reducible to this moment, but transcends it (or not) through the way in which it outranks counter-claims (or not). This normative outranking requires, in turn, spaces of discussion and deliberation that are themselves formed by the very values that one seeks to discuss. Within this hermeneutic circle of moral interpretation (played out through institutions and through historical time) it is impossible to reduce to a moment in time or place the processes of universal value-formation. It is accordingly impossible to give – in the name
208
The SAGE Handbook of Political Science
of reflexivity, plurality or diversity – moral equivalence as such to a universalist claim, on the one hand, and to a particularist claim, on the other. Both claims must be tested through the processes of value-formation; it is only through this testing that moral argument can advance. The hermeneutic circle is never of course resolved: neither ethics nor politics is a science. There is a difference, however, between rehearsing arguments within the circle and short-circuiting the circle in the name of a flat pluralism: postmodern critiques of the Enlightenment risk the latter step. This risk is unhelpful today in the context of both domestic and international disorder. The second problem that the postmodern intellectual disposition encounters and that I want to address here is the risk of marginalization. At least to date, and as the two earlier sections argued, postmodernism does not seek to posit a counter-subject against the modern subject; rather, it seeks multiple identities and dissenting, critical viewpoints from under the surface of modern subjectivity. For many, as mentioned above, this intellectual move fits a rapidly pluralizing world (and is therefore far from running the risk of marginalization). However, as I have argued elsewhere (Beardsworth, 2011), postmodern commitments confuse here aesthetic and political specificities. The realm of political behaviour and action is a realm of forces in which one force is necessarily counter-posed to another. The nonviolent disposition of ‘turning the other cheek’ constitutes, for example, a force in politics that was most efficient, for Mahatma Gandhi, in the context of the legal and political forces of English Law in early 20th-century South Africa and India. This disposition would, however, have been physical and political suicide against the forces of Nazi Germany. The realm of the political – however ethically mediated, whatever the exact form of the forces at play – is a realm of limits in which agents necessarily brush up against each other. To translate to this realm values that, precisely, blur distinctions and limits (within and among subjects) assumes that this realm is similar to the
aesthetic, where such blurring constitutes the very condition of good art. Postmodern commitments often commit this category error or confusion. As a result, postmodernism makes sophisticated progressive arguments without delimitation of the political limits and agents required to make such progress happen: it remains as a result absent from the political field. Other forces take control of the political realm in their stead – today, new forms of particularist politics. Postmodernism’s selfjustifying argument that it does not seek an alternative to the modern hides, I suggest, its fundamental lack of political realism in this respect. This is not to suggest that postmodernism is the cause of our present political disorders (a nonsensical claim). It is to suggest, as with our post-truth era, that the postmodernist disposition has not helped to arrest the emergence of present political dilemmas. At the very least, one cannot argue that its commitments befit a globalized, fragmented age. The above two problems – relativism and marginalization – come to a head in postmodern approaches within the discipline of IR for obvious reasons. In contrast with the humanities or a social science like sociology, IR is focused on a field of limits, forces and agents. However complex this field is (and at the level of causation, it is obviously highly complex), questions of empirical political reality, questions of order and disorder and questions of value and value accommodation are outstanding. Without consistent traction on these questions, sophisticated arguments for self-reflexivity, plurality and diversity can become, at best, abstract and, at worst, meaningless for the discipline. So, what futures can postmodernism countenance as an intellectual disposition in IR?
Futures As is clear from the above, I consider the tenets of postmodernism to be tested by present political realities, rather than confirmed
Postmodernism Past, Present and Future
by them. The postmodern critique of modernity – of rationalism, progress, the modern subject and its ethical and political correlates – has certainly helped to make contemporary culture more careful of the limits of modern approaches to the world. Moreover, this care, as well as the humility that goes with it, can offer a framework within which to interpret, and help shape, a world undergoing important cultural and political transitions away from western hegemony. To do so, however, those calling upon postmodern argument in the discipline of IR need to make this framework more intellectually rigorous. In line with arguments rehearsed above, I would suggest at least four futures. First, to counter the risk of both relativism and marginalization, postmodern approaches to international relations should rehearse what an ethics and politics of the lesser violence entails. For postmodernism, all acts of theory and all practices of limitation are violent. The criteria by which both are either more or less violent require, however, formalization. From out of this formalization one can then argue what, in the realm of international politics, a politics of the lesser violence might entail and what institutions (or institutional reforms) at the international and global levels would be required to embody this politics of the lesser violence. In doing so, postmodern tenets may find a way not only of transcending our posttruth era and post-truth politics (with which I have argued it is partly complicit), but also of offering theoretical and ethical platforms on which to build a plural international politics. Second, postmodernism needs to look again at the liberal project, born within its articulation of modernity as a whole. Postmodern understanding of the liberal subject as well as of liberal ethics and liberal polity is impoverished (at least in IR) by its decline into anti-universalism and its concomitant refusal to address the formal, institutional conditions of difference. With this refusal, difference inverts into a politics of unarticulated differences without common ground. To revisit critically these conditions
209
and inversions might allow the postmodern approach to national and international politics to rehearse a new liberalism for our globalized age. Third, where the postmodern critique of modernity is very precisely borne out by present and future realities is of course climate change and imminent climate catastrophes (see Voituriez, Chapter 85, this Handbook). It is here, perhaps, that the ‘post’ of postmodernism will have found its most important place of argument. The last 500 years of modernization processes – from industrialization to the modern subject and its freedoms – are the very years in which the human has had such structural impact on the earth system that these very processes are increasingly in doubt. Contemporary arguments for the Anthropocene pinpoint increasingly accurately how the legacy of modernity is unsustainable for the planet as a whole (see, for example, Hamilton, 2017). Here indeed, the postmodern blurring between the modern subject and its un-freedoms, emancipation and domination, the human and the nonhuman, etc. can find their truth in the modern, enlightened disrespect of the planet and of its inhabitants as a whole. For this critical argument to have rigour within postmodernism, two points must nevertheless be borne in mind: first, it must adopt a new ‘grand narrative’ of the planet; and, second and concomitantly, it must relinquish once and for all an ethics and politics of the marginal in order to develop collective strategies of sustainable life systems. Fourth and finally, these collective strategies could work, for postmodernists, with an overarching agenda of development that seeks greater equality and sustainability among, and within, the richer and poorer countries of the world. It is well known that climate change will reinforce regional inequalities. Adaptation to it will require immense political will and effort so that something like global collective action offsets the new politics of disparity that is already emerging (see Morlino in this Handbook).
210
The SAGE Handbook of Political Science
In sum, will postmodernism make these jumps that would, at one and the same time, both confirm and reorganize its critical commitments? Or will it – a child of the 1960s linguistic turn – cede to new intellectual formations that are more ready to respond to the needs of the future? To rehearse the origins, commitments and problems of postmodernism in IR is, perhaps, to suggest the coming fork in the road.
Note 1 I have not spoken of the similarities of commitment between postmodernism and constructivism in IR: suffice it to say here that, while both sets of thought have similar theoretical outlooks on the social construction of reality, IR constructivism is much less critical of the liberal project in general. This is particularly the case in the United States.
References Ashley, R K and Walker, R B J (1990) Reading Dissidence/Writing the Discipline: Crisis and the Question of Sovereignty in International Studies. International Studies Quarterly 34 (3), 367–416. Barthes, R (1972) Mythologies. Trans. Annette Lavers. London: Jonathan Cape. Beardsworth, R (2011) Cosmopolitanism and International Relations Theory. Cambridge, UK: Polity. Campbell, D (1998) National Deconstruction: Violence, Identity, and Justice in Bosnia. Minneapolis: University of Minnesota Press. Connolly, W E (2002) Identity|Difference: Democratic Negotiations of Political Paradox. Minneapolis: University of Minnesota Press. Deleuze, G and Guattari, F (2013) A Thousand Plateaus: Capitalism and Schizophrenia. Trans. Brian Massumi. London: Continuum.
Der Derian, J (2009) Critical Practices in International Theory: Selected Essays. London: Routledge. Derrida, J (1982) Margins of Philosophy. Trans. Alan Bass. Chicago, IL: University of Chicago Press. Derrida, J (2004) Rogues: Two Essays on Reason. Trans. Pascale-Anne Brault and Michael Naas. Stanford, CA: Stanford University Press. Derrida, J (2016) Of Grammatology. Trans. Gayatri Chakravorty Spivak. Baltimore, MD: John Hopkins University Press. Foucault, M (1980) Power/Knowledge: Selected Interviews and Other Writings, 1972–1977. Ed. Colin Gordon. Brighton: Harvester Press. Foucault, M (1991) Discipline and Punish: Birth of the Prison. Trans. Alan Sheridan. London: Penguin. Foucault, M (2008) The Birth of Biopolitics: Lectures of the College de France, 1978–79. Trans. Graham Burchell. London: Palgrave Macmillan. George, J (1994) Discourses of Global Politics: A Critical Reintroduction to International Relations. Boulder, CO: Lynne Rienner Publishers. Hamilton, C (2017) Defiant Earth: The Fate of Humans in the Anthropocene. Cambridge, UK: Polity. Kristeva, J (1984) Revolution in Poetic Language. Trans. Margaret Walker. New York, NY: Columbia University Press. Lévinas, E (1969) Totality and Infinity: An Essay on Exteriority. Dordrecht: Martinus Nijhoff Publishers. Lyotard, J-F (1986a) The Postmodern Condition: A Report on Knowledge. Trans. Geoffrey Bennington and Brian Massumi. Manchester: Manchester University Press. Lyotard, J-F (1986b) Answer to the Question: What is the Postmodern? In Lyotard, J-F The Postmodern Condition, pp. 72–84. Manchester, UK: Manchester University Press. Shapiro, M (1999) Cinematic Political Thought: Narrating Race, Nation and Gender. Edinburgh: Edinburgh University Press. Walker, R B J (1993) Inside/Outside: International Relations as Political Theory. Cambridge: Cambridge University Press.
13 David Easton’s Political Systems Analysis H e n r i k P. B a n g
Introduction David Easton introduced the systems theory in political science. This theory appeared in social sciences during the 1930s and tried to promote a global approach of social facts considered as a whole which strives to maintain a certain order and an identity. As such, Easton (1917–2014) is one of the founding fathers of modern American political science, even though he was born in Canada and retained his Canadian citizenship his entire life. However, it was after first taking up a position in the University of Chicago in 1947 that he began developing his specific approach to studying political systems. He describes his first 10 years in Chicago as: ‘just one great intellectual high’. From the onset, he was completely taken aback by: the intellectual excitement, the natural interdisciplinary exchanges, the value put upon knowing and understanding, the passion for the idea and the systematic subordination of bureaucratic necessities to scholarly needs.
In Chicago, Easton was drawn into a crossdisciplinary group of scientists, discussing the new concept of system in the natural sciences and its application to the study of human and social life. He got acquainted with general systems theory’s ambition to formulate a systems language which was not dependent on any particular science or discipline (Boulding, 1956; von Bertalanffly, 1962; RadcliffeBrown, 1957). He also sensed how members of the group looked to psychology, anthropology and sociology as the core sciences out of which common variables might be identified. Finally, he noted how scholars from these sciences were especially attracted to studying human and social life by analogy with homeostasis (self-regulation) in individual organisms. Thus, the young Easton could conclude that there was a lack in the group of a special interest for understanding and explaining human and social behaviour in their political aspects. He then began his search for such a special political theory in a dialogue with political scientists but from the general vantage point of
212
The SAGE Handbook of Political Science
open systems theory, stating that the aim of critical systems inquiry must be (Wilden, 1972: xxvi) ‘to contribute to the long-term well-being of humanity in its historical process and global context (Wilden, 1972: xxvi). The coupling problem between a system and its environment and between the component parts of a system is central to this general approach. Like Karl W. Deutsch, who applied it to study political communication and control (1963), Easton employed it to look for the sources and the conditions common to whole sets of messages as such rather than to individual messages. The difference is that Easton, unlike Deutsch does not focus solely on the ‘nerves of government’ (1963) but on politics and society viewed as an interrelated whole. To him the coupling problem concerns political authority as a communicative and symbolic linking mechanism between a political system and society on the one hand, and between political authorities and ordinary members inside a political system on the other hand. This in turn explains why Easton did not adopt the homeostatic model that Parsons (1951), Almond and Verba (1963) and Rokkan and Lipset (1967) adopted from biologically inspired general systems theory: then he would have had to place social stability and normative integration in the forefront. However, in his endogenous and interrelationist political model, society could not exist without a political system being capable of handling increasing complexity by continuously changing and differentiating itself and its relationship to its relevant environments (Bang, 2011; Crozier, 2010). Thus, Easton’s application of general systems analysis to politics and society contains four claims (1965a: 24–5): 1 System. It is useful to view political life as a system of behaviour. 2 Environment. A system is distinguishable from the environment in which it exists and open to influences from it. 3 Response. Variations in the structure and processes within a system may usefully be interpreted as constructive or positive alternative efforts by members of a system to regulate or
cope with stress flowing from environmental as well as internal sources. 4 Feedback. The capacity of system to persist in the face of stress is a function of the presence and nature of the information and other influences that return to its actors and decisionmakers.
Claims 3 and 4 – response and feedback – point out two fundamental facts of political existence (1965a: 133): a. Members of a system are not passive transmitters of things taken into the system, digesting them in some sluggish way, and sending them along as outputs that influence other social systems or the political system itself. They are able to regulate, control, direct, modify, and innovate with respect to all aspects and parts of the processes involved.
Furthermore, such a transformative political capacity presumes that a political system can learn from its life-experiences (1965b: 369): b. Without being provided with an array of information about the consequences of present outputs, it could not add to its store of knowledge, its memory would forever remain static, and it would have no way of implementing its potentialities for learning and changing.
Hence, the persistence of a political system depends above all on its members’ capacities for learning and changing their system, its environment or both together. Therefore, the critical issue of political existence is how to provide coupling mechanisms between political authorities and ordinary members inside the political system (internal response and feedback), and between the political system and its relevant environments (external response and feedback). Without such coupling mechanisms, a political system could not become better at governing itself in the face of unceasing stress or risk. In this sense, political systems failure can mean one of two things: ‘[i] that [the system] has changed but continues to exist in some form; or [ii] that [it] has disappeared entirely’ (1965a: 82).
David Easton’s Political Systems Analysis
It follows that the basic presumptions presented above in (3) and (4), (a) and (b) and [i] and [ii] concern a political system’s transformative capacity to persist in some form even in the face of the most turbulent crisis situation, like a revolution or a rapidly developing high-consequence risk (like global warming). Considered in this light, is it any wonder that Easton’s critics nearly all begin their interpretations of his system from the conclusion that ‘the primary focus of David Easton’s work is what Talcott Parsons designates as the polity or the goal-attainment subsystem’ (Lewis, 1974: 676). ‘Like Parsons he is concerned with stability and order, with the “persistence” of political systems in a world of change and stress’ (Meehan, 1967: 174). ‘The result of this focus is to set aside the question of how the capacity of the polity is used to meet demands and to deal solely with how this particular subsystem maintains itself or persists’ (Lewis, 1974: 676). However, there is a simple reason why Easton’s political system from the onset was looked at as an offspring of Parsons’ structural, or normative, functionalism. This is how his model was introduced into comparative politics as an exemplar of modern political analysis. As Gabriel A. Almond puts it in his contribution to Easton’s Festschrift: Talcott Parsons’s formulation, coming out of a mix of sociological, psychological, and anthropological theory, stressed culture and personality, psychological orientation, and socialization processes… It was in this setting that Easton perfected his system framework. It was this systems model, much influenced by cybernetics, that I inherited from Easton’s work and that enabled me to codify and systematize the findings and conclusions of the five or six preceding decades of empirical research, primarily on American political processes. (Almond in Monroe, 1997: 224)
To Almond, and most other scholars in the mainstream, Easton was the one who fleshed out the missing diachronic and dynamic aspect of political history in Parsons’ synchronic and static scientific model. Thus, Easton’s political system became Parsons
213
social system applied as the concrete, historical mechanism for ‘unfolding’ the social structure just as it is ‘in itself’. The irony is that Easton’s systems model constitutes a fundamental break with this modern dichotomization of space and time, the first being available for synchronic analysis in abstraction from the diachronic functioning of the latter. As Parsons argued: [P]olitical science [is] the discipline concerned with political power and its use and control, but because of the diffuseness of political power this makes it a synthetic science in the social system field, not one built about a distinctive analytical conceptual scheme. (Parsons, 1951: 551, emphasis in original)
However, in The Political System, already Easton plainly and directly rebuts Parsons’ claim: If this statement were true, then the development of political science would be so dependent upon the other social sciences that little blame could attach to it for the level of its insights, the nature of its methods, or its neglect of theory…… I shall assume that political science does constitute a distinct field of research, not for problems of application alone, but what is more significant, for analytical and conceptual purposes as well. (Easton, 1953: 60–1)
As I shall show, Easton considers time-space inseparable from one another. To him, stability cannot be contrasted to change, because ‘stability is only a special example of change, not a generically different one’ (1965a: 106). His practical interest in politics is by no means prompted by a ‘technical’ interest in applying political authority to secure and sustain homeostasis in the social organism. To the contrary, it stems from his break with this identification of authorization with legitimate domination. In his doctoral thesis The Theory of the Elite: A Study of the Elitist Trends in English Thought, Easton raises the political claim that: [e]litism, the ideal of the reaction against democracy, yields political and social domination to the rulers. The democratic ideal, wherever it is found,
214
The SAGE Handbook of Political Science
encompasses a credible political sociology by recognizing the superior role of the people. Since the elitist myth of the governing class seeks to eliminate the people in connection with the destiny of society, this myth explodes when confronted with the fact that without the people, the rulers are as free spirits wandering lonely, dejected and unemployed in an empty world. But without rulers dominating their existence, the people, on the contrary find that very freedom that calls forth their most creative efforts. Elitism places blind faith in an appropriate governing class. The democratic ideal incorporates a tempered trust in the wisdom and creative genius of the people. (Easton, 1947: 418, emphasis added)
Easton’s practical commitment to, and belief in, ‘the people’s’ creative capacities for governing and taking care of themselves stems from his early experiences in Canada where he was born. He grew up in Toronto in very modest circumstances in a family with no academic background. He was the first in the family to attend college (the University of Toronto) and soon became engaged in leftleaning political activism. He toured back and forth between Canada and the United States as a stowaway on freight trains to participate in various rallies and protests. He met his wife Sylvia in a Trotskian Friendship Association in Chicago; she remained a leftleaning activist her entire life. They discussed the connection between democracy and political science their entire life together, combining her deep sense of human suffering with his natural intellectual curiosity. Thus, the mature Easton’s systematic focus on power-knowledge as the key to understanding and explaining political systems persistence comes from a practical ambition to connect political authorities and laypeople in the political community in terms of reciprocal relations of power, knowledge trust and respect. This is why the coupling problem is at the core of Easton’s political systems analysis: it is not only at the foundation of political existence but also of a democracy in which people accept and recognize their intrinsic differences – not as a barrier to, but as a condition of, identifying and handling their common concerns.
I shall begin by illuminating in a bit more detail how Easton became Parsons, and how this concealed the young Easton’s quest for finding a way to connect ‘fact’ and ‘values’, political science and democracy. Then I shall present Easton’s simple four-function model of the political system and shed light on how it makes the authority relationship between political authorities and laypeople in the political community inside the political system the heart of understanding and explains: (a) how a political system is structured in time-space and (b) how it manages to persist in some form in its relevant environments through time-space. Then I shall show, how Easton became inspired by the Chicago School – especially by Harold D. Lasswell’s technocratic and democratic approach to making and implementing policy – to find a way to overcome the twin dangers of evolutionism and historicism in his specification of the political relationship between democratic political authority and democratic political community. I shall conclude by encouraging young scholars to critically reassess Easton’s uniquely political macro-theoretical approach to the persistence and structuration of a political system in the light of current advances within the study of political institutions, political psychology, governance, governmentality and discursive and participatory democracy.
How Easton Became Parsons Then, how did Easton become famous (not to say notorious) as a founding father of an American political behaviouralism, which has been widely accused of silencing ‘the reflective and critical voice of the discipline’ and of undermining its status ‘as the discursive home of political theory’ (Gunnell, 1993: 269)? How could Easton become an exemplar of Parsons’ non- and apolitical kind of normative functionalism in which a political system is assessed solely by its
David Easton’s Political Systems Analysis
capacity to attain goals that contribute to ‘pattern maintenance’, ‘normative integration’ and ‘economic adaptation’ in the social system (Alexander, 1984: 10)? This social systems approach tends to reduce the practical tasks of political theory and science to a technical issue of how to internalize norms of social stability into the unconscious parts of people’s personality, hereby creating a disposition in them to support ‘the system’ and obey the decisions and actions of their political authorities (Parsons, 1968). This position is flagrantly inconsistent with Easton’s own specification of the relationship between democratic theory and political science presented above. However, Almond adjusted Easton’s model to that of Parsons. Citing a lecture which he once conducted to show the bond between Eastonian and Parsonian systems analysis (1997: 225), Almond critiques Easton of never providing: ‘the full notions of multidirectional interaction and of equilibrium and disequilibrium which are implied in the concept of system’ (ibid: 225). When Easton does not provide any such notion, it is because in his model ‘[a] system may well seek other goals than those of reaching one or another point of equilibrium’ (1965b: 20). This should also be common sense, since many political systems in history have had to hobble along in continuing disequilibrium when seeking new goals to cope with the concrete high-consequence conflicts, risks and challenges that confront it. When Almond does not notice Easton’s distinction between goal-seeking and goal-attainment but instead imposes Parsons’ Freud-inspired notion of order vs. anomie on his political system, it may be because he overlooks two facts about Easton’s systems approach: (1) Easton’s political system has not been modelled after a biologically and structuralistinspired mix of sociology, psychology and anthropology. It is more influenced by the kind of general but open systems thinking represented by scholars like von Bertalanffy (1962) and Vickers (1959), which explicitly
215
breaks with all equilibrium thinking. For example, as von Bertalanffy puts it: In contrast to equilibrium states in closed systems which are determined by initial conditions, the open system may attain a time independent state independent of initial conditions and determined only by the system’s parameters. (von Bertalanffy, 1962: 18, cf. Wilden 1972: 38)
Of course, in this framework, organisms are considered goal-seeking systems too, but ‘what they seek is stability, not change; what they reproduce is themselves, not novelties’ (Wilden, 1972: 363). This is exactly why it is a mistake to believe that what a political system seeks is ‘best described as a changing equilibrium’ (Almond, 1997: 227). Unlike an individual organism, a political system is not predetermined to seek equilibrium, homeostasis, order or stability, above all else. What it seeks is to exist in, and through, continuous change, whether incremental or revolutionary. Without such chronic changes, a political system simply could not articulate and handle the complex relationship within itself or to its relevant environments. As Easton himself puts it: What political systems as a type of social system possess uniquely, when compared to both biological and mechanical systems, is the capacity to transform themselves, their goals, practices, and the very structure of their internal organization. (Easton, 1965a: 99)
(2) Open systems thinking made Easton realize that (a) a social system is of a higher level of complexity, or order of organization, than are mechanical and biological systems (Buckley, 1968; Luhmann, 1995; Mingers, 1995; Bateson, 2002), and (b) a political system is distinct from all other kinds of social system as a general type of communicative decision and action, which can do something for society substantially different from what could be done without it (Deutsch, 1963). As he opens A Framework for Political Analysis – his first systematic investigation into the problem of political systems persistence:
216
The SAGE Handbook of Political Science
[I have] not been able to lean on any readymade model; and no eclectic borrowing from other varying kinds of systems approaches would do. A consistent structure of concepts had to be newly developed that would fit the kind of system that political life constitutes. (Easton, 1965a: xii)
Somewhat ironically, it was Almond’s Parsonian-inspired equilibrium approach to reading Easton which made his political systems model the most influential one in the field of comparative politics (Rokkan and Lipset, 1967; Blondel, 1990). Curiously, this Parsonian reading of a political system is still quite influential in comparative politics although it has since long been abandoned in all the other social sciences. As Mark Blyth notes, mainstream political science still suffers from ‘ELEN’ – a sickness making one study institutional formation in terms of concepts of equilibrium, linearity, exogeneity, and normality (2011: 99). Almond ‘infected’ Easton’s political system with this ELEN sickness to make it more useful to assessing the relationship between ‘rational man’ and ‘irrational society’ that liberal democracy has been designed to handle and mediate (Barry and Hardin: 1982). This relationship has until very recently been claimed to determine the input of specific (‘rational’) and diffuse (‘irrational’) support for a political regime and thereby its homeostasis or selfmaintenance. Thus, the factor of exogeneity in the ELEN model turned Easton’s model into a black box for studying how to stabilize and harmonize the relationship between the democratic regime inside the political system and the civic culture outside. It was claimed to address only: beliefs, feelings, and values [that] significantly influence political behaviour, and that … [they] are the product of socialization experiences. (Almond, 1989: 29)
Thus, Easton became a common professional possession in mainstream political science, who, like Almond, claims that adequate socialization experiences generate the kind of specific and diffuse support for political
authorities and political regime without which no political system could maintain itself. Without such support, conflicting individual preferences and social interests could not be aggregated and integrated into binding decisions in an orderly and consensual manner. Nor would the political system be able to deliver outcomes that are experienced as fair and just by most citizens in society. Very few in the mainstream have challenged this input-driven interpretation of Easton’s political system, according to which the conversion of inputs into outputs inside a political system is determined by the clash between private interests in the market and public interests in civil society and the civic culture outside. American comparative politics has celebrated many successes and has enjoyed hegemony in the political discipline for more than 60 years with this ‘outside-in’ approach to the authoritative articulation and allocation of values for a society. However, in our world where all focus is on how a political system makes and implements binding decisions from the inside-out, it is difficult to see how American comparative politics is to sustain its politics-before-policy model of consensus, order and equilibrium. In our ‘brave new world’ where political elites and laypeople increasingly place policy-beforepolitics, and where conflict, chaos and disequilibrium are becoming the new ‘normality’, questions of power, knowledge and identity are becoming ever more central. As Blyth asked his mainstream colleagues some years ago: [W]hat if we live in a world that is actually disequilibrial and dynamic, where causes are endogenous and nonlinear, and where outcomes of interests are not normally distributed? (Blyth, 2011: 87)
There is no ‘what if’ any longer, since it has become obvious to most that chaos is the rule more than the exception, and that the best one can hope for politically today is to be able to create islands of order in this general maze of disorder. It is in this context that political systems analysis gains
David Easton’s Political Systems Analysis
new and immediate significance and relevance, since chaos is the very idea that enables Easton to argue that ‘a system can be said to persist even if it changes [completely]’ (1965a: 82). As a Canadian, Easton has never had the same relationship to ‘American exceptionalism’ as Almond and many other scholars in the mainstream. His ambition was from the onset to overcome the tendency in the social sciences to put the blind eye to the phenomena of change and power. This compelled him to try to find a way to overcome the order vs. anomie distinction with its commingling of facts and values, science and history, authority and hegemony, a political system and a democratic regime. Easton was seeking a way to avoid making values into facts by identifying the general issue of how a political system manages to persist with the specific issue of how close to, or far away from, different countries are from the democratic equilibrium in terms of which a stable and consensual relationship between a political regime and a civic culture is established, maintained and developed. As he argues in A Systems Analysis of Political Life, his political model: helps us to prevent research from remaining exclusively and narrowly preoccupied, at least implicitly, with one type of system, namely, democracy as it has developed in the West. Even where nondemocratic systems are under scrutiny, it is seldom for the sake of understanding and explaining political systems as such, but through contrast with Western democracies, to shed a stronger light on the conditions surrounding the existence or emergence of democracies. The designation of exotic systems as developing or transitional suggests a norm toward which they are moving, and seldom does this standard represent anything other than Western democracies as we know them today. The prevalence of the concept of ‘modernity’ further reflects this marked cultural limitation. (Easton, 1965b: 15)
In fact, political systems analysis is as relevant to assess the limits of liberal democracy today as to reimagining its future possibilities. In sharp contrast to the ELEN model
217
outlined by Blyth above, Easton has always insisted that in his model there is: • no specification of stability as normal, and, hence of change as the exceptional phenomenon to be understood and explained: There is never a social situation in which the patterns of interaction are absolutely unchanging (Easton, 1965a: 106, cf. Giddens, 1979);
• no evolutionary traits of rational and normative modernity that unfold themselves ‘behind the backs’ of members of a political system: At times members in a system may wish to take positive actions to destroy a previous equilibrium or even to achieve some new point of continuing disequilibrium (Easton, 1965b: 20, cf. Giddens, 1981);
• no teleology – the attainment of an already existing goal – but instead teleonomy, as manifestations of its members’ ongoing seeking for goals: It would be impossible to understand [systems persistence] if either the objectives or the form of the responses are taken for granted. A system may well seek goals other than those of reaching one or another point of equilibrium (Easton, 1965b: 20, cf. Wilden, 1972);
• no disciplined and obedient cultural ‘dopes’ blindly consenting to orders from above when ‘adapting’ to changing circumstances: Since [a political system] is composed of reflective human beings, it is capable of evaluating what is happening and of taking evasive action (Easton, 1965b: 225, cf. Bang, 2015);
• no ethnocentrism, claiming modern liberal democracy to be a rational Western development to which all modern social systems must adapt to make their members free and equal: Once we affirm that all political life in its varied manifestations may properly become our universe, the substance of theoretical inquiry would have to change radically. It would no longer suffice to assert some central value that is associated with an interest bred by the historical experience of the West (Easton, 1965b: 14).
Almond’s Parsonian reading of the political system made many believe that Easton’s
218
The SAGE Handbook of Political Science
systems approach ‘belongs to the category of theories that come into vogue and then just vanish’ (Lane, 1978: 161). This has not been the case: Easton’s model has proved puzzlingly resilient. It has survived all major paradigmatic conflicts, and, unlike structural-functionalism and structural Marxism, it has not yet been put to rest at the ideological and theoretical scrapheap of history. Even today, basic elements of Easton’s systems approach inform much political research in the mainstream into political support, culture, citizenship and political parties (Abramson and Inglehart, 1970; Dalton, 1998, 2002; Klingemann, 1998; Norris, 1999, 2011). Although they only deliver some partial applications of his political theory, they demonstrate how, as Daniele Caramani notes in the introduction to his book on Comparative Politics: Easton’s work has been a victim of its own success. His concepts have impregnated the minds of political scientists, as well as those of the wider public, so deeply that, in a way, it goes ‘beyond citation’. (Caramani, 2008: 12)
Let me therefore instead begin from listening to, and learning from, what Easton himself says his political systems model is all about. This is the first necessary step towards a critical analysis of his original political macro-approach to studying how values of all kinds are authoritatively articulated and
demands support
allocated for a society, in and through a political system of some kind.
Easton’s Political Systems Analysis It is difficult to imagine what political science would have looked like had David Easton not in 1957 introduced ‘An Approach to the Analysis of Political Systems’ in World Politics. His four-function model (see Figure 13.1) with its ‘inputs’ of demand and support and ‘outputs’ of decision and action have reached the stage of something learned at one’s mother’s knee. Critics from all major paradigmatic perspectives in political science have continually been blaming Easton’s four-function model for being inconsistent and inadequate in its view of social structure, class relations, the nation-state, rational actors and political institutions. Although critics from his own behaviouralist tradition do consider his model useful, they also think it is badly flawed. For example, Lane opens his Parsonsinspired critique by stating: ‘THERE MUST BE LIMITS TO CONFUSION’ (1978: 161). Easton, he argues, never even tries to answer Parsons’ crucial question about what homeostasis or social order is about: ‘e.g. that the society persists biologically and socially and that it is not characterized by anarchy
POLITICAL SYSTEM issues • problems
feedback actions
Figure 13.1 Easton’s four-function model
decisions
David Easton’s Political Systems Analysis
or anomie’. In contrast, Sorzano, a rational choice critic, argues that this failure stems from the fact that Easton’s model is not structural-functionalist at all. It ‘not only regards the actors as behaving in a maximizing fashion but it also characteristically distinguishes between the actor’s intentions and the objective consequences of his behavior for the system as a whole’ (1975: 98). Thus, Easton was entangled in the heated and ongoing battle between methodological collectivism and methodological individualism (O’Neill, 1973), against which state theory (Badie and Birnbaum, 1983) positioned itself in the 1980s. As a result, another reading of Easton’s political system appeared considering the core issue in his model as being historical and political in nature rather than sociological and economic. As Green argues, structural-functionalism and rational choice theory have got the problem raised by Easton’s political systems analysis all wrong: If politics is indeed the realm of the authoritative allocation of values for a society, then support, stability, and compliance behaviour are much less important in its analysis than many have thought. Rather, the existence of authority in a society depends on there being standards which function in a particular way in the practical reasoning of its members: they guide their subjects’ action without appeal to their view of the merits of the case. But to understand authority in this way, as a feature of practical reasoning, is to embark on a very different kind of political science than the one anticipated by systems theorists or by most of their critics. Where it ultimately leads is in the direction of the traditional theory of the state claims authority over its subjects, and whose claims subjects surrender their judgement. (Green, 1985: 141–2)
Notice that this statist interpretation of Easton’s political system stands in sharp contrast to the young Easton’s practical critique of elitism (Easton, 1949). However, Green’s criticism opens the black box, introducing the mature Easton’s own specification of political authorities, the political regime and lay members in the political community as
219
the three basic component parts of any going political system. Thus, the question of political authority becomes central to understanding Easton’s systems model as a whole.
Political Authority: The Heart of Political Systems Analysis As Green notes above, political authority serves as the principal coupling mechanism between a political system and its relevant societal and non-societal environments on the one hand, and between political authorities and laypeople inside the political system on the other. Thus, the basic message Easton sends with his almost banal four-function model is very simple: a political system is identified by its general capacity to make and implement authoritative decisions that are accepted and considered binding by most members of society, at least most of the time. Therefore, political authority and people’s practically considered acceptance of it are at the core of any kind of political system from the tribal to the global (Easton, 1955b, 1958, cf. Bang, 2015). It is the political condition of living together as a group of people whose differences about how values are to be distributed that often call for immediate political decision and action amidst, and cutting through, all debacles. When differences cannot be resolved customarily or privately, then someone must step up and say: ‘we must do this now’. If the members then accept this, after having recognized that messages received in this way must be acted on without raising further delaying questions to their validity, they bestow political authority on the sender. Their reasons for doing so may be manifold. Some may feel threatened to accept the message; others may expect to be rewarded after having accepted it; some may consider the sender trustworthy or morally superior; others may consider that their acceptance is in their own best interest or contribute to developing their political identity:
220
The SAGE Handbook of Political Science
But regardless of the particular grounds, it is the fact of considering the allocations as binding that distinguishes political from other types of allocations. (Easton, 1965a: 50)
Hence, the ‘test’ for what makes a message political in nature is: (1) that the message is issued clearly and directly by the sender to the receiver without involving any intent to distort communication (through manipulation and deception) or engage in a rational dialogue with the sender about its validity (‘deep’ persuasion oriented towards agreement); and (2) that the receiver accepts the message as binding for his or hers doing or refraining in the concrete situation, carrying it out without undertaking a deeper analysis of its instrumental or moral validity. These, in Easton’s view, are the two political conditions of group life which make political authority distinct from all other types of societal power-knowledge. It follows that political action is modelled after neither instrumental action (rational choice) nor moral-legal action (communicative rationality) but from the practical experiences underlying its considered acceptance. As evidence of a distinctly political relationship, authority is simultaneously cognitive and affective, involving various mixes of emotions and sound practical judgments, depending on the situation – what Aristotle calls phronesis as distinct from techne and episteme (Eikeland, 2008; Flyvbjerg et al., 2012). Political authority ‘as such’ says nothing about the special form in and through which its message is communicated. It simply manifests what makes the political distinct from the economic, the social, the cultural, the religious and other real and significant differences that people living together have sorted out from each other to cope with the existential risks, conflicts, challenges and problems that they confront in time-space. Political authority can lean towards antagonism or consensus; be centred or decentred; based on hierarchy or networking; shaped as a command or a request; reveal the power of the
few over the many, or that the power between the few and the many are equally distributed. It all depends on the political situation. As a politically communicated message, political authority is inherently open to reinterpretation, reconfiguration and transformation. Thus, the crucial existential issue for a political system of any kind does neither concern its stability nor its change. Nor is it about the extent and degree of conflict or consensus associated with its articulation and exercise. The primary existential issue for a political system is that of uncoupling – of its political authorities from laypeople in their political community, or of itself from its relevant environments (Easton and Dennis,1969). Inversely, connectivity is what makes it possible for a political system to go on in multiple, irreducible forms in, and through, history (Bang, 2015). Today, we seem to have forgotten about uncoupling as the core issue of political existence. Liberal democracy was originally founded on this idea that a viable connection between political authorities, political regime and political community must exist to cope with conflicts, risk and problems in a democratic manner (Dahl, 1957; Almond and Verba, 1963; Putnam, 1993). However, party politics (national and transnational) has never been so uncoupled from laypeople inside the political community and from the population outside as today. The situation is not much better in the international political system, which is in great danger of losing the minimal interconnectivity between nationstates required to live together in peaceful coexistence. In fact, the situation resembles the kind of uncoupling in the democratic Weimar Republic that led to Hitler’s gradual seizure of total control from 1933. The response to political uncoupling has so far been dominated by nativist populism with its call for a ‘strongman’ to seize hegemony and form collective identity out of an imaginary ‘pure’ people (Müller, 2016; Mudde and Kaltwasser, 2017; Bang, 2018). The societal cleavage and conflict between globalism and nativism have taken over from established
David Easton’s Political Systems Analysis
party politics along the left–right axis. They have generated a virtual war, online as well as offline, between ‘the elite’ and ‘the people’, or ‘the professionals’ and ‘the amateurs’, propelled by nativist populism’s revolution against ‘the world elite’ and ‘the world elite conspiracy’ which in Hitler’s horrendous terminology became ‘the world Jew’ and ‘the world Jew conspiracy’(Klemperer, 2011: 44). One should think that Easton’s four-function model would gain new importance and significance in a situation where uncoupling is accelerating even in Hitler’s demonizing form. After all, its crucial political message is that uncoupling is the biggest threat to political and societal existence, as stemming from, and leading to, the refusal to accept and recognize that difference is at the heart of the political as well as of democracy.
221
or a world political system. Political authority is in any case what makes it possible for a political system to convert different, contested and sometimes profoundly opposed demands into collective decisions and actions. It provides the members inside the political system with the capacity to structure the system in multiple irreducible ways to cope with internal as well as external conflicts, risks and problems. Thus, authority does not only place limits and constraints on political decision and action; it also facilitates and enables new forms of political discourse and practice to occur, take root and develop. As such, political authority manifests the basic coupling mechanism between political authorities and lay members in the political community, conditioning their contingent structuration of the political regime as both a medium for, and outcome of, their situated interaction (Figure 13.2).
The Issue of Coupling Inside the Political System Easton claims that authority manifests the transformative capacity of any political system, whether shaped as a tribe, a wandering nomadic band, a city-state, a principality, a nation-state, a transnational political system
The Political Systems Model: Extended Version Political authorities must conform to the following criteria: (1) they must engage in the
POLITICAL REGIME
ENVIRONMENT
Regime Structure= Values Norms Resources
POLITICAL COMMUNITY
Figure 13.2 Regime structuration
POLITICAL AUTHORITIES
222
The SAGE Handbook of Political Science
daily affairs of a political system; (2) they must be recognized by most members of the political system as having the responsibility for day-to-day decisions and actions and (3) their actions must be accepted as authoritative by most members, at least most of the time, as long as they act within the limits of their assigned role (Easton, 1965b: 193). The political regime refers to ‘the general matrix of regularised expectations within the limits of which political actions are usually considered authoritative, regardless of how or where these expectations may be expressed’ (Easton, 1965b: 193–4). It consists of values (goals and principles), norms and structures of authority. The values serve as broad limits concerning what can be taken for granted in the processing of demands into collective decisions and actions without violating deep feelings of important segments of the political community. The norms specify the kinds of procedures that are expected to be recognized as acceptable for making and implementing binding decisions for society. The structures of authority designate the formal and informal patterns in, and through, which resources (material, symbolic, imaginary and virtual) are accumulated, institutionalized, organized and brought into play for authoritatively articulating and allocating society’s scarce values (1965b: 193). The political community concerns that aspect of a political system that consists of its members seen as a group of feeling and sensible people who are drawn together by the fact that they participate in a common political structure and set of processes – however tight, or loose, their ties to these structures and processes may be. From the vantage point of sharing such a political division of labour it does not matter whether people form a community in the sociological sense of a group of members who share a deep feeling of community and a set of common traditions. A political community may well be composed of people with different, even antagonistic, nationalistic and other collective identities. The members may also experience deep
religious cleavages and conceive of their different cultural values and norms as entirely incompatible with one another. However, so long as they, at least implicitly, accept and recognize that they share in a political division of labour, they can be said to constitute a political community in this minimal sense as an action community (Easton, 1965b: 177). Regardless of their internal divisions, their situated interaction will then express Rousseau’s claim (1987: 153) that ‘political authority is one and single and cannot be divided without being destroyed’. Hence, political community is the final test of any political system. Political authorities can be overturned time and again, and the political regime can be in disequilibrium and revolutionary change without threatening systems persistence; but if the members of a given political community refuse to share a political division of labour any longer and give up collaborating even minimally in the situation, then the continued existence of not only the political system but also society is at stake. An illustrative example is what happened in the former Yugoslavia. Easton’s insistence on political authority as a necessary condition of political and societal existence is often seen as a sign of authoritarianism and elitism. As Dag Anckar puts it in his Easton critique: Outputs result in certain outcomes, functioning as stimuli resulting in certain responses from the masses. In other words, we are dealing with a society in which the rulers manipulate the ruled and where the ruled react mechanically to their manipulation. (Anckar, 1973: 82–3, my translation from Swedish)
This conclusion is obviously inspired by Anckar’s a priori conception of Easton’s political system as a small appendix to Parsons’ social system. However, some critics do sense that the young Easton operates from the presumption that political science and ethics must be anchored in people’s reciprocal acceptance and recognition of each other’s differences to be optimally
David Easton’s Political Systems Analysis
useful and valuable to science as well as to democracy and democratization. But again, Parsons stands in the way, making, for instance, Miller conclude that ‘[Easton’s] theoretical position does not favor the revival of serious inquiry about the ends of political life’ (1971: 223). This is despite the fact that Easton’s political systems analysis begins and ends with the member’s acceptance and recognition of political authority inside their political community. Political authority need neither be coercive and commanding, as Anckar seems to believe, nor does its acceptance imply that authorities are the only political actors in town, as Miller tends to suppose. As the mature Easton stresses: Unfortunately, in political research we have no convenient term for distinguishing the authorities from all other members in the system. Marx’s ruling class as against the ruled, Pareto’s elite, Mosca’s political class and Michels’ oligarchy versus masses are transparently not satisfactory for this purpose. They classify members of a system according to the power they hold whereas here we wish to point up the difference between those who are occupants of authority roles as against the occupants of all other roles. But however we might classify members in a systematic structural analysis of political systems, this much can be said here and has also been implied in all of the preceding discussion: the authorities need not be co-extensive with the politically relevant members. (Easton, 1965b: 214–15)
As evidence of a general political transformative capacity, authority is contingent on freedom and domination and presumes power and knowledgeability on the parts of both senders and receivers of its communicated messages. As an ordinary member, one can accept and recognize that political authorities can do something for the political system substantially different from what can be done without them, and yet actively resist those who misuse authority to try to dominate one’s political existence. This is the conclusion that the young Easton arrived at in his doctoral thesis (1947) written directly up against Parsons’ knowledge regime at Harvard. The younger Easton developed his
223
thesis partly in opposition to Parsons’ social evolutionism, partly in conflict with the historicism that also made its presence felt at Harvard in that period. Easton was intensely dissatisfied with the entire programme as well as with his advisor, which is why he moved on to the University of Chicago in 1947, accepting a joint position where he would be teaching Scope and Method of Social Science in the graduate division of social science and then doing research in the department of political science.
Easton’s Meeting with the Chicago School: Policy Before Politics Chicago gave Easton the perfect start to his career, because it gave him the possibility to combine his interest in facts and quantitative method with his more politically and ethically oriented theoretical and philosophical thinking. He was also greatly attracted to the Chicago School, especially to Harold D. Lasswell and Charles E. Merriam, who became two of his methodical and theoretical exemplars. Especially Lasswell’s democratic policy analysis attracted Easton’s explicit attention, because he could sense how Lasswell struggled with the same issues as himself about global vs. national, science vs. history and elite vs. people. It was also through his critique of how Lasswell tends to oscillate between technocracy and ethics in his policy analysis for a democratic society that Easton found a preliminary road to avoiding the twin pitfalls of evolutionism and relativism (Easton, 1955a). The idea of a political system already appears in embryonic form in Easton’s thesis from Harvard, but it was first with his involvement in a multidisciplinary group of systems scientists in Chicago that his idea of political systems as irreducible to any other types of systems began to take shape. It is intriguing to note how this group already in the early 1950s
224
The SAGE Handbook of Political Science
imagined a future based on the internet, robots and artificial intelligence. Easton was taken aback by the profound scientific curiosity driving this group. He developed a deep fascination with technology that he kept his entire life, though without forgetting his classical insights into how technocracy needs be connected to ethics in any democratic society worthy of its name (Baer et al., 1991: 195–215; Easton, 1969; Easton, 1973). Easton learned from Lasswell, how technocracy comprises a rationality, ideology and culture distinct from that of bureaucracy. Technocracy is oriented towards achieving success by competing and networking in finding ‘best practices’ for doing things efficiently and effectively. In contrast, bureaucracy has its foundation in the ancient King’s hierarchical, disciplinary and coercive power. With the occurrence, consolidation and development of representative democracy, state sovereignty was tied to popular sovereignty, and bureaucracy took on a new role as the rational, loyal and obedient servant of democratic government and ‘we the people’. However, its basic structure was, and still is, that of a centralized system of rank and grade which functions in, and through, a strictly hierarchized and disciplinary chain of commands to sustain political and social order. Thus, bureaucracy and democracy are in constant tension with one another (Habermas, 2015). The young Easton was fascinated by Lasswell’s policy-oriented, managerial and collaborative ethos, committing administration to continuous innovation and change through its knowledgeable employment of technological advances. Technocracy, as distinct from bureaucracy, carries no intention to exercise coercive power over individuals. Rather, it wants to manage or ‘nudge’ them to become more and more autonomous and thereby more and more functional to boosting competition and growth. The younger Easton agreed with Lasswell that both political science and democracy must be more policy oriented to handle the threat of new despotisms like fascism, Nazism and
authoritarian communism. However, already when writing his doctoral thesis, he sensed the new threat to democracy resulting from the new kind of domination intrinsic to technocracy. This was what he called sciencebased factual elitism: Elitism is an attack against democracy but it engages in battle in a subtle and disguised manner. The elitists would disavow any intention of combating democracy as such; they are simply interested in establishing a science of society. But in fact, they do try to undermine an especially strong pillar beneath democratic theory. This is the notion that the people have a vital part to play in creating a prosperous and viable community. (Easton, 1947: 2)
Easton’s overarching ambition was, from his days as a graduate student in Toronto, to find a way to avoid that the crucial political difference, and division of labour, between political authorities and laypeople in polis, turns into a bilateral either/or opposition between the elites and the masses. He critiqued the young Lasswell as adopting the elitist equation: ‘politics=the study of changing value hierarchy=influence and influential=elite or few’ (Easton, 1950: 462). Following Pareto, the young Lasswell considered policy a rational game for elites only. The young Easton sensed that the older Lasswell endeavoured to break away from this elitist equation by trying to get away from identifying authority with ‘power over’, hegemony and legitimate domination. The older Lasswell could see that although political authorities will always consist of the few, this does not in any way imply that the few must possess most power and knowledge. As the young Easton argues: If the same reasoning had been used about stealing, an example which Pareto himself cites, then from the fact that the thieving capacity is distributed in the form of a pyramid, the clearly invalid conclusion would have to be drawn that the greater part of the loot is necessarily obtained by the few best thieves. Of course, in contemporary society the few thieves at the top may quite easily accumulate the greater part of stolen wealth. But in the last analysis it is a question for factual investigation as to how much of the loot they actually do
David Easton’s Political Systems Analysis
get. Similarly, where actual supremacy lies, that is, who has the greatest power at any time, is a matter for empirical research. (Easton, 1950: 466–7)
Thus, the young Easton adopted Lasswell’s claim that ‘the political’ is about outputs before inputs, but he wanted to find a way to render the mature Lasswell’s insight that democratic authority must provide for reciprocal power, knowledge, trust and respect more analytically distinct. The young Lasswell revealed a tendency to consider technocratic expertise more important to handling policy problems and risks than democratic engagement. To the young Easton, making and implementing policy is not only about having expertise, but also about possessing the practical wisdom and intuition required to conduct significant and relevant actions that can do well for people. As Aristotle emphasizes: ‘A city [or system] is good in virtue of the goodness of the citizens who have a share in its constitution’ (1995: 282).
Bringing the Classical Greek Polis Back into Policy Analysis The young Easton embraces Aristotle’s understanding of polis as a praxis involving everybody in its (re)constitution as subjects to a political authority relationship. The difference only is that the mature Easton distinguishes a political community in its general minimalist sense as partaking in a political division of labour from democratic political community as defined by Aristotle’s ethics of the good and happy life (Bang, 2009). Easton developed this distinction between political community as a general political condition of existence, and the special, irreducible forms in which it occurs in time-space to overcome the fact–value divide that he detected in the young Lasswell’s policy analysis. Easton saw Lasswell as a scholar struggling against himself. The technocrat and social scientist in Lasswell pushed him towards abjuring values, treating them either as objects of desire or as
225
amenable to scientific validation. Thus, the classical dream of self-government of and by the members of polis themselves vanishes in a puff of smoke. However, when the mature Lasswell begins to commit himself more seriously to the linking of collective decisionmaking to political communication in the public realm, he imperceptibly begins to undermine his earlier elitist equation. Instead he tries to complement his rational policy orientation with a more commonsensical everyday approach to democracy as being about the exercise of self-governance in, and through, a political action community: In place of his insistent attention to the arrival, survival, and composition of the elite, his interest now shifts to the parts of the political process that might make it possible for the people to challenge and limit the power of bureaucracy. (Easton, 1950: 468–9)
The mature Lasswell begins to acknowledge that democracy carries an ethical obligation to practising one’s freedom in, and through, joint involvement in identifying, pursuing and resolving one’s common concerns (Habermas, 2015). Democracy, Lasswell could suddenly see, is not only about rational decision and action but also about how laypeople are constantly challenging and problematizing expertise in, and through, their more sensuous and spontaneous everyday practices. Thus, the technocratic elitist in Lasswell begins to waver and revise his rational position. Firstly, he acknowledges that ‘there must be greater awareness of the role that conceptual principles play in research’: The assumptions of a framework, unavowed or explicit, may so compel research in one direction that the theories and facts discovered may not be relevant to the more urgent purposes of society. (Easton, 1950: 476)
Secondly, Lasswell becomes conscious that: If a social discipline is to contribute to the understanding of social policy it is not enough that it confines itself to the search for the ‘pure’ truth. So
226
The SAGE Handbook of Political Science
long as the plain truth was the objective, values – could and did creep in through the back door even though they appeared to have been suppressed. (Easton, 1950: 476)
Thirdly, Lasswell begins to recognize that a politics of truth hang together with democratic policy-making and implementation, especially when facing a crisis situation: If the survival of society were guaranteed under any eventualities, then the social sciences could afford to tolerate research that was indifferent to the framework within which it was cast. But when the threat of self-destruction hangs over the world, the urgency of social problems demands a reconsideration of the link between the conceptual framework in each social science and the utility of the results of that science for the attainment, preservation, and extension of the goals upon which men have agreed. (Easton, 1950: 476)
Finally, the young Easton sensed a fourth implicit challenge in the mature Lasswell’s democratic policy analysis: [T]here appears in embryo the further claim that even the goals upon which social policy must be based can be established with the procedures of a fully developed science of man. (Easton, 1950: 476)
Mediating between Facts and Values The young Easton began to ponder Lasswell’s last bold claim noted above in his emerging theoretical approach to political systems persistence. He realized that a political system cannot itself articulate, pursue and reach its goals; only its members can do so. However, because of its intrinsic transformative capacity, a political system can keep its ongoing processes of decision and action inherently open to its members’ new goals and directions. A political system, as the mature Lasswell also argues, does not, like an individual organism, regulate itself according to a biological code or formative principle. It is coded and given its formative principles through the situated and contingent
interventions of its members in its ongoing processes of decision and action. An ethically informed science of the political system is ultimately about demonstrating what makes democracy possible as government of, by, for and with ‘the people’. It follows that empirically oriented general political theory is not principally concerned with what the political is, or ought to be, but rather with what the political is not; what it could be. Only by establishing a creative alliance between facts and values, can the possibility of a democracy as founded on both episteme, techne and phronesis be theorized and applied to practice (Bang, 2015). This merely requires that people, as historically situated political subjects, reciprocally accept and recognize that just as a political system most generally is about handling differences of all kinds, democracy is most specifically about showing mutual trust in and respect of such difference. Hence, the democratic imaginary can be claimed to be the only one which has a potential for including everyone in a political system’s continuous constitution and reconstitution. All other political imaginaries exclude some members from partaking in this constitution, whether by reference to their culture, class, ethnicity, nationality, race, sex, gender or religion. A political system, which is founded on reciprocal acceptance and recognition of difference on the part of both political authorities and laypeople in the political community, would merely exclude those who do not accept and recognize difference.
The Tensions between Bureaucracy, Technocracy and Democracy President Donald J. Trump’s conflict-driven policies for ‘Making America Great Again’ can serve to illustrate the young Easton’s critique of elitism. Trump’s nativist populism is principally targeted towards stopping the cosmopolitan global elite’s undermining of
David Easton’s Political Systems Analysis
227
Table 13.1 Bureaucracy, technocracy and democracy Bureaucracy
Technocracy
Democracy
Issue
Politicization
Depoliticization
Power Knowledge Domination
Hierarchy Rational Strategic manipulation and deception
Freedom
Autonomy to pursue own interests
Meritocracy Rational Strategic communication and management Autonomy to achieve success
Politicization/ Problematization Circularity Practical/Prudential Mob rule
the nation as the home of the ‘pure’ people (Bang, 2018). According to ‘Trumpism’, this technocratic and science-based global elitism has, by allowing and facilitating the free movement of ‘professionals’ in the global knowledge economy, opened the sluice gates for uncontrollable immigration and illegal aliens into the homeland of ‘we the people’. Trump is using bureaucracy as well as his people movement as his ‘personal army’ to smoke out ‘the global elite’ from all its ‘policy holes’ within national policy domains like health, environment, education and, of course, immigration. The intention is to stop the undermining of central government and leaderships by the global meritocracy of networking and collaborating policy elites. The young Easton’s critique of Lasswell indicates that this takes us out of the frying pan and right into the fire. Liberal democracy is besieged by these two opposed hegemonic forces, leaning on global, science-based technocracy and national, people-based bureaucracy, respectively. On the one side, Trump seems justified in critiquing the global policy elite of having uncoupled itself completely from laypeople in the political community. It makes only ‘the professionals’ count in the constitution of policy, what in turn has contributed to depoliticizing policy, atomizing the nation, and depriving ‘the people’ of its sovereignty. On the other side, in trying to stop this global elite domination by (re)politicizing major parts of bureaucracy and the citizenry in the pursuit of hegemony, the politics of the strongman that
Emancipation from domination and autonomy to make a difference
haunted democracy in the 1930s is once again attempting to take over from democratic law, morality and ethics. As the young Easton indicates above, a technocracy set free from the ethics of polis is indeed as great a danger to democracy as is a bureaucracy set free from democratic morality and law. In any case, what suffers is democracy as a political relationship between elites and laypeople founded on reciprocal acceptance and recognition of difference (see Table 13.1). The mature Lasswell’s democratic policy analysis is primarily addressed to overcoming the kind of manipulation and deception that derives from a politicizing bureaucracy which is only loyal to itself and which only pursues its own special interests. The younger Easton moves a step further by emphasizing how democratic critique must also be addressed to problematizing the depoliticizing and evidence-based technocracy’s elitist presupposition that the few always knows best. Technocracy does not dominate in, and through, deformations and reifications of democratic discourses and practices; it is much subtler in its exercise of domination, making use of competence development and nudging to get people to do what they otherwise would not have done. Technocracy is global and scientific in nature (Castells et al., 2006); it aims at ‘adjusting’ all individuals and institutions to seek instrumental success above all else. This is also why many technocrats consider bureaucracy as such a threat to improving policy efficiency and policy effectiveness. Bureaucracy is all about state
228
The SAGE Handbook of Political Science
sovereignty, order, hierarchy and one-way commands; it is by nature inflexible and rigid in its organization. It must be depoliticized by all available means and subjected to evidencebased administration, technocrats hold, if it is not to become an authoritarian force, undermining global competition and growth. Had Easton been alive, he would surely have warned us against the hollowing out of discursive and practice-oriented democracy by the depoliticizing global technocracy and the politicizing national bureaucracy. Their internal conflict can only extend and deepen the opposition between globalism and nativism, innovation and tradition, achievement and equality, the reasonable and the emotional, and so forth. Following Easton means recognizing how a ‘freestanding’ technocracy or bureaucracy necessarily undermines the transformative capacity of authority in terms of which political authorities and laypeople in the political community are coupled together inside the political system. If laypeople in the political community did not share in this transformative capacity, political authorities and policy elites would be left in the dark as to how best to make use of laypeople to identify and handle the conflicts, risks, challenges and problems that confront them in their everyday practices. Political authority in its democratic sense is not only committed to protect and approximate people’s liberties equally; it is also dedicated to nourishing and expanding people’s capacities to govern and take care of themselves.
Democracy as Equal Freedom and as Self-Governance The depoliticizing technocracy is widely discussed within the new institutionalisms (March and Olsen, 1989, 1995), governance analysis (Bang, 2003) and governmentality studies (Dean, 2007), just as the politicizing bureaucracy has continuously been a concern for advocates of critical theory (Habermas, 1987), deliberative democracy (Dryzek, 2002) and discursive democracy (Laclau,
1990). The young Easton’s double critique of bureaucracy and technocracy indicates how these more institutional steering perspectives should be connected to formulating new, more connective approaches to recoupling political authorities, the political regime and the political community (Bang, 2016). Both politicization and depoliticization should be more explicitly considered from the vantage point of laypeople in their diverse everyday practices in the political community. On this level, the young Easton seems to have much in common with Habermas, who also argues that ethical claims to concrete action by associated individuals go beyond what citizens would be obliged to do by institutionalized law and morality: Moral commands should be obeyed out of respect for the underlying [democratic] norm itself without regard to the future compliance of other persons, whereas the citizen’s obedience to the law is conditional on the fact that the sanctioning power of the state ensures general compliance. Fulfilling an ethical obligation, by contrast, can neither be enforced nor categorically required. (Habermas, 2015: 21)
Easton helps by illuminating how Habermas lacks a distinction between authority as legitimate domination and as facilitating communicative governance to accomplish his mediation between (1) what people in polis must accept to obey to enjoy their abstract liberties, and (2) what they can only voluntary accept and recognize as required to practice their freedoms on their own everyday terms and conditions. People’s obedience to democratic law and morality is required to (re 1) hinder the politicizing bureaucracy from distorting and repressing the struggles for equal freedom. In contrast, (re 2) people’s voluntary acceptance and recognition of the communicated message of political authority is required to expand self-governance and avoid that they are simply nudged by technocrats to do what they would otherwise not have done. Habermas does himself, implicitly make this distinction between equal freedom and self-governance, when arguing that:
David Easton’s Political Systems Analysis
What differentiates both ethical expectations and appeals to solidarity from law and morality is the peculiar reference to a ‘joint involvement’ in a network of social relations. (Habermas, 2015: 23)
To Easton such joint involvement is not peculiar at all but characteristic of how members of a democratic political community are networking based on their mutual acceptance and recognition of their intrinsic differences. Even the vision of such a democratic political community is today dissolving in front of our eyes. Either the members of the political community are subjugated to a global technocracy arguing that ‘the professionals’ know best, or they are haunted by nativist strongmen making use of bureaucracy to impose their false imaginary of a ‘pure’ people on their everyday practices. The young Easton points to a way for democracy out of this morass: With a general empirical theory woven into a moral theory, we would then have completed the new or post-modern image of social science. We would then have returned to the kind of knowledge prevalent in all ages prior to the late nineteenth century, but at a higher level of understanding and empirical confirmation … We would be clearly aware of the responsibilities and potentialities of each kind of knowledge without fearing to utilize each separately or together as the situation seemed to demand. (Easton, 1955a: 18)
The demand for such a postmodern political analysis should be obvious in a political situation where fears of failure and of everything foreign are permeating every pore of society. It is time we begin to reconsider how a systems approach to political life as a whole can help to reinstall the vision of the good and happy life introduced by Aristotle in Classical Greece.
References Abramson, Paul R. and Ronald Inglehart, (1970) ‘The Development of Systemic Support in Four Western Democracies’. Comparative Political Studies 2(4): 419–442.
229
Alexander, Jeffrey, C., (1984) The Modern Reconstruction of Classical Thought: Talcott Parsons. London: Routledge & Kegan Paul. Almond, Gabriel A., (1989) ‘The Intellectual History of the Civic Culture Concept’ in Almond, Gabriel A. and Sydney Verba (eds), The Civic Culture Revisited. London: Sage, pp. 1–37. Almond, Gabriel A., (1997) ‘The Political System and Comparative Politics: The Contribution of David Easton’ in Monroe, Kristen R. (eds), Contemporary Empirical Political Theory. Los Angeles: University of California Press, pp. 219–231. Almond, Gabriel A. and Sydney Verba, S., (1963) The Civic Culture: Political Attitudes and Democracy in Five Nations. Princeton, NJ: University Press. Anckar, Dag, (1973) ‘David Easton’s Politiska Teori’. Acta Academiae Aboensis 50(2): 1–101. Aristotle, (1995) Politics. Revised with an Introduction and Notes by R.F. Stalley. Oxford: Oxford University Press. Badie, Bertrand and Pierre Birnbaum, (1983) The Sociology of the State. Chicago: University of Chicago Press. Baer, Michael A., Malcolm E. Jewell and Lee Sigelman (eds), (1991) Political Science in America: Oral Histories of a Discipline. Lexington: University Press of Kentucky. Bang, Henrik Paul (ed.), (2003) Governance as Social and Political Communication. Manchester: Manchester University Press. Bang, Henrik Paul, (2009) ‘Political Community: The Blind Spot of Modern Democratic Decision-Making’. British Politics 4(1): 100–116. Bang, Henrik Paul, (2011) David Easton. Copenhagen: Djøf Forlag. Bang, Henrik Paul, (2015) Foucault’s Political Challenge: From Hegemony to Truth. Houndsmill, Basingstoke: Palgrave Macmillan. Bang, Henrik. P., (2016) ‘Interactive Governance: A Challenge to Institutionalism’ in Edelenbos, Jurian and Ingmar van Meerkerk (eds), Critical Reflections on Interactive Governance Self-organization and Participation in Public Governance. Cheltenham UK: Edward Elgar, pp. 66–93.
230
The SAGE Handbook of Political Science
Bang, Henrik Paul, (2018) ‘The American Dream: Who Else but the Young Can Revive It?’. Policy Studies. 39(3), 274–291. Barry, Brian and Russell Hardin (eds), (1982) Rational Man and Irrational Society? London: Sage. Bateson, Gregory, (2002) Mind and Nature: A Necessary Unity. Cresskill, NJ: Hampton Press. Blondel, Jean (1990) Comparative Government. Hemel Hempstead, Hertfordshire: Simon & Schuster. Blyth, Mark, (2011) ‘Ideas, Uncertainty, and Evolution’ in Béland, Daniel and Robert Henry Cox (eds), Ideas and Politics in Social Science Research. Oxford: Oxford University Press: 83–101. Boulding, Kenneth E., (1956) ‘General Systems Theory – The Skeleton of Science’. Management Science 2(3): 197–208. Bourdieu, Pierre, (1992) Language and Symbolic Power. Cambridge: Polity. Buckley, Walter (ed.), (1968) Modern Systems Research for the Behavioral Scientist. London: Aldine Publishing Company. Caramani, Daniele, (2008) Comparative Politics. Oxford: Oxford University Press. Castells, Manuel and Gustavo Cardoso, (2006) The Network Society: From Knowledge to Policy. Baltimore, MD: Center for Transatlantic Relations, John Hopkins University Press. Crozier, Michael P., (2010) ‘Rethinking Systems: Configurations of Politics and Policy in Contemporary Governance’. Administration & Society 42(5): 504–525. Dahl, Robert A. (1956, 2006) A Preface to Democratic Theory. Chicago: University of Chicago Press. Dahrendorf, Ralf, (1969) Essays on the Theory of Society. Stanford, CA: Stanford University Press. Dalton, Russell J., (1998) ‘Political Support in Advanced Industrial Democracies’. UC Irvine: CSD Center for the Study of Democracy: 1–24. Dalton, Russell J., (2002) Citizen Politics: Public Opinion and Political Parties in Advanced Industrial Democracies. 3rd edn. New York: Chatham House. Dalton, Russell J., (2008) ‘Citizenship Norms and the Expansion of Political Participation’. Political Studies 56: 76–98.
Dean, Mitchell, (2007) Governing Societies. Maidenhead: Open University Press. Deutsch, Karl W., (1963) The Nerves of Government: Models of Political Communication and Control. New York: The Free Press. Dryzek, John S., (2002) Deliberative Democracy and Beyond: Liberals, Critics, Contestations. Oxford: Oxford University Press. Easton, David, (1947) The Theory of the Elite: A Study of the Elitist Trends in English Thought. Harvard University: Unpublished Doctoral Dissertation. Easton, David, (1949) ‘Walter Bagehot and Liberal Realism’. American Political Science Review 43(1): 17–39. Easton, David, (1950) ‘Harold D. Lasswell: Policy Scientist for a Democratic Society’. Journal of Politics 12(3): 450–477. Easton, David, (1953) The Political System: An Inquiry into the State of Political Science. New York: Alfred A. Knopf. Easton, David, (1955a) ‘Shifting Images of Social Science and Values’. Antioch Review 15(1): 3–18. Easton, David, (1955b), ‘A Theoretical Approach to Authority’. Office of Naval Research, Report 17: 1–59. Easton, David, (1957) ‘An Approach to the Analysis of Political Systems’. World Politics 9(3): 383–400. Easton, David, (1958) ‘The Perception of Authority and Political Change’ in Friedrich, Carl J. (ed.) Authority. Cambridge, MA: Harvard University Press: 170–196. Easton, David, (1965a) A Framework for Political Analysis. Englewood Cliffs, NJ: Prentice Hall. Easton, David, (1965b) A Systems Analysis of Political Life. New York: Wiley and Son. Easton, David, (1969) ‘The New Revolution in Political Science’. American Political Science Review 63(4): 1051–1061. Easton, David and Dennis, Jack, (1969) Children in the Political System: Origins of Political Legitimacy. New York: McGraw-Hill. Easton, David, (1973) ‘Systems Analysis and Its Classical Critics’. The Political Science Reviewer 3: 269–301. Easton, David, (1990) The Analysis of Political Structure. New York: Routledge, Chapman and Hall.
David Easton’s Political Systems Analysis
Eikeland, Olav, (2008) The Ways of Aristotle: Aristotelian Phrónêsis, Aristotelian Philosophy of Dialogue, and Action Research. Bern: Peter Lang. Evans, Michael, (1970) ‘Notes on David Easton’s of the Political System’. Journal of Commonwealth Political Studies 8(2): 117–133. Flyvbjerg, Bent, Todd Landman and Sanford Schram (eds), (2012) Real Social Science: Applied Phronesis. Cambridge: Cambridge University Press. Giddens, Anthony, (1979) Central Problems in Social Theory: Action, Structure and Contradiction in Social Analysis. London: Macmillan. Giddens, Anthony, (1981) A Contemporary Critique of Historical Materialism: Vol 1, Power, Property and the State. London: Macmillan. Green, Leslie, (1985) ‘Support for the System’. British Journal of Political Science 15(2): 127–142. Gunnell, John G., (1993) The Descent of Political Theory: The Genealogy of an American Vocation. Chicago: University of Chicago Press. Habermas, Jürgen, (1987) The Philosophical Discourse of Modernity. Cambridge: Polity Press. Habermas, Jürgen (2015) The Lure of Technocracy. Cambridge: Polity Press. Klemperer, Victor, (2011) Det Tredje Riges sprog (the Language of the Third Reich). Copenhagen: Tekst og Tale. Klingemann, Hans-Dieter, (1998) ‘Mapping Political Support in the 1990s: A Global Analysis’. WZB, Discussion Paper FS III: 98–202. Laclau, Ernesto, (1990) New Reflections on the Revolution of Our Time. London: Verso Lane, Jan-Erik, (1978) ‘THERE MUST BE LIMITS TO CONFUSION’. Eripainos Politiikka 3:161–179. Leslie, Peter, (1972) ‘General Theory in Political Science: A Critique of Easton’s Systems Analysis’. British Journal of Political Science 2(2): 155–172. Lewis, Thomas, J., (1974) ‘The Normative Status of Talcott Parsons’ and David Easton’s Analyses of the Support System’. Canadian Journal of Political Science 7(4): 672–686.
231
Luhmann, Niklas, (1995) Social Systems. Stanford, A: Stanford University Press. March, James G. & Olsen, Johan P., (1989) Rediscovering Institutions: The Organizational Basis of Politics. New York: The Free Press. March, James G. & Olsen, Johan P., (1995) Democratic Governance. New York: The Free Press. Meehan, Eugene, J., (1967) Contemporary Political Thought. Homewood, IL: The Dorsey Press. Miller, Eugene F., (1971) ‘David Easton’s Political Theory’. The Political Science Reviewer 1: 184–235. Mingers, John, (1995) Self-Producing Systems: Implications and Applications of Autopoiesis. New York and London: Plenum Press. Mudde, Cas and Kaltwasser, Cristobal Rovira, (2017) Populism: A Very Short Introduction. Oxford: Oxford University Press. Kindle Edition. Müller, Jan-Werner, (2016) What is Populism? Philadelphia: University of Pennsylvania Press. Norris, Pippa (ed.), (1999) Critical Citizens: Global Support for Democratic Government. Oxford: Oxford University Press. Norris, Pippa, (2011) Democratic Deficit: Critical Citizens Revisited. New York: Cambridge University Press. O’Neill, John (ed.), (1973) Modes of Individualism and Collectivism. London: Heinemann Parsons, Talcott, (1951) The Social System. New York: The Free Press. Parsons, Talcott, (1968) Sociological Theory and Modern Society. London: CollierMacmillan/New York: Free Press. Putnam, Robert D., (1993) NJ: Princeton University Press. Radcliffe-Brown, Alfred R., (1957) A Natural Science of Society. New York: The Free Press. Rokkan, Stein and Lipset, Seymour Martin, (eds) (1967) Party Systems and Voter Alignments: Cross-National Perspectives. New York: The Free Press. Rousseau, Jean-Jacques, (1987) The Basic Political Writings. Indianapolis, IN: Hackett.
232
The SAGE Handbook of Political Science
Sorzano, J. S., (1975) ‘David Easton and the Invisible Hand’. American Political Science Review, 69(1): 91–106. Vickers, Geoffrey, (1959) The Undirected Society: Essays on the Human Implications of Industrialization in Canada. Toronto: University Press.
von Bertalanffy, Ludvig, (1962) ‘General Systems Theory: A Critical Review’. General Systems, VII: 1–20. Wilden, Anthony, (1972) System and Structure. London: Tavistock Publications.
14 Max Weber and the Weberian Tradition in Political Science Andreas Anter and Hinnerk Bruhns
Introduction As one of the most influential political thinkers of the modern age, Weber offers starting points for the most diverse directions of political science. There’s hardly a textbook that would not refer to him. With his positions on state and legitimacy, power and domination, parliament and government, he shaped the political science debate sustainably. He is one of the classics of the discipline – texts such as Politik als Beruf (Politics as a Profession) (Weber, 1992 [1919]) are among its canonical texts. Weber’s reception is not only limited to positions and concepts, but also extends to theory formation and empirical research. It is therefore no coincidence that a ‘Weberian tradition’ developed in international political science. However, one cannot assume that this would be a homogeneous direction or a ‘school’. On the contrary, the different representatives of this ‘Weberian tradition’ are as disparate as Weber’s work. Given this diversity, the question arises as to which varieties of Weberian analysis can be
distinguished in political science or whether one can speak – as in sociology – of a ‘Weber paradigm’. With Lawrence Scaff, the question is: ‘are there essential characteristics attached to Weberian analysis?’ (Scaff, 2014: 3).
Early Weberians From Weimar to America Weber’s effect during his lifetime, as well as in the first decades after his death, was comparatively small. A greater resonance was only reserved for his writings on Protestant ethics and the question of value judgement, as well as his series of articles such as Parliament and Government in Germany under a New Political Order or the lecture Politics as a Profession. In retrospect, prominent figures such as the two German Chancellors Helmut Schmidt and Kurt Georg Kiesinger or a constitutional father such as Carlo Schmid later reported on the lasting effect of reading
234
The SAGE Handbook of Political Science
Politics as a Profession during their student days. Theodor Heuss, the first German Federal President, already felt the impact of the lecture as sensational and recommended that professional politicians should read Politics as a Profession. The collected essays edited by Marianne Weber, including the Collected Political Writings (Weber, 1921), did not meet with much response in the Weimar years. Weber was quoted, but hardly received, and if so, then not necessarily positively. An important exception from Weber’s own generation was Otto Hintze (Bruhns, 1996). The generation immediately following Weber was sceptical or hostile towards him. This applies not only to sociology and Weimar state theory, but also to political science, which slowly began to develop in Germany at that time. Among the exceptional figures are Sigmund Neumann with his party analysis (Neumann, 1932); Karl Mannheim, who in his draft of a political science in Ideology and Utopia repeatedly referred to Weber and the ‘truth of Max Weber’s words’ (Mannheim, 1929: 171, 67); and Siegfried Landshut, who meticulously occupied himself with Weber; moreover, he was the first to describe the ‘rationalization’ as ‘Max Weber’s research topic’ (Landshut, 1929: 35ff.; Landshut, 1930). Hermann Heller in particular presented the advanced draft of a political science with his – posthumously published – Staatslehre, in which he assigned Weber a central place (Heller, 1934: 13ff.). Heller can be called the founder of the ‘Weberian tradition’. This orientation can already be seen in his definition of the state as an order, which could only endure if it could enforce its ‘order against other social orders’ (Heller, 1934: 268) and at the same time has the corresponding legitimacy: ‘Where a self-asserting state power is not wanted, there is no state’ (Heller, 1934: 229). With this, he confirmed Weber’s finding that a state exists only as long as ‘people orient their actions on the idea that it exists or should exist in such a way’ (Weber, 2013: 161–2).
The Weberian approach of his state theory is also evident in the historical explanations, when Heller makes it clear that statehood is an achievement of modernity which must not be projected into earlier epochs, but only begins with the monopolization of violence, the streamlining of the bureaucratic apparatus and the unification of the legal order (Heller, 1934: 141ff.). His representation reads over long distances like a paraphrase of Weber’s positions. When he sees the monopolization of legitimate physical violence as the ‘hallmark of the modern state’ (Heller, 1934: 152), he joins Weber’s concept of the state, which marks a turning point in the history of state theory and establishes a historical-sociological viewpoint that has asserted itself internationally over the decades (cf. Anter, 2014: 25ff.). Heller’s orientation is most evident when he defines his approach as ‘reality science’, which aims to understand the peculiarity of the life around us (Heller, 1934: 62). Thus, he also joins Weber in programmatic terms, who understands his approach as a ‘science of reality’ that aims to understand the ‘lived reality within which we are placed’ (Weber, 2004 [1904]: 374). The appeal to ‘reality’ is a permanent feature in Weberian literature. In the post-war period, the German American political scientist and Weber student Karl Loewenstein took up this line when he saw it as the task of political science to make a contribution to the ‘reality of the power process’ (Loewenstein, 1957: 147). He saw political science as a science of reality, with a particular focus on its epistemological foundations. Weber had demanded that every scientist should disclose his own values, since every insight is always subjective (Weber, 2004 [1904]: 366, 383). This is the core of his often distorted value judgement postulate. Weber was therefore not concerned with an ‘objective’ reality, but with the fact that science should not be exhausted in sterile norm analysis. That was entirely in line with Loewenstein’s intentions. He thought that political science should not do ‘mental acrobatics’, but should focus on the ‘reality’ of the political order (Loewenstein, 1952: 433). In
Max Weber and the Weberian Tradition in Political Science
German political science, Weber’s perspective was referred to as ‘constitutional realism’ and represented by leading figures such as Ernst Fraenkel (1961: xv). Loewenstein’s Weberian attitude also led him to his passionate opposition to the thesis that Weber, with his positions on ‘plebiscitary leader democracy’, had been an intellectual precursor of Hitler. This ‘pioneer’ thesis, which the young Wolfgang J. Mommsen had put forward in 1959 in his dissertation (Mommsen, 1984), met with a strong response at that time, all the more so as Weber had until then been regarded as one of the few great thinkers by means of whom the young Federal Republic of Germany could continue a positive German tradition beyond National Socialist Germany and the failed Weimar Republic. In this sense, the first German Federal President, Theodor Heuss, had formulated his foreword to the new edition of Weber’s Political Writings in 1958. The ‘pioneer’ thesis that Karl Löwith had already advocated in 1939–40, however, was able to establish itself in academic discourse.1 The thesis was, of course, untenable, since it turned Weber’s political thinking into its opposite. Beyond the defensive attitude, a typical element of the ‘Weberian tradition’ can be recognized in the debate: that it mostly reacted to ideological distortions of reality. Like Weber, Loewenstein also regarded power as a central category of political science. For Weber it was the central criterion of the political: he generally saw political questions as questions of maintaining and distributing power and therefore defined politics as the striving for power in the state and between states. Loewenstein similarly devoted himself to the problem of power and made it clear that politics is ‘nothing else but the struggle for power’ (Loewenstein, 1957: 3). For him, political reality was decisively shaped by power processes. His conviction that society is ‘a system of power relations’ (Loewenstein, 1955: 247) has meanwhile condensed into certainty in the more recent theory of power.
235
German-Speaking Emigration In Germany, the various strands of a Weberian tradition in the social sciences were cut off after 1933. The National Socialist policy towards university and science left no room for developing social sciences in the sense of Weber. Numerous scientists who were acquainted with his work emigrated, especially to America. Most of these emigrants did not return to Germany after the end of the war, as, for example, Paul Honigsheim, Hans Gerth, Karl Mannheim, Karl Loewenstein, Alexander von Schelting and Alfred Schütz. Others, such as Ernst Fraenkel, returned and played a major role in the re-establishment of political science in Germany (Söllner, 2006: 160ff.). If it can be said for sociology that Weber returned to Germany via America after the end of the Second World War, this does not apply equally to political science. In the Weimar Republic political science did not yet exist at universities as an institutionalized discipline. Politics was taught at the faculties of Staatswissenschaften, together with economics and law. But German emigrants succeeded in integrating themselves into American political science and to influence it, even if they were actually law graduates, like Arnold Brecht, Ernst Fraenkel, Otto Kirchheimer, Hans J. Morgenthau or Franz Neumann (Söllner, 1996). Franz Neumann emphasized the importance of the United States for the continuity of a Weberian tradition. In his structural analysis of the National Socialist regime published in 1944 (Behemoth: The Structure and Practice of National Socialism, 1933 – 1944) he combined Marxist and Weberian approaches. Neumann commented later: It is characteristic of German social science that it virtually destroyed Weber by an almost exclusive concentration upon the discussion of his methodology. Neither his demand for empirical studies nor his insistence upon the responsibility of the scholar to society were heeded. It is here, in the United States, that Weber really came to life. (Neumann, 1953: 22)
The reception of Weber the ‘political economist’ and the ‘political sociologist’ in parts
236
The SAGE Handbook of Political Science
of the emigrant community, especially at the New School for Social Research, was counterbalanced by the ‘non-reception’ in the Critical Theory, transported from Frankfurt to Manhattan (Scaff, 2011: 241–2). Although elements of the political Weber found their way into the American discussion (the theory of bureaucracy, the concept of the ‘rationallegal authority’, the thesis of disenchantment, etc.), it was the sociologist Weber who made his career in America, the Weber of The Protestant Ethic and the Spirit of Capitalism (Weber, 1930 [1904/05, 2nd ed. 1920]). This work, translated by Talcott Parsons in 1930, tapped unintentionally ‘the most fundamental of American narratives; the possibility of emancipation in pursuit of a better life’. In America, it was read as ‘a story about ourselves’ (Scaff, 2011: 198). David Beetham summed up Weber’s paradoxical reception in the Anglo-American world as follows: profound impact on sociology – relative neglect within political science, but with an indirect influence ‘through the dissemination of his theory of competitive leadership democracy via the work of Schumpeter and others’ (Beetham, 1985: 2). In the founding years of the Federal Republic of Germany, German emigrants played an important role in the development of political science, especially in the American Zone of Occupation. The familiarity of emigrants such as Franz Neumann, Ernst Fraenkel, Karl Loewenstein or Hans Gerth, Albert Salomon and Heinrich Blücher with Weber’s work however, had less significance for the revival of Weberian traditions in the new German political science influenced by America than the anti-Weberian influence of others, such as above all, Leo Strauss (Natural Right and History, 1953), who accused Weber of nihilism and indifference to the normative aspect of the political, and Eric Voegelin (The New Science of Politics, 1952). Despite his initial proximity to Weber, Voegelin has conveyed a completely distorted image of Weber as an arch-positivist based solely on a misunderstanding of the concept of value
freedom: ‘The movement of methodology, as far as political science is concerned, ran to the end of its immanent logic in the person and the work of Max Weber’ (Voegelin, 1952: 13). Weber ‘participated in the destruction of science’ (ibid.: 12). Strauss and Voegelin had more influence than others like Brecht, whose Political Theory (1959), though entirely based on a critical assessment of the debate on values in science, radically rejected the Weber interpretation of Strauss and Voegelin.
Non-Weberian Political Science in Post-War Germany: A ‘Science for Democracy’ Weber’s marginal role in German political science in the 1950s and 1960s has various reasons. On the one hand, it can be explained by the fact that as an academic discipline it ‘did not develop in an inner-scientific process of differentiation of subject areas and disciplines’. From 1949 on it was ‘imposed on the universities with a political-pedagogical intent and soft governmental power’ (Behrmann, 1998: 448). The re-establishment of democracy, under the influence of the Western victorious powers, and the re- establishment of political science coincided and gave the latter a normative orientation. On the other hand, most of the remigrants who were involved in the establishment of political science in the early Federal Republic now represented less of a German scientific tradition than an American one (cf. Söllner, 2006: 185–6); only a few – Fraenkel is an important example – continued regarding Weber as a ‘spiritual mentor’. However, in the general intellectual spectrum of the new German democracy, Weber was initially considered as one of the very few great personalities who could serve as a bridge, beyond the Nazi regime, to honourable periods and traditions of German history. For the new political science in Germany, however, not only Weber’s distorted image as a methodologist and representative of a
Max Weber and the Weberian Tradition in Political Science
radical pluralism of values posed a problem, but also recent interpretations of Weber’s political positions from his Inaugural Address as professor of economics in Freiburg in 1895 on to the First World War and the Revolution of 1918–19. Criticism of this had been voiced early on. Weber’s intense commitment to the democratization of Germany between 1917 and 1919 was obscured by interpretations of his political thinking that drew a direct line from his ideas about political leadership (plebiscitary leader democracy) to Hitler and National Socialism. However, this criticism, expressed by Karl Löwith as early as 1939– 40 from Japanese exile, did not initially meet with much response. The decisive break came in 1959 with the publication of Mommsen’s book on Max Weber and German Politics 1890–1920 (1984). Mommsen insisted heavily on Weber’s nationalism and his understanding of international relations (IR) in terms of power politics. In a political science that considered itself as a ‘Demokratiewissenschaft’ and as being positively involved in the process of democratization of Germany, a Max Weber who had fought fiercely for the democratization of Germany during the First World War and the Revolution and had made important contributions to shaping the constitution of the Weimar Republic, paradoxically had no place. His relationship to democracy was perceived as ambivalent: Weber’s commitment to parliamentary democracy was not based on normative and philosophical ideas, but on historically founded rational considerations about the form of government that on the one hand provided Germany as a modern industrial nation with the means to assert itself in the competition of the ‘power states’, and on the other hand guaranteed citizens a minimum of political equality and participation. In the 1950s and 1960s, however, Weber’s concept of politics, his (alleged) ‘pluralism of values’ and his (misunderstood) ‘postulate of freedom of value judgement posed a threat to philosophical-normative political science’ (Hübinger et al., 1990: 184).
237
In the early 1960s in Germany, the actual Weber renaissance took place in sociology and social history. As to politics and the political, the discussion did not focus on his ideas about parliamentary democracy but almost exclusively on Mommsen’s interpretation of Weber’s nationalism, his attitude to imperialism and his alleged theory of plebiscitary leader democracy, which, according to the accusations, had helped to mentally attune the German people to the National Socialist dictatorship. The result of this was that at the Heidelberg Congress of the German Society of Sociology, organized in honour of Weber’s centenary in 1964, the discussion on the political dimension of Weber’s work was tailored entirely to the topic of power politics, for which Raymond Aron had been invited as a keynote speaker. Aron’s Weber interpretation ended in an aporia: ‘Weber after all betrayed himself in his theory of politics for power was never his aim, neither for himself, nor for the nation. […] The man and the philosopher leave us an inheritance undiminished by the mistakes of the theoretician of power politics’ (Aron, 1971: 100). The comments and responses by Carl J. Friedrich, Hans Paul Bahrdt, Wolfgang J. Mommsen, Karl W. Deutsch, Eduard Baumgarten and Adolf Arndt (Stammer, 1971: 101–30) to Aron’s lecture at the 1964 Congress of Sociology raised a series of highly interesting questions with regard to Weber that only later found their way into political science research. An interesting example is the observation made by Arndt: ‘We should not forget that [Weber] put three questions to himself and to us: How can we make the people political? […] How can we instruct parliament? How can we find politicians of the right calibre?’ (Arndt, 1971: 130). These questions were only taken up two decades later by Wilhelm Hennis as central dimensions of Weber’s work for political science (Hennis, 2000a, 2000b), while in the 1960s and 1970s Weber was deliberately received ‘one-sidedly’. Political scientists largely ignored the research on Weber and his work in the
238
The SAGE Handbook of Political Science
neighbouring disciplines. Also, research controversies about Weber as a political theorist were more likely to be carried out in the leading sociological journal, the Kölner Zeitschrift für Soziologie und Sozialpsychologie, which René König had made to a kind of Weber forum, than in the journals of political science (Hübinger et al., 1990, 189ff.). Strauss’ and Voegelin’s far-reaching critique of science at the interfaces of German and American political science, ‘from whose point of view Max Weber in particular had misled modern political science’ (Behrmann, 1998: 474), was reinforced by the effects of partial and biased, often second-hand reading, of Mommsen’s study on Weber and German politics. It was strengthened by the sharp attacks at the Heidelberg Congress in 1964 by Herbert Marcuse: ‘In the development of capitalist rationality, irrationality thus becomes reason’ (Marcuse, 1971 [1965]: 137, emphasis in the original) and by Jürgen Habermas, who turned Carl Schmitt into Weber’s ‘natural son’ and underlined that ‘viewed in the light of the history of influences, the decisionist element in Weber’s sociology did not break the spell of ideology, but strengthened it’ (Habermas, 1971: 66). Parallel to such normative and ideological oppositions to Weber, there was also, in Political Science, a real interest in Weber, especially for the ‘Sociology of rule’ in Economy and Society, and for political writings such as ‘Suffrage and Democracy’, ‘Parliament and Government’, ‘Politics as Vocation and Profession’, ‘Germany’s Future Form of Government’. This interest was reinforced by the fact that Johannes Winckelmann had introduced a ‘Sociology of the State’ as a final chapter in the 4th edition (1956) of Weber’s Economy and Society. Winckelmann had compiled this ‘Staatssoziologie’ from Weber’s political writings of the years 1917– 19 and from his lecture on economic history held in Munich in 1919–20. Weber’s political writings, sometimes devalued as ‘political of the day’, found their way into other languages earlier via this ‘sociology of the state’,
translated as a part of Economy and Society, or through translations of the political writings. Treatises such as ‘Suffrage and Democracy in Germany’ or ‘Parliament and Government in Germany under a New Political Order’ were already put in a row with the Contrat social by Loewenstein in 1920. Nevertheless, in political science and its neighbouring disciplines the separation between Weber’s ‘political’ and ‘scientific’ writings has long been elevated to a principle. Only Hennis, as late as 1987, made it clear that such a divorce of the work is untenable (Hennis, 1987: 224). Since then, the traditional separation has been tacitly abandoned. Similarly hesitant has been the attitude of political scientists to the historical and comparative dimensions in Weber’s sociologies of rule and state.
Weberians in Contemporary Political Science Max Weber’s Central Question It was Hennis, who made a pioneering contribution to Weberian political science by examining Weber’s central question. As a young scholar, he had fought Weber vigorously, blaming him for the positivist departure of political science from the Sinnfrage [question of sense]. Thus, he condemned Weber’s political science as empty and worthless (Hennis, 1959: 20–1). In his criticism, he incomprehensibly followed the distorted Weber image of Strauss, who regarded Weber as a relativist (Strauss, 1953: 44). Hennis later recalled selfcritically that political science at that time ‘under the opinion leadership of its authoritative “founding fathers” refused to accept Weber positively’ (Hennis, 2003: 5). In the 1980s, however, Hennis radically turned his Weber image from head to toe: in his book Max Weber’s Central Question (first 1987), he saw Weber in a completely new light, namely as an ‘authority’ of political science (Hennis, 2000a: 86). With an
Max Weber and the Weberian Tradition in Political Science
unparalleled interpretive power and a refreshing polemic, he shook the prevailing doctrine that the ‘occidental rationalization process’ was Weber’s central theme. In contrast, he disclosed the anthropological question: the question of the fate of mankind [‘Menschentum’] in modernity. In doing so, Hennis opposed the unhistorical approach in the prevailing Weber orthodoxy – and returned the existential dimension of Weber’s work to political science. His gripping reinterpretation caused an international stir and catapulted him into the premier ranks of Weber research. Hennis vigorously protested against a de-historicized Weber reception, presented in diagrams, and demanded, ‘Weber has to be read afresh and “without prejudice”. And that means the entire corpus of his work’ (ibid.: 5). His appeal did not go unheard. It was particularly popular with those Weberians who could not make friends with a de-historicized and colourless Weber à la Habermas. In today’s literature of political theory and international relations, for example, Weber’s explicitly nationalist positions are discussed much more openly than before (cf. Lebow, 2017b).
Legitimacy: A Weberian Category Weber’s interest focuses on the question: Why do people obey other people? His answer is that domination can only assert itself permanently if it is considered legitimate. He thus placed the category of legitimacy at the centre of political science. Today, the discipline is concerned with the question of why claims to political order are followed. In any case, claims for order cannot be based on naked violence, otherwise they would not be permanently observed (Weber, 2013: 449ff.). Weber’s concept of legitimacy was a successful concept, since almost all investigations into the relationship between state and legitimacy have since been based on Weber – whether critical or approving. The debate reached its first peak in 1975, at the Congress of the German Political Science
239
Association, which was dedicated to the issue of the so-called ‘Legitimation Crisis of the Political System’. The topic of the Congress was inspired by Jürgen Habermas’ book on the ‘Legitimation Crisis of Late Capitalism’ [Legitimationsprobleme im Spätkapitalismus, 1973], which maintained that the Western political system is short before its decline. His opponent was Wilhelm Hennis, who argued that there is no such thing as ‘legitimation crisis’ (Hennis, 2009: 77). Moreover, he showed that never before in recent German history had the foundations of the state ‘been so little questioned and contested as in the era of the Federal Republic’ (Hennis, 2009: 77). After Congress, Hennis’ lecture was printed in several journals and books and found much discussion, since it was perceived as a document of political realism against political ideology. The Weberian approach of this document becomes clear when Hennis explicitly points out: ‘Wherever the category of legitimacy turns up in modern social sciences it is at root Max Weber’s concept’ (ibid.: 89). Hennis’ lecture was a turning point in the evolution of his political thinking. Moreover, it was a starting point for a new Weberian tradition. In all of his writing before, he has been criticizing him, whereas he was now considering Weber’s state theory as ‘one of the most acute and earliest decodings of the internal law of development of the modern state’ (ibid.: 89). Since that time, political science’s debates on legitimacy have been shaped by Weber’s concepts. In between, this is true to various parts of international political science, such as International Relations theory (IR) theory, political philosophy, Governance studies and European studies. The category of legitimacy is actually the Archimedean point in Weber’s theory of domination. From his point of view, the question of legitimacy is one of when, how and why the political and legal order is recognized and respected. It was Weber, who made legitimacy an elementary analytical category for the comprehension of political and legal order. Legitimacy is the twin of the modern state, which particularly goes for the democratic constitutional state, which invokes legal
240
The SAGE Handbook of Political Science
principles such as human dignity, liberty and equality, but cannot be based on higher values and principles. If a state order can only exist as long as it is regarded as legitimate, an intimate relationship emerges between state and legitimacy, a Weberian perspective that is deployed by theorists like Patrice Duran (2019). The fact that the stability of an order can be read off from the degree to which the actions of individuals are oriented towards the order is also evident in the field of international politics. In his investigation of the effects of state collapse, Daniel Lambach draws on Weber’s concept of legitimacy and comes to the conclusion that a population’s belief in legitimacy is a central factor ‘that determines the success and failure of a state: The state therefore exists only to the extent that people follow its rules’ (Lambach, 2008: 278). With this, he indeed agrees with Weber, for whom the existence of an order is based on the people’s belief that it should exist: it is based on legitimacy.
The ‘Weberian State’ The modern state is one of the prominent objectives of today’s political science research (see Schlichte and Gaufman, Chapter 81, this Handbook). The discussion is as vital as it is controversial, since the reality of the modern state itself can only be described in paradoxical formulas. The varieties of state experience rely on the heterogeneous manifestations of the state. This causes a central problem of state theory, which is reflected by Weber, who described the notion of the state as ‘the most complex and interesting case’ of the problem of concept formation (Weber, 2004 [1904]: 394). Weber defines the state as a political institution that claims successfully on the ‘monopoly of legitimate physical force’ (Weber, 2004 [1922a]: 356, emphasis in original). Weber occupies a central position in present-day state theory since he formulated more clearly than anyone else the monopoly of violence as
the elementary criterion of the state. It is no coincidence that he is recognized in the international scientific community as the theorist of the monopoly of force. Weber’s concept of the state seems to be well established (cf. Anter, 2019). As Duncan Kelly points out, it is ‘quite simply the most commonly used working definition found in contemporary historical and political writing’ (Kelly, 2008: 4). Weber’s theory of the state, however, is not limited to the monopoly of violence, but includes a series of criteria, among them the institutional and action-based character of the state. For Weber, the state is a ‘business enterprise’ (Weber, 1984 [1918]: 451), which he defines as ‘continuous purposive action of a particular kind’ (Weber, 2013: 209). This perspective was closely bound up to his actionbased theory of the state: that the state ‘only covers a course of human action of a particular kind’ (Weber, 2013: 440). He goes even a step further, when inserting the category of ‘chance’ before that of action, for the state ‘consists exclusively and solely of the chance that action occurred, occurs or will occur’ (Weber, 2013: 177). Weber concludes that a state ceases to exist sociologically with the disappearance of the chance that particular forms of action might occur (Weber, 2013: 177). Thus, the ‘chance’ becomes the condition of possibility of the state. The ‘chance’, which plays a central role in Weber’s writing (Anter, 2014: 88ff.), has an empirical payoff: if the chance of the state is quantifiable, there must be different degrees of the state, which can eventually be measured empirically. Such a ‘gradual’ conception of the state corresponds to the gradual validation of orders. Hence, an order depends on the chance that action ‘can be oriented by an actors’ conception of the existence of a legitimate order’, so that ‘there exists no absolute alternative between the validation and the non-validation of a particular order. There are instead fluid transitions from one to the other’ (Weber, 2013: 183–4, emphasis in original). For current state research in comparative political science and international relations,
Max Weber and the Weberian Tradition in Political Science
Weber’s gradualist approach offers the conception for an empirical analysis of the different degrees of statehood. For some time, a ‘Weberian approach’ has been established in international state theory (vom Hau, 2015: 135ff.). Even the state itself is referred to as a ‘modern Weberian state’ (Lemay-Hébert et al., 2014: 7). It is a rare case in modern social science that an object is labelled with the name of the theorist who is monitoring it. The Weberian approach is represented in international state theory by authors as diverse as Theda Skocpol (1985) and Gianfranco Poggi (2010). Stefan Breuer’s study ‘The State’ (Der Staat, 1998) with its typology and genealogy of the modern state, following Weber, is one of the important recent state analyses. At the same time, he is among the important Weber interpreters of our time. The current debate turns particularly to the role of the monopoly of violence in the processes of state-building, its function in securing domestic peace, and the threatening fact that the monopoly is constantly endangered in the European states and does not even exist in many parts of the world. Particularly, the global danger of Islamic terrorism has rendered the monopoly of violence in many states as highly tenuous (cf. Laqueur, 2017). The monopoly of violence is not least essential for contemporary democratic states since it guarantees that legitimate decisions have the chance to be enforced. This is particularly evident from a Weberian perspective (cf. Anter, 2019). Without a secure statehood, no stable democracy can emerge, as Wolfgang Merkel shows based on failed states, since without an efficient state administration, ‘democratic decisions could not be adequately implemented’ (Merkel, 2013: 300). Arthur Benz also refers to Weber when he makes it clear that the modern state is not a static entity, but that today’s ‘multinational multi-level state’ is exposed to increasing dangers in view of the progressive dissolution of boundaries, in particular the abolition of the control function of state borders (Benz, 2013: 68). He argues in favour of strengthening the institutions of the
241
nation-state in relation to the multilevel structure of the EU, above all the courts and the administrations (Benz, 2013: 70).
Charisma, Patrimonialism and Bureaucracy: Weberian Issues in Political Science Weber’s typology of forms of legitimate rule (legal-rational, traditional, charismatic) has become an important part of Weberian tradition in social sciences. This does not mean, however, that Weber’s scheme is generally made on the basis of the analysis of political regimes, nor that Weber’s types, subtypes and subcategories have proved to be real stimuli for political science. This statement must be corrected in relation to the concepts of charisma, patrimonialism and bureaucracy. The concepts of charisma and charismatic domination have made a never-ending career from sociology through history, religious studies, social psychology to political science, but this has not really contributed to clarify her instrumental value for the analysis of concrete political configurations. The abundance of works on the concept of charisma and on the type of charismatic rule has meanwhile developed into a field of its own in research on Weber. From the long list of political leader figures to whom the concept of charismatic rule has been applied, early works on Hitler and the Nazi regime are of particular methodological interest, beginning with ‘The Nazi Party: Its Leadership and Composition’ (1940) by the German emigrant Hans Gerth, who has played an important role for the reception of Weber in the Anglo-American world. Other early authors are Ernst Fraenkel (1941) and Talcott Parsons (1942). The analysis of the charismatic dimension of the National Socialist regime was later taken up by political scientists with reference to Weber (e.g. Nyomarkay, 1967). Most interesting from the viewpoint of methodology are the works of sociologists (Lepsius, 1993 and 2017) or historians
242
The SAGE Handbook of Political Science
(Herbst, 2010). Lepsius examines the conditions for the exercise of charismatic leadership in a complex and differentiated territorial state, taking up suggestions made by Fraenkel. The administrative staff of the charismatic leader must also fulfil the everyday tasks of a society. With the dissolution of formal coordination procedures and the lack of institutionalization, the leader has a central role as a coordinating authority. Lepsius describes an only partially charismatic system of rule, the core of which lay in securing the personalized belief in legitimacy, which was syncretistically mixed with elements of formal legality, enriched by elements of traditional legitimacy (Lepsius, 1993: 115–17). Another example of differentiated applications of the concept of charisma are Guenther Roth’s (1987) analyses of charismatic leadership patterns in authoritarian one-party systems and democratic presidential systems. In political sociology, the topic of charisma has been pursued under theoretical aspects by Edward Shils (1965) in relation to charisma as a universal element of persons, strata and institutions, which in times of crisis intensively and concentrated, but also in everyday life in a diffuse way, gives social orders legitimacy through transcendence (cf. Utz, 2014). Breuer (1994) emphasized both the order-generating significance of the charisma and its transformations for the evolution of archaic states and the significance of the ‘charisma of reason’ in the American and French revolutions. Early on (Friedrich, 1961), Weber was accused – under the impression of the often misunderstood concept of value pluralism – not to have sufficiently differentiated between charismatic structures within constitutional orders on the one hand, and dictatorial regimes which searched to acquire plebiscitary legitimacy on the other. Yet for the 20th century in particular, historians, sociologists and political scientists have analysed a whole series of charismatic situations with the help of Weber’s model of charismatic rule and his concept of routinization of charisma, i.e.
ways of institutionalizing religious or political power after the death of a charismatic ruler. Weber’s ‘theory’ by no means focuses solely on phenomena such as personal charisma or charismatic legitimacy, the belief in legitimacy. Central concepts for the political science analysis are beside it the ‘routinization’, the follow-up question, the ‘day-today interests of the administrative staff and the transformation of charismatic norms into traditional norms (Weber, 2013: 503–4, emphasis in the original). After the Second World War, the concept of charismatic rule was often applied to regimes in so-called developing countries characterized by strong personalities in order to distinguish them from rational-legal bureaucratic regimes (Theobald, 1982). In response to this undifferentiated use of terms from Weber’s sociology of rule, Roth (1968) suggested resorting to another term from the typology of domination: patrimonialism. The concept of patrimonialism has then largely been used, with reference to Weber, to analyse certain forms of state construction and governance, namely in Africa, Latin America and Asia. The intention was to take account of various facets of obstacles encountered in the process of democratization and political development in general. Political and social scientists insisted on aspects as the confusion between the public and private spheres, the assimilation between administrative and domestic functions, the recruitment of the agents of power and of public bureaucracy, are among those who are close to the ruler, who acts in such a manner as to make it impossible to distinguish between the function and the person holding that function (Fauré and Médard, 1995: 289–90). Shmuel N. Eisenstadt questioned the legitimacy of using the term ‘patrimonial’, a term derived from the analysis of traditional historic political systems, for analysing modern political systems. He proposed to use the term ‘patrimonial’ not in order to describe a level of development or differentiation of political regimes, but only in order to designate a
Max Weber and the Weberian Tradition in Political Science
specific way of coping with the major problems of political life ‘which may cut across different levels of “development” or structural complexity’ (1973: 59). Eisenstadt suggested distinguishing between traditional (Antiquity, Middle Ages) patrimonial regimes and modern forms of patrimonialism, thereby introducing the terms ‘neo-patrimonialism’ and ‘post-patrimonial regimes’ (1973: 13, 46). He saw the essential differences between patrimonial and neo-patrimonial regimes as being, first, ‘the political problems which were faced respectively by such traditional and modern regimes’, and, second, ‘in close relation to these problems, the constellations of conditions which could assure the continuity of any specific patrimonial regime’ (1973: 50). By introducing the term of neo- patrimonialism, Eisenstadt was reacting to a development in the use of the concept of patrimonialism initially put forward by Roth (1968: 194–206), who had observed that in many new states, tradition had lost its force as a source of legitimacy without having been replaced by legal, rational legitimacy. As a consequence, forms of personal rule that did not correspond to any of the three Weberian ideal-types of legitimacy essentially owed their maintenance to material incentives and rewards, notably cronyism and corruption. To take account of this development, Roth suggested grasping these forms of domination conceptually by distinguishing between traditional patrimonialism and de-traditionalized, personalized patrimonialism, later referred to as neo-patrimonialism. Eisenstadt’s term ‘neo-patrimonialism’ has been widely used, but with different accentuations, as for example, by Jean-François Médard, who distinguishes between ‘rationalized’ neo-patrimonial states, i.e. those regulated by a form of specific regulation based upon particularist redistribution, and purely predatory and cleptocratic states leading to the criminalization and privatization of the state. This ultimate state of neo-patrimonialism, which destroys the state which feeds it, recalls
243
‘sultanism’ as described by Weber (Médard, 1990). Other authors point to a general problem in the discussions on patrimonialism and neopatrimonialism: the relationship between patrimonial domination on the one hand and legal-rational bureaucratic domination on the other. The hybrid phenomenon of neopatrimonialism is seen as ‘a creative mix of two Weberian types of domination: of a traditional subtype, patrimonial domination, and legal-rational bureaucratic domination’ (Erdmann and Engel, 2007: 104). Important studies devoted to the concept of patrimonialism in Weberian sociology (e.g. Hermes, 2004; Breuer, 2006) have hardly resonated among contemporary analysts of so-called neo-patrimonial regimes. These are manifestly two completely distinct fields of research. In the latter, references to Weber are almost completely limited to a few paragraphs of Economy and Society. Weber’s major study of a patrimonial regime: China (Weber, 1989 [1915–1920], seems to be extremely rarely read by researchers interested in contemporary (neo-)patrimonial regimes. Weber conceived his concepts and typologies, including its sub-types and hybrids, as analytic instruments. ‘Historically, there has never been an ideal-typical purely patrimonial “state”’ (Weber, 2013: 484, emphasis in the original). From a pragmatic vantage, we might well ask whether the list of sub-types of traditional domination elaborated by Weber (including patriarchalism, patrimonialism and sultanism) should be completed with other sub-types on the basis of new empirical data and configurations. Médard suggested to reserve the concept of neo-patrimonialism for forms of the state which, in contrast to the European development, are the product of ‘two processes of bureaucratization and patrimonialization which otherwise have always gone hand in hand and are intimately linked’. He thus completed the Weberian idea of bureaucratic patrimonialism with that of a patrimonialized bureaucracy (Médard, 1998: 311–12). Theobald (1982) had demanded to
244
The SAGE Handbook of Political Science
pay more attention to historical and economic factors and to include the specific characteristics of administrative systems in developing countries in the analysis of patrimonial regimes. The intention to make Weber’s theoretical findings fruitful for research on the ‘patrimonial state’ characterizes the joint venture of Bach and Gazibo (2012). The question of bureaucracy occupies an important place in Weber’s sociology. Ancient Egypt, the late Roman Empire, the medieval Roman church and 18th-century China are major fields of his investigations. With the modern state and the legal-rational mode of rule the theme becomes absolutely central: ‘The whole history of the development of the modern state is identical with the history of modern bureaucracy and the bureaucratic enterprise’ (Weber: 2004b [1922b]: 135). He states: ‘The purest type of legal rule is that of bureaucratic administrative staff’. Weber constructs the ideal-type of bureaucracy by assembling a series of unilateral accentuated characteristics of this bureaucratic administrative staff, such as recruitment, qualification, advancement or organization, and of its way to administrate. This ideal-type is neither a description of the reality, nor a normative ideal, but an analytic instrument. The fact that purely bureaucratic rule is the most technically efficient and formally rational form of exercising power is not the central element of Weber’s theory of bureaucracy. What is decisive for him is the sociological observation that the development of modern forms of associations as state, church, army, party, business enterprise or university, is ‘identical with the increase in bureaucratic administration’ (Weber, 2013: 463, emphasis in the original). Without such a bureaucratic administration, mass administration is not possible, and people’s everyday lives are confined to it. Weber observes the effects of bureaucratization mainly, but not only, from the perspective of domination. Basic tenets like the principle of qualification, the enforcement of salary and the labour market regulated by rational law, put social relations on a new basis. Bureaucratization with the
aim of more rational action has the effect of weakening or eliminating social hierarchies. When Weber speaks of the transition from communal action (Gemeinschaftshandeln) to rational social action (Gesellschaftshandeln: Weber, 2005: 34), modern social sciences translate this as coordination of individuals who are different by qualification, social, local or ethnic origin and other criteria. In this sense, the bureaucracy, as presented by Weber, is an answer to the problem of coordination in a mass society. The international reception of Weber’s theory of bureaucracy has largely been shaped by American sociology, in which Weber’s idealtypical construction was often misunderstood as a representation of reality or as a normative ideal (Duran, 2011). As a result of this reading, the international sociology of organization has largely developed from the criticism of Weber’s thesis of bureaucracy, misunderstood in this way (Mayntz, 1965). In this respect, the sociology of organization initiated in France by Michel Crozier is also a continuation of American research. Crozier’s interest is limited to partial aspects of Weber’s theory of bureaucracy. For him, bureaucracy is only a formalization of social exchange by means of systems of rules and procedures that the actors must adhere to. Weber, however, was less interested in the organizational aspects than in the institutional ones and in the question of power relations (Meier and Schimank, 2014).2 Similar to organizational sociology, theorists of New Public Management have used Weber as a punching ball. In light of the modernization of state administration by New Public Management, it has often been claimed that the functional principles of bureaucracy described by Weber have proved dysfunctional under today’s conditions (cf. Dunn and Miller, 2007). Indeed, more recent reform concepts with their focus on competition, controlling, output management, privatization and deregulation seem to have marked a departure from the principles described by Weber. However, most political scientist observers come to the conclusion
Max Weber and the Weberian Tradition in Political Science
that a ‘post-Weberian’ administrative model is not in sight (Bogumil et al., 2006: 184). In the one-sided and partial reception of Weber’s works, it has largely been forgotten that Weber was above all interested in a genuine political dimension. Breuer draws attention to the role of the bureaucracy in the frame of the Weberian type of ‘rational rule’ and explains why ‘Weber’s entire political theory is geared on the problem of how to produce sufficient political energy to keep the bureaucracy at the status of a mere instrument’ (Breuer, 1991: 110). ‘Parliament and Government in Germany under a New Political Order’ (Weber, 1994 [1918]) shows that for Weber the control of officials was not a question of theory, but of empirical investigation. The analysis is focused on the specific German case and is illuminated by comparison with other cases. Weber’s second concrete study concerns the development of the Chinese bureaucracy in the 18th century. Weber describes a system in which the bureaucracy as a whole – but not the individual civil servant whose position was completely precarious – was secured by a huge prebendary income. The interests of the top class of officialdom were not appropriated individually but collectively. Thus, the officialdom as such opposed resistance to any intervention or reform (Zingerle, 1983). Weber raised the question of how the individual in a society subject to the growing process of bureaucratization can still preserve a rest of freedom. Contrary to the tendency to interpret Weber here in terms of a philosophy of history (cf. Mommsen, 1974: 95–115), François Chazel pointed out that Weber’s work contains already a ‘theory of mass democracy as an oligopolistic competition between bureaucratic parties’ (Vincent 1998: 105). Public administration and private bureaucratic apparatuses of parties or business associations compete for the shaping of politics. This competition between the bureaucracies serves freedom, even if this is not their goal. Normative interpretations of Weber’s speak here of ‘bureaucratic sabotage’ (Brecht, 1937: 53).
245
Weber and International Relations Despite the current processes of change, states continue to be the key political actors in international relations. ‘The international community is essentially a community of states. The states continue to be the bearers of the international order, the actual creators and guarantors of international law’ (Isensee, 2003: 12). Therefore the states also play an important role in the theories of international relations. From Weber’s perspective, a differentiated answer must be given to the question of the function of today’s states, since the picture is very different in the different territories. Political science is even more referring to Weber, since IR theory has always been focused on questions of power and power relations (see Hellmann, Chapter 76, this Handbook). There is hardly any theory of power or domination that would not draw on Weber. So it is no coincidence that he is regarded as ‘the father of modern IR theory’ (Lebow, 2017a: 1). Hans J. Morgenthau is one of the most important pioneers of the ‘Weberian tradition’ in IR theory. He was one of those German emigrants in the United States who influenced university debates as well as public political discussions, especially during the Vietnam War, when he opposed American warfare as a ‘public intellectual’. In his writings one can see that he oriented himself to Weber, whereby this orientation reaches back to his student days. He reports that during the first state examination in Munich he attended a seminar on Weber’s political thinking, which became formative for him (Morgenthau, 1977: 7). His Weberian attitude can already be seen in his positions of power theory, which he phrased in the 1940s. Like Weber, he regards power from an action-oriented perspective as an asymmetric structure between acting individuals: ‘When we speak of power, we mean man’s control over the minds and actions of other men’ (Morgenthau, 1948: 13). In his understanding of politics, Morgenthau proves to be a Weberian. For him, political action, especially in international politics, is
246
The SAGE Handbook of Political Science
not guided by idealistic goals, but rather by the pursuit of power, a position that Weber arguably held most firmly. The Weberian attitude also includes an anti-eudaimonistic attitude. Just as Weber thought it illusory to expect ‘that peace and happiness lie waiting in the womb of the future’ (Weber, 1994 [1895]: 14), Morgenthau also thought it absurd to expect international politics to create peaceful happiness: for him it was first and foremost an arena of power struggles (Morgenthau, 1948). Regarding today’s IR theory, one cannot speak of a uniform concept of power. The discussions here are just as controversial as they are in other parts of political science. Nevertheless, a dominance of the Weberian tradition can be observed. When Robert Keohane and Joseph S. Nye say that ‘[p]ower can be thought of as the ability of an actor to get others to do something they otherwise would not do’ (Keohane and Nye, 2012: 10), then a tradition becomes clear that extends from Morgenthau to today’s IR theory. According to the prevailing opinion in international relations, ‘those who are in a position to sustainably enforce their will against the ideas of others are considered powerful’ (Münkler, 2006: 847). The Weberian tradition can still be seen in Nye’s influential distinction between ‘soft power’, which asserts itself through persuasion, and ‘hard power’, which asserts itself through command (Nye, 2005), for this distinction follows at large Weber’s distinction between power and domination. While Morgenthau played an important role for Weber’s reception in American IR theory, Aron played a similar role in France. He had already encountered Weber during a study visit to Germany between 1930 and 1933 and had subsequently introduced him to the French public with his book La sociologie allemande contemporaine (Aron, 1935). Aron felt connected to Weber through a kind of elective affinity, though he felt distant from him in ‘important points’ (Aron, 1990, 60–1). For example, he shared neither Weber’s political pessimism nor his assessment of the ‘Machtstaat’ [‘power state’]. Aron played a
major role in the canonization of Weber in the 1960s. This aspiration is clearly evident in the lectures he gave in Paris, which culminated in a ‘Weberian interpretation of the contemporary era’ (Aron, 1967: 222). Like Morgenthau’s IR theory, Aron’s view of international relations is strongly focused on the phenomenon of power. This becomes evident in his book Peace and War, where he distinguishes between different types of power: An individual’s power is his capacity to act, but above all to influence the actions or feelings of other individuals. On the international scene I should define power as the capacity of a political unit to impose its will upon other units. In short, political power is not an absolute; it is a human relationship. (Aron, 2003: 47)
Here, as in other contexts, it becomes clear how strongly he is oriented towards Weber, from methodological questions to questions of international politics. Like Weber, he considered ethics of conviction to be incompatible with the laws of international politics (Aron, 1990: 49). He was by no means uncritical of Weber. In his lecture at the 1964 Heidelberg Congress, he raised the question of whether Weber ‘was not slipping towards some kind of nihilism’ by ‘setting the power-interests of the German nation as the ultimate goal’ (Aron, 1971: 98). Aron was not the only one who criticized a discrepancy between Weber’s national stance and his scientific positions (ibid.: 100). It was very common at that time to label Weber’s political positions as time-related and to separate them from the social science work. Thus, Aron ended his Heidelberg lecture with the statement that Weber had left us ‘an inheritance undiminished by the mistakes of the theoretician of Machtpolitik’ (ibid.: 100). He had thus found a formula that would enable later Weberians to put nationalist positions in the shadow of the master constructions. Having already played an important role for pioneers of IR theory such as Aron or Morgenthau, Weber also occupies a prominent position in the current IR discussion (cf. Lebow, 2017a). Not only his concept of
Max Weber and the Weberian Tradition in Political Science
power and his view of the political, but also his understanding of the gradual character of statehood have prevailed among many theorists of international relations (see Lebow, Chapter 75, this Handbook). It is indeed methodologically advantageous to understand statehood as a gradual phenomenon. Approaches to this point of view can already be found in 1968 in J. P. Nettl’s essay ‘The State as a Conceptual Variable’, which initially received little attention (Nettl, 1968), while Christopher Clapham was able to establish the concept of the ‘Degrees of Statehood’ (Clapham, 1998). Daniel Lambach rightly points out that it was Weber who made ‘an understanding of the state as a variable’ possible (Lambach et al., 2016: 21). Based on a gradual understanding of statehood, as introduced by Weber, it is possible to develop typologies of strong and weak statehood, to reconstruct state-building processes or to establish a ‘State Fragility Index’ (Marshall and Cole, 2014), which shows that weak statehood is primarily due to a perforated monopoly on violence, and that the strong Weberian state is rather the exception than the rule. A view to the global situation of political communities offers a very heterogeneous tableau. Weak orders and failed states are facing stable political communities with a high degree of statehood. From a Weberian perspective, a policy of active state-building is crucial for many areas, since without statehood, a legitimate political system cannot emerge. It is not without reason that contemporary advocates of state-building policy refer to Weber (Fukuyama, 2005; Lemay-Hébert, 2013).
Political Thinking, Political Education A central question in Weber’s political thinking has only been dealt with late in political science: how to create a reservoir of genuine, responsible leading politicians (leaders) and how to educate the Germans to political thinking (Weber, 1994 [1918]: 143). Scaff first explicitly investigated the relation between
247
‘Politics and Political Education’ in Weber: ‘for Weber fulfilment of both the scientific and political vocations required the cultivation of the role of “political educator”’ (Scaff, 1973: 128). Political education and life conduct are interrelated. Weber’s work on Protestant ethics and Puritan sects had a political dimension (Scaff, 1973). The conception of Weber as a political educator was then introduced by Hennis into political science (2000b: 85ff.). Hennis relates Weber’s ‘sociology’ to the tradition of the classical political science. The ‘social orders and powers’ are ‘educational powers’, as well as the formative realities and institutions of modern society: press, associations, factory work, universities, political parties, constitutional forms (Hennis, 2000b: 89). Hennis ranks Weber among the great educators, in the sense of nomothetes: Plato, Rousseau and Tocqueville. Scaff described Hennis’ search for the centre of Weber’s work as ‘in part an odyssey of discovery, a renewal of inquiry into a missing tradition of thinking about politics’ (Scaff, 2013: 320). A ‘political thought’ understood in this sense is not a theoretical enterprise, ‘a matter of “theorizing” in the abstract. […] Instead it is an invitation to dissect, diagnose, reflect upon and understand the conditions and relations of a state of affairs that is historically given to us’ (ibid.: 316). For Hennis, in this sense, Weber’s ‘Parliament and Government in Germany’ ‘has always been the pattern of a political-scientific situation analysis: what historical factors had shaped the situation, what problems did they raise, where did the tendency of further development go, where was it to go’ (Hennis, 2000 [1998]: 401).
The Future of Weberian Political Science If one follows the development of political science over the last 100 years, then one can speak of the formation of a Weberian
248
The SAGE Handbook of Political Science
tradition, even if some of the manifold lines of tradition do not really go back to the core of Weberian theories, sometimes they are only supposed or misunderstood traditions. What initially articulated itself only gingerly in small circles and in individual publications during the Weimar period, experienced a sudden setback in Germany in 1933. But on the other side of the Atlantic, German emigrants developed a vital and influential reception, which at the same time also started in some European countries. In the post-war period, it had, albeit at first hesitantly, an impact back on Germany. Today, the Weberian tradition is no longer tied to individual places or centres, but has developed into an international, global community, from Berkeley to Tokyo. Weber’s concepts are used to discuss the phenomenon of rationalized modernity, its advantages and dark sides. If there is a leitmotif that emerges again and again, then it is that of realism. It was no coincidence that the Weberian tradition initially asserted itself in the various directions of political realism. Weber himself had given the keyword ‘reality science’, which was immediately taken up by the first generation of Weberians in Weimar. Heller adopted the concept programmatically. This marks the beginning of a tradition that runs like a red thread through political science, from Loewenstein, Morgenthau and Aron to Hennis, New Institutionalism and today’s theory of International Relations. As far as Weber’s worldwide impact is concerned, his concept of ‘orders of life’, which Hennis so sustainably called into consciousness, is particularly relevant. Weber’s theme, the tensions between the orders of life, provides an important starting point for reception in cultures as diverse as Japan, or America. One cannot predict how the Weberian tradition will develop in political science in the future. In the past, developments have always been unpredictable, and there is much to suggest that this will not be any different in the future. It would be desirable to strengthen communication between political science on
the one hand and Max Weber research, which is based on several disciplines, on the other.
Notes 1 On the dispute over Mommsen’s thesis, cf. Bruhns (2009). 2 Crozier even claimed that Weber had eliminated the dimension of power from his analysis (Crozier, 1961: 34).
References Anter, Andreas, 2014: Max Weber’s Theory of the Modern State, Basingstoke: Palgrave Macmillan. Anter, Andreas, 2019: ‘The Modern State and its Monopoly on Violence’, in: Edith Hanke, Lawrence Scaff, Sam Whimster (eds.), The Oxford Handbook of Max Weber, pp. 227– 236. Oxford: Oxford University Press. Arndt, Adolf, 1971 [1965]: ‘Discussion on Max Weber and Power Politics’, in: Otto Stammer (ed.), Max Weber and Sociology Today, trans. Kathleen Morris, Oxford: Blackwell, 127–132. Aron, Raymond, 1935: La sociologie allemande contemporaine, Paris: Presses Universitaires de France. Aron, Raymond, 1967: Main Currents in Sociological Thought, Vol. II, New York: Basic Books. Aron, Raymond, 1971: ‘Max Weber and Power Politics’, in: Otto Stammer (ed.), Max Weber and Sociology Today, trans. Kathleen Morris, Oxford: Blackwell, 83–100. Aron, Raymond, 1990: Memoirs: Fifty Years of Political Reflection, New York: Holmes & Meier. Aron, Raymond, 2003: Peace and War: A Theory of International Relations (1966), New Brunswick, NJ: Transaction. Bach, Daniel C., Mamoudou Gazibo (eds.), 2012: Neopatrimonialism in Africa and Beyond, London/New York: Routledge. Beetham, David, 1985: Max Weber and the Theory of Modern Politics, Cambridge: Polity Press. Behrmann, Günter C., 1998: ‘Die Verselbständigung der Wissenschaft von der Politik’, in: Karl Acham, Knut Wolfgang Nörr,
Max Weber and the Weberian Tradition in Political Science
Bertram Schefold (eds.), Erkenntnisgewinne, Erkenntnisverluste, Stuttgart: Franz Steiner, 443–478. Benz, Arthur, 2013: ‘Ein Gegenstand auf der Suche nach einer Theorie’, in: Andreas Voßkuhle, Christian Bumke, Florian Meinel (eds.), Verabschiedung und Wiederentdeckung des Staates im Spannungsfeld der Disziplinen, Berlin: Duncker & Humblot, 59–79. Bogumil, Jörg, Stephan Grohs, Sabine Kuhlmann, 2006: ‘Ergebnisse und Wirkungen kommunaler Verwaltungsmodernisierung in Deutschland’, in Jörg Bogumil, Werner Jann, Frank Nullmeier (eds.), Politik und Verwaltung, Wiesbaden: VS Verlag, 151–184. Brecht, Arnold, 1937: ‘Bureaucratic Sabotage’, The ANNALS of the American Academy of Political and Social Science Gero, 189, 48–57. Brecht, Arnold, 1959: Political Theory. The Foundations of Twentieth-Century Political Thought, Princeton, NJ: Princeton University Press. Breuer, Stefan, 1991: ‘Rational Domination. A Category of Max Weber’, Law and State 44, 92–125. Breuer, Stefan, 1994: Bürokratie und Charisma. Zur politischen Soziologie Max Webers, Darmstadt: Wissenschaftliche Buchgesellschaft. Breuer, Stefan, 1998: Der Staat. Entstehung, Typen, Organisationsstadien, Reinbek: Rowohlt. Breuer, Stefan, 2006: ‘Patrimonialismus’, in id., Max Webers tragische Soziologie, Tübingen: Mohr Siebeck, 80–91. Bruhns, Hinnerk, 1996: ‘Stato, economia e società: Otto Hintze e Max Weber’, in: Beatrice de Gerloni (ed.), Problemi e metodi della storiografia tedesca contemporanea, Torino: Einaudi, 209–233. Bruhns, Hinnerk, 2009: ‘Max Weber et le politique: retour sur l’œuvre de Wolfgang J. Mommsen’, in: Hinnerk Bruhns, Patrice Duran (eds.), Max Weber et le politique, Paris: LGDJ/Lextenso éditions, 31–46. Clapham, Christopher, 1998: ‘Degrees of Statehood’, Review of International Studies 24(2), 143–157. Crozier, Michel, 1961: ‘De la bureaucratie comme système d’organisation’, European Journal of Sociology/Archives Européennes de Sociologie 2(1), 18–50.
249
Dunn, William N., David Y. Miller, 2007: ‘A Critique of the New Public Management and the Neo-Weberian State’, Public Organization Review 7(4), 345–358. Duran, Patrice, 2011: ‘La bureaucratie a-t-elle un avenir?’, in: Charles-Henry Cuin, Patrice Duran (eds.), Le travail sociologique. Du concept à l’analyse, Paris: Presses de l’université Paris-Sorbonne, 119–131. Duran, Patrice, 2019: ‘Entre conflit et entente: La théorie wébérienne de la légitimité comme théorie générale du politique’, Revue européenne des sciences sociales, 57(1), 43–75. Eisenstadt, Shmuel N. (ed.), 1968: Max Weber on Charisma and Institution Building. Selected Papers, Chicago/London: University of Chicago Press. Eisenstadt, Shmuel N., 1973: Traditional Patrimonialism and Modern Neopatrimonialism, Beverly Hills/London: Sage. Erdmann, Gero, Ulf Engel, 2007: ‘Neopatri monialism Reconsidered: Critical Review and Elaboration of an Elusive Concept’, Common wealth & Comparative Politics 45(1), 95–119. Fauré, Yves-André, Jean-François Médard, 1995: ‘L’État-business et les politiciens entrepreneurs. Néo-patrimonialisme et Big men: économie et politique’, in: Stephen Ellis, Yves-André Fauré (eds.), Entreprises et entrepreneurs africains, Paris: Karthala; Orstom, 289–290. Fraenkel, Ernst, 1941: The Dual State: A Contribution to the Theory of Dictatorship, New York/London: Oxford University Press. Fraenkel, Ernst, 1961: ‘Geleitwort’, in: Karl Loewenstein (ed.), Beiträge zur Staatssoziologie, Tübingen: Mohr Siebeck, ix–xvi. Friedrich, Carl J., 1961: ‘Political Leadership and the Problem of the Charismatic Power’, The Journal of Politics 23(1), 3–24. Fukuyama, Francis, 2005: State-Building: Governance and World Order in the TwentyFirst Century, Ithaca, NY: Cornell University Press. Gerth, Hans, 1940: ‘The Nazi Party: Its Leadership and Composition’, American Journal of Sociology 45(4), 517–541. Habermas, Jürgen, 1971 [1965]: ‘Discussion on Value-freedom and Objectivity’, in: Otto Stammer (ed.), Max Weber and Sociology Today, trans. Kathleen Morris, Oxford: Blackwell, 59–66.
250
The SAGE Handbook of Political Science
Heller, Hermann, 1934: Staatslehre, 6th ed. 1983, Tübingen: Mohr Siebeck. Hennis, Wilhelm, 1959: ‘Zum Problem der deutschen Staatsanschauung’, Vierteljahreshefte für Zeitgeschichte 7, 1–23. Hennis, Wilhelm, 1987: Max Webers Fragestellung. Studien zur Biographie des Werks, Tübingen: Mohr Siebeck. Hennis, Wilhelm, 2000 [1998]: ‘Politik wissenschaft als Beruf’, in: id., Regieren im modernen Staat, Tübingen: Mohr Siebeck, 381–415. Hennis, Wilhelm, 2000a: Max Weber’s Central Question, trans. Keith Tribe, 2nd ed. Newbury: Threshold Press. Hennis, Wilhelm, 2000b: Max Weber’s Science of Man, trans. Keith Tribe, Newbury: Threshold Press. Hennis, Wilhelm, 2003: Max Weber und Thukydides, Tübingen: Mohr Siebeck. Hennis, Wilhelm, 2009: ‘Legitimacy: On a Category of Civil Society’, in: id., Politics as a Practical Science, trans. Keith Tribe, Basingstoke: Palgrave Macmillan, 77–120. Herbst, Ludolf, 2010: Hitlers Charisma. Die Erfindung eines deutschen Messias, Frankfurt/M.: Fischer. Hermes, Siegfried, 2004: ‘Vom politischen Traditionalismus zum ökonomischen Rationalismus. Kapitalistische Wirtschaft und patrimoniale Herrschaft bei Max Weber’, Archiv für Kulturgeschichte 86(1), 179–213. Hübinger, Gangolf, Jürgen Osterhammel, Wolfgang Welz, 1990: ‘Max Weber und die Wissenschaftliche Politik nach 1945’, Zeitschrift für Politik 37(2), 181–204. Isensee, Josef, 2003: ‘Die vielen Staaten in der einen Welt – eine Apologie’, Zeitschrift für Staats- und Europawissenschaften 1(1), 7–31. Kelly, Duncan, 2008: The State of the Political: Conceptions of Politics and the State in the Thought of Max Weber, Carl Schmitt and Franz Neumann, 2nd ed. Oxford: Oxford University Press. Keohane, Robert and Joseph Nye, 2012: Power and Interdependence, 4th ed., Boston: Pearson. Lambach, Daniel, 2008: Staatszerfall und regionale Sicherheit, Baden-Baden: Nomos. Lambach, Daniel, Eva Johais, Markus Bayer, 2016: Warum Staaten zusammenbrechen. Wiesbaden: Springer.
Landshut, Siegfried, 1929: Kritik der Soziologie: Freiheit und Gleichheit als Ursprungsproblem der Soziologie, München/Leipzig: Duncker & Humblot. Landshut, Siegfried, 1930: ‘Max Webers geis tesgeschichtliche Bedeutung’, in: id., Zur Kritik der Soziologie und andere Schriften zur Politik, Neuwied/Berlin: Luchterhand 1969, 119–130. Laqueur, Walter, 2017: A History of Terrorism, exp. ed., New York: Routledge. Lebow, Richard Ned, 2017a: ‘Introduction’, in: id., (ed.), Max Weber and International Relations, Cambridge: Cambridge University Press, 1–7. Lebow, Richard Ned, 2017b: ‘Max Weber and International Relations’, in: id., (ed.), Max Weber and International Relations, Cambridge: Cambridge University Press, 10–39. Lemay-Hébert, Nicolas, 2013: ‘Rethinking Weberian Approaches to Statebuilding’, in: David Chandler, Timothy D. Sisk (eds.), The Routledge Handbook of International Statebuilding, London/New York: Routledge, 3–14. Lemay-Hébert, Nicolas, Nicholas Onuf, Vojin Rakić, 2014: ‘Introduction: Disputing Weberian Semantics’, in: Nicolas LemayHébert, Nicholas Onuf, Vojin Rakić, Petar Bojanić (eds.), Semantics of Statebuilding: Language, Meanings and Sovereignty, New York: Routledge, 1–18. Lepsius, M. Rainer, 1993: ‘Das Modell der charismatischen Herrschaft und seine Anwendbarkeit auf den “Führerstaat” Adolf Hitlers’, in: id., Demokratie in Deutschland: Soziologischhistorische Konstellationsanalysen, Göttingen: Vandenhoeck & Ruprecht, 95–118. Lepsius, M. Rainer, 2017: `Max Weber’s Concept of Charismatic Authority and its Applicability to Adolf Hitler’s “Führerstaat”’, in: id., Max Weber and Institutional Theory, ed. Claus Wendt, Springer Switzerland, 2017, 89–109. Loewenstein, Karl, 1920: ‘Persönliche Erinnerungen an Max Weber’, in: René König, Johannes Winckelmann (eds.), Max Weber zum Gedächtnis, Köln/Opladen: Westdeutscher Verlag, 1963, 48–52. Loewenstein, Karl, 1952: ‘Verfassungsrecht und Verfassungsrealität’, in: id., Beiträge zur Staatssoziologie, Tübingen: Mohr Siebeck, 1961, 430–480.
Max Weber and the Weberian Tradition in Political Science
Loewenstein, Karl, 1955: ‘Über das Verhältnis von politischen Ideologien und politischen Institutionen’, in: id., Beiträge zur Staatssoziologie, Tübingen: Mohr Siebeck, 1961, 245–270. Loewenstein, Karl, 1957: Political Power and the Governmental Process, Chicago: University of Chicago Press. Löwith, Karl, 1939–1940: ‘Max Weber und seine Nachfolger’, Maß und Wert 3, 166–176. Mannheim, Karl, 1929: Ideology and Utopia, London/New York: Routledge, 1963. Marcuse, Herbert, 1971 [1965]: ‘Industrialization und Capitalism’, in: Otto Stammer (ed.), Max Weber and Sociology Today, trans. Kathleen Morris, Oxford: Blackwell, 133–151. Marshall, Monty G., Benjamin R. Cole (eds.), 2014: Global Report 2014: Conflict, Governance and State Fragility, Vienna, VA: Center for Systemic Peace. Mayntz, Renate, 1965: ‘Max Webers Idealtypus der Bürokratie und die Organisationssoziologie’, Kölner Zeitschrift für Soziologie und Sozialpsychologie 17, 493–502. Médard, Jean-François, 1990: ‘L’Etat Patrimonialisé’, Politique Africaine 39, 25–36. Médard, Jean-François, 1998: ‘Postface’, in: Jean-Louis Briquet, Frédéric Sawicki (eds.), Le clientélisme politique dans les sociétés contemporaines, Paris: Presses Universitaires de France, 311–316. Meier, Frank, Uwe Schimank, 2014: ‘Bürokratie als Schicksal? – Max Webers Bürokratiemodell im Lichte der Organizational Studies’, in: Hans-Peter Müller, Steffen Sigmund (eds.), Max Weber-Handbuch, Stuttgart/Weimar: J. B. Metzler, 354–360. Merkel, Wolfgang, 2013: ‘Staatstheorie oder Demokratietheorie’, in: Andreas Voßkuhle, Christian Bumke, Florian Meinel (eds.), Verabschiedung und Wiederentdeckung des Staates im Spannungsfeld der Disziplinen, Berlin: Duncker & Humblot, 285–305. Mommsen, Wolfgang J., 1974: The Age of Bureaucracy: Perspectives on the Political Sociology of Max Weber, Oxford: Blackwell. Mommsen, Wolfgang J., 1984: Max Weber and German Politics 1890–1920. Trans. Michael S. Steinberg. Chicago: University of Chicago Press. (1st German ed. 1959). Morgenthau, Hans J., 1948: Politics Among Nations: The Struggle for Power and Peace, New York: Alfred A. Knopf.
251
Morgenthau, Hans J., 1977: ‘Fragment of an Intellectual Autobiography: 1928–32’, in: Kenneth W. Thompson, Robert J. Myers (eds.), A Tribute to Hans J. Morgenthau [Truth and Tragedy], pp. 1–17, Washington: New Republic Book Co. Münkler, Herfried, 2006: ‘Die selbstbewußte Mittelmacht’, Merkur 60, 847–858. Nettl, J. P., 1968: ‘The State as a Conceptual Variable’, World Politics 20(4), 559–592. Neumann, Franz, 1944: Behemoth: The Structure and Practice of National Socialism 1933–1944, 2nd ed., New York/London: Oxford University Press. Neumann, Franz, 1953: ‘The Social Sciences’, in: id., et al., The Cultural Migration: The European Scholar in America, Philadelphia: University of Pennsylvania Press, 4–26. Neumann, Sigmund, 1932: Die deutschen Parteien, Berlin: Junker und Dünnhaupt. Nye, Joseph S., 2005: Soft Power: The Means to Success in World Politics, New York: Public Affairs. Nyomarkay, Joseph, 1967: Charisma and Factionalism in the Nazi Party, Minneapolis: University of Minnesota Press. Parsons, Talcott, 1942: ‘Max Weber and the Contemporary Political Crisis. I. The Sociological Analysis of Power and Authority Structures’, The Review of Politics 4(1), 61–76. Poggi, Gianfranco, 2010: The State, 4th printing., Cambridge: Polity Press. Roth, Guenther, 1968: ‘Personal Rulership, Patrimonialism, and Empire-Building in the New States’, World Politics 20(2), 194–206. Roth, Guenther, 1987: Politische Herrschaft und persönliche Freiheit. Frankfurt/M.: Suhrkamp. Scaff, Lawrence A., 1973: ‘Max Weber’s Politics and Political Education’, American Political Science Review 67(1), 128–141. Scaff, Lawrence A., 2011: Max Weber in America. Princeton, NJ/Oxford: Princeton University Press. Scaff, Lawrence A., 2013: ‘Wilhelm Hennis, Max Weber and the Charisma of Political Thinking’, in Andreas Anter (ed.), Wilhelm Hennis’ politische Wissenschaft, Tübingen: Mohr Siebeck, 307–325. Scaff, Lawrence A., 2014: Weber and the Weberians, Basingstoke: Palgrave Macmillan. Shils, Edward, 1965: ‘Charisma, Order, and Status’, American Sociological Review 30(2), 199–213.
252
The SAGE Handbook of Political Science
Skocpol, Theda, 1985: ‘Bringing the State Back In: Strategies of Analysis in Current Research’, in: Peter B. Evans, Dietrich Rueschemeyer, Theda Skocpol (eds.), Bringing the State Back In, Cambridge: Cambridge University Press, 3–37. Söllner, Alfons, 1996: ‘From Public Law to Political Science? The Emigration of German Scholars after 1933 and their Influence on the Transformation of a Discipline’, in: G. Ash Mitchell, Alfons Söllner (eds.), Forced Migration and Scientific Change: Emigré German-Speaking Scientists and Scholars after 1933, Cambridge: Cambridge University Press, 246–272. Söllner, Alfons, 2006: Fluchtpunkte: Studien zur politischen Ideengeschichte des 20. Jahrhunderts, Baden-Baden: Nomos. Stammer, Otto (ed.), 1971 [1965]: Max Weber and Sociology Today, trans. Kathleen Morris, Oxford: Blackwell. Strauss, Leo, 1953: Natural Right and History, Chicago: University of Chicago Press. Theobald, Robin, 1982: ‘Patrimonialism’, World Politics 34(4), 548–559. Utz, Richard, 2014: ‘Charisma’, in: Hans-Peter Müller, Steffen Sigmund (eds.), Max WeberHandbuch, Stuttgart/Weimar: J. B. Metzler, 42–46. Vincent, Jean-Marie, 1998: Max Weber ou la démocratie inachevée, Paris: Éditions du Félin. Voegelin, Eric, 1952: The New Science of Politics: An Introduction. Chicago/London: University of Chicago Press. vom Hau, Matthias, 2015: ‘State Theory: Four Analytical Traditions’, in: Stephan Leibfried, Evelyne Huber, Matthew Lange, Jonah D. Levy, Frank Nullmeier, John D. Stephens (eds.), The Oxford Handbook of Transformations of the State, Oxford: Oxford University Press, 131–151. Weber, Max, 1930 [1904/05, 2nd ed. 1920]: The Protestant Ethic and the Spirit of Capitalism, trans. Talcott Parsons, London: Allen and Unwin. Weber, Max, 1958: Gesammelte politische Schriften, ed. Johannes Winckelmann, 2nd ed. Tübingen: Mohr Siebeck. Weber, Max, 1984 [1918]: ‘Parlament und Regierung im neugeordneten Deutschland. Zur politischen Kritik des Beamtentums und
Parteiwesens’, in: id., Zur Politik im Weltkrieg. MWG I/15, ed. Wolfgang J. Mommsen with Gangolf Hübinger, Tübingen: Mohr Siebeck, 421–596. Weber, Max, 1989: Die Wirtschaftsethik der Weltreligionen. Konfuzianismus und Taoismus. Schriften 1915–1920. MWG I/19, ed. Helwig Schmidt-Glintzer with Petra Kolonko. Tübingen: Mohr Siebeck. Weber, Max 1994 [1895]: ‘The Nation State and Economic Policy’, in: id., Political Writings, ed. Peter Lassman, Ronald Speirs, Cambridge: Cambridge University Press, 1–28. Weber, Max, 1994 [1918]: ‘Parliament and Government in Germany under a New Political Order’, in: id., Political Writings, ed. Peter Lassman, Ronald Speirs, Cambridge: Cambridge University Press, 130–271. Weber, Max, 2004 [1904]: ‘The ‘Objectivity’ of Knowledge in Social Science and Social Policy’, in: Sam Whimster (ed.), The Essential Weber, London: Routledge, 359–404. Weber, Max, 1921: Gesammelte politische Schriften, ed. Marianne Weber, München: Drei Masken Verlag Weber, Max, 1992: Wissenschaft als Beruf 1917/1919 – Politik als Beruf 1919. MWG I/17, ed. Wolfgang Mommsen and Wolfgang Schluchter with Birgitt Morgenbrod, Tübingen: Mohr Siebeck. Weber, Max, 2004 [1922a]: ‘Basic Sociological Concepts’, in: Sam Whimster (ed.), The Essential Weber, London: Routledge, 311–358. Weber, Max, 2004 [1922b]: ‘The Three Pure Types of Legitimate Rule’, in: Sam Whimster (ed.), The Essential Weber, London: Routledge, 133–145. Weber, Max, 2005: Wirtschaft und Gesellschaft. Herrschaft. MWG I/22-4, ed. Edith Hanke with Thomas Kroll, Tübingen: Mohr Siebeck. Weber, Max, 2013: Wirtschaft und Gesellschaft. Soziologie. Unvollendet 1919–1920. MWG I/23, ed. Knut Borchardt, Edith Hanke, Wolfgang Schluchter, Tübingen: Mohr Siebeck. Zingerle, Arnold, 1983: ‘Max Webers Analyse des chinesischen Präbendalismus: Zu einigen Problemen der Verständigung zwischen Soziologie und Sinologie’, in: Wolfgang Schluchter (ed.), Max Webers Studie über Konfuzianismus und Taoismus. Interpretation und Kritik, Frankfurt/M.: Suhrkamp, 174–201.
PART II
Methods
This page intentionally left blank
15 The Survival and Adaptation of Area Studies Rudra Sil
Most of what is truly useful for policy is contextspecific, culture-bound, and non-generalizable. Francis Fukuyama (2005: 22)
Introduction Within the framework of the humanities, the significance of ‘area studies’ is largely unproblematic. The subject matter – be it the history, literature, art, or culture of a particular region – is presumed to be worthy of study in its own right. There are surely debates over the objectives and methods of an inquiry, with some more partial to critique or deconstruction and others more concerned with evoking the richness of human experience within a given society. Even so, area specialists housed in a humanities discipline typically feel no special need to justify their investments in expertise on a given country or region. The status of area studies within the social sciences, by contrast, has grown more tenuous over the past quarter-century.
Area specialists appointed in social science disciplines must contend with simultaneously engaging two kinds of scholarly communities, one representing the discipline or one of its subfields and one defined in terms of an abiding interest in a geographic region. The problem stems largely from the growing gap in assumptions about which skill-sets are most crucial and what constitutes ‘good’ or ‘useful’ scholarship. An area specialist’s efforts to generate social scientific knowledge on a given country or region is likely to run up against questions about whether and how that knowledge speaks to general theories or matches up with methodological ‘best practice’ within one’s home discipline. This chapter is concerned with the trajectory of area studies in relation to the discipline of political science. It begins with a short history of the emergence and evolution of area studies, stressing in particular the long shadow cast by the Cold War. The second section turns to some important shifts and challenges that have emerged in the wake of the Cold War, both in terms of the resources available to area
256
The SAGE Handbook of Political Science
studies research and methodological currents in the discipline of political science. The section also examines the different ways in which area studies scholarship has survived as part of a more globalized political science within the United States, Europe, and other parts of the world. The next section addresses some of the problematic aspects of conceptualizing and demarcating ‘areas’, especially to cope with the more fluid processes and global challenges that have emerged in the past two decades. As the geopolitical agendas and theoretical frameworks of the Cold War era recede into the past, some of the newer intellectual and methodological currents that have taken root in the discipline are serving to intensify the trade-off between investing in approaches and theories touted within political science and accumulating contextual knowledge about socially constructed spaces called ‘areas’. The fourth section addresses efforts by scholars in comparative politics to manage both the methodological and practical dimensions of this trade-off. The section considers new ways to frame the contributions of single-area research so as to better resonate with disciplinary trends as well as the emergence of new rationales and designs for cross-regional qualitative research, including comparative area studies (CAS), qualitative comparative analysis (QCA), and sub-national comparisons within and across areas. The chapter concludes by noting that area studies, viewed from a global and longterm perspective, have not only survived and adapted but will likely remain a crucial element of the field of comparative politics.
Area Studies and Political Science in the Shadow of the Cold War At the start of the 20th century, empirical research by Western scholars on non-Western societies was largely confined to historians delving into archives, or anthropologists
immersing themselves in distant lands, often isolated communities or colonized societies. The ‘father of modern anthropology’, Bronislaw Malinowski, published his study of the islands of Melanesia, Argonauts of the Western Pacific, in 1922; it came to epitomize the ethnographic study of the beliefs, rituals, and social relations of faraway communities. Although future anthropologists would criticize or refine Malinowski’s methods and interpretations, including his findings on Melanesia (Carrier, 1992), ethnographic practices would influence the general notions in sociology and political science that extensive field research was a necessary part of efforts to better understand foreign countries and areas. But it is only following World War II that area studies would grow into a core component of the discipline of political science, primarily within the subfield of comparative politics. To a large extent, this development was propelled by the sense that both the building of a stable post-war order and the containment of communism depended on deepening our understanding both of the main ‘enemy’ – the Soviet bloc – and of the growing ranks of newly decolonized sovereign states. In short, it was within the context of navigating the Cold War that area studies centers and institutes began to proliferate throughout the United States, Western Europe, and later, the Soviet bloc as well. In the United States, the growing demand for knowledge among government agencies, such as the State and Defense Departments and the Central Intelligence Agency, came to be met by new streams of federal funding, notably the Fulbright Program enacted in 1946 and the Title VI of the Higher Education Act – which grew out of the National Defense Education Act of 1958 – supported by programs set up by leading philanthropic organizations such as the Ford and Rockefeller Foundations. This set the stage for massive investments in learning foreign languages, deepening the understanding of previously less familiar societies, and developing frameworks for tracking economic, social, and
The Survival and Adaptation of Area Studies
political transformations in particular states. Two academic bodies founded in the aftermath of World War I – the Social Science Research Council and the American Council of Learned Societies – helped coordinate the activities and funding programs of universities, foundations, and government programs (Szanton, 2004). Across Western Europe, the leading universities became a natural focus for concentrating resources that could be used by rapidly burgeoning communities of researchers devoted to building up stocks of knowledge across the humanities and social sciences, spanning language study, historical research, and cultural studies, as well as analysis of the politics, society, and economic development of specific countries and regions. The University of Oxford’s School of Global and Area Studies1 developed graduate-level programs of study on entire continents (in the case of African Studies and Latin American Studies) alongside programs focused on either single countries (e.g. Japanese Studies) or sub-continental regions where one country clearly stood out (Modern South Asian Studies, and Soviet and East European Studies). In both the United States and Western Europe, national and international associations worked to promote research on various areas of the world, establish interdisciplinary area studies journals, and bring together scholars from different disciplines to area studies conferences each year. On the other side of the iron curtain, the impetus came from a Soviet leadership eager to carefully monitor trends in the West while increasingly seeking engagement with post-colonial countries in search of new candidates that might join, or at least cooperate with, the communist bloc. The most prominent area studies think tanks were the institutes set up as part of the Academy of Sciences of the USSR (now, once again, the Russian Academy of Sciences), a vast multidisciplinary organization built around the core of the Tsarist-era Russian Academy of
257
Sciences (Zhuk, 2017). The largest of these institutes, the Institute of Oriental Studies, was first established in 1818. During the Cold War, the institute produced research focused on the countries of Asia and North Africa. The Cold War saw greater attention to the Americas, with the founding of the Institute of Latin American Studies in 1961 followed by the founding of the Institute of the USA and Canada in 1967. These institutes, as well as numerous other research centers and think tanks, accounted for thousands of researchers in the USSR (Gottemoeller and Langer, 1983) and in other Eastern-bloc countries. Many of these area-focused institutes and centers remain active as part of reorganized academic bodies such as the Russian and Polish Academy of Sciences. What is interesting to note is that, despite differences among particular institutions and funding sources across regions, the overall organization and content of area studies scholarship during the period of the Cold War consistently reflected an unflinching interdisciplinarity. In fact, this interdisciplinary character of area studies, far from being a tangential attribute or a problem for cumulating disciplinary knowledge, was viewed as a distinctive asset for scholars housed in any of the social science disciplines. Political scientists, sociologists, and anthropologists regularly joined area studies associations and attended area studies conferences alongside scholars representing the humanities, and their pursuit of area-specific knowledge did not detract from their status within their home disciplines. Perhaps the persistent theme of the (de)construction of ‘modernity’ in the humanities and the growing interest in problems of ‘modernization’ across the social sciences helped to make narratives about continuity and change in various contexts and regions intelligible across disciplines. In the United States, modernization theory provided a common analytic framework and theoretical vocabulary for social scientists analyzing the trajectories of political and economic development in different regions
258
The SAGE Handbook of Political Science
of the world (Stevens et al., 2018). For better or worse, modernization theory also provided an overarching theoretical foundation that made deep, multifaceted interdisciplinary knowledge about countries and regions seem reliable and useful to Western policymakers seeking clues about how to contain the spread of communism and to respond to the varied developmental imperatives across the so-called “Third World” (Packenham, 1973). With the end of the Cold War, area studies centers and associations started to become the sites for new debates over historiography, methodology, culture, politics, and economics. These debates began to take off in directions not anticipated in the initial programs of government agencies and foundations to build up policy-relevant knowledge (Mosley, 2009). These debates would pull area specialists within the humanities and social sciences in very different directions. Within the humanities, a growing segment within area studies communities came to see their role as one of ‘deparochializing US- and Euro- centric visions of the world in the core social science and humanities disciplines, among policymakers, and in the public at large’ (Szanton, 2004: 2). Building on, or perhaps chastised by, Edward Said’s (1978) rebuke of ‘Orientalism’, this strand of thought called for a fundamental reorientation of Western area studies scholarship so that it would reject or transcend the binaries of ‘us’ and ‘them’ underpinning overt and hidden relationships of domination between the West and the developing world (Sidaway, 2013). In historiography, this critique inspired a shift from a focus on national elites to a deeper consideration of the experiences and perspectives of ‘subalterns’, particularly in post-colonial countries (Chakrabarty, 2000). Importantly, these pointed critiques of past scholarship implied not a rejection of area studies but a more expansive and openended engagement with diverse social groups and contending intellectual currents within the countries and areas being studied. A different set of challenges would be faced by social scientists, particularly
political scientists, who considered themselves to be area specialists. For scholars and policymakers alike, the end of the Cold War represented a critical turning point. The Cold War had provided the background condition for many of the arguments in favor of largescale investments in the study of foreign languages, histories, and cultures, ostensibly to keep as much of the world safe from communism as possible (Clowes and Bromberg, 2015). With the Berlin Wall coming down in 1989, these heretofore substantial investments were scaled back sharply, creating new challenges for maintaining support for language training and the building of area expertise. This shift coincided with the crystallization of a more methodologically focused line of attack within political science. Although this critique was most explicitly articulated in the United States, there was a general sense that past efforts of area specialists had lacked the discipline and rigor that distinguishes the social sciences from the humanities. At least one oft-repeated indication of this shortcoming, according to some, was the presumed failure of area experts, in both scholarly and policy communities, to foresee the fall of communism and the end of the Cold War. In political science, while area-focused research was not dismissed out of hand, it was seen as useful for building disciplinary knowledge only insofar as it shed its humanistic side and was primarily motivated by the theoretical debates and methodological principles of contemporary political science (Hanson, 2009). The result of these trends was not the ejection of area studies from political science but an increasingly stark set of trade-offs, both methodological and practical, for scholars in comparative politics. On the one hand, there was the imperative of leveraging extensive investments in area-focused expertise, fieldwork, and scholarly networks; on the other hand, there was the pressure to conform to research practices identified with ‘rigorous’ social science. This trade-off is not new, but in the context of the post-Cold War challenges faced by area studies, it has
The Survival and Adaptation of Area Studies
given rise to new rationales for area-focused inquiry and new styles of within-area and cross-area comparative analysis (see below). The next section turns to the different ways through which area studies has managed to survive as part of a more globalized political science within the United States, Europe, and other parts of the world.
Adapting to Challenges: Emergent Regional Patterns With the end of the Cold War, area studies research worldwide had to contend with new pressures and constraints. To some extent, this is the logical result of the thinning out of funding streams tied to the geopolitics of the Cold War, but it was also exacerbated by global financial crises in 1997–8 and 2008–9. There is also the impact of different intellectual currents tugging on different disciplines which put pressure on area specialists to reformulate the character and significance of their contributions to their respective disciplines. Yet the challenge has been just that – a challenge. It has not thus far led to any concerted effort to dismantle the organizational framework for area-based scholarship. Indeed, one lasting legacy of the Cold War era has been the durability of the various academic units created for the production and dissemination of area-specific knowledge (Stevens et al., 2018: 6). In addition, major events in the real world have played a crucial role in reminding political scientists and policymakers of the significance of deep contextualized knowledge about different parts of the world. Even so, in the United States, area specialists have faced mounting challenges in securing the resources needed for building language proficiency and carrying out sustained research in the field, as the government has steadily scaled back funding for area expertise except in the limited context of providing specific types of information deemed to be
259
reliable and useful for the purposes of policymakers and media (Clowes and Bromberg, 2015). Within political science, it is scholars in comparative politics who would be most affected by these cuts. For most areas, steady cutbacks in the Department of Education’s Title VI funding since the 1990s made it progressively difficult to obtain federal support for language training and area expertise – except for select projects that are of high priority to US national security in a post-9/11 world (frequently focused on the Middle East and China). Additionally, in October 2013, the US State Department also eliminated its Title VIII program, which had been once a massive source of funding for language training and area expertise for scholars studying Eastern Europe and the former Soviet Union (King, 2015). Despite the spirited effort to highlight the importance of area expertise by many political scientists (e.g. Fukuyama, 2004; Hanson, 2009; King, 2015; Pepinsky, 2015), area studies have remained under strain in the post-Cold War era. The strain has been magnified by a second trend: since the mid 1990s, leading political science departments in the United States have been placing less emphasis on the accumulation of area-focused knowledge in favor of methodological techniques consistent with the ‘causal inference revolution’. Driving this shift is a basic epistemic notion, most famously championed by King et al., (1994), that there exists a universal set of methodological principles and logics of inference that define rigorous scholarship in both quantitative and qualitative research. This idea is certainly not without its detractors, as evident in ongoing debates as to whether there exist distinct ‘cultures’ of quantitative and qualitative scholarship with distinct understandings of evidence and causation (Goertz and Mahoney, 2012). Nevertheless, the growing attention to causal identification in the United States has reduced the space for stand-alone qualitative research, especially if focused on particular countries or areas. This is especially evident in flagship journals of the discipline (such as the
260
The SAGE Handbook of Political Science
American Political Science Review), which rarely publish papers focused on one country or region, not counting the United States. And, at the very top political science departments in the United States, stand-alone qualitative research on a single country or area, no matter how compelling or original, is treated as lying outside the realm of ‘cutting edge’ scholarship in the discipline. At the same time, there are indications that area specialists in the United States, though facing new challenges, have managed to survive and evolve. For one, within comparative politics, where political scientists with area expertise are generally housed, the majority of articles published in the subfield’s leading journals are country- or area-focused, though increasingly incorporating at least some quantitative analysis or field experiments. Moreover, books published in comparative politics tend to be overwhelmingly and disproportionately focused on single countries or small-N studies confined to a single area (Köllner et al., 2018: 17). This has been buttressed by the growing awareness that cutbacks in area studies training and research are depleting much needed reservoirs of deep knowledge on areas where US policymakers have been confronting new conflicts and crises. As Clowes and Bromberg (2015: 2) note: ‘The world saw the impact of inconsistent funding and training, for example, in American adventures in Afghanistan and Iraq at the start of which the US government scrambled to identify well-trained language experts and reliable local and regional information’. More recently, problematic developments across the former Soviet Union – the Maidan crisis in Ukraine, the annexation of Crimea by Russia, and the steep downturn in US–Russia relations – prompted Charles King (2015) to point out the danger of ‘flying blind’ in an era where Title VIII funding is no longer available to support advanced language training and deep knowledge of a critically important part of the world. More broadly, Robert Gallucci, the former President of the MacArthur Foundation,
noted in a 2014 speech: ‘[T]his is a time when policymakers need more help than ever to understand the world not as an abstract set of generalities but as a finely-grained, complex, and unpredictable environment shaped by culture, language, religion, and history’. While resource constraints have also affected area studies in Europe, the value of country- or area-specific research does not appear to have diminished. This may be in part because methodological debates have been less acrimonious and stand-alone qualitative research has continued to be recognized as valuable in its own right. In fact, across much of Europe, there are indications of a continued commitment to area studies scholarship within the social sciences, with new strategies for continuing to develop areafocused training, research, and scholarship. In fact, as the EU itself became more institutionalized and more responsive to the challenges of globalization in a post-Cold War era, pre-existing Europe-wide associations have been bolstered while new ones have been set up to pool resources and organize scholarly activities across different EU member countries. Examples of new associations include EASAS, the European Association for South Asian Studies, which began organizing Europe-wide conferences on South Asian Studies in the middle of the 1990s, and AEGIS, the Africa-Europe Group for Interdisciplinary Studies, which was set up in 1991 to expand research on Africa’s response to globalization and provide academic and policy-relevant knowledge to the Africanist institutions of the EU.2 In Britain, major schools set up long ago to study various regions of the world continue to draw distinguished scholars and support country- or area-focused research. In addition to the aforementioned School of Global and Area Studies at Oxford, the School of Oriental and Asian Studies (SOAS) at the University of London remains a premier institution for the in-depth study of Africa, Asia, and the Middle East, boasting the largest staff of area experts (over 300) of any university
The Survival and Adaptation of Area Studies
in the world.3 At University College London, the School of Slavonic and East European Studies (SSEES) remains the leading institution in the world focused on teaching about Russia, the Baltics, and Central and Eastern Europe. There are also a host of British area studies associations for various countries and regions, many of which have joined the United Kingdom Council for Area Studies Associations, founded in 2003. In addition, a joint initiative by the Economic and Social Research Council (ESRC), the Arts and Humanities Research Council (AHRC), and the Higher Education Funding Council for England (HEFCE) has provided funding for five new collaborative ‘Centres for Excellence in language-based Area Studies’ which are each housed at a lead institution and cover China, Russia, East Asia, Eastern Europe, and the Arabic-speaking world. This targeting of specific countries and regions may make it difficult in the future to devote resources for research on other regions, but overall, the commitment in British academia to area studies remains much more stable than is the case in the United States. In Germany, too, area studies scholarship appears to have not only survived the end of the Cold War but perhaps even progressed further, with expanded support for new graduate schools, research clusters, and collaborative networks devoted to various world regions. Since 2006, the German government has funded an ongoing competition among state universities with the express aim of creating several ‘universities of excellence’. Moreover, the German Council of Science and Humanities has also backed new programs to advance area studies, giving impetus to such initiatives as the 2009 effort by the Federal Ministry of Education and Research to enable a range of area studies centers, providing research networks to expand their capacities to conduct research within and across various world regions. A related program has expanded the fellowship-based collaborative research centers focused on South Asia, Latin America, China, and sub-Saharan
261
Africa. In addition, the German government has teamed up with private foundations to promote new think tanks and funding lines for research on various key countries and world regions. Examples include the Mercator Institute for China Studies (MERICS) and the Volkswagen Foundation’s funding initiative for research on Central Asia and the Caucasus (Köllner et al., 2018: 12). In Russia, the chaos of the post-communist transition took a toll on the production of knowledge on areas. The break-up of the Soviet Union – and along with it the USSR Academy of Sciences and various other institutes – meant that research on areas became fragmented and severely underfunded. Sharp cuts in budgets and salaries during the 1990s led thousands of Russian researchers to either leave their positions in academia and/ or emigrate in search of positions in other countries (Schiermeier, 2018). The situation improved markedly after the Russian economy began to rebound in 2000. In one global study of think tanks, several Russian area studies institutes rank among the top forty regional studies think tanks worldwide. Among them are the venerable Institute for the Study of USA and Canada as well as the Institute of Oriental Studies, which includes centers or departments focused on China, India, Japan, the Middle East, and Central Eurasia, among others. And, among university-affiliated regional studies centers, Moscow State University’s Institute for Asia and Africa Studies is among the world’s top thirty (McGann, 2019). More recently, the 2018 budget of the Russian government saw a 25% increase in the amount earmarked for research and development (R&D), while Russia climbed into the top ten in the number of research articles produced (Schiermeier, 2018). These shifts mark an improvement of the broader environment in which area studies research is being conducted in Russia now, at least compared to the early years of the postSoviet transition. The developing world has also begun to catch up in terms of both overall investments
262
The SAGE Handbook of Political Science
in knowledge production and the generation of area and cross-area expertise. China has led the way and remains far ahead of the rest of the developing world or emerging economies, both in terms of the total amount invested in R&D and the rise in percentage of GDP invested in R&D, which doubled between 2001 and 2016 according to the World Bank. India and Brazil spend far less on R&D, but have moved into the world’s top ten in terms of total expenditures on R&D. This does not necessarily mean that research on different parts of the world is flourishing across the global south. But, certainly, there has been a growth in the scale of expert knowledge amassed in some developing countries at least about their own regional ‘neighborhood’. The Yusof Ishak Institute in Singapore, established in 1968 as the Institute of Southeast Asian Studies (ISEAS), has greatly expanded its visibility, activities, and resources in relation to research focused on Southeast Asia. Along similar lines, the Africa Institute of South Africa (AISA), first established in 1960, was restructured and expanded in 2001 and now purports to ‘produce some of the finest research on contemporary African Affairs by having its dedicated and highly qualified researchers conduct field research every year throughout the African continent’.4 AISA has been rated among the top fifty best-managed global think tanks (McGann, 2019: 183). In addition, while the economic trends that gave rise to the term ‘BRICs’ in 2001 are no longer evident (given the lower growth rates in Brazil and Russia over the last decade), research on BRICS (now including South Africa) has become a cottage industry in each of the member countries. Most notable among these is the BRICS Policy Center in Brazil (attached to the Pontifical Catholic University of Rio de Janeiro), which has made it to the list of the top-ten best university-affiliated think tanks (McGann, 2019: 202). There is also now a BRICS Think Tank Council (BTTC) that was established in 2013 to boost cooperation on BRICS-focused research being done
at major institutes or centers in each of the countries. On the whole, while research on faraway regions remains substantially under- developed and underfunded across the developing world, the growing scope for intensive research focused on the ‘regional neighborhoods’ of particular emerging economies or rising powers have helped to greatly expand the production of area-based knowledge worldwide.
The Concept of ‘Areas’ in Problem-Driven Research: Opportunities and Trade-Offs Area expertise within political science has been generally thought to refer to knowledge about one or more nation-states presumed to occupy clearly demarcated geographic ‘areas’. This was not a problem for the kinds of questions that political scientists most commonly delved into within a global order dominated by the Cold War balance of power and by the pursuit of modernization by nation-states. Yet, within the actual organization of area studies research, there were always indications that what counted as an ‘area’ was not ‘natural’ or ‘fixed’. One of these indications is the fact that, alongside research focused on whole spaces that sometimes spanned entire continents, separate research communities and associations formed around the study of specific countries that were implicitly deemed worthy of focused attention. In the United States, for example, the fields of Latin American Studies or African Studies came to be accompanied by the rise of ‘Sovietology’ as a self-contained field during the Cold War – and now it is Chinese Studies that is the most rapidly growing field devoted to the study of a single country. Likewise, in China, following the reorganization of the leading area studies research institutes in the 1980s under the umbrella of the Chinese Academy of Social Sciences (CASS), the region-based Institutes
The Survival and Adaptation of Area Studies
of European Studies and Asia-Pacific Studies (which now includes South Asian Studies) coexist with the single-country focused Institutes of American Studies and Japanese Studies (Sleeboom-Faulkner, 2007). The very fact that some research communities are organized around a single ‘large’ or ‘important’ country, while others are organized around spaces covering multiple religious and ethno-linguistic communities, is indicative of a long-standing issue: the concept of an ‘area’ or ‘region’, although often treated as self-evident, is in fact a reflection of geopolitical realities, particularly as experienced by those with the resources to fund and organize research in a given period. Simply put, what constitutes an ‘area’ is an imagined, socially constructed reality reflecting a mix of contingent factors (Acharya, 2014). These include availability of resources tied to policy imperatives, familiarity based on proximity or language, similarity in developmental levels, regional alliances, and geographic features (particularly where whole continents coincided with ‘areas’ to be studied). Nevertheless, ‘area’ boundaries that may have made sense for many of the questions and agendas in the Cold War era came to acquire an enduring and global significance (Stevens et al., 2018: 6). Just as the QWERTY keyboard has remained a fixture through all kinds of technological shifts, the basic organization of area studies persists even though the end of the Cold War has given rise to a different set of questions and priorities within shifting geopolitical contexts and resource environments. But this does not mean that the position of area specialists has remained unchanged, particularly within political science. One dilemma that has become increasingly apparent is how the construction of ‘areas’ might match up with exhortations to be ‘problem-driven’ in designing research. True, certain ‘areas’ and ‘problems’ are well suited to each other. This is obviously the case where a question is narrowly framed so as to be only relevant to particular regions,
263
as in a study of how sectarian divides affect political stability in the Middle East or of what economic conditions may have helped strengthen the appeal of right-wing populist parties in Europe. There is also a good match in the case of more broadly framed problems so long as the full range of variation happens within a given geographic area as a result of the intrinsic heterogeneity of that area. In the latter scenario, it makes sense to focus one’s attention on that region while designing a comparative study to construct or explore general causal inferences. For example, in his book on how contentious politics shaped authoritarian state-formation in Southeast Asia, Dan Slater (2010: 7) notes: ‘While selecting cases from a single region frequently entails selection bias, choosing cases in Southeast Asia helps avoid this inferential pitfall’. Yet, in the post-Cold War era, scholars of comparative politics have been increasingly researching a host of problems – from democratic transitions and the effects of globalization to the politics of economic reform and the rise of populism – where the distribution of relevant patterns and outcomes cuts across countries and locales that are situated on different continents and have little in common in terms of history or culture. For such broad questions, any qualitative study that purports to offer portable arguments will likely want to consider how the places they are studying match up with the range of outcomes that exists across the full population of relevant cases (Geddes, 2003). Therein lies a fundamental – and increasingly problematic – methodological trade-off with respect to the entrenched organization of institutions, research networks, journals, and associations around fixed ‘areas’. The conversations that may take place among political scientists specializing in a given area may be more ‘fluent’ given the common background knowledge and deep familiarity with locales within that area. But these conversations would not necessarily encompass the more widely sampled observations and modes of empirical analysis that would be necessary
264
The SAGE Handbook of Political Science
to draw the attention of the discipline of political science writ large. This is actually a manifestation of a familiar dilemma for political scientists with area expertise: case selection principles derived from the basic logic of the comparative method yield very different kinds of advantages than those that emerge from cases within a single familiar area on which one has deep background knowledge. On the one hand, restricting one’s analysis to a space that one claims to have expertise on runs the risk of not having a representative sample of cases (Geddes, 2003). On the other hand, attempts to expand case selection beyond an area on which one has more in-depth knowledge and training comes with its own pitfalls, and there is no guarantee that the set of cases will be subject to the same manner of treatment given the unevenness in language skills, background knowledge, and engagement with scholarly networks. Importantly, this methodological trade-off has a crucial practical dimension related to the enormous investment of time and effort required to acquire language proficiency, build a cross-disciplinary base of background knowledge, and accumulate experience in conducting research in a specific country or region. Yet the kind of skill-set required to establish credibility in the eyes of a multidisciplinary area studies community is quite different from that which will bring recognition and status from other scholars in political science, particularly those that see the discipline’s advancement as based mainly on demonstrating technical sophistication and/or theory accumulation (Sil, 2018: 226). That is, area specialists in political science, particularly in the field of comparative politics, have to balance the varying expectations of two very different kinds of audiences. For the disciplinary audience, it may be better to prioritize methodological rigor by examining a representative sample of cases without the advantage of expert knowledge on each case (which risks losing one’s area studies audience); for the area studies audience, it would
make sense to leverage the stock of skills and knowledge acquired over years of study and research (which risks losing one’s disciplinary audience). Certainly, it is not inconceivable that, given one’s inclination and the availability of plentiful resources (including time), one could acquire expertise in new areas. One illustrious example of this is David Laitin, who was trained as an Africanist but set out to build enough area expertise to conduct fieldwork for his study of identity formation among Russian-speaking populations in four post-Soviet nations (Laitin, 1998). For most scholars, however, this is a daunting feat – one that requires the luxury of being able to redirect substantial amounts of time and energy away from regular professional obligations so as to invest anew in a new pocket of expertise. This hardly seems like a worthwhile investment, particularly if it is solely for the purpose of analyzing one additional case in the course of pursuing a single research project. This is perhaps why even those political scientists who engage in smallN comparative studies tend to choose cases within areas where they already possess the relevant expertise, particularly where a single language provides access to primary sources and fieldwork opportunities throughout the area (Köllner et al., 2018). The two areas where this pattern is most evident is Latin America (where Spanish provides access to a range of cases, leaving aside Portuguesespeaking Brazil) and the Middle East and North Africa region (where Arabic, despite the different dialects, provides access to a range of predominantly Islamic societies). For example, James Mahoney’s (2010) study of the long-term impact of Spanish colonialism employs comparative-historical analysis across a small number of cases from Spanishspeaking Latin America cases that share certain core similarities yet combine differently with general causal forces to produce a range of trajectories for post-colonial development. Similarly, in Middle Eastern and North African Studies, Amaney Jamal (2007) has examined
The Survival and Adaptation of Area Studies
how top-down efforts affect the scope and character of civic engagement in the West Bank under the Palestinian authority, with comparisons to similar dynamics evident in civil-society formation in Morocco (and, in less detail, in Jordan and Egypt). Such studies generate invaluable insights about their respective regions by leveraging deep knowledge and expertise, including language skills. But their research questions are necessarily limited in scope, given the focus on regionally specific attributes and outcomes. It is also worth noting that the advantages enjoyed by Arabic-speaking experts studying the Middle East or Spanish-speaking scholars studying Latin America do not extend to those researching other areas such as subSaharan Africa, Eastern Europe, and East and Southeast Asia. In cross-case studies within these latter regions, mastery of a single foreign language would not go very far if primary sources or immersive fieldwork are crucial to the design. Yet there are small-N studies where area expertise is assumed to be relevant and necessary even without the use of the local language for each and every case study. For example, Anna Grzymała-Busse’s (2007) Rebuilding Leviathan offers a comparative analysis of nine countries within Central and Eastern Europe in the process of analyzing different pathways to state reconstruction. This impressive study is rightly credited for demonstrating deep expertise on the region of Central and Eastern Europe – but without the author relying on language skills for each of the case studies. This is by no means a critique of the work; it is simply an illustration of a tacit understanding that one can leverage ‘area expertise’ to support cross-case comparisons within some regions (such as Eastern Europe or Southeast Asia) without necessarily deploying language skills for each case as one might expect for the Middle East and Spanish-speaking America. This point is also crucial to understanding the relationship between area studies and crossregional comparisons considered in the next section.
265
Re-positioning Area Studies, Advancing Cross-Area Studies The challenge of balancing the standards and expectations of area studies communities with those of non-area specialists in political science is not about to disappear anytime soon. This challenge can, however, be managed through different strategies that yield distinctive payoffs – strategies that have in common a reliance on some contextual knowledge of particular areas. This knowledge cannot by itself fortify the status of area specialists within political science. It is also necessary for area specialists to take an active role in framing the knowledge they generate in relation to evolving theoretical and methodological debates in the discipline. This implies a need to be more explicit and self-conscious in describing the epistemological assumptions and methodological principles through which qualitative observations from one or more areas are interpreted in relation to general concepts and theories in political science. One common approach has been to present qualitative area-focused scholarship in the form of case studies within mixed-methods projects, usually alongside formal models and/ or regression analyses (see Bergman, Chapter 26, this Handbook). The proliferation of mixed-methods research since the 1990s has been extremely rapid and has had some unanticipated consequences, including shrinking the space available for single-method qualitative research (Ahmed and Sil, 2012). At the same time, mixed-methods designs provide ‘cover’ for many scholars who remain deeply committed to studying particular areas and engaging area studies communities. For them, even if qualitative research is the primary objective, incorporating regression analyses or formal models can go a long way towards convincing non-area specialists to pay attention to the qualitative findings. One type of mixedmethods research that has been gaining in popularity is the integration of qualitative research with field experiments (see Bassi, Chapter 22,
266
The SAGE Handbook of Political Science
this Handbook). In this approach, research designs that most closely resemble laboratory experiments are seen as the most reliable path to improving causal inference; yet, in the field, area expertise is indispensable for supplying the contextual knowledge required for strong designs (Dunning, 2012). However, mixedmethods research, even where it showcases deep area expertise, is generally less likely to appeal to area studies communities than to other political scientists (many of whom will know little about the area in question). Thus, a second strategy may be preferable for political scientists who retain an abiding interest in engaging a particular area of the world. This strategy involves framing the main findings of area-focused qualitative research in a language associated with recognizable intellectual currents such as historical institutionalism and, to a lesser extent, interpretive research. Historical institutionalism has been evolving since the 1990s and has produced a sophisticated analytic toolbox – encompassing such notions as ‘process- tracing’, ‘critical junctures’, ‘contingency’, and ‘path dependence’ – for the design and presentation of qualitative research on a host of topics across the subfields of political science (Fioretos et al., 2016; Berntzen, Chapter 23, this Handbook). A somewhat smaller community of political scientists has gained recognition for the distinctive insights it has generated through interpretive styles of research such as ethnography (e.g. Schaffer, 2016; Wedeen, 2010). While those identifying with historical institutionalism are more likely to be considered ‘mainstream’ within the discipline, both of these intellectual traditions have helped to preserve some space for area specialists doing stand-alone qualitative research within political science. These strategies, however, do not address the question of how to reconcile the pursuit of area-based knowledge with the analysis of phenomena typically manifested across broader expanses of time and space. For such problems, small-N comparisons frequently require cases drawn from different areas,
at least if they are to capture the full range of variation and trace the effects of various causal mechanisms under different settings. One example of this sort of problem is the study of the ‘resource curse’. Although the rentier states of the Middle East represent a reasonable focal point for tracing the political and economic consequences of oil rents, the top ten exporters of crude oil are situated on four separate continents within distinct institutional settings and political dynamics. Widening the range of comparable cases selected from different regions allows for a more open-ended analysis of how the ‘resource curse’, rather than invariably undermining institution-building in developing countries, can sometimes have the opposite effect depending on such intervening factors as the position of exporters in relation to the ruling coalition (Saylor, 2018). Similarly, in the comparative study of post-communist transitions, cases are frequently selected within the region of Central and Eastern Europe. This makes sense for a host of questions including, for example, the analysis of market reforms or constitutional changes related to accession to the EU by former communist countries. Yet, broader examinations of the causes and consequences of different pathways followed by former communist regimes can benefit from research designs that encompass countries in Central Asia, where post-communist politics has had to contend with the rise of Islam, as well as the East Asian regimes, where ‘communist’ party-states remain entrenched but have sought to expand the private sector and promote greater integration into the global economy (Chen and Sil, 2007). To address such problems of broad scope, there is certainly the long-standing tradition of small-N analysis designed around some version of Mill’s methods.5 This approach emphasizes the logic of the comparative method and the representativeness of selected cases. It does not, however, expect each of the case studies to rely upon area expertise or take heed of area-specific scholarly
The Survival and Adaptation of Area Studies
debates. It is therefore worth highlighting a distinctive variant of small-N comparative analysis that is not limited to a single area but still seeks to leverage the sensibilities of a trained area specialist. This is precisely the approach touted in a recent volume on ‘comparative area studies’ (CAS), which affirms the importance of continued investments in single-country or single-area research while highlighting the benefits of cross-regional contextualized comparisons (Ahram et al., 2018). The editors recognize that trade-offs between the pursuit of deep contextualized knowledge of an area and the construction of broad causal generalizations can never be fully overcome. Nevertheless, they contend, that there are distinctive intellectual gains to be had through research strategies that consciously split the difference between context-sensitive narratives that are attentive to area-specific debates and causal generalizations that depend on the quasi-experimental logic of the comparative method. In this approach, the use of comparable cases from different areas allows a researcher to set up an array of causal configurations that can yield portable inferences. At the same time, the sensibilities of an area specialist are crucial for recognizing the relevant context conditions and understanding how individual case studies relate to scholarly discourses within the relevant area studies communities. Thus, even as it seeks to generate middlerange theoretical propositions, CAS implies an active effort to identify which attributes of various spatial and temporal contexts matter in what ways for understanding how different kinds of mechanisms and processes produce a range of outcomes (Köllner et al., 2018). In practical terms, the CAS approach does not require that a researcher becomes an expert on every country or area to be analyzed. And, it is certainly not reasonable to expect a researcher to keep learning new languages or finding new collaborators for each and every additional case in a small-N study. But, it is possible and worthwhile for anyone trained as an area expert to study cases
267
from a different area with an eye to regionally specific context conditions and with an awareness of how contending intellectual traditions and historiographic complexities shape discourses among the relevant area studies communities. In fact, there are a growing number of scholars who have taken on this challenge, and some of their work has been showcased in summary fashion in the aforementioned volume (Ahram et al., 2018). These studies highlight the payoffs of CAS in probing deeply into their cases and engaging area studies debates while also identifying portable concepts and causal linkages through cross-regional comparative analysis. In the process, CAS also serves an integrative function, expanding the channels of communication both between separate communities of area specialists interested in similar problems and between these communities and the discipline of political science writ large. A related strategy of cross-area research is QCA, an approach that also appreciates the complexity of causal configurations but is markedly more ambitious in seeking out broader logical inferences across a greater number of cases (Rihoux and Ragin, 2009; Berg-Schlosser, 2018). QCA follows the logic of Boolean algebra and the principles of set theory while seeking to simultaneously increase the number of cases being analyzed and the number of variables under consideration. The initial ‘crisp’ versions of QCA relied upon binary coding of a set of variables (low/ high, absent/present) across a number of cases. In response to criticisms that this limited coding generates claims that are overly deterministic, fuzzy-set QCA has been designed to incorporate multidimensional and continuous variables that allow for a much wider range of potential configurations (see Wagemann, Chapter 20, this Handbook). Given the interest in expanding the number of cases to cover the full range of causal configurations, it becomes difficult for a researcher using QCA to conduct in-depth process-tracing or ethnography for any single case (see Beach, Chapter 17, this Handbook). Moreover, given the focus on
268
The SAGE Handbook of Political Science
assigning values to discrete variables across different sets of cases, QCA researchers are increasingly working with standardized algorithms and computer programs to analyze discrete matrices of cases and variables. Even so, area-based knowledge provides crucial contextual information needed to identify the relevant variables across a given set of cases and to assign appropriate values to these variables for each case. Thus, while QCA is not as dependent as CAS on the skill-set or sensibilities of an area specialist, it does provide justification for continuing investment in area studies research, without which it is not possible to identify plausible case-specific causal configurations. Although the discussion of cross-regional comparison up to this point has implicitly treated the main units of comparison as countries, it is also possible to ‘scale up’ or ‘scale down’ cross-regional qualitative research. Scaling up implies comparing entire regions as a whole or comparing countries treated as representing the regions in which they are situated. This mode of inter-regional comparison emphasizes the relevance of discrete region-wide attributes and processes that can play a crucial role in mediating causal forces thought to originate at the global or national levels. By decentering the nation-state and focusing on regions, inter-regional comparisons are in a position to shed light on how regional-level historical inheritances or transformational processes might mediate between global and local forces and influence the trajectories of discrete clusters of countries. Peter Katzenstein’s (2005) A World of Regions, for example, makes a powerful case for identifying and comparing regional orders within the larger international system. While such orders may be ‘porous’ vis-à-vis the forces of globalization and internationalization, they still retain a regionally distinctive combination of economic, cultural, and institutional features that shape the behaviors and relations among countries within a given region. Finally, scaling down to the local level offers the possibility of deploying what
Snyder (2001) has labeled the ‘subnational comparative method’. This approach encompasses within-country comparisons of cities or provinces as well as between-country comparisons of like units situated in different countries. The latter variant forfeits the possibility of controlling for national-level historical, societal, or institutional attributes; but it gains more traction in analyzing how similarities and differences in those attributes might produce similar patterns of subnational variation across different national settings. This fundamental design easily lends itself to cross-regional studies, where comparisons can focus on similar sets of sub-national units situated in different areas of the world. This is particularly useful for researching questions where the relevant sub-national dynamics are limited to countries that have certain common characteristics or face certain common challenges even though they are located in different areas. For example, Heller (2012) has illuminated local variations in the efficacy of reforms designed to decentralize aspects of policymaking and expand the scope for grassroots civic engagement in Brazil, India, and South Africa (Heller, 2012). Similarly, Smith (2018) has generated novel insights about the conditions that spur the emergence of separatist movements through context-sensitive comparisons of groups and locales in states formed out of post-imperial partitions in different areas. Both inter-regional and sub-national comparative studies bolster the argument that area expertise is an extremely valuable asset for identifying the contextual knowledge needed to design and execute cross-regional comparative studies.
Conclusion: Reports of the Death of Area Studies Have Been Greatly Exaggerated Long before questions arose about the fate of area studies in the post-Cold War era, there
The Survival and Adaptation of Area Studies
was ample evidence to suggest that the pursuit of area-specific knowledge and the advancement of political science had never constituted a zero-sum game. Key concepts – from ‘corporatism’ and ‘consociationalism’ to ‘developmental state’ and ‘rentier state’ – became cornerstone of major research programs in political science after initially having emerged out of area- or country-focused research by political scientists (Sil, 2018: 230). That is, area specialists in political science, far from being on the margins of the discipline, had compiled an impressive track record of introducing major conceptual and theoretical advances of the discipline. Nevertheless, the post-Cold War era has seen the emergence of new challenges for area specialists in the social sciences – in part related to resources and funding streams, in part related to methodological currents (particularly in the United States) that tend to discount stand-alone qualitative analyses of single countries or areas. These constraints have greatly intensified the pressures and trade-offs for area specialists in political science, widening the gap between the payoffs from the accumulation of area expertise and the rewards associated with adhering to disciplinary ‘best practice’. The result of these trends, however, is not an irreversible decline in area studies but rather an expansion of the variety of research products that area specialists are able to offer. As we saw above, some have reapportioned their time and effort to incorporate quantitative analysis or mathematical modeling within mixed-methods designs, while others have connected their areaspecific research to recognized intellectual traditions such as historical institutionalism. We are also encountering scholars who have area expertise but are exploring context-sensitive cross-regional comparisons. Importantly, the latter are not viewed as subsuming or supplanting research produced by area specialists. In fact, any comparative approach that is attentive to context conditions in various locales must necessarily rely upon area studies research and engage scholarly debates within area studies communities. Thus, area-focused and
269
cross-area qualitative research in the social sciences, rather than dueling with one another, are in a position to jointly affirm that their research output, far from being esoteric or idiosyncratic, has much to tell us about how global, regional, national, and local factors shape political outcomes worldwide. Indeed, discussions along these lines are taking place – both across ‘horizontal’ channels linking scholars embedded in different countries and area studies networks, as well as along ‘vertical’ channels linking various pockets of area-focused research to disciplinary debates over concepts, theories, and methods. It is true that shifting flows of resources and intellectual trends in various disciplines have affected the scope for area-specific training and field research in some places. At the same time, on a global scale, area studies research remains active and fruitful, and there are now more centers for regional studies as well as more conversations among scholars based in different regions. Moreover, the real world keeps throwing up surprises that require deep insight and contextual knowledge, making it abundantly clear to academics, policymakers, and foundations that the scholarly study of different regions of the world cannot be scaled back to the degree that some envisioned immediately after the Cold War. All of these developments taken together suggest that area studies are more likely to keep adapting rather than simply peter out due to shifts in resource streams and disciplinary fashions at any given time. At a minimum, it seems safe to say that reports of the death of area studies have been greatly exaggerated.
Notes 1 Website of the University of Oxford’s School of Global and Area Studies (OSGA), available at https://www.area-studies.ox.ac.uk/about-us (accessed February 20, 2019). 2 As noted on the website of AEGIS: https://www. aegis-eu.org/why-aegis (accessed February 25, 2018). 3 As per the website of SOAS: https://www.soas. ac.uk/about/ (accessed February 27, 2017).
270
The SAGE Handbook of Political Science
4 See the website of the Africa Institute of South Africa (AISA), at http://www.ai.org.za/aboutaisa-2 (accessed February 25, 2019). 5 These refer to the methods of induction outlined by John Stuart Mill in his A System of Logic (1843).
References Acharya, Amitav. 2014. ‘Global International Relations and Regional Worlds: A New Agenda for International Studies’, International Studies Quarterly 58 (4): 647–659. Ahmed, Amel and Rudra Sil. 2012. ‘When Multi-Method Research Subverts Methodological Pluralism – Or, Why We Still Need Single-Method Research’, Perspectives on Politics 10 (4): 935–953. Ahram, Ariel I., Patrick Köllner, and Rudra Sil, eds. 2018. Comparative Area Studies: Methodo logical Rationales and Cross-Regional Appli cations. New York: Oxford University Press. Berg-Schlosser, Dirk. 2018. ‘Comparative Area Studies: The Golden Mean between Area Studies and Universalist Approaches?’ in Ahram, Köllner and Sil, eds. Comparative Area Studies: Methodological Rationales and Cross-Regional Applications, pp. 29–44. New York: Oxford University Press. Carrier, James G., ed. 1992. History and Tradition in Melanesian Anthropology. Berkeley: University of California Press. Chakrabarty, Dipesh. 2000. ‘Subaltern Studies and Postcolonial Historiography’, Nepantla: Views from South 1 (1): 9–32. Chen, Cheng and Rudra Sil. 2007. ‘Stretching Postcommunism: Diversity, Context and Comparative Historical Analysis’, Post-Soviet Affairs 23 (4): 275–301. Clowes, Edith W. and Shelly Jarrett Bromberg. 2015. ‘Introduction: Area Studies after Several “Turns”, in Clowes and Bromberg, eds. Area Studies in the Global Age: Community, Place, Identity. DeKalb. pp.1– 12. IL: Northern Illinois University Press. Dunning, Thad. 2012. Natural Experiments in the Social Sciences: A Design-Based Approach. New York: Cambridge University Press. Fioretos, Orfeo, Tulia G. Falleti, and Adam Sheingate, eds. 2016. The Oxford Handbook of Historical Institutionalism. New York: Oxford University Press.
Fukuyama, Francis. 2005. ‘How Academia Failed the Nation: The Decline of Regional Studies’, Journal of Management and Social Sciences 1 (1): 21–23. Gallucci, Robert. 2014. ‘Academia and the Foreign Policy-making Process’, available at: http://www.macfound.org/press/speeches/ academia-and-foreign-policy-makingprocess-speech-robert-gallucci/ (accessed August 24, 2019). Geddes, Barbara. 2003. Paradigms and Sand Castles: Theory Building and Research Design in Comparative Politics. Ann Arbor: University of Michigan Press. Goertz, Gary and James Mahoney. 2012. A Tale of Two Cultures: Quantitative and Qualitative Research in Social Science. Princeton: Princeton University Press. Gottemoeller, Rose E., and Paul Fritz Langer. 1983. Foreign Area Studies in the USSR: Training and Employment of Specialists. Santa Monica: Rand. Grzymała-Busse, Anna. 2007. Rebuilding Leviathan: Party Competition and State Exploitation in Post-Communist Democracies. New York: Cambridge University Press. Hanson, Stephen E. 2009. ‘The Contribution of Area Studies’, in Todd Landman and Neil Robinson, eds. The SAGE Handbook of Comparative Politics. pp. 159–174. London: Sage. Heller, Patrick. 2012. ‘Democracy, Participatory Politics, and Development: Some Comparative Lessons from Brazil, India and South Africa’, Polity 44 (4): 643–665. Jamal, Amaney A. 2007. Barriers to Democracy: The Other Side of Social Capital in Palestine and the Arab World. Princeton: Princeton University Press. Katzenstein, Peter J. 2005. A World of Regions: Asia and Europe in the American Imperium. Ithaca, NY: Cornell University Press. King, Charles. 2015. ‘The Decline of International Studies: Why Flying Blind is Dangerous’, Foreign Affairs 94 (4) (July/ August): 88–98. King, Gary, Robert O. Keohane, and Sidney Verba. 1994. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton University Press. Köllner, Patrick, Rudra Sil, and Ariel Ahram. 2018. ‘Comparative Area Studies – What it is,
The Survival and Adaptation of Area Studies
What it can do’, in Ahram, Köllner and Sil, eds. Comparative Area Studies: Methodological Rationales and Cross-Regional Applications, pp. 3–26. New York: Oxford University Press. Laitin, David D. 1998. Identity in Formation: The Russian-Speaking Populations in the Near Abroad. Ithaca, NY: Cornell University Press. Mahoney, James. 2010. Colonialism and Postcolonial Development: Spanish America in Comparative Perspective. New York: Cambridge University Press. Malinowski, Bronislaw. 1922. Argonauts of the Western Pacific. London: Routledge & Kegan Paul. McGann, James. 2019. The 2018 Global Go-To Think Tank Index Report, The Think Tanks & Civil Societies Program (TTCSP), University of Pennsylvania, Philadelphia. Available at: https://repository.upenn.edu/think_tanks/16/ (accessed February 25, 2019). Mill, John Stuart. 1843. A System of Logic. London: John W. Parker. Moseley, William G. 2009. ‘Area Studies in a Global Context’, The Chronicle of Higher Education (November 29), available at http:// works.bepress.com/william_moseley/89/ (accessed January 06,2020). Packenham, Robert A. 1973. Liberal America and the Third World: Political Development Ideas in Foreign Aid and Social Science. Princeton: Princeton University Press. Pepinsky, Thomas B. 2015. ‘How to Make Area Studies Relevant Again’, The Chronicle of Higher Education (February 12), available at https:// www.chronicle.com/blogs/conversation/ 2015/02/12/how-to-make-area-studiesrelevant-again/ (accessed January 06,2020). Rihoux, Benoît and Charles C. Ragin, eds. 2009. Configurational Comparative Methods. Thousand Oaks, CA: Sage. Said, Edward. 1978. Orientalism. New York: Pantheon. Saylor, Ryan. 2018. ‘Gaining by Shedding Case Selection Strictures: Natural Resource Booms and Institutional Development in Latin America and Africa’, in Ahram, Köllner and Sil, eds. Comparative Area Studies: Methodological Rationales and Cross-Regional Applications, pp. 185–203. New York: Oxford University Press. Schaffer, Frederic C. 2016. Elucidating Social Science Concepts: An Interpretivist’s Guide. New York and London: Routledge.
271
Schiermeier, Quirin. 2018. ‘Russian Science Chases Escape from Mediocrity’, Nature (13 March), available at: https://www.nature.com/articles/ d41586-018-02872-8 (accessed March 24, 2019). Sidaway, James D. 2013. ‘Geography, Globalization, and the Problematic of Area Studies’, Annals of the Association of American Geographers 103 (4): 984–1002. Sil, Rudra. 2018. ‘Triangulating Area Studies, Not Just Methods: How Cross-Regional Comparison Aids Qualitative and Mixed-Method Research’, in Ahram, Köllner and Sil, eds. Comparative Area Studies: Methodological Rationales and Cross-Regional Applications, pp. 225–246. New York: Oxford University Press. Slater, Dan. 2010. Ordering Power: Contentious Politics and Authoritarian Leviathans in Southeast Asia. New York: Cambridge University Press. Sleeboom-Faulkner, Margaret. 2007. The Chinese Academy of Social Sciences (CASS): Shaping the Reforms, Academia, and China (1977–2003). Leiden: Brill. Smith, Ben. 2018. ‘Comparing Separatism Across Regions: Rebellious Legacies in Africa, Asia and the Middle East’, in Ahram, Köllner and Sil, eds. Comparative Area Studies: Methodological Rationales and CrossRegional Applications, pp. 168–184. New York: Oxford University Press. Snyder, Richard. 2001. ‘Scaling Down: The Subnational Comparative Method’, Studies in Comparative International Development 36 (1): 93–110. Stevens, Mitchell L., Cynthia Miller-Idriss, and Seteney Shami. 2018. Seeing the World: How U.S. Universities Make Knowledge in a Global Era. Princeton: Princeton University Press. Szanton, David L. 2004. ‘Introduction: The Origin, Nature, and Challenges of Area Studies in the United States’, in Szanton, ed. The Politics of Knowledge: Area Studies and the Disciplines, pp. 1–33. Berkeley: University of California Press. Wedeen, Lisa. 2010. ‘Reflections on Ethnographic Work in Political Science’, Annual Review of Political Science 13 (1): 255–272. Zhuk, Sergei. 2017. Nikolai Bolkhovitinov and American Studies in the USSR: People’s Diplomacy in the Cold War. Lanham, MD: Lexington Books.
16 Big Data in Social Sciences Uwe Wagschal and Felix Ettensperger
Definition of Big Data The application of big data in social and political sciences is still in its infancy. However, big data analysis is no longer something fancy, but has reached all social science disciplines. In fact, the core idea of big data is not new: scientists, researchers and operators have always been interested in more and better data. Despite the fact that governments, social media and private digital data companies have begun to produce huge amounts of accessible data, interpretation and analysis of these data streams still constitutes a technical and functional challenge. And even in the neighboring fields of economics, econometrics and finance – disciplines long accustomed to huge amounts of real-time data, for example in stock market analysis – the application of big data frameworks for huge-N analysis and automatically generated information is still considered as new and innovative. The most common definition of big data is based on the so-called three Vs. Gartner, a
leading consulting firm in this sector, defines it as follows: ‘Big data is high-volume, highvelocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation’. This definition, however goes back to Doug Laney (2001), a consultant who previously worked for another consultancy now owned by Gartner. The first ‘V’ refers to the volume of big data. Due to new methods of data gathering from various sources, the amount of data has increased vastly compared to former times when data collection was predominantly ‘small data’. Internet databases, data from social media and visualized data that can be gathered from videos and other sources, have all led to the creation of larger data sets. A second aspect is the increased capacity of hard drives and new technologies like cloud computing enabling the storage of immense data sets – not surprisingly the size of some applications has reached petabytes.
Big Data in Social Sciences
The second ‘V’ refers to velocity. The speed of data has also increased enormously. Realtime applications (for example, from social media, financial markets or public debates) are a new challenge to data analysis. The importance of velocity can especially be seen at the stock markets: today, almost two-thirds of trades are done by algorithms and automatized computing. Another aspect of big data is the analysis of communication in real time, because real-time data also require faster storage of data. Finally, the variety of data is a new aspect. ‘In the old days’, data storage was organized in databases with specific structures, for example, variables in the first row and time or crosssectional identificators in the column. Basically, the type of data was numerical. In contrast, for many applications, big data is nowadays based on new types like audio, video or text analysis. This requires new approaches for social scientists to analyze this kind of data. Some new big data applications even refer to geographical identificators: for example, surveys can be allocated to a geographical region via the IP address of the internet connection. In the literature there are some other ‘Vs’ mentioned which supplement the original three dimensions. The fourth ‘V’ is veracity, which concerns the uncertainty of data, especially possible inconsistencies and incompleteness. The fifth ‘V’ is the validity of the data. Validity is related to veracity and is an old criterion of data quality in statistics. Do the data really measure what should be measured? This is why validity is always a concern when collecting data. With big data, the validity problem has increased since the measurement is even further removed from the theoretical construct that the researcher is interested in. Data from the internet are far more blurred and fluid. A sixth aspect is the visualization of data. The strive for good graphs in social sciences is not new (Tufte, 1983), but new graphical tools like Shiny enable persistent data storage. Shiny apps offer the possibility to display data in a very timely and elaborated manner. Finally, because big data offers opportunities for businesses, the literature mentions the
273
‘V’ for value. Data and information are valuable resources in the 21st century. Companies like Facebook, Twitter, Google or Cambridge Analytica (CA) use information about their customers, clients or targets for advertising purposes. In the end, there is no production of ‘hard’ products, but the value added by these companies is a new approach to look at data differently. It is not only social media that create a huge and constant data stream: more importantly, in the future there will be many new data streams contributing images (for example, videos and photos) or signals from sensors (‘internet of things’) to the growing set of available real-life data. Interestingly, Mayer-Schönberger and Cukier (2013) argue that big data is not so much about causal modeling and theory testing as it is about finding patterns and correlations within data. Researchers using big data are, therefore, located not only in the traditional domain of testing theories and rejecting hypotheses, but simultaneously in a very different world regarding their quest for finding correlation patterns. For the task of pattern detection, simple correlations and descriptive statistics can be used. In addition, more elaborated methods like neural networks, clustering methods, machine learning (ML) algorithms and artificial intelligence can also be employed. boyd and Crawford offer a different definition: We define Big Data as a cultural, technological, and scholarly phenomenon that rests on the interplay of: 1. Technology: maximizing computation power and algorithmic accuracy to gather, analyze, link, and compare large data sets. 2. Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims. 3. Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy. (boyd and Crawford, 2012: 663)
Big data poses problems concerning the autonomy of individuals and the right to possess their own information. China, for example,
274
The SAGE Handbook of Political Science
uses big data applications for the surveillance of citizens. The US National Security Agency (NSA) and the UK Government Communications Headquarters (GCHQ) have massively monitored telecommunication and the internet worldwide since 2007. The NSA used a program called PRISM to retrieve data from the internet. Partners of PRISM, among others, are Microsoft (with Skype), Google (with YouTube), Facebook, Yahoo and Apple. The revelations of Edward Snowden showed an unprecedented surveillance of citizens all over the world, violating their guaranteed civil rights. A specific profile can be constructed for every internet user showing his specific preferences. However, some of these tech companies do not restrict their usage of data to their own business purposes; Facebook and Twitter, for example, use the collected information to feed their clients with fitting information based on their preferences. During the Brexit campaign and the election campaign of President Donald Trump, CA used available information to influence voters with targeted information adjusted to their profiles. O’Neil (2016: 3) gives, in her instructive and informative article, several examples for various threats and calls big data a ‘weapon[ ] of math destruction’. The examples pointed out before, clearly show that big data can be a serious danger for civil and political rights and, therefore, for democracy itself. However, big data offers huge possibilities to improve daily life and creates new opportunities concerning not only science, but also politics. Mayer-Schönberger and Cukier (2013), for example, demonstrate how avian flu could have been predicted earlier and contained faster by scanning for items used in the internet. Combining the search data from Google with the data from the Centers for Disease Control and Prevention on the spread of the flu between 2003 and 2008, they were able to show a strong correlation of outbreak data and specific search patterns. Search engines using big data and artificial intelligence are able to predict the best moment to buy a product, and in general, big data brings
a competitive advantage for many companies. The experiences made in some recent elections reveal that a good big data strategy can be an advantage for political parties and candidates.
A Little History of Big Data Governments, scientists and business have always been keen on more and better data. The nation building of modern societies was in many cases accompanied by the foundation of national statistical bureaus. However, the beginning of data collection was basically driven by socio-economic purposes, like a population census or improvement of taxation. This generated huge data sets as well as problems regarding data storage and data analysis. The introduction of computers facilitated data collection and data analysis on a smaller scale compared to today. At the beginning of the computerization era, large mainframes – which had much less power than personal computers today – needed punch cards to proceed the commands. Big data applications are therefore nothing new; the difference lies in the increase in size and complexity. Innovations like Hadoop, neural networks, artificial intelligence, elaborated cluster analysis and learning algorithms were big data innovations, which helped to cope with both problems mentioned above. In the social sciences, computer science was lagging behind the neighboring discipline of economics. Early attempts had been made at implementing complicated econometric models to simulate the economy and the effects of policymaking; however, it was sociologist Charles Tilly who first introduced the notion of big data in his scientific article. He was quite skeptical about the possibility of huge data sets and big data applications being able to explain historical and sociological developments. Citing the historian Lawrence Stone he considered the argument `that none of the big questions has actually yielded to the bludgeoning of the big-data
Big Data in Social Sciences
people, and that “in general the sophistication of the methodology has tended to exceed the reliability of the data”’ (Stone, 1979: 13 as cited in Tilly, 1984: 369). The developments in the field were driven from economics, finance and computer science. Around the year 2000, big data started to become a hot topic, but, after Laney had formulated his three ‘Vs’ (volume, velocity and variety) definition in 2001, it still took some years for the sophisticated software, necessary to use big data properly, to be developed. A first important step was the implementation of Hadoop by Yahoo in 2005, which was a web crawler used to collect data from the internet. In the following years, the applications for big data took off, creating new technologies and computer languages. Another important innovation was automatic data collection with the programing language ‘R’ which became a standard within social science to collect data from Twitter, Facebook, speeches, articles and online comments. Content analysis and text mining were taken to a higher level using a big data approach through which tremendous amounts of information could be retrieved relatively easily (Munzert et al., 2015).
Basic Theories and Concepts Big Data Frameworks One important aspect of big data is its focus on huge amounts of data. The standard definition usually implies that the quantity of data involved is no longer manageable by a single computer system and therefore requires cloud computing, data servers or sophisticated networks processing the information with process management systems like Hadoop. Technical advances are the main problem with this definition: what would have counted as big data research in 2000 due to technical limitations in data storage and processing power, can today be processed on modern computer systems without
275
splitting the data into separated objects and using various processing lines (boyd and Crawford, 2012). Although the continued improvements in computer hardware have to some extent slowed down in recent years, it can still be anticipated that more projects labeled ‘big data’ will eventually fall out of this narrow technical definition. The sources of big data offer a second aspect for analysis. Many projects looking at data from social media have been labeled ‘big data research’ because they use a source of information that clearly shows the characteristics of a big data structure, with data sources far beyond the processing power of any single consumer computational system. This usually does not indicate that the research itself is using data that are too big to compute on a single device; it merely implies that data are taken out of a big data ecosystem with huge amounts of accessible data. Rodgers argues in ‘Foundations of Digital Methods’: As noted, digital methods make use of online methods, by which I refer to an array of techniques from the computational and information sciences – crawling, scraping, indexing, ranking, and so forth – that have been applied to and redeveloped for the Web. They refer to algorithms that determine relevance and authority and thereby recommend information sources as in Google’s famed PageRank, but also boost all manner of items, from songs and ‘friends’ to potential ‘followers’. Many of the algorithms are referred to as ‘social’, meaning that they make use of user choices and activity (purposive clicks such as liking), and may be contrasted with the ‘semantic’, meaning that which is categorized and matched (as in Google’s Knowledge Graph). (Rodgers, 2017: 76)
A third aspect of big data research refers to the implications – the impacts and the outcomes of big data systems. While this is not big data research per se, but more likely research about big data, it might provide many interesting aspects, which could be investigated in the field of political science. One interesting observation regarding the usage of the ‘big data’ label is that all the types of research mentioned above have been labeled ‘big data research’: a clear distinction
276
The SAGE Handbook of Political Science
is necessary in order not to conflate these very different approaches and to express more accurately which type of research is used in each case. For some researchers (Mayer-Schönberger and Cukier, 2013), big data research is a totally new way of looking at data. Causal explanations lose ground, whereas pattern detection becomes more important. New clustering techniques, as well as analyses of texts, photos and videos, ideally combined with other types of data, yield new insights. According to Mayer-Schönberger and Cukier, ‘Correlations are useful in a smalldata world, but in the context of big data they really shine’ (Mayer-Schönberger and Cukier, 2013: 3). On first sight, this non-theorizing seems to be a problem for social science. However, the empirical findings may outweigh these problems and stimulate an inductive theory-building. Couldry puts it this way: ‘Big data’ is only possible on two basic conditions (which actually are composites of many more detailed conditions): first, that data is collected continuously about the states of affairs in various domains (including not just what individuals do and say, but the state of their bodies); second, that data is aggregated and its patterns of correlation computed and ‘interpreted’. (Coudry, 2017: 235)
Semi-Automatic Approaches to Retrieving Data from Twitter and Facebook It has been proclaimed that big data does not merely refer to vast data sets and sophisticated tools to manipulate and analyze tremendous amounts of data, but it is also perceived as bringing about a transformation in how we generate and process knowledge. Some see it as a change of equal importance as that of the invention of the industrial assembly line by Henry Ford (Burkholder, 1992; Baca, 2004; boyd and Crawford, 2012: 665). Big data is without doubt transforming many aspects of business, governance and science. But the hype around new technological options sometimes leads to a false impression of unmatched accuracy and
absolute objectivity in this new domain (boyd and Crawford 2012: 665–7). In all regards, especially concerning the validity of the underlying research design and the research question applied, big data contributions should be checked with the same thoroughness as all other kinds of scientific research. Researchers, political leaders and company executives should not be distracted by the bold claims about these new methods bringing a general improvement in how data can be processed and evaluated. Especially, examples from the interesting sub-field of social media research can be cited to illustrate the pitfalls of big data related projects in general. Some significant problems regarding big data research are easily demonstrated in relation to semiautomatically collected data from social media and other sources of high volatility. The three main problems here are the selection of data, the process of collecting it and transforming it into usable data and the overall coverage and representativeness of this kind of source. In this context, semiautomatic means that data are specified as relevant and a certain research focus is set by the investigating human specialist(s), while the retrieval of data is fully automated. The decisions of researchers about how to process data from social media, how to select and filter it, as well as how to view the extracted data in context, are as crucial as ever to gain meaningful results in this field. In many social media-related research projects, search terms for queries, or at least the rules of aggregation and analysis, are completely specified beforehand by the team of researchers. What to include and, more importantly, what to leave out is often left to human judgment based on overly confident decisions done with limited information. If it is not considered and evaluated carefully, a specific selection of data can easily lead to the illusion of discovering solid evidence where relationships are indeed at best weak. Due to the sheer size of available social media data, researchers are likely to encounter even
Big Data in Social Sciences
the most marginal topics and side issues and may easily overestimate the overall impact of these information chunks. Moreover, reviewing only certain segments of information can be highly misleading and confirm pre-existing assumptions by neglecting counter-evidence or the real ratios and volumes of the investigated content in relation to the overall volume the data generated. All this can be considered as a form of confirmation bias (Nickerson, 1998). Researchers also sometimes tend to over-interpret coincidentally appearing random patterns in data: ‘Too often, Big Data enables the practice of apophenia: seeing patterns where none actually exist, simply because enormous quantities of data can offer connections that radiate in all directions’ (boyd and Crawford, 2012: 668). The technical sources of potential errors when retrieving data are also manifold: data queries must often be repeated in certain intervals, and data sets must be accumulated over time, cleaned and compiled for further processing. All these steps are prone to intentional or unintentional distortions, to errors and manipulations. Due to the limitations of access via application programming interfaces (APIs) and other constraints in the accessibility of data sources, research in the domain of social media research often works with constraint samples of information. Companies like Facebook and Twitter limit or meter the amount of data that can be retrieved in a single session (especially by nonbusiness accounts) at any given time (Twitter, 2019). In the end, information is their prime resource and they will not share it to the same extent as they are capable of collecting and processing it. Retrieving User Timelines in Twitter by a standard API is usually limited to the most recent 3,200 tweets from a single user, and if it is repeated later, the data foundation will probably have changed. Paying premiums for unlimited access normally does not solve the need to extensively reformat, combine and accumulate vast amounts of different data frames and to query data at many different times.
277
To make things worse, the growing role of bots and other malign-opinion entrepreneurs in social media can further distort the picture. While finding out which accounts are used to strategically spread misinformation and propaganda and what techniques are used to influence normal users can be fascinating, these disruptive agents in social media can often lead to wrong impressions about the real magnitude of certain issues. Bots publicizing huge amounts of negative or onesided comments can distort trends not only for users looking for confirmation of their attitudes, but also for researchers trying to evaluate the objective volume of content published by human actors. Between 9–15% of all tweets come from bots, as recent research estimates indicate (Varol et al., 2017). Social media data gained from sources like Twitter or Facebook are not representative of the general population. In fact, they are not even representative of the growing fraction of digital citizens. ‘Twitter does not represent “all people”, and it is an error to assume “people” and “Twitter users” are synonymous: they are a very particular subset’ (boyd and Crawford, 2012 : 669). Twitter metrics revealed that roughly 44% of all Twitter accounts never actually tweet (Koh, 2014). Taking this into account, active Twitter users are not even a perfect representation of the opinion and attitude of all Twitter users as many of them belong to the vast group of silent listeners. As Twitter users only account for a small fraction of online users in general, putting too much emphasis on social media analysis can be hugely misleading. Although the fraction of Facebook users is comparably higher, the same limitations appear to be in place. Of course, ongoing generational change and increasing digitalization will make online communication more widespread and will increase the importance of the internet as well as data that can be derived from social media and online sources. However, the representativeness of online data will remain an issue. Researchers must be transparent and honest about all transformations they apply to
278
The SAGE Handbook of Political Science
their original data, and should consider as well as report all its inherent limitations. They should be critical regarding the validity and reliability of their research design and review it in detail. If certain topics of interest are selected, the observed volume of content should be made transparent in relation to general trends and overall behavior of users. Otherwise, big data research projects risk running into errors well known from conventional research, which will lead to false positive results or erroneous conclusions due to the process of data selection.
Application of Big Data in Social Sciences Many books written on big data point out that better policymaking and the wider detection of diseases and different forms of fraud are supported by these new techniques. De Rosa noted: ‘The radical expansion of digital data is transforming the global evidence base and will lead to improved knowledge, understanding and decision making across the economy and politics’ (2017: 125). The following example, that of election fraud, deals with the application of big data in social sciences. Election fraud was predominantly identified in authoritarian and totalitarian countries. However, even in democratic countries, problems with free and fair elections have been observed, as in the 2000 US presidential election, with an official recount leading to litigation and controversy surrounding anomalies within the electoral process. Election fraud can happen not only in the process of counting votes but at various stages of the electoral process: registration of voters, electoral districting, voting by mail, electronic voting machines, and other features, offer possibilities to influence the election outcome. New instruments have been developed to detect election fraud, especially election forensics: using larger data sets from election results, election forensics rely on big data foundations and statistical analysis.
Especially, the distribution of votes can indicate election fraud, and by using Benford’s law (Mebane, 2006), or other statistical techniques, that fraud can be made visible. Benford’s law focuses on the distribution of the second and last digit of the counted votes. According to Benford, the distribution of the second and last digit should follow a distinct distribution. With a special computer program in R (election forensics toolkit) it is possible to check for these anomalies.
Real-Time Response Measurement as a Big Data Application The following is another example of the potential of a big data application in the social sciences. The simultaneous measurement of direct responses to political debates, discussions and other forms of communication have been of interest to social scientists since the mid 1940s when Paul Lazarsfeld started this kind of research in the United States, first in the context of radio transmissions and later during the presidential debate of 1952 (Levy, 1982; Lazarsfeld and Stanton, 1944). Surprisingly, technological advances measuring the impact of television debates during US election campaigns were rather scarce. For several decades, physical dialers were used to measure the effects of political debates on the viewers and respondents – albeit several different tools such as sliders, dialers or push buttons were tested. It was especially during the presidential campaign of 2012 (boydstun et al., 2014) that new advances occurred and that researchers moved this kind of research from the lab setting, where experiments and research of political debates had traditionally been conducted, to environments more familiar to the audience. During the 2017 German federal election campaign, a new instrument for real-time measurement, the Debat-O-Meter (Metz et al., 2016), was applied to the debate between the two leading candidates, Angela Merkel and Martin Schulz, watched by roughly 16 million spectators on
Big Data in Social Sciences
279
Figure 16.1 Different graphical user interfaces of the Debat-O-Meter used in TV debate Source: Debat-O-Meter project (www.debatometer.com).
five TV channels simultaneously. Before the German federal election of 2017, the DebatO-Meter team experimented with different GUIs (graphical user interfaces) like rubber bands, sliders, dialers and push buttons. It was also used in TV debates like the third presidential debate between Trump and Hillary Clinton, the first debate during the French presidential campaign in 2017 and the Brexit debate (see Figure 16.1). In the case of the Merkel–Schulz debate, the number of discussants varied between two and seven. More than 44,000 spectators entered the website debatometer.com and roughly 28,000 users completed the pre-surveys and the RTR (realtime response) evaluation. In the end, more than 15,000 post surveys were completed. To these users an individual voting advice analysis (VAA) was provided. All in all, more than 1.52 million clicks were cast during the 90 minutes of the debate, resulting in votes in favor or against Chancellor Angela Merkel or challenger Martin Schulz and their arguments. For the analysis, the geographical information of the IP address was used to cross-check the location of the participants (mainly to avoid fraud from abroad and check geographical representation).
Who won the debate? According to the post survey it was close to a draw: 39% responded that Merkel won the debate while 40.3% declared Martin Schulz the winner. This promising research revealed new ways to measure the preferences of the electorate, especially the distribution within the group of undecided voters and party sympathizers. It was also possible to find correlations between attitudes of the participants and their clicking behavior. In fact, four different data sources were combined for the analysis: a pre- and post-survey, the RTR measures and geographical information from the IP address.
Major Techniques of Big Data Analysis Data Mining and Related Techniques As shown in previous examples, collecting and assembling data in automatic and semiautomatic ways is a core task necessary to enable big data research. This initial datagathering process is, however, not what is commonly labeled ‘data mining’; it is not the
280
The SAGE Handbook of Political Science
collecting and assembling of vast amounts of data, but the subsequent analytical process of finding relationships inside certain data sets. It is the task of scanning through already collected data using statistical methods to find new relations and patterns in the data (Leskovec et al., 2014). Data mining is thus somewhat of a misnomer (Han and Kamber, 1998: 6). Data mining is a very unspecific and broad description of processes, involving a vast repertoire of different statistical methods. It is by definition the ‘knowledge discovery in databases’. The difference between data mining and what is more conventionally called data analysis lies mostly in their different intentions: data mining is done with the clear objective to produce viable predictions based on available data, while data analysis tries conventionally to explain and analyze effects inside the data. Six different classes of tasks are generally associated with and applied to data mining: regression, clustering, anomaly detection, summarization, classification and rule learning (Fayyad et al., 1996). One potential example for a concrete technique is k-means clustering. It is a method used to discover different sub-groups in data. Thus, it can show which variables are occurring simultaneously within certain groups of cases. Such a clustering technique can identify different categories related by their corresponding variable patterns. It can also be applied to identify similarities in cases. In a marketing setting for example, it can be used to identify different types of customers and target them individually by specialized marketing and tailor-made advertisements. In political science, it enables us to differentiate between different sets of voters, countries, actors or institutions. Data mining can also involve anomaly detection using techniques for finding and identifying statistical outliers in a data set. These outliers can give an indication for data manipulation, for example, in election forensics and for the detection of election fraud by looking at specific anomalies. It can also be used to identify online bots or automated
systems spreading huge amounts of polarizing data on social media based on their posting behavior. Data-mining techniques are not necessarily complex, and summarizing vast amounts of data can already provide an advantage in the market or in elections if competitors are operating without any big data or data techniques at all. But as will be shown in the next section, big data also opens the door to much more complex applications.
Big Data as a Pre-Condition for Artificial Intelligence The processing of huge, accessible and rapidly collected libraries of big data in itself promises more detailed insights into many disciplines, including economics, political science, sociology, medicine, biology and management (De Mauro et al., 2016). It can help to approach and better understand some of the pressing social and economic challenges of our time. Yet, the full impact of big data can only be understood if viewed in relation to the many new data science methods that are enabled and advanced through big data: ML, artificial intelligence, neural networks and advanced clustering methods. ML algorithms and deep neural networks can learn to identify patterns by processing, and ‘learning’ from, vast amounts of previously collected data (Hastie et al., 2009: 2). By adapting their internal weights and nodes to huge amounts of training data, these algorithms and networks can become extremely efficient and reliable in identifying cases and assigning them to learned categories, even in environments where cues are very subtle and human observers and other statistical methods would struggle to identify patterns and associations due to the sheer amount of data. The possible applications of this new technology are (almost) infinite: a team of biomedical engineers and ML experts programed a deep learning framework based on deep neural networks that can identify genetic diseases with astonishing accuracy (~96%) by analyzing only portrait
Big Data in Social Sciences
photos of patients. Characteristic reference points in each face, the ‘variables’ in this approach, are trained with 17,000 photos including cues for 216 different rare genetic conditions to acquire an understanding of which individual combinations of facial characteristics are corresponding to certain rare genetic conditions (Gurovich et al., 2019). In another example, ML combined with text-mining techniques were used to identify the author of anonymous written content by comparing associations in language and word usage of known authors to the unknown source (Chakrabarty, 2019). In medicine and biology, deep convolutional networks can figure out how complex proteins fold and thus produce new tailor-made medicine to interact precisely with these proteins (Wang et al., 2016). Teams of ML specialists developed ML tools to identify icebergs in satellite photos in order to advance automatic alarm systems that make sea routes safer by preventing dangerous collisions (Mazur et al., 2017). In conflict studies, training an algorithm with social, economic, political and demographic data of previous years provides insights on how to predict the conflict propensity of certain regions (Colaresi and Mahmood, 2017). As a final example, advanced deep networks can learn from thousands of games of Go to master this ancient Asian board game and reach levels of perfection that are unachievable for humans (Silver et al., 2016). In all these examples, the foundation for training a network or algorithm is a suitable amount of data. Depending on the complexity of the research question, a minimum of a few hundred cases can sometimes be sufficient to develop a viable and relatively successful ML framework. But their capacity to surpass other statistical techniques and even humans in pattern recognition, and to reach exceptionally high accuracy in certain applications, is strongly connected to the amount of utilizable data. It is not unusual for ML frameworks to incorporate millions of cases with hundreds or thousands of different variables. Unlike other methods, ML approaches
281
can be easily up-scaled (Hastie et al., 2009; Goodfellow et al., 2016). A high number of variables does not have to be counterbalanced by a certain number or ratio of cases, nor is there an existing upper input limit for elements used aside from computational limits that can arise during this process. One of the few limitations of ML and artificial intelligence research is the processing power on the hardware side. Especially, artificial neural networks (ANNs) require more processing power than simpler learning algorithms, but promise better results in the form of more complex learning patterns, which can lead to amazing breakthroughs in artificial intelligence. The reason why big companies are at the forefront of new discoveries in artificial intelligence is that they can combine huge processing power with enormous amounts of data collected from their customers, their search engines or online platforms. The amount of data necessary to make ANNs smart is staggering, but by opening up ever more data sources and connecting ever more devices and gadgets to our daily life, we can now be relatively certain that big data is about to fundamentally change many aspects of human life.
Problems of Big Data and Critical Assessment The Case of Cambridge Analytica The influential role of CA during the 2016 US election campaign and also during the Brexit referendum in the same year, highlights some of the fundamental dangers resulting from the application of big data. Although the real impact of CA’s data-driven approach is highly disputed among experts and hard to measure (Hersh, 2015; González, 2017: 10–11), initial evidence suggests that the impact of such actors on voting behavior can be far-reaching and constitutes a considerable and growing
282
The SAGE Handbook of Political Science
challenge for democracies in the decades to come. Roberto González observed for the 2016 presidential election that myths are abundant in regard to CA and their role in the 2016 campaign. Along with other observers, González went so far as to claim that: Cambridge Analytica, played a pivotal role in Donald Trump’s victory by formulating new algorithmic techniques to influence the electorate during the final months of the 2016 US presidential campaign. […] Some described Cambridge Analytica’s tools as ‘mind-reading software’, a ‘weaponized AI [artificial intelligence] propaganda machine’ that ‘turned the world upside down’ by saturating voters with carefully crafted messages. (González, 2017: 9)
Evaluating the aspects of the 2016 election comprehensively, many researchers came to the conclusion that CA’s (self-)description as a pivotal instrument for the success of the Trump campaign was exaggerated. Yet, several political scientists are warning us not to underestimate the role of these new agents in the realm of ‘electoral management’ for future campaigns. So, how can companies like CA effectively influence millions of voters and help to secure narrow wins in crucial battleground states? By building on the Republican Party’s own data-collection efforts, it appears that CA established individual profiles, including about 5,000 individual data points, on more than 200 million US citizens (Bartlett, 2018). Building on this data foundation, the company managed to model and predict how ‘persuadable’ individual voters are and how to target them efficiently with tailor-made advertisements (Bartlett, 2018: 14). Some interesting connections within the data were discovered in this process. For example, analysts found that a preference for cars made in the United States was already a good first indication for a potential Trump voter. Of course, evaluating all 5,000 data points in the combined dataset allowed a much better identification of individual voter traits and to tailor content much more precisely to the
emotional needs and expectations of specific groups of recipients. By harnessing this bigdata-driven approach, political entrepreneurs suddenly gained an effective tool to identify and mobilize swing voters and convince them by using arguments customized to their personality and activating powerful emotional triggers (Cadwalladr, 2017). Although they denied the use of psychometric profiling, it is very evident that they repeatedly used this technique to determine voter preferences and to improve their voter targeting (Bartlett, 2018: 14). The consecutive claim that CA was able to pinpoint 20 million potential swing voters in crucial states by developing psychological profiles is therefore highly probable (González, 2017: 9). Thus, by targeting the right group of persons in the right kind of states, it is likely that CA had an important impact on the election, despite the fact they engaged late in the presidential race and worked with very limited time and resources. By targeting only voters that are susceptible to a certain message, these big data techniques can help to make campaigns much more efficient, and by avoiding voters that are extremely unlikely to change their mind, they can help to avoid wasting limited resources on futile attempts at swaying certain demographic and sociological groups. It is not only in the United States that the impact of big data-driven campaigns can be witnessed. It has emerged that Nigel Farage, then leader of UKIP (UK Independence Party), was also involved with CA. He was introduced to the firm and its techniques by Robert Mercer, a stakeholder of CA, back in 2016. There is growing evidence that CA provided, at minimum, expert advice to the ‘Leave Campaign’ and provided information on how to target uncertain voters in the Brexit campaign via Facebook (Cadwalladr, 2017). Although CA has closed its doors and the real impact of the company remains disputed, the rise of new competitors on the market of ‘election management’ and ‘data-driven political consulting’ is well on its way. Winning elections in democracies will in the future
Big Data in Social Sciences
depend to an increasing degree on who can learn more about potential voters and effectively use big data techniques to target and mobilize them. But the legal and political limitations that these actors and companies face in their quest to collect and use information will determine how much big data may shape democracy in the 21st century.
Developing the Perfect Big Data Dystopia? The Chinese Social Credit System In 2014, the Chinese government started one of the most colossal and controversial social experiments of our time. By the year 2020, the currently developed and now step-by-step implemented unified Social Credit System is supposed to be in place, ready for a comprehensive launch to ‘improve’ the ‘social management’ of all Chinese citizens (Meissner, 2017: 6). It is in fact a universal big databased ranking system incorporating thousands of different data sources traceable from public activities of the entire Chinese public. It covers and affects almost all aspects of daily life. On the long list of alleged benefits that the government cites for this new system, the following points are especially important: the improvement of social cohesion and integrity, the promotion of honesty in governmental affairs, the positive influence on commercial integrity and the alleged positive influence on judicial credibility. A cure-all for social problems, as it seems (Hoffmann, 2017; Ohlberg et al., 2017: 5). Remarkably, the initiation and the planning phase of the system was barely noticed by Western governments and largely overlooked by many outside observers until very recently. The news coverage about this system has only gathered momentum lately, along with the advancing technical implementation of measures in certain parts of the country. It is currently tested in over 70 different settings – some elements by commercial actors and some by local
283
governments, mostly concentrated around flagship projects in cities like Shanghai, Wuhan, Zhengzhou or Rongcheng (Ohlberg et al., 2017: 6; Kostka, 2018: 2). Up to now, most implementations are focused on testing parts of the system’s performance and applicability (Ohlberg et al., 2017). Some aspects of the system have attracted the attention of Western media – especially, some errors and curiosities surrounding the tests have been reported. For example, a CCTV camera and automated detection system repeatedly identified a Chinese businesswoman as a jaywalker because her face was printed in commercials on Shanghai city buses. The bus activated the automated face recognition system for pedestrians crossing the road while traffic lights were at red (Shen, 2018). Despite some minor setbacks and an increasingly unlikely implementation by 2020, the Chinese government leaves little doubt about their unwavering determination in eventually implementing this giant big data project comprehensively for every citizen and every region of China (Meissner, 2017: 8). How does the Social Credit System work? In the current test projects in Shanghai and Rongcheng every citizen starts with an equal score of 1,000 points. By collecting data from companies, banks, online shopping platforms, search engines, online games, CCTV cameras (for pedestrian and car-driving behavior), social media, official police databases, judicial process files and myriad other sources, all relevant events or activities of the individual citizen are considered, evaluated and finally scored. Each activity can increase or decrease the corresponding rating. Minor and major traffic offenses, fines and legal decisions are factored in to determine the score. Even repeatedly complaining in social media about grievances in the country may lead to the loss of points. Especially harsh is the treatment for political offenses in collision with party policy, which has reportedly led to a collapse of social scores down to the lowest rating possible. The implications if social scores fall below a certain threshold can be far-reaching:
284
The SAGE Handbook of Political Science
for example, you can be denied high-speed train tickets and international business flights; loans and credits may be denied by banks; and visa applications can be delayed or denied. A low score may also drastically reduce the chance to secure a job in a staterun company or as a state employee. Lists of offenders were even made public in several instances and used for shaming individuals (Ohlberg et al., 2017: 10). International companies operating in China are forced to participate in data sharing, just like all Chinese companies are generally required to share their data whenever necessary. From a Western perspective, it is hard not to perceive this system as an accumulation of Orwellian dystopias and as a massive threat to any kind of privacy. Interestingly, in the Chinese public, as far as it can be measured with studies so far, the response to existing commercial and state-run types of social credit systems (SCSs) has been far less critical and mostly framed around the benefits of such a comprehensive system. Genia Kostka noted that ‘[r]ather than perceiving them as instruments of surveillance, they see them as a way to protect consumers from food scandals or financial fraud – and to access benefits connected to a high social credit score’ (Kostka, 2018). Respondents in anonymous online surveys reported approval rates of up to 80% for SCSs, and even reported to have more trust in state-run systems that in commercially run schemes (ibid.: 2018: 22). Some respondents noted concerns regarding fairness and transparency in the system, but approved of the general idea to improve social cohesion by implementing these measures. Of course, people might be reluctant to report concerns with the existing system due to a real or perceived threat of potential repercussions. However, there was not a big difference between these online gathered attitudes and responses in safe, private personal interviews (ibid.: 2018: 23). Together with other evidence of rather lukewarm opposition to the SCSs, it can be stated that the Chinese public does not fundamentally oppose the approaching implementation of these systems on any larger scale.
Looking at it from the Chinese Communist Party’s ideological point of view, it fits well into pre-existing concepts of social management, reaching back all the way to the Mao era (Hoffmann, 2017: 4). Of course, the implications for state-critical activists, human rights lawyers and NGOs operating on the edge of what the party and government may consider legitimate could be severe, as these measures constitute another set of tools for potential sanctions and repercussions. Besides the obvious implications on privacy, the implementation of such a system further raises questions about the process of judging the behavior of 1.3 billion citizens. Gathering big data alone is not sufficient to evaluate and judge the behavior of all citizens. Due to the enormous amounts of collected information, incidents can no longer be principally evaluated by human observers; the whole process must be highly automated and supported by artificial intelligence to correctly identify misbehavior as shown in the case of the Shanghai jaywalking case. Because automated systems and ML are heavily used to evaluate the behavior of individual citizens, it raises the question of how much control should and can be delegated to machines and neural networks in identifying misbehavior and how prone this system is to error and unfairness. Finally, there are very impactful long-term global implications if the Chinese government manages to establish a ‘successful’ big data-driven universal SCS, because it could easily become a blueprint for many other states. In this context, big data can be seen as a tool to secure state authority and power. China’s Social Credit System may be only the first step in bringing us closer to a ‘Brave New World’.
Perspectives Big data analysis is on the verge of transforming many fields of academic research. The
Big Data in Social Sciences
interpretation of vast data sets and automated sources of data, once considered to be a procedure predominantly applied in natural sciences, has recently begun to open up new frontiers in economics and social sciences as well. New big data publications have shown innovative solutions to previously unanswerable questions and the great potential of these methods for future research. Complex and long-duration phenomena can finally be approached and analyzed by powerful toolkits of computation and data management. For political and social scientists unlocking the power of big data in their field can provide fascinating new possibilities for the discovery and observation of once undetectable effects, the creation of new hypotheses and the examination and study of vast pools of previously inaccessible data. New developments also concern the area of new methods, like artificial intelligence or ML. So far, the social sciences have not responded to these developments in their curricula. It is obviously necessary to enhance software education for social scientists in general to adapt to this new development (Munzert et al., 2015: 383). Concerning the veracity and validity of data, more checks will be required by experts who understand the technical and political implications of this technology. Big data in the hands of state and non-state actors will fundamentally change political developments and threaten or stabilize regimes as well as entire political systems. Social bots and fake news will shape public perception and shift debates. It is essential for the field of political research to engage with these developments early and profoundly.
References Aragona B. and R. De Rosa (2018) “Policy Making at the Time of Big Data: Datascape, Datasphere, Data Culture,” Sociologia Italiana, 11: 173–185. Baca G (2004) ‘Legends of Fordism: Between Myth, History, and Foregone Conclusions’. Social Analysis, 48(3): 169–178.
285
Bartlett J (2018) ‘Big data is watching you. The Cambridge Analytica row shows politics moving in a disturbing direction’. The Spectator, March 21, 14–15. boyd d and Crawford K (2012) ‘Critical Questions for Big Data’. Information, Communication & Society, 15(5): 662–679. Boydstun AE, Glazier RA, Pietryka MT and Resnik P (2014) ‘Real-Time Reactions to a 2012 Presidential Debate: A Method for Understanding Which Messages Matter’. Public Opinion Quarterly, 78(S1): 330–343. Burkholder L (ed.) (1992) Philosophy and the Computer. Boulder, Colorado: Westview Press. Cadwalladr C (2017) ‘Revealed: How US billionaire helped to back Brexit’. The Guardian, February 26, available at: https://www.theguardian.com/politics/2017/feb/26/us-billionaire-mercer-helped-back-brexit (accessed December 19, 2018). Chakrabarty N (2019) ‘A Machine Learning Approach to Author Identification of Horror Novels from Text Snippets’. Towards Data Science. Available at: https://towardsdatascience.com/a-machine-learning-approachto-author-identification-of-horror-novelsfrom-text-snippets-3f1ef5dba634 (accessed January 20, 2019). Colaresi M and Mahmood Z (2017) ‘Do the Robot: Lessons from Machine Learning to Improve Conflict Forecasting’. Journal of Peace Research, 54(2):193–214. Couldry N (2017) ‘The Myth of Big Data’. In: Schäfer MT and Van Es K (eds) The Datafied Society: Studying Culture through Data, pp. 235–239. Amsterdam: Amsterdam University Press. De Mauro A, Greco M and Grimaldi M (2016) ‘A Formal Definition of Big Data Based on Its Essential Features’. Library Review, 65(3): 122–135. De Rosa R (2017) ‘Governing by Data: Some Considerations on the Role of Analytics in Education’. In: Lauro NC, Amaturo E, Aragona B, Grassia M and Marino M (eds) Data Science and Social Research: Epistemology, Methods, Technology and Applications, pp. 67–77. Heidelberg: Springer-Verlag. De Rosa R and Aragona B (2017) Unpacking Big Data in Education. A Research Framework.Statistics, Policy and Politics 8(2): 123–137.
286
The SAGE Handbook of Political Science
Fayyad U, Piatetsky-Shapiro G and Smyth P (1996) ‘From Data Mining to Knowledge Discovery in Databases’. AI Magazine, 17(3):37–53. González R (2017) ‘Hacking the Citizenry? Personality Profiling, “Big Data” and the Election of Donald Trump’. Anthropology Today, 33(3):9–12. Goodfellow I, Bengio Y and Courville A (2016) Deep Learning. Cambridge, Massachusetts: MIT Press. Gurovich Y, Hanani Y, Bar O, Nadav G, Fleischer N, Gelbman D, Basel-Salmon L, Krawitz P, Kamphausen S, Zenker M, Bird L, Gripp K (2019) ‘Identifying facial phenotypes of genetic disorders using deep learning’. Nature Medicine, 25(January 2019): 60–64. Han J and Kamber M (1998) Data Mining: Concepts and Techniques. Waltham, MA: Morgan Kaufmann. Hastie T, Tibshirani R and Friedman JH (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer. Hersh ED (2015) Hacking the Electorate: How Campaigns Perceive Voters. Cambridge: Cambridge University Press. Hoffmann S (2017) Programming China: The Communist Party’s autonomic approach to managing state security. December 12, MERICS China Monitor. Koh Y (2014) ‘44% of Twitter Accounts Have Never Sent a Tweet’. The Wall Street Journal, April 11. Available at: https://blogs.wsj.com/ digits/2014/04/11/new-data-quantifiesdearth-of-tweeters-on-twitter/ (accessed January 19, 2019). Kostka G (2018) ‘China’s Social Credit Systems and Public Opinion: Explaining High Levels of Approval’. SSRN. Available at: https://papers. ssrn.com/sol3/papers.cfm?abstract_id= 3215138 (accessed January 11, 2019). Laney D (2001) 3D Management: Controlling Data Volume, Velocity, and Variety. Application Delivery Strategies, META Group, Inc., Stanford. Lazarsfeld P and Stanton FN (eds) (1944) Radio Research 1942–43. New York: Arno Press. Leskovec J, Rajaraman A and Ullman JD (2014) Mining of Massive Datasets, 2nd ed. Cambridge: Cambridge University Press.
Levy MR (1982) ‘The Lazarsfeld-Stanton Program Analyzer: An Historical Note’. Journal of Communication, 32(4): 30–38. Mayer-Schönberger V and Cukier K (2013) Big Data: A Revolution That Will Transform How We Live, Work, and Think. London: John Murray. Mazur AK, Wahlin AK and Krezel A (2017) ‘An Object-Based SAR Image Iceberg Detection Algorithm Applied to the Amundsen Sea’. Remote Sensing of Environment, 189:67–83. Mebane W (2006) Election Forensics: The Second-Digit Benford’s Law Test and Recent American Presidential Elections. Paper presented at the Election Fraud Conference, Salt Lake City, September 29–30, 2006. Meissner M (2017) China’s Social Credit System. A big-data enabled approach to market regulation with broad implications for doing business in China. May 24, MERICS China Monitor. Metz T, Wagschal U, Waldvogel T, Bachl M, Feiten L and Becker B (2016) ‘Das Debat-OMeter: ein neues Instrument zur Analyse von TV-Duellen’. ZSE, 14(1):124–149. Munzert S, Rubba C, Meißner P, and Nyhuis D (2015) Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining. Chichester: Wiley. Nickerson RS (1998) ‘Confirmation Bias: A Ubiquitous Phenomenon in Many Guises’. Review of General Psychology, 2(2):175–220. Ohlberg M, Ahmed S and Lang B (2017) Central Planning, Local Experiments. The complex implementation of China’s Social Credit System. December 12, MERICS China Monitor. O’Neil C (2016) Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York: Penguin Random House. Rodgers R (2017) ‘Foundations of Digital Methods’. In: Schäfer MT and Van Es K (eds) The Datafied Society: Studying Culture through Data. Amsterdam: Amsterdam University Press, pp. 75–94. Shen X (2018) ‘Facial recognition camera catches top businesswoman “jaywalking” because her face was on a bus’. November 22, available at: https://www.abacusnews. com/digital-life/facial-recognition-cameracatches-top-businesswoman-jaywalkingbecause-her-face-was-bus/article/2174508 (accessed January 22, 2019)
Big Data in Social Sciences
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, von der Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M (2016) ‘Mastering the game of Go with deep neural networks and tree search’. Nature, 529: 484–489. Stone, Lawrence (1979). “The Revival of Narrative: Reflections on a New Old History,” Past and Present, No. 85, Nov., 3–24. Tilly C (1984) ‘The Old New Social History and the New Old Social History’. Review (Fernand Braudel Center), 7(3):363–406. Tufte ER (1983) The Visual Display of Quantitative Information. Cheshire, Connecticut: Graphics Press.
287
Twitter (2019) Rate limits – Standard API rate limits per window. Twitter Developer Platform. Available at: https://developer.twitter. com/en/docs/basics/rate-limits (accessed January 20, 2019). Varol O, Ferrara E, Davis CA, Menczer F and Flammini A (2017) ‘Online Human-Bot Interactions: Detection, Estimation, and Characterization’. arXiv:1703.03107. Wang S, Peng J, Ma J and Xu J (2016) ‘Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields’. Scientific Reports, 6(18962). doi:10.1038/srep18962
17 Case Studies and Process Tracing Derek Beach
Process tracing is an in-depth case study method for making within-case inferences about the operation of causal mechanisms that link causes and outcomes. While originally developed as an adjunct tool in psychological experiments to gain clues about potential causal mechanisms, process tracing has developed into a key case study method within the social sciences. The strength of process tracing methods, if properly conducted, is that they enable strong causal inferences to be made about how causal processes work in real-world cases. The downsides of process tracing include not only the large amount of analytical resources that are required to conduct it properly, but also the limited ability to generalize about mechanisms beyond small sets of relatively homogeneous cases. This chapter first explores the foundational ontological and epistemological underpinnings of process tracing as a case-based method. This is followed by a discussion of the three core elements of process tracing: theories of causal mechanisms, empirical
analysis through observable traces left in cases and case selection and generalization.
The Foundations of Case-Based Methods Process tracing as a form of within-case causal research can be categorized as being a ‘bottom-up’ approach because the analytical point of departure is what is happening within an individual case. In case-based approaches, detailed process tracing is the ‘gold standard’ because it enables us to make direct causal inferences about the process linking causes and outcomes. Causal inferences are made about what is happening within a case, using the traces left by the operation of a causal mechanism to infer that an underlying theorized mechanism was present in the given case. Yet it is by no means certain that the mechanism we found operative in one case works in the same manner in other cases
Case Studies and Process Tracing
because mechanisms are often very sensitive to contextual conditions (Falleti and Lynch, 2009; Goertz and Mahoney, 2009; Beach and Pedersen, 2019). Therefore, process tracing can be thought of as a form of ‘bottom-up’ analytical strategy in relation to making causal inferences within a population of cases because individual cases are the core unit of analysis. In contrast, variance-based approaches – such as experiments – that build on counterfactual comparisons (potential outcomes) can be categorized as ‘top-down’ because they enable the assessment of mean causal effects across a population (or sample thereof) of cases, but they, per definition, do not shed much light on what is going on within any individual case because evidence takes the form of evaluating the difference that variation in the independent variable (cause) makes for values of the dependent variable (outcome), by comparing cases where everything else is held constant. In variance-based approaches, a controlled experiment is the ‘gold standard’. The foundational differences between case- and variance-based approaches are summarized in Table 17.1. It is important to note that despite many differences between within-case analysis using process tracing and cross-case analysis using variance-based approaches, logically the level at which causes are operative is always within a single case. A drug used to treat a sickness is operative in a single patient; it does not have causal effects across patients unless it can be administered to groups. Similarly, an increase in the number of veto players can produce deadlock and joint decision-traps within a political system, but a reform in one country would not produce deadlock across different countries unless there are diffusion
289
effects or other influences across cases. One can potentially learn about the effect that the increase in veto players have by comparing a case where this took place with one where it was absent, where all other things are equal. But at the end of the day, causation always occurs within cases. At its core, the types of causal claims that can be analysed with process tracing differ fundamentally from variance-based approaches. Variance-based approaches assess counterfactuals across cases, ideally using an experimental design in which there are no other differences between two groups of cases other than exposure to the cause. Because causal claims are made across cases, ontologically probabilistic claims are made about mean causal effects. In contrast, process tracing is used to study mechanistic causal claims about the causal process that links a cause (or set of causes) with an outcome within a case (Machamer et al., 2000; Illari and Williamson, 2011; Beach and Pedersen, 2019). A ‘mechanism explanation for some happening that perplexes us is explanatory precisely in virtue of its capacity to enable us to understand how the parts of some system actually conspire to produce that happening’ (Waskan, 2011: 393). Viewing causation in mechanism-based terms means that we explain why something occurred by analysing the productive causal processes that link a cause (or set of causes) with an outcome. Within a case, a causal relationship logically has either happened or not. This means that ontologically deterministic claims are made, understood as the claim that within a given case, a mechanism either links a cause
Table 17.1 Foundational assumptions of case- and variance-based approaches Case-based approaches Analytical point of departure Understanding of causation Method at the top of evidential hierarchy
Variance-based approaches
Individual cases (bottom-up approach) Population-level (or samples thereof) (top-down approach) Mechanistic, deterministic, asymmetric Counterfactual, probabilistic, symmetric (usually) Detailed process tracing (combined Controlled experiments with bounded comparisons)
290
The SAGE Handbook of Political Science
and outcome or it does not Bhaskar, 1978; Mahoney, 2008. Ontological determinism is often misunderstood because scholars often conflate the ontological (nature of things) with epistemological issues (how we can learn about causal relationships). Epistemological determinism would mean that we could gain 100% certain knowledge about causal relationships, which is naturally an impossibility outside of trivial facts. In process tracing, we combine ontological determinism (something happens within a case) and epistemological probabilism (we have varying degrees of confidence in the validity of a causal claim) (Beach and Pedersen, 2019). The conclusions of process tracing research take the form of ‘in case X, there is considerable evidence that mechanism Y operated as we had theorized’. Note that we do not claim that we confirm or disconfirm a theory with 100% confidence; instead, depending on how much confirming or disconfirming mechanistic evidence is present, our confidence may be higher or lower.
The Three Elements of Process Tracing At its core, process tracing can be broken down into three elements: the process we are tracing (mechanisms), the evidence we use to trace it (mechanistic evidence) and whether we can make mechanistic claims across bounded populations of cases (case selection and generalization). In this section, we discuss how recent work on process tracing has tackled these three questions.
What is the Process? When research is causally oriented, the process traced is a causal mechanism. This, of course, raises the question of how we can define the nature of causal mechanisms. Let us first discuss what causal mechanisms are
not. Causal mechanisms are not causes, but instead are the processes that are triggered by causes and that link them with outcomes in a productive relationship. Mechanisms are not merely series of events that occur in-between the occurrence of a cause and an outcome. A sequence of events tells us who did what and when, but it does not tell us why they did it, and most importantly, why the events were linked in a causal sense. Describing a series of events can provide a plausible descriptive narrative about what happened, but it does not shed light on the underlying causal linkage between a cause and an outcome. Intervening variables are also not mechanisms. Treating a mechanism as an intervening variable means that the ‘mechanism’ is transformed from a productive process into a counterfactual claim. This has two problems. First, analytically, if we want to study a counterfactual we have to assess the difference that variation makes for the outcome, meaning that we are no longer tracing how a mechanism works within a case, but instead are comparing variations across cases that are completely similar with the exception of the intervening variable being present or absent (Runhardt, 2015). This is, of course, an impossibility, meaning that there will always be a number of other differences between the cases that could plausibly account for the difference, turning process tracing into a poor little brother of ‘proper’ scientific research using experiments. Second, even if we could find the two most similar cases that vary only on the intervening variable, by comparing instead of tracing, we lose focus on the process that was actually operative within a case (Machamer, 2004: 31; Groff, 2011; Waskan, 2011). Even the perfect experiment tells us nothing about how a cause is linked to the outcome; only that variation in the treatment has an average causal effect on outcomes. In the words of Bogen, ‘How can it make any difference to any of this whether certain things that did not happen would have or might have resulted if other things that did not actually happen had happened?’ (2005: 415).
Case Studies and Process Tracing
The essence of mechanistic explanations is that we shift the analytical focus from causes and outcomes to the hypothesized causal process in-between. Unfortunately, even among scholars who take process tracing seriously, there is disagreement about the nature of causal mechanisms. There are two overall positions: a minimalist position that works with mechanistic ‘sketches’ that do not unpack in detail the workings of the mechanisms, and the productive account that unpacks a mechanism into constituent parts, composed of entities engaging in activities that link a cause (or set of causes) with an outcome in a productive relationship. Both positions are relevant for process tracing, but which one we should work with depends on the research situation. Early in a project, it can make sense to operate with a minimalist understanding, where mechanisms remain relatively black-boxed as mere ‘sketches’ in which the parts and the causal logics linking them are not specified. When there is considerable uncertainty about whether there is a mechanism linking a cause (or set of causes) and an outcome, or there are multiple plausible mechanisms that might link them, it makes sense to engage in a form of plausibility probe to see whether there is any evidence of a link before engaging in a detailed tracing of a full-fledged mechanism. In contrast, tracing a full-fledged mechanism makes sense when one has a strong hunch that a particular mechanism might be operative and one wants to test more rigorously whether the mechanism is operative, or when there is strong evidence of a causal relationship and one wants to find what mechanism is linking the causes and outcome. Despite having a superficial resemblance to the intervening variable understanding of mechanisms, scholars who use a minimalist definition understand mechanisms as what links causes and outcomes. However, the causal arrow between a cause and an outcome is not unpacked in any detail, either empirically or theoretically. Typically, scholars operating with a minimalist understanding are more focused on empirical analysis than understanding in more detail how a process
291
works. They concentrate, therefore, on searching for some form of within-case, mechanistic evidence of a link, understood as the withincase observables that could have been left by a causal mechanism operating within a case. Some scholars in political science use terms like ‘causal process observations’ or ‘diagnostic evidence’ to refer to the within-case traces of mechanisms (e.g. Brady and Collier, 2011; Bennett and Checkel, 2014: 7). Theories of mechanisms in the minimalist understanding are typically theorized at a relatively high level of abstraction. The theoretical causal process that binds the cause, the mechanism and the outcome is not unpacked in any detail. This means that the theorized mechanistic explanation is either: (1) superficial because both the parts of the process itself and the causal logics linking them are not specified at all, or (2) it is incomplete because the causal logics that link parts of the process are not specified (Craver and Darden, 2013: 83–95). A superficial, minimalist mechanism is typically depicted in the form of Cause → Causal Mechanism (M1) → Outcome, whereas an incomplete mechanism scheme would include more parts (e.g. Cause → part 1 → part 2 → Outcome), but where the causal logics linking parts together are not described, but instead are merely depicted as arrows that are assumed to link parts together in a relationship of conditional dependence. In both instances, the theorized minimalist mechanism does not provide us with enough information to answer fully the ‘how does it work?’ question (Craver and Darden, 2013: 90–1). An example of a theorized ‘minimalist’ mechanism can be found in Nina Tannenwald’s (1999) article on the impact of norms on US decision-making. She theorizes that norms against the non-use of atomic weapons (cause) (a nuclear ‘taboo’) contributed to US decision-makers’ avoidance of using them (outcome), but the mechanism remains firmly within a theoretical grey-box because no causal mechanism is detailed whereby the cause is linked to the outcome.
292
The SAGE Handbook of Political Science
The closest she gets to unwrapping causal mechanisms in the conclusion of the article is where she mentions three plausible links between norms and non-use in the form of minimalist ‘one-liners’: e.g. constraints imposed by individual decision-makers’ personal moral convictions, domestic or world opinion (Tannenwald, 1999: 462). Yet these brief descriptions do not describe the causal process that links norms with non-use; i.e., how does the existence of the taboo actually produce behavioural changes? The analytical result of not unpacking the mechanism in any detail theoretically means that the mechanistic evidence that we gain from a minimalist process tracing case study does not enable strong within-case causal inferences to be made. But early in a research process, given that there can be many different plausible mechanisms linking a cause and outcome, it makes sense to explore whether there is any within-case mechanistic evidence of particular causal processes before one engages in more encompassing, step-by-step tracing of mechanisms. When combined as a two-stage case study research design, the initial minimalist process tracing study acts as a form of plausibility probe of rough mechanism ‘sketches’, intended to narrow down the range of possible causal mechanisms, followed by more detailed process tracing of the mechanism(s) in a given case. Theorizing mechanisms in a minimalist fashion can also be appropriate as a follow-up to a number of successful process tracing case studies to see whether the same mechanism is operative in multiple cases (see below). At its core, the productive account means that the core elements of the causal mechanisms are unpacked theoretically and studied empirically in the form of the traces that the activities associated with parts of the process leave within cases. Mechanisms in this understanding are defined as systems of interlocking parts that transmit causal powers or forces between a cause (or a set of causes) to an outcome (examples of this understanding include Machamer
et al., 2000; Machamer, 2004; Beach and Pedersen, 2019). In the productive account, we are operating at a lower level of analytical abstraction because we are trying to capture how actual causal processes play out within cases. The level of abstraction can range from very detailed, case-specific models of processes to more abstract processes that can in theory be present within a bounded population of cases. The ambition is to unpack explicitly the causal process that occurs in-between a cause (or set of causes) and an outcome and trace each of its constituent parts empirically. Here the goal is to dig deeper into how things work, but by tracing each part of the mechanism empirically using mechanistic evidence, and in particular observing the empirical fingerprints left by the activities of entities in each part of the process, we are arguably able to make stronger causal inferences about how causal processes actually work in real-world cases (Russo and Williamson, 2007; Illari, 2011). In comparison, in the ‘minimalist’ understanding we have less direct mechanistic evidence because we have not made the process explicit, resulting in weaker inferences about the operation of a causal process. When mechanistic explanations are viewed as systems, they are understood in a holistic way where the effects of the mechanism are more than the sum of its parts. Parts have no independent existence (i.e. they are not variables) in relation to producing an outcome; instead, they are integral parts of a system that transmits causal forces to the outcome. Theorized as a system, there is often a complex interrelationship between the parts of the mechanisms, where the effects of individual parts often only manifest themselves fully together with the effects of other parts. In the words of Cartwright, ‘There are any number of systems whose principals cannot be changed one at a time without either destroying the system or changing it into a system of a different kind’ (2007: 239). This means that a mechanism-as-system explanation cannot be reduced to counterfactual dependencies.
293
Case Studies and Process Tracing
Table 17.2 ‘Unpacked’ causal mechanism of nuclear taboo Cause
Part 1
Part 2
Outcome
Norm against use of nuclear weapons within group
→ believer in taboo uses speech act to shame proponents of use
→ proponents of use are silenced because they are unable to deploy counter arguments that clash with taboo
→ non-use of nuclear weapons
Each of the parts of the mechanism can be described in terms of entities that engage in activities (Machamer et al., 2000; Machamer, 2004). Entities are the factors (actors, organizations or structures) engaging in activities, whereas the activities are the manifestations of the causal powers of the entities in each part, or in other words, what transmits causal forces or powers through a mechanism. Given the focus on the productive nature of mechanisms, activities that link parts are the main analytical focus. Returning to the Tannenwald example above, the minimalist mechanism theorized by Tannenwald that linked individual moral convictions and behaviour could be unpacked theoretically by detailing parts of the process and the causal logics linking them. Here, we would first have to develop further the causal logics underpinning the links between parts of the process, making explicit what activities link parts and why. For example, we could draw on theories that focus on the constraining impact of normbased speech acts. Using this causal logic, the mechanism could then be depicted as in Table 17.2. The theorized mechanism has two parts: (1) a believer (entity) in the taboo uses a speech act (activity) to attempt to shame proponents of use, and (2) the proponents (entity) are silenced (activity) because they are unable to deploy counterarguments because of their normative costs (clash with taboo).
Mechanistic Evidence as Traces of Mechanisms How can we trace mechanisms within cases? While many accounts of process tracing merely suggest that we go out and collect
observations, this tells us little about what actually constitutes relevant evidence in relation to tracing mechanisms in case studies. Because most existing terms in the social science process tracing literature like ‘causal process observations’ are imprecise about what we are tracing and how we can trace it, in this chapter I utilize the term ‘mechanistic evidence’ from the natural sciences to refer to the overall type of evidence that can be used to learn about the operation of causal mechanisms (Russo and Williamson, 2007). Mechanistic evidence can be defined as any type of empirical fingerprint that has probative value in relation to determining whether an activity is associated with a part of a mechanism. When we are working with a minimalist sketch of a mechanism, the activities still need to be considered, because otherwise we have no idea what relevant traces could be when we have no clue about what we are tracing. There are four distinguishable types of mechanistic evidence: pattern, sequence, trace and account. Pattern evidence relates to predictions of statistical patterns in the empirical record. For example, if we are testing a causal theory of racial discrimination in a case dealing with employment, then statistical differences in patterns of employment could actually be relevant evidence upon which we could make inferences. Sequence evidence deals with the temporal and spatial chronology of events that are predicted by a hypothesized causal mechanism. Trace evidence is evidence whose mere existence provides proof. For example, if we were testing a theory about lobbying, the existence of some record of a meeting being held between a decision-maker and a lobbyist would be proof that they had met. Finally, account evidence
294
The SAGE Handbook of Political Science
deals with the content of empirical material, be it minutes that detail what was discussed in a meeting, an oral account of what took place in a meeting or in the form of a discourse present in speeches or other material. Bayesian reasoning can provide us with a set of logical tools for evaluating what finding particular pieces of mechanistic evidence tells us about our theories of causal mechanisms. As used here, empirical material acts as evidence that can either increase or decrease our confidence in a hypothesis being valid. Updating our confidence in a hypothesis about a causal mechanism (or a part thereof) operating in a case is a function of: (1) our prior confidence based on existing knowledge, (2) the confirming or disconfirming power of evidence in theory and (3) whether we can trust the sources for particular observations (empirical evaluation). After we have collected new mechanistic evidence, we update our degree of confidence in the presence of the causal mechanism and how it worked within a particular case (termed posterior probability). We need to ask two sets of Bayesianinspired questions in order to make inferences about whether an activity was present or not in a case. First, at the theoretical level, we need to evaluate whether an activity has to leave a particular type of empirical fingerprint (theoretical certainty), and if found, whether there are alternative explanations for finding it (theoretical uniqueness). Theoretical certainty relates to the disconfirming power of evidence, whereas theoretical uniqueness deals with the confirming power. One common misunderstanding in social science applications of Bayesian reasoning in process tracing is that theoretical uniqueness of evidence is always relative to rival or competing theories that explain the outcome (e.g. Bennett, 2014; Fairfield and Charman, 2017: 366–8). However, what we are talking about here are alternative explanations for finding a particular piece of evidence – not alternative explanations of the outcome. These ‘predictions’ are still not evidence; therefore it is more proper to use the term
proposition about proposed empirical fingerprints. We only have actual mechanistic evidence when we also evaluate the actual sources of the observations (or lack thereof) of the propositions. At the empirical level, we have to ask ourselves whether a particular observation matches what we expected to find theoretically, whether we had access (if not found) and if found, whether we can trust it. We might only have very untrustworthy sources for observing the fingerprint, even though the fingerprint was in theory very confirming if found. In this instance, the actually found mechanistic evidence would not enable strong confirmation because we could not trust the source. Alternatively, we could be in the situation where we had a very theoretically certain fingerprint, but we did not have access to an empirical record that would enable us to observe whether it was actually present or not (e.g. because we did not have access to an archive). Here, absence of evidence would not be evidence of absence. Therefore, the overall probative value of evidence can be summarized as in Figure 17.1. At the theoretical level, we ask ourselves about the evidence-generating process in relation to a particular activity, speculating about what types of fingerprints may be left in a given case in theory. At the empirical level, one needs to consider a range of source-critical questions relating to access to sources, and whether we can trust particular sources (e.g. an interview with a stakeholder). Returning to the Tannenwald (1999) example, a proposition about an empirical fingerprint in her article was that she expected to find ‘taboo talk’, defined as ‘non-cost-benefit-type reasoning along the lines of “this is simply wrong” in and of itself because of who we are, what our values are, “we just don’t do things like this”, “because it isn’t done by anyone”, and so on’ (1999: 440). She then goes on to argue that the fingerprint is not theoretically certain because when norms really matter, they will not manifest themselves openly (i.e. taboo talk would then not be expected to be found despite norms really mattering).
295
Case Studies and Process Tracing
Questions to be asked
Entity engaging in activity • Theoretical
Evidence-generating process
level
empirical fingerprint? •
for finding it? •
Observational process
level Observations (sources)
If found, are there any alternative explanations
Proposition about evidence Empirical fingerprint of activity
Empirical
Do we have to find
Have we found/not found the proposed fingerprint?
•
Can we trust the source?
Figure 17.1 A two-stage evidence-evaluation framework for turning empirical material into evidence of mechanisms in process tracing Source: Beach and Pedersen (2019).
Table 17.3 Template for transparent evaluation of mechanistic evidence in process tracing Theoretical level – part of a causal mechanism • Description of activity associated with a part of a causal mechanism Proposition level – empirical fingerprint of • Description of proposition about empirical fingerprints activity, and theoretical evaluation • Theoretical evaluation of proposition: do we have to find it (theoretical (certainty and uniqueness) certainty), and if found, are there alternative explanations for finding the proposition (theoretical uniqueness)? Actual sources and source-critical evaluation • Description of actual source of observation (e.g. statement by actor in interview, or extract from archival document) • Empirical evaluation of observation: what does the observation mean in context? Can we trust it?
However, she claims that it is theoretically unique, writing that ‘Taboo talk is not just “cheap talk” as realists might imagine’ (1999: 440), suggesting that there is research that suggests that when people make normative statements like taboo talk, this gives insight into their underlying motives. Unfortunately, in the article she does not engage in the empirical evaluation of what finding particular observations in sources mean. For instance, she claims that one example of taboo talk is ‘Dean Rusk, at the time assistant secretary of state for the Far East … [who] wrote that “We would have worn the mark of Cain for generations to come”’ (Tannenwald, 1999: 445). However, given that the quote is found in an autobiographical work written some time later, it might also be a post-hoc justification instead of reflecting an actual statement that he made during a decision-making process
about using nuclear weapons. Therefore, while this piece of mechanistic evidence might in theory be confirming, because we cannot trust the source, it provides precious little confirmation of the existence of the part of the theorized causal mechanism. Good process tracing should therefore explicitly link an activity with actual sources by explicitly evaluating what inferences of mechanistic evidence in theory and empirically are possible. This can be done in table form, as depicted in Table 17.3.
Case Selection and Generalization This section first discusses how we can select appropriate cases, followed by an elaboration of the challenges regarding generalizing about causal mechanisms given their contextual sensitivity.
296
The SAGE Handbook of Political Science
The first task is to select cases that are appropriate for engaging in process tracing. If we are interested in tracing a causal mechanism linking a cause or set of causes and an outcome, we logically want to trace it in cases where it could have been present, at least in theory. Tracing a non-existent mechanism in a case where we a priori knew it was not present because the cause that triggers it is not present cannot tell us anything about how the mechanism works. Guidelines for case selection that are appropriate for variancebased designs – where per definition we need to investigate negative cases (absence of cause) to assess whether variation makes a difference – are therefore not relevant for a case-based method like process tracing. Selecting appropriate cases for process tracing requires that we first map a population of cases by scoring them according to whether the cause (or set of causes) that is theorized to trigger a mechanism and the outcome are present, along with contextual conditions that can potentially affect how the process works. Mapping a population involves scoring cases on values of the cause(s), outcome and relevant contextual conditions. Contextual conditions can be defined as any factor that could impact on how a process works. For example, the degree of ethnic heterogeneity in a country can conceivably impact how societal conflicts play out. Mapping cases on the cause(s), outcome and potentially relevant contextual conditions can be done using qualitative comparative analysis (QCA) (e.g. Ragin, 2000; Schneider and Rohlfing, 2013, 2016), or by hand using a table in the form of a similarity
graph (Berg-Schlosser, 2012: 111–59), as depicted in Table 17.4. Here, case scores for all potentially relevant conditions are assessed to determine the level of similarity between cases. Case membership is depicted as either present (+) or absent (−). Levels of similarity – or what Berg-Schlosser terms ‘Boolean distance’ – are determined by the number of conditions shared by cases with the same outcome. These procedures can be simplified by using a simple spreadsheet that compares the number of conditions shared across cases to find the one that has the most commonalities with other cases. Once we have mapped cases, we are able to distinguish between four types of cases: (1) typical cases where the hypothesized cause, outcome and contextual conditions are all present; (2) deviant consistency cases, where the known cause and contextual conditions are present but the outcome is not present; (3) deviant coverage cases, where the cause(s) is not present but the outcome is; and (4) irrelevant cases where neither the cause or outcome are present (Schneider and Rohlfing, 2013, 2016). Critical for distinguishing between types of cases in case-based research are the qualitative differences-in-kind in concepts that demarcate the presence or absence of causal and contextual conditions. These differences-in-kind between cases that are members of a set of a condition or not are depicted as thresholds in Figure 17.2. Cases within cell I are typical cases that are in the set of both the cause(s) and known contextual conditions and the outcome. In process tracing, typical cases are used for building and testing theories about
Table 17.4 Mapping cases using a similarity graph Case no.
1 2 3 4
Cause
+ + + +
Contextual conditions
# of similarities with case 3
C1
C2
C3
+ + + +
+ + + −
+ + − −
2 2 2
Case Studies and Process Tracing
Outcome present
II - Deviant case coverage
Outcome not present
III - Irrelevant cases Cause(s) and/or contextual conditions not present
297
I - Typical cases (theory-testing or -building process tracing) IV - Deviant case consistency (theoretical revision process tracing) Cause(s) and known contextual conditions present
Figure 17.2 Four types of cases in process tracing
mechanisms, whereas deviant consistency cases shed light on why a process did not work as expected, e.g. there was an omitted contextual condition that has to be present for the mechanism to work. While relevant for other case-based and variance-based methods, deviant coverage and irrelevant cases tell us nothing about the mechanisms linking causes and outcomes, and therefore have limited use for process tracing. What typical cases should we select when there are multiple cases that share the cause(s) and outcome? When we are uncertain about what contextual conditions have to be present for a given mechanism, we can start by selecting a typical case where as many as possible relevant contextual conditions are present. Alternatively, when there are empirical and/or theoretical reasons to believe that there is considerable mechanistic heterogeneity in the population, it can make sense to select a case that is similar to other typical cases on as many contextual conditions as possible. For example, if we are engaging in theorybuilding, we would not want initially to develop a theorized mechanism using a case that differs from other typical cases on many contextual conditions because we could reasonably expect that the found mechanism would not travel to many other cases. This is depicted in Table 17.4, where a comparison of similarity focusing on contextual conditions (C1, C2, C3) is depicted within a small population of cases. Here case 3 has the most similarities with cases 1 and 2, and then case 4. In contrast, case 4 only has one similarity with cases 1 and 2. If we had to choose only one case for process tracing, it makes sense
to select case number 3. If we have two cases to select, it would make more sense to select either case 1 or 2, and then case 4 in order to explore whether differences in C2 and C3 impact on how the mechanism works. If we then find the mechanism in a typical case, we cannot automatically infer to other cases where fewer or more contextual conditions are present. We would then want to study another case with other contextual conditions present, gradually becoming more confident about what conditions have to be present for the mechanism to function through repeated case studies using the snowballing-outwards case selection strategy discussed later in this chapter. When there is residual uncertainty either because of theoretical or empirical reasons about where the exact threshold of a condition (either cause, contextual or outcome) is, it is advisable to only select typical cases that are widely accepted by case experts as being within the set of the concept; i.e. they are almost ‘ideal-typical’ cases where there are clear substantive and theoretical arguments for set membership that can be documented in a transparent fashion (Goertz and Mahoney, 2012: 133). Cases within quadrant IV in Figure 17.2 are used in theoretical revision process tracing, in which a mechanism is traced until it breaks down in order to detect either omitted contextual conditions when a cause (or set of causes) was theorized to be sufficient, or omitted causal conditions when a cause (or set of causes) is not assumed to be sufficient. If we theorize that a cause is a sufficient cause of an outcome, deviant cases where the cause is present but where the outcome is
298
The SAGE Handbook of Political Science
not present are useful to investigate the contextual conditions that have to be present to trigger the mechanism that will produce the outcome. If the cause is not theorized to be sufficient, deviant cases within quadrant IV can be used to detect omitted causal conditions that together with cause would be sufficient to produce the outcome. In both instances, we only employ this type of design after we have positive results when tracing a mechanism in one or several typical cases within quadrant I. The reasoning behind this suggestion is that there is no reason to investigate mechanism breakdown before we are relatively confident about the actual existence of a mechanism in one or more typical cases and have some knowledge about how it operates. If we do not know much about how a mechanism operates, it would be very difficult to investigate why it broke down in a deviant consistency case. Once we are confident about what is going on in typical cases, investigating deviant consistency cases is very important for developing better mechanism-based explanations. Using an analogy, once we are certain about the mechanisms that enable airplanes to fly, we would want to investigate very closely any accidents to develop a better understanding of the contextual conditions under which planes can fly safely, for obvious reasons. The results of process tracing case studies either shed light on how a given causal mechanism operates within a case, or tell us that no mechanism was operative in the case despite our insistent probing. How then can we generalize our findings about mechanisms to other cases without engaging in process tracing in all other cases? The literature on causal mechanisms frequently points out that the ways mechanisms unfold in a specific case are sensitive to the context which surrounds them (Steel, 2008; Falleti and Lynch, 2009). Contextual conditions are sometimes termed ‘scope’ conditions in the literature, but the terms mean the same thing. In this chapter, contextual conditions are defined as all ‘[…] relevant aspects of a setting (analytical,
temporal, spatial, or institutional)’ (Falleti and Lynch, 2009: 1152) in which the analysis is embedded in and which might have an impact on the constitutive parts of a mechanism. Even when the same causes and outcome are present, different contextual conditions can create differences in the mechanisms linking them together (Steel, 2008; Falleti and Lynch, 2009). This means that two cases that look causally homogeneous based on sharing similar causal conditions and an outcome might be heterogeneous at the mechanism level because of contextual differences. These differences can be defined as mechanistic heterogeneity, which can both mean that in two (or more) cases, the same cause triggers different processes, thereby resulting in different outcomes, or it can mean that the same cause is linked to the same outcome through different processes. Different processes can either be the whole mechanism that is completely different, or only one or more parts that diverge between two cases. A real-world example of mechanistic heterogeneity produced by differences in context can be found in White (2009). He describes a mechanism that links a cause (education of mothers in nutrition) with an outcome (improved nutritional outcomes for children) that was found to have worked in a case (the Tamil Nadu Integrated Nutrition Project in India). The unpacked mechanism can be described as: Cause (mother participates in programme) → (1) mother receives nutritional counselling → (2) exposure results in knowledge acquisition → (3) knowledge used to change child nutrition → Outcome (improved nutritional outcomes) (based on White, 2009: 4–5). Based on the success of the programme in the Tamil Nadu case in India, it was then attempted to use it in Bangladesh. However, the mechanism did not function as expected in the different context; instead it broke down. The reason for this was a key contextual difference. In Bangladesh, mothers were not the key decision-makers in households, with men doing the shopping, and mother-in-laws in joint households (sizeable minority) acting
Case Studies and Process Tracing
as decision-makers about what food went onto the table. The mechanism therefore ‘worked’ until part 3, but because of a contextual difference, it broke down in the Bangladesh case. Note that the problem of mechanistic heterogeneity cannot just be resolved by raising the level of abstraction of the theorized mechanism. Of course, logically the higher the level of abstraction of a theorized mechanism, the greater our ability to generalize about the mechanism across cases, other things equal. There can be different levels of abstraction of theorized mechanisms, ranging from mid-range mechanisms where the process is described in quite abstract terms but where we can still identify interlocking parts composed of entities engaging in activities, over contingent mechanisms at quite low levels of abstraction, to detailed, case-specific mechanisms that describe a causal process in a particular case. But to qualify as a mechanism-based explanation (i.e. the productive account), we have to be able to answer the ‘how does it work?’ question, which requires that the causal arrow in-between causes and outcomes has to be elucidated in enough detail so that the critical parts of the process are made clear in terms of entities engaging in activities that link one part to the next (Craver and Darden, 2013: 31). Unfortunately, existing guidelines for generalization of case study findings – be they variance-based (Gerring and Seawright, 2007) or case-based (Schneider and Rohlfing, 2013) – are blind to the risk of mechanistic heterogeneity. The basic logic in existing guidelines is that generalizations from one case to others deal with using the within-case evidence of the causal effect/association gained from the studied case(s) to make us more/less confident in the average causal effect (variance-based) or invariant association (case-based) in the rest of the population. Therefore, most of the literature that is relevant for generalizing about mechanisms is still trapped in thinking about cross-case causal relationships, be they causal effects in variance-based approaches, or invariant claims about necessity or sufficiency in case-based approaches.
299
Given the sensitivity of mechanisms to context, we should be averse to making any cross-case generalizing claims about similar mechanisms being present in other cases to be studied until we have tested whether it is reasonable to expect similar mechanisms operative in other cases in the population. This can be done using a snowballing-outwards strategy for multiple case studies, in which we first start with in-depth process tracing in the ‘most similar’ typical case (see above). This is then followed by selecting cases with more differences until the outer bounds of the operation of a given causal mechanism are found, incrementally enlarging or restricting the boundaries of our generalizing inferences about what mechanisms are operative. There are many potential sources of mechanistic heterogeneity that can be lurking in what otherwise looks at the level of conditions/outcomes like a causally homogenous set of cases (for more, see Beach and Pedersen, 2019). Here I illustrate the risks of two potential sources: known/unknown omitted contextual conditions, and differences of degrees in concepts. First, unknown conditions are the product of exclusion before the cross-case analysis because we either did not know about them or the literature suggested they were not relevant for a cross-case relationship. Known conditions are ones that are ignored when selecting cases because the cross-case analysis found they did not matter for the identified causal effect/invariant association. However, while unknown/known conditions might not matter for a cross-case relationship, given the contextual sensitivity of mechanisms, we cannot assume that they do not also matter at the level of mechanisms. Therefore, testing for unknown conditions can involve engaging in process tracing of two cases that appear to be identical to see whether there might be an unknown contextual condition that differs and that impacts on how the mechanism works. Testing for the impact of known conditions can be done by returning to the original data set (e.g. a QCA
300
The SAGE Handbook of Political Science
truth table or similarity graph) and selecting cases that are progressively more different on selected contextual conditions to explore whether their presence/absence makes a difference for the mechanisms. Second, mechanistic heterogeneity can also be lurking behind differences-of-degree in concepts; be they in the form of intervalscale variables or fuzzy-set concepts. In the following, I focus on the problem of differences-in-degree in fuzzy-set concepts, but it is important to note that the problem is even greater when using ordinal/interval scale variables as the basis for mapping cases and generalizing because qualitative thresholds are not included. Therefore, we would first have to recalibrate the variable to include a categorical difference before we could proceed further, in effect transforming the variable into a fuzzy-set concept. The main analytic idea behind fuzzy-sets is that they enable us to capture both differencesof-kind and differences-in-degree in a single measure. The key categorical difference-inkind – i.e. qualitative threshold – is the 0.5 fuzzy-set value, which determines whether a case is a member in a given set or not. Above or below this 0.5 crossover point, the membership of cases can range from fully in (1) to fully out of the set (0) with the option of introducing fine-grained gradations in-between to account for partial (non-)membership of cases. Following this, the causal rationale at the crosscase level is that the categorical point of difference at 0.5 establishes the threshold at which a concept develops or changes its causal character due to its change in conceptual status, e.g. switching from a unified to a divided government, or from public support to lack of public support. Differences-of-degree, on the other side, are usually conceived as having a linear effect, i.e. a partial-set value of 0.6 in a condition is expected to result in a partial outcome of 0.6, whereas a full membership in a condition should fully produce the outcome (e.g. Schneider and Rohlfing, 2013: 581; 2016: 548). However, when we move to the level of mechanisms, it is by no means certain that
these differences-of-degree mean the same thing in causal terms as they do at the level of causes/outcomes. It is not a far stretch to imagine that a mechanism might operate differently when a cause has a value of 0.6 in relation to full membership (1). One real-world example of this can be found in Samford’s (2010) study on trade liberalization in Latin America. His case narratives of a set of cases that QCA told him were causally similar illustrate that very different processes were at work in Peru (1990–95) and Uruguay (1972–85) although both are part of the found conjunction of causes (hyperinflation and unconstrained executive power). The condition inflation played a different causal role in the processes operative in the two cases, a fact which might be expected given that the inflation rate in Peru was at 7,481% while Uruguay’s rate was only(!) 58%. Therefore, the snowballing-outwards selection strategy of selecting multiple cases to explore whether mechanistic heterogeneity is present should also explore whether differences-of-degree mask important categorical differences in which mechanisms are operative in different cases. One way to reduce the analytical costs of conducting multiple process tracing case studies is to focus only on critical parts of the identified mechanism. Determining critical stages of a mechanism can be evaluated by asking whether there are parts of the process that are particularly crucial from a causal perspective, and where we have theoretical reasons to expect that the processes might most plausibly differ across cases (Steel, 2008: 88–92). An alternative means to lighten the burden of engaging in a snowballingoutwards case selection strategy is to develop ‘signatures’ of a particular process that can be tested across a number of cases.
Using Process-Tracing in Practice The four different variants of process tracing are depicted in Table 17.5. All four variants can be used both in a minimalist version in which the mechanism is not really unpacked, and a more in-depth version aiming to trace
301
Case Studies and Process Tracing
Table 17.5 Four variants of process tracing Theory-testing process tracing Research purpose Is hypothesized causal mechanism present and does it function as theorized? Analytical focus
Theory-building process tracing
Theoretical revision process tracing
Explaining outcome process tracing
What is the causal mechanism between the cause and outcome? Theory-focused
Why did the mechanism breakdown in-case?
What mechanistic explanation accounts for historical outcome? Case-focused
more detailed mechanisms that are viewed as systems composed of entities engaging in activities that bind parts of the process together. Theory-testing process tracing starts with conceptualizing a plausible hypothetical causal mechanism based on existing theorization and empirical research. Theorized causal mechanisms then need to be operationalized in terms of developing propositions about potential empirical fingerprints that might have been left in a given case by the activities of a mechanism (or its parts). The predictions about mechanistic evidence should be as clear as possible, making it easier to determine whether they are then actually found in the subsequent case study or not. The researcher then collects and assesses the available empirical record to determine whether there is mechanistic evidence suggesting that the mechanism was present and worked as theorized, or whether the theory needs to be modified. If the predicted evidence is found, we can then infer that the hypothesized causal mechanism is present in the case and worked as we theorized. If evidence is not found for a given part (or for the overall mechanism if the minimalist understanding is used), the researcher should engage in a round of theorybuilding, using the insights gained from the empirical analysis of what went wrong as inspiration for building theories of new parts of the mechanism. Theory-building process tracing is an empirics-first form of research that in its purest form starts with empirical material and uses a structured analysis of this material to build a plausible hypothetical causal mechanism whereby a cause (or set of causes) is
linked with an outcome that can be present in multiple cases, meaning it can be generalized beyond the single case. In effect, it involves using empirical material to answer the question ‘how did we get here?’. Theory-building process tracing is utilized primarily when we know that there might be a relationship between a cause and an outcome, but we are in the dark regarding potential mechanisms linking the two. In reality, theory-building process tracing is usually an iterative and creative process that goes back and forth between empirical probing and theorization. After key theoretical concepts (causes and outcome) are defined and operationalized, theory-building proceeds to investigate the empirical material in the case, using empirical material as clues about the possible empirical fingerprints of an underlying causal mechanism. This involves an intensive and wideranging search of the empirical record, with material collected without knowing what it is evidence for. Here it can also be helpful to develop a descriptive narrative of what happened in the case to shed light on potential mechanisms. The next step involves inferring that the found observable empirical material is actual evidence that reflects the empirical fingerprints left by the operation of a plausible causal mechanism in the case. Tentative hunches about potential mechanisms (and their parts in the productive account) are made based on the first round of empirical probing, after which the researcher proceeds to evaluate whether any of the collected material is actually evidence of the tentative mechanism (or parts of the mechanism). This evaluation of evidence proceeds in a
302
The SAGE Handbook of Political Science
slightly different fashion than in theory testing, given that it is not relevant to discuss the theoretical or empirical certainty of evidence because it has already been found; instead one only evaluates uniqueness in relation to the tentative hypothesized mechanism or its parts. Evidence does not speak for itself. Often theory-building has a strong testing element, in that scholars seek inspiration from existing theoretical work and previous observations for what to look for. Here existing theory can be thought of as a form of grid to detect systematic patterns in empirical material, enabling inferences about predicted evidence to be made. In addition, one can look to research on mechanisms on similar research topics for inspiration for what parts of the mechanism might look like. In other situations, the search for mechanisms is based upon hunches drawn from puzzles that are unaccountable for in existing work. Theoretical revision process tracing involves a combination of tracing a mechanism in a deviant consistency case (see above), where a mechanism should have been operative but where it broke down, and using the information garnered about where in the process the mechanism broke down to engage in a focused comparison of what differs between a typical case where the mechanism worked and the studied deviant consistency case. The goal is to uncover unknown omitted conditions that have to be present for the mechanism to function properly. Explaining outcome process tracing is an iterative research strategy that aims to trace causal mechanisms in order to produce a comprehensive explanation of a particular outcome. Cases here are not ‘cases of’ a particular causal relationship; instead they are viewed holistically as a particular historical outcome such as the peaceful resolution of the Cuban Missile Crisis. Research in explaining outcomes uses abductive analysis as a way of building explanations, where there is a continual and creative juxtaposition between empirical material and theories. The types of theoretical explanations constructed often
are eclectic combinations, viewing theories as heuristic tools. There are two different starting points for explaining outcome process tracing: either theory or empirics. The theory-first path follows the steps described above under theory testing, where an existing cause (or set of causes) and the associated mechanisms are tested to see whether they can account for the outcome. In most research situations, a single existing cause and mechanism cannot provide a sufficient explanation of an outcome, resulting in a second stage of research where either a testing or building path can be chosen, informed by the results of the first empirical analysis. If the testing path is chosen again, this would involve testing another theorized cause and associated mechanism as a supplemental explanation to see whether together they can account for the ‘big and important’ things going on in the case. Alternatively, the theory-building path can be chosen in the second iteration, using empirical evidence to build a new mechanism that can account for the elements of the outcome that were unaccounted for using the first mechanism, following the steps discussed above under theory-building. In both paths, theorized mechanisms and empirical tests are treated more pragmatically as heuristic devices to understand important events.
Conclusion This chapter described the three core elements of process tracing as a case-based social science method. Process tracing methods have developed considerably in the past two decades, but much work remains to be done. There is still considerable debate about how we should select cases and generalize about mechanisms (e.g. Mikkelsen, 2017; Beach and Pedersen, 2018; Rohlfing and Schneider, 2018). There is also considerable debate about whether the probative value of mechanistic evidence (‘causal process observations’) should be quantified or not (Humphreys and
Case Studies and Process Tracing
Jacobs, 2015; Fairfield and Charman, 2017; Beach and Pedersen, 2019. There is even debate about what we actually are tracing (e.g. Runhardt, 2015; Beach and Pedersen, 2019). Given these disagreements, it is vital to make clear exactly what interpretation of process tracing one is using and why. As with everything in life, different methodological choices involve tradeoffs.
References Beach, Derek, and Rasmus Brun Pedersen. 2018. Selecting Appropriate Cases When Tracing Causal Mechanisms. Sociological Methods and Research 47(4): 837–871. Beach, Derek, and Rasmus Brun Pedersen. 2019. Process-Tracing Methods – Foundations and Guidelines. 2nd edition. Ann Arbor: University of Michigan Press. Bennett, Andrew. 2014. Appendix: Disciplining our conjectures: Systematizing process tracing with Bayesian analysis. In Process Tracing: From Metaphor to Analytic Tool, edited by Andrew Bennett and Jeffrey T. Checkel. Cambridge: Cambridge University Press. pp. 276–298. Bennett, Andrew, and Jeffrey T. Checkel. 2014. Process Tracing: From Metaphor to Analytic Tool. Cambridge: Cambridge University Press. Berg-Schlosser, Dirk. 2012. Mixed Methods in Comparative Politics: Principles and Applications. Houndmills: Palgrave Macmillan. Bhaskar, Roy. 1978. A Realist Theory of Science. Brighton: Harvester Press. Bogen, Jim. 2005. Regularities and Causality; Generalizations and Causal Explanations. Studies in History and Philosophy of Biological and Biomedical Sciences 36(2): 397–420. Brady, Henry E., and David Collier, eds. 2011. Rethinking Social Inquiry: Diverse Tools, Shared Standards. 2nd ed. Lanham, MD: Rowman & Littlefield. Cartwright, Nancy. 2007. Hunting Causes and Using Them: Approaches in Philosophy and Economics. Cambridge: Cambridge University Press.
303
Craver, Carl F., and Lindley Darden. 2013. In Search of Mechanisms:Discoveries across the Life Sciences. Chicago: University of Chicago Press. Fairfield, Tasha, and Andrew E. Charman. 2017. Explicit Bayesian Analysis for Process Tracing. Guidelines, Opportunities, and Caveats. Political Analysis 25(3): 363–380. Falleti, Tulia G., and Julia F. Lynch. 2009. Context and Causal Mechanisms in Political Analysis. Comparative Political Studies 42(9): 1143–1166. Gerring, John, and Jason Seawright. 2007. Techniques for Choosing Cases. In Case Study Research: Principles and Practices, edited by John Gerring. Cambridge: Cambridge University Press, pp. 86–150. Goertz, Gary, and James Mahoney. 2009. “Scope in Case-Study Research.” In The Sage Handbook of Case-Based Methods, ed. David Byrne and Charles C. Ragin, 307–17. Thousand Oaks, CA: Sage. Goertz, Gary, and James Mahoney. 2012. A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences. Princeton: Princeton University Press. Groff, Ruth. 2011. Getting Past Hume in the Philosophy of Social Science. In Causality in the Sciences, edited by Phyllis McKay Illari, Federica Russo and Jon Williamson. Oxford: Oxford University Press, pp. 296–316. Humphreys, Macartan, and Alan M. Jacobs. 2015. Mixing Methods: A Bayesian Approach. American Political Science Review 109(4): 653–673. Illari, Phyllis McKay. 2011. Mechanistic Evidence: Disambiguating the Russo-Williamson Thesis. International Studies in the Philosophy of Science 25(2): 139–157. Illari, Phyllis McKay, and Jon Williamson. 2011. Mechanisms are Real and Local. In Causality in the Sciences, edited by Phyllis McKay Illari, Federica Russo and Jon Williamson. Oxford: Oxford University Press, pp. 818–844. Machamer, Peter. 2004. Activities and Causation: The Metaphysics and Epistemology of Mechanisms. International Studies in the Philosophy of Science 18(1): 27–39. Machamer, Peter, Lindley Darden, and Carl F. Craver. 2000. Thinking about Mechanisms. Philosophy of Science 67(1): 1–25.
304
The SAGE Handbook of Political Science
Mahoney, James. 2008. Toward a Unified Theory of Causality. Comparative Political Studies 41(4–5): 412–436. Mikkelsen, Kim S. 2017. Fuzzy-Set Case Studies. Sociological Methods & Research 46(3): 422–455. Ragin, Charles C. 2000. Fuzzy-Set Social Science. Chicago: University of Chicago Press. Rohlfing, Ingo, and Carsten Q. Schneider. 2018. A Unifying Framework for Causal Analysis in Set-Theoretic Multimethod Research. Sociological Methods & Research 47(1): 37–63. Runhardt, Rosa W. 2015. Evidence for Causal Mechanisms in Social Science: Recommendations from Woodward’s Manipulability Theory of Causation. Philosophy of Science 82(5): 1296–1307. Russo, Federica, and Jon Williamson. 2007. Interpreting Causality in the Health Science. International Studies in the Philosophy of Science 21(2): 157–70. Samford, Steven. 2010. Averting ‘Disruption and Reversal’: Reassessing the Logic of Rapid
Trade Reform in Latin America. Politics & Society 38(3): 373–407. Schneider, Carsten Q., and Ingo Rohlfing. 2013. Combining QCA and Process Tracing in SetTheoretic Multi-Method Research. Sociological Methods & Research 42(4): 559–597. Schneider, Carsten Q., and Ingo Rohlfing. 2016. Case Studies Nested in Fuzzy-Set QCA on Sufficiency: Formalizing Case Selection and Causal Inference. Sociological Methods & Research 45(3): 526–568. Steel, Daniel. 2008. Across the Boundaries: Extrapolation in Biology and Social Science. Oxford: Oxford University Press. Tannenwald, Nina. 1999. The Nuclear Taboo: The United States and the Normative Basis of Nuclear Non-Use. International Organization 53(3): 433–468. Waskan, Jonathan. 2011. Mechanistic Explanation at the Limit. Synthese 183(3): 389–408. White, Howard. 2009. Theory-Based Impact Evaluation: Principles and Practice. International Initiative for Impact Evaluation Working Paper 3. New Delhi: India.
18 Causation Michael Baumgartner1
A History of Disagreement Causation is one of the most basic concepts regulating our interaction with the world. It is omnipresent both in science and in every-day life. Correspondingly, it is among the oldest topics in Western philosophical and scientific theorizing. Aristotle, Aquinas, Descartes, Hobbes, Galileo, Spinoza, Leibniz, Locke, Newton, Hume, Kant, Mill, Reichenbach – just to name some of the most prominent figures – have devoted important parts of their work to this topic. A multitude of different theories of causation, many of which are incompatible, have been proposed over the centuries. Even to this day, conflicting theories continue to co-exist – ultimately because they are embedded in, and draw their justification from, incompatible background metaphysics and ontologies, which are notoriously difficult to reconcile (see also Moses, Chapter 27, this Handbook). In consequence, despite in-depth and centuries-old investigations into the topic, no consensus has been reached on the most
fundamental question: what is causation? Is it an objective feature of our world or is it something we, as observers, project onto the world? Is it something that actually governs what occurs around us or is it a concept that merely facilitates theorizing about those occurrences? Is it a matter of the instantiation of regularities or laws, or of counterfactual dependence, or of probability-raising, or of manipulability, or of mechanisms? Does it only obtain between occurrences in space and time or does it also obtain between absences of such occurrences? And more specifically, is it a transitive relation or not; is it a deterministic relation or not? It is beyond the scope of this chapter to review the abundance of different answers that have been given to these questions over time. Instead, I focus on establishing the importance of developing explicit theories of causation rendering transparent the understanding of causation presupposed in a given research context. Moreover, the chapter provides systematic introductions to those theories with most conceptual and methodological
306
The SAGE Handbook of Political Science
impact for political science: the regularity theory, the probabilistic theory, the counterfactual theory, the interventionist theory, and the mechanistic theory. Where appropriate, the historical roots of these theories will be discussed, but my focus will be on their modern versions. An overview of the history of theorizing about causation is provided in Beebee et al. (2009: part I). The chapter ends with an outlook identifying methodological frameworks suitable for uncovering causation as defined by the discussed theories.
Purpose of a Theory of Causation Despite the omnipresence of causation in every-day life and science, it is far from pretheoretically (i.e. intuitively) clear under what conditions a causal relation obtains. Even in the most commonplace of scenarios it easily happens that our causal judgements are unclear, unstable, and inconsistent. To illustrate, consider Case 1: Case 1. Angelo lives in Rome and goes on vacation. His neighbor agrees to water Angelo’s plants, but repeatedly forgets to do so. When Angelo returns two weeks later, his plants are dead.
When asked to identify the causes of the plants’ death, most people will, presumably, point to the neighbor’s negligence and contend that the ultimate cause of death was insufficient water supply. Of course, insufficient water supply did not only result from the neighbor’s negligence but also from the fact that everybody else, including, say, the Pope did not water regularly. Yet, even though it is uncontroversial that, had the Pope watered the plants, they would have survived, many people will deny that the Pope’s inaction is another cause of death. How are these discrepancies in the ascription of causal efficacy justifiable? One important difference between the neighbor and the Pope is that only the former has agreed to tend the plants. Yet, while the
violation of the agreement is a reason to hold the neighbor and not the Pope morally accountable, it does not seem to be a reason to ascribe causal efficacy to the neighbor’s but not the Pope’s inaction. After all, either one could have secured sufficient water supply; and causation is a relation that obtains in the world independently of judgements of moral accountability – or so it seems. But then, either the neighbor’s and the Pope’s (and everybody else’s) inactivity caused the plants to die or neither of them did. The list of unclear candidate causes connected to Case 1 can easily be extended. Is Angelo’s failure to give the plants to his (reliable) mother prior to departure also a cause of their death? Or what about Angelo’s purchase of the plants; or Angelo’s birth? There is a case to be made that, had he not been born, he would not have bought the plants, which, subsequently, would have been bought by some plant enthusiast. I leave it to the reader to contemplate possible answers to these questions. What matters for our purposes is that, even though Case 1 describes a most commonplace scenario, causally analyzing it in a consistent manner is far from trivial. And the difficulties are not due to lacking empirical evidence. We can safely assume that we have enough evidence on the neighbor’s, the Pope’s, and the mother’s watering behaviors and on possible alternative plant buyers to determine exactly what would have happened in relevant contrast scenarios. That does not help us to decide, for example, whether we should ascribe causal relevance to inaction, or whether we should take causation to be a transitive relation and count Angelo’s birth among the causes. These, and many others, are conceptual decisions that need to be taken before the question of what the available evidence entails for causal relations even arises. It is the purpose of a theory of causation to take those conceptual decisions, render them transparent in an explicit definition of causation and, thereby, provide a framework against the background of which empirical research into thus defined causation becomes possible.
Causation
A theory of causation provides necessary and sufficient conditions for a dependency to be of causal nature, or differently, it specifies the truth conditions of claims of the form ‘x causes y’. This is accomplished by replacing P with suitable analysis conditions in schema (Φ), where ‘iff’ is short for ‘if, and only if’:
x causes y iff P.(Φ)
Causation is not a technical term of art that can be stipulatively defined in any way we please, rather, it is widely used in every-day and scientific language. A theory that wants to be taken seriously must take that existing usage into account and aim for a P that reproduces standardly accepted pre-theoretic causal judgements. However, as these judgements tend to be inconsistent, no (consistent) theory can possibly reproduce all of them. At best, a theory can do justice to a maximally large consistent proper subset of pre-theoretic causal judgements. The selection of these subsets can vary from theory to theory and there is no fact of the matter as to what is the true selection. Rather, the selection must be justified based on considerations concerning the purposes a particular theory is intended to serve. In that light, the position of causal pluralism, that is, the view that different theories of causation do not contradict one another but capture different concepts or variants of causation, has seen a rise in popularity in recent years (e.g. Psillos, 2010). Correspondingly, which theory to choose in a given research context depends on the exigencies of that context, which are determined, for instance, by the investigated research questions or by the nature of the available data.
Dimensions of Variance between Different Theories Theories of causation differ along various dimensions. This section discusses the most important ones.
307
Reductionism vs. Non-Reductionism All theorizing about the world – in any domain – needs to start with some conceptual inventory that is presupposed as being clear. The concepts in that inventory are called fundamental. The most general divide between different theories of causation concerns the question of whether the concept of causation should be considered one of those fundamental ones or not. A theory, subject to which causation is fundamental, holds that it is impossible to define causation without recourse to some ‘causally loaded’ concept or other, meaning that it is impossible to substitute P in schema (Φ) by conditions that are entirely free of causal connotations. Such a theory is called non-reductionist, for it contends that it is impossible to reduce causation to non-causation. By contrast, a reductionist theory maintains that causation is not fundamental, that is, causation can be defined in terms of entirely non-causal concepts – it can be reduced to non-causation. Hence, a reductionist theory substitutes P by conditions without any causal connotations whatsoever. Overall, the divide between reductionism and non-reductionism partitions the currently available theories of causation in roughly two equal halves.
The Ontology of Causation The second dimension of variance concerns the ontology of causation, that is, the question of what types of entities can stand in causal relations. The answers to that question are influenced by two conflicting intuitions. On the one hand, it seems that causes and effects are entities that occur in time and space. For example, elections are causes of policy changes or peasant revolts are causes of social revolutions. Causes and their effects are items we can point to. On the other hand, absences and omissions are often causally interpreted as well, even though their
308
The SAGE Handbook of Political Science
characteristic feature is that they do not occur in time and space. For instance, the absence of water causes plants to die or the absence of political participation causes popular anger. Corresponding to these two intuitions, there are two main candidate ontologies of causation: according to the first, causes and effects are events (e.g. Davidson, 1967), according to the second they are facts (e.g. Mellor, 1995). There exist many theories of both events and facts, which cannot be reviewed here (cf. Casati and Varzi, 2015). What matters for our purposes is that events and facts constitute categorically different entities: events are spatiotemporally located concrete entities, facts are non-spatiotemporally located abstract entities. Obama’s election as the first African-American president is an event that occurred on November 4, 2008, in the United States, whereas the fact that Obama was elected as the first African-American president is not located in the year 2008 in the United States, rather it is a fact in contemporary Europe just as it will be a fact in Australia 100 years from now. The categorical difference between events and facts yields that event and fact ontologies cannot be combined in one theory of causation. Either a theory takes causes and effects to be events or to be facts (not both). Yet, neither ontology is clearly preferable; there are persuasive arguments for and against both ontologies (cf. Ehring, 2009). For that reason it has become customary to bracket the question as to the ontology of causation by remaining as non-committal as possible with regard to the nature of causes and effects. This is accomplished by referring to causes and effects simply as variables, factors, or values of variables/factors. Random variables and factors are flexible modeling devices that can be used to represent any types of entities. On the upside, theories of causation that are formulated in terms of variables or factors can easily be adapted for different ontologies; on the downside, what such theories have to say about causation
becomes relative to the choice of variables/ factors, which introduces a subjective element into causal analyses (cf. Halpern and Hitchcock, 2010).
General vs. Singular Causation Two kinds of causation must be distinguished: one on the type level, general or type causation (often also called causal relevance), and one on the token level, singular or token causation (often also called actual causation). An example of the first kind is ‘printing money causes inflation’; an example of the second is ‘Turkey’s money printing in 2018 causes Turkey’s inflation in 2018’. General causation relates types (of events or facts) that can be instantiated repeatedly and on various occasions by corresponding tokens, which themselves are related in terms of singular causation. General causation is the kind of causation that is traced in scientific theory-building and causal modeling and that figures in scientific laws. Knowledge of general causation is needed for prediction. Singular causation, by contrast, is traced in the course of explaining some event or fact of interest. Knowledge of singular causation is needed for retrodiction. Plainly, general and singular causation are not independent. If some type, A, is causally relevant to another type, B, there exists a token α of type A (in the right circumstances) that is a singular cause of a token β of type B; and if a token, α, is a singular cause of another token, β, there exists a type-level structure featuring the corresponding type A as the general cause of B. Accordingly, theories of general and singular causation are not independent either. The standard approach in theory development, therefore, is to first define either general or singular causation – as so-called primary analysandum – and to then spell out the other kind of causation in terms of the primary analysandum. As we shall see below, some theories take general causation to be primary, others opt for singular causation.
Causation
Relational Properties of Causation Causation, both on type and on token level, is usually analyzed as a relation ‘… is causally relevant to …’ or ‘… is a singular cause of …’, where the dots are filled by the corresponding causes and effects. Every relation can be characterized by the following relational properties: symmetry, reflexivity, and transitivity. To properly understand a relation, clarifying its relational properties is crucial. Some relational properties of causation are uncontroversial. For instance, general and singular causation are neither reflexive nor symmetric. That is, a cause is normally neither caused by its own effects nor by itself. Standardly, it is even assumed that, on the token level, there are no causal feedbacks at all. Token causes and effects are distinct entities such that the former temporally precede the latter. The tax raise at time t1 causes firms to lay off people at a later time t2, but at t2 the tax raise has already happened, such that the lay-offs cannot cause the initial tax raise. However, feedbacks may occur on the type level. Tax raises may be causally relevant to lay-offs, which cause the need to generate more revenue for social security, which, in turn, is causally relevant to further tax raises. That is, on the type level it is possible that a cause, via a sequence of intermediate links, is causally relevant to itself, thereby closing a causal cycle. Concerning the third relational property, transitivity, matters are much less clear. While it certainly often holds that a cause, A, not only brings about its direct effect, B, but also (indirectly) causes B’s effects, the question as to the transitivity of causation is whether causal influence is always propagated along causal chains. Does it hold generally that, if A is a cause of B and B is a cause of C, then A is also a cause of C? Here again, there are conflicting intuitions. On the one hand, more often than not, our objectives when interacting with the world cannot be caused directly by one suitable intervention; rather, our ultimate objectives are typically far remote from
309
what we can cause directly. This holds in particular in politics. If our objective is, say, to reduce the unemployment rate, all we can do is to bring certain monetary, fiscal, or educational policy changes under way, which, via many intermediate links, may ultimately reduce unemployment. We would not induce these policy changes (with often unwanted side effects) if we were not convinced that they will cause the desired objective. Hence, much of our interaction with the world is driven by the assumption that we can rely on our actions not only having direct effects but also far remote indirect ones – which amounts to the assumption that causation is transitive. Yet, in the face of certain concrete examples, the intuition that causation is transitive becomes shaky. To illustrate, consider Case 2: Case 2. Walter is a candidate in a presidential race. But his poll numbers are going down. In a nationally televised debate, he intensifies his populist campaign pledges. In the next poll, his numbers are going up again.
Walter fuels his populism as a reaction to his sinking polls, and populism tends to be conducive to rising polls. That is, there is a causal chain from sinking polls to intensified populism and on to rising polls. If causation is transitive, it follows that sinking polls cause rising polls, which seems highly counterintuitive. There are some authors arguing that transitivity can be maintained even in light of examples like Case 2 (e.g. Lewis, 2000), others contend that such examples prove that intuitions as to the transitivity of causation are misguided (e.g. Hitchcock, 2001). Correspondingly, there are theories of causation providing transitive notions of causation, while others define causation in such a way that transitivity does not hold.
Realism vs. Anti-Realism Another dimension of variance between theories of causation is their stance on the question whether the causal relation is a real
310
The SAGE Handbook of Political Science
constituent of the world. While it is beyond doubt that causes, effects, and their behavior patterns exist, it is a matter of controversy whether there additionally exists a causal bond connecting them and governing their behavior. The position of causal realism maintains that not only the entities related by the causal relation exist in the world, but also the relation itself; causes and effects are connected by a real causal bond (e.g. Tooley, 1987). By contrast, the position of causal anti-realism contends that only the causally related entities and their behavior patterns exist, not the causal relation itself; there are no causal bonds (Hume, 1748). Certain theories of causation endorse causal realism, others subscribe to causal anti-realism. According to an anti-realist theory, causation boils down to the regularities, the correlations, or other sorts of dependencies obtaining between the behaviors of causes and effects. To be causally related is nothing over and above behaving in a certain (yet to be specified) manner. Subject to a realist theory, causation is more than a certain behavior pattern. To be causally related means to be tied together by a causal bond. The difference between realism and antirealism is not just of philosophical interest. It has a direct bearing on what counts as empirical evidence for causation. To establish causation, the anti-realist only has to demonstrate the existence of the required behavior pattern in the data – and behavior patterns are exactly what is contained in ordinary data. By contrast, the realist has to furnish evidence for the existence of the relevant type of causal bond. But contrary to behavior patterns, such bonds are not directly visible or measurable, meaning that ordinary data will not contain direct evidence on causal bonds. Instead, the existence of bonds must be indirectly inferred from the (behavioral) information contained in data. In light of the fact that causation as defined by an anti-realist theory is more directly accessible in data, the theories with the most relevance in empirical science are of the
anti-realist type. The only theory with realist leanings resorted to in political science is the mechanist theory.
Production vs. Difference-Making The contrast between realist and anti-realist theories closely aligns with the contrast between so-called production and differencemaking theories. On the one hand, a production theory contends that the characteristic feature of causes is their capacity or disposition to produce the effect, where production is taken to be a fundamental (i.e. non-reducible) causal notion expressing the physical bringing about of an effect, say, via the transfer of energy or momentum (e.g. Dowe, 2000). On the other hand, a difference-making theory maintains that the characteristic feature of causes is that they are difference-makers of their effects, which is to be understood in terms of the non-causal notion of association: A is a difference-maker of B iff there (possibly) exist homogenous scenarios in which a difference in A is associated with a difference in B (e.g. Woodward, 2003). Production theories tend to subscribe to realism, difference-making theories to anti-realism. Correspondingly, causation as defined in difference-making terms can be more easily traced in data. While difference-making relations can likewise be investigated on coarse-grained macro and fine-grained micro levels, tracing production relations requires zooming in on the micro level in order to determine what is happening between causes and effects.
Determinism Until the development of quantum mechanics, causation was generally believed to satisfy the principle of determinism: whenever the same types of causes occur, the same types of effects occur as well – in slogan form, ‘same causes, same effects’. Of course
Causation
it had always been recognized that determinism is often not apparent in data. For example, although regular watering causes plant growth, some regularly watered plants die nonetheless. However, such indeterminacies were taken to be a result of insufficient control over background influences generating noise, meaning that scenarios with seemingly alike causes but different effects do not actually feature the exact same causes. Our world is of such enormous causal complexity that it is often impossible to ensure that nothing interferes with a cause or that all background conditions necessary for a cause to be efficacious are instantiated. But other things being equal, ceteris paribus, causation is a deterministic dependence relation. By contrast, various so-called no-hiddenvariables theorems in quantum mechanics suggest that indeterminacies in data are not always due to noise but – at least on the level of fundamental particles – a result of the inherent indeterministic nature of the physical processes themselves (Albert, 1992). Hence, according to the standard interpretation of the mathematical machinery of quantum mechanics, the principle of determinism is false. Yet, even though quantum mechanics is one of the most successful scientific theories currently available, the indeterministic interpretation of its mathematical formalism failed to convince many friends of determinism that the principle of determinism must be abandoned – for two main reasons. First, there also exist deterministic interpretations of the quantum mechanical machinery, e.g. Bohm’s interpretation (Albert, 1992: chapter 7). Second, even if it should turn out that there exist inherently indeterministic processes on the fundamental level, there are many open questions with respect to the causal interpretability of these processes (e.g. Healey, 2009). Therefore, quantum mechanics notwithstanding, many current theories of causation either explicitly endorse the principle of determinism or remain non-committal as to its validity. The possibility of fundamental indeterminism induced by quantum
311
mechanics has been the main motivation behind the development of probabilistic theories of causation.
Main Theories This section reviews the theories of causation with the highest relevance for social and political science.
Regularity Theory So-called regularity theories of causation have the longest tradition among theories that continue to be developed today. They date back to Hume (1748). The most influential modern regularity theory is due to Mackie (1974) – with refinements by Graßhoff and May (2001) and Baumgartner (2013). Concerning the theoretical dimensions of variance, regularity theories make the following analytical choices. Their primary analysandum is general causation, which they reductively define in terms of redundancy-free regularities obtaining among factors taking on specific values: such as ‘whenever factor A takes value i (A=i), factor B takes value j (B=j)’. Moreover, they subscribe to antirealism and assume that causation is a deterministic form of dependence that is not transitive. Finally, they contend that the characteristic feature of causes is that they are difference-makers of their effects. Factors analyzed by regularity theories can either be crisp-set, taking two possible values 0 and 1, fuzzy-set, taking continuous values from the unit interval [0,1], or multivalue, taking an open (but finite) number of non-negative integers as possible values. For simplicity of exposition, I focus on crisp-set factors here, which allows for conveniently abbreviating the explicit ‘Factor = value’ notation. As is conventional in Boolean algebra, I use ‘A’ as shorthand for A=1 and ‘a’ for A=0. Modern regularity theories moreover borrow
312
The SAGE Handbook of Political Science
much of the formal machinery from Boolean algebra, in particular, the operations of conjunction, A*B (expressing ‘A=1 and B=1’), disjunction, A + B (‘A=1 or B=1’), implication, A → B (‘if A=1, then B=1’), and equivalence A ↔ B (‘A=1 if, and only if, B=1’) (see also Wagemann, Chapter 20, this Handbook). The implication operator allows for formally expressing regularities, more specifically, it allows for defining the notions of sufficiency and necessity, which are the two core Boolean dependencies exploited by regularity theories: A is sufficient for B iff A → B (i.e. whenever A is given, B is given); A is necessary for B iff B → A (i.e. whenever B is given, A is given). Many of these Boolean dependencies, however, have nothing to do with causation. For example, the sinking of a (properly functioning) barometer is sufficient for weather changes but it does not cause the weather; or whenever there is an election, votes are cast; so casting votes is necessary for the election – but it does not cause the election. Still, some Boolean dependencies are in fact due to underlying causal dependencies: rainfall is sufficient for wet streets and also a cause thereof, or winning an election is necessary for being sworn into office and also a cause thereof. That means the crucial problem to be solved by a regularity theory is to filter out those Boolean dependencies that are due to underlying causal dependencies and are, hence, amenable to a causal interpretation. The main reason why most structures of Boolean dependencies do not reflect causation is that they tend to contain redundancies, whereas structures of causal dependencies do not feature redundant elements. Every part of a causal structure makes a difference to the behavior of that structure in at least one context. Accordingly, to filter out the causally interpretable Boolean dependencies, they need to be freed of redundancies. Only those elements of sufficient and necessary conditions can be causally relevant which are indispensable to account for a scrutinized outcome in at least one
context. Or in Mackie’s words, causes are at least INUS conditions, viz. Insufficient but Non-redundant parts of Unnecessary but Sufficient conditions (1974: 62). Whatever can be removed from sufficient and necessary conditions without affecting their sufficiency and necessity is not a difference-maker and, hence, not a cause. The causally interesting sufficient and necessary conditions are minimal in the sense that they do not contain sufficient and necessary proper parts, respectively. Minimally sufficient and minimally necessary conditions can be combined in so-called minimal theories, which constitute the heart of contemporary regularity theories: a minimal theory of an outcome B is a minimally necessary disjunction of minimally sufficient conditions of B.1 An example might be:
A*e + C*d ↔ B(1)
(1) being a minimal theory of B entails that A*e and C*d, but neither A, e, C, nor d alone, are sufficient for B, and that A*e + C*d, but neither A*e nor C*d alone, are necessary for B. Minimal theories directly mirror the complexity of causes (conjunctural causation) – causes do not bring about their effects in isolation but only in conjunction with other causes – as well as the principle of equifinality – outcomes can be caused along various alternative paths. Moreover, although minimal theories as (1) have a main operator (‘↔’) that is symmetric, to the effect that both sides of (1) are mutually sufficient and necessary for one another, the fact that (1) features two minimally sufficient conditions of B yields that it nonetheless identifies a direction of determination: both A*e and C*d determine B, but B neither determines A*e nor C*d. Hence, A*e and C*d are the (alternative) causes of B, and not vice versa. Still, to define causal relevance by means of minimal theories, an additional constraint is needed, for not all minimal theories faithfully reflect causation. The reason is that complete redundancy elimination is relative
Causation
to the set of analyzed factors F, meaning that factor values contained in minimal theories relative to some F may fail to be part of a minimal theory relative to a superset of F (Baumgartner, 2013). In other words, by expanding factor sets, factor values that appear to be non-redundant to account for an outcome can turn out to be redundant after all. Therefore, only factor values that are not rendered redundant by expanding factor sets are causally relevant. These considerations yield the following substitution instance of schema (Φ): (R) A is a type-level cause of B iff A is part of a minimal theory of B relative to a factor set F and remains part of a minimal theory of B across all expansions of F.
The main problem of (R) is that, since Boolean dependencies expressed in minimal theories are inherently deterministic, (R) incorporates determinism at its very core. But it may turn out that some of the indeterminism typically encountered in data is not due to noise but to the fact that some causal relations are inherently indeterministic. Hence, (R) is a viable theory only for contexts where causation can safely be assumed to be deterministic. While that assumption is dubious for certain micro-level areas explored by physics, it seems innocuous for macrolevel areas investigated in social and political sciences.
Probabilistic Theory The restriction of regularity theories to deterministic contexts has prompted probabilistic theories to abandon the assumption that causation is deterministic. Still, like regularity theories, probabilistic accounts are difference-making theories that are committed to causal anti-realism, and they do not entail that causation is transitive. Apart from that, there is much variance within the probabilistic framework. Some probabilistic theories take general causation to be the primary
313
analysandum (Eells, 1991), others take singular causation to be primary (Glynn, 2011), still others contend that both general and singular causation can be analyzed in one go (Suppes, 1970). There are reductionist accounts (Suppes, 1970; Glynn, 2011), and non-reductionist ones (Eells, 1991). In what follows, I focus on theories that primarily analyze general causation and sketch the main ideas behind one reductionist and one non-reductionist account. Like regularity theories, probabilistic theories remain non-committal with respect to the ontology of causation by referring to causes and effects as variables or factors taking values. For simplicity, I continue to concentrate on the case of binary factors and to employ the Boolean shorthand notation introduced in the previous section. Probabilistic theories reject the regularity theoretic requirement that causes are parts of sufficient conditions of their effects. Instead, A can count as a cause of B if A merely raises the probability of B, where A is said to raise the probability of B iff the probability of B conditional on A is higher than the probability of B conditional on not-A, viz. a, or formally:
P( B | A) > P( B | a) (2)
However, not all cases of probability-raising are also cases of causation, for several reasons. First, probability-raising is symmetric: if A raises the probability of B, then B also raises the probability of A. But causation is not symmetric. There are various ways to address that problem. For instance, in complex networks of probabilistic (in-)dependencies, so-called Bayesian networks, it is possible to infer the direction of causation from substructures featuring multiple independent paths to the same effect, so-called unshielded colliders (Spirtes et al., 2000: chapter 5). Another way to distinguish between causes and effects is via their temporal order: causes precede their effects. This can be formally captured by time indexing probabilistically related factor values, to the
314
The SAGE Handbook of Political Science
effect that the probability-raiser At′ precedes the raised Bt :
P( Bt | At ′ ) > P( Bt | at ′ ) , where t ′ < t (3)
Second, many cases of temporally ordered probability-raising in the vein of (3) are not due to a causal dependence between At′ and Bt but to a common cause of At′ and Bt. For example, the sinking of a barometer at t′ raises the probability of rain at a later time t without causing it. This probabilistic dependence is the result of an approaching lowpressure system at a time t′′ before t′ causing the barometer to sink on one path and the rain on another. Accordingly, conditional on the low-pressure system, a sinking barometer no longer raises the probability of rain; in other words, given that a low-pressure system is approaching, additional information about the behavior of a barometer has no bearing on the rain probability. More generally, the set of common causes, C, of two parallel effects, A and B, neutralizes or screens off the probabilistic dependence between A and B in the following sense (Reichenbach, 1956) (where ‘*’ again stands for conjunction):
P( B | A * C) = P( B | a * C)(4)
Relations of temporally ordered probabilityraising only track causation if they are not screened off by antecedently instantiated factors. This idea is captured in the following reductionist theory (Suppes, 1970): (P1) At′ is a type-level cause of Bt, where t ′ < t, iff At′ raises the probability of Bt and there does not exist a set Ct″, where t ″ < t ′, that screens off the probabilistic dependence between At′ and Bt.
(P1) has two problems. First, not all causes raise the probability of their effects; some in fact lower it. To cite a classical example, wind gusts lower the probability that golfers make a hole-in-one. Nonetheless, it can happen that wind gusts deflect balls in such a way that they end up in the hole in one shot after all (e.g. by luckily bouncing off trees).
Being essential contributors to the trajectories of such balls, the wind gusts are causes of the holes-in-one, even though they lower their probability. So, causes must not be required to be probability-raisers of their effects, rather, it suffices that a cause At′ is a probability-changer of its effect Bt in the following sense:
P( Bt | At ′ ) ≠ P( Bt | at ′ ) , where t′ < t (5)
The second problem of (P1) stems from the fact that probabilities are typically determined via relative frequencies in a studied population. Probabilistic dependencies inferred from relative frequencies in the whole population, however, may be reversed or neutralized in subpopulations, which is a very common phenomenon in statistics known as Simpson’s Paradox. To use another classical example, it can happen that, in the whole population of applicants to some university, X, the frequency of admission among male applicants is significantly higher than among female applicants, while in the subpopulations of X’s faculties, the admission rates among men and women are exactly equal (for illustrations see Eells, 1991, 62–80). Such frequency distributions could be the result of men more often applying to faculties that are easier to get into. But unequal ratios could reappear in even more fine-grained subpopulations; for instance, it could turn out that men are more frequently admitted than women to each of X’s departments. It is thus unclear whether being male should be regarded as a probability-changer of being admitted to university X or not. Many representatives of probabilistic theories of causation have taken such paradoxical frequency distributions to show that rigorous constraints must be imposed on populations suitable for inferring probability-changing relations, in particular, that such populations must be required to be homogeneous in causally relevant respects. More specifically, a probability-changing relation between At′ and Bt that tracks causation must obtain in a causal context, K, in which all causes of Bt
Causation
not on a causal path from At′ to Bt are constant. This leads to the following non-reductionist theory (Eells, 1991: 86, 106): (P2) At′ is a type-level cause of Bt, where t′ < t, iff there exists a causal context K such that, in K, At′ changes the probability of Bt and there does not exist a set Ct ″, where t ″ < t ′, that screens off the probabilistic dependence between At′ and Bt.
(P2) is non-reductionist because it does not define causal relevance in non-causal terms but in terms of causal contexts. Applying (P2) to concrete cases, say, in the course of identifying the causes of some outcome, B, hence, presupposes substantive prior causal knowledge about B’s causes. Overall, (P2) avoids the problems of (P1), but it does so at the price of abandoning the project of defining causation in non-causal terms.
Counterfactual Theory A theoretical framework not prepared to give up reductionism is the so-called counterfactual one. Like regularity theories, counterfactual theories have their roots in suggestions by Hume (1748). Lewis (1973; 2000) developed Hume’s ideas into a full-blown theory. Further refining Lewis’ account is a field of ongoing research, incorporating various techniques from structural equation modeling (e.g. Halpern, 2016). As the technicality of these latest proposals is beyond the scope of this chapter, I will subsequently concentrate on Lewis’s original theory. It is a reductionist difference-making theory that subscribes to anti-realism and assumes causation to be deterministic. Contrary to the previously discussed theories, it stipulates that causation is transitive, it focuses on singular causation as primary analysandum, and it takes a determinate stance on the ontology of causation by assuming that causes and effects are spatiotemporally located events. In order not to confuse events with factors, I refer to events using Greek lower-case letters α, β, etc.
315
The main idea behind a counterfactual theory is to define singular causation between two occurring events, α and β, in terms of the truth of a counterfactual conditional of the form ‘had α not occurred, β would not have occurred’. Or more concretely, Turkey’s money printing in 2018 is a cause of Turkey’s inflation in 2018 if it is true that, had Turkey not printed money in 2018, there would have been no inflation in Turkey in 2018. Even though such counterfactual claims are very commonplace, it is difficult to state precisely under what conditions they are true. The main problem is that they refer to non-actual scenarios, meaning their truth cannot be determined by observing or conducting experiments. Their truth also does not depend on different scenarios of the same type that actually occurred (on different occasions). For example, that Turkey did not print money in 2014 and did not have an inflation in that year does not tell us what would have happened in 2018, had there been no money printing. To render the truth conditions of counterfactual statements precise, it is standard to draw on so-called possible world semantics, which was developed in modal logic to explicate the meanings of claims about possibility and necessity. A possible world, w, can be thought of as a (hypothetical) maximal state of affairs such that every state of affairs is either included in w or precluded by w. There is a possible world in which McCain (and not Obama) becomes president in 2008, Italy is in Africa (and not in Europe), Caesar (and not Armstrong) is the first man on the moon, Archduke Ferdinand is not assassinated but lives to be 80 years old, etc. Importantly, all possible worlds can be compared as to how similar or distant they are relative to one another, and there exists one distinguished possible world, the actual world, including all the states of affairs that obtain in the world we live in. Moreover, a possible world, w, is said to be an α-world if the event α occurs in w, and a non-α-world otherwise. Relative to that theoretical background, Lewis (1973) determines that the statement ‘had α not occurred, β would not have occurred’ (where α and β are events in the actual world)
316
The SAGE Handbook of Political Science
is true iff some non-α-world where β does not occur is more similar to the actual world than any non-α-world where β occurs, or differently, iff it takes less of a departure from actuality to suppress α and β together than to just suppress α. If that counterfactual statement is true, α and β are said to be counterfactually dependent, and α is a cause of β. Before this account can be applied to identifying token-level causes, the conditions under which a non-α-world, w, counts as similar to the actual world need to be clarified. Such similarity, according to Lewis, does not require that w is governed by the same laws of nature as the actual world. The reason is that if world similarity would require sameness of laws, α not occurring in w would presuppose that the causes of α are also absent from w, as well as their causes, and so forth. In that case, thus, a multitude of states of affairs in w would be different from the actual world, meaning that the latter would be very dissimilar from w. Moreover, not only effects would counterfactually depend on their causes, but also causes on their effects, for if α causes β, the most similar non-β-world with the same laws would also be a non-α-world. To avoid these consequences, Lewis stipulates that a non-α-world, w, counts as similar to the actual world if all states of affairs in w coincide with the actual world up until the occurrence of α, at which moment a law of nature of the actual world is broken in w by a so-called divergence miracle suppressing α in w. If, and only if, after that miracle, β does not occur in w, α and β are counterfactually dependent. Counterfactual dependence is sufficient but not necessary for causation because counterfactual dependence is not transitive. To ensure that causation is transitive, Lewis (1973) defines causation to be the transitive closure of counterfactual dependence: (C) α is a token-level cause of β, where α and β are events in the actual world, iff there exists a sequence of events 〈α, σ1, …, σn, β〉, with n ≥ 0, such that each event in the sequence counterfactually depends on its predecessor.
As indicated above, (C) is the object of ongoing refinement efforts because it is subject to various types of counterexamples, for instance, cases of overdetermination or preemption. To illustrate overdetermination, consider this case: Case 3. President X issues an executive order. Two judges, independently of one another, overrule that order, which subsequently is suspended. (Each overruling is individually sufficient for the suspension.)
It seems (pre-theoretically) clear that the rulings of both judges cause the suspension of the order. However, there is no sequence of counterfactual dependencies from either of the rulings to the order’s suspension: had the first judge not overruled, the order would still have been suspended, due to the second judge’s ruling, and vice versa. That is, (C) erroneously entails that the suspension of the order is caused by neither overruling. While recent modifications of (C) can adequately handle many of these types of counterexamples, the more fundamental problem remains that possible worlds are not epistemically accessible to us. Determining what would have happened, had certain events been suppressed by a miracle is, to a large degree, a matter of speculation. On the face of it, anything might happen in worlds that feature miracles.
Interventionist Theory The so-called interventionist framework provides another non-reductionist approach to defining causation. Interventionist theories have a lot of intuitive appeal and enjoy growing popularity. They tie the notion of causation directly to the way we commonly discover causation: by suitably manipulating factors of interest. Causes are those factors that can be manipulated in such a way that other factors, the effects, vary as well. While some theories in that framework anchor causation in human agency (e.g. Menzies and
Causation
Price, 1993) and, as a result, are subject to anthropocentricity worries, the most compelling interventionist theories define causation in terms of a technical notion of an intervention that is independent of human agency (e.g. Woodward, 2003; Pearl, 2009). This section focuses on the currently most widely used interventionist theory, which is due to Woodward (2003). Woodward chooses general causation as primary analysandum, develops an antirealist difference-making theory and neither assumes causation to be deterministic nor transitive. He also remains non-committal with respect to the ontology of causation by referring to causes and effects as variables or factors – in contrast to, say, regularity theories that take causes and effects to be factor values. The difference between factors and factor values seems small but is substantial. A factor, A, is causally relevant for a factor, B, (according to Woodward) iff at least one of A’s values can make a difference to at least one of B’s values, but in order for A=1 to be causally relevant to B=1, not any value of A but the specific value, 1, must be the difference-maker for the specific value 1 of B. Put differently, a theory analyzing causation between factors applies to cases such as ‘the electoral system is causally relevant for women’s representation in parliament’, while a theory opting for factor values applies to ‘a PR electoral system is causally relevant for high women’s representation in parliament’. Hence, in what follows I no longer use the Boolean shorthand notation according to which upper-case letters A, B, etc. stand for factor values; rather, they now stand for variables or factors (simpliciter). The basic idea behind Woodward’s interventionist theory is to first specify an ideal test setup, T, such that interventions occurring in T can recover all and only the causal relationships, and to then define A to be causally relevant to B iff that causal relation can be recovered in a T-test. The crucial requirement that T has to comply with in order to serve its designated purpose is non-confounding.
317
That is, if the causal relevance of A for B is investigated in T, it must be ensured that no uncontrolled causes of B other than the ones on a path from A to B, viz. no off-path causes of B, are operative in the background. If A is intervened on while some latent off-path cause produces B, the resulting data is confounded and A erroneously appears to make a difference to B. Hence, all off-path causes must be homogenized, that is, held fixed in a T-test. Furthermore, rigorous constraints must be imposed on the way A is intervened on. If the manipulation of A also influences B on a separate path (such that A and B are parallel effects of that manipulation), the resulting co-variation of A and B is confounded and cannot be taken as evidence for the causal relevance of A for B. Likewise, the manipulation of A must not itself be the effect of an off-path cause of B, for that, again, would introduce confounding. In that light, Woodward defines the following technical notion of an intervention (Woodward 2003, 98): an intervention on A with respect to B is a surgical cause of A – meaning A is its only direct effect – that sets A to one specific value and is independent of all off-path causes of B. Against that background, Woodward’s (2003) interventionist theory then amounts to the following substitution instance of schema (Φ): (I) A is a type-level cause of B iff there exists a possible intervention on A with respect to B that is associated with a change in B when all off-path causes of B are held fixed.
Some features of that theory deserve separate emphasis. First, it is non-reductionist because it defines causation in terms of interventions, which are certain designated causes. According to (I), determining whether A is a cause of B requires some prior causal knowledge about the causes of A and B, not however, about the relationship between A and B itself (i.e. the theory is not circular). That means (I) can only be brought to bear when that required causal knowledge is available.
318
The SAGE Handbook of Political Science
An interventionist theory cannot analyze causal phenomena from scratch. Second, even though the notion of an intervention is commonly associated with human action, Woodward’s technical definition of the term carries no such connotations whatsoever. Any surgical cause of A, whether the result of human action in an experiment or occurring beyond the range of human action, can count as an intervention. That means, in particular, that an interventionist theory can be applied both in experimental research contexts and in purely observational ones. If observational data contain information about a surgical cause of some test factor, A, (I) can be brought to bear. Third, for A to count as a cause of B, (I) does not require that A co-varies with B as a result of an actual intervention, rather, it suffices that there exists a possible intervention inducing a co-variation of A and B. Whenever no intervention on A actually occurs, the question thus arises whether such an intervention is possible and what would happen to B were such an intervention to occur. Answering those questions calls for recourse to possible world semantics, which, in turn, has led to interventionist theories being characterized as (type-level) variants of counterfactual theories. The main problem of (I) stems from the fact that it not only treats recoverability by an interventionist T-test as sufficient for causation – which it uncontroversially is – but also as necessary for it. (I) entails that, if it is impossible to surgically cause A (i.e. to intervene on A), A is causally inert. Given the enormous causal complexity of the world, it might well be quite common for factors to be interconnected with so many other factors that they cannot be caused surgically. That such factors do not cause anything does not seem adequate.
only causes and effects exist in the world but also the relation connecting them. There are two types of mechanistic theories: process theories (e.g. Dowe, 2000) and complex systems theories (e.g. Glennan, 1996). Process theories define causation in terms of the transmission of energy or momentum from causes to effects. They are tailor-made to account for causation in physical systems and are only of minor relevance for social and political science. Accordingly, this section focuses on complex systems theories, which contend that causes are connected to their effects via a mechanism – a complex system of interacting parts – accounting for the production of the effect by the cause. According to this approach, the characteristic feature of causes is not that they are difference-makers of their effects but that they produce them. As production is a causally loaded notion, resulting theories are non-reductionist. Mechanistic theories typically remain non-committal as to whether causation is transitive or deterministic. By contrast, they take a decisive stance on the ontology of causation: causes and effects are spatiotemporally located entities, that is, events. Correspondingly, their primary analysandum is singular causation. The conceptual core of a mechanistic theory is the notion of a mechanism. As indicated above, complex systems approaches do not interpret that notion in a narrow physical sense. Rather, any complex system featuring suitably arranged and interacting parts can count as a mechanism. The following is the most frequently cited characterization of a mechanism in the wide sense; it is due to Machamer et al.:
Mechanistic Theory
The deliberately unspecific terms ‘entities’ and ‘activities’ referring to the constituents of a mechanism can be specified for various fields of application. For instance, Little stipulates that social mechanisms are ‘constituted
Finally, contrary to all previously discussed theories, mechanistic theories subscribe to causal realism, that is, to the thesis that not
Mechanisms are entities and activities organized such that they are productive of regular changes from start or set-up to finish or termination conditions. (Machamer et al., 2000: 3)
Causation
by the purposive actions of agents within constraints’ (2011: 273). Against that background, Glennan (1996) puts forward the following substitution instance of schema (Φ): (M) α is a token-level cause of β, where α and β are non-fundamental events, iff there exists a mechanism connecting α and β.
In other words, α and β are causally related iff there is a sequence of intermediary events, each of which is caused (produced) by its predecessor and causes (produces) its successor. The causal relations among the elements of that sequence are again to be analyzed in terms of the existence of intermediary mechanisms, and so forth. That is, (M) non-reductively spells out causation between upper-level events in terms of causation between lower-level events, giving rise to a regress that continues until a fundamental level is reached where nothing intermediary exists any longer – if such a level exists at all; or, if no fundamental level exists, the regress continues ad infinitum. In any case, causation between events on a fundamental level cannot be understood in the vein of (M), which is why Glennan restricts the applicability of (M) to non-fundamental events. Causation between fundamental events must then either be analyzed on the basis of another theory of causation or treated as fundamental. As all other non-reductionist theories, (M) can only be applied if a significant amount of prior causal knowledge is already available. While other non-reductionist theories, however, presuppose clarity on causation in the background (i.e. outside) of a scrutinized causal relation, (M) presupposes knowledge about what is going on in between candidate causes and effects (i.e. inside of a causal relation).
Methodological Perspectives I end this chapter by briefly connecting the above theories to available methods of causal inference and discovery. As documented in
319
this part of the Handbook, there exists a multitude of methods. Many of them are tailored to uncover causation as defined by different theories. Given the vast amount of available methodological frameworks, I cannot render explicit for all of them what variant of causation they trace. Hence, without claim to completeness, the following paragraph links every theory discussed in this chapter to exemplary methods designed to uncover causation as defined by that theory. Regularity theoretic causation can be uncovered by configurational comparative methods (e.g. Ragin, 2008; Baumgartner and Ambühl, 2018). Bayesian network methods (e.g. Spirtes et al., 2000) and regression analytic methods (e.g. Gelman and Hill, 2007) trace the probabilistic variant of causation. The potential outcomes framework and structural equation modeling (e.g. Morgan and Winship, 2007; Halpern, 2016) provide useful tools to search for causation in the sense of counterfactual theories. Interventionist causation can be uncovered by experimental methods or by interventionist variants of Bayesian network methods or structural equation modeling (e.g. Pearl, 2009). And a paradigmatic framework to examine mechanistic causation is process tracing (e.g. Beach and Pedersen, 2013). According to the position of causal pluralism, the theories of causation discussed in this chapter define different concepts of causation by doing justice to divergent (and often incompatible) properties pre- theoretically ascribed to causation. There is no fact of the matter as to which is the true theory. Causal pluralism yields methodological pluralism. Methods tracking causation as defined by different theories have different search targets and, consequently, cannot be meaningfully pitted against each other. There is no fact of the matter which of them is more truth-conducive, rather, they complement one another. The same, however, does not hold for methods targeting the same variant of causation. Such methods can and must be rigorously benchmarked against each other.
320
The SAGE Handbook of Political Science
If one of them turns out to more correctly and completely uncover the relevant variant of causation it is strictly superior. The choice of theory and the choice of method mutually constrain each other and are constrained by the exigencies of a given research context. If causally analyzed data are of a correlational nature, a method is called for that can process correlational data and causation must be understood in terms of a theory that connects correlational dependencies to causation. Appropriate choices in such a context are a probabilistic theory and a method from the Bayesian network or regression analytic framework. If a study is examining the complexity of the type-level causal structure underlying a phenomenon of interest, a theory and corresponding method are needed that are capable of reproducing conjunctural causation, equifinality, and causal sequentiality. In that case, a regularity theory might be chosen along with a pertinent configurational comparative method. Alternatively, if enough prior knowledge on how to intervene on an investigated phenomenon is available, the interventionist framework is a safe choice. Or if a study aims to explain the occurrence of some token-level effect, suitable choices might be a counterfactual theory in combination with the potential outcomes framework or process tracing underwritten by a mechanistic theory.
Notes 1. I thank the Trond Mohn Foundation (grant no. 811886) for generous support of this research. 2 . The complete definition of the notion of a minimal theory is beyond the scope of this article (for the latest definition see Baumgartner and Falk, 2019).
References Albert, D. Z. (1992). Quantum Mechanics and Experience. Cambridge: Harvard University Press.
Baumgartner, M. (2013). A regularity theoretic approach to actual causation. Erkenntnis 78(Suppl 1), 85–109. Baumgartner, M. and M. Ambühl (2018). Causal modeling with multi-value and fuzzy-set Coincidence Analysis. Political Science Research and Methods. doi: 10.1017/psrm.2018.45 accessed 17 December, 2019. Baumgartner, M. and C. Falk (2019). Boolean difference-making: A modern regularity theory of causation. British Journal for the Philosophy of Science, doi: 10.1093/bjps/axz047 Beach, D. and R. Pedersen (2013). Process Tracing Methods: Foundation and Guidelines. Ann Arbor: University of Michigan Press. Beebee, H., C. Hitchcock, and P. Menzies (eds.) (2009). The Oxford Handbook of Causation. Oxford: Oxford University Press. Casati, R. and A. Varzi (2015). Events. In E. N. Zalta (ed.), The Stanford Encyclopedia of Philosophy (Winter 2015 ed.). Metaphysics Research Lab, Stanford University. Davidson, D. (1967). Causal relations. Journal of Philosophy 64(21), 691–703. Dowe, P. (2000). Physical Causation. Cambridge: Cambridge University Press. Eells, E. (1991). Probabilistic Causality. Cambridge: Cambridge University Press. Ehring, D. (2009). Causal relata. In H. Beebee, C. Hitchcock, and P. Menzies (eds.), The Oxford Handbook of Causation, pp. 387– 413. Oxford: Oxford University Press. Gelman, A. and J. Hill (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press. Glennan, S. S. (1996). Mechanisms and the nature of causation. Erkenntnis 44, 49–71. Glynn, L. (2011). A probabilistic analysis of causation. British Journal for the Philosophy of Science 62(2), 343–392. Graßhoff, G. and M. May (2001). Causal regularities. In: W. Spohn, M. Ledwig, and M. Esfeld (eds.), Current Issues in Causation, pp. 85–114. Paderborn: Mentis. Halpern, J. Y. (2016). Actual Causality. Cambridge, MA: MIT Press. Halpern, J. Y. and C. Hitchcock (2010). Actual causation and the art of modelling. In R. Dechter, H. Geffner, and J. Y. Halpern (eds.), Heuristics, Probability, and Causality: A Tribute to Judea Pearl, pp. 383–406. London: College Publications.
Causation
Healey, R. (2009). Causation in quantum mechanics. In H. Beebee, C. Hitchcock, and P. Menzies (eds.), The Oxford Handbook of Causation. pp. 673–686. Oxford: Oxford U niversity Press. Hitchcock, C. (2001). The intransitivity of causation revealed in equations and graphs. Journal of Philosophy 98(6), 273–299. Hume, D. (1999 [1748]). An Enquiry Concerning Human Understanding. Oxford: Oxford University Press. Lewis, D. (1973). Causation. Journal of Philosophy 70(17), 556–567. Lewis, D. (2000). Causation as influence. Journal of Philosophy 97(4), 182–197. Little, D. (2011). Causal mechanisms in the social realm. In P. M. Illari, F. Russo, and J. Williamson (eds.), Causality in the Sciences, pp. 273–295. Oxford: Oxford University Press. Machamer, P. K., L. Darden, and C. F. Craver (2000). Thinking about mechanisms. Philosophy of Science 67(1), 1–25. Mackie, J. L. (1974). The Cement of the Universe: A Study of Causation. Oxford: Clarendon Press. Mellor, D. H. (1995). The Facts of Causation. London: Routledge. Menzies, P. and H. Price (1993). Causation as a secondary quality. British Journal for the Philosophy of Science 44(2), 187–203.
321
Morgan, S. L. and C. Winship (2007). Counterfactuals and Causal Inference: Methods and Principles for Social Research. Analytical Methods for Social Research series. Cambridge: Cambridge University Press. Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge: Cambridge University Press. Psillos, S. (2010). Causal pluralism. In R. Vanderbeeken and B. D’Hooghe (eds.), Worldviews, Science and Us: Studies of Analytical Metaphysics, pp. 131–151. New Jersey: World Scientific Publishers. Ragin, C. C. (2008). Redesigning Social Inquiry: Fuzzy Sets and Beyond. Chicago: University of Chicago Press. Reichenbach, H. (1956). The Direction of Time. Berkeley: University of California Press. Spirtes, P., C. Glymour, and R. Scheines (2000). Causation, Prediction, and Search (2nd ed.). Cambridge: MIT Press. Suppes, P. (1970). A Probabilistic Theory of Causality. Amsterdam: North Holland. Tooley, M. (1987). Causation: A Realist Approach. Oxford: Clarendon Press. Woodward, J. (2003). Making Things Happen: A Theory of Causal Explanation. New York: Oxford University Press.
19 Concept Regulation in Political Science Zachary Elkins
Introduction In the early 1970s, a group of enterprising political scientists began to plot the future of their discipline. Their primary frustration was one of vocabulary; specifically, the lack of a standardized set of terms for political things. This frustration is undoubtedly a common one among scientists (or anyone seeking knowledge). Language and thinking are inextricably linked; we use words to think and think to use words. Why should we be surprised, then, at the many intellectual disagreements that are ‘just’ semantic? One solution to this state of affairs is to record common usage of the central terms of the debate. A debatable endpoint of such an exercise would be to legislate common meanings of terms, preferably with the backing – however implicit – of an authoritative body. Giovanni Sartori was at the center of just such a movement in Political Science, and he and collaborators championed the idea of a more conscious and atten tative approach to concepts. Their group
constituted the first Research Committee (Research Committee-01) of the nascent International Political Science Association (IPSA). And what better mission for a global learned society than to coordinate language and develop international standards in this way? As we shall see, Sartori and others in this ‘conceptualist movement’ were successful over the next 50 years in accelerating interest in the systematic study and use of concepts, if not in legislating their meaning. Concept analysis is now a central part of the Political Science toolbox. It is important, of course, to situate Sartori and friends within the larger trajectory of what we might call ‘concept regulation’ in the sciences and humanities. Intellectual history is studded with similarly heroic attempts to structure, represent, and standardize knowledge. Even ordinary (‘folk’) language, in many societies, has undergone some sort of regulation. It is almost quaint to imagine a time in which human beings would deign to use words without reference to their ‘correct’,
Concept Regulation in Political Science
or at least official usage as established by a published dictionary. Often, this regulation is the intellectual product of a driven and dedicated maven. And sadly, these publicspirited, library-bound souls were often not fully appreciated during their lifetimes. Noah Webster devoted 25 lonely years to the construction of a new American dictionary. He apparently did not garner much acclaim or affection among his contemporaries (Lepore, 2006), but the impact of his work, for those of us in the United States, on the meaning of words, our choice of words, and how we spell them is nothing short of seismic. And surely we can say something similar about the Académie française, the board of governors of the Oxford English Dictionary, and so on. But the scientific fields are no different, and perhaps even more suited to regulation. Leibniz, for one, recognized the value of a shared sense of terms, and pondered the development of a systematic language in which scientists could reason through their observations of the natural world together (Loemker, 1969). Within domains of study, certain entrepreneurs have succeeded in establishing conceptual schemes. Think, of course, of Linnaeus’ Systema Naturæ (1735); his taxonomy and Latin vocabulary still dominates our thinking about the natural world, even as other conceptual schemes to classify organisms evolved. And then there is the Diagnostic and Statistical Manual of Mental Disorders (DSM) of the American Psychiatric Association (2013), which in its various editions has (in)famously catalogued the ‘pathologies’ of the mind for nearly 70 years. It makes sense to review these efforts and insights of concept entrepreneurs across fields of study, in part to understand the consequences of regulation itself. No one would doubt the trade-offs and challenges involved in any regulation of human thought, and the costs and benefits are on display in each of these disciplines. Moreover, nothing about concept formation is unique to Political
323
Science, and insights from cognitive psychology, philosophy, and any of the substantive fields of inquiry are enlightening. Nowhere is the study of concept regulation more alive than in the fertile field of information science. The last two decades has ushered a clear sense of how one might proceed in the realm of concept regulation. I think here about the use of formal ‘ontologies’ in the merging of data and concepts. ‘Ontology’ in this usage is not the philosophical study of being, but the information-science sense of a formalized set of concepts in a domain, as well as their properties and relationships with other concepts. What do these advances mean for concept monitors in Political Science? The second half of this chapter turns to these new tools of ontology formation and analysis, which I illustrate in the domain of comparative constitutional law.
The First 50 Years Modern concept regulation, at least in Political Science, might be said to have started in 1970 with the publication in the American Political Science Review of Giovanni Sartori’s seminal article ‘Concept misinformation in comparative politics’, and the launch of the IPSA and that organization’s flagship committee (Research Committee-01) on concepts.1 One can go back further, of course. Weber is a logical antecedent, not to mention a generation or two of German scholars such as Reinhart Koselleck who helped to fashion an entire field of ‘Conceptual History’ (Begriffsgeschichte). But within Political Science, Sartori and his many followers proceeded to illustrate and define what it means to do ‘systematic’ concept analysis. Pick up an article or book on political science and you are likely to find a section (or chapter) on (something like) ‘conceptualization and measurement’. These concept narratives often have an informal quality, at least in contrast to the more coordinated
324
The SAGE Handbook of Political Science
concept regulation that would go on in other fields (say, formal taxonomies in biology or the medical sciences). But as we shall see, the concerns of these narratives fit logically with advances in information technology in which scholars digitalize (and, thus, formalize) ontologies to represent knowledge and to connect concepts to data. These formal methodologies of concept structuration – whatever form they take – very likely will occupy concept regulators over the next 50 years. An introduction to these approaches constitutes the second part of this chapter. But first, we must understand what it means to do ‘concept analysis’ in political science.
The Art and Science of Concept Reconstruction (and what is a concept, anyway?) If the point of concept analysis is to understand and communicate the meaning of terms, it would be the height of hypocrisy not to begin with the ‘C’ word. And what better way to understand the science of concept analysis than to turn that analysis on itself? The concept of ‘concept’ is probably hazy for most. Ask students (or anybody) for the meaning of a concept, or even an example of one, and you are likely to be met with stony silence. As it happens, the question leads to a rewarding classroom discussion – as most conceptual questions do.2 And like all conceptual questions, the answer will probably not be decided by fiat, since we are interested in understanding the multiplicity (if there is such) in meanings. Indeed, we should subject the term to the same grueling treatment that Sartori would reserve for other troublesome concepts in political science (e.g., power, institution, democracy). By way of example, Sartori’s 1984 book Social Science Concepts includes separate chapters in which he and his collaborators reconstructed a set of important concepts. Another notable example is a special issue of the newsletter of the Comparative Politics section of the American
Political Science Association (APSA), in which leading comparativists marched through some 20 concepts that have troubled the discipline for years (APSA-CP, 2009). This is not to mention the many articles and monographs that reconstruct concepts of various kinds, such as Weyland’s (2001) on populism or Gerring’s (1997) on ideology, and so on. Most of these works, it is probably fair to say, borrow in some way from the special attention given to concept analysis in the 1970s. All this is to say that we should subject the concept of concept3 to the Sartorian concept-reconstruction process, which involves exploring the intension and extension of terms as well as their semantic field. The intension (or connotation) of a concept refers to its meaning, and extension (or denotation) to the application of the concept to cases (or, instances of the concept). The ‘semantic field’ is one of the resonant labels from Sartori’s (1984) work and refers to the network of concepts that are related to the concept under study. Entities in the semantic field might be synonyms, antonyms, broader variants (hypernyms), narrower variants (hyponyms), concepts that share one or more attributes, etc. Essentially, the semantic field is what one might hope to find in the entry in an exceptionally comprehensive and narrated thesaurus. Clearly, if one were to deconstruct (and reconstruct) a concept, mapping the semantic field while simultaneously exploring the intension and extension of both the central concept and its field mates makes for a useful process of disambiguation. One way to understand the meaning – both the intension and extension – of the term concept is to understand its cognitive utility. Concepts help us to navigate the blinding amount of stimuli by lumping ‘things’ into categories, which we then call an ‘instance’ of that category. So, if we could understand leaders such as Juan Perón and Donald Trump as instances of populism (or some other category), we might move about the political world with more cognitive ease.
Concept Regulation in Political Science
Similarly, if I know that a piece of music is classified as jazz or, say, punk, I can know whether or not I can listen to it without distracting my writing. Notice that concept and category are more or less interchangeable here. Indeed, it seems to many that one can swap the two without significant loss in meaning (Murphy, 2004). For that matter, we might include class, classification, and taxon(omy) as near-enough synonyms, in the spirit of mapping the semantic field. Once we start thinking about concepts as categories, or ‘data containers’ (Sartori, 1970), their use becomes more clear – and more data oriented. David Collier perfected an ingenious catch-phrase – ‘what is that a case of?’ – which he used to encourage students to think more conceptually (and categorically). He used it periodically, and to great effect. If an unwitting student expressed interest in some particular political phenomenon – say, the Chiapas uprising – the Collier treatment would follow: ‘interesting … but, what is that a case of?’. Categorizing is the first step in concept development and that particular Collierism has very likely led many students to think more conceptually. Accordingly, a common summary definition of a concept is a ‘mental representation’. Suppose that we think of a mental representation as an idea that stands for something in the world. As such, a defining attribute is that a concept is abstract as opposed to concrete (observable or perceived). This is also to say that a concept is invented (constructed), as opposed to given. Hence the term construct, which we might also view as synonymous with concept, but one that reminds us that the mental representation is invented (not given) (see Moses, Chapter 27, this Handbook). Although here we must consider the possibility that some concepts are natural, or basic, that is, that some concepts seem almost to be hard-wired as opposed to learned or invented, the evidence for such a claim would seem to be the appearance of very similar concepts across a large set of
325
societies that are removed from one another. Research in child development and across cultures explores exactly this possibility of socially constructed, or learned, concepts as against something more innate. Rosch (1973), for example, found that some concepts and categories are more ‘natural’ than others. Specifically, the important insight from Rosch is that some levels of abstraction are more ‘basic’ than others (more below). This abstract/concrete distinction leads to the concern of reification – that is, that one’s mental representation may be mistaken for a thing unto itself. And here we must appreciate both the appeal and danger of concepts. Naming something can be a remarkably powerful act. Contrary to popular opinion, a rose by any other name does not smell just as sweet, except in blind anonymous tests. Just ask any recipient of one on Valentine’s Day; the category of things called rose carries all sorts of constructed meaning independent of the physical plant. Similarly, if we are to call Donald Trump a populist, we have imputed all sorts of associations that may or may not hold. Categorizing the world with official labels is undoubtedly one reason that some have been skeptical of the DSM. Some wonder whether certain behaviors are actually disorders at all. bereavement syndrome (that emotional state brought on by grief) is now properly named and, apparently, a d isorder, at least as of the latest edition of the DSM. Concerned practitioners and researchers in the field of health raise real questions about whether in the naming of symptoms we are medicalizing and exaggerating common and expected human behaviors. Of course, group identity (whether along race, ethnicity, nationality, or any other marker) is one of the powerful and controversial forms of reification. Some ways of categorizing people seem to be ‘basic’, in the sense of Rosch, but others are quite obviously products of scientific thinking. I recall a survey of street vendors in Lima, Peru, in which interviewers inquired about the respondent’s
326
The SAGE Handbook of Political Science
satisfaction with being a member of the informal sector. If respondents had not considered themselves as part of that particular class of ‘informal workers’, they did now! Certainly, human beings classify each other in any number of ways, but it is worth recalling that these labels and categories are just that – human inventions. Still, while reification can strike somewhat pathological and prejudicial notes, it is worth understanding that it is that exact rhetorical and cognitive power that makes for successful scientific theorizing and conceptualization. Hence the power of concepts. And with great power comes great responsibility.
Classical Versus Prototypical Views An important distinction among concepts has to do with the problem of borders and gradations across categories. In the ‘classical’ view of categorization (Murphy, 2004), categories are defined by characteristics that are necessary and jointly sufficient for membership. For example, parliamentary democracies may be defined as those in which the executive is elected and removed by the legislature (call this defining feature ‘assembly confidence’). This view of concepts (sometimes associated with the logic of the excluded middle) admits of no borderline cases, and each member of the category is treated as a full instance of the concept, with no significant distinctions drawn among members. Of course, the Linnaean taxonomy of the natural world follows this approach (thus, the duck-billed platypus is admitted a full-fledged mammal on account of its mammary glands despite all sorts of attributes that are more characteristic of non-mammals). The modern view of concepts, associated closely with the work of Ludwig Wittgenstein and Eleanor Rosch, has led to a more probabilistic view of concepts. So, Wittgenstein’s (1953) idea of family resemblance threatens the idea that there is any common (much less necessary) attribute of category members.
parliamentary systems, then, might be a family of systems whose members share a substantial number of characteristics (e.g., executive decree power, minimal legislative oversight of the executive, a figure-head for head of state), but vary in the degree in which they exhibit these attributes. Rosch’s (1973) large body of experimental work showed that most of us possess prototypical views of concepts (a chair is a highly typical instance of furniture, a bookcase less so, and a piano even less). Again, then, the idea is that in-group items are differentiated with respect to their degree of belonging to the group. More recently, David Collier’s (Collier and Mahon, 1993; Collier and Levitsky, 1997) elaboration on classical versus radial sub-typing and his application of these ideas to central political science concepts such as democracy introduced many of us to these more graded approaches to categorization. Collier’s work has left political scientists with a distinct appreciation for partial membership in classification, though he stopped short of recommending particular measurement instruments with which to assign scores.
The Idea of Concept Regulation But ‘all is not well in the land of concepts’, begins Gerring (2001: 36), in an evaluation of the state of affairs. And indeed, frustration with the inattention and even resistance to shared meaning and understanding in political science motivated at least some conceptualists to act. Sartori, Riggs, and Teune – the ringleaders of the conceptualist movement in the 1970s – were very much of this opinion. The group’s manifesto may well be their manuscript of 1975, appropriately entitled Tower of Babel (the Sartori chapter of which is republished in Collier and Gerring, 2009). Sartori’s weighty chapter is sprinkled with warnings about the mounting trend toward non-cumulative research and debate, something that he sees in stark
Concept Regulation in Political Science
contrast to practices in the natural sciences and economics. He worries that ‘the soft sciences are sliding – if unawares – toward a vicious cycle of incommunicability and frivolous verbalism’ (Sartori, 1975 [2009]: 63). I should say that others before him had been similarly frustrated. Charles Titus (1931) – who would develop his own symbolic language for political concepts – reported his frustration at turning up 145 different definitions of the concept state. Titus described Political Science as a ‘guessing game’, in which ‘we are spending much of our time guessing what the sender means when he uses even technical words’ (1931: 45), He goes on: Take, for example, our own situation. Do you know what I mean by the words and terms which I have been using? Or have you been guessing? Do you know whether you have successfully guessed the meanings I have tried to put into the terms used? What assurance have I that you, individually or collectively, have been successful in your guesses? (Titus, 1931: 45–6)
Plus ça change…, as they say. But, as Sartori later saw it in 1975, things were only getting worse due to three trends in scholarship – trends that have probably only escalated, or at least morphed, since then. One was the sharp decline of classical languages. Scholars were (are) no longer conversant in Latin or Greek. The problem for Sartori was not that they could not agree upon a lingua franca for naming (à la the Latin names in the biological taxonomy), but rather that the etymological roots of the vocabulary of modern languages were now obscured. So, in his example, we may not appreciate that consensus and cooperation – two terms in need of disambiguation – have their respective roots in ‘feeling’ as against ‘working’ together. There is, of course, much more to say here regarding the rise and fall of the world’s languages. One phenomenon that has evolved since the Tower of Babel is that English seems to have penetrated international scholarship and become effectively the lingua franca in the sciences and social sciences (although it is notable that rich
327
intellectual and scientific discussion continues across languages). How this trend affects concept regulation is interesting to ponder. It may be, of course, that heretofore heterogeneity in language has led not only to concept proliferation (and redundancy) but also to conceptual innovation. Undoubtedly, scholarship has been well served by rich concepts developed in other modern languages, especially German, which are often preserved as intranslatables.4 Sartori was also concerned – at least with respect to mutual intelligibility – with what he saw as a de-contextualized and ahistoric approach to Political Science evident in the ‘behavioral revolution’. He worried that a concept such as coercion that might be understood in reference to a particular prototypical experience, would no longer register among scholars operating with a more general approach. Relatedly, Sartori saw the scholarly disciplines increasing in their specialization and, as a result, losing a common discourse. He notes that ‘words such as “structure” and “culture” are used in philosophy, ethnology, anthropology, psychology, sociology, and Political Science in a way which is chaotic, wasteful, and frequently conducive to cross-disciplinary bastards’ (Sartori, 1975 [2009]: 62–3). Finally, he observes what he describes as a ‘frenzy of novitism’, in which scholars are encouraged to invent new terms, without any real consultation with established usage. We should note that Sartori is not decrying any of these trends. Some of them, such as disciplinary specialization, he sees as distinctly positive developments. The point is that they throw up challenges to intersubjective agreement on terms, hence the rationale for concept regulation. Regulating (or legislating or rationalizing) concepts has a pejorative and perhaps futile sense to it, as one might expect of any efforts at social control. Some may see the effort as stifling the natural creativity and fluidity of speech. Virginia Woolf (quoted at length in Gerring, 2001: 36) mounted a
328
The SAGE Handbook of Political Science
full-throated defense of this position,5 but a sentence here provides a flavor: ‘[words] hate anything that stamps them with one meaning or confines them to one attitude, for it is their nature to change’ (Gerring, 2001: 36). Suffice it to say that Woolf was not concerned about cumulative theorizing and would decidedly not have been a card-carrying member of the IPSA’s Research Committee-01. But there is something of this appreciation of the indeterminacy and dynamism of ordinary language that permeates any of the work on concept regulation. Sartori himself makes clear that their effort was decidedly not about defining terms by fiat. And Calise and Lowi, in one of the more ambitious and creative efforts at concept rationalization in recent years recognize their ‘need to set aside any pretense of having the ultimate word(ing), even though such an attitude may run contrary to the objective of all lexicographers as well as the ambition of most scholars’ (2010: 4). But one can find examples of seemingly quixotic efforts to regulate political science language. An extreme version might be that of the aforequoted Charles Titus, who in the late 1920s carried out an imaginative test experience with a synthetic language, in which he endeavored to replace many important political concepts with symbols. So, for example, in his dictionary, the Greek letters alpha, beta, and gamma would stand for assumptions, methods, and ‘control programs’, respectively. ‘A’ would indicate a human activity, ‘C’ a written constitution, and so on. Titus imagined this to be something like chemistry’s periodic table, or the basic operating notation in mathematics. Titus’s system may be on the extreme end of concept regulation, and has an heroic, Esperanto-like feel to it. However, one can easily imagine a very modest adoption of his system by formal theorists, in the interest of standardized notation. The conceptualist movement of Sartori was decidedly not heavy handed, but it too was founded on the belief that one cannot capitulate to poetic license. Sartori, for his
part, conceptualizes the debate as one of Heraclitus versus Descartes. Is it, he asks, that phenomena are ‘unbounded, continuous, and in endless flux’ (Sartori, 1975 [2009]: 92) or that scholars can only gain any traction and understanding of a phenomenon if they can put some bounds on how they describe it. Sartori falls unambivalently into the Cartesian camp. Indeed, he asserts that ‘wherever we have arrived, as mental animals, in controlling the world of nature, we owe it to the Cartesian approach’ (1975 [2009]: 92). But the interesting question is what to do? When Gerring (2001: 37–9) muses on the options, he considers (in my paraphasing): (1) doing nothing; (2) maximizing concreteness; (3) defining terms carefully; and (4) contextualizing every reference. The Sartori prescription may be an extended version of (3) and (4). It is a heavily analytical approach combined with an appreciation for precedent. The first, and core, Sartorian measure is something of a non-proliferation treaty. That is, an injunction against neologisms (anti-novitism, Sartori calls it). The second, and highly interdependent, injunction is that scholars consult and describe the semantic field, especially when introducing a new term. Mapping related terms in the semantic field, it follows, might remind us to reduce, reuse, and recycle our terms. I think of these measures as something like the sign that hung outside the computer room of the Berkeley Political Science department in the 1990s, which read: ‘(1) Absolutely no food or drinks allowed; (2) If you do bring in food or drink, be careful’. Like a steaming cup of coffee next to one’s keyboard, neologisms are one of the joys of scholarship, and one in which Sartori himself indulged. But more so than anti-novitism and semantic mapping, Sartori’s prescription is toward a more conscious and analytical approach to concept formation and consumption. His various books demonstrate a clear methodology and language for how to talk about concepts, analyze them, and develop them. He thought about his rules as ‘attention
Concept Regulation in Political Science
sharpeners’ – methods of inducing more self-conscious conceptualization practices. Sartori, of course, understood the power of institutions and of coordination. In this sense, he saw the newly formed IPSA research committee as a powerful way to promote this more conscious approach to concepts. ‘My hope’, he writes, ‘is that [IPSA’s RC-01] will provide – as a joint endeavor – the missing linkage, or the missing bridge, between specialists in logic, methodology, and the philosophy of science on the one hand, and social science practitioners on the other’ (Sartori, 1975 [2009]: 93). And perhaps the commitee has become such a link, to some extent. But certainly, the committee’s publications – inter alia, those by Sartori – were impactful and led a fair number of influential scholars to take conceptualization seriously and incorporate a conscious approach to concepts as an integral part of their work.
The Next 50 Years Many political scientists, then, have become quite self-conscious about their use of concepts. But one task that scholars have not attempted is that of mapping – in any comprehensive or formal sense – the use of terms across the discipline (as in biology). Nor have political scientists attempted, understandably, to legislate or standardize vocabulary. Some interesting formalizations have evolved, notably the aforementioned Calise and Lowi (2010). But technology for such coordination has evolved, in part as a method for attaching concepts to data, which is a primary incentive for coordination, along with the goal of representing knowledge. These methods are an integral part of the current information infrastructure online. Here, I introduce and describe these methods and illustrate their relevance for Political Scientists in the context of constitutional law.
329
The Domain of Constitutional Text Consider the study of national constitutions, historically a core concern of political science, and one that has undergone significant changes in data collection in recent years. Conceptual/classificatory questions are legion in this field. For example: some may see a national executive that is selected by the people to serve for a fixed term and label the system that enshrines those attributes as ‘presidentialist’, while another observer might look to still other properties, such as veto power or nominating power, and categorize the system quite differently. Those who study institutional and constitutional forms engage periodically in these classification exercises. For example, the authors of the Comparative Constitutions Project (CCP) (Elkins et al., (2005 [2019]), have identified, excavated, and interpreted each national ‘constitution’ that has come into force since 1789. They, of course, have their own conceptual preferences, which they have used to develop a survey instrument to read and interpret constitutions in order to record the content of the documents. These authors’ conceptual scheme – starting with the identification of what, exactly, a country’s Constitution is – represents merely their view of the world, albeit a view informed by scholars who have come before them. After all, the classification of constitutional elements goes back at least as far as Aristotle, who put together a ‘dataset’ of the constitutions of Greek city-states (Politics, especially Books IV and VI). My working assumption is that other researchers read texts differently from the way CCP Scholars. do, and would record different properties about them (more on this assumption below). Still, projects such as the CCP that gain scholarly currency may inadvertently have the effect of discouraging the elaboration of alternative schemes. Ideally, CCP interpretations and concepts should speak to and with the voices of others. So, for example, other scholars should
330
The SAGE Handbook of Political Science
be able to see that CCP Scholars. recorded Article 38, section 1(1) of the Cape Verdean Constitution of 1982 as expressing, among other things, the right to privacy. That particular section reads: ‘The right to personal identity, to civil rights, to a name, honor, and reputation, and to personal and family privacy shall be guaranteed’ (Cape Verde [1980] Article 38(1)). In this case, right to privacy is a concept that CCP Scholars. find useful for capturing an element of that clause, but other researchers might find additional or other, perhaps more refined, concepts just as relevant. Indeed, the field of human rights has been particularly fertile, with respect to concepts. And so it may be that the Cape Verdean clause speaks to evolving concepts such as the right to be forgotten (Rosen, 2012), or even the right to the city (Harvey, 2008). And, of course, the clause is a manifestation of multiple other concepts even in CCP taxonomy itself.
Relationships Among Conceptual Schemes In order to understand how different conceptual schemes can ‘converse’, we need to understand how the schemes might vary, exactly. It is only then that we can understand how to translate and share them. It seems possible that these schemes could vary in any number of ways, but consider three kinds of relationships among datasets in the constitutional domain, each of them related to the data on constitutional elements (such as the CCP) in roughly three orders of proximity (see Table 19.1).
First-Order Extensions and Refinements In the first category, consider those projects in which scholars are essentially observing and coding the same thing, albeit with different purposes, different theoretical frameworks and, importantly, different levels of
generality. Some schemes are simply more refined than others, depending upon observers’ tastes and interests. This is the phenomenon of the proverbial Eskimos and their 60 kinds of snow. The CCP categorization scheme, in some areas at least, stops at a relatively high level of generality (maybe something comparable to the genus level in biology), but other researchers might prefer to make finer distinctions (say, at the species level). For example, in the CCP the authors identify six characteristics related to the treatment of women in constitutions. Lambert and Scribner (2009), however, have since coded a smaller set of constitutions across a much wider (and deeper) set of characteristics, a set that was motivated by their own theoretical agenda. Specifically, Lambert and Scribner are interested in comparing constitutions that ‘emphasize women’s different needs and provide gender-based protections (difference) as compared to countries with constitutional structures that emphasize equality or gender neutrality (equality)’ (Lambert and Scribner, 2009: 337). This particular distinction between their concepts of difference and equality may not be fully specified in the CCP data, though one might certainly measure something about equality provisions in the CCP data. The discussion above implies a sequential, generational relationship in which data collectors build on one another’s labor. It could be, of course, that analysts till the same soil independently (sometimes willfully so) of one another. We might even think of some projects as competing with one another – carried on in parallel but with sidelong glances. Stephen Jay Gould’s (1992) 2010 account of Brontosaurus comes to mind. Gould’s story is one of two zealous paleontologists with two different visions of the same species. One coined the name ‘Brontosaurus’ and the other ‘Apatosaurus’. One name, inevitably, would be the one that would become the industry standard and capture the imagination of 7 year olds everywhere. For many years,
Concept Regulation in Political Science
331
Table 19.1 Selected datasets in the constitutional domain Distance from CCP*
Unit of analysis
Topic(s)
Data source
First order
Constitution
Gender Minority incorporation Rights Judicial authority Judicial independence Executive power Amendability Environment Judicial councils Constitutions c., 1975 Constitutions c., 1998
Lambert and Scribner (2009) Koenig and Tsutsui (2019) Versteeg (n.d.) (2014) Brinks and Blass (2018) Ríos-Figueroa and Staton (2012) Shugart and Carey (1992) Ginsburg and Melton (2015) Boyd (2011) Garoupa and Ginsburg (2009) van Maarseveen and van der Tang (1978) Harutyunyan et al. (1998)
Second order
Judicial decisions
Third order
Country
Various Free speech Gender Rights Development Political authority Minority status Human rights enforcement Ethnic power relations
Carrubba et al. (2012) Keck (2015) Women’s Link Worldwide (2019) Cichowski and Chrun (2017) World Development Indicators Varieties of Democracy (V-Dem) Minorities at Risk (MAR) Cingranelli et al. (2014) Wimmer et al. (2009)
* Comparative Constitutions Project Source: Own elaboration.
Brontosaurus was the standard, but some time in the 1980s, the powers that be designated Apatosaurus as such. Brontosaurus was to be the ‘junior synonym’. But then in 2015, long after Gould had quit the matter, two biologists wrote an apparently authoritative article that established two distinct species, thus allowing scientists to use both names. Who knows what parents are supposed to say to their kids about this now? Political science, for better or worse, does not have any sort of taxonomical police force or, it seems, even the same bitter lexical battles as that in biology. But an important part of the re-organization of anything is dealing with redundancy. We are more interested in translation (maintaining multiple useful names) than in standardization (retiring names). Both methods are forms of de-cluttering, but one is more neutral than the other. It would seem that the easiest approach to formalizing multiple perspectives on the world is simply to record (if not
maintain) these multiple perspectives. Of course, one faces a potential dilemma about whether to preserve infelicitous terms that do not capture the public. Still, one hopes that there are ways to privilege commonly held terms while recording ‘junior’ synonyms, if only as historical reference points. In short, there really seems to be no need to make a decision between ‘Brontosaurus’ and ‘Apatosaurus’ in political science, and many reasons to catalog the use of both. In the case of the CCP, the researchers were conscious of the prior, if esoteric, efforts of van Maarseveen and van der Tang (1978) to build an inventory of constitutional concepts and to code them across a set of constitutions. CCP Scholars. sought to build on the thinking and even integrated some of the Dutch scholars’ concepts in the CCP. However, it was only much later that CCP Scholars. discovered the work of a group of Armenian scholars (Harutyunyan et al., 1998), who had constructed their own
332
The SAGE Handbook of Political Science
constitutional terminology for use in a project geared toward constitutional analysis in post-transition Armenia. In either case, it would be profitable to compare ontologies systematically.
Second-Order Extensions Another task of conceptual translation is to connect conceptualizations in one domain (or set of units) to conceptualizations in a related domain (or units). An example helps to clarify. Constitutional text can be inherently interesting (to some of us at least) and informative (again, sometimes). But text does not interpret itself and, as some would have it, text could be nothing more than a flexible vehicle that leaders and judges twist to their own ends. The indeterminacy (or not) and enforcement (or not) of text is a matter of dispute ad nauseam at nearly every forum on constitutional law, and not worth belaboring here. But regardless of where one comes down on these questions, a reader of constitutional text will wonder how the relevant courts or officials have interpreted and implemented the law. Conversely, a reader of a court’s decision regarding a particular law will want to read the accompanying law in question. By way of analogy, the same is true of religious texts and the doctrine surrounding them. In order to read scripture and its attending doctrine efficiently, one needs several basic pieces of information. One is, of course, a systematic recording of what the texts say (or, failing such digests, simply the texts themselves). Another is the decoding of the decisions (and opinions) regarding that text. But importantly – and the point of this chapter – one also needs a relatable set of concepts in order to connect these pieces of data. Relatability implies either a common (perhaps standard) lexicon, or translation tools that allow one to connect different conceptual vocabularies. Several empirical projects are in progress that offer some promise in this regard (see the secondorder group in Table 19.1). These are
interesting studies of court opinions across different domains and samples. A basic goal might be to connect first-order projects with second-order projects. That is, text on the right to free speech in the South African constitution would be connected to that country’s high-court interpretation of that provision in cases.
Third-Order Extensions A third extension, even further from the constitutional text, is one to data about the jurisdiction itself (e.g., the country, which is the jurisdictional level of observation in the CCP data). So, one might want to connect constitutional data in a given year and country to country-year data on human rights enforcement, economic conditions, demographic characteristics, or whatever (how about happiness?). This kind of merging is, of course, common in any data-analytic paper on the origins or consequences of constitutions. Many of us compile such data as a matter of course in cross-national research. But sometimes it is helpful to have some systematic conceptual translation, especially if the data merge is machine mediated. Increasingly, one wants to gather data and information on a concept in a jurisdiction from across various unprocessed data sources, such as media reports. In order to do that efficiently, one would need a relatable set of concepts. Merging data with a concept map is certainly more efficient, but perhaps the real gain is in discovering the data in the first place by tracing concepts. Consider, for example, the concepts that underlie Ted Gurr’s (and collaborators’) Minority at Risk (MAR) data, which include characteristics of groups that have experienced marginalization in their state of residency (Gurr, 1995). How do MAR concepts such as ‘political discrimination’ relate to some of the elements of ethnic accommodation in constitutions? It might be that we should think of a constitutional provision, such as ‘constraints on party formation’, as one application (or example) of ‘political
Concept Regulation in Political Science
discrimination’. If so, the presumed relationship between those two concepts is worth recording.
Concept Formation and Enrichment The basic insight from this discussion thus far is that we live in a world in which multiple scholars analyze roughly similar phenomena with roughly similar concepts. Any connections between these scholars’ works can be inferred, but the connections are rarely made explicit. This state of affairs is nothing new, though with the ever increasing mountain of information, it may be increasingly frustrating. However, it is useful to imagine a very different world, if only as a thought experiment. This would be a world in which researchers could see, understand, and connect to another researcher’s conceptual schema (or ontology). Understanding how one’s concepts fit with another’s concepts is tremendously valuable and enlightening in and of itself, as I will try to elucidate. However, building bridges among concepts has other, perhaps even more powerful, downstream effects. If I can connect my concept to another, I can then connect my observations about that concept’s manifestations (that is, my data) to those of another researcher. Mapping concepts allows us to combine data creatively, which is critical for the analysis of measurement and hypothesis testing. Imagine, following Sartori’s guidelines of concept analysis, one wished to formalize understandings of a concept. That is, for example, to map the semantic field, identify defining and elective properties, and point to any important dimensions or classes of the concept. Enumerating these three tasks is one thing, but how exactly are two researchers to construct such knowledge in a way that will map on to that of other researchers? And, more to the point, why would they do so? Of course, one could formalize one’s
333
concepts tabularly, the raw and basic form of such information. Imagine rows of concepts crossed with columns of information listing sub-concepts, properties, synonyms, antonyms, etc. The result is a table of information about concepts that any data scientist can make sense of. Of course, no one wants to stare at a data table of this sort. But analytic and visualization software applications can render such information in interesting ways. Indeed, the many ways that creative users and their applications will use the data are – like data of any kind – limitless and ever evolving. And, of course, one would like to combine that table with other tables, linked by any of the items in the data.
One Promising Solution: Linked Open Data One such mapping solution is ‘linked open data’, sometimes summarized as the ‘Semantic Web’, and more generally, ‘Web 3.0’. It is a data structure that is increasingly common on the web, and it is the structure that makes possible the data panels and carousels that search engines now regularly surface above search results. Social science data are increasingly available in this format. As it happens, one of the first data projects to employ this structure in the social sciences was the CCP, and its indexed constitutional repository (Constitute). Both projects incorporate a systematized ontology along with their data (CCP Scholars 2014).
Linked Open Data in Action The progress of the CCP, again, illustrates some of the benefits and challenges of this structure. The core intellectual product of the CCP is a set of data on some 600 or so characteristics of the world’s constitutions (and revisions to those constitution) since 1789. In 2013, CCP Scholars partnered with Google Ideas (now Jigsaw) to leverage these data in order to build a public repository of constitutional texts currently in
334
The SAGE Handbook of Political Science
force. Importantly, the texts would be indexed by some 300 topics and made available through an accessible interface. The goal was to allow constitutional drafters to call up a set of representative excerpts on any provision of interest. Since 2013, the site has received some 5,000 visitors a day and serves constitutional drafting teams throughout the world. So, think of CCP as a standard tabular dataset on the content of constitutions. Across over 900 constitutional systems, CCP Scholars have coded 600 or so characteristics. People often confuse Constitute and CCP. Constitute exhibits the repository of texts (currently, only those in force), which are indexed by 300 or so topics. CCP is a dataset of codings about these texts and more (in fact, all constitutions and their amendments since 1789). If one wanted a quantitative sense of which country has what and when, one would consult CCP data. If one wanted to see the text associated with some topic, one would consult Constitute. The data files for CCP consist of a recognizable format used in the social sciences since the days of punch cards – a matrix of rows representing constitutional documents against columns of constitutional attributes. The data for Constitute, however, take a very different form built for the online environment – one that could be easily accessed by human beings and machines. Much more on that format shortly. But first, consider some early dividends of this data strategy.
From Linked Open Data to OneBoxes and Knowledge Panels As of noon (PDT) on September 17, 2015 (Constitution Day), a Google search on ‘us constitution’ returns the 4,500-word US Constitution on a card at the top of the search results (and a 3×5-inch card at that, at least on most screens). A drop-down menu on the card (a ‘one-box’, in Google-speak) allows the reader to jump through sections. The text and data on these cards comes directly from online Constitute data.
The US Constitution on an index card is, quite literally, a small thing, but something that represents a huge advance for information science, and the social scientists who depend on such. The data for these one-boxes, knowledge panels, and other infographics are stored in Google’s ‘knowledge graph’, a curated set of highly connected and machine-readable data (‘linked data’). Google began delivering results from the knowledge graph in 2012, but the concept of linked data had been simmering at least since Tim Berners-Lee, known to many as the ‘inventor’ of the World Wide Web, had begun championing the concept as early as 2009 as the heart of his Web 3.0. Berners-Lee runs an organization devoted to building and standardizing the relevant technology. Linked data is also identified sometimes as ‘graph data’ (highlighting its interconnectedness) and is a core part of what technologists describe as the ‘Semantic Web’. Linked data are simple to understand and their utility is immediately obvious. One key feature is that each data element, whether a concept or a concrete ‘thing’ (e.g., the US Constitution) has its own unique location on the web (http://something…). These locations (Uniform Resource Location (URL)) are not websites (Uniform Resource Identifier (URI)) that human beings read, but places where data reside for machines. Each of these entities is linked to other entities through some relationship, a relationship which itself is labeled with a unique URI. So, a typical linked data file comprises seemingly endless lines of subject-predicate-object ‘triples’, each of whose elements is a distinct URI. For example, http://constitute/constitution/sudan2005/ article2\ http://constitute/hastopic\ http://constitute/torture\ is one of many triples in the Constitute dataset. Linked data files have the suffix ‘.nt’ (as in, ‘N triples’). This particular triple tells us that Article 2 of the Sudanese Constitution deals with the topic of torture. It should be clear that the Constitute dataset alone would have other
Concept Regulation in Political Science
335
Figure 19.1 Results from a Google search: ‘us constitution’
links to each of these entities (that is, ‘Article 2’, ‘Sudanese Constitution’, and ‘torture’). As you can imagine, data files of this sort are utterly forbidding to browse directly. Editing and visualization tools allow analysts to work and understand various relationships in these files. But regardless, the verbose code is extraordinarily intelligible to machines, which matters to us too. The beauty of linked data is that these ‘things’ (concepts, properties, data elements, etc.) can be linked to an infinite number of other things and concepts (hence the graph analogy, which refers to a network graph). And every ‘thing’ lives at a unique address on the World Wide Web. The consequence is that machines and their human analysts can draw connections easily and, as network ties grow, exponentially. Consider, again, the Constitution-on-indexcard idea, which represents a very simple use of linked data. To produce these cards, Google’s ‘knowledge graph’ queries data on the world’s constitutions that our website Constitute makes available as linked data on a
SPARQL endpoint (a data hub that machines can consume). Google can then program its search engine to reproduce the textual data as an info-box with the text indexed by the section headers, which are also identified in the Constitute data. It is a simple application, but one could exploit the knowledge graph for more. Imagine a box that lists provisions on ‘cruelty’ in Constitutions from countries about which human rights organizations have made allegations of torture. Linked data on all of those elements exist; one needs only to put together a few lines of query-code to get the list.
Ontology Enrichment But perhaps one of the most interesting aspects of the Semantic Web is its utility for concept formation and enrichment. In particular, Semantic Web methods have the distinct advantage of being able to share and collaborate on conceptual ‘schema’, or more
336
The SAGE Handbook of Political Science
exactly, ‘ontologies’. With tools developed for the Semantic Web, one can edit a dataset’s ontology to update or expand any of this conceptual structure. So, for example, consider the 300 or so topics that are in the Constitute Ontology. These topics reflect some basic categories in the CCP dataset. CCP Scholars developed a threelevel hierarchy to organize the concepts – see Figure 19.2, a snapshot that shows the arrangement of just a few of these concepts on www.constituteproject.org. In this example, the first-level topic, ‘Culture and identity’, is expanded to depict several sub-topics under which ‘Indigenous Groups’ is expanded to see a set of constitutional provisions including ‘Indigenous right not to pay taxes’. But, to return to conceptual coordination, this categorization is only one view of constitutional elements and by no means suitable or useful for all users. For example, none of the topics has ‘women’ in its label. There are
certainly topics that are related to women in our taxonomy; for example, inheritance laws, marriage equality. These are important concepts closely related to the status of women. Another researcher might have included them under the category, ‘women’ or ‘gender’. So, how would one integrate this new category? In the case of Constitute, CCP Scholars introduced the keyword ‘women’ and linked each relevant topic to that concept. So now if one types ‘women’ in the search bar, Constitute auto-suggests each of the topics related to that concept (see Figure 19.3). Adding keywords essentially allows Constitute to integrate different conceptualizations and, as such, enrich the underlying ontology. What this means practically is that one can access constitutional
Figure 19.2 A snapshot of Constitute’s topic tree
Figure 19.3 Entering ‘women’ in Constitute’s search box triggers topics
Concept Regulation in Political Science
texts and data associated with concepts other than those stipulated by the CCP.
Tools for Editing and Viewing Ontologies Enriching an ontology means, essentially, adding another data triple (subject-predicateobject). For example, [Women] – [is a keyword associated with] – [gender quotas] These relationships are encoded in a standardized ontology language (actually, Ontology Web Language, or OWL) that is not fit for human editing or viewing (imagine lines and lines of code similar to the .nt files described above). Fortunately, rudimentary but powerful tools have evolved to allow for editing and viewing ontologies. The design of these tools is, of course, not a trivial matter. Working with a classification system could, in theory, be a delightful exercise. It could also be an organization chore made even more painful by a clunky experience. Much of the success of the Semantic Web, I am afraid, depends upon these design issues.
Conclusion and Discussion The practitioner of any profession or pastime faces an existential crisis at some point. After all, there are many good ways to spend one’s time. Accordingly, one wonders about the role of the conceptualist in the discipline and, relatedly, the value of conceptual work. Especially the kind of conscious conceptual work that is not in direct service to a substantive research question. Undoubtedly, Linnaeus, Webster, and the authors of the DSM have had an enormous influence on the rest of us, at least in terms of what we call things, if not in terms of how we think about them. But one sometimes worries that what they (or we) are doing is ‘just’ semantic. One recalls Popper’s warning: ‘Never let yourself be goaded into taking seriously problems about words and
337
their meanings’ (1976: 19). And indeed, why talk so much about naming things, and not more about the things themselves, after all. Sartori himself (1975) was sensitive to the impatience that some might have with the methodologist who is ‘endlessly entangled in preliminaries and never gets to work’. Something like C. Wright Mills’ ‘over-conscious’ scholar. Thus, one wonders whether to recommend to young scholars that they pick up, say, Sartori’s (1984) Social Science Concepts. Philippe Schmitter (2009) has wondered the same. In his wonderfully titled address ‘The Confessions of a Repeat Offending and Unrepentant Conceptualist’, he presents some well-meaning job advice: For those of you who are just starting in the profession, your career prospects are not great should you choose this line of work. No department or faculty that I know of has a designated slot for a ‘conceptualist’. (Schmitter, 2009)
But yet, it often seems that those who have fretted about language and have invented a category and a term that captures some phenomenon, some group, or some cluster of attributes so vividly, have been exactly those who made a name or a term stick, have made an outsized contribution. Schmitter is in good company. Indeed, for some, a self-conscious approach to names and meaning is at the very heart of science and knowledge. Recall our introductory premise that thought and words are interdependent. Or choose your favorite aphorism, whether it is the psychologists’ quip, ‘how do I know what I think if I haven’t said it yet?’ or the classic Latin formulation: nomina si nescis, perit et cognito rerum (if one does not have the names, there is no knowledge of things) (Sartori, 1975 [2009]: 68). The overriding sense of those who have taken the time to organize and represent knowledge through conscious conceptual work is that the exercise is enlightening. Perhaps Weber said it best: ‘The history of the social sciences is and remains a continuous process passing from the attempt to order
338
The SAGE Handbook of Political Science
reality analytically through the construction of concepts’ (1905 [1949]: 105). If so, what is the next step for the next Sartori? The basic question remains, always: how do we formalize our conceptual map of the world and, then, compare this map with that of others? Roughly, two generations of political scientists have honed the craft of concept analysis. What remains for future generations is to formalize these understandings in ways that allow for clearer knowledge representation and inter-scholar coordination. Much of this work will need to go on within particular domains. Within constitutional law, I describe a promising solution that information scientists have developed in recent years. The solution provides a standardized infrastructure for mapping concepts. One significant advantage of this approach is that it has become the industry standard for online data environments. It therefore facilitates the consumption and exportation of data to and from an extraordinary wide set of sources. Indeed, the solution has already paid dividends. Google and other search engines are already interacting productively with the social scientists who have disseminated in this form. But I see other clear dividends as well. One is the systematic description of the conceptual map of any given concept. That is, knowledge representation. In any given inquiry, researchers are at various points interested in understanding the semantic relationships among concepts, particularly in the presentation and development of theory. Another perhaps more immediately a ppreciated dividend, has to do with the retrieval of data. Connecting concepts allows one to find and surface data from associated concepts. This step is critical in any sort of empirical testing and can be important in crafting original empirical designs that merge data from different sources and domains. And it is this connection to concrete phenomena – i.e., data – which so entrances most of us about concepts. After all, their beauty lies in their ability to represent the world.
Notes 1 Originally named the Committee on Conceptual and Terminological Analysis (COCTA), the committee – still as Research Committee-01 – was rechristened the Committee on Concepts and Methods (C&M). 2 By the way, anyone teaching concept analysis will want to peruse Gary Goertz’s list of classroom concept exercises, available from Goertz upon request (see Goertz, 2006). 3 It is a convention to letter concepts in small caps to distinguish them from perceived entities. 4 One of the initiatives of the Committee on Concepts and Methods is an online dictionary, of sorts, of ‘intranslatables’. 5 BBC radio broadcast made on April 29, 1937 entitled ‘Craftsmanship’ (part of a series called ‘Words Fail Me’) and later published in The Death of the Moth and Other Essays (1942).
References American Psychiatric Association. 2013. Diagnostic and Statistical Manual of Mental Disorder, 5th ed. (DSM-5). Arlington, VA: American Psychiatric Association. Aristotle. Politics. 1995. Translated by Barker, Ernest and revised by Richard Stalley. Oxford: Oxford University Press APSA-CP. 2009. ‘Concepts that hinder understanding … and what to do about them’. APSA-CP Newsletter 20/2: 5–18. Boyd, David R. 2011. The Environmental Rights Revolution: A Global Study of Constitutions, Human Rights, and the Environment. Vancouver: UBC Press. Brinks, Daniel M. and Abby Blass. 2018. The DNA of constitutional justice in Latin America: Politics, governance, and judicial design. New York: Cambridge University Press. Calise, Mauro and Theodore J. Lowi. 2010. Hyperpolitics: An Interactive Dictionary of Political Science Concepts. Chicago and London: The University of Chicago Press. Carrubba, Clifford J., Mathew Gabel, Gretchen Helmke, Andrew Martin, and Jeffrey K. Staton. 2012. ‘An Introduction to the CompLaw Database’. Cichowski, Rachel and Elizabeth Chrun (2017). European Court of Human Rights Database, Version 1.0 Release 2017. Online at http:// depts.washington.edu/echrdb/
Concept Regulation in Political Science
Cingranelli, David L., David L. Richards, and K. Chad Clay. 2014. ‘The CIRI Human Rights Dataset’. Online at http://www.humanrights data.com. Version 2014.04.14. Collier, David and John Gerring (eds). 2009. Concepts and Method in Social Science: The Tradition of Giovanni Sartori. New York: Routledge. Collier, David and Steven Levitsky. 1997. ‘Democracy with adjectives: Conceptual innovation in comparative research’. World Politics 49(3): 430–451. Collier, David and James E. Mahon. 1993. ‘Conceptual ‘stretching’ revisited: Adapting categories in comparative analysis’. American Political Science Review 87(4): 845–855. Constitution of Cape Verde. 1980 [2010]. Online at constituteproject.org Elkins, Zachary, Tom Ginsburg, and James Melton. 2005 [2019]. Comparative Constitutions Project. Online at http://comparativeconstitutionsproject.org Elkins, Zachary, Tom Ginsburg, James Melton, Robert Shaffer, Juan F. Sequeda, and Daniel P. Miranker. 2014. ‘Constitute: The world’s constitutions to read, search, and compare’. Web Semantics: Science, Services and Agents on the World Wide Web, 27–28: 10–18. Garoupa, Nuno and Tom Ginsburg. 2009. ‘Guarding the guardians: Judicial councils and judicial independence’. American Journal of Comparative Law 57(1): 103–134. Gerring, John. 1997. ‘Ideology: A definitional analysis’. Political Research Quarterly 50(4): 957–994. Gerring, John. 2001. Social Science Methodology: A Criterial Framework. New York: Cambridge University Press. Ginsburg, Tom and James Melton. 2015. ‘Does the constitutional amendment rule matter at all? Amendment cultures and the challenges of measuring amendment difficulty’. International Journal of Constitutional Law 13: 686–713. Goertz, Gary. 2006. Social Science Concepts: A User’s Guide. Princeton University Press. Gould, Stephen Jay. 1992 [2010]. Bully for Brontosaurus: Reflections in Natural History. W.W. Norton and Company. Gurr, T.R., 1995. Minorities at risk- a global view of ethnopolitical conflicts. Virginia: United States Institute of Peace Press.
339
Gurr, Ted Robert. 1995. Minorities at Risk – A Global View of Ethnopolitical Conflicts. Washington, DC: United States Institute of Peace. Harutyunyan G., G. Vahanyan, and V. Bleyan. ‘Electronic Dictionary of Basic Constitutional Concepts’. Harvey, David. 2008. ‘The right to the city’. New Left Review 53: 23–40. Johnson, James. 2003. ‘Conceptual problems as obstacles to progress in political science: Four decades of political culture research’. Journal of Theoretical Politics 15(1): 87–115. Keck, Thomas. 2015. ‘Comparative Free Speech Jurisprudence’. NSF [National Science Foundation] Project Description. Koenig, Matthias and Kiyoteru Tsutsui. 2019. “Minority Rights and National Constitutions.” Presented at the annual meetings of the American Sociological Association, New York. Lambert, Priscilla A. and Druscilla L. Scribner. 2009. ‘A politics of difference versus a politics of equality: Do constitutions matter?’ Comparative Politics 41(3): 337–357. Lepore, Jill. 2006. ‘Noah’s mark’. The New Yorker, November 6. Lindberg, S.I., Coppedge, M., Gerring, J. and Teorell, J., 2014. V-Dem: A new way to measure democracy. Journal of Democracy, 25(3), pp.159–169. Linnaeus, Carolus. 1735. Systema naturæ, sive regna tria naturæ systematice proposita per classes, ordines, genera, & species Vol. 1. Leiden: Theodor Haak. Loemker, Leroy, ed. and trans., 1969. Leibniz: Philosophical Papers and Letters. Synthese Historical Library. Dordrecht: D. Reidel. Mills, C. Wright. 1959. “On Intellectual Craftsmanship.” In Llewellyn Gross (ed.) Symposium on Sociological Theory. New York: Harper and Row. Murphy, Gregory L., 2004. The Big Book of Concepts. Boston, MA: MIT Press. Popper, Karl. 1976. Unended Quest: An Intellectual Autobiography. La Salle, IL: Open Court. Ríos-Figueroa, Julio and Jeffrey K. Staton. 2012. ‘An evaluation of cross-national measures of judicial independence’. Journal of Law, Economics, and Organization 30(1): 104–137.
340
The SAGE Handbook of Political Science
Rosch, Eleanor H. 1973. ‘Natural categories’. Cognitive Psychology 4(3): 328–350. Rosen, Jeffrey. 2012 ‘The right to be forgotten’. Stanford Law Review 64: 88 Online at http://www.concourt.am/armenian/legal_ resources/world_constitutions/index.htm. Accessed January 7, 2020.” Sartori, Giovanni. 1970. ‘Concept misformation in comparative politics’. American Political Science Review 64(4): 1033–1053. Sartori, Giovanni. 1975 [2009]. ‘The Tower of Babel’. In Giovanni Sartori, Fred Riggs, and Henry Teune (eds), Tower of Babel: On the Definition and Analysis of Concepts in the Social Sciences. International Studies Association, Occasional Paper No. 6, University of Pittsburgh. [Reprinted in Collier and Gerring (2009)]. Sartori, Giovanni (ed.). 1984. Social Science Concepts: A Systematic Analysis. Vol. 1. Beverly Hills/London: Sage. Schmitter, Philippe. 2009. ‘The Confessions of a Repeat Offending and Unrepentant Conceptualist’. Address at the Mattei Dogan Prize Ceremony IPSA World Congress, Santiago de Chile. July 15, 2009. Online at https://www. eui.eu/Documents/DepartmentsCentres/SPS/ Profiles/Schmitter/IPSATalk2009.pdf. Accessed January 7, 2020. Shugart, Matthew Soberg and John M. Carey. 1992. Presidents and assemblies: Constitutional design and electoral dynamics. New York: Cambridge University Press.
Titus, Charles H. 1931. ‘A nomenclature in political science’. American Political Science Review 25(1): 45–60. van Maarseveen, Henc and Ger van der Tang. 1978. Written Constitutions: A Computerized Comparative Study. Leiden: Brill. Versteeg, Mila. ‘Cross-national data on Constitutional Rights’ Weber, Max. 1905 [1949]. The Methodology of the Social Sciences. Glencoe, IL: Free Press. Weyland, Kurt. 2001. ‘Clarifying a contested concept: Populism in the study of Latin American politics’. Comparative Politics 34(1): 1–22. Wimmer, Andreas, Lars-Erik Cederman and Brian Min. 2009. ‘Ethnic politics and armed conflict. A configurational analysis of a new global dataset’. American Sociological Review 74(2): 316–337. Wittgenstein, Ludwig. 1953. Philosophical Investigations. New York: Macmillan. Woolf, Virginia. 1942. Death of the Moth and other Essays. London: Hogarth Press. Women’s Link Worldwide. 2019. Gender Justice Observatory. Online at https://www. womenslinkworldwide.org/en/gender-justiceobservatory/court-rulings-database,Accessed January 7, 2020. World Bank. 1978. World development indicators. Online at https://datacatalog.worldbank.org/dataset/world-developmentindicators. Accessed January 7, 2020.
20 Configurative Methods Claudius Wagemann
1. Introduction At first, it is necessary to get a closer understanding of what ‘configurative methods’ are, since the term is not very frequently used in social science methodology. We can most easily refer to Rihoux and Ragin (2009a) which has a very similar title (Configurational Comparative Methods). When looking at the single contributions to that volume, it becomes clear that most of the chapters1 deal with a specific method, which has also become known as Qualitative Comparative Analysis (QCA) and is mainly connected to Ragin’s widely read contributions (Ragin, 1987, 2000, 2008; see also Schneider and Wagemann, 2012). While Rihoux and Ragin (2009b: xix) even explicitly refer to QCA in all its variants2 when speaking about Configurative Methods, others apply a broader perspective and see them equivalent to Boolean-based approaches in general (Thiem et al., 2016: 765). They include ‘all case-study methods that are
based on Boolean-algebraic principles’ (Thiem et al., 2016: 767, endnote 3) such as – again – QCA, but also Coincidence Analysis, Event-Structure Analysis, etc. This chapter follows Rihoux and Ragin (2009b: xviii) and understands Configurative Methods as methods of ‘systematic crosscase comparisons, while at the same time giving justice to within-case complexity, particularly in small- and intermediate-N research designs’ (2009b: xviii). This definition is part of how QCA can be seen (see Schneider and Wagemann, 2012: 8ff.), but can also be extended to other methods, as will become clear later in this chapter. Admittedly, all QCA applications recently also increasingly include examples of largeN studies (see the bibliographic overviews in Buche and Siewert, 2015; Rihoux et al., 2013; Wagemann et al., 2016), which has led to a discussion about this further diversification of QCA approaches (Fiss et al., 2013; Greckhamer et al., 2013). However, this amplification of the perspective is still
342
The SAGE Handbook of Political Science
connected to the basic principles of configurative analysis that had been developed in the early years of QCA methods so that, nowadays, a definition makes less explicit reference to the N of a study (although the diffusion of large-N studies may have some impact on the case-orientation of Configurative Methods, as Wagemann et al. (2016) empirically demonstrate). Thus, for our purposes, the term Configurative Methods mainly refers to systematic cross-case comparisons with a strong case-orientation and a recognition of case complexity. In this sense, more general principles and logics of Configurative Methods are presented in section 2, before QCA as the most prominent variant is introduced with more detail in section 3. However, since QCA covers only a part of the universe of configurative methods, an attempt will be made in section 4 to see in which other methods configurative elements can be identified. Finally, in the conclusion (section 5), a brief reflection on current challenges for configurative methods and their position within the world of social science methods will be presented.
2. Principles and General Logic of Configurative Methods 2.1 Components of Configurative Methods3 As the term ‘configurative methods’ indicates, we first have to clarify what ‘configurations’ are. This is also called for since a methodology should necessarily conform to the ontology of one’s thinking (Hall, 2003). Or, in other words, if a researcher sees the social world full of configurations or – and this would be more epistemologically inspired – thinks that the social world is best analysed in terms of configurations, then (s)he should also choose those methods which take the issue of configurations seriously.
At the heart of a configuration lies the idea that cases are made up and can be portrayed in terms of their properties. These properties are usually highly standardized notions of concepts, in the sense of Sartori’s view on concepts as data containers. Ragin (2000: 64) himself proposes to ‘[study] cases as configurations’ as one of the main characteristics of QCA. Elsewhere he goes beyond this purely descriptive use of configurations and adds a causal component: ‘[understanding] causally relevant conditions as intersections of forces and events’ (Ragin, 2008: 109). In other words, real-world existing cases are split into their entities. Such entities can be defined through the components of typologies. When Lijphart (1999) organizes political systems along ten criteria, then every single (country) case can be grouped with regard to every one of these criteria. A country is then no longer called ‘Great Britain’, but is described as a parliamentarian (usually) single-party government without a written constitution and without a strong central bank, whose electoral system is majoritarian. Yet another country, ‘Germany’, is also no longer called by its proper name, but is described using the same elements (which take an equal value as Great Britain for the category of a parliamentarian system, but different values for the other elements just mentioned). Such a proceeding corresponds very closely to the original Ancient Greek meaning of ‘analysis’, namely to divide a whole into its constituting parts. The attempt of describing Great Britain and Germany with the help of Lijphart’s (1999) criteria shows something else: it sometimes may not be so easy to attribute a country (or, more generally, a case) to a category or not. In the British case, the components of federalism and, increasingly, a two-party versus multi-party system are not easy to decide. In the German case, the electoral system is mainly proportional, but with strong majoritarian elements. Thus, we have to differentiate between an ‘ideal type’ and a real case. Lazarsfeld (1937) has solved this very nicely by presenting a ‘property space’ where the
Configurative Methods
cases could be located with regard to all their defining properties. If it could not be decided whether a case should be described with a given property or not, then it was placed somewhere close to that property, but not exactly with that property. In other words, cases were set into relation to various ideal types. Fuzzy sets (as we will later see, section 3) take up this idea of only partial belongingness of cases to ideal types. Thus, although a view on cases as configurations of properties and therefore as a combination of single elements risks to contradict the plea of holism which Ragin has made elsewhere (Ragin, 1987: 52, 85, 160, 166), we are still asked ‘to think holistically and to understand causally relevant conditions as intersections of forces and events’ (Ragin, 2008: 109). In the end, this means that we should be able ‘to investigate cases both as wholes and as parts’. However, configurative methods do not stop there. The goal is not (only) to arrive at typologies, although explanatory typologies or typological theories (Bennett and Elman, 2006; Elman 2005) and ‘fuzzy-set Ideal Type Analysis’ (fsITA) (Kvist, 2007) are based on the same principles. Rather, the focus is on causal analysis. Rihoux and Ragin (2009b: xix; italics in the original) indeed define a configuration as the ‘specific configuration of factors ([…] we call these conditions in CCM terminology) that produces a given outcome of interest’ (2009b: xix; italics in the original). Following this, thinking about configurations always implies a reflection about what is implied (or ‘caused’) by these configurations. The central idea is that some configurations or patterns are linked to a given outcome, while other configurations of the same elements are not. The important aspect is that this is postulated about configurations, and not about individual factors. Neither can individual factors alone be considered as ‘causes’ nor is their aggregation automatically of any help. If factor A cannot explain an outcome, and factor B cannot either, it can still be possible that a combination of
343
A and B (written as A*B or AB in Boolean algebra notation) can instead. One factor may need the presence of another factor in order to lead to a result. Neither the victory of the opposition party in an election nor the fact that electoral results are respected by the ruling party can alone overthrow a ruling government. Only if both elements occur, a ruling government must cease office (needless to say, other components also have to be added). Configurative thinking in this sense seems to correspond especially well to the social sciences. Social phenomena are usually complex, and nobody would assume that one factor alone will imply stable peace agreements or a solution for global climate change. Neither does a thinking correspond to our worldviews (and this is where the discussion on ontology comes in again), that factors work in an additive way, that is, that each one adds a bit more to the explanation of the observed variance. If we have an ontology, which sees the social world as complex, causal processes influenced by the coincidence of several factors, and causal outcomes as the result of highly complex intersections of causes, then we already think in configurations and need a method which takes this perspective into account. However, this alone does not yet represent the full range of what can be understood as a configuration. Two more characteristics are usually added for what is called ‘causal complexity’4 (Ragin, 2000: 88; Schneider and Wagemann, 2012: 78). One of them – equifinality – enlarges the notion of a configuration from a purely conjunctural view, which exists if the term of a configuration is just used to describe the coincidence and the simultaneous presence or absence of more than one factor in order to arrive at a causal formula. Equifinality refers to those explanations where other different configurations of the same or other factors are accepted as equally valid explanations. This takes into account the fact, for example, that combinations of success strategies for centre-left political parties may not explain the success of
344
The SAGE Handbook of Political Science
centre-right parties. Once all different combinations of strategy factors are identified, all of them can count as explanations, although for different cases or case types.5 This seems to correspond very well with many researchers’ ontology. There is no reason why not more than one explanation could be found to account for phenomena to be explained. Configurative methods very often show a third characteristic, in addition to conjunctions and equifinality, namely asymmetry. However, there is no reason why asymmetry should be connected to the idea of configurations. The idea that the explanation of a positive outcome should not be used to explain a negative outcome (a central claim for asymmetry, Schneider and Wagemann, 2012: 81) is not part of the configurative logic. Instead, asymmetry is related to characteristics of set relations (see below). So, in brief, conjunctions, equifinality and asymmetry are important aspects of configurative methods. Before turning to QCA, we first look at the relations between configurative and other methods.
2.2 Configurative Methods and Their Relations to Other Methods Given the broad use of configurative methods,6 it is no wonder that they have become an established part of the methodological world. This also means, however, that we have to consider which position they occupy with regard to other methodological approaches and paradigms. Much of what has been said before (section 2.1) runs counter to mainstream quantitative methods. Although interaction effects in quantitative methods seem to be similar to conjunctures, configurations can be more easily assessed than complicated models with interaction terms (Bennett and Elman, 2006: 466). Similarly, cluster analysis seems to share some ideas with configurative methods. Nevertheless, configurative methods are substantially different from quantitative methods,
mostly because of the elements described above (for such a reasoning see Ragin, 2006; Thiem et al., 2016). While past discussions resulted in comparisons of one approach with the other, often attributing superiority to one of them (Seawright, 2005), a pluralist approach to methods has developed more recently, not least because of the attempt to convert the discussion into ‘tales’ and paradigms into ‘cultures’ (Goertz and Mahoney, 2012). In the same way, the mixed methods debate has shown that methods should be chosen because of their appropriateness for specific research questions and their ontological and epistemological premises. Configurative methods are usually grouped within the ‘qualitative’ realm of methods.7 In case of QCA, this is even more evident, since the Q stands for ‘qualitative’. While earlier accounts saw QCA in a middle position between qualitative and quantitative methods (Ragin, 1987), more recent contributions firmly underline that it is more correct to classify QCA with the qualitative research tradition, as it becomes manifest in set theory (Goertz and Mahoney, 2012)8 which is strongly connected to the idea of configurative reasoning (see section 3). The fact that the ‘qualitativeness’ of the procedure is embedded in set theory is important, since there are various understandings of what ‘qualitative methods’ actually are. Indeed, ‘the qualitative paradigm includes many divisions’ (Goertz and Mahoney, 2012: 4; for doubts about the dichotomy of qualitative and quantitative methods, see also Kühn and Rohlfing, 2016). Koivu and Kimball Damman’s (2015) presentation of four central variants clearly attributes a special role to what they call ‘empirical interpretivism’ (2624), not least because of this stream being based on different epistemological underpinnings (2619, 2624). They even report that ‘the interpretivist epistemology is sometimes dismissed as not belonging within the field of qualitative methods’ (2624).9 While this is certainly not the place to shed further light on differences in the understanding of what
Configurative Methods
qualitative methods are, it should be noted that many scholars who claim to work with qualitative methods do not include configurative methods when using the adjective ‘qualitative’. Indeed, there seems to be a very strong mutual ignorance of these variants of qualitative methods.10 Yet, another variant of qualitative methods seems to be closer to configurative methods. Comparative ‘case study research’ (Gerring, 2007) has made important contributions to knowledge accumulation in the social sciences (Blatter and Haverland, 2012: 2ff.), but cannot be qualified as purely q uantitative- or qualitative-interpretive. Indeed, as Mahoney and Sweet Vanderpoel (2015) show (see below), configurative thinking can also be found in case study research, although this does not always happen explicitly (section 4). Therefore, it is not surprising that topics in case study research methodology and more formalized configurative approaches have recently been linked in the literature (Rohlfing and Schneider, 2018; Schneider and Rohlfing, 2013). The focus of these contributions initially has been on case selection, but this discussion shows the deep connections between conventional comparative case studies and configurative methods (Morlino, 2018: 93ff.) who places the discussion of configurative methods within the broader spectrum of comparative case study methods. Therefore, configurative methods are close to other comparative case study methods. They are sometimes formalized in a similar way to quantitative methods, without wanting to replace them, and they do not share the epistemological assumptions of interpretive methods (see Carver, Chapter 24, this Handbook). Thus, they are methods in their own right.
3. QCA and Fuzzy Sets It has already been mentioned that QCA is the ‘elephant in the room’, in the sense that this method can be seen as an elaborate,
345
highly systematized and broadly used variant of configurative methods. Therefore, a whole section is devoted to QCA, without, however, overlooking that there are other uses of configurations (see section 4). In older publications, ‘Fuzzy-Set Analysis’ is sometimes portrayed as a method of its own (e.g., Bennett and Elman, 2006: 468ff.), but, in the meantime, this is presented as a specific variant of QCA (fsQCA; Schneider and Wagemann, 2012). QCA originally goes back to Ragin’s (1987, 2000, 2008) influential writings.
3.1 Sets and Set Calibration QCA works with sets and is usually presented as a set-theoretic method. Originally, Ragin framed his approach in terms of Boolean algebra (Ragin, 1987; see also Caramani, 2009), but this was relegated in later works, and the set-theoretic underpinnings of QCA became more accentuated (Ragin, 2000, 2008; Rihoux and Ragin, 2009a; Schneider and Wagemann, 2012). This step was crucial, because it strengthened the linkages between QCA and the case-oriented research tradition (see section 4). Briefly said, in set-theoretic methods, the explanatory factors (‘conditions’) and the outcome to be explained are conceptualized in terms of sets. Different from a measurement approach, in which countries would be attributed a value on how democratic they are, sets indicate the belongingness of a case to the set, such as, for example, the belongingness of a country to the set of all democracies. The process of assessing this belongingness is called ‘set calibration’ (for more on differences between measurement and calibration, see Ragin, 2013: 172) and leads to membership values of cases in sets. Often social science phenomena are characterized by dichotomies (presence or absence of a democracy, a war, a policy success, etc.), but these dichotomies can be empirically observable at different intensities. Two democracies can vary with
346
The SAGE Handbook of Political Science
regard to the degree how much they belong to the set of democracies, that is, one country can be more democratic than another. The same holds for concepts such as war, policy successes, etc. In other words: when thinking about set memberships, the social sciences require an idea that allows us to capture both differences in kind (the dichotomy) and in degree (the grading). Fuzzy sets offer such a strategy. While being dichotomous in nature, they allow for different degrees of belongingness to sets. Fuzzy values between 0 and 1 indicate this degree numerically. A case with a fuzzy value of 0.8 in a given concept clearly belongs to the set described by that concept, but not perfectly. By contrast, a case with a fuzzy value of 0.4 does not belong to the set described by the concept (it is a difference in kind), but comes close to the threshold of 0.5 which would move it into the set. There are various strategies how to calibrate, which cannot be described in detail here (but see Ragin, 2008: 85ff.; Schneider and Wagemann, 2012: 32ff.). However, all have in common a double requirement: first, considerable work has to be done in order to arrive at a concept definition that allows for the assignment of finely grained fuzzy values to empirical cases. It is not possible to define membership values in the set of democracies, if there is no clear and explicitly spelled-out concept of what a demo cracy is. Therefore, conceptual work lies at the heart of calibration. Since this is a typical step in qualitative reasoning (just think about the etymological origin of the word ‘qualitative’, the Latin qualis, which refers to the fundamental characteristics defining an object), calibration and the importance of concept formation contribute to placing QCA in the world of qualitative methods. Second, case knowledge is of the highest importance. Even if researchers arrive at highly elaborate versions of concepts, if they do not know the cases at hand, they may attribute wrong fuzzy values to these cases. This also limits the validity of large-N QCAs, since calibration will usually not
occur case-by-case, but cases will have to be assessed on the basis of more general characteristics. Note that calibration in QCA has to be explicitly performed, while many other comparative case studies perform this step implicitly. Case studies which do not use QCA will usually not include a chapter on ‘calibration’, but they meticulously define the concepts which are needed for the argument, and they are also be characterized by deep case knowledge. Using numbers, QCA just goes a step further. The numerical representation of set memberships makes decisions more transparent and forces the researchers to stick to their concept and their case classification throughout the research process. Next to fuzzy-set QCA (fsQCA), further variants of QCA exist (see Schneider and Wagemann, 2012: 13ff., 253ff.). Crisp-set QCA (csQCA) does not allow for the differentiation of set memberships and only works with dichotomies. Since this means that only the fuzzy values 0 and 1 are permitted in calibration, this is nothing else than a specific variant of fsQCA (Schneider and Wagemann, 2012: 15) for which some procedures are smoother, but not different from fsQCA. Therefore, it is also not a problem to mix crisp sets and fuzzy sets in an analysis, although, from a research design perspective, this means that some concepts are more differentiated than others (for a discussion on whether this is positive or negative for the analysis see Schneider and Wagemann, 2012: 24ff.). Multi-value QCA (mvQCA, Cronqvist and Berg-Schlosser, 2009) allows for concepts that are not dichotomous, but ordinal (low, medium, high, etc.) or nominal such as citizenship or religion. While acknowledging the elaboration of the algorithm and the usefulness of mvQCA, some authors doubt the settheoretic nature of these concepts (Schneider and Wagemann, 2012: 258ff.; Vink and Van Vliet, 2009), which is solved through seeing multi-value concepts as ‘multiple crisp sets’ (Schneider and Wagemann, 2012: 259).
Configurative Methods
As has been shown, such a multiple crisp-set analysis has the same limits and prospects as mvQCA (Schneider and Wagemann, 2012: 262, fn. 12), so that this variant is not different from QCA as such. A similar conclusion has been drawn for time-QCA (tQCA, see Caren and Panofsky, 2005), which uses the temporal order in which conditions occur for introducing a dynamic element to QCA. However, Ragin and Strand (2008) subsequently showed that when calibrating time itself as sets, conventional QCA arrived at the very same results as tQCA. Sets are the constituting units of what has been described as ‘configuration’ (see section 2.1). A conjunctural and equifinal configuration is composed of various sets, be they crisp or fuzzy. A conjunction, AB, for example, is the simultaneous occurrence of the sets A and B. For example, a federalist democracy is present when a country has a membership in both the set of federations and democracies. In other words, configurations can be seen as chains or compositions of sets. If this involves the use of fuzzy sets, then the configuration describes the ‘ideal type’ in the sense of Lazarsfeld’s (1937) property space (see section 2.1). In any case, in configurative methods, cases become set configurations.
3.2 Necessary Conditions As mentioned above, configurative methods are usually not only descriptive or typologically oriented, but aim at causal explanation (Rihoux and Ragin, 2009b: xix; Schneider and Wagemann, 2012: 8). In QCA as a settheoretic method, this is modelled through set relations. For this, the relation between the set describing the outcome and the set(s) describing the condition(s) is assessed. For example, the so-called Copenhagen criteria establish whether a country can become a member of the EU or not. In this respect, two sets can be defined, namely a set C (‘criteria’)
347
of all countries fulfilling the Copenhagen criteria and a set M (‘member’) of all EU member countries. Ideally, a situation results in which all countries that are in M are also in C. However, not all countries in C are also in M, since countries (outside of Europe) might also fulfil the Copenhagen criteria, but are not members of the EU. In other words, M is a subset of C, and C a superset of M. If our model now foresees that the fulfilment of the Copenhagen criteria is an explanatory factor for EU membership,11 then the outcome (M) is a subset of the condition (C). Such a scenario refers to C being a necessary condition for M. Set-theoretically speaking, if the outcome is a subset of the condition, then the condition can be defined as necessary for the outcome.12 From a logical perspective, it can be argued that for a necessary condition there can be no case (i.e., country) which shows the outcome, but not the condition. If such a case instead had existed (i.e., an EU member country which does not fulfil the Copenhagen criteria), then C cannot be considered to be always necessary for M. For such a situation, parameters can be defined in QCA which assess how tolerable such a deviance can be for the necessity assessment (see Schneider and Wagemann, 2012: 139ff. on the consistency measure for necessary conditions). In other words, set theory and the assessment of necessary conditions are closely linked, since set theory provides the set relations through which necessary conditions can be defined. Again, the question arises about which role configurative thinking plays in all of this. Indeed, there are two moments when configurations play an important role. First, taking up the example above, the fulfilment of the Copenhagen criteria is certainly not the only necessary condition for being a member of the EU. Switzerland or Norway would easily fulfil them, but their populations and the governing elites have decided not to join the EU. In an analysis of necessity, the researcher will try to come up with many potential necessary conditions, ideally with all of them. However, completely
348
The SAGE Handbook of Political Science
saturated explanatory models which do not omit any potentially relevant explanatory factor are rare, if existing at all (for a discussion on the illusion of ever solving the omitted ‘variables’ problem in QCA, see Radaelli and Wagemann, 2018). However, if, in an ideal research world, a researcher were able to come up with all necessary conditions, their combination (= their ‘configuration’) would be a sufficient condition for the outcome, and the outcome would be explained without any error. Since it is not realistic to reach this goal, researchers can try to come as close as possible and to arrive at configurations which include many (hopefully theoretically relevant) necessary conditions. Second, a situation can be conceived where no single condition is identified as a necessary condition. However, researchers may detect that the presence of a given outcome always implies the presence of one out of two (or more) conditions. For example, the outcome of a highly developed welfare state may be connected to the presence of the condition of strong social-democratic parties or the presence of strong trade unions. However, social-democratic parties and trade unions alone do not show the required set relation. In such a case, configurations can count as necessary conditions, which are composed of two or more components that are alternatively necessary for one another. Certainly, mathematically speaking, this reasoning could be driven to the extremes: the more conditions are defined as ‘alternatively necessary’ in this sense the less meaningful is the necessity statement. The trivial statement could follow according to which it is necessary that any of the presumed conditions is present.13 In order to avoid such an artefact of alternative necessary conditions, it is required that the conditions making part of such a configuration are ‘functionally equivalent’ (Schneider and Wagemann, 2012: 74, 326). This means they have to refer to the same macro concept in order to be considered substitutable variants of one another. In the case of strong social-democratic parties and strong trade
unions, ‘strong left-wing political organizations’ could represent such a macro concept. It goes without saying, however, that such a decision requires the use of theory and conceptual clarity.14
3.3 Sufficient Conditions The analysis of sufficient conditions is even more clearly based on the notion of configurations. First, the general rules for defining sufficient conditions have to be introduced. Again, from a set-theoretic perspective, let us imagine a set of countries where right-wing populist parties (R) are part of a (coalition) government. In addition, let us imagine a set of countries that decide on tighter rules about immigration (I). Our theoretical model postulates that R is explanatory for I (although, interestingly enough, this relation could also be reversed). If now R is a subset of I and I a superset of R – if, in other words, the condition is a subset of the outcome – then this subset relation gives us good reasons to claim R to be a sufficient condition for I (the caveat of note 12 also holds for sufficient conditions). This is the opposite way of defining a necessary condition, and various parameters of fit tell us how many deviances from the rule we may accept (for the consistency of a sufficient condition, see Schneider and Wagemann, 2012: 123ff.). The relation in size between the condition set and the outcome set informs us about the explanatory contribution of the condition (for the coverage of a sufficient condition, see Schneider and Wagemann, 2012: 129ff.). Certainly, such a straightforward finding of declaring the participation of right-wing populists in a government as sufficient for certain decisions in immigration policy is not very likely and ignores the complex reality of the social world, mainly for two reasons which are both connected to the issue of configurations. First, it may well be that rightwing populists are part of a coalition, but the coalition agreement can have fixed certain
Configurative Methods
immigration policies where the right-wing populist party did not prevail. This means that just being part of a coalition alone is not yet sufficient for arriving at changes in policies. Something else has to be added. This means that two or more elements have to be combined in a conjuncture, which is one basic unit of a configuration (see above, section 2.1). Second, there may also be situations in which no right-wing populist party is needed for tighter rules on immigration. Indeed, even governments without any right-wing populist participation may introduce tighter rules on immigration. Many reasons can be imagined for this. Above (section 2.1), we have referred to this as equifinality. There may be more conjunctures which imply the outcome. Taken together, conjunctural causation and equifinality contribute to a configurative approach of sufficiency. Indeed, the ‘truth table algorithm’, which is at the heart of most QCAs (Schneider and Wagemann, 2012: 178ff.) also makes a strong use of configurations. A truth table is organized around all possible configurations, which can be formed from the chosen conditions. If A, B and C are chosen as conditions, then these can combine as ABC, AB~C,15 A~BC, A~B~C, ~ABC, ~AB~C, ~A~BC or ~A~B~C. As can be seen, a truth table formed from three conditions results in eight configurations that are called ‘truth table rows’. Since every single condition can be part of a configuration either in its presence or absence and thus takes on two ‘truth values’ (‘true’ and ‘false’), the number of configurations is an exponential function of the number of conditions and can be calculated as 2k, with k being the number of conditions. This also means that the configurative logic puts certain limits to the applicability of configurative methods. If Lijphart (1999) had not used quantitative methods for his assessment of different political systems, but configurative ones, he would have had to work with 210 = 1.024 configurations. Since some countries may have the same truth values for
349
all components,16 that is, belong to the same configuration, he would have needed at least 1,024 countries for his analysis to cover all possibilities. This means that QCAs cannot simply add ever more conditions to analytical models. Instead, QCAs have to be as parsimonious as possible regarding the relation between the number of cases and the number of conditions. In any case, in the truth table algorithm, all these 2k combinations are assessed for whether they qualify as sufficient conditions for the outcome and whether they are part of the equifinal solution formula. In a further step, this solution formula is minimized (on the rules of and the discussion about minimization, see Baumgartner and Thiem, 2017; Schneider and Wagemann, 2012: 104ff.). While we cannot delve deeply into the technical details of this procedure, the analysis of sufficiency starts from an assessment of all possible configurations. The analysis of sufficiency is, therefore, very closely linked to the issue of configurations. As is obvious, it is certainly an illusion that the noisy and chaotic empirical social world provides researchers with empirical cases for all 2k combinations. Even when the number of conditions is kept small (often at the expense of overly simplistic theoretical and analytical models), there may still be combinations (or parts thereof) which do not exist empirically. No woman has so far ever been in the most important leadership position of the current super-powers: the United States, Russia and China; no African welfare state has developed; etc. Sometimes these combinations cannot exist (the famous metaphorical ‘pregnant man’), but, more often, historical, social and political processes have just not produced all combinations of conditions which make up a truth table. In QCA, this phenomenon is called ‘limited diversity’. It cannot be ‘solved’ in the sense that researchers can make up for information that they do not have. However, various proposals have been developed to cope with this problem (Ragin, 1987: 103ff., 2008: 147ff.; Schneider
350
The SAGE Handbook of Political Science
and Wagemann, 2012: 160ff., 197ff.). All of them take configurativeness seriously by not breaking up the configurations and placing the arguments on single, isolated factors, but they instead leave configurative thinking intact, even if this implies to arrive at counterfactual arguments for selected configurations (Ragin and Sonnett, 2004; Schneider and Wagemann, 2012: 167ff., 197ff.). In short, configurations are at the heart of sufficiency analyses and are important ingredients of the analytical protocol of QCA.
3.4 Test of Theories and Robustness Tests The fundamental paradigmatic differences between QCA as a configurative method and mainstream quantitative methods also mean that some procedures which are standard (and useful) in quantitative approaches cannot be directly applied or translated to QCA. Nevertheless, some possibilities exist which are frequently overlooked. Two of them, which seem to be particularly relevant when looking at QCA from a configurative perspective, are briefly presented here. The first concerns theory testing. This is a frequent misunderstanding of QCA, which goes back to its configurative nature. It is, of course, an illusion that a researcher will start his/her research with hypotheses that make valid assumptions about all configurations that may contribute to a given outcome. Hardly any hypothesis will state in detail that a combination of the presence of factor A and the presence of factor B, or the combination of the absence of factor A and the presence of factor C, or the absence of factor C and the presence of factor D, is sufficient for Y. The world and its causal processes are just too complex for such sophisticated hypotheses. Thus, the configurative nature of QCA that allows for so many complex patterns makes it hard to arrive at realistic hypotheses. Nevertheless, there is a possibility in QCA that allows for some theory
testing (Ragin, 1987: 118ff.; Schneider and Wagemann, 2012: 295ff.). It basically starts from a hypothesis in set-theoretic formulation, but which is much less complex than the one just presented. For example, if two different theories claim the factors A and B, respectively, to be sufficient for Y, and a third (more elaborate) theory says factor C is sufficient, but only if D is not present, then this can be formulated as A + B + C~D → Y.17 This is undoubtedly a configurative expression. Subsequently, the result of the empirical QCA analysis is intersected with the formula of the theory. The intersection (for technical details, see Schneider and Wagemann, 2012: 295ff.) then identifies which parts of the theory are confirmed, which parts are not confirmed, and where the theory has to be extended. In sum, this is a creative way to make use of Boolean algebra to test whether empirical results of a configurative analysis confirm or contradict theoretical expectations. However, note that a difference between the theoretical assumptions and the empirical results is already expected at the outset. The second refers to robustness tests (Schneider and Wagemann, 2012: 284ff.; Skaaning, 2011). Several proposals have been made how robustness can be assessed in QCA. Usually, they include changes in calibration, or changes with regard to the threshold of consistency values above which a given configuration still counts as sufficient, or changes with regard to the cases under research. It goes without saying that, if major changes are made, QCA results also change – and this is not a disadvantage, since major changes in the design should also imply major effects with regard to the results. Nevertheless, the question is how much the results change. Schneider and Wagemann (2012: 286), for example, check for the setrelational status of the results obtained with different design elements. They conclude: if different choices lead to solution terms that are not in a subset relation with one another, then results are not robust. If, however, there is a clear
Configurative Methods
subset relation between different solution terms, then results can be interpreted as robust, even if these solution terms look quite different on the surface (Schneider and Wagemann, 2012: 286).
This means if the main configurative structure of the argument is maintained (and this can be shown in the subset relation), then results can be interpreted as robust. Such a definition of robustness places a special emphasis on the configurative nature of QCA.
4. Using Configurative Thinking Beyond QCA While QCA is a typical configurative method, this does not mean that configurative thinking is limited to QCA. There are various other (components of) methods that make use of configuration, sometimes explicitly, sometimes implicitly. We cannot give here a complete overview, but the following sections provide a first insight into the broad usage of configurations.
4.1 Descriptive Variants First of all, configurations cannot only be used when causal arguments are made, but also when social science phenomena are simply described. A fundamental prerequisite for communication in social science is a shared understanding about what the research objects actually are. In other words, clear concepts are needed. A whole literature has evolved around concept formation (Mair, 2008; Sartori, 1970). Configurative thinking begins when Sartori (1970: 1041) presents the idea of the ‘intension’ of concepts as ‘the collection of properties which determine the things to which the word applies’ (quoted from Salmon, 1963: 90–1). While in Sartori’s text the word ‘properties’ is in italics, it is better for our purposes to italicize ‘collection’. When we build a concept and derive
351
sub-concepts, properties are connected with one another to arrive at a more sophisticated concept. In other words, by adding characteristics to a concept, the concept will become more specific. In this way, for example, the concept of democracy can be developed into a ‘parliamentary democracy’ by creating a conjunction. Later writings on concepts took up this idea of configurative concept formation, but went beyond Sartori’s formulation. When radial concepts (Collier and Mahon, 1993: 848ff.; Mair, 2008: 193) were developed, conjunctions were interpreted in a different way, namely increasing the extension of a concept. Family resemblance approaches (Collier and Mahon, 1993: 846ff.; Goertz, 2006: 74–5; Mair, 2008: 194) go even further and propose sophisticated versions of configurative concept formation. The equifinal aspect here is expressed through the fact that various conjunctions of properties (e.g., not only ABC, but also ABD, or BCD) can imply the presence of the concept. At the same time, family resemblance approaches make use of conjunctions so that the resulting concept formation is truly configurative (for an extensive discussion see Goertz, 2006). Such concepts can be further used as ‘data containers’ (Sartori, 1970: 1039) and be combined in typologies. Indeed, the idea of typological theories (Bennett and Elman, 2006; Elman, 2005) is directly connected to this. Therefore, it is no surprise that ‘fuzzy-set Ideal Type Analysis’ (fsITA) has been developed (Kvist, 2007). This is similar to a QCA with fuzzy sets, but the algorithm is not completed. It rather stops when the truth table is constructed, that is when all potentially possible configurations are matched with real cases. The aim of fsITA is not causal, but classificatory: it asks which configurations do exist, and which cases can be attributed to them. This can help to control and eventually revise existing classificatory systems, such as typologies of welfare states, capitalist systems of production, party systems, etc. The truth table can then be minimized by only
352
The SAGE Handbook of Political Science
accepting those combinations of properties that exist empirically. The result is a configurative formula that describes which combinations of properties exist in reality.
4.2 Causally Oriented Variants As mentioned above (section 2.2), there are various related methods which, taken together, form the larger context within which configurative methods can be placed. We have termed them comparative case study methods. Often, a difference is made between cross-case methods and within-case methods (Gerring, 2007: 21ff.). While QCA (see above, section 3) and more conventional ways of comparing small numbers of cases (see Morlino, 2018) are a typical example for the former, the latter usually occur in n = 1 settings, where just a single case is analysed, either synchronically or diachronically (Gerring, 2007: 28). The central idea of within-case analysis is that variation can also occur within a single case, such as at different moments of time, or through different aspects of a process, or through different actors, etc. A literature has developed from this about ‘process tracing’ (Bennett and Checkel, 2015a; Blatter and Haverland, 2012: 97ff.; George and Bennett, 2005). However, the method has a buzzword problem (Bennett and Checkel, 2015b), since many users claim to perform process tracing without actually referring to the standards of the method. This runs counter to ideas to systematize process tracing concepts. Already very early, the idea of necessary and (not quite) sufficient conditions was built into the logic (Van Evera, 1997: 31–2). Subsequently, Mahoney and Sweet Vanderpoel (2015) enlarged this idea and showed how it was able to graphically (and thus logically) present standard procedures of comparative case study research through set-theoretic terms. Most of the presentation is, indeed, through set theory, but, because of the close relation between set theory and
configurative thinking, these ideas can also be framed in terms of configurative methods. Finally, Coincidence Analysis (CNA) – developed in response to QCA methodology (Baumgartner, 2009, 2013) – is yet another method based on configurative thinking. CNA does not decide potential causal relations a priori and, therefore, does not distinguish between conditions and the outcome; these are equally treated as ‘factors’ (Baumgartner, 2013: 14). As a result, with CNA it is possible to detect more complex causal structures in the data such as causal chains and common causes (Baumgartner, 2009: 91). This approach drives the configurative method even further, since not only the potentially explanatory factors are considered to be part of a configuration, but also the explanandum. Configurations are examined independently from previously formulated hypotheses or assumptions about which elements could count as causes and which as effects.
5. Conclusions In this chapter, a brief overview of configurative methods has been provided. Starting from the observation that the term as such does not enjoy much popularity in the social sciences methods canon, we first defined configurations through the ideas of conjunctural causality and equifinality. This made it possible to identify a group of configurative methods that are also known as set-theoretic methods. Configurative perspectives are a special feature of set relations. A main part of this contribution was devoted to QCA as the most widely used configurative method. Subsequently, other methods that use configurations to describe patterns or to make causal arguments were briefly presented. One important caveat has to be made concerning the ‘quality’ of the application of configurative methods (for empirical analyses on this topic with regard to QCA,
Configurative Methods
see Buche and Siewert, 2015; Wagemann et al., 2016; for conceptualizations of what analytical ‘quality’ means, see Schneider and Wagemann, 2012: 275ff.). Even the most basic requirements such as transparency standards are sometimes not fulfilled (Wagemann and Schneider, 2015). Methods gain visibility and are taken seriously by the scholarly community not only because their algorithms and procedures function ‘in theory’, but the actual research practice is also decisive. High-quality research practice means, at the most basic level, that methods are appropriately executed. It also means that the execution of methods is well connected to the research design. Ideally, methods are not only used for the pleasure to run procedures, but in order to accumulate knowledge about social science phenomena. Therefore, quality aspects of the application are of foremost importance also for configurative methods. Certainly, by their very nature, configurative methods allow for much flexibility. For this reason, it is even more important to show that flexibility does not mean rulelessness, or that flexibility can be (mis)used in order to justify superficial practices.
Notes 1 Of course, the book also includes chapters that touch upon other topics, such as Berg-Schlosser and De Meur (2009) with their broader approach on most similar different outcome and most different similar outcome designs. 2 The QCA family traditionally encompasses three types, namely crisp-set, fuzzy-set and multi-value QCA (Rihoux and Ragin 2009a; Schneider and Wagemann 2012). 3 Parts of the presentation of what configurations are have been developed together with Markus B. Siewert in preparation of a paper for the ECPR Joint Sessions 2017 in Nottingham (Wagemann and Siewert, 2017). 4 ‘Complex causality’ may be the better term. 5 It is often wishful thinking that the existence of different explanations can be traced back to clear groups of cases, that is, party families, geographical areas, types of political systems, etc. It will more frequently be the case that researchers
353
need some imagination in order to classify cases with their explanations. Often, there may be more than one equifinal explanation per case, so that the case is ‘over-explained’. 6 It is, of course, impossible to estimate how many studies have been published based on configurative methods. Buche and Siewert (2015), Rihoux et al. (2013) and Wagemann et al. (2016) present some numbers for QCA, while Blatter and Haverland (2012) and George and Bennett (2005) are good examples of (text)books in which path-breaking studies are named and selected publications intensively discussed. 7 Without any doubt, the classification of methods as ‘quantitative’ vs. ‘qualitative’ falls short of research reality, as do other dichotomies, such as case-oriented versus variable-oriented (Ragin, 1987). Nevertheless, we recur to these dichotomies, since they are often used in social science methodology. 8 Others emphasize that QCA uses mathematical symbols and procedures, which are more typical for quantitative methods (Schneider and Grofman, 2009). 9 If interpretive methods really should not belong to the qualitative camp, this means that they either constitute a third methodological paradigm, which would go beyond the quantitative– qualitative dichotomy, or that they are no valid social science methods at all (such an opinion is clearly rejected by Goertz and Mahoney, 2012: 5, fn. 2). 10 Goertz and Mahoney (2012: 4–5) explicitly exclude interpretive methods from their discussion of two cultures because of different ontological and epistemological premises. 11 In our example, it is difficult to imagine that the causal relation could be the other way around, namely that EU membership is an explanatory factor for the fulfilment of the Copenhagen criteria. 12 Very often, set relations are interpreted as causal relationships. If the described set relationship (the outcome is a subset of the condition) exists, then this is only a technical-set coincidence. The decision whether the condition really qualifies as a necessary condition needs theoretical arguments (for more on necessary conditions in the social sciences, see Goertz and Starr, 2003). This is comparable to the discussion in quantitative research, whether causation can be inferred from statistical correlation. Similarly, the mere existence of a set relation does not tell us much about the importance of such a presumed necessary condition. It can still be a trivial necessary condition. For example, without any doubt, the fact to be born is a
354
The SAGE Handbook of Political Science
ecessary condition in order to become a Russian n president. But there are so many people who have been born, but only a very limited part of them becomes a Russian president. In set-theoretic terms, the set of the condition is many times larger than the set of the outcome, and we have a so-called coverage problem. It goes without saying that, while being born undoubtedly is and remains a necessary condition for becoming a Russian president, this statement is not particularly helpful in the analysis (for more on trivialness of necessary conditions, see Schneider and Wagemann, 2012: 144ff., 233ff.). 13 This can be assessed through measures of trivialness in QCA (Schneider and Wagemann, 2012: 144ff., 233ff.). 14 More elaborate versions of these configurations can also be thought of (Mahoney et al., 2009: 126). 15 The tilde ~ is one possible way to indicate the absence of a condition. 16 Note that, going back to Lazarsfeld’s (1937) ideas of the property space (see section 2.1), in case of fuzzy sets, truth table rows constitute the ideal cases to which real cases can more or less correspond. In other words, through fuzzy sets, cases become partial members of ideal types. By consequence, cases may have different membership values in configurations, but belong to the same ideal type, although at different intensities. 17 Read: A or B or the combination of the presence of C and the absence of D are sufficient for Y.
References Baumgartner, Michael (2009). ‘Uncovering Deterministic Causal Structures: A Boolean Approach’. Synthese 170 (1): 71–96. Baumgartner, Michael (2013). ‘Detecting Causal Chains in Small-n Data’. Field Methods 25 (1): 3–24. Baumgartner, Michael, and Alrik Thiem (2017). ‘Model Ambiguities in Configurational Comparative Research’. Sociological Methods & Research 46 (4): 954–87. Bennett, Andrew, and Jeffrey T. Checkel, eds. (2015a). Process Tracing: From Metaphor to Analytical Tool. Cambridge: Cambridge University Press. Bennett, Andrew, and Jeffrey T. Checkel (2015b). ‘Process Tracing: From Philosophical Roots to Best Practices’. In Andrew Bennett and Jeffrey
T. Checkel (eds.), Process Tracing. From Metaphor to Analytical Tool. Cambridge: Cambridge University Press, pp. 3–38. Bennett, Andrew, and Colin Elman (2006). ‘Qualitative Research: Recent Developments in Case Study Methods’. Annual Review of Political Science 9: 455–76. Berg-Schlosser, Dirk, and Gisèle De Meur (2009). ‘Comparative Research Design: Case and Variable Selection’. In Benoît Rihoux and Charles C. Ragin (eds.), Configurational Comparative Methods: Qualitative Comparative Analysis. Thousand Oaks (Calif.): Sage, pp. 19–32. Blatter, Joachim, and Markus Haverland (2012). Designing Case Studies: Explanatory Approaches in Small-N Research. Houndmills, Basingstoke: Palgrave Macmillan. Buche, Jonas, and Markus B. Siewert (2015). ‘Qualitative Comparative Analysis (QCA) in der Soziologie. Perspektiven, Potentiale und Anwendungsbereiche’. Zeitschrift für Soziologie 44 (6): 386–406. Caramani, Daniele (2009). Introduction to the Comparative Method with Boolean Algebra. Los Angeles: Sage. Caren, Neal, and Aaron Panofsky (2005). ‘TQCA. A Technique for Adding Temporality to Qualitative Comparative Analysis’. Sociological Methods & Research 34 (2): 147–72. Collier, David, and James Mahon (1993). ‘Conceptual “Stretching” Revisited: Adapting Categories in Comparative Analysis’. American Political Science Review 87 (4): 845–55. Cronqvist, Lasse, and Dirk Berg-Schlosser (2009). ‘Multi-Value QCA (mvQCA)’. In Benoît Rihoux and Charles C. Ragin (eds.), Configurational Comparative Methods: Qualitative Comparative Analysis. Thousand Oaks (Calif.): Sage, pp. 69–86. Elman, Colin (2005). ‘Explanatory Typologies in Qualitative Studies of International Politics’. International Organization 59 (2): 293–326. Fiss, Peer C., Dmitry Sharapov, and Lasse Cronqvist (2013). ‘Opposites Attract? Opportunities and Challenges for Integrating Large-N QCA and Econometric Analysis’. Political Research Quarterly 66 (1): 191–8. George, Alexander L., and Andrew Bennett (2005). Case Studies and Theory Development in the Social Sciences. Cambridge (Mass.): MIT Press.
Configurative Methods
Gerring, John (2007). Case Study Research. Cambridge: Cambridge University Press. Goertz, Gary (2006). Social Science Concepts. Princeton and Oxford: Princeton University Press. Goertz, Gary, and James Mahoney (2012). A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences. Princeton (N.J.): Princeton University Press. Goertz, Gary, and Harvey Starr, eds. (2003). Necessary Conditions: Theory, Methodology, and Applications. Lanham (Md.): Rowman & Littlefield. Greckhamer, Thomas, Vilmos F. Misangyi, and Peer C. Fiss (2013). ‘The Two QCAs: From a Small to a Large-N set Theoretic Approach’. In Peer C. Fiss, Bart Cambré, and Axel Marx (eds.), Configurational Theory and Methods in Organizational Research. Bingley: Emerald Publishing, pp. 49–75. Hall, Peter A. 2003. ‘Aligning Ontology and Methodology in Comparative Research’. In James Mahoney and Dietrich Rueschemeyer (eds.), Comparative Historical Analysis in the Social Sciences. Cambridge (Mass.): Cambridge University Press, pp. 373–404. Koivu, Kendra L., and Erin Kimball Damman (2015). ‘Qualitative Variations: The Sources of Divergent Qualitative Methodological Approaches’. Quality & Quantity 49 (6): 2617–32. Kühn, David, and Ingo Rohlfing (2016). ‘Are There Really Two Cultures? A Pilot Study on the Application of Qualitative and Quantitative Methods in Political Science’. European Journal of Political Research 55 (4): 885–905. Kvist, Jon (2007). ‘Fuzzy Set Ideal Type Analysis’. Journal of Business Research 60 (5): 474–81. Lazarsfeld, Paul (1937). ‘Some Remarks on Typological Procedures in Social Research’. Zeitschrift für Sozialforschung 6 (1): 119–39. Lijphart, Arend (1999). Patterns of Democracy: Government Forms and Performance in Thirty-Six Countries. New Haven (Conn.): Yale University Press. Mahoney, James, and Rachel Sweet Vanderpoel (2015). ‘Set Diagrams and Qualitative Research’. Comparative Political Studies 48 (1): 65–100.
355
Mahoney, James, Erin Kimball, and Kendra L. Koivu (2009). ‘The Logic of Historical Explanation in the Social Sciences’. Comparative Political Studies 42 (1): 114–46. Mair, Peter (2008). ‘Concepts and Concept Formation’. In Donatella della Porta and Michael Keating (eds.), Approaches and Methodologies in the Social Sciences. Cambridge: Cambridge University Press, pp. 177–97. Morlino, Leonardo (2018). Comparison: A Methodological Introduction for the Social Sciences. Opladen/Berlin/Toronto: Barbara Budrich Publishers. Radaelli, Claudio M., and Claudius Wagemann (2018). ‘What Did I Leave Out? Omitted Variables in Regression and Qualitative Comparative Analysis’. European Political Science, online first, https://doi.org/10.1057/s41304017-0142-7 accessed 6 January, 2020. Ragin, Charles C. (1987). The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. Berkeley (Calif.): University of California Press. Ragin, Charles C. (2000). Fuzzy-Set Social Science. Chicago: University of Chicago Press. Ragin, Charles C. (2006). ‘The Limitations of Net-Effects Thinking’. In Benoît Rihoux and Heike Grimm (eds.), Innovative Comparative Methods for Policy Analysis: Beyond the Quantitative-Qualitative Divide. New York: Springer, pp. 13–41. Ragin, Charles C. (2008). Redesigning Social Inquiry: Fuzzy Sets and Beyond. Chicago: University of Chicago Press. Ragin, Charles C. (2013). ‘New Directions in the Logic of Social Inquiry’. Political Research Quarterly 66 (1): 171–4. Ragin, Charles C., and John Sonnett (2004). ‘Between Complexity and Parsimony: Limited Diversity, Counterfactual Cases and Comparative Analysis’. In Sabine Kropp and Michael Minkenberg (eds.), Vergleichen in der Politikwissenschaft. Wiesbaden: VS Verlag für Sozialwissenschaften, pp. 180–97. Ragin, Charles C., and Sarah I. Strand (2008). ‘Using Qualitative Comparative Analysis to Study Causal Order’. Sociological Methods & Research 36 (4): 431–41. Rihoux, Benoît, and Charles C. Ragin (2009a). Configurational Comparative Methods: Qualitative Comparative Analysis and Related Techniques. Thousand Oaks (Calif.): Sage.
356
The SAGE Handbook of Political Science
Rihoux, Benoît, and Charles C. Ragin (2009b). ‘Introduction’. In Benoît Rihoux and Charles C. Ragin (eds.), Configurational Comparative Methods: Qualitative Comparative Analysis and Related Techniques. Thousand Oaks (Calif.): Sage, pp. xvii–xxv. Rihoux, Benoît, Priscilla Álamos-Concha, Damien Bol, Axel Marx, and Ilona Rezsöhazy (2013). ‘From Niche to Mainstream Method? A Comprehensive Mapping of QCA Applications in Journal Articles from 1984 to 2011’. Political Research Quarterly 66 (1): 175–84. Rohlfing, Ingo, and Carsten Q. Schneider (2018). ‘A Unifying Framework for Causal Analysis in Set-Theoretic Multimethod Research’. Sociological Methods & Research 47 (1): 37–63. Salmon, Wesley C. (1963). Logic. Englewood Cliffs (N.J.): Prentice-Hall. Sartori, Giovanni (1970). ‘Concept Misformation in Comparative Politics’. American Political Science Review 64 (4): 1033–53. Schneider, Carsten Q., and Bernard Grofman (2009). ‘An Introduction to Crisp Set QCA, with a Comparison to Binary Logistic Regression’. Political Research Quarterly 62 (4): 662–72. Schneider, Carsten Q., and Ingo Rohlfing (2013). ‘Combining QCA and Process Tracing in SetTheoretic Multi-Method Research’. Sociological Methods & Research 42 (4): 559–97. Schneider, Carsten Q., and Claudius Wagemann (2012). Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis. Cambridge: Cambridge University Press. Seawright, Jason (2005). ‘Qualitative Comparative Analysis vis-à-vis Regression’. Studies
in Comparative International Development 40 (1): 3–26. Skaaning, Svend-Erik (2011). ‘Assessing the Robustness of Crisp-Set and Fuzzy-Set QCA Results’. Sociological Methods & Research 40 (2): 391–408. Thiem, Alrik, Michael Baumgartner, and Damien Bol (2016). ‘Still Lost in Translation! A Correction of Three Misunderstandings Between Configurational Comparativists and Regressional Analysts’. Comparative Political Studies 49 (6): 742–74. Van Evera, Stephen (1997). Guide for Students of Political Science. Ithaca (N.Y.) and London: Cornell University Press. Vink, Maarten P., and Olaf Van Vliet (2009). ‘Not Quite Crisp, Not Yet Fuzzy? Assessing the Potentials and Pitfalls of Multi-Value QCA’. Field Methods 21 (3): 265–89. Wagemann, Claudius, and Carsten Q. Schneider (2015). ‘Transparency Standards in Qualitative Comparative Analysis’. Qualitative & Multi-Method Research 13 (1): 38–42. Wagemann, Claudius, and Markus B. Siewert (2017). Configurational Comparative Methods: Approaching a Chameleonic Label to Derive at Configurations of Configurational Methods. Paper for the ECPR Joint Sessions Workshop ‘Configurational Thinking in Political Science: Theory, Methodology, and Empirical Application’, Nottingham, 25–30 April 2017. Wagemann, Claudius, Jonas Buche, and Markus B. Siewert (2016). ‘QCA and Business Research: Work in Progress or a Consolidated Agenda?’. Journal of Business Research 69 (7): 2531–40.
21 Designing a Research Project1 Hans Keman
Introduction The ‘art of doing research’ in political science is explored here by demonstrating how to relate a theory guided Research Question to a properly founded Research Answer by developing an adequate Research Design. First, the role of theory and conceptualization will be examined. Second, the function of variables in social research will be highlighted. Third, the meaning of ‘cases’ and their selection will be discussed. These are important steps in any (comparative) Research Design. Fourth, the focus will turn to the ‘core’ of any type of research, whether qualitative or quantitative: linking evidence to argument in a verifiable and convincing fashion. Among the different methods and approaches (see e.g. Burnham et al., 2004; Keman and Woldendorp, 2016) the comparative approach is emphasized: the use of the logic of comparative inquiry to analyse the relationships between variables – representing theory – and the information contained in
the cases – the evidence. Finally, some problems common to relating theory to evidence will be discussed. The logic of comparison allows for more types of research than is often thought. In the past, many researchers were taught that ‘comparative politics’ meant the crossnational analysis of states, their institutional complexion and politics (e.g. the welfare state in Western European democracies). As will be elaborated below, this is not the state of affairs any longer. Other systemic entities, for example transnational networks (like the EU) or societal developments over time, (sub-)systems (e.g. cities across the globe), many or few cases (be it countries, cities, institutions, actors or individuals) are now considered as proper units of analysis. For example, if one wishes to understand the process of state formation or the development of democracies, obviously you need to take into account the cross-case variation as well as the evolution over time, also called: the historical approach.
358
The SAGE Handbook of Political Science
The historical approach has long been neglected in political science (see also Berntzen, Chapter 23, this Handbook). Following Mahoney and Rueschemeyer, it can be defined as follows: ‘comparative historical analysis aims at the explanation of substantively important outcomes by describing processes over time using systematic and contextualized comparisons’ (Mahoney and Rueschemeyer, 2003: 6). There are basically two modes of analysis: if one compares sequences in a single or across several cases, it is called a diachronic analysis; if it concerns several cases at the same point in time or during the same historical period, it is called a synchronic analysis. The second element in the above definition concerns ‘important outcomes’. But what is important? For example, by hindsight the outcome of what we call today World War I was important because it meant a watershed in Europe in terms of democratization and is also considered as a cause of World War II (Hobsbawm, 1994). Hence, important outcomes of a historical process need to be formulated carefully by means of a Research Question to explain puzzles over time and must be specified spatially (i.e. where and when do the same or similar phenomena occur?) to compare these properly regarding their temporal variation (Pierson, 2000). A Research Design is a crucial step for developing and testing theories and for the verification of rival theories. Theory development and Research Design are closely interlinked in analysing political developments. Contrary to everyday practice, where most
people are often implicitly comparing situations, in comparative politics the issue of what and how to observe reality is explicitly part of the comparative method. Dogan and Pelassy, for example, remark: ‘to compare is a common way of thinking. Nothing is more natural than to consider people, ideas, or institutions in relation to other people, ideas and institutions. We gain knowledge through reference’ (1990: 3). Hence, the evolution of political science has moved on from implicit comparisons to explicit ways of comparing political systems and researching related processes. One of the major modern developments in political science consists of linking theory to evidence by means of comparative methods (Landman, 2008). The particular method to be used depends, however, on the Research Question (RQ) asked and the Research Answer (RA) to be given (see Box 21.1). The actual method chosen is what we call Research Design (RD) and this is what this chapter is about. A theory, in its simplest form, is therefore a meaningful statement about the relationship between two real world phenomena: X, the independent variable, and Y, the dependent variable. According to theory, it is expected that change in one variable will be related to change in the other. The conceptual and explanatory understanding of such a relationship is the point of departure for conducting research by comparing empirical evidence, for instance across systems or over time (see also Brady and Collier, 2004: 309; Burnham et al., 2004: 57). An example of how the triad works and has helped to answer a contested issue is the
Box 21.1: The Triad RQ → RD → RA The point of departure is that all Research Questions should be theory guided. The theoretical guidance is expressed in relating Research Questions (RQ) to Research Answers (RA) by means of logical relationships between a dependent variable (Y: what is to be explained) and the independent variables (X: the most likely causes, i.e. factors serving as an explanation). The ‘bridge’ between RQ and RA is called a Research Design (RD). Therefore, the (comparative) method is a means to an end: to make choices as to which of the potential mass of relevant empirical data (the evidence) and possible causes (X) explaining variations in Y can be used and are valid and reliable in arriving at an RA.
Designing a Research Project
debate during the 1980s and 1990s on ‘does politics matter?’ (Castles, 1982; Hicks and Swank, 1992; Schmidt, 1996; Keman, 2002). The dependent variable or the outcome (Y) in this example is welfare state development, i.e. what the researcher seeks to explain. It is called dependent because we expect that the variation in welfare state provisions across systems also depends on one or more independent variables. As a tentative answer, the researcher comes up with a hypothesis. In this example, the variation in welfare state development (Y) is dependent on the relative strength of left-wing parties and trade unions (X) in various countries. This Research Answer, or hypothesis, is a conjecture about the relationship between the dependent variable and the independent variable and is supposed to explain the outcome, i.e. the development of the welfare state (Y). In this case, in a comparative Research Design a theoretical relationship is elaborated to account for the differences and similarities in welfare state development across a number of nations over time. Obviously, any type of ‘X–Y’ relationship in social science is an abstraction from the complexities of the real world. This is deliberate. By means of hypotheses or explanations (X) those factors are included that can account for the variation in Y. This procedure allows us to establish whether or not a meaningful relationship indeed exists, and whether or not this relationship can be qualified as ‘causal’ or not.2 As we will discuss below, another factor to take into account is the role of economic growth (X2). It goes almost without saying that the growth of the ‘wealth of a nation’ may well depend on the financial revenues of the state. Hence, the Research Design is further extended to include this potential explanation: X1+ X2→Y. This formal description is then subject to develop a Research Design that can provide a convincing, reliable and valid Research Answer by means of developing the proper information and selecting the relevant cases (those where there is a welfare state)
359
and period covering the observation in terms of development (e.g. after World War II). In more formal terms, a theory posits the dependent variable in the analysis – what is to be explained? Additionally, the researcher wishes to know: what are the most likely ‘causes’ of the phenomenon under investigation? Again, in formal terms: which independent variables, or explanatory factors, can account for the variation of the dependent variable across different systems (e.g. countries) or features of political systems (e.g. parties or policies) or by means of a time series? The answer to this question rests heavily on the development of a ‘correct’ Research Design. Comparative methods can thus be considered as a ‘bridge’ between the Research Question asked and the Research Answer proposed. This is what we label the ‘triad’ RQ → RA → RD (see Box 21.1). Developing a Research Design in political science requires careful elaboration (what is ‘political’?). First, the Research Design should enable the researcher to answer the question under examination. Second, the given answer(s) ought to meet the ‘standards’ set in the social sciences: are the results valid (authoritative), reliable (irrefutable) and generalizable (postulated) knowledge (King et al., 1994)? Third, are the Research Design, the methods and the data used indeed suitable for the research goals set? This chapter will elaborate on these issues and attempt to guide the reader towards linking research questions to research answers by presenting examples of extant research. Key Points: • The proper use and correct application of methods by means of a Research Design is essential in all contemporary political science research. • A correct application implies that the method in use meets the professional ‘standards’ set in terms of validity, reliability and its use in a wider sense, i.e. generalizability of the results. • The relationship between variables and cases in any social science research is crucial in order to reach empirically founded conclusions that will bring further knowledge in political science.
360
The SAGE Handbook of Political Science
• The foundation of a Research Design rests with the proper planning of the ‘triad’: relating a Research Question to a Research Answer by means of a Research Design that allows for evidence-based Research Answers.
The Substance of Political Science Political science is – as are many of the social science disciplines – a multifarious discipline in terms of types of theoretical approaches and related diversity of methods. As Gabriel Almond (1990) once remarked: the development of political science can be characterized as having in essence the same menu, it is however consumed at different tables. Political science analysis is, therefore, characterized by its variety and often contested across the discipline. Obviously, this is by and large due to the nature of its core substance: the role of politics in society and its systemic processes. Political processes involve individuals, i.e. the political actors (be it individuals like political leaders, MPs, electors, party members, citizens and so on), organizations (parties, governments, movements, trade unions, business associations, NGOs and so on) and institutions (formal, like basic laws and a constitution or informal rules, conventions, traditions or temporary measures). However, both actors and institutions are interdependent. The game of politics is played by the rules (of course, not always), but the actors or players differ in weight and position. Hence, analytical tools must be used in terms of the units of the system in which they operate (the playing field, a democracy, an autocracy or some other system or subsystem). This defines the weight and position of the different actors given the extant sets of institutions and their occurrence at different levels of aggregation: inter- or transnational, national or intra-national. Let us give a few examples: in a democracy all electors are equal and all parties have the same platform to compete. Yet, some parties
are bigger than others and some are in government and other parties are not. This makes a difference for the actors involved and – depending on the systemic rules in use – gives some of them more leeway for action than others. For instance, in a presidential system with a two-party system one party takes all (having the majority), whereas in a parliamentary system, more often than not more than one party enters into a coalition to have a majority in parliament and to form the government. So, a Research Question could well be: does it make a difference whether a political system is presidential or parliamentary in view of representation and policy performance? Another example is: what is the difference between democracy and autocracy in terms of the quality of life? In the eye of the public of the so-called advanced democracies the answer is almost self-evident. Yet others have claimed that the answer is not so straightforward at all. Hence, you need to do (comparative) research to find out to what is the ‘truth’. For this you need a welldeveloped Research Question and a proper Research Design that is capable to give a scientifically sound answer. These examples are typically formulated on the level of the nation-state and can only be researched on the macro-level comparing nations. Originally, nation-states were considered to be the main organizing unit of political processes. This is not always the case anymore. In addition to the fact that states were often assumed to be unitary governing bodies, we notice – on the one hand – more and more the emergence of transnational and international regimes (e.g. international governmental organizations – IGOs), and on the other hand, that the nation-state is not a closed unit of analysis, but has a varied intrasystem composition. Hence, political processes cannot and should not be studied by focusing on the nation-state per se. Instead, the focus can well be a ‘systemic multi-level’ approach to understand the dynamics of politics in the contemporary world (Braun and Maggetti, 2016).
Designing a Research Project
In other words, we need to consider the political world as a more or less organized system that is characterized by systemic (or: within-system) features that can be discerned at various levels (Kübler, 2016). In the real world, this may concern a federation and its constituting parts or the sub-national units within a centralized state (e.g. municipalities) or the members of an international governing body (like the UN or the EU). Hence, both theory and method in political science are characterized by complex interactions between political actors (like parties or organized interests) operating at different levels of the polity that vary in space and time and are in need of different methods and approaches to capture the political processes both descriptively and analytically (Mahoney and Rueschemeyer, 2003; Pennings et al., 2006). If this is the case, then researching political processes and their ramifications is a challenging and complex assignment for any political scientist. It must be made clear, however, that such a ‘systemic’ view should not to be considered as a ‘paradigm’ or as a ‘unique selling point’. Rather, we consider this perspective to be a useful heuristic device to show how and when different approaches, methods and related applications can be used best to analyse political activities at different levels of a polity. In short: studying substantial political processes as a systemic activity is our framework of reference that directs the formulation of the relationship between a theory guided Research Question and potential explanations, i.e. the Research Answer. Hence, before one starts collecting data and information, let alone analyse them, it is – first of all - paramount to elaborate the underlying or guiding theory and formulate a clear and meaningful set of research questions (or hypotheses). These preliminary activities define the type of Research Design, the analytical tools and the empirical information needed (be it qualitative or quantitative), not the other way around (Brady and Collier, 2004).
361
Second, studying political processes implies that both change and continuity have to be taken into account as well as the patterned variation in terms of space and time (Keman, 2016a). This means that this approach to politics implies that the comparative method and its logic is seen as a central form of political analysis. However, we do not claim that it is the only way to go. One can make a valid distinction between ‘implicit’ and ‘explicit’ modes of comparison. Implicit comparing is almost always used intuitively by researchers in the social sciences (and by most people in general!) when assessing situations, developments and outcomes. By contrast, explicit comparisons are often used intentionally by academic researchers to explain horizontal (e.g. relations between parties) and vertical political processes (e.g. states vis-à-vis their citizens) controlling for their systemic differences over time and space (e.g. Pierson, 2000; Keman, 2016b: 91). Third, not only different approaches exist which are (sometimes strongly) contested, but this also applies to the mode of analysis and related types of empirical information.3 Our point of departure is that essentially any type of political research should be open to replication, ought to be reliable in terms of the sources and data used and should be valid as regards measurements in relation to the concepts used. This statement appears perhaps to be almost superfluous, but it is not. A major divide within the social sciences and political science in particular concerns qualitative versus quantitative types of data and analysis. Yet, as Brady and Collier (2004) argue, it is less the question whether the method and related techniques are more or less superior, but rather whether or not the scientific standards to be used are indeed honoured. We share this view and maintain that empirical information must be solid and responsibly reported. This applies to the historical approach and method as well as to constructivist types of research or quantitative data analysis (King et al. 1994; Pennings et al. 2006).
362
The SAGE Handbook of Political Science
Key Points: • Political processes can be analysed by using a systemic and multi-layered approach as a heuristic and organizing device and postulates that ‘politics’ is manifested at various levels of governance and by various societal activities. • Political actors and institutions are central to any type of ‘politics’ and ought to be studied in spatial terms and across time (depending on the Research Question). • Both implicit and explicit types of comparison underlie most types of political research and analytical inferences should be based on empirical information, be it qualitative or quantitative. • Fundamental to any type of research is that the standards of empirical–analytical social science are shared and applied. These ideas guide and structure the elaboration of any Research Design.
The Research Design: Bridging Theory and Outcome Many students on whatever level have excellent theoretical ideas to come up with a relevant and challenging Research Question. Some of them also have fine ideas what Research Answer (or: hypothesis) would be viable, if not convincing. However, if it comes to developing a Research Design that indeed underpins the relationship between Research Design and Answer, many students tend to fall short. How come? A reason may well be that in many university courses the emphasis is on either theoretical or substantial questions only, or solely on the application of methods and collecting evidence as such. There is, however, little training in relating the Research Question to the Research Answer. Yet, it is crucial for any student, to develop a proper Research Design that is viable and feasible. A Research Design is a crucial step for developing and testing theories and for the verification of rival theories. Hence, as Guy Peters emphasizes, ‘the only thing that should be universal in studying comparative politics … is a conscious attention to explanation and research design’ (1998: 26).
Theory development and Research Design are closely interlinked in political science. Let us elaborate this by means of the ‘history’ of analysing the impact of economic development (X) and the role of the state (Y). An example is ‘Wagner’s Law’ developed in the 1890s (Kohl, 1985). This ‘law’ says that the more the economy grows, the higher the level of state expenditures. Hence, the theory says that the ‘wealth of a nation’ or: X (e.g. expressed as the gross d omestic product per capita) correlates positively with the growth of public spending (i.e. the more X changes, the more Y changes) and thus produces higher levels of taxation (revenues of the state). Although Wagner’s Law has been contested, at a general level it appears to work and has been conducive to additional Research Questions and alternative Research Designs to explore whether or not it is indeed a ‘law’ (i.e. it is evident under all circumstance and everywhere) and what the consequences are for a society. Peacock and Wiseman (1961), for example, conducted an alternative approach to this Research Question. Rather than comparing nations, their method focused on a single case (UK) over a long period using annual data and found that Wagner’s Law is more or less a general tendency. But they found, in addition, that particular circumstances – like the incidence of war or an economic depression – explained the observed stepwise increment of public spending and taxation. Therefore, the trend may be correct, but its development is divers due to specific external factors. What is important to note is, that this research design did confirm the basic tenet of Wagner’s Law, but at the same time offered an explanation why and when the increment of public finances occurred. Manfred Schmidt (1989) also took Wagner’s Law as a point of departure and asked an additional Research Question, namely in what way and to what extent the higher levels of public expenditures were distributed by the state. He found that the level of economic development makes a difference in
Designing a Research Project
terms of functional spending (e.g. how much respectively on the military or on public welfare). His follow-up Research Question was that the political institutionalization of decision-making should be included in the Research Design in order to understand the comparative variation in public spending. For this purpose, Schmidt chose to employ a cross-national Research Design in which he developed sub-sets of countries defined by their relative level of economic growth. What this debate shows is that the relationship between Research Question and Research Answer defines the choices regarding the development of the Research Design. Wagner used economic evidence and applied statistics to underpin the relationship (by means of correlations). His analysis was implicitly comparative and aiming at showing a general development in the industrializing world. Peacock and Wiseman employed annual data to measure change in various ways (long-term and short-term fluctuations) in order to observe the longitudinal deviations around the trend using time-series analysis within one country. This Research Design not only confirmed Wagner’s Law by and large, but enabled these researchers to refine the ‘law’ by including external factors. By contrast, Schmidt made explicit use of available global data, not so much to scrutinize Wagner’s
363
Law, but to analyse how state involvement extended beyond its core activities (like Law and Order, Justice, External Safety, Regulating social and economic relations and providing Infrastructure) in different sub-sets of nations by employing comparative standards of economic development and characteristics of the polity. His principal aim was to demonstrate that the so-called ‘modernization’ thesis4 could not be upheld. Instead, Schmidt showed that modernization was indeed dependent on economic growth but that the political institutionalization of state involvement affected this process in terms of policy choices allocating public revenues. This example of an application of the ‘triad’ demonstrates that a relatively straightforward Research Question can well be conducive to further elaboration, building on previous Research Answers and related evidence by means of carefully developed Research Designs. Key Points: • Without a theory there cannot be a proper relationship between the Research Question and the Research Answer → there is no given Research Design. • The Research Design must be directly related to the substance and the central concepts of the Research Question → variables and cases must be valid.
Box 21.2: Internal and External Validity Internal validity refers to the degree to which descriptive or causal inferences from a given set of cases are indeed correct for most cases under inspection. External validity concerns the extent to which the research results can be considered valid for similar cases that were not included. Both types of validity are equally important, but it should be noted that there is a trade-off. The more cases included in the analysis that can be considered as representative, the more ‘robust’ is the overall result (external validity). Conversely, however, the analysis of fewer cases may well be conducive to a more coherent and solid conclusion for the cases that are included (internal validity). It should be noted that the concepts of internal and external validity are ideal-typical: in a perfect world with complete information then the standards of both internal and external validity may be met, but in practice this is hard to achieve.
364
The SAGE Handbook of Political Science
• The Research Design must deliver reliable evidence, which is open to replication5 → the empirical data employed have to be fully documented. • The goal of a Research Design is to achieve generalizability of the results to advance knowledge → data analysis should be internally and externally valid.
Building the Bridge: Concepts – Variables – Cases The ways to bridge the gap between abstract concepts and specific measurements concerns the tension between the concept as an object of analysis (its meaning) and the concept as a means to accomplish empirical analysis of political phenomena (its observation). This is difficult because many concepts are often not directly observable. It is crucial to what extent the observations related to a concept are reliable and valid. Reliability is achieved if a measurement tool yields stable and consistent results when repeated over time (e.g. a distance should be measured with a solid tool and not a rubber band). Validity is achieved if we are measuring what we intend to measure (i.e. temperature should be measured with a thermometer and not a barometer). Both criteria should be met as much as possible, although there is often a trade-off between them (Pennings, 2016). This trade-off concerns the choice between the degrees of precision of how the concept is transformed into observable terms (the vari able) vis-à-vis the relative comprehensiveness
of the unit of observation (the case). The more cases are included (covering the variable) the more likely is the generalizability of the research results. However, this always implies a ‘ceteris paribus’ situation or: assuming all other things being constant. Alternatively, one can restrict the conceptual transformation into lack of precision to a minimum, which would lead to one or to a few cases to be analysed (recall Box 21.2). For a long time this trade-off has led to the dispute between ‘quantitative’ (i.e. the more cases, the better) and ‘qualitative’ (i.e. the fewer the number of cases, the more exact the observations) approaches (Box 21.3). This dispute is in our view rather quaint: first, there are methods that help to forestall this problem by means of ‘Fuzzy Set Analysis’ and Qualitative Comparative Analysis (QCA).6 Second, precise and comprehensive description may become recycling history and does not contribute to general knowledge (see Pierson, 2000, for how historical observations can instead be used for social science analysis). Third, an alternative Research Design might be feasible: combining ‘many cases’ with ‘few cases’ where the latter analysis can corroborate the former results or bring to light the ‘deviant cases’ or so-called ‘outliers’ (i.e. the ‘exception to the rule’; see Mahoney and Rueschemeyer, 2003 and Box 21.3). There are a multitude of notions about what a case study is and which purpose it serves in the development of adequate research. However, the methodological discussion about case study research in political science has been
Box 21.3: Qualitative and Quantitative Approaches in the Social Sciences Qualitative methods focus on understanding the meaning underlying an intention, action, object or phenomenon. In other words, researchers adopting qualitative methods aim to develop, an interpretation, of the way in which those cases to actors study understand their actions and the context in which they act. In political science research, qualitative methods are usually contrasted with quantitative methods, which typically deal with large amounts of data and statistical methods, establishing causal relationships between social phenomena. As such, it is clear that qualitative methods are usually underpinned by an interpretivist position by means of case-based description, while quantitative methods are underpinned by numerical information and variable-based statistical analysis. Given these differences, qualitative and quantitative methods have often been seen as mutually exclusive modes of generating and analysing data. (Bryman, 1988; Ercan and Marsh, 2016).
Designing a Research Project
linked with comparative analysis. The dispute concerns how case-based analysis can contribute to the attainment of theoretical, generalizable knowledge (Seha and Müller-Rommel, 2016). Hence, while many conceptions of case study research exist, the understanding of case study research as an instrument for uncovering generalizations is commonly given prevalence over interpretative approaches and the notion of studying cases for their singularity (Yanow et al., 2010: 109). Given the intention to relate Research Questions to Research Answers, the case analysis is expected to enable the researcher to make statements that are general as well as testable; practitioners of case study research in political science have commonly not confined themselves to studying a single case on its own terms, but have instead placed the study of cases under the umbrella of the comparative analysis. The study of ‘democracy’ can serve here as an example. This is a concept that is for most people easy to understand in daily use, but quite complex to pin down in empirical-analytical terms. It has, therefore, to be specified as precisely as possible in order to get a common understanding about the meaning of what democracy entails and its application in terms of observation. Since most, if not all concepts are complex and abstract, they must be broken down into components that are measurable to generate data (Beetham, 1994). In political science most concepts are multidimensional, but the complexity (i.e. the number of dimensions and the number of levels of observation) differs (Goertz, 2005). The basic level entails a general abstract notion like democracy and its institutions (e.g. the constitution, electoral laws, type of parliament, state organization) that is considered to make up a democratic system. The secondary level divides the basic concepts into constitutive dimensions. In the case of democracy these are, for example, participation by citizens and contestation between candidates for office (Beetham, 1994; Keman, 2002). The third level is the operationalization level. This
365
level is most detailed in order to enable data collection by means of indicators that allow empirical information to be contained in variables (i.e. measuring the values representing the constitutive dimensions across cases). There exist multiple approaches to conceptualizing and measuring democracy in the literature. They can be broadly divided into either minimalist (to increase the number of cases) or maximalist (including many factors to enhance precision) conceptualizations, each having strengths and weaknesses. A minimalist will opt for a few indicators like universal suffrage, regular elections and basic civil rights (Vanhanen, 1997; Freedom House, 2006). A maximalist will strive for an in-depth analysis by means of a comparable case analysis, taking into account many factors that contribute to the functioning of democracies, such as political equality and policy performance in actual practice. A comprehensive approach is the Varieties of Democracy project (V-Dem) that seeks to capture seven different concepts of democracy over time after 1900 up to the present (Coppedge et al., 2011).7 In order to bridge the gap between theory and data adequately, a number of requirements should be met. It is not sufficient to list a (large) number of dimensions of a concept (like the V-Dem Research programme). The generated data should enable the researcher to specify a causal relationship between those real world phenomena in a way that logically follows from the theory guided Research Question. In addition, the data must clearly establish the temporal and spatial dimensions by being applicable to certain places, actors, time periods, etc. (Pierson, 2000; Keman, 2016b). For example, a theory of democracy should not only describe what democracy is, but also question under which conditions it performs better or worse. If these causal factors are not equally important, their weight must be specified. For example, which cultural, political or economic factors are more decisive for the rise and endurance of demo cracies? (Lipset, 1959; Beetham, 1994).
366
The SAGE Handbook of Political Science
The basic Research Question underlying Lipset’s (1959) Political Man: The Social Bases of Politics concerned the question of how and under what conditions a democratic political system could be stable and prosperous. The Research Answer formulated by Seymour Lipset is that there is a direct relationship between economic development and democracy. This conclusion was founded upon a Research Design developing an extensive empirical examination, both quantitative and qualitative, by means of an analysis of the bases of democracy across the world. As the evidence showed, a strong causal relationship between economic development (prosperity) and democratic stability could be established. Arend Lijphart (1968) employed another type of Research Design to study the question: what conditions advance the stability of democracy? Contrary to established ideas that a homogeneous society (in terms of culture, like, for example, religion and language) and pluralist behaviour (seeking a balanced choice across politically organized cleavages) are the stepping stones towards a stable and peaceful democratic society, Lijphart demonstrated by means of a case-based analysis (the Netherlands) that a combination of certain institutions and behaviour of organized actors tended to lead to consensus rather than conflict. This Research Answer led to a new concept in political science, namely consociational democracies. This concept has been adopted by others and led eventually to the development of a more general theory of democracy (known as consensus democracy; Lijphart, 2012). This example shows how a Research Question can lead to a Research Answer derived from a case study and is developed into a different theoretical and empirical conceptualization and measurement of demo cracy by means of a Research Design. Key Points: • Proper conceptualization of the central elements of a Research Question and related Answer is crucial for the elaboration of a viable Research Design.
• The type of concept in use is closely related to type of variables, level of measurement and number of cases to be included in the Research Design. There is always a trade-off between variables (many or few) and cases (many or few) in relation to the method to be used and the eventual analytical results. • Developing variables from the concepts underlying the Research Question and related to the Answer requires a careful operationalization by means of indicators (real world observations). Essential in this regard is ‘reliability’ by replication. • The relationship between variables and cases defines the internal and external validity of the initial research outcomes and opens avenues for further research by means of alternative research designs.
Putting Theory into Praxis: Caveats – Pitfalls – Problems We argued throughout the chapter that the comparative approach in political science is considered to be the ‘royal way’ in linking theory to evidence, enhancing it as a ‘scientific’ discipline. Obviously, the comparative approach is not the only one, but we do contend that the underlying methodology offers a wide spectrum of options to develop a Research Design to carry out political science-related research. Of course, there are – as in other approaches in the social sciences – a number of constraints that may well affect its possibilities and can impair its usefulness in answering the Research Question under consideration. In this section, we discuss some of these limitations. While it is important to be aware of them, it is a challenge for researchers to find appropriate solutions to overcome them. A major admonition for the social sciences is that there are many theories that fit the same data. You only have to go through the literature concerning research on, for instance, political parties or the welfare state, to understand this pitfall. There is a bewildering number of different research outcomes derived from the same sources (like OECD and UN data as regards
Designing a Research Project
the welfare state research or the use of national or individual-based statistics concerning elections). Obviously, the caveat is that valid and reliable data for the cases we have selected to test theoretical relations are often applied to different Research Questions. If this problem is insufficiently understood, it will undermine the quality of advancing knowledge. Let us discuss a number of these problems.
Conceptual Stretching Conceptual stretching is the bias occurring when a concept developed for one set of cases is extended to additional cases to which the features of the concept do not apply in the same manner. Sartori (1970) illustrated this problem by means of the ‘ladder of generality’ (i.e. enhancing a wider use of a theoretical concept by extension of its initial meaning). For instance, some researchers have applied a concept that involves a loss of intension (where the observations reflect the original features of the concept, i.e. remain close to the original Research Question). Intension will obviously reduce the applicability of a concept across more cases, but it enhances the internal validity of the cases included. Extension will have the opposite effect, and the question is then whether or not the wider use (i.e. in a greater number of cases) impairs the claim for external validity of the analytical results. The more the meaning of a concept moves in the process of operationalization from specific to general, the less the information collected will be relevant for each case and signifies a loss of internal validity of the research results (see Box 21.2). At the end of the day, it is up to the researcher to make a decision: accepting less generable results or aiming for a result that is close to a real world situation.
Equivalence A related problem of transforming concepts into empirically based indicators concerns the
367
question of whether the meaning of a concept stays constant across time and space. Landman (2008: 43–6) argues that this problem is less a matter of whether a concept is measured with identical results (which is a matter of reliability), and that whatever solution is chosen, it is up to the researcher to convince us whether or not the degree of equivalence between measured phenomena is acceptable. This pitfall is particularly relevant regarding quantitative studies. In conjunction with this hazard, it has been noted that reliability problems may arise as a result of including (functional) equivalents that are used to widen the number of cases as a result of conceptual stretching. An example is the concept of the ‘political party’. What defines a political party, and how can such a concept be transformed into measurable and valid entities across a wider universe? Across Europe it may be feasible, but – as we know – in the United States the political parties are quite differently organized and perform not in the same way as in Europe. Thus, the problem is to what extent a concept transformed into an empirical indicator has indeed an identical meaning across different settings or cultures (van Deth, 1998). However, attempts have been made to develop methods to cope with the problem of overstretching and travelling (Braun and Magetti, 2016). All this implies that a Research Design ought to take into account the selection of cases and, if relevant, the period or time series under investigation (Pierson, 2000; Keman, 2016a).
Family Resemblance Another caveat with respect to the validity and reliability of data collection and analysis is the use of ‘family resemblance’ to extend the available data (Collier and Mahon, 1993). In its simplest fashion, this method extends the initial concept by adding features which share some of the attributes of the original concept. How far this type of extension can go depends on what the Research Question
368
The SAGE Handbook of Political Science
is. For example, if we are investigating the behaviour of political parties and define these as any actor that is vote-seeking, officeseeking and policy-seeking, then the concept of a party can be used in a wider sense. If instead of requiring that all three characteristics are present simultaneously, we may allow the inclusion of parties that have fewer features in common, then these may show a mere resemblance. Examples are so-called ‘electoral democracies’, where people can vote but where elected parties cannot decide about the composition of the government in autocratic or ‘hybrid’ systems. Overall, these potential solutions to enlarge information on more cases are at the same time potentially affecting the reliability and validity of the data in use. In part, these solutions are creating ‘artefacts’ and can easily lead to the above mentioned caveat that too many data collections serve to underpin different theories in political science. This observation only shows once more that the development and elaboration of a proper Research Design is crucial for the further development of political science. However, even if we dispose of a complete and convincing Research Design, that is adequately executed, the actual outcomes need to be correctly interpreted in view of the Research Question that underlies the research results as such.
Interpreting Results There are caveats and pitfalls to be taken into account when interpreting the generated data available – whether or not with numerical values. These are: • Galton’s problem • Individual and ecological fallacies • Over-determination
Galton’s Problem Galton’s problem refers to a situation where the observed differences and similarities may well be caused by exogenous factors that are
common to the cases selected for comparison, such as comparing fiscal policy-making across states in Europe after the introduction of the European Monetary Union requirements, or the choice of Westminster-style parliamentary governance in former British colonies (Burnham et al., 2004: 74). Another example is the process of ‘globalization’ (see Kübler, 2016). Obviously, diffusion among different cases or path dependence over time within a single or a few cases will affect the process of descriptive inference because the explanation is distorted by a common external cause or sequence of events. A possible way to detect such a common cause is either by triangulation (employing more than one method or more than one type of data) or by applying comparative historical analysis (Mahoney and Rueschemeyer, 2003; Keman, 2016a).
Individual and Ecological Fallacies An ecological fallacy occurs when data measured on an aggregated level (e.g. at the country level) are used to make inferences about individual or group-level behaviour. Butt et al. (2016), for example, claim that survey techniques remain important in political science, but are in dire need of linking the information to contextual factors. Conversely, individual fallacy is the result of interpreting data measured at the individual or group level as if they represent the ‘whole’ (e.g. using electoral surveys for party behaviour or national attitudes, Welzel, 2007). This type of fallacy occurs often in comparative politics, be it case studies or multiple cases and shows the need for reflecting on a proper level of measurement in the Research Design.
Over-Determination and Selection Biases Over-determination and selection biases are risks that emanate from case selection. In particular, when a Research Design is used containing similar cases (assuming ceteris paribus), the chances are high that the dependent variable (Y) is over-determined by another difference that is not actually
Designing a Research Project
accounted for in the Research Design. Conversely, if the cases included in the analysis are fairly homogeneous, there is a chance that a selection bias will go unnoticed. As King et al. (1994: 141–2) note, if the similarities among the cases affect the degree of comparative variation of the independent and dependent variables we cannot draw valid conclusions (Landman, 2008). In addition to these constraints and limitations, comparative methodology is often criticized for being a-theoretical, empiricist and solely country-oriented. However, we argue that the comparative approach may not offer solutions for all types of research problems, but by acknowledging them and finding ways to arrive at plausible answers a Research Design can be enhanced. Comparison is required to make research insightful and scientifically relevant. The alternative is to study a single case (Seha and Müller-Rommel, 2016). Several new approaches that are related to the ‘spatial turn’ in comparative politics can help to advance new insights. One example is multi-level governance (MLG) that criticizes methodological nationalism and other forms of centrism (like Europe or democracies) that often characterize comparative politics. The MLG approach analyses how decision-making competences are shared by policy-relevant actors at different levels of government (i.e. supranational, national and sub-national levels; Kübler, 2016: 63–80). It focuses on the dynamics of cross-level interactions between these actors in one or more policy areas. A second example examines interdependencies, policy diffusion and policy transfer between national states and international agencies (Nölke, 2016). This type of research argues that not only developments within nation-states matter, but also relations between nation-states and inter- and supranational forms of cooperation and bargaining (Nölke, 2016). Such a relational approach means that policy outcomes are accounted for by multiple interdependencies in conjunction with domestic factors.
369
Although these new developments are promising, we have to take into account an important caveat. The existing applications to transnational politics are descriptive case studies and often lack explanatory power and generalizability. Hence, case-based analyses should be regarded as useful additions to conventional approaches and not as replacements of the comparative approach as such (Seha and Müller-Rommel, 2016). To conclude this section, the constraints and limitations of the comparative method need permanent attention. However, it would be wrong to conclude that – given the complexities and criticisms discussed in this chapter – the comparative approach to politics is therefore misdirected or fallacious. If we accept the fact that most political science is comparative, explicitly or implicitly, then it is one of the strengths of the comparative method that both the advantages and disadvantages are recognized and discussed in terms of its Research Design. Key Points • There are many hazards and pitfalls in comparative methods that ought to be taken into account to link theory and evidence in a plausible fashion. • Conceptual travelling is a sensitive instrument to widen the case selection as long as overstretching is avoided. The use of ‘family resemblance’ and ‘functional equivalence’ can remedy this, if used prudently. • Interpretation problems are often due to biases like Galton’s problem and over-determination, as well as to individual and ecological fallacies. Avoiding these problems will reduce the probability of drawing invalid conclusions.
Conclusions We began with the claim that a Research Design in political science is essential for the development of high-quality research. It is the crucial link between what the researcher aims to investigate and the ideas that may answer this. Essentially a Research Question ought to be ‘theory guided’. In fact, there is
370
The SAGE Handbook of Political Science
one exception to this ‘rule’: a research project may be developed to test an existing research result in order to advance it by means of new data or cases. Yet, the intrinsic goal of research should be to advance and extend our ‘knowledge’. The foundation of a Research Design rests with the proper planning of the ‘triad’: relating a Research Question to a Research Answer by means of a Research Design that allows for evidence-based research answers (Box 21.1). Therefore, we have stressed throughout the importance of developing reliable and valid data that enable the researcher to bridge the question and answer under review. Additionally, the issue of replication was mentioned because it allows other researchers not only to replicate but also to amend and further existing results. The goal of a Research Design is to achieve generalizability of the results to advance knowledge by means of data analysis, which is based on well-documented and reliable information and is internally and externally valid (Box 21.2). Advancing theory and data is an intrinsic value in the academic community. It is a ‘collective good’! To make this work, a Research Design needs to be characterized by an adequate elaboration. First of all, the conceptualization of the central elements of a Research Question and the related Answer is crucial for the elaboration of a viable Research Design. The type of concept used is closely related to the type of variables, level of measurement and number of cases to be included in the Research Design. However, there is always a trade-off between variables (many or few) and cases (many or few) in relation to the method used and the eventual results. The analysed relationship between variables and cases defines the internal and external validity of the initial research outcomes and opens avenues for further research by means of alternative research designs. This contention is applicable to whatever type of Research Design, be it qualitative or quantitative, a multiple case analysis or a single case study. In our
view this distinction is not relevant in view of developing a Research Design as long as it conforms to our shared rules and standards that are in use and have been accepted in the social sciences (Box 21.3). The comparative approach, implicitly or explicitly, is defined in this chapter as the ‘royal way’ to do research in political science. This position may be challenged, but nevertheless it is a useful departure for developing a proper, valid and reliable Research Design. This enables the researcher to conduct thoughtful research because it forces her to include context, time and dimension. Further, as we discussed, it makes her aware of the pitfalls and problems related to developing a viable, feasible and fruitful Research Design.
Notes 1 This chapter is an original text, but also makes use of earlier publications by the author. See Keman and Woldendorp (2016; Keman and Pennings, 2017). 2 Causality is a fraught concept in the social sciences and is hard to establish. Yet, it is now accepted that if the variation in the dependent variable (Y – here: more or less expansion of the welfare state) is evidently and systematically related to the variation in (one of) the independent variable(s) and a theory as to why this is the case (X1: political indicators and X2: economic factors), then we can assume causality and the data analysis is meant to demonstrate the relative weight of each variable (see also Baumgartner, Chapter 18, this Handbook). 3 By ‘empirical’ we mean all observations that can be related to the ‘real world’. Alternatively, it can be considered as ‘evidence’ be it quantitatively (as a numeric variable) or qualitatively (as an interpreted description). Yet, both approaches are subject to the shared rules and standards that apply within the social sciences. See King et al. (1994; Brady and Collier, 2004; Berg-Schlosser, 2012). 4 Modernization theory refers to a model of a progressive transition from a ‘pre-modern’ or ‘traditional’ to a ‘modern’ society. M odernization theory originated from the ideas of German sociologist Max Weber. The theory emphasizes those factors – e.g. economic growth – that can bring development in the ‘Third World’ in the same manner as for earlier developed countries in
Designing a Research Project
Europe or North America. Modernization theory was a dominant approach in the social sciences in the 1950s and 1960s. 5 Replication signifies the possibility for other researchers to repeat the original analysis independently using the same evidence. This implies that the researcher ought to make his or her data publicly available. Nowadays there is a development to make evidence available via ‘open data’ archives. 6 QCA attempts to cater for ‘multiple causalities’. This type of analysis allows for the handling of many variables in combination with a relatively high number of cases simultaneously. Ragin (2008) claims that this type of Research Design avoids the trade-off between many cases/few variables vs few cases/many variables. The logic of comparison employed is based on Boolean algebra or ‘Fuzzy Set Logic’ in which qualitative and quantitative information is ordered in terms of necessary and sufficient conditions as regards the relationship under investigation (see also Wagemann, Chapter 20, this Handbook). 7 The concepts concern: participatory, consensual, majoritarian, deliberative, egalitarian, electoral and liberal democracy in as many countries as possible using almost 400 specific indicators. These data are collected worldwide by country experts and their results are cross-validated and checked for reliability.
References Almond, G.A. (1990) Discipline Divided. Schools and Sects in Political Science, London and Newbury Park: Sage. Beetham, D. (1994) Defining and Measuring Democracy, London: Sage. Berg-Schlosser, D. (2012) Mixed Methods in Comparative Politics: Principles and Applications, Basingstoke: Palgrave Macmillan. Brady, H. D. and D. Collier (2010) Rethinking Social Enquiry: Diverse Tools, Shared Standards (2nd edition), Lanham, MD: Rowman & Littlefield. Braun, D. and M. Maggetti (eds) (2015), Comparative Politics: Theoretical and Methodological Challenges, Cheltenham: Edward Elgar Publishing. Braun, D. (2015) Between parsimony and complexity - system-wide typologies as a challenge in comparative politics, in: Braun, D. and M. Maggetti (eds) (2015), Comparative Politics: Theoretical and Methodological
371
Challenges, pp. 90–124. Cheltenham: Edward Elgar Publishing. Bryman, A. (1988) Quantity and Quality in Social Research, London: Routledge. Burnham, P., K. Gilland Lutz, W. Grant and Z. Layton-Henry (2004) Research Methods in Politics, Basingstoke: Palgrave Macmillan. Butt, S., S. Widdop and L. Winstone (2016) The role of high-quality surveys in political science research. In: Keman and Woldendorp (eds) pp. 262–280. Castles, F. (1982) The Impact of Parties.Politics and Policies in Democratic Capitalist States, London and Beverly Hills CA: Sage. Collier, D. and J. E. Mahon (1993) Conceptual ‘Stretching’ Revisited: Adapting Categories in Comparative Analysis, American Political Science Review, 87(4): 845–855. Coppedge, M., J. Gerring, D. Altman, M. Bernhard, S. Fish, A. Hicken, M. Kroenig, S.I. Lindberg, K. McMann, P. Paxton, H.A. Semetko, S-E. Skaaning, J. Staton, J. Teorell (2011) Conceptualizing and Measuring Democracy: A New Approach, Perspectives on Politics, 9(2): 247–267. Dogan, M. and D. Pelassy (1990) How to Compare Nations: Strategies in Comparative Politics (2nd edition), Chatham, NJ: Chatham House. Ercan, S. A. and D. Marsh (2016) Qualitative methods in political science. Keman and J. Woldendorp (eds) (2016) Handbook of Research Methods and Applications in Political Science, pp. 309–322. Cheltenham: Edward Elgar. Freedom House (2006) Freedom in the World 2006: The Annual Survey of Political Rights and Civil Liberties, Lanham, MD: Rowman & Littlefield. Goertz, G. (2005) Social Science Concepts: A User’s Guide, Princeton, NJ: Princeton University Press. Hicks, A. M. and D. H. Swank (1992) Politics, Institutions, and Welfare Spending in Industrialized Democracies, 1960–82, American Political Science Review, 86(3): 658–674. Hobsbawm, E. (1994) The Age of Extremes: The Short Twentieth Century, 1914–1991, New York: Vintage Books. Keman, H. (ed.) (2002) Comparative Democratic Politics: A Guide to Contemporary Theory and Research, London: Sage.
372
The SAGE Handbook of Political Science
Keman, H. (2016a) On Time and Space: The Historical Dimension in Political Science. In: Keman and Woldendorp (eds) pp. 64–78. Keman, H. (2016b) Systems theory: the search for a general theory of politics. In: Keman and Woldendorp (eds) pp. 79-93. Keman, H. and J. J. Woldendorp (eds) (2016) Handbook of Research Methods and Applications in Political Science, Cheltenham: Edward Elgar Publishing. King, G., R. O. Keohane and S. Verba (1994) Designing Social Inquiry: Scientific Inference in Qualitative Research, Princeton, NJ: Princeton University Press. Kohl, J. (1985) Staatsausgaben in West-Europa: Analysen zur langfristigen Entwicklung der öffentlichen Finanzen, New York/Frankfurt a.M.: Campus Verlag. Kübler, D. (2016) De-nationalization and multilevel governance. In: Braun and Maggetti (eds) pp. 55–89. Landman, T. (2008) Issues and Methods in Comparative Politics: An Introduction (3rd edition), London: Routledge. Lijphart, A. (1968) The Politics of Accommodation: Pluralism and Democracy in the Netherlands, Berkeley, CA: University of California Press. Lijphart, A. (2012) Patterns of Democracy: Government Forms and Performance in ThirtySix Countries (2nd edition), New Haven, CT and London: Yale University Press. Lipset, S. M. (1959) Political Man: The Social Bases of Politics, Garden City, NY: Anchor Books. Mahoney, J. and D. Rueschemeyer (eds) (2003) Comparative Historical Analysis in the Social Sciences, Cambridge: Cambridge University Press. Nölke, A. (2016) International relations and transnational politics. In: Keman and Woldendorp (eds) pp. 169–183. Peacock, A. T. and J. Wiseman (1967) The Growth of Public Expenditure in the United Kingdom (2nd edition), London: Allen & Unwin. [First published 1961, Princeton University Press]. Pennings, P. (2016) Relating theory and concepts to measurements: Bridging the
gap. In: Keman and Woldendorp (eds) pp. 54–63. Pennings, P., H. Keman and J. Kleinnijenhuis (2006) Doing Research in Political Science: An Introduction to Comparative Methods and Statistics (2nd edition), London: Sage. Peters, B. G. (1998) Comparative Politics: Theory and Methods, Basingstoke: Palgrave Macmillan. Pierson, P. (2000) Politics in Time: History, Institutions, and Social Analysis, Princeton, NJ: Princeton University Press. Ragin, C. C. (2008) Redesigning Social Inquiry. Fuzzy Sets and Beyond, Chicago, IL: University of Chicago Press. Sartori, G. (1970) Concept Misformation in Comparative Politics, The American Political Science Review, 64(4): 1033–1053. Schmidt, M. G. (1989) Social Policy in Rich and Poor Countries: Socio-Economic Trends and Political-Institutional Determinants, European Journal of Political Research, 17(6): 641–659. Schmidt, M. G. (1996) When Parties Matter: A Review of the Possibilities and Limits of Partisan Influence on Public Policy, European Journal of Political Research, 30(2): 155–183. Seha, E. and F. Müller-Rommel (2016) Case study analysis. Keman, H. and J. Woldendorp (eds) (2016) Handbook of Research Methods and Applications in Political Science. pp. 419–429. Cheltenham: Edward Elgar. van Deth, J. (1998) Comparative Politics: The Problem of Equivalence, London/New York: Sage. Vanhanen, T. (1997) Prospects of Democracy: A Study of 172 Countries, New York: Routledge. Welzel, C. (2007) Are Levels of Democracy Influenced by Mass Attitudes?: Testing Attainment and Sustainment Effects on Democracy, International Political Science Review, 28(4): 397–424. Yanow, D., P. Schwartz-Shea and M. J. Freitas (2010) Case study research in political science. In: A. J. Mills, G. Durepos and E. Wiebe (eds) The Encyclopedia of Case Study Research, vol. 1, Thousand Oaks, CA: Sage, pp. 108–113.
22 Experiments Anna Bassi
Social sciences have long been thought to be non-experimental disciplines. However, as history has taught us, disciplines are not inherently experimental, but become so when theoretical concepts develop in a way to be suitable for controlled manipulation. In the early 17th century, Galileo Galilei began to conduct physics experiments because theoretical concepts such as force and mass had become well defined and easy to control and manipulate. Similarly, in the early 20th century, psychology became an experimental science when psychologists started to develop theories about how different stimuli could affect individual behavior. With their similar focus on individual behavior, the disciplines of economics and political science soon followed. As was the case for psychologists, political scientists have turned to experimental settings with the purpose of creating artificial situations in which they can observe the behavior of subjects and analyze the factors that determine such behavior. Experiments in economics and political science are also
somewhat analogous to those in physics, chemistry, and other natural sciences, with the only difference being that the subject of analysis is the behavior of human beings. Common among these various disciplines is the very reason that scientists became motivated to turn to experiments; that is, the realization that knowledge can be gained in a controlled experimental setting regardless of time, place, and context. The first study that adopted experimental methods in political science appeared in the 1920s (Gosnell, 1926). Gosnell conducted a field experiment to investigate the effects of information on voter turnout in Chicago. He hypothesized that turnout on election day would increase if voters were provided with information about registration procedures and they were encouraged to vote. In the summer preceding the Coolidge versus Davis race, Gosnell tested his theory in 12 districts; constituents in six districts received information about registration procedures by mail, and those in the remaining six districts
374
The SAGE Handbook of Political Science
did not. Using this methodology, Gosnell effectively manipulated the environment in six districts and controlled the effect of this manipulation by observing turnout in the six districts that were not provided with information. He observed that turnout was 8% higher in the districts with information than in the districts that did not receive it. After Gosnell’s study, however, experimental methods did not become widespread in political science until the early 1970s. Except for a few experimental studies of voter turnout (Eldersveld, 1956; Eldersveld and Dodge, 1954) and an early push toward the adoption of field experimentation (Campbell and Stanley, 1963), empirical research in political science focused primarily on surveys and archival studies. However, the inability to answer significant research questions, such as those that involve the direction of causality among variables, and the emergence of new questions that were particularly suitable for experimentation, such as those that sought information about the effect of institutions on political behavior, led to the reappearance of experimentation in political science. Many experimental laboratories began to take shape and contributed to the growth of the experimental methodology in different directions: Stony Brook laboratories were established at The State University of New York to perform political psychology experiments under the direction of Tanenhaus and Lodge; at Yale, the University of Michigan, and the University of California, Los Angeles (UCLA), experimental programs were pursued with the support from Kinder and Iyengar; while at the California Institute of Technology, Plott established experimental laboratories to run political economy experiments. By the beginning of the 1990s, experimental methods had achieved enough recognition to appear regularly in general audience journals, such as the American Political Science Review and the American Journal of Political Science. Groundbreaking articles such as those by Gerber and Green (2000) provided tools to overcome many
methodological fallacies associated with earlier field experiments and generated the reemergence of field experimentation in political science. Technological advances such as computer-assisted and on-line surveys also helped spur the expansion of experimentation, leading the way to a notable growth in experimental studies. During the past 20 years, experimental studies have blossomed in the field of political science (see Druckman et al., 2009, 2011 for a full account of the development of experimental methodology in political science) and dozens of new subfields have opened to experimentation. In 2010, the American Political Science Association welcomed the addition of an Experimental Research section (Section 42) and, in 2014, the launch of the Journal of Experimental Political Science provided a stable platform for experimental studies of any type. Despite the growth of the experimental methodology, experimental studies are nonetheless the target of several criticisms. This chapter discusses the theoretical foundations, the advantages, and the challenges of the experimental methodology both in theory and in practice, with specific emphasis on studies that are commonly referred to as ‘political economy experiments’, which are incentivized, controlled laboratory experiments that are focused mainly on testing theoretical predictions and conjectures.
The Experimental Method One reason that experimental methods are unique is that they provide the ability to study non-naturally occurring phenomena. The engine of the scientific process, which lies in the alternation of theory and empirical evidence, not only requires analysis of observed phenomena but also requires exploration of counterfactuals. However, counterfactuals do not necessarily occur in nature. The experimental laboratory provides a
Experiments
unique environment to generate counterfactual situations and test theoretical conjectures to advance the scientific process. Experimentation is defined by the researcher’s intervention upon nature and the control that experiments encapsulate to help provide answers to causal questions. The experimenter exercises three types of control. First, she establishes an experimental setting that allows her to control for nuisance factors that could potentially interfere with the causal relationship under study. Second, she generates different treatments to isolate causal factors and identify the effect(s) of each individual factor. Third, she exercises control over potential contamination by scheduling observations in a way that causal relationships are not affected by nuisance effects that might occur simultaneously with the treatment. When an experimenter is not able to exercise these three types of controls, ‘quasiexperiments’ or ‘natural experiments’ may emerge. In quasi-experiments, one of the three types of control is missing. For example, experiments in educational settings often lack the first and third type of control. For ethical reasons, the experimenter cannot expose participants to the full set of treatments that she wishes to apply: for example, the experimenter cannot force participants to smoke or not smoke, be (un)athletic, eat (un)healthily, etc. Likewise, for either ethical or policy reasons, she might not be able to pick and choose participants to be exposed to a specific treatment like Gosnell did (Gosnell, 1926), being constrained to run the same experimental treatment to all students of the same classroom or school. Natural experiments are a subset of quasi-experiments in that they lack the second type of control. In natural experiments, nature acts as the experimenter, generating an exogenous treatment and exposing only a subset of subjects to it. In other words, nature acts in a way that is close to how the researcher would have intervened. Experiments can be divided in two large groups: field experiments and laboratory
375
experiments. Field experiments typically are conducted in settings where the phenomenon under study naturally occurs, providing a more realistic environment than a laboratory setting, but limiting the experimenter’s control over potential confounding effects. Laboratory experiments typically are characterized by a common physical location where subjects participate in the experiment. In this location, the laboratory, the experimenter exercises maximum control over all aspects of the phenomenon under study. Most experiments fall into either of these two broad categories, but some may fall somewhere between the two. For example, on-line experiments, which allow subjects to participate largely via the internet, are neither purely laboratory nor purely field experiments. In fact, subjects neither interact in the same controlled physical location nor in the natural location where the phenomenon occurs. As a result, the experimenter has greater control over the creation of settings than in a field experiment but loses control over potential confounding effects because of her inability to observe the subjects directly while they are exposed to the treatment. Details regarding the two broad categories of field and laboratory experiments are given in the following subsections.
Field Experiments Field experiments typically are performed in settings with a high degree of naturalism. To control for unobservable confounding effects, the experimenter randomly assigns the treatment among participants. Random assignment, if performed with a large enough number of participants, ensures the balance of observable and unobservable confounding variables among the treatment groups, thus allowing causal relationships to be established. The degree of naturalism in an experiment is established according to four criteria (Gerber and Green, 2012): (i) whether the experimental manipulation resembles the
376
The SAGE Handbook of Political Science
phenomenon in the real world, (ii) whether experimental subjects resemble the actors who usually are involved in that phenomenon, (iii) whether the context in which subjects are exposed to the treatment resembles the context of interest, and (iv) whether the outcome measure resembles the actual phenomenon outcome. For example, Gosnell’s field experiment (Gosnell, 1926) tested a realistic get-out-the-vote campaign rather than a simulated campaign. It targeted eligible voters in certain Chicago neighborhoods where elections would actually take place instead of using a convenience sample of participants at, for example, a college campus. Finally, Gosnell’s experiment assessed actual voter turnout behavior rather than the mere intent to turn out or simulated voting behavior. One of the benefits of field experiments compared to laboratory experiments is that field experiments are considered to have high external validity. Because the manipulation of the environment, the participants, the setting, and the outcomes more closely resemble those that would occur in the real world, causal relationships established through field experiments are considered to have more validity and generalizability than those established in settings where these relationships are artificially created. As Gerber and Green argue, ‘field experiments strive to be as realistic and unobtrusive as possible’ (2012: 9). as the less subjects are aware of being part of a study, the more they will behave naturally and will be unbiased against what they think the experimenter is trying to accomplish.
Laboratory Experiments Laboratory experiments are experiments in which subjects are recruited to come to a common physical location (a laboratory) and the researcher controls all aspects except for subject behavior in the environment at that location. Subjects typically are given instructions (also called ‘scripts’ or ‘frames’) that
describe the choices they will make during the experiment. Laboratory experiments can be divided into two subcategories: political psychology experiments and political economy experiments. Although both categories share the similar goal of studying attitudes and behavior, they use different approaches to address similar questions. As Ariely and Norton (2007) suggest, the different approaches derive from the different methodologies by which the two disciplines recreate real-world phenomena in the laboratory environment; psychologists use deception and economists use incentives. After deciding which aspect(s) of the phenomenon to bring to the laboratory for testing, the experimenter must find a way to recreate this aspect within the experiment. This stage is the most important in designing an experimental study and, as discussed in the next sections, affects the study’s validity. It is at this stage that economists and psychologists take different routes. Economists bring a phenomenon to the laboratory by abstracting it from inessential features and focusing only on variables that the theory identifies as those that drive agents’ behavior. Because most political economy theoretical models are derived from assumptions found in rational agent theory, economists recreate in the experiment the incentives that theoretical models claim agents in the real world face. For psychologists, agents’ decisions are context-dependent, meaning that environmental factors that are bypassed by rational agent theory are claimed to affect agents’ behavior. As a result, psychologists expend much effort in recreating contextual aspects that are considered key factors in explaining real-world phenomena. As these contextual aspects might be perceived as preposterous by the experiment participants, experimenters need to use stories and deception to ensure that people react to the contexts as naturally as they would in the real world (Kimmel, 1998). For psychologists, the use of deception is essential to represent contextual cues in the experiment. By contrast, economists have no use for deception, as they aim
Experiments
to abstract the phenomenon from specific contexts. For economists, deception only carries a potential risk to the validity of the experiment. In fact, the suspicion and mistrust that deception might evoke in participants could affect the trust in the experimenter and in the experimental institutions that the experimenter is trying to recreate. Dickson (2011) and Bol (2018) provide full accounts and discussions of the conceptual and methodological differences between political economy and political psychology experiments. The rest of this chapter is devoted primarily to political economy experiments and highlights the principles on which political economy experiments build and ways that political economy experiments differ from other types of experimental methods.
Political Economy Experiments The first studies in the political economy experiment tradition were developed in the 1950s by Fouraker and Siegel (Siegel and Fouraker, 1960), who conducted a series of experiments aimed at analyzing agents’ bargaining behavior. Fouraker and Siegel combined, for the first time, methods belonging to economics (theoretical models of bargaining) with methods pertaining to psychology (experimental methods), while adopting real incentives to motivate subjects. Although these studies were later developed to test hypotheses from the psychology literature, Fouraker and Siegel laid the foundations for what became the distinguishing features of experimentation in economics: incentivized, controlled, laboratory experiments. After Siegel’s death in 1961 and a period of slow growth of experimentation in economics, Charles Plott became the forerunner of political economy experiments in the early 1970s. Plott believed that the similarity between social choice theory and economic theory, that is, that both types of theory build on fundamental assumptions such as rational
377
agents and equilibrium behavior, make political science topics and questions like voting and committee behaviors ideal candidates to be studied in experimental settings. During the past five decades, political economy experiments have grown in number in both economics and political science. Groundbreaking theories, such as those that involve spatial models, agenda-setting models, models of turnout and political participation, and legislative bargaining, have been tested in various laboratory settings (see Palfrey, 2009, 2012, for a thorough survey of the development and contributions of political economy experiments to political science). Novel theoretical models and new phenomena have emerged and have been explored thanks to the experimental methodology. Notable and most recent examples are the studies of Grosser and Palfrey (2019), who propose and test a variation of the citizen candidate model; Merlo and Palfrey (2018), who use laboratory data to conduct a test on the predictive power of competing behavioral models of voter turnout (such as instrumental voting, altruistic voting, expressive voting, and ethical voting); and Casella et al. (2012), who test theoretical models of vote-trading. Importantly, new fields have started to open to the experimental methodology. Researchers in comparative politics have increasingly adopted what are referred to as ‘lab-in-the-field experiments’ to study political behavior and attitudes in theoretically relevant populations. Examples of this kind of laboratory experiments include the study conducted in Nepal on collective behavior in conflict-plagued regions by Gilligan et al. (2014), and the study on the effect of sanctioning on public goods contributions run in Uganda by Grossman and Baldassarri (2012). Political economy experiments have started to develop also in the field of international relations, from the early application of Pilisuk (1984), who studied an arms race dilemma, to the study of Tingley and Walter (2011), which analyzes the effect of costless threats on the probability to deter opponents’ attacks.
378
The SAGE Handbook of Political Science
Alvin Roth proposes a classification of political economy experiments as a function of the experiment’s primary goal (Roth, 1986, 1988, 1995). The first of three classes includes all experiments aimed at testing hypotheses derived from formal theoretical models. Experimental methods contribute to the advancement of science by discriminating among alternative theories, testing the predictive power and robustness of a single theory, and shedding light on elements that the theory has ignored. Examples of this type of experiment include ultimatum games experiments (Güth et al., 1982), coordination games (Van Huyck et al., 1990), and bargaining game experiments (Roth, 1983). These contributions constitute the classic and probably most common goal of experimental research. The second class includes all those experiments that investigate phenomena that cannot be explained by existing theories. Many experimental projects start with the goal of testing a theory and then extend the aim to include analysis of anomalies, that is, behaviors that violate well-established and prevailing theories. Examples of this type of experiment include the Ellsberg paradox (Ellsberg, 1961), the Allais paradox (Allais, 1953), and the preference reversal phenomenon (Grether and Plott, 1979). This category of experiments has given rise to the formulation of new theories that are capable of explaining behavioral regularities, and the creation of so-called ‘behavioral economics’. While experimentation represents primarily a method of investigation, behavioral economics is mostly concerned with the process of revising classic rational agent theories. The third class includes those experiments that are intended to support or guide policy choices. An example is the series of experiments that explore the conflict between individual interest and group interest. Studies that belong to this line of research appear in different disciplines (social dilemmas experiments in psychology, tragedy of the commons in political science, and public goods in economics) and they share the goal
of providing tools to attenuate the conflict between individual and group interests to achieve a common desirable outcome (Isaac et al., 1985; Plott, 1983). Experiments that aim at testing new voting procedures before they are adopted for real elections also belong to this class of studies. Examples of this type of experiments are studies by Forsythe et al. (1996), which analyzes the impact of polls on election outcomes, Blais et al. (2010), which tests the performance of rational choice theory to predict individual behavior, and Bassi (2015), which analyzes the extent of strategic voting under different voting procedures.
Principles of Experimentation A laboratory experiment needs to reproduce the abstract context of the theory that it wants to test in the artificial setting of the laboratory. The experimenter creates a selfcontained environment – a microcosm – that consists of individual agents together with an institution through which they interact. The experimental subjects are the agents who act in this microcosm and the institution specifies which interactions are allowed and the potential outcomes of each interaction. For example, to analyze the way in which different voting systems affect the likelihood of voters to vote strategically, the experimenter needs to create an environment where experimental subjects can act as voters (agents) in an election that is administered using a specific voting system (institution). Controlled laboratory experiments are uniquely positioned for studying the behavior of agents within different economic and political institutions, even those institutions that do not exist in nature. The researcher can test different theories about how these institutions affect human behavior. The subsequent analysis of agents’ behavior in an experimental setting can provide support for extant theories or serve as a tool to refine these theories.
Experiments
As part of the scientific dialogue, theories organize our knowledge and help us predict behavior in new situations. Empirical analysis turns up regularities that are not explained by theory, thus leading to theory refinements. The experimental method provides a unique ‘ground zero’ empirical environment for testing a theory. If a theory fails to predict agents’ behavior in the simplified artificial environment of a laboratory, then the theory needs to be revised and improved before it is applied in empirical settings with higher degrees of naturalism. The use of data that are generated under controlled laboratory conditions has begun to play a key role in bridging the gap between the theoretical world in which the researcher can keep all the variables constant, except the focus variables (creating ceteris paribus conditions), and the real world in which no such control is possible. Hence, experiments are situated in a key intermediate place on the path of scientific dialogue between the formulation of theoretical conjectures about real-world phenomena and their final applications.
Induced-Value Theory The first principle of experimentation is that, to provide a test for a theoretical claim, the experiment must share the same characteristics as the theory it is designed to test. Rational agent theory assumes that agents assign values to each outcome and that, given these values, the institutional differences have predictable effects on the subjects’ choices. To test a theory, the experimenter must induce subjects to have the same value orderings over outcomes as are assumed by the theory. As aforementioned, political economy experiments achieve this goal through the use of incentives. The experimenter, in order to motivate participants to behave as they would in real life, must explicitly define the incentives as an integral part of the experimental design so that participants can fully evaluate the costs and benefits of each
379
decision, thereby mirroring the theoretical claims and predictions (Edwards, 1961; Hertwig and Ortmann, 2001). Vernon Smith organized the methodology of experimental political economy around a set of ‘precepts’ of the experimental design (Smith, 1976), which focused on the importance of incentives, following the lead of Siegel (Siegel and Fouraker, 1960). According to the induced-value theory, the proper use of a reward medium induces specific characteristics in experimental subjects such that the background characteristics of the subjects become irrelevant. Smith (1976, 1982) specifies four norms that a reward medium must satisfy to form an adequate incentive structure: monotonicity, salience, dominance, and privacy. Monotonicity (or non-satiation) is the first norm of induced-value theory. According to monotonicity, subjects must prefer a choice that provides more of the reward medium over a choice that provides less reward, assuming the two choices lead to an otherwise equivalent outcome. Salience prescribes that a reward is a function of the subject’s behavior (and that of other agents), as clearly defined by institutional rules. The relationship between a reward medium and institutional rules is well understood by and consequential to the subject; that is, the subject has a guaranteed right to claim the reward based on his or her actions in the experiment. Salience connects the outcomes in the laboratory microcosm to a reward medium that the subject cares about. This aspect of salience is the reason that many political economy experiments do not use subjects who have previously participated in experiments that employ deception. Subjects who have been deceived in the past might not trust the experimenter and her promise to compensate them according to the institutional rules. Dominance prescribes that the utility of the experimental subjects is a function predominantly of the reward medium while other factors are negligible. The reward medium dominates any subjective value that a subject
380
The SAGE Handbook of Political Science
can bring to the experiment. In order to be dominant, a reward medium must be substantial enough to incentivize the subject to put forth maximum effort to understand the institutional rules and to behave in such a way to maximize the reward, given the institutional rules. A common convention is to compensate experimental subjects approximately twice their opportunity cost. For example, if undergraduate students on a university campus earn on average $15 per hour by working in the library or tutoring other students, compensating them $30 per hour should be sufficient to dominate any other motivating factors. Another way to increase dominance is to design the experiment in a way that reduces the impact of any other background preference. This aspect is the reason that many political economy experiments use abstract and neutral terminology and keep the experimental goals as general as possible. For example, the dominance of a voting experiment design that uses real candidates’ names is diminished when pre-existing political preferences do not align and might suppress the preference(s) that the experimental design aims to induce. Privacy specifies the information that is provided to each subject. Each experimental subject is given information only about his or her own reward. Single-blind privacy prescribes that subjects may know the choices and the rewards of other subjects, but not the other subjects’ identities. Only the experimenter has complete information regarding the correspondence between rewards and subjects’ identities (usually for payment purposes). With double-blind privacy, not even the experimenter has knowledge of this correspondence, and the payment process is handled through a third party or anonymous entity. Although none of the four norms explicitly mentions monetary rewards, financial incentives have become the gold standard for political economy experiments and their use as a reward medium is considered a sufficient condition to satisfy all the induced-value theory precepts. The reason is twofold: on the
one hand, monetary rewards are considered the only reward media that satisfy monotonicity and non-satiation. Although other reward media, such as lottery tickets, chocolate, or extra course credit, might satisfy monotonicity at least for a specific group of subjects, an experimenter can feel confident that every participant prefers more money than less without feeling satiated. However, many people might feel satiated after accumulating large amounts of other reward media, such as chocolate or course credits. On the other hand, financial rewards reduce variation in subjects’ performance by reducing the variation of subjects’ preferences (or intensity of preferences) for the reward. For example, lottery tickets might satisfy monotonicity and non-satiation, but subjects might differ in terms of how much they value them. Participants who are more risk-averse would value lottery tickets less than participants who are risk lovers. Furthermore, even when controlling for risk attitudes, participants might have different beliefs regarding the probability of winning the reward. Subjects who do not have perfect and complete information about the probability of winning, or who have time preferences and discount the time when the winning ticket is drawn, would value lottery tickets less than the experimenter might expect. The reward of extra course credits shares a similar problem, as students who naturally perform better might value extra credit less than students who need extra credit to improve their grades. Hence, financial incentives are more likely to satisfy dominance precepts than other reward media. For these reasons, cash is the most often used form of reward media for behavioral experiments. In addition to being dispensed to the subjects at the end of the experiment as compensation for their participation, cash also is sometimes used prior to the experiment to induce salience. This form of reward, often referred to as an ‘on-time bonus’, serves primarily to establish ex ante credibility about the experimenter’s solvency beside the clear minor goal of reducing tardiness. Credibility is a key
Experiments
feature that the experimenter needs to gain and maintain to satisfy salience. If subjects do not believe that they are going to be compensated in the manner that the experimenter has described at the beginning of the experiment, then their behavior will not be determined by the underlying theoretical model. However, compensating subjects solely with a fixed reward (as is common practice in political psychology experiments) does not satisfy induced-value theory because the theoretical values and order of preferences are not actually induced (the theoretical values are not salient). Furthermore, other confounding variables might become dominant and drive subjects’ behavior. For example, subjects might prefer to minimize the time it takes to finish the experimental task rather than exerting effort. Subjects might feel empathetic toward fellow subjects who perform worse than they do and consequently may then act in a way that helps the poorer performers. The use of financial incentives suppresses the variability in the data that may be caused by subjects who make choices because of considerations outside the realm of the theory. Multi-session experiments, which involve repeated interactions with the same subjects, also pose difficulties for induced-value theory. These experiments typically require the same subjects to participate in a series of experimental sessions conducted at different points of time, with the time between sessions allowing participants to assimilate, or forget, what they have been exposed to in earlier sessions. Even though compensating the subjects at the end of each session is ideal to satisfy dominance and salience, it might also generate attrition, since subjects might choose not to return to a new session of a multi-session study. Paying the subjects at the end of the last session (and thus for the entire experiment) is the only way to guarantee that all subjects will return for all subsequent sessions that the experimental design includes. However, this payment scheme can be implemented only in a well-established pool of subjects who have participated in
381
experiments before and who have never been deceived and, as a consequence, trust that the experimenter will keep her word and pay them at the end of the experiment. When this scenario is not the case, the experimenter can use a higher payout rate in the subsequent experimental sessions to reduce attrition and induce subjects to return and participate in the entire series of sessions. Conventional one-session experimental studies that included multiple rounds (or tasks) pose a similar problem. Even though attrition is not a major concern – as the probability that subjects drop out during a single session is usually very low – fatigue and wealth effects might decrease dominance. On the one hand, long experimental sessions might create boredom and tiredness that decrease the subjects’ attention and effort. On the other hand, the accumulation of earnings gained in the earlier rounds creates wealth that can reduce the dominance of the reward medium in subsequent rounds. Fatigue and wealth effects on late-rounds behavior can be mitigated by paying subjects for a randomly drawn round rather than for each round. In doing so, each round would have the same probability to be selected and to determine the participants’ compensation, forcing subjects to pay the same attention to each round without being affected by the possibility of accumulated wealth.
Controlled Experimental Design The second principle of the experimental method is referred to as ‘perfect control’. The goal of the experimenter is to analyze the effect of focus variables on the variable of interest. However, other nuisance variables also might have an effect, and the experimenter must account for them to avoid reaching wrong conclusions. The point of experimentation is to be sure that the effect of the focus variable is clean and is not confounded by the effect of other variables. To separate the effects of focus and nuisance
382
The SAGE Handbook of Political Science
variables, the experimental design must employ two key devices: controlled treatment and randomization. Controlled treatment is an intervention or artificial variation that is induced by the experimenter. Experimental subjects are divided into groups that receive different or no treatment (the group that receives no treatment is usually referred to as the control group). By comparing the level of the variable of interest across groups, the experimenter can test whether a specific intervention or treatment has caused a change in the behavior of the experimental subjects. This type of experimental design, in which the treatment variable varies only across groups of subjects, is called ‘between-subject’ design. The between-subject design builds on the so-called ‘controlled variation’ logic, which assumes that groups that are exposed to different treatments are identical, or at least similar enough to be deemed equivalent. This assumption is pivotal for the logic to work. In fact, should this equivalency not be the case, the experimenter cannot attribute a difference across groups to the different treatments to which the groups are exposed because the groups were already different before being exposed to the treatments. A treatment test works only if all other factors that might influence the subject’s behavior remain fixed or if these factors are uniformly distributed, or equally present, in all treatment groups. According to induced-value theory precepts, a dominant reward medium should suppress other factors that might affect a subject’s behavior. However, when the groups are different enough, even with a dominant reward medium, any difference in the groups’ behavior cannot be attributed solely to the difference in the treatment variable. For example, if all the subjects in a group are poorly educated and all the subjects in another group are highly educated, their behavior in response to a stimulus might differ simply because the subjects in the second group are better able to understand their optimal course of action. Hence, the experimenter needs to
create groups in a way that all possible factors that could affect the variable of interest are balanced. The most obvious way to balance these factors is by matching, which is the process of assigning subjects to each treatment group in a way that the groups are similar with respect to key characteristics. Another way to avoid confounding problems that may emerge due to potentially dissimilar groups without directly controlling for them via matching procedures is through randomization. Because direct control is essentially impossible to achieve, given the number of known and unknown factors that an experimenter needs to balance and that matching is always less than perfect, randomization has become the most commonly used method to guarantee pretreatment similarity of groups. With randomization, provided that the sample is large enough, the differences among groups are neutralized. Hence, a carefully designed experiment that is based on the logic of controlled comparison and uses randomization to assign subjects to different treatment groups achieves the property of perfect control that is essential in all experimental sciences. Friedman and Sunder (1994) classified four types of randomized design as functions of the direct control component they entail. These types of randomization, described in the following paragraphs, are: completely randomized design, factorial design, random block design, and within-subject design. The simplest type of randomized design is the completely randomized design in which each treatment is equally likely to be assigned to each subject. This type of randomized design is ideal when the design involves few treatments and a large number of participants. When the number of participants increases, the probability of any correlation between the treatment and the nuisance variables decreases until it becomes negligible. When the efficacy of a completely randomized experiment is hindered by the number of treatment variables that the experimental design entails, a factorial design type of randomization can be used. The difference
Experiments
between a completely randomized design and a factorial design is that the former assigns each combination of treatments randomly to each participant, whereas the latter assigns the same number of participants to each combination of treatments. Even though the two types of design achieve the same goal as the number of participants increases, a factorial design is more efficient with a limited number of participants. When the number of participants is limited and the nuisance variables still appear to be correlated with the treatment variable, an optimal solution is to use the right combination of control and randomization. In a random block randomized design, one or more nuisance variables are controlled in blocks; that is, they are held constant within a block but varied across blocks. Each block represents a set of participants. Independence or (no correlation) between treatment and nuisance variables can be also be achieved using a within-subject design in which each subject serves as a block and the nuisance variables are fixed inside the block. In this type of design, each subject is exposed to different treatments and the subjects’ behaviors are compared across treatments.
Experimental Subjects The presence of human subjects makes behavioral experiments crucially different from other methodologies. Theoretical models often do not engage in questions about how people react to changes of situations, learn, form beliefs, and choose strategies, relying instead on simplifying assumptions of rationality. Theories generally assume a homogenous abstract agent (the so-called ‘representative agent’), but human beings are heterogeneous and differ greatly. The experimenter must be the one to decide which differences among human beings matter for the experiment’s purpose and which do not. Most political economy experiments use college students as experimental subjects,
383
which may relate to the convenience and relative ease of recruiting undergraduate students, but also to the specific features these students encompass. Undergraduate students have ready access to the subject pool, they can be recruited easily on university campuses where laboratories typically are located, and they have a low opportunity cost, which means that dominance can easily be achieved through a reward medium. But most importantly, college students are on average more literate in language and math, and generally display an ability to learn and assimilate new concepts that often cannot be matched by other segments of the population. This sophistication and steep learning curve make college students a best fit to participate in behavioral experiments where participants typically have only a limited time to become familiar with a new environment and try to discern the best course of action. Finally, undergraduate students usually lack the exposure to experience or information that might bias their behavior, which is the primary reason that graduate students or professionals typically are not recruited to participate in behavioral experiments. Undergraduate students are thought to respond more in terms of their understanding of possibly relevant theories than to direct incentives (the incentives become less dominant) whereas graduate students and professionals may use decision rules with which they are familiar but that are not appropriate or useful for the specific experimental task. Hence, unless the skills necessary for the experiment cannot be acquired during the period of the experiment, the advantages of using student subjects are so great that switching to other subject pools usually is not justifiable. However, the usefulness of undergraduate students as subjects does not mean that experimenters should run experiments using their own students. On the contrary, experimenters should use caution before running experiments in their classroom. Classroom experiments offer the advantage of effortless recruiting and scheduling, and the
384
The SAGE Handbook of Political Science
opportunity for students to earn academic credits can produce dominance by itself or along the use of a financial reward medium. However, classroom experiments present many problems. First, the relationship between subjects and the experimenter can create validity problems; students may want to impress the experimenter/instructor by behaving in a way that they believe pleases her, and instructors may imprint personal points of view on their students that might bias their behavior during an experiment. Second, classroom experiments present ethical concerns; when class time or grades are used to motivate subjects, a conflict may arise between the pedagogical needs of the university’s program of instruction and the instructor’s research needs. Grades must measure or approximate the students’ learning accomplishment in the course. When no connection between pedagogical and research goals can be established, the experimenter should limit the experimentally determined component of the grade to a small fraction of the total grade. For these reasons, classroom experiments are best suited for pilot or exploratory studies.
Validity In general, the validity of a research project is the extent to which the research is sound. The validity of an experimental study concerns both its design and methods and has three connotations: first, validity pertains to the extent to which the subject of evaluation reflects the phenomenon under study; second, validity refers to the degree to which the results of the study add to the understanding of the phenomenon; and third, validity indicates the extent to which the results and implications that are drawn from a study can be extended to other contexts. Cook and Campbell (1979) deconstruct validity into a typology of four concepts: construct validity, causal validity, statistical validity, and external validity.
Construct validity is the degree to which inference from the data is valid for the theory under study. In practice, a study has construct validity when the data-generating process used in the experimental setting matches the theoretical concept it is intended to study. The major hurdle to construct validity lies often in the complexity of the political concepts under study. Because political variables are often ambiguous, a unique way to recreate a theoretical concept in the laboratory is not always feasible. An example is testing the theoretical conjecture that the more aggressive the leader, the higher his or her approval ratings. Although approval rating is not an ambiguous concept, aggressiveness is, and its definition can be operationalized. For example, the construct of aggression could be operationally defined as action that leads to physical harm, physical or verbal threats, debate strategies that aim to hinder the opportunity of the opponent to raise his or her point, or simple obnoxiousness. Although conceptualization and measurement are factors that are common to all empirical analyses, political economy experiments generally suffer from these factors to a lesser extent because the theoretical models that they typically test provide clearly specified and less ambiguous variables to operationalize; thus, it is easier to represent them in the abstract laboratory setting of political economy experimentation. Causal validity is whether the relationship that the researcher finds within the target population is causal or not. In experimental settings, the researcher manipulates a treatment variable (or independent variable) to determine its effect on the focus variable (or dependent variable). If the researcher can attribute the observed difference (or change) in the dependent variable to the independent variable, and if she can rule out other explanations or rival hypotheses, then the inference is said to be causally or internally valid. Internal validity is one of the most important features that a study should be able to satisfy. That is, when factors that are extraneous to the concerns of the study can affect the
Experiments
results of the study, the findings can simply be considered invalid. Controlling all possible factors that threaten internal validity is a primary responsibility of every study. Statistical validity assesses whether the covariance between the variables of interest is statistically significant and whether the relationship is sizeable. Experimental findings can support statistical validity also through reproducibility. When a study, if reproduced under the same conditions, produces the same findings, the statistical validity of its findings increases. External validity describes the extent to which findings can be generalized beyond the parameters of a specific study. Generalizability can be of two types: generalization to and generalization across. The former assesses whether the results can be applied beyond the sample of participants and to the entire population; the latter assesses whether the results can be applied beyond the particulars of time, place, and methodology. One of the main threats to the ‘generalization to’ type of external validity concerns the adequacy of the sample used in the study. Samples that are representative of the population are deemed to be better fitted to generate findings that can be generalized to the entire population. The predominant use of convenience samples, typically composed of undergraduate students, has made the experimental methodology a constant object of criticism. The main threat to the second type of externality validity, ‘generalization across’ (also called ‘ecological validity’), concerns the correspondence between the conditions under which a study is conducted and the realworld conditions in which the phenomenon occurs. When this correspondence is weak, the likelihood that the findings can be extended to other environments and settings decreases. Hence, ecological validity and internal validity are inversely related. Increasing the artificiality of the research conditions increases the control of nuisance variables, increasing the study’s internal validity, but it reduces its correspondence with reality, thus decreasing ecological validity.
385
Validity lies at the center of a debate about the usefulness and appropriateness of experimental methods in political science. The next section discusses the strengths and hurdles of the experimental methodology and suggests a path to overcome their challenges and broaden their appeal for scholars in the field.
Experimental Strengths and Weaknesses: Ongoing Debates and Perspectives The two main advantages of experimental methodology compared to other methodologies are the ability to control the datagenerating process and the possibility to reproduce and replicate data creation. Perfect control over the experimental microcosm allows the researcher to engage in types of analyses that are impossible when using naturally occurring data. For example, the ability to recreate theoretical settings and manipulate laboratory conditions in order to isolate key variables allows the experimenter to test theoretical predictions. This work is not possible when using data from natural settings, because real-world settings that correspond to theoretical assumptions (which would constitute a natural experiment) are extremely difficult to find in nature and the conditions to test theoretical conjectures are virtually impossible to manipulate. As argued by Davis and Holt (1993), the absence of control over natural data prevents the simplest prediction of any theoretical model from being verified. Furthermore, experimental methodology is ideally suited to test causal propositions. By varying one treatment variable at a time, any correlation between a treatment variable and a focus variable can reflect a causal relationship with a clear direction of causality. Furthermore, the experimenter can engage in analytical decomposition and analyze the direct and interactive effects of multiple treatment variables. Last, by controlling and appropriately manipulating background
386
The SAGE Handbook of Political Science
factors, the experimental method is an ideal tool to test the robustness and validity of causal relationships. Reproducibility and replicability describe the possibility to recreate the experimental findings using the same methods and conditions of the original study and using different methods from the original study, respectively. Because the experimenter has created the experimental environment, the results are expected to be reproducible as long as the environment can be reproduced in an equivalent manner. Reproducibility of an experiment allows a researcher to verify other researchers’ experimental findings in an independent manner. The main criticism of the experimental methodology is focused on validity. As discussed, the methodology used in a study is considered internally valid when it allows conclusions to be drawn on behavior within the study, and the methodology is considered externally valid when it allows conclusions to be drawn on behavior outside the study. An experimental study is not internally valid when the design is inappropriate for the phenomenon under study (no construct validity) or it lacks proper control (no causal validity), and it is not externally valid when the experimental findings do not give information about behavior outside the experimental study. An experimental study can have low external validity if the population sample that participated in the experiment responds to experimental manipulations differently than other samples would respond in other settings. Although internal validity concerns can be addressed with appropriate experimental design and implementation, concerns about external validity are more complex, as they do not pertain strictly to a specific study but to the entire experimental discipline. Detractors and critics of experimental methods claim that because the artificiality of the experimental setting does not mirror real-world situations, experimental findings cannot be generalized to other environments. Friedman and Sunder (1994) argue
that external validity issues are inherent of all empirical methodologies. Because all investigations are conducted in specific settings, no single study can produce general knowledge. Single observations have no phenomenological status unless they support a theoretical claim. Hence, general knowledge exists only in tandem with theoretical knowledge. As Friedman and Sunder (1994) claim, behavioral regularities will persist in new situations to the extent that the relevant underlying conditions remain substantially unchanged, where it is the theory that defines what is relevant and what is substantial. Via its abstractness, a theory applies across settings and populations. Hence, as noted by Plott (1982), for experimental studies that are aimed at testing hypotheses derived from formal theoretical models, the abstractness of the laboratory setting is an advantage because it better mirrors the abstractness of a general theory. If an experimental design recreates and manipulates every theoretically relevant variable and finds an effect that does not generalize to naturally occurring situations, then such an outcome is not representative of criticism of the experimental study but of the underlying theory. Hence, the key component of external validity is the underlying theoretical proposition. One way to increase external validity is to improve the confirmatory status of the theory by conducting ad hoc scientific replications. Through replications of the experimental study using different measures and settings, confidence in the theory is gained and each successive test increases the external validity of the experimental findings. Another criticism regarding the external validity of laboratory experiments pertains to the student samples that laboratory experiments traditionally use. Because laboratory experiments do not draw probability samples from large populations – a sampling technique usually adopted in other methodologies – critics often argue that experimental findings cannot be generalized from a convenience sample to the
Experiments
larger population. Although college students who participate in experiments represent a specific sample of a small and distinct population, their use as experimental subjects is not a downside for studies whose goal is to disprove or confirm a theory. In fact, homogeneous samples are better fitted for this purpose than heterogeneous ones, as homogeneous samples increase the likelihood of identifying violations of a theory when the theory is false. The way to overcome all these criticisms, increase external validity of experimental studies, and increase the generalizability of underlying theories is to employ replications. Replications should be conducted using non-random samples that differ significantly from the sample used in the original study on measurements of variables for which the theoretical prediction is expected to hold. For example, if a theoretical claim is expected to hold for different measurement of the ‘age’ variable, an experimental study that uses college student samples should be replicated using senior participants. Stress tests also should be conducted using samples that differ from the sample used in the original study, along measurements or variables not described by the theory, but that are closer to real-world characteristics. As McDermott argues: ‘Conducting a series of experiments which include different populations, involve different situations, and utilize multiple measurements establishes the fundamental basis of external validity’ (2011: 38). Only by increasing the heterogeneity of the study population, the settings, and the circumstances of the situation in which a phenomenon occurs, can generalizability of both the theoretical construct and the empirical findings be established.
Concluding Thoughts This chapter is not intended to cover the extensive literature of political economy
387
experiments. Instead, the goal is to provide readers with an understanding of the conceptual features that set political economy experiments apart from other types of experiments and to offer a glimpse on the development of this methodology. Political economy experiments are uniquely fitted to test theoretical models of political interaction and to generate refinements and improvements of such theories. Their use in political science has grown so much in the past 20 years that this type of laboratory, controlled, incentivized experimentation is now regarded as one of the most valuable tools of empirical analysis. The issue of external validity, raised by critics of this methodology, can be overcome by complementing experimental results with ad hoc replications on theoretically relevant samples and settings. But what does lie ahead for this type of experimentation in political science? As happened in the field of experimental economics, the future of political economy experiments will be primarily tied to the developments of formal modeling in the fields of domestic politics, international relations, and comparative politics. As we noted at the beginning of this chapter, experimentation will grow in the fields of research that witness a growth of formal modeling suitable to this type of experimentation. Furthermore, the prosperity of behavioral economic models in the field of economics, which improved upon the classical assumptions of the rational agent model, is increasingly influencing all fields of political science. Political scientists, more than economists, have always been wary of the restrictive assumptions of rationalist models and are well disposed toward behavioral revisions of the standard model of choice. Novel behavioral contributions on preferences, beliefs, and decision-making, such as preference reversal and bounded rationality, are bound to be incorporated into formal models and open the way to new laboratory, controlled, and incentivized experimental studies in the years to come.
388
The SAGE Handbook of Political Science
References Allais, M. (1953). Le comportement de l’homme rationnel devant le risque: critique des postulats et axiomes de l’ecole americaine. Econometrica, 21 (4), 503–546. Ariely, D. and Norton, M. I. (2007). Psychology and experimental economics: A gap in abstraction. Current Directions in Psychological Science, 16 (6), 336–339. Bassi, A. (2015). Voting systems and strategic manipulation: An experimental study. Journal of Theoretical Politics, 27 (1), 58–85. Blais, A., Laslier, J. F., Sauger, N., and Van Der Straeten, K. (2010). Strategic, sincere and heuristic voting under four election rules: an experimental study. Social Choice and Welfare, 35 (3), 435–472. Bol, D. (2018). Putting politics in the lab: A review of lab experiments in political science. Government and Opposition, 54 (1), 1–34. Campbell, D. T. and Stanley, J. C. (1963). Experimental and quasi-experimental research design for research. Chicago, IL: Rand McNally. Casella, A., Llorente-Saguer, A., and Palfrey, T. R. (2012). Competitive equilibrium in markets for votes. Journal of Political Economy, 120 (4), 593–658. Cook, T. D. and Campbell D. T. (1979). Quasiexperimentation: Design and analysis issues for field settings. Dallas, TX: Houghton Mifflin. Davis, D. D. and Holt, C. A. (1993). Experimental Economics. Princeton, NJ: Princeton University Press Dickson, E. S. (2011). Economics vs. psychology experiments: Stylization, incentives, and deception. In J. N. Druckman, D. P. Green, J. H. Kuklinski, and A. Lupia (Eds.), Cambridge handbook of experimental political science, pp. 58–70. Cambridge: Cambridge University Press. Druckman, J. N., Green, D. P., Kuklinski, J. H., and Lupia, A. (2009). The growth and development of experimental research in political science. American Political Science Review, 100 (4), 627–635. Druckman, J. N., Green, D. P., Kuklinski, J., and Lupia, A. (2011). Cambridge handbook of experimental political science. New York, NY: Cambridge University Press. Edwards, W. 1961. Costs and payoffs are instructions. Psychological Review, 68 (4): 275–284.
Eldersveld, S. J. (1956). Experimental propaganda techniques and voting behavior. The American Political Science Review, 50 (1), 154–165. Eldersveld, S. J. and R. W. Dodge. (1954). Personal contact or mail propaganda? An experiment in voting and attitude change. In D. Katz, D. Cartwright, S. J. Eldersveld, and A. M. Lee (Eds.), Public opinion and propaganda, pp. 536–540. New York, NY: Dryden Press. Ellsberg, D. (1961). Risk, ambiguity and the Savage axioms. Quarterly Journal of Economics, 75 (4), 643–669. Forsythe, R., Rietz, T., Myerson, R., and Weber, R. (1996). An experimental study of voting rules and polls in three-way elections. International Journal of Game Theory, 25 (3), 355–383. Friedman, D. and Sunder, S. (1994). Experimental methods – A primer for economists. Cambridge: Cambridge University Press. Gerber, A. S. and Green, D. P. (2000). The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment. American Political Science Review, 94 (3), 653–663. Gerber, A. S. and Green, D. P. (2012). Field experiments: design, analysis, and interpretation. New York, NY: W.W. Norton & Company. Gilligan, M. J., Pasquale, B. J., and Samii, C. (2014). Civil war and social cohesion: Lab in-the-field evidence from Nepal. American Journal of Political Science, 58 (3), 604–619. Gosnell, H. F. (1926). An experiment in the stimulation of voting. The American Political Science Review, 20 (4), 869–874. Grether, D. M. and Plott, C.R. (1979). Economic theory of choice and the preference reversal phenomenon. The American Economic Review, 69 (4), 623–638. Grosser, J. and Palfrey, T. R. (2019). Candidate entry and political polarization: an experimental study. The American Political Science Review, 113 (1), 209–225. Grossman, G. and Baldassarri, D. (2012). The impact of elections on cooperation: Evidence from a lab-in-the-field experiment in Uganda. The American Journal of Political Science, 56 (4), 964–985. Güth, W., Schmittberger, R., and Schwarze, B. (1982). An experimental analysis of ultimatum
Experiments
bargaining. Journal of Economic Behavior and Organization, 3 (4), 367–388. Hertwig, R. and Ortmann, A. (2001). Experimental practices in economics: A methodological challenge for psychologists? Behavioral and Brain Sciences, 24 (3), 383–403. Isaac, R. M., McCue, K. F., and Plott, C. R. (1985). Public goods provision in an experimental environment. Journal of Public Economics, 26 (1), 51–74. Kimmel, A. J. (1998). In defense of deception. American Psychologist, 53 (7), 803–805. McDermott, R. (2011). Internal and external validity. In J. N. Druckman, D. P. Green, J. H. Kuklinski, and A. Lupia (Eds.), Cambridge handbook of experimental political science, pp. 27–40. Cambridge: Cambridge University Press. Merlo, A. and Palfrey, T. R. (2018). External validation of voter turnout models by concealed parameter recovery. Public Choice, 176 (1–2), 297–314. Palfrey, T. R. (2009). Laboratory experiments in political economy. Annual Review of Political Science, 12, 379–388. Palfrey, T. R. (2012). Experiments in political economy. In J. H. Kagel and A. E. Roth (Eds.), The handbook of experimental economics Vol. 2. Princeton, pp. 347–434. NJ: Princeton University Press. Pilisuk, M. (1984). Experimenting with the arms race. The Journal of Conflict Resolution, 28 (2), 296–315. Plott, C. R. (1982). Industrial organization theory and experimental economics. Journal of Economic Literature, 20 (4), 1485–1527.
389
Plott, C. R. (1983). Externalities and corrective policies in experimental markets. Economic Journal, 93 (369), 106–127. Roth, A. E. (1983). Toward a theory of bargaining: an experimental study in economics. Science, 220 (4598), 687–691. Roth, A. E. (1986). Laboratory experimentation in economics. Economics and Philosophy, 2, 245–273. Roth, A. E. (1988). Laboratory experimentation in economics: A methodological overview. Economic Journal, 98 (393), 974–1031. Roth, A. E. (1995). Introduction to experimental economics. In J. H. Kagel and A. E. Roth (Eds.), The handbook of experimental economics. Princeton, NJ: Princeton University Press. Siegel, S. and Fouraker, L. E. (1960). Bargaining and group decision-making: Experiments in Bilateral Monopoly. New York, NY: McGraw-Hill. Smith, V. L. (1976). Experimental economics: Induced value theory. American Economic Review, 66 (2), 274–279. Smith, V. L. (1982). Microeconomic systems as an experimental science. American Economic Review, 72 (5), 923–955. Tingley, D. H. and Walter, B. F. (2011). Can cheap talk deter? An experimental analysis. Journal of Conflict Resolution, 55 (6), 996–1020. Van Huyck, J. B., Battalio, R. C., Beil, R. O. (1990). Tacit Coordination Games, Strategic Uncertainty, and Coordination Failure. The American Economic Review, 80 (1), 234–248.
23 Historical and Longitudinal Analyses Einar Berntzen
Introduction Historical and longitudinal analyses constitute a prominent part of a research tradition in contemporary political science referred to as comparative historical analysis (CHA). This research tradition has the following three defining features: First, a macro-configurational orientation that entails a concern with complex, macro-level, large-scale outcomes, which are often aggregated combinations of multiple events and processes. For example: state and nation building, democratic transitions, welfare state building, revolutions and the analysis of aggregate cases such as nation-states, political movements, welfare regimes, party systems, empires, and even whole civilizations and world systems. Second, an emphasis on empirically grounded, problem-driven case-based research with a focus on explaining observed outcomes, real-world empirical puzzles often anchored in particular times and places. Comparative historical analysis seeks to develop explanations
that identify the causal mechanisms that enable and generate these outcomes. Third, a commitment to temporally oriented analysis and attention to the study of temporal processes and the temporal dimension of politics as essential for the valid understanding and explanation of real-world political outcomes. Comparative historical methods for temporal analysis reflect an ontology in which temporal location shapes the effects of individual variables that may be mediated by historical context, and the temporal structure of causes and outcomes matters for explanatory analysis and our understanding of process and time in politics (Mahoney and Rueschemeyer, 2003; Mahoney and Thelen, 2015).
Historical and Longitudinal Analyses: A Brief Genealogy Historical and longitudinal analyses have a long and distinguished history in the social
Historical and Longitudinal Analyses
sciences and can trace their roots back to the founders of modern social science. The classics of modern social science, from Alexis de Tocqueville to Karl Marx and Max Weber, all pursued CHA as a central mode of analysis. The macro-configurational and temporal orientation of CHA links it to the classics of modern social science and shares with them an abiding concern with explaining largescale political and political-economic outcomes and historical processes, like the origins of modern Western capitalism in the cases of Marx and Weber. In the decades after World War II, comparative historical research and analysis was partially eclipsed by other analytical approaches. Ahistorical grand theories, Parsonian structure-functionalist systems theories, modernization theory, and large-N cross-national statistical research dominated sociology and political science in the 1960s and 1970s. However, after some period of neglect, history came into sharper focus. There was a resurgence of interest in what Mahoney and Rueschemeyer label ‘big’ questions, that is, ‘questions about large-scale outcomes that are regarded as substantively and normatively important by both specialists and nonspecialists’ (Mahoney and Rueschemeyer, 2003: 7). Modern CHA took shape in the 1960s and 1970s through the publication of a series of landmark and early agenda-setting books: Barrington Moore Jr., 1966; Martin Seymour Lipset and Stein Rokkan, 1967; Perry Anderson, 1974; Emmanuel Wallerstein, 1974; Charles Tilly, 1975; and Theda Skocpol, 1979. In particular, Barrington Moore Jr. was a pivotal figure among the forerunners cumfounders of modern CHA. His path-breaking Social Origins of Dictatorship and Democracy (1966) laid a solid foundation for the resurgence of comparative historical scholarship within sociology and political science. By the late 1970s and early 1980s, the revival of comparative historical research across the social sciences was well beyond its days as an isolated mode of analysis carried out by a
391
few older scholars dedicated to the classical tradition (Skocpol, 1984). Almost two decades later, Mahoney and Rueschemeyer noted that comparative historical research was again a leading mode of analysis, widely used throughout the social sciences (Mahoney and Rueschemeyer, 2003: 5). Comparative historical analysis occupies an important and influential place in contemporary comparative politics and political science. As CHA came of age over the past three decades, numerous books from this perspective have been published. These works focus on a wide range of topics, but they are united by a commitment to offering historically grounded explanations of largescale and substantively important outcomes. The comparative historical scholarship consists of a surge of important studies on social provision and welfare state development (e.g. Skocpol, 1992; Rothstein, 1998) and studies that explore processes of state formation and state restructuring in various world regions (e.g. Tilly, 1990; Ertman, 1997). In addition, the last decades have seen the publication of important comparative historical books on the origins of democratic, authoritarian, and hybrid regimes (e.g. Luebbert, 1991; Rueschemeyer et al., 1992; Yashar, 1997; Mahoney, 2001; Levitsky and Way, 2010), racial and ethnic relations and national identities (e.g. Marx, 1998), the causes and consequences of revolutions (e.g. WickhamCrowley, 1992), and political parties and party systems (e.g. Collier and Collier, 1991; Roberts, 2014). A concern with ‘big structures, large processes, huge comparisons’ (Tilly, 1984) is how CHA began with the classic founders of modern social science and how its defining features remain.
Critical Junctures and Historical Legacies A central concept in historical and longitudinal studies is the concept of critical juncture,
392
The SAGE Handbook of Political Science
an historical moment during which much greater change is possible than during the preceding and subsequent periods of high and often long institutional stability. The concept of critical juncture has played an important role in comparative historical and other macro-comparative scholarship since it was developed and first used by Lipset and Rokkan (1967), who traced the origins of party system formation in Western Europe to three critical junctures located far back in history (Lipset and Rokkan, 1967: 37–8). They focused on the connection between critical junctures and fundamental societal cleavages: center–periphery, church–state, rural–urban, and owner–worker. They argued that the cumulative legacy derived from three critical junctures (Reformation–CounterReformation in the 16th and 17th centuries, the National Revolutions after 1789, and the Industrial Revolution in the 19th century) was the ‘freezing’ of different types of European party systems in the first decades of the 20th century. Lipset and Rokkan’s approach to analyzing multiple critical junctures is to treat them as separate branching points on a tree in which countries gradually diverge from one another in the course of time. Lipset and Rokkan thereby established a three-step cleavage-critical juncture-legacy template. Their analysis, however, was largely embedded in a macro-structuralist approach that left little space for the analysis of decisions and strategic choices of political actors and reconstructed, instead, the origins of historical variation ex post, starting from variation in the outcome of interest (European party systems) and looking backwards in time (Berntzen and Selle, 1990). Collier and Collier’s Shaping the Political Arena (1991) helped crystallize and further develop the critical juncture approach (Mahoney and Rueschemeyer, 2003: 156). The critical juncture in Shaping the Political Arena is the period of labor incorporation in eight Latin American countries. Collier and Collier define a critical juncture as ‘a period
of significant change, which typically occurs in distinct ways in different countries (or in other units of analysis) and which is hypothesized to produce distinct legacies’ (1991: 29). In addition, Collier and Collier (1991) expanded Lipset’s and Rokkan’s three-step template by adding the antecedent conditions prior to the cleavage/shock, and the aftermath when the legacy takes shape. Thus, Collier and Collier’s (1991) critical juncture framework established a five-step template: antecedent conditions, cleavage or shock, critical juncture, aftermath, and legacy (Collier and Munck, 2018). Antecedent conditions are diverse socioeconomic and political conditions prior to the onset of the critical juncture that constitute the baseline for subsequent change. Some antecedents derive from earlier critical junctures. For example, in Lipset and Rokkan (1967) these include the structure of the party system as it evolved across multiple critical junctures. Knowledge of antecedent conditions is essential for explaining the distinct ways in which the critical juncture occurs across cases. Cleavages, shocks, or crises are triggers of critical junctures. Critical junctures often grow out of a fundamental societal or political cleavage as in Lipset’s and Rokkan’s (1967) four cleavages. In other cases, the precipitating event can be called a shock or a crisis, as with the Latin American debt crisis of the 1980s (Roberts, 2014). However, the cleavage, crisis, or shock should not be confused with the critical juncture itself, the latter being specifically an episode of institutional innovation. It is the cleavage or shock that triggers a critical juncture, but cleavages or shocks do not necessarily always produce a critical juncture (Collier and Munck, 2017: 5). Critical junctures are major episodes of institutional change or innovation. The core claim in the critical juncture framework is that the episode of institutional innovation generates an enduring legacy. No legacy, no critical juncture. The strength of a critical juncture argument hinges to a large degree on
Historical and Longitudinal Analyses
how well this claim can be sustained (Collier and Munck, 2017: 6). The aftermath is the period during which the legacy takes shape. The legacy does not always flow directly from the critical juncture. The zigzag path of change that often follows the critical juncture may therefore be important in shaping the legacy. The issue concerns the mechanisms of production that generate the legacy. The complex steps between the critical juncture and the legacy are what Collier and Collier called ‘reactions and counter reactions’ (1991: 37), and Mahoney calls a ‘reactive sequence’ (2001: 5). The legacy is an enduring, selfreinforcing institutional inheritance of the critical juncture that stays in place and is stable for a considerable period. Without the emergence of such a legacy, the prior episode does not constitute a critical juncture (Collier and Munck, 2017: 6). With this formulation, antecedent historical conditions define a range of alternatives available to actors during a key point of choice, or the critical juncture, which is characterized by the selection of a particular option (e.g. a specific policy, coalition, or government) from two or more alternatives. The selection made during a critical juncture is ‘critical’ precisely because it leads to the creation of new institutional or structural patterns that endure over time. In turn, in the aftermath institutional and structural persistence triggers a reactive sequence in which excluded actors challenge the prevailing institutional setup through a series of predictable reactions and counter reactions. These reactions then channel developments up to the point of a final outcome, which represents the legacy. By way of illustration, Collier and Collier (1991) compare eight Latin American countries to argue that labor-incorporation periods were critical junctures that set the countries on distinct paths of development that had major consequences for the crystallization of certain parties and party systems in the electoral arena. The way in which state
393
actors incorporated labor movements was conditioned by the political strength of the oligarchy, the antecedent condition in their analysis. Different policies towards labor led to four specific types of labor incorporation: state incorporation (Brazil and Chile), radical populism (Mexico and Venezuela), labor populism (Peru and Argentina), and electoral mobilization by a traditional party (Uruguay and Colombia). These different patterns triggered contrasting reactions and counter reactions in the aftermath of labor incorporation. Eventually, through a complex set of intermediate steps, relatively enduring party system regimes were established in all eight countries: multiparty polarizing systems (Brazil and Chile), integrative party systems (Mexico and Venezuela), stalemated party systems (Peru and Argentina), and systems marked by electoral stability and social conflict (Uruguay and Colombia) (Collier and Collier, 1991). The critical juncture approach has been widely used to study the historical development of both political regimes and party systems. Some further examples of recent comparative work (Yashar, 1997; Mahoney, 2001; Roberts, 2014) in this vein illustrate the analytical steps of the critical juncture framework. The critical juncture in Yashar (1997) is the democratic transition in Costa Rica and Guatemala in the 1940s and 1950s. Yashar compares the different political regime outcomes in Guatemala and Costa Rica to explore the question of why democracies are founded and endure. Her argument is that enduring democracy depends on whether emerging democratic forces seize the opportunity to forge cross-class coalitions during the democratic transition and pass redistributive reforms that weaken the power of rural elites, while at the same time allowing for the possibility that traditional social forces will continue to play a role in politics. The scope of reforms introduced during the critical juncture period triggered counter mobilization on the part of elites, but elite unity in Guatemala and split in Costa Rica shaped the different
394
The SAGE Handbook of Political Science
decision-making institutions that emerged, establishing an enduring legacy of authoritarian rule in Guatemala and of democracy in Costa Rica. The level of development of civil society constitutes the antecedent condition, but agency is paramount: in the event that actors do not act at the moment of democratic transition, they will lose a historic window of opportunity to reshape institutions in an enduring way (Yashar, 1997). In Mahoney’s (2001) study of regime change in the five Central American republics, the critical juncture is the liberal reform period between about 1870 and 1920. The different choices that the liberal presidents and their political allies made in relation to agricultural development during the critical juncture were crucial in setting the political institutions of these countries on different paths of development: traditional dictatorships developed in Honduras and Nicaragua, military authoritarianism in Guatemala and El Salvador, and democracy in Costa Rica. The choices made by liberal presidents resulted in three patterns of liberal reform (different models of agricultural development): radical liberalism, reformist liberalism, and aborted liberalism. Under a pattern of ‘radical liberalism’ in Guatemala and El Salvador liberals enacted policies that saw the creation of a highly polarized agrarian economy and a militarized state apparatus. By contrast, under ‘reformist liberalism’ in Costa Rica, liberals promoted a less dramatic shift to commercial agriculture that saw the development of an advanced but non-polarized agrarian economy and a centralized but non-militarized state apparatus. Finally, under ‘aborted liberalism’ in Honduras and Nicaragua, foreign intervention undermined ongoing processes of liberal transformation. Hence, liberal efforts to promote development were not fully successful, allowing many traditional state and agrarian structures to survive into the postliberal period. As in the case of Yashar (1997), antecedent conditions (preexisting socioeconomic structures) were an important backdrop during the critical
juncture but did not determine the decisions of political actors (Mahoney, 2001:13). In all countries, the liberal reform brought on significant reactions from actors who were excluded or marginalized during the reform process. Disadvantaged actors responded to the structural patterns established during the liberal reform period by leading democratizing episodes. In the cases of radical liberalism, Guatemala and El Salvador, these democratizing episodes ultimately failed when military and economic elites initiated powerful counter responses against democratic reformers, which led to the establishment of harsh military-authoritarian regimes (Mahoney, 2001: 14). By contrast, in the case of reformist liberalism in Costa Rica, important factions of the elite actively mobilized democratizing movements to generate electoral support in the course of political struggles. The political incorporation of previously excluded groups increased the stakes of political competition among the elite and ultimately convinced most factions that accepting full democracy as the type of political regime was not incompatible with their interests. Finally, in the two cases of aborted liberalism, Honduras and Nicaragua, where there was no basis for democratizing movements, the political reactions in the aftermath period were directed at US actors who had intervened during the liberal reform period. When US presence ended, 19th-century-style backwards traditional dictatorships reappeared (Mahoney, 2001: 16). The contrasting political outcomes of military authoritarianism in Guatemala and El Salvador, democracy in Costa Rica, and traditional dictatorship in Honduras and Nicaragua constitute the regime heritage or enduring legacies of the three patterns of radical, reformist, and aborted liberalism of the critical juncture period. Roberts (2014) treats the crisis-induced transition from state-led development to market liberalism in Latin America as a critical juncture affecting party system transformation. Structural adjustment and market
Historical and Longitudinal Analyses
policies either aligned or de-aligned Latin American party systems programmatically, depending on whether the market reforms were led by conservative actors and whether a major left party existed to channel societal resistance to market orthodoxy. The reactive sequences in the aftermath period were moderated where conservative-directed reforms aligned party systems programmatically, stabilized partisan competition, and channeled societal resistance towards institutionalized leftist parties. In countries like Brazil, Chile, and Uruguay, these left parties were strengthened and gained victories in national elections in the aftermath period, which produced relatively moderate ‘left turns’. By contrast, in countries like Venezuela, Ecuador, Bolivia, and Argentina, where market reform policies were implemented by traditional center-left or populist parties, the critical juncture dealigned party systems programmatically. The fact that the market reforms were introduced by the traditional center-left or populist parties left them vulnerable to highly disruptive reactive sequences driven by social and electoral protest against market orthodoxy, culminating in the rise of more radical populist alternatives to the left of the traditional party systems (Roberts, 2014). In general terms, critical juncture analysis tests the hypothesis that the domestic social factors that correlate with the institutional outcome of interest may be endogenous to political decisions made much earlier in time and for reasons unrelated to such factors. Mahoney, for example, shows that patterns of agrarian class relations, which many consider to be key determinants of political regimes in Central America during the 20th century, were largely the consequence of the previous choice of a particular model of agricultural development on the part of liberal leaders, who were mainly interested in maintaining and expanding their own personal power (Mahoney, 2001: 19). Mahoney emphasizes the historical contingency of the choices made by political actors during the critical juncture. But is contingency a defining
395
feature of critical junctures? The definition of the concept of critical juncture has thus been the object of the contingency versus determinism debate concerning this question. Some scholars (Yashar, 1997; Mahoney 2000, 2001; Capoccia and Kelemen, 2007; Roberts, 2014; Capoccia, 2015) view the uncertainty of outcomes and substantial degrees of freedom in actor choices as a defining feature of critical junctures. Whereas Collier and Collier defined a critical juncture rather loosely as ‘a period of significant change’ (1991: 29), Mahoney (2001) delivered a more precise and elaborated definition of the concept. According to Mahoney, critical junctures are points of choice when a particular option is adopted from two or more alternatives. These junctures are ‘critical’ because once an option is selected, it becomes increasingly difficult to return to the initial point of choice when multiple alternatives were still available. Both the alternatives available during critical junctures and the choices finally made by actors are typically rooted in antecedent conditions. The degrees of freedom in actor choices during critical junctures may vary, ranging from choices characterized by high levels of individual discretion to choices that are more determined by antecedent conditions. However, in many cases, critical junctures are moments of structural fluidity and institutional uncertainty allowing actors to shape outcomes in a more voluntaristic fashion than under normal circumstances. These choices illustrate the power of agency by showing how long-term patterns of development can hinge on distant actor decisions of the past. Not all points of choice are critical junctures. The status of critical junctures can only be obtained by those choices that put countries (or other units) onto paths of development that lead to certain outcomes, different from others, and that are difficult to brake or reverse. Before a critical juncture, the range of possible outcomes is broad; after a critical juncture, enduring institutions and structures are created, which limit the range of
396
The SAGE Handbook of Political Science
possible outcomes considerably (Mahoney, 2001: 7). Although the choice itself may have been caused by prior events, it is the factors unleashed by the choice, not the antecedent conditions that led to the choice, that determine final outcomes (Mahoney, 2001: 7). Capoccia and Kelemen (2007) and Capoccia (2015) share Mahoney’s emphasis on contingency during critical junctures and further refine the definition of the concept in order to capture the political dynamics of institutional path selection during critical junctures. Capoccia and Kelemen (2007) note that during moments of social and political fluidity such as critical junctures, the decisions and choices of key actors are freer and more influential in steering institutional development than during ‘normal’ times. In other words, by analytically linking the concept of contingency to the choices, strategies, and decisions of political decision makers, the analysts are more likely to capture the dynamics that in most cases influence the selection of one institutional solution over others that were available during the critical juncture. On this theoretical basis, Capoccia and Kelemen (2007) define critical junctures as ‘relatively short periods of time during which there is a substantially heightened probability that agents’ choices will affect the outcome of interest’ (Capoccia and Kelemen, 2007: 348). By ‘relatively short periods of time’, they mean that the duration of the juncture must be brief relative to the duration of the pathdependent process it instigates and which eventually leads to the outcome of interest. By ‘substantially heightened probability’, they mean that the probability that agents’ choices will affect the outcome of interest must be high relative to that probability before and after the critical juncture. This definition captures both the notion that, for a brief phase, agents face a broader-than-typical range of feasible options and the notion that their choices from among these options are likely to have a significant impact on subsequent outcomes. The absolute duration of
a critical juncture has an impact on the possibility for actors to act more freely, and for the consequences of their actions to have a bigger impact than in normal times: the longer the juncture, the higher the probability that political decisions will be constrained by some reemerging structural constraint (Capoccia and Kelemen, 2007: 351). Linking contingency to the analysis of political choice, and decision-making points to the plausibility of the counterfactual argument that actors could have made different decisions, and had they done so, this would have led to the selection of a different path of institutional development. The range of plausible options during critical junctures (what could have happened) is not infinite: its boundaries are defined by antecedent conditions even though, within the limits of those conditions, actors have real choices (Capoccia, 2015: 158–9). According to Capoccia and Kelemen (2007) the analysis of agency and contingency in critical junctures also yields important methodological advantages. A logical consequence of stressing the importance of contingency as a defining element of critical junctures is that change is not a necessary element of a critical juncture. If change is possible and plausible, considered, pursued but narrowly fails to materialize in a situation of high uncertainty, there is no reason not to consider this situation as a critical juncture (Capoccia and Keleman, 2007: 352). Such ‘near-miss’ critical junctures or negative cases may occur due to two main sets of circumstances. The social and political bases for reform may exist, but political entrepreneurs may fail to mobilize the necessary coalition to achieve reform (Yashar, 1997: 22). A second source of near-miss critical junctures is one in which the political forces in favor of institutional change narrowly lose their struggle to forces favoring stability, in other words, cases in which the political struggle over the choice of different institutional options during a phase of uncertainty and institutional fluidity results in the restoration
Historical and Longitudinal Analyses
of the pre-critical juncture status quo rather than change (Capoccia, 2015:166). Another endeavor to advance the precision of the concept of critical juncture is Capoccia and Kelemen’s (2007) attempt to operationalize the ‘criticalness’ of critical junctures by focusing on two components: the probability that at the end of the critical juncture the institution acquires its durable path-dependent characteristics observable for the duration of the legacy of the critical juncture, and the duration of the critical juncture relative to its legacy. The duration of a critical juncture should be shorter than the path-dependent process it initiates. They recommend that the analysis begins with those critical junctures that had the largest impact on the outcome and that occurred earlier in time (see Capoccia and Kelemen, 2007: 360–3 for a formalization of this operationalization). A focus on the ‘most critical’ critical junctures also provides analysts with a criterion for establishing a meaningful beginning point of analysis and overcoming the problem of infinite explanatory regress into the past. Such longitudinal comparisons of the ‘criticalness’ of candidate critical junctures can help scholars build arguments as to which candidate critical juncture was more ‘critical’ for their outcome of interest (see e.g. Mahoney, 2001: 26–7). Other scholars (e.g. Pierson, 2004; Slater and Simmons, 2010; Soifer, 2012) view critical junctures more deterministically and have criticized the focus on agency and contingency as key causal factors of institutional path selection during critical junctures. Antecedent conditions play an important role in debates about contingency versus determinism in the critical juncture. Slater and Simmons (2010) use the label ‘critical antecedents’ to underscore the influence of this earlier phase prior to the critical juncture on subsequent developments. Antecedent conditions are, however, an important source of rival hypotheses for explaining the outcomes attributed to the critical juncture. For example, Capoccia (2015) in offering
397
methodological advice for building a critical juncture argument, recommends that once candidate critical junctures are identified, the hypothesis that antecedent conditions, not agency, drive institutional change should be tested (Capoccia, 2015:168). While not denying the causal importance of agency and contingency during critical junctures, Slater and Simmons (2010) nevertheless argue that a focus on the antecedent conditions of critical junctures is analytically more useful than a focus on political agency and contingency. According to Slater and Simmons (2010) and Soifer (2012), the defining feature of critical junctures is not contingency but divergence. Hence, Slater and Simmons (2010) define critical junctures as ‘periods in history when the presence or absence of a specified causal force pushes multiple cases onto divergent long-term pathways, or pushes a single case onto a new political trajectory that diverges significantly from the old’ (Slater and Simmons, 2010: 888). Since they do not deny that agency (Slater and Simmons, 2010: 890) and contingency (Soifer, 2012: 1573) play a potentially important causal role during critical junctures, Capoccia (2015) argues that such a role is logically inconsistent with a definition of critical junctures in terms of divergence. Divergence is a consequence of critical junctures in which agency and contingency are causal, and as such it cannot be used to define them (Capoccia, 2015: 157).
From Critical Juncture to Enduring Legacy: Path Dependence Critical juncture analysis is a component of the study of path-dependent institutions, but path dependence as a mechanism of reproduction of the legacy comes into play only after critical junctures have performed their work. Path dependence is not simply a blanket label to describe all explanations that
398
The SAGE Handbook of Political Science
employ a long time horizon or that examine temporally sequenced variables. James Mahoney provides a much more precise (and far more restrictive) definition: path dependence characterizes specifically those historical sequences in which contingent events set into motion institutional patterns or event chains that have deterministic properties (Mahoney, 2000: 507). Hence, Mahoney views the critical juncture in terms of contingency, but uses a framework of determinism for studying the self-perpetuating character of the legacy. The concept of path dependence is built around the idea that crucial choices during critical junctures may establish certain directions of change and foreclose others in a way that shape development over long periods of time (Mahoney, 2001: 264). Therefore, a defining feature of path dependence is the idea that it is difficult for actors to reverse the effects of choices made during critical junctures. Critical junctures lock countries into particular paths of development. Once a country has started down a path, the costs of reversal are very high. There will be other points of choice, but the resilience of certain institutional arrangements prevents an easy reversal of the initial choice point (Mahoney, 2001). Analyses of path dependence tend to encourage a rather strict separation of the issues of institutional innovation and institutional reproduction. With regard to the question of institutional genesis and change, analyses of path dependence emphasize moments of institutional innovation in which agency, choice, and contingency figure prominently. Such moments of ‘openness’ and rapid innovation are then followed by long periods of institutional reproduction, stasis or ‘lock in’ where the importance of strategy and choice recedes relative to processes of adaptation to institutional incentives and constraints. The reproduction of the institutional patterns created during the period of institutional innovation is marked by relatively deterministic causal patterns or what
Mahoney refers to as ‘inertia’: once a process is set into motion and starts tracking a certain outcome, it tends to keep on moving and tracking the outcome (Mahoney, 2000: 511). However, Mahoney makes an important analytical distinction between two types of path-dependent sequences: self-reinforcing sequences ‘characterized by processes of reproduction that reinforce early events’, and reactive sequences in which ‘initial events trigger subsequent development not by reproducing a given pattern, but by setting in motion a chain of tightly linked reactions and counter reactions’ (Mahoney, 2000: 510, 526–7). Self-reinforcing path-dependent sequences are characterized by the formation and long-term reproduction of a given institutional pattern (Mahoney, 2000: 508). Pierson contributed to clarifying the concept of path dependence by specifying the dynamic processes that sustain institutions over long periods of time by investigating the distinctive characteristics of social processes subject to what economists call increasing returns, which could also be described as self-reinforcing or positive feedback processes (Pierson, 2000: 251). Path dependence operates through a number of distinct mechanisms. Mahoney has contributed to analytic clarity by specifying various mechanisms of reproduction that may be sustaining different institutions, in a typology of path-dependent explanations of institutional reproduction: utilitarian, functionalist, power, and legitimation arguments (Mahoney, 2000: 517). Pierson analyzes path-dependent power dynamics by distinguishing five ways in which power may beget power (Pierson, 2015). In his analysis of path dependence, Mahoney (2000: 512) refers to Arthur Stinchcombe who noted that the factors responsible for the genesis of an institution may not be the same as those that sustain it over time (Stinchcombe, 1968). The second type of path-dependent analysis involves the study of reactive sequences. Reactive sequences are chains of temporally ordered and causally connected events.
Historical and Longitudinal Analyses
In a reactive sequence, each event in the sequence is both a reaction to prior events and a cause of subsequent events (Mahoney, 2000: 526). Reactive sequence arguments follow a different logic from that of self-reinforcing sequences. Whereas self-reinforcing sequences are characterized by processes of reproduction that reinforce early events, reactive sequences are marked by backlash processes that transform and perhaps reverse early events. Reactive sequences are relevant for the aftermath period, the intervening steps that constitute the ‘mechanism of production’ of the legacy. A pattern of reactive sequences in the aftermath produces important methodological challenges. If the legacy does not emerge directly but in zigzag steps, how can the analyst know when the reactive sequence has come to an end? The implication is that the analyst might misinterpret one of the steps in the sequence as an enduring legacy. The debate between Roberts (2014, 2017) and Boas (2017) focuses precisely on the importance of temporal distance and how much historical hindsight is needed before the analyst can conclude that an enduring legacy has been established. Roberts (2014) uses the critical juncture framework to analyze the political and party system consequences of the transition to neoliberal market economies in Latin America in the 1980s and 1990s. Boas’ (2017) critical assessment concerns the inherent methodological difficulties in analyzing recent or ongoing transformations with a theoretical model of change that requires temporal distance from the events to be explained. Boas claims that a critical juncture argument requires making the case that a definitive legacy has emerged. Unless a definitive enduring legacy can be identified, the outcome being explained might ultimately turn out to be just one step in a larger sequence of reactions and counter reactions in the aftermath of the critical juncture. A critical juncture argument constitutes a causal hypothesis linking a major societal transformation to a
399
temporally distant outcome that represents the endpoint of a process of change, not something that happened while the process was still unfolding. In order to evaluate the hypothesis, process tracing is needed to connect the critical juncture to the legacy. Hence, a critical juncture argument requires that the enduring legacy be specified a priori in order to describe how countries vary with respect to this legacy. Specifying the legacy is crucial not only for establishing that the critical juncture produced distinct legacies, but also for connecting cause and effect (see Beach, Chapter 17, this Handbook). Extended analytical time horizons are crucial in the critical juncture framework since it is typically used to make arguments about processes that are hypothesized to play out over long periods of time. The point of departure for a critical juncture is typically a cleavage or crisis that calls into question the political status quo. Yet the critical juncture is analytically distinct from this cleavage or crisis, and it is often temporally removed as well. The emergence of the legacy may also be temporally removed from the critical juncture itself. This is particularly true if ‘the critical juncture is a polarizing event that produces intense political reactions and counter reactions’. These are intervening steps that constitute the ‘mechanisms of production’ of the legacy one seeks to analyze (Boas, 2017: 18). The importance of temporal distance and historical hindsight is also relevant with regard to the legacy. Given that an enduring legacy is a defining feature of a critical juncture, how much historical perspective is needed to establish that it has in fact endured? How long must the legacy last to count as the legacy of a given critical juncture? The length of the legacy in relation to the length of the critical juncture is also a consideration (Capoccia and Kelemen, 2007). Roberts (2014) defends the thesis that market reforms in Latin America in the 1980s and 1990s constituted a critical juncture that shaped key features of subsequent party systems.
400
The SAGE Handbook of Political Science
By contrast, Boas points out that it may still be too early to identify the market reforms in Latin America in the 1980s and 1990s as a critical juncture. This debate highlights the challenge of finding an appropriate time horizon for studying processes of change that may still be unfolding.
Incremental Change and the Logic of Institutional Evolution The critical juncture framework is premised on a model of historical change that emphasizes moments of ‘openness’ and rapid innovation followed by long periods of institutional stability. The implication is that institutions, once created, either persist or break down in the face of some kind of exogenous shock or crisis. But what about institutional changes that fall short of breakdown? While the theory of critical junctures and institutional path dependence provides scholars with a powerful conceptual tool for substantiating claims of distal causation, scholarship on models of endogenous institutional change has shown that the conceptual apparatus of path dependence may not always offer a realistic image of development. Theories of endogenous institutional change (Thelen, 2003) criticize theories of path dependence for displaying a ‘stability bias’, which relegates change to exogenous shocks or crises. In the effort to incorporate change in a theoretical account of institutional development, scholars in this tradition have conceptualized institutions as ‘arenas of conflict’ rather than as equilibria. These endogenous processes of institutional change are not deployed in short periods but evolve gradually, radically transforming institutions over the long run (Hacker et al., 2015). Thelen (2003) criticizes path-dependent and increasing returns arguments for telling only part of the story Perhaps ´criticizes arguments of path-dependence and increasing returns for
telling only part of the story. Such arguments are better at articulating the mechanisms of reproduction behind particular institutions than they are at capturing the logic of institutional evolution and change. She proposes to distinguish more clearly, at both an empirical and an analytic level, between the mechanisms of reproduction and the logic of change at work in particular instances, and to suggest modes of change that go beyond the empirically perhaps relatively rare cases of institutional ‘breakdown’ or wholesale replacement implied by the critical juncture approach. Thelen proposes two concepts: institutional layering and institutional conversion, as ways of conceptualizing the problem of institutional evolution that overcome the zero-sum view of institutional innovation versus institutional reproduction. Arguments about institutional change through layering and conversion highlight the processes through which institutional arrangements are renegotiated periodically in ways that alter their form and functions. These alternative modes of change share with path dependence perspectives a strong ‘historicist’ and temporal dimension and provide a way of understanding why, over time, institutional arrangements may come to serve functions that are quite remote from those originally intended by their designers, how they can affect, rather than just reflect or reinforce, the prevailing balance of power among social classes and groups, and how they can become resources for, rather than just constraints on, actors engaged in struggles over appropriate or desirable practices. The more gradual, cumulative patterns of institutional evolution analyzed by Thelen (2003) and Mahoney and Thelen (2010) often come nested within broader patterns of institutional continuity. Change, therefore, may occur incrementally through ‘periodic political realignment and renegotiation’ with institutions adapting to pressures for change through processes of gradual transformations and institutional discontinuity caused by incremental ‘creeping’ change: layering, drift, and conversion.
Historical and Longitudinal Analyses
Institutional layering involves the partial renegotiation of some elements of a given set of institutions while leaving others in place: layering of new arrangements on top of preexisting structures or grafting new elements onto old ones. Layering is a mechanism used by proponents of change to work around institutions that have powerful vested interests. New coalitions may design novel institutional arrangements but lack the support, or perhaps the will, to replace preexisting institutions that were established to pursue other ends. Some aspects may be ‘locked in’ in the way path dependence theorists emphasize, by the power of the constituencies they have created. But institution builders have found ways to work around this opposition by adding new institutions rather than dismantling the old. Rothstein (1998) provides an example of this type of change. He identifies the potentially transformative effects of individualized private-sector social services that emerge alongside standardized universalistic welfare programs in Scandinavia. The growth of private alternatives can undermine support for universal programs among the middle class, on whose ‘contingent consent’ the entire public system rests (Rothstein, 1998). Institutional drift occurs when institutions or policies are deliberately kept in place while their contexts change in ways that significantly alter their effects. A simple example is the US minimum wage, which is not indexed to inflation. Hence, unless new federal legislation is enacted to change the rules, the minimum wage will keep losing value with rising prices. Interest groups that wish to promote change through drift need only prevent the updating of existing rules. Drift thus depends on the context sensitivity of the effects of an institution, and on whether policies and institutions are designed in such a way that it is possible to update them in accordance with changing circumstances, and on how easy or difficult it is to block such updating (Hacker et al., 2015: 180). Institutional conversion describes changes in implementation that occur without formal
401
policy revision, when political actors are able to redirect institutions or policies towards purposes, functions, and goals beyond their original intent. An example is the ability of business corporations to use the Sherman Antitrust Act of 1890 to fight labor unions. The anti-trust legislation was originally passed with the purpose of fighting corporate collusion, as it was meant to break up business trusts that were ‘in restraint of trade’. Yet business corporations were successful in convincing federal courts that trade union organizing was ‘in restraint of trade’. In cases of conversion, actors who are not part of the original coalition that originally created the formal rules are able to convert these rules in order to achieve their own quite different goals. Rule ambiguity is a source of conversion. Multiplicity of political arenas where ambiguous rules can be reinterpreted is another (Hacker et al., 2015: 181).
Comparative Historical Methods: Tools for Comparing Temporal Processes There is a close alignment between theories of institutional change and CHA (Hacker et al., 2015: 182). The study of historical and longitudinal processes has important methodological implications. Critical junctures and enduring legacies, drift, or conversion can be observed only in analyses that are both configurational and attentive to changes that unfold over long periods of time. The affinity between macro-historical questions and comparative research design is strong. The focus on theories of process and sequences of events rather than on the configuration or covariation of factors has triggered methodological advance with regard to the strategy of casing. The shift towards theories of historical processes required new strategies of casing and new comparative designs in which periods and events rather than systems are the primary units for comparison.
402
The SAGE Handbook of Political Science
This is certainly true for any historical work that systematically compares two or more sequences within a given case. With these studies, the sequences are central units of analysis, not only the national or other spatial units in which they are located. In turn, when one treats sequences as central units of analysis, it is possible to revisit traditional comparative historical methods, which are often understood to apply mainly or exclusively to the macro-spatial unit under analysis. A new vantage point for thinking about comparative historical methods comes into being by treating sequences and processes as core units of analysis and comparison. Comparative historical analysis is often understood to imply the comparison of a small-to-medium number of cases (usually countries or other macro units). However, Falleti and Mahoney (2015) argue that it may be more precise to say that the field of CHA is more concerned with the systematic comparison of sequences. They suggest that a principal overarching methodology of CHA is the comparative sequential method (Falleti and Mahoney, 2015: 211). This method is defined by the systematic comparison of two or more historical sequences of events, and CHA causal claims rest upon the inferences derived from the analysis and comparison of those sequences. The comparative sequential method implies both longitudinal comparisons of sequences (e.g. comparisons of two or more critical junctures argued to explain an outcome in the same unit of analysis, assessing rival arguments), as well as crosssectional comparisons of critical junctures in different units. The comparative sequential method is an overarching methodology. Concerning the temporal component of the comparative sequential method, sequences are the central units of analysis, But the comparative sequential method also encompasses more specific cross-case and within-case methods for making causal inferences. The most basic comparative methods for assessing causal hypotheses are J. S. Mill’s method of agreement and method of
difference (Mill, 1843/1972). The method of agreement matches cases that share a certain outcome and it eliminates any potential factor that is not shared by the cases in question. The factor is not necessary for the outcome. The method of difference compares a case where the outcome is present with a case where the outcome is absent and it eliminates any factor that is shared by these cases. The factor is not sufficient for the outcome (Mahoney, 2004: 86). When used in isolation, these methods are weak instruments for small-N causal inference. While these methods may be able to discover that an individual factor is not necessary/sufficient for an outcome, they are unable to establish that a given condition is necessary/sufficient. Small-N researchers thus normally combine Millian methods with process tracing in order to make a positive case for causality. These basic matching methods or controlled comparisons are widely used in comparative historical analyses (e.g. Collier and Collier, 1991; Luebbert, 1991; Yashar, 1997; Mahoney, 2001; Roberts, 2014). The best-known formal method for evaluating necessary and sufficient causes is Boolean algebra introduced by Ragin (1987). Boolean algebra in Qualitative Comparative Analysis (QCA) is particularly suited for the analysis of combinations of variables that are sufficient for the occurrence of an outcome. Since several different combinations of variables (conjunctural causation) may each be causally sufficient, this method allows for multiple paths to the same outcome. These more formal techniques are also used by comparative historical analysts (e.g. Wickham-Crowley, 1992; Berntzen, 1993; Berg-Schlosser and De Meur, 1994). Process tracing is a technique in which the analyst attempts to locate the causal mechanisms linking a hypothesized explanatory variable to an outcome. Process tracing is an especially important tool for studies in which explanatory and outcome variables are separated by long periods of time. Process tracing is also valuable for establishing the
Historical and Longitudinal Analyses
features of the events that compose individual sequences (e.g. their duration, order, and pace) as well as the causal mechanisms that link them. Counterfactual analysis is used to evaluate causal arguments in a single case. For example: a critical juncture approach has more analytical leverage when controlled comparisons are used to identify the causal processes that produce divergent outcomes across a range of cases that are subjected to similar pressures for change. The basis for claims that a major change in a single unit of analysis constitutes a critical juncture, in isolation from patterns observed elsewhere, is more limited. A counterfactual thought experiment might increase analytical leverage by explaining potential variance in hypothesized outcomes in a single case (Capoccia and Kelemen, 2007: 355–7).
Conclusions This chapter has presented an overview of historical and longitudinal analyses as part of the CHA tradition in comparative politics and political science. Historical and longitudinal analyses belong to a prominent research tradition that can trace its roots back to the classics of modern social science and their concern with addressing major questions of enduring significance. The three core defining elements of the CHA approach – its emphasis on macro-configurational explanation, its focus on problem-driven case-based research, and its attention to temporal processes and the temporal dimension of politics – are closely linked in empirical CHA research. Because the timing and sequencing of relevant events form part of the context that produces the outcome in question, macro-configurational explanations often have a strong temporal dimension. Likewise, empirically grounded case-based research facilitates the identification of causal mechanisms and interaction among different causal
403
variables and processes as these unfold over time. The critical juncture framework is widely used in historical and longitudinal analyses. The definition of the concept of critical juncture has been the object of the contingency versus determinism debate, in which antecedent conditions have played an important role. Some scholars emphasize the historical contingency of the choices made by political actors during the critical juncture, whereas other scholars have criticized the focus on agency and contingency as key causal factors of institutional path selection during critical junctures. The latter argue that a focus on antecedent conditions of critical junctures is analytically more useful. Since the critical juncture approach is a method and theoretical model that presupposes temporal distance from the events in question, extended analytical time horizons are crucial for developing a valid critical juncture argument. The core claim in the critical juncture framework is that the critical juncture generates an enduring legacy. No legacy, no critical juncture. The problem is that the legacy may be temporally distant from the critical juncture. Hence, the challenge of identifying critical junctures and assessing the legacy of a new critical juncture without the benefit of hindsight shows the importance of historical hindsight and temporal distance when developing a critical juncture argument. Critics of the conceptual apparatus of path dependence implied by the critical juncture model have developed alternative models of institutional change and evolution. Attention to incremental forms of political change such as layering (the introduction of new rules on top of or alongside existing ones) and conversion (the changed enactment of existing rules due to their strategic redeployment) may be relevant to major subjects of contemporary political systems. There is a close alignment between theories of historical processes and institutional change and CHA. Critical junctures and
404
The SAGE Handbook of Political Science
enduring legacies, or incremental change (drift, layering, or conversion) can only be observed by analyses that are both configurational, case-based, and attentive to changes that unfold over long periods of time. The focus on theories of temporal processes and sequences of events has brought about a change in the strategy of casing. The events and sequences of events have become the primary units of analysis. Hence, the principal overarching methodology for comparing temporal sequences is the comparative sequential method.
References Anderson, Perry (1974) Lineages of the Absolutist State. London, New Left Books. Berg-Schlosser, Dirk and Gisèle De Meur (1994) Conditions of Democracy in Interwar Europe: A Boolean Test of Major Hypotheses. Comparative Politics, 26 (3), 253–279. Berntzen, Einar (1993) Democratic Consolidation in Central America: A Qualitative Comparative Approach. Third World Quarterly, 14 (3), 589–604. Berntzen, Einar and Per Selle (1990) Structure and Social Action in Stein Rokkan’s Work. Journal of Theoretical Politics, 2 (2), 131–150. Boas, Taylor C. (2017) Potential Mistakes, Plausible Options: Establishing the Legacy of Hypothesized Critical Junctures. Qualitative & Multi-Method Research, 15 (1), 18–20. Capoccia, Giovanni (2015) Critical junctures and institutional change. In: James Mahoney and Kathleen Thelen (eds.) Advances in Comparative-Historical Analysis, pp. 147–179. Cambridge, UK, Cambridge University Press. Capoccia, Giovanni and R. Daniel Kelemen (2007) The Study of Critical Junctures. Theory, Narrative, and Counterfactuals in Historical Institutionalism. World Politics, 59 (3), 341–369. Collier, David and Gerardo L. Munck (2017) Building Blocks and Methodological Challenges: A Framework for Studying Critical Junctures. Qualitative and Multi-Method Research, 15 (1), 2–9.
Collier, David and Gerardo L. Munck (2018) Research on Critical Junctures: Template, Examples, Guidelines. [Online] Available from: http://www.critical-juncture.net [Accessed 17 December, 2019]. Collier, Ruth Berins and David Collier (1991) Shaping the Political Arena: Critical Junctures, the Labor Movement, and the Regime Dynamics in Latin America. Princeton, Princeton University Press. Ertman, Thomas (1997) Birth of the Leviathan: Building States and Regimes in Medieval and Early Modern Europe. Cambridge, Cambridge University Press. Falleti, Tulia G. and James Mahoney (2015) The comparative sequential method. In: James Mahoney and Kathleen Thelen (eds.) Advances in Comparative-Historical Analysis, pp. 211–239. Cambridge, UK, Cambridge University Press. Hacker, Jacob S., Paul Pierson, and Kathleen Thelen (2015) Drift and conversion: Hidden faces of institutional change. In: James Mahoney and Kathleen Thelen (eds.) Advances in Comparative-Historical Analysis, pp. 180–208. Cambridge, UK, Cambridge University Press. Levitsky, Steven and Lucan A. Way (2010) Competitive Authoritarianism: Hybrid Regimes After the Cold War. New York, NY, Cambridge University Press. Lipset, Seymour M. and Stein Rokkan (1967) Cleavage structures, party systems and voter alignments: An introduction. In: Seymour M. Lipset and Stein Rokkan (eds.) Party Systems and Voter Alignments: Cross-National Perspectives, pp. 1–64. New York, NY, Free Press. Luebbert, Gregory M. (1991) Liberalism, Fascism, or Social Democracy: Social Classes and the Political Origins of Regimes in Interwar Europe. Oxford, Oxford University Press. Mahoney, James (2000) Path Dependence in Historical Sociology. Theory and Society, 29 (4), 507–548. Mahoney, James (2001) The Legacies of Liberalism: Path Dependence and Political Regimes in Central America. Baltimore, MD, Johns Hopkins University Press. Mahoney, James (2004) Comparative-Historical Methodology. Annual Review of Sociology, 30, 81-101.
Historical and Longitudinal Analyses
Mahoney, James and Dietrich Rueschemeyer (eds.) (2003) Comparative Historical Analysis in the Social Sciences. New York, NY, Cambridge University Press. Mahoney, James and Kathleen Thelen (eds.) (2010) Explaining Institutional Change: Ambiguity, Agency, and Power. New York, NY, Cambridge University Press. Mahoney, James and Kathleen Thelen (eds.) (2015) Advances in Comparative Historical Analysis. Cambridge, UK, Cambridge University Press. Marx, Anthony W. (1998) Making Race and Nation; A Comparison of South Africa, the United States, and Brazil. Cambridge, Cambridge University Press. Mill, John Stuart (1843/1872) The System of Logic. 8th edition. London, Longmans, Green. Moore, Barrington Jr. (1966) Social Origins of Dictatorship and Democracy: Lord and Peasant in the Making of the Modern World. Boston, MA, Beacon Press. Pierson, Paul (2000) Increasing Returns, Path Dependence, and the Study of Politics. American Political Science Review, 94 (2), 251–267. Pierson, Paul (2004) Politics in Time. Princeton, NJ, Princeton University Press. Pierson, Paul (2015) Power and path dependence. In: James Mahoney and Kathleen Thelen (eds.) Advances in ComparativeHistorical Analysis, pp. 123–146. Cambridge, UK, Cambridge University Press. Ragin, Charles C. (1987) The Comparative Method. Moving Beyond Qualitative and Quantitative Strategies. Berkeley, CA, University of California Press. Roberts, Kenneth M. (2014) Changing Course in Latin America: Party Systems in the Neoliberal Era. New York, NY, Cambridge University Press. Roberts, Kenneth M. (2017) Pitfalls and Opportunities: Lessons from the Study of Critical Junctures in Latin America. Qualitative & Multi-Method Research, 15 (1), 12–15. Rothstein, Bo (1998) Just Institutions Matter: The Moral and Political Logic of the Universal Welfare State. New York, NY, Cambridge University Press. Rueschemeyer, Dietrich, Evelyne Huber Stephens, and John D. Stephens (1992)
405
Capitalist Development and Democracy. Chicago, IL, University of Chicago Press. Skocpol, Theda (1979) States and Social Revolutions. A Comparative Analysis of France, Russia and China. New York, NY, Cambridge University Press. Skocpol, Theda (ed.) (1984) Vision and Method in Historical Sociology. Cambridge, UK, Cambridge University Press. Skocpol, Theda (1992) Protecting Soldiers and Mothers: The Political Origins of Social Policy in the United States. Cambridge, MA, Harvard University Press. Slater, Dan and Erica Simmons (2010) Informative Regress: Critical Antecedents in Comparative Politics. Comparative Political Studies, 43 (7), 886–917. Soifer, Hillel David (2012) The Causal Logic of Critical Junctures. Comparative Political Studies, 45 (12), 1572–1597. Stinchcombe, Arthur L. (1968) Constructing Social Theories. New York, NY, Harcourt Brace. Thelen, Kathleen (2003) How institutions evolve: Insights from comparative historical analysis. In: James Mahoney and Dietrich Rueschemeyer (eds.) Comparative Historical Analysis in the Social Sciences, pp. 208–240. New York, NY, Cambridge University Press. Tilly, Charles (ed.) (1975) The Formation of National States in Western Europe. Princeton, NJ, Princeton University Press. Tilly, Charles (1984) Big Structures, Large Processes, Huge Comparisons. New York, NY, Russell Sage Foundation. Tilly, Charles (1990) Coercion, Capital and European States, AD 990–1990. Cambridge, MA, Basil Blackwell. Wallerstein, Immanuel (1974) The Modern World-System I: Capitalist Agriculture and the Origins of the European World-Economy in the Sixteenth Century. New York, NY, Academic Press. Wickham-Crowley, Timothy P. (1992) Guerrillas and Revolution in Latin America: A Comparative Study of Insurgents and Regimes since 1956. Princeton, NJ, Princeton University Press. Yashar, Deborah J. (1997) Demanding Democracy: Reform and Reaction in Costa Rica and Guatemala, 1870s–1950s. Stanford, CA, Stanford University Press.
24 Interpretative Methods Te r r e l l C a r v e r
Science and Society All scientific investigations require prior interpretative work to set them up, and all scientific results – whatever methods deployed – require interpretation after the fact. Whether it is rocks or votes, scientists must be trained in a discipline which already interprets the world. In that way, phenomena can be identified and selected for investigation. Any object being studied is then subject to analytical methods, which themselves arise in a similarly interpretative way. Successive interpretation thus produces ongoing dialogical consensus in a scientific community about methods, as well as about results. However, novel methods – with suitable evaluative trials and shared agreement – can win adherents within a discipline, and may generate similarly novel results. Since the mid 20th century, political science that proceeds in this way has occupied a mainstream position in research and training (Easton, 1981).
The observations about interpretation stated above notably transcend the physical science/social science binary, viewing both as similar practices within human knowledge-making communities. In that way both communities are using a similar understanding of what science is, and what an object of study is. That commonality, though, does not exclude differences relevant to the particular sorts of objects involved. Thus, meteorology or geology will necessarily involve somewhat different methods from those of physics or biology. On this view, then, social studies, indeed political studies as a science, is thus dissimilar only in kind to various natural sciences. But it is not dissimilar in essence, since science itself is presumed to transcend the natural/social binary, and to offer methodological protocols and truthcriteria common to both. This understanding of science invokes a commonplace notion of interpretation, which is taken for granted and not itself much investigated (Yanow, 2014a).
Interpretative Methods
Alternatively, social and political studies have sometimes been defined as wholly different from the natural/physical sciences: votes not being like sub-atomic particles; politics not being like weather-systems. There is thus no commonality in respect of presuppositions and methods, even if there are areas where similar language is deployed. Apparently similar terms, in relation to rigorous procedures and truth-criteria, are rather said to be mere analogies, but not to represent genuine similarities. From that binary perspective, then, the ‘science’ in political science seems a misnomer, so the locution political studies has sometimes been substituted. The interpretative methods under consideration below have been developed, but only in part, within one side of the binary just described. That side of the binary, positing social and physical objects as distinctly different in essence, is often identified as hermeneutics (Zimmerman, 2015). Interpretative scholars might bridle at being called either scientific or unscientific, since they understand social objects and appropriate methods to be radically different from those conceived in ‘hard science’. Nonetheless, they sometimes defend their work methodologically, claiming that it is just as rigorous as research done in natural/ physical science, albeit in an alternative way. And they sometimes claim to achieve results that are just as objective, albeit derived from alternative methodologies, because the objects of investigation are so different in essence (Yanow, 2014b). Sometimes political scientists add an element of interpretative methods to the scientific ones, modelled on the natural/physical sciences. While those interpretative methods have been derived from scholars working within the hermeneutic demarcation of the physical from the social, that important distinction is most often simply disregarded by political scientists. They are incurious about this so as to make their appropriation of hermeneutically driven methods legitimate and to maintain an unproblematic notion
407
of interpretation. Thus, in apparently talking to each other on the basis of agreement, political scientists and hermeneutic scholars are often actually talking past each other (Hawkesworth, 2014). However, from the 1990s the consideration of interpretative methods in relation to political science has begun to generate a novel perspective on science itself, whether social or physical, and not merely on methods. Those developments have followed on from innovative work in philosophy, and particularly in philosophy and sociology of science, first begun some decades earlier (Kuhn, 2012). That unifying approach uses post-structuralist premises based on the implications of the ‘linguistic turn’, the terms of which will be explained below. That novel perspective goes well beyond commonplace truisms about knowledge-making in human communities. It does so by transcending the terms through which the ontological and epistemological binaries of familiar knowledge-making practices have been constructed in the first place (Belsey, 2002). For many political scientists the end-result of the discussion here will not look like science, or even like political phenomena as usually conceived. For those in political studies taking a hermeneutical approach, the end-result will not look like hermeneutics, either, or necessarily like political activity, even on a broad understanding of the term. Many if not most practitioners within the discipline of political science will probably prefer the uneasy but familiar metaphors and compromises through which they operate, and through which graduate training takes place. However, for a full understanding of the significance of the interpretative ‘turn’ in social and political studies, it is necessary to violate familiar presumptions and comfort zones so as to view the world differently (Culler, 2002). Besides obtaining clarity in understanding and cataloguing interpretative methods, a further upside to this exercise in transgressive thinking is the outreach it offers to other disciplines. This is because
408
The SAGE Handbook of Political Science
post-structuralist premises and the ‘linguistic turn’ are based on a revised understanding of how knowledge of any kind, and therefore science of any kind, is constituted. To do this, our discussion will become more historical and thus centred in the ideas through which the ‘linguistic turn’ has taken place and post-structuralist premises have been formulated. These developments have undercut long-established traditions and distinctions in philosophy. Many philosophers are also opposed to the presuppositions that will be outlined here. Moreover, the standpoint in question is not coincident with traditional understandings of hermeneutics (Critchley, 2001). And many of the thinkers and academics who pursue these burgeoning studies come from other disciplines entirely, but often address political phenomena and ideas. Perhaps the best way to think about the ‘linguistic turn’ and post-structuralist premises is to consider interdisciplinary eclecticism a virtue, rather than a vice, and to treat the propositions expounded here as a lens or perspective, rather than as an overturning of valuable knowledge and academic practice (Carver and Hyvärinen, 1997). However, in order to understand how this perspective works, we need to consider in more detail the exact philosophical foundations through which science and hermeneutics were constituted as truth-searching forms of knowledge-making.
Empiricism and Factuality The study of politics as a science, using methods applicable to both the natural and social worlds, reaches back to Aristotle. Famously he advocated observation and data-collection, but perhaps now less memorably, he also based his work on a conceptual apparatus of essence and motion, teleology and hierarchy. Those constitutive principles have been under attack since the 17th century. Subsequently the argument for an
understanding of the physical world through an anti-metaphysical conceptualization of material presumptions and concepts, such as matter and energy, and a union of experimental and mathematical knowledge-creation, has developed and triumphed. That battle was not just a matter of abstract argumentation but was also an industrial practice through which commercial technologies and scientific research were intellectually intertwined and experientially validated. Revised philosophies of science followed those world-changing developments, often unhelpfully characterized as two distinct revolutions: scientific and industrial. New, materialist philosophies were projected into social studies, conceived on an analogous basis and therefore scientific in practice. German authorities were particularly, though not exclusively, influential in this process (Farr, 2003; Baumgartner, Chapter 18, this Handbook). Two important further developments, through which the ‘science’ in political science was constituted, date approximately to the later 1950s. Those were the behavioural protocols, typically applied to explaining and predicting phenomena associated with political participation, and the formal protocols, typically used to generate explanatory and predictive models for strategic interaction. For political science, such interactions could be between self-interested human individuals, or in the case of International Relations, the constituent units were states or similar collective actors (Weber, 2001). Both subdisciplines together constituted the self-styled and widely accepted scientific core of the discipline. And – despite obvious differences – both were constituted through observations of phenomena that could be reduced by abstraction such that mathematical, logical and/or statistical methods would apply. Those methods were easily borrowed from the defining core of post-17th-century science, a powerful union of observation, data-collection and reductive analysis. The apparent success of marginalist economics
Interpretative Methods
in conceptualizing human behaviour and complex interactions in highly abstract, symbolically manipulable terms, was clearly a model. Unsurprisingly, voting was conceived by some political scientists as an intrinsically or analogously economic transaction based on strategic pursuit of self-interest. Self-interested activity by individuals was then understood paradigmatically as the kind of human interaction through which politics itself is constituted. What holds that view of science together – whether natural/physical or social/political – is an empiricism, that is, a view of the world as comprised of human individual subjects, such as political scientists, who ‘know’, and objects of knowledge, such as human interactions, which are ‘to be known’. In philosophical terms that is an ontology, an account of what exists in the world, which also presupposes an epistemology, a formalization of the ways through which objects can be known with accuracy and certainty. The understanding of what an object ‘is’, thus determines, in circular fashion, the kinds of ways through which it can ‘be known’, and vice versa. If human social action is conceived as behaviour that, through observation, can be reductively objectified by means of conceptual abstraction and symbolic representation, then the methods used to understand non-human physical objects, whether inanimate or animate, can be applied in explanatory and predictive ways. Thus, the ontology and epistemology of materialism is complete, forming a methodological unity within a comprehensive concept of science. In that way, scientific studies are said to be empirical, and vice versa, and subject-object ontology and epistemology is said to be empiricist (Moses, Chapter 27, this Handbook).
409
and an object to be known, the former the scientist/reader and the latter the words-onthe-page. Originally those texts were classical, biblical and Egyptological. Analytical methods were devised, methodological protocols developed, and explanatory results were understood as the true meaning, derived scientifically from the text at hand. The premise and promise was that a scientific determination of meaning would add accuracy and certainty to what had otherwise been subjectively construed from texts as meaning. On the one hand, this model for hermeneutics can be applied reductively to any instance of language-use, no matter how ordinary, so the limited quantity of important texts was eventually transcended. This extended hermeneutic study to written and spoken words in general. Thus the study of texts broadened out to include the study of languages and languageusers, the former becoming comparative linguistics, and the latter becoming empirical linguistics. Knowledge of meanings, and of meaning-making, in both realms of study presumed that formal structures, derived from analysis, would be both explanatory and predictive. Linguistic scientists would thus command knowledge of true meanings relevant to both everyday interactions and hermetically encoded texts. Moreover, they would eventually command knowledge of the properties common to all human languages anywhere. The former study would generate protocols of symbolic mapping for the myriad ways through which individual speakers communicate meaning to each other. The latter study presumed that the use of scientific methods would disclose a deep structure hidden within human language itself (Matthews, 2003).
Text and Truth
Post-Structuralism and Its Premises
Hermeneutics as the study of texts, scientifically pursued, arose within that context. Paradigmatically it posits a knowing subject
However, a number of later-20th-century developments reversed the empiricism described above, precisely by positing human
410
The SAGE Handbook of Political Science
language, social interaction, practical activities and meaning-making, taken altogether, as a substitute for both ontology and epistemology. This ‘linguistic turn’ supplanted the subject-object/knower-known structure through which empiricism is defined. Poststructuralism thus constituted a revisioning of the human world, including sciences, technologies and all forms of human ‘being’ as meaning-making. Moreover, it posits that scientists, researchers, indeed all human ‘knowers’ therefore function wholly within this environment. There is thus no viewfrom-nowhere or otherwise disembodied or necessarily privileged point from which truth arises. Post-structuralism is thus a critique of structures that, following the protocols of empiricism, were presumed to be ‘there’ in the objects of knowledge, such that explanatory and predictive generalizations were validated as accurate reflections of how things really are. Rather, on the post-structuralist view, that situation is one of projection: human ‘knowers’ are finding what is ‘to be known’ as already ‘there’ in external structures, and so evidently discoverable. This is obviously a circular process. On the poststructuralist view, then, objects of knowledge are themselves human conceptual constructs, not ‘things’ which have a structure or fixed nature in themselves to be known. Post-structuralists have argued that objects of whatever kind cannot be presumed to be constituted in themselves in terms that map to human conceptual constructs. As just stated, the process of knowledge-creation must be working the other way round. Knowledge is necessarily humanly derived and socially driven, rather than ‘there’ as structures to be ‘discovered’. For that to be so, objects would have to have already come into existence in ways that do – or will – map to human conceptions. Thus for structuralists, certain knowledge of things as they really are – even if only gradually and asymptotically approached – requires a metaphysical presumption of coincidence between the human mind and everything else, or a creator-God,
Himself human-like, who made a universe that was founded on, but was mysteriously concealing of, singular truths that can be mirrored in human conceptual constructions (Rorty, 1989). The origins of the contrary post-structuralist arguments lie again in German intellectual achievements of the mid-19th century through which the truths of biblical revelation of God’s creation and His will were undermined, and religion was explained as a projection of human concepts and capacities onto imaginary beings. Physical and natural sciences were the next objects to fall within the post-structuralist critique, and for the same reasons: the truthful and certain coincidence between what material objects are, and the human capacities and conceptions that enable knowledge-construction, cannot be presumed at the outset or in finality. Rather, human-knowledge construction arises and proceeds within socially communicative practices of meaning-making. Of necessity those meaning-making activities include the ontological-epistemological protocols through which standards of validity are socially set in on-going ways. Thus the use of the natural/physical ‘hard science’ model in the social sciences, through which certitude could be obtained, was challenged by poststructuralists, though most political scientists have chosen not to engage (Hawkesworth, 2014). However, the above line of argument does not necessarily result in a nihilistic scepticism or judgemental vacuum. The fall-back and antidote is rather a pragmatic one: technologies that work and find a market, and research projects that engage participants and find funders. While the natural/physical sciences provide very obvious referents for those practices and successes (subject of course to varying judgements), the social sciences have in general been hugely successful in engineering social change and producing modern individuals (again, subject to varying judgements). While some ‘hard scientists’ and social scientists might distinguish their activities as pure or theoretical, and so create
Interpretative Methods
a distinction and hierarchy relative to applied technologists and disciplines, nonetheless the logic of the deconstructive argument sketched above applies across the board. What is produced by humans is known and judged within human terms, and not in relation to anything that is somehow external to that.
The ‘Linguistic Turn’ and Speech-Act Theory The legacy of hermeneutics involves more than the ‘linguistic turn’ and the dissolution of certainties supposedly derived from the methodological protocols of science. Because hermeneutic researches focused on written texts, and therefore on words, and on language and languages, the relationship of word to object, and language to object, was made problematic. If words do not refer to things, such that the relationship is either correct (when words mirror things accurately) or incorrect (when they do not do so accurately), then to what do words, and therefore languages as such, refer? The answer to this question reversed the familiar referentiality of empiricism. Empiricism is itself reflected in numerous locutions in many languages. Speakers and writers use locutions that describe – rather than construct or create – a material world that is supposed to exist external to, or outside of, the conscious minds within which words and languages arise. The ‘linguistic turn’ and post-structuralist premises arose from the counter-declaration that language refers only to language, and that descriptive statements are a trope, not a mirror of things in thought. From that perspective, language is a closed system through which meanings arise in relation to other meanings, not through a relationship between language-user and ‘external’ world – however it is conceived or experienced. During the ‘linguistic turn’ this form of intersubjective idealism developed two
411
further postulates: the self-referential nature of language-use is founded on an ‘excess’ through which an open-ended instability in meaning-making is definitionally inherent; and within the linguistic system there are ‘performative’ speech-acts that are coincident and co-constitutive with social action. The former explains the creative powers through which cultural and social change, including natural/physical scientific and technological innovation, is possible yet always already unstable; the latter demonstrates the way that abstractions are made real and intelligible through practices that involve citation and repetition. The now commonplace examples of performativity in speech-acts include the marital statement ‘I do’, which does not refer to marriage as an abstract idea but rather realizes a specific social actuality when publicly performed. Thus a performative concept, and the corresponding illocutionary declaration, perform and construct the meaningful reality of marriage as an on-going social institution. More controversially, this view of human language users as meaning-makers dissolves familiar binary distinctions between the material (as invested with an inherent stability and predictable regularity from which certainties can arise) and the immaterial (contrarily invested with instability and unpredictability, hence uncertainty). The immaterial realm was thus in practice, and metaphorically, a home for mere opinion, whereas the material realm was in practice, and metaphorically, a home for knowledge and – within appropriate protocols – science. The dissolution of this familiar dichotomy arises from a practice-based and meaning-oriented concept of materialization as the repetitive, citational learning through which the distinction is made socially operative (Butler, 2011). Put very simply, we learn to talk that way, so materiality is projected as a matter of routine in order to generate for us ‘things’ to which words are said to refer. Similarly, we learn to reject locutions that fail to follow that particular repetitive pattern but are
412
The SAGE Handbook of Political Science
instead incorporated into another, contrasting category. Thus, among many language-users the locutions ‘horse’ and ‘god’ map to ‘real’ and ‘unreal’; though it is easy to imagine, or indeed experience, cultures where the opposite repetitive citation would reverse the attribution. The history and sociology of natural/ physical science, understood from that perspective, provides many similar illustrations, as do the social sciences.
Post-Empiricism and PowerRelations The reversal of empiricism identifies referentiality – as a truthful-or-not relationship between words internal to the mind and things said to be external to it – as a trope within language itself. That is, we are used to repetitive practices expressing an image of mind, world and knowledge suited to certainty. Thus the world as variously understood within different languages, in different cultures and by different individuals, will have such commonalities as human language users generate, and also perforce importantly such differences as they generate. But the world will not exist in a meaningful sense until meaning-makers work together in practical activities to make meanings that more or less suit themselves. Thus, humans are what they are in and through their activities as language-users. Because language has properties of excess and instability, it incites distinction-making and the institution of differences. Those properties are thus the origin of powerrelations and hierarchies, and of knowledgeclaims, whatever the terms through which these claims are expressed. From that basis arise disciplinary practices, however physically violent or verbally rhetorical. Power hierarchies and power-plays are thus inescapable among language-users, for two reasons: communication between languageusers is always non-coincident; and power is
everywhere because language is everywhere. Or to put this in commonplace terms: human nature lies in the nature of language, not in any bundle of physical, moral or divine set of properties given to, or inherent in, human bodies, subjectivities or souls. If there were such things to be known, we would have to know them within language, apart from which there can be no meaning-making practices. Communication between language-users will always be miscommunication, no matter how strictly terms are defined and deviance is punished. Though meanings are made through the social activities of which individuals are constituents, it does not follow that each individual interprets the meanings that are made in precisely the same way. That is because individuals as language-users are necessarily active interpreters, rather than passive recipients, of verbal messages from one to another. Thus, the excess that is a property of language necessarily arises within language-users as individuals. In general terms, agreement in forms of words is conditional on agreement in forms of life, which is necessarily unstable, as meaning-making among individuals evolves (Wittgenstein, 2009: §241). As initially practised on texts, hermeneutics presumed authorship in individuals, even if they could not be identified authoritatively. Following a protocol that valued certainty, linguistic analysts aimed to identify meaning as a singularity located in an authorial mind, conceptualized as the author’s intended meaning. Truth in hermeneutics was therefore the revelation, by means of analytical methods, of this otherwise hidden or misinterpreted message. However, from the post-structuralist perspective, and following the ‘linguistic turn’, a reversal was in order: meaningmaking is an active process of interpretation engaged in by readers of texts. It is of necessity uncontrolled by authors themselves or their hermeneutic avatars. It further follows – and this is another reversal of hermeneutics as initially practised – that social activities
Interpretative Methods
are themselves texts to be read, whereas written texts were formerly considered to be the sole instance of textual artefacts. Many of the methods developed to de-code written texts, such as symbolic analysis or semiotics, were thus adapted to understand non-verbal modes of communication, whether gestural (e.g. bodily movements and expressions) or representational (e.g. visual and aural communicative modes) (Chandler, 2007). Those media, of course, may be more or less precise in conveying meanings than spoken or written words, depending on the respective circumstances of the communicators. Structuralist approaches to meaning work from a data-set or corpus of spoken or written words. Analysis presumes that meaning is located in individual words, groups of words, locutions, repetitive patterns and the like. The research goal is that revelatory understanding will emerge that bears on an author’s non-obvious intent or a hidden pattern in community meaning-making. The properly schooled and trained researcher is thus, like a scientist, licensed by protocols of certainty to state otherwise obscure or obscured truths such that knowledge is made available authoritatively. By contrast, post-structuralism and the ‘linguistic turn’ take meaning-making to be constitutive of humanness itself. Human understanding is therefore a capacity common to all language-users, an insight loosely derived from phenomenology (Inwood, 2000). Moreover, meaning-making arises through non-verbal as well as verbal activities, and communication occurs through signs, which are spoken or written words, and all other representational media. Communication is necessarily imperfect, so any claims to authority and thus control are matters of persuasion. Those claims may then succeed or fail as power-plays (Bateman, 2014). They succeed when meaning-making is institutionalized as education, training, law and convention and suchlike disciplinary methods that attempt to ensure agreement, uniformity and certainty. They fail, at least somewhat, when – given
413
the capacity of language-users to generate distinctions and differences – resistance and subversion arise as contrary power-plays. The art of rhetoric was a classical study in the art of persuasion. However, contemporary accounts of rhetoric are well placed to summarize what the post-structuralist perspective can do. Rhetoric has traditionally considered the speaker and concomitant stylizations or body-language that are deployed in public settings. Rather than merely verbal statements or responses of agreement, speakers often aim to stimulate action. We are thus close to the speech-acts considered by theorists of the ‘linguistic turn’ and to the post-structuralist view of communication as a realm of powerplay. Thus, rhetorical analysis is a good place to begin the consideration of interpretative methods in detail. Moreover, the social context – public-speaking to p ersuade – is paradigmatic for those who study politics, political scientists or otherwise. It is also certainly a key to the careers of politicians and political activists. The American political scientist Richard E. Neustadt famously said, ‘The power of the President is the power to persuade’ (Richard E. Neustadt, 1960: 1), and perhaps even more famously, Otto von Bismarck, from the practitioner side, remarked, ‘Politics is the art of the possible …’ (Oxford Essential Quotations, 2016).
Rhetorical Political Analysis Classical rhetoric operated in a realm of instability rather than certainty, offering technique rather than knowledge, action rather than truth. It was thus marginalized as an applied art in relation to the more abstract accounts of human experience pursued by philosophers within which knowledge, and therefore criteria of truth, were the stated outcomes. Much the same marginalization and exclusion characterizes rhetorical political analysis today as a set of methods, unless post-structuralist premises and perspectives
414
The SAGE Handbook of Political Science
are invoked, in which case those methods are foundational. At this point, we encounter the characterization of rhetors or speakers, engaged in power-plays, as necessarily or at least habitually untruthful. That characterization accords with the practical, power-related character of the activity, such as politics, but presumes a subject-object epistemology aiming at singular truths. The contrary, post-structuralist position resolves itself into the situation as already understood: the human world of meaning-making is itself about relative powers of persuasion, rather than a situation in which some humans make regrettable departures from truths that can be established with unarguable certainty. For students of politics the situation is already one of ‘doing things with words’, whether spoken or written, so classical protocols can be adapted and updated (Austin, 2018). Using rhetorical analysis, speech can thus be classified by genre, for example: epideictic or ceremonial; forensic or judicial; deliberative or political. Those three genres map very roughly to the past, present and future, constructed as ideas in discourse. Genre is thus a shortcut, for speaker and audience alike, to intelligibility. And with intelligibility there is then the possibility of approximate and conditional alignments among active meaning-makers. Rhetorically, the issue at stake in a speech can be parsed as conjecture, in relation to a truth or falsehood; definition, in relation to what to think or not to think; quality, in relation to an action as good or bad; and circumstance, in relation to differentials in authority and power. A speaker endeavours to persuade by discovery, that is, revealing the argument to the audience, and so marshalling appeals that are typically of three kinds: logos, or appeals to reason; ethos, or appeals to authority; pathos, or appeals to emotion. The dispositio or arrangement of these appeals was crucial to teaching rhetoric as an art, and can therefore be used to analyse political speech: exordium or introduction
to prepare the audience; narratio or narration to set out facts selectively; confirmatio or proof to present the argument; refutio or refutation to reject alternatives; peroratio or conclusion, to sum up persuasively. From the speaker’s point of view, individual interpreters in the audience should be more aligned in their understanding, feeling and motivation towards agreement and action, and less aligned with alternative views, feelings of rejection and motivations of opposition. The rhetorician will employ aspects of style or elocutio as persuasive devices within and supervenient to speech. In appealing to logos or truth, speakers may affect a styleless mode of factuality, using the simple, direct language of the literal; in appealing to ethos or authority, they may deploy overt mannerisms that repetitively and therefore performatively reference legitimated power; in appealing to pathos or emotion, they may deploy any number of verbal images or tropes to elicit positive feelings about themselves or negative feelings about others. All those devices of style rely on denotative and connotative aspects of meaning-making: the former referentially specific and limited; the latter associative and suggestive (Martin, 2014). Classical rhetoricians identified hundreds of rhetorical schemes, too many to list here. Indeed, those lists were the core of their methodological contribution to political studies (Lanham, 1991). Rhetorical analysis considers repetition in various forms, antithesis and binary contrast, puzzle and resolution. Most famously, we have the rhetorical question, a device in which the speaker already knows the answer, and the audience either already knows it, too, or knows that the speaker knows it and will reveal it on the spot. Rhetorical analysis, while speakerfocused, is also founded on audience-reception. Analytically that can be parsed down to individuals who may – or may not – interpret the devices as the speaker intends, and therefore may – or may not – be persuaded. Imagery or tropes are at the fine-grained end of rhetorical analysis. Figures of speech
Interpretative Methods
are connotative practices of association, typically founded on metaphor and simile as ways of transferring meaning from one term to another. Tropological analysis extends to hyperbole, or exaggeration; irony, or saying one thing and meaning another; paradiastole, or a reversal of moral significance through deliberate misnaming; and numerous other ways of using language to effect persuasion (Carver and Pikalo, 2008). With televisual and digital media, delivery or actio has become increasingly important, given that communication from speaker to audience can be expanded when recorded material is shared, sometimes across platforms and to world-wide audiences. Moreover, politicians are encouraged by that situation to access theatrical coaching and actor-appurtenances to make their performances persuasive. Voice, gesture, dress, embodiment, props, lighting and make-up are all important constituents not just of performance but of identity. Branding is understood as an image to which repetitive citation can refer performatively (Howells and Negreiros, 2018). Rhetorical analysis, conducted this way, reveals how truths are constructed as persuasive practices in speaker-audience situations. Analytically the object is to show how this is done, and to construct plausible accounts of effectivity. Those results will be derived from observation in the full sense, rather than from limiting the object of study to a words-intranscription, that is, a reduction of experiential data to verbal units. Note the term ‘plausible’ in the paragraph above. As an interpretative method, rhetorical political analysis itself must persuade an audience that an analysis is meaningful and significant. But this persuasive communication can occur only in relation to the criteria that each interpreter brings to the communicative context. Interpreters are always at liberty to make individual judgements, even if they keep these to themselves and outwardly dissemble. Post-structuralist premises are thus founded on the indeterminacy
415
of language and the inherent uncertainty in any one human about knowing another.
Discourse Analysis and Deconstruction Discourse analysis and deconstruction arose within the development of the ‘linguistic turn’ and post-structuralist premises. Together these developments summarize the performative approach to meaning-making as embodied and enacted in social circumstances that are inherently imbued with power and hierarchy (Howarth, 2000). Other methodological approaches to communication, derived from Kantian philosophy, or from empirical linguistics and ideologycritique, will be considered separately below. There are thus profound differences in presuppositions underlying what might otherwise seem to be a common, indeed commonplace term: discourse. Discourse refers here to speech and text, though in another post-structuralist reversal, text has taken priority over speech. That priority is asserted notwithstanding the obvious fact that humans learn to speak before they can read and write, and that reading and writing postdate the existence of speaking humans. Moreover, text is understood here to include any mode of human expression that communicates meaning, even when no written or spoken words are involved. Thus, discourse in this broadly defined way includes images, both still and moving, sound, movement including dance, and all the symbology of semiotics (Mills, 2016). Understood that way, discourse analysis invites methodological eclecticism, borrowing insights, concepts and procedures arising in art history, aesthetics, photography, cinema-studies, media and cultural studies and communication theory. Given the global ubiquity and evident effectivity of digital social media, notably in political communication and campaigning, discourse analysis offers a powerful
416
The SAGE Handbook of Political Science
framework that embraces real-world complexities (Weldes, 2014). From the post-structuralist and performative perspective, written texts are in the first instance physical objects, whether inscribed on paper or on a screen, or rendered visible in some other medium, or made aural through fluid vibration. Words in texts are thus not transparent windows used by ‘knowers’ to view meaning as objectively ‘there’ in a common conceptual space. Rather, texts are themselves objects with ‘surfaces’ presented ‘to be known’ by interpreters for whom meanings are variously ‘there’, but subjectively in individual consciousnesses. Written texts thus have surface properties which can be analysed using methodological concepts and protocols that reveal how objects are constructed so that meaning-making takes place. That approach does not look for an underlying meaning, but rather promotes an exploration of the textual surface by different interpreters. Therefore, a variety of meanings will emerge as readers engage with texts. Discourse analysis presumes that human communicative relations are inherently antagonistic and conflictual because they are constituted in and through articulatory practices of meaning-making undertaken by individuals as readers. Articulation is the construction of nodal points where meanings are partially and temporarily fixed or ‘sutured’. Logics of equivalence and difference can then be traced in written texts as sequences of nodal points. It is through those nodal points that social practices are defined, stimulated and promoted. Social practices as meaning-making activities, and written texts as meaning-making objects, are thus mutually constitutive. As with linguistic excess, those processes of meaning-making are open-ended and never-ending. The presumption of antagonism in social relations generates the concept of the discursive ‘other’, a moment of negation that sparks meaning-making as an articulation of differences into an unstable equivalence. Since that unstable equivalence already
contains differences, and therefore inherent antagonisms, articulations of equivalence are necessarily vulnerable to re-formation as further differences, and similarly along endless chains of signification. In more formal and even more abstract terms those procedures presume that any given concept has a constitutive ‘outside’, i.e. anything is what it is, only because it is not any number of other things. And they presume that human meaning-making arises from conscious and sub-conscious emotions of fear and anxiety that are never really resolved into stability but rather constitute inherently what we are. The former view is derived from the philosopher G. W. F. Hegel, and the latter borrowed from Sigmund Freud, the founder of psychoanalysis. As meaning-makers humans have subjectivities through which agency arises, though agency in this framework does not presuppose a stability of identity or a consistency in consciousness, rational or otherwise. Rather, subjectivities arise in and through subject positions, which are products of meaningmaking activities into which individuals are interpellated, or ‘hailed’ (Montag, 2002). However dominated individuals may then be, or however constrained they feel, linguistic ‘excess’ is the medium through which complicit or subversive agency arises and is made meaningful. The concept hegemony provides an explanatory framework through which, in political terms, individuals may be, or may be said to be, consenting to oppressive structures. Or, conversely, they may be working to rearrange power-relations as a matter of resistance. Conceived on this basis and in these terms, discourse analysis necessarily invokes deconstruction as a method. While the above protocols explain how meaning is performed in social activities, deconstruction mandates historical research in order to establish the conditions of possibility that enabled a text of that kind to appear as meaningful. Research is directed genealogically towards preceding texts, and thus to former meaning-making activities. A given text only makes sense as a dialogical
Interpretative Methods
successor to former ideas and practices, so deconstructive analysis embraces diachrony as well as synchrony. Deconstruction works from a hermeneutics of suspicion, enabling researchers to identify essentializing, naturalizing and universalizing logics in texts. Those logics work to persuade readers that meanings are thus secured as certain and moral. Deconstruction, which presumes the undecidability of concepts, and reveals the power-plays inherent in meaning-making practices, is an important, though non-formulaic, tool in political analysis. Rather than providing researchers with a method to follow formulaically, the analytical perspective arising from post-structuralist premises and the ‘linguistic turn’ promotes creativity and individuality in researchers, as well as innovation in methodology and novelty in results. However, under the general heading of discourse analysis there are contrasting approaches, which sometimes generate unproductive hostilities, or simply mutual misapprehension. Here are two further approaches, which also contrast with each other.
Communicative Action and Discourse Ethics A theory of communicative action, or in some versions discourse ethics, uses a Kantian methodology to intuit, formulate and propound an ideal speech situation through which rational individuals communicate dispassionately in order to reconcile their differences and to achieve consensus. Reasoning from first principles, it is possible, on this view, to deduce transcendentally, i.e. abstractly from logic rather than empirically from evidence, a set of rules through which argumentative discourse should proceed. Using these rules as analytical tools, rather than as practical constraints, analysis can thus reveal defects and dysfunctions in reallife situations. The rules can be summarized in relation to dialogical participants as: inclusion and open admission; free questioning;
417
freedom of assertion; openness in expression; and exclusion of coercion. As a diagnostic tool those rules are ideals, but subject to revision. Counter-arguments can be shown, by invoking performative contradiction, to presuppose what they object to. The communicative action approach is thus the inverse of the post-structuralist discourse analysis discussed above. That view was to some extent argued in opposition to the presumption within the communicative action approach that there is a singular ideality of consensus to which speech and behaviour ought to conform. Post-structuralist premises are also in opposition to a view that antagonism and conflicts are necessarily defects in, rather than inherent constituents of, humanness. Where post-structuralism sees persuasion and power as constituting of humanness, discourse ethics sees morality and values. Many researchers in political studies prefer to position their results positively in relation to the morality and values, working from the premises of communicative action, rather than to put themselves into an ambiguous position in relation to political judgement and action, as post-structuralists do.
Critical Discourse Analysis Critical discourse analysis, self-styled CDA, arose from the traditional hermeneutics that promised the discovery of hidden meanings. Those meanings were said to be concealed in texts, whereas properly trained researchers applying science-like protocols could reveal them accurately (Machin and Mayr, 2012). The ‘critical’ in CDA was derived from socalled critical theory, which posits a political realm of interest- and/or class-driven ideology (Freeden, 2003). Ideologies articulate an outlook or worldview systematically and persuasively. But they are in some sense – to be revealed by means of intellectual critique – misleading, selective, parti pris and suspect. Critical theory and ideology-critique derive
418
The SAGE Handbook of Political Science
loosely from Marxism and the writings of Marx and Engels, who argued that the ruling ideas in politics, law, religion, morality and suchlike are not derivative of timeless truths, but rather historical products. Therefore, they are malleable effects of meaning-making. Despite overt claims to be in the general interest, or indeed everyone’s individual interest, such systems of ideas are covertly articulated so as to benefit some classes in a society to the detriment of others (Bronner, 2017). Although derived from empirical linguistics, within which claims to scientificity rely on a value-neutral stance of objectivity, CDA has reversed the enterprise in order to embrace an egalitarian political perspective. And like post-structuralist approaches to discourse as meaning-making, it has also embraced visuality. The tool kit elaborated in the how-to volumes generated within CDA overlap considerably with the eclectic mix of concepts developed by discourse analysts working from post-structuralist premises. The difference between the two arises, however, in two things: the singularity of the critical project for CDA, that is revealing a hidden truth; and the necessary positioning therefore of the researcher as superior, through training and credentialization, to the ordinary reader. By contrast, the post-structuralist approach, as discussed above, embraces indeterminacy in meaning-making, rather than the singularity of the hidden truth. And it acknowledges uncertainty about meaning-makers, rather than proceeding from mere suspicion. Taking the meaning-making involved in discourse analysis itself to be persuasive, rather than definitive, thus relieves discourse analysts working from post-structuralist perspectives from charges of authoritarianism and elitism.
Critical Realism and Social Constructivism So far we have pursued interpretative methods on post-structuralist premises related to
the ‘linguistic turn’. And we have briefly noted the two contrasting starting-points: discourse ethics and ideal-speech, on the one hand, and critical theory and ideologycritique, on the other. Two further alternatives are now on the horizon. Critical realism was a response to the way that post-structuralists embraced uncertainty in judgement and identified meaning-making with power. For critical realists this could be resolved with a conditional but Kantian argument, namely that humans should be acting as if certainty in ethical and scientific judgement is attainable, even if there is no unarguable basis for indubitable deduction. And similarly they argue that knowledge-making should proceed as if humans are capable of knowing how any phenomenon really is in itself, thus expecting human categories and all else in the world to arrive eventually at coincidence. That position is thus a heuristic presupposition, licensing certainty but on a conditional basis. That tactic, so it was argued, avoided the problems of relativism. Relativism implies that the ultimate absence of certainty in judgement necessarily disallows the invocation of any criteria at all on which judgement can be soundly based (Benton and Craib, 2001). A post-structuralist response is that resolving differences into relative powers of persuasion presumes that countervailing powers are possible and are themselves founded on judgements. For judgements to be effective, human agents must make meaningful some courses of action as opposed to others. Or put simply, any alleged foundational and thus indubitable certainties, whether of ethics or science, are themselves performatives that enact, via meaning-making communities, what they purport to describe (Chambers and Carver, 2008). Constructivism references post-structuralist premises and the ‘linguistic turn’, as described above, but only to the point that the existence and properties of the material world come into question. Thus it is said that humans construct the social world in and
Interpretative Methods
through performative concepts. Those meanings are projected into, and arise out of, social activities that are meaningful and intelligible, precisely because they are constructed in that way. However, the material world, and the sciences thereof, are typically bracketed off from consideration, thus referencing a social-/material-presumed binary which divides objects of knowledge that are different in essence. However, following the discussion of materialization provided above, the post-structuralist riposte to constructionists is that concepts of the material, and all other categories of science, arise performatively within human communities. Thus material technologies, as well as social ones, are seamless with processes of discovery and validation. As methodological premises and useful heuristics, both critical realism and constructivism function within political studies as approaches that incorporate many of the interpretative methods considered here.
Visuality and Communicative Objects Theorizing from post-structuralist premises aims for complexity rather than simplicity. It thus rejects the scientific tradition which has, for some centuries, celebrated reductionism, parsimony and elegance. The orientation towards complexity is a clue to the inclusion of visual meaning-making not only within, but crucially important to, discourse analysis as practised on premises derived from poststructuralism and the ‘linguistic turn’. Words, whether written or spoken, are essential to meaning-making, and – through the preservation of written texts and repetitive intertextual citation – to communicative practices. But then so are visuality and aurality. Images and sounds are important meaningmakers, though they need not, and in many cases do not, occur in conjunction with written or spoken words. Modern cultures are logocentric in conflating writing and speech
419
with concepts, and thus finding images and sounds problematic as meaning-makers in communicating ideas. Like written and spoken texts, images mean different things to different people. As the commonplace saying goes: one picture is worth ten thousand words. Like words, images can be denotative, or representational, and connotative, or associational. Similarly, they evoke feelings and emotions. And rhetorically they can persuade or dissuade. Like people they ‘want’ to be looked at, to be engaged in dialogical meaning-making, and to be social creatures and political agents (Mitchell, 2005). Of course that is a projection of humanness into physical objects, but then that trope licenses an interpretative analysis of non-verbal communications. Non-verbal communications, perhaps because of their ubiquity and potency, are often even more effective meaning-makers than purely verbal media. Thus meaning-making does not have to come to humans only from other humans via the physical media as described. Rather, meaning-making within human social activities is done in conjunction with further physical objects and phenomena. Thus images and sounds do not merely represent concepts, albeit defectively and imprecisely. Nor are they merely vehicles for conveying meanings that are necessarily only verbal. Rather, images and sounds convey conceptual and emotional messages, which may or may not be easy to put into words. As communicators of concepts through which we experience sociality as meaning-making, they are indispensable to being human. This understanding of meaning-making extends even more to the built environment. That is because the instantiation of concepts, such that meanings are communicated more or less effectively, and then read and interpreted variously by individuals, are a constituent of architectural theory and practice, and similarly with respect to interior design (Yanow, 2014c). Discourse analytical methods thus include picture-space, geometry, composition, colour, light, perspective, symbolism, culture,
420
The SAGE Handbook of Political Science
audience, intention, economics, reception and any number of similar categories developed in art history and aesthetics (Rose, 2012). For photography, many of those apply similarly but with additions of ‘the gaze’, viewer-camera positioning, framing and cropping, the window-on-reality effect and similar technical considerations (Hand, 2012). For moving images, whether cinematic, animation or amateur video, a grammar of narrative meaning-making has been derived from literary studies to which technical terms are analogous: editing and cuts are similar to the ways that prose and poetry shift time and space; montage and fade-out mimic narrative devices that proceed episodically; mise en scène and set-dressing condense the continuous prose of narration and description. To the dramaturgy of theatrical performances cinema adds camerawork understood as close-up, long-shot, focuspulling, panning, zoom, dolly shot, camera motion and many other techniques – taken as analytical tools – through which a film can be read. Reading a film is thus portraying it dialogically as a meaning-maker (Monaco, 2009). Narrative analysis is applicable to any medium in which someone tells a story, whether it is a novelist or film-maker. This includes authorial voice in a novel or voiceover in a movie, or an interviewee in response to an unstructured or semi-structured question from a researcher (Bevir, 2006). The method is also applicable to non-verbal communicative objects, such as pictures or photographs, individually or sequentially, when viewers construct a narration that puts images into words. In commonplace terms, a story has a beginning, a middle and an end, and is narrated in prose or poetry. In relation to living human subjects researchers will necessarily have a self-reflexive account of what they are doing, and research projects are themselves narrations with a narrator. Over and above the rhetorical, symbolic and metaphorical or tropological considerations that are operative in discourse analysis, narrative
analysis requires genre-classification. This includes comedy, tragedy, satire, romance and the like, each of which will have defining features. It also requires consideration of the narrator’s point of view and reliability; tests of consistency and continuity in relation to space, time and character; and crucially, reconstruction of social mythologies, through which facts and fictions are understood. It also apprehends concepts of identity, through which self-understandings are pursued in dialogical relations of recognition and misrecognition. Narrative analysis offers a powerful way to explore meaning-making (Charteris-Black, 2005). Aural meaning-making, other than speech, is the area where there is the least consensus on analytical categories and methodological tools. So far it relies on rather banal staples of musical appreciation, such as the association of minor keys with sadness, or the use of evocative genre-distinctions, such as the association of march-time rhythms with militarism (Franklin, 2005). As recording and playback technologies have developed, music has become very widely accessible in text-less and disembodied modes. In that way, it is increasingly experienced apart from live performers or moving images with dialogue. In genres of pure sound, with increasing reduction of, and isolation from, background noise, listeners are encouraged to be meaning-makers independent of authorial or other instruction. Cinematic sound design is highly developed, including even subliminal and other aural effects. But even within multi-media studies, analysis of sound as itself meaning-making, rather than meaning-enhancing, represents a methodological opportunity (Sexton, 2007).
Conclusion A perspective that follows post-structuralist premises and the implications of the ‘linguistic turn’, as we have done here, affords
Interpretative Methods
researchers the most extensive array of interpretive methods, and the most promising possibilities, for knowledge-creation in the study of politics. However, knowledgecreation that proceeds from other premises, e.g. the subject-object empiricisms through which protocols of reduction, deduction and induction are deployed, are sometimes extended to include data-collection from non-verbal media. In that way, such mediaspecific methods can apply within political science (Banks and Zeitlyn, 2015). Empirical research into the effectivity of political campaigns, for example, can take up visual and aural data with appropriate tools, thus improving on intuitive and untutored observations. The practice of obtaining data from nonverbal sources inevitably raises questions as to the comparative validity of results. The use of interpretative methods, appended to empiricist social science, will produce knowledge that always looks subjective in relation to the researcher. And it will look uncertain in relation to reproducibility of results, predictive power of models, or explanatory value of conclusions. Taken on other terms, namely those of post-structuralism and the ‘linguistic turn’, interpretative methods will open up a non-reductionist understanding of politics as human social interaction arising from power-differentials. It will also necessarily promote an all-round consideration of the meaning-making activities through which political relations actually operate (Yanow and Schwartz-Shea, 2012).
References Austin, J. L. 2018 [1955]. How to Do Things with Words. Eastford, CT: Martino. Banks, Marcus and David Zeitlyn. 2015. Visual Methods in Social Research. 2nd edn. Thousand Oaks: Sage. Bateman, John A. 2014. Text and Image: A Critical Introduction to the Visual/Verbal Divide. Milton Park: Routledge.
421
Belsey, Catherine. 2002. Poststructuralism. Oxford: Oxford University Press. Benton, Ted and Ian Craib. 2001. Philosophy of Social Science. Basingstoke: Palgrave. Bevir, Mark. 2006. How Narratives Explain. In: Dvora Yanow and Peregrine Schwartz-Shea (eds), Interpretation and Method. 1st edn. Armonk: M.E. Sharpe, 281–90. Bronner, Stephen Eric. 2017 [2011]. Critical Theory. 2nd edn. Oxford: Oxford University Press. Butler, Judith. 2011 [1993]. Bodies that Matter. Milton Park: Routledge. Carver, Terrell and Jernej Pikalo (eds). 2008. Political Language and Metaphor. Milton Park: Routledge. Carver, Terrell and Matti Hyvärinen (eds). 1997. Interpreting the Political. London: Routledge. Chambers, Samuel A. and Terrell Carver. (2008). Judith Butler and Political Theory. Milton Park: Routledge. Chandler, Daniel. 2007 [2002]. Semiotics: The Basics. 2nd edn. Milton Park: Routledge. Charteris-Black, Jonathan. 2005. Politicians and Rhetoric: The Persuasive Power of Metaphor. Basingstoke: Palgrave Macmillan. Critchley, Simon. 2001. Continental Philosophy. Oxford: Oxford University Press. Culler, Jonathan. 2002 [1983]. Barthes. Oxford: Oxford University Press. Easton, David. 1981 [1953]. Political System: An Enquiry into the State of Political Science. New York: Alfred A. Knopf. Farr, James. 2003. The New Science of Politics. In: Terence Ball and Richard Bellamy (eds), The Cambridge History of Twentieth-Century Political Thought. Cambridge: Cambridge University Press, 431–45. Franklin, M. I. (ed.). (2005). Resounding International Relations. Basingstoke: Palgrave Macmillan. Freeden, Michael. 2003. Ideology. Oxford: Oxford University Press. Hand, Martin. 2012. Ubiquitous Photography. Oxford: Polity. Hawkesworth, Mary. 2014. Contending Conceptions of Science and Politics. In: Dvora Yanow and Peregrine Schwartz-Shea (eds), Interpretation and Method. 2nd edn. Armonk: M.E. Sharpe, 27–49. Howarth, David. 2000. Discourse. Buckingham: Open University Press.
422
The SAGE Handbook of Political Science
Howells, Richard and Joaquim Negreiros. 2018. Visual Culture. 3rd edn. Oxford: Polity. Inwood, Michael. 2000. Heidegger. Oxford: Oxford University Press. Kuhn, Thomas S. 2012 [1962]. The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Lanham, Richard A. 1991. A Handlist of Rhetorical Terms. Berkeley: University of California Press. Machin, David and Andrea Mayr. (2012). How to do Critical Discourse Analysis. Thousand Oaks, CA: Sage. Martin, James. 2014. Politics and Rhetoric. Milton Park: Routledge. Matthews, P. H. 2003. Linguistics. Oxford: Oxford University Press. Mills, Dana. 2016. Dance and Politics: Moving Beyond Boundaries. Manchester: Manchester University Press. Mitchell, W. J. T. 2005. What do Pictures Want?: The Lives and Loves of Images. Chicago: University of Chicago Press. Monaco, James. 2009 [1977]. How to Read a Film. 4th edn. Oxford: Oxford University Press. Montag, Warren. 2002. Louis Althusser. Basingstoke: Palgrave. Neustadt, Richard E. 1960. Presidential Power: The Politics of Leadership, pp. 1. (New York: John Wiley) Oxford Essential Quotations, 4th edn, ed. Susan Ratcliffe, Oxford University Press, 2016, available at : https://www.oxfordreference.com/ view/10.1093/acref/9780191826719.001. 0001/q-oro-ed4-00001699 accessed 6 January, 2020.
Rorty, Richard. 1989. Contingency, Irony, and Solidarity. Cambridge: Cambridge University Press. Rose, Gillian. 2012. Visual Methodologies, 3rd edn. Thousand Oaks: Sage. Sexton, Jamie (ed.). 2007. Music, Sound and Multimedia: From the Live to the Virtual. Edinburgh: Edinburgh University Press. Weber, Cynthia. 2001. International Relations Theory. Basingstoke: Routledge. Weldes, Jutta. 2014. High Politics and Low Data. In: Dvora Yanow and Peregrine SchwartzShea (eds), Interpretation and Method. 2nd edn. Armonk: M.E. Sharpe, 228–38. Wittgenstein, Ludwig. 2009 [1958]. Philosophical Investigations. Oxford: Wiley-Blackwell. Yanow, Dvora and Peregrine Schwartz-Shea. 2012. Interpretive Research Design: Concepts and Processes. Milton Park: Routledge. Yanow, Dvora. 2014a. Thinking Interpretively. In: Dvora Yanow and Peregrine SchwartzShea (eds), Interpretation and Method. 2nd edn. Armonk: M.E. Sharpe, 5–26. Yanow, Dvora. 2014b. Neither Rigorous nor Objective? In: Dvora Yanow and Peregrine Schwartz-Shea (eds), Interpretation and Method. 2nd edn. Armonk: M.E. Sharpe, 97–119. Yanow, Dvora. 2014c. How Built Spaces Mean. In: Dvora Yanow and Peregrine Schwartz-Shea (eds), Interpretation and Method. 2nd edn. Armonk: M.E. Sharpe, 368–86. Zimmerman, Jens. 2015. Hermeneutics. Oxford: Oxford University Press.
25 Methodology: Qualitative and Quantitative Approaches1 Nathaniel Beck
Introduction Political methodology deals with all issues related to how to do empirical political research (non-empirical work, such as pure formal or normative theory, is excluded here). Methodology, as it is understood here, simply refers to the ways in which we acquire knowledge and comprises a multitude of specific methods and techniques. As such, it is embedded in an epistemological tradition of ‘critical rationalism’ (Karl Popper: 1963) and ‘scientific realism’ (Moses, Chapter 27, this Handbook). This has been summarized as the ‘twofold conviction that the world consists of causal mechanisms that exist independently of our study – or even awareness – of them, and that the methods of science hold our best possibility of our grasping their true character’ (Shapiro, 2005: 8–9). While it is often confused with narrower topics such as statistics, methodology is a broad area that deals with every aspect of political research, both quantitative and qualitative. While some
methodological issues are more relevant to certain subfields or types of research, all political science is subject to similar standards and logic. While political methodology is related to more general social science methodology, there are specific issues that distinguish political methodology, while of course there is a shared logic and standards across the empirical social sciences. Since political science is itself defined by substantive questions, there is much importing of methods from other disciplines into political science. Questions of what is imported and the relevance of imported methods are important issues in political methodology. This entry discusses some of the major advances in this field. While political methodology deals with empirical research, there cannot be any purely empirical research. Every empirical study involves some relationship between a theoretical concept and its empirical referent. Even very narrow empirical research, such as the measurement of electoral turnout
424
The SAGE Handbook of Political Science
in a given locality in a given period, requires a theoretical assessment of what is electoral turnout. For example, are people who are of legal age to vote but excluded from the process because of a prior felony (as some are in the United States) counted in the denominator? While empirical studies of voting turnout are less complex theoretically than studies of, say, whether being a democracy causes a nation to be more pacific, both studies involve a mix of theoretical and empirical analysis and both can be assessed using the same logic. Thus, issues of measurement are always critical; such issues have become even more critical as technology makes new forms of data (video, blogs) available, or it makes it possible to easily analyze data that we have always used but found hard to deal with (text). The ability to deal with new sources of very complicated data, and, with modern computers, the ability to code massive amounts of non-quantitative data, as well as the ability to collect individual data via the internet, are amongst the most exciting developments in political methodology (Wagschal and Ettensperger, Chapter 16, this Handbook). Similarly, there can be no difference in the underlying logic of qualitative and quantitative research. While obviously the specific tools will be different, if the question of interest is why countries have differing regulatory systems, we may pursue this in a number of ways. But, in the end, all such studies must be able to answer whether the evidence used leads to the conclusion asserted. While process tracing through official papers of decision makers is different than regressing regulatory rules on political variables, both may use one type of quantitative method while students of cross-national comparative politics may use another type, these subfields are also subject to the same fundamental logic. This point has been forcefully made in recent years in important books by Gary King, Robert Keohane and Sidney Verba (1994) and by Henry Brady and David Collier (2010). The interrelationship (both
similarities and dissimilarities) between quantitative and qualitative analysis, and how to combine both types to improve research, is another research area in political methodology that is seeing much discussion. Some political science research is purely descriptive: how many people vote, did more people vote last year than this year, and the like. As noted, even for this simple issue, serious measurement issues arise: which people vote and what does it mean to vote. Obviously, such issues are simpler than asking how many countries are democracies, but the methodological logic does not change. Most political science research, however, is not simply descriptive. While sometimes we only care about associations, much research uses causal language, metaphors and ideas. Thus, while we might only be interested in whether there is a correlation between being a democracy and the amount of public goods provided, we are often more interested in whether there is a causal relationship, such that being a democracy leads to more public goods being provided. In purely associational studies, the variables are all treated symmetrically; when using the language of causality there is always a variable that is being caused and one or more variables doing the causing. Issues of assessing causality are in the forefront of current discussions of political methodology, whether quantitative or qualitative. These issues can all be considered under the heading of research design. Finally, when the above issues have been dealt with, there is the issue of how to get the data to speak clearly, and how to assess what inferences can be drawn from the data (be it qualitative or quantitative). This is the realm of statistics (for quantitative studies). Obviously, the use of sophisticated statistical methods has mushroomed in political science, at least partly as a function of the increased computer power that is now commonly available. But while this part of methodology is often seen as highly mathematical and complicated, it is usually the simplest part of political methodology. It is also the
Methodology: Qualitative and Quantitative Approaches
case that the most sophisticated statistics cannot save a poor research design or poor measurement; by contrast, good research design often leads to simple statistical analysis. This entry begins with developments in data and measurement, then goes on to discuss research design, then measurement and only then concludes with a discussion of statistical methods.
Data and Measurement Empirical research deals with data (observations), and the earliest empirical work, that of Aristotle on constitutions, took as data the various Greek constitutions. Data comes in a multitude of forms, ranging from reading diaries or internal records of decision making, through field observation and (unstructured or semi-structured) interviews with local leaders, to the analysis of carefully collected economic statistics or a variety of highly structured survey data. Every type of research has certain types of data that are more commonly used, but all types of data are subject to the same standards and issues. There is a difference between journalism and political science, and much of that difference has to do with standards for data. Both journalists and political scientists may interview political leaders, but the way they collect and code that information is usually quite different. Modern technological developments have had an enormous impact on political science data. Much of the data we use is textual (laws, party manifestos, records of debate and deliberation, court decisions, newspaper accounts, minutes and records of administrative procedures, amongst many others). Our discipline has always used such data, but coding this data was extremely difficult and time consuming, leading to such data being underutilized. Now much of this data comes in machine readable form, and any data available in hard copy can be scanned and put into such a form. Thus it is now relatively easy
425
to code documents for the use of different types of words (or themes or tropes or whatever one likes). Machines can quickly search the records of various newspapers to code for various types of political events, a task that used to require a large team of graduate students and a very large budget. With modern computations, it is relatively easy for any investigator to code this textual data in a way that meets the needs of an individual research project. This is one of the most exciting advances in data collection in our discipline, an advance that is well under way. The possibilities for the analysis of textual data are almost limitless. While small bodies of text can be analyzed qualitatively, much modern text analysis works on enormous corpora (often generated by social media). Such corpora require quantitative analysis and recent years have seen many breakthroughs and new methods (Wilkerson and Casas, 2017). The analysis of surveys has been a mainstay in our discipline for the last half-century or more. Until recently, analysts were at the mercy of the survey organization; conducting a survey was a multi-million-dollar task. Thus electoral analysts in each country worked with the same standard survey collected by some national research organization; research could not go beyond the questions asked by that organization. With modern advances in communications and computers, it is now relatively easy for researchers to design their own surveys to suit the needs of a specific research project. We are also seeing more standardized surveys, so students of elections can now analyze a similar set of questions in almost any European country (and there are similar efforts in other parts of the world, with the Afrobarometer, the Latinobarometro and the World Values Survey). At the same time, a researcher wanting to study a specific event can put a survey in the field in a short space of time and at a reasonable cost. Larger survey houses can monitor populations over the course of an election, and even change or add questions as issues arise over the course of a campaign. Critical to this is the ability
426
The SAGE Handbook of Political Science
to monitor a survey on a day-to-day basis. In past times, it might have been months or years until survey data became available; with the new technology such data is available on a daily basis (Cautrès, Chapter 28, this Handbook). Thus, for example, it was possible to track changes in the British electorate in 2010 and their responses to the introduction of debates into the British campaign. Modern technology also makes it easy to imbed experiments into a survey. Thus one can give different respondents different scenarios, or different question wordings or whatever one likes, and these different treatments can be chosen randomly. This technology has greatly increased the use of field experiments. The incredible growth of the internet has been enormously important here. Various survey organizations in many countries give the researcher access to a huge pool of respondents, as well as the tools to quickly design a survey instrument and to allow the researcher to take advantage of experimental manipulation in the questions (subject only to ethical constraints on such manipulation). Internet surveys can be undertaken at very low cost, and are within the budget of even a PhD student. While there are still many issues in the use of the internet in this manner, we are clearly seeing more and more use of the internet (both for reasons of cost and sample size, and because other modes of interviewing are becoming more problematic). Once the data has been collected, numbers must be assigned. This is the process of measurement, which relies both on concept formation and (for quantitative studies) various methods often associated with psychometrics. Qualitative scholars have paid much attention to concept formation, and, in conjunction with tools such as Charles Ragin’s (1987, 2008) Qualitative Comparative Analysis (QCA), much progress has been made. Students of comparative politics have paid much attention to how concepts generalize across geographic locations, and how we can generalize across locations without having too much ‘conceptual stretching’.
At the same time, statistical and computing advances have allowed researchers to move beyond the psychometric techniques that were available 20 years ago. Today there is vibrant activity in multidimensional analysis, and in recent years, scaling techniques, both uni- and multidimensional, have been put on a much firmer theoretical basis. The new textual data has brought to the forefront issues such as how to locate political parties in a multidimensional issue space, and new statistical methods have allowed for great advances in the location of individual legislators in such a space. The last decade has seen a great increase in the amount of data sharing (largely through the impetus of funding agencies) as well as journals requiring authors to make available replication data sets. While this works very well for quantitative data, it is more problematic for qualitative data (interviews, field observation notebooks and the like). However, as it becomes easier either to collect this data in digitized format or to convert it to such a format, it can be expected that it will be as easy to make qualitative data publicly available as it now is for quantitative data (though obviously there are more issues of confidentiality and the like).
Research Designs Political scientists have relied heavily on observational studies (whether quantitative or qualitative). However, while there is some interest in pure description, there is usually more interest in making causal interpretations from the data. Thus, while we begin with simply observing that pairs of democracies usually do not go to war, we are more interested in the question of whether, as countries democratize, they become less likely to go to war. Finally, we try to deepen the explanation, asking what facet of democracy makes democracies less likely to fight each other.
Methodology: Qualitative and Quantitative Approaches
The question of how we can infer causality from observational data has vexed philosophers as long as there have been philosophers. The meaning of causality is a vibrant topic in modern philosophy (Baumgartner, Chapter 18, this Handbook). Applied researchers have attempted to find ways to assess causality, and, at a minimum, to attempt to rule out other, non-causal explanations for findings. In the above example, being a democracy may not really be the causal variable; perhaps, instead, the real causal variable is economic development, and richer countries are simply more likely to be democracies. Thus the observed association (correlation) between democracy and peacefulness could be artifactual or spurious. Both quantitative and qualitative researchers have devoted enormous attention to this issue. On the purely qualitative side, researchers have paid great attention to Mill’s Methods (John Stuart and Mill, 1843). Thus we see large numbers of comparative case studies, with researchers choosing the cases so as obtain variation on the key dependent variable and the causal variable, but little or no variation on other variables. Researchers also choose cases for theory testing based on the cases that are likely to prove hardest for the theory to explain. Researchers are also taking advantage of differences in designs where two cases are compared for two different times, where the cases were originally similar but a key variable (and hopefully only that key variable) has changed in one case but not the other. Moving to larger numbers of cases, researchers have used various configurational techniques to see how variables are related to each other, and to study complicated causal paths. Great attention has been paid to necessary and sufficient conditions and to designs which can distinguish whether a condition is necessary, sufficient, both or in some more complicated relationship to a variable of interest (Wagemann, Chapter 20, this Handbook). Qualitative researchers have often chosen their cases based on issues of research design
427
or the importance of their cases; quantitative researchers, conversely, have often chosen cases to maximize generalizability, either via national surveys or large cross-country analyses. But trends in quantitative research are tending to blur the difference between qualitative and quantitative designs (with both designs subject to similar standards about inferring causality). All of these are attempts to bring experimental insights into political science. One relatively new development which shows the convergence of various approaches is the ‘analytic narrative’ (Bates et al., 1998). Here researchers attempt to show how some important development in history can be explained by modern analytic theory. While these analytic narratives cannot test analytic theories, they can make such theories more or less plausible. There is much controversy about whether this tool can really be used to either help validate a theory or to explain an important historical event; can a clever user of the tool explain anything, or, as Jon Elster (2015) put it, are they simply ‘just so’ stories? However, this joining of modern analytic (usually formal) theory and careful historical evidence shows how two very different traditions can be joined in a potentially fruitful manner. Political science studies are most frequently observational. Other fields, such as medicine, rely more heavily (though far from exclusively) on experiments. In medicine the gold standard is the clinical trial, where subjects in a selected pool are randomly assigned to either a ‘treatment’ or a ‘control’ group, and where, under appropriate conditions, one can infer whether the treatment causes a better outcome. Experiments are much harder in political science. We cannot assign countries randomly to be either democracies or not, nor can we assign voters to be randomly rich or poor. We can, however, set up a laboratory, and then randomly assign participants in an experiment to a treatment or control condition. Thus, for example, we can study whether people prefer ‘fair outcomes’ by
428
The SAGE Handbook of Political Science
having pairs of subjects bargain, where each pair is randomly assigned to some initial endowment or price system. While this is an exciting new area of political science, issues of generalizing from experiments to the real world (external validity) will always limit the use of laboratory experiments in our discipline. However, where experiments are possible, we can be much more certain about assessing whether some political treatment had a causal (and not spurious) effect on an outcome of interest. Experiments may be particularly useful for testing formal theories of politics, since those theories are themselves highly abstract representations of the political universe. Experiments need not be limited to a laboratory; it is perfectly possible (and now with modern technology even easier) to conduct field experiments. The move from the laboratory to the field increases external validity at the cost of our being less certain about our causal inferences (internal validity). Perhaps the first examples of this came in conjunction with surveys, where different people could randomly receive different question wordings or question orders. It was easy to move beyond this to providing different people with different information randomly (subject of course to ethical guidelines about dealing with human subjects, which do not allow for, at a minimum, misleading subjects). Perhaps the most common field experiments have to do with the effect of various attempts to motivate people to vote, and to what extent voting turnout can be influenced by various communications strategies (Green and Gerber, 2008). Field experiments are now also common in the assessment of various interventions. Thus, if we want to know if certain types of political interventions (say national-level monitoring of local corruption), and if the area of intervention is chosen randomly (because the state cannot monitor all localities), it is then relatively easy to study the impact of the anti-corruption intervention. Of course, this depends on the willingness of the state to intervene randomly, something
they are not always (or often) willing to do. On a simple level, we can often study such things as educational reforms by comparing students who were randomly selected for the reform with those who applied but were not selected in a lottery. New programs which are oversubscribed often choose participants in this way. Of course if we simply compare those in the program with those not in the program, we do not know if the program, or factors which led people to choose to be in the program, led to the observed outcome, and so no causal inference is possible. But there is more and more demand for careful evaluation of programs (such as aid programs sponsored by various large foundations), and so this type of approach will become more and more common. This is a major step for applied researchers who want to see whether innovations actually work (Bassi, Chapter 22, this Handbook). A somewhat different though related strategy is to keep laboratory control but move the laboratory from the research university to real world settings. In a university laboratory we can study how undergraduates in research universities bargain. In field laboratories (again made possible by technological innovations) we can study how a wider group of people bargain. Researchers can also imbed in these experiments more ‘real world’ features. Thus, in some particularly exciting experiments on the role of ethnicity and trust, some people bargained with people of their own ethnic group, while others simply bargained with a randomly selected person. Thus, we can now make advances in studying group trust with studies that are both at least somewhat externally valid while still allowing for reliable causal inference. But perhaps the strongest convergence of qualitative and quantitative thinking comes in what Donald T. Campbell and Julian Stanley called ‘quasi-experiments’ in their path-breaking 1963 book on research design. Unlike experiments, some external force (political or natural) has ‘assigned’ one group to a treatment and another to a control. Since
Methodology: Qualitative and Quantitative Approaches
this is not a true randomized experiment, researchers must show that the treatment was assigned in such a way that the assignment process was independent of the outcome. Thus, for example, the British drew boundaries in Africa as if they were random (that is, following various geographical markers); thus one ethnic group might be divided between two countries, and one can then see whether there are differences in the behavior of the same ethnic group in the two countries. In the study of (McCauley and Posner, 2015) there were two ethnic groups each split across two countries, with the division into countries being ‘as if’ random; Posner could then see whether political and social rivalries between the two groups differed as a function of the larger political structures in each country. We are seeing more and more such designs. These research designs obviously have high external validity, though they lack the clear ability to show causal effects, since the assignment to groups was not totally random. But researchers are taking advantage of ‘almost random’ natural assignments to study the effects of changing laws (with laws cutting natural labor markets artificially, so the effect of, say, a change in the minimum wage law in half the labor market can be compared with what happened in the same labor market not subject to the change). This approach is very exciting, though of course one must work hard to show that the assignment process was effectively random. From a methodological standpoint, it does not matter whether the data collected from the two groups is quantitative or qualitative, and, in general, both types are collected. But even if the data from the two groups comes from large surveys, we are still comparing only two groups. A related research design, which brings together both quantitative and qualitative researchers, is the difference-in-differences design. In a simple before and after comparison we do not know if the intervention in between the observations caused the observed change. If we simply compare two units, one with an intervention and one without, we do
429
not know if the intervention or something else caused the observed difference. The difference-in-differences design asks the researcher to find two similar units where one unit had an intervention (say a change in a law) and the other did not. We need to be able to observe both units both before and after the intervention in one unit. If there is a bigger difference in the unit with the intervention then we have evidence that the intervention had a causal impact. As before, this design has good external validity but it does not rule out all other causal explanations. And, as before, we can compare many units, leading to a quantitative design (so long as the units were similar beforehand) or we can do a simpler two cases difference-in-difference design, allowing for more in-depth analysis of the two before and after cases. The study of causality has also been a big issue in statistical analysis. But even without statistical innovations, all empirical researchers have clearly been impacted by new thinking about using good research design to infer causality.
Statistics Multiple regression is clearly the workhorse of the quantitative political scientist. But political scientists have been quick to utilize related methods that better fit the data analyzed. A quarter of a century ago researchers still found methods like logit and probit for dichotomous dependent variables to be either just at or just beyond their grasp. Today these methods are commonplace. Similarly, researchers with ordered dependent variables, or event count dependent variables, or length of time dependent variables, typically know how to find the correct methods (and all commonly used software makes it easy to use these methods in practice). While the gains here are often, but not always, small, they usually come at no price, so there is no question that researchers should match their choice of method to the type of data being analyzed. These issues,
430
The SAGE Handbook of Political Science
typically important in cross-sectional research, are, by and large, now solved problems (King, 1989; Ward and Ahlquist, 2018). As computers have become more powerful, so have our methods. Thus, it is now possible to analyze more complex models, such as hierarchical models or spatial models of voting, using Markov chain Monte Carlo (MCMC) methods (Jackman, 2011). As with maximum likelihood before it, this method has moved from being an exotic state-of-the-art procedure to one being commonly used by graduate students; as with maximum likelihood, current software developments have made this method much more accessible to applied researchers. The other new development, which also depends on powerful computers, is replacing the standard linear specification with highly complicated non-parametric models using machine learning (Hastie et al., 2009). While the jury is still out on whether these methods will replace workhorse linear methods, there is no question that machine learning is vital for measurement when using complicated text or visual data. Similar strides have been made in the analysis of time series. While in the past the important issues of time series (which often have enormous consequences for results) were ignored, over the last quarter of a century the discipline has become much more sophisticated. Thus most researchers analyzing time series get the technical details right. Econometricians, at the same time, have made great strides in studying data that is trending (or, more technically, non-stationary). Political scientists have been quick to pick up on this, and we see much fewer spurious regressions. So while the issues here are more complicated, and there are still open issues about trending series, as with cross-sectional analyses, time series analyses in political science are now reasonably well done. In comparative politics we have data that consists of time series observed over a number of countries: time series cross-sectional data. Many articles now analyze such data, and the discipline has gotten good at analyzing such
data. Similarly, we can have cross-sectional surveys with repeated observations on each individual: panel data. The analysis of such data has also become commonplace, with the appropriate methods often used. Finally, there have been great advances in data where individuals are observed over multiple units (say, common surveys in different countries): multilevel data. Again, there have been great strides recently in the analysis of such data, and the correct analysis of multilevel data has also become more common. This is not to say that all statistical issues have been solved. Most current methods assume that observations in one unit are independent of observations in other units. But this assumption is clearly false for political science. What goes on in one country affects its neighbors and trading partners; a dyad going to war must have impacts on a large number of other dyads. Recently political methodologists have been investigating methods for modeling spatially dependent data, and great strides are being made in this area. Another active area of research is on ecological data, that is, data where interest is on individuals but only aggregate data is observed. Political science is rich with aggregate data, particularly voting data collected at the precinct level. But often interest is at the individual level. For example, who voted for the National Socialist Party in Germany in the 1930s? We have lots of data on aggregate votes at the precinct level, and some knowledge of the social characteristics of such precincts. Obviously we would like to do surveys, but these are impossible for events in the past. Since William Robinson’s (1950) classic work on the ‘ecological fallacy’ we have known that it is not simple to make inferences about individuals from data collected at a higher level of aggregation. However, recent advances have shown that we can use such data to gain insights into individual data (and also to show when the data cannot support such insights). An important issue that is the subject of much discussion is how to interpret statistical
Methodology: Qualitative and Quantitative Approaches
results. Political science has been dominated by the null hypothesis testing framework, where we calculate the probability of obtaining the data observed if the null hypothesis (almost always that two or more variables are unrelated) is correct. If this probability is low enough we ‘reject’ the null hypothesis, otherwise we do not reject. This approach is highly problematic, since rejecting the null hypothesis does not imply that there is a strong relationship between variables and failing to reject the null hypothesis does not mean that there is no relationship between variables. There is now active discussion in the social sciences and amongst journal editors as to moving beyond the hypothesis testing framework, and the subsequent issue of what should replace it (see. e.g. Blakeley and Gal, 2017 and ensuing comments). In the last few years, there has been much discussion of moving to a Bayesian paradigm. Much of this is driven by the computing power, rather than the interpretive possibilities, made possible by a Bayesian approach. Bayesian interpretation assumes we know something about the world, expressed as a statistical ‘prior distribution’ on some parameters of interest. This prior is combined with the information in the data, the ‘likelihood’, to produce a posterior distribution. Statements about the parameters of interest can be made based on this posterior distribution. There is much controversy on how to use prior information, and much controversy on the issue of different scholars having different priors. But this is a very active area of current research, both in political methodology and beyond, and Bayesians ideas (as well as computational methods) are making strong inroads in political science (Jackman, 2011).
Statistics and Causality Regression, and its maximum likelihood cousins (limited dependent variables, event counts, event history, time series) estimate a
431
model of a dependent variable conditional on an observed set of independent variables; these independent variables are assumed to be exogenous, that is, they are determined outside the system being modeled and hence can be taken as given. For pure description this is fine. But we generally want to make causal inferences. To take the simplest regression case (and all holds in the more complicated cases mentioned above), we believe the data was generated for unit i by the following process
y i = βx i + ε i
where x refers to either a single independent variable or a vector of such variables and ε is a standard unknown error term. Clearly β can be interpreted descriptively, that is, as the slope of a line (or plane) that best fits the points. But can it be interpreted causally, that is, do we believe that if for a given unit x increases by one point that y will increase by β points? (Obviously we will have to use an estimate of β, but this discussion holds even if we know the value of β for sure. This is not an issue of estimation.) There are a number of reasons that the relationship between x and y could be noncausal. The simplest is that there is some other variable, z, that causes both x and y. For example, there may be a good sized β in a regression of spending on public goods on democracy, but it may be that it is really how rich a country is that is causing both spending on public goods and democracy; there may be no causal relationship between democracy and spending on public goods in the sense that simply making a country more democratic, but keeping everything else the same, may lead to no increase in spending on public goods. Traditionally this was dealt with by including z in the regression and seeing if the coefficient on x changed. This is not an unreasonable way to proceed. However, it can be problematic. First, it assumes that the effect of x and z on y are linear and additive.
432
The SAGE Handbook of Political Science
For just the two variables, this means that we are assuming that
y i = γ x i + δ z i + ε i
where x and z are now scalars. While the details are a bit more complicated, this procedure estimates the effect of x on y by subtracting off from each observation δzi. But if the effect of x on y varies with z, or if the linear additive model is otherwise incorrect, this correction is, alas, not correct. But this is not the only problem. Let us say x is binary (democracy/non-democracy) and let z be national income. Can we make a poor autocracy comparable (in terms of public goods spending) to a rich democracy by simply adding δz (z is income) to its y (spending on public goods)? Given that there are few rich autocracies or poor democracies, this approach depends a lot on extrapolation well outside the data and so depends on a strong belief that the linear additive assumptions are correct. Recently researchers have proceeded in a different way, at least for the binary x case. For each democracy, they attempt to find one or more autocracies that are very close on various exogenous variables that might influence x and z (what Judea Pearl has called ‘back-door paths’ between x and y, and what Paul Rosenbaum has called ‘confounders’; Pearl, 2009; Pearl and MacKenzie, 2018; Rosenbaum, 2017). If one has eliminated all potential ‘confounders’ by matching on them, then the difference in means between the democracies and autocracies will give us the effect of democracy on y. Of course, this means that the various confounders must be observable and measured in the data set (and there are many technical issues that must be resolved by the researcher). What if we cannot match all the democracies with autocracies? These unmatched cases are simply dropped from the analysis. Thus we do not have to extrapolate well beyond the data, but this limits us to studying causal impacts in comparable cases; thus, for
example, we cannot say what would happen if Denmark were to become an autocracy. This is almost certainly the right degree of modesty. This matching literature is undergoing rapid development at the current time (for a summary, see Morgan and Winship, 2014). Issues that must be studied include how to handle continuous (or multivalued) xs and how to deal with studies where we cannot focus solely on one independent variable of interest. There are also many technical issues that are continually being dealt with, such as what does it mean for two cases to match and how many and which cases should be dropped from an analysis because they do not match. But clearly this approach is often superior to multiple regression (and when multiple regression is correct, it provides roughly the same answer). Perhaps more importantly, even if one decides to continue to run regressions, the insights of the matching and causality literatures are of great value. There are two difficult issues in multiple regression on which statistics give few insights: which independent variables to include in the regression and which cases should be studied. The matching approach suggests that only variables which are on back-door paths between the key independent variable and the dependent variable should be included in the regression. Equally important are variables on front-door paths, where x causes z and y should not be included in the regression. Thus if some variable is a consequence of x, if we include it in the regression, we may incorrectly conclude that x has no causal impact on y. In terms of which cases to include in a regression, the matching literature tells us that for any given potential causal variable, some cases give us no leverage because it is impossible to match cases where the causal variable is present to where it is absent. This is often not a problem in survey analysis but can be a major problem in the study of comparative and international politics. We often analyze a group of countries because they belong to a data reporting organization; the
Methodology: Qualitative and Quantitative Approaches
matching approach gives a more principled way of starting to think about which cases should be included in an analysis. And, just as importantly, the cases to include vary with the causal variable being studied. But, as with simple matching, we then must remember that the causal effect that is estimated is a function of which cases are studied. The matching approach (and multiple regression) assumes that we can observe the various confounders that impede causal inference. But what if they cannot be observed? There are several approaches that are promising, though, as with any method, they must be used with care. One is to model what is known as selection and the other is to use what are known as instrumental variables. These deal with issues of selection bias and endogeneity. Selection bias is a critical empirical issue. In applied work, if we want to see if some new type of school provides better outcomes, and we compare outcomes of those who attend the new school against a sample from other schools, we may find that the new type of school seems to work either because better students choose to go the novel school or because students who have knowledge of themselves and who have good reason to believe that the new type of school will work for them will choose the new school. The former problem is always critical, while the latter is critical if we wish to encourage everyone to utilize the new type of school. This problem was formalized by the Nobel Prize-winning economist James Heckman in the 1980s. He was interested in the returns to education (in terms of wages) of women; we only observe the wages of women who choose to enter the labor market. This may lead to underestimating the effect of education on women’s wages, since women with less education may only enter the labor market if they have some reason to know that they will do well in that market. Alas, such reasons are usually not observable in a data set. In political science, we may be interested in the effect of being involved in a scandal on the electoral success of incumbents running
433
for re-election. But those who see little chance of re-election may choose not to run, and those who were involved in a scandal but chose to run anyway may have private reasons to know that they are likely to do well. Are campaign ads effective? Perhaps people who already like a candidate are more likely to see that candidate’s ads. Similarly, in international relations, if we only study the outcomes of wars, many nations that are militarily weak may simply choose not to fight; thus the weak will fight only if they have some private information that they have a chance of winning, and so we may underestimate the effect of military strength on winning a war. Similarly, does international mediation actually help solve conflicts? Perhaps mediators only take on their task when they think success is likely. In comparative politics, autocrats that believe they can remain in power if they liberalize are perhaps more likely to liberalize. These are just a few examples, but selection bias is pervasive in observational studies. One solution is to match those who select some treatment (war, watching an ad or whatever) with those who chose not to do so. But if the data set does not contain enough information to match on critical variables (nations that are militarily weak but have other, unobserved, private reasons to believe that war is in their interests) this approach does not work. Heckman suggested a two-equation model, one for selection and one for the outcome given selection, with the errors in the two equations correlated, so that nations which should not have gone to war but did for unobservable (error term) reasons will also be more likely to do better, again for reasons that are in the error term. Note also that the various research design issues (experiments, quasi-experiments, etc.) can also be a critical tool for dealing with selection bias. But even if there is no statistical solution available, and we are not lucky enough to observe a good quasi-experimental situation, understanding the nature of the problem is critical for causal inference.
434
The SAGE Handbook of Political Science
The other issue is endogeneity. Does high income lead to good institutions or do good institutions lead to high income? Do voters who like some candidate assume his position is close to theirs, or does closeness on issues lead voters to choose that candidate? Disentangling whether x causes y or y causes x is critical to political science. And observational data on cross sections cannot help answer this question, since, as is well known, association tells us nothing about causation. For the above two examples, we would observe the same exact data regardless of the causal process. The research design tools discussed previously can sort out these issues; if x changed for some reason external to the system (perhaps because of some natural event) then we could study whether x causes y and is not simply associated with it. In the 1950s, economists associated with the Cowles Foundation at Yale thought they could solve the problem by estimating a series of equations, one for x and one for y. Of course, these had to be estimated jointly, and this technique came to be known as the estimation of simultaneous equations. Interest in this approach waned as it became obvious that we simply lacked a strong enough theory to allow the estimation of such equations. This theory, at a minimum, was necessary to tell us that there was some exogenous z that affected y and not x, and some other exogenous w that affected x and not y. At least in political science it seemed hard to find such exogenous variables with such asymmetrical effects. Interest in part of this approach, instrumental variables, started to reappear in the 1990s and now has become an extremely active area of research. The basic idea is that we are interested in the causal impact of, say, economic growth on having a civil war. However, in a cross-sectional study we would worry that civil wars hurt economic growth. The instrumental variable approach is to find some exogenous variable that affects growth but only affects civil wars through its link with economic growth. In an ingenious study,
Ted Miguel, Shanker Satyanath and Ernest Sergenti (2004) decided that, for Southern Africa, rainfall would be a good instrument. They had to convince themselves that rainfall only affected the outbreak of civil war through economic growth (since it is clear that rainfall is exogenous). The method of instrumental variables, in its simplest form, then regressed both civil wars and economic growth on rainfall (both regressions are fine since rainfall is exogenous). Having the effect of rainfall on both variables, they could divide these effects and then obtain a good estimate of the impact of economic growth on civil wars without worrying about the reverse direction of causality. Research on instrumental variables, both on the theory on when they are useful and also in various applications, is one of the most vibrant current research areas in political science. It is hard, but not impossible, to find good instruments. There is much current research on what properties a good instrument should possess and if and how empirical researchers can test whether a given instrument is a good one.
Conclusion Political methodology was not even considered a field for research 30 years ago. The world has changed remarkably. The American Political Science Association (APSA) has subfield groups for both quantitative and qualitative methodology; these groups are amongst the largest such groups in the Association. The last few years have seen a huge increase in specialized short courses in methodology, both quantitative and qualitative, in all parts of the world, including those regularly organized by the European Consortium for Political Research (ECPR) and the International Political Science Association (IPSA). Almost all departments of political science now require methods training of all of their PhD students.
Methodology: Qualitative and Quantitative Approaches
Thirty years ago the focus was on statistical inference. While many still focus on that issue, the advances over the last three decades have made the estimation of many complicated models quite easy. Today there is an incredible revolution in data collection and measurement, and renewed interest in research design, especially as it relates to making causal inferences. There has been huge growth in thinking about experimental and quasi-experimental approaches to studying critical questions. While qualitative and quantitative researchers often go their separate ways, there is surely renewed interest in what these two approaches have in common methodologically (and where they appropriately differ). In short, political methodology has been one of the great success stories of our discipline over the last 30 years.
Note 1 This is a revised and updated version of my chapter in the International Encyclopedia of Political Science a revised and updated version of Beck (2011).
References Bates, R. H., Avner Greif, Margaret Levi, JeanLaurent Rosenthal and Barry R. Weingast (1998), Analytic Narratives. Princeton: Princeton University Press. Beck, Nathaniel (2011), Methodology, in: Badie, Bertrand, Dirk Berg-Schlosse and Leonardo Morlino (eds.), International Encyclopedia of Political Science, pp. 1557–1567. Beverly Hills: Sage. Blakeley, B. McShane and David Gal (2017), Statistical Signifcance and the Dichotomization of Evidence, Journal of the American Statistical Association, Vol. 112, No. 519, pp. 885–895. Brady, Henry E. and David C. Collier (Eds.) (2010), Rethinking Social Inquiry: Diverse Tools and Shared Standards, 2nd ed. Lanham, Md: Rowman & Littlefield.
435
Campbell, Donald T. and Julian C. Stanley (1963), Experimental and Quasi-Experimental Designs for Research. Boston: Houghton Mifflin. Elster, Jon (2015), Explaining Social Behavior, (rev. ed.). Cambridge: Cambridge University Press. Green, D. P. and A. S. Gerber (2008), Get Out the Vote: How to Increase Voter Turnout, 2nd ed. Washington, D.C.: Brookings Institution Press. Hastie, Trevor, Robert Tibshirani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, (2nd ed.). New York: Springer Verlag. Jackman, Simon (2011), Statistical Inference, Classical and Bayesian, in: Bertrand Badie, Dirk Berg-Schlosser and Leonardo A. Morlino (Eds.) International Encyclopedia of Political Science. Thousand Oaks: Sage, vol. 8, pp. 2516–2521. King, Gary (1989), Unifying Political Methodology: The Likelihood Theory of Statistical Inference. New York: Cambridge University Press. King, Gary, Robert O. Keohane and Sidney Verba (1994), Designing Social Inquiry: Science Inference in Qualitative Research. Princeton: Princeton University Press. McCauley, John F. and Daniel N. Posner (2015), African Borders as Sources of Natural Experiments: Promise and Pitfalls, Political Science Research and Methods, Vol. 3, No. 2, pp. 409–418 (with John McCauley). Miguel, Edward, Shanker Satyanath and Ernest Sergenti (2004), Economic Shocks and Civil Conflict: An Instrumental Variables Approach, Journal of Political Economy, Vol. 112, No. 4, pp. 725–753. Mill, John Stuart (1843), A System of Logic, reprinted 1967, Toronto: University of Toranto Press. Morgan Stephen L. and Christopher Winship (2014), Counterfactuals and Causal Inference: Methods and Principles for Social Research, (2nd ed.). New York: Cambridge University Press. Nathaniel Beck. “Methodology.” In International Encyclopedia of Political Science. Badie, Bertrand, Berg-Schlosser, Dirk and Morlino, Leonard (eds.). Thousand Oaks, Ca.: Sage. 2011. Pearl, Judea (2009), Causality: Models, Reasoning, and Inference, 2nd ed. New York: Cambridge University Press.
436
The SAGE Handbook of Political Science
Pearl, Judea and Dana Mackenzie (2018), The Book of Why: The New Science of Cause and Effect. New York: Basic Books. Popper, Karl R. (1963), Conjectures and Refutations: The Growth of Scientific Knowledge, London: Routledge. Ragin, Charles C. (1987), The Comparative Method, Berkeley: University of California Press. Ragin, Charles C. (2008), Redesigning Social Inquiry, Chicago: University of Chicago Press. Robinson, William S. (1950), Ecological Correlations and the Behavior of Individuals, American Sociological Review, Vol. 15, No. 3, pp. 351–357.
Rosenbaum, Paul R. (2017), Observation and Experiment: An Introduction to Causal Inference. Cambridge, MT: Harvard University Press. Shapiro, Ian (2005), The Flight from Reality in the Human Sciences. Princeton: Princeton University Press. Ward, Michael D. and John S. Ahlquist (2018), Maximum Likelihood for Social Science: Strategies for Analysis. New York: Cambridge University Press. Wilkerson, John and Andreu Casas (2017), Large-Scale Computerized Text Analysis in Political Science: Opportunities and Challenges, Annual Review of Political Science, Vol. 20, pp. 529–544.
26 Mixed Method and Multimethod Research and Design Manfred Max Bergman
Introduction According to Denzin and Lincoln: Qualitative research is a situated activity that locates the observer in the world. Qualitative research consists of a set of interpretive, material practices that make the world visible. These practices transform the world. They turn the world into a series of representations, including field notes, interviews, conversations, photographs, recordings, and memos to the self. At this level, qualitative research involves an interpretive, naturalistic approach to the world. This means that qualitative researchers study things in their natural settings, attempting to make sense of or interpret phenomena in terms of the meanings people bring to them. (2018: 10)
In contrast, Flick suggests: [The g]uiding principles of [quantitative research] have been used for the following purposes: to clearly isolate causes and effects, to properly operationalize theoretical relations, to measure and to quantify phenomena, to create research designs allowing the generalization of findings, and to formulate general laws. (2009: 13)
If we consider these influential conceptualizations on qualitative (QL) and quantitative (QN) methods, it is difficult to imagine they can be combined in a single research design. For example, Johnson and Onwuegbuzie proclaim: The goal of mixed methods research is not to replace [qualitative and quantitative] approaches but rather to draw from the strengths and minimize the weaknesses of both in single research studies and across studies. If you visualize a continuum with qualitative research anchored at one pole and quantitative research anchored at the other, mixed methods research covers the large set of points in the middle area. If one prefers to think categorically, mixed methods research sits in a new third chair, with qualitative research sitting on the left side and quantitative research sitting on the right side. (2004: 14–15)
But how would we ‘interpret phenomena in terms of the meanings people bring to them’ and concurrently ‘isolate causes and effects … [and] formulate general laws’ in a single research design? How would such a combination constitute drawing from each other’s
438
The SAGE Handbook of Political Science
strengths and minimizing weaknesses? Can QL and QN methods represented as two points, implying that there are no withingroup variations between them? And can they indeed be understood as opposites? Combining QL and QN components in one research design may either be impossible due to their incompatibility, thus undermining claims about the potentials of mixed method research (MMR) and designs, or they must be radically reconceptualized. In this chapter, I will conceptually disentangle MMR and multimethod research and designs, outline three developments since the early 20th century, and appeal for a careful revision of how we think of and work with mono-, mixed, and multimethod research and designs.
Conceptual Clarification At the most general level, MMR and multimethod research combine at least two different research methods in one research design. MMR and multimethod research are related, although the academic literature proposes three distinct variants. Some use the terms interchangeably, often objecting to the term ‘mixing’, given that, strictly speaking, mixing is not part of any research design. Proponents of this position argue that, at best, methods are combined, blended, merged, or integrated into a single research design, but they are never mixed. According to this position, MMR is a misnomer and the term multimethod-, blended-, or integrated research should be preferred. While ‘mixing’ in this context is indeed misleading and arguably unaesthetic, I am unconvinced that the proposed alternatives represent clear improvements. A second group of authors argues that multimethod research designs include any combination of QL and QN, while MMR must include at least one QN and at least one QL component. From this perspective, combining two or more QN methods, combining two or more QL
methods, or combining at least one QN and at least one QL method in a single research design are variants of multimethod research but only the latter qualifies as MMR. The problem with this nomenclature is that it does not differentiate adequately between MMR and other multimethod variants. The third variant differentiates between MMR, which includes at least one QN and one QL component, and designs that combine only QN or only QN components. Here, only the latter two are considered multimethod designs. Thus, combining two or more QN components in one research design would be referred to as a quantitative multimethod research design, and combining two or more QL components in one design as a qualitative multimethod design. For this chapter, I will use this third variant, and I will focus mostly on MMR because it is the most challenging of all method combinations. However, most of what is covered in this chapter applies also to multimethod research. Methods triangulation is another, although outdated, term that is often equated with MMR or multimethod designs. Drawing primarily from trigonometry, surveying, or navigation, triangulation is used metaphorically in the context of research designs. In navigation, for instance, triangulation refers to the process of determining or verifying a position in a given territory based on measurements relative to remote points. For example, if the distances to one of the two remote points are known, and if the distance between the two points is known, then it is possible to pinpoint an exact location because the three distances form the sides of a triangle. As I will show in a subsequent section on the purposes of and justifications for MMR, using triangulation metaphorically to describe MMR or multimethod designs has its limits and should, therefore, be avoided. Finally, a few thoughts on the components to be combined in MMR and multimethod designs. One way to differentiate methods relates to data collection and data analysis. Examples of data collection methods include
Mixed Method and Multimethod Research and Design
non-participant observations, narrative interviews, focus groups, diary entries, a collection of newspaper articles or blogs based on specific selection criteria, responses to closed-ended survey questions, and experimental data. Examples of data analysis methods include regression analyses, structural equation modeling, multilevel modeling, social network analysis, multidimensional scaling, correspondence analysis, quantitative or qualitative content analysis, discourse analysis, narrative analysis, and dramaturgical analysis. When combining research components in MMR and multimethod designs, it is important to separate collection and analysis techniques. For example, statistically analyzing survey responses is neither, although we are combining two components: a data collection and a data analysis method. Instead, this is an example of quantitative monomethod research (sometimes also referred to as single method design). A Foucauldian discourse analysis based on open-ended responses from a survey would be considered a qualitative monomethod research design, although it combines non-numerical survey data with discourse analysis. MMR and multimethod designs are identifiable by the combination of either data collection or data analysis methods. Thus, a quantitative multimethod design could include a regression analysis from data that was gleaned from a computer assisted personal interview (CAPI) as well as experimental data from a subset of the survey participants, or it could include a factor analysis from a set of responses in a questionnaire, where factor scores were subjected to a regression analysis. Analogously, a qualitative multimethod design could be based on a thematic analysis from a set of focus groups and one-to-one interviews of the same persons, or it could include a preparatory thematic analysis of focus group transcripts before subjecting the results to a discourse analysis. Finally, an MMR design could combine a qualitative analysis of narrative interviews with an analysis of responses to standardized survey questions, or it could
439
be based on a quantitative as well as a qualitative analysis of a collection of newspaper articles. Whether different data sets ought to be collected from the same source, whether different sources can provide data for the same project, or whether a given data set ought to be analyzed in multiple ways will depend on the substantive issues associated with the research project. However given their added complexity and limitations, MMR and multimethod designs are not inherently better than monomethod designs. Selecting and justifying MMR and multimethod research and designs should be based on substantive considerations and the availability of resources and skills, rather than principle.
Justification for MMR The aims and benefits of MMR and multimethod designs appear simple: to take advantage of the strength of each method and, thus, overcome their respective weaknesses, as stated in much of the literature on MMR (e.g. Johnson and Onwuegbuzie, 2004; Tashakkori and Teddlie, 1998, 2003, 2010). But what, precisely, are their respective strengths and weaknesses? For example, are the biases introduced by interviews expunged by adding a survey or experiment to a research project? Does the reporting of frequency counts and significance tests from themes embedded in newspaper articles render a narrative analysis more objective? Is heteroscedasticity eliminated by using the same variables in two different QN analyses within a single design? Of course not! Yet much of the debate on combining research components in one research design is based on arguments that often imply exactly that. This is due in part to the way in which, particularly from the early 1980s, QL methods were explicitly associated with constructivism and interpretive methods, while QN methods were often by default, with positivism, implying incorrectly that there is only one kind of
440
The SAGE Handbook of Political Science
constructivism, one kind of positivism, and that they ostensibly occupy opposite ends of some continuum. The excerpts at the beginning of this entry are well-established representatives of this historically evolved division. This state of affairs among QL and QN researchers led some academics to propose an ‘Incompatibility Thesis’ and the thus arising ‘Paradigm Wars’ (e.g. Lincoln and Guba, 1985; Tashakkori and Teddlie, 1998). If, as it is argued even today, QL and QN methods are based on different ontological, epistemological, and axiological premises, then a mixed method approach would not be possible (Bergman, 2008; see also Moses, Chapter 27, this Handbook). For example, if QL researchers presume that all research endeavors are value-laden and thus subjective, and if QN researchers assume that all research endeavors must be value-free (as one of the preconditions for objectivity), then how can these incompatible positions be combined in one research design? Based on the presumption of a fundamental divide, even incompatibility, between QL and QN methods and the thus implied necessity to justify its practices, mixed method researchers have devised theoretical, conceptual, and empirical strategies, which, as an unintended consequence, have led to three interrelated weaknesses: uncritical adoption of the QL– QN divide, theoretical fuzziness, and unnecessary formalizations of MMR designs. In this section, I will outline the reputed differences between QL and QN methods, review briefly how MMR has attempted to deal with these differences, assess the plausibility of these differences, and propose steps toward a new generation of theory and application relating to QL and QN methods, and, by extension, whether and how to combine them. Ultimately, I will argue that contemporary MMR and multimethod theory, design, and applications lack a sufficient grounding, and that an alternative grounding will not only improve the justification for MMR and multimethod designs, but it will also broaden their scope and applicability. In the process, a
third generation of MMR thinking will, serendipitously, have the potential to expand the scope and practice of QL and QN methods. In other words, I will argue that the growth and exploitation of MMR has been hampered considerably by how QL and QN have been ontologically, epistemologically, axiologically, and habitually constrained, especially by theory and, to a lesser extent, research applications. A careful re-examination of their possibilities and limits will reveal that research design possibilities are far richer than currently imagined, thus introducing new possibilities in relation to data collection and data analysis.
Qualitative vs. Quantitative? Before dealing with when, how, and why to combine methods, we should first reflect on what it is that we are combining. It is difficult to identify the origins of the idea that QL and QN methods represent fundamentally different approaches in research and knowledge production. It could be argued that the divide originated with epistemological divergences between Socrates, Plato, and Aristotle on what constitutes knowledge – as ‘justified true belief’ or ‘universal knowledge’ for example, and whether knowledge is based on observations or true premises. A study of the pre-Socratic philosophers Xenophanes, Heraclitus, or Parmenides reveals that the possibilities and limits of obtaining knowledge were debated even earlier. Knowledge was often presented as invariant across different observers or circumstances of observation, or as identical to perception. The degree to which information from our senses and, by extension, empirical observations, played a role in acquiring knowledge has been widely debated long before and after Socrates. Another possible origin of the current division between QL and QN methods could be attributed to the systematic epistemological differences between British empiricists, such as Thomas Hobbes and John Locke, and the rational philosophers of
Mixed Method and Multimethod Research and Design
the 16th and 17th centuries, such as René Descartes, Baruch Spinoza, and Gottfried Leibniz. Although influential in understanding the foundations of modern social science, the current QL–QN divide is also driven by ideological, political, and strategic positions. The current mainstream position is to consider QL and QN methods as belonging to separate paradigms (e.g. Bryman, 2012; Denzin and Lincoln, 2018; Silverman, 2017; Tashakkori and Teddlie, 1998, 2010). Notable in this regard is that the focus on the fundamental differences between QL and QN approaches has reached its zenith in the late 1980s and 1990s with the publication of an entire battery of influential texts (e.g. Brewer and Hunter, 1989; Danziger, 1990; Denzin and Lincoln, 1994, 1998; Flick, 1998; Lincoln and Guba, 1985; Maykut and Morehouse, 1994; Reichhardt and Rallis, 1994; Silverman, 1993; 1997). While written for purposes other than whether QL and QN methods are sufficiently compatible to combine them in one research design, they nevertheless dictate the MMR canon. Mainly for pragmatic reasons, texts on MMR from the 1990s have adopted this division with the advantage that MMR tapped into an existing narrative on the possibilities and limits of QL and QN research. However, integrating into the MMR architecture the conceptual constraints of QL and QN methods also hampers its adequate theoretical grounding. MMR theory and applications are unnecessarily bounded by historically evolved, incommensurate positions that are often anchored in ideological and political stances. Although many textbooks and handbooks written since 2000 argue that the Paradigm Wars have been overcome, a closer inspection of important texts on research methods in the past two years reveals that things have not changed much since the 1980s. Ironically, the underpinnings of the obsession in methods debates on truth and knowledge is premised by specific strategic, political, and ideological foundations. Based on this heritage, numerous ontological, epistemological, axiological, and
441
habitually ascribed qualities are attributed to QL and QN methods. According to excerpts from the methods literature, researchers from the QL tradition ostensibly embrace a constructed or co-construct reality, accept the existence of multiple realities, or posit that reality does not exist. They embrace the interdependence between the researcher and the research subject, accept the value-ladenness of the research process and its output, prefer to work with a relatively small number of cases, and they thus accept the impossibility of generalizing QL findings – many claim that generalization is not possible in any case. They furthermore reject the possibility to identify or distinguish between causes and effects, and they insist that QL research is fundamentally inductive and exploratory (Carver, Chapter 24, this Handbook). QN methods, in contrast, are cast in opposition to QL methods: researchers engaging in QN research ostensibly believe in a single and empirically accessible reality, the necessity of separating the researcher from the research subject to avoid research bias or to maintain objectivity, aim to conduct value-free research, are able to generalize findings beyond the contextual limits of the researched units and research situation, work with large and representative samples, and, ultimately, identify universal, causal laws by testing falsifiable hypotheses via the hypothetico-deductive model of science (Beck, Chapter 25, this Handbook). Variants of such differentiations between the so-called paradigms can be found in a number of influential texts (e.g. Bryman, 1988, 2012; Creswell, 2003; Cresswell and Plano Clark, 2007; Fielding and Fielding, 1986; Mertens, 2004; Tashakkori and Teddlie, 1998, 2010). They tend to reproduce previously published lists, often categorized according to ontological, epistemological, and axiological reflections (e.g. Crotty, 1998; Denzin and Lincoln, 1994; Lincoln and Guba, 1985). Considering the qualities attributed to QL and QN approaches more closely, however, theorists and researchers engaging in MMR have to maintain a
442
The SAGE Handbook of Political Science
strangely incommensurable position toward the division of labor between QL and QN methods: On the one hand, they must accept and emphasize the divergent qualities attributed to each approach, which on ontological, epistemological, and axiological grounds are clearly incompatible. On the other, they put forward the proposal that the strengths of each paradigm can be combined fruitfully within one single research design. Interestingly, research practices have not followed the research methods literature. For example, many QN projects across all social science disciplines exist that are based on small, non-representative data sets (e.g. Pett, 2002), and there are many types of statistical procedures that do not aim at testing hypotheses but, instead, are mainly exploratory. Examples of these are cluster analyses, factor analyses, multidimensional scaling, network analyses, or correspondence analyses. Even experiments, possibly the most rigorously defended QN methods due in part to their association with the natural sciences, tend to work with relatively small samples, and the units of analysis are rarely originate based on a random sample. By contrast, there are many QL researchers, who do not embrace a constructivist paradigm but, instead, embrace materialist and realist perspectives. For example, a medical anthropologist studying interactions between an HIV-positive mother and her HIV-negative baby may not necessarily embrace constructivism, here in terms of the virus and risk of infection, to produce interesting and relevant research results. In contrast to habitual claims made in the methods literature, it is even possible to mix constructivist and materialist perspectives in QL, QN, or MMR. For example, a questionnaire may explore the construction of gender identity by a set of characteristics that respondents from different social classes attribute to themselves or others. The researcher may decide to treat class membership as a realist-materialist social category, and the characteristics attributed to gender as a social construction associated
with class membership. In this example, a QN study mixes constructivist and realistmaterialist perspectives. It is also possible to mix constructivism and realism-materialism in qualitative research. For example, a QL researcher may be interested in how the concept of ‘terrorism’ is constructed differently across election cycles and events. Here, the researcher designates election cycles and specific events as realist-materialist categories, while exploring the content and context within which the term terrorism is used from a constructivist perspective. In sum, it is up to researchers to decide on whether and to what extent they want to employ a realistmaterialist, constructivist, or other perspective. The decision on the ‘truth-value’ of the research categories, data, or results should be made based on the research purpose and theoretical framework; in short, on substantive grounds, rather than on principle. The point here is that, irrespective of whether researchers pursue QL, QN, or MMR, they need to decide on whether or which part of their research needs to be constructivist, realistmaterialist, or both. Drawing together the major distinctions between QL and QN approaches, one has to wonder why the characteristics attributed to them are so diametrically oppositional in the methods literature, especially when considering their shared subject space. Should we not be more suspicious of such clear and clean distinctions, their mutual exclusivity, especially if we reflect on the complex, messy, and compromise-laden research process? I wonder whether such distinctions are made in an attempt to maintain an uneasy truce between two highly specialized, politicized factions, rather than to demarcate two kinds of methods and approaches. And if it is indeed a negotiated settlement between stakeholders, rather than a fair representation of the actual possibilities and limits of different research approaches, what are QL and QN methods losing as a consequence and how does this settlement affect the possibilities and limits of MMR?
Mixed Method and Multimethod Research and Design
The Response of the Second Generation of MMR Theory to the Presumed Differences between QL and QN Research The response of the first generation of mixed method researchers to the presumed difference between QL and QN can be summarized in one word: none. This is due in part to the fact that researchers were routinely combining different data collection and analysis methods since the very beginning of the social sciences, well before the term ‘mixed methods’ was coined. At around the end of the 19th and early 20th centuries, there was little orthodoxy about how to conduct social science research, mainly because of the relatively unsophisticated research components, including sampling, data collection, and data analysis. In contrast to today, it was not difficult to acquire adequate knowledge and skills on research methods to conduct empirical social science research (for a historical review, see Brannen, 1992; Tashakkori and Teddlie, 1998). The increasing influence of French sociophilosophical theory of Lyotard, Baudrillard, Deleuze, Barthes, Foucault, Derrida, and many others to mainstream social science from the 1970s led many qualitative researchers and theorists to proclaim a link between constructivism and QL (Brannen, 1992; Tashakkori and Teddlie, 1998). Consequently, QL and QN researchers devised new justifications, while MMR adopted a shallow interpretation of philosophical pragmatism as a way to deal with this emergent incompatibility. A careful reading of pragmatism – founded by Charles Saunders Peirce, William James, and John Dewey, and rooted in Kant’s Critique of Pure Reason (1781) in which the latter attempts to bring together rationalism, Hume’s skepticism, and empiricism – reveals that pragmatism is itself incompatible with constructivism and positivism. Thus, and at least meta-theoretically, mixed method researchers have actually maneuvered themselves into contradictions because
443
pragmatism fails to resolve the tensions between the adopted QL-constructivist and QN-positivist positions. Pragmatism, philosophical or conventional, does not resolve incompatibilities between constructivism and positivism.
The Third Generation of MMR Theory as an Alternative Response to Presumed Differences between QL and QN Research There are more elegant and consistent ways to deal with the apparent contradictions without glossing over some of the central ideas in research methodology. But even if this hurdle has been successfully overcome, it remains unclear why and how methods should be mixed. Will MMR get us closer to objectivity? Should we mix different theories, types of data, analyses, interpretations, or all of the above? Considering such issues and anticipating the plethora of complexities introduced by them, one wonders whether mixing methods is indeed an improvement over monomethod designs. Instead of embracing MMR over QL or QN designs on principle, MMR should instead be considered based on the specificities of the research purpose as well as the available skills and resources. It is much easier and more convincing to justify and conduct MMR based on substantive grounds, rather than based on general principles. Given that the most successful researchers rarely, if ever, develop ontological, epistemological, or axiological defenses of their methods when presenting substantive findings, it is always surprising to me to find, first, the consistency with which these topics are covered in the methods literature and, second, how superficial this treatment is, especially if we consider that we have about 3,000 years of writings on epistemology and ontology at our disposal. Instead of reproducing classifications or dichotomies that are rarely convincing, that may be inconsistent with the purpose
444
The SAGE Handbook of Political Science
and goals of a research project, and that are routinely ignored in research applications, researchers ought to make explicit which data collection and analysis methods best connect to the research question, theoretical grounding, and research purpose. In addition, skill, time, and funding should also be taken into account before embarking on MMR. The possibilities and limits of methods are far better conceptualized and integrated into a research design based on the qualities of specific sampling, data collection, and data analysis methods, rather than attributing overgeneralized qualities to QL, QN, and MMR.
Recasting the Justification for MMR Designs What may be more difficult to accept for proponents of the conventional QL–QN divide as outlined above may indeed be liberating for those engaged in MMR. Reconceptualizing MMR as a substantively justified research design also maintains the focus on the research purpose. From this vantage point, the justification and interest for MMR is also much easier to defend. Of the four justifications found among MMR projects (Bergman, 2008, 2010), the two least convincing are also the most frequently employed. The first of the less successful justifications can be called holism, where MMR designs are justified based on providing findings that are considered more complete, due to the combination and thus additive value of the QL and QN components. Whatever data was used and whatever analyses were conducted, however, any additional data set or analysis relevant to a project is likely to provide additional insights or further qualifications such that research findings will always remain incomplete because they will always remain conditional and partial, regardless of the number of research components employed in a single project. The second unconvincing justification for MMR can be called perspectivism, where a
researcher adds a QL or QN component to a QN or QL study to gain an additional perspective. Mixing methods in order to ‘produce a fuller picture of the empirical domain under study’ (Erzberger and Kelle, 2003: 469–70; see also Brewer and Hunter, 1989, 2006) is a necessary but insufficient justification for MMR. While related to holism, it is nevertheless an improvement because this justification acknowledges the conditionality and partiality of any research result. Thus, adding an additional perspective may extend or qualify research findings in important ways. However, similar to holism, any additional data set or analysis may provide yet another perspective. Thus, just adding more detail to a research project is not yet sufficient to warrant an additional QL or QN component. The third justification for MMR is also the oldest. Here, MMR is used as a form of validation, a purpose that comes closest to the meaning of triangulation as discussed above. Initially developed in pyschonometrics for QN by Campbell and Fiske (1959), a matrix of intercorrelations between measures was designed for cross-validation, specifically in relation to convergent and discriminant validity of psychological constructs. Transferring this idea to MMR, convergence refers to the degree of overlap between the results from the QL and QN components, which may imply a cross-validation of results. In this variant, the more the results from different components of the MMR or multimethod design converge, the more they cross-validate each other. An interesting development in this regard is that divergent results may not be indicative of invalidity of at least one of the research results. Instead divergence may give rise to important qualifications of results or instigate further investigation into a subject matter. For example, when the results of a questionnaire on voting intentions diverge from the results of a focus group on voting intentions, it may not necessarily mean that at least one of the results are invalid. Instead, it may give rise to a deeper understanding of the formation,
Mixed Method and Multimethod Research and Design
communication, and group dynamics associated with voting intentions. In my opinion, the most interesting justification for MMR is complementarity. Conducting exploratory, unstructured interviews to identify important underlying dimensions of thought among a target population before developing a questionnaire is an example of a sequential MMR design that is based on complementarity. The QN component is complemented by the initial, exploratory QL component, which may provide important insights into the knowledge base and sensitivity of the researcher or respondents, appropriate wording, and important underlying dimensions. The questionnaire is complemented by the findings from the initial QL component such that the final research results could have only been obtained from the interdependence between the two components. In sum, MMR is best justified not according to vague and untenable dichotomies between QL and QN methods, nor by a problematic appeal toward a greater range of evidence, but by substantive justifications that pertain directly to answering a research question in accordance with a theoretical framework, research purpose, and available skill sets and resources. And even then, the results of MMR will always remain conditional and partial.
Conclusions The conventional divide between QL and QN research approaches is based on questionable premises. To justify its jurisdiction, MMR theory has absorbed the lines of demarcation at the cost of theoretical inconsistency and vagueness. Worse still are the thereby created unnecessary limitations that are imposed on the capabilities of different data collection and analysis methods, and by extension on MMR theory, designs, and capabilities. A more consistent and viable justification for MMR and multimethod research is to reconsider the assumptions underlying QL
445
and QN methods. The current assumptions embedded in the methods literature not only hamper a theoretically grounded integration of QL and QN research into one single research design, but limit applications of different data collection and analysis methods. MMR and multimethod designs will need more elaborate explanations with regard to the purpose and use of each data collection and analysis method, as well as how and for what purpose the specific set of methods are combined. Thus, MMR cannot claim to bridge the unbridgeable gap between positivism and constructivism. However, if coherently and consistently applied, it is indeed possible to frame an MMR project within a constructivist, realist-materialist, or pragmatist perspective. In other words, these and other frameworks are all possible, once we become aware of the different theoretical and analytical levels that separate theory, data collection, and data analysis. Ultimately, MMR does not automatically or in principle provide better answers to research questions, and it is unlikely to replace well-designed monomethod research designs. But under specific substantive considerations, MMR will indeed produce findings that transcend the limits of monomethod research. As practical applications of research methods usually and successfully contradict or transcend strict meta- theoretical doctrine, I am convinced that we are at the beginning of a new generation of not only MMR and multimethod design, but also of monomethod research designs, all of which will turn out to be more capable and powerful once they have repositioned themselves in relation to current interpretations of the philosophies of science and knowledge.
References Bergman, M. M. (2010). On concepts and paradigms in mixed methods research. Journal of Mixed Methods Research, 4, 3, 171–175.
446
The SAGE Handbook of Political Science
Bergman, M. M. (2008). The straw men of the qualitative-quantitative divide and their influence on mixed methods research. In: M. M. Bergman (Ed.), Advances in Mixed Methods Research: Theories and Applications. Los Angeles, CA: Sage. Brannen, J. (1992). Mixing Methods: Qualitative and Quantitative Research. Aldershot: Avebury. Brewer, J. and Hunter, A. (2006). Foundations of Multimethod Research: Synthesizing Styles (2nd ed.). Thousand Oaks, CA: Sage. Brewer, J. and Hunter, A. (1989). Multimethod Research: A Synthesis of Styles. Newbury Park, CA: Sage. Bryman, A. (2012). Social Research Methods (4th ed). Oxford: Oxford University Press. Bryman, A. (1988). Quantity and Quality in Social Research. London: Routledge. Campbell, D. T. and Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 2, 81–105. Creswell, J. W. (2003). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (2nd ed.). Thousand Oaks, CA: Sage. Creswell, J. W. and Plano Clark, V. L. (2007). Designing and Conducting Mixed Methods Research. Thousand Oaks, CA: Sage. Crotty, M. (1998). The Foundations of Social Research: Meaning and Perspective in the Research Process. London: Sage. Danziger, K. (1990). Constructing the Subject: Historical Origins of Psychological Research. Cambridge: Cambridge University Press. Denzin, N. K. and Lincoln, Y. S. (Eds.) (2018). The SAGE Handbook of Qualitative Research (5th ed.). Thousand Oaks, CA: Sage. Denzin, N. K. and Lincoln, Y. S. (1998). Introduction: Entering the field of qualitative research. In: N. K. Denzin and Y. S. Lincoln (Eds.), The Landscape of Qualitative Research: Theories and Issues. Thousand Oaks, CA: Sage. Denzin, N. K. and Lincoln, Y. S. (Eds.) (1994). The SAGE Handbook of Qualitative Research (1st ed.). Thousand Oaks, CA: Sage. Erzberger, C. and Kelle, U. (2003). Making inferences in mixed methods: The rules of
integration. In: A. Tashakkori and C. Teddlie (Eds.), Handbook of Mixed Methods in Social and Behavioral Research (2nd ed.). Thousand Oaks, CA: Sage. Fielding, N. G. and Fielding, J. L. (1986). Linking Data: The Articulation of Qualitative and Quantitative Methods in Social Research. Beverly Hills, CA: Sage. Flick, U. (2009). An Introduction to Qualitative Research (4th ed.). Thousand Oaks, CA: Sage. Johnson, R. B. and Onwuegbuzie, A. J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33, 7, 14–26. Lincoln, Y. S. and Guba, E. G. (1985). Naturalistic Inquiry. Beverly Hills, CA: Sage. Maykut, P. S. and Morehouse, R. (1994). Beginning Qualitative Research: A Philosophic and Practical Guide. London: Falmer Press. Mertens, D. (2004). Research and Evaluation in Education and Psychology: Integrating Diversity with Quantitative, Qualitative, and Mixed Methods (2nd ed.). Thousand Oaks, CA: Sage. Pett, M. A. (2002). Nonparametric Statistics in Health Care Research: Statistics for Small Samples and Unusual Distributions. Thousand Oaks, CA: Sage. Reichhardt, C. S. and Rallis, S. F. (Eds.) (1994). The Qualitative–Quantitative Debate: New Perspectives. San Francisco: Jossey-Bass. Silverman, D. (2017). Doing Qualitative Research. London: Sage. Silverman, D. (1997). Qualitative Research: Theory, Method and Practice. London: Sage. Silverman, D. (1993). Interpreting Qualitative Data: Methods for Analyzing Talk, Text, and Interaction. London: Sage. Tashakkori, A. and Teddlie, C. (Eds.) (2010). Handbook of Mixed Methods in Social and Behavioral Research (2nd edition). Thousand Oaks, CA: Sage. Tashakkori, A. and Teddlie, C. (Eds.) (2003). Handbook of Mixed Methods in Social and Behavioral Research (1st ed.). Thousand Oaks, CA: Sage. Tashakkori, A. and Teddlie, C. (1998). Mixed Methodology: Combining Qualitative and Quantitative Approaches. Thousand Oaks, CA: Sage.
27 Ontologies, Epistemologies and the Methodological Awakening1 Jonathon Moses
Introduction At the close of the last century, Alexander Wendt provided some sage advice to practitioners of international relations. For Wendt, ontology is not something that most international relations (IR) scholars spend much time thinking about. Nor should they. The primary task of IR social science is to help to understand world politics, not to ruminate about issues more properly the concern of philosophers. Yet even the most empirically minded students of international politics must ‘do’ ontology, because in order to explain how the international system works they have to make metaphysical assumptions about what it is made of and how it is structured. (1999: 370)
Political scientists, more broadly, also ‘do’ ontology (and epistemology too!) – whether they recognize it or not. Like other methodological decisions, it is prudent to reflect on the consequences of these choices and consider (explicitly) how those decisions might affect the outcomes of our research. After all, key
theoretical and substantive differences turn critically on ontological issues. These differences manifest themselves in a smorgasbord of epistemological choices, many of which are unresolvable (given the different underlying ontological positions). Understanding how these (ontological and epistemological) challenges combine and influence our research project is the subject of methodology (Moses and Knutsen, 2019: 4–5). This chapter provides a basic introduction to these terms (ontology, epistemology, methodology) for the practising political scientist. My objective is to avoid the most woolly vocabulary and arcane sidebars often found in the literature, and focus on those aspects that are most relevant to the practising, empirically minded, political scientist. As we shall see, it is quite difficult to separate questions of ontology and epistemology, so I will briefly introduce each of them, before moving quickly to a discussion of how they are inter-related. While there is no agreement about the nature of the relationship
448
The SAGE Handbook of Political Science
between epistemology and ontology (which came first: the chicken or the egg?), there is a robust consensus on the need to be aware of one’s epistemological and ontological assumptions (e.g. Hay, 2006: 8; Furlong and Marsh, 2007). This chapter aims to build that awareness.
Two Rising Stars Ontology is the study of ‘being’, or the study of ‘what there is’. For philosophers, the potential solution set can be huge, and often very abstract: they wonder about what it means to exist, or why something came into existence (e.g. Quine, 1948, but see also Hofweber, 2018). For example, three classic questions of ontology are: ‘is there a god?’, ‘are there numbers?’ and ‘do universals exist?’. Maurizio Ferraris (2016) provides a useful summary The task of ontology, but also aesthetics, logic, ethics and so on, is to make people think about and understand our world and, if possible, make it better by then interacting with more specialised forms of knowledge.
From these philosophical exchanges a rather specialized vocabulary has developed for established approaches (e.g. foundationalism, objectivism, but also metaphysical realism, internal realism, scientific realism, speculative realism, new realism and others), the names of which often migrate into social science. (Ferraris, I should note, is an exception: he has an accessible, ironic and playful style.) Most political scientists have a more downto-earth approach to ontology. Rather than trying to establish the essence of some particular object or concept, our concern is focused on how such an object appears to us, and whether its ‘appearance’ might be affected by our subsequent investigation. This is because the ontological certainty of social facts (e.g. sovereignty) is quite different from that of
natural facts (e.g. hydrogen). Hence, when discussing ontological questions in political science we are usually concerned with the particular nature of our object of study, and whether those objects can be (or are) affected by our investigation. The practising political scientist’s concern with ontology can be mustered into two camps. The first camp contains a group of political theorists, who trade in ‘social ontologies’, where the subject of inquiry is the world (of objects, identities, relationships, etc.) investigated by social scientists (see, e.g. Schatzki, 2003; Hay, 2006; Lawson et al., 2007; Jackson, 2008; Michel, 2009; Cruickshank, 2010; Ikäheimo and Laitinen, 2011). This work is closest to that of traditional philosophy, and can be seen as part of a long tradition in political theory, where concepts and ideal-types are defined, explained and carefully considered. Hence, social ontologists ask questions such as: is there such a thing as false consciousness? Can ideas cause political phenomena? Are political preferences stable? What are human rights? What is the nature of sovereignty? Social ontologists examine the nature of complex political concepts and ideal-types, often as a harbinger to more empirical study. The second camp consists of more empirically minded political scientists, who have become aware of the methodological constraints associated with competing ontological viewpoints. It is here that we see the largest growth of interest in recent years. Many contemporary political scientists have been introduced to methods and approaches that grew out of the natural sciences. In doing so, we inadvertently inherited many of the ontological assumptions of natural science. As the discipline has grown, developed and matured, it has begun to examine these assumptions more critically. The result is what we might think of as a methodological awakening. For these political scientists, ontology is used to reference the fundamental assumptions we make about the nature of the world
Ontologies, Epistemologies and the Methodological Awakening
we study: how we imagine our own social world to be (Hall, 2003: 374). For example, is the world that we study made up of atomistic, interchangeable parts, causally related to one another in generalizable patterns, but independent of the observer, or is it more reasonable to assume that the objects of our study are idiosyncratic, suspended in webs of human significance, where causal connections are complex and contingent? This is the sort of pressing ontological question that taunts the practising political scientist, and the answer to questions such as this have significant consequences for how we choose to design our studies. One main task of the methodological awakening has been to get political scientists to recognize when they are engaging with ontological issues, and to make them more aware of the consequences of their (often implicit) ontological choices. Following Blaikie (1993: 6), we should recognize that political scientists are ‘doing’ ontology when they adopt a particular approach (or any number of common assumptions or axioms) to their study of political behaviour. These assumptions include issues such as what exists (or doesn’t), what they will look like, how we can ‘capture’ their essence, how our subjects of study will interact, etc. More often than not, we simply inherit the ontological position of the literature we engage or follow; after all, we tend to approach a particular question from the standpoint of the existing literature. As we seek to go beyond, or add to, the existing literature, however, we need to consider first whether that literature provides an appropriate ontological foundation, upon which we can build. The methodological awakening in political science is about drawing attention to these underlying ontological assumptions and considering how they affect or influence our research output. I think about this challenge every time I read (and/or lecture on) the contemporary literature on democratic theory. According to John Dryzek, ‘[t]he essence of democracy
449
itself is now widely taken to be deliberation, as opposed to voting, interest aggregation, constitutional rights, or even self-government’ (2000: 1). Although I am rather sceptical of the claim, it is not renegade among political theorists who work in the area. Yet, most political scientists that I know working on empirical studies of democracy rely on an indicator of democracy (e.g. POLITY IV or Freedom House) that largely ignores issues of deliberation2. If democracy is about deliberation, and our most common indicator for democracy does not measure deliberation, what are we actually measuring? At this point, ‘what is democracy?’ becomes an ontological question3. Consider now the way we study political preferences. Much contemporary work on political behaviour finds it convenient to assume that preferences are fixed and independent of our investigation. We try to predict and understand political preferences with reference to past (revealed) behaviour, and to do this we assume that preferences will be stable as we move into the future. Despite this common assumption, there is a multi- billion-dollar campaign industry and apparently extensive (albeit clandestine) international efforts afoot to change political preferences in a particular direction. Is it meaningful to assume that political preferences are fixed? If we assume preferences are fixed, will they stay that way? What does it mean when analysts assume that the nature of a given subject matter is different from that of the practitioners? A couple of pressing concerns arise from these two simple observations about democracy and preferences. First, when we begin to question the ontological status of the things that we study, we raise awkward questions about their independence (from the observer). At the most mundane level, this suggests a need to be concerned about investigator influence – that in studying the world we may be inadvertently changing it. For example, when we use survey methods to enquire about political preferences, we
450
The SAGE Handbook of Political Science
need to be concerned that our questions are not somehow influencing the responses. In more spectacular cases, our research could even result in self-fulfilling prophesies or Pygmalion effects4. A second concern is a recognition that the nature of our inquiry should influence the way that we conduct that inquiry. In the same way that you would not measure the volume of a fuel tank with a ruler (a litre gauge would be more appropriate), political scientists need to be sensitive to the nature of our subject matter (and our objective, as analysts). For example, if the causal relationships that interest us are contingent on time and context, we cannot capture these with tools that assume causal relationships to have constant and independent effects across time and space. It is here that we step over into the realm of epistemology.
Epistemology Epistemology is a more familiar concept than ontology and it has made greater inroads into our discipline. Political scientists are more comfortable employing the term. Indeed, among our tribe there are even those who study ‘epistemic communities’ (Haas, 1992). As with my introduction to ontology, I intend to detour much of the philosophical discussion and focus on how practising political scientists employ the term, and what it means for our sundry research agendas. Epistemology, quite simply, is the study of knowledge, or how we know. It seems to surprise many students of political science that there is no settled path to truth. The power of history, a regression table or a formal equation may appear to offer unequivocal evidence of knowledge that we have about the world. When investigated more closely, however, we frequently find that the social relationships we study are not as settled as the conventions of mathematics, periodic tables or the laws of gravity. The world of social facts is often evasive.
Consider a simple example: how do we know if there are ‘universal human rights’? In considering this dicey question, we can clearly see how ontological and epistemological concerns are closely inter-related. The definition of human rights has varied from time to time, and from place to place, as has their practice. Like justice, it would seem that human rights are relative, not universal: their definition varies with the nature of political authority. But those who advocate for human rights, do so on the basis of their universality. How, then, is it possible to establish the existence of ‘human rights’? Following Aristotle (1946 [c. 350 bce]) (and mainstream political science today), we might hope to pursue an inductive approach, in a firm belief that ‘observation shows us’. But observation can only reveal particular/ actual examples of human rights, as practised (or not). It is not possible to observe a political concept or ideal-type such as the existence of ‘universal human rights’. If observation is insufficient to prove the existence of universal human rights, which other ways are there to knowing? The long tradition of Western political thought offers a clue. When Socrates argues for a just state in Plato’s Republic (1941 [c. 360 bce]), he engages with Thrasymachus along very similar lines: Thrasymachus argues that justice is relative and will vary from place to place, and over time. Socrates holds that justice is transcendental and unchanging (i.e. universal), and its essence can be discovered by means of reason (dialectics). By Plato’s account, a reason-based approach is superior for studying essential (universal) concepts; but Aristotle (and Thrasymachus) prefer a more inductive approach to studying the world. Over time, and across cultures, we find a wide and varying array of sources to knowledge. Depending on where and when you grew up, you might refer to a god (or gods), a myth(s), an elder or authority, to human reason, or to good old sensory perception. Philosophers interrogate these sundry epistemological sources to discern about the
Ontologies, Epistemologies and the Methodological Awakening
necessary and sufficient conditions of knowledge; the source of knowledge, its structures and limits, etc. Practising political scientists – like scientists in general – tend to prioritize rational and empirical sources of knowledge more than any of the others – at least in principle. It is because of this convention that political science departments have long encouraged the development of both deductive (formal/normative theory) and inductive approaches to the study of politics. But it is only recently that the discipline has begun to consider how these methodological choices might be framed by our underlying ontological assumptions about the world that we study. As with ontology, it is possible to distinguish between the philosophical literature on epistemology and the way the term is often employed by practising political scientists. Readers who are looking for an accessible introduction to the philosophy of science approach, should refer to Steup (2018). Political scientists embrace a pragmatic approach to epistemology. Here too, it is possible to discern two camps. In one camp, we find a group interested in the way that political actors can be motivated into action (and the role that knowledge plays in that motivation). We might refer to this group as political epistemologists. This group is perhaps most evident in the long tradition of work that considers the role of ideas in influencing outcomes, as well as in the recurring battles between materialists and idealists about the nature and influence of ideas in the political world (e.g. Weber, 1976 [1905]; but also Goldstein and Keohane, 1993; Hall and Taylor, 1996; Blyth, 2003). In short, political epistemologists study political ideas and knowledge – both as dependent and independent variables, and as products of political behaviour and institutions. In the other camp are a group of political scientists interested in epistemological issues by way of the methodological awakening introduced above. The concern in this
451
camp is perhaps broader, in that these scholars are questioning how we can secure reliable information about the political world we study, or the validity of information gathered by competing epistemological approaches. In this group, there is more explicit reflection on the tricky nature of the relationship between ontology and epistemology, as analysts question the utility of different epistemological tools, under varying ontological assumptions. The nature of the relationship between ontology and epistemology can be illustrated by reference to a couple of simple examples. Consider first, a glass of water and a freezing element. We know that by sufficiently cooling the water, we can transform it from a liquid to a solid state (i.e. ice). The hydrogen and oxygen molecules in the water are unaware of the natural laws under which they labour: they simply respond. Hence, it is easy to predict that water cooled to at or below 0°C will turn into ice. This is the world of natural science, out of which many of our approaches to social science have evolved. Testing for causation is simply a matter of manipulating or controlling for variation across the component parts (or ‘variables’) in a way that can account for reliable (consistent) patterns. But what if the hydrogen molecules in that glass of water were afraid of freezing? What if they used their knowledge about the laws of physics, and organized collectively in order to thwart those laws? If this could happen, it would make it difficult to speak with confidence about laws of nature, and we would need to begin thinking about relationships in terms of probabilities and strategic interactions, such as: what is the likelihood that the hydrogen molecules will revolt, under these conditions, and at this temperature? In a world where molecules can be radicalized, the real freezing point of water might actually be something other than 0°C, but we cannot witness this freezing point because under such conditions the hydrogen molecules have resisted effectively. What, then, is the real freezing point of water?
452
The SAGE Handbook of Political Science
In the social sciences, we frequently experience the effects of will and reason, and the results should make us question the nature of the knowledge that we produce. For example, one of the most famous claims in modern social science is Marx and Engels’ 19th- century prediction of an imminent revolution, due to a conflict of material interests between increasingly exploited workers and wealthy capitalists (Marx and Engels, 1948 [1848]). As the conditions of workers deteriorated under capitalism, Marx predicted a period of short-term crises, culminating in a workers’ revolution that would establish socialism, and eventually a communist society. Was Marx right? There are three possible answers to this query. First, it is possible to see the absence of a proletarian revolution as evidence that Marx’s predictions were wrong: capitalism did not collapse under the weight of class conflict – case closed. Second, it is possible that capitalists employed Marx’s theory to change the underlying social conditions, deflecting the revolution. Like the molecules afraid of freezing, capitalists (and their lackeys in government, read Marx) were made aware of the real possibility that worsening worker conditions would prompt a revolution. In response, they improved worker conditions (and the surrounding welfare state) to stave off the promised revolution. In this answer, knowledge about the world was used to change the world in a way that undermined the truth (originally) generated by that knowledge. Here the lack of a revolution can be used to argue for the truth of Marx’s claim. Finally, it is possible that Marx’s theory is still in play, but only the timeline for testing has been extended: the effort to placate workers has proven to be relatively short-lived, and we are again witnessing increased inequality, falling wages and an increased concentration of capital. In this view, it is just a matter of time before Marx’s predictions are borne out. Here too, the lack of a revolution does not necessarily falsify Marx’s theory. The point of this example is not to convince readers about the likelihood of a revolution,
but to focus attention on how our knowledge might affect the world that we study. It also illustrates the intricate ways in which ontological and epistemological issues intertwine. In these two examples (radicalized molecules and workers), we avoided normative issues, and concentrated on theories that could generate empirical predictions. But there is an additional dynamic to much political analysis, one less familiar to natural scientists: the need to consider subjective and/or moral perspectives. This is necessary given the nature of human knowledge, agency and consciousness. Does political reality reveal itself in the same way to women and men? Workers and employees? Lefties and righties? Do political scientists have a moral responsibility to report (or not) findings that could negatively affect social conditions? If we know that a scientific finding reflects a specific (say vested) interest – and not a universal truth – should we reject it outright (even if it is evidently true)? If we know that a scientific finding will change conditions in unacceptable ways – say we can demonstrate that blue-eyed people are intellectually incapable of voting, or that ‘human rights’ is the ideological invention of a Western elite – how do we balance moral/subjective value against the empirical veracity of the claim? Should sapere aude (dare to know) be a motto we embrace when knowledge comes steeped in politics and/or delivers contentious lessons? These too are epistemological concerns. As Foucault reminded us: ‘We must not imagine that the world turns toward us a legible face which we would have only to decipher; the world is not the accomplice of our knowledge’ (1984: 27). When the world shows us many different faces, there can be no single way to interpret or know that world: different observers will see different things, depending upon where they stand. With this insight, it makes little sense to assume that knowledge can be objective or universal – epistemology will reflect particular social conditions and their relationships. Our particular experience (as women, or workers,
Ontologies, Epistemologies and the Methodological Awakening
or ageing white academics) will shape what we know, but also how we know, or how we think about knowledge. It is from this realization that ‘standpoint epistemology’ grows, where ‘a standpoint is the perspective from which one views the world, social relations, and hence reality’ (Hirschmann, 1989: 1229). Consider the motivations for an individual voting. The legitimacy of representative democracy rests on the willingness of citizens to vote for candidates that can represent them. Democratic communities rely on a robust civic culture to check abuses of power, legitimize governmental power and authority, assess contending claims and choose their leaders. I do not think it is controversial to suggest that we should encourage citizens to vote in elections. At the same time, the rational choice tradition of political science has long recognized that it is not rational for individuals to invest the time and energy to vote, as the expected reward is far less than the effort exerted (e.g. Downs, 1957). When political scientists teach this view of the world (as I do every year), we encourage our students to see voting as irrational. In doing so, we implicitly discourage our students from voting. Consequently, the theory can become a self-fulfilling prophesy. Like the economics graduates that have been shown to be more selfish than their peers, we may be producing political science graduates that are less politically active than their peers (e.g. Frank et al., 1993). Is ‘voting is a waste of time’ the lesson we want to convey to our students? By arguing that the world appears to work in a particular way, and in convincing most practitioners that this is the way of the world, we can actually change the world to reflect that initial appearance (Popper, 2002 [1957]). How, then, can we weigh the benefits of the knowledge produced against the potential costs that such knowledge can afflict on our society? These series of questions and examples are aimed to prompt the curious reader to think critically about the nature of the world s/he studies before deciding on the most
453
appropriate approach for that study. We have seen how ontological and epistemological issues are tightly intertwined, and difficult to separate. For that reason, the remaining discussion will address both concepts concomitantly.
A Methodological Awakening I hope that you now have a basic understanding of ontology and epistemology, and that your interest in them has been piqued. If this is the case, you are not alone: in recent years, there has been an explosion of interest in these concepts – even though ontological and (especially) epistemological concerns have a very long history in political science. This methodological awakening can be seen in the three figures below, which are derived from a Web of Science topic search of their core collection (from 1945) for two terms (‘ontolog*’ and ‘epistemolog*’), conducted on 25 July, 2018. This search was limited to articles in the fields of ‘political science’ and ‘international relations’, and resulted in 674 hits for ‘ontolog*’ and 689 hits for ‘epistemolog*’. The aggregate pattern of use for each term, as it developed over time, is seen in Figure 27.1. Here we can see a conspicuous rise in the number of ‘recorded counts’ for each term after the turn of the millennium. Curiously, the number of references to ‘epistemolog*’ has dropped off in recent years. A very similar pattern is also evident in a parallel English book search that I ran on the same day on Google Books’ Ngram Viewer for ‘ontology+political science’ and ‘epistemology+political science’ (Ngram, 2018). Here too, a rapid drop in references to ‘epistemology+political science’ is evident, but starting much earlier (in the mid-to-late 1990s). As the Web of Science search draws from a larger universe, I analysed the aggregate results in hopes of uncovering possible
454
The SAGE Handbook of Political Science
140
120
Recorded Counts
100
80
60
40
0
1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
20
Ontolog*
Epistemolog*
Figure 27.1 Aggregate Web of Science results
regional patterns. Using the Web of Science indicator for country/region, which uses the nation-base for the host institutions of all associated authors, I created six world regions and compared developments across them, as seen in Figures 27.2 and 27.3. Not all countries are represented, only those which have hosted an author (institution) that is caught in my search net. These countries were then grouped by combining basic geography with a pinch of pragmatism5. From the figures, we can conclude that there has been a significant rise in interest in, and use of, these terms, and it appears to be driven mostly by European and North American authors. A smaller, and subsequent increase is evident in the regions from the rest of the world, suggesting a lagged effect – but I do not have any evidence to support this hypothesis. What I can say is that our interest in ontological issues continues to grow, while an interest in epistemological issues seems to be declining in recent years, except among European authors (see Figure 27.3). While the growing use of these terms is evident, an underlying explanation is not. Hay
(2006) suggests the reason may be that political science, as a discipline, has become less confident and certain of its capacity to access the truth. I wonder if the reason is not just the opposite: that political scientists are becoming more confident, allowing them to step out from under the shadows of natural science. In other words, political scientists have begun to explore alternative ways of thinking and knowing about our subject matter. Another, related, explanation for this rise in interest is the attention it is getting from the writing of an influential thought-leader in the United States and a series of new textbooks in Europe. With regard to the former, I am thinking of Peter Hall’s 2003 contribution, which was foreshadowed in a 1996 piece he wrote with Rosemarie Taylor (Hall and Taylor, 1996), and finds its deeper roots in Charles Ragin’s (1987) ground-breaking book on The Comparative Method. With this work, a generation of young comparativists became more aware of the ontological foundations of their work. The focus of Peter Hall’s (2003) contribution was on recent developments in the field
Ontologies, Epistemologies and the Methodological Awakening
455
Ontolog* 80 70
Recorded Counts
60 50 40 30 20
0
1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
10
N. America
Europe
M.East & Africa
Asia
S. America
Oceania
Figure 27.2 Regional trends for ‘ontolog*’ search, Web of Science Epistemolog*
50 45 40
Recorded Counts
35 30 25 20 15 10
0
1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
5
N. America
Europe
Mid East & Africa
Asia
S. America
Oceania
Figure 27.3 Regional trends for ‘epistemolog*’ search, Web of Science
of comparative politics. In particular, Hall noted that ‘a substantial gap [had] opened up between the methodologies popular in comparative politics and the ontologies the field embraces’ (2003: 374). Comparative politics
had traditionally embraced an ontology, imported from the natural sciences, which assumed that the world we studied was ‘governed by causal relationships that take the form of lawlike regularities operative across
456
The SAGE Handbook of Political Science
space and time’ (2003: 377). This ontology fuelled the development of variables-oriented research, searched for correlations by way of ‘disciplined configurative inquiry’ (à la Verba, 1967) and depended on effective typologies. Hall argued that the discipline was now negotiating an ‘ontological shift’ (2003: 379), where new theories of strategic interaction and of path dependence were challenging that (old) naturalist ontology. These new approaches draw from a different ontology, less reliant on timeless causal regularities and much more sensitive to constitutive influences. Consequently, Hall warned that our ontologies have outrun both our method ologies and standard views of explanations. Comparative politics has moved away from ontologies that assume causal variables with strong, consistent and independent effects across space and time toward ones that acknowledge more extensive endogeneity and the ubiquity of complex interaction effects. (ibid.: 387)
Coming from a scholar of Peter Hall’s calibre, this single chapter had a significant influence on the way that the next generation of American political scientists began to think about their work. It was now both legitimate and expected that scholars pay more attention to ontological issues, and to prioritize ontological considerations when deciding on an appropriate method. In Europe, the effort at focusing on these concerns has been more sustained, in that it has found an institutional foothold in what might be called the Birmingham school, and its influence on Macmillan/Palgrave’s ‘Political Analysis’ book series. In particular, two popular titles in this series have highlighted ontological and epistemological issues. The first of these is Theory and Methods in Political Science (Marsh and Stoker, 1995, 2002, 2010; and Lowndes et al., 2018), about which Bates and Jenkins trumpeted: ‘Marsh and Stoker are to be commended for virtually “doublehandedly” introducing the importance of ontological and epistemological reflection
for political science students’ (2007: 58). The second leading title in this effort is Colin Hay’s (2002) Political Analysis, another product of the Birmingham curriculum. Hay takes ontological diversity as his point of departure to explore the boundaries of the political, questions of structure, agency, power and the dynamics of political change. Another Macmillan/Palgrave textbook, although not in this series, is Ways of Knowing, which I have co-authored along with Torbjørn Knutsen (2007, 2012, 2019). This text takes explicit inspiration from Peter Hall’s (2003) argument to narrow the gap that separates the implied ontologies and the methods employed by so many of today’s social scientists (Moses and Knutsen, 2019: 5). It does so by introducing two main ontological positions, naturalism and constructivism, and then showing how a handful of common methods (experiments, statistics, small-N comparisons, case studies) are used in different ways and toward different ends, when employed in different ontological contexts. With the success of these very different textbooks, European students are being introduced to political science by approaches that are explicitly sensitive to diverse ontological and epistemological concerns. When this impetus is combined with the sort of response we have seen to Peter Hall’s piece, there is reason to hope that the methodological awakening witnessed in the figures above will continue into the future.
Use Value To demonstrate the utility of thinking about the underlying ontological and epistemological claims in contemporary political science, I propose to map some of the most familiar and influential arguments in contemporary political science onto a very simple twodimensional space. Along one axis, I take the simplistic ontological dichotomy introduced
Ontologies, Epistemologies and the Methodological Awakening
in Moses and Knutsen (2019: chapters 1, 2 and 8), which distinguishes between naturalist and constructivist ontologies, and stretch it into a continuum. I realize that this is a huge oversimplification, as it shrinks a cacophony of ontological positions onto a simple twodimensional space. But let us suspend our critical disbelief for just a moment and assume that all work in political science can be placed somewhere along this simple ontological continuum. On the right-hand side of this continuum, we can locate authors who assume that there is a single Real World out there, independent of our experience of it, governed by laws or patterns that can be predicted, and that we can gain access to that World by thinking, observing and recording our experiences carefully. I think that a variant of this ontological view dominates contemporary social science, and political scientists rely on it to reveal patterns that exist in nature but are often obscured by the complexities of life. Knutsen and I call this ontological viewpoint naturalism, as it seeks to discover and explain patterns that are assumed to exist in nature. Others often refer to this ontological position as positivistic or foundational. At the opposite end of this continuum lie those who doubt the naturalist’s view of the world, as many of the patterns that interest them are seen to be ephemeral and contingent on human agency. For these political scientists, the patterns of interest are not firmly rooted in nature but are a product of our own making. There is not one Real World, but many: each of us sees different things, and what we see is determined by a complicated mix of social and contextual influences and/or presuppositions. It is for this reason that I refer to the left-hand anchor point to the imagined continuum as constructivist: it recognizes the important role of the observer and society in constructing or influencing the patterns we study as political scientists. Others might refer to this ontological position as interpretivist or antifoundational.
457
Both of these endpoints constitute idealtypes individual authors that are willing to fully embrace a naturalist or constructivist ontology tend to be relatively few in number. Most of us find ourselves somewhere in the middle of this imagined continuum, adopting a somewhat more naturalist or constructivist worldview. Some of us commit to an ontological perspective as if it were like our own skin; others are more willing to change ontological viewpoints along with the underlying conditions, as if ontologies were a sweater (Marsh and Furlong, 2002; Marsh et al., 2018). I now propose that we make the same radical abbreviation to the epistemological space. Instead of the variety of epistemological approaches that I referred to at the outset of the chapter, let us focus on the two main ways of acquiring knowledge in contemporary political science: reason-based and empirically-based, or deductive and inductive, approaches. Here too, I ask for the reader’s credence, and recognize this to be a giant simplification. We can then imagine a vertical continuum that stretches from approaches that rely entirely on rational, formal and mathematical approaches (at the top), to those that are committed to observational and experiential approaches (at the bottom). In the middle, where most practising political scientists will be found, we find a balance or combination of deductive and inductive approaches. We have now imagined a 2×2 space, covering both ontological and epistemological positions (albeit admittedly simplified ones), upon which we can place individual pieces of scholarship (see Figure 27.4). Each of the four resulting quadrants describes an important theoretical tradition within contemporary political science. Each theoretical tradition provides something different and useful to the study of the political world. For example, we can position the rational choice tradition in the upper right quadrant, where scholars embrace a naturalist ontology and deductive approaches. The behaviouralist tradition
458
The SAGE Handbook of Political Science
Deduction
Downs Rawls
Przeworski and Wallerstein
Political Theory
Rational Choice
Hirschmann
Constructivism
Naturalism Cox Inglehart
Behaviouralism
Scepticism Preston
Rokkan
Induction Figure 27.4 A methodological mapping
is located in the lower right quadrant, as Behaviouralist tend to share a naturalist ontology, but seek evidence of this worldview by using more empirical means. On the left-hand side of this Cartesian space, we can distinguish between political theory (upper left) and sceptical (lower left) approaches, reflecting their embrace of either rational argument (Political Theory) or empirical evidence (Scepticism). In both cases, we find work that is more cognizant of the way that our knowledge can affect the world that we study. After all, the point of normative political theory is to convince us of the attractiveness (or not) of alternative (still possible) worlds, compared to the empirical real world, in which we live. Here we find
scholars who recognize that there are other worlds out there, better than our own (and waiting to be made). I will now position eight well-known examples in the four quadrants, to give an idea of how this sort of typology might be employed. My criteria for selection was simple: each of these pieces is representative of a particular theoretical tradition (quadrant); each is influential or well known; and they are personal favourites, for one reason or another. Obviously, one can argue over the exact positioning of any particular piece in this 2×2 space, but this general framework provides a parsimonious way of thinking about, and organizing, contemporary work in political science.
Ontologies, Epistemologies and the Methodological Awakening
In the upper right quadrant, home to the Rational Choice tradition, we find scholars whose work aims to reveal stable and predictable patterns in the political world with reference to deductive (mostly mathematical) approaches. Anthony Down’s (1957) economic theory of political action is exemplary of that tradition, in that he builds a reason-based argument, atop a list of axioms, in an effort to demonstrate that ‘the political structure of a democracy can be viewed in terms of a set of simultaneous equations’ (1957: 137). It is worth noting that there is not one single piece of empirical evidence in this remarkably influential piece. Przeworski and Wallerstein’s (1988) piece is similarly committed to a naturalist ontology, but it comes at it from a radically different political/ideological perspective. Przeworski and Wallerstein aim to show that the capitalist state is not structurally dependent upon capital, so that it is possible to change the distribution of consumption between wage earners and capitalists – even radically – without undermining continued private investment. While Przeworski and Wallerstein rely on a sophisticated formal model to make this argument, they also demonstrate a stronger willingness to engage with empirical studies and examples, and even entertain the possibility that the patterns they uncover may be contingent on social factors. For that reason, I have placed their piece closer to the centre of the diagram (relative to Downs). In the lower right quadrant, which captures the Behaviouralist tradition in political science, I have chosen one of my favourite descriptive pieces: Stein Rokkan’s (1967) chapter on the main cleavages in Norwegian politics. We can then compare this with Ronald Inglehart’s (1971) influential APSR article on post-materialist values. As with the Rational Choice selections, these two pieces come from different eras in the history of political science, and a more nuanced methodological view is clearly evident in the more recent pieces. Rokkan’s chapter offers
459
a detailed description of Norway’s political development, a description he uses to build a case for his five stable and persistent dimensions (or cleavages) of political conflict (1967: 389). Rokkan offers the classic example of induction: his pages are filled with table after table, and figure after figure, all of which are aimed to reveal stable patterns, consistent with a naturalist ontology. Inglehart demonstrates a similar faith in the ability to reveal strong patterns by inductive means, but for Inglehart the data is surveybased, not official historical statistics, as it was for Rokkan. Inglehart (1971: 997) is explicitly ‘tapping’ into inherent and persistent patterns in political opinion, but he introduces a stronger dose of theory, and even seems to recognize the possibility of contingent influences that would play havoc on the relationships he hopes to predict (e.g. ibid.: 1008). It is for these reasons that I have placed Inglehart’s argument closer to the deductive and constructivist fields (relative to Rokkan). Political Theory is positioned in the upper left quadrant of Figure 27.4, in recognition of its embrace of rational approaches and its explicit willingness to propose and conceive of alternative worlds. John Rawls is perhaps the most famous political theorist of our time, and his 1985 article in the Philosophy and Public Affairs journal offers an accessible and succinct introduction to his thinking. Rawls also draws heavily from a rational choice tradition, and relies on an axiomatic (deductive) approach. This makes it relatively easy to place him in our imagined 2×2 space. Rawls aims to provide a practical, not metaphysical, argument for justice as fairness, while avoiding ‘claims to universal truths, or claims about the essential nature and identity of persons’ (1985: 223). In this light, Rawls distances himself from a naturalist view of the world, and presents justice as fairness ‘not as a conception of justice that is true, but one that can serve as a basis of informed and willing political agreement between citizens viewed as free
460
The SAGE Handbook of Political Science
and equal persons’ (ibid.: 230). For these reasons, this article belongs on the constructivist side of the ontological divide, but Rawls remains strongly committed to a deductive approach, and still signals strong links to the naturalist tradition. A very different sort of argument, but one that still belongs in the Political Theory quadrant is Nancy Hirschmann’s (1989) ‘Freedom, Recognition and Obligation: A Feminist Approach to Political Theory’. Hirschmann unabashedly embraces a feminist standpoint approach (1989: 1229) in an insightful demonstration of how ‘choices exist in contexts. Indeed, choices are so deeply embedded in contexts of relationship, emotion, value and taught belief – all of which are social phenomena, deriving from relations with others and not from a purified or natural self’ (1241, emphasis in original). This is a masterful example of a deductive argument that recognizes the constructed/situational nature of the truths we see around us. But Hirschmann is also willing to reference more empirical research, especially when she draws from the field of psychology. For that reason, I have placed her piece deeper in the constructivist camp but closer to the inductivists (than Rawls). The final, bottom left-hand, quadrant is the most difficult to name. I have played with ‘post-modern’ and ‘constructivist’, but have decided to describe this space as belonging to a Sceptical tradition. In this quadrant, I have included two very different pieces: Robert Cox’s (1981) Millenium piece and one of the most readable pieces ever published in the APSR: Larry Preston (1995). Both articles avoid generalization and recognize the contingent nature of knowledge. Cox opens his piece by recognizing that ‘Academic conventions divide up the seamless web of the real social world into separate spheres, each with its own theorising’ (1981: 126). This is the foundation, upon which he rests his most famous argument that ‘Theory is always for someone and for some purpose’ (ibid.: 128, emphasis in original). Cox’s approach is to
examine the existing literature in a way that appears very inductive (even if the material surveyed is itself largely theoretical). In this way, his account is much closer to Hirschmann’s than it is to Preston’s. Preston relies on a handful of personal histories and anecdotes to show how ‘[t]heoretical language passes over and distorts the differences it would understand. Theory more attentive to difference needs to gain access to the meanings that circulate within different lives, especially as reflected in literary writing of those who, themselves, speak and write from sites of difference’ (1995: 941). In describing the difficulty of capturing difference, and how often these descriptions are necessarily inauthentic, Preston demonstrates the strongest constructivist ontology among the pieces I have selected: ‘Dig all the way down into the identity or subjectivity and there isn’t any person-with-a-voice to be found’ (ibid.: 950).
Conclusion Whether we recognize it or not, ontology and epistemology are part and parcel of what we do as political scientists. In this regard, we are like the orator who was greatly surprised when told that she had spoken prose over all those years. Recognition of the importance of ontological and epistemological issues seems to be on the rise, as recent years have seen a veritable methodological awakening in the literature. This chapter aims to feed that awakening by sensitizing the reader to the often implicit methodological decisions we make about ontology and epistemology when designing and conducting our research projects. In particular, this entry has described and documented that methodological awakening, and provided some ideas about the forces that may be propelling it. Political scientists of all shades are increasingly aware of the importance of reflecting about
Ontologies, Epistemologies and the Methodological Awakening
the consequences of adopting (or inheriting) a particular ontological or epistemological foundation. To help practising political scientists see the importance of these decisions, and to demonstrate the spread of ontological and epistemological choices that are available, I have devised a simple framework that can be used to compartmentalize the contemporary literature. Any and every piece of political science might be placed in this stylized space, and contrasted against others. To illustrate the utility of such an approach, I have compared eight influential examples from contemporary political science. With such a framework in hand, young scholars might think critically about the dominant ontological and epistemological positions in their own particular area of research. For example, students who are interested in voting behaviour might recognize the dominance of what I have called the Behaviouralist tradition in their field of study, and begin to wonder how that tradition might be challenged in the face of more deductive and/or constructivist approaches. Such a simple mapping allows us to see how our own particular area of research is often pigeonholed in a given ontological or epistemological tradition (for better or for worse). This realization might prompt us to consider alternative ontological and epistemological vantage points in an effort to break out of the status quo. It is even possible to map developments in the discipline over time, and to see how they gravitate to a particular quadrant or the other (or head toward the centre) at different points in history. Most importantly, I hope this framework can help us think critically about the sort of (too often implicit) assumptions that underlie the main traditions, within which we work. Such a framework may offer a useful and practical means to address the concerns voiced by Peter Hall (2003) about the need to narrow the gap that now separates the implied ontologies and the methods employed by so many of today’s social scientists.
461
Notes 1 I would like to acknowledge the helpful comments of Torbjørn Knutsen and Michael Alvarez. I, alone, am (obviously) responsible for the resulting product 2 The exception to this rule is the Discourse Quality Index (see Steenbergen et al., 2003), but it tends to be used by a smaller sub-set of political scientists. I hasten to note that the newer ‘Varieties of Democracy’ index does include deliberation, along with electoral, liberal, participatory and egalitarian principles in its index. See V-Dem (2018). 3 It should be noted that someone who is not attuned to ontological issues will assume the ontological certainty of her subject matter, and recognize this to be a problem of validity: i.e. is she using an appropriate indicator to capture the essence of democracy (which is assumed to be solid)? A more methodologically aware political scientist will use this mis-match to question the ontological status of the democracy being measured. 4 For examples of the former, consider the Hawthorne effect, as described in Moses and Knutsen (2019: 56–7). For examples of the latter, see Moses and Knutsen (2019: 275–7). 5 In particular, North America includes United States, Canada, Cuba and Mexico; Europe includes Austria, Belgium, Bulgaria, Cyprus, Czech Republic, Denmark, England, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Netherlands, Northern Ireland, Norway, Poland, Portugal, Russia, Scotland, Serbia, Spain, Sweden, Switzerland and Wales; Middle East and Africa includes Ethiopia, Iran, Israel, Lebanon, Nigeria, Saudi Arabia, South Africa and Turkey; Asia includes India, Japan, People’s Republic of China, Singapore, South Korea, Taiwan and Thailand; South America includes Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Uruguay and Venezuela; while Oceania is limited to Australia and New Zealand.
References Aristotle. (1946 [c. 350 BCE]). The Politics of Aristotle. Edited and translated by Ernest Barker. Oxford: Oxford University Press. Bates, S. and Jenkins, L. (2007). Teaching and learning ontology and epistemology in political science. Politics, 27(1), pp. 55–63. Blaikie, N. (1993). Approaches to Social Enquiry. Cambridge: Polity. Blyth, M. (2003). Structures do not come with an instruction sheet: Interests, ideas, and
462
The SAGE Handbook of Political Science
progress in political science. Perspectives on Politics, 1(4), pp. 695–706. Cox, R. W. (1981). Social forces, states and world orders: Beyond international relations theory. Millennium: Journal of International Studies, 10(2), pp. 126–155. Cruickshank, J. (2010). Knowing social reality: A critique of Bhaskar and Archer’s attempt to derive a social ontology from lay knowledge. Philosophy of the Social Sciences, 40(4), pp. 579–602. Downs, A. (1957). An economic theory of political action in a democracy. Journal of Political Economy, 65(2), pp. 135–150. Dryzek, J. S. (2000). Deliberative Democracy and Beyond: Liberals, Critics, Contestations. New York: Oxford University Press. Ferraris, M. (2016) Interview with Maurizio Ferraris. Figure/Ground. Interview conducted by Laureano Ralón and Mario Teorodo Ramírez on 12 May 2016. Available at: http:// figureground.org/interview-with-maurizioferraris/ [Accessed 20 August, 2018]. Foucault, M. (1984). The Order of Discourse. In: M. Shapiro, ed., Language and Politics. Oxford: Basil Blackwell, pp. 108–139. Frank, R. Gilovich, T., and Regan, T. (1993). Does studying economics inhibit cooperation? Journal of Economic Perspectives, 7(2), pp. 159–171. Furlong, P. and Marsh, D. (2007). On ontological and epistemological gatekeeping: A response to Bates and Jenkins. Politics, 27(3), pp. 204–207. Goldstein, J. and Keohane, R. O., eds., (1993). Ideas and Foreign Policy: Beliefs, Institutions, and Political Change. Cornell: Cornell University Press. Haas, P. M. (1992). Introduction: Epistemic communities and international policy coordination. International Organization, 46(1), pp. 1–35. Hall, P. A. (2003). Aligning ontology and methodology in comparative research. In: J. Mahoney, and D. Rueschemeyer, eds., Comparative Historical Analysis in the Social Sciences. New York: Cambridge University Press, pp. 373–406. Hall, P. A. and Taylor, R. C. R. (1996). Political science and the three new institutionalisms. Political Studies, 44(5), pp. 936–957.
Hay, C. (2002). Political Analysis. Basingstoke: Palgrave. Hay, C. (2006). Political Ontology. In R. E. Goodin and C. Tilly, eds., The Oxford Handbook of Contextual Political Analysis. New York: Oxford University Press, pp. 78–96. Hirschmann, N. (1989). Freedom, Recognition and Obligation: A feminist approach to political theory. American Political Science Review, 83(4), pp. 1227–1244. Hofweber, T. (2018). Logic and Ontology. In: E. Zalta, ed., The Stanford Encyclopedia of Philosophy, Summer 2018 ed. Available at: https://plato.stanford.edu/archives/sum2018/ entries/logic-ontology/ [Accessed 5 February, 2019]. Ikäheimo, H. and Laitinen, A., eds., (2011). Recognition and Social Ontology. Leiden or Boston: Brill. Inglehart, R. (1971). The silent revolution in Europe: Intergenerational change in postindustrial societies. American Political Science Review, 65(4), pp. 991–1017. Jackson, P. T. (2008). Foregrounding ontology: Dualism, monism, and IR theory. Review of International Studies, 34(1), pp. 129–153. Lawson, C., Latsis, J. and Martins, N., eds., (2007). Contributions to Social Ontology. London: Routledge. Lowndes, V., Marsh, D. and Stoker, G. (2018). Theory and Methods in Political Science. 4th ed. Basingstoke: Palgrave. Marsh, D. and Furlong, P. (2002). A Skin not a Sweater: Ontology and Epistemology in Political Science. In: D. Marsh and P. Stoker, eds., Theories and Methods in Political Science. 2nd ed. Basingstoke: Palgrave Macmillan, pp. 17–44. Marsh, D. and Stoker, G., eds., (1995). Theories and Methods in Political Science. Basingstoke: Palgrave Macmillan. Marsh, D. and Stoker, G., eds., (2002). Theories and Methods in Political Science. 2nd ed. Basingstoke: Palgrave Macmillan. Marsh, D. and Stoker, G., eds., (2010). Theories and Methods in Political Science. 3rd ed. Basingstoke: Palgrave Macmillan. Marsh, D., Ercan, S. and Furlong, P. (2018). A Skin, Not a Sweater: Ontology and Epistemology in Political Science. In: V. Lowndes, D. Marsh and G. Stoker, eds., Theory and
Ontologies, Epistemologies and the Methodological Awakening
Method in Political Science, 4th ed. Basingstoke: Palgrave, pp. 177–199. Marx, K. and Engels, F. (1948 [1848]). The Communist Manifesto. New York: International Publishers. Michel, T. (2009). Pigs can’t fly, or can they? Ontology, scientific realism and the metaphysics of presence in international relations. Review of International Studies, 35(2), pp. 397–419. Moses, J. and Knutsen, T. (2007). Ways of Knowing: Competing Methodologies in Social and Political Research. Basingstoke: Palgrave Macmillan. Moses, J. and Knutsen, T. (2012). Ways of Knowing: Competing Methodologies in Social and Political Research. 2nd ed. Basingstoke: Palgrave Macmillan. Moses, J. and Knutsen, T. (2019). Ways of Knowing: Competing Methodologies in Social and Political Research. 3rd ed. London: Red Globe/Macmillan. Ngram (2018). Available at: https://books. google.com/ngrams/graph?content=ontolog y%2Bpolitical+science%2Cepistemology%2 Bpolitical+science&year_start=1900&year_ end=2008&corpus=15&smoothing=3&share =&direct_url=t1%3B%2C%28ontology %20%2B%20political%20science%29% 3B%2Cc0%3B.t1%3B%2C%28epistemol ogy%20%2B%20political%20science %29%3B%2Cc0 [Accessed 25 July, 2018]. Plato. (1941 [c. 360 BCE]). The Republic of Plato. Translated with Introduction and Notes by F. Cornford. Oxford: Oxford University Press. Popper, K. (2002 [1957]). The Poverty of Historicism. London: Routledge. Preston, L. M. (1995). Theorizing difference: Voices from the margins. American Political Science Review, 89(4), pp. 941–953. Przeworski, A. and Wallerstein, M. (1988). Structural dependence of the state on
463
capital. American Political Science Review, 82(1), pp. 11–29. Quine, W. V. O. (1948). On what there is. Review of Metaphysics, 2(5), pp. 21–38. Ragin, C. C. (1987). The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. Berkeley: University of California Press. Rawls, J. (1985). Justice as fairness: Political not metaphysical. Philosophy and Public Affairs, 14(3), pp. 223–251. Rokkan, S. (1967). Geography, religion, and social class: Crosscutting cleavages in Norwegian politics. In: S. M. Lipset and S. Rokkan, eds., Party Systems and Voter Alignments: Cross-National Perspectives. New York: The Free Press, pp. 367–444. Schatzki, T. R. (2003). A new societist social ontology. Philosophy of the Social Sciences, 33(2), pp. 174–202. Steenbergen, M., Bächtiger, A., Spörndli, M. and Steiner, J. (2003). Measuring political deliberation: A Discourse Quality Index. Comparative European Politics, 1(1), pp. 21–48. Steup, M. (2018). Epistemology. In: E. Zalta, ed., The Stanford Encyclopedia of Philosophy, Winter 2018 ed. Available at: https:// plato.stanford.edu/archives/win2018/entries/ epistemology/ [Accessed 5 February, 2019]. V-Dem (2018). Varieties of Democracy: Global Standards, Local Knowledge. Available at: https://www.v-dem.net/en/ [Accessed 5 February, 2019]. Verba, S. (1967). Some dilemmas in comparative research. World Politics, 20(1), pp. 111–128. Weber, M. (1976 [1905]). The Protestant Ethic and the Spirit of Capitalism. Introduction by A. Giddens. New York: Charles Scribner’s Sons. Wendt, A. (1999). Social Theory of International Politics. Cambridge: Cambridge University Press.
28 Survey Research Bruno Cautrès
Introduction Survey research is among the most important methodological instruments of modern social sciences. Conducting a survey could even be qualified as the major activity of social scientists. From the early stages of modern social sciences to the latest contemporary developments, ‘surveying’ social and political phenomena is certainly the most unifying mode of investigating the world across the disciplines and the methodological paradigms of social sciences. Presenting an exhaustive panorama of ‘survey research’ is not only quasi-impossible to do in a one-chapter exercise, but also destined to be no longer up to date very quickly. There have been many rapid technical developments in the field of social science surveys in recent years and the survey research field became a gigantic one, composed of many different sub-topics, each of them highly specialized. Rather than proposing to cover everything but lightly, this chapter made two choices:
first, to present some major issues in the design of surveys in restricting the word ‘survey’ to quantitative surveys; second, to extend the survey research issues to the analysis of survey data. This double substantive choice is driven by two intellectual motivations: first, we propose that something unifies the two most important processes of producing a survey, the ‘measurement’ process and the ‘representation’ process. Then, we propose that something also unifies the design and the analysis of a survey. A good survey is a complex set of procedures and protocols that requires at the same time a division of labor between specialists but also a very strong integration among them. The making of a survey is a chain of processes, but nothing would be more damaging to the quality of a survey than a strict separation between the different tasks corresponding to these different processes. The making of a survey is also linked to the ultimate objective: the statistical analysis of the data collected through the
Survey Research
survey. Making a survey/analyzing a survey are actually one and the same task.
Designing a Survey with the ‘Total Survey Error’ Perspective Typically, survey research targets a key scientific objective: to make possible ‘inferential knowledge’ about the world. Making ‘inferences’ is part and parcel of the survey instruments and techniques since the beginnings of the survey paradigm. A survey is actually a set of procedures, methods, and techniques targeting two types of inferences: a ‘measurement’ inference (measuring unobservable concepts) and a ‘statistical’ inference (the question of representation, in other words the possibility from the survey to infer the results to a larger population). Every step in the making of a survey, any technique used in this framework, has one of these two inferential perspectives and often both at the same time. According to Groves, ‘a survey is a systematic method for gathering information from (a sample of) entities for the purposes of constructing quantitative descriptors of the larger population of which the entities are members’ (Groves et al., 2009: xv). For an organization or an institution, deciding to conduct a survey is always an important and strategic decision. Doing a survey is not without any risk; a survey produces and delivers key indicators and any organization or institution expects these indicators to be reliable information. Recent trends in public policy, for instance, attribute to data collected by surveys a major role in evaluation. This importance is particularly the case with quantitative surveys. The outcomes of a quantitative survey are numbers and these numbers are ‘sampling estimates’, in other words estimations of the ‘true’ reality (the so-called ‘parameters’). Estimating accurately the parameters of an empirical reality is the final objective
465
of a survey and this inferential perspective has gained more and more relevance. But the making of a survey is also a question of costs. The cost–benefit balance is an integral part of the making of a survey. Contrary to common wisdom, according to which budget limits for a survey is only a negative constraint, we propose another interpretation. If organizations and institutions have to become aware that survey quality involves high expenditures and that good social and political surveys cost a lot like in any other discipline, the budget limits also have their good side. Without budget limits it is likely that surveys would be conducted in non-sensical fieldwork, with hours of interviews, too complex and long questionnaires, and many repetitions of existing data. The obligation to think about the cost– benefit balance of a survey had certainly another positive consequence, an increasing demand for the quality of data produced by surveys, particularly quantitative surveys. The concept of ‘total survey error’ reflects this new requirement. If producing a survey has become both more costly in terms of budget and more strategic in terms of the validity and reliability of its results, then every effort must be made to reduce the risk of error and bad estimation. If this has already been the case before, and has always been a key issue in surveys, research into survey methodology has evolved considerably and progressed on one essential point. The ‘total’ error, the one that will lead to producing a poor estimate, can be broken down into a series of ‘partial’ and ‘local’ errors. In an ideal world, errors should not be made, but in the real world, the world of multiple constraints on survey production, some mistakes can be avoided, others can be corrected even if they are made. The concept of ‘total survey error’ is nowadays considered as the theoretical and empirical reference framework for the quality control of data produced by surveys. Jon Krosnick, Paul Lavrakas, and Nuri Kim define it as follows:
466
The SAGE Handbook of Political Science
The total survey error perspective is based on the notion that the ultimate goal of survey research is to accurately measure particular constructs within a sample of people who represent the population of interest. In any given survey, the overall deviation from this ideal is the cumulative result of several sources of survey error. Specifically, the total survey error perspective disaggregates overall error into seven major components: coverage error, sampling error, non-response error, specification error, measurement error, adjustment error, and processing error. (Krosnick et al., 2014: 431)
According to Groves et al., the concept of total survey error can even be described as a ‘unified perspective’ of survey methods: Over the past two decades, a set of theories and principles has evolved that offer a unified perspective on the design, conduct, and evaluation of surveys. This perspective is most commonly labeled the ‘total survey error’ paradigm. The framework guides modern research about survey quality and shapes how practising survey professionals approach their work. The field arising out of this research domain can appropriately be called ‘survey methodology’. (Groves et al., 2009: xv)
Groves et al. argue that one way of learning about surveys ‘is to examine each type of error in turn, or studying surveys from a “quality” perspective’ (2009: 41). This modern and recent perspective on the quality of data produced by a survey can be synthesized using two figures proposed by Groves et al. (2009). The first graph shows the steps involved in conducting a survey: from the most general, defining research objectives, to the most specific, producing data and publishing an ‘educated’ version. It must be understood that the published and disseminated data are not necessarily the raw data collected. A real editing work of the raw collected data must be carried out: the coding or recoding of the data after their collection. The corrections and weighting procedures are essential steps before ‘survey statistics’ can be published. But the main interest of this graph is to show that before the fieldwork starts, the process of producing
a survey has two separate branches that are intended to unify and gradually converge. These two branches are the bases from which the two types of inference (see Figure 28.1) are possible. We see that on the right side of this graph we find what concerns inference in the classic sense of the term, statistical inference; but on the left side, we find everything that concerns the other inference problem, that of ‘representation’ in the sense of ‘measurement’. The seven components of the ‘total survey error’ mentioned above can be reframed in a slightly different way (but very close to it) by using the second figure coming from Groves et al. This second figure develops fully the ideas of the ‘total survey error’ (Figure 28.2). Now we are zooming in a very detailed way on the two parallel but converging processes of inference. All over these two processes,
Figure 28.1 The different steps of a survey Source: Figure slightly adapted from Figure 2.4 in Groves et al. (2009: 47).
Survey Research
467
Figure 28.2 The two parallel processes of controlling the ‘total survey error’ Source: Figure slightly adapted from Figure 2.5 in Groves et al. (2009: 48).
a set of seven types of error are the components of the ‘total error’: each error type is a ‘partial’ or ‘local’ source of the ‘total error’. As can be seen, correctly estimating a ‘survey statistic’ is the result of a relatively complex methodological process, which requires sustained attention at all stages. We can really talk about a ‘production chain’ of survey research, which aims at a double inferential objective (measurement and representation). In the very synthetic scheme proposed by Groves et al. the word ‘error’ regularly appears at all stages, as a threat to the validity and reliability of the statistics produced at the end. The main types of errors that the second graph identifies are the necessary steps for any reflection on the production of a survey.
Measurement Issues The four blocks of the measurement process are the steps that lead to the production of what is called an ‘edited response’. This
vocabulary is particularly interesting and metaphorical: the data that are published at the end of the measurement process are ‘constructed’ data. Without being part of a constructivist epistemology, the modern approach to survey research assumes that at all stages of the development of a survey important choices are made and that the produced/published statistics are actually the result of an intellectual construction. ‘Construct validity’ is among the most important issues in the measurement process. According to Paul J. Lavrakas (2008), ‘in the context of survey research, construct validity addresses the issue of how well whatever is purported to be measured actually has been measured’. Construct validity implies ‘face validity’ (validity ‘at face value’) but the reverse is not necessarily true. A survey item that would measure the popularity of the President by an item about the way the country is run by the current administration would certainly have some face validity, but certainly not construct validity. Evaluating the strength of construct validity requires
468
The SAGE Handbook of Political Science
an attention to many empirical aspects: the wording, the formatting, the place of the item in the questionnaire, the mode of administration of the questionnaire, the response style of the respondent. In other words, a high construct validity depends not only on the strict meaning of ‘validity’ (the instrument measures what it is supposed to measure), but also on the empirical context of the questionnaire and fieldwork. At the end, the statistical analysis is the ultimate step in judging the construct validity: if the construct is valid, its measurement must correlate with criterion variables. In other words, a high construct validity means that the items capture the variance of the phenomenon that is supposed to be measured. The basic idea of a measurement error consists in a difference between a measured quantity and its ‘true value’. Measurement error in itself is not the problem. When a valid construct is empirically measured, the operationalization of the measure can generate errors like in any observation or experiment. The real concern is when the error is not a socalled ‘random error’, but when it has a ‘systematic’ origin, caused by a wrong calibration of the measuring instruments. If ‘random errors’ are unavoidable (due to the sampling of the respondents and due to the empirical process of measuring), the ‘total survey error’ perspective aims at controlling and reducing the avoidable part of the measurement error, the systematic error part. According to Paul Biemer, the measurement error: includes errors arising from respondents, interviewers, survey questions, and various interview factors. Respondents may (deliberately or unintentionally) provide incorrect information in response to questions. Interviewers can cause errors in a number of ways. By their speech, appearance, and mannerisms, they may undesirably influence responses, transcribe responses incorrectly, falsify data, or otherwise fail to comply with the survey procedures. The questionnaire can be a major source of error, if it is poorly designed. Ambiguous questions, confusing instructions, and easily misunderstood terms are examples of questionnaire problems that can lead to measurement error. (Biemer, 2010: 823)
The set of errors described as ‘processing error’ come typically from editing, coding, data entry, or programing errors. These kind of errors can sometimes be difficult to detect, particularly when it comes to coding and data entry errors. If a numerical code has been wrongly attributed to one case, the error may be impossible to detect if that case is the unique error among hundreds or thousands of observations. This kind of error arises during the data processing stage. For example, in coding open-ended answers related to economic or socio-demographic characteristics, coders may deviate from the coding protocol procedures. The errors that occur in a particular survey are strongly influenced by survey planning, and to some extent the survey’s resources (e.g. staff and budget) and constraints (e.g. elapsed time between data collection and publication). In general, resources and constraints weigh heavily in the data collection mode selected, with each mode resulting in different types of processing errors.
Representation Issues Because the major objective of a survey (at least for a quantitative survey) is to produce accurate estimates of the population parameters, the early stages of the representation process are of crucial importance. As can be seen in the second figure, the representation process is mostly about sampling issues. The sample procedure itself comes only in a third position in the sequential order of producing the survey statistics. Before achieving a sample, we must consider the serious issues before: defining the ‘target population’ and its correspondence with the ‘sampling frame’. It looks trivial to mention here, but it is a major issue. We must clarify, before anything else, which population the survey intends to cover. An example comes from election studies: they want to analyze the electoral population, the one having voting age and
Survey Research
voting rights. But, an interesting perspective for election studies could also be to analyze the population of the young citizens that will become voters soon, in between the beginning of the survey and the election date. In a recent and large panel study of the French voters, for example, the decision has been taken to also cover the population of the ‘first voters’, those young citizens that would become voters during the span of the panel study1. Once the targeted population has been correctly defined, the empirical implementation of this objective can be more complex than expected. A sampling frame, composed of the list of sampling units, must be available (for extracting the sample) but it must also correspond to the defined population. A series of gaps between the two can compromise the initial objective of representing the population. These gaps are called ‘coverage error’ problems. This is a major source of bias in a statistic, when the target population does not coincide or correspond exactly with the population actually sampled. The source of the coverage error can be diverse but essentially results from ‘undercoverage’ or ‘overcoverage’ of segments of the population. ‘Undercoverage’ occurs when members of the target population are excluded (for instance if an age limit is given for the sampling, when the objective is to cover all the population). ‘Overcoverage’ concerns the opposite, when a segment of the population is erroneously present in the sample when it should not (for instance a national sample of residents when the objective was a national sample of voters). The net coverage error is the difference between the two. As summarized by Herbert Weisberg, coverage error issues raise not only sampling questions but also considerations about the mode of administration of a survey: [C]overage error produces bias when a large part of the target population is omitted from the sampling frame, and when the mean of the sampling frame thus differs from that for those omitted
469
from it. Concern that many households did not have a telephone led to avoidance of telephone surveys until the 1970s when the coverage rate for telephones in the US finally went above the 90% level. Internet surveys still suffer from serious coverage problems. (Weisberg 2008, 226)
Coverage error issues are thus still a matter of major concern in the new context of websurvey design. Face-to-face and area probability frames are certainly still the best mode of administration of surveys. Telephone interviews became obsolete (problem of directories, problem of cell phone-only households, dual frame of landline and mobile phones – dealing with overlap issues in sampling) and mail surveys, too (mail survey is possible only if we have an upto-date and accurate list of the population, which is problematic in many circumstances due to the mobility of some segments of the population). For the web-surveys, the coverage error issues are due to the fact that no ultimate and good frame has been developed for sampling the internet population (digital divide; multiple email addresses problems), despite major efforts done recently in this field. The coverage errors can have a negative effect on the sampling error, particularly on one of the two facets of the ‘sampling error’. Basically speaking, ‘sampling error’ consists of two components: ‘sampling variance’ and ‘sampling bias’. The first one: [is] the part that can be controlled by sample design factors such as sample size, clustering strategies, stratification, and estimation procedures. It is the error that reflects the extent to which repeated replications of the sampling process result in different estimates. (Krotki, 2008)
The second one: results from a systematic source that causes the sampling estimates, averaged over all realizations of the sample, to differ consistently from their true target population values. Whereas sampling variance can be controlled through design features such as sample size, stratification, and clustering, we need to turn to other methods to control and reduce bias as much as possible. (ibid.)
470
The SAGE Handbook of Political Science
This point is particularly critical for survey research: if the sampling variance of an estimate is part of the sampling frame (by definition a sample generates estimations, we call it ‘sample statistics’, and by definition the estimation may fluctuate randomly from one sample to another), the sampling bias is a concern. The sampling variance is the random component of the sampling error when the sampling bias indicates a systematic (non-random) error which is present whatever the sample size or how many times the sample is drawn. Correcting for both sampling errors (the random part and the possibly systematic one) is a major task in checking and validating the survey estimates. The correction requires us to know the population parameters and this concerns an essential point: the sampling strategy. Probability samples (when the units of analysis are extracted from the population by a random selection) only guarantee that corrections can be applied to the estimated parameters. Applying weighting procedures to correct for sampling error is not undoable with non-probability samples like quota samples, but is more complex, less direct, and certainly less precise. Correction for sampling errors when the sample does not come from a random selection out of an exhaustive list of the population is less precise since the base of correction is not the individual units of the population but (in the case of quota samples) the grouping of individuals into the quota groups. But even if the coverage of the survey is good and the sampling error is restricted to the sample variance of the statistics, another source of error can damage the quality of the estimate: the ‘non-response error’. This error occurs when the survey, particularly in its interview stage, fails to get responses to some indicators and questions. The nonresponse error creates two types of problems for the survey: first, the decrease in sample size (due to non-response) results in larger standard errors for the estimated statistics; second, and perhaps more importantly, it
creates a systematic bias since the nonrespondents differ from respondents within a sample. In other words, it is very likely that non-response is not a random error. This error not only concerns the interview process (some questions, items, or indicators may generate non-response for different reasons: complexity or bad phrasing of the questions, mis-specification of the items for instance), but it also concerns the delicate stage of the contact with the respondents. Some segments of the population might be ‘reluctant’ to respond to surveys, particularly social and political surveys. Converting the ‘reluctant respondent’ into a ‘participant respondent’ is a critical issue. Important experiments have been conducted in the context of the European Social Survey (ESS) to convert the ‘reluctant respondent’ and to persuade her to participate and respond to the questionnaire. It has been shown that some respondents need to be re-contacted several times before accepting to respond which has a major effect on the distribution of the estimates. For instance, the sampling distribution of the measurement of authoritarian values is deeply affected by non-response errors and may vary very significantly according to the number of contacts needed with the respondents. A last but not least threat to the quality of survey data is what are called ‘adjustment errors’. The particularity of this source of error is that it concerns potential errors committed after the data collection. These ‘post-survey adjustments’ refer to a series of statistical adjustments applied to survey data prior to data analysis and dissemination, but after data collection. Typically this concerns technical tasks like ‘data editing’, missing data verifications, and eventually imputation, weighting, and disclosure limitation procedures. As summarized nicely by Michael Young: [D]ata editing may be defined as procedures for detecting and correcting errors in so-called raw survey data. Data editing may occur during data collection if the interviewer identifies obvious
Survey Research
errors in the survey responses. As a component of post-survey adjustments, data editing involves more elaborate, systematic, and automated statistical checks performed by computers (…) Data editing begins by specifying a set of editing rules for a given editing task. An editing program is then designed and applied to the survey data to identify and correct various errors. First, missing data and ‘not applicable’ responses are properly coded, based on the structure of the survey instrument. Second, range checks are performed on all relevant variables to verify that no invalid (out-ofrange) responses are present. Out-of-range data are subject to further review and possible correction. Third, consistency checks are done to ensure that the responses to two or more data items are not in contradiction. (Young, 2008: 597)
But the ‘adjustment errors’ can have sources other than errors made when editing the data. Missing data imputation and weighting adjustments are also critical questions. Imputation and weighting adjustments are actually standard find tools for dealing with missing data in surveys. As can be seen, ‘adjustments errors’ may have their origins in the ‘non-response errors’. The way the missing data are dealt with (for instance in eliminating the individual units of analysis having some missing data) and the choices operated to imputate some values in replacement of the missing information can turn into errors rather than correcting for them. Normally, the statistical objective of weighting and imputation is to reduce the potential bias of survey estimates due to item nonresponse or sampling error. But this can be achieved to the extent that we know and can identify correctly the mechanism which has produced these two errors. If the sampling error is not correctly identified, if the process that generated missing data is not known, then the correction could be problematic.
Analyzing Survey Data Once the data have been collected and, in an ideal world, the ‘total survey error’ has been corrected in total or for some of its
471
components, then we may start the analysis of an ‘edited data set’. The analysis of survey data is an integral part of survey research. Not only is this not an independent step in the conduct of the investigation, it is part of it from the beginning. No survey questionnaire is produced independently of data analysis issues. It can even be said that the entire intellectual and methodological construction underlying the conduct of a social or political survey has no other objective than to allow data analysis. If the survey design aims to produce statistical estimates from which the collected data are subjected to a series of statistical inferences, the survey analysis also aims at inference: the survey analysis can indeed be assigned the same objectives as those of the survey design: representativeness and inference. The statistical analysis of survey data can be interpreted as an extension of survey design by other means: the objective is the same; it is to produce high-quality survey statistics. To achieve this objective, the statistical analysis of survey data employs different procedures, all of which fall under what can be termed ‘data quality control’. This data quality control is one of the most crucial steps in the phase following the completion of the field survey.
The Construction of Attitude and Measurement Scales One of the most common practices for analyzing survey data is to create synthetic measures that group indicators together. The analysis of the data is thus based on a methodological paradox. On the one hand, the development of the questionnaire leads researchers to wish for a great richness of indicators: academic survey questionnaires (for example those of the European Social Survey or European Values Studies) are often very long, with interview times that can appear excessive (in wave 8 of the European Social Survey the average interview time was 65 minutes with a standard deviation of
472
The SAGE Handbook of Political Science
27 minutes, which reveals a large variance in the styles of responses to the questionnaire). On the other hand, the objective of analyzing the data collected through this long questionnaire is in fact to reduce the measurements to large synthetic indicators. A methodological tension thus exists between the length and multidimensionality of the survey questionnaire and the reduction of this complexity to large synthetic indicators. This tension is a central element of the social science survey paradigm. This paradigm is based on an extremely solid and important methodological and intellectual foundation, bequeathed to survey analysis by one of its most important founders, Paul Lazarsfeld (Lazarsfeld and Rosenberg, 1955). In his work, he proposes that the main objective of sociological surveys is a measurement objective: the empirical transition from ‘concepts to indicators’. Lazarsfeld inscribes this objective in a four-step framework that continues to inspire the development of social science surveys and questionnaires. In Lazarsfeld’s scheme, the first step is for the researcher to develop an abstract and imaginative representation of a theoretical problem: the aim here is to sketch an abstract theoretical construction, an imagined representation called a ‘concept’ (Elkins, Chapter 19, this Handbook). The second step turns towards the operational decomposition of the concept into its different components. These components are referred to as ‘facets’ or ‘dimensions’. The third step is to define the type of data needed to empirically measure the selected ‘facets’ or ‘dimensions’. To do this, the researcher can rely on ‘indicators’. The latter are elements that can be collected in the empirical world and whose link with the concept is defined in terms of probability. This last point is crucial to understanding the plurality and multidimensionality of the indicators present in a survey questionnaire. Indeed, if the link between the dimensions of a concept and its indicators is probabilistic, it means that each indicator is only a potentially imperfect measure of each dimension.
This relationship is not ‘deterministic’, unless one could measure perfectly or (an even more demanding methodological condition) if one could perfectly summarize a dimension with a single indicator. The plurality of indicators that are proposed to measure a dimension of a concept is therefore based on the central assumption that the concept and its dimensions cannot be perfectly measured by a single indicator. The objective of a survey questionnaire is thus to define indicators each measuring a particular facet of a concept. If the indicators are plural, the choice of a particular indicator is based on a rule of ‘interchangeability’ of indicators: according to this fundamental rule, different indicators can be used to measure a dimension of a concept in an equivalent way. This rule, which is easier to establish as a theoretical principle than to apply in empirical reality, prevents the answers given to the questions from being directly comparable to the meaning of the concept. Here we find Lazarsfeld’s strong hypothesis relating to the ‘probabilistic’ nature of the relationship between the concept, its dimensions, and its indicators. The fourth and last step of Lazarsfeld’s analytical scheme is the construction of synthetic indicators, composed of several indicators, constituting the indices of the dimensions of the concept. This step is essential for the quality control of the data collected by the survey and even more so for the validation of the measurements. This is one of the most fundamental challenges of survey analysis and its practices: the analysis of correlations between items or between questions makes it possible both to construct synthetic indices and to verify whether the indicators work well according to the initial plan, i.e. in their ability to measure the different aspects of the dimensions of the concept.
The Measurement of the ‘True Score’ Survey analysis and its ‘Lazarsfeldian’ tradition on all these questions are related to
Survey Research
another discipline and research perspective of the social sciences: psychometry and the theory of the ‘true score’. There is a very clear intellectual proximity and methodological familiarity between the development of synthetic indicators and what psychometrics call the ‘true score’ of a measure. To fully understand this familiarity and its methodological importance for the quality control of survey data, we must return to a number of key concepts in the analysis of survey data. By attempting to construct synthetic indicators, we actually assess their operational and empirical capacity in terms of ‘validity’ and ‘reliability’. The statistical analyses (see below) that make it possible to construct synthetic indicators aim to provide an essential response to two main concerns of survey analysis: do we measure what we claim to measure? is the measurement obtained of sufficient quality? To answer these essential questions from any empirical measurement perspective (for any scientific research device or program the question of measurement is probably the most central question), reference is made to two key concepts of measurement theory: ‘reliability’ and ‘validity’. Reliability (or fidelity) is confirmed when an instrument is used several times under the same conditions and produces the same results. Validity is high when an instrument really measures the phenomenon you attempt to measure. A chemist’s example is often used to explain these two concepts. He records the temperature: if his thermometer always indicates 2 degrees more than other calibrated thermometers, there is a validity problem. If the chemist is short-sighted and has difficulty reading the thermometer and the values he records are only approximate: there is a problem of reliability (Nunnally, 1978). Reliability and validity can be affected by many sources of errors that can be grouped into two main categories: ‘random errors’ and ‘systematic errors’. The true score model is a framework that proposes to take into account these two main categories of errors. It breaks down the
473
result of a measurement into three essential elements: the ‘true value’ supposed to correspond to the reality of the phenomenon studied, random error, and systematic error. We can now see the intellectual proximity between the design of the survey, with the ‘total survey error’ perspective, and the analysis of the survey data, with the psychometric ‘true score’ perspective. Fighting against measurement or representation sources of errors is in fact looking for the best estimate of the ‘true score’. All the efforts to construct a survey questionnaire actually aim to reduce the importance of random error as much as possible and to fight resolutely against systematic error. If the random error is related to the observation and empirical measurement, the probability that it will be revealed during an observation should normally be small and randomly distributed. This is where the perspectives of psychometrics and Lazarsfeld’s original intuition about the probabilistic nature of the link between indicators and the facets or dimensions of concepts come together. By contrast, systematic error is a measurement bias that deviates from the ‘true score’ in a non-random way. An example provides a clear understanding of this distinction. Imagine that a survey questionnaire of political attitudes includes several indicators measuring different aspects of ethnocentrism. The objective of the measurement is to produce an ethnocentrism score characterizing the population studied: this ‘true score’ will be inferred from the score estimated by the synthetic indicator constructed by combining, for example, the different ethnocentrism indicators present in the questionnaire. It is possible that a discrepancy exists between the ‘true score’ and the ‘estimated score’: if the indicators that have been chosen are both ‘reliable’ and ‘valid’, this discrepancy is a simple random fluctuation and the score estimate does not systematically deviate from the ‘true score’. But a more dramatic error could have occurred in the questionnaire: one of the indicators (or, even worse, several of them) could be tainted
474
The SAGE Handbook of Political Science
by a systematic measurement bias. Thus, if one of the indicators concerned the measurement of anti-semitic prejudice, it is not impossible that this would disturb the estimation of the ‘true score’ of ethnocentrism. Although ethnocentrism and anti-semitism are correlated, they constitute two different phenomena. The systematic bias here would be less severe if one of the indicators in the construction of the synthetic ethnocentrism index actually measured a phenomenon with a weak correlation.
Manifest Indicators and Latent Variables The construction of synthetic indices is not based solely on the assumption that they allow us to estimate the ‘true score’ of the dimensions or facets of concepts. This construction is also based on another hypothesis, which is the real pillar of the Lazarsfeldian design of the survey. Synthetic indices are in fact the ‘real variables’ sought. These cannot be measured directly for operational reasons. Let us take the example of the measurement of authoritarianism. An interviewee cannot be asked to define himself directly on an ‘authoritarian’ scale. If a survey questionnaire included an indicator of the type: ‘generally speaking, would you say you are an authoritarian person’, this indicator would produce responses with a ‘social desirability’ bias because few people probably want to be labeled as ‘authoritarian’. Moreover, the use of the word ‘authoritarian’ would undoubtedly cause misunderstandings given its both abstract and polysemous character (authoritarian in politics? In the relationship to others? In private life?). These are the reasons that actually led to the measurement of authoritarianism by indirect indicators. In France, for example, one of the indicators used to measure authoritarianism is an attitude towards the restoration of the death penalty (which was abolished in France in 1981).
In the Lazarsfeldian tradition, opinion indicators are referred to as ‘manifest variables’: they reveal (in a ‘manifest’ way) the hidden political attitudes of the people who respond to the survey. The questionnaire thus becomes a conversation in which the respondent is indirectly led to reveal his psycho-sociological attitude traits without being hurt, without being questioned in a too intrusive way. An intimate relationship unites this philosophy of measurement and the neutrality of observation and the observer towards the responding person. Thus, a ‘good’ questionnaire is not only characterized by the fact that the indicators are both ‘valid’ and ‘reliable’: their formulation (their wording), their number, the order in which they are presented to the respondent, are all elements that must facilitate spontaneous expression and be as faithful as possible to deep attitudes. The questionnaire and its indicators then make it possible to reveal ex post not only the ‘true score’ but also the ‘true score of the true attitude’, the so-called ‘latent variable’. Statistical analysis plays a fundamental role here. Multivariate statistical analyses are used to test all these choices and hypotheses. Two main statistical schools cohabitate in the world of survey data analysis. The dominant one is certainly the ‘regression model world’ that aims at modeling the relationships between a so-called ‘dependent variable’ and its ‘explanatory factors’ (Beck, Chapter 25, this Handbook). Coming from the scientific tradition of Galton and Gauss, the regression-based statistical analysis looks at the ‘true score’ question mostly through the issue of model specification. Is the model function an accurate representation of the causal chain driving the accurate estimate of the ‘dependent variable’? The ‘true’ score for the regression-based analysis is the estimated one through a link function and a particular specification of the model. A very ‘hard’ perspective in regression analysis could say that ‘truth’ is the estimation, providing the link function and the model specification are valid and correct. The residuals of the regression
Survey Research
model would be like the random error in that case, respecting the important assumption of the regression model that error is stochastic and randomly distributed. Needless to say, this perspective and assumption needs a lot of verification and empirical checks. Assuming that the link function and the model specification to be correct is one thing, to prove it is something else. The regression-based set of methods is in fact particularly exposed to the issue of selection bias and endogeneity, two critical points that can totally ruin the perspective to get the ‘true score’ in the estimation. A second statistical school or paradigm is the set of ‘data reduction techniques’, like principal component analysis and broadly speaking the ‘factor analysis world’. To separate it from the ‘regression model world’ for analytical reasons is called for when the reality is more complex. Some factor analysis developments and methods are actually model-based (in psychometrics in particular or in structural equation models). But we can say that the main difference is that data reduction aims to reduce the dimensionality of the edited data base, i.e. to reduce the number of columns of the data base (the variables) to a smaller set of ‘components’, ‘dimensions’, or ‘factors’; or to reduce the number of rows of the data base (the units of analysis) to a smaller set of groups, or types, or clusters. The data reduction techniques are sharing a closer familiarity with the ‘true score’ perspective than the regression-based techniques. These methods are fully in line with Lazarsfeld’s perspectives about ‘manifest’ and ‘latent’ variables. Running a factor analysis is in fact searching the ‘true score’ in the unobserved latent variable. The set of items or questions developed in the questionnaire are the manifest measures of the ‘secret’ latent variable that can be accessed only through the discovery of the latent variable as a combination of the manifest indicators. The research attitude of the researcher using this data reduction framework is very coherent with the logic of drafting the
475
questionnaire and going from the concepts/ constructs to their indicators. The best survey research strategy, the optimal one, is indeed to anticipate in the questionnaire design the future data reduction. This is paradoxical: the same researcher develops a long and complex questionnaire and later uses statistical analysis to summarize and reduce it to a small set of latent variables. This paradox in the process of producing a survey sometimes makes it very hard to obtain funding: the accumulation of data that are not used is in fact impressive. It is not easy to convince funding agencies that you need a long time for interview and many items will not be used by themselves, but only as elements of composite scales or latent variables.
Conclusion Survey research is today at a crossroads, as has probably never been the case before. The developments of online accumulation and availability of data (part of which can be qualified in the ‘big data’ perspective: Wagschal and Ettensperger, Chapter 16, this Handbook) poses a series of serious questions to the paradigm of survey research. Can we do survey research as before? After the end of telephone interviews are we going to see the end of face-to-face interviews? If web-surveys and online interviews become the dominant way of conducting surveys, what about the developments of data bases for sampling internet respondents? What are going to be the ‘data’ collected by surveys? Still numerical codes, or texts, images, emotions? These are only a few of the many challenging questions that the digitalization of societies and the social sciences are raising.
Note 1 The survey is a very large panel study of French voters, called ENEF2017 (the National Electoral
476
The SAGE Handbook of Political Science
Survey 2017). This major national election study started in the Fall of 2015 and continues to observe the same panel of French voters up to the present. The study is conducted by CEVIPOF in Sciences Po; see: https://www.enef.fr/
References Biemer, Paul, ‘Total survey error. Design, implementation, and evaluation’. Public Opinion Quarterly, Vol. 74, No. 5, 2010, 817–848. Groves, Robert M., Floyd J. Fowler Jr., Mick P. Couper, James M. Lepkowski, Eleanor Singer, and Roger Tourangeau. Survey Methodology. Hoboken, NJ: Wiley, 2009 (second edition). Krosnick, Jon, Paul J. Lavrakas, and Nuri Kim. ‘Survey research’. In: Harry T. Reis, Charles M. Judd (eds.). Handbook of Research Methods in Social and Personality Psychology, Cambridge: Cambridge University Press, 2014 (second edition).
Krotki, Karol, ‘Sampling error’. In: P. J. Lavrakas (ed.), Encyclopedia of Survey Research Methods. Thousand Oaks, CA: Sage Publications, 2008. Lavrakas, P. J. (ed.). Encyclopedia of Survey Research Methods. Thousand Oaks, CA: Sage Publications, 2008. Lazarsfeld, Paul F. and Morris Rosenberg (eds.), The Language of Social Research. A Reader in the Methodology of Social Research. Glencoe, Ill.: The Free Press, 1955. Nunnally, Jum. C. Psychometric Theory. New York: McGraw-Hill, 1978 (second edition). Weisberg, Herbert F., ‘The methodological strengths and weaknesses of survey research’. In: Wolfgang Donsbach, Michael W. Traugott (eds.). The SAGE Handbook of Public Opinion Research. London: Sage, 2008. Young, Michael, ‘Post-survey adjustments’. In: Paul J. Lavrakas (ed.). Encyclopedia of Survey Research Methods. Thousand Oaks: Sage, 2008.
PART III
Political Sociology
This page intentionally left blank
29 Clientelism Herbert Kitschelt
Introduction Political clientelism involves a specific form of coordination between an aspiring political office holder (the ‘patron’) and constituency supporters (the ‘clients’). To win or sustain political office, the patron bestows or promises to bestow targeted benefits on or avert losses from clients in return for their support of her political bid. These advantages accrue to clients as individuals or members of small groups, typically in a geographically narrowly confined locale. No club or collective goods are produced. Clientelism thus constitutes one of several varieties of electoral ‘accountability mechanisms’. Such mechanisms encompass any conceivable actions undertaken by aspiring political office holders or candidates in the expectation of mobilizing citizens’ support. Political clientelism involves an intertemporal relationship of double contingency, where either side may be tempted to defect once the other has made its move, even though
the capacities to exit may be asymmetrically distributed. There is no third-party enforcement of the exchange, and the participants’ efforts to create safeguards or turn clientelism into a self-enforcing relationship are always precarious. Clientelism therefore tends to be a ‘leaky bucket’ with plenty of losses incurred by suckers – patrons or clients – whose contributions are not reciprocated. But for want of feasible alternatives, time and again clientelism may persist, at least as one of several modes of political coordination. Early students of clientelism investigated diffuse political–economic relations of dependence between (large) landholders (as patrons) and agrarian producers (as clients). Clientelism became a subject of genuinely political research with some lag. Within the political domain, subspecies of clientelism involve administrative or electoral exchange. In the former, the patron may reach or maintain her desired position along a number of different pathways (e.g. kinship, personal allegiance, educational skills),
480
The SAGE Handbook of Political Science
with the exception of formal election from a pool of competitors. In the latter, patrons seek office through mobilization of support in electoral competition. Administrative clientelism may occur in any form of political regime. Electoral clientelism involves competitive elections, but does not presuppose full-fledged democracy. Purely electoral democracies with limited civil and political liberties (‘illiberal’ electoralism), or even competitive authoritarian regimes, may practice electoral clientelism. This chapter builds on previous literature reviews of clientelism (especially Kitschelt and Wilkinson, 2007; Stokes, 2007; Hicken, 2011) and will not venture into a broader consideration of distributive politics (Golden and Min, 2013).1 By focusing on research since about 2005, it features areas of intellectual advance in the study of political clientelism, particularly in our understanding of the micro-logic and transactional process involved in clientelistic exchange. But it will also highlight continuing lacunae of clientelism research, particularly in understanding the organizational level of clientelistic transactions – such as ‘machine’ politics – as well as the forces that promote the rise or demise of clientelism as an important accountability mechanism. The first two sections concern conceptual issues of clientelism as an accountability mechanism. As a genus in a taxonomic hierarchy, how does clientelism relate to other genera within the family (or domain) of political accountability relations? And what is the conceptual relation between the genus electoral clientelism and its various expressions (‘species’)? The third section turns to organizational patterns of exchange between clients and patrons, and particularly their mediation by brokers within the frame of political party organization. The fourth section deals with inter-party competitiveness and the targeting of clientelistic benefits on “core” or “swing” voters. The fifth section addresses five different conditions and
mechanisms promoting rise and decline of clientelism. It goes beyond the conventional focus on economic development. The sixth and seventh sections reverse the causal pers pective: If electoral clientelism is a prominent accountability mechanism deployed by citizens and politicians, what sorts of con sequences may this have for questions of economic distribution and social welfare?
Clientelism: Genus within the Family of Political Accountability Mechanisms Students of clientelism typically mention local club goods, programmatic policies and personal candidate traits as further genera within the family of political accountability mechanisms. Stokes et al. (2013) offer a systematic typology of distributive accountability mechanisms that includes clientelism (targeted to individuals, contingent on client performance: ‘quid pro quo’), local club goods (pork: targeted, but without conditionality of award) and individual constituency service (also targeted, no conditionality), as well as programmatic politics (distribution under public general rules: neither targeted nor conditional). In the broader behavioral perspective on linkage mechanisms, this leaves out unreflective and affective modes of citizen–politician coordination that occur through party identification and candidate traits. It is also not clear where ascriptive socio-cultural group identification would come in, net of considerations of instrumental group payoffs. Unthinking party identification and rote habitual support may, in fact, be the quantitatively most important components of citizens’ vote choice functions, even in advanced knowledge societies. Cultivating party identification and brand recognition may therefore often be politicians’ best option to maintain the perception of political accountability among their constituents.
Clientelism
Conceptual work, however, has not generated a comprehensive typology of political accountability mechanisms. Accordingly, investigations have usually focused on politicians’ and citizens’ use of a singular accountability mechanism, such as clientelism, or a pair of such mechanisms, as in the case of clientelism and programmatism. But the correct unit of analysis to understand citizens’ and politicians’ mutual coordination may be the comprehensive ‘linkage profile’ over all possible dimensions of political accountability. The linkage profile would be the N-tuple, or multi-dimensional vector, indicating the relative effort politicians and citizens make in coordinating with each available linkage strategy. It would require mapping the trade-offs, interdependence and interaction of political linkage strategies in the ‘portfolio’ of a politician (or party) in generating or sustaining a support constituency. So far, little research has mapped complex linkage profiles. Often theory and even empirical research instruments presume a simple trade-off between clientelistic and programmatic linkage (e.g. Keefer, 2007).2 But it is hard to nail down the mechanisms that may make this trade-off compelling (Kitschelt, 2000: 853–5).3 Which voters to target with clientelistic inducements may in fact depend on a party’s programmatic linkages to those voters: perhaps the most efficient target of clientelistic benefits are voters mildly opposed to a party’s program, but receptive to clientelistic benefits that tip their balance of considerations in partisan vote choice (GansMorse et al., 2014). Parties may serve different constituencies with different linkage benefits (Luna, 2014). It may therefore be useful to think of parties as assembling ‘portfolios’ of linkage strategies to attract different voters (Diaz-Cayeros et al., 2016; Calvo and Murillo, 2019) or linkage diversification that may appeal to the same voters. Historically, European ‘left’ working-class parties often attracted voters with clientelistic benefits (e.g. subsidized public housing in big cities) as well as political ideology. In early 21st-century
481
politics, however, parties programmatically on the ‘right’ tend to deploy clientelism with greater effort (Tzelgov and Wang, 2016). Behaviorally, clientelistic networks may persuade voters to adopt their benefactor party’s programmatic ideology. Politicians may even use their selective management of clientelistic exchange, for example in treatment of ostracized ethnic minorities, to send programmatic signals to their voters in the majority ethnic group (Mares and Young, 2019). Ignoring the interaction of clientelism with the deployment of other accountability mechanisms in citizen–politician linkages may become problematic when empirically appraising the impact of clientelism on voting behavior and the popular support of political parties. A similar effort in providing clientelistic benefits or sanctions may yield different electoral impact, contingent upon how politicians also engage in an array of other appeals (programmatic, charismatic, identity based, etc.). In vote choice functions as well, analysts would want to explore the interaction of clientelistic and other accountability efforts in shaping citizens’ support for a party and electoral choice. Given that alternative linkage mechanisms often do not come into view in clientelism studies, investigations of the electoral impact of clientelism may lack a baseline in assessing the magnitude – in relative and absolute terms – of clientelistic efforts on a candidate’s or party’s support. If voters who indicate they are recipients of clientelistic benefits have a 3–5% greater propensity to vote for one party over others, is that a small or a large effect? An answer to this question depends on the size of the effect that other mechanisms have on the same electorate, as well as on the interactions between the clientelistic effort and other efforts that may yield indirect pathways magnifying (or reducing) the overall substantive impact of clientelism on the distribution of partisan support. More sophisticated research in this regard is needed that considers accountability mechanisms as
482
The SAGE Handbook of Political Science
complex ensembles (multi-dimensional vectors?) in which the value of the clientelistic effort is one term or dimension that interacts with other terms and dimensions. The electoral effect of clientelism, then, depends on the interaction with other linkage mechanisms, as well as the organization of partisan mobilization and the competitive context in which clientelistic linkage occurs.
The Impact of Clientelistic Appeals Relative to Other Modes of Political Accountability Only a limited and fragmentary stock of empirical research has addressed the electoral impact of clientelism. First, some studies report how parties allocate clientelistic benefits – such as resources for targeted social programs – across constituencies in order to estimate effects on turnout or voting. The most sophisticated analysis in terms of estimating a rather substantial direct effect of clientelism relative to other linkage mechanisms on voting behavior is Diaz-Cayeros et al.’s (2016) investigation of the influence of targeted social policies on subnational voting in Mexico, albeit with district-level rather than individual-level data. This research typically yields statistically significant but substantively modest effects of clientelism on vote choice, but does not fully calibrate the relative effect compared to partisan effort regarding other accountability mechanisms. A second strategy to assess the impact of clientelistic effort comes from experiments to persuade voters not to sell their votes (Vincente, 2014). Persuasion tends to result in lower voter turnout, especially among poor people, and increased survival of office incumbents, as it is difficult to motivate voters in the absence of strong programmatic and other linkage appeals. Also, where public audits of local finances make targeted benefits risky for politicians, voter turnout drops substantially (Hidalgo and Nichter, 2016).
Third, in survey research on vote choice itself, Weghorst and Lindberg (2013) show for Ghana that clientelism may be a consideration in people’s decisions, but not the dominant one among the several included in their investigation. Unobtrusive survey list experiments where social desirability bias cannot interfere with respondents’ propensity to report clientelistic influence on their vote choice tend to yield higher estimates of the impact of clientelistic considerations, albeit with large contextual variance (Corstange, 2016; Kramon, 2017; Mares and Young, 2019). But programmatic considerations rarely outperform clientelistic motivations. Fourth, observation of party operatives engaging in clientelistic exchange suggests what is widely known, namely that clientelism – at least as an electoral strategy (see the following section of this chapter) – is a ‘leaky bucket’ in which politicians’ resources are dissipated, as clients opportunistically pocket resources without delivering support. Indonesia provides some of the starkest examples (Aspinall and Berenschot, 2019). Fifth, a final strategy to explore the effectiveness of clientelism asks country experts to assess both the efforts parties are making in transacting clientelistic benefits with targeted voters and the effects of such efforts (Kitschelt and Altamirano, 2015). In an 88-country/506-party comparison, experts attribute weak effectiveness of parties’ clientelistic efforts typically to parties that lack encompassing associational networks to sound out and informally monitor their constituencies. Overall, efforts to conceptualize the role of clientelism within a set of linkage mechanisms of political accountability, as well as the measurement and appraisal of the effect of clientelism on vote choice, often suffer from insufficient inclusion of the full array of linkage mechanisms and the relations between different mechanisms in whole profiles or ‘portfolios’ of party accountability.
Clientelism
For this reason, it is also difficult to appraise the magnitude of a clientelistic impact on voting.
Varieties of Clientelism If clientelism is the genus, what are the species of this political accountability mechanism? Classic political–anthropological investigations conceptualize clientelism as (1) broad-based support of a number of clients for a single patron that involves (2) durable, (3) dyadic and personalistic, as well as (4) asymmetrical– hierarchical and monopolistic, but also both coercive as well as norm-infused, allegiance to (5) a single patron in exchange for protection from existential economic threats. As long as all of these elements are in place, the risk of bilateral opportunism – that either patron or client would defect from their exchange – is remote. But clientelism now comes in multiple forms that vary along a range of dimensions, with different implications for political opportunism of clients and patrons. First, the activities profile of patrons’ support for political clients may be varied. Clientelism may concern just the buying of votes or of clients’ abstention from voting (Nichter, 2008). But it may also encompass a broad range of diverse activities to sustain political parties and their lead candidates. Then there may be more sublime, diluted forms of clients’ contributions. Material inducements may be just signals that purchase the attention of an audience to a candidate and update clients’ perceptions of a candidate’s credibility and competence (Kramon, 2017). By luring citizens with incentives to vote rallies, they may also enhance their receptiveness to be persuaded by a candidate (Muñoz, 2019). Opportunism in the clientelistic exchange is no problem in attention and persuasion buying, or in abstention or participation buying, but it is a major concern in vote-buying activities, where vote secrecy and Australian ballot may bar the patrons from direct observation of their clients’ actions.
483
Second, the range of goods and services provided by the prospective patrons varies widely. Patrons may provide simple services such as campaign entertainment and material gifts (e.g. food, clothing, building materials, appliances), including money. But they may also generate more substantial benefits such as patronage jobs, social benefits (subsidized housing, disability pensions, scholarships), procurement contracts or regulatory easements. Third, the temporality of the clientelistic exchange varies widely. While early students of clientelism focused on durable clientelistic exchange relations, more recent research has often focused on single-shot vote buying instead of the more ‘relational’ lasting modes of exchange (Nichter, 2018). Empirically, Indonesian clientelism tends to emphasize single-shot clientelism (Aspinall and Berenschot, 2019: 98–106), whereas in countries such as Argentina, Brazil or Mexico, sustained, relational clientelism may be more prominent (cf. Diaz-Cayeros et al., 2016; Nichter, 2018). Iteration and durability of exchange relations cut down on the participants’ opportunism, as they reduce the need for and the costs of direct monitoring of transactions in favor of indirect control through social networks and party organization. Normative commitment mechanisms to reciprocate to a clientelistic gift may sometimes contain client opportunism in singleshot exchanges (Finan and Schechter, 2012; Lawson and Greene, 2014). Politicians may target different clientelistic constituencies with goods and services that involve differential temporality. A fourth attribute along which clientelism may vary has to do with scale, that is, whether the targeted clients are individuals (‘retail clientelism’) or (in)formally organized and spatially concentrated groups (‘wholesale clientelism’), such as firms with offices and factories, residential neighborhoods or economic or cultural associations (trades, unions, churches, ethnic networks). In wholesale clientelism, patrons can delegate the task of monitoring compliance of individual citizens
484
The SAGE Handbook of Political Science
to collective managers/organizers of the client group itself. Spatially concentrated group voting may allow indirect compliance checks by monitoring small-scale precinct and ballot station electoral results (Rueda, 2015). A fifth dimension of variance concerns the sources of clientelistic largesse. Is it asset-rich but vote-poor patrons who out of personal private assets channel resources to asset-poor but vote-rich clients (Kitschelt, 2000: 849), or is it politicians deploying public resources for clientelistic purposes? There is some theorizing about the consequences of private rather than public supplies of clientelistic resources (Medina and Stokes, 2007), but it is not highly elaborated, nor supplemented by empirical investigations. A sixth dimension of clientelism characterizes the social organization of clientelism to which the next section is devoted. It ranges from decentralized, candidate-centered social networks to formal ‘machines’ with external membership boundaries and internal division of labor, hierarchy of control and career patterns of mobility of individual actors across different offices. Finally, participants may perceive clientelistic exchange as a more coercive, affective or remunerative, disenchanted affair. The classical anthropological literature on clientelism prominently featured the combination of coercion conceived as clients’ lack of outside options to seek scarce resources intertwined with normative–affective bonds. Mares and Young (2019) emphasize that recent clientelism research has mostly focused on the remunerative–utilitarian benefit calculations of clients in accepting and acting on material inducements. Such research has thereby neglected the coercive, options-removing impact of clientelistic linkage mechanisms. It needs to be kept in mind, however, that what is a coercive threat or a material benefit may be a matter of perception. The prospect of losing a benefit (say, a subsidized apartment or a public sector job) triggers perceptions of coercion putting an actor into the domain of severe losses whereas the prospect of gaining
an identical benefit due to clientelistic allegiance puts an actor into the domain of gains and remuneration. Nevertheless, in line with prospect theory, the perceived threat of loss is probably greater than the gain experienced by anticipating a clientelistic benefit. Pushing a bit back against emphasizing coercion, however, some clientelism research has asserted the capacity of clients to shape the relationship and assert their interests as an under-appreciated aspect of the exchange (cf. Nichter, 2018; Pellicer et al., 2018; Taylor-Robinson, 2010). If these dimensions of clientelism vary broadly, do attributes of them ‘hang together’ in predictable clusters? The jury is out, but based on a cluster analysis of attributes reported in 60 fine-grained ethnographic descriptions of clientelistic transactions in the recent ethnological literature, Pellicer et al. (2018: 14) recover three of four types of clientelism one might distinguish in terms of duration, type of exchange and scale of actors. These types may also associate with different forms of funding sources and organization, but probably not with coerciveness, an ever-present attribute that may shape relations in any type of clientelism. First, there is individual-single-shot clientelism that targets individual voters as recipients of gifts and purchases, but also acts of regulatory forbearance (Holland, 2017: 14–20). It involves vote and abstention buying, but also the sublime forms of informational and persuasive clientelistic attention buying. It may be more likely to be funded by private patrons and involve little partisan organization in candidate-centered operations. Where single-shot vote-buying is most pronounced, party institutionalization tends to be empirically least entrenched (Driscoll, 2018). Political opportunism and problems with the ‘leaky bucket’ of clientelism may make this form of linkage only weakly effective in terms of generating votes. Second, individual–relational clientelism may also use gifts and vote-buying, but its comparative advantage lies in the deployment
Clientelism
of patronage jobs (e.g. Robinson and Verdier, 2013) and social benefits. To sustain the effort, patrons tend to rely on state resources more than private donations and build political machines to funnel resources in a predictable and resilient fashion. The durability of these clientelistic exchanges make them approximately ‘self-enforcing’, that is, it minimizes the need for explicit monitoring. In collective (‘wholesale’)–relational clientelism, public works procurement contracts establish an intertemporally extended relationship between politicians and corporate organizations of clients, such as producer firms, neighborhood organizations or professional groups. By virtue of collective organization, clients may have considerable leverage over (competing) politicians, while credibly vouching for their ability to deliver support and prevent defection in their own ranks. As a residual category, collective singleshot clientelism would involve political patrons and corporate clients exchanging one-time commitments. Patrons may deliver procurement contracts and forbearance, but most regulatory relations may involve iterative relations. And forbearance may rarely take the form of clientelistic conditionality (Holland, 2017: 307–8). The brief sketch of the relationship between modes of clientelism and the empirical activities involved in clientelistic activities brings to the fore that the social organization of the exchange is critical to determine whether clientelism effectively organizes political accountability relations. This is exactly the subject on which much recent research has been performed in the clientelism literature.
Organizational Design of the Clientelistic Exchange Because clientelism opens the door to bilateral opportunistic defection by clients and/or patrons (e.g. Corstange, 2016: 30), a great deal rides on the organization of the exchange.
485
Investigations have put forward social networks that link clients and patrons and particularly brokers as nodes in social networks’ centrality in explaining the micro-logic of how clientelistic exchange can be sustained. But the generic notion of broker hides great differences in the roles which investigations have assigned to intermediaries between politicians and voters. Brokers are conceived to be intermediaries who have specialized knowledge about prospective clients that allows candidates (patrons) to fine-tune the targeting of benefits. Conversely, clients may use brokers as competent specialists to press for benefits. Brokers may also serve as mediators between layers of governance. At the same time, brokers deploy their comparative advantage to advance their own personal welfare. Brokers may divert the patron’s resources to their own consumption and career aspirations (cf. Stokes et al., 2013: chapter 3; Szwarcberg, 2015). Let us distinguish six different brokerage roles. They are based on brokers’ differential insertion in their communities’ socio-economic and cultural fabric and their revenue dependence on one, or on several, political parties. Chiefs and traditional notables: Often based on kinship and descent, and only sometimes on ethnic affiliation, these are people of high status, with nodes of dense social ties, and proven organizers of local club good production. In competitive party systems, such ‘development brokers’ may wish to keep an arm’s-length relationship to partisan politics and not become clientelistic ‘electoral brokers’ (Baldwin, 2016). Economic brokers: Business people (and even criminal gangs) deliver votes of employees and captured neighborhoods in exchange for payments, procurement contracts, regulatory favors or forbearance. Such commercial brokers are indifferent to the politicians’ partisan brands and ideological constituencies. Service brokers: These are facilitators of citizens’ dealings with public bureaucracies who intermittently also happen to favor
486
The SAGE Handbook of Political Science
political candidates. Particularly under competitive conditions, service brokers only weakly associate with parties on a case-bycase basis, as they are selected by their clients primarily because of their competence in getting things done (Auerbach and Thachil, 2018). Drawing on external service brokers and delivering targeted, contingent benefits may be a strategy for parties to broaden their electoral appeal beyond existing constituencies that can be mobilized based on more programmatic appeals (Thachil, 2014: chapters 4 and 5; Luna, 2014). Community organizers: They derive power from bargaining with competing politicians over provision of services in exchange for community-level vote blocs (e.g. Auerbach, 2016). Their community organization of citizens stages an iterative game with politicians. Multi-partisan brokers: They are service brokers for politicians and they have the primary material objective to deliver votes. Such brokers aim to arrange short-term vote buying or electoral ‘success teams’ (e.g. Aspinall and Berenschot, 2019: 102–6). These are also operatives who help politicians buy attention and receptiveness to persuasion. Machine brokers: These are card-carrying politicians in party machines dedicated just to one party. They may have static ambitions to preserve their office and a flow of assets into their personal coffers, or they may have progressive ambitions to move up in a hierarchy of party offices or a subsidiary network of stakeholders and ultimately in order to become principal electoral office holders themselves (cf. Gingerich, 2013). There are game theoretical accounts of how patrons’ efforts to restrain broker opportunism are empirically most plausible in case of machine brokers permanently embedded in political parties. Not by chance, many of the studies in which brokers inside parties loom large are preoccupied with the most institutionalized parties in developing countries, such as historically in Argentina (Stokes et al., 2013; Szwarcberg, 2015; Weitz-Shapiro, 2014) or Mexico (Diaz-Cayeros et al., 2016; Larreguy
et al., 2017). They also make a strong showing in post-communist countries with a resilient organizational continuity with previous Communist ruling parties, particularly as diversified democratic linkage strategies (cf. Kitschelt and Singer, 2018). What the literature on clientelistic party organization is missing, however, is an analytically thorough conceptualization and theoretical treatment of diversity of brokerage relations. Attempts to venture in this direction are limited and confined to inferences drawn from comparative studies of two or three countries’ party systems (e.g. Gingerich, 2013). But there is no systematic analysis of the different contextual contingencies that would give rise to differential brokerage systems. There is also little effort to analyze the organization of parties that deploy and promote brokers or at least deal with them on an intermittent basis. What is broadly missing is an analysis of different political recruitment and career patterns of brokers and patrons, whether in integrated ‘machine’ parties or in more personalistic, decentralized horizontal political and social networks. Likewise, we know little beyond descriptive single-party monographies about the theoretical logic of clientelistic party governance and leadership in devising political strategies and allocating party resources. For example, clientelistic machines appear to concentrate power in small leadership cliques more often than parties with a heavily programmatic linkage profile. Middle-level operatives and activists appear to have little say in them. In other words, among the three major tasks which parties in mass democracies may tackle – to mobilize voters, to screen and groom citizens for recruitment to political leadership positions, and to devise partisan strategies that feature the ‘brand’ of the label in interaction with competitors – the clientelism literature has hitherto addressed only one role, that of electoral mobilization. Parties gain credibility with a following only if they institutionalize leaders that force brokers to deliver resources to clients, but
Clientelism
simultaneously incorporate mechanisms to remove non-performing leaders who deceive their followers (Keefer, 2018). Clientelistic parties rarely appear to satisfy this second criterion, as their leadership coalitions seem to face few checks and balances inside the machines. But no systematic studies have probed the hierarchical leadership structures of parties that deploy clientelism as one important linkage mechanism.
Clientelism, Competition, Competitiveness An electoral race is highly competitive for an electoral candidate or party if small changes in their voter support translate into large changes to their probability of winning office and their bargaining leverage over authoritative executive decisions. Competitiveness thus involves electoral risk, and therefore incentives to deploy the utmost effort to win an extra margin of electoral support, even if that is a slender slice. Beyond two-party systems, where the winner’s past or prospective vote margin is a measure of competitiveness, the literature has been beset by the difficulty that no adequate measure for multi-party systems, and those with multiple voting districts, has been identified. Volatility or party system fragmentation will not do. For this overview, however, conceptual questions of measuring competitiveness have to be bracketed. Competitiveness may come into play in deciding (1) how much emphasis to give to clientelism in a party’s linkage profile and (2) which kinds of constituencies to target.
Competitiveness and the Choice between Clientelistic Effort and Alternative Linkage The literature on agrarian landlord–peasant clientelism usually assumed monopolistic
487
relations between patron and clients, and this has slopped over into some formal modeling of electoral clientelism (e.g. Medina and Stokes, 2007). In a competitive electoral situation, so conventional reasoning goes, politicians tend to invest more in programmatic and other non-clientelistic linkage strategies, because the marginal voter they need to recruit in order to win elections is unlikely to be integrated into social networks that enable clientelistic exchange by countering voter opportunism through subtle monitoring and sanctioning. Building party organization to police defection may be too expensive, so programmatic appeals may be overall cheaper.4 Empirically, however, clientelism often thrives in competitive situations. Where poor voters constitute much of the electorate, competitive contests induce politicians to redouble their clientelistic efforts, because they may not be able to credibly commit to the provision of collective and club goods (cf. Kitschelt and Wilkinson, 2007: 28–35; Weitz-Shapiro, 2014; Corstange, 2016). Competitiveness may enable a party’s activists to extract more patronage as they become pivotal in a party’s victory (Driscoll, 2018). Likewise, clients and brokers can threaten patrons with defection (Berenschot, 2019).
Targets of Clientelistic Effort under Conditions of High Competitiveness: Core or Swing Districts and Voters? Upon initial consideration, under competitiveness clientelistic resources should be funneled to marginal (‘swing’) voters and districts where they make a difference because voters are close to indifferent between alternatives (e.g. Corstange, 2016: chapter 6). But a long-standing debate about partisan resource allocation to ‘core’ or ‘swing’ constituencies and districts suggests a complex, contingent relationship (for literature reviews see Golden and Min, 2013: 77–82; Stokes et al., 2013: 32–6; 131–43).
488
The SAGE Handbook of Political Science
A core rather than swing strategy may work best when party machines with encompassing social networks restrain voter opportunism and use clientelistic resources to mobilize voters. Conversely, where most voters are unaffiliated, parties have little choice but to target swing voters, especially in urban areas with high volatility. In a dynamic perspective, parties with large machines may develop sufficiently long-time horizons to use clientelistic resources to prevent today’s core voters from becoming tomorrow’s swing voters. At the aggregate geographical level, these efforts may be concentrated on districts with eroding party hegemony, as compared to stable monopolistic and already fully competitive districts (Diaz-Cayeros et al., 2016: 81). A further contingency comes in through the presence of self-interested voters who may help to attract voters, but also are prone to pocket resources. As Stokes et al. (2013) find in the case of Argentinian machine politics, patrons may target resources on marginal districts, but then let their brokers mobilize core voters within such locales. Patrons measure brokers’ performance in terms of their ability to achieve voter turnout to rallies, and it is substantially cheaper for brokers to achieve this with core rather than swing voters. Only under conditions of most intense competitiveness may brokers be compelled to target expensive ‘swing’ voters.
Rise and Decline of Clientelism among Political Accountability Strategies Recent clientelism research has advanced most in theoretically specifying the terms and the process of clientelistic exchange and close empirical analysis of transactions in specific settings. Relatively little has been achieved, however, in the realm of determining the causes and consequences of the contexts that shape the rise or decline of clientelism. A penchant for micro-level analysis makes
macro-phenomena recede from view. For example, localized studies have identified political mechanisms that make individual citizens prefer or steer away from clientelistic political exchange. But they suffer from a problem of ‘inheritability’. Politicians are likely to permit such practices eroding or preventing clientelistic exchange only if it is in their interest. Where office-seeking politicians are incentivized to sustain clientelism, anyone who intends to deploy localized techniques to undercut it on a massive scale beyond sandbox randomized experimental designs will eventually run into severe political resistance from the powers that be. So it will be imperative to investigate the meso-party level and macroinstitutional and political–economic conditions that give leverage to politicians who select nonclientelistic linkages or that ‘screen’ politicians out who stick to clientelistic strategies. Making headway in studying the macro-contexts that incentivize voters, brokers and patrons, however, is difficult because of a paucity of suitable macro-level data on clientelism, whether assembled through cross-national (and regional) voter surveys, expert surveys or other behavioral/experimental data. It would take large numbers of micro-level studies under different meso and macro-conditions to fully establish the micro–macro link.
Micro-Level Interventions: Experiments in Undercutting Clientelism Some natural or field experiments in local settings have suggested conditions under which clientelism may recede. The critical mechanisms are informational and social network-related: new information and nonclientelistic messages about politicians’ activities may, at the margin, dissuade some voters from choosing among candidates for clientelistic considerations. However, field experiments declare alternative messages to be programmatic, mostly candidates’ advertising of local club goods. It involves valence
Clientelism
competition, not positional competition, and typically at a local level, without partisan brands organized around alternative positions to provide large-scale club goods that involve major trade-offs (e.g. taxes for national health care systems or education). In addition to informational treatments, material inducements or withdrawals may modify voters’ linkage conduct. Random distribution of water cisterns in drought-stricken Northeastern Brazil reduces recipients’ propensity to vote according to clientelistic inducements (Bobonis et al., 2017). Likewise, random local municipal budget audits by the national regulator in Northeastern Brazil drove down voter turnout among the poor, presumably because mayors no longer dared to allocate resources to clientelistic targeting (Hidalgo and Nichter, 2016). These experiments are insightful, but would have to be run in many iterations under massively varying meso and macro-level context conditions to be informative about the potential for aggregate shifts of large voter segments toward partisan linkage profiles that minimize clientelism. In many instances, politicians would fight against non-clientelistic practices. It is critical to establish where and when non-clientelistic linkage practices can gain dominance. Macro-relations of political power come into play that typically recede from view in micro-level studies.
Macro-Level Evidence I: Development and Political Institutions as Conventional Hypotheses Conventional macro-level theory identifies the following conditions as detrimental to clientelism: economic development/high skill levels, universal suffrage, safeguards of the voting process against surveillance by party agents (secrecy, standardized Australian ballot, compulsory voting), electoral institutions favoring impersonal choice among partisan alternatives and the prevalence
489
of a professionalized high-capacity state apparatus. Evidence for all of these conditions comes primarily from case studies. But counter examples suggest that these factors may only enhance the probability of a shift out of clientelism to a certain extent, at least when not occurring in concert. Stokes et al. (2013: chapter 7) have offered the most refined developmentalist account of the British transition away from clientelism in the late 19th century, detailing how supplyside costs of managing clientelistic exchange with agents became unsustainable through encompassing enfranchisement, while urbanization and social mobility made their whereabouts and behavior intractable – and all of this happening while better-off voters demanded higher payoffs or the provision of large-scale club or collective goods. But in the 20th and 21st centuries, development may not be a sure recipe to defeat clientelism. Parties learned to build clientelistic machines and the state apparatus mobilized more resources to satisfy even exorbitant clientelistic demands. It is not by accident that empirical studies find a curvilinear relationship between affluence and clientelistic partisan linkages that begs for explanation (Kitschelt and Kselman, 2013; Diaz-Cayeros et al., 2016: 98). Institutionalist arguments run into problems, too. Suffrage extension and vote secrecy certainly have not eradicated clientelism in many 20th-century polities. British parties may simply have had insufficient time to build requisite organizations and social networks to preserve clientelism. Cases such as Britain may offer data points confirming the institutional argument, but many others do not (cf. Lawson and Greene, 2014 on Mexico). Party organization and social networks may compensate for inhospitable institutional incentives and preserve clientelism (Nichter, 2018). Electoral systems are a case in point. Change may sometimes, but not always, trigger a decline in clientelism. Electoral systems – such as Brazil’s open-list proportional representation without vote pooling – look perfect for the entrenchment of clientelistic parties, but
490
The SAGE Handbook of Political Science
then witness a surge of programmatic parties (cf. Hagopian et al., 2009). Changing popular linkage demands may induce electoral system change, rather than the other way around, as Japan (Rosenbluth and Thies, 2010) or Taiwan (Greene, 2007: 265–8) may suggest. If popular demand remains strong for clientelism, parties manage to adjust their organizations to compensate for any inconvenient electoral institutions, as Gingerich (2013: chapter 3) argues in a comparison of Bolivia, Brazil and Chile. Lack of state capacity, conceived as low professional insulation of the bureaucracy from political appointments, is another staple of the literature predicting clientelism. Low state capacity undercuts the credibility of politicians’ programmatic stances, and may thereby produce clientelism (Nathan, 2019). But the argument that the presence of high capacity before the advent of universal suffrage would protect democracies from clientelism does not even hold consistently in Western Europe (Kitschelt, 2007). Moreover, clientelism and patronage bureaucracies may be so intimately intertwined that it is often futile to assign one as the cause of the other (Kuo, 2018: 20). In more recent democratic polities, the establishment of professional bureaucracies may be endogenous to the fight of ‘progressive’, mostly middle class groups against clientelist machines. US civil service reform in the late 19th century may have undercut the spoils system of patronage (cf. Folke et al., 2011), but this process was intimately intertwined with the progressive movement. In many instances, therefore, the growth of state capacity and the decline of clientelism are part of the same phenomenon, not one providing an explanation of the other (Kuo, 2018: 20). Conversely, low state capacity is the cause and consequence of clientelism, as politicians resort to other linkage mechanisms when programmatic commitments are not credible (Nathan, 2019). Nevertheless, in the first encompassing cross-national analysis of more than 60 countries on the subject, Bustikova and Corduneanu-Huci (2017) establish that
polities with strong early state capacity formation, instrumented by infant mortality rates in the 1920s, show weaker clientelist linkages as recently as the early 21st century. The causal channel operates through trust in government institutions which is enhanced by early state capacity building. Also, masslevel public opinion data covering 31 countries confirm the robustness of the negative relationship between early state capacity before the advent of mass electoral politics and later electoral clientelism.
Macro-Level Evidence II: Ethnocultural Pluralism Another predictor of clientelistic politics has been ethnocultural heterogeneity. Once again, there is little comprehensive comparative study of this question, and a plausible alternative, established in small-N country case studies (such as Corstange, 2016 or Nathan, 2019), is that it is really the resourcefulness of ethnic leaders as local chiefs, not their ethnic affiliation per se, that shapes partisan linkage strategies (Baldwin, 2016). Corstange’s (2016) evidence, based on small-N case studies and experiments, suggests that transactional mechanisms are more important than collective identity in shaping ethnic clientelism. It is primarily in situations of inter-ethnic competition, and with ethnic groups sustaining dense social networks that lower the transaction costs of monitoring and sanctioning, that clientelism is likely to take hold. Moving toward a broader confirmation of the general association between specific types of ethnocultural heterogeneity and clientelism, albeit in a purely observational study, Wang and Kolev’s (2018) 79-country comparison of clientelistic partisan efforts finds that ethnocultural pluralism boosts clientelistic linkage formation only if there are considerable disparities between the income averages of ethnic groups. And these disparities are associated with the presence of strong
Clientelism
within-group social networks that discourage opportunism in clientelistic exchange.
Undercutting Clientelism III: The Life-Cycle of Democracies and Parties Proposed by Keefer (2007), a novel argument about clientelism’s rise and fall may be called the ‘life cycle’ theory. Young parties and party systems lack the track record of fulfilling programmatic commitments in past electoral terms. Without such credibility capital, therefore, voters discount parties’ programmatic appeals and prefer to be compensated with clientelistic inducements. This argument, however, may require modification, as clientelism cannot be created overnight: it requires deep organizational networks to raise resources, penetrate state bureaucracies and organize the distribution of targeted benefits to core constituencies. In such circumstances, parties with legacy roots in authoritarian regimes have an advantage in establishing clientelistic networks (Kitschelt and Singer, 2018). In ‘young’ democracies, new parties mobilize based on the personal name recognition or ascriptive qualities (ethnic, religious, linguistic…) of their leaders rather than clientelistic machine capacities. If anything, they may resort to diluted forms of clientelism such as information and persuasion buying (Kramon, 2017; Muñoz, 2019), as they lack the organizational capacities to establish meaningful material clientelistic exchange. Empirically, a modified life-cycle argument posits that young democracies abstain from clientelism because most young parties lack the social and organizational infrastructure, while old democracies back-pedal from clientelism because their brands have gained programmatic credibility. This version is at least more consistent with the empirical regularity of a curvilinear incidence of clientelistic partisan effort in poor, middle and high-income countries, a distribution that is also collinear with the democratic experience
491
of countries (Kitschelt and Kselman, 2013: 1469–70). But the underlying partisan strategies may be rooted more in political economy than a simple life-cycle model. Nevertheless, Keefer’s proposal to link age of democracy to linkage strategies has the important virtue of placing time and experience on the agenda of theorizing about democratic partisan linkage relations. It remains a fruitful template for future research.
Undercutting Clientelism IV: Political Governance of the Economy Opponents of clientelism may hope that economic development will eventually generate a critical mass of educated professionals bent on eradicating clientelism. But this prospect postpones relief in many countries to the long run, and with detours along the way. Institutional remedies for clientelism appear to be inauspicious, as clientelistic politicians can work around them, when given time. If anything, institutional innovations tend to be endogenous to political and economic power relations. So why not focus on such power relations directly to uncover potential sources of change, net of long-term development? The curvilinear empirical relationship between development and the incidence of clientelistic partisan effort may provide an inductive clue (Kitschelt and Kselman, 2013 with cross-national evidence; Diaz-Cayeros et al., 2016 with subnational Mexican evidence). Middle-income countries may exhibit an economic structure and power configuration that is particularly conducive to clientelism, but also reveals its transitional nature that often results in sudden collapse. To reach middleincome status, polities typically build a rather strong extractive state that diverts resources from declining sectors (agriculture, natural resources) to growing industries (manufacturing, in the future services). Such developmental states broadly conceived are powerful
492
The SAGE Handbook of Political Science
allocators of economic resources supported by an urban coalition of business and wage earners.5 As they grow fast, these states generate the resources to compensate declining sectors with clientelistic politics, while also providing the collective and large-scale club and collective goods, such as physical infrastructure and social insurance systems, for the rising urban sectors. And the emerging urban professional class may be quite happy with clientelistic linkages, as long as developmental states generate a bounty of patronage jobs. It is thus not necessarily the rising proportion of poor people in developing states that may expand clientelistic demands (Stokes et al., 2013: 264), but the preferences of a state- centered middle class itself. This transitional window of middle-income development is thus the hour of diversified political party machines, sometimes emerging from authoritarian regimes (Kitschelt and Singer, 2018), that rely both on clientelistic and programmatic linkages. State capacity may be initiated and advanced even by leftist programmatic parties that push for social insurance (Grassi and Memoli, 2016). But also, such parties often combine clientelistic and programmatic appeals once in office. As developmental states, however, move closer to the global innovation frontier, they often experience extremely sharp economic and financial crises. Modern knowledge economies propelling sophisticated technologies do not thrive on patronage-based, heavy-handed political allocation of capital and labor, but require professionalism and market-based trial and error to succeed. The developmental state reaches its economic limits, and the ensuing crises undercut the foundations of clientelistic linkage building rooted in a middle-income stage of rapid catch-up economic growth. There is a cumulative case study literature that has associated the decline of clientelism with the crisis of developmental states, broadly conceived. This literature covers the failure of import-substituting industrialization in Latin American countries
such as Mexico and Brazil starting with the Lost Decade of the 1980s (Greene, 2007; Hagopian et al., 2009), but also accounts for change in Japan after the bursting of the economic bubble (Rosenbluth and Thies, 2010) or the 1998 financial crisis in South Korea and Indonesia (Kenny, 2017). Even in today’s advanced capitalist democracies, the exigencies of market efficiencies at different points in time led to business demands for progressive reforms (Kuo, 2018) or the demise of statist business sectors with patronage governance, such as Italy and Austria in the 1980s (Kitschelt, 2007). Future research will tell whether the advent of business crises in countries situated at the upper ceiling of middle-income development provides an analytically powerful general argument to pinpoint the likely demise of clientelistic linkage practices more generally.
Undercutting Clientelism V: Partisan Competitiveness as Game Changer? Where demand for clientelism is high in the linkage profiles of parties, intensifying competition will redouble parties’ requisite efforts. But can increasing competitiveness sometimes be a game changer to switch political strategies to emphasize other, and particularly programmatic, linkage strategies? An affirmative answer appears to be Geddes’ (1991) claim, at least for two-party systems. Intense competition means that both parties are uncertain whether they can retain power. Why not, when prospectively losing office, preemptively remove patronage from the toolkit of partisan struggle, and thus make it less likely that a successor can entrench himself in power? Geddes’ game theoretical model requires a slight exogenous ‘nudge’ to get the process under way, at the end of which politicians of rival incumbent and opposition parties jointly depoliticize and professionalize a government bureaucracy. Moreover, bureaucratic professionalization has empirically proved to
Clientelism
be reversible. Could it be possible that a firm exogenous ‘nudge’ away from clientelism has more to do with the crises developmental state governance reaches when it bumps into the ceiling of upper middle-income status? Or does it involve different, yet to be identified political mechanisms in future investigations of linkage mechanisms? In any case, it is unlikely that increased partisan competitiveness, by itself, is the deus ex machina that will tip the balance of changes in partisan linkage profiles. High competitiveness may enhance, but not fundamentally alter, existing profiles of partisan linkages in a polity.
Linkage Strategies and Political–Economic Outcomes: Security, Equality, Economic Growth Clientelism in general, and patronage and social benefits specifically, provide economic hedging strategies against existential risks where comprehensive formula-based welfare state social programs do not exist (Abente Brun and Diamond, 2014). But is clientelism a substitute for social policy programs, compensating at least in part for encompassing welfare states, or does clientelism reduce welfare outcomes, net of social policy effort? It is common, but empirically largely unproven, folk wisdom that the prevalence of clientelistic partisan linkage mechanisms will lead to (1) an undersupply of collective goods; (2) worse social outcomes (child mortality and life expectancy, literacy/skill levels, social satisfaction); (3) comparatively weaker economic growth; and (4) deepening income and wealth inequality. Literature reviews report the corresponding hypotheses (Hicken, 2011: 302–5), but can point to precious little systematic empirical evidence to substantiate any of these claims. The presumed mechanism in all of them is that the rich treat clientelistic inducements as a cheap way to produce social compliance of
493
the poor. Formulaic, (un-)conditional social insurance programs (pensions, health care, unemployment), means-tested income support and universal social services (health, education, housing) would be more expensive and redistributive. The presence of clientelism may hence impact different aspects of social policy. First, it may weaken the mobilization of social policy preferences, particularly demand for redistribution and its collective mobilization. Second, clientelism may depress spending and administrative labor input in social policy programs. Third, it may affect the quality of the service outputs of social policy programs (e.g. percentages of school cohorts graduating, numbers of patients treated). Fourth, it may degrade the outcomes of policy in terms of material life chances of recipients. As an initial step, investigations may explore whether robust empirical correlations can be established between partisan linkage profiles and the different aspects of social policy, once appropriate controls for development, exogenous elements of state capacity, cumulative democratic experience and exogenous institutional features have been applied. Only then would it be worth probing into the causal direction that associates clientelism with social policy features. The trouble is that existing investigations have not even established the existence of robust correlations among the phenomena of interest in a comprehensive fashion. Only limited fragmentary observational and natural experimental evidence lends indirect support to some of the political– economic clientelism accounts. With regard to redistributive preferences, Holland (2018) finds that in truncated Latin American welfare states, citizens at all levels of income express only muted redistributive preferences. This suggests that a dominance of clientelistic politics over redistributive programmatic partisan contestation may lock in high inequality. Taylor-Robinson (2010: 63) theorizes, however, that the inability of poor clients to extract benefits from the political process may be contingent upon institutional
494
The SAGE Handbook of Political Science
features that shape the responsiveness of political elites to poor people’s demands. Under conditions of competitive elections with open-list proportional representation, poor voters may gain more leverage over their legislatures. Comparing voter satisfaction before and after an electoral system switch in Honduras establishes some circumstantial evidence for this proposition. Finding an empirical trade-off between policy preferences favoring social policy and/ or welfare state programs, on one side, and the presence of clientelism, on the other, is insufficient to establish that clientelism worsens social policy outcomes. A negative correlation between redistributive preferences or social policy outlays and clientelism might simply suggest that clientelism is a ‘functional equivalent’ to achieve social risk reduction for the poor, who therefore have no reason to intensify their demands for political change. To prove the role of clientelism in undercutting social hedging facilities, a strong association of clientelism with the incidence of worse life chances (mortality, morbidity, literacy, female empowerment, etc.) and income or wealth distribution, as well as subjective correlates such as lack of trust in institutions and political elites, would need to be empirically established. Or, in the realm of schooling, it would have to be demonstrated that clientelism leads not just to lower budgets and worse teachers, but also to lower student performance and labor market aptitudes. Only very few studies examine the association between clientelistic practices, policy outputs and outcomes. In what is probably the methodologically most sophisticated analysis of clientelism on policy outcomes, DiazCayeros et al. (2016: chapters 5 and 6) establish that unlike non-clientelistic policies, the Mexican clientelistic Pronasol social infrastructure and family benefits program had no positive effects on the local presence of predictable utility services (water, power, sewer) and on change rates of infant mortality. The result remains robust once the different poverty levels of communities, their competitive
political circumstances and reverse causality mechanisms are taken into account. In the realm of education policies, at least two studies have established an association between clientelistic practices, outputs and outcomes. Hicken and Simmons (2008) show that where electoral systems are in place that promote parochial accountability and induce politicians to engage in localized club goods provision, countries spend less on education and experience higher illiteracy rates. With a more direct measure of partisan clientelism, Chen and Kitschelt (forthcoming) show for a set of 31 or more mostly middle-income developing countries that a strong clientelistic partisan effort coincides with a crowding out of educational expenses (outputs) and a degrading of educational outcomes, as measured by high school mathematics and science test scores, and net of educational expenses and outputs (degrees earned). The most important outcome that students of clientelism might want to identify is a close relationship between clientelistic partisan linkages and income or wealth inequality. Two studies comparing local or state level units in India claim to have found just that relationship (Anderson et al., 2015; Markussen, 2011), but they draw inference from highly indirect evidence about the relationship between inequality of land holding and wage levels (Anderson et al., 2015) and party affiliation and social benefits access (Markussen, 2011). The one direct crossnational empirical analysis of the association between inequality and partisan clientelism by You (2015: 235–8) finds a robust correlation between the two, but actually aims to prove with instrumental variable analysis that inequality causally fosters clientelism. But the correlation between clientelism and inequality is vulnerable to model specifications. Moreover, using a recent, higher-quality measure of national income inequality makes that correlation fade away. Furthermore, the correlation depends on the inclusion of rich post-industrial democracies. It does not hold among poor and middle-income countries.6
Clientelism
Overall, then, research on the welfare impact of clientelism is still in its infancy. In part this is due to the paucity of reliable and valid measures of partisan linkage patterns. Some evidence suggests that there is a trade-off between clientelism and social policy inputs and outputs. But this could just indicate a substitution effect in producing social hedging strategies. When it comes to social policy outcomes (equality, mortality, literacy, educational achievements, economic growth) the existence of a correlation with clientelism, let alone the causal direction of that relation, is highly uncertain, as only very few studies, with typically limited data, have probed that relationship.
Conclusion The main messages of this survey may be briefly summarized in a set of theses. First, it is critical to think about clientelism not as an isolated linkage mechanism, but as a component of a complex profile of linkages deployed by partisan politicians. This recommendation is typically not heeded in the literature (see the first section of this chapter). Second, clientelistic exchange covers a highly diverse range of practices, and it may be time to disaggregate it into different subtypes, perhaps arrayed in a two-dimensional space, distinguishing whether clientelism involves more of a single-shot exchange or an iteration of exchanges and whether the exchange takes place between politicians and their brokers on one side, and individuals or groups and corporate entities as client recipients on the other (see the second section of this chapter). Third, recent years have seen great advances in studying the clientelistic exchange process between patrons, their agents (brokers) and clients, but the investigation of dyadic relations has lost sight of the broader organizational features and over-time development of political organizations and ‘machines’ (see the third section of this chapter). Fourth, an important element of research progress has been examination of the groups
495
of voters who are targeted by clientelism and the strategic configurations of competitiveness under which they are targeted (see the fourth section of this chapter). Whether enhanced competitiveness reinforces clientelistic or other (programmatic?) partisan appeals is contingent upon development and political economy. Fifth, relatively little headway has been made on the key subject of what causes clientelistic practices to become important components of parties’ linkage strategies and what makes clientelistic exchange lose that role. Established causal theories of the relationship between linkages and economic development, democratization, democratic institutions, state capacity and ethnic heterogeneity have proved to be of only limited use. Alternative (or complementary?) political–economic explanations of linkage mechanisms focusing on the interplay of economic development and political– economic governance structures have yielded insights at the level of case studies, but have not yet resulted in a convincing theoretical model or systematic comparative analysis of evidence (see the fifth section of this chapter). Sixth, the study of the welfare consequences of clientelist linkage mechanisms covers few investigations that would rely on tractable observations of clientelistic transactions. A few tentative results, mostly based on indirect measures of partisan linkage, confirm the conventional wisdom that clientelism reduces social welfare outcomes, but this subject is wide open for more systematic and innovative analysis (see the sixth section of this chapter).
Notes 1 Because of constraints imposed by the Sage Handbook format, this chapter cannot cite a great number of valuable and pertinent recent contributions to the vibrant field of clientelism research. In fact, the author had to pare back the references from close to 200 – mostly contributions written since 2010 – to about 60 in the editing process. Apologies to all scholars on whose work this article draws, but who remain unmentioned.
496
The SAGE Handbook of Political Science
2 In the authoritative Varieties of Democracy (V-Dem) dataset, the question directly pertaining to linkage mechanisms (question v2psprinkgs, *_osp,*_ord, in Codebooks 7 and 8, accessed on https://www.vdem.net/en/ ) even forces country experts to score entire party systems unidimensionally on a tradeoff between clientelism (score 0 = constituents are rewarded with goods, cash and/or jobs) and policy/programmatic (4 = constituents respond to a party’s positions on national policies, general party programs and visions for society). 3 A similar problem applies with regard to politicians’ capacity to deploy charismatic authority. Personalistic politics may sometimes fill the void left by the implosion of clientelistic party infrastructures (Kenny, 2017), but personalism may at other times complement clientelistic mobilization. 4 Of course, programmatism incurs its own waste, as it dissipates benefits to supporters and opponents alike (Stokes et al., 2013: 196–9). 5 The narrower conception of the developmental state attributed only to Southeast Asian, not Latin American, middle-income countries requires a core cadre of professional bureaucrats, in addition to a specific coalition of economic producer groups. 6 Author’s reanalysis of You (2015) estimation with SWIID data, available upon request.
References Abente Brun, Diego, and Larry Diamond. 2014. (Eds.) Clientelism, Social Policy, and the Quality of Democracy. Baltimore: Johns Hopkins University Press. Anderson, Siwan, Patrick Francois and Ashok Kotwal. 2015. ‘Clientelism in Indian Villages.’ American Economic Review. Vol. 105(6): 1780–1816. Aspinall, Edward, and Ward Berenschot. 2019. Democracy for Sale: Elections, Clientelism, and the State in Indonesia. Ithaca, NY: Cornell University Press. Auerbach, Adam. 2016. ‘Clients and Communities. The Political Economy of Party Network Organization and Development in India’s Urban Slums.’ World Politics. Vol. 68(1): 111–48. Auerbach, Adam, and Tariq Thachil. 2018. ‘How Clients Select Brokers: Competition and Choice in India’s Slums.’ American Political Science Review. Vol. 112(4): 775–91.
Baldwin, Kate. 2016. The Paradox of Traditional Chiefs in Democratic Africa. New York: Cambridge University Press. Berenschot, Ward. 2019. “Informal demo cratization: brokers, access to public services and democratic accountability in Indonesia and India.” Democratization. Vol. 26(2): 208–24. Bobonis, Gustavo J., Paul Gertler, Marco Gonzalez-Navarro and Simeon Nichter. 2017. Vulnerability and Clientelism. The National Bureau of Economic Research. Working Paper 23589. Bustikova, Lenka, and Cristina CorduneanuHuci. 2017. ‘Patronage, Trust, and State Capacity – The Historical Trajectories of Clientelism.’ World Politics. Vol. 69(2): 277–326. Calvo, Ernesto, and Maria Victoria Murillo. 2019. Non-Policy Politics: Richer Voters, Poorer Voters, and the Diversification of Electoral Strategies. Cambridge: Cambridge University Press. Chen, Haohan, and Herbert Kitschelt. Forthcoming. ‘Political Linkage Strategies and Social Investment Policies. Clientelism and Educational Policy in the Developing World.’ To be published in Julian L. Garritzmann, Silja Häusermann and Bruno Palier (Eds.) The World Politics of Social Investment. Oxford: Oxford University Press. Corstange, Daniel. 2016. The Price of a Vote in the Middle East. New York: Cambridge University Press. Diaz-Cayeros, Alberto, Federico Estévez and Beatriz Magaloni. 2016. The Political Logic of Poverty Relief: Electoral Strategies and Social Policy in Mexico. New York: Cambridge University Press. Driscoll, Barry. 2018. ‘Why Political Competition Can Increase Patronage.’ Studies in Comparative Development. Vol. 53(4): 404–27. Finan, Frederico, and Laura Schechter. 2012. ‘Vote-Buying and Reciprocity.’ Econometrica. Vol. 80(2): 863–81. Folke, Olle, Shigeo Hirano and James M. Snyder. 2011. ‘Patronage and Elections in U.S. States.’ American Political Science Review. Vol. 105(3): 567–85. Gans-Morse, Jordan, Sebastián Mazzuca and Simeon Nichter. 2014. ‘Varieties of Clientelism: Machine Politics during
Clientelism
Elections.’ American Journal of Political Science. Vol. 58(2): 415–32. Geddes, Barbara. 1991. ‘A Game-Theoretic Model of Reform in Latin American Democracies.’ American Political Science Review. Vol. 85(2): 371–92. Gingerich, Daniel W. 2013. Political Institutions and Party-Directed Corruption in South America: Stealing for the Team. Cambridge: Cambridge University Press. Golden, Miriam, and Brian Min. 2013. ‘Distributive Politics Around the World.’ Annual Review of Political Science. Vol. 16: 73–99. Grassi, Davide, and Vincenzo Memoli. 2016. ‘Political Determinants of State Capacity in Latin America.’ World Development. Vol. 88(1): 94–106. Greene, Kenneth. 2007. Why Dominant Parties Lose: Mexico’s Democratization in Comparative Perspective. Cambridge: Cambridge University Press. Hagopian, Frances, Carlos Gervasoni and Juan Andrés Moraes. 2009. ‘From Patronage to Program: The Emergence of Party-Oriented Legislators in Brazil.’ Comparative Political Studies. Vol. 42(3): 360–91. Hicken, Allen. 2011. ‘Clientelism.’ Annual Review of Political Science. Vol. 14: 289–310. Hicken, Allen, and Joel W. Simmons. 2008. ‘The Personal Vote and the Efficacy of Education Spending.’ American Journal of Political Science. Vol. 52(1): 109–24. Hidalgo, F. Daniel, and Simeon Nichter. 2016. ‘Voter Buying: Shaping the Electorate through Clientelism.’ American Journal of Political Science. Vol. 60(2): 436–55. Holland, Alisha C. 2017. Forbearance as Redistribution: The Politics of Informal Welfare in Latin America. Princeton, NJ: Princeton University Press. Holland, Alisha C. 2018. ‘Diminished Expectations: Redistributive Preferences in Truncated Welfare States.’ World Politics. Vol. 70(4): 555–94. Keefer, Philip. 2007. ‘Clientelism, Credibility, and the Policy Choices of Young Democracies.’ American Journal of Political Science. Vol. 51(4): 804–21. Keefer, Philip. 2018. ‘Organizing for Prosperity: Collective Action, Political Parties, and the Political Economy of Development.’ In Carol Lancaster and Nicolas van de Walle (Eds.)
497
Organizing for Prosperity: Collective Action, Political Parties, and the Political Economy of The Oxford Handbook of the Politics of Development. Oxford: Oxford University Press, pp. 432–57. Kenny, Paul D. 2017. Populism and Patronage: Why Populists Win Elections in India, Asia, and Beyond. Oxford: Oxford University Press. Kitschelt, Herbert. 2000. ‘Linkages between Citizens and Politicians in Democratic Polities.’ Comparative Political Studies. Vol. 33(6–7): 845–79. Kitschelt, Herbert. 2007. ‘The Demise of Clientelism in Affluent Capitalist Democracies.’ In Herbert Kitschelt and Steven I.Wilkinson (Eds.) Patrons, Clients, and Policies: Patterns of Democratic Accountability and Political Competition. Cambridge: Cambridge University Press, pp. 298–321. Kitschelt, Herbert, and Melina Altamirano. 2015. ‘Clientelism in Latin America. Effort and Effectiveness.’ In Ryan E. Carlin, Matthew M. Singer and Elizabeth J. Zechmeister (Eds.) The Latin American Voter: Pursuing Representation and Accountability in Challenging Contexts. Ann Arbor: University of Michigan Press, pp. 246–73. Kitschelt, Herbert, and Daniel M. Kselman. 2013. ‘Economic Development, Democratic Experience, and Political Parties’ Linkage Strategies.’ Comparative Political Studies. Vol. 46(11): 1453–84. Kitschelt, Herbert, and Matthew M. Singer. 2018. ‘Linkage Strategies of Authoritarian Legacy Parties under Conditions of Democratic Party Competition.’ In James Loxton and Scott Mainwaring (Eds.) Life after Dictatorship: Authoritarian Successor Parties Worldwide. Cambridge: Cambridge University Press, pp. 53–83. Kitschelt, Herbert, and Steven I. Wilkinson. 2007. ‘Introduction.’ In Herbert Kitschelt and Steven I. Wilkinson (Eds.) Patrons, Clients, and Policies: Patterns of Democratic Accountability and Political Competition. Cambridge: Cambridge University Press, pp. 1–50. Kramon, Eric. 2017. Money for Votes: The Causes and Consequences of Electoral Clientelism in Africa. New York: Cambridge University Press. Kuo, Didi. 2018. Clientelism, Capitalism, and Democracy: The Rise of Programmatic Politics
498
The SAGE Handbook of Political Science
in the United States and Britain. Cambridge: Cambridge University Press. Larreguy, Horacio, Cesar E. Montiel Olea and Pablo Querubin. 2017. ‘Political Brokers: Partisans or Agents? Evidence from the Mexican Teachers’ Union.’ American Journal of Political Science. Vol. 61(4): 877–91. Lawson, Chappell, and Kenneth F. Greene. 2014. ‘Making Clientelism Work: How Norms of Reciprocity Increase Voter Compliance.’ Comparative Politics. Vol. 47(1): 61–77. Luna, Juan Pablo. 2014. Segmented Representation: Political Party Strategies in Unequal Democracies. New York: Oxford University Press. Mares, Isabela, and Lauren E. Young. 2019. Conditionality and Coercion: Electoral Clientelism in Eastern Europe. Oxford: Oxford University Press. Markussen, Thomas. 2011. ‘Inequality and Political Clientelism: Evidence from South Asia.’ Journal of Development Studies. Vol. 47(11): 1721–38. Medina, Luis Fernando, and Susan C. Stokes. 2007. ‘Monopoly and Monitoring: An Approach to Political Clientelism.’ In Herbert Kitschelt and Steven Wilkinson (Eds.) Patrons, Clients, and Policies: Patterns of Democratic Accountability and Political Competition. Cambridge: Cambridge University Press, pp. 68–83. Muñoz, Paula. 2019. Buying Audiences: Clientelism and Electoral Campaigns When Parties Are Weak. Cambridge: Cambridge University Press. Nathan, Noah L. 2019. Electoral Politics and Africa’s Urban Transition: Class and Ethnicity in Ghana. New York: Cambridge University Press. Nichter, Simeon. 2008. ‘Vote Buying or Turnout Buying? Machine Politics and the Secret Ballot.’ American Political Science Review. Vol. 102(1): 19–31. Nichter, Simeon. 2018. Votes for Survival: Relational Clientelism in Latin America. New York: Cambridge University Press. Pellicer, Miquel, Eva Wegner, Markus Bayer and Christian Tischmeyer. 2018. ‘Clientelism from the Client’s Perspective: A Framework Based on a Systematic Review of Ethnographic Literature.’ Ms. University of Duisburg, Germany. Robinson, James, and Verdier, Tierry. 2013. ‘The Political Economy of Clientelism.’
Scandinavian Journal of Economics. Vol. 115(2): 260–91. Rosenbluth, Frances McCall, and Michael F. Thies. 2010. Japan Transformed: Political Change and Economic Restructuring. Princeton, NJ: Princeton University Press. Rueda, Miguel R. 2015. ‘Buying Votes with Imperfect Local Knowledge and a Secret Ballot.’ Journal of Theoretical Politics. Vol. 27(3): 428–56. Stokes, Susan C. 2007. ‘Political Clientelism.’ In Carles Boix and Suan C. Stokes (Eds.) The Oxford Handbook of Comparative Politics. Oxford: Oxford University Press, pp. 604–27. Stokes, Susan C., Thad Dunning, Marcelo Nazareno and Valeria Brusco. 2013. Brokers, Voters, and Clientelism: The Puzzle of Distributive Politics. New York: Cambridge University Press. Szwarcberg, Mariela. 2015. Mobilizing Poor Voters: Machine Politics, Clientelism, and Social Networks in Argentina. Cambridge: Cambridge University Press. Taylor-Robinson, Michelle M. 2010. Do the Poor Count? Democratic Institutions and Accountability in a Context of Poverty. University Park: Pennsylvania University Press. Thachil, Tariq. 2014. Elite Parties, Poor Voters: How Social Services Win Votes in India. Cambridge: Cambridge University Press. Tzelgov, Eitan, and Yi-ting Wang. 2016. ‘Party Ideology and Clientelistic Linkage.’ Electoral Studies. Vol. 44(3): 374–87. Vincente, Pedro C. 2014. ‘Is Vote Buying Effective? Evidence from a Field Experiment in West Africa.’ The Economic Journal. Vol. 124 (574): 346–87. Wang, Yi-ting, and Kiril Kolev. 2018. ‘Ethnic Group Inequality, Partisan Networks, and Political Clientelism.’ Forthcoming, Political Research Quarterly. First View. Weghorst, Keith R., and Staffan I. Lindberg. 2013. ‘What Drives Swing Voters in Africa?’ American Journal of Political Science. Vol. 57(3): 717–34. Weitz-Shapiro, Rebecca. 2014. Curbing Clientelism in Argentina: Politics, Poverty and Social Policy. Cambridge: Cambridge University Press. You, Jong-sung. 2015. Democracy, Inequality and Corruption. Cambridge: Cambridge University Press.
30 Elites Ursula Hoffmann-Lange
Introduction Elites are a universal phenomenon of organized social life. Leaders emerge even in small groups. Complex organizations and, even more so, entire societies usually possess an internal division of labor involving a complex hierarchical system for fulfilling different tasks essential for their successful management. Elites are the people at the top of powerful political institutions and organizations in a society. Using the term elite suggests the existence of a cohesive elite formation with a common interest in preserving the current distribution of power, while using the term elites assumes an elite formation including a plurality of elite groups pursuing different or even conflicting interests. Hierarchies inevitably produce internal conflicts over the distribution of power and rewards. The chances to influence the existing distribution depend on an individual’s position in the hierarchy. The larger and the
more complex an organization or society, the larger the conflict potential usually is. This implies that elites who benefit most from the existing structure of privileges will always be viewed with suspicion by those who are less privileged. Elite research analyzing the structures, behaviors and networks of elites can help to dispel such suspicions (Zapf, 1961: 204). The following discussion of the basic theoretical questions and the results of empirical elite research is based primarily on the findings for elites in socio-economically developed democracies. This is not due to a lack of interest in the structures and behaviors of elites in other parts of the world. Unfortunately, however, empirical studies on elites in recent democracies of the Global South are still scarce. In authoritarian systems this is even more difficult. Researchers have to rely either on published materials those elites provide voluntarily, which are usually rather limited, or on the testimonies of political dissidents and journalists. Therefore, their
500
The SAGE Handbook of Political Science
validity is difficult to assess, since opportunities to corroborate them through independent sources are mostly unavailable. Nevertheless, many results of elite research will also hold true in non-democratic and non-western societies.
The Elite Concept and Elite Theory Classic Elite Theories Assumptions about elites can be traced back to the ancient Greek philosophers. Theories focusing on difference in the roles of elites and citizens were developed first by Machiavelli in the late 15th century and later, in the age of enlightenment, by political theorists such as John Locke, Thomas Hobbes, Charles de Montesquieu and Jean-Jacques Rousseau. At about the same time, theories of representative government began to take into account the political opinions of ordinary citizens, for instance in the Federalist Papers and in the writings of Edmund Burke. But it was only after the onset of industrialization, in the second half of the 19th century, that socialist thinkers developed the idea of democracy as a political system in which all citizens could participate as equals in political decision-making. Based on extensive historical analyses of elites and political regimes, the classic elite theories by Vilfredo Pareto (1935) and Gaetano Mosca (1939) explicitly refuted the socialists’ ideas as illusory and instead emphasized the universality of elites. Even though only Pareto used the term elite – Mosca preferred the term ruling class – both were preoccupied with demonstrating that socio-political inequalities and the emergence of elites are inevitable in human society. Pareto was primarily interested in the circulation of elites. Drawing on a dichotomy introduced by Machiavelli, he developed an elite typology with two types of
elites. To acquire and retain power, the types either use force (lions) or rely on persuasion and cunning (foxes). He further assumed that elites have a tendency to degenerate, thereby running into the problem that their system of power becomes vulnerable to the emergence of counter-elites of the other elite type. This is why he claimed that ‘history is a graveyard of aristocracies’ (Pareto, 1935: 1430; see also Pakulski, 2018). Robert Michels and Max Weber also made fundamental contributions to the development of elite theory. In his book on Political Parties, first published in 1911, Michels analyzed the emergence of a small professionalized leadership in political parties. He emphasized that in order to pursue their political objectives, political parties need to create an effective organization with a full-time staff. This introduces a division of labor between the party leadership and the rank-and-file members, thus allowing the party leaders to acquire superior knowledge and prestige, and sharply reduces the possibilities for effective control of the leadership by the ordinary members. Michels postulated an iron law of oligarchy governing all social organizations (2001: 26). In a similar vein, Max Weber argued that a concentration of power in elites and political leaders is inescapable in modern societies (Pakulski, 2018: 20). It has to be noted that these early theorists, who have frequently been denounced as antidemocrats by critical social scientists (e.g. Bottomore, 1993), were not fundamentally opposed to representative democracy. This is not even true for Pareto and Mosca, who severely criticized the existing democracies of their time; it is even less so for Weber, who believed in the merits of parliamentary democracy, which he considered to be more responsive to the demands of citizens (Pakulski, 2018: 22–3). In his famous treatise ‘Politics as a Vocation’, Weber discussed the qualities of a good political leader: they possess ‘passion, a feeling of responsibility, and a sense of proportion’. The contrasting type
Elites
is the demagogue, who is only interested in gaining votes and who behaves irresponsibly (Weber, 1946, first published 1919). Meanwhile, elite theory has moved beyond the assumption of a dichotomy of elites versus masses by acknowledging the hierarchical and at the same time pluralist character of modern societies. Most importantly, however, elite rule and democracy are no longer seen as irreconcilable concepts. Joseph Schumpeter (1942) was the first to develop a realistic theory of democracy by defining democracy as a political system based on the competition of political parties in free elections. Because of the concept’s association with idealistic expectations, Robert Dahl (1971) suggested the term polyarchy instead, in order to emphasize that both elites and citizens have to be considered as legitimate political actors. He argued that modern polyarchy – despite its shortcomings, which he vividly analyzed – is the closest approximation that can be achieved in large territories. Therefore, Dahl is one of the prominent representatives of a strand of elite theory that has been labeled a theory of democratic elitism.
The Elite Concept and Policy Formation in Liberal Democracies Elite research focuses on the characteristics and actions of individuals who, by virtue of their strategic positions at the helm of powerful institutions and organizations, are able to exert regular and substantial influence on nationally important decisions (Higley and Burton, 2006: 7). In contrast to power, which derives from the formal authority to take decisions on behalf of an organization or a political institution, influence is based on informal means of persuasion, inducement, activation of commitments or deterrence (Parsons, 1963a, 1963b). The distinction between power and influence is important for analyzing elites in modern democracies. These are characterized by a multiplicity of powerful political institutions
501
and civil society organizations, which enjoy considerable autonomy and participate in the process of policy formation. These elites typically act as agents of their organizations. The most important sectors studied by elite research are politics, public administration, the judiciary, the military, business, labor unions, civic associations, the media, academia and religious societies. While the monopoly for making binding decisions for the entire society rests with the constitutionally legitimated political institutions, the elites of the other sectors control important power resources to make decisions within and on behalf of their own organization. Because of the multidimensionality of resources controlled by different elites, there exists no overarching hierarchy of power among them. This implies that the outcomes of such decisions are contingent on the constellation of interests involved. The influence of sectoral elites depends on the subject matter, the resources they control, and their ability to mobilize public opinion in favor of their position and to stave off decisions they oppose (Mattina, Chapter 32, this Handbook).
Methods of Elite Identification The general definition of elites as persons participating regularly in a society’s important policy decisions is insufficient for determining who exactly belongs to the elite for the purpose of empirical elite research. An operational definition needs to specify criteria of elite status. If the object is the elite of a country, these criteria have to take into account the institutional structure as well as the informal patterns of policy-making in that country. Their identification involves decisions taken by the researcher. In order to be universally applicable, this definition has to be parsimonious and must provide unambiguous criteria of elite status. Substantive characteristics of elites such as the degree of their cohesion, their expertise or the question of whether they act responsibly have to be treated as empirical questions.
502
The SAGE Handbook of Political Science
Three basic methods have been developed for identifying elites: positional, decisional and reputational. The positional method is based on the assumption that political influence in complex societies is vested in formal leadership positions located in a broad range of societal sectors. It is the method most frequently used in empirical elite research, based on the assumption that modern elites have to command important power resources that are recognized by the other elites. The positional method requires several research steps. Each step involves purposive decisions based on prior assumptions and prior research about the structure of power and influence in a society, such as: • the approximate number of elites to be included in the study • the nature and number of relevant sectors • the most powerful organizations within each sector • the most powerful positions within each organization • the current incumbents of these positions.
The positional method does not provide guidelines for specifying the horizontal and vertical boundaries of an elite. The inclusion of elite sectors and the choice of cut-off criteria for defining the range of elite positions must be based on the research question and the results of previous research. Although the positional method is useful for identifying and studying a wide range of elite groups, the different elite groups and the individual holders of elite positions cannot be assumed to be equally powerful or influential. Therefore, the method is primarily useful for studying important characteristics of elites such as their social backgrounds, careers, values and attitudes, and for comparing them to other elite groups. The decisional method identifies elites according to their active involvement in important policy decisions. It starts by selecting a representative sample of policy issues. The method requires extensive research on the policy formation for these issues through studying relevant documents and media
reports and conducting interviews with important actors. This will show which actors were involved at various stages of the policyformation process and uncover the existence of informal elite networks and their internal coalitional structure, as well as the resources that were decisive for the final outcome. While the method has been successfully applied in small and medium-sized communities and single policy domains (Knoke et al., 1996), it is obvious that the complexity of national policy formation prevents its application to entire countries. The reputational method, finally, relies on experts to identify elites. It is applicable in community power research, where the number of politically influential actors is relatively small. Its usefulness at the national level is questionable, however. In complex national settings with a multiplicity of policy arenas, the identification of the core members of a country’s elite depends entirely on the subjective assessments of a limited number of experts. Therefore, both the validity and reliability of results provided by this method are questionable, as, for instance, Floyd Hunter’s controversial study on Top Leadership, U.S.A. (1959) showed. The choice of one of the three methods, or a combination of them, depends on the research question and implies different theoretical expectations about the loci of power and influence in modern societies (Hoffmann-Lange, 2018).
Elites and Regimes Regimes and Elite Structures Classic elite theories assumed a close relationship between elites and political regimes, although they disagreed about the underlying causality. Pareto believed that regimes reflected the character of elite configurations and that the degeneration of an elite formation enabled the emergence of counter-elites
Elites
that could exploit the political dissatisfaction of the population in their quest for political power and regime change. Mosca (1939: chapter 2) instead believed that changes in the socio-economic structure of society devalued previously essential elite qualifications and thereby contributed to the ascent of new types of elites that would change the character of the regime. Modernization theory (Lipset, 1959; Dahl, 1971; Vanhanen, 2003) finally assumes that socio-economic modernization fosters the development of a pluralist society, elite pluralism and democratization. Since it is not possible in the present context to discuss specific historical regime and elite types, it must suffice here to discuss two central dimensions characterizing the structure of elites and regimes. These are the degree of elite cohesion on one side, and the degree to which the diversity of sociopolitical interests is represented in the elites on the other side. It is obvious that traditional pre-democratic elites were set apart from the rest of society by a high degree of elite cohesion with respect to social background, social status and institutionalized social closure of access to the elite. Conflicts within and between elite factions were fights for power and not based on different ideas about the common good. Citizens were considered as subjects rather than a relevant social force. Resistance against elite decisions was kept at bay by repression. Increasing levels of socio-economic development and social differentiation produced a differentiation within both elites and populations (Keller, 1991: chapter 3). Rifts within the elite were no longer exclusively based on personal ambitions for power, but rather on the increasing elite heterogeneity and the emergence of different economic interests and political preferences. At the same time, the middle classes became more affluent and demanded civil rights protecting individual liberty and economic security in exchange for their taxes. Elite factions became able to mobilize segments of the population in support of
503
their political demands. Nascent political parties emerged and the quest for democratization became ever more widespread. Therefore, it is obvious that the increasing socio-political differentiation created a need for elite cooperation across different elite subgroups and political parties. Elite cohesion was no longer secured by the common interest of preserving the dominance of a traditional aristocracy. The necessity to secure both the existence of a pluralist society and stable patterns of elite integration has been discussed at length in the elite literature (Keller, 1991).
The Role of Elites in Democratic Transitions The question of whether regime transitions are the result of mass protests or of conflicts among elites is controversial. Elite theorists tend to assume that regime changes always start at the elite level while public unrest and protest movements play a minor role. Such explanations run the risk of circularity, however. They have to assign elite status to the leaders of dissident movements that decisively contributed to regime change. It is obvious that a non-trivial explanation needs to distinguish between regime change which was initiated by parts of the established elite, for instance a military coup or an elite faction within a ruling party, and a change initiated by a grass-roots movement whose leaders did not belong to the established elite of the old regime before – such as, for instance, the leaders of the Solidarność movement in Poland in the 1980s. The third wave of democratization gave rise to a large body of studies into democratic transitions that identified groups of relevant actors, contributing factors and regularities in the dynamics of such transitions (Huntington, 1991; Linz and Stepan, 1996). One important strand relies on modernization theory and emphasizes the relevance of structural and cultural factors. Ronald Inglehart
504
The SAGE Handbook of Political Science
(2018: chapter 7) and Christian Welzel have argued that value change has become the crucial determinant of democratization, which implies that citizens rather than elites were the prime drivers of the third wave of democratization (Inglehart and Welzel, 2005: 164– 72). They have relied primarily on survey data showing the rise of emancipative values even in non-democratic countries and have analyzed the lagged relationship between the rise of emancipative values and the degree of democracy at the aggregate level. In doing so, they have ignored the political processes leading to the breakdown of non-democratic regimes and the transition to democracy: ‘Though elite bargaining was central when representative democracy first emerged, and still plays an important role, effective democracy increasingly emerges when ordinary people develop values and skills that enable them to put effective pressures on elites’ (Inglehart, 2018: 130). The opposite position is taken by Higley and Burton (2006), who claim that the behavior of elites is the crucial determinant of any successful democratization. They argue that successful democratization presupposes the formation of a consensually united elite characterized by a value consensus on democratic institutions, restrained partisanship and an inclusive elite network (ibid.: 11). They identified three distinct avenues that historically led to the formation of stable democracies: elite settlements, colonial origins or convergence among formerly disunited elites. Elite settlements are defined as agreements among previously divided elites to end civil strife, to accept the legitimacy of opponents and to implement rules containing conflicts by negotiations and compromises. Such settlements are assumed to have been historically rare and contingent on a number of favorable conditions, as follows: • the prior experience of costly, intense and inconclusive conflicts between elite factions; • a durable elite alignment into two or three distinct and antagonistic camps;
• a triggering crisis; • the presence of adept and flexible leaders; • the consolidation of the agreement by elite habituation to power sharing and crisis management. (Higley and Burton, 2006: 64–7)
An additional precondition was only implicitly mentioned by Higley and Burton, but was noted by Przeworski (1991: 87). This is an agreement on the introduction of constitutional safeguards for competitive elections and for the protection of political minorities. The chances for reaching such an agreement are higher if the new constitution is passed before holding the first competitive elections because the different elite camps do not know their popular strength at the outset. Higley and Burton further argued that elite settlements are likely to be self-reinforcing, thus leading to a stable democratic regime capable of managing political conflicts and securing a low level of domestic violence. The reliance on elections as a means of determining governmental majorities also secures a high degree of flexibility in absorbing conflicts arising from pressing political and economic problems (Lipset, 1959). If the initial introduction of democracy excludes extreme left-wing or right-wing political parties, Higley and Burton assume the possibility of a gradual elite convergence if the political parties of the center are strong enough to secure electoral majorities over a series of successive elections. Under this condition, the more extreme parties may be induced to move towards the political center in order to become more competitive. One could add that this additionally requires that the democratic institutions have proved their effectiveness and have gained legitimacy (Lipset, 1959). Moreover, the fringe parties must have accepted that they will not be able to undermine the legitimacy of the democratic political system. In this case, they may start considering participation in government as a more attractive alternative than being permanently excluded from the spoils of political office and unable to promise
Elites
tangible political results to their followers. If this is not the case, however, the presence of a bilateral opposition by extreme political parties of the left and the right may induce a centrifugal tendency pulling the moderate parties away from the center, as Sartori (1976) predicted. Most comparativists studying democratic transitions have taken an intermediate position between a top-down and a bottom-up model of democratization by emphasizing that usually both grass-roots movements and established elites are actively involved (Linz and Stepan, 1996: 71; Huntington, 1991; Schmitter, 2018). The assumption of elites’ crucial importance in democratic transitions goes hand in hand with the understanding that such transitions are historically contingent. While structural conditions may create a favorable environment for democratization, authoritarian elites are able to suppress demands for democracy for a considerable length of time. Inglehart’s claim that value change drives democratization is not incompatible with this assumption because value change can be assumed to affect parts of authoritarian elites as well. Reformist elite factions will usually wait, however, to start pressing for far-reaching reforms until the authoritarian regime’s inability to quell public dissatisfaction has become obvious. Although they pursue different research questions and are interested in different aspects of the previous authoritarian elite’s involvement in democratic transitions, two recent books by Stephan Haggard and Robert Kaufman (2016) and Michael Albertus and Victor Menaldo (2018) study systematic differences between elite-driven and citizendriven democratization. They confirm the importance of elites in the transition process. Haggard and Kaufman were primarily interested in studying the relevance of distributional conflicts and their impact on the transition path, the quality of democracy achieved and its vulnerability for later democratic reversals. They used a restrictive definition of elite-led transitions, which they
505
characterized as ‘the absence of pressures from below’ (142). Albertus and Menaldo studied the participation of the outgoing authoritarian elites in the transition process. They defined the ensuing regime as elitebiased democracy that preserves special privileges for those elites (9). They also included pacted transitions in this category. The differentiated and methodologically sophisticated analyses presented in the two books constitute an important step forward for understanding the relevance of socio-economic structures and different groups of actors in democratic transitions because they supersede the inconclusive and ultimately sterile dispute over elites versus masses as agents of democratization.
Elite Change and Regime Transformation Elite Circulation and Elite Change Elite circulation involves changes in the elite positions of a country. At the most specific level, it may signify the replacement of individual incumbents of elite positions by persons with similar characteristics and qualifications. Such replacement may be regular if it is due to the expiration of a temporary term of office or retirement rules. Irregular replacements occur when incumbents are forced out of their elite position, for instance because they are no longer perceived as being up to the demands of their office, because the selectorate concludes that a position requires a different set of qualifications than before, or for political reasons. Since elite status is always associated with high prestige, power, income and other amenities, incumbents will normally prefer to keep their elite positions for as long as possible. Therefore, it can be expected that only an opportunity for upward mobility to a better position (in terms of pay, prestige or power), term limits, personal reasons
506
The SAGE Handbook of Political Science
(such as old age, failing health or death) or enforced r esignation will terminate the incumbency of elites. Elite circulation may also involve groups of elite position-holders, for instance when a government agency is privatized, a major business corporation goes bankrupt or a political party experiences heavy losses in an election. In order to come to grips with the multi-faceted nature of elite circulation, it seems appropriate to distinguish between individual elite circulation, changes in the social composition of elites due to social change and elite change induced by regime change (Putnam, 1976: 168–72).
Social Change and Elite Change Mosca believed that elite change has frequently been the result of a rise of new social forces (see also Putnam, 1976: 168–9). Industrialization is a prime historical example of such change, because it led to a decline in the importance of military elites and the landed aristocracy in favor of the new group of industrial entrepreneurs (Bottomore, 1993: 41–6). Since social change tends to proceed slowly, this type of elite change is not necessarily associated with abrupt regime change. If a regime has time to adapt to social change by opening new career channels for elite recruitment, elite and regime change may proceed over an extended period of time. British history is a telling example of such a dual transformation by which a feudal system slowly evolved into an industrial and democratic society, extended over several centuries. The longitudinal study of changes in the social composition of parliaments in a large number of European countries since the mid 19th century included a large number of social and professional characteristics. It shows, for instance, a drastic decline of the nobility as well as an increase in the levels of formal education and political professionalization (Cotta and Verzichelli, 2007; Putnam, 1976: 179–83).
Regime Change and Elite Change Sudden and extensive elite transformations have historically been associated with the transitions to one-party totalitarian or ideocratic regimes in which a ruling party took control of the entire society. Totalitarian and ideocratic elites are also typically outsider elites with respect to their social and professional backgrounds. The most detailed analyses of the takeover of power by a totalitarian party were conducted on Germany under National Socialism (Neumann, 2009, first edition 1942; Zapf, 1961). When the National Socialist party came to power in January 1933, it swiftly took control of all crucial ministries and the state-owned radio network. Next, the legislative power of parliament was curbed in March 1933. Only a few months later, all other political parties were outlawed. Within its first year, the National Socialist regime also managed to install its loyalists at the top of most civil society organizations – exceptions were the business and financial elites, most of whom kept their positions. Such extensive elite change was possible because the party had a devoted mass membership working in many different sectors and ready to take over elite positions. Most of these elite changes could be accomplished without generating much open resistance from civil society. The party utilized a formally legalistic strategy and the elite changes were legally sanctioned by governmental decrees and implemented by an effective public service (Zapf, 1961: 51–6). The communist parties’ ascent to power in Russia in 1917, in the satellite states of the Soviet Union after World War II, in North Korea in 1948 and in China in 1949 resulted in an even more pervasive elite change because these regime changes also involved the more or less complete nationalization of industry and finance. While the National Socialist and communist takeovers were especially swift, most democratic backsliding in the past
Elites
two decades has proceeded more gradually and less visibly. In a number of cases, as in Russia or Turkey, this has led to the threshold being crossed from a (fragile) democratic to an authoritarian regime. It tended to start with a political party formed or captured by an outsider who came to power during an acute political or economic crisis, either through a regular election or upon appointment by the head of state (Levitsky and Ziblatt, 2018). ‘Although some elected demagogues take office with a blueprint for autocracy, many … do not. Democratic breakdown doesn’t need a blueprint’ (ibid: 75). It is sufficient that these parties pretend to pursue a legitimate public objective such as combating corruption or enhancing domestic security, and follow a piecemeal strategy of dismantling democratic institutions. Likewise, elite circulation in democratic backsliding proceeds gradually through modifications of the political institutions – for instance the shutdown of existing or the creation of new government agencies – and the replacement of individual politicians, high-ranking public servants and judges at high courts who refuse to cooperate. Elite change after democratization has usually been less pervasive than one should suspect. This was demonstrated by a host of elite studies in the post-communist central and eastern European democracies. Iván and Szonja Szelényi (1995) distinguished four types of elite circulation: elite continuity, horizontal reproduction, vertical reproduction and elite change. Elite continuity was limited to the ruling communist parties, which only changed their names and their top leadership. Horizontal reproduction implies that elites and sub-elites of the communist system continued their careers, but moved to a different sector. This was made possible because previous incumbents of top political positions in the communist party or in the bureaucracy could use their old networks to claim high positions in the business or finance sector during the privatization of the formerly stateowned companies.
507
Vertical reproduction is the most common pattern found after regime transitions and not limited to post-communist countries. Middlelevel elites, who started successful careers during the non-democratic regime, benefit the most from the new window of opportunity because the top elites of the old regime usually lose their positions during or after a regime change. This pattern is particularly pronounced in politics, public administration and the media because their elites were the essential pillars of the old regime. Elite change, in the sense of the recruitment of new elites who had not held highlevel positions under the communist regime, was mostly limited to the leadership of new political parties, civil society associations, media outlets and start-up businesses in fields that had not previously existed (electronics, computing, financial consulting, etc.) due to political restrictions. In the present context, it is not possible to provide country-specific information. In the introductory chapter to their comparative volume, Higley and Lengyel (2000: 1–21) summarize the findings by distinguishing two modes (gradual and peaceful versus sudden and coerced) and two scopes (wide and deep versus narrow and shallow) of elite circulation in the post-communist countries. The three Baltic states (Estonia, Latvia and Lithuania), Hungary, Poland, Czechoslovakia and Slovenia experienced pacted transitions with round-tables in which representatives of the former ruling party, of former dissident groups and of new parties participated. This led to what the authors called classic elite circulation. It involved extensive changes in the political sector, while a mixture of elite continuity and vertical reproduction prevailed in most of the other sectors and horizontal reproduction was rather widespread in the new business elites. The circulation pattern in the other post-communist countries (including the EU members Bulgaria, Romania and Slovakia) was governed by what Higley and Lengyel termed reproduction circulation, which
508
The SAGE Handbook of Political Science
implies a high degree of elite continuity, and where parties without roots in the communist system did not take a strong hold in society.
Elite Social Backgrounds and Elite Recruitment Elites tend to come predominantly from privileged family backgrounds and do not constitute a mirror image of their society. This finding has been confirmed by a host of elite studies all over the world. It holds true even for socialist regimes propagating an explicitly egalitarian ideology, despite attempts to conceal the persistence of a considerable social bias in elite recruitment. It seems doubtful, however, that Mosca (1939) and critical social scientists in the tradition of Pierre Bourdieu (1996) are correct in assuming a general tendency towards the social closure of elite recruitment due to the deliberate attempt of elites to restrict access to the upper echelons of power to their own offspring. It is, of course, plausible to assume that incumbents of elite positions try to ensure that their children will achieve a similarly high social status. Because even the upper class is always much larger than the number of available elite positions, however, the reproduction of high social status in the following generation does not imply that elites are able or even try to secure elite positions for their offspring. While access to the elites used to be relatively closed in feudal societies or is controlled by guardians of ideological orthodoxy in ideocratic regimes, the same is not true in liberal democracies where no formal restrictions apply and the field of competitors is usually broad. Here, elite recruitment is primarily based on the presumed expertise necessary for an elite position rather than being passed on from one generation to the next. A long-standing and successful career and experience in equivalent positions
at lower levels are necessary but not sufficient conditions to be considered as a serious applicant for an elite position; personality, social skills and the field of competitors play a role as well. The overrepresentation of children from families with a high social status among elites is primarily the result of the prevailing patterns of social mobility (Putnam, 1976: 28–32). Today, a university degree has become a near universal precondition for elite recruitment. At the same time, it is well known that the opportunity to attend institutions of higher education is strongly influenced by the cultural capital of the family. This situation has not fundamentally changed with the expansion of higher education over the past 50 years. While the absolute numbers of people achieving higher secondary and tertiary degrees have greatly risen over time, the social selectivity of educational institutions has not decreased substantially (Teese, 2007). Putnam (1976: chapter 2) analyzed the disadvantages of people from lower-class backgrounds to achieve elite status and showed two regularities. First, he demonstrated the existence of what he called the agglutination model, which implies that different dimensions of social status, such as education, occupational status, income, prestige and power, tend to go hand in hand. Second, he stated a law of increasing disproportion (ibid: 33) by showing that the overrepresentation of persons with a higher class background increases at successively higher levels of institutional hierarchies. While the equality of educational opportunities and the openness of elite recruitment are important democratic principles, family background’s influence on the life chances of children is deeply rooted in the fabric of society and cannot be completely eliminated by political intervention. The available data do not support the assumption of closed elite recruitment in today’s liberal democracies, however. Examples of prominent families with several
Elites
members who achieved elite status – such as the Rockefellers, the Kennedys and the Bushes in the United States – are not sufficient to prove that elite status is predominantly passed on in families. They indicate primarily that family traditions may be a powerful incentive for the ambition to excel. Today, hereditary elite status is limited to a few owners of large fortunes, while the overwhelming majority of the elites are the first in their family to have achieved that status.
Political Representation: The Political Beliefs of Elites and Citizens Bottom-Up or Top-Down? Models of Political Representation Since democracy in large territories can only be organized as representative democracy, the question of how political representation can be measured has plagued elite research for a long time. Empirical elite research has mostly relied on comparing the political attitudes of elected representatives with those of the electorate at large, with their party’s electorate or – in systems with single-member constituencies – with the voters of their constituency. It is doubtful, however, that such comparisons provide valid insights into the quality of political representation. Elite theory leads us to expect considerable differences between the political beliefs of elites and citizens. These are not due to differences in social backgrounds and economic interests, as is frequently assumed, but primarily result from their different political roles. Political elites are full-time politicians intimately involved in legislation and executive acts. They participate in decisions on myriad political issues. While elected representatives are expected to be responsive to citizens’ preferences, they also have to observe constitutional limits and balance voter demands with those of their party, as
509
well as the diverse interests of civil society actors, the media, transnational organizations and the governments of other countries. Finally, they have to keep within budgetary limits and need to forge legislative majorities for passing new legislation. In their book Democracy for Realists, Christopher Achen and Larry Bartels (2016) severely criticize the idealistic notion that policy-making is a bottom-up process and the idea that the voting decisions of citizens are based on rational considerations regarding the issue positions of political parties and candidates. The authors take a decidedly elitist stance, arguing that policy-making is a job for specialists and not for ordinary citizens. They cite a host of empirical studies confirming that policy-makers are rarely responsive to public opinion. ‘Policies are made by political elites of one kind or another, including elected officials, government bureaucrats, interest groups and judges’ (ibid: 320). Instead, they claim, voters expect representatives to do their job (ibid: 318–19). Based on such premises, András Körösényi (2010) proposed an authorization model based on Schumpeter’s notion of democracy as electoral competition among political parties. ‘Voters’ choices depend primarily on the impact of the strategic interplay of rival political leaders and parties, which is to say on agenda-setting, framing, and priming political issues [rather] than forming judgments according to independent yardsticks’ (ibid: 54). This model is realistic and at the same time rational because voters can rely on party competition as a powerful incentive for political accountability. Putnam argued that the significance of elections for elite–mass linkages primarily derives from the ability of political elites to anticipate the (socio-economic) consequences of their decisions and is not a result of citizens making specific demands (1976: 151–2). Therefore, elite theory’s emphasis on the division of labor between voters and political elites in policy-making does not imply that the interests of citizens are not taken into
510
The SAGE Handbook of Political Science
account, but rather that political accountability must not be equated with responsiveness. Democracy requires the free articulation of the diversity of interests existing in a pluralist society. Policy-making instead involves the aggregation of demands by diverse interests, which necessarily implies the search for compromise and can never be fully responsive to each and every position. Moreover, the political beliefs of citizens are neither consistent nor stable over time, as shown in Philip Converse’s (1964) seminal article on the belief systems of mass publics, where he demonstrated that the political knowledge and sophistication of most voters is rather low. However, the share of politically sophisticated voters is not invariant over time and across countries (e.g. Klingemann, 1979). Michael Delli Carpini and Scott Keeter found considerably more variation in levels of political knowledge within the citizenry. The best-informed quartile of citizens were able to ‘express a considered, genuine opinion on most issues’ which are both internally consistent and stable (1989: 19, 266). Converse’s basic claim of a steep gradient in the political knowledge, consistency and stability of citizens’ political beliefs has by and large been confirmed by many other studies. These differences are explained not by citizens’ lower levels of formal education, but primarily by their lower level of information about politics. As consumers of policies, they also do not have to consider whether their demands will add up to a consistent set of policies. They select from the electoral platforms of the different parties those promises which they find appealing and they react primarily to issues that are currently high on the political agenda. By contrast, electoral competition and differences in the basic ideological positions of political parties ensure congruence among party politicians and their voters with respect to some basic outlooks. In electoral campaigns, parties try to appeal to different social groups by emphasizing their familiar positions on salient socio-political cleavages.
Therefore, congruence tends to be highest for those issues that are most central to the programmatic outlook of the parties. Since the empirical results on the considerable elite–citizen differential in political knowledge do not imply that this differential is constant over time and across countries, regular monitoring of public opinion on current policy issues is essential for studying the accountability of political elites because political opinion formation is intrinsically a multi-directional process involving a multitude of political actors. Therefore, modern demo-elitist theories conceive of policy formation as a bi-directional process involving both top-down and bottom-up flows of opinion formation (Körösényi, 2018). New issues may be brought up by marginal groups, and dissatisfaction with the status quo may bring new demands to the fore that force the established elites to develop new strategies for coping with these challenges.
Political Value Orientations The political culture of a society indicates whether the political value orientations, legitimacy beliefs and behaviors are congruent with the prevailing political regime. Likewise, studying the political culture of elites is important to learn about the mutual trust among elites and about intra-elite conflict over basic principles of the political order. Herbert McClosky’s (1964) classic study was the first to demonstrate that political activists and political leaders in the United States showed higher levels of support for democracy and more tolerance of deviating political opinions than ordinary citizens. McClosky concluded that elites have to be considered as the main carriers of the democratic creed. Based on these results, the theory of democratic elitism argued that the stability of democracy rests primarily on the existence of an elite consensus on democratic rules of the game. Later studies in the US and other democracies have largely confirmed this result.
Elites
In a more comprehensive study of political tolerance, which included several specialized elite groups – among them lawyers, leaders of the American Civil Liberties Union (ACLU) and police officers (McClosky and Brill, 1983) – the authors analyzed the impact of various independent factors on democratic value orientations. The results revealed the influence of four factors (McAllister, 1991). The first is the level of formal education, which raises the awareness of the political relevance of democratic principles and institutions. A second factor is the socializing effect of participation in public affairs, which explains why political activists are more tolerant of deviating opinions than their nonactive counterparts. Professional norms are a third factor. Defense lawyers who are professionally involved in protecting the rights of their clients were the second most libertarian group, only surpassed by the activists of the ACLU. Police officers were considerably more supportive of law and order policies. Finally, political ideology (conservative versus liberal) was also related to civil libertarianism. The latter result was confirmed by Paul Sniderman et al. (1991), who demonstrated considerable differences between the leaders of different political parties. While these results of course do not disprove that elites on average are more tolerant than the general public, Sniderman et al. pointed out that changes in government may lead to considerable shifts in policies with respect to civil liberties. Based on a thorough survey of the results of comparative research, Mark Pfeffley and Robert Rohrschneider (2007) found that some of the conclusions of the theory of democratic elitism have to be questioned. First, the sweeping assumption that elites are the guardians of democracy has to be qualified. While it is true that democratic elites tend to be more supportive of democracy and more tolerant of different (political) points of view, elite consensus may be far from perfect because national averages may conceal considerable differences between major parties
511
and elite groups (McClosky and Brill, 1983; Sniderman et al., 1991). Moreover, both support for democracy and political tolerance may be of little practical relevance once elites perceive certain political groups as a threat to democracy. In the latter case, political elites may be even less tolerant than the public and may actively engage in repressive policies, as for instance during the McCarthy era (Pfeffley and Rohrschneider, 2007: 71–2). Finally, it has to be noted that extant research on elite–citizen differences regarding regime legitimacy and the political value orientations of elites has been limited to democratic countries – including new democracies – but very little is known about the political value orientations of elites in countries with hybrid or authoritarian regimes. While it is certainly difficult to conduct elite surveys in such countries, qualitative studies are lacking in the literature as well.
Novel Challenges for Established Elites since the Beginning of the 21st Century Since the mid 1960s, established democracies have experienced a considerable decline in institutionalized forms of political participation such as electoral turnout, party identification and party membership. This has been accompanied by a parallel surge of voter volatility and citizen protest against governmental decisions and changes in the party systems of established democracies. In his theory of value change, Ronald Inglehart (2018) has characterized this ongoing process as a change from materialist to postmaterialist values and from elite-directed to elite-challenging political participation. He holds that this value change is the result of a cognitive mobilization that has enabled citizens to act as political subjects rather than just as objects of elite decision-making. According to Inglehart’s theory, value change depends on the presence of
512
The SAGE Handbook of Political Science
economically favorable and relatively secure socio-economic conditions, particularly in the developed democracies, but also among the rising middle classes in other parts of the world. In recent years, these conditions have been increasingly shattered by the effects of the ongoing globalization. Rapid changes in the labor markets and living conditions, the massive wave of global migration and the fallout of the global economic crisis after 2008 have nurtured feelings of insecurity and facilitated the rise of right-wing populist movements whose political demands are very different from post-materialist concerns. Populist movements and political parties, such as the French Front National, the Dutch Partij voor de Vrijheid, the Sweden Democrats and the German AfD (Alternative für Deutschland), articulate traditional materialist demands. They favor higher government expenditure for infrastructure and welfare, the creation of new jobs and protective measures against imports. They also mobilize against reforms such as gay marriage and immigration by people from different cultural backgrounds, which they denounce as violating traditional values and threatening the cultural identity of the country (Papadopoulos, 2013: 27–8, Kriesi, Chapter 90, this Handbook). This raises the question whether increased citizen mobilization and the advent of populist protest movements will contribute to an erosion of representative democratic institutions and impair the ability of elites to aggregate an increasingly heterogeneous spectrum of political demands. Many elite theorists have argued that a considerable degree of elite autonomy is necessary for pursuing coherent and sustainable policies (Schumpeter, 1942; Kornhauser, 1960; Higley and Burton, 2006). Therefore, the rising tide of grass-roots protest movements implies more public pressure, especially on elected political elites. Their power basis will become less stable and their careers more vulnerable to electoral defeat. This will force them to pursue short-term strategies in order
to secure their mandate. At the same time, it will make democratic politics more volatile and erratic in the future (Hoffmann-Lange and Kuklys, 2019). Heinrich Best (2009) conceptualized three basic theoretical dilemmas that are useful for analyzing the potential consequences of these developments. First, he replaced the assumption that elite cohesion is based on elite consensus by the notion of antagonistic cooperation among elites, assuming that relations at the elite level usually involve considerable policy disagreements and that elites will refrain from open clashes only as long as they perceive that mutual restraint in bargaining with each other will involve higher returns. Next, the principal-agent dilemma governing relations between elites and their constituents can be managed only as long as strong organizational linkages between elites and their followers secure the existence of sufficient trust in the accountability of the elites. The erosion of traditional elite–citizen linkages in the past decades and the widespread perception that effective party competition has declined has fueled increasing political discontent that has even reached consolidated democracies with relatively successful economies, such as Switzerland, the Netherlands, Germany and the Scandinavian countries. Finally, the challenge–response dilemma addresses the question of how elites will cope with the ongoing social and political change. In Europe, the ramifications of globalization and the economic crisis for standards of living were aggravated by the wave of immigration into the affluent European democracies in the past decade. So far, the established political parties have reacted rather defensively in tackling these challenges, which has contributed to the impression that they do not have a realistic strategy to cope with them. This, in turn, has contributed to the rise of populist political movements and parties in recent years (Best and Hoffmann-Lange, 2018). Populist parties aggressively reject the traditional ways of doing politics (Mudde,
Elites
2008; Mudde and Kaltwasser, 2017). They claim to represent the ordinary people, whose interests are purportedly neglected by the established political parties and their leaders. Ronald Inglehart and Pippa Norris (2017) have pointed out that such claims have the main purpose of mobilizing voters who feel that social change is threatening traditional values and their way of life. They assume that support for populist authoritarian parties is primarily ‘motivated by a backlash against the cultural changes linked with the rise of Postmaterialist and Self-expression values, far more than by economic factors’ (ibid.: 446), although they acknowledge that declining real incomes, the increase in precarious jobs and the rise in economic inequality constitute economic underpinnings of public discontent (ibid: 448–9). Huber and Schimpf have likened the appearance of populist parties to a drunken guest ‘spilling out the painful truths’ (2016: 119). Populists point out the shortcomings of the established ways of running politics and the erosion of representative democracy ‘due to the shift of decision-making functions to bodies remote from the representative process, collaborative governance frequently characterized by limited pluralism, transnational governance with weak democratic accountability and increased privatization’ (Papadopoulos, 2013: 242). By criticizing globalization and immigration as the most obvious symptoms, populist parties capitalize on the ‘degradation of democratic quality’ in today’s representative democracies. Their critique is ‘often extremely reductionist’, however, because ‘there is no mono-causal explanation’ and no simple solution for these major challenges (ibid: 243–4). The theoretical discussion about the relationship of populism and democracy is of considerable relevance for elite theory. The anti-elitist stance of populist movements shows that they exploit the ambivalent promises of democracy as an ideal and as a system of government (Canovan, 1999). Whenever the gap between the ‘haloed democracy and
513
the grubby business of politics’ becomes too wide, ‘populists tend to move on to the vacant territory, promising in place of the dirty world of party maneuvering the shining ideal of democracy renewed’ (ibid: 11). Benjamin Arditi discussed the potentially problematic effects of populism on liberal democracies. He argued that populist claims are unproblematic as long as they are articulated by the media as maverick contributions to the public discourse about shortcomings of the political system or as ‘a mode of participation that departs from the etiquette of political salons without apologising for its brashness’ (2004: 142). These two forms have the ‘potential to both disturb and renew the political process without necessarily stepping outside the institutional settings of democracy’ (ibid.). A third form, however, ‘comes to haunt political democracy and to endanger the very framework in which it can function’ (ibid.) when those making the claims exploit the public’s lack of trust in institutional procedures and in the legislative process to discredit the rule of law and to openly advocate authoritarian practices (ibid.). The latter danger is the reason why Simon Tormey, who has otherwise severely criticized the deficiencies of representative democracy, argues that populism can be ‘democracy’s deadly cure’ (2017). The empirical analysis of Huber and Schimpf takes into account this Janus-faced character of populism. The authors assume that populist parties can be beneficial by broadening the spectrum of issues debated in parliaments when they are in opposition, while they are apt to impair the quality of democracy when they participate in government (2016: 111). This hypothesis was confirmed in their analysis based on aggregate data including European cabinets for the period 2000 to 2012. These discussions have shown that populism is a phenomenon resulting from the unfulfilled promises of democracy, which is particularly likely to spread in times of rapid social change when established patterns of
514
The SAGE Handbook of Political Science
political crisis management are confronted with new challenges. Theoretically, it is the opposite of elitism: ‘Elitism shares populism’s basic monist and Manichean distinction of society between a homogeneous “good” and a homogeneous “evil”, but it holds an opposite view on the virtues of the groups’ (Mudde and Kaltwasser, 2017: 7). By contrast, Moísés Naím (2017) argues that populism is not a new phenomenon at all, but rather a traditional strategy to obtain and retain power by ‘exacerbating sociocultural division and conflict’. It includes magnifying the nation’s problems, criminalizing the opposition, discrediting experts and delegitimizing the media. In his view, populists are simply counter-elites vying for power. Similarly, Mudde (2008) has argued that it is not sufficient to only take into account the demand side of populism, but that we should also look at the supply side and consider that it can be deliberately used to mobilize pre-existing grievances. Therefore, the rapid spread of populism cannot be explained exclusively by the failure of established elites in coping with current challenges. Although most of the recent literature has dealt with the current wave of populism in the established liberal and the post- communist democracies, it is not limited to these. It is also widespread in hybrid and authoritarian political systems, for instance in Turkey, Russia, the Philippines and Venezuela. The diffuse demands of populist parties for more and ‘true democracy’ conceal, however, that they only mobilize against the status quo. Changing things for the better would require mobilization for something, at least on the basis of a rudimentary operational program. In short, it would require political leadership in Max Weber’s sense. The populist governments of Greece and Italy have not achieved any institutional reforms so far; rather, they have continued the muddling-through of their predecessors. Other successful populist leaders have replaced fledgling democracies and hybrid systems with authoritarian rather than democratic regimes. It cannot be ruled out
that citizens will quickly become disappointed with the populists and renew their linkages with traditional parties once they recognize that the populists have no political solutions to offer. For the time being, however, the advent of populist parties in the legislatures and the governments of liberal democracies will contribute to make policy formation more rather than less intricate and conflictual (Papadopoulos, 2013: chapter 7).
References Achen, Christopher H. and Bartels, Larry M. (2016) Democracy for Realists: Why Elections Do Not Produce Responsive Government. Princeton: Princeton University Press. Albertus, Michael and Menaldo, Victor (2018) Authoritarianism and the Elite Origins of Democracy. Cambridge: Cambridge University Press. Arditi, Benjamin (2004) Populism as a Spectre of Democracy: A Response to Canovan. Political Studies. 52 (1), pp.135–43. Best, Heinrich (2009) Associated Rivals: Antagonism and Cooperation in the German Political Elite. Comparative Sociology. 8 (3), pp.419–39. Best, Heinrich and Hoffmann-Lange, Ursula (2018) Challenged Elites - Elites as Challengers. Towards a Unified Theory of Representative Elites. Historical Social Research. 43 (4), pp.7–32. Bottomore, Tom (1993) Élites and Society. 2nd edition. London: Palgrave. Bourdieu, Pierre (1996) The State Nobility. Oxford: Polity Press. Canovan, Margaret (1999) Trust the People! Populism and the Two Faces of Democracy. Political Studies. 47 (1), pp.2–16. Converse, Philip E. (1964) The Nature of Belief Systems in Mass Publics. In Apter, David E. (ed.). Ideology and Discontent (pp.206–61). New York: The Free Press. Cotta, Maurizio and Verzichelli, Luca (2007) Paths of Institutional Development and Elite Transformations. In Cotta, Maurizio and Best, Heinrich (eds). Democratic Representation in Europe (pp.417–73). Oxford: Oxford University Press.
Elites
Dahl, Robert A. (1971) Polyarchy. New Haven: Yale University Press. Delli Carpini, Michael X. and Keeter, Scott (1989) What Americans Know about Politics and Why It Matters. New Haven: Yale University Press. Haggard, Stephan and Kaufman, Robert R. (2016) Dictators and Democrats: Masses, Elites, and Regime Change. Princeton: Princeton University Press. Higley, John and Burton, Michael (2006) Elite Foundations of Liberal Democracy. Lanham: Rowman & Littlefield. Higley, John and Lengyel, György (eds) (2000) Elites after State Socialism. Lanham: Rowman & Littlefield. Hoffmann-Lange, Ursula (2018) Methods of Elite Identification. In Best, Heinrich and Higley, John (eds). The Palgrave Handbook of Political Elites (pp.79–92). London: Palgrave Macmillan. Hoffmann-Lange, Ursula and Kuklys, Mindaugas (2019) European Citizens and Elites in Times of Economic Crisis and Citizen Unrest. In Vogel, Lars, Gebauer, Ronald and S alheiser, Axel (eds). The Contested Status of Political Elites: At the Crossroads. London: Routledge. Huber, Robert A. and Schimpf, Christian H. (2016) A Drunken Guest in Europe? The Influence of Populist Radical Right Parties on Democratic Quality. Zeitschrift für Vergleichende Politikwissenschaft. 10 (2), pp.103–29. Hunter, Floyd (1959) Top Leadership, U.S.A. Chapel Hill: University of North Carolina Press. Huntington, Samuel P. (1991) The Third Wave: Democratization in the Late Twentieth Century. Norman: University of Oklahoma Press. Inglehart, Ronald (2018) Cultural Evolution. Cambridge: Cambridge University Press. Inglehart, Ronald and Norris, Pippa (2017) Trump and the Populist Authoritarian Parties: The Silent Revolution in Reverse. Perspectives on Politics. 15 (2), pp.443–54. Inglehart, Ronald and Welzel, Christian (2005) Modernization, Cultural Change, and Democracy: The Human Development Sequence. Cambridge: Cambridge University Press. Keller, Suzanne (1991) Beyond the Ruling Class: Strategic Elites in Modern Society.
515
2nd edition. New Brunswick: Transaction Publishers (first published in 1963). Klingemann, Hans D. (1979) Measuring Ideological Conceptualization. In Barnes, Samuel H., Kaase, Max et al. Political Action: Mass Participation in Five Western Democracies (pp.215–54). Beverly Hills: Sage. Knoke, David, Pappi, Franz Urban, Broadbent, Jeffrey and Tsujinaka, Yutaka (1996) Comparing Policy Networks. Cambridge: Cambridge University Press. Kornhauser, William (1960) The Politics of Mass Society. London: Routledge and Kegan Paul. Körösényi, András (2010) Beyond the Happy Consensus about Democratic Elitism. In Best, Heinrich and Higley, John (eds). Democratic Elitism: New Theoretical and Comparative Perspectives (pp.43–60). Leiden: Brill. Körösényi, András (2018) Political Elites and Democracy. In Best, Heinrich and Higley, John (eds). The Palgrave Handbook of Political Elites (pp.41–52). London: Palgrave Macmillan. Levitsky, Steven and Ziblatt, Daniel (2018) How Democracies Die. New York: Crown. Linz, Juan J. and Stepan, Alfred (1996) Problems of Democratic Transition and Consolidation. Baltimore: The Johns Hopkins University Press. Lipset, Seymour Martin (1959) Some Social Requisites of Democracy: Economic Development and Political Legitimacy. American Political Science Review. 53 (1), pp.69–105. McAllister, Ian (1991) Party Elites, Voters and Political Attitudes: Testing Three Explanations for Mass-Elite Differences. Canadian Journal of Political Science. 24 (2), pp.237–68. McClosky, Herbert (1964) Consensus and Ideology in American Politics. American Political Science Review. 58 (2), pp.361–82. McClosky, Herbert and Brill, Alida (1983) Dimensions of Tolerance: What Americans Believe about Civil Liberties. New York: Russell Sage Foundation. Michels, Robert (2001) Political Parties: A Sociological Study of the Oligarchical Tendencies of Modern Democracy. Kitchener (Ontario, Canada): Batoche Books (first published in 1911 in German). https://socialsciences. mcmaster.ca/econ/ugcm/3ll3/michels/polipart.pdf (Accessed January 7, 2020).
516
The SAGE Handbook of Political Science
Mosca, Gaetano (1939) The Ruling Class. New York: McGraw-Hill. Mudde, Cas. 2008. The Populist Radical Right: A Pathological Normalcy. Working Paper 3/07. Malmö: Malmö Institute for Studies of Migration, Diversity and Welfare. Mudde, Cas and Kaltwasser, Cristóbal Rovira. (2017) Populism: A Very Short Introduction. Oxford: Oxford University Press. Naím, Moisés (2017). How to Be a Populist. The Atlantic April 21, 2017. https://www. theatlantic.com/international/archive/2017/ 04/trump-populism-le-pen/523491/ (Accessed November 9, 2017). Neumann, Franz L. (2009) Behemoth. The Structure and Practice of National Socialism, 1933–1944. Revised edition with an Introduction by Peter Hayes. Chicago: Ivan R. Dee (first published in 1942/1944). Pakulski, Jan (2018) Classical Elite Theory: Pareto and Weber. In Best, Heinrich and Higley, John (eds). The Palgrave Handbook of Political Elites (pp.17–24). London: Palgrave Macmillan. Papadopoulos, Yannis (2013) Democracy in Crisis? London: Palgrave Macmillan. Pareto, Vilfredo (1935) The Mind and Society: A Treatise on General Sociology. New York: Harcourt, Brace and Company (first published 1916 in Italian). Parsons, Talcott (1963a) On the Concept of Political Power. Proceedings of the American Philosophical Society. 107 (3), pp.232–62. Parsons, Talcott (1963b) On the Concept of Influence. Public Opinion Quarterly. 27 (1), pp.37–62. Pfeffley, Mark and Rohrschneider, Robert (2007) Elite Beliefs and the Theory of Democratic Elitism. In Dalton, Russell J. and Klingemann, Hans-Dieter (eds). The Oxford Handbook of Political Behavior (pp.65–79). Oxford: Oxford University Press. Przeworski, Adam (1991) Democracy and the Market. Cambridge: Cambridge University Press.
Putnam, Robert D. (1976) The Comparative Study of Political Elites. Englewood Cliffs: Prentice-Hall. Sartori, Giovanni (1976) Parties and Party Systems: A Framework for Analysis. Cambridge: Cambridge University Press. Schmitter, Philippe C. (2018) Democratization: The Role of Elites. In Best, Heinrich and Higley, John (eds). The Palgrave Handbook of Political Elites (pp.593–610). London: Palgrave Macmillan. Schumpeter, Joseph A. (1942) Capitalism, Socialism and Democracy. New York: Harper & Brothers. Sniderman, Paul M., Fletcher, Joseph F., Russell, Peter H., Tetlock, Philip E. and Gaines, Brian J. (1991) The Fallacy of Democratic Elitism: Elite Competition and Commitment to Civil Liberties. British Journal of Political Science. 21 (3), pp.349–70. Szelényi, Iván and Szelényi, Szonja (1995) Circulation or Reproduction of Elites during the Postcommunist Transformation of Eastern Europe. Theory and Society. 24 (5), pp.615–38. Teese, Richard (2007) Time and Space in the Reproduction of Educational Inequality. In Teese, Richard, Lamb, Stephen and DuruBellat, Marie (eds). International Studies in Educational Inequality, Theory and Policy (Volume 1, pp.1–21). Dordrecht: Springer. Tormey, Simon (2017) Is Populism Democracy’s Deadly Cure? The Conversation, September 21, 2017. http://theconversation.com/ispopulism-democracys-deadly-cure-82592 (Accessed November 7, 2018). Vanhanen, Tatu (2003) Democratization: A Comparative Analysis of 170 Countries. London: Routledge. Weber, Max (1946) Politics as a Vocation. In Gerth, Hans H. and Mills, C. Wright (eds). From Max Weber: Essays in Sociology (pp.77–128). New York: Oxford University Press (first published 1919 in German). Zapf, Wolfgang (1961) Wandlungen der deutschen Elite. München: R. Piper.
31 Identities Ireneusz Pawel Karolewski
Identities have experienced a new surge of interest in the social sciences in recent years (e.g. Karolewski, 2010; Kaina, Karolewski and Kuhn, 2016; Fukuyama, 2018). Whereas the notion of identity had been part of academic debates on nationalism (e.g. Miller, 1995a; Meyerfeld, 1998), liberal multiculturalism (e.g. Kymlicka, 1995), secessionism (e.g. Sorens, 2012), social movements (e.g. Bernstein, 2005) or liquid modernity (e.g. Bauman, 2000) for some time, it was often used with such a variety of diverging meanings that some scholars proclaimed ‘identity’ to be an obsolete and problematic concept, which serious scholarship should avoid. Brubaker and Cooper (2004: 28) argued famously that ‘identity … tends to mean too much (when understood in a strong sense), too little (when understood in a weak sense), or nothing at all (because of its sheer ambiguity)’. Others have argued, however, that we cannot avoid dealing with identity, since having an identity is a ‘psychological imperative’ as well as a ‘sociological constant’ (Greenfeld, 1999: 38).
As a result of a new wave of globally spreading populism and national chauvinism, the very concept of identity is experiencing a revival (e.g. Mounk, 2018; Fukuyama, 2018). Both Yascha Mounk and Francis Fukuyama point in their recent work to many people’s need to have ‘their’ identity recognized and to perform it through exclusion of others, which contrasts with the previous scholarly interest in ‘weak identity’ centered on fluidity and constructability of identities rather than resilient and stable forms thereof. The research on fluidity of identity has highlighted multiple or hybrid identities of individuals in the process of making and claiming identities, in which identities are not attributes but rather resources people can apply almost at will. In this sense, the individualization of identity would go hand in hand with ‘liquid’ modernity, in which the traditional strong identities have lost their grip on individuals. However, it seems that this ‘rootlessness’ and fluidity of identity appears recently to have been replaced by widespread claims for strong
518
The SAGE Handbook of Political Science
identity, durable belonging and less ambivalence in both advanced societies and the socalled developing countries. This chapter proceeds in the following steps. First, it introduces the concept, or rather the variety of concepts, of identity. Second, it focuses on three clusters of identity research surrounding expected functions of identity: the cognitive perspective, the selfesteem perspective and the collective action perspective. Next, the article will deal with two ongoing debates on identity: identity in nationalism, and identity politics. The chapter draws on research from social psychology, sociology and political science in order to show the types of conceptual, methodological and normative problems with which research on identity has been confronted over the past 30 years.
The Concept(s) When we use the term identity in the social sciences, we often refer to collective identity, connecting the individual to groups (Kaina and Karolewski, 2013). This, in turn, gives us insights into various forms of collective behavior. Still, there is little consensus among scholars on what exactly identity means: it can refer to (1) something groups and individuals have (e.g. Abrams and Hogg, 2004); (2) something groups and individuals are (e.g. Cross, 1985); and (3) something groups and individuals do or say (e.g. Wodak et al., 2009). However, certain aspects, borrowed from social psychology, can be found in many approaches to identity in the social sciences. One such aspect is self-categorization as a group member. Henri Tajfel (e.g. 1981) strongly influenced research on identity by focusing on social categorization. In his work, Tajfel defined a person’s perception of belonging to a group as one component of group identification, which can be unveiled in people’s attitudes (Tajfel, 1981: 132).
In Tajfel’s view, the necessary (sometimes even seen as sufficient) step in the process of identification is self-categorization, serving as a benchmark exhibiting commonalities between ‘me’ and the ‘others’ (members of the same group) and labeling distinctions between ‘me’ and the ‘other others’ (members of other groups). A further step would be social categorization, which is how individuals are categorized by others. However, some scholars argue that social categorization, even if central for identity formation, is insufficient for people to develop collective identity vis-à-vis a given group, as ‘being categorized does not automatically mean to take on this label as an aspect of self-identity or to see oneself as sharing something with others so categorized’ (Fuss and Grosser, 2006: 213f; Kaina and Karolewski, 2013). A further controversial aspect pertaining to the concept of identity is the question of whether identity is an artifact resulting from social construction, or rather ‘naturally’ evolved (Cederman, 2001: 141–3). Even though there seems to be a sort of consensus on the artificiality of identities, some students of collective identity question the manmade nature of collective identities, in particular national identities. There are, roughly, two major theoretical paradigms explaining collective identity formation: essentialism and constructivism. ‘Essentialists’ believe in the existence of cultural ‘raw material’ within a society, from which political collective identities develop. In contrast, ‘constructivists’ see the active role of intellectuals and political entrepreneurs as central, for instance, in manipulating cultural symbols and mobilizing cultural cleavages (Cederman, 2001: 142). Even more, collective identities can be conceptualized as narrative constructions (or ‘stories’) which are the objects of identification. In this context, some students of collective identity highlight that identities form around social constructions of difference, which also rest upon processes of categorization and attribution. The ‘stuff’ of these social constructions may vary, and cover
Identities
norms and values but also primordial features such as gender or race. Since identities appear to be more than just social categorizations, how are the cognitive perceptions of belonging to a group transformed into emotional bonds among community members? This question shows the relevance of the distinction between ‘belonging to’ and ‘belonging together’ (Kaina and Karolewski, 2013). While ‘belonging to’ represents mainly a cognitive category, ‘belonging together’ stresses feelings of commitment, one’s willingness to espouse solidarity, as well as readiness to make a personal sacrifice for the well-being of the fellow group members. Also with regard to this issue, there are different positions in the scholarly debate. Some scholars stress that people’s awareness of ‘belonging together’ is mainly constructed by elites (e.g. Cederman, 2001). Others argue that feelings of togetherness may be linked to individuals’ belief in the specific value of the collective and its relevance for the fate of its members. This position relies on Tajfel’s argument that individuals aspire to such group memberships, through which they develop (or maintain) some sort of psychological gratification (Tajfel, 1981). In other words, the preservation and improvement of individual self-esteem is a motivation for someone’s identification with a given collectivity. In large collectives, the individual gratification results from the conviction that one shares those precious commonalities with one’s fellow group members, rather than with people from outside the community. These ‘imagined communities’ (Anderson, 1991) are believed to transform ‘belonging to’ into ‘belonging together’. At the same time, there appears to be a consensus in the scholarship that collective identities require the delineation between ingroup and out-group members. The valuable features of a collective matter because they are contrasted to the perceived dissimilarity of out-groups. However, concerning the relationship between in-groups and out-groups, scholars also point to the potential ‘dark side’
519
of collective identity, which can generate contestation and conflict between different groups. Even though several scholars argue that collective identities do not have to generate aversion to others, the in-group/out-group antagonism seems to be a latent phenomenon. It can be mobilized, in particular, when insiders believe that outsiders present a threat to the in-group. Furthermore, there is a question of the relationship between the individual and collective elements in people’s overall identity. According to Norbert Elias (1991), individuals maintain a We–I balance, where We-feelings relate to collective identity of individuals and the I-component describes the personal and idiosyncratic aspect of individual identity. As a result, different constellations of individual identity can exist – some more collectivist, others more individualist. The We–I balance can lean towards the I-component while containing a minimal amount of We-feeling (weak collective identity). If there is a continuous long-term process shifting towards I-identity, we can label it individualization, during which the We–I balance turns to the I–We balance. Equally, the We-component can dominate the We–I balance, leading to a collectivization of identity (strong collective identity). Against this backdrop, collective identity can undergo long-term or medium-term changes in the process of identity construction. For Elias, the major source of We-identity has been the modern state, which advanced people’s identity by integrating the hitherto excluded classes of the population through the concept of membership equality in the modern state. However, this development did not occur before the 20th century, as the loyalty and mobilization of the population gained significance only as a result of mass warfare, which worked as an ‘equalizer’ (Elias, 1991: 207). The resilient and strong collective identity generated by modern states resulted from the integration of the modern state via national citizenship and its far-reaching role as a survival unit during wars, which in turn
520
The SAGE Handbook of Political Science
underpins the emotional nature of the ‘weness’ (Elias, 1991: 219). According to the optimal distinctiveness theory, individuals define themselves as much in terms of their group membership as in terms of their individual achievements (Sherman et al., 1999). Since an individual’s self-concept is shaped by two opposing needs – the need for assimilation (collective identity) and the need for differentiation (individual identity) – people aspire to become part of larger social entities, while at the same time they want to feel unique and original. However, too much of either identity is expected to generate the opposite motivation, which provokes efforts to change the current level of social identification. In the ensuing sections, I focus on various functions of collective identity, as they are debated and explored in the vast body of identity literature. This will demonstrate the different perspectives from which collective identity can be looked at, and which specific questions are raised in current research. Against this backdrop, the research on identity will be discussed based on three clusters of identity functions that inform the following perspectives: (1) the cognitive perspective; (2) the self-esteem perspective; and (3) the collective dilemma perspective.
Functions of Identity The Cognitive Perspective A number of theories from social psychology expect relevant cognitive functions from collective identity. This cluster of identity research focuses on collective categorization and depersonalization processes in groups, through which individuals perceive themselves as members of groups. These cognitive processes have the function of reducing social complexity for individuals and thus decrease social uncertainty (e.g. Hogg, 2000). Once self-categorization as a group
member has occurred, the individual comes to perceive him or herself as indistinguishable from other members of the group and increasingly different from members of contrasting categories. As a result of this depersonalization process, the importance of the individual’s personal identity is diminished, and the importance of the person’s social (collective) identity is increased. Accordingly, the perceived boundaries between self and other group members are weakened and a higher salience of the boundaries between in-group and out-groups occurs. Furthermore, due to depersonalization the successes and failures of the ingroup are incorporated into the self-concept and perceived as personal outcomes, and in this process out-groups can come to be viewed as alien and threatening. Against this background, some authors argue that increased interactions among individuals due to increased multi-cultural experiences can fuel new identity conflicts, as identities clash with each other by reproducing ethnic, religious and socioeconomic cleavages (e.g. Huntington, 1993). This results from the globalization of the economy and human affairs that tends to make some individuals ontologically insecure and existentially uncertain. As a response to such insecurity, some students of identity expect more frequent attempts to reaffirm the selfidentity by approaching groups ‘offering’ to decrease insecurity and existential anxiety. Consequently, a rise of ‘collectivism’ (or shift towards We–I balance, to use Elias’ concept) could be expected, particularly in forms of religion and nationalism, as a reaction to ontological insecurity associated with cultural fluidity, mobility and complexity of the outside world. Zygmunt Bauman (2017) discusses exactly this wish to return to ‘tribes’ in his book Retrotopia, in which he points out that people construct in their minds the safe and well-known world of the past that actually never existed. Bauman – one of the leading scholars of liquid modernity, previously proclaiming the imminent end of the identity
Identities
grip of the nation-state – attempts to come to terms with the new surge of world-wide nationalism and concedes that even fluid identities might be prone to ‘chronicity’, as they can become ‘frozen’ and conflictive, rather than flexible and compatible. In an extreme form, the revival of ‘collectivism’ can foster prejudice against ‘others’, particularly if group membership is based on ascriptive and ethnic characteristics. In his seminal work on ‘cognitive prejudice’, Henri Tajfel (1969) indicated that the ‘blood-andguts model’ of group membership can lead to dehumanization of out-groups and as a consequence to aggression against them, while the entire process occurs already at the cognitive level. In this sense, depersonalization occurs both in the in-group and vis-à-vis the out-group, since the out-group comes to be viewed as a depersonalized agglomerate of similar members. Since categorization as a boundary-making mechanism is conceptualized as a vital mental process, from which individuals cannot escape, it is also strengthened and forged through political practices. For instance, states classify people by assigning them to categories which are associated with group identities. As a result of censuses, states ‘freeze’ the salience of certain collective identities through political practices of categorization. Even if census categories may not coincide with the self-understanding of the categorized social groups, they are frequently utilized by cultural and political actors to draw the boundaries of collective identity. Therefore, states often impose ethnic or even racial categories on individuals, institutionalizing them in documents; this often has political consequences for these identities, including discrimination against minority identities by state authorities (Jenkins, 2000). In sum, the cognitive perspective on identity suggests that ethnicities, nations, minority groups and other collectivities are mental and social constructs, which are necessary for individuals to develop self-concepts as group members in the context of social
521
complexity and uncertainty. However, the cognitive logic of collective identity gives political authorities a tool for drawing boundaries in society by recognizing and classifying people as group members. In this sense, state authorities construct and reconstruct both collective similarity and difference, using them as templates for organizing perceptions of belonging as well as frames for socio-political comparisons (Brubaker and Cooper, 2004: 47).
The Self-Esteem Perspective A further cluster of identity literature points to yet another function of collective identity beyond the reduction of uncertainty and social complexity: the acquisition of a positive self-image from group membership. In this view, the drive for positive social identity fuels the process by which one’s self-esteem is produced and maintained through favorable comparison to an out-group. In this perspective, groups (but also larger collectivities, such as nations) are viewed as social resources used by individuals for their psychological benefits (Correll and Park, 2005). Against this backdrop, individuals can use various strategies for collective status improvement. For instance, they can leave the collectivity and become a member of a more positively evaluated group (this is more difficult with nations). The permeability of the group boundaries is a relevant criterion in this context, since some groups are difficult to abandon in light of their sanctioning mechanisms, such as attribution of the stigma of a traitor. In addition, individuals may attempt to redefine the inter-group comparison process by selecting other reference points or standards of comparison or devaluing out-groups, so their own group appears more attractive. In particular, the latter strategy is connected with so-called othering: an active strategy of demarcation and a juxtaposition of the in-group against the ‘other’, where the ‘other’ frequently acquires a more durable image than in the cognitive approach.
522
The SAGE Handbook of Political Science
This is visible, for instance, in nationalism studies, where ‘othering’ is the primary tool of forging national identity. According to Michael Billig, nationalism tells ‘us’ who ‘we’ are by relating ‘us’ to ‘them’ (Billig, 1995: 78). Nationalism as a specific sort of collective identity, therefore, responds to the needs of a society to create and recreate its own ‘others’. Particularly in times of crisis, the significant ‘other’ becomes activated in the collective identity of individuals, since the construction of ‘us’ versus ‘them’ helps in overcoming crises by using blaming and scapegoating strategies. However, what matters is not only the construction of the ‘other’ as such, but also the perceived attributes of the ‘other’, since the constructed nature of representing the ‘other’ is significant for the consequences of collective identity. There can be significant and insignificant ‘others’, but only the significant ‘other’ becomes the relevant reference for collective identity formation. As Billig (1995: 80) argues, foreignness is subject to scrupulous distinctions between different groups of ‘others’, created and recreated in debates about the meaning to ‘us’ of various groups of foreigners. When the ‘other” is constructed as threatening, it may produce xenophobia and violence against the ‘others’. Still, the ‘others’ can also be constructed as inferior, which boosts a given group’s feelings of supremacy and grandeur and might lead to a stigmatization of ‘others’. Moreover, the ‘others’ can assume positive attributes. Some students of nationalism argue that there can exist positive images of ‘others’, for instance, forged by the consideration of the minority as legitimate or unjustly discriminated against throughout history (Petersoo, 2007). Once individual self-esteem is increased by lowering the perceived value of others, majorities in crisis might be prone to finding scapegoats for their perceived misery, which can result not only in discrimination but also in stigmatization or victimization in cultural terms. Goodey (2002: 135–58) argued more than 15 years ago that there is a visible
tendency in some European nation-states for political or public stereotyping of migrants as ‘undesirables’ or as potential criminals. This ‘criminalization’ of migrants is frequently based on the selective focus on some offences or crimes committed by migrants while simultaneously ignoring criminality committed among the majority group. Even more, in this view the EU societies seem to uphold a sort of hierarchy that exists with the citizens of the EU member states at the top, followed by other EU nationals, with non-EU nationals at the bottom of the ladder. Goodey points out that this hierarchy concerning non-EU nationals seems to be based more on color than on nationality. However, group identities do not have to cause inter-group conflicts, stigmatization or violence. For instance, the likelihood of identity clashes strongly depends on the type of inter-group comparison (Turner, 1999). When the in-group and out-group do not stand in a competitive relationship to each other with regard to a salient comparative dimension, conflicts are less likely to occur. In addition, scholars argue that even though identity affects conflict behavior, it is mediated by the degree of insecurity, implying that increased feelings of security correspond to more cooperative behavior.
The Collective Action Perspective A further cluster of collective identity research relates to something well known in the social sciences as ‘dilemmas of collective action’, in which the individual rationality of interdependent actors leads to collectively irrational outcomes and is thus detrimental to the provision of collective goods. A number of collective dilemma situations are described in the literature, including free-riding and the prisoner’s dilemma, all of which problematize the clash between individual and collective rationality (e.g. Axelrod, 1980). Next to the rationality-oriented solutions to collective action dilemmas discussed in the debates on cooperation problems (e.g. Wu and
Identities
Axelrod, 1995), collective identity has increasingly been debated as one of the possible explanations for problems of collective action (e.g. Gavious and Mizrahi, 1999). For instance, group identification has been viewed as a significant factor in explaining political protest behavior. Klandermans (2002) emphasizes that of the three components of identity – cognitive, evaluative and affective – the latter appears to be the most relevant for explaining the readiness of individuals to engage in protest actions. At the same time, the relationship between identification and protest participation seems to be a double-edged one, since identity promotes participation, and participation in turn reinforces identification (Klandermans, 2002: 898). Moreover, identity is likely to reduce actors’ responses to the ‘greed component’ in the prisoner’s dilemma (the motivation to ‘free-ride’ on the cooperation of others), but it does not appear to have any effect on the responses to the ‘fear component’, that is, the motivation to avoid being ‘exploited’ (Simpson, 2006). Furthermore, the social movement literature stresses that collectively framed grievances can generate a sense of ‘we-ness’ which is directed at ‘them’, who are deemed responsible for the grievances in question (della Porta, Chapter 39, this Handbook). For instance, if political authorities are regarded as the culprits in a given context, it can politicize collective identity up to a point where the collective action dilemma is rendered meaningless. This is likely to be the case when political authorities are unresponsive or react in a repressive manner. Collective identities generated in a social movement can even produce a revolutionary mobilization, thus overcoming the collective action dilemma in revolutions (e.g. Kuran, 1989). In this sense, the concept of collective identity in the literature on social movements refers to shared representations of the group based on common interests and experiences, but it also pertains to a process of shaping and creating an image of what the group stands for and how it wants to be viewed by others.
523
As a result, collective identity is more than a psychological reaction or state-governed categorization, but an achievement of collective efforts (Poletta and Jasper, 2001). Other scholars examining social movements stress the cultural background of collective action, as actors are believed to draw elements from their cultural repertoire and adapt them to their movement’s purposes. In this sense, collective grievances might not be the only source of collective identity, as individuals are likely to use their specific cultural repertoire as well as the wider cultural context of a social movement. Here, collective identity is explained less through individual rationality and more in a structuralist manner of the wider cultural context favoring collective identity (e.g. Williams, 1995). A further argument pertaining to the solution of collective dilemmas through collective identity can be found in neo-Marxist literature. Jon Elster uses Marxian class consciousness as the solution to the problem of collective action. Once a class is able to overcome the free-rider problem in realizing its own class interests, collective identity occurs at a class level (Elster, 1985: 347ff.). However, the form of class consciousness differs depending on the relations of production. Therefore, interests, identities and organization of workers and capitalists will vary, particularly regarding their demands on the state (e.g. Bottero, 2004). Nonetheless, in the Marxist approaches class consciousness is supposed to generate class solidarity, which also allows for effective organization of class interests against the interest of the opposing class. This transforms the class ‘in itself’ into a ‘class for itself’. The neo-Marxist account of collective identity is, however, divided regarding the issue of how collective identity comes about. Whereas Marx assumed a creation of class consciousness by default depending on the critical degree of exploitation, some of his followers advocated an active role for the Communist Party in creating a collective identity of the working class (e.g. Lukács, 1971).
524
The SAGE Handbook of Political Science
Ongoing debates on identity Among the various contemporary debates on identity, I will focus on identity in nationalism and identity politics. There are other debates – for instance, gender and identity (Sawer, Chapter 6, this Handbook) – which are relevant and controversial, but these two appear to be the most encompassing ones, where several of the aspects of identity are integrated.
Identity in Nationalism One of the major contemporary debates on collective identity can be found in the research on nationalism. There are various arguments pertaining to why national identity is pervasive in modern societies, but the approach of liberal nationalism emphasizes, in addition, the normative value of a national collective identity. Probably the most pronounced argument in favor of nationalism has been formulated by Liah Greenfeld (1999), for whom nationalism is a vehicle for dignity and equality in a modern society. Nationalism is, therefore, regarded as a unique form of social consciousness, which is historically anchored and hence not easily replaceable. In the same vein, authors such as David Miller (1995a) and Yael Tamir (1993) argue that national identity is conducive to individual enrichment in a moral and political sense. The collective bond of nationalism is supposed to deepen commitments and obligations between those who share it by providing an essential motivation for civic commitments. Hence, nationalism as a unique form of collective identity is expected to produce social trust, drawn primarily from the cultural layer of the community, in which deep obligations stem from perceived relatedness of the individuals. This argument has been applied to the conditions of modern (redistributive) economies, which require high levels of moral commitment in the form of mutual solidarity. Only
against the background of a high level of social trust can democracy function in a sustainable manner, since redistributive measures cannot be justified otherwise (Miller, 1995b: 26). The normative value of nationalism suggests a necessary strengthening of the nation-state against the pressures of globalization. It also entails granting national independence to communities striving for it, as nation-states are not only the basic but also the ‘natural’ organizational units for modern political communities. Even though scholars of nationalism agree on the historical contingency of national communities, this does not preclude the functional indispensability of nationalism in the context of modern statehood. Ernest Gellner (e.g. 1981) was probably the most prominent scholar of nationalism who stressed that nationalism is strongly associated with capitalism. Nationalism promoted the development of a homogenous language facilitating communication, and therefore enabled national sentiments to be constructed and preserved. However, the national communicative integration has been enforced by capitalism, which has made replaceability and equality between members of the national community necessary in terms of their functions for the cognitive and material growth of modern capitalist societies. The main distinction between the social psychology of collective identity and the nationalist perspective of collective identity can be found in the specific nature of national identity. Although most of the nationalism scholars agree on the constructed and contingent character of nationalism, they also agree that nationalism is not easily replaced, since it has been generated by strong forces of nation-building, often associated with violence, genocide and suppression of regional and local identities (Wimmer, 2013). Furthermore, national identity has become the dominant collective identity in modern human societies, as it went hand in hand with the establishment
Identities
of states as powerful identity-making agents. More strongly than any other political organizations, nation-states pursue policies of identity construction and reconstruction. Ernest Gellner (1995: 50) has argued that nation-states establish collective identity (a process he calls ‘exo-socialization’) mainly via standardized education systems. This allows for a communicative centralization of a modern society despite its cultural variety and complexity (Gellner, 1981: 753). The nationalism perspective not only frames national identity as ‘chronic’ in relation to other less resilient identities and as normatively valuable due to its trust-making potential, but also highlights its functionality with regard to modern industrialized societies. In this sense, national identity becomes the dominant and structurally ‘useful’ identity that is associated with the nation-state. Here, as long as nation-states remain the dominant units of political organization, national identity is bound to remain the primary and chronic form of collective identity. A further controversy in nationalism studies relates to the distinction between liberal and non-liberal nationalism. Non-liberal nationalism has played a prominent role in previously colonized nations of the Third World and in the nationalist conflict in former Yugoslavia. Scholars tend to expect non-liberal nationalism to be oppressive to minorities and marginalized members, since it views cultures as essentialist and static while rejecting the fundamental value of individual rights. Still, some authors defend the non-liberal nationalism of previously colonized nations by arguing that it can be justifiable. This communitarian argument constructs a moral agency of cultures, highlighting that even non-liberal cultures can hold emancipatory potential for individuals participating in the reconstruction of their national culture (Herr, 2006). In contrast, liberal nationalism highlights the significance of nationality for citizens and its role in the justification of liberal policies. Liberal nationalists (Tamir, 1993) argue that national
525
identity serves basic individual needs and is not only compatible with postulates of equality and individuality, but also should be fostered for the liberal-democratic state to function. Therefore, the dark side of nationalism is believed to be exaggerated, as many nationalism scholars tend to focus on the few cases of nationalist aggression and oppression and neglect the majority of benign cases of nationalism. Some argue that even the notion of ethnicity can be expressed in liberal terms, thus constituting a variant of the good life based on a synthesis of liberalism and ethnicity (Kaufmann, 2000). Others believe that there are sufficient reasons to distrust nationalism. For instance, Andrew Vincent (1997) points out that there is a distinction to be made between pragmatic (and reluctant) acceptance and a principled ethical esteem of nationalism. Nationalism can be empirically accepted as a currently pervasive form of group loyalty, but one should not bestow any ethical significance upon it, which legitimizes nationalism by transferring respect and dignity from individuals to nations.
Identity Politics States, governments and political entrepreneurs are believed to apply identity technologies to citizens in an attempt to construct collective identity. These identity politics aim for collective identity in a top-down manner as citizens become receivers of a collective identity whose orientation is constructed ‘politically’. From the perspective of the elites, collective identity is an instrument to achieve two fundamental principal goals. On the one hand, political elites claim legitimacy in representing the collective concerns of the group; on the other, they control ‘deviant’ behavior of their political opponents within the group by interpreting what collective identity is (Castano et al., 2002). Identity politics relies on the fact that the majority of group members are unlikely
526
The SAGE Handbook of Political Science
to have any direct contact with each other. Still, they draw on a horizontal feeling of belonging, which is expected to be sufficiently powerful to mobilize and legitimize political actions. This feeling of belonging is supported and stabilized by an ideology of ‘we-ness’, which holds that the defining characteristics of the group establish ‘authentic’ boundaries towards other groups. At the same time, these identity politics generate a legitimate basis for making political claims of either a special treatment or collective self-determination. In order to support their political claims, group leaders emphasize and manipulate shared myths, symbols and cultures associated with a particular territory or a particular way of life as factors which help to consolidate and maintain collect identity. Since there is no face-to-face communication among the majority of the group members, collective identity is transported via images and representations such as the national or regional soccer team, whose players most of the group members will never know in person. However, collective identity can have illiberal consequences in this context as well. A construction of fraternization and a development of communitarian loyalty have the downside of oppressive measures should a group member violate the norms of loyalty. Gellner (1995) labels this ‘tyranny of the cousins’. Frequently, myths of common ancestry are employed by political entrepreneurs because they have the capacity to modulate perceptions of self-interest and to add a moral dimension to political conflict. Therefore, political conflict (often of a redistributive nature) becomes one between the moral ‘us’ and the immoral ‘other’, and turns into an ideological battle. One of the controversies of identity politics is the possible scope of manipulation of collective identity by political leaders. This is particularly evident in the debate on nationalism as a particular case of legitimacy-seeking identity politics. In contrast to modernists such as Ernest Gellner or Liah Greenfeld, primordialists represent the position that the
rise of the modern nation is related to the ethnic communities of the ancient and medieval world, rather than being the result of the making of modernity. For instance, Anthony D. Smith (1996) highlights how a nation’s emphasis on its own unique culture leads to a sense of ‘chosenness’ of its people, which develops historically as a result of ethnic conflicts. Consequently, national identities can be traced back to ethnic communities, which expanded to the detriment of other ethnic communities and, due to the emergence of a homogenous language and its origin in the ethnic past, gave birth to the political identity of nationalism. In this perspective, political leaders cannot freely manipulate the collective identity of nations, or, for that matter, of other groups with political claims, since ethnicities are the raw material for identity politics and cannot be easily ignored. In contrast, other scholars support the constructivist conception of identity politics as grounded in social interaction. In this view, collective identity develops through social processes, which encompass personal interactions with others but also symbolic exchanges of gestures and language, in which meanings are negotiated. In this sense, collective identity is a dynamic result of our everyday social interaction. Being socially shaped, collective identities are products of social processes – this means that they can change, but it does not mean that they are fluid. The classic symbolic interactionism puts a particular emphasis on interactions with ‘significant others’, who are emotionally important, parent-like figures, perceived as close. This perspective differs from the socio-psychological approach, in which ‘others’ are external representations of out-groups essential for boundary-making and in-group identification rather than for positive identification with the ‘significant others’ (Berger and Luckmann, 1966). In the framework of symbolic interactionism, ‘significant others’ are particularly powerful when they control emotional life, as parents do in early childhood. Therefore, identity politics frequently generates collective emotions
Identities
in order to legitimize political claims of given groups. In the process, collective identity becomes emotionally charged and relates to parental figures of political leaders (living or dead) or uses personification of the state as an emotional system of reference. A further aspect of identity politics relates to so-called cultural tacit knowledge, which includes habits and unconscious social patterns of behavior established by cumulative repetition and social routinization (Berg-Schlosser, Chapter 37, this Handbook). Against this background, identity politics is viewed as a longterm identity creating a project of governing elites with the aim of shaping citizens or future citizens from their early childhood. Identity politics delivers patterns with which to interpret the social environment and thus restricts possibilities of narrative identity building at the personal level. Usually it involves interpretation matrices, which are applied by individuals in the unconscious process of explanation of social reality. However, these matrices do not float freely, but are delivered by the political elites with the goal of fostering a certain collective interpretation of the community, relating both to its past and to its future (MarczewskaRytko, Chapter 38, this Handbook). For this purpose, social and political events in communities are intentionally adapted in a retrospectively narrative manner. In the process, social events become parts of a composite system of interpretation, which offers citizens a grasp of both the inside and the outside world (Polkinghorne, 1996). Certainly, the best known methods of identity politics refer to the national identity construction perpetuated by nation-states, as they are in possession of institutionalized tools of official historiography, public education systems and mass media. This kind of background process of identity construction was analyzed by Michael Billig (1995), who coined the term ‘banal nationalism’ to describe this phenomenon with regard to national identity. Billig observed that the everyday language used by the mass media, particularly in news and weather reporting, invokes ‘us’ and ‘we’ as a community on an
527
everyday basis, thus establishing a largely unconscious interpretation matrix for citizens who regard themselves as belonging to the same collective category. In contrast to traditional nationalism research, Billig argues that national community is not necessarily forged on special occasions such as national holidays and times of international crisis. Instead, it occurs first and foremost as a certain ideological habit of thought, which must be reproduced on a daily basis to be activated when needed. Only through piecemeal and subconscious construction can a stable and legitimacy-providing collective identity be developed.
Outlook This chapter has focused on various facets of the research on identity. In addition to the sheer complexity of this research, there are various conceptual, methodological and normative challenges. First, there is divergence among scholars as to how to define collective identity. While the more skeptical scholars such as Brubaker and Cooper reject the concept altogether, there is a variety of definitions, ranging from categorization to identification and belonging. The vagueness of the term ‘identity’ is one of the greatest deficiencies when it comes to research on identity. Second, there are methodological problems concerning the measurement of collective identity. While social psychology has developed a sound toolbox of measurement based on the in-group and out-group distinction and perceptions connected with that, political science seems still to be preoccupied with questions of the origin of nations and their construction. In addition, there are unresolved normative issues, as some forms of collective identity – first and foremost national identity – are bestowed with ethical qualities by scholars, while the same identities are demonized by other scholars. Furthermore, there is the question of the normative assessment of identity politics or identity technologies administered in a top-down
528
The SAGE Handbook of Political Science
manner, where citizens become ‘receivers’ of collective identity and the resulting identity construction might not be easily discernible from collective brainwashing and begs the question of legitimacy.
References Abrams, Dominic and Hogg, Michael A. (2004) Collective Identity: Group Membership and Self-conception, in M. B. Brewer and M. Hewstone (eds), Perspectives on Social Psychology: Self and Social Identity, Malden: Blackwell, 147–81. Anderson, Benedict (1991) Imagined Communities: Reflections on the Origin and Spread of Nationalism, London: Verso. Axelrod, Robert (1980) Effective Choice in the Prisoner’s Dilemma, Journal of Conflict Resolution 24(1): 3–25. Bauman, Zygmunt (2000) Liquid Modernity, London: Polity. Bauman, Zygmunt (2017) Retrotopia, London: Polity. Berger, Peter L. and Luckmann, Thomas (1966) The Social Construction of Reality: A Treatise in the Sociology of Knowledge, New York: Doubleday. Bernstein, Mary (2005) Identity Politics. Annual Review of Sociology 31: 47–74. Billig, Michael (1995) Banal Nationalism, London: Sage. Bottero, Wendy (2004) Class Identities and the Identity of Class, Sociology 38(5): 985–1003. Brubaker, Rogers and Cooper, Frederick (2004) Beyond, ‘Identity’, in Rogers Brubaker, Ethnicity without Groups, Cambridge: Harvard University Press, 28–63. Castano, Emanuele, Coull, Alastair, Paladino, Maria Paola, and Yzerbyt, Vincent Y. (2002) Protecting the Ingroup Stereotype: Ingroup Identification and the Management of Deviant Ingroup Members, British Journal of Social Psychology 41: 365–85. Cederman, Lars-Erik (2001) Nationalism and Bounded Integration: What It Would Take to Construct a European Demos, European Journal of International Relations 7(2): 139–74. Correll, Joshua and Park, Bernadette (2005) A Model of the Ingroup as a Social Resource,
Personality and Social Psychology Review 9(4): 341–59. Cross, William E. (1985) Black Identity: Rediscovering the Distinction between Personal Identity and Reference Group Orientation, in Margaret Beale Spencer, Walter R. Allen, Geraldine K. Brookins (eds), Beginnings: The Social and Affective Development of Black Children, Hillsdale NJ and London: Laurence Erlbaum Associates, 155–71. Elias, Norbert (1991) The Society of Individuals, New York: Continuum. Elster, Jon (1985) Making Sense of Marx, Cambridge: Cambridge University Press. Fukuyama, Francis (2018) Identity: The Demand for Dignity and the Politics of Resentment, New York: Farrar, Straus and Giroux. Fuss, Daniel and Grosser, Marita A. (2006) What Makes Young Europeans Feel European? Results from a Cross-Cultural Research Project, in Ireneusz Pawel Karolewski and Viktoria Kaina (eds), European Identity: Theoretical Perspectives and Empirical Insights, Muenster et al: LIT, 209–42. Gavious, Arieh and Mizrahi, Shlomo (1999) TwoLevel Collective Action and Group Identity, Journal of Theoretical Politics 11(4): 497–517. Gellner, Ernest (1981) Nationalism, Theory and Society 10(6): 753–76. Gellner, Ernest (1995) The Importance of Being Modular, in John Hall (ed.), Civil Society, Cambridge: Cambridge University Press, 32–55. Goodey, Joanna (2000) Non-EU Citizens’ Experiences of Offending and Victimisation: The Case for Comparative European Research, European Journal of Crime, Criminal Law and Criminal Justice 8(1): 13–34. Goodey, Joanna (2002) Whose Insecurity? Organised Crime, Its Victims and the EU, in Adam Crawford (ed.), Crime and Insecurity: The Governance of Safety in Europe, Cullompton: Willan, 135–58. Greenfeld, Liah (1999) Is Nation Unavoidable? Is Nation Unavoidable Today?, in Hanspeter Kriesi et al. (eds), Nation and National Identity: The European Experience in Perspective, Zürich: Rüegger, 37–54. Herr, Ranjoo Seodu (2006) In Defense of Nonliberal Nationalism, Political Theory 34(3), 304–27. Hogg, Michael A. (2000) Subjective Uncertainty Reduction through Self-categorization: A
Identities
Motivational Theory of Social Identity Processes, European Review of Social Psychology 11: 223–55. Huntington, Samuel P. (1993) The Clash of Civilizations?, Foreign Affairs 72(3): 22–49. Jenkins, Richard (2000) Categorization: Identity, Social Process and Epistemology, Current Sociology 48(3): 7–25. Kaina, Viktoria and Karolewski, Ireneusz Pawel (2013) EU Governance and European Identity, Living Reviews in European Governance 8(1): http://www.livingreviews.org/lreg-2013-1. Kaina, Viktoria, Karolewski, Ireneusz Pawel and Kuhn, Sebastian (eds) (2016) European Identity Revisited: New Approaches and Recent Empirical Evidence, London: Routledge. Karolewski, Ireneusz Pawel (2010) Citizenship and Collective Identity in Europe, London: Routledge. Kaufmann, Eric (2000) Liberal Ethnicity: Beyond Liberal Nationalism and Minority Rights, Ethnic and Racial Studies 23(6): 1086–1119. Klandermans, Bert (2002) How Group Identification Helps to Overcome the Dilemma of Collective Action, American Behavioral Scientist 45(5): 887–900. Kuran, Timur (1989) Sparks and Prairie Fires: A Theory of Unanticipated Political Revolution, Public Choice 61(1): 41–74. Kymlicka, Will (1995) Multicultural Citizenship: A Liberal Theory of Minority Rights, Oxford: Clarendon Press. Lukács, Georg (1971) History and Class Consciousness, Cambridge: MIT Press. Meyerfeld, Jamie (1998) The Myth of Benign Group Identity: A Critique of Liberal Nationalism, Polity 30(4): 555–78. Miller, David (1995a) Reflections on British National Identity, Journal of Ethnic and Migration Studies 21(2): 153–66. Miller, David (1995b) On Nationality, Oxford: Clarendon Press. Mounk, Yascha (2018) The People vs. Democracy: Why Our Freedom Is in Danger and How to Save It, Cambridge: Harvard University Press. Petersoo, Pille (2007) Reconsidering Otherness: Constructing Estonian Identity, Nations and Nationalism 13(1): 117–33. Polkinghorne, Donald E. (1996) Explorations of Narrative Identity, Psychological Inquiry 7(4): 363–7.
529
Polletta, Francesca and Jasper, James M. (2001) Collective Identity and Social Movements, Annual Review of Sociology 27: 283–305. Sherman, Steven J., Hamilton, David L. and Lewis, Amy C. (1999) Perceived Entitativity and the Social Identity Value of Group Memberships, in Dominic Abrams and Michael A. Hogg (eds), Social Identity and Social Cognition, Oxford: Blackwell, 80–110. Simpson, Brent (2006) Social Identity and Cooperation in Social Dilemmas, Rationality and Society 18(4): 443–70. Smith, Anthony D. (1996) Culture, Community and Territory: The Politics of Ethnicity and Nationalism, International Affairs 72(3): 445–58. Sorens, Jason (2012) Secessionism: Identity, Interest, and Strategy, Montreal: McGillQueen’s University Press. Tajfel, Henri (1969) Cognitive Aspects of Prejudice, Journal of Biosocial Sciences, Supplement 1: 173–91. Tajfel, Henri (1981) Human Groups and Social Categories: Studies in Social Psychology, Cambridge: Cambridge University Press. Tamir, Yael (1993) Liberal Nationalism, Princeton: Princeton University Press. Turner, John C. (1999) Some Current Issues in Research on Social Identity and Self-categorization Theories, in Naomi Ellemers, Russell Spears and Bertjan Doosje (eds), Social Identity: Context, Commitment, Content, Oxford: Blackwell, 6–34. Vincent, Andrew (1997) Liberal Nationalism: An Irresponsible Compound?, Political Studies 45(2): 275–95. Williams, Rhys H. (1995) Constructing the Public Good: Social Movements and Cultural Resources, Social Problems 42(1): 124–44. Wimmer, Andreas (2013) Waves of War: Nationalism, State Formation, and Ethnic Exclusion in the Modern World, Cambridge: Cambridge University Press. Wodak, Ruth, de Cillia, Rudolf, Reisigl, Martin and Liebhart, Karin (2009) Discursive Construction of National Identity, Edinburgh: Edinburgh University Press. Wu, Jianzhong and Axelrod, Robert (1995) How to Cope with Noise in the Iterated Prisoner’s Dilemma, Journal of Conflict Resolution 39(1): 183–9.
32 Interest Group Systems in the Age of Globalization Liborio Mattina
Interest Groups: what they are, what they do Introduction Interest groups are the most numerous social actors in liberal democracies. Even more than political parties, they contribute to creating a vast network of links between civil society and the economic world on the one hand and political institutions on the other. Interest groups represent, in other words, the society that organizes itself to present its preferences to public institutions asking for answers. Therefore, interest groups feed a flow of solicitations towards policy-makers that are vital to the proper functioning of political systems. Yet, interest groups are less studied than parties or other classic sub-fields of political science. The reasons for this lack of attention derive from the conceptual difficulties inherent in the definition of the object of this research and from the methodological
problems related to the need to measure the activity of such groups and to assess their impact on the functioning of democratic regimes. In this chapter, we show the problems scholars face when they study interest groups and the solutions adopted to overcome them. Then we inspect the most important subject addressed in this sub-field, namely investigation of the biased – or unbiased – character of the groups system in liberal democracies. In the second section of the chapter, similarities and differences between the group systems of liberal democracies and those of some of the most important countries committed to economic and social development will be examined. We will conclude with some lines on the conditions that favor or hinder the formation of pluralist group systems.
A Definition Interest groups are organizations of individuals who may come from homogeneous social
Interest Group Systems in the Age of Globalization
sectors (e.g. employees of metalwork companies) or from heterogeneous social realities, as with citizen associations that pursue diffuse interests. Associations of banks that try to influence policy-makers who regulate the financial sector are also interest groups, as are university consortia asking government for more investment to support higher education. Similarly, single industrial or financial companies asking policy-makers for specific measures in favor of their individual business must be considered interest groups. An interest group is, in other words, a label that can be applied to a large and heterogeneous population of social and economic actors, and to institutions. Thus, a wide range of groups have to be included in the same concept, so much so that ‘interest group’ takes in a large variety of expressions: political interest groups, interest associations, interest organizations, and so on. Many of these words are ‘tied’ to specific areas of research, going together with specific approaches and normative assessments. This heterogeneity of terms and objectives hinders the accumulation of results. This often leads to a fragmentation of research into non-communicating sub-groups that produce non-comparable case studies. To overcome these limits, the literature on interest groups needs a definition that circumscribes its field of application, but which also considers the great diversification that the object of research may assume. The task is not easy because, as we have seen, there are significant differences among the groups. An important difference that cuts the population of interest groups longitudinally is the distinction between organizations with membership and those without. The first includes those interest groups that are real associations formed by individuals or organizations (local authorities, churches, universities, hospitals, companies) in which members argue in support of reaching common positions to be presented to their counterparts. The literature on interest groups has paid much attention to this kind of association, developing classic topics such as the recruitment
531
of membership and its aggregation of shared objectives, the organization as a vehicle for political participation, the long-term political alliances established by groups with political parties, and lobbying activities. The interests that fall into the second category are not associations of individuals or organizations. They are single organizations that in some situations act as a lobby, but which do not share all the other activities of interest groups with membership. This, however, does not mean that this second type of interests – which can be labeled organized interests to distinguish them from interest groups with membership – can be overlooked, because they consist of municipalities, regions and states (in federal systems). In addition, the business community is part of this category of interests, as well as all other organizations that are active in the most important policy networks. To better define the scope of research on groups, it is also necessary to distinguish interest groups from political parties, to identify their customary ways of political initiative and their political targets. To this end, it may be useful to propose the following definition: ‘Interest groups are formal organizations, usually based on individual voluntary membership, which seek to influence public policies without assuming government responsibility’ (Mattina, 2011: 1219–20). This definition states that interest groups, unlike political parties, do not try to acquire direct control of public offices through electoral competition. Instead, interest groups limit their commitment to influence policymakers. The main purpose that interest groups try to achieve is that the policies approved by public actors are in tune with their preferences. It is worth emphasizing that the approval of favorable public policies is the main objective of the groups’ activities because in this way it is possible to limit the scope of action in a sectoral or sub-sectoral dimension. This makes it easier to delimit the number of groups to be considered in relation to the specific issue that is under examination.
532
The SAGE Handbook of Political Science
This definition also links the study of political behavior of such groups to the sub-field of policy analysis. Moreover, it suggests not limiting research to the mobilization process and the lobbying activities, but also giving attention to the conduct of these groups during the implementation of policy-making. Finally, the above definition points out that activity is exercised through influence. This is the most important qualification of interest groups and at the same time their most controversial feature, because it is difficult to make a reliable empirical measurement of the degree of influence that each group can exercise within the decision-making process. This problem discourages research on interest groups because their effectiveness in policy-making is never fully ascertained. The research, therefore, tries to circumvent the obstacles inherent in measuring influence by focusing attention on access and lobbying in the attempt to present convincing empirical evidence.
How to measure group influence? Influence is a form of indirect power, exercised through persuasion, which aims to change the conduct of individuals without apparent external signs. Influence is difficult to distinguish clearly from political power because the latter can also be exercised through persuasion. Political scholars therefore run into the difficulty of circumscribing neatly the perimeters of these two political processes. Moreover, it is difficult to distinguish influence from power, because neither is easily measurable (Baumgartner and Leech, 1998: 58–61). Regarding influence, in particular, it seems impossible to quantify a political process involving certain groups and the behavior of rival groups, politicians, public authorities, the bureaucracy and public opinion (ibid: 13–14). In fact, any attempt to identify the real impact of the influence exerted by
interest groups is unsatisfactory, because the observation of negotiations among the relevant actors in the policy-making does not allow certain identification of all of the really important issues at stake (Dür, 2008: 1216– 19). Often certain items are not even included in the political agenda. And other factors – for example, other legislators, party leaders, the legislator’s personal convictions, the media – can influence the policy choices of policymakers to an even greater extent.
Access and Lobbying Faced with the problem of measuring influence, scholars on interest groups shifted their attention to access and lobbying, which became a proxy for investigating the groups’ impacts on policy-making. Access is the attempt to come close to the public venues where the relevant decisions are taken. But when you have a ‘seat at the table’, access does not necessarily translate into influence. Opposing groups may have equal access and political actors can reject the demands made by interest groups. Public actors may even use access as an instrument to co-opt societal interests. Taking access as a proxy for influence is thus likely to lead to erroneous results (Dür, 2008: 1213–14). Moreover, access to one governmental body is insufficient to exercise real influence in a decisionmaking process that requires the intervention of several institutional bodies, as is the case for the federal government in the US and in the EU. Lobbying is generally understood as one or more face-to-face meetings between representatives of an interest group and legislators, sought by the former so as to influence the decisions of the latter in a way that benefits the group’s preferences … In its broader form lobbying involves a wide range of initiatives including contacts with bureaucratic bodies, the premier’s office, the courts and parliament, the use of mass media, preparation of memorandums, the forging of links with individual functionaries and so forth. (Mattina, 2011: 1226)
Interest Group Systems in the Age of Globalization
Research on lobbying suffers the same methodological weaknesses found in the study of access. Scholars often tend to use their own methods for measuring the impact of lobbying, without any comparison with instruments used by their colleagues. Moreover, scholars often start their research with the optimistic assumption that lobbying always has some real impact on public decisions, underestimating the fact that in the real world such groups face several obstacles. These can derive from a scarce attitude of the public decision-makers to meet groups, demands or from a relevant political salience of the issues at stake that make it hard for politicians to find solutions shared by confrontational interests. This optimistic bias inevitably overestimates the effectiveness of lobbying. Finally, scholars usually identify lobbying with a proactive action of pressure, while it is a fact that lobbyists spend most of their time monitoring the work of different policy actors, to try to obtain the inclusion of their proposals on the political agenda (Heinz et al., 1993: 380). More generally, lobbying studies suffer from the absence of a shared theoretical basis, and narrow analytical perspectives have been adopted that led scholars not to build on the results obtained from the work of others and, accordingly, to make little use of comparisons (Baumgartner and Leech, 1998: 126–37).
Lobbying in Washington and Brussels Several limits found in the research on lobbying were surmounted by a new wave of studies carried out in the United States since the second half of the 1980s. Those pieces of research have been able to count on a better quantity and quality of institutional data, on a greater methodological awareness and on higher attention paid by scholars to the political–institutional context in which groups decide to favor one lobbying strategy over another. These changes allowed for important
533
research on lobbying at the federal level and within the states that increased knowledge of the ways in which lobbying is carried out (Gray and Lowery, 2000). However, despite this progress, researchers were not able to find an effective assessment of lobbying accepted by the entire community of scholars. The methodological and analytical problems encountered by US scholars have been more evident among European scholars of interest groups. Research on lobbying is less developed in Europe than in the United States because European scholars have displayed less interest in the topic, due to the importance attributed to the neo-corporatist approach (Lehmbruch and Schmitter, 1982). Neo-corporatist scholars take for granted that there is collusion between policy-makers and the representatives of the main economic interests, while they largely ignore the lobbying activities of non-economic groups. However, there was a downturn in this area over the past two decades, mainly generated by the greater political importance assumed by the institutional system of the European Union: this is based on multilevel governance, which offers many direct access points for individual groups and national associations. The greater powers acquired by the European Union imposed on national interests the need to promote their causes by working both in the domestic and the supra-national arenas. And research on interest groups increased accordingly, by recording steady growth in the number of surveys devoted to investigation of lobbying in the European Union (Mahoney and Baumgartner, 2008).
A Biased group system The issue concerning the functioning of the group system presents important normative implications because it calls into question the quality of liberal democracies and raises the question of their legitimacy. It is, therefore, a topic that presents a serious challenge
534
The SAGE Handbook of Political Science
to scholarship, whose reputation depends to a large extent on the ability to give an empirical answer to the question: Does the groups system favor balanced access within institutions to interests operating in society and provide an equal ability to influence policymaking, or are institutions more attentive to the demands of the few at the expense of the many? This issue has been addressed by several pieces of research dedicated to access and lobbying in the United States and European Union. The results on access indicate the groups that are physically at hand in given policy network. The results on lobbying aim to go one step further because they do not take for granted that the groups permanently present in the policy arenas are also the most influential. Several factors can prevent even the groups more active in the policy-making venues from exercising real influence on policy-makers. Studies on lobbying must therefore be considered as the attempt closer to the identification of actual influence of interest groups.
Group Access in Washington The literature on interest groups in the US case registers a bias in favor of business and professional groups with regard to access. Walker (1983) and Schlozman and Tierney (1986) showed that about three-quarters of the groups represented in the mid 1980s in Washington were associated with economic and professional interests. The subsequent surveys carried out by Schlozman et al. (2012) on data concerning nearly 14,000 organizations registered in the 2006 Washington Representative Study confirmed the persistence of a bias in favor of business and professional groups, although the presence of institutions, especially representatives of state and local governments, universities and hospitals, also increased. Public interest groups, unions and groups representing the poor were the most
underrepresented. In the past 25 years, their modest capacity for access to Congress has stagnated or become worse. The difficulties of access for US unions related to the more general weakening of the US workers’ organizations, whose membership in the private sector fell between 1981 and 2010 from 21.4% to 11.9% (ibid: 87). Similar to Schattschneider (1960), the persistence of the bias led Schlozman and colleagues to the conclusion that, although in Washington the pluralist choir became larger, there was no change in either its accent or the assortment of the voices that compose it (2012: 345; Schlozman et al., 2018: 583).
Group Access in Brussels The findings derived from research in the United States are confirmed to a large extent for interest groups in the EU. In fact, the results of the available research indicate that business and professional associations have easier access to the EU (Rasmussen and Carroll, 2013). The increased presence of these associations – which represent about 80% of the organizations present in Brussels (Eising, 2007: 393) – results from legislation that favors economic actors with interests in cross-border transactions (trade, investment, production, distribution) (Stone Sweet and Sandholtz, 1998). The development of the EU’s regulatory activity and the presence of economic interest groups in Brussels go hand in hand. This trend took root to the detriment of diffuse interests that often turn to the European Parliament to politicize certain issues (environment, health, consumption). By contrast, business groups prefer to deal with these issues through an exclusively technical approach within the committees that support the work of the European Commission and the Council. A survey carried out on the components of a sample of 124 expert committees assisting the work of the European Commission found that 72% of the representatives of groups participating in the committees’ activity were
Interest Group Systems in the Age of Globalization
representatives of business and professional associations (Mahoney, 2004: 450). Among business groups, multinational companies took a dominant position within the euro-groups, without renouncing individual access (Eising, 2007). The considerable availability of organizational and financial resources, as well as international experience and expertise, enabled multinationals to gain an advantage over euro-groups and national trade associations representing similar interests (European Parliament, 2003: 13–16). Multinationals have therefore become the privileged referent of the Commission, to which they offer ‘good’ information for the preparation of legislation (Broscheid and Coen, 2007: 25–9). The bias for access in favor of large business groups is not uniformly spread across all policy arenas. It is greater in some but less pronounced in others, because the disjointed nature of the EU decision-making process does not create a cumulative advantage for groups that prevail in some policy networks and leaves open the opportunity for many others to find the proper venue to promote their causes (Mazey and Richardson, 2001: 234). The imbalance in access is therefore mitigated by the pluralism of the institutional system of the EU.
Lobbying in Washington and Brussels Are the dominant groups present in the most important policy arenas also the most influential? To answer this question, the literature on interest groups concentrated its efforts, as we have seen, on the study of lobbying.
Washington With regard to the United States, some surveys – based on a vast amount of empirical data referring to the past 20 years – allow a satisfactory answer to the question of the
535
possible correspondence between the dominance of certain groups in policy arenas and their influence. Research by Baumgartner and Leech (2001) on the quarterly reports that companies involved in lobbying must submit to Congress in accordance with the law allowed detection of a greater presence in Washington of entrepreneurial and professional groups, which overall represent 65% of the total, compared to 10% represented by non-profit groups, citizens’ associations and trade unions. The scrutiny of the reports also allowed for distinguishing between a limited number of conflicting subjects discussed in some policy networks that attract many groups, and the more than 50% of total issues that are not contentious, on which barely 3% of lobbying was focused. The entrepreneurial and professional groups were active both in the crowded and contentious policy networks and in the non-contentious ones, where it is enough to suggest the inclusion of a few lines to exert a substantial effect on the outcome of the policy. In contrast, trade unions and groups representing widespread interests were active mainly in the first type of policy network. Moreover, the significant position of the business and professional groups in the Congress policy-making was confirmed by the amount of money spent on lobbying. The data collected by Baumgartner and Leech show that entrepreneurial groups and professional associations spent nine times more than non-profit groups and citizen associations, equaling 85% of total expenditure in the time span considered by the research. Large companies spent more than half of the total money invested in lobbying. In conclusion, the research by Baumgartner and Leech shows that entrepreneurial and professional groups are the most present and the most active in Congress, where they have the most contact with decisionmakers. Moreover, they invest a far greater amount of financial resources in lobbying than other groups do, they are the best presented in the policy networks and they enjoy the advantage of protected positions within
536
The SAGE Handbook of Political Science
different ‘niches’. They find in the status quo (Baumgartner et al., 2009) a powerful ally for the protection of their interests. In the end, these data may not be sufficiently exhaustive to establish unequivocally the greater influence of business community in decision-making in Washington, but they get very close to this goal.
Brussels The results of research on the EU offer more discordant indications than those reported for the United States. Some authors believe that in Brussels, much more than in Strasbourg, the business community exerts a significant influence on the institutions of the EU. Others doubt this greater influential ability. Concerning the former, since the beginning of the 1990s, several authors have asserted that the EU decision-making system favors business interests. More recent studies confirm the evaluations of the previous generation’s authors (Bunea, 2014; Hermansson, 2016). They identify the expert groups that assist the European Commission in the preparation of legislative proposals as an effective vehicle through which business groups have greater opportunities to see their preferences reported in the final text of the policy’s proposals (Chalmers, 2014). The bias in favor of business groups through expert groups seems to have been confirmed to some extent by the results of a recent European Parliament inquiry concerning the censurable omissions of the European Commission – exploited to their advantage by the automotive companies – in monitoring the implementation of European regulations aimed at reducing harmful emissions of carbonic acid gas into the atmosphere (European Parliament, 2017). Other authors propose a more problematic assessment of business groups’ ability to influence Brussels and Strasbourg. According to some, business groups can play an important role in drafting the reports that the European Parliament sends to the Council
only when the European business federations present unitary positions to the most important parliamentary committees, and on issues that have little political salience (Rasmussen, 2015). In any case, business groups often face coalitions of interests that include groups representing diffuse interests. In addition, business interests find allies in most of the member states. The outcome of such confrontations is usually a compromise, which is often beneficial to the coalition to which diffuse interests belong (Dür et al., 2013). Therefore, the most recent research on interest groups’ influence on EU policy-making is dissonant with regard to the assumption that business groups have a decisive influence on European legislation. How to explain these divergent interpretations? Certainly, in the case of the EU, there are greater difficulties for collecting well-documented data sets than is the case in the United States. These difficulties in the United States, as regards the federal level, have largely been overcome thanks to the mandatory registration of lobbying activities. Scholars of groups active at Brussels must instead rely on inadequate data sets to produce shared results. Scholars also still use different methodologies for data processing. The time periods covered also differ. Without ignoring these divergent findings, we can perhaps consider them as discordant interpretations of a relationship that has evolved over time. This assessment is in line with the statement by Scharpf (2001) that business groups were among the main protagonists of the institutional evolution of the EU with the adoption of the Single Act (1986) and the Maastricht Treaty, and were those who benefited most. However, the process through which the EU established methods for regulating business groups after Maastricht – particularly in relation to consumer rights and environmental protection – involves costs for the business community that the EU often refuses, at least in part, to support. It is therefore not infrequent that attempts to obtain approval for such requests are frustrated.
Interest Group Systems in the Age of Globalization
To conclude, there is a bias, both in Washington and in Brussels, in the groups system that derives from the dominant position held by the economic groups close to the institutional places relevant for policy- making. This position frequently translates into a greater ability of the business community to influence the decisions taken by political actors. This trend is more pronounced in Washington and less so in Brussels.
Some explanations of the privileged role of business groups in democratic regimes Moving from the results of empirical analysis to the theoretical contributions, the literature on interest groups explains the causes of the bias in the groups system, distinguishing between the political and the structural power of the business community. The first discloses itself through greater availability of money, expertise, financial and organizational resources, which make both access and lobbying easier. The second relates to their ability to decide when and where to invest. For this reason, according to Lindblom, political authorities strive with all means at their disposal to support large companies, consequently guaranteeing them a privileged position in policy-making (Lindblom, 1977). For Lindblom, however, the privileged status of business groups is incongruent with the founding principles of liberal democracies: ‘The large private corporation fits oddly into democratic theory and vision. Indeed, it does not fit’ (ibid: 356). Claus Offe (1977) also comes to similar conclusions, although he adopts a neo-Marxist position that is alien to the liberal tradition to which Lindblom belongs. According to Offe, political actors can enjoy substantial autonomy from the dominant economic groups, but are forced to establish privileged relations with the capitalists because, not having their own resources, they have an interest in supporting
537
the economic conditions of reproduction of capital which constitute the material basis of public finances. The strength of structural conditioning exercised by business groups over political actors risks serious consequences for the legitimacy of liberal democracies. According to Dahl and Lindblom, ‘businessmen play a distinctive role in polyarchal politics that is qualitatively different from that of any interest group. It is much more than an interest group’ (1976: xxxvi). Going further, Dahl (1985) concludes that the influence exerted on public institutions by entrepreneurial groups affects the outcomes of the democratic process, creating social and political inequalities that nurture one another, reduce possibilities for citizen participation and prevent implementation of redistributive policies, while renewing the influence of the business community on political life. The drastic judgment of Dahl, echoed by Lindblom’s and Offe’s observations, is related to the idea that public institutions have little chance of countering business group preferences when they are accompanied by the threat of reducing investments or of re-allocating them to other countries to oppose unwanted measures such as high taxation or strict regulatory policies. However, these positions are questioned by scholars who adopt a neo-institutionalist approach.
Groups and institutions: historical institutionalism The main assumption of historical institutionalism is that institutions are prior to interest groups and shape their ability to exercise influence on policy-making (Hall and Taylor, 1996; Immergut and Anderson, 2008). For example, Coen and Richardson (2009) argue that EU institutions substantially shape the patterns of lobbying. This tendency is due to the fact that institutions can impose institutional constraints, more or less relevant, deriving from
538
The SAGE Handbook of Political Science
their history and their practices over time (path dependency in the sense of Paul Pierson), from their internal articulation (the more or less large number of institutional veto players involved in the decision-making), and, last but not least, from the greater public legitimacy they enjoy compared to interest groups. Based on these assumptions, historical institutionalism distances itself from a certain indeterminacy which pluralists often adopt to describe the characteristics of the state in liberal democracies, and rejects the assumption of neo-Marxists who classify the state as an instrument to service the ruling class. The state is, instead, conceived as an actor who can operate in an autonomous way in the decision-making process and which is able to adopt independent choices in times of economic crisis (Gourevitch, 1986). Therefore, when important institutional changes occur within a political system, interest groups adapt their strategies to the new conditions deriving from institutional transformations (Steinmo, 2008). Instead, norms and institutions are particularly permeable to the influence of groups when they are not adequately equipped to regulate sectorial pressures. According to Hacker and Pierson (2010), this is the case for the United States’ institutional system, which discourages the formation of disciplined political parties at the federal level, offers many channels of access to public institutions and is articulated in autonomous central institutions that often compete. The institutional system also allows an endemic parliamentary obstructionism. The combination of these institutional characteristics allowed the groups of the business community – better organized and endowed with considerable financial resources – to engage successfully in promoting highly unequal tax legislation for the benefit of 0.1% of the population and prevented the approval of laws that effectively protect workers in their workplace and savers from the risks of stock market speculation. By contrast, in other cases (France and Sweden), when institutions favor the formation of governmental
autonomy from the influence of interest groups, they can launch important reforms – such as health care reform – which, despite the opposition of medical associations, create a public service that benefits the entire population (Immergut, 1992). In the end, according to historical institutionalism, institutions have different characteristics from one case to another, but always influence the behavior of interest groups. The general assumptions of historical institutionalism seem, however, inadequate to grasp the adaptations that business groups devised in the recent past in various national contexts following the transformations introduced by globalization.
Globalization and structural bias Globalization reduces the powers of the territorial state in contemporary democracies and increases the power of the business community over governments. In particular, the integration of financial markets generated a huge increase in funds raised on the international capital market and increased companies’ ability to collect and transfer capital across national borders. For territorial states, it has become a priority to use incentives and facilities to the business community to prevent private investment migrating to more attractive shores. As a result, while firms’ ability to influence governments has increased greatly, national states have increasingly been forced to promote tax and labor policies that benefit companies and penalize large sectors of the population (Crouch and Streeck, 1996; Strange, 1988). Therefore, it seems appropriate to reconsider the relationship between states and interest groups, overcoming the rigidity of the neo-institutionalist assumptions. To this end, Wolfgang Streeck (2010) argues that the weakness of the neo-institutionalist approach derives from a static vision of political change, which is conceived as a set of occasional fluctuations that leave unchanged
Interest Group Systems in the Age of Globalization
the institutions and the relationships between them and the protagonists of capitalist development. On the contrary, Streeck supports the idea that change fueled by the transformations induced by contemporary capitalism can generate radical changes. Streeck’s criticism of the neo-institutionalist approach finds an empirical foundation in the research that has critically controlled the assumptions of the literature dedicated to the ‘varieties of capitalism’. This literature pays special attention to the various institutional solutions adopted by liberal democracies to structure economic policy (Hall and Soskice, 2001). But the empirical findings of the comparative research undertaken by Baccaro and Howell (2011) show that the permanent divergence in existing institutional solutions in different countries in the relationship between capital and labor proved to be perfectly compatible with the increase in discretionary choices by entrepreneurs in the workplace. In other words, business communities everywhere gained power over salaried workers, regardless of the institutional set-up existing in different countries, while governments have not been able to play their traditional role as ‘neutral’ actors promoting balanced labor policies compatible with the preferences of the various stakeholders. It remains to be noted that the greater indulgence of political actors towards business groups risks overthrowing a basic principle of European democracies according to which people have acquired the right to welfare benefits (establishment of public services in several sectors such as health care, transport, water, education) by virtue of their status as citizens and not because they can buy them on the market (Crouch, 2004).
Is the interest group system pluralist? Do the conclusions reached by the most recent research allow us to state that the
539
pluralist structure of the interest group system has been replaced by a system with a strong elitist connotation? The results of several pieces of research carried out by scholars who identify themselves with the neo- pluralist approach do not authorize such a drastic statement because they show that the interest system in advanced democracies offers many opportunities for groups to organize, and institutions still offer many places of access (Berry, 1999; Lowery and Gray, 2004). Furthermore, associations of diffuse interests and trade unions show a good capacity for mobilization when the issues on the table have a strong political salience because they concern problems that affect large sectors of the community and arouse strong passions based on opposing ideological preferences (Czada, Chapter 34, this Handbook). However, the pluralism of the interest groups system is biased in favor of business groups because the most important decisionmaking arenas (industry, finance, technology, international trade, agriculture, etc.) are poorly covered by associations who represent diffuse interests. Moreover, the bias, as we have seen, has increased in the past three decades, following the weakening of the state’s prerogatives vis-à-vis multinationals and large financial groups. Overall, the logic of pluralistic competition helps the dispersion of power and the opening up of the political system, but it cannot guarantee equal opportunities for all groups to influence public decisions. The possibilities to successfully influence public policies depend to a large extent on the different endowments of resources – economic, organizational, education, expertise, prestige – which citizens possess before entering the arena of pluralistic competition. This problem was highlighted by Dahl and Lindblom many years ago: ‘We cannot move closer to greater equality in political resources without greater equality in the distribution of, between other things, wealth and income’ (1976: xxxii). In other words, democracies should offer all citizens some form of substantive equality, because mere equality of
540
The SAGE Handbook of Political Science
opportunities offers only the equal opportunity to become unequal (Schaar, 1967).
Pluralist interest group system and its alternatives Pluralism of opportunities and selective benefits enjoyed by business groups are two of the main characteristics that distinguish group systems in the advanced democracies with which we have dealt in the previous sections. Regarding the first, we must add that pluralism improves the quality of democracy because it helps to increase controls on the activity of policy-makers, contributes to the growth of democratic debate on the advancement of human rights and can promote a more egalitarian distribution of national resources through confrontation among the various stakeholders. Taken together, these qualifications do not reduce the existing imbalance in favor of the business community, but pluralist group systems can contain it, although the historical age of neoliberal globalization does not seem the best time for this to happen. In other words, pluralism seems to be the most important quality of the interest group system. It is therefore worth reflecting further on the conditions that make the pluralist interest group system achievable in the advanced democracies, while it seems infeasible in authoritarian (China) and hybrid (Russia) regimes and problematic in Latin American presidential democracies (Brazil) or in the oldest Asian democratic regime (India).
China China is a civil authoritarian regime characterized by the strict control that the Chinese Communist Party (CCP) exerts on the media and on any form of political activity. The political stability of the Chinese regime depends on the economic dynamism that the political elite has been able to create and the systematic use
of repression of any form of independent politics (Freedom House, 2018a). The economic diversification initiated by the CCP in the late 1980s was based on the privatization of several sectors of the economy. Today, the CCP includes several members of the business community who were previously senior cadres of the party (Li et al., 2008). The intertwining of relationships between party cadres and state officials from one side and business groups from the other forms a lucrative alliance that fuels the already widespread corruption, which is one of the main obstacles to the government’s efforts to improve social equality while pushing for economic development (Xiangwei, 2006). As in other East Asian countries (Japan, Taiwan, South Korea), the Chinese political leadership has created a strongly corporatist system to support intensive economic development. But Chinese corporatism presents a significant difference. While the neighboring countries maintained a centralized state corporatism in the early decades of industrialization, China has reduced state seizures and controls the economy and society through CCP surrogates, namely the hundreds of thousands of social organizations who strive to interpret the demands of the complex Chinese society while maintaining the priority of promoting the state’s interest (Unger and Chan, 1995). None of the Chinese social organizations controlled by the Communist Party is committed to the promotion of human rights. Individuals attempting to promote human rights are imprisoned and often tortured until they confess that they have acted against ‘national security’ (The Guardian, 2017). In Chinese state corporatism, unions are called upon to carry out the double role of guaranteeing state interests and representing workers’ complaints over low pay, underpayment of social insurance and abusive management regimes. The only state-permitted union – All China Federation of Trade Unions (ACFTU) – claims to have embarked on reforms for improving worker protection. But in redressing workers’ grievances, ACFTU
Interest Group Systems in the Age of Globalization
functions more as an agency for legal assistance using a strategy of problem-solving on a case-by-case basis than as a labor organization that defends workers’ collective interests in a proactive way. In other terms, given its position in Chinese state corporatism, ACTFU operates mostly as a state instrument with the purpose of preventing or stopping any labor action, or simply disappears when conflicts occur (Chen, 2003). The Chinese experiment of state corporatism shows that an authoritarian regime does not contemplate a group system independent of state control, and prevents the development of a politically autonomous civil society by maintaining its repressive grip both over citizens who ask for promotion of human rights and over workers who claim rights for better work conditions and higher wages.
Russia Russia is a hybrid regime characterized by a constitution that stipulates political pluralism, freedom of speech and the existence of multiple sources of information. In reality, what prevails are restrictions on political competition and interference in local and regional elections in ways that prevent citizens invoking their right to change their government. The central government controls many forms of media and often infringes on freedoms of speech and expression, pressures major independent media to abstain from critical coverage and harasses and intimidates journalists into practicing selfcensorship (Gelman, 2015). Within the regime, the interest group system is characterized by the dominant position taken by large economic groups loyal to the central power. These groups, with the support of the central government from 2001 onward, formed major associations of entrepreneurs and created a permanent coordination between the state’s interests (mostly concentrated on the control of natural resources) and the biggest corporations. This alliance
541
between the state elite and big business continues to define the development prospects of Russian society and economy (Ledeneva, 2013). Alongside the political–economic elite, a plethora of informal clan-based interest groups, which are firmly intertwined with the institutions of the regime, has developed. These clans use their influence on the state machine to secure benefits and privileges in a system that does not offer firm legal guarantees for protecting rights and property (Kimmage, 2009), while corruption continues to be widespread throughout the executive, legislative and judicial branches at all levels of public institutions (Freedom House, 2018b: 9). The work of nongovernmental organizations (NGOs) is arduous because it is hampered by obstructionist measures of the government. Restrictions are applied in a discriminatory manner, particularly to those NGOs that are receiving foreign funding or involved in issues of political opposition or human rights monitoring (Crotty, 2009). Moreover, security services and local authorities at times fabricate grounds for legal justification for searches and raids on civil society groups. Independent unions are active in some industrial sectors and regions, but in practice worker rights are limited. The largest labor federation (FNPR) works in close cooperation with the Kremlin, while the right to strike is difficult to exercise (Olimpieva, 2012). As a matter of fact, the majority of strikes are considered technically illegal because they violate one or more of a complex set of procedures governing disputes. While the law prohibits anti-union discrimination, the police often uses intimidation techniques against union supporters, including detention, interrogations and provocation of physical confrontation. Sometimes police pressure union activists to become informants (Kimmage, 2009). The Russian interest groups system is dominated by political–economic oligarchies and is repressive towards diffused interests. The group system is state-controlled and insensitive to the many pressures coming from an
542
The SAGE Handbook of Political Science
expanding civil society. Given these characteristics, the Russian interest group system can move towards a more pluralist pattern only in the case of regime change; however, at present this seems unlikely.
Brazil The group system in Brazil is a system of modified corporatism (Thomas, 2009). This system maintains some characteristics of the authoritarian state corporatism which was distinctive of Brazilian political life until the 1980s, while at the same time it has acquired some aspects of a pluralist group system. During the military dictatorship (1964–89), the state placed major controls on group organization. In particular, the authoritarian regime adopted legislation that transformed workers’ organizations to subordinate supporters of the central power. This structure of the group system has been a serious obstacle to the development of political pluralism since the demise of the authoritarian regime. Therefore, the Brazilian group system continues to provide a series of legislative facilities for the unions, but at the same time limits their freedom of action, making workers’ organizations often ineffective and prone to sectarianism (Lang and Gagnon, 2009). More generally, the Brazilian group system is characterized by a large number of cliques that are active within central and regional bureaucracies, as well as in government bodies. These cliques promote their interests through exclusive ties based on family and friendship relationships, fueling elitism and widespread political corruption. At the same time, the Brazilian group system has developed, mainly since the mid 1980s, several aspects of pluralist systems. In fact, in recent decades, a considerable increase in the number of groups has been registered (Oliveira Gozetto and Thomas, 2014). Moreover, the degree of institutionalization of many interest groups has increased, together with the range of active interests and
the tactics and strategies used. Because of its hybrid character, the current Brazilian interest group system appears similar to that of Mexico, Argentina and Peru. In other words, it is a system that is neither entirely free from the dependence of the state nor adequately open to the demands coming from civil society, which, to a large extent, distrusts a system that is perceived as elitist, not transparent and a generator of corrupt practices. The Brazilian group system’s inadequacy to aggregate the demands that come from the majority of the population could favor the emergence of populist phenomena similar to those that appeared in Venezuela and Bolivia in recent decades (ibid: 237).
India India, the largest democracy in the world, offers many points of access to interest groups thanks to its federal structure based on the decentralization of power at the state, local and village levels (panchayat). India is also one of the countries with the greatest social heterogeneity, as well as strong inequalities and extensive poverty. The Indian economy has also shown a significant increase in economic growth since 1991 as a result of external pressures in favor of a modernization of the economy derived from neo-liberal globalization. The factors mentioned contribute to a pluralist characterization of the Indian group system. However, the great social heterogeneity – which nourishes territorial, linguistic, ethnic, religious, tribal divisions – contributes to the fragmentation of interests and the country’s institutional structure facilitates compartmentalization within the various levels of the federal system, hindering the mentioned factors’ effectiveness on a national scale. The political parties also contribute to the fragmentation of Indian groups, because groups are often ancillary associations of the many political parties that represent mainly local interests, while only few have a federal dimension. In 2014, 1,600 parties were
Interest Group Systems in the Age of Globalization
registered to participate in the federal parliamentary elections. The compartmentalization of interests is one of the conditions of the stability of Indian democracy, because conflicts present at the decentralized levels of the institutional system are unlikely to reach the federal government, which remains stable thanks to the formation of coalition governments and a consociational style of decision-making (Hardgrave, 1993). In contrast, by staying subordinate to local parties, the compartmentalized groups contribute to feeding the separatist pressures that several Indian states cultivate to the detriment of national unity. The insufficient structuring of interests is also caused by the existence of strong social inequalities that hinder the political participation of large parts of the rural world population to which more than 60% of Indians belong. Rural India is mostly illiterate, segmented according to religion, caste and language, and often unaware of the welfare policies promoted by the federal government to alleviate poor social conditions (Sekhar, 2005). This situation benefits local farmers with large and medium-sized businesses, who control the patronage networks on which the rural masses depend, and ally with local governments to hinder or block federal policies of social equalization. The difficulties that the complex reality of Indian society poses to a modern structure of interests are to some extent surmounted by groups of the business community, which are, however, divided into competing organizations mainly on a territorial basis. The liberalization of the Indian economy at the beginning of the 1990s prompted the federal institutions – traditionally interventionist in the economy – to meet business groups’ demands to loosen the grip of regulations. Now the role of the federal government is more that of a facilitator rather than a regulator (Mitra and Singh, 2010: 35). A consequence of this new approach is that Western-oriented policies often forget the sustainable development in the countryside, where large parts of stakeholders remain cut off from the trilateral
543
decision-making process that involves politicians, bureaucrats and business groups. These trilateral interactions often generate corruption, which is a great disease of Indian administration and politics and deeply affects the daily relationships between citizens and politicians (Bhagwan, 2007). Besides the government and business groups, there are trade unions. In India, the top layers of unionized labor are the interests shaped in the tertiary sector by the economic pervasiveness of the Indian state. The majority of white-collar employees – teachers, professors, medical doctors, engineers, scientists and employees of governmental organizations – seem to be among the few who have the potentiality to act on an allIndia basis (Charan, 1994: 152). By contrast, the heterogeneity of membership renders the unions of blue-collar workers unstable, fragmented and uncoordinated. Moreover, the blue-collar unions are restricted to the metropolitan areas and concentrated in large-scale factories; their number is negligible in the rural areas and in the informal sector of the economy, which comprises more than 90% of the Indian workforce. Finally, in the field of industrial relations, Indian trade unions are losing the battle for job preservation in the formal sector, against the neo-liberal trend that creates a shift to the informal sector with an accompanying increase in poverty (Jit, 2016). In conclusion, the Indian group system is pluralist but also fragmented and compartmentalized; what is worse, the most disadvantaged groups are excluded from it – and exclusion breeds protest. Therefore grievances against the government, especially in India’s rural eastern regions, are frequent, while in the urban areas thousands of protest groups are engaged in issues related to the environment, education, women’s rights, worker’s rights, protection from police abuses, religious persecution and gay rights. These protest movements, which involve millions of people, are a permanent characteristic of the Indian political landscape. But the interest group system does not appear capable of aggregating these
544
The SAGE Handbook of Political Science
interests and channeling them into institutionalized negotiations with the federal and state governments. This is the most important limitation of the Indian group system, and is also a structural weakness of the democracy of this great Asian country.
Conclusion The discussion in the previous section leads to the conclusion that neither China nor Russia have a pluralist group system. In the former, they lack the two minimum conditions for its achievement: independence from the state of the channels which aggregate interests, and the autonomous development of civil society. In the latter, these conditions are in existence to an embryonic degree and may regress. The Brazilian system of modified corporatism, unlike China and Russia, presents an incipient pluralism that nevertheless coexists with the survival of state controls on groups and opaque and discriminatory lobbying practices. India is the most interesting case because it signals that even a democratic regime may not be able to offer suitable opportunities for groups to organize the many demands that are initiated in the civil society and in the economy. The limits of the Indian group system, which has apparent pluralist characteristics, derive from the extreme social heterogeneity and the enormous inequalities that still exist in the country. Both of these factors contribute to the formation of an interest group system that is fragmented and compartmentalized and which, moreover, shows a structural inability to integrate the needs of the countless poorest members of Indian society. The discussion of the different configurations that group systems assume in the contemporary world suggests that this system has greater chances to become pluralist when two requirements, one political–institutional and the other socio-cultural, are fulfilled. The first is related to the emancipation of the
group system from state control, which takes place when the institutional conditions that prevent its autonomous functioning are completely eliminated. The second is the robust reinforcement that group systems receive when civil society articulates itself in numerous ways that enter into the already existing organized channels of political demands. In general, it is wise to bear in mind that group systems have better chances of assuming a real pluralist configuration when both groups and governments operate within a system of effective checks and balances. This means that a long list of conditions is necessary to have a pluralist group system.
References Baccaro, L. and C. Howell (2011) A Common Neoliberal Trajectory: The Transformation of Industrial Relations in Advanced Capitalism, Politics & Society, 39 (4), pp. 521–63. Baumgartner, F. R. and B. L. Leech (1998) Basic Interests: The Importance of Groups in Politics and in Political Science, Princeton, Princeton University Press. Baumgartner, F. R. and B. L. Leech (2001) Interest Niches and Policy Bandwagons: Patterns of Interest Group Involvement in National Politics, The Journal of Politics, 63 (4), pp. 1191–1213. Baumgartner, F. R., J. M. Berry, M. Hojnacki, D. C. Kimball and B. L. Leech (2009) Lobbying and Policy Change: Who Wins, Who Loses, and Why, Chicago, Chicago University Press. Berry, J. (1999) The New Liberalism: The Rising Power of Citizen Groups, Washington DC, Brookings Institution Press. Bhagwan, V. (2007) Corruption and Good Governance, The Indian Journal of Political Science, 68 (4), pp. 727–38. Broscheid, A. and D. Coen (2007) Lobbying Activity and Fora Creation in the EU: Empirically Exploring the Nature of the Policy Good, in D. Coen (ed.) EU Lobbying: Empirical and Theoretical Studies, London, Routledge, pp. 15–31. Bunea, A. (2014) Evaluating Pluralism: Interest Groups’ Policy Demands and Lobbying Success
Interest Group Systems in the Age of Globalization
in the European Commission’s Open Consultations: A Case Study in Environmental Policy. EUI Working Paper. MWP, 3, pp. 1–38. Chalmers, A. W. (2014) Getting a Seat at the Table: Capital, Capture and Expert Groups in the European Union, West European Politics, 37 (5), pp. 976–92. Charan, U. (1994) Group Dimension in Politics in India and United States: Some Comparisons, The Indian Journal of Political Science, 55 (2), pp. 149–58. Chen, F. (2003) Between the State and Labour: The Conflict of Chinese Trade Unions’ Double Identity in Market Reform, The China Quarterly, 176, pp. 1006–1128. Coen, D. and J. Richardson (2009) Institutionalizing and Managing Intermediation in the EU, in D. Coen and J. Richardson (eds) Lobbying the European Union: Institutions, Actors, and Issues, Oxford, Oxford University Press, pp. 337–50. Crotty, J. (2009) Making a Difference? NGOs and Civil Society Development in Russia, Europe-Asia Studies, 61 (1), pp. 85–108. Crouch, C. (2004) Post-Democracy, Cambridge, Polity Press. Crouch, C. and W. Streeck (eds) (1996) Les capitalismes en Europe, Paris, La découverte. Dahl, R. A. (1985) A Preface to Economic Democracy, Berkeley, University of California Press. Dahl, R. A. and Lindblom, C. E. (1976) (1953 1st ed.) Politics, Economics, and Welfare, Chicago, University of Chicago Press. Dür, A. (2008) Interest Groups in the European Union: How Powerful Are They? West European Politics, 31 (6), pp. 1212–30. Dür, A., P. Bernhagen and D. Marshall (2013) Interest Group Success in the European Union When (and Why) Does Business Lose? Comparative Political Studies, 48 (8), pp. 951–83. Eising, R. (2007) The Access of Business Interests to EU Institutions: Towards Elite Pluralism? Journal of European Public Policy, 14 (3), pp. 384–403. European Parliament, Directorate-General for Research (2003) Lobbying in the European Union: Current Rules and Practices, Working Paper, Luxembourg, pp. 1–68. European Parliament (2017) Report of the Committee of Inquiry into Emission
545
Measurements in the Automotive Sector, Brussels, March 2. Freedom House (2018a), Country report – China, see : https://freedomhouse.org/report/ freedom-world/2018/china Freedom House (2018b) Russia Profile, see: in https://freedomhouse.org/report/freedomworld/2018/russia Gelman, V. (2015) Authoritarian Russia: Analyzing Post-Soviet Regime Changes, Pittsburgh, PA, Pittsburgh University Press. Gourevitch, P. (1986) Politics in Hard Times: Comparative Responses to International Economic Crises, Ithaca, NY, Cornell University Press. Gray, V. and D. Lowery (2000) The Population Ecology of Interest Representation: Lobbying Communities in the American States, Ann Arbor, University of Michigan Press. The Guardian (2017) China ‘eliminating civil society’ by targeting human rights activists – report from Hong Kong, February 16. Hacker, J. S. and P. Pierson (2010) Winner-TakeAll Politics: How Washington Made the Rich Richer – and Turned Its Back on the Middle Class, New York, Simon & Schuster. Hall, P. A. and R. C. R. Taylor (1996) Political Science and the Three New Institutionalisms, Political Studies, 44 (5), pp. 936–57. Hall, P. A. and D. Soskice (2001) An Introdution to Varieties of Capitalism, in P. A. Hall and D. Soskice (eds) Varieties of Capitalism: The Institutional Foundations of Comparative Advantage, Oxford, Oxford University Press, pp. 1–70. Hardgrave, R. L. (1993) India: The Dilemmas of Diversity, Journal of Democracy, 4 (4), pp. 54–68. Heinz, J. P., E. O. Laumann, R. L. Nelson and R. H. Salisbury (1993) The Hollow Core: Private Interests in National Policymaking, Cambridge, MA, Harvard University Press. Hermansson, H. (2016) The European Commission’s Environmental Stakeholder Consultations: Is Lobbying Success Based on What You Know, What You Own or Who You Know? Interest Groups & Advocacy, 5 (3), pp. 177–99. Immergut, E. M. (1992) Health Politics: Interests and Institutions in Western Europe, Cambridge, Cambridge University Press. Immergut, E. M. and K. M. Anderson (2008) Historical Institutionalism and West European
546
The SAGE Handbook of Political Science
Politics, West European Politics, 31 (1–2), pp. 345–69. Jit, R. (2016) Challenges of Trade Union Movement in India, Global Journal of Enterprise Information System, 8 (2), pp. 20–5. Kimmage, D. (2009) Russia: Selective Capitalism and Kleptocracy, in Undermining Democracy: 21st Century Authoritarians (pp. 49–64). Washington DC, Freedom House. Lang, K. and M.-J. Gagnon (2009) Brazilian Trade Unions: (In)Voluntary Confinement of the Corporatist Past, Relations Industrielles/ Industrial Relations, 64 (2), pp. 250–69. Ledeneva, A. V. (2013) Can Russia Modernize? System, Power Networks and Informal Governance, New York, Cambridge University Press. Lehmbruch, G. and P. C. Schmitter (1982) Patterns of Corporatist Policy-Making, London, Sage. Li, H., L. Meng, Q. Wang and L. Zhou (2008) Political Connections, Financing and Firm Performance: Evidence from Chinese Private Firms, Journal of Development Economics, 87 (2), pp. 283–99. Lindblom, C. E. (1977) Politics and Markets, New York, Basic Books. Lowery, D. and V. Gray (2004) A Neopluralist Perspective on Research on Organized Interests, Political Research Quarterly, 57 (1), pp. 163–75. Mahoney, C. (2004) The Power of Institutions: State and Interest Group Activity in the European Union, European Union Politics, 5 (4), pp. 441–66. Mahoney, C. and F. R. Baumgartner (2008) Converging Perspectives on Interest-Group Research in Europe and America, West European Politics, 31 (6), pp. 1253–73. Mattina, L. (2011) Interest Groups, in B. Badie, D. Berg-Schlosser and L. Morlino (eds), Encyclopedia of Political Science, vol. IV, London, Sage, pp. 1219–30. Mazey, S. and J. Richardson (2001) Interest Groups and EU Policy Making: Organisational Logic and Venue Shopping, in J. Richardson (ed.) European Union: Power and PolicyMaking, London, Routledge, pp. 218–37. Mitra, S. and V. V. Singh (2010) Regulatory Management and Reform in India, Background Paper for OECD. Offe, C. (1977) Strukturprobleme des kapitalistischen Staates: Aufsätze zur
politischen Soziologie, Frankfurt-New York, Campus Verlag. Olimpieva, I. (2012) Labor Unions in Contemporary Russia: An Assessment of Contrasting Forms of Organization and Representation, Working USA: The Journal of Labor and Society, 15 (2), pp. 267–83. Oliveira Gozetto, A. C. and C. S. Thomas (2014) Interest Groups in Brazil: A New Era and Its Challenges, Journal of Public Affairs, 14 (3–4), pp. 212–39. Rasmussen, A. and B. J. Carroll (2013) Determinants of Upper-Class Dominance in the Heavenly Chorus: Lessons from European Union Online Consultations, British Journal of Political Science, 44 (2), pp. 445–59. Rasmussen, M. K. (2015) The Battle for Influence: The Politics of Business Lobbying in the European Parliament, Journal of Common Market Studies, 53 (2), pp. 365–82. Scharpf, F. (2001) What Have We Learned? Problem-Solving Capacity of the Multilevel European Polity. MPIfG Working Paper 01/4. Cologne: Max Planck Institute for the Study of Societies. Schaar, J. H. (1967) Equality of Opportunity, and Beyond, in J. R. Pennock and J. W. Chapman (eds), Nomos IX: Equality, New York, Atherton Press, pp. 228–49. Reprinted in L. P. Pojman and R. Westmoreland (1997) Equality: Selected Readings, Oxford, Oxford University Press, pp. 137–47. Schattschneider, E. E. (1960) The Semisovereign People: A Realist’s View of Democracy in America, New York, Holt, Rinehart and Winston. Schlozman, K. L. and J. T. Tierney (1986), Organized Interests and American Democracy, New York: Harper & Row. Schlozman, K. L., S. Verba, H. E. Brady and P. E. Jones (2012) The Unheavenly Chorus: Unequal Political Voice and the Broken Promise of American Democracy, Princeton, Princeton University Press. Schlozman, K. L., H. E. Brady and S. Verba (2018) Unequal and Unrepresented: Political Inequality and the People’s Voice in the New Gilded Age, Princeton, Princeton University Press. Sekhar, C. S. C. (2005) Economic Growth, Social Development and Interest Groups, Economic and Political Weekly, 40 (50), pp. 5338–47.
Interest Group Systems in the Age of Globalization
Steinmo, S. (2008), Historical Institutionalism, in D. della Porta and M. Keating (eds), Approaches and Methodologies in the Social Sciences: A Pluralist Perspective, New York, Cambridge University Press, pp. 118–38. Stone Sweet, A. and W. Sandholtz (1998) Integration, Supranational Governance and the Institutionalisation of the European Polity, in W. Sandholtz and A. Stone Sweet (eds), European Integration and Supranational Governance, Oxford, Oxford University Press, pp. 2–26. Strange, S. (1988) States and Markets: An Introduction to International Political Economy, London, Pinter. Streeck, W. (2010) Epilogue: Institutions in History: Bringing Capitalism Back In, in Glenn Morgan et al. (eds), The Oxford Handbook of Comparative Institutional Analysis,
547
Oxford/New York, Oxford University Press, pp. 659–86. Thomas, C. S. (2009) Understanding the Development and Operation of Latin American Interests, Power Groups and Interest Groups, in C. McGrath (ed.), Interest Groups and Lobbying in Latin America, Africa, the Middle East, and Asia, Lewiston, NY, Edwin Mellen Press, pp. 3–30. Unger, J. and A. Chan (1995) China, Corporatism, and the East Asian Model, Australian Journal of Chinese Affairs, 33, January, pp. 29–53. Walker, J. L. (1983) The Origins and Maintenance of Interest Groups in America, American Political Science Review, 77 (2), pp. 390–406. Xiangwei, W. (2006) ‘Jury still out whether Beijing will take on entrenched interest groups’, South China Morning Post, October 23.
33 Parties Daniel-Louis Seiler1
The word ‘party’ refers to one of the oldest concepts used in modern political science. Its use in historical, philosophical or polemical vocabulary first appeared in the 17th century with the memoirs of Cardinal de Retz in France, Viscount Bolingbroke in England and, above all, David Hume, who, in the early 18th century, initiated what was to become the analysis of parties. Nonetheless, the word has been used since the Middle Ages to refer to the opposite sides in a civil war, for example. Even the etymology of the word party is telling: party, ‘parti’ in French, ‘partei’ in German, ‘partido’ in Spanish and even ‘partia’ in Russian and Polish and in many other languages – derived from the verb ‘partir’, which in medieval French meant to split into parts or divide. All European languages – including Slavic ones which use other terms, such as ‘strana’ in Czech or ‘stanka’ in Croat or Serbian – use words to mean ‘side’. The idea is the same: to take sides or to choose one’s side or one’s camp in a political conflict. All definitions can be
grouped into three broad categories, which are sometimes combined. First of all, following Burke, parties can be defined according to the ideas they convey. Then, following Max Weber, Robert Michels and Maurice Duverger, one can insist on the parties as organizations. Finally, the trend since the end of the 20th century has been to use the criterion of elections and the existence of a representative, or at least democratic, regime. The remark attributed to Max Weber that ‘parties are the children of democracy and universal suffrage’ is put forward to support this thesis. One should not, however, forget the classic definition given during the reign of George III by Edward Burke: ‘A party is a body of men united for promoting by their joint endeavors the national interest upon some particular principle in which they are all agreed’ (Burke, 1770; 134). This definition remains the best, even if the subsequent evolution of political systems has made it imprecise, as it is now incomplete. Here, we propose to use the term in the following way: a party is an
Parties
organization of individuals engaged in collective action, in order to mobilize as many individuals as possible against other equally mobilized individuals in order to accede, either alone or in coalition, to the exercise of government functions. This engagement and this claim for power are justified by a particular conception of the national interest. Below, we first discuss this definition in greater detail, then turn to the party’s historical origins and conceptualization in the European context, and finally come to contemporary forms of organization of the party in different parts of the world, and recent developments.
Definitional elements As we have seen, (1) the party is the product of a collective organized action that is permanent and continuous in time. It is therefore intended to outlive the action of its founding fathers and continue throughout history for as long as it is able to mobilize the supporters that keep it alive. As an institution, parties present characteristics common to all organizations studied by organizational theory. Parties are in the category of association-type organizations, that is to say, based on voluntary membership and the choice of the actors: members, militants, elected representatives, leaders. If membership is automatically granted on the basis of birth, family or clan, it is not a party. (2) Any organization is structured according to an objective, which, in the case of a party, is to accede to the different functions of government: national, regional and local. Parties can exist which are limited to one or the other of these levels of government, as is often the case in Canada, for example. A political organization that does not strive for power, but merely for influence, is not a party. (3) Claiming power is not an end in itself: it is justified for the sake of the national interest, which the party intends to defend or promote depending on the
549
particular conception of the actors involved. Claiming power in the name of a particular conception of the national interest constitutes the raison d’être of a party and a condition sine qua non for a political organization to be a party. (4) The way to reach the objective of the party to which its organization is rationally conditioned is the mobilization of as many individuals as possible. The most frequently used means is electoral mobilization and most parties were born with the establishment of more or less competitive representative political systems (see also Michels 1962). Democratization – whether gradual, as in the United Kingdom, or brutal, as in France – gave rise to the development of parties. By contrast, in authoritarian systems non-electoral modes of organization and mobilization exist. These can be peaceful – meetings, demonstrations, strikes, petitions, such as with chartism in Great Britain, Solidarity in Poland and the Civic Forum in former Czechoslovakia – or violent, as in uprisings, revolutions and so on. The common character shared by electoral mobilization and other forms of party mobilization is the appeal to the popular masses – that is to say, according to La Palombara and Weiner (1966), that they are in some way striving for popular support. Partisan mobilization is carried out against individuals who are also organized with a view to acceding to government in the name of a different, often opposite, conception of national interest. As we have seen, party means part (division) and therefore implies conflict. Jean Blondel (1978) sees behind every party ‘a protracted social conflict’ (pp. 137–141). As a corollary, it can be said that there are no parties without conflicts. They always convey either a current, active conflict of which they are agents or a past conflict of which they are witnesses – an example is the opposition between Fianna Fail and Fine Gael in Ireland, which corresponds historically to the struggle between anti-Treaty and Free State republicans in the bloody civil war of 1921–2. One of the major aspects of the seminal contribution by Seymour Martin Lipset
550
The SAGE Handbook of Political Science
and Stein Rokkan (1967) is to have assigned to parties the function that summarizes all the others: agents of conflict and instruments of integration. The dialectic is the following: by expressing conflict, parties thus allow negotiation and contribute at the end of a more or less long evolution to pacifying political life. Conflict and integration, as well as the name party or part, imply ipso facto plurality and competition between parties. The term ‘parties’ means a system of parties, and consequently that there are at least two of them. A single party is a contradiction in terms: it is impossible to be at the same time a single entity and a part. This obvious fact was stated at the beginning of the last century by Max Weber. The contradictory concept of a ‘single party’ or a one-party system was nonetheless used at the time of the Cold War to designate the role of a party in communist or authoritarian systems. Almond and Powell (1966), for example, distinguish between parties that are ‘one-party pluralistic’, modeled on the PRI in Mexico before democratization, and those that are ‘one-party revolutionary, centralizing’ (99). While contesting the logical pertinence of the ‘one-party system’ concept, Sartori (1976) nonetheless admits this category as the first level of his seven-rank typology. To clarify the debate, which has retained its historical pertinence, it is necessary to distinguish two totally different cases. On the one hand, as Max Weber asserted in a predictive manner by studying the ‘parte Guelfa’ in the medieval republic of Florence, where one party eliminates its rivals to become incorporated in the apparatus of the state, it changes its nature and ceases to be a political party. It then falls into another sociological concept. Raymond Aron (1967), a disciple of Weber, applies this approach to the cases of fascism, Nazism and Stalinism and speaks of ‘monopolistic parties’, which hasten to eliminate all the other parties as soon as they come to power and thus change their nature. On the other hand, we encounter so-called parties founded after a military or non-military clique has taken over power.
These are sham parties set up in order to control the population, like those of certain authoritarian regimes, such as the Popular Movement of Zaire at the time of President Mobutu. With the changes of regime and the return to democratic forms of power, these socalled single parties disappeared along with the regimes that created them. By contrast, ‘monopolistic parties’ which had ‘changed their nature’ by eliminating their rivals have in most cases gone back to their original nature with the loss of power and the return to democracy. Thus, more recently, Kuo Min Tang in Taiwan alternated in government with the independence movement, while the PRI in Mexico and the Communist Party in the Russian Federation embodied the Loyal Opposition. Similarly, communist parties in Eastern Europe have been able to reconvert themselves. This was not the case for Salazar’s National Union in Portugal, or the Movement created by Franco to support his dictatorship.
Origins The history of the appearance and development of parties corresponds to that of the scientific study of the phenomenon. Lipset and Rokkan (1967) note four thresholds in the evolution of a party: legitimization, incorporation, representation and majority power. One can apply these to every party and to every stage of the party system. Parties were not always considered to be legitimate. As a sign of conflict in societies seeking balance and harmony, they were associated with a form of evil. Political modernity, which developed with the disintegration of the feudal order in the Renaissance, was embodied in absolutist states and parties only emerged in times of crisis, civil or religious wars. Even with the establishment of a representative regime, the party was first perceived as something that divides and was equated with a faction. The timing of the legitimization of parties, the very idea
Parties
of a party system, depends on the country. Thus, three men of action who were also political thinkers reflect the same perplexity towards parties in three countries during three different periods: Bolingbroke in early 18th-century England, Madison at the time of the American Revolution, and de Gaulle in France in the mid-20th century. All three of them in their own way were deeply concerned about parties and their struggles and advocated national unity against all the divisions, yet finally themselves became involved in the struggles of parties. As a necessary evil, the party is always the party of the other, and the temptation to find other democratic paths was long present but doomed to failure. This was the case with the ‘era of good feelings’ desired by President Monroe after 1816, in order to put an end to the opposition between federalists and republican-democrats. In Great Britain, the phenomenon appears earlier and the writings of Hume bear witness to this fact; however, here it remains within the elite, unlike in the United States, where it concerned the masses. The distinction between parties and factions is the determining criterion and it is with Hume that this was clearly established. The origin of parties and their existence before a representative regime depends on the definition. If we retain the three criteria that we propose – (1) a particular conception of the national interest, (2) free organization and (3) mobilization – the Guelphs (13th century) were a party, even if their means of action were different from those of modern parties. Their fight against the Ghibelines, however, degenerated into a struggle between factions. Cavaliers and Roundheads, Whigs and Tories were also parties. When they were still badly organized, parties were discovered and studied by social scientists at first as carriers of ideas. Then, with the extension of the electoral franchise and civil rights, they were studied as organizations. Bryce, Michels, Ostrogorsky and, above all, Max Weber laid down the foundations in the late 19th and early 20th centuries.
551
Finally, the study of the mobilization of actors began in the 20th century with Andre Siegfried on electoral geography and Maurice Duverger on circles of participation in partisan activity. It branched out in many directions – militants, members, sympathizers and voters – and was favored by the development of the various forms of sociological survey research. These three approaches are necessary and have to converge if one is to understand (1) a given political party, (2) a national system of parties or (3), in a comparative manner, the classification of parties.
The raison d’être and the identity of parties Devoted to the defense and the promotion of a particular conception of the national interest (Burke), many 19th and 20th-century historians likened parties to the great schools of political thought: conservatism, liberalism, socialism, Christian democracy, communism, fascism, and so on. The links between the two phenomena are obvious, except that parties – agents of conflict but also instruments of integration – are led in majority governments to betray their initial ambition and adapt themselves to the constraints of the exercise of power, to become institutionalized, to change their program and sometimes their ideology in order to become catch-all parties (Kirchheimer, 1965). One must therefore observe the social interests that are expressed through these ideas and justify the ‘particular conceptions of national interest’. Duverger saw in parties the translation of two successive class struggles: the conflict between the land-owning nobility expressed by the conservatives and the capitalist bourgeoisie represented by the liberals, on the one hand, and that of the bourgeoisie against the proletariat organized by socialists, social democrats and labor parties, on the other. Liberals were confronted with the dilemma of whether to ally themselves with their former conservative
552
The SAGE Handbook of Political Science
enemies against the peril represented by workers’ parties – the choice taken by liberals – or to accept alliances with social democrats – the case for the so-called radical parties. Duverger (1954) notes that some Christian Democratic parties in Catholic Europe or agrarian ones in some Nordic countries remained outside these class struggles, on which the dualism between left and right is founded. It is the superimposition of dualisms that generates multiparty systems. Duverger found that using the Anglo-American majority vote – ‘first past the post’ – facilitates the establishment of a two-party system opposing conservative-liberals and social democrats. We find the same idea in Lipset’s Political Man (1960), who sees in parties the expression of social classes. For him, there are three such classes: (1) the upper class, supported by the Church, which is expressed in conservative parties; (2) the secular middle class, expressed in liberal parties; (3) the working class, expressed in labor, socialist and social democrat parties. The right, center and left form a democratic spectrum, to which corresponds an anti-democratic, extremist spectrum, including the communist extreme left and the authoritarian monarchist, clerical and reactionary extreme right as the expression of a refusal of change by a threatened upper class – illustrations include Salazar in Portugal, Horthy in Hungary, Dollfuss in Austria or Franco in Spain in the interwar period. The originality of Lipset is to show that the middle class also engendered an extreme center with Italian fascism and German Nazism, which was opposed to the extreme left and extreme right. Today, the concept of the extreme center is enlightening in the description of parties such as the FPÖ in Austria or the Rassemblement National in France and in the definition of a clear relationship between fascism and the former National Alliance in Italy. It is more precise and more scientific than the concept of populism. Lipset attenuated his position somewhat by remarking that certain traditional parties, such as Catholic parties, combine cultural conservatism with
socio-economic reformism, and that new forces like the green parties mix cultural liberalism and anti-industrialist reaction, thus constituting a neo-bourgeois ideology. The approach exposed in Political Man remains most fruitful but leaves aside the existence of interclass parties, which are nonetheless not catch-all parties. The most fitting example was Democrazia Cristiana, the Italian Christian Democrats, which was backed by a workers’ union (CISL) but also by certain employers, grouping together people with nothing in common other than the defense of the interests and values of the Catholic community. It included as many as nine tendencies (correnti) spreading from the pro-Marxist left to the traditionalist extreme right. This party, which was born out of antifascist resistance and anti-communism after World War II, broke up in an interesting way in the 1990s. The right was recovered by Berlusconi; the center-right refused to join the People’s Freedom Party (PDL) created by Berlusconi in 2009; the center-left and left merged with the former members of the PCI to found the Democratic Party in 2008. In fact, the DC met the same fate as the French MRP 30 years earlier. The latter was also formed during the Resistance and dispersed its forces to the right, the center and the left. The Christian Democratic parties of the Benelux countries and Switzerland correspond to the same model as the Italian one. The same can be said about the German Zentrum party from 1871 to 1933, but not about the German CDU, which is no more clerical than the LR in France. By contrast, in Bavaria the CSU belonged to the tradition of classic, clerical conservatism, but clericalism also became contested in the party itself.
Historical Cleavages in Europe Finally, if only in order to understand multiparty systems with more than six parties, it is necessary to use a multidimensional space,
Parties
as in the systematic model of the origin of parties in Europe set up by Lipset and Rokkan (1967). For them, cleavages are neither ephemeral oppositions nor contingent divisions, but structural effects which result from the political translation of profound traumatic changes that affected the history of a country or a group of countries. These conflicting effects are exerted along two axes: the functional axis and the territorial–cultural axis. In the case of Europe, originally marked by Catholic Christianity, two revolutions engendered four cleavages: national revolutions and the Industrial Revolution. The former broke the unity between countries born of the Reformation and those marked by the Catholic counter-reformation. Two cleavages were engendered: (1) along the unitary center/periphery axis opposing modernization to the resistance of the subjected cultures from provinces and peripheral areas; (2) along the functional Church versus State axis, opposing the modernizing, secular elite and the defense of the interests of the Church in the fields of education and values. The Industrial Revolution also generated two cleavages: (3) along the territorial axis, primary versus secondary economy, opposing landed interests and the rising class of industrial entrepreneurs; (4) along the functional axis, with employers/owners opposing the interests of property, capital and business and the labor union movement which defended the interests of wage earners. These four cleavages are conveyed in the short term through the issues which oppose the parties, and in the long term through the party systems. In Catholic and Protestant Europe, on both sides of each cleavage, families of parties emerged that became established with the extension of suffrage and democratization. Since then, these cleavagebased party systems have been ‘frozen’ for a considerable time. Depending on the period, one may add that a cleavage can dominate electoral parliamentary debate: Church versus State in Catholic countries in the 19th century; primary versus secondary sector in Sweden
553
at the same time; the center versus periphery cleavage in the Basque country or in Ireland. However, the most important cleavage in Western Europe since the crisis of 1929, and even before, has been owners versus workers: apart from the two cases cited previously, this forms the axis of the most frequent parliamentary constellations.
Socio-Economic Cleavages Currently, the majority of parties are based on the functional–economic owners versus workers cleavage. In other words, on the one hand there are parties for the defense of owners, which have formal and/or informal links with employers, companies and the business community in general, but with a much broader electoral base which includes the middle classes. This family unites former previously opposed parties, such as conservatives and liberals in Protestant countries, Switzerland and the Benelux countries. It also includes parties of other origins from former Christian Democratic groups, such as the CDU-CSU in Germany or former nationalists such as LR in France (this party includes former Gaullists, conservatives and liberals), as well as new parties such as the PSD in Portugal, the post-Franco Partido Popular in Spain, and Forza Italia, which merged with the post-fascist AN to form the PDL, a unified right-wing party. On the other hand, since the 19th century there has been a systematic development of parties for the defense of workers, which historically constitute the labor movement and maintain special links with labor unions. Their voters are salaried employees, mainly working class, but also some white-collar workers and civil servants. They were born in the wake of the Industrial Revolution from the convergence of four forces: two ideologies – the Jacobinism of the French revolution and radical philosophy, and social Christianity in Protestant countries – and two forms of political organization – the labor unions and cooperative movements,
554
The SAGE Handbook of Political Science
and the internationalism of Marx, Engels and their disciples. The combination of these four ingredients, in variable proportions depending on the country, created three genetic models in the sense of Angelo Panebianco (1988) which developed into three traditions which are very visible today. The first is the labor tradition, born out of the failure of chartism, which was translated into parties of ideological and religious pluralism, dominated organically by the labor unions in which, ideologically, social Christianity is slightly more important than radicalism, whereas the International and Marxism are minor or even marginal. The second is the social democratic tradition, born in Germany and dominated by the Socialist International and Marxism, in which trade unionism emanated from the party. These parties have kept a controlled and solid form of organization, which has particularly subsisted in the Swedish SAP and in the Austrian SPÖ. The third is the social democratic tradition of the French revolution of 1848 – marked by Jacobin radicalism and the republican and anticlerical struggle. From the start, these parties came up against distrust from the anarcho-syndicalist movement, which was hostile to any collaboration with parties. The anarcho-syndicalists combined theoretical anarchism founded on the rejection of the state with electoral politics, advocating a mutual benefits system, self-management and federalism with practical ‘bread and butter’ reformism within companies. Moreover, the union movement was divided by a new Christian labor movement, which was equally wary of party politics, and a communist labor movement which would ultimately supplant anarchosyndicalism. The socialist-democratic tradition came to be embodied in weaker, intellectual parties and was neither controlling the labor movement nor being controlled by it. These socialist parties – French, Italian and Spanish – would practice ideological extremism and give more than their due to Marxism while practicing shortsighted reformism. The communist tradition is, in fact, a variant of the social democracy which was implanted in France and Italy, where it failed.
It is a kind of fighting social democracy which has accentuated its specific features: orthodox Marxism; centralized organization adapted to the political struggle in authoritarian regimes (democratic centralism); control over the unions which have become the ‘driving belts’; and above all the primacy of the International. With the Komintern, communist parties were the only party with an international dimension, devoted for many years to the interests of the USSR, which was presented under Stalin as the fatherland of socialism. The similarity between the electorate of the French and Italian communist parties on the one hand and of the Austrian, German and Scandinavian social democratics on the other hand favored the incorporation of some communist parties into party systems, their unofficial social democratization including membership of the Socialist International. The PCI, the Italian communist party, was the first to take this path; it was soon to be joined, with the end of the USSR, by the Hungarians, Bulgarians and Lithuanians. The development of a Marxism sui generis adapted to Italy and the West thanks to Antonio Gramsci and the strategic intelligence and political savoir faire of its leaders, Palmiro Togliatti and Enrico Berlinguer, who were anxious to promote ‘a national path towards socialism’ helped this transformation of the PCI first into the PDS and then into the DS (left-wing democrats) and PD, with the gradual help of the Christian Democrats. The PCF (French Communist Party) remained Stalinist for a long time, headed by leaders of no great stature – mainly Thorez and Marchais – and did not take the opportunity to return to social democracy. In the Scandinavian countries where social democracy is strong, the extreme left is also of communist origin – the Danish SF (Popular Socialists) broke off links with Moscow in 1956, and in Sweden the communist party became the Left Party. The situation has been the same in Finland since the end of the Soviet Union. By contrast, the left-wing socialists in Norway are a dissident Labor Party movement. The case
Parties
of Die Linke (the Left) in Germany is more ambiguous: the former communist party of East Germany, with its strong organization, merged with social democratic dissidents from the SPD. The most original of all parties situated ‘on the left of the left’ is the SP, the Dutch socialist party, an anti-capitalist protest party composed of former Maoists which has won up to 10% of votes.
Center–Periphery Cleavages Outside the dominant socio-economic one, the other important cleavage is center– periphery, which relates to territorial defense and is divided into two opposed families. On the one hand, there are the parties of centralized state nationalism, which correspond historically to a unifying, imperialistic state nationalism which is economically protectionist and which, socially, carries policies that are favorable to a protective state. One is reminded of Bismarckism in Germany, which during the Weimar Republic became the German National Party (DNVP), and of Bonapartism in France, and then the republican current from which Gaullism originated. These parties, concerned about the authority of the state, are inclined to deviate towards authoritarianism, have engendered some extreme versions – the total state that identifies nation, state and leader in fascist totalitarianism in Italy; Nazism in Germany – and have been emulated by many others. These extreme center parties are wrongly assigned to the extreme right and characteristically attract not only voters from the working class, but also leaders from leftwing parties: Mussolini, a former socialist in charge of the newspaper Avanti; Jacques Doriot, a former communist deputy and the founder of the PPF (French Popular Party); the less conventional Oswald Mosley, a former Labour MP and the founder of the British Union of Fascists. After World War II, with its horror and crimes against humanity, fascism is no longer presented as such, except in the case of marginal groups which are not parties.
555
Nevertheless, the ideological ground has remained fertile and able to produce analogous parties, which out of caution tone down their discourse. Globalization, immigration and the crises since the end of the 1970s favored the rebirth of post-fascism in places where a previous tradition existed: the National Front of Jean-Marie Le Pen in France; the FPÖ – heir of the pan-Germanic nationalist current – in Austria; and the NPD (National Democrat Party) in some Länder in Germany now challenges by the new born AFD. We may add, in Flanders, Vlaams Belang, the new name of Vlaams Blok, heir of the pro-Nazi VNV that existed between the two world wars. In other countries where there was no previous fascist tradition, new movements have emerged with a similar sociology and a less articulate discourse. Their creed resides in a radical xenophobia enhanced since the beginning of the 21st century by anti-Islamism. The longest-lasting case is the Danish popular party, DF, but the most spectacular is to be found in the Netherlands, in the LPF. That the founder and leader of the DF is a woman – Pia Kjaersgaard – and her Dutch counterpart in the LPF was a millionaire and a militant homosexual (who was assassinated by an animal rights activist) constitutes a break with traditional fascism; however, it is a mistake to consider them as new parties: their organization is new, yet the issues which fuel them – xenophobia and racism – have political roots that go back to the 19th century. Whether moderate or extremist, democratic authoritarian, statenationalist parties are – or were – authentic catch-all parties. The historical opponents of centralism are parties for the defense of the Periphery, sometimes regionalist and federalist, sometimes nationalist and separatist. They are the expression of ethnic or linguistic minorities and have a territory that is quite easily definable. Their existence is not recent and most often corresponds to countries with an imperial structure: Austria-Hungary before 1919 had many such parties. Today, the oldest party to defend
556
The SAGE Handbook of Political Science
the periphery is the Basque Nationalist Party (PNV), founded in 1895, which identifies with the cause of the language, culture and democratic traditions of the Basque country, where it is the main party. The Swedish Popular Party in Finland is also an old organization – early 20th century – which holds the monopoly of representation of the Swedish-speaking minority in Finland; for this reason it has participated in almost all government coalitions. The Scottish National Party, founded in 1925, only managed to break through in the 1960s, when it became alternately the second or third party in Scotland. Wales also has its nationalist organization, which is less strong and has fewer seats in Westminster: Plaid Cymru. Two parties which were created after World War II enjoy a majority in their region, though they are insignificant on a national scale: the popular party of South Tyrol in Alto Adige and the Val d’Aosta Union in Italy. In Flanders, the party defending the periphery, Volksunie, split in two, creating the more centrist New Flemish Alliance (NVA) and SPIRIT, which is more social-libertarian. Spain counts the largest number of parties of this type: in Euskadi, we must add to the PNV the nationalist, left-wing EA. in Catalonia, the moderate and centrist Convergència Democràtica de Catalunya moved quickly towards a more extreme stand. Under the new label of PD Cat (Democratic party for Catalonia) it became Republican and separatist just like its allies of the republican left of Catalonia. In Catalonia two parties have decided to fight in favour of independence, th left wing Esquerra Republicana de Catalonya (which was the pre-civil war dominant party in the region) and the center-right PDCat (Democratic Party of Catalunya), created by Jordi Pujol in 1977 under the name of Convergencia de Catalunya; this party has ruled the Generalitat since the return of Spain among the democratic countries in 1977. Numerous Spanish regions have their autonomists. The periphery also has extremists; sometimes these are violent ones but, unlike the terrorist movements of the 1970s such as the Red Brigades, are endowed with a legal electoral voice which shows popular support. This was the case in Northern Ireland, where the IRA was linked to
Sinn Fein until Blair’s Good Friday Agreement, and in the Basque country, where ETA was linked to Batasuna until the party was prohibited by the Spanish courts. In Corsica, the situation is the same, but the nationalists are only represented on a regional and local level. There is one special case: that of Lega Nord – Northern League – in Italy, which has moved from defense of federalist positions to the instrumentalizing of xenophobia, and from enthusiastic Europeanism to staunch Euro-skepticism.
Church and State Cleavage Another cleavage resulting from national revolution as defined by Rokkan is the cleavage between Church and State, which used to be of prime importance but now belongs to history. The Christian Democratic parties whose role was essential for European integration are in something worse than a crisis: the major such party, in Italy, broke up and now only exists as the Center Democratic Union, with a marginal role. The parties’ area of strength is limited to the Netherlands and the Benelux countries, where the Christian Democrats have lost a great deal of electoral weight; in Switzerland, the Christian Democrat party’s vote has collapsed dramatically. These parties were and still remain the best examples of inter-class, horizontal parties, that is to say, covering all the ground from the right to the left – from fundamentalism to progress through centrism. In fact, they reflected the sociological as much as the ideological structure of the Catholic subculture. They are not catch-all parties because, even when they are non-confessional, they embody the political will of believers and citizens steeped in Catholic culture. Due to dialogue between their bourgeois, agricultural and working class tendencies, Christian Democratic programs constitute a useful compromise enabling government coalitions either with liberals or social democrats. The latter explains their unparalleled longevity in government: Democrazia Christiana participated in all the coalitions of the first Italian republic, but the record is held by the Luxemburg Christian Democrats, with over a century in power, followed closely by the Belgians and the
Parties
Dutch, at almost one hundred years. They are the axis both of center-right conservative coalitions in Belgium and the ‘Roman Blue’ in the Netherlands and of labor center-left coalitions in Belgium and the ‘Roman Red’ in the Netherlands. Even the most conservative party in Austria, the ÖVP (Austrian Popular Party), has since 1945 participated most often in so-called Red–Black governments alongside the social democrats, in spite of the former party’s questionable alliance with the xenophobic nationalists of the FPÖ. It is interesting to note that in the Czech Republic the only non-communist party to have lived through the Soviet era is the Czech Popular Party (CSL), which participates in all coalition governments as it did before 1938, its foundation dating back to the Austro-Hungarian empire (see also Seiler 2003).
New Cleavages In addition to the old families of parties, there is a more recent one – the Greens. One thesis claims that the Greens are the embodiment of New Politics (Poguntke, 1993), based on postmaterialist issues such as quality of life, protection of nature and libertarian individualism, in opposition to the supporters of Old Politics, based on materialistic issues and values such as wage increases and ‘bread and butter’ issues in general. This idea was influenced by the work of Inglehart (1977), who developed his theory of post-materialism based on the proposition that the generation marked by the Great Depression followed by war and reconstruction was succeeded by a generation socialized in a context of prosperity, the ‘affluent society’ of the ‘golden sixties’. The new post-Industrial Revolution is said to have given rise to a new cleavage of materialists versus post-materialists, with the Greens occupying the ‘post-mat’ side and the ‘extreme right’ that of the materialists. The ecologists represent a new force, but the parties qualified as far right are as old as parliamentary democracy itself. The post-materialism adopted by social democratic parties partly explains the success of extreme center parties among workers. Stefano Bartolini and Peter Mair (1990) noted that the Greens participated
557
in what the two authors called intra-block mobility within the left, that is to say, the workers side of the owners versus workers cleavage. Later events seem to have proved them right, with the experiences of the Greens in government first in Finland and then in Belgium, and finally in Germany and France in coalitions with social democrats. In France, they owe the few seats they won in parliament to electoral alliances with the socialists. However, in Belgium and Germany, local coalitions with the right have been seen since the beginning of the 21st century. An intermediary hypothesis can be put forward, namely that the Greens stem from a restructuring of the territorial– economic cleavage – primary sector versus secondary sector – opposing the industrialized world to nature, which explains the mixture of postmodern and traditionalist features in the discourse. Their closeness to the left can be considered as the result of their h ostility to capitalism, which destroys the balance of nature.
Parties as organizations Parties are not biological organisms, nor do they have a lifetime association with a cleavage: they are autonomous forces. The German conservatives in the CDU (Christian Democratic Union) are an excellent example of a change of cleavage and of realignment. The CDU was created as a Christian Democratic party, heir of the pre-1933 Catholic Zentrum. Its founding programs defined a third way between Marxist collectivism and liberal capitalism through a Christian socialism founded on personalism and a respect for property, the quest for the common good and the principle of subsidiarity. The CDU and its Bavarian sister party, the CSU, had accepted anti-Nazi Protestant intellectuals from the ‘Confessing Church’. Under pressure from the Allies, who were concerned about the Soviet threat, and fearing the Marxism displayed by the SPD of 1945–6, the CDU opened its doors to Protestant conservatives who were not always former members of the Resistance. In Konrad Adenauer, the former mayor of Cologne and
558
The SAGE Handbook of Political Science
a moderate, the CDU had a leader who turned out to be a true visionary both on the question of European integration and on the future of Germany. Certain that Germany would sooner or later be reunified and that in this context Catholics, who were roughly equal in number to the Protestants in West Germany, would once again become a minority, he reoriented the CDU along a more conservative line to take the place of the old Zentrum party and the German Right, which had been discredited for its support of Nazism. The CDU/ CSU became a party with a majority vocation, inspired by the ‘catch-all party’ defined by Otto Kirchheimer (1965). As Peter H. Merkl (1963) noted, the CDU/CSU became a conservative party dedicated to defending the interests of industry, business and agriculture, with its popular voters (blue and white-collar workers) taking advantage of the spin-off from prosperity. As for Christian socialism, ‘it was a mere memory and to many of the party leaders not a pleasant one’ (Merkl, 1980: 32). However, the CDU was the first example of political realignment where the logic of organization took precedence over ideological considerations and programs. The CDU/CSU was able to considerably increase the number of its new voters, while to a great extent keeping its traditional electorate, and was not affected by the growing secularization of society. It increased its share from 31% in the 1949 elections to 50% in 1957. For the change to be successful, the cleavage had to be followed by lasting electoral alignment as defined by V. O. Key (1942). This requires good partisan organization. When, within a party, the requirements of organization – and therefore accession to power – are in contradiction with the preservation of its principles and identity, it either undergoes a re-foundation or it changes its identity and undergoes realignment towards an electorally more beneficial cleavage. The organization of parties, therefore, conveys a different logic from ideas and cleavages, and needs to be studied separately. The concept most commonly used to classify partisan
organization is the opposition between mass parties and cadre parties. This is often attributed to Maurice Duverger, but he borrowed it from Max Weber and developed it. Duverger’s great contribution was to distinguish between parties of inside creation and parties of outside creation, depending on whether the founders where in parliament – a typical example is that of Whigs and Tories – or outsiders who had no access to power, not even to parliament. The groups that may exist before the organization of a party can be labor unions, associations, Masonic lodges, leagues – including terrorist ones. The cadre parties are therefore parliamentary parties resulting from the widening of the electorate, aimed at inciting new voters to enroll on the electoral register and support the party and the electoral committees of the candidates. Mass parties are parties created outside the spheres of power, whose only means of access is to have the largest possible number of voluntary activists and regular financial contributors. More than the number of members, it is the criterion of funding that distinguishes mass parties. The regularity and registration of contributions is very important in mass parties whose internal legitimacy is embodied by its members rather than its electorate. It is necessary to add a characteristic that Duverger does not mention: the stability of leadership indifferent to the vagaries of the economic situation. Thus, during the entire 20th century, the Swedish Social Democrats had only five leaders, whereas other parties had many more. The Industrial Revolution and the expansion of means of transport, communication and propaganda leading to the development of cheap newspapers enabling national electoral campaigns and the running of centralized national bodies favored the action and development of mass parties until the 1960s. Some cadre parties adapted to their competition – often parties formed later, like the British and Scandinavian conservatives who were organized by penetration from the
Parties
center to the periphery, becoming, according to Angelo Panebianco (1988), electoral parties: professionals endowed with strong leadership, recruiting large numbers of members and, above all, oriented towards their potential voters. The difference with mass parties is three-fold. First, and regardless of the number of members, the funding of the party is dependent on gifts from business or rich contributors, and not the members who sometimes do not pay their fees and are not excluded. Second, legitimacy and power belong to parliamentarians, entrusted by voters, whose opinions are more important than those of members. Finally, the survival of the leaders depends on their success in general elections. Some parties have kept a more archaic form of organization: a federation of electoral committees composed of local personalities, headed by a much more undisciplined parliamentary party and with a weak leadership. These less developed cadre parties are to be found in countries such as France, Spain, Portugal and, to a lesser degree Italy. Jean Charlot (1971: 201) suggests calling them partis de notables. They were the earliest parties to be organized and correspond to a model of organization by diffusion; that is to say, they were created on the initiative of constituency committees which, by coming closer and closer, ultimately became federated, hence from the bottom up. During the hundred years from 1860 to 1960, technical development favored mass parties which, in certain cases – Catholic Zentrum and social democrats in Germany; Catholic and socialist parties in Austria, Belgium and the Netherlands; French and Italian communists – managed, in the words of Sigmund Neumann (1956: 405), to ‘take charge of voters from the cradle to the grave’. These rigid mass parties had a very large membership involved in a network of parallel organizations for women, children, young people, cooperatives, travel agencies, sport clubs and choirs, not to mention a party press. According to Neumann, these are ‘social
559
integration parties’; in Germany during the Weimar Republic, they were even qualified as ‘social ghetto parties’. Not all mass parties reached such organizational perfection; some, what Duverger (1954) calls ‘flexible mass parties’, were content to rely on the voluntary work of their militants. But, as Jean Blondel (1978) remarks, not attaining a large membership is always a failure for a party. In fact, parties of social integration have a strong organizational culture, which generates ‘party patriotism’ and the attachment of members and even voters who are more concerned with the organization itself than the idea which it embodies. Flexible mass parties are in a perpetual debate over ideas, a source of divisions which are unproductive from an electoral point of view. Since the 1960s and the development of political communication centered on television and now the internet, mass parties – and especially the most powerful among them – have experienced a crisis of adaptation brought about by a decreasing membership. This has hit all the large social and political organizations, characterized by the omnipotence of image and entertainment, ensuring a domination of form and appearance over substance, style over ideas and the personality of the leader over the political party. Duverger thought – rightly at the time – that cadre parties were doomed but, thanks to a ruse of history, it is the mass parties which are the dinosaurs, and today the professional electoral parties (Panebianco, 1988) prosper in Europe. They are more reactive to variations in public opinion and more personalized, and, while easily disposing of leaders who fail, they are on the same wavelength as the media. As they lose more and more members, the old parties of integration tend to be reduced, to the advantage of their apparatus, with a large permanent staff paid by the party. The latter can be described by the concept of the ‘bureaucratic mass party’ proposed by Panebianco. Moreover, the increasingly high costs of election campaigns, along with the development
560
The SAGE Handbook of Political Science
of the media and particularly television, have provoked a change in the organization of parties. First, whatever the social organization, individuals are less engaged in long-term action but are more willing to commit themselves in a limited way to a precise objective: parties lose members, but so do labor unions, and there are fewer practicing members of the Church. This poses a vital problem to mass parties confronted with a decrease in the number of their contributors on the one hand and the increasing cost of electoral campaigns on the other. They resort increasingly to funding by private enterprises, as other parties did, which changes their nature. As the mass parties are often unpopular with the business community, they are driven to use practices which in some countries are considered corruption and condemned by law. Consequently, the states develop legislation on parties, replacing or limiting their private funding with public funds. Lastly, as television does not lend itself to a deepening of political discourse or to nuances, parties are forced to put on performances – they have done this since the 19th century, but they now do it for television, which means they have to embody themselves in a leader, who has to build an image of a prime minister or in France of a president. Countries such as Belgium, the Netherlands or Italy, which were governed by coalitions of parties and appointed their prime ministers by a process of negotiation and arbitration between parties and currents within a party, have had to resign themselves to personalization. The leading German parties were the first to go down this path by designating their candidates for the chancellery. In the UK, around 1965, the Conservative Party decided to have its leader elected by the members of parliament – previously, the appointment had been the result of a secret process in which the outgoing leader played a substantial role. In the 1970s and 1980s more and more parties, which until then had had their leaders elected by delegates at their congress or party conference, changed to direct election by
paid-up party members. In 1995, the French Socialist Party used this method to choose its candidate for the presidential election and so did the British Conservatives. In Italy and France, the left-wing coalition and the socialists tried to import the American system of primaries. The model set up in Italy appears to be closest to the original, with the center, left and extreme left taking part – but the ballot was organized privately within the offices of the parties. For the French Socialist Party, it was an internal election enlarged to accommodate members admitted for the circumstance in return for a reduced financial contribution. In both cases, the analogy with the American primaries resides in the fact that there was an internal election and even debates between the candidates. The aim was to attract the attention of the media and thus favor the campaigns of the parties concerned. This way of functioning does not correspond to the ideal of mass parties. The distinction between professional electoral parties on the one hand and bureaucratic mass parties on the other has become blurred in numerous countries. They have become publicly financed institutions oriented towards the media and, according to Katz and Mair (1993), could even manage without members. These authors suggest that the reasons for maintaining the role of members in certain parties are of a symbolic nature. According to them, there is a new mode of partisan organization, the cartel party, which since the 1970s has succeeded the hegemony of catch-all parties, which had taken over from mass parties after replacing elitist parties. The fact is that what Mair (1999) calls ‘parties in opinion’ have taken precedence over the ‘central apparatus of the party’ in an arena of dialogue with the ‘party in public office’ – leader, government, parliamentary group, all political professionals. Are there any alternatives to cartel parties? There is one in Italy with Forza Italia, devoted to the promotion and the defense of the interests of its founder, Silvio Berlusconi. Such a phenomenon was only made possible thanks to
Parties
two ‘accidents’: the deregulation of television, which opened the doors to Berlusconi’s establishment of a television empire, and the collapse of the 1st Italian Republic, which freed some space on the right of the political spectrum, that is to say, on the side of the defense of liberal capitalism. We can see that both the right and the democrats in Italy, as well as a large number of parties in Europe, are greatly influenced by the organizational models of the United States. However, the latter remain different and are based on various autonomous strata both for the Democrats and the Republicans. Since the seminal contribution of V. O. Key (1942), scholars have analyzed American parties as tripartite structures: (1) the party in the electorate; (2) the party organization; (3) the party in government. The first of these refers to the loyalty and identification of the voters and the third to public office holders, from the President to local councilors. The second is structured in a manner defined by Sam Eldersveld (1982: 124) as a stratarchy: ‘an organization with layers, or strata of control, rather than centralized leadership from the top down.’ American party organizations are far older than the European ones, and, despite the decline in party identification, they have adapted to the various and numerous changes and evolutions affecting the practice of democratic government. This is obviously not the case in Europe, where we can speak of a ‘crisis of parties’.
Global perspectives Whereas the study of parties and their political functions, social roots and organization has for a long time been confined to the Western world where they first developed, decolonization and democratization have led to new forms and experiences elsewhere. Some of the concepts and perspectives developed from the very varied European experience discussed above have now to be tested and
561
applied to different social and historical contexts. Not all such concepts and typologies can easily travel, however, and it remains to be seen (in the following paragraphs) which characteristics can be considered to be of a more universal nature and which are more context-specific. The more general definition of parties and their raison d’être in contemporary democracies still remains the same. Their social context and form of organization vary greatly, however. As a first step we will discuss, therefore, how far the cleavage patterns described by Lipset and Rokkan apply to other parts of the world. Whereas the Church–State relation in Europe is a very specific historical one, the center–periphery and socio-economic cleavages, in different variants, seem to be applicable in a number of cases elsewhere. A second step describes more specific patterns of organization in the broader world regions.
Latin America When considering the origins of parties (and party systems) in the 22 independent Latin American states, some similarities stand out, but differences prevail. Thus, although with slightly different timing (end of 19th century– early decades of the 20th century), the elite origins of parties are in essence not different from the European cases. The differences come later, mainly because of the diversity of socio-economic developments and the institutional contexts in which mass politics emerged and parties became the necessary tools for shaping opinions, organizing voters, recruiting politicians, forming the government and carrying out policies. In fact, among other factors, the weaknesses of the industrialization process, the characteristics of highly unstructured immigration societies, the high level of economic and social inequalities, the very low unionization and the low level of literacy never created the conditions for wellorganized parties with stable roots in society. Consequently, there was almost no chance to
562
The SAGE Handbook of Political Science
build strong socialist parties characterized by a strong left–right cleavage, such as the one described by Lipset and Rokkan as a result of the European industrial revolution. The religious–secular cleavage became apparent with the creation of Christian Democratic parties in several countries, but a serious cleavage between a strong state and the Catholic Church was missing. Ethnic parties never blossomed, and when, more recently, ethnic interests were better represented this happened within parties that also articulated other interests. The absence of explicit separatist policies, and in several cases the mixing between the local people and the heirs of Spanish or Portuguese colonizers, account for this result. The different institutional contexts were due, on the one hand, to the different success rates of democratic regimes and military rule in the two former colonial areas, with all related legacies, and on the other hand to the traditional presidentialization of constitutional structures in Latin America, which was very different from the prevailing parliamentarization of European democracies. With this background, interest differences, ideological differences and the secular– religious cleavage paved the way for mass parties, some of them with a long history. Thus, for example, in Chile, where a more traditionally European-like party system developed, there was a Christian Democratic party with social programs as early as the 1960s, with strong links to European Christian Democratic parties, and a Socialist party going back to the 1930s. Chile’s President, Salvador Allende, and a leftist coalition were at the core of one of the most tragic political events of recent democratic history. Allende’s presidency ended with the installation of the highly repressive military regime led by General Pinochet after the coup d’état of September 1973. The Argentinian Justicialist Party and Radical party also have a long history. The former was created by Juan Perón in the mid-1940s and was again successful in the presidential elections of 2003, 2007 and 2011 (Nestor and
Christina Kirchner). It was also one of the few examples where unions played an important role in the support of the party. The Radical Party (‘radical’ in the sense of a European liberal party as in France, discussed above), whose foundation dates back to the end of the 19th century, played a crucial role in the transition to the new democratic regime under President Raul Alfonsin and the end of military rule in 1983. The oldest Brazilian party, the Brazilian Democratic Movement (MDB), was created in 1965 under military rule as the official opposition party, and is a legacy of that period, although today a fully transformed one. It was the Worker’s Party (PT), founded in 1980 – one of the largest and most important socialist parties of Latin America – that was able to govern at the federal level for most of the first two decades of the 21st century (2003–16), with Luiz Inácio Lula da Silva and later Dilma Rousseff as Presidents. The Mexican Institutional Revolutionary Party (PRI) is not the oldest Latin American party, having been founded in 1929, but it is the party with the longest incumbency both in the authoritarian period and after a transition to competitive multi-party politics, without interruption until 2000. Through adaptation and leading regime change, the party was able to survive and even prosper for some years with the transition to democracy. Trying to put Latin American parties into a typology seems an impossible exercise. If we think about classic categories as cadre parties or professional electoral parties (see above) or clientelistic or personalistic parties (Hellinger, 2011: chapter 15), no party can be exclusively classified in this way, but personalistic and clientelistic features are embedded in many of these parties, strengthened by the caudillistic military traditions, the presidential regimes and a widespread clientelistic culture (Kitschelt, Chapter 29, this Handbook). Organizational changes in this area went along similar lines to the European ones and were brought about by technological transformations, with the dominant role of media and, later, of social networks within the party,
Parties
as evidenced by the successful presidential election campaign of Jair Bolsonaro in Brazil in 2018. Bolsonaro was originally nominated by the small Social Liberal Party (PSL) and later supported by other parties, and a basic reason for his victory is the transformation of the key tools of the electoral campaign through the adoption and exploitation of the much higher propaganda reach created by the new social media. Growth of distrust and indifference, increased authoritarian attitudes, party dealignment and the fading away of the parties’ already weak social roots, compounded by a stronger role for leaders in their actual work, are the main features of party developments during the first two decades of the 21st century. Thus, if we recall Mainwaring (1999: 22–39), who suggested analyzing the institutionalization of parties and party systems according to four dimensions (stability in the competition between parties; strong partisan roots in society and related strong voter attachment to parties; public legitimation of elections and parties by political actors; independent, stable party structures), we can reach the conclusion that in the first two decades of this century there was a progressive process of de-institutionalization (see also Mainwaring, 2018). Looking at the phenomenon of the emergence of independent candidates without any partisan attachment, especially in local elections in a number of Latin American countries – for example, in Mexico – we can conclude that the degree of party de-institutionalization has reached a point of no return. This widespread delegitimization of parties, favoring populism, personalism and extremism, has serious consequences for Latin American democracies.
Middle East and South Asia The formation of parties in the Middle East and in South Asia was first dominated by the national liberation process. Instead of expressing domestic cleavages, they were
563
primarily organized to claim a national identity, as the Shabab al-Ba’ath al-Arabi did from 1942, and, one year after, the Hizb alBa’ath al-Arabi: both were the predecessors of the Baath Party in Syria and in Iraq (Batatu, 1999). The same development could be observed in India, with the Congress party created in 1885; in Indonesia, with the Indonesian National Party, created in 1923, or in Malaysia, with the Malay Union (1926) (Robison, 2011: chapter 1). When the liberation process was achieved through violence and war, the party was basically organized along military lines, as was the case in Algeria (with the NLF being created in October 1954, when the independence war began). This origin gave a specific orientation to these parties, which became ‘dominant parties’, strongly oriented towards national unity, personal and charismatic leadership, centralized organization and mass mobilization, with a very vague ideology focusing on sovereignty and development. Their first challenge came from the international context in the Cold War era and the creation of communist parties, which appeared as their first serious rivals in Malaysia and in Indonesia (where CP counted more than one million activists) – but also in India, where they played an important role as local forces (e.g. in Kerala), and in Egypt, Iraq and Syria, where their success hinged upon their capacity to mobilize cultural minorities. Later, religion, and particularly Islam, appeared as a second challenge, either by mobilizing the Muslim minority (as the Muslim League did in India around Mohammed Ali Jinnah, who founded Pakistan) or by promoting an alternative vision of politics, as in the case of the Indonesian Islamic Union Party (1923), the Nahdatul Ulama (1926) in Indonesia, or the BJP in India, created in 1980 and derived from the RSS (Rashtriya Swayamsevak Sangh), constituted in 1926 (Jaffrelot, 1996). Today, these historical parties are often weakened by power erosion and the fading memory of the heroic liberation period. They are now moving more toward populist
564
The SAGE Handbook of Political Science
orientations, a new religious rhetoric (around a radical vision of Islam, Hinduism in India or Buddhism in Sri Lanka) and a kind of depoliticization caused by a consensus around a form of state capitalism promoted in Indonesia, India and Malaysia. For this reason, recent debates are more often created inside civil society, marginalizing the role of party politics.
Sub-Saharan Africa Except for the very special case of South Africa, parties have come into being only with political independence since the mid-1950s and early 1960s, or shortly before. Some parties were successors of earlier independence movements; others were created in the newly independent and constitutionally democratic states. In many cases, however, the new regimes soon degenerated into single party states proclaiming some form of ‘African socialism’, as in Tanzania, or were instead taken over by military rulers. But even where some form of party competition persisted for a while – as in Kenya, for example – the regimes remained at best ‘semi-competitive’. The only exceptions worth speaking of were Botswana, with a dominant party, and the island state of Mauritius. It was only after the ‘second liberation’ and the end of the Cold War in the early 1990s that truly democratic and pluralist political parties emerged. Initially, newly founded parties proliferated and found many expressions, from tiny ‘telephone booth’ parties (where all members, it was said, would fit in a booth) to truly mass parties with a large and regular membership. After a while some clearer patterns became apparent. To get an initial indication of the overall order of magnitude, out of the 48 Sub-Saharan African states, 9 were classified as ‘free’ and another 23 as ‘partly free’ by Freedom House in 2018. Only there does some level of effective party competition exist. The free and partly free countries are mostly concentrated in West, Eastern and Southern Africa; authoritarian states
predominate in Central Africa and in territories bordering the Sahara. A closer look reveals that in a number of countries, single dominant parties, often linked to the independence movement, prevail, as in Botswana, Namibia and South Africa. In some other countries some more stable competitive patterns have emerged, which show a strong regional/ethnic orientation, as in Cote d’Ivoire, Nigeria or Uganda. Party names and organizations change very quickly and show a very fluent pattern, as in Kenya, for example. More generally speaking, strong center– periphery cleavages are apparent, indicating dominant ‘neo-patrimonial’ personalistic and clientelistic relationships between those in power and their followers (Bratton and van de Walle, 1997; Temelli, 1999). Nevertheless, with emerging middle classes, socio-economic cleavages with corresponding party organizations also become stronger – as, for example, in Malawi and Zambia. In the longer run, therefore, a differentiation along a more common left–middle–right dimension may be observed (Langhans, 2013). In some of the more stable new democracies, as in Botswana, Ghana or Senegal, a certain amalgamation of traditional sources of power and forms of decision-making at the local levels (involving elders, chiefs, marabouts and similar authorities) has occurred. The level of institutionalization and organizational strength also varies greatly. At the one extreme there are highly institutionalized parties such as Chama cha Mapinduzi (CCM, the ‘party of the revolution’ – the successor of the original Tanzania African National Union, TANU) in Tanzania and the Botswana Democratic Party (BDP), both in power since independence. At the other extreme, there are many very fragile and loosely organized parties, often centered on a particular personality, such as the roughly 40 registered parties in Benin or the approximately 50 parties standing for election in Nigeria. More important are often shifting coalitions between parties and leading personalities, as in Kenya (Basedau and Stroh, 2008).
Parties
Conclusion As this overview has shown, political parties, as essential elements of modern forms of democratic representative government, have a checkered history in various parts of the world. On the one hand, there have been oligarchic and cartel tendencies within parties, emphasizing control from above and losing touch with their membership and electorate. This has contributed to a widespread disillusionment with established parties, a more general dealignment and much greater electoral volatility. On the other hand, more direct forms of democracy, facilitated by the digital revolution and new social media, cannot really replace them. At the extreme, this could lead to purely plebiscitarian politics with a great danger of manipulation by demagogic leaders, disrespect for essential democratic rights and freedoms, discrimination of minorities and nationalist aggressions. A proper balance must be found between these two tendencies, guaranteed by the rule of law, to ensure viable and effective forms of democracy, including political parties, in this century.
Note 1 This chapter is based in part on the author’s contribution to the International Encyclopedia of Political Science (Badie et al., 2011) and has been revised and supplemented by the editors of this Handbook.
References Almond, G. A. and Powell, G. B. Comparative Politics: A Developmental Approach, Boston: Little, Brown, 1966. Aron, R. Democracy and Totalitarianism, London: Weidenfeld and Nicolson, 1967. Badie, B., Berg-Schlosser, D. and Morlino, L. (eds) International Encyclopedia of Political Science, Thousand Oaks (CA): Sage, 2011. Bartolini, S. and Mair, P. Identity Competition and Electoral Availability, Cambridge: Cambridge University Press, 1990.
565
Basedau, M. and Stroh, A. Measuring Party Institutionalization in Developing Countries: A New Research Instrument Applied to 28 African Political Parties, GIGA Working Paper, No. 69, February 2008. Batatu, H. Syria’s Peasantry, the Descendants of Its Lesser Rural Notables, and Their Politics, Princeton: Princeton University Press, 1999. Blondel, J. Political Parties: A Genuine Case for Discontent? London: Wilwood House, 1978. Bratton, M. and van de Walle, N. Democratic Experiments in Africa: Regime Transitions in Comparative Perspective, Cambridge: Cambridge University Press, 1997. Burke, E. Thoughts on the Cause of the Present Discontents, in Burke E., The Philosophy of Edmund Burke, Ann Arbor: The University of Michigan Press. Charlot, J. Les partis politiques. Paris: Armand Colin, 1971. Duverger, M. Political Parties, London: Methuen, 1954. Eldersveld, S. J. Political Parties in American Society, New York: Basic Books, 1982. Hellinger, D. C. Comparative Politics of Latin America: Democracy at Last? New York and London: Routledge, 2011. Inglehart, R. The Silent Revolution, Princeton: Princeton University Press, 1977. Jaffrelot, C. The Hindu Nationalist Movement and Indian Politics, London: C. Hurst & Co. Publishers, 1996. Katz, R. and Mair, P. (eds) How Parties Organize: Change and Adaptation, London: Sage, 1993. Key, V. O., Jr., Politics, parties and pressure groups. New York: Thomas Y. Crowell Company, 1942. Kirchheimer, O. Der Wandel des westdeutschen Parteisystems. In: Politische Vierteljahresschrift. Band 6, 1965, p. 20–41. Langhans, J. Party Systems and Cleavage Structures in Southern Africa, Münster: Books on Demand, 2013. La Palombara, J. and Weiner, M. (eds) Political Parties and Political Development, Princeton: Princeton University Press, 1966. Lipset, S. M., Political Man, Baltimore: University Press, 1960.
566
The SAGE Handbook of Political Science
Lipset, S. M. and Rokkan, S. (eds) Party Systems and Voters Alignments, New York: Free Press, 1967. Mainwaring, S. Rethinking Party Systems in the Third Wave of Democratization: The Case of Brazil. Stanford: Stanford University Press, 1999. Manwaring, S. ‘Party System Institutionalization in Contemporary Latin America”, in S. Mainwaring, ed., Party Systems in Latin America, Cambridge: Cambridge University Press, 2018, pp. 34–70. Mair, P. Party Systems Change, Oxford: Oxford University Press, 1999. Merkl, P. H. (ed) Western European Party System, New York: Free Press, 1980. Michels, R. Political Parties, New York: Free Press, 1962.
Neumann, S. Modern Political Parties: Approaches to Comparative Politics, Chicago: The University of Chicago Press, 1956 Panebianco, A. Political Parties, Cambridge: Cambridge University Press, 1988. Poguntke, T. Alternative Politics: The German Green Party, Edinburgh: Edinburgh University Press, 1993. Robison, R. Routledge History of South East Asia, London: Routledge, 2011. Sartori, G. Parties and Party Systems, Cambridge: Cambridge University Press, 1976. Seiler, D.-L., Les Partis Politiques en Occident, Paris: Ellipses, 2003, pp. 105–184. Temelli, S., Demokratisierung im subsaharischen Afrika, Hamburg: LIT-Verlag, 1999.
34 Pluralism Roland Czada
Pluralist ideas and politics regard the diversity and autonomy of social groups not only as relevant but also as valuable. Pluralism, in its many ramifications, represents a particularly broad line of political and social thought as well as an approach to empirical analysis. The intellectual roots of the concept can be traced back over centuries. In modern political science, the term has been mostly associated with analyses of the influence of interest groups over executive political decisionmaking. As a paradigmatic theory and method, the approach was not fully elaborated until the mid 20th century. It then quickly developed into a classic, often dominant approach to the study of politics in the Western world. Originating from the American group school of political science (Bentley, 1908; Truman, 1951; Latham, 1952), pluralists of the 1950s and 1960s conceived of governmental policies as the result of countervailing pressures and lobbying exerted by a multiplicity of autonomous, more or less organized social groups competing for political influence.
A Short History of the Concept A genealogy of pluralist thinking could begin with Greek philosophers and their teachings on how to live in groups side by side in tolerance and diversity, instead of on top of each other in a hierarchy. The image of a plurality of worlds, as it was taught and lived in ancient schools by Democritus, Epicurus, Herodotus and Xenophon, was curbed by Christian monotheism from late antiquity into the Age of Enlightenment. The concept was then revived during the early modern period. It influenced the American constitutional debate of the late 18th century, legal theories of corporate group personality of the 19th century, and political science theories of the 20th century in particular. His work on associations in politics earned Johannes Althusius great recognition as the founder not only of federalism but also of early modern pluralistic thought. Althusius (1563–1638) was the first to formulate a comprehensive theory of what he called a
568
The SAGE Handbook of Political Science
‘consociationalist’ (associationalist) constitution, and thereby rejected the arguments of his contemporaries in favor of monarchic monism and indivisible territorial sovereignty. The question of how to reconcile social groups’ quest for autonomy with a government’s claim for sovereignty continues to permeate the discourse on pluralism to this day. In this debate, taming the Leviathan can be regarded as the overarching goal of pluralist thinking past and present. Since the early 20th century, pluralist thoughts and studies have contributed above all to justifying the role of interest groups in policy making. They were generally aimed against monism, autocracy, hierarchical statehood and elitist politics, and thus took center stage in many scholarly works on theories and operating principles of liberal democracy. From the very beginning, studies of pluralism focused on the power bases of governments, and modes of participation and equilibration – balancing of interests – in politics. Starting as a particularly North American political science approach, the modern notion of democratic pluralism spread globally. It influenced political science in Latin America, Africa and Asia. Simultaneously, its basic research theme stretched out into a number of subtopics. Since the 1960s, political systems based on party competition and institutional divisions of power have been referred to as ‘pluralist democracies’. Today, pluralist thinking inspires debates on the limits of principled universalism and on concepts of moral and value pluralism, democratic elitism, legal pluralism, religious governance, up to controversies on identity politics and cultural pluralism worldwide.
Basic theories and conceptual variations The many faces of pluralism correspond with variations in terminology. Different names and emphases of pluralistic thinking can be recognized over time. Common to all
approaches is the association and action of individuals in groups as a starting point. The concept embraces terms such as interest group politics, associational governance and political power-sharing, advocacy, lobbying, pressure politics, collaborative governance, mutual partisan adjustment, corporate pluralism, consociationalism – from consociatio, the Latin word for association – societal interest intermediation, and corporativism and corporatism – from Latin corpora, meaning social organisms or corporate group personalities. Today advocacy has, in a way, replaced the former semantics of pluralism.
Early Forerunners of Modern Pluralism The universal commonwealth (consociatio universalis) proclaimed by Althusius is a polity based on autonomous manifold social groups, rather than a concept of sovereign statehood as embodied in the evolving European absolutism of his time. A state or polity has to be understood – in his own words – as ‘an association inclusive of all other associations (families, collegia (i.e. guilds), cities, and provinces) within a determinate large area, and recognizing no superior to itself’ (Althusius, 1964 [1603]: 12). In conceiving the social contract as a real pact among corporate legal entities – semi-autonomous associations that compose society – he set himself against his near contemporary Thomas Hobbes, who in his famous book Leviathan considered a single agreement entered into by individuals who commit themselves to absolute subjection under a common power. Johannes Althusius had a notion of shared sovereignty that stands in deep contrast not only to Hobbes’ unitarism but also to Jean Bodin’s doctrine of monarchical sovereignty. Due to his emphasis on associational autonomy, the subsidiarity principle and the multilevel character of his constitutional system, Althusius is now
Pluralism
considered an early modern protagonist and forerunner of both federalism and pluralism. In Europe, the medieval notion of shared sovereignty became prominent again with the doctrine of the real personality of the association, as put forward by Otto von Gierke in his works on medieval law and political theory. In the second half of the 19th century, when he referred to and translated parts of Althusius’ works – originally published in Latin – to a wider German audience, Gierke’s pluralism played an important role in disputes between the Germanists and Romanists over what kind of law should be adopted in Germany. Pluralism, in addressing groups as legal personalities or semi-sovereign corporate bodies with their own will and capacity to act for their members and followers – as in medieval law – has had considerable significance in constitutional thought as well as for the political movements of the time (Dewey, 1926: 672). Gierke’s writings – and through them Althusius’ political philosophy – found a broad reception not only in the United States but also in Britain (Dreyer, 1993). His Political Theories of the Middle Ages (Gierke, 1900) paved the way for a newly emerging English school of academic and political pluralism, of which guild socialism had the most farreaching impact. Frederic Maitland, George D. H. Cole, J. Neville Figgis and Harold Laski, the masterminds of English guild socialism (Glass, 1966), were greatly concerned with labor unions and self-government in industry. In search of a pluralist blueprint, they fought against the alienation of the individual under conditions of unrestrained capitalism. Their ideas moved toward a participatory democracy beyond individual citizens’ voting rights. Functional representation in voluntary associations should integrate the individual into communities that would complement or even replace the society of market participants, with its deprivations and social uprooting, as well as the state as an institution of compulsory membership and coercion. Against this backdrop, the English guild socialists belong to the early theorists of a ‘moral economy’.
569
In an attempt to diminish the discretionary exercise and unequal distribution of political and economic power, the proponents of socialist pluralism used the medieval structure of guilds, chartered cities, villages, monasteries and universities as a model for a worker-controlled economy. Their research and political activities came to an abrupt end soon after World War I. The ideas, however, continued to live on in Austromarxism and concepts of industrial democracy. They had a strong impact on Karl Polanyi’s conception of a socially embedded economy free of centralist command and market dominance. Early pluralists were focused on associational autonomy mostly in a legal and constitutional view. They rejected monistic theories of sovereignty endowing state institutions with supremacy over society. For them, sovereignty resides not exclusively with governments or parliaments but with many social, political, cultural and economic organizations in society. These community institutions are perceived as free and prior to state institutions.
Pre-Classic American Theories of Pluralism In the United States, the history of pluralist reasoning begins with the debates on the constitution of the Union from 1787 onwards. The American constitutionalists worked out an embryonic theory of pluralism in an attempt to combine the best features of John Locke’s postulates of liberalism, Edmund Burke’s social conservatism and Jean-Jacques Rousseau’s thoughts on participation in politics (Connolly, 1969: 3). Among the founders, James Madison, in Federalist Papers No. 10 of 1787, states that the political mechanisms created by the new constitution were specifically designed to protect freedom of association and should simultaneously balance conflicts between factions and interests in domestic politics. The US Constitution stands out explicitly from the monistic traditions in Europe in this respect. Its social implications
570
The SAGE Handbook of Political Science
were impressively described by Alexis de Tocqueville, who placed the activities of autonomous groups and their preeminence in public life at the center of his famous two volumes on Democracy in America, published in 1835 and still regarded as a groundbreaking contribution highly important to later academic works on pluralism (Tocqueville 2000). The American group school in political science, the beginnings of which go back to the 1920s, and its successors could draw on this national intellectual heritage. But there was another, similarly important academic influence coming from Europe. Otto von Gierke’s Political Theories of the Middle Ages attracted political scientists in the United States, among them Arthur Bentley, whose studies in Germany in 1893/4 are reflected in his later writings on the role of group associations in politics. Leading ideas from the work of Georg Simmel, whom he had met in Berlin, found expression in Bentley’s pluralistic view on society and politics, namely Simmel’s theory of ‘crosscutting social circles’ according to which modern societies consist of groups that cut across each other in many directions and hence forbid any classification of diverse societies into stationary and sharply divided classes or status groups. David Truman used this thought to great advantage in his path-breaking basic work on pluralism, The Governmental Process: As Arthur Bentley has put it: ‘To say that a man belongs to two groups of men which are clashing with each other; to say that he reflects two seemingly irreconcilable aspects of the social life; to say that he is reasoning on a question of public policy, these all are but to state the same fact in three forms’. The phenomenon of the overlapping membership of social groups is thus a fundamental fact whose importance for the process of group politics, through its impact on the internal politics of interest groups, can scarcely be exaggerated. (Truman, 1951: 158)
Individual cross-pressures resulting from overlapping group affiliations in society became essential for pluralists, since they
tend to mitigate conflicts, foster a rationally motivated open-mindedness toward various interests in society and, thus, promote the reconciliation of clashes between social groups. Individual conflicts of preference that result from multiple overlapping memberships form integrative forces that bring the general interest to bear at the level of individual citizens – this was a grandiose discovery and principle that gave the theory of pluralism a firm base and finally caused its breakthrough in the North American political science community (Czada, 1991: 278–81).
Classic Empirical Theories of Pluralism The proponents of classical pluralism widened the scope by searching for institutional power structures and channels of political influence in given societies. In this way, the concept developed into a regime type called pluralist democracy (Dahl, 1967). Robert Dahl, the first and most renowned proponent of the classic theory of pluralism, no longer conceived of civil society associations as a counterweight to a sovereign political majority, but insisted that a constitutional-cumsocietal pluralism replaces rather than counters the sovereignty of the people or the majority of the people in a majoritarian democracy. Thus he returns to early modern approaches that are critical of sovereign supremacy: • Instead of a single center there must be multiple centers of power, none wholly sovereign. Although the only legitimate sovereign in the perspective of American pluralism is the people, even the people ought never to be an absolute sovereign; consequently, no part of the people such as a majority, ought to be absolute sovereign. Why this axiom? The theory and practice of American pluralism tend to assume, as I see it, that the existence of multiple centers of power, none of which is wholly sovereign, will help (may indeed be necessary) to tame power, to secure the consent of all, and to settle conflicts peacefully:
Pluralism
• Because one center of power is set against another, power itself will be tamed, civilized, controlled, and limited to decent human purposes, while coercion, the most evil form of power, will be reduced to a minimum. • Because even minorities are provided with opportunities to veto solutions they strongly object to, the consent of all will be won in the long run. • Because constant negotiations among different centers of power are necessary in order to make decisions, citizens and leaders will perfect the precious art of dealing peacefully with their conflicts, and not merely to the benefit of one partisan. (Dahl, 1967: 24)
In this interpretation, the idea of pluralism turns from a theory of political influence into a political system type that Dahl (1971) himself called ‘polyarchy’ (lit. rule of the many). He points to the American presidency, Congress, the Supreme Court, the states and ‘The Other Ninety Thousand Governments’ as being policy makers in their own right. ‘These territorial governments below the national level are of bewildering variety and complexity. The governments of the fifty states constitute a vast field of themselves. The thousands of towns and cities create a political tapestry even more complex’ (Dahl, 1967: 171–2). The benefits of such a horizontally and vertically differentiated political system, according to Dahl (1967: 172–3), are fourfold: (1) diversity in public governance reduces the workload of the national government and makes democracy more manageable; (2) it prevents conflicts accumulating at the national level and, thus, makes democracy more viable; (3) providing numerous semi-autonomous centers of power reinforces the principles of balanced authority and political pluralism; (4) facilitating self-government at the local level of administration creates opportunities for learning and practicing democracy. In his most influential empirical study of community power dynamics in New Haven, Connecticut, Dahl (1961) showed that no one could effectively monopolize political power in a pluralist society of groups free
571
from political control. Decision-making turned out to be shared instead among different groups and individuals in competition with each other. Dahl’s method was not based on reputation or positions in power networks, as in most contemporary analyses of political power structures (HoffmannLange, Chapter 30, this Handbook). Rather, he compiled a number of empirical policy analyses. In focusing on how political decisions were made on certain issues and areas of policy, various observational means had been employed, among them lists of persons who were involved to a measurable degree in decision-making. The study identified a series of elite groups who dominated areas of public policy such as education, nominations to public office, urban renewal, and so on. While there was some overlap of names, particularly when elected public officials were concerned, its extent was surprisingly small. Empirical studies on pluralism did certainly not confirm the idea of equal opportunities for all groups to influence the political process. They rather showed a pluralist democracy without a single recognizable power elite. In concluding that there are ‘multiple centers of power, none of which is wholly sovereign’, Dahl (1967: 24) rejects the concept of parliamentary sovereignty based on majority rule, as enshrined in the British Westminster model of government. In its golden age of the 1960s, classical pluralism described an open, largely unpredictable competitive system of political power sharing with multifarious access routes to political decision-making. At the same time, the concept departed from earlier optimist assumptions of equilibration among a great number of political forces neutralizing each other.
Deficits and Critique of the Classic Pluralist Model Pluralism – societal, political and ethical – was not only the most prominent approach of the 20th century used to describe, understand
572
The SAGE Handbook of Political Science
and explain the functioning of Western liberal democracies; it was also among the most criticized concepts. One finds numerous attempts of empirical refutation as well as some strong theoretical counterarguments. The strongest empirical critique could be seen on the streets of American cities in 1967, just as Robert Dahl’s major work Pluralist Democracy in the United States was published. Riots struck 56 American cities, among them New Haven, the ‘home of pluralism’, where, in late August, four rebellious nights put the city in a state of terror. ‘Substantial areas of twelve great cities lay in ruins … How could this happen in a society of slack resources, in which any active and legitimate group can make itself heard effectively? […] There must have been something fundamentally wrong with the theory of pluralist democracy or the analysis would not have gone so wide off the mark’. (Burtenshaw, 1968: 586–7)
Neo-Pluralism and the Corporatist Turn Concepts of neo-pluralism and neo-corporatism departed from the notion of social groups operating independently from and outside the sphere of government. Neo-pluralism ‘is one of a class of research findings or social science models – such as elitism, pluralism, and corporatism – that refer to the structure of power and policy making in some domain of public policy’ (McFarland, 2007: 45). The term refers to new concepts in the critique and succession of classic pluralist approaches, among them neo-corporatism, clientelism, consociationalism, advocacy coalitions, issue networks and policy niches. Theodore Lowi (1969) was among the first to reject Dahl’s concept of interest group liberalism since – according to his research – associational elites put their resources on the table without any moral or rationalist meaning. They would only exchange with bureaucrats instead of establishing democratic links between people and government.
Neo-pluralist thinking can be divided at least into four strands of argument. First, the classical school has been expanded to the extent that some interests – for instance, those of big businesses – are now being recognized as having a privileged influence, if not over single political decisions then in terms of an overarching political agenda that, according to neo-pluralists, is ultimately biased toward business power. In this sense, neo-pluralists no longer regard governments as neutral mediators, but just as other players on the field who are in some ways connected to economic power holders. Second, neo-pluralist approaches include so-called sub-governments consisting of networks of members of parliament, their staff, ministry officials, experts and representatives of interest associations and firms that are linked by close and lasting relationships. Some other labels relate to the sub-government phenomenon, such as ‘iron triangles’ and ‘issue networks’ (Heclo, 1978), or even state capture. Regardless of their differences, these concepts are all based on empirical observations indicating that there is no open competition among interest groups and that only those with clientelistic relations get access to administrative departments or agencies (Kitschelt, Chapter 29, this Handbook). This view is obviously different from the classic idea of competitive laissez faire pluralism. Third, a variety of neo-pluralist approaches refer to the state – politics and administration – as a relative autonomous entity. They emphasize governments’ capacity to withstand pressures exerted by powerful economic groups or companies in pursuing their own policy agenda backed by parliamentary majorities. At the same time, Fraenkel (1964) insists that the whole of society, and not just the state, needs to be viewed as a complex constitution. The state and civil society are linked through a compound of laws, practices and procedures that define the rights and roles of public and private institutions. The necessity for the state to counter the excessive influence of
Pluralism
oligopolistic, if not monopolistic, carriers of socio-economic power has been emphasized in this view. The democratic state must protect all those sections of the population that are unable to form and maintain sufficiently powerful associations to the end that their interests are not neglected. This normative variety of neo-pluralism is reminiscent of Paul Hirst’s notion of an ‘associative democracy’ (Hirst, 1994) and its implicit assumption of a common good. It entails a paradigmatic turn, since classic pluralism abandoned any notion of a public interest or a common good. The modern classics replaced the search for the common good, which has shaped the history of political ideas over millennia, with a process of articulation, aggregation and integration of manifold interests to achieve a result that is subsequently considered to be in the public interest. A fourth distinctive concept refers to neocorporatist patterns of interest intermediation based on close relations between governments and producer groups, highly centralized top associations of labor and capital in particular (Lehmbruch and Schmitter, 1982), and arrangements of sectoral self-regulation up to semi-autonomous ‘private interest governments’ (Streeck and Schmitter, 1985). In its most basic meaning, corporatism refers to a political power structure and practice of consensus formation and self-government based on the functional representation of professional groups. The corporatist paradigm has been deliberately placed against some central assumptions of mainstream pluralism. It overcomes the influence perspective that underlies all theories of pluralism and their empirical applications so far (Mattina, Chapter 32, this Handbook). Corporatist patterns of interest intermediation certainly do not comply with any conception of lobbying. The latter addresses one- directional relations of influence and impacts on the formulation of policies, whereas the concept of corporatist intermediation emphasizes ongoing ‘exchange relationships’ between governments and well
573
organized interest associations representing important parts of the economy and society. Their participation and even integration concerns not only the formulation but also the implementation of policies. Corporatist interest intermediation has been mainly a European phenomenon that applies to smaller countries such as Austria, Switzerland, the Netherlands and Sweden in particular. Comparative public policy analyses indicate that policies coordinated between governments and top associations of labor and capital resulted in lower unemployment and inflation, enhanced industrial productivity and increased economic growth rates during the 1970s and 1980s (Calmfors and Driffill, 1988). The explanation lies in the comprehensive organization of interests: ‘encompassing’ functional groups who are organized in a centralized, hierarchical fashion have more incentives than small special interest groups to consider the common good (Olson, 1986). Corporatist interest intermediation declined in the wake of a neo-liberal turn in economic policy and major shifts from social democratic to conservative governments in Europe. At the same time, a large number of advocacy groups, social movements and new forms of activism have emerged worldwide. These include manifold idealistic groups that pursue non-commercial purposes, such as civil and human rights, environmental protection, gender equality, gay rights, food safety, grassroots lobbying, animal rights, and so on.
Advocacy and the Civil Rights Movement Organizations that emerged from social and economic justice movements represent marginalized groups such as single mothers, racial minorities, gays and lesbians or the poor. They were born out of the ‘advocacy explosion’ (Andrews and Edwards, 2004: 479) of the late 1960s and 1970s. Civic activism has grown enormously since then.
574
The SAGE Handbook of Political Science
Starting from worldwide political mass movements, student revolts and protests against the Vietnam War, civic initiatives, citizen groups and public interest groups established new methods of advocacy, lobbying and legal action. These civic activities seem so dissimilar from earlier forms of pluralist interest politics, as well as from corporatist concertation, that Tichenor and Harris (2005: 257) attested the older theories to ‘be of little or no theoretical utility’ in understanding policy making in such diverse activist pluralist democracies. A look at social movements of the time reveals indeed some change. Citizens’ initiatives mushroomed and contributed not only to the expansion but also to the differentiation of interest politics worldwide (della Porta, Chapter 39, this Handbook). Contrary to widespread expectations, however, this did not replace the still powerful old-fashioned interest-group lobbies, nor do these movements refute the basic thoughts of pluralism. On the contrary, the strong and continuous rise of advocacy groups is reminiscent of Truman’s (1951) original theory according to which modern societies tend to generate more and more interest groups – all the more so if they are stimulated to organize because of dissatisfaction with governments and in view of social disturbances that alter their relationship with other groups or institutions. To the extent that latent groups associate in order to remedy grievances and discriminatory experiences, they contribute to pluralist power dynamics, and a new equilibrium may be reached. The advocacy explosion exposes multiple, diffuse, interacting groups and factions resembling the original idea of pluralist interest politics as it was originally put forward by Bentley (1908) and Truman (1951). The rise of idealistic non-profit organizations posed new questions on the role, character and impact of groups in a society. They induced research and debates about the benefits of social capital and civic engagement (Putnam, 2000). This line of research directed attention
to the local and regional level and sectoral dynamics, as well as cultural determinants of organizations and how they generate opportunities for and constraints on participation. Jenkins, Wallace and Fullerton (2008) identified a general global shift toward a ‘social movement society’ in which protests have become a routine part of political bargaining. Environmental risks, postindustrial values, gender equality and affluence went along with the growth of the state, sub- governments and corporatism in causing popular opposition and unconventional group activities. This development has gone hand in hand with the fragmentation of parties and party systems. Some analysts fear that the rise of assertive advocacy gave rise to strong emotional, cultural, ideological and religious motivations and will eventually fragment polities, split societies, and lead to populism and crises of governability (Karolewski, Chapter 31, this Handbook). This could jeopardize pluralist democracies, understood as political and social systems of overlapping, mutually compensating cleavages among groups who leave passions and ideologies behind and focus mainly on material interests. Strolovitch and Forrest (2010) found that, compared to group organizations in general, those representing marginalized groups are far less likely to use professional lobbyists, employ legal staff or mobilize party donations. They also stress that advocacy for identity groups shows much less interest homogeneity than, say, narrowly defined business associations. The former groups are characterized by less clear-cut interests that overlap between class, race, gender and ways of life, coming together in one single organization. Marginalized constituencies within these groups often receive the least active representation (Strolovitch and Forrest, 2010: 475f.).
The Rational Choice Perspective Pluralist group theories long neglected the rational motives of individuals to join and
Pluralism
become dues-paying members of interest associations. Mancur Olson (1965) revolutionized the earlier views on why individuals associate. According to Olson, it is not rational for an individual to voluntarily support an organization in pursuing a collective good that is indivisible so that everyone can benefit from being a member or not. The beneficiaries of collective goods will, therefore, tend to avoid paying membership fees and act as ‘free-riders’ instead. Olson demonstrates that the conditions for organizing interests vary by group size: there is little incentive to join large interest organizations, because they act independently from an individual’s contribution. In small groups, however, individual membership may decide whether one can enjoy the fruits of lobbying or not. Thus, the organization of small groups is facilitated by their members’ individual material interests, whereas large groups suffer from opportunism and free- riding. These arguments refute the pluralist belief in equal opportunities to associate resulting in a balanced system of representation. Olson presents the most comprehensive critique of the pluralist group school so far. In pointing to problems of mobilization and internal maintenance, he posed a number of questions that the pluralists had wrongly taken for granted. His classic study (Olson, 1965) is based on six basic premises: • The primary function of groups is to advance the interests of individuals. • Groups seek to provide collective goods whose benefits can be limited to members only, or – if indivisible common goods are concerned – can be enjoyed by everyone in the field. • As it will not be rational for self-interested individuals to contribute to the groups that deliver benefits to everyone, groups are facing a freerider problem. • In order to overcome the free-rider problem, groups will have to provide extra incentives or sanctions to get potential members to join. • The larger the group, the smaller the value of participation by rational individuals.
575
• Non-material solidarity incentives are important only in small groups or sub-circles of large groups as long as interest trumps ideology.
‘Interests trump ideology’ has been a basic assumption of pluralist theories from Madisonian reflections on taming the moods and promoting a reasonable consensus through countervailing diversity up to the classic and neo-pluralist approaches to interest politics. James Q. Wilson (1995) casts doubt on this rationalistic view. He modified a widely held view on the role of material interests and their impact on the associability of individuals, as well as on their prospects of collective action. In maintaining an organization, political entrepreneurs may use different motivational resources. Groups can rely on any combination of four general types of associational incentives. Besides material incentives, which are at the heart of Olson’s theory, Wilson (1995) distinguishes between specific solidarity incentives that can be withheld from individual group members (honors, prices, positions), collective solidarity incentives (friendship, fun, fellowship and conviviality) and purposive solidarity incentives (beliefs, ideological goals). These motivational forces vary in precision and goal specificity: material incentives can easily be decoupled from goals and directed in precise quantities, whereas purposive solidarity incentives are closely related to a group’s stated goals. It follows that groups based on material interests are more adaptive and flexible in their internal organization as well as in relation to their organizational environment, whereas idealistic groups are less able to compromise. Wilson’s theory, in reaction to the material interest bias in Olson’s rational choice approach, supports a widespread conviction that collective action is also motivated by ideals, without any expectation of material rewards. Even if one looks at all sorts of solidarity incentives as quasi-material payoffs from membership, such subjectively felt rewards cannot be calculated in a consistent and precise manner. In this respect, Olson’s
576
The SAGE Handbook of Political Science
critique of the classical school of pluralism has become somewhat attenuated.
Cultural Pluralism We find pluralistic diversity not only in conflicting interests and ideologies, but also in the area of group values, identities and cultural ways of life. Cultural pluralism and identity politics have been among the most flourishing research fields in the wake of minority and group rights discussions and as a result of increasing international migration movements. Especially with the end of the Cold War, there was a dramatic rise in the political significance of cultural pluralism and a change in scholars’ understandings of what drives and shapes ethnic identification in established Western democracies as well as in the successor states of the former Eastern Bloc and in the Global South (Young, 1993). Diversity of culture and values includes differences in group identities and lifestyles marked by religious, linguistic, ethnic and regional affiliations or along the lines of skin color, ancestry, caste, gender and sexual orientation. Cultural pluralists share some premises with classic pluralism, namely that societies are by no means homogeneous, nor are they determined by distributive social class conflicts. The main difference lies in their special consideration of value conflicts and of cultural differences. Cultural pluralism entails a twofold critique of assimilationist concepts as they prevail in classic interest group pluralism. Culturalist approaches replaced the image of a ‘melting pot’ of culturally amalgamated citizens with the new metaphor of a ‘salad bowl’, suggesting that social belongings or identities determined by oneself or others do not melt away but combine like the ingredients of a salad. In addition, there is a functional distinction: cultural pluralism works in other ways than interest pluralism. In culturally segmented societies the amount of overlapping membership seems to be restricted, if not completely absent. One cannot be a Muslim, a Catholic, a Jew and a Hindu at the same time. Even if cultural communities maintain close
relationships, their members may not feel the same cross-pressures from overlap as, for example, a unionist and member of a shareholders’ club, consumer, motorist and nature lover when it comes to conflicting interests in high wages, high profits, low prices and a clean environment or – more specific – members of a fishing club finding fisheries polluted by their workplace. The reassuring effects of overlapping membership – and thus of interest pluralism - appear to be less pronounced in culturally segmented social environments where identitary group loyalties outweigh interest. In societies characterized by strongly felt affiliations along ethnicity, skin colour, language or religion, the integrative functions of interest pluralism may thus be weakened by cultural plurality (cf. Smits 2005). Cultural pluralism is mostly a normative theory proposing protective group rights for minorities. Kymlicka (2003), for instance, argues that different groups within the same society should be eligible to receive different rights to protect their cultures, religions or worldviews against external pressures. This, however, should not support any attempts of organizational elites to limit their individual members’ freedoms in the name of culture. The proposal obviously reveals a dilemma between protective policies for group rights and the liberalist concern for equal rights of individuals, among them defensive rights against political interventions into the private sphere (Deveaux, 2000). Moreover, proposing political valuations of different rights not only leads to legal pluralism, as opposed to the idea of legal unity; it also reflects a normative split of ideals that seems inappropriate for pluralist liberal democracies. Protective group rights could also contribute to the segmentation and division of societies. In this respect, multiculturalism as suggested by the proponents of cultural pluralism could intensify conflicts and would, thus, violate the ideal of social balance, peace and compromise to which the pluralistic idea was first and foremost committed. This seems particularly threatening if cultural, economic and social cleavages reinforce each other and this
Pluralism
eventually throws modern societies back to segmented tribal structures. Some aspects of cultural pluralism resemble the concept of ‘consociationalism’ that Arend Lijphart has vehemently advocated since the 1960s (Lijphart, 1971). The key elements of consociational democracies are cultural groups forming relatively closed social ‘pillars’ that are integrated through cooperative relations among their highest representatives at the elite level of societal sectors such as public media, religion, education, administration and political parties, in particular. Such systems, also known as ‘Proporzdemokratie’ (proportional democracy) or ‘Konkordanzdemokratie’ (concordance democracy), existed and still exist in somewhat looser versions in Austria, the Netherlands, Switzerland and Belgium. Initially, political camps were formed comprising parties that are linked with ideological (e.g. Austria, Switzerland), religious (e.g. the Netherlands) or language (e.g. Belgium) groups in those countries, resulting in a two-tier system of electoral and associational political participation (Lehmbruch, 1977). Such nonmajoritarian democracies based on political power-sharing instead of majority rule are considered to solve conflicts in societies that are divided by deep cultural, ideological, religious or linguistic cleavages (Lijphart, 2004). They often occur together with corporatist interest intermediation and traditions of social partnership. In such cases, political camps are formed in which certain parties and the electoral channel of participation are linked with respective interest associations and subsystems of political interest bargaining. Political systems based on non-majoritarian consensual forms of political conflict regulation have also been labeled ‘consensus democracy’ or ‘negotiation democracy’.
Regional varieties of pluralism The United States is the homeland of pluralism. Pluralist politics is anchored in its constitutional history just as variants of liberal
577
corporatism characterize policy making in some small European countries, whereas state corporatism prevails in parts of Latin America and in some states in Asia and Africa. This has to do with empirical realities, but also with traditions of political thought and regional academic legacies. The many political science approaches dealing with pluralism still lack a coherent understanding of interest politics. The American perspective remains focused on lobby groups influencing governmental decision-making. Research on European state–group relations emphasized bi-directional exchange relations between governments and organized groups instead. In Latin America and newly industrialized countries in Asia, the view prevails that governments use state–group relations to structure and guide national economies and societies in a top-down process. Pressure on governments, negotiations with governments and subordination to governments can be seen as three distinct major modes of interest politics. They differ in the direction of the influence and are known as pluralist pressure politics, liberal corporatism and state corporatism respectively. American researchers’ continuing and recently renewed obsession with one- directional lobbying is difficult to explain, as studies on sub-governments, issue networks and iron triangles have proven the existence of bi-directional collaborative relationships between state authorities and private interests in the United States. Most American studies on pluralism missed the realities of interest intermediation outside the United States. Similarly, after the 1970s, European researchers rarely took the North American perspective. Thus, theories as well as empirical work have been split along paradigms and continents so far. During the 1960s a number of case studies appeared in an attempt to apply the American perspective to some European and Latin American countries. Skilling (1971) suggested that the ‘group theory’ might prove useful to examine Soviet politics, since he
578
The SAGE Handbook of Political Science
found public manifestations of the influence of special interests on policy formulation in the post-Stalin phase. Apparently, even under authoritarian regimes in developing and communist countries, informal groups and various forms of pluralist pressure politics had been identified to drive the political process (Linz and Stepan, 1978). Up to the 1970s, international debates were dominated by an ethnocentric view, which treated political systems and processes as a variety of American pressure group lobbying. Implicit to this analysis was a functionalistic optimism which attributed the modernization of societies and the postwar economic boom to the beneficial consequences of democratic pluralism. Many empirical analyses of political process in African or Asian countries have been shaped, if not dominated, by liberal pluralist thinking. In assuming Euro-American value terms and working conditions, most of them tended to downplay or screen out the diversity of cultural viewpoints and conflicts, which differ fundamentally from interest group pluralism. As a consequence, the pluralist approach often went hand in hand with an assimilationist thrust in favor of modernization and Westernization. This may have contributed to the rise of anti-pluralist attitudes of politics in the guise of cultural nationalism and populism, which have, in different ways, become a significant feature of contemporary politics in some Asian countries (Mobrand, 2018). As in most parts of Africa, majoritarian and exclusionary policies and agendas with a strong emphasis on public order and security form the core of the ruling elite orientation. The intersection of social conservativism and populism is a key feature of present anti-pluralist politics. Anti-pluralism often builds on legacies of ‘authoritarian statism’ that once pushed back representative institutions and strengthened the authority of bureaucratic agencies not directly accountable to the public. Research on interest politics in the EU showed that its institutions and policies
contribute to the transformation of interest intermediation in Europe (Streeck and Schmitter, 1991). Most research indicates that business associations do particularly well in promoting their agendas and preventing policies they do not want (Klüver, 2013). Others found that the EU’s multiple tiers of government offer opportunities for citizen groups to defend and advance their interests (Dür et al., 2015). In addition, the EU Commission regards citizens’ groups as allies in its efforts to improve its competences and legitimacy, and to establish a European public space. The European Parliament’s receptiveness toward citizen groups additionally supports these efforts. Besides, the ability of activists to expand public debates and conflict exceeds that of established business associations who prefer to shape the policy process quietly, avoiding open conflict (Dür et al., 2015: 958, 967, 975).
Major advances, ongoing debates, critical assessments Research on interest groups has made great progress over the past century. Much has been learned – how they emerge and organize, aggregate and articulate demands, interact among each other, influence the legislative and executive branches of governments, and how all this effects the outputs and outcomes of public policy-making in pluralist democracies. Following David Truman’s (1951) extension of Bentley’s (1908) seminal study on social pressures and their effects on governing, a bulk of interest group research appeared, a considerable part of which led to disillusionment. Early notions of balancing private interests have been continuously refuted. In most of the cases investigated, the forces of interest influence were found to be unevenly distributed. However, most studies also confirmed that powerful single interests were not able to monopolize political decision-making.
Pluralism
In this respect, the central thrust of pluralist theories could be confirmed: there is no single private interest, nor any sovereign power, in a pluralist democracy commanding the outcomes of public policy. Olson’s (1965) rational choice approach to the study of interest groups demystified some long established views on the functioning of pluralism. He could demonstrate that there is no level playing field in pressure politics. Moreover, what early pluralists saw as equilibration appeared to be more of a series of distributive struggles among small special interest groups exploiting the general interest: in other words, a sort of wrestling in a china shop, damaging the common good (Olson, 1986: 173). Olson’s theory exposed some serious flaws of the pluralist model but did not falsify the approach as such, nor replace it with a new one. On the contrary, Olson affirmed the idea of policy making being a multi-channel process of political influence. He even emphasized the key role of lobby groups for national welfare (Olson, 1986). Gary Becker, however, considered Olson’s condemnation of small interest groups to be exaggerated ‘because competition among these groups contributes to the survival of policies that raise output’ (Becker, 1983: 344). Small interest groups may be more efficient in controlling the negative effects of free-riding, but they are handicapped in taking advantage of scale economies in the organization of pressure. Becker assumes that policies reducing social outputs stimulate more countervailing pressures from negatively afflicted groups than welfare-enhancing policies do. This is mainly because the potentials to compensate cost-bearers decrease owing to the dead-weight losses of collusive cartelization or redistributive policies. Therefore, in democratic states, the rising marginal costs of socially destructive lobbying should mark the limits of an excessively unbalanced growth of narrow interest groups (Czada, 1991: 272). After the turn of the millennium, new approaches were largely inspired by attempts to take into account policy attributes and
579
thematic factors that affect the course and outcomes of interest politics. It has been established that the issue context, in terms of number of actors involved and their public interest position, matters for the strategies used and for their political success (Mahoney, 2007). Baumgartner et al. (2009) found that groups defending the status quo usually do better in realizing their goals than groups seeking to change policies. Research on the effects of interest group action on policy outputs still suffers from a lack of data. It is much more difficult to measure the political weight of interest groups than that of political parties. Except for simple cases, the relationship between the stakes of groups and their political strengths remains a mystery, largely because in nearly all studies neither stakes nor gains in regulation are directly measured. This is all the more lamentable as the relative power to influence served as a key explanatory variable. The causal impact of interest groups on outcomes is still unsolved. Theories of pluralism and most research contributions simply assume that groups have an influence on policy outcomes. In contrast, theories on corporatism turn the influence vectors around or are based on the assumption of bi-directional causation and repercussions of governmental policies on group strategies in particular. It seems at least reasonable to assume that redistributive policies in favor of certain groups make them stronger and more influential. Lehmbruch (1991), in an attempt to establish a developmental theory of interest systems, points to the fact that interactions between interest organizations and governments are shaped by long established national administrative cultures. Corporatist relations prevail in Scandinavian and some other small European countries. Lasting close relationships between administrations and associations were also found in Germany, but to a much lesser extent in the UK, and hardly at all in the United States. They are practically absent in France due to the pronounced claim of autonomy of the French bureaucratic elite.
580
The SAGE Handbook of Political Science
The closest state–society networks could be found in Switzerland, where - at the end of the 19th century - the federal government began to subsidize the formation of representational monopolies of top associations due to the peculiar structure of the Swiss state. Because of its institutional decentralization and, at that time, its extremely weak administrative capacities, the government of the federation found itself not well equipped to reconcile the conflicting interests in foreign trade and conduct successful international negotiations on tariffs. Therefore, it proposed to the ‘Vorort’ (hitherto an association run by leading businessmen in a honorary capacity) and to the Swiss Union of Articrafts and Trades (small business) financial grants to employ full-time secretaries for the establishment of trade statistics and other documentation needed by the government. (Lehmbruch, 1991: 137)
To be sure, state-society links, sectoral subgovernments, issue networks and iron triangles have been part of the American research agenda. But close state–group interactions have been interpreted more as an expression of inadmissible state capture than as twosided exchange relationships. This, however, is just another indication of how strongly the focus on group pressures and lobbying shapes the American tradition of research on pluralism. This is not just a blind spot on the research agenda. Rather, it points to a possible tautology that lies in assuming a clear causality to the ambiguous relationship between political pressure and public policy. Since most policies have redistributive effects, researchers have been tempted to identify winning groups as the most powerful. Such a backward conclusion would only apply if one interprets public policies solely as the result of group pressure. It turns into tautology when other explanatory factors, such as factual constraints, scientific expertise, institutional imperatives, policy routines or strategies of governments toward particular groups, are taken into account, to the point that governments exert pressure or ask interest groups to exert pressure on them in favor of a particular policy.
Conclusion and prospects In summary, it can be said that pluralism research has gone through several stages of development and branched out in many ways, but without abandoning its reference to the impact of interest group politics on political decision-making. Leaving the early history of ideas aside, research on democratic pluralism began with the American group school and its assumption that public policy making was determined by the interaction of groups (Truman, 1951). The second stage focused on the concept of ‘pluralist democracy’ based on a decentralized, non-majoritarian political system called ‘polyarchy’ by Dahl (1967). Major contributions to the third stage denied former assumptions that all groups have equal opportunities to organize and to deal with conflicts. McFarland (2010: 40) describes this stage as one of ‘multiple-elitism’ because of its particular focus on special-interest coalitions, sub-governments and issue networks. The fourth stage of neo-pluralism and corporatist intermediation extended the thematic range by emphasizing the role of governments and administrations and their exchange relationships with interest groups. In the course of this development, each subsequent variant of theory and research retained elements of earlier ones, but rejected others that were thought to be erroneous. This looks like an ideal case of cumulative research and discovery. However, it did not lead to a coherent theory of pluralism. On the contrary: in dealing with diversity, the research on pluralism has itself very much diversified. This is due to the fact that one finds a multitude of democratic models of interest intermediation over time, along policy fields and in a cross-national comparative perspective. The conclusion drawn long ago that group influence is fundamentally biased in favor of business and professional interests is still generally correct. Nevertheless, many studies point to a much more diverse interest group
Pluralism
system in which weak groups are somewhat better represented now than in the 1970s and 1980s. Sustained distortions are mainly caused by barriers to collective action as explained by Olson (1965) and lacking resources on the part of underrepresented interest groups. Most promising new avenues point to the effects of government policies on the structure and development of interest groups. Increasing government activities seemingly led to a massive shift in interest group activism, creating a more diverse and densely packed political environment. Moreover, attributes of state institutions go hand in hand with access opportunities for groups, and specific policies shape their choices, opportunities and strategies – an observation that Eckstein (1960) already made more than half a century ago with reference to the British case. This is reminiscent of two critical statements on pluralism research: first, Almond’s (1983: 252) comment that research in pluralism reveals signs of ‘professional amnesia … impairment of professional memory [that] has become common in political science and helps to explain its fragmented and faddish character’; second, LaPalombara’s (1960: 29, 34) warning not to simply transfer American approaches elsewhere, but to take other countries’ different traditions and structures of interest politics as a basis for cross-national comparisons. This task, suggested in a paper delivered at the 1959 Annual Meeting of the Midwest Conference of Political Scientists, is yet to be undertaken.
References Almond, G. A. (1983), ‘Corporatism, Pluralism, and Professional Memory’, World Politics, Vol. 35 No. 2, pp. 245–60. Althusius, J. (1964), The politics of Johannes Althusius (F. Carney, Trans.). Boston: Beacon Press. (Original work published 1603).
581
Andrews, K. T. and Edwards, B. (2004), ‘Advocacy Organizations in the US Political Process’, Annual Review of Sociology, Vol. 30, pp. 479–506. Baumgartner, F. R., Berry, J. M., Hojnacki, M., Leech, B. L. and Kimball, D. C. (2009), Lobbying and Policy Change, Chicago: University of Chicago Press. Becker, G. S. (1983), ‘A Theory of Competition among Pressure Groups for Political Influence’, The Quarterly Journal of Economics, Vol. 98 No. 3, pp. 371–400. Bentley, A. F. (1908), The Process of Government: A Study of Social Pressures, Chicago: University of Chicago Press. Burtenshaw, C. J. (1968), ‘The Political Theory of Pluralist Democracy’, The Western Political Quarterly, Vol. 21 No. 4, pp. 577–87. Calmfors, L. and Driffill, J. (1988), ‘Bargaining Structure, Corporatism and Macroeconomic Performance’, Economic Policy: A European Forum, Vol. 3 No. 6, pp. 13–61. Connolly, W. E. (Ed.) (1969), The Bias of Pluralism, New York: Atherton Press. Czada, R. (1991), ‘Interest Groups, Self-Interest, and the Institutionalization of Political Action’, in Czada, R. M. and A. Windhoff-Héritier (Eds), Political Choice: Institutions, Rules, Limits of Rationality, Frankfurt am Main/Boulder, CO: Campus/Westview Press, pp. 257–99. Dahl, R. A. (1961), Who Governs? Democracy and Power in an American City, New Haven, CT: Yale University Press. Dahl, R. A. (1967), Pluralist Democracy in the United States: Conflict and Consent, Chicago: Rand McNally. Dahl, R. A. (1971), Polyarchy: Participation and Opposition, New Haven, CT: Yale University Press. Deveaux, M. (2000), Cultural Pluralism and Dilemmas of Justice, Ithaca, NY: Cornell University Press. Dewey, J. (1926), ‘The Historic Background of Corporate Legal Personality’, The Yale Law Journal, Vol. 35 No. 6, pp. 655–73. Dreyer, M. (1993), ‘German Roots of the Theory of Pluralism’, Constitutional Political Economy, Vol. 4 No. 1, pp. 7–39. Dür, A., Bernhagen, P. and Marshall, D. (2015), ‘Interest Group Success in the European Union: When (and Why) Does Business
582
The SAGE Handbook of Political Science
Lose?’, Comparative Political Studies, Vol. 48 No. 8, pp. 951–83. Eckstein, H. (1960), Pressure Group Politics: The Case of the British Medical Association, Stanford, CA: Stanford University Press. Fraenkel, E. (1964), Der Pluralismus als Strukturelement der freiheitlich-rechtsstaatlichen Demokratie, München: C.H. Beck. Gierke, O. F. v. (1900), Political Theories of the Middle Ages, Cambridge: Cambridge University Press. Glass, S. T. (1966), The Responsible Society: The Ideas of the English Guild Socialists, London: Longmans. Heclo, H. (1978), ‘Issue Networks and the Executive Establishment’, in Beer, S.H. and King, A. (Eds), The New American Political System, Washington: AEI Studies, American Enterprise Institute for Public Policy Research, pp. 87–124. Hirst, P. Q. (1994), Associative democracy. New forms of economic and social governance, Cambridge: Polity Press. Jenkins, J. C., Wallace, M. and Fullerton, A. S. (2008), ‘A Social Movement Society? A Cross-National Analysis of Protest Potential’, International Journal of Sociology, Vol. 38 No. 3, pp. 12–35. Klüver, H. (2013), Lobbying in the European Union: Interest Groups, Lobbying Coalitions, and Policy Change, Oxford: Oxford University Press. Kymlicka, W. (2003), Multicultural Citizenship: A Liberal Theory of Minority Rights, Oxford: Clarendon Press. LaPalombara, J. (1960), ‘The Utility and Limitations of Interest Group Theory in NonAmerican Field Situations’, The Journal of Politics, Vol. 22 No. 1, pp. 29–49. Latham, E. (1952), The Group Basis of Politics: A Study in Basing-Point Legislation, Ithaca, NY: Cornell University Press. Lehmbruch, G. (1977), ‘Liberal Corporatism and Party Government’, Comparative Political Studies, Vol. 10 No. 1, pp. 91–126. Lehmbruch, G. (1991), ‘The Organization of Society, Administrative Strategies, and Policy Networks: Elements of a Developmental Theory of Interest Systems’, in Czada, R. M. and A. Windhoff-Héritier (Eds), Political Choice: Institutions, Rules, and the Limits of Rationality,
Frankfurt am Main/Boulder, CO: Campus/ Westview Press, pp. 121–60. Lehmbruch, G. and Schmitter, P. C. (Eds) (1982), Patterns of Corporatist PolicyMaking, London and Beverly Hills, CA: Sage. Lijphart, A. (1971), ‘Cultural Diversity and Theories of Political Integration’, Canadian Journal of Political Science, Vol. 4 No. 1, pp. 1–14. Lijphart, A. (2004), ‘Constitutional Design for Divided Societies’, Journal of Democracy, Vol. 15 No. 2, pp. 96–109. Linz, J. J. and Stepan, A. C. (1978), The Breakdown of Democratic Regimes, Baltimore MD: Johns Hopkins University Press. Lowi, T. J. (1969), The End of Liberalism: Ideology, Policy, and the Crisis of Public Authority, New York: Norton Mahoney, C. (2007), ‘Lobbying Success in the United States and the European Union’, Journal of Public Policy, Vol. 27 No. 1, p. 35. McFarland, A. S. (2007), ‘Neopluralism’, Annual Review of Political Science, Vol. 10, pp. 45–66. McFarland, A. S. (2010), ‘Interest Group Theory’, in Maisel, L. S. and J. M. Berry (Eds), The Oxford Handbook of American Political Parties and Interest Groups, Oxford: Oxford University Press, pp. 37–56. Mobrand, E. (2018), ‘Limited Pluralism in a Liberal Democracy: Party Law and Political Incorporation in South Korea’, Journal of Contemporary Asia, Vol. 48 No. 4, pp. 605–21. Olson, M. (1965), The Logic of Collective Action: Public Goods and the Theory of Groups, Cambridge, MA: Harvard University Press. Olson, M. (1986), ‘A Theory of the Incentives Facing Political Organizations: NeoCorporatism and the Hegemonic State’, International Political Science Review, Vol. 7 No. 2, pp. 165–89. Putnam, R. D. (2000), Bowling Alone: The Collapse and Revival of American Community, New York: Simon & Schuster. Skilling, H. G. (1971), ‘Group Conflict in Soviet Politics: Some Conclusions’, in Skilling, H. G. and F. Griffiths (Eds), Interest Groups in Soviet Politics, Princeton, NJ: Princeton University Press, pp. 379–416.
Pluralism
Smits, K. (2005), Reconstructing Post-Nationalist Liberal Pluralism: From Interest to Identity, New York: Palgrave Macmillan. Streeck, W and Schmitter, P. C. (1985), Private Interest Government: Beyond Market and State, London and Beverly Hills, CA: Sage. Streeck, W. and Schmitter, P. C. (1991), ‘From National Corporatism to Transnational Pluralism: Organized Interests in the Single European Market’, Politics & Society, Vol. 19 No. 2, pp. 133–64. Strolovitch, D. Z. and Forrest, M. D. (2010), ‘Social and Economic Justice Movements and Organizations’, in Maisel, L. S. and J. M. Berry (Eds), The Oxford Handbook of American Political Parties and Interest
583
Groups, Oxford: Oxford University Press, pp. 468–84. Tichenor, D. J. and Harris, R. A. (2005), ‘The Development of Interest Group Politics in America: Beyond the Conceits of Modern Times’, Annual Review of Political Science, Vol. 8, pp. 251–70. Truman, D. B. (1951), The Governmental Process: Political Interests and Public Opinion, New York: Alfred Knopf. Wilson, J. Q. (1995), Political Organizations, Princeton, NJ: Princeton University Press. Young, C. (Ed.) (1993), The Rising Tide of Cultural Pluralism: The Nation-State at Bay? Madison, WI: University of Wisconsin Press.
35 Political Behavior Oscar Gabriel
The Origins of Behavioral Research Compared to other sub-disciplines of political science, political behavior is a new field of research. It was not before the 1920s that scholars presented the first findings on the patterns and determinants of electoral choices. The contours of behavioral analysis became first visible in the path-breaking study The People’s Choice (Lazarsfeld et al., 1944), which embedded the analysis of electoral behavior in a broader context of political communication, information processing and attitude formation, and thus – much earlier than other studies – highlighted the dynamics of people’s political attitudes and actions. Although political behavior was broadly investigated in the following decades, behavioralism as a new paradigm in political science was not outlined systematically before the 1960s. Inspired by methodological and theoretical advances in neighboring disciplines, some scholars claimed the need for a more
professional profile of political analyses. As a consequence, the demand to broaden the scope of political science by putting analyses of political behavior on the research agenda gained momentum (Almond, 1996: 64–78). Besides sketching the methodological creed of behavioralism, a couple of empirical studies served as the blueprint of the evolving behavioral program. In electoral research, Lazarsfeld et al. (1944) established the sociological school. The socio-psychological model introduced by Campbell et al. (1954, 1960) has figured as the leading approach in electoral research up to now. Researchers around Verba and Nie (1972, 1978) have played a key role in research on political participation. The Civic Culture (Almond and Verba, 1963/1989) should also be mentioned in this context as a point of reference for large-scale cross-national studies using individual-level data (Berg-Schlosser, Chapter 37, this Handbook). Between 1950 and 1970, research on political behavior developed into an innovative
Political Behavior
and flourishing discipline in the United States. In other nations, the institutionalization of behavioral research was delayed by at least a decade. Aside from electoral behavior and political participation as the two main fields of behavioral political science (see also Dalton and Klingemann, 2007), the behavioral paradigm inspired research in diverse fields such as political representation, legislative and judicial behavior, coalition building, political socialization and, mostly outside of political science at first, political communication (Almond, 1996: 73–8).
Behavioralism in Political Science A major step toward establishing behavioral research as a subfield of political science was made when the behavioralist revolution formed as an academic movement. Psychological behaviorism, as promoted by John B. Watson (1913), served as a blueprint for the behavioralist research agenda. Behaviorism limited psychological research to observable behavior and developed a stimulus response model as an explanation of behavior. Sharing with behaviorism an interest in human behavior, the behavioral approach in political science focuses on the description and explanation of a wide range of individual and collective behaviors in politics. This goes alongside the assumption that macro-level characteristics such as stability and innovation, performance and responsiveness of political regimes are rooted in civic behavior in the last instance (Almond and Verba, 1963/1989; Easton, 1965). The demand to shift the focus of political research from institutions and history to political behavior has always been accompanied by a plea for the use of quantitative empirical data as a means of theory testing. The methodological guidelines of behavioralism, as stated by Easton (1967: 16–17), can be summarized as follows:
585
1 Focusing research on testing hypotheses/theories that explain regularities of political behavior, that are validated by observing political reality, and used as a basis of scientific forecasting and social technology. 2 Using standardized research methods in order to gather quantitative data as a basis for describing political behavior and testing explanatory theories. 3 Strictly distinguishing empirical explanation from ethical evaluation and limiting research to the former. 4 Learning from the theoretical and methodological achievements of the more advanced neighboring disciplines of political science.
The principle of theory-oriented empirical research serves as the glue keeping the behavioral movement together. Starting from the psychological stimulus–organism– response paradigm, behavioral analyses describe how people perform their various roles in politics and how the interplay between the citizens’ social position, social context and political attitudes accounts for that (Smith, 1968). Theories are understood as (sets of) hypotheses stating a causal relationship between two kinds of empirical observables: dependent and independent variables. An explanation is seen as valid only if the dependent variable is empirically proved as a consequence of the independent variable and cannot be attributed to the influence of third factors. The behavioral creed was put into practice in a variety of ways. Theoretical behavioralists such as Easton (1967) and Eulau (1969) elaborated the methodology of behavioralism and outlined the building blocks of a general theory of political action. Easton’s A Systems Analysis of Political Life (1965) stands out as a prominent example of a macro-level analytical framework explaining which factors contribute to the survival of political systems in a changing environment. Rational choice theories (Downs, 1957; Simon, 1957) stand for an intellectual tradition of methodological individualism and emphasize the ideal of a rational decider. This latter approach has
586
The SAGE Handbook of Political Science
evolved into the leading paradigm in modern political theory and has inspired more recent psychological theories of political behavior (Mutz, 2007). Instead of working on a general theory covering the whole array of civic activities, empirical behavioralism has proceeded by testing middle range theories that explain specific forms of political behavior. Most empirical studies have focused on voting behavior and political participation and have generated a plurality of mutually compatible theoretical approaches.
theoretical constructs and the explanatory models, were steadily improved. Continuous, cumulative empirical research conducted in an increasing number of democratic countries has successively generated a huge amount of comparable data for empirical research. Thus, the developmental work performed in the ANES context left its traces across the entire field of attitudinal and behavior studies (Arzheimer et al., 2017; Dalton and Klingemann, 2007: 9–13; LewisBeck et al., 2008: 10–18).
Electoral Research
The Sociological Models of Voting Behavior
Electoral research is a good example of theoretical pluralism showing a co-existence of sociological and socio-psychological approaches. The sociological school dates back to a regional study of the presidential election of 1940 in Erie County, Ohio (Lazarsfeld et al., 1944), while nationwide studies of the 1952 and 1956 presidential elections (Campbell et al., 1954, 1960) were the starting point of the socio-psychological approach to voting behavior. The first American National Election Study (ANES) providing the data for electoral research in the United States was conducted in 1948 and has been repeated regularly since then. From its beginnings, the ANES inspired national, international and cross-national research on political attitudes and behavior in several respects. First, the project developed the design of representative population samples allowing inferences on the characteristics of national publics. Second, it drafted the key concepts aiming to describe and explain political behavior. Third, it translated the relevant constructs into survey questions allowing the measurement of behavioral, attitudinal, social and contextual characteristics of citizens. As a result of a debate continuing over a long time-span, the methods for sampling and data collection, as well as
The sociological approach to electoral behavior was never elaborated as a coherent theory. Apart from sharing the view of social characteristics as key determinants of behavior, most studies analyzing the electoral choices of different social groups referred rather loosely to the work of Lazarsfeld et al. (1944: 27). In a different respect, this applies also to the second pioneering study of the sociological school, the social cleavage theory developed by Lipset and Rokkan (1967). Beyond their focus on social characteristics, these two variants of the sociological approach do not have much in common. While Lazarsfeld and his associates aim to explain individual voting decisions, the cleavage theory is a macrosociological approach linking societal divisions to party system characteristics. It serves as a background theory for individual-level voting analyses rather than as a basis of systematic empirical studies testing the impact of conflict resolution on party system change throughout modern history (Berntzen, Chapter 23, this Handbook). Without elaborating a stringent operationalization of the ideas of Lipset and Rokkan, the cleavage theory has nevertheless inspired numerous national and cross-national studies of the impact of social change on party identification and voting behavior (von Schoultz, 2017). In a broader
Political Behavior
approach, Lazarsfeld et al. not only investigate the role of social backgrounds, but also include interpersonal and mass communication as determinants of electoral choice.
The Micro-Sociological Approach According to the micro-sociological approach, a large number of individual social characteristics and contextual factors account for voter turnout and party choice. Among the social background variables, socio-economic status (SES), religion, residence, occupation and age were first separately analyzed as determinants of individual voting decisions and then combined into an Index of Political Predisposition (IPP). When charting the social basis of party support, it was found that the Republicans were preferred by around 75% of Protestant, high SES people living in rural areas. An even higher percentage of low status, Catholic, urban dwellers voted for the Democrats (Lazarsfeld et al., 1944: 16–27). Pro-business attitudes were other important antecedents of party preference (Lazarsfeld et al., 1944: 28–39). Political interest, the intensity and consistency of exposure to political stimuli, active search for political information and anticipation of the winner of the election, media consumption and contact with opinion leaders made a difference to the timing, stability and direction of the voting choice (Lazarsfeld et al., 1944: 52–158). In analyzing the role of these factors, the authors highlighted the dynamics of the processes of opinion formation and voting choice. Some of these assumptions returned to the research agenda when electoral studies sought to build bridges with the neighboring disciplines of communication research and psychology.
The Macro-Sociological Approach Compared to the micro-sociological school, the cleavage theory has far more strongly
587
shaped empirical research on the social antecedents of electoral choices, particularly in Europe. The term ‘cleavages’ is used as a description of durable political divisions that are rooted in the structure of a society and find their expression in the party system. Cleavages build the basis of voter– elite coalitions and are stabilized by intermediary organizations (churches, unions) and anchored in ideologies (left–right) (von Schoultz, 2017: 33–6). Lipset and Rokkan (1967) distinguished between two kinds of cleavages: socio-cultural and socio- economic. The former separate center from periphery and state from church; the latter divide urban from rural areas and labor from capital. While socio-cultural conflicts center on values and group identities, socio-economic divisions relate to the organization of economic life (state versus markets) and to the shape and size of the welfare state. According to Lipset and Rokkan, the European party systems of the 1960s reflected the cleavage structures prevailing at a point at which the European societies had passed crucial thresholds in the processes of modernization and democratization (‘freezing hypothesis’). The cleavages of this period were assumed to reflect the nature, number and sequence of modernization conflicts. In one ideal constellation, the four modernization conflicts were resolved stepwise and generated a two-party system exclusively built on the last cleavage, the division between labor and capital (‘Left’ and ‘Right’). One of the competing parties represents labor (Social Democrats), the other business (Liberals). When, by contrast, the political system faced the four modernization conflicts more or less simultaneously and when there was no opportunity to resolve them successively, a multi-party system reflecting the citizens’ positions in the multiple cleavage structure was formed. In addition to Social Democrats and Liberals, religious, agrarian and ethnic or regional parties then competed for electoral support.
588
The SAGE Handbook of Political Science
As demonstrated by empirical research, the citizens’ position in the cleavage structure does not fully determine voting choices. Nevertheless, the basic assumptions on the voter–party coalitions were broadly supported by empirical studies, mainly in Europe. Earlier than studies using the socio-psychological approach, sociological analyses of electoral behavior evolved as a field of cross-national and cross-temporal electoral studies. They present broad evidence on the social basis of partisan choices and, moreover, explain variations in electoral behavior over time, as well as between and within nations (von Schoultz, 2017; Dalton and Klingemann, 2007: 9–13).
The Socio-Psychological Model of Voting Behavior Compared to the complexity of the microsociological approach, Campbell and his associates offer a parsimonious, coherent and powerful explanation of individual voting behavior. Although well aware of the manifold influences on electoral choices, a minimal number of political attitudes, that is, the citizens’ attitudes toward the parties and candidates running for public offices, are in the focus of the socio-psychological school. These are conceived as located close to the party choice and as genuinely ‘relevant conditions’. This clearly reflects the behavioral creed that exogenous (historical, social and contextual) factors will not become meaningful for behavior unless they are translated into political attitudes immediately preceding that behavior. All relevant components of the model were arranged in a funnel of causality, with the voters’ social position and their political experience as starting points, the attitudes toward parties and candidates in an intermediary position, and the voting behavior at the end of the causal chain (Campbell et al., 1960: 18–64, 523–38). Among the attitudes toward parties and candidates shaping the electoral choices, party identification plays the decisive role
(Campbell et al., 1960: 120–45). This concept is rooted in psychological group theories and characterizes a more or less stable affective link between individuals and a political party. It is seen as affectively charged, varying in direction and strength, persisting over time and serving as the prime direct determinant of party choice. Under normal conditions, most people cast their vote in favor of the party they identify with. At the same time, party affiliation shapes citizens’ attitudes toward candidates and issues and thus also exerts an indirect influence on the voting decision. Issue orientations depict the impact of public policy on voting choices (Campbell et al., 1960: 168). Stated simply, issue voting means that voters prefer the candidate or party they perceive as most competent in dealing with public policy matters, that is, domestic and foreign issues, and as best representing their policy stances. In order to become salient for party choices, political issues must be cognized (issue familiarity) and induce a minimal intensity of feeling (salience), and voters must perceive one of the parties to be closer to their positions or expectations than the others (Campbell et al., 1960: 170–87). Due to shifts in the political agenda, the citizens’ issue orientations may change from one electoral contest to the next. The perception and evaluation of the candidates running for public offices figures as the third component of the socio-psychological model. Campbell et al. (1960: 52–9) regarded voters’ attitudes toward the presidential candidates as a multidimensional phenomenon. Candidates are perceived as representatives of their parties and assessed according to their personality, qualification, experience and record. These attitudes can rest on diffuse (‘a good man’) as well as on more specific attributes, such as being strong and decisive or presenting themselves as kind, warm and likable. By conceptualizing party identification, issue and candidate orientations as main determinants of party choice, the socio- psychological model incorporates long-term and short-term factors. While the former
Political Behavior
account for behavioral stability, the latter are a source of electoral change. The stabilizing function is attributed to party identification, while the attitudes toward issues and candidates are seen as fluctuating more strongly between elections. Due to its specific characteristic as a stable and influential attitude, party identification exerts a direct influence on the vote and has an indirect impact on it through shaping attitudes toward candidates and issues. When voters assess one of the competing parties positively on all three relevant dimensions, they will be strongly disposed to participate in the election and to cast their vote in favor of the respective party. However, party identification is not immune to changes and can, for instance, be altered over the long run by assessments of candidates and issues contradicting the existing party affiliation. Inconsistency between those factors makes abstention from voting probable and will also elicit changing party choices (Campbell et al., 1960: 523–38). The socio-psychological theory has served as a basis for innumerable studies of electoral behavior all over the democratic world. Due to variations in study design, time and context, we cannot derive general conclusions from existing empirical evidence. Nevertheless, party identification has proved the crucial determinant of voting behavior in the United States and in a large number of other democratic countries over the years. As expected, the impact of issue orientations and the attitudes toward candidates on voting choices differs strongly between elections, with no clear pattern over time or across nations (Dalton and Klingemann, 2007: 9–13; Lewis-Beck et al., 2008: 31–59, 60–81, 111–200; Thomassen, 2005a, 2005b).
589
democratic countries. Nationwide survey data are used as the main source of electoral studies. The scope of electoral research has broadened by including analyses of so-called second order elections, as for the European Parliament. At the same time, scholars have sought to integrate the core assumptions of the traditional approaches, but have also opened the discipline to new psychological approaches. The search for a common theoretical frame of reference, the updating of the key concepts, the improved availability of national and cross-national data and the formation of international research networks has laid the ground for cross-national and cross-temporal comparison of voting behavior. Answering the question of how societal change was going to alter the role of party identification, issue and candidate orientations and their impact on electoral choices moved to the top of the research agenda. These conceptual and theoretical innovations went alongside the development of new methods of data gathering and data analysis. Finally, electoral research that originated in the United States and then was institutionalized in the OECD countries has expanded to other world regions such as Central and Eastern Europe, Latin America and parts of Asia and Africa, where social cleavages and party alignments do not play the same determining role in voting behavior as they do in the established democracies. As a consequence, electoral volatility and rapid party system changes seem to play a major role in the political process of the emerging democracies (Arzheimer et al., 2017: 5; Dalton and Klingemann, 2007: 12–13; in greater detail LeDuc et al., 2002, 2010, 2014).
Revising established concepts Innovations In the last quarter of the 20th century, electoral research was institutionalized as a subdiscipline of political science in many
Changing Views on Party Identification Since the publication of The American Voter, research on party identification has been
590
The SAGE Handbook of Political Science
mainly influenced by two factors. First, as electoral studies were conducted in an increasing number of democratic societies over an extended time-span, scholars observed considerable cross-national and cross-temporal variations in levels of party identification. This elicited a dispute as to whether party identification exists and how it affects voting choices in non-American, often multi-party systems. A follow-up problem referred to the measurement of party identification under varying institutional and cultural conditions. Third, findings on the variation of party identification over time stipulated broad research on the nature and causes of the respective attitude (Budge et al., 1976/2010; Lewis-Beck et al., 2008: 138–56). After a series of long and controversial debates, party identification is now mostly understood as a stable, affective link between voters and political parties that differs from formal party membership and also from party choice. It exists in most contemporary democracies, is rooted in processes of primary and secondary socialization and shapes voting choices as well as other political attitudes and behaviors. In broadening the concept, some scholars assume that people can develop stable affective links to more than a single party, particularly in multi-party systems. Under specific conditions, the voters’ relationship to political parties may also consist in rejecting one or more parties rather than in upholding stable affective linkages to others – what is called negative party identification. These conceptual refinements reflect variations in the institutional and cultural contexts in which party identification emerges and operates (Arzheimer et al., 2017; Lewis-Beck et al., 2008: 127–37; Thomassen 2005a: 11–12). The broadly documented cross-national decline of party alignment gave rise to debate as to the changing nature of party identification and the impact this might have on electoral choices. While some scholars plead for a new look at party identification as a volatile
tally in the evaluation of party performance, most uphold the view of party identification as a stable affective political attitude impacting a large number of less stable orientations to parties and politics. However, the traditional conceptualization was also challenged by findings that voting behavior could also affect party identification. In revising the prevailing view of party identification as being mainly transmitted by parental socialization and largely persisting over time, Converse (1969) emphasized the role of increasing electoral experience and repeated decisions in favor of a particular political party as stabilizing party attachments. While most scholars have dealt with party identification as an individual-level concept, there were also attempts to explore the distribution, source and impact of party identification at the macro level. While analyses of normal voting and critical elections highlight the impact of the distribution of party identification on long-term party system changes, the concept of macro-partisanship is preoccupied with (short-term) changes in the distribution of party identification as a reaction to fluctuating political conditions (Lewis-Beck et al., 2008: 131–6).
Refining the Concept of Issue Orientations Research on issue voting developed as a particularly innovative field and yielded a series of conceptual and theoretical advances (Arzheimer et al., 2017; Lewis-Beck et al., 2008: 185–200). Among other reasons, there was an uptake in attention to issue voting due to the increasing attractiveness of rational choice theories. The construct of a rational voter who carefully processes information about political parties and their issue positions and performance attributes a decisive role to political issues in the making of electoral choices (Downs, 1957). In addition, the finding of a decline in party identification stipulated scholarly attention to issue voting.
Political Behavior
Long before the debate on a decline of party identification began, Stokes (1963) introduced a distinction between valence and position issues. Valence issues, such as employment, growth and social security, are not a matter of disagreement on goals. Instead, people often attribute varying levels of salience to these issues and have different perceptions of the competence of parties and candidates in handling these problems. By contrast, position issues refer to controversial goals such as liberalization of abortion or immigration. In these cases, the voters do not reason about priorities and competences, but ask whether and to what degree political parties share their own viewpoints. Spatial models of voting derive from the distinction between valence and position issues by addressing the nature of agreement between the issue positions held by voters and political parties. Directional voting means that voters prefer a party because they share its general views, such as a liberal immigration policy. In contrast, distance voting focuses on the perceived distance between the voters’ and the parties’ stances on issues, irrespective of whether both stay on the same side. This may be particularly relevant for moderate voters who do not take a position in the same sector of the political space as a moderate party, but are closer to it than to the standpoint of more extreme parties perceived as being on the right side. The concept of issue ownership rests on the idea that a specific political party is consistently seen as most competent in dealing with an issue or a set of issues. Therefore, this issue is given a prominent role in the parties’ political agenda and will be strongly highlighted in its campaign activities in order to prime the voters’ judgments. Examples are the association of law and order with conservative parties and of social welfare measures with socialist ones. Analyses of economic voting as a particular case of issue voting (Campbell et al., 1960: 381–401; Lewis-Beck et al., 2008: 365–89) are anchored in the tradition of the
591
rational choice approach. Accordingly, the individuals’ cost–benefit calculations serve as the decisive factor in their voting decisions. They calculate the economic return they had obtained or might expect when being governed by alternative parties and candidates. In the end, they prefer the alternative which appears most economically advantageous for them. In view of the ambiguous meaning of economic benefits, different approaches to economic voting have developed. One distinction is between pocketbook and sociotropic voting and refers to the personal economic situation versus the state of the national economy as the basis of calculating economic return. Finally, as an alternative to evaluating the handling of a large set of specific political issues, voters can consider the general record of the incumbent officials as a less costly alternative and either use past performance or expectation of future actions as the basis for their preference building.
Candidate Orientations Research on candidate orientations took a strong impetus from the debate on the decline of party identification. By adopting psychological concepts, scholars have dealt in depth with the dimensions and the impact of candidate orientations. The use of schema theory marked one major advance and conveyed evidence on voters’ ability to organize their assessments of political leaders alongside broader classes of characteristics. The major dimensions of candidate evaluations found in a large number of empirical studies vary in detail, but many findings confirm the view of Campbell and colleagues (1960) that competence, integrity, strong leadership and problem solving capabilities are perceived as relevant candidate traits. A good deal of research on candidates has dealt with the question of whether the decline of party identification and changes in the system of mass communication have elicited an increasing personalization of electoral
592
The SAGE Handbook of Political Science
choices. In the context of voting choices, personalization means that candidate orientations have become more influential as determinants of electoral behavior over the years and that performance counts less than personality in this respect. However, the assumption of an increasing personalization of the vote has not generally been supported by empirical research so far (Garzia, 2017; Lewis-Beck et al., 2008: 53–8).
New Perspectives on the Determinants of Voting De-alignment and Re-alignment: The Impact of Traditional and New Cleavages on Voting Choices Parallel to other domains of political life, changes of electoral behavior became obvious in the 1970s and scholarly interest turned to the ‘changing voter’ (Thomassen, 2005a, 2005b). The increased electoral volatility and corresponding processes of partisan dealignment and re-alignment were traced back to worldwide societal changes. Accordingly, the transition from industrial to post-industrial society and a concomitant process of value change seemed to weaken the traditional social cleavages that had long shaped electoral behavior. Indeed, the share of blue collar workers and self-employed people has been shrinking in favor of an emerging new middle class in the post-industrial economies. The parallel processes of cultural secularization and the shift from materialist (law and order, economic growth and prosperity, welfare politics) to post-materialist (lifestyle, emancipatory) values undermine traditional values. In terms of cleavage politics, socio-economic and socio-cultural changes have a double impact on the traditional voter–party coalitions. As demonstrated by a large number of empirical studies, the social groups that have built the basis of the electorates of
socialist, liberal, conservative and Christian parties that have traditionally dominated the electoral process in Europe no longer make up the majority of the electorate, which decreases the support for traditional parties. Since these parties still emphasize traditional political concerns, they face difficulties in finding support from the growing segments of the electorate who prioritize new political values and issues – the young, well educated, new middle class post-materialists. Moreover, there are indications that identification with and electoral support for these parties is declining in their traditional social milieus, too. This applies particularly to social democratic parties, which are divided between a traditional materialist and a new post-materialist left. As a consequence of the decline of party identification, it is said that short-term factors (candidate and issue orientations) increasingly shape voting choices and nurture electoral volatility. Generational replacement plays a key role in the process of de-alignment (Inglehart, 1983; von Schoultz, 2017; Thomassen, 2005b; Dalton and Klingemann, 2007: 9–13). While the de-alignment concept explains how and why voters’ support for traditional political parties vanishes, the idea of realignment goes one step further by showing what accounts for the voters’ switch from well-established to new party alignments and for the resulting formation of new political parties. In the 1970s and 1980s, the success of Green parties was explained as induced by the diffusion of post-materialist values. More recent re-alignment processes account for the rise of right wing populist parties. As demonstrated by Kriesi and others, the conflicting attitudes toward globalization versus national identity build the ground for an emerging new cleavage that divides supporters of right wing populist parties from supporters of the traditional ones (see Kriesi, Chapter 90, this Handbook). The shift of party loyalties elicited by the conflict around globalization goes alongside strong anti-establishment and anti-system attitudes. It displays true
Political Behavior
characteristics of re-alignment, since many working class socialists have joined the camp of anti-globalists and now support right wing populist parties. As traditional cleavages have weakened, assumptions of the micro-sociological approach were also revitalized. Thus, the voters’ position in their social environment, namely the local and regional context, patterns of interpersonal communication and integration into social networks such as neighborhood and primary groups, is seen as an important determinant of voting behavior. Moreover, analyses of issue and candidate voting benefited from the inclusion of mass communication in electoral studies. Agenda setting, framing and priming are concepts highlighting the indirect influence of mass media reporting on electoral choices. Attention to institutional factors, for example electoral laws, as determinants of voting choice also increased (Arzheimer et al., 2017; Dalton and Klingemann, 2007, Thomassen, 2005b).
Cognition, Emotion and the Vote In the 1990s, attention to psychological paradigms rose among electoral researchers. Although concern with cognitive processing is not completely new, the inclusion of emotions brings a previously neglected element into the analysis of electoral choices. Cognitive approaches are preoccupied with political information processing as preceding voting choices – as was already addressed by classical rational choice analyses of the role of information costs in the making of political judgments. The idealized figure of a fully informed voter was contrasted to a rational ignorant voter who made no effort to achieve well-informed and considered choices (Downs, 1957). Similarly, Campbell et al. (1960: 188–215) had broadly analyzed the role of belief systems as a basis of wellinformed voting choices. Subsequently, this concept was more deeply elaborated by
593
Converse’s (1964) idea of ideological reasoning. Accordingly, when evaluating political objects, ideological reasoners rely on a hierarchically and horizontally organized belief system encompassing a few fundamental, stable and abstract principles together with a large set of specific preferences derived from these core values. By contrast, Campbell pointed to party identification as an information shortcut for those voters who were neither capable nor willing to consider in full detail the characteristics of the issue agenda and the positions taken by the competing candidates or parties (Campbell et al., 1960: 128). The impact of information processing on the vote has become a prominent theme in electoral research over the past decades. Aligning to psychological dual modes theories that distinguish between a central and a peripheral route of information processing, these approaches incorporate the established explanatory concepts of voting choices, party identification, issue and candidate orientations into the framework of cognitive theories. Accordingly, well-structured and informed reasoning on parties, issues and candidates on the one hand, and the use of information shortcuts on the other, is mainly a result of the characteristics of the decision situation. When facing salient political issues, having sufficient time and disposing of adequate cognitive resources, voters take a central route of information processing. This implies a careful search for and evaluation of information that is needed for a rational decision. By contrast, low salience of issues, short time and limited cognitive resources elicit the choice of the peripheral route and induce people to simplify the choice situation by the use of heuristics, shortcuts or some other ways of ignoring or eliminating alternative information. Reasoned choice relies strongly on values, ideologies, a careful evaluation of the issue positions and problem solving capacities of political parties and assessment of the personality and performance of the candidates running for public offices. By contrast,
594
The SAGE Handbook of Political Science
heuristic judgments make use of party identification, group stereotypes, media reporting and physical appearance of candidates, or rely on the preferences of persons in the voters’ personal environment when it comes to making a voting choice. Easily available information and impressions as well as past behavior may also shape the voters’ preference building (Mutz, 2007). Similar to cognitive dual mode theories, the theory of affective intelligence distinguishes between a central and a peripheral route of information processing, but infers the difference in information processing from the kind of emotions prevailing in the decisional situation. When emotions such as joy or pride prevail and the circumstances are not perceived as threatening, people rely on heuristic processing. By contrast, threatening situations elicit fear and lead people to use a systematic method of information processing. In the electoral situation, the party choice of people who feel fearful was shown as more strongly influenced by candidate and issue orientations and less so by party identification, while the opposite held true for voters experiencing joy and pride (Mutz, 2007; Redlawsk and Pierce, 2017). A second approach, the theory of motivated reasoning, also starts from the premise that people handle information processing differently when asked to make a decision. According to Lodge and Taber (2000), decisions rest on a different weighting of accuracy (information) and directional goals (predisposition). In addition to the classical model of rationality (strong accuracy and weak directional goals), they consider the three other combinations of accuracy and directional goals as ideal types. Turning to real politics, they emphasize that every information is affectively charged. Thus, voters often do not conform to the ideal model of a rational decider but rather follow their predispositions, select information, simplify the decisional situation and seek justification of decisions they have already made before evaluating the available alternatives. Online
processing of information and resorting to ‘how do I feel’ heuristics are other core elements of Lodge and Taber’s theory (Redlawsk and Pierce, 2017). The changing views on electoral choices and the processes preceding the final decision have fostered a series of methodological innovations. Experiments and the use of data gathered by panel surveys and rolling cross sections have become established instruments in dealing with the dynamics of electoral behavior. Moreover, multi-level analyses have facilitated research on the interplay of macro- and micro-level variables in shaping electoral choices (Arzheimer et al., 2017).
Political Participation Since the act of voting rests on the double decision to go to the polls instead of abstaining from vote and to make a choice among the competing parties, analyses of turnout were included in electoral studies from the beginning (Lazarsfeld et al., 1944: 40–52; Campbell et al., 1960: 89–115). Early research equated political participation with activities embedded in the electoral process. As the citizens’ repertory of political action broadened, scholars turned to the analysis of additional forms of political engagement and to the way these activities related to each other. While early analyses of political participation remained mostly descriptive, interest in explaining the level and type of citizens’ political engagement increased in the 1970s. Individual as well as macro-level approaches tried to integrate social, attitudinal and institutional factors into middle range theories exploring why and how people make their voices heard in politics. However, a broadly accepted theory of political participation does currently not exist. Instead, multiple, but compatible, explanatory approaches trace political participation back to a variety of individual and systemic factors (Gabriel, 2012: 11–20).
Political Behavior
Changing views on political participation As an ambiguous and contested concept, political participation is defined in different ways. Most early conceptualizations did not go beyond listing a set of activities understood as political participation (Campbell et al., 1960: 90–3), and no major attempts to define and operationalize the concept were made before the 1960s. Even then, there was no common understanding of what should be subsumed under the heading of political participation (Milbrath and Goel, 1977: 2; Verba and Nie, 1972: 2). Avoiding the limitations of older approaches, Verba et al. (1995: 38) define political participation as an ‘activity that has the intent or effect of influencing government action – either directly by affecting the making or implementing of public policy or indirectly by influencing the selection of people who make those policies’. Meanwhile, this view has become widely accepted as distinguishing political participation from other kinds of social and political behavior. As the scope of activities people use in order to influence politics has increased, research has more intensely analyzed the distinctive characteristics of various forms of participation and the relationship existing between them. In a first encompassing report of existing research, Milbrath described political participation as a pyramid of activities mainly differing in terms of required effort. He identified three types of political actors: the inactive, who largely stay outside the political arena and take a position at the base; the spectators, who mainly limit themselves to political communication and are in an intermediary position; and the gladiators, who actively seek to exert influence in the political arena and make up the small top of the pyramid (Milbrath and Goel, 1977: 10–12). The varying levels of effort required by an activity have remained an important aspect in the classification of participation,
595
but were eventually not seen as sufficient. Instead, political participation was shown as encompassing a variety of qualitatively different systems serving different purposes and used by different people (Verba and Nie, 1972: 45). In addition to the effort caused by the activity, Verba and Nie introduced three further characteristics that distinguished between electoral participation, campaigning and contact because of collective or individual concerns: (1) information or pressure as the function of the respective activity, (2) the level of implied conflict, and (3) the individual or collective impact of the outcome of participation. With only minor modifications, these assumptions were supported by empirical analyses. To better understand this conceptualization, it has to be mentioned that the authors originally excluded political protest from their analysis. Similar to the Verba group, (Barnes et al., 1979) claimed a qualitative difference between the observed systems of participation, conventional and unconventional activity based on the criteria of institutionalization, legality and legitimacy. This distinction was also validated empirically (Kaase and Marsh, 1979a). A series of follow-up studies came to similar conclusions on the structure of political engagement. Participation in elections, other activities related to electoral protest, contacting politicians, and both legal and illegal protests were found to be the main groups of activities that people use to influence politics. In some studies, the system of participation appeared less complex than previously described; others found additional factors, such as consumer participation and political violence. Moreover, being involved in one of the systems of participation was shown as by no means excluding other activities. Although many people are either inactive or use all available kinds of exerting influence on politics, a certain segment of the citizenry specializes in using one or several forums to exert influence (Kaase and Marsh, 1979b; Verba and Nie, 1972: 82–95; also Armingeon, 2007; Gabriel, 2012: 6–11; Teorell et al., 2007).
596
The SAGE Handbook of Political Science
The approach to newer types of political participation is different: Only a few encompassing studies of citizens’ initiatives and referenda, deliberative forms of democracy and electronic participation are theoretically and methodologically integrated into the mainstream of behavioral research; most are limited to mere description and do not systematically investigate the factors leading people to become active in politics. Although the study on political participation conducted by Verba et al. (1978) found similarities between the Western (USA, the Netherlands and Austria) and the four nonWestern countries (India, Japan, Nigeria and Yugoslavia) under observation, far more research on civic engagement has subsequently focused on traditional democracies. Recent studies on participation in new democracies deal with traditional forms of political participation (Norris, 2002: 113–16, 119–32, 168–88), but more interestingly, some of them observe these countries as taking a pioneer role in developing innovative, deliberative modes of participation and citizen empowerment, for example participatory budgeting, referenda and e-participation (Dalton and Klingemann, 2007: 13–16; Norris, 2002: 194–202, 207–11; LeDuc et al., 2002, 2010; Weldon and Dalton, 2014).
Explanatory Approaches Irrespective of the lack of a general theory, research has explored from early on what factors enhance political participation. Education, age, gender and political interest were determinants of electoral turnout analyzed by Lazarsfeld et al. (1944: 40–52). In investigating the attitudinal correlates of turnout, Campbell et al. (1960: 89–110) found that intensity of party affiliation, political interest, sense of political efficacy and the view of voting as a civic duty fostered participation in elections.
These findings were soon incorporated into research on political participation in general. Milbrath and Goel (1977) traced political participation back to macro-level characteristics of the social and political system, life position factors, the citizens’ personality and the stimuli set by the political environment. Broad empirical evidence underlines the role of these factors in citizens’ decisions to take an active role in politics. While attitudes such as political interest, party identification and political efficacy were part of individuallevel analyses, the role of socio-economic factors as antecedents of political participation is analyzed in micro- and macro-level approaches (Gabriel, 2012: 11–20). Scholarly interest in integrating available empirical evidence on the determinants of political participation into more complex theories has highlighted different factors. Starting from an observed gap between the normative idea of political participation as a right that should be accessible to all citizens on the one hand, and the unequal use of this right by upper and lower status groups on the other, Verba et al. explored in depth the socio-economic antecedents of political participation. In line with previous research, they found people who are well equipped with socio-economic resources are politically more active than those with fewer resources, not only in the United States but also in other countries. Regarded against this background, the search for a way to achieve greater equality in democratic participation led the authors to state an additional assumption. Accordingly, institutional affiliation (party identification and politicized membership in voluntary organizations) could mitigate the social bias of participation. Depending on the strength of the mediating effect of citizens’ relationship to political institutions, the authors distinguished between weak, additive, dominant, restrictive and mobilizing institutional systems. Unfortunately, the analysis failed to establish a clear and consistent impact of institutional affiliation on the relationship between the socio-economic
Political Behavior
resource level and the level and type of political participation in the observed countries (Verba et al., 1978). Notwithstanding the inconsistent results, the key variables of the resource institution model were incorporated into subsequent research and the question of the relationship between participation and political equality has remained a prominent theme to date (Schlozman et al., 2012). During the 1960s, research on political participation was challenged by the worldwide spread of protest and political violence, which contradicted conventional wisdom in several respects. First, protest was more common in socio-economically advanced postindustrial societies than in less developed ones which were additionally characterized by strong socio-economic inequality. Second, with only a few exceptions, supporters of the protest movements in post-industrial societies were mainly recruited of among young, well-educated upper middle class people. Two different approaches addressed the problem of political protest in modern societies. The first, the theory of value change, explains the broadening of the citizens’ action repertory as a product of societal modernization (Barnes et al., 1979; Inglehart, 1983). The second approach was anchored in theories of alienation and understood political protest as an expression of a growing rejection of politics and society by the people (Schwartz, 1973; see also della Porta, Chapter 39, this Handbook). As mentioned above, the transition from industrial to post-industrial societies was accompanied by a shift from materialist to post-materialist values. This did not only alter electoral behavior but also elicited elitechallenging forms of political participation in addition to the elite-directed activities that had prevailed so far. Two factors, value change and cognitive mobilization, account for the changing style of political action. First, high levels of education and related cognitive capacities provide the necessary resources for playing an active role in politics. Second, post-materialists feel a need to rely
597
on protest as a means of exerting influence on politics because they perceive political elites as insufficiently open to new political concerns. These assumptions were broadly supported by empirical research (Inglehart, 1983). Additional findings on the new social movements underlined the strong role of post-materialist priorities in mobilizing collective protest (Norris, 2002: 188–212). Variants of alienation theory trace political protest, political apathy and conformist political participation back to an interplay of two facets of a complex syndrome of alienation: powerlessness (inefficacy) and normlessness (distrust). In a broader approach to the antecedents of disruptive political behavior, elements of alienation theory were combined with rational choice theories and assumptions derived from Easton’s (1965) concepts of diffuse and specific support (Muller, 1979). Empirical research yielded mixed findings on the impact of political alienation on political participation: political inefficacy made the greatest difference between becoming politically active or remaining passive, but regime support and trust in authorities proved poor predictors of political participation. Moreover, there is no evidence on the assumed interaction between political inefficacy and distrust in eliciting protest, conformist participation and political apathy. Together with social networks, the role of social and political trust as a determinant of political engagement was also emphasized in the theory of social capital (Putnam et al., 1993). While some types of social activity indeed seem to foster political participation, this mostly does not apply to social and political trust (Armingeon, 2007; Norris, 2002: 137–87). Although broad agreement on the determinants of political participation has developed over time, a theory integrating these various factors in a theoretically meaningful way is still missing. The ‘civic voluntarism model’ (CVM) of Verba et al. (1995) comes close to that ideal by including social background, political attitudes and the citizens’ integration
598
The SAGE Handbook of Political Science
into social networks, and, moreover, by elaborating in detail how these factors influence political participation. Attributes such as time, skills and social background are seen as resources enabling people to become active. Attitudes such as political interest, political efficacy, party identification and a sense of civic duty work as factors motivating people to take an active role in politics. Finally, integration into social networks contributes to mobilizing people for political engagement. Although not all constructs included in the civic voluntarism model proved valid predictors of political participation, it gives a convincing explanation of why some people become politically engaged while others remain passive. Numerous current studies of political participation support the core assumptions stated in the CVM model. While much electoral research focuses on a single political system, a couple of studies of political participation entail cross-national comparisons (Barnes et al., 1979; LeDuc et al., 2002, 2010, 2014; Norris, 2002; van Deth et al., 2007; Verba et al., 1978). As the micro- and macro-level data available for cross-national research have improved considerably over time and new tools for simultaneous analyses of individual-level and systemic data have become available, research has recently turned to political institutions as determinants of political participation. As supply factors, institutional arrangements make some forms of participation available while incentivizing or restricting others. Due to variations in the selection of units of observation, variables and study designs, a valid general conclusion on whether, how and why institutions matter as determinants of political participation cannot be drawn from the available findings (LeDuc et al., 2002, 2010, 2014; Norris, 2002). Similar to voting choices, the types and levels of political participation show considerable variation over time and across nations. However, the determinants of political participation seem to be less dependent on prevailing circumstances than is the case for party
choices. Notwithstanding dissimilarities in detail, empirical research shows that political participation – of whatever type – is fostered by resources such as education and cognitive abilities, by motives such as interest, political efficacy, support of participative norms, and party identification, and by inclusion in informal and formal social networks.
Advances and Needs In a time-span of almost a century, research on political behavior has been established as a productive sub-discipline of political science. Due to advances in the techniques of data collection and analysis, such studies have been conducted in an increasing number of countries. Starting from the empirical observation of voting behavior, they have successively included a broader array of elite and mass behaviors and made efforts to integrate different middle range theories. Recently, research has considerably progressed by turning to dynamics of political behavior. Voting behavior and political participation have evolved as the most advanced subfields of behavioral studies in all these respects, though a general and encompassing theory of political behavior is not yet in sight. Evidently, research on political behavior is still most advanced in stable democracies such as the United States, Canada, Australia, New Zealand and Western Europe. But due to a variety of changes in society, politics and academia, alongside an increasingly internationalized school of political science, political behavior in Central and Eastern Europe, Latin America, parts of East and Southeast Asia and some African nations has also become a more or less prominent topic of empirical research (Almond, 1996; Dalton and Klingemann, 2007). While there have been substantial advances, some blind spots have remained in the research agenda. While, in general, the patterns, trends and determinants of political behavior have been broadly
Political Behavior
studied, the impact of electoral behavior and political participation on the quality of citizenship and democracy has largely remained outside the scope of empirical research. Little is known as to whether and how variations in political behavior contribute to elite responsiveness, strengthen systemic stability and promote performance and innovation. This is partly due to a lack of appropriate data, but it also has to be said that convincing assumptions on macro-level determinants of political behavior are missing. Finally, the inclusion of new forms of political behavior into mainstream behavioral research remains desirable.
References Almond, G. A. (1996). Political Science: The History of the Discipline. In: R. E. Goodin and H.-D. Klingemann (eds): A New Handbook of Political Science. Oxford: Oxford University Press, 50–96. Almond, G. A., and S. Verba (1989). The Civic Culture: Political Attitudes and Democracy in Five Nations. New Edition. Beverly Hills: Sage (first published in 1963). Armingeon, K. (2007). Political Participation and Associational Involvement. In: J. W. van Deth, J. R. Montero and A. Westholm (eds): Citizenship and Involvement in European Democracies: A Comparative Analysis. London and New York: Routledge, 358–83. Arzheimer, K., J. Evans and M. S. Lewis-Beck (2017). Introduction. In: K. Arzheimer, J. Evans and M. S. Lewis-Beck (eds): The Sage Handbook of Electoral Behaviour. Los Angeles et al: Sage, 1–6. Barnes, S. H. et al. (1979). Political Action: Mass Participation in Five Western Democracies. Beverly Hills and London: Sage. Budge, I., I. Crewe and D. Farlie (2010). Party Identification and Beyond: Representations of Voting and Party Competition. New York: Wiley. (First published in 1976, ECPR Press.) Campbell, A., G. Gurin and W. E. Miller (1954). The Voter Decides. Evanston, IL: Row, Peterson, and Co.
599
Campbell, A., P. E. Converse, W. E. Miller and D. E. Stokes (1960). The American Voter. New York: Wiley. Converse, P. E. (1964). The Nature of Belief Systems in Mass Publics. In: D. E. Apter (ed): Ideology and Discontent. New York: The Free Press, 206–61. Converse, P. E. (1969). Of Time and Partisan Stability. Comparative Political Studies 2(2): 139–71. Dalton, R. J., and H.-D. Klingemann (2007). Citizens and Political Behavior. In: R. J. Dalton and H.-D. Klingemann (eds): The Oxford Handbook of Political Behavior. Oxford and New York: Oxford University Press, 3–26. Downs, A. (1957). An Economic Theory of Democracy. New York: Harper and Row. Easton, D. (1965). A Systems Analysis of Political Life. Chicago and London: University of Chicago Press. Easton, D. (1967). The Current Meaning of ‘Behavioralism’ in Political Science. In: J. C. Charlesworth (ed): The Limits of Behavioralism in Political Science. New York: The Free Press, 11–31. Eulau, H. (ed) (1969). Behavioralism in Political Science. New York: Atherton Press. Gabriel, O. W. (2012). Political Participation in France and Germany – Traditions, Concepts, Measurements, Patterns and Explanations. In: O. W. Gabriel, S. I. Keil and E. Kerrouche (eds): Political Participation in France and Germany. Colchester: ECPR Press, 1–32. Garzia, D. (2017). Voter Evaluation of Candidates and Party Leaders. In: K. Arzheimer, J. Evans and M. S. Lewis-Beck (eds): The Sage Handbook of Electoral Behaviour, Los Angeles et al: Sage, 633–53. Inglehart, R. (1983). Changing Paradigms in Comparative Political Behavior. In: A. W. Finifter (ed): Political Science: The State of the Discipline. Washington, DC: The American Political Science Association, 429–69. Kaase, M., and A. Marsh (1979a). Political Action: A Theoretical Perspective. In: S. H. Barnes, M. Kaase, K. R. Allerbeck, B. G. Farah, F. Heunks, R. Inglehart, M. K. Jennings, H.-D. Klingemann, A. Marsh and L. Rosenmayr: Political Action: Mass Participation in Five Western Democracies. Beverly Hills and London: Sage, 27–56.
600
The SAGE Handbook of Political Science
Kaase, M., and A. Marsh (1979b). Political Action Repertory: Changes over Time and a New Typology. In: S. H. Barnes, M. Kaase, K. R. Allerbeck, B. G. Farah, F. Heunks, R. Inglehart, M. K. Jennings, H.-D. Klingemann, A. Marsh and L. Rosenmayr: Political Action: Mass Participation in Five Western Democracies. Beverly Hills and London: Sage, 137–66. Lazarsfeld, P. F., B. Berelson and H. Gaudet (1944). The People’s Choice: How the Voter Makes Up His Mind in a Presidential Campaign. New York: Duell, Sloan and Pearce. LeDuc, L., R. G. Niemi and P. Norris (eds) (2002). Comparing Democracies 2: New Challenges in the Study of Elections and Voting. London et al: Sage. LeDuc, L., R. G. Niemi and P. Norris (eds) (2010). Comparing Democracies 3: Elections and Voting in the 21st Century. London et al: Sage. LeDuc, L., R. G. Niemi and P. Norris (eds) (2014). Comparing Democracies 4: Elections and Voting in a Changing World. London et al: Sage. Lewis-Beck, M.S., W. G. Jacoby, H. Norpoth and H. F. Weisberg. (2008). The American Voter Revisited. Ann Arbor: The University of Michigan Press. Lipset, S. M., and S. Rokkan (1967). Cleavage Structures, Party Systems and Voter Alignments: An Introduction. In: S. M. Lipset and S. Rokkan Stein (eds): Party Systems and Voter Alignments: Cross-National Perspectives. New York: Free Press, 1–64. Lodge, M. and C. S. Taber (2000). Three Steps toward a Theory of Motivated Political Reasoning. In: A. Lupia, M. D. McCubbins and S. L. Popkin (eds): Elements of Reason: Cognition, Choice, and the Bounds of Rationality. Cambridge: Cambridge University Press, 183–213. Milbrath, L. M. and M. L. Goel (1977). Political Participation: How and Why Do People Get Involved in Politics? Chicago: Rand McNally. Muller, E. N. (1979). Aggressive Political Participation. Princeton, NJ: Princeton University Press. Mutz, D. C. (2007). Political Psychology and Choice. In: R.JS. Dalton and H.-D. Klingemann (eds): The Oxford Handbook of Political Behavior. Oxford and New York: Oxford University Press, 80–99.
Norris, P. (2002). The Democratic Phoenix – Reinventing Political Activism. Cambridge: Cambridge University Press. Putnam, R. D., R. Leonardi and R. Y. Nanetti (1993). Making Democracy Work: Civic Traditions in Modern Italy. Princeton, NJ: Princeton University Press. Redlawsk, D. P. and D. R. Pierce (2017). Emotions and Voting In: In: K. Arzheimer, J. Evans and M. S. Lewis-Beck (eds): The Sage Handbook of Electoral Behaviour. Los Angeles et al: Sage, 406–32. Schlozman, K. L., S. Verba and H. E. Brady (2012). The Unheavenly Chorus: Unequal Political Voice and the Broken Promise of American Democracy. Princeton, NJ: Princeton University Press. von Schoultz, A. (2017). Party Systems and Voter Alignments. In: K. Arzheimer, J. Evans and M. S. Lewis-Beck (eds): The Sage Handbook of Electoral Behaviour. Los Angeles et al: Sage, 30–55. Schwartz, D. C. (1973). Political Alienation and Political Behavior. Chicago: Aldine. Simon, H. A. (1957). Models of Man. New York: Wiley. Smith, M. B. (1968). A Map for the Analysis of Personality and Politics. Journal of Social Issues 24(3): 15–28. Stokes, D. E. (1963). Spatial Models of Party Competition. American Political Science Review 57(2): 368–77. Teorell, J., M. Torcal, and J. R. Montero (2007). Political Participation: Mapping the Terrain. In: J. W. van Deth, J. R. Montero and A. Westholm (eds): Citizenship and Involvement in European Democracies: A Comparative Analysis. London and New York: Routledge, 334–57. Thomassen, J. J. (2005a). Introduction. In: J. J. Thomassen (ed): The European Voter: A Comparative Study of Modern Democracies. Oxford: Oxford University Press, 1–21. Thomassen, J. J. (2005b). Modernization or Politics? In: J. J. Thomassen (ed) The European Voter: A Comparative Study of Modern Democracies. Oxford: Oxford University Press, 254–66. van Deth, J. W., J. R. Montero and A. Westholm (eds) (2007). Citizenship and Involvement in European Democracies: A
Political Behavior
Comparative Analysis. London and New York: Routledge. Verba, S. and N. H. Nie (1972). Participation in America: Political Democracy and Social Equality. New York: Harper & Row. Verba, S., N. H. Nie and J.-O. Kim (1978). Participation and Political Equality: A Seven-Nation Comparison. Cambridge: Cambridge University Press. Verba, S., K. L. Schlozman and H. E. Brady (1995). Voice and Equality: Civic Voluntarism
601
in American Politics. Cambridge, MA: Harvard University Press. Watson, J. B. (1913). Psychology as the Behaviorist Views it. The Psychological Review 20(2): 158–77. Weldon, S. and R. J. Dalton (2014). Democratic Structures and Democratic Participation: The Limits of Consensualism Theory. In: J. Thomassen (ed): Elections and Democracy: Representation and Accountability. Oxford: Oxford University Press, 113–31.
36 Political Communication Gianpietro Mazzoleni and Cristopher Cepernich
Introduction What is meant by ‘political communication’? The question is not an idle one, since the mass media and the digital media have come to play a pivotal role in the new communication ecosystems and have become vital to present-day polities. We have, in fact, reached a point at which it is difficult for scholars and commentators to make sound analyses of any political phenomenon or trend without giving due consideration to issues of communication. There are copious examples, including the rise and success of Donald Trump, the broad diffusion of populism, Brexit and international terrorism, all of which have been influenced by media activities, by the manipulation of news sources and by the rise of online communication platforms, which, to varying degrees, explain the nature and cause of those events. The role of communications in politics has been investigated since the early studies on voting in the 1940s (for example, Paul Lazarsfeld), including by political
thinkers such as Karl W. Deutsch (1963), who considered them ‘the nerves of government’. Scholarly literature on the subject, both theoretical and empirical, has proliferated in the past 70 years, and the field of political communication has attained a standing of its own, even though mainstream political science had long ignored the importance of media and communication in the political process. Besides – and before – being an academic discipline, political communication is the conveying of ideas and information by people engaged in politics, elections and government on the one hand, and in news production, news management, debate and entertainment on the other. The flow of political content posted and circulated via social media is also a tangible form of political communication. All of these activities involve senders and receivers, messages and feedback, and have a verifiable impact on political reality – that is, on election outcomes, the formation of government coalitions and policy-making – and also in shaping public opinion, political
Political Communication
views and citizen engagement. It is political communication that practitioners perform in assisting political actors running for office or in office, in devising marketing, advertising, PR strategies, in lobbying activities and the like. It thus has a wider meaning than that of an academic disciplinary field. Political communication as a discipline is, rather, the field in which communication dimensions of politics are investigated and interpreted from a scholarly perspective. This is the object of this chapter. One outstanding example, now a standard reference for innumerable studies, is the comprehensive outline of the state of political communication in post-war democracies written by Jay G. Blumler and Dennis Kavanagh (1999). They identified three ‘ages’ in the evolution of political communication, which they defined as the dynamic between political actors, news and the public. Age 1 corresponds to the first two post-war decades, in which the party system adhered to entrenched social divisions and the electorate related to politics via a strong party identification and voted according to group loyalties. The communication system was dominated by political parties and to a significant extent was also ‘partisan’: political messages were ‘substantive’; political leaders focused on issues and enjoyed ready access to mass media. Age 2 spans from the 1960s through the 1980s, when television strengthened its grip on politics. This was a time when traditional party allegiances based on the subcultural divisions of the immediate past weakened rapidly and politicians and their parties discovered television to be a powerful means to contact hitherto unreachable audiences. Political communicators therefore had to adopt the typical formats of television production in order to address wider, more varied audiences. Age 3 ran from the 1990s onwards, a challenging period in which the media environment became more fragmented and witnessed the rise of the internet. It seemed a bonanza
603
for established political players: although the multiplication of public communication channels made it hard to keep the media in check, the new, interactive platforms represented unprecedented resources for every kind of political communication. Blumler and Kavanagh highlighted the main features of this age as: (1) a professionalization of the relationship between political leaders and public opinion, making use of an array of systems, from spin doctoring to news management; (2) the populist nature of much political communication that yields to market imperatives, privileging ‘popular’ formats, transmitting soft news to the detriment of debate on serious political issues; (3) a ‘centrifugal’ communication by political players that was forced to ‘chase’ an audience distracted by the popular or the new social media. The first two decades of the 2000s may well be labeled Age 4, one in which the internet has become firmly rooted in everyday life, impacting on the polity as a whole. The web is changing the ecology of political communication, with important implications – first, for the working of established democratic institutions, which have been unsettled by the rise of political leaders astute in the use of all media; second, for the political empowerment that the new social media gives citizens who are now able to actively produce messages and be involved in a process of ‘disintermediation’ from both the traditional party-centered networks and from the gatekeeping of mainstream mass media; third, for traditional communication theory, built almost exclusively on decades-long research carried out within the mass communication paradigm. While most of this theory is still valid – given that the mass media, especially television, still occupy center stage in political communication – politics finds itself submerged in a ‘hybrid’ media environment (i.e. mass media together with new media) (Chadwick, 2013), in which new political dynamics are at work that scholars suspect cannot be fully understood and explained by mainstream theory.
604
The SAGE Handbook of Political Science
The discipline Political communication as a field of study is the object of several disciplines. While being a complex phenomenon that has politics at its core, it also involves a wide array of aspects and cannot be restricted to narrow theoretical confines. It is a multi-faceted phenomenon, multi-disciplinary in character on account of being established by contributions, direct and indirect, from disciplines such as sociology, anthropology, public opinion research, political philosophy, history, semiotics, discourse analysis, linguistics and psychology. This chapter concentrates on the crossroads between political science and the sociology of communication, since these have produced the largest amount of empirical research and theoretical reflection. However, the discipline has also been enriched by the ideas of such theorists as Hannah Arendt and Jürgen Habermas (for instance, their views on the role of informed citizens engaged in discussion in the ‘public realm’, an activity substantial for democracy yet clearly one of political communication). Habermas developed his theories while speaking explicitly about the centrality of the media for the functioning of the ‘public sphere’. Critics such as Robert Putnam (2000), meanwhile, have accused television of distracting millions of Americans from social and civic engagement. And new ground was broken by political thinkers such as Murray Edelman, who proposed an insightful typology of ‘political language’ in his book The Symbolic Uses of Politics (1976), which has inspired generations of political communication scholars. A further example is that of anthropologist David I. Kertzer, who, in his seminal book Ritual, Politics and Power (1988), sheds light on the non-rational dimensions of political action, where symbols and rituals are used by politicians seeking to gain (through those types of communication) their followers’ support. In addition, the linguist and cognitive scientist George Lakoff (2008) used neuroscience to show how narratives, metaphors and frames can influence voting patterns.
Finally, as a discipline in its own right, political communication displays diverse degrees of academic institutionalization because of its connections worldwide. As a relatively new sub-field of the wider domain of communication science, it has found little room among the well-established disciplines that constitute the traditional pillars of university teaching. Departments of political communication are still rare in academic settings, and the discipline is often grouped with other curricula of either political science or social studies. The reasons for this negligible academic recognition are its inherent multidimensional nature – making it a territory colonized by other scientific interests – and, more importantly, its lack of a defined statute. Even so, intense and ever growing international cooperation between scholars and innumerable comparative research efforts has produced a profusion of publication projects, and these have helped to show how essential political communication scholarship is for a clear understanding of some of the key political phenomena of our time.
The basic theories and research traditions Campaign Communication Scholars from a variety of disciplines have devoted considerable time to the study of election campaigns. These are moments in which the contest for power is planned and played out in front of the electorate, and their outcome will be of great consequence. The machinery of parties and candidates work at full speed, media coverage is at its highest and voters are the constant target of political propaganda: no wonder, then, that campaigns provide the best opportunities for academic analysts to investigate electoral communication, its unfolding and its potential impact. Research into electoral campaigns has focused first on the sum of communications
Political Communication
produced by the various players – politicians, parties, candidates, consultants, activists, media outlets, grassroots movements and the like – and then on the extent to which communication efforts affected the election outcome. Pippa Norris (2000) developed a useful typology of US election campaigns, taking a historical perspective to classify them as pre-modern, modern or post-modern. She also studied their various features, taking into account the development of democratic awareness, the diffusion of the media (especially television) and the growth of professional expertise in support of political contenders. Norris’ typology proved well suited for categorizing the development of campaigns in political contexts quite unlike that of the United States. Campaigns can in fact be either candidate-centered (as in the United States) or party-centered (as in most nonpresidential systems), and this obviously has important implications for the organization of propaganda and communication strategies. The academic analysis of campaigns has contributed to the crafting of a sub-field of political communication, that of ‘political marketing’, which borrows much from the canons of established commercial marketing. The language of politics has quickly adopted its jargon – targeting, positioning, attack and defense strategies, political commercials, negative advertising and the like – and professional figures such as political consultants, spin doctors, image builders and media experts have become familiar with political hopefuls, government officials, party leaders, journalists and, of course, even scholars, who have produced a mass of studies on each of those new types of political communicators. The introduction of new media, especially social networks, into the campaign arsenal has itself attracted the attention of scholarly researchers in many countries. Their countless publications explore the idiosyncratic uses that politicians make of such media, as well as the conditions required for effective communication, types of content, intensity of use and the profiles of ordinary citizens responding
605
to or addressing candidates. One of the main findings is that social media enable – at least potentially – a truly interactive exchange between candidates and voters. Research on the effects of election campaign communication is one of the oldest academic traditions in the field, going back to the pioneering studies of, first, the 1940s Columbia School of Lazarsfeld, who in The People’s Choice (1944) and Voting (1954) discovered that the social characteristics of target voters are powerful filters of campaign persuasion stimuli, and, second, the Survey Research Center of the University of Michigan, with Angus Campbell et al., who in The American Voter (1960) emphasized the psychological and affective determinants of voting behavior, especially party identification. The analysis of campaign effects stems from a wider question that refers to the impact of mass communication (and online media) on politics and political dynamics, from the perspective of both senders and receivers.
Media–Politics Interaction Patterns The role of media in the political sphere is another key area of scholarly investigation in the field of political communication. Two of the many questions addressed by academics are worthy of particular attention: (1) To what extent has the presence and activity of various mass media affected the way politics is performed? (2) Taking account of the different and somewhat irreconcilable rules that govern them, how do the news media and political actors interact? Although the existence of specific differences between political systems is taken for granted, research has found that the media respond generally and primarily to market imperatives and to the determinants of their industry, as David L. Altheide and Robert P. Snow explained in their influential book The Media Logic (1979). When political content, political figures and political stories are handled by the media – whether they be news
606
The SAGE Handbook of Political Science
or entertainment outlets – politics undergoes a significant mutation, clearly mediated by the cultural environments in which the media function. This process is described as the ‘mediatization’ of politics (Mazzoleni and Schulz, 1999). The personalization of political communications is an example of it. In the media, the words and actions of party leaders and key politicians take precedence over abstract ideas or impersonal statements, and thus individuals take center stage at the expense of the party and its manifesto. Other effects of mediatization are to be seen in the growing tendency to spectacularize political communication, perceived by some as political action degenerated into the triviality of show business. A similar trend is that of composite political discourse, whereby complex issues are squeezed into a string of soundbites, mainly for the benefit of the breathless flow of news. This inevitably produces an oversimplification of political debates to the detriment of a healthy democratic process. A further effect of the mediatization of politics is the capacity of the media to set and shape the political agenda, either through journalistic investigations, or by directing the spotlight to certain critical events, or by taking canny editorial stances. The media play the role of key actors on the political stage, choosing what is to be on the public agenda and compelling politicians to follow rather than lead. The nature of the interaction between media and politics is an ongoing object of scholarly interest and speculation. There have been several attempts to define the role of the (news) media in the political process, from considering them as political institutions in themselves on a par with traditional political forces – a hypothesis that stems from the reflection on the ‘fourth estate’ – to highlighting a more obvious interdependence between the two independent entities. Two theoretical conceptualizations have been widely implemented by comparative research. One was postulated by Jay G. Blumler
and Michael Gurevitch (1995), who spoke of an interaction based on a continuum between a ‘pragmatic’ distance, which responds to a logic of independence, and a ‘sacerdotal’ closeness, where the news media reflect existing political stances. In the former case, the highest degree of independence is a conflict in which the media play the role of watchdog, and the medium level is an equal exchange based on a balance between opposed interests. In the latter case, the media are to a large extent subservient to political logics, thus producing biased journalism. The other attempt to explain the intricate, inevitable ‘cohabitation’ in democracy of media and politicians can be found in the tripartition developed by Daniel C. Hallin and Paolo Mancini (2004), who condensed the great variety of relations one observes at different political levels into a typology built on systemic parameters concerning political systems (i.e. political history, consensus or majoritarian government, level of pluralism, role of the state, regional legal authority) and media systems (nature of newspaper industry, type of political parallelism, level of professionalization, degree of state intervention). The first model, the ‘Mediterranean/Polarized Pluralist Model’, reflects the state of government– news media relations prevalent in countries such as France, Greece, Italy, Spain and Portugal, where one finds low newspaper circulation, an elite-oriented press, high political parallelism, weaker professionalization of the news profession and generally strong state intervention through press subsidies. The second model, or the ‘Northern European/Democratic Corporatist Model’, shows opposed features such as high newspaper circulation, more neutral, commercially oriented media, strong professionalization and institutionalized self-regulatory policies, and state intervention which is strong but mitigated by equally robust protection of press freedom. The countries where these characteristics are most evident are Austria, Belgium, Denmark, Finland, Germany, Netherlands, Norway, Sweden and Switzerland.
Political Communication
The third, the ‘North Atlantic/Liberal Model’ – which encompasses the UK, the United States and Canada – reflects conditions of medium newspaper circulation, a strong commercial press, established models of broadcast governance, but also marketdominated media practices and a long tradition of the professional training of journalists. Hallin and Mancini’s models – recently revised to incorporate other geo-political experiences, such as those of Eastern Europe and Asian countries (2012) – are key to identifying the typology and temperature of politics–media interactions. For example, if one explores the degree of independent, critical, watchdog-like attitudes of the media, one cannot ignore the distinctiveness between diverse sets of historical, political and cultural traditions.
Media Effects on Voters’ Behavior Empirical research on media effects on audiences has a long history in the social sciences. Many theorists and scientists have developed a large corpus of investigations and explanations, often conflicting, that spans from the early studies by Lazarsfeld’s team on the impact of commercial advertising in the 1940s to the current efforts of researchers to measure the influence of the use of new communication technologies on individual and collective behaviors. The political concern about the presumed powerful effects of media, especially if used for manipulative ends, even precedes commercial investigations, being associated with the rise of authoritarian regimes between the two World Wars. Radio and cinema were blamed by analysts for having helped dictators to brainwash the masses, although this was mostly theoretical speculation that remained an untested hypothesis since no empirical research was available to support what appeared to be a legitimate fear. The phase in question was named after the hypodermic needle, as it was based on
607
behavioristic views of the actions and conduct of people that considered mass media stimuli to be especially powerful. This deterministic explanation was countered by subsequent research employing methods used in empirical sociology (such as surveys and participant observation) that did not find significant changes in people exposed to certain media content, and led to the formulation of the opposite theory of ‘limited effects’. With the diffusion of more sophisticated investigative tools, research found it should be acknowledged that media have significant effects on the political behavior of voters, albeit ones that work in addition to other important determinants. Many theories have been developed by various schools, such as ‘uses and gratifications’, which emphasizes the active role of audiences and citizens in counteracting attempts at persuasion; ‘information processing’, which explains how people resort to ‘schemas’ to process and evaluate political information; ‘agenda setting’, which on the contrary points to the capacity of the media to influence people’s assessment of what is or is not important; and ‘framing’ and ‘priming’ hypotheses, which highlight how the way in which the media cover (political) events conditions people’s perceptions of reality and their subsequent attitudes. Finally, the most recent perspective is ‘emotion in politics’, which reappraises the workings of affective factors on voters’ choices: people tend to vote for the leader who voices the right sentiments, rather than for the candidate who uses the right argument, and ‘elections are decided in the marketplace of emotions’ (Westen, 2008: 35). Political communication scholarship contributed significantly to research on effects, exploring and explaining what is at work in the dynamics of media influence on political knowledge, political participation and voting. A major study by Michael X. Delli Carpini and Scott Keeter (1996) on what Americans know about politics highlighted a series of findings that have also been observed in other liberal democratic contexts.
608
The SAGE Handbook of Political Science
They found that: (1) the number of citizens who are well informed on political affairs make up a small minority of the general public, yet the knowledge they possess ensures a sufficient level of citizenship; (2) the inequality of citizens’ knowledge is correlated with the socioeconomic inequality between social groups; (3) motivation, interest, level of education and cultural environment are factors that condition the acquisition and processing of information and knowledge circulated by the media. It is a traditional normative tenet that democracy thrives with an informed citizenry. According to this view, the media perform the role of providing substantive knowledge to the citizenry. Scholars have long discussed what level that transmission of knowledge should aim for, and at what point it is sufficient for citizens to make (political) choices. Pippa Norris (2000) sums up the debate and addresses the various theoretical questions by speaking of two fallacies: the ‘civics fallacy’ of ‘regarding the news media as analogous to a third-grade civics teacher’, while ‘the capacity to make reasoned electoral choices does not require encyclopedic information’ (211); and the ‘relativist fallacy’ of assuming that ‘any beliefs that voters use to help them to come to judgements, whether true or false, are the equivalent to knowledge’ (212). The balanced view, according to Norris, is that of a ‘practical knowledge’ that the news media provide through sufficient information, which can help citizens ‘to connect their political and social preferences to the available options’ (213). One of the questions raised by analysts concerns the effect of non-informative media, like the entertainment media, on people’s political views. If most people are barely, if sufficiently, knowledgeable about politics because they prefer to be entertained rather than informed by the media, is this detrimental to their reasoned choices? Convincing research findings are still largely missing on this issue, but several authors (e.g. van Zoonen, 2005; Jones, 2005; Mazzoleni and Sfardini, 2009; Baym, 2010; Day, 2011;
Baumgartner and Becker, 2018) argue that fictional political stories; satirical, comedic TV shows; ‘engaged’ music; celebrities voicing political claims or advocating social apolitical causes; infotainment programs; and even the ‘soft news’ are to be seen as additional – if not unique – sources of information for a large part of the citizenry, with as much relevance to people’s choices as the ‘hard news’. Whether exposure to mass communication content has negative effects on civic engagement is also a matter of dispute among political scientists. Theorists of the ‘media malaise’ (Robinson, 1976; Putnam, 2000) have claimed that the legacy of media’s typical ways of covering public affairs spreads cynicism and disengages voters, and that more television weakens civic engagement. Norris (2000) found evidence from comparative, cross-national data that proved fears of negative media effects to have been ‘misplaced and exaggerated’, and indicated ‘that the people who pay the greatest attention to campaign coverage in newspapers and television, as well as the messages from the parties, are more likely to participate in the political process’ (277). In the new media environment that has risen in the past decade alongside the diffusion of the internet, people have more tools with which to become informed, get involved and make their voices heard. Recent research examining the impact of digital media on different types of political engagement suggests an overall positive effect. The case of the election campaign of 2008 that brought Barack Obama to the presidency is generally taken as proof that an intelligent use of Facebook successfully mobilized millions of supporters. The evidence, however, varies according to political cultures, and the interpretation is still contentious. Clearly, there is a need for more nuanced explorative studies. For example, the transfer of political expression via social media to ‘real-world political action’ appears to be less likely in non-democratic systems (Skoric et al., 2016). According to Dimitrova et al.,
Political Communication
the use of different forms of digital media, controlling for other factors, has little impact on political knowledge. As in the era before the Internet, what matters more for political learning is political interest, prior political knowledge, and attention to politics in traditional media formats. … In terms of effects on political participation … results show some support for the notion that use of digital media leads to increased political activity among the public at large. (2014: 16)
New directions and perspectives: The digital paradigm shift Since the 1990s, widespread access to the internet and personal media has profoundly changed the relationship between the media and politics. The fourth phase of political communication, as defined above, coincided with a new paradigm. The mass ownership of smartphones used in everyday life as personal communication hubs, linked to the central role assumed by social networks and instant messaging and content sharing platforms, has produced a different, and in many ways speculative, media logic compared to that of the older media, despite often overlapping and interconnecting with them. The ‘networked media logic’ (Klinger and Svensson, 2015) is thus one of the basic aspects of a ‘mediatization without media’ that is integrated and interdependent with that of traditional media. Manuel Castells has called this new communication paradigm ‘mass selfcommunication’, to point out the state of post-medial individualism brought about by the restructuring of communication practices into networks. On the one hand, mass selfcommunication remains a form of mass communication because it allows political actors to reach a relatively huge audience. For example, the President of the United States, Donald Trump, can count on more than 57 million followers on Twitter. On the other hand, mass self-communication is ‘selfcommunication because the production of
609
the message is self-generated, the definition of the potential receiver(s) is self-directed, and the retrieval of specific messages or content from World Wide Web and electronic communication networks is self-selected (2009: 55). The most innovative element of contemporary political communication is therefore the centrality accorded by the digital paradigm to the individual, who is the producer, distributor and consumer of political communication. Every connected individual is an active hub of social interconnections, no longer limited to being just a passive viewer of televised politics. He or she operates primarily through the opportunities provided by the technological infrastructures that interconnect online communicative action and offline mobilization with different communication models: one-to-many (essentially websites), many-tomany (Facebook, Instagram, Twitter), oneto-one (WhatsApp, Telegram, Snapchat). By developing and enhancing connectivity and interactivity, personal media enable the rediscovery of the importance of interpersonal communication. The basic result is the recovery of the individual as the central piece of the digital mosaic that is created through the reciprocity of members of a social network. This networked media logic structures the communication dynamics in the new digital paradigm and directs political actors to a radical change in the way they communicate and present themselves in public. The engine of contemporary political communication is fueled by data. In its fourth phase, the importance of opinion polls is matched by that of individual-level voter data. Digital environments are first and foremost immense databases covering the consumption habits, values, attitudes and opinions of billions of individuals, built on a global scale. Thus, the communication strategies of political actors become data-driven, and big data are used primarily to optimize the diffusion of their message. The ability to use data provided by social media on a large scale allows them, for example, to direct messages with
610
The SAGE Handbook of Political Science
unprecedented precision at voters who fit a particular profile, to measure the microclimates of opinion through advanced techniques of sentiment analysis and to observe how members of an interpretative community interact on the social media page of a leader (Wagschal and Ettensperger, Chapter 16, this Handbook). Barack Obama’s 2012 re-election campaign represented a turning point in the strategic use of data. The construction of a huge database of electors, the coalescing of the many databases built by local staff from as early as the 2008 primaries, allowed the campaign to carry out in-depth analyses of target voters. Operationally this enabled a massive micro-targeted mailing strategy and the compiling of efficient walk lists designed to direct door-to-door and phone-calling activities, identifying which doors to knock on in order to meet receptive voters. The same data-driven model has been used to increase online engagement, for example through the analysis of social network interactions between audience members during televised debates. The central role of data analytics was an even more important feature of the 2016 US elections, as is demonstrated, on the one hand, by the global impact of the Cambridge Analytica scandal and, on the other, by Hillary Clinton’s extensive digital media and data staff, which included 65 people involved in communication, advertising and media, 27 digital advisors, 32 data analysts and 18 field operations strategists. Data, then, constitute the prerequisite that permits political communication to implement micro-targeting methodologies, that is, to apply techniques for the super-segmentation of communication targets with four main goals in mind: first, to break through the wall of widespread disregard for political messages in a climate of profound distrust of the political system; second, to eliminate the information overload and the thunderous background noise of the entropic ecosystem of mass communication; third, to customize the message, identifying which voters to reach with
particular content through the processing of information on individual interests, prioritizing of issues and ethical, cultural and – most importantly – political values; fourth, to maximize the effectiveness of field operations, in other words, the campaigning of on-the-ground volunteers. From this point of view, political communication in the digital era is intrinsically scientific in the sense that party war rooms make decisions based on quantitative and qualitative research methods that can accurately identify key voters. The detailed map of key demographics and electoral micro-targets produces greater communication effectiveness both through traditional media, because during election campaigns advertising can be programmed according to TV schedules, and online, because it enhances the targeted effects of sponsored content through social media and the various techniques of mobile marketing. The media in fact retain a central role in strategies of awareness-building, for the task of positioning a campaign in relation to issues and for the branding of a political organization. In addition to these, however, are communication functions that belong to digital political marketing: the participation of the electoral base, the activation of undecided or abstaining voters and the mobilization of supporters. Digitalization has provided political communication with a high-tech infrastructure. As we have seen, it has profoundly transformed techniques and strategies, but in an even more radical way it has changed how it is organized. Today, in fact, communicating is organizing. With reference to ‘technologyintensive’ electoral campaigns, Kreiss (2016) has emphasized the impact that technological innovation and digital infrastructure (social media, smartphones, apps) have had on communication, mobilization and participation. High-tech organizational models applied to politics have completely restructured social relations, and have recalibrated the distribution of power within the decision-making centers of political organizations, making it more diffuse:
Political Communication
On one level, networked politics refers to electoral activities that take shape through the technical infrastructure of interlinked computer networks. On another, I refer to networked politics as a mode of organizing electoral participation. Networked politics involves sustained and coordinated collective action that occurs outside of direct managerial relationships and is premised on the voluntary contributions of supporters. (Kreiss, 2012: 6)
In this sense, the European political scene provides some interesting examples of the fluid evolution of the parties, which are now characterized by ever weaker postbureaucratic apparatuses and by increasingly efficient systems of digital connectivity. Representative cases include En Marche!, the post-ideological movement of Emmanuel Macron victorious in the 2017 French presidential elections, and the Five Star Movement founded by the Italian comedian and blogger Beppe Grillo in 2009, which won the Italian general elections of 2018. Ultimately, communicating and organizing politics in digital ecosystems involves first of all the strategic use of technology. Chadwick (2007) considers the internet to be the main factor in the organizational hybridization of political actors collectively, as a consequence of which the classic differences between parties, interest groups and social movements are dissolving. There are four practices originating from the digital network that leave traditional parties looking more and more like social movements, resulting in less involved and more flexible forms of membership. First is the use of a growing number of online forms of action made available in communicative environments that make it possible to engage (for example in fundraising activities, which has become decisively important) and to organize offline mobilization locally, especially through the most popular channels such as social networks, chats, apps, blogs and emails. The second practice concerns building reciprocal trust – imperative for collective action – by means of the horizontal connection between groups of citizens. No
611
mobilization is possible without an ideological integration between the mobilized groups and a connection with external parties that are interested but not directly involved. The trustworthy relationship established between individuals and groups is the driving force of the networks’ interconnected mechanisms, making provision for the blend of coordination and decentralization that generates organized spontaneity. In contemporary democracies it is no longer possible to build trust in the political and media system merely by relying on a single authoritative source of legitimation, such as, for instance, the mass political party of the past. Instead, Chadwick stresses, ‘what emerges is what I have in the past termed “distributed trust” – a by-product of the discursive context of the issue network itself’ (2007: 290). The third practice is that of the insertion of subcultural codes into political discourse. The deployment of this stratagem is intensified during electoral campaigns, in parallel with an increased use of tactics aimed at popularizing political content and facilitating the spread of online content. The most commonly used form of this kind of cultural remixing is that between politics and satire, which generates memes. Finally, the fourth practice is that of sedimentation, a process through which membership of the movement is built. The base grows slowly in successive waves, especially after high-visibility promotional events that make an emotional impact. In short, digital technologies make possible new and more flexible ways of organizing political mobilization and participation. More and more often, these are alternatives to traditional approaches, born out of the self-evident crisis of representative democracy and of intermediary social bodies such as political parties. The emerging organizational model is referred to as ‘netroots’. This neologism was coined by Jerome Armstrong, a blogger, strategist and political consultant on the grassroots campaign staff of Howard Dean during the Democratic primaries of
612
The SAGE Handbook of Political Science
2004, and describes those forms of political activism that are either self-organized or organized from the bottom up through the use of systems such as blogs, social networks and platforms for civic engagement. Karpf (2012) has identified three participation models that use the internet as their organizational infrastructure: advocacy organizations structured online around general themes or single issues, an example being America’s MoveOn during the early 2000s; the online communities composed of people with a common cause, such as campaigning against the indiscriminate spread of guns and gender violence; and finally, neofederated political associations based on the model of member-driven action committees like Howard Dean’s Democracy for America, which fought against corporate politics and policies responsible for social inequality. The ever more widespread presence of netroots organizational models in Western democracies confirms the theory of a long tail of the MoveOn Effect. In other words: Those groups take a variety of forms, but all share similar membership and fundraising practices. They tend to be issue generalist, mobilizing citizen support around the pressing issues of the day. They are sedimentary organizations, developing their member lists by riding waves of public interest and offering an outlet for citizen action … Their advocacy work extends well beyond ‘clicktivism’, engaging supporters in large-scale, sustained collective action. Their work routines and campaign strategies are built around the Internet – these organizations would be impossible without e-mail and the World Wide Web – but they are far different from the ‘organizing without organizations’ often heralded in public discourse. (Karpf, 2012: 156)
In conclusion, if, on the one hand, in digital environments communicating means organizing, on the other hand, organizing means communicating. Technologies generate and structure political engagement that in the digital sphere evolves toward personalized forms based on social relations. As Nielsen (2012) has observed, people serve as media for messages that originate elsewhere.
Advancements: personalized and direct communication to re-connect with voters Political communication is first and foremost a practical application in the field of advanced knowledge built on qualified professional skills. As we have seen, innovation in political communication is determined by the acquisition of scientific information, which allows for an analytical approach to the formulation of strategies and tactics, as well as by the technical use of communication infrastructures. It follows that sophisticated specialization in the requisite skills of political communication today necessitates the recruitment of staff with a high level of expertise. In this sense, election campaigns in the United States are the most advanced laboratory of skills: it is no coincidence that for a long time we have talked of the ‘Americanization’ of political communication (Nelson and Thurber, 2019). There are currently four processes that establish the sudden turns (or bursts) of speed/breakaways/forward spurts of the evolution of contemporary political communication. The first is the process of hybridization within the media system. Social TV, which enables the use of dual screens for debates between candidates and political talk shows, is an obvious example. In general, flows of communication are increasingly the product of hybrid media logics. A hybrid environment ‘is built upon interactions among older and newer media logics – where logics are defined as technologies, genres, norms, behaviors, and organizational forms – in the reflexively connected fields of media and politics’ (Chadwick, 2013: 4). As a result, the digital world renders invalid the classical dichotomy of old and new media because the privileged space for digital communication is that of ‘flux, in-betweenness, the interstitial, and the liminal’ (4). For example, during the 2016 presidential race, Trump’s concerted attempts to attract the attention of the media with events, unscheduled interactions and
Political Communication
even aggressive social media activity succeeded precisely because they exploited hybrid logics. The trend, however, is widespread and can be found in all political contexts: for example, election candidates who make maximum capital out of Facebook Live, a format that can disrupt the rigid hierarchies of television news and crystallized talk-show liturgies by imposing the political actor’s times and ways of communicating on the logic of journalistic mediation. The same is true of the tsunami of live tweeting during media events and personal appearances. A second fundamental process, which presents to us future scenarios of political communication, is the disintermediation of communication between political actors and the electorate. For the most part this function is carried out through the main social networks, principally Facebook and Twitter, but apps, text messages, instant messaging platforms and email may also be used for this purpose. Strategies of disintermediation concentrate their propaganda on the fan base of the leaders and parties, but do not overlook the broader boundaries of supporters and sympathizers. Disintermediating communication means strategically exploiting self-representation by speaking in the first person, bypassing the mediation of journalism and traditional media. On a practical level, this involves obtaining the following: (1) a direct relationship with the citizen, hence the reinsertion of public engagements into the leader’s agenda such as community events, meetings and rallies typical of pre-medial electoral campaigns; (2) the potential for spreadability or virality of content, an indispensable attribute for boosting the circulation of online messages and heightening their visibility; (3) the conditioning of the media agenda ‘upstream’ by which information and journalism are obliged to pursue the political actors, rather than the opposite; (4) the recovery of interpersonal communication within campaign communication devices, aimed at establishing ‘opinion leaders’ (or ‘influencers,’ as we say today) as transmitters of informal communication.
613
A third fundamental process is that of personalizing the flow of communication, in two ways. One is through the ‘leaderization’ of the communication strategy, to be construed as a general process and long-term effect of the mediatization of the political sphere: using a political personality is the most efficient cognitive shortcut to making a political offer visible and comprehensible. The celebrity politician is the ‘primary material’ of electoral branding and rebranding. The leaderization of political parties has led to the definitive leaderization of communication strategies. We must, however, note the central role played by social networks and the internet in enhancing and speeding up the leaderization of communication: social media in fact ‘represent semi-public, semi-private spaces for self-representation where borders between offline personal and online mediated relations are blurred. They allow politicians (and voters) to stage their public and private rules, and to shift between them seamlessly and more or less consciously and strategically’ (Enli and Skogerbø, 2015: 121). The second way in which the communication flow is personalized is subjectivization, that is, the humanization of the leader’s public role. The leader must present himself to the people as an institutional figure, but also as a person. This is vital since his (or her) personal qualities as a human being are a significant part of the reputational and moral capital that is offered as a practical alternative to the weakness of trust relationships with the electorate that are based on ideology. As James Stanyer (2013) has explained, the personalization of the public representation of a political figure involves, on the one hand, his privatization, in other words putting on view his subjective characteristics and ‘backstage’ life, and on the other hand, his intimization, allowing his private space (health, sex, sorrow), relationships (family, marital, extramarital) and living space (home, places of leisure) to be seen by all. In this sense, the ideological and political images evoked by a particular figure are inextricably linked to individual
614
The SAGE Handbook of Political Science
and subjective matters, to psychological considerations and to his or her emotionality, all more or less openly displayed. Depending on the context, all of this becomes, to varying degrees, an object of observation and media representation. Thus, emotional judgments on profoundly private matters become an integral part of public political judgment. The fourth fundamental process is the increasing emotionalization of political communication. This trend is an emblematic reflection of the radical change of paradigm brought about by digital technologies.
From society to the brain: the emotionalization of political communication In the current phase of political communication, a change of an affective kind is taking place that reflects the systemic changes in relations between the media and politics. This prospect opens up new, unprecedented scenarios for the discipline. It is rooted in the gradual but comprehensive emancipation of the social sciences from the paradigm of calculating rationalism and from the dichotomous conception of the relationship between reason and emotion (Cepernich, 2016). Today emotions are considered an integral part of rationality, in that they fulfil the necessary function of reducing the complexity induced by the environment, selecting the problems that reason is required to tackle. They support reason by delimiting a set of possible solutions when the quantity of available options seems overwhelming (Damasio, 1995). The crumbling of the myth of the rational social actor in the field of politics has come about through the demolition of the idol of the voter as a homo oeconomicus faced, for instance, with a choice in elections. Neuroscience, cognitive psychology and linguistics have confirmed the interdependence between rationality, emotion and affection, going so far as to record different modes
in the functioning of neural circuits in the light of political affiliation (Westen, 2008; Schreiber et al., 2013). A first body of research on this by now established subject examines the impact that emotionalized communication has on voting decisions. The reference point is the theory of affective intelligence, which has demonstrated how negative emotional states favor the research and processing of information (Marcus et al., 2000, MacKuen et al., 2007). Paradoxically, anxious and fearful voters are more inclined to take a reflexive attitude and make rational choices, since their emotional state prompts them to gather information on the state of their surroundings. By contrast, subjects who are enthusiastic about their leader or candidate are more likely to be partisan. This evidence has had implications for two interconnected fields. The first is that of using GOTV (Get Out The Vote) techniques designed to stimulate electoral participation. This has been systematized by social opportunity theory, according to which: ‘To mobilize voters, you must make them feel wanted at the polls. Mobilizing voters is rather like inviting them to a social occasion. Personal invitations convey the most warmth and work best’ (Green and Gerber, 2004: 92). For example, door-to-door mobilization campaigns carried out by militants, volunteers and activists have a high level of effectiveness, with 14 face-to-face conversations with a canvasser resulting in one vote being cast. The classic study of New Haven, Connecticut revealed an increase in turnout after door-todoor contact of 9.8% compared to the control sample (Gerber and Green, 2000). This confirms the fact that the effectiveness of GOTV techniques is contingent on the personalization of electoral communication. In other words, it is a result of those strategies and communication practices that include direct interpersonal communication in their campaign action plans. No less fertile is the line of studies that investigate the strategic use of emotive appeal in election campaigns, particularly through
Political Communication
recourse to images and music (Brader, 2006). These researches agree that the use of certain highly emotionalized forms of content can have an effect on voter behavior, at both individual and aggregate levels. Nevertheless, (1) advertisements do not influence viewers directly, but instead change the way that choices would be taken through ‘set framing’, which defines the margins of meaning within which decisions mature; (2) there is no confirmation of the viability of the conventional interpretative scheme according to which positive messages lead to positive impressions of the promoter and negative messages lead to an aversion to a competitor. A second body of studies investigates empirically the means through which emotional states are transmitted between individuals on a social network, both as an effect of simply being connected to each other and as an effect of interacting directly. These analyses, carried out on a large scale, have revealed a high capacity for transmitting emotions through the mechanism of contagion (Kramer et al., 2014). On Facebook, for example, (1) the emotions expressed by a particular network influence the individuals that belong to it and direct interaction is not required for this transmission; (2) the positivity or negativity expressed on the newsfeed of users varies in line with the changing emotions of others; (3) people who are less exposed to emotive content for a period communicate less emotionally afterwards. The study carried out by Kramer and his colleagues thus provides further confirmation of the concept of ‘public mood’, construed as a widespread affective state that individuals experience through their membership of a community. This means that the emotional climates and microclimates produced by the major social networks can by themselves exert significant influence on the more general climate of public opinion (NoelleNeumann, 1986). This spill-over effect is also demonstrated by important later studies of other social networks. Predictably, not all users have the same ability to propagate the
615
contagion: opinion leaders and those who connect clusters with a low rate of information redundancy (‘structural hole spanners’) have greater influence (Yang et al., 2016). The third and final body of empirical investigations into emotionalization addresses the effects of emotionalized communication on online engagement. The use of emotively codified rhetoric by, for example, party leaders is a variable that can increase the level of polarization in the climate of opinions and, consequently, of cognitive mobilization. This is also true of offline communication, as the past decade of election campaigns has shown (Stromer-Galley, 2014; Johnson, 2018). The campaign for the re-election of Barack Obama in 2012 engaged around 2.2 million volunteers, with more than 30,000 of them taking responsibility for managing 10,000 neighborhood teams. It has been calculated that this produced around 24 million direct interactions and, it is estimated, 1.8 million votes (McKenna and Han, 2014). Emotive communication on social networks, on the other hand, has activation effects chiefly on subjects who are already politically engaged (Jones et al., 2012). In conclusion, all the factors considered contribute to defining a structurally partisan disposition of the political communication ecosystem. This tends to favor the effects of polarization between different communities in accordance with their respective loyalties and political points of view. Typically, this relates to support for a leader, candidate or party and resistance to its detractors and opponents. Online oppositions can be so radicalized that they lead to serious instances of negative campaigning, hate speech, confrontation and disrespect. Hate campaigns can even take an organized form, supported technologically by the use of bots that multiply their viral potential exponentially. And this naturally reinforces the symbolic meaning and proliferates the political implications. These types of tensions emerge as the product of the logic of opposition between ingroups and outgroups, on the basis of the
616
The SAGE Handbook of Political Science
identification of a political adversary as an enemy that must be eliminated. At the root of the drive towards polarization is the process through which the digital public sphere splinters into echo chambers (Jamieson and Cappella, 2008). These are formed when individuals connect by reason of homophily – that is, the affinity between subjects who seek recognition and the sharing of their points of view – and relate to Fritz Heider’s attribution theory, which can be simplified as ‘the friends of my friends are my friends, and the enemies of my friends are my enemies’. Social networks provide the technological infrastructure that brings these elementary dynamics into play. Thus, the echo chamber is a social space for political debate that is characterized by a high degree of internal homogeneity between its members, by the self-referentiality of the communication system and by informational redundancy. For this reason, it is also a place in which fake news and disinformation have the greatest chance of circulating and registering their desired effects. This brings powerfully distorting forces to bear on opinion-building processes on the internet. Moreover, the effects of selective exposure are increased, leading to the affirmation and reinforcement of pre-existing opinions. Like everywhere else, on the internet people seek information and data that are compatible with their current convictions and capable of delivering an adequate level of emotional coherence with their state of mind (Kahneman, 2011). Given its actor-centric structure, this occurs more pervasively on the web than in other communication channels. This controversial and conflictual fragmentation of internet user-groups has been described by Sunstein (2017) as the ‘Balkanization’ of knowledge and understanding. In the new paradigm of digital communication, one no longer asks the subject – accommodated and flattered by the ‘filter bubble’, and huddled in the comfort zone of his echo chamber – what he thinks, but rather what side he is on.
References Altheide DL and Snow RP (1979) Media Logic. Beverly Hills, CA: Sage. Baumgartner JC and Becker AB (eds) (2018) Political Humor in a Changing Media Landscape. Lanham, MD: Lexington Books. Baym G (2010) From Cronkite to Colbert: The Evolution of Broadcast News. Boulder, CO: Paradigm Publishers. Blumler JG and Gurevitch M (1995) The Crisis of Public Communication. London: Routledge. Blumler JG and Kavanagh D (1999) The Third Age of Political Communication: Influences and Features. Political Communication 16(3): 209–30. Brader T (2006) Campaigning for Hearts and Minds: How Emotional Appeals in Political Ads Work. Chicago and London: The University of Chicago Press. Castells M (2009) Communication Power. New York: Oxford University Press. Cepernich C (2016) Emotion in Politics. The International Encyclopaedia of Political Communication. Wiley-Blackwell, New York, vol. 1: 360–70. Chadwick A (2007) Digital Network Repertoires and Organizational Hybridity. Political Communication 24(3): 283–301. Chadwick A (2013) The Hybrid Media System. Oxford: Oxford University Press. Damasio AR (1995) Descartes’ Error: Emotion, Reason and the Human Brain. London: Picador. Day A (2011) Satire and Dissent. Bloomington: Indiana University Press. Delli Carpini MX and Keeter S (1996) What Americans Know about Politics and Why It Matters. New Haven, CT: Yale University Press. Deutsch KW (1963) The Nerves of Government: Models of Political Communication and Control. New York: The Free Press. Dimitrova DV, Shehata A, Strömbäck J and Nord LW (2014) The Effects of Digital Media on Political Knowledge and Participation in Election Campaigns: Evidence from Panel Data. Communication Research 41(1): 95–118. (Published online in 2011). Edelman M (1976) The Symbolic Uses of Politics. Champaign, IL: University of Illinois Press. Enli GS and Skogerbø E (2015) Personalized Campaigns in Party-Centred Politics: Twitter and Facebook as Arenas for Political
Political Communication
Communication in G Enli and H Moe (eds) Social Media and Election Campaigns: Key Tendencies and Ways Forward. London and New York: Routledge: 119–35. Gerber AS and Green DP (2000) The Effects of Canvassing, Telephone Calls, and Direct Mail on Voter Turnout: A Field Experiment. American Political Science Review 94(3): 653–63. Green DP and Gerber AS (2004) Get Out the Vote! How to Increase Voter Turnout. Brookings Institution Press: Washington, DC. Hallin DC and Mancini P (2004) Comparing Media Systems: Three Models of Media and Politics. Cambridge, MA: Cambridge University Press. Hallin DC and Mancini P (eds) (2012) Comparing Media Systems beyond the Western World. Cambridge, MA: Cambridge University Press. Jamieson KH and Cappella JN (2008) Echo Chamber: Rush Limbaugh and the Conservative Media Establishment. New York: Oxford University Press. Johnson DW (2018) Campaigning in the TwentyFirst Century: Activism, Big Data, and Dark Money. New York and London: Routledge. Jones JP (2005) Entertaining Politics: New Political Television and Civic Culture. Lanham, MD: Rowman and Littlefield. Jones PE, Hoffman LH and Young DG (2012) Online Emotional Appeals and Political Participation: The Effect of Candidate Affect on Mass Behavior. New Media and Society 15(7): 1132–50. Kahneman D (2011) Thinking, Fast and Slow. New York: Farrar, Straus & Giroux. Karpf D (2012) The MoveOn Effect: The Unexpected Transformation of American Political Advocacy. New York: Oxford University Press. Kertzer DI (1988) Ritual, Politics and Power. New Haven, CT: Yale University Press. Klinger U and Svensson J (2015) The Emergence of Network Media Logic in Political Communication: A Theoretical Approach. New Media & Society 17(8): 1241–57. Kramer ADI, Guillory JE and Hancock JT (2014) Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks. PNAS 111(24): 8788–90. Kreiss D (2012) Taking Our Country Back: The Crafting of Networked Politics from Howard Dean to Barack Obama. New York: Oxford University Press.
617
Kreiss D (2016) Prototype Politics: TechnologyIntensive Campaigning and the Data of Democracy. New York: Oxford University Press. Lakoff G (2008) The Political Mind: Why You Can’t Understand 21st-Century Politics with an 18th-Century Brain. New York: Penguin Group. MacKuen MB, Marcus GE, Neuman WR and Keele L (2007) The Third Way: The Theory of Affective Intelligence and American Democracy in WR Neuman, GE Marcus, AN Crigler and M MacKuen (eds) The Affect Effect: Dynamics of Emotion in Political Thinking and Behavior. Chicago, IL: The University of Chicago Press. Marcus GE, Neuman WR and MacKuen M (2000) Affective Intelligence and Political Judgment. Chicago, IL: University of Chicago Press. Mazzoleni G and Schulz W (1999) ‘Mediatization’ of Politics: A Challenge for Democracy? Political Communication 16(3): 247–61. Mazzoleni G and Sfardini A (2009) Politica Pop. Bologna: Il Mulino. McKenna E and Han H (2014) Groundbreakers: How Obama’s 2.2 Million Volunteers Transformed Campaigning in America. New York: Oxford University Press. Nelson CJ and Thurber JA (eds) (2019) Campaigns and Elections American Style: The Changing Landscape of Political Campaigns. New York and London: Routledge. Nielsen RK (2012) Ground Wars: Personalized Communication in Political Campaigns. Princeton, NJ: Princeton University Press. Noelle-Neumann E (1986) The Spiral of Silence. Public Opinion: Our Social Skin. Chicago and London: University of Chicago Press. Norris P (2000) A Virtuous Circle: Political Communications in Postindustrial Societies. Cambridge, MA: Cambridge University Press. Putnam RD (2000) Bowling Alone: The Collapse and Revival of American Community. New York: Simon and Schuster. Robinson MJ (1976) Public Affairs Television and the Growth of Political Malaise: The Case of ‘The Selling of the Pentagon’. American Political Science Review 70(2): 409–32. Schreiber D, Fonzo G, Simmons AN, Dawes CT, Flagan T, Fowler JH and Paulus MP (2013) Red Brain, Blue Brain: Evaluative Processes Differ in Democrats and Republicans. PLoS ONE 8(2), e52970: 1–6. Skoric MM, Zhu Q and Pang, N (2016) Social Media, Political Expression, and Participation
618
The SAGE Handbook of Political Science
in Confucian Asia. Chinese Journal of Communication 9(4): 331–47. Stanyer J (2013) Intimate Politics: Publicity, Privacy and the Personal Lives of Politicians in MediaSaturated Democracy. Cambridge: Polity Press. Stromer-Galley J (2014) Presidential Campaigning in the Internet Age. New York: Oxford University Press. Sunstein CR (2017) #Republic: Divided Democracy in the Age of Social Media. Princeton, NJ: Princeton University Press.
van Zoonen L (2005) Entertaining the Citizen: When Politics and Popular Culture Converge. Lanham, MD: Rowman and Littlefield. Westen D (2008) The Political Brain: The Role of Emotion in Deciding the Fate of the Nation. New York: Public Affairs. Yang Y, Jia J, Wu B and Tang J (2016) Social Role-Aware Emotion Contagion in Image Social Networks. AAAI’16 Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence: 65–71.
37 Political Cultures Dirk Berg-Schlosser
A Short history of the subject Perceptions (and stereotypes) of other peoples’ mentalities, mindsets, ways of life and cultures, or Tocqueville’s famous ‘habits of the heart’, are probably as old as humankind. Culture, however, is one of the most elusive concepts of the social sciences. Kroeber and Kluckhohn (1952), for example, found 164 different definitions ranging from words like agri-culture to very elaborate notions of ‘enlightenment and excellence of taste acquired by intellectual and aesthetic training’ (as per Merriam-Webster). Several distinctions are necessary at this point. First, the scope of the concept has to be defined. In a very broad sense, culture refers to a large cultural area (Kulturkreis) which in the past has been largely shaped by the major world religions (Weber, 1920). The term ‘civilization’ as used by Huntington (1996) and others is largely synonymous with this concept. The number of cultures identified in this way varies to some extent. Weber speaks
of six major world religions: Confucianism, Hinduism, Buddhism, Christianity, Islam and Judaism. Huntington, in his map of ‘The World of Civilizations: Post-1990’, lists nine, where he adds a Latin American, an African and a separate Japanese one. He further separates orthodox Christianity from the ‘Western’ and subsumes Judaism under the latter (Huntington, 1996: 27f.). This very broad concept thus neglects further important distinctions within these religions, such as those between Catholic and Protestant churches, or the Sunnite and Shi’ite divisions in Islam. It also does not take any existing political borders into account. Whether any concrete social identities are formed on this basis is an empirical question, which will be addressed below. Second, the content of what is covered by the term culture varies enormously as well. This ranges from very encompassing definitions as the customary beliefs, social forms and material traits of social groups or, in other words, their ‘way of life’ (Thompson
620
The SAGE Handbook of Political Science
et al., 1990), to specific forms of ‘high culture’ in the arts and sciences. Quite often the term also has strong normative connotations, distinguishing those who are ‘cultured’ and ‘civilized’ from, at the other extreme, ‘barbarians’ (from Greek barbaros, meaning foreign or ignorant). In contemporary political science, the current use of the concept of political culture was coined in a seminal article by Gabriel Almond reflecting a ‘Weberian’ tradition in the social sciences. There, he defined it as ‘the particular pattern of orientations toward political actions in which every political system is embedded’ (1956: 396). In the by now classic ‘Civic Culture’ study, together with Sidney Verba, he first put this approach into empirical practice (Almond and Verba, 1963; see also Almond and Verba, 1980). Another pioneer of this approach, Lucian Pye, proposed a more elaborate definition: Political culture is the set of attitudes, beliefs, and sentiments which give order and meaning to a political process and which provide the underlying assumptions and rules that govern behavior in the political system. It encompasses both the political ideals and the operating norms of a polity. Political culture is thus the manifestation in aggregate form of the psychological and subjective dimensions of politics. A political culture is the product of both the collective history of a political system and the life histories of the members of that system, and thus it is rooted equally in public events and private experiences. (Pye, 1968: 218)
Over the course of time, the concept has generated many criticisms and controversies. Almond himself, in retrospect, distinguished four main lines of criticism. One, advanced by Barry (1970), for example, questions the assumed causality: culture is not an independent but a dependent or, at best, ‘residual’ (Elkins and Simeon, 1979) variable. It is not socialization, attitudes and subsequent behavior that shape political institutions and decide the fate of a polity, as in Weimar Germany, but rather the other way round: institutions and performance influence attitudes and determine the eventual downfall of
a regime. This Almond considers a ‘straw man polemic’, because the concept has to be seen in a dynamic sense with feedback mechanisms working in both directions. The (orthodox) Marxist critique that the mode of production and the resulting social structures determine attitudes and behavior is similarly dismissed as one-sided and ‘monistic’. More recent neo-Marxists referring to Gramsci’s concept of cultural ‘hegemony’ discuss the complexity of the relationship between ‘basis’ and ‘super-structure’ and arrive at more balanced and (self-) critical accounts (Galkin, 1986). A third line of criticism puts into doubt the separation of political attitudes and actual behavior. This is a general problem for a ‘behavioralist’ perspective mainly based on survey research, such as is seen in election studies. Almond refutes this argument, saying that by separating the two, the complexities of the relation between political thought and political action – for example, including situational aspects – can be more fully explored. Finally, Almond dismisses the rational choice anti-culturalist critique that replaces historically shaped values and norms with mere calculations of (material) self-interest of political actors as reductionist and, at least again in a maximalist and scientistic variant, ‘monistic’. On the whole, thus, Almond considers his original concept to have withstood these criticisms quite well. Beyond these more immediate criticisms and further specifications and refinements which have taken place, distinguishing for example between system, process and policy aspects of political culture (Almond and Powell, 1978) and its cognitive, affective and evaluative components, some more fundamental critiques of Almond’s position still have to be mentioned. One questions the overall behavioralist and ‘attitude-centered’ perspective from a more holistic social– anthropological or semiotic point of view (Dittmer, 1977). The analysis of culture, referring to macro-level phenomena, should
Political Cultures
not be reduced to individual attitudes and survey research. This amounts to an ‘individualistic fallacy’. Rather, the overall meaning of culture, its interpretation by relevant groups and actors and its expressions in certain symbols or rituals must also be taken into account. Another more specific criticism concerns not so much the political culture approach as such, but more the normative position implicit in the original ‘Civic Culture’ study. There, the authors not only described the political cultures of five countries, but also advocated, to some extent, the more balanced and mixed ‘civic’ Anglo-Saxon culture of the United States and Great Britain, consisting of a more or less harmonious blend of ‘parochial’, ‘subject’ and ‘participant’ elements. This was heavily criticized as an American and conservative bias by advocates of more participatory or ‘radical’ forms of democracy (Barber, 1984). Almond (2002) himself re-emphasized ‘Civic Culture as Theory’ by referring to Eckstein’s (1988) Theory of Democratic Stability emphasizing the congruence between a democratic political system and the supporting culture, and to Dahl’s (1989) more restricted concept of ‘polyarchy’. In this more restricted sense, ‘civic culture’ clearly refers to the embeddedness of modern democracies in deep-rooted popular attitudes. In new democracies, however, first a period of adaptation and learning – ‘habituation’, in Rustow’s (1970) sense – is required. Then strong discrepancies between respective elite cultures and the populations at large may occur (Higley and Gunther, 1992). In a broader sense, Almond himself saw the concept of political culture as not limited to democratic regimes only. In the following, we also use the concept in this broad sense, looking at cultural manifestations across a variety of contemporary political regimes. For this purpose, a more comprehensive analytical framework will be presented, in which the various approaches discussed so far can be located and, to some extent, integrated.
621
Basic theories and concepts The concept of political culture, as it has been discussed so far, still suffers from two major flaws. One is its relative diffuseness and vagueness. The other concerns a more elaborate conceptualization of the micro– macro relationships. Both can be remedied by more general frameworks of sociological theory, one building on a broader ‘system’ perspective, the other incorporating James S. Coleman’s ‘general model of social explanations’ (Coleman,1990). Almond and many others share a conceptualization of politics derived from cybernetics, where the interactions between a society and its political system as the central regulatory unit can be described as a flow of inputs, conversions and outputs which, if well functioning, maintain a dynamic equilibrium in the longer run (Easton, 1965). In the Parsonian tradition four major social subsystems and their interactions can be distinguished: the community system identifying the external boundaries, the socio-cultural system expressing its value orientations, the economic system providing its material basis and the political system as the major regulating body. These have been placed by Parsons in his well-known AGIL scheme (Parsons, 1951). With some modifications, this scheme can also be used to identify the major contents of a political culture and to locate the specific emphasis of some of the varying approaches discussed so far (Figure 37.1). This general taxonomic scheme has to be filled with a more differentiated pattern of variables, which, according to the cases and regions analyzed, can vary in time and space. Without claiming to be exhaustive, the following dominant features of each sub-system must be considered.
The Community System First, the boundaries of each case have to be determined. In modern times, the nation-state
622
The SAGE Handbook of Political Science
G goal attainment (specification) Political system:
adaption A (opening) Economic system:
diffuse support Community system:
specific supports Social–cultural system: political codes
consensual norms
(meanings and interpretations) -----------------------------------------------------political orientations
I integration (closure)
latent pattern maintenance L (generalization)
Figure 37.1 Components of political culture in a system framework Source: Adapted from Parsons (1951).
has become the most pertinent unit of analysis. Its objective borders today are defined by international law, but its actual impact can also be assessed with the help of communications research (Deutsch, 1966). In political– cultural terms, the extent and degree of a sense of identity with a person’s political community is the most relevant aspect (Karolewski, Chapter 31, this Handbook). This national identity tends to become a social ‘skin’ for the individuals concerned which, after a certain age, cannot be shed very easily. In this regard, a person’s social identity is linked to his ego-identity and his personality system in general. Problems and crises at one level may easily affect the other, and an excessive sense of nationalism, for example, is not rarely found in persons who experience other serious psychological problems. In many cases, the political community is not homogeneous and various sub-national identities persist. Each contemporary nation-state has been formed by specific historical developments, some of which – for
example, the drawing of boundaries by the colonial powers in Africa, but also the division of Germany after World War II – have been arbitrary and accidental. In this way, sometimes quite curiously composed units have come into being, which, however, over the course of time develop their own ‘life’ and weight. As far as the formation of nation-states in Europe is concerned, Stein Rokkan has developed the most explicit model (Rokkan, 1975; Flora, 1999). The most important cleavages discerned were those of center–periphery, state–church, countryside–town, capital–labor and their respective interactions and developments over time. In each case, then, a specific pattern of these cleavages with their respective ethnic/regional, religious and socio-economic identities emerged. Characteristically, it is the ethnic/regional identification – which is often reflected in linguistic variations as well – that tends to be articulated most strongly. Religious attachments similarly may be very intense, but they have become weaker in many instances in the
Political Cultures
course of processes of secularization. A sense of identity based on socio-economic differentiations, on the other hand, often requires a specific context and more explicit forms of organization – farmers’ associations, trade unions, and so on. In this regard, the level of differentiation and development of the economic sub-system and the objective class distinctions which can be based on it are of particular importance. It must also be noted that the kind and degree of identification of a certain group should not be confounded with the actual content of its conflicts with others. Only very rarely, for example, do conflicts arise about ethnic or religious matters per se. Usually they concern the economic position and access to political power of these groups in the overall system and are articulated along the dominant ethnic or religious cleavage (as in Belgium or in Northern Ireland, to give only two of the numerous historical and contemporary examples: Horowitz, 1985). In certain instances, different aspects of objective group differentiations can be combined in a social milieu with a common subculture. In Imperial Germany, for example, a rural–catholic, a protestant–bourgeois, and a workers’ milieu, each with its specific regional concentrations, could be distinguished. These milieus can develop quite extensive internal structures and organizations (e.g. in the fields of education; common social and cultural activities; the media; economic and political organizations, and so on) and become largely autonomous from the wider community. It has been shown that such sub-milieus have been persisting in Germany, for example, since the territorial and religious divisions of the Peace Treaty of Westphalia in 1648, and still influence electoral behavior (more Catholic or Christian versus other parties) to a considerable extent (Rohe, 1992; Berg-Schlosser and Rytlewski, 1993). In more extreme cases, these milieus can ossify into certain ‘Lager’, which view each other as hostile camps as in Austria in the
623
1930s, and which, at best, cooperate only in a consociational manner at the elite level (Lijphart, 1977). In such cases it is not rare to see the other alternative – civil war or, if a group is more remote and regionally concentrated, secession. More often, however, multiple identifications, which need not necessarily be in conflict with each other, can be found within the larger community. Thus, a person can be a local, regional and national ‘patriot’ at the same time, with the kind and intensity of his attachment depending on the concrete circumstances. The respective scope and intensity of expressions of social trust may similarly vary in each instance. In extreme cases, it may extend only to members of a person’s immediate family or other narrow in-groups. In others, it may be quite pervasive and generalized (Banfield, 1958; Putnam, 1993). At the overall community level, certain often unconscious consensual norms are also at work, which accept and support the social system as such, even though individuals and groups may act mostly in a conflicting manner within it.
The Socio-Cultural System The socio-cultural system reflects the basic values of each society and gives meaning to its existence. In traditional societies, the interpretation and internalization of these values was closely linked to a transcendental sphere that legitimized the existing social and political order. In modern societies, a certain secularization and rationalization of values has taken place. But even there, common rituals and symbols can be observed which give meaning to political life by referring to constitutive historical events in the light of some universally claimed values and their particular evolution in a certain society. Examples such as the American, French or Soviet revolutions and their respective value bases, but also more gradual evolutions (as in the UK) or more peaceful events elsewhere,
624
The SAGE Handbook of Political Science
such as celebrations of independence in ‘new’ nations, are cases in point. Some authors have coined the term ‘civil religion’ in reference to this phenomenon (Bellah and Hammond, 1980), which is in part congruent with this aspect of political culture. Such values justify the place of individuals and groups in the society (e.g. in a more egalitarian or more hierarchical sense, but also concerning differentiations of age, sex, and so on), determine their scope of action (e.g. in a more dependent or more participatory way) and define the respective realms of solidarity, in particular when claims running counter to egotistically perceived or other more immediate material interests have to be made. These values also define the extent of the political sphere proper (in a more pervasive or more limited sense), in which authoritative common decisions have to be made. They include, basically, the rules for the resolution of conflicts in society (e.g. in a more consensual or more antagonistic way) and for decisionmaking (e.g. in an authoritarian or more democratic manner) in the political system. In this regard, they closely interact with the bases of legitimacy of the political system proper. Here again, close interactions between this social sub-system and individual personality characteristics can be observed. Cultural values are transmitted through the usual socializing agents of each society (families, peer groups, the educational system, the media, etc.), and are more or less internalized by each member (Marczewska-Rytko, Chapter 38, this Handbook). They are, in turn, shaped by collective historical experiences (in particular traumatic ones such as wars, intensive political or economic crises, assassinations of political leaders, etc.) and form the collective memory of each society. The strength and durability of this memory varies culturally, too, depending to a certain extent on the more specific orientation of each society towards its past and future. It seems that during long periods of external political suppression, such memories can become particularly keen (as in Ireland or Poland, for example).
In many communities, the interpretation of basic values has been the particular domain of priests and similar specialists. In modern societies, secular intellectuals and scientists have increasingly taken up this role (Poggi, Chapter 3, this Handbook). They reflect and justify such values in a discursive manner at a higher level of abstraction. In this sense, they contribute to a cultural meta-system (‘culture of culture’). Their role, however, is not limited to legitimizing the existing political order in a docile way; on the contrary, they may critically point to existing insufficiencies in the realization of certain values and inconsistencies and contradictions between them. The analysis of the more general distribution of attitudes and values in the sociocultural subsystem is amenable to the usual tools of modern representative and quantifiable survey research, if a certain minimal technical infrastructure exists for this purpose and the general ‘climate’ of a particular regime permits it. In this way, on the one hand, too ‘holistic’ generalizations, as in former ‘national character’ studies (Inkeles, 1997), can be avoided and, in a critical sense, existing stereotypes and prejudices concerning other communities which are assessed in the same manner can be refuted. It is important, however, that such overall distributions of certain characteristics are broken down by the major social structural categories and linked to the cleavages in the community system. On the other hand, the political codes and meanings, and their interactions with the socio-cultural sphere, have to be assessed by more qualitative methods and interpretations. In this respect, the complementary nature of quantitative and qualitative methods, requiring a certain in-depth knowledge and sensibility of the respective researcher, is of particular importance.
The Economic System The economic system constitutes the material basis for the existence and development
Political Cultures
of each society. Again, it is not so much its ‘objective’ side (i.e. the different modes of production, the concrete allocation of resources, the effects on social structure and their dynamics over time) with which we are concerned here, but more its ‘subjective’, political–cultural implications. This sub-system is determined by its own logic of instrumental–rational (zweckrational in Weber’s sense) thinking and behavior. To a certain extent, however, these orientations are also conditioned by interactions with the general socio-cultural sphere. They relate to individualistic versus more collective orientations, attitudes toward work, property, the accumulation of wealth, patterns of consumption, certain lifestyles, and similar ones. The interactions with the political sub-system consist of certain regulatory needs (with which we need not deal at this point) and concrete demands toward the public sphere. Their satisfaction may create ‘specific supports’ for the political authorities in Easton’s sense and may contribute in the longer run towards a ‘diffuse support’ for the political system as a whole (Easton, 1965).
The Political System The ‘core’ of political culture can be found in the sources and the extent of legitimacy of the political system, the ‘diffuse support’ it enjoys in the political community. Whereas this support is always based to a certain extent on the customary acceptance of certain rules and institutions, if they have existed over a longer period of time, its value-base also has to be justified in terms of the more general discourse of the socio-cultural system. Again, in more traditional communities, this base is grounded in the transcendental sphere, as for example the believed divine origin or ‘gift of grace’ of certain dynasties or the consecration of political rulers by religious authorities. In modern societies, the major source of legitimacy is a ‘rational– legal’ one in Weber’s sense (1922: 122ff.),
625
based on a critical reflection of the institutionalized rules of political recruitment and decision-making. Open and fair elections involving the widespread participation of the population at large have become the major instrument in this regard. The decisive test of legitimacy of a democratic political culture in this sense is the acceptance of a political decision with which a person or group does not agree. Weber’s charismatic type of legitimacy, based on the personal appeal of a political leader, is a special case, which, by definition, is a relatively short-lived one. If the ‘routinization’ of charisma in terms of some more generally accepted principles fails, this type of legitimacy ends with the death of the political leader, at the latest. Where conflicting principles of legitimacy, for example monarchic and democratic ones, exist side by side, the stability of the system as a whole is undermined (Huntington, 1968). Such conflicts often lead to civil wars or revolutions. Where no more durable and pervasive forms of legitimacy can be established, rulers usually attempt to achieve compliance by coercion. In addition to the political–cultural foundation of the ‘polity’, there exist more specific orientations that are directed to the politics and the policies of the system in Almond and Powell’s (1978) terms. ‘Politics’ refers to the political processes at the input side of the political sub-system. This ‘process culture’ consists of the knowledge, feelings and evaluations which members of the political system have toward the self as a political actor and toward other political actors, in particular political parties and interest groups, but also more informal social movements and ‘citizens’ initiatives’. ‘Policy culture’ is directed toward the output of the system, its internal policies (extractive, regulative and distributive), but also its external (military, diplomatic and economic) ones. All these have to be specified more closely depending on the actual cases analyzed. This ‘systemic’ outline thus leads to a more complex conceptualization of political
626
The SAGE Handbook of Political Science
culture. In contrast to some of the early protagonists of this approach, it is not only concerned with aspects of democratic stability in a ‘civic culture’, but points to possible tensions and sources of ‘system breakdown’ as well. Each sub-system possesses, to a certain extent, its own mechanisms and ‘logic’, and is ‘self-referential’ or auto-poietic (Luhmann, 1984). If this internal logic becomes overriding (as, for example, with the pursuit of individual or particular group benefits in the economic system at the expense of certain collective goods), the overall system may disintegrate. It is also important to note that political culture should not be considered as being exclusively determined by the community, the socio-cultural and the economic subsystems, but that it is constantly maintained and sometimes explicitly modified by the political system – its institutions, symbolic expressions and incumbents itself. This could be observed in the different processes of state formation in Europe, for example, but also in the attempts of ‘nation-building’ in post-colonial Africa and elsewhere. The very process of European integration similarly implies some of these ‘directed’ political–cultural changes. Interactions of this kind where institutional aspects can be considered as independent variables affecting the other sub-systems creating attitudinal and behavioral changes ‘from above’ are also considered by more recent ‘neo-institutional’ approaches (March and Olsen, 1989).
Linking Levels of Analysis As has been mentioned before, political culture is a concept referring to the macro-level of society, which, however, in the behavioralist Almond/Verba tradition is usually only assessed by survey research at the microlevel. In this way, many important aspects may escape the attention of the observer. For example, expressions of a sense of patriotism and the use of the flag and the national anthem, as has vividly been demonstrated again after September 11, 2001, differ markedly between the United States and Germany. This can only be fully understood if the representative national ‘macro’-histories and, in particular, the traumatic German one are taken into account. Conversely, the use of such symbols and their relative levels of perception and acceptance can only be assessed by quantifiable methods. To bring these levels and their interactions into a coherent framework, the general model of explanations in social science, as proposed by James Coleman (1990) and further developed by Hartmut Esser (1993), is helpful (Figure 37.2). This model links the initial social situation at the macro-level (upper left-hand side) – including its historical, social structural, and so on conditions – to the micro-level of subjective perceptions, expectations, interests, preferences, values and so on (lower left-hand side), and the resulting individual actions (lower right-hand side). In order to become politically relevant, these actions
Macro: Social Situation Meso:
Micro:
framing
Actor
Figure 37.2 Levels of analysis Source: Adapted from Coleman (1990) and Esser (1993).
Explanandum aggregation
Action
Political Cultures
must be aggregated in different forms at the meso-level (right-hand side in the middle) to affect the actual political decision-making and its consequences again at the macro-level (upper right-hand side). This, of course, must be seen in a dynamic sense with various feedback mechanisms which produce changes in the course of time, but also within the context of the international and nowadays ‘global’ system with its continuous interactions. With the help of this model, some of the major fallacies and deficiencies of past approaches can again be made evident. Orthodox Marxists, for example, would draw their conclusions directly from the macro- (social structural) level on the lefthand side to the macro- (political) level on the right-hand side without taking into account the subjective perceptions and actual ‘consciousness’ of the most relevant social classes at the micro-level and the problems of their aggregation and acting together at the meso-level. Similarly, a purely semiotic analysis of some symbols and rituals at the macro-level is not able to assess their perception and impact at the micro-level (where, for example, a certain cynicism concerning official rituals, as in some of the former socialist countries, may prevail). A social– anthropological thick description in Geertz’ (1973) sense, usually obtained by ‘participant observation’ in small communities or groups, is subject to similar limitations. Even though the meaning of certain symbols and actions may be better understood (Verstehen in Weber’s sense), it remains difficult to generalize from the point of view of a particular observer and from a small group or village to the overall society. The restrictions of a pure ‘rational choice’ approach on the micro-level also become evident. Neither the social and historical context (upper left-hand side) nor the problem of meaningfully aggregating individual preferences on a large scale at the meso-level (middle right) are adequately taken into account by simple utility maximizing assumptions of homo oeconomicus or the strict observation
627
of social roles and norms by homo sociologicus. For this reason, Hartmut Esser has proposed to conceive of socially integrated and active human beings as ‘restricted, resourceful, evaluating, expecting, maximizing men’ (RREEMM) and, of course, women. In this way, the restrictions and resources depending on the macro-level (upper left-hand side), but also cultural factors and traditions (‘expecting and evaluating’), come into play. Esser has further refined and supplemented this model with the concept of ‘framing’. This is derived from cognitive psychology, where the observation of a certain object (e.g. a certain material symbol, such as a flag) must fit into the ‘frame’ of a known or anticipated situation in order to be able to interpret it and to act accordingly (Shore, 1996). The subsequent action then follows a certain routinized and often unconscious ‘script’ which has been ‘programmed’ by the socializing experiences of a particular group or society (e.g. when you raise and place your right hand on your heart while listening to the national anthem during official ceremonies in the United States – a ‘frame’ and ‘script’ which are not practiced in this way in many other countries). These frames are often specific to particular sub-milieus in a larger society where the structural components of the community system and their specific identities again come into play. When these sub-milieus are also territorially segregated to a certain extent (e.g. in certain constituencies, precincts, city quarters, etc.), the long-term preferences for certain parties based on such cleavages can be better explained than by economic or other immediate utility considerations alone. In this sense, electoral geography, as originally developed by André Siegfried (1913), can in certain situations provide a better explanation (and predictor) of election results than the usual cross-national surveys where particular findings may not ‘fit’ and remain contradictory (e.g. Catholic workers voting for a conservative party). In addition, some ‘modernizing’, secularizing and
628
The SAGE Handbook of Political Science
other individualizing influences are also at work and, at least as party preferences and voter identification are concerned, a certain ‘dealignment’ can be observed (Gabriel, Chapter 35, this Handbook). In this way, tensions may arise between a person’s ‘cultural’ group identity and her individual preferences. In supplementing Esser’s terminology, these conflicting tendencies can be captured by adding two more Is (for identifying and individualizing) to his formula, making it ‘RREEIIMM’ (in German this even allows for a play on words, where sich einen Reim auf etwas machen means to make sense of something, which in this context may be quite appropriate, Berg-Schlosser, 2010).
Global/Regional differentiation After a certain lull in the 1980s, the political culture approach experienced a ‘renaissance’ in the 1990s, but also received renewed criticism. This renaissance was influenced both by epistemological considerations and concrete events. In a broader sense, the ‘cultural turn’ in the humanities brought with it a renewed emphasis on the more ‘subjective’ sides of human existence, aspects of meaning and understanding, but also a more ‘relativist’, less universal and more ‘post-modern’ rather than ‘scientistic’ epistemological orientation (Kellner, 1995). More specifically, in political science attempts to demonstrate the general post-materialist and, in this sense, also ‘post-modern’ value change were made by authors such as Inglehart (1988) and the broader cultural and deeper historical dimension of politics was (re-) emphasized in Eckstein’s (1988) approach and Putnam’s (1993) influential study. At about the same time, the events after 1989/90 in Central and Eastern Europe and the demise of the Soviet Union opened up a rapidly widening field of democratization and political culture studies more or less in the Almond/Verba tradition (Diamond,
1994; Rose et al., 1998). Similarly, in international politics, the perception of an ongoing and intensifying ‘clash of civilizations’, in Huntington’s (1996) terminology, replacing the Cold War of the former super-powers by intra-societal and international conflicts based on ethnic, religious and, in general, ‘cultural’ identities, was intensively discussed and seemingly confirmed, in the minds of some scholars, by the events of September 11, 2001, and after. As a first step, it is worth taking a closer look at the broader cultural areas (Kulturkreise) or civilizations in Huntington’s sense, which are largely based on the predominant religions. Max Weber (1920 [1905]) in a seminal but contested study, described a close linkage between Protestant religious beliefs and social and economic attitudes and behavior. In his later works on Confucianism, Hinduism, Buddhism and ancient Judaism he further elaborated this linkage, mostly with regard to the economic ethics of world religions and their consequences for social and economic development (Weber, 1920). By necessity, however, his ‘qualitative’ studies were based on the original religious documents and their interpretations by leading protagonists, and available contemporary sources. These reflected prevailing perceptions of such cultures and their ‘mentalities’, but could not be assessed in a more differentiated and representative way. In this sense, they also represented mostly elite cultures and their interpretations, and did not show the actual way of life of the populations at large. Such broad (over-) generalizations can also be found in Huntington’s book. Nevertheless, these religious and philosophical bases remain important for a deeper understanding of regional cultures. With regard to our conceptual scheme, Weber’s work was mostly concerned with the socio-cultural system and the prevailing codes and its consequences for the attitudes in the economic sub-system. Here, we look first at the broader political orientations in these cultural areas and the bases
629
Political Cultures
Figure 37.3 Huntington’s ‘The world of civilizations: post-1990’ Map of civilizations, based on Huntington’s Clash of Civilizations (1996: 26). The nine ‘civilizations’ identified by Huntington are: Western, Latin American, Japanese, Sinic, Hindu, Orthodox ( ), African ( ), Buddhist ( ), Islamic ( ). Note: Huntington considers Turkey and Iran to be special cases, indicated by different shades of
.
Source: https://commons.wikimedia.org/wiki/File:Civilizations_map.png, last accessed November 12, 2018.
of legitimacy for the respective regimes. At this point, these can, of course, only be some broad cursory remarks. As will become apparent later, much more detailed countryby-country accounts of the respective political cultures are needed. A first clue in this respect can be gained from Huntington’s map ‘The World of Civilizations: Post-1990’ (Figure 37.3). In this map, the West consists of Northern and Western Europe, the United States and Canada, plus Australia and New Zealand. No further distinction (in contrast to Weber) between (Roman) Catholic and Protestant areas is made. For Huntington, Western civilization is characterized by the ‘classic legacy’ of Greek and Roman cultures, a separation of spiritual and secular authorities, social pluralism, the relatively early emergence of representative political bodies and a high degree of individualism and personal rights. All this resulted in a long-term process of ‘modernization’, including industrialization and widespread literacy, the ‘enlightenment’ and, finally,
democratization (for more detailed accounts see, e.g., Eisenstadt, 1987, vol.1). By contrast, the Christian Orthodox cultural areas, after their split from Rome in the schism of 1054, remained more ‘traditional’ and authoritarian under large bureaucratic empires. Church and state were not separated for a long time, and modernization after the reforms of Peter the Great in Russia in the early 18th century remained largely confined to technological innovations, reforms of the army and the strengthening of the imperial power. Attempts at political liberalization remained weak and were faced with repression from above. This did not change after the Soviet revolution of 1917, even though the influence of the church was diminished (Ilyin, 2011). Further East, popular beliefs were strongly shaped by the teachings of Confucius (551– 479 bc). He advocated harmony between the self, the family and the order of the state. If this is achieved, a peaceful and meaningful life in a harmonious and consensual society will follow. Holders of public offices should
630
The SAGE Handbook of Political Science
be examples of moral quality. In the Chinese Empire, civil servants consequently had to undergo long periods of training and further qualifications. In a positive sense, this could lead to a benevolent form of authoritarianism, as practiced later to some extent by Prime Minister Lee Kuan Yew in Singapore (in this office from 1959 to 1990, later as Senior Minister and ‘Mentor’). In a negative sense, the Chinese bureaucracy under the Empire was very hierarchical and rigid and became inefficient in many ways. After the success of the 1949 revolution led by Mao Zedong, a strictly authoritarian Communist regime was established. Nevertheless, some Confucian traditions are still emphasized, which, as far as the moral quality of the leadership is concerned, can also be turned against some of the incumbents. As the successful cases of Taiwan (after the constitutional reforms of 1992) and, under somewhat different circumstances, South Korea (since the late 1980s) show, a liberal democracy is not incompatible with such a tradition, as already advocated by Sun Yat-sen as first president of the newly founded Republic of China in 1912 (He, 2016). Japan is a special case in this region, shaped by Shinto (‘Way of the Gods’) and Buddhist traditions. In the long history of the Japanese Empire, these beliefs underpinned the legitimacy and divine origins of the emperors. After the Meiji Restoration in 1868 this developed into a kind of ‘state shintoism’ mobilizing imperial and nationalist loyalties and modernizing the state, the economy and the military. After the defeat in World War II, a constitutional democracy was established under American auspices and the emperor (tenno) became a mere figurehead, no longer to be considered an akitsumikami (a deity in human form). Yet, the Yasukuni shrine, for example, commemorating those who died in service of Japan in wartime, including many convicted war criminals after World War II, still serves as a nationalist symbol. More generally, up to the present day, in Japanese society some of the more ‘allegiant’ (to use
Inglehart’s and Welzel’s term, see below) traditional values and consensual norms prevail (Murakami, 1987). In other parts of East and Southeast Asia, strong Buddhist influences exist as well, going back to the teachings of Gautama Buddha, a monk and sage who is believed to have lived in Northeast India between the 4th and 6th centuries bc. This today includes countries such as Thailand, Cambodia, Laos, Myanmar, Sri Lanka, Mongolia and former Tibet. Buddhists, today in many variations, believe in a cycle of birth and rebirth in different forms (samsara), which only can be ended by leading a ‘good’ meritorious life (karma) and the final attainment of ‘nirvana’. In daily practice, this is reflected by adhering to major ethical principles (no killing, no stealing, no lying, no intoxicants) and frequent meditation and prayers, leading to self-awareness and calmness of mind. This pronounced ‘otherworldly’ orientation does not mean, however, that passive and fatalist attitudes prevail. On the contrary, positive action is required to achieve karma (Powell, 1989). In India, Hinduism has become the prevailing religion – or, more precisely, dharma as the general way of life. This consists of a fusion of various Indian traditions with no clear founder or single religious script. Central concepts include, as in Buddhism, samsara as the continuous cycle of birth, life, death and rebirth, and karma, a person’s actions in current life or in the past. Traditional Hindu society had been divided into four major castes (varnas): the Brahmins (teachers and priests), the Kshatriyas (warriors and kings), the Vaishyas (farmers and merchants) and the Shudras (servants and laborers), to which one belonged by birth, with very limited chances of inter-marriage or social mobility. Politically the subcontinent was ruled for a long time by a variety of regional kingdoms and empires, but it was also subject to several invasions and to outside dominance by Mongol and Muslim rulers. After a period of colonization from the early 17th century by
Political Cultures
the British East India Company, the subcontinent was finally unified and formally ruled by the British Crown after 1858. Over the course of time a new British-educated indigenous elite emerged, including personalities such as one of the major leaders of the independence movement, Mahatma Gandhi, and the first Prime Minister after independence in 1947, Jawaharlal Nehru. Gandhi also initiated a cultural revolution, turning against the strict religious and caste divisions, emphasizing the equal rights of the ‘untouchables’ and members of the ‘scheduled castes’ (dalits). He did not succeed, however, in preventing the secession of the Muslim-dominated parts of East (later Bangladesh) and West Pakistan. Independent India was, from the very beginning, a federal secular state and a parliamentary democracy. The long dominant Congress Party of Nehru and his successors was later replaced in government by the Bharatiya Janata Party (BJP), which emphasized Hinduist nationalism. The secular constitution, however, has been maintained (Ashraf, 1995). The political influences of Islam have become the most contested in the contemporary world. It is most prominent in the Middle East and North African (MENA) region, but it also extends to parts of subSaharan Africa, Central Asia and Southeast Asia, including Bangladesh, Indonesia and Malaysia. In some of the remaining monarchies, as in Jordan or Morocco, the ruling dynasties claim their legitimacy to be based on the fact that they are descendants of Prophet Muhammad (sharifs). In most of the other countries, whether under civil authoritarian or military rule, the influence of Islamic culture and beliefs is very strong. Basically, two major strands today can be distinguished. One remains strongly traditional and fundamentalist, accepting only the Qur’an (the holy book as revealed to the prophet) and the Shari’ah (the body of legal texts and interpretations derived from it) as foundations of social and political life. The Wahhabist doctrine in Saudi Arabia and the
631
theocratic regime in Iran (velâyat-e faqih), but also parts of the Muslim Brotherhood in Egypt and the self-proclaimed ‘Islamic State in Iraq and Syria’ (ISIS), are examples of this orientation, while the Islamic republic of Iran is committed to the Shiite doctrine of the velâyat-e faqih, that is to say, ‘governance of the jurists’ who are able to interpret the law. The other strand is more reformist and regards basic Islamic values and practices as compatible with many features of modern life and democratic rule. This came to the fore in the Arab Spring in 2010–11 in Tunisia and later in other parts of the MENA region, but only in Tunisia has it so far had more lasting consequences of greater democratization. Nevertheless, the democratic regimes in Senegal and Indonesia are also examples of this kind (Fattah, 2006). Turkey is a special case: the revolution against the Ottoman rulers led by Kemal Atatürk established a secular republic in 1923, with, however, a mixed history of military and democratic rule after the first multi-party elections in 1950. The contrasts between the urban political elite and the rural, more traditional parts of the country remained strong. The Islamist Justice and Development Party (AKP) of Prime Minister Erdoğan, in power since 2002, has put the secular foundations of the state into question. After another attempt at a military coup in 2016 and the adoption of a new constitution, (now) President Erdoğan has turned the country in an authoritarian direction. In many parts of sub-Saharan Africa today, Christian religions, including many foreign or indigenous sects, prevail. Traditionally, most African ethnic groups had some form of animist religion, believing in a creator god and a transcendental world of ancestor spirits who still could influence their daily lives (Mbiti, 1969). In Ethiopia, a Coptic Orthodox church has existed since the 4th century. Colonization by the European powers after the Berlin conference in 1885 also led to intensive missionization by Catholic, Anglican and Protestant churches.
632
The SAGE Handbook of Political Science
More recently, many evangelical and other sects came into being, some of them linked to North American churches. In a country like Kenya, for example, more than 200 Christian denominations can be found today. In a number of countries, as in Uganda or Nigeria, religious divisions also have led to political conflicts. On the whole, the public influence of the various churches is relatively strong. Most early attempts of establishing democratic rule after independence in the early 1960s failed, leading to ‘big men’ politics and ‘neo-patrimonial’ rule in authoritarian one-party states or military dictatorships. Only after the ‘second liberation’ in the early 1990s and the end of the Cold War did some more durable democratic regimes emerge. Nevertheless, strong clientelistic ties, often on an ethnic and regional basis, still prevail in many countries (Bratton and van de Walle, 1997). The Latin American region was colonized by Spain and Portugal in the 16th century, establishing the Catholic Church, with strong ties to the Vatican. Traces of indigenous religions (Aztec, Maya, etc.) can only be found in some of the Andean states and Mexico. After independence from the colonial rulers in the early 19th century as a result of the Napoleonic wars in Europe (and later in Brazil), mostly authoritarian regimes were established, often led by military leaders (‘caudillos’). In the 20th century, many countries experienced a vacillating history, often with renewed military interventions. Only in the late 1980s did more stable democratic regimes emerge (O’Donnell et al., 1986). In the meantime, many syncretistic churches and evangelical sects have also become active. These can be found mostly in areas affected strongly by the slave trade from Africa, as in the Caribbean and Brazil. With regard to continuing strong social and economic inequalities, a ‘liberation theology’ within the Catholic Church has gained some ground. This has also contributed to the emergence of Christian Democratic parties, as in Chile and Venezuela (Sigmund, 1994).
Overall, this mixture of influences justifies speaking of a distinct civilization in this region, in Huntington’s sense. These brief characterizations of the major civilizations only describe some of the longterm historical influences and cultural value patterns. For a ‘political’ culture in the narrower sense of the term, the specific contemporary political context and institutional set-up of each country has to be taken into account. This begins with the ‘community system’ of our conceptual scheme and the specific patterns of state formation underlying it. Thus, the borders of each unit (and the possible conflicts that go with them) have to be determined. Then the internal differentiation, again shaped by historical forces and events, has to be outlined, as, for example, in Rokkan’s ‘conceptual map of Europe’ (1975 and see above). In this way, a more detailed picture of each country and its religious, ethnic, regional and other political–cultural variations can be obtained. These then are also reflected in the respective electoral results and their changes over time.
Empirical databases With regard to all this complexity, in the past most studies of political culture have been concerned with individual cases in a more qualitative and historical sense. But since Almond and Verba’s pioneering study and the communication revolution in the digital age, a huge wealth of data has been accumulated, most of it freely available on the internet. In the beginning, these were mostly concerned with economic or demographic developments, as pioneered by the United Nations and its sub-organizations or by the IMF and the World Bank. In the meantime, a vast amount of cultural and political data has also been collected on a more continuous basis by a variety of efforts and organizations. This applies mostly to different measures and indices of democracy at the macro-level, such as
Political Cultures
those by Polity and Freedom House or, more recently, ‘Varieties of Democracy’ (V-Dem), but today these also include regular survey research on political values and attitudes in a large number of countries at the micro-level. This began with the Eurobarometer in Western Europe in the 1970s, initiated by Ronald Inglehart in the European Community framework and followed since the 1990s in all major regions of the world (Afrobarometer, Arab Barometer, Asian Barometer, Eurasian Barometer, Latinobarometro), now organized jointly in the framework of Global Barometer Surveys (https://www.globalbarometer.net/). The European Social Survey (ESS), covering a broad scope of issues and established in 2001, is the most methodologically sophisticated so far. Even ‘deeper’, in a cultural sense, is the probing of the World Values Survey (WVS), which since the early 1980s has attempted to assess the major world cultures and their changes over time (http://www.worldvaluessurvey.org/wvs.jsp). Again at the initiative of Inglehart, and originally testing his theory on post-materialism and generational change, up to 2014, six waves of these surveys had been completed. By now, they have covered almost 90 countries, but this coverage has been irregular since not all countries were included in each wave and funding has to be sought for each wave on an individual country basis. Nevertheless, the major findings are remarkable and have led to a refinement of modernization theory and assessment of cultural patterns worldwide and their changes over the past few decades (see below).
Major advances, ongoing debates, critical assessments Historical Depth and Persistence of Cultures Robert Putnam (1993) and his collaborators emphasize the deep historical dimension of
633
political cultures contributing to the renaissance of this approach mentioned above. They analyzed the civic traditions in the newly created regions after the administrative reforms in Italy in the early 1970s, and the differential impact of these reforms. They attributed the relative success of these reforms in the North, compared to the persistence of more traditional structures and low institutional performance in the South, to the rich traditions of city-states and republics in Northern Italy since the Middle Ages. There, a dense network of organizations of civil society with high levels of social trust and political participation has emerged over the course of time. This ‘social capital’, as they call it, has been continuously reproducing itself and constitutes the basis for further economic and political developments up to the present day. By contrast, social relations in Middle and Southern Italy, in the areas of the former (much larger) Vatican state and the Kingdom of Sicily, have been characterized by a political culture of suspicion, feudal and more recent clientelistic relationships including the Mafia, economic backwardness and political apathy or cynicism. Thus, this has remained in line with what Banfield (1958) earlier called the ‘amoral familism’ of the South. Thus, here too, self-reproducing mechanisms and game theoretical equilibria are at work, this time as a circulus vitiosus rather than a circulus virtuosus as in the North. Putnam’s study has found much acclaim. Nevertheless, some weaker points relating to certain aspects of the methodology (Morlino, 1995) and their overall approach (Jackman and Miller, 1996) have been pointed out as well. By going back in history almost indefinitely such an explanation can easily be made unfalsifiable, overlooking important changes that have taken place as well. A more elaborate specification of the overall model, linking and testing the arguments at each level, could have provided a more differentiated and less static picture.
634
The SAGE Handbook of Political Science
Global Cultural Modernization Modernization theories have been en vogue in the social sciences since the late 1950s and early 1960s. In political science, Lerner’s (1958) and Lipset’s (1963) early studies have been most influential. While Lipset based his assessment on broader socio-economic and more objectifiable indicators such as GNP per capita and literacy, Lerner explicitly identified a socio-psychological component: the increasing ‘empathy’ of persons in the transition from traditional to modern societies. In more recent times, Ronald Inglehart has become most prominent (and persistent!) in following this approach on a broad cultural basis. He first detected and proclaimed the existence of ‘post-materialist’ attitudes and values in the ‘silent revolution’ (1977) of the younger generation in the well-to-do Western countries. His concepts, methods and findings have been widely reviewed and criticized in the meantime (Jackman and Miller, 1996). More recently, Inglehart (1997) further broadened his approach to document ‘modernization and post-modernization’ in 43 societies, now including a number of nonWestern ones, mostly based on the first waves of the World Values Survey (WVS) which were conducted in the early 1980s and 1990s. There, he expanded his concept to tap a large variety of orientations, ranging from religious beliefs and economic and political attitudes to sexual norms and changing gender roles. He argues that in the course of modernization, traditional, including religious, values are replaced by ‘rational–legal’ ones in Weber’s sense. But, as he puts it, ‘modernization is not the final stage of history’ (1997: 5). Increasingly in the younger generation in the advanced countries, ‘post-modern’ values, including ‘post-materialist’ ones, but also ‘a growing mass desire for participation and self-expression’ (ibid: 327) can be found. In this way, he concludes, ‘economic, cultural, and political change go together in coherent patterns, and they are changing the
world in broadly predictable ways’ (ibid: 341). Such sweeping statements have again raised many criticisms, mainly concerning methodological aspects. Even more ambitious is subsequent work with Pippa Norris highlighting increasing gender equality across cultures (Inglehart and Norris, 2003), and with Christian Welzel showing the rise of democratic values and self-assertive citizens around the globe (Inglehart and Welzel, 2005). The most recent works along these lines are those by Welzel (2013) and Dalton and Welzel (2014), based on the sixth wave of the WVS and its predecessors. There they distinguish between the ‘allegiant model’ of the Almond/ Verba ‘civic culture’ concept, with a general allegiance to the regime, pride in the political system and a modest level of political participation on the one hand, and an ‘assertive model’ of critical citizens emphasizing emancipative values, which have become ‘dissatisfied democrats’ in many instances, on the other. The distribution of countries around the world based on these two dimensions is shown in Figure 37.4. These cases can also be grouped into broader cultural zones resembling Huntington’s civilizations. The older established democracies can be found in the upper right-hand quadrant, Latin American cases somewhere in the center and East European cases on the left-hand side. The African and Asian cases are mostly on the lower righthand side. Remarkable changes over time are revealed in Figure 37.5, where respondents in the latest survey from each society are broken down by age cohorts. Here the arrows follow the trace from the oldest cohort at the arrow tail to the youngest one at the arrow head. This figure clearly indicates changes from allegiant to assertive cultures in almost all regions, with the exception of Africa and the Indian subcontinent. In spite of some limitations of the data with changing sets of nations in each WVS wave, this is an important finding, showing overall
Political Cultures
635
Figure 37.4 From allegiant to assertive citizens Source: Dalton and Welzel (2014: 295), based on most recent WVS survey of each society.
trends and prospects for further democratization and more intensive political participation in major areas.
Cultural Conflicts on a Global Scale Even more widely discussed than Inglehart’s ‘universal’ and Putnam’s ‘deep’ historically rooted concepts of political culture has been Huntington’s (1996) ‘Clash of Civilizations’. He distinguishes nine broad cultural zones, which are characterized mainly by common fundamental religious traditions and a common identity, which transcends existing political boundaries (see above). These identities, in his view, determine the most
significant cleavages in international politics after the end of the Cold War. The Islamic region, in particular, is seen to be characterized by ‘bloody borders’ and ‘fault line wars’, as for example in Bosnia or Kosovo, Chechnya, Kashmir, the Philippines, Northern Nigeria, and so on. In addition, in the longer run, ‘core states’ in the major regions – in particular China and, if a common center should emerge, in the Islamic world – would pose a major challenge to the present dominance of the West. At the extreme, a scenario of ‘the West against the rest’ may become possible. He concludes that ‘in … the global real clash between Civilization and barbarism the world’s great civilizations … will hang together or hang separately’ (1996: 321). He advocates, therefore, that core states refrain
636
The SAGE Handbook of Political Science
Figure 37.5 From allegiant to assertive citizens (trajectories) Source: Dalton and Welzel (2014: 297), culture zone cohort trajectories.
from intervening in conflicts in other civilizations (his ‘abstention rule’) and that core states negotiate with each other to contain or to halt fault line wars (his ‘mediation rule’) (1996: 316). This perception stands thus in stark contrast to Inglehart’s assessment of a universalizing ‘post-modern’ world culture, but also to the emerging international regimes based on commonly accepted charters within the framework of the United Nations, such as the International Criminal Court, and an increasingly universal perception (and practice) of basic human rights, democracy and ‘good governance’. Not surprisingly, this spectacular thesis has provoked much reaction and criticism. Among the favorable comments were those by hardliners on either side of the cultural divide, such as among conservative and military interests and security forces in the United States or in Mahatir’s Malaysia, speaking in defense of non-democratic ‘Asian values’. Others pointed out some major weaknesses in his concept. Mark Juergensmeyer, for example, also identified ‘religious nationalism’ as
a major more recent phenomenon, but saw it largely confined to smaller groups, including terrorist ones, in some cultures – as among Islamic fundamentalists or Hindu nationalists. Rather than the ‘apocalyptic vision of a worldwide conflict between religious and secular nationalism’, he sees ‘reason to be hopeful. It is equally as likely that religious nationalists are incapable of uniting with another, and they will greatly desire an economic and political reconciliation with the secular world’ (Juergensmeyer, 1993: 201). Huntington’s warnings, intensified after September 11, 2001, have stimulated a wide range of reactions both in international politics and in political science. Among the most balanced and sober accounts in this respect is Müller’s (1998) ‘Das Zusammenleben der Kulturen’ (literally: the living together of cultures). Müller questions some of Huntington’s major assumptions, such as the prevalence of cultural vis-à-vis state politics, the more or less uniform identities of the respective civilizations, their capability to become major coherent actors on the world scene, and the
Political Cultures
neglect of global economic factors – such as the importance of world trade and energy supplies – and not merely ‘cultural’ ones, and so on. This cannot be discussed further here. It is important, however, that Huntington’s and similar concepts have to be checked against the framework of our initial model. First of all, it is important to note that his concept of ‘civilization’ is not identical to political culture. The latter is defined by the established territorial boundaries of the respective polity and shaped (in a dynamic feedback loop) by the existing type of political system. Only in cases where, in Huntington’s terminology, ‘core states’ coincide with the respective civilization, as in China, Japan and, to a relatively large extent, India, may the two notions be treated as synonymous. In all others, the diversity of states and regimes in each cultural zone may lead to significantly varying political cultures, whether more authoritarian or more democratic. Second, ‘macro’ preconditions (on the upper left-hand side) such as cultural and religious cleavages, but also economic discrepancies and so on, only play a role if they are perceived as such and acted upon at the ‘micro’-level, and then may be aggregated at the meso-level on the right-hand side in order to have political ‘macro’-effects (on the upper right-hand side, see Figure 37.2). If and to what extent this really is the case at each stage can only be answered empirically, but some of these (necessary) links seem to be rather weak in Huntington’s scenarios.
Perspectives Taken altogether, there have been many advances in the study of political cultures around the world. The ongoing debates concerning deeper historical studies, global cultural and political trends and possible cultural conflicts and even violent clashes also show the liveliness and relevance of
637
this topic. Nevertheless, there are still many blank spots in our cultural maps in many regions. Moreover, even where we have well documented longer-term studies and detailed assessments, as in many of the older democracies, new developments bring about important and sometimes quite sudden and unexpected changes. A number of these tendencies have been lumped together under the broad rubric of ‘globalization’ (Milner, Chapter 73, this Handbook). These include many recent technological developments, most importantly in the area of digitalization and the new electronic media; an increased international division of labor; mass tourism; and new waves of international migration. In this way, many new inter-cultural contacts occur, which provide chances of learning and better mutual understanding, but which can also lead to new frictions, hateful populist reactions and xenophobia. Therefore, ‘multi-culturalism’, sometimes used in a polemical sense, is here to stay (see Amir-Moazami, Chapter 89, this Handbook). This, however, also requires clear regulations for international travel, refugee or asylum status, possible citizenship, and so on. On the whole, a greater ‘hybridization’ of cultures will occur (Katzenstein, 2010). This does not mean that all national or sub-cultural identities will disappear. As has been indicated above, multiple identities at different levels and in different situations are much more likely to become the rule in an increasingly cosmopolitan world. This also means that inter-cultural dialogues, not just among intellectuals or the international jet set, but in many daily life situations, will become much more frequent. For this, a certain amount of openness and curiosity to learn more about other people’s lives and thinking, but also a measure of mutual tolerance of disagreements, is required. In any case, in contemporary democracies, narrowly conceived ‘imagined’ identities based on perceived common origins and ethnos (Anderson, 1983) will have to be overcome
638
The SAGE Handbook of Political Science
to make place for the demos of all citizens, if the plebs in the streets is not to prevail. In an increasingly global political science, open, eye-to-eye dialogues about basic concepts such as democracy, fundamental human rights, the social and political role of women, forms of social or sexual discrimination and many others, including our methodology, are required. In this way, ‘non-Western’ theories and philosophies can be confronted with the current ones, as in comparative political theory (Mallavarapu, Chapter 1, this Handbook), but also with the rich and ever expanding empirical data on social and political attitudes and political cultures worldwide (see, for example, Schubert and Weiß, 2016). Had he been alive today, Max Weber certainly would have welcomed and embraced these possibilities.
References Almond, G. A. (1956) Comparative Political Systems. Journal of Politics 18(3), 391–409. Almond, G. A. (2002) Ventures in Political Science: Narratives and Reflections. Boulder, CO: Lynne Rienner Publishers. Almond, G. A. & G. B. Powell (1978) Comparative Politics: System, Process, and Polity. Boston: Little, Brown. Almond, G. A. & S. Verba (1963) The Civic Culture: Political Attitudes and Democracy in Five Nations. Princeton, NJ: Princeton University Press. Almond, G. A. & S. Verba (1980) The Civic Culture Revisited. Boston: Little, Brown. Anderson, B. (1983) Imagined Communities: Reflections on the Origins and Spread of Nationalism. London: Verso. Ashraf, Ali (ed.) (1995) The Emerging Political Culture in India. New Delhi: HIRA Publications. Banfield, E. C. (1958) The Moral Basis of a Backward Society. New York: The Free Press. Barber, B. R. (1984) Strong Democracy. Berkeley: University of California Press. Barry, B. (1970) Sociologists, Economists and Democracy. London: Macmillan.
Bellah, R. N. & P. E. Hammond (1980) Varieties of Civil Religion. San Francisco: Harper & Row. Berg-Schlosser, D. & R. Rytlewski (eds) (1993) Political Culture in Germany. London: Macmillan Press. Berg-Schlosser, D. (2010) Political Culture at a Crossroads? In: S. K. Mitra, M. Pehl & C. Spiess (eds) Political Sociology – The State of the Art. Opladen: Barbara Budrich Publishers, 31–49. Bratton, M. & N. van de Walle (1997) Democratic Experiments in Africa. Cambridge: Cambridge University Press. Coleman, J. S. (1990) Foundations of Social Theory. Cambridge, MA: Harvard University Press. Dahl, R. A. (1989) Democracy and Its Critics. New Haven, CT: Yale University Press. Dalton, R. & C. Welzel (eds) (2014) The Civic Culture Transformed: From Allegiant to Assertive Citizen. Cambridge: Cambridge University Press. Deutsch, K. W. (1966) Nationalism and Social Communication. Cambridge, MA: MIT Press. Diamond, L. (1994) Political Culture and Democracy in Developing Countries. Boulder, CO: Lynne Rienner Publishers. Dittmer, L. (1977) Political Culture and Political Symbolism: Toward a Theoretical Synthesis. World Politics 29(4), 552–83. Easton, D. (1965) A Systems Analysis of Political Life. New York: Wiley. Eckstein, H. (1988) A Culturalist Theory of Political Change. American Political Science Review 82(3), 789–804. Eisenstadt, S. N. (ed.) (1987) Patterns of Modernity, 2 vol. London: F. Pinter. Elkins, D. J. & R. Simeon (1979) A Cause in Search of Its Effect, or What Does Political Culture Explain? Comparative Politics 11(2), 127–45. Esser, H. (1993) Soziologie: allgemeine Grundlagen. Frankfurt and New York: Campus. Fattah, M. A. (2006) Democratic Values in the Muslim World. Boulder, CO: Lynne Rienner. Flora, P. (ed.) (1999) State Formation, NationBuilding and Mass Politics in Europe – The Theory of Stein Rokkan. Oxford: Oxford University Press. Galkin, A. A. (1986) Herrschaftselite, Politisches Verhalten, Politische Kultur. Frankfurt: IMSF.
Political Cultures
Geertz, C. (1973) The Interpretation of Cultures: Selected Essays. New York: Basic Books. He, B. (2016) Confucianism and Democracy: Testing Four Analytical Models in an Empirical World. Taiwan Journal of Democracy 12(2), 59–84. Higley, J. & R. Gunther (1992) Elites and Democratic Consolidation in Latin America and Southern Europe. Cambridge: Cambridge University Press. Horowitz, D. L. (1985) Ethnic Groups in Conflict. Berkeley: University of California Press. Huntington, S. P. (1968) Political Order in Changing Societies. New Haven: Yale University Press. Huntington, S. P. (1996) The Clash of Civilizations and the Remaking of World Order. New York: Simon & Schuster. Ilyin, M. (2011) Democracy: Russian Perspectives. In: B. Badie, D. Berg-Schlosser & L. Morlino (eds) International Encyclopedia of Political Science. Thousand Oaks: Sage, vol. 3, 607–14. Inglehart, R. (1977) The Silent Revolution. Princeton, NJ: Princeton University Press. Inglehart, R. (1988) The Renaissance of Political Culture. American Political Science Review 82(4), 1203–30. Inglehart, R. (1997) Modernization and Postmodernization: Cultural, Economic, and Political Change in 43 Societies. Princeton, NJ: Princeton University Press. Inglehart, R. & P. Norris (2003) Rising Tide, Gender Equality and Cultural Change around the World. Cambridge: Cambridge University Press. Inglehart, R. & C. Welzel (2005) Modernization, Cultural Change, and Democracy. Cambridge: Cambridge University Press. Inkeles, A. (1997) National Character: A Psycho-Social Perspective. New Brunswick: Transaction Publishers. Jackman, R. W. & A. Miller (1996) A Renaissance of Political Culture? American Journal of Political Science 40(3), 632–59. Juergensmeyer, M. (1993) The New Cold War? Religious Nationalism Confronts the Secular State. Berkeley: University of California Press. Katzenstein, P. J. (ed.) (2010) Civilizations in World Politics: Plural and Pluralist Perspectives. London: Routledge.
639
Kellner, D. (1995) Media Culture: Cultural Studies, Identity and Politics between the Modern and the Postmodern. London: Routledge. Kroeber, A. L. & Kluckhohn, C. 1952. Culture: A Critical Review of Concepts and Definitions. Cambridge, MA: Harvard University Press. Lerner, D. (1958) The Passing of Traditional Society: Modernizing the Middle East. Glencoe, IL: Free Press. Lijphart, A. (1977) Democracy in Plural Societies: A Comparative Exploration. New Haven, CT: Yale University Press. Lipset, S. M. (1963) Political Man: The Social Bases of Politics. New York: Doubleday. Luhmann, N. (1984) Soziale Systeme: Grundriss einer allgemeinen Theorie. Frankfurt am Main: Suhrkamp. March, J. G. & J. P. Olsen (1989) Rediscovering Institutions: The Organizational Basis of Politics. New York: Free Press. Mbiti, J. S. (1969) African Religions and Philosophy. London: Heinemann. Morlino, L. (1995) Italy’s Civic Divide. Journal of Democracy 6(1), 173–7. Müller, H. (1998) Das Zusammenleben der Kulturen. Ein Gegenentwurf zu Huntington. Frankfurt am Main: Fischer Taschenbuch Verlag. Murakami, Y. (1987) Modernization in Terms of Integration: The Case of Japan. In: S. N. Eisenstadt (ed.) Patterns of Modernity. London: F. Pinter, vol. II, 65–88. O’Donnell, G., P. Schmitter & L. Whitehead (eds) (1986) Transitions from Authoritarian Rule. Baltimore: Johns Hopkins University Press. Parsons, T. (1951) The Social System. Cambridge, MA: Harvard University Press. Powell, A. (1989) Living Buddhism. Berkeley: University of California Press. Putnam, R. D. (1993) Making Democracy Work: Civic Traditions in Modern Italy. Princeton, NJ: Princeton University Press. Pye, L. W. (1968) Political Culture. In: D. L. Sills (ed.) International Encyclopedia of Social Sciences. New York: Macmillan, 218–24. Rohe, K. (1992) Wahlen und Wählertradition in Deutschland. Kulturelle Grundlagen deutscher Parteien und Parteiensysteme im 19. und 20. Jahrhundert. Frankfurt am Main: Suhrkamp.
640
The SAGE Handbook of Political Science
Rokkan, S. (1975) Dimensions of State Formation and Nation-Building: A Possible Paradigm for Research on Variations within Europe. In: C. Tilly (ed.) The Formation of National States in Western Europe. Princeton, NJ: Princeton University Press, 562–600. Rose, R., W. Mishler & C. Haerpfer (1998) Democracy and Its Alternatives: Understanding Post-Communist Societies. Cambridge: Polity Press. Rustow, D. A. (1970) Transitions to Democracy – Toward a Dynamic Model. Comparative Politics 2(3), 337–63. Schubert, S. & A. Weiß (eds) (2016) ‘Demokratie jenseits des Westens – Theorien, Diskurse, Einstellungen. PVS – Sonderheft 51, Baden Baden: Nomos. Shore, B. (1996) Culture in Mind: Cognition, Culture, and the Problem of Meaning. Oxford: Oxford University Press.
Siegfried, A. (1913) Tableau Politique de la France de l’Ouest sous la Troisième République. Paris: Armand Colin. Sigmund, P. E. (1994) Christian Democracy, Liberation Theology, and Political Culture in Latin America. In: L. Diamond (ed.) Political Culture & Democracy in Developing Countries. Boulder, CO: Lynne Rienner, 211–28. Thompson, M., R. Ellis & A. Wildavsky (1990) Cultural Theory. Boulder, CO: Westview Press. Weber, M. (1920) Gesammelte Aufsätze zur Religionssoziologie. 3 vol. Tübingen: J. C. B. Mohr. Weber, M. (1922) Wirtschaft und Gesellschaft. Tübingen: J. C. B. Mohr. Welzel, C. (2013) Freedom Rising: Human Empowerment and the Quest for Emancipation. Cambridge: Cambridge University Press.
38 Political Socialization Maria Marczewska-Rytko
A Short history of the subject Studies on political socialization developed in the 1960s thanks to American sociologists. The points of reference for studies on political socialization are psychological and sociological theories. Psychological theories that provided special inspiration included the theory of social learning, psychoanalysis, the theory of social development and human ecological theory. Sociological theories, such as systems theory, action theory and the theory of social structure, also exerted a significant impact on the investigation into socialization processes. Social learning theory is based on the assumption that human behavior is directly associated with an individual’s process of experiencing and processing environmental elements because a person is born without the knowledge of behavioral principles. S/he has to acquire this knowledge in the learning process, in which an essential role is played by reinforcement. In classic learning
theories, reinforcement is a trained, automatic response to a stimulus. John Broadus Watson, the founder of behaviorism, emphasizes the fact that people direct most of their behaviors at other people and meet different reactions. These reactions can be positive or negative. Positive reactions occur when an individual observes the rules adopted in a community or meets social expectations. Negative reactions will be a consequence of actions that violate the adopted rules. Thus, when trying to avoid negative reactions, an individual learns to repeat the behaviors that are accepted in a community. In this way, an individual internalizes expectations that stem from a given culture. Modern learning theories treat reinforcement as a conscious decision by a person who takes the results of a specific action into account. Social learning theory, founded by Albert Bandura, occupies an important position in psychology. According to its tenets, one of the mechanisms of behavior acquisition is learning by observation and imitation
642
The SAGE Handbook of Political Science
of persons perceived as representing standard behaviors. Such persons are chosen because of the functions they perform and are treated as social models. In other words, in the receiver’s perception, models have to be characterized by a high status or competencies. The process of learning through social models consists in that the observer identifies with the model and imitates his/her behavior. In different stages of their life, individuals are more or less receptive to modeling. Psychoanalysis focuses on analyzing personality from the angle of tensions between an individual’s drives/motives and the rules of social culture with its system of norms and sanctions. The founder of psychoanalysis theory is Sigmund Freud, who perceived personality as an outcome of a person’s relationship with other people who are emotionally important to him/her. Freud believed that human personality is determined by the Id system – genetically acquired needs or drives – and the Superego system – socially and culturally shaped ideas of what an individual should be. Tensions and conflicts occur between Id and Superego because Superego restricts a person’s natural desires, thereby producing the sense of guilt and shame. An individual develops a special way of conduct, a compromise between natural drives and the requirements of culture and society. The third system of personality is defined as Ego. In Freud’s conception, socialization is a process of achieving this compromise. An important role in the development of psychoanalysis is played by Erik H. Erikson’s theory of psychosocial development. He distinguishes eight developmental stages in human life: from infancy through toddlerhood, early childhood, preschool, early school (pre-adolescence), young adulthood, middle adulthood and old age (late adulthood). In each of the stages, an individual faces arising conflicts. Resolving them is a condition for advancing to the next stage. Conflicts that are not resolved will dysfunctionally influence a person. To sum up, an individual’s development is perceived as a mutual adjustment
(reconciliation) of an individual and the environment in the process of resolving successively emerging conflicts. The theory of cognitive development founded by Jean Piaget assumes that intelligence is an advanced form of biological adaptation, which results in the structuring of cognitive processes. Piaget distinguishes several stages of development: the stage of sensorimotor intelligence, which lasts until the 18th month of age; the preoperational stage – from a year and half until seven years; concrete operations – from 7 to 15 years of age; and the stage of formal operations (thinking), from 15 years up. Piaget’s interests cover the processes of active and multidimensional interactions between an individual and the environment. An individual’s behavior and his/her actions are functions of the organism; the environment can influence them only to a limited extent. The ecological theory of human development (ecological psychology) assumes that human development occurs through an individual’s relations with the environment. An individual is influenced by the environment and how s/he impacts on it. Urie Bronfenbrenner – the founder of the ecological theory of human development – distinguishes four levels that constitute the environment: the microsystem, mesosystem, exosystem and macrosystem. The microsystem consists of the family, a peer group, a professional or religious group. The mesosystem encompasses connections between individual elements that make up the microsystem and can exert a positive influence on human development. The exosystem refers to various social settings or structures, whose accumulation can negatively impact human development. The macrosystem refers to such dimensions of an individual’s functioning as politics, economy or culture. Bronfenbrenner also uses the term ‘niche’ (habitat, setting), denoting the place where close personal relationships are established. Worth noting among the systems theories is the theory by Talcott Parsons, based on
Political Socialization
the category of system that denotes functionally connected and specifically structured systems and features. Parsons distinguishes three types of systems: the organism system, social system and personality system. For Parsons, the socialization process is the process of acquiring and internalizing values and norms of the social environment. The process continues until the acquired values and norms become the motivations for an individual’s action and form the system of his/her personality. In the socialization process, the state of balance is thus achieved between the needs of organism, personality and social structure. During the process of socialization, each human individual advances through a specific system of increasingly complex structures of social roles, in which s/he functions at different stages of his/her life. In the course of the socialization process, an individual acquires and gathers the means and abilities to function in the social environment. Niklas Luhmann further developed systems theory. In his conceptualization, it assumes the form of a metatheory for constructing theories of individual systems and the general system. Luhmann believes that the signifying characteristic of the modern world is mainly the phenomenon of growing complexity and unpredictability (contingency). In a world composed of increasingly large numbers of partial systems, many opportunities for action, expiring and constructing meanings appear. Luhmann emphasizes the role of mutually coupled and self-steering mechanisms. His conviction is that society is unable to totally control the results of the process of socialization because they are seen not so much in the social system as in an individual’s mental system. Significantly enough, the interconnection of these systems makes their constituents, like social communication or consciousness, go beyond the boundaries of their own systems in the process of merging. Theory of action is associated first of all with the research by Charles Horton Cooley and George Herbert Mead. Cooley focuses on analyzing the relationship between the
643
functioning of an individual in a group and the functioning of a group in an individual. In a word, the two entities can be examined only in relation to one another rather than in isolation from each other. It is therefore essential to inquire into the essence of interaction, to seek an answer to the question about how an individual becomes a human being and what this process looks like. Cooley is convinced that this process takes place owing to communication with other people, that is, owing to the use of gestures, postures, facial expressions, words, writing, telephone, telegraph, and so on. He explains mental differences between individuals as caused by differences between communication systems in which they function. A substantial role is also played by institutions, social classes or public opinion. According to G. H. Mead’s theory, personality develops as the internalization of social contacts with other people. At birth, there is no personality: it develops in the process of social experiencing and interaction. Personality is thus a social structure. It is one’s own social experience that determines to what extent personality participates in communication. Mead stresses that an individual does not experience him/herself directly but rather indirectly, by adopting the points of view of other members of a social group. He is convinced that children’s roleplaying games fulfill in their experience an important function of getting to know the social world and shaping their personality owing to the experience of their own place in the whole of the social system. A self-aware individual accepts the organized social attitudes of the group or community to which s/he belongs. For example, in the sphere of politics, an individual identifies with a certain political grouping and adopts its attitude toward the rest of a given community and toward problems arising in a specific political situation. The theory of social structure exerts a great influence on the development of socialization processes. Émile Durkheim focuses on relationships between an individual and society,
644
The SAGE Handbook of Political Science
and seeks a balance between the aspirations of individual persons and the needs of society as a whole. An individual functions in a specific reality, which is the existing reality. That is why an individual has to adjust to the way/ ways of life recognized in society. Otherwise s/he may encounter various reactions, such as legal sanctions or social exclusion. The starting point for Jürgen Habermas is the development of the acting subject functioning in a definite social structure. He interprets socialization as the unity of the process of socialization and individualization, as a result of which an interaction occurs between an individual and the social environment. Habermas comprehensively explains the social process in which an individual develops. Society is interpreted by Habermas from the perspective of communication theory. Special importance is attributed to an individual’s linguistic skills perceived in the communication process. Autonomy and the ability to act focus on linguistic exchange between subjects. From the standpoint of political socialization, of no small importance is Theodor Adorno’s conception of authoritarian personality. Adorno and his associates seek the sources of the individual’s mental and social mechanisms in the early stages of his/her life. As a result, they created the so-called F scale to measure the level of authoritarianism, and they distinguish the type of personality termed authoritarian.
Basic theories and concepts The concept of socialization denoting the process of learning, that is, acquiring social competencies that enable one to become a member of a specific social or cultural group, has a long history. It appeared in the Oxford English Dictionary in 1828 and was defined as ‘to render social, to make fit for living in society’ (Renshon, 1992). Socialization encompasses the process of the emergence and development of personality taking place
in interrelationship with the socially conveyed social and material environment. Essentially, we seek the answer to the question of how a human being becomes a subject (actor) capable of acting socially. The essence of the socialization process comes down to the emergence, shaping and development of human personality. In the course of the socialization process, a human, with his/her biological and mental dispositions, becomes a socially mature individual, equipped with abilities to act effectively (dynamically sustained throughout his/her life) within the structures of a given society. The term ‘political socialization’ denotes the process of individuals becoming accustomed to the framework of political culture, which (framework) has been adopted as binding in the sphere of politics (Berg-Schlosser, Chapter 37, this Handbook). The process of political socialization thus takes place as an aspect of specific systems of norms and values. The political awareness and political behaviors of members of society is shaped within this process. In this way, the patterns of political behavior and ways of action are transmitted in the socialization process. Owing to socialization processes, an individual is introduced into a certain political culture or rules of the game that are binding in the world of politics (Langton, 1969; Dennis, 1973; Greenstein, 1975; Patrick, 1977; Dekker, 2014). Herbert H. Hyman uses the term ‘political socialization’ to refer to the process lasting throughout an individual’s life (Hyman, 1959). During childhood, attitudes develop that influence political behaviors in mature age. He emphasizes: ‘humans must learn their political behavior early and well and persist in it’ (Hyman, 1959: 17). As a result of the process of political socialization, an individual becomes a political person (Dawson and Prewitt, 1969; Ichilov, 1990; Hess and Easton, 1962; Easton and Dennis, 1969). For R. S. Sigel, political socialization means ‘the learning process by which the political norms and behaviors acceptable
Political Socialization
to an ongoing political system are transmitted from generation to generation’ (1965: 1). Edward S. Greenberg defines political socialization as ‘the process by which the individual acquires attitudes, beliefs, and values relating to the political system of which he is a member and to his own role as citizen, within that political system’ (Greenberg, 2017: 3). Individual political systems thus try to transmit to young people the systems of norms, values and convictions that are congruent with the persistence of a political system. For Almond and Powell (1978), political socialization is a process, as a result of which political cultures persist and evolve. During the process, individuals adopt cultural problems and develop their orientations toward political phenomena. The two scholars emphasize the importance of political socialization for the functioning of political systems. Fred I. Greenstein offers a narrower and a broader conception of political socialization. According to the first, political socialization is ‘the deliberate inculcation of political information, values, and practices by instrumental agents who have been formally charged with this responsibility’ (Greenstein, 1968: 551). According to the second, political socialization would encompass all political learning, formal and informal, deliberate and unplanned, at every stage of the life cycle, including not only explicitly political learning but also nominally nonpolitical learning that affects political behavior, such as the learning of politically relevant social attitudes and the acquisition of politically relevant personality characteristics. (Greenstein, 1968: 551).
Two approaches have been developed in the studies on political socialization. According to the first one, the point of reference of socialization processes is an individual. According to the second position, the political system is the point of reference. With the individual-oriented approach, knowledge and values or convictions characteristic of an ordinary citizen are of significance. Emphasis is put here on the system of the connection
645
between socialization processes in primary groups, first of all in the family, and on the functioning social structures. Political behaviors characteristic of different cultures were associated with the patterns of parental authority or binding norms and values in the upbringing process. Such approaches can be found inter alia in Margaret Mead’s conceptions (1978). With the orientation toward the political system, scholars focus their attention primarily on the issues of the stability of political systems treated as those responsible for the course of the process of political socialization. The counterculture movements of the 1960s exerted a crucial influence on the development of inquiries into political socialization. Owing to them, the principal research assumptions were revised. Scholars focused their interest on the questions of social change, opposition to the authorities and interactions between different subjects of socialization processes. The process of internalization of social reality consists in both knowing the external world and sharing it with others. Peter L. Berger and Thomas Luckmann (1996) state that we have to do with the comprehensive introduction of an individual into the objective social. An individual becomes human by entering relationships with the natural environment and with the social and cultural environment owing to significant others who exert an influence on that individual. Referring to G. H. Mead’s findings, the two scholars recognize that personality development is directly bound up with the development of the organism and social processes accessible through significant others. All social universes undergo changes because they are the result of human activity. Reality is socially defined in unambiguous terms because it is defined by both individuals and social groups. Therefore, Berger and Luckmann (1966) emphasize that an individual is not born a member of a community/society, but s/he is born with predispositions to socialization. Consequently, the internalization process denoting both the
646
The SAGE Handbook of Political Science
understanding of other people and perception of the world as a meaningful and social reality is of significance. According to the definition given by Piotr Sztompka (2002), socialization denotes a process owing to which an individual becomes accustomed to the way of life of his group and society at large. This happens through learning the rules and ideas contained in culture. The social origin of the behavior of human individuals is determined by at least two arguments. One of them applies to the situation when specific systems of norms, values, rules and ideas are self-evident in some societies but absent from others, with entirely different patterns functioning instead. The other argument refers to the cases of children brought up in an animal environment and to the resultant consequences. They show that the human way of life develops as a result of the child acquiring the culture of the community in which s/he is brought up. Biology, Sztompka (2002) emphasizes, provides specific predispositions, but whether they will fully develop is associated with the influence of society. That is why it can be concluded that, owing to socialization processes, an individual becomes a full-fledged member of social communities: s/he acquires competencies and skills indispensable for functioning in a specific group; s/he adopts the ideas and conceptions characteristic of a given culture and learns specific social roles s/he intends to perform. Margaret Mead (1978) makes identification the main point of reference for the socialization process. This provides the answer to the question of which past, present or future a young person can identify with. Mead stresses that possibilities of choosing an identification became available when competing lifestyles arising from religion and political ideologies were experienced. She distinguishes three types of cultures: postfigurative, cofigurative and prefigurative. A characteristic feature of postfigurative cultures is that children learn first of all from their parents. Changes are so slow here that
the past of the adults is in essence the future of the young. In other words, young people follow the path paved by the older generations. The feeling of invariable continuity is transmitted in the process of socialization. From this comes the feeling of the young child’s identity and the conviction that only one destiny is available. An example is the caste system in India, in which representatives of individual castes define their identity and their capabilities or lack of them by identifying with a caste. A similar situation occurs in the case of different patterns of upbringing depending on gender. In cofigurative cultures, both children and adults learn from their peers. The prevailing pattern for members of societies is therefore the way their peers behave. Cofigurative cultures function based on such experiences of young people that have no equivalent in the experience of their parents or grandparents. What is more, grandparents are often not present in the physical sense because in the process of mobility, young people move to other parts of their country, emigrate abroad, start a family, leave their parents in old people’s or nursing homes. In this way, representatives of the older generation cease to exert an influence on the experience of the growingup children. In this type of culture, there is the belief that the years of children’s learning are only a partial preparation for functioning in different social groups outside of the family. The prevailing feeling is that of living in a constantly changing world. This is associated with consent to the existence of an intergenerational rift and the expectation that the subsequent generations will function in a new technological world. In prefigurative cultures, the adults also learn from their children. Young people functioning in different parts of the world are facing situations that the older generations have not encountered. This largely stems from the fact that the previous generations did not know, experience or take advantage of such rapid changes on a global scale, such as the extension of energy sources or availability of
Political Socialization
new means of communication. Importantly, these changes happened over the span of the functioning of one generation. Mead (1978) observes that in many parts of the world, the older generations are still functioning in accordance with the postfigurative type of culture: children adopt their parents’ conviction that there are absolute, indisputable values and they seek to redefine these values in the course of their lives. Piotr Sztompka (2002) uses the term ‘reverse socialization’. In the traditional model, the older generation exerts an influence on the younger generation, whereas in the model of reverse socialization, elders learn from young people. In his view, two main factors influence the process of reverse socialization: social and cultural changes taking place in the age of globalization and the impact of the mass media. According to Almond and Powell (1978), the concept of a political system encompasses both government institutions and legislative bodies; courts; state administration agencies; all structures in their political dimension; traditional structures comprising family relationships; caste groups; and organizations such as political parties, interest groups or means of mass communication. A significant aspect from the standpoint of the functioning of political systems is the legitimization of political power. Political power is legitimized when citizens observe the rules adopted in a system not because they fear sanctions but because they believe in the legitimacy of authority. A significant point of reference for political socialization is political culture. The concept of political culture encompasses a set of attitudes of community members and assessments of the role of an individual in a given system. Therefore, Almond and Verba (1965), referring to the concept of political culture, point to the political system internalized by particular members of a community in the form of emotions and assessments. Specific political attitudes of adult behaviors can be related to the experience acquired by a child in the process of political socialization.
647
This process of internalization encompasses cognitive, emotional and evaluative attitudes. In a democratic system, a citizen is expected to participate in political life, to be well informed and to be guided by reason rather than by emotions in the process of political choices (Almond and Verba, 1965). Therefore, for a democratic community, the most desirable model of political socialization is one in which an individual acquires the knowledge about how to perform various roles to overcome dissonance between them. Importantly, Almond and Verba lay strong emphasis on the character of the social environment, the binding patterns of social interactions, historical political memory and various experiences pertaining to the political structure and the ways of its functioning. In Easton’s (1957) considerations, political socialization is linked to the mechanisms of shaping support for a specific political system. It is the culmination of creating and granting legitimacy to the political system. Easton stresses that demands put forward toward a given political system are not sufficient to keep it operating. Specific attitudes and measures supporting the political system are indispensable. Easton distinguishes two categories of this support. The first entails direct support for actors involved in decision-making processes and is manifested inter alia in voting or participation in political campaigns. The second category is connected with indirect or passive support and it assumes the form of, inter alia, paying taxes, doing military service or abiding by the legal regulations in force. In order to persuade members of society to support the political system, a system of strengthening ties between members of the system, by means of positive incentives (rewards) and negative ones (sanctions), is used. As new members of society grow up, others transmit diverse social goals and norms through the system of incentives. In this way, individual members of society learn different social roles. This is especially important for what the society perceives as desirable in social life.
648
The SAGE Handbook of Political Science
What is essential for political systems to function properly is that their members have a certain set of shared expectations regarding the criteria for political assessments. Easton emphasizes that learning does not end at some special moment in life. It is important that we have a system of rewards and punishments in order to adopt proper political attitudes and appropriate behaviors. An essential role in this process is played by myths, doctrines, philosophies and ideologies conveying a specific interpretation of social and political goals and norms.
Global/regional differentiation Socialization is a result of learning consciously through upbringing and unconsciously by imitation. In the literature, there is a distinction between primary and secondary socialization. Primary socialization is defined as the first socialization the child encounters, and through which s/he becomes a member of society. At this stage, purely cognitive learning takes place in the conditions of emotional identification with significant others. In the process of primary socialization the child adopts the roles and attitudes of those significant others, internalizes them and identifies with them. According to Berger and Luckmann, the molding of the child’s identity means being assigned to a specific place in the world. The two scholars also emphasize that the child actually has no choice: s/he is placed in a compulsory situation, and the world internalized in the process of primary socialization is, therefore, far more rooted in human consciousness than the worlds that are internalized in the processes of secondary socialization. Of significance for the process of primary socialization is what culture the child grows in, what system of norms and values is inculcated into him/her, and what roles s/he is prepared for. Berger and Luckmann believe that although the crowning point of the process of primary
socialization is the consolidation of the concept of generalized other in a person’s consciousness, the socialization process does not end at this point because socialization is never complete and finished. Secondary socialization is the process of internalization of different institutional worlds and of learning roles connected with participating in these worlds. Importantly enough, there may be rivalry between representatives of various institutions that define the reality. This is effected through specialized agencies. In the process of socialization, influence on an individual is exerted by the parents and the child’s closest environment, teaching the individual mainly basic behaviors in a small social group, which is the family. Secondary socialization – essential from the social and cultural standpoint – begins when the subject broadens the scope of his/her social and cultural relations, when s/he has already undergone primary socialization. School, peer groups, the Church, the workplace and the media are some of those subjects/agents that exert an impact in the socialization process. The secondary socialization process often involves the comparison and revision of earlier views, ideas or values. During the socialization process an individual acquires the competencies that are fundamental to social interactions, including language and symbols, norms and behavioral patterns, values and the ability to use objects. Socialization processes take place mainly in institutions that have developed over the years and guarantee the fulfillment of functions they are required to perform. Research into socialization distinguishes two groups of institutions (agencies): those in which socialization has been socially planned and organized, and those in which socialization occurs along with the execution of other tasks. Almond and Powell divided socialization into manifest or direct and latent or indirect, as well as into consistent and inconsistent aspects. Manifest socialization consists in information transmission and suggesting the values and assessments of political
Political Socialization
phenomena. Civics courses in schools can serve as an example here. Latent socialization consists in transmitting non-political patterns, which essentially modify attitudes toward the elements of a political system. Consistent socialization takes place when all elements influencing an individual are consistent and are not in conflict with one another. We are dealing with inconsistent socialization when these elements are in conflict with one another, and when the systems of transmitted norms and values are inconsistent. Family is treated as the basic structure of socialization. Here the child is introduced to social life and attitudes toward authority are shaped in the family. The decisions taken in the family are authoritative to the child. The inclusion of the child in the decisionmaking process may develop his/her ability to function in a social group and the feeling of his/her own competencies, and as a result, exert an influence on his/her activity in the political sphere. Studies show that the general family attitude toward the political system is also significant. For example, in the United States, research into party membership transmitted in the family shows that children tend to share their parents’ preferences regarding the choice between Republicans and Democrats (Key, 1961). Studies by LeVine (1960) reveal the relationship between personality patterns that are a consequence of the way of bringing up children and political behaviors as exemplified by two preliterate African tribes – the Gusi and the Nuer. In the former, children are entirely subordinated to adults, while in the latter there is high egalitarianism and openness in the child–adult relationships. The upbringing in the Gusi tribe translates into a high level of submissiveness to the British colonial authorities. Another example is the results of studies by Robert D. Hess and David Easton (1960) on the identification by the child of the figure of the leader – in this case the US president – with the figure of the father. The scholars point to the evolution of family as an institution.
649
The traditional family, usually multigenerational, provided role models: the father as the breadwinner for the family, the mother as a housewife. In the process of evolution, various factors exerted an influence on the development of alternative patterns of this agency. We can refer here to the functioning of incomplete families, common-law marriage, patchwork families, occupational activity of women and changes in the roles traditionally performed by men and women. By comparison with family, school is a system of formalized roles. In accordance with Parsons’ conception, the socialization process is treated here as the internalization of roles, involving the constant assessment of students’ achievements. The assessment process leads to internal divisions among students. They learn how a status is achieved and how it is defended. The task of the school is to prepare students to function within the hierarchical social system. Teachers are expected to make the right assessment of individual achievements. The category of role, created by Parsons, is of crucial importance here. Relationships between teachers and students are not merely a pedagogical relationship, but a normalized activity between specific roles. School provides general and particular knowledge on many spheres of life, including the functioning of political systems. Owing to this, it can strengthen the awareness of specific values and facts (including historical facts) and loyalty to a specific political system, and impart basic symbols of emotional response to the system. Among socialization agencies, school is largely controlled by the political system, subjecting children to socialization in accordance with the mainstream of political life. Peer groups play a substantial role in shaping the system of values and political orientations, especially in the situation when the role of the family is weakening. The role of peer groups, both formal and informal, enables the development of attitudes of hostility and aggression, ability to collaborate within different social entities
650
The SAGE Handbook of Political Science
or participation in the decision-making process (Kruger, 1993). Emphasis is also laid on peer collaborative learning. A special role is played in friendly peer groups by giving an individual information about him/herself and other people, providing emotional support in difficult life situations or forming a pattern of future social relations. Two types of relations between people are distinguished here: vertical and horizontal. The former refers to relations between the child and someone who has greater knowledge and authority than the child. The latter type involves having contact with persons who have similar knowledge and authority as the child. This type of relationship is generally characteristic of contacts with peers, the partners essentially having a similar status. The studies demonstrate that children spend more time with other children in their first years of life. The research also shows that children feel closely attached to persons who are important in their life: members of the close family and persons regarded as friends. Mass media play an increasingly important role in socialization processes. This results mainly from the development of new means of mass communication. The media not only inform about political events, but interpret them as well. Certain events may be publicized while others may in turn be ignored. A controlled system of mass media constitutes an important force in shaping political convictions. As early as the 1960s, Fagen (1966) stressed that media are often a dominant instrument of the socialization of members of particular societies. Since that time the modern means of communication have developed. In the context of the above mentioned socialization agencies, the question arises whether a failure in the process of socialization is possible. Berger and Luckmann use the concepts of successful and unsuccessful socialization. Successful socialization takes place when objective and subjective reality produce a high level of responsibility. Vice versa, when there is no correspondence between objective and subjective reality, then
we speak of unsuccessful socialization. In practice, both these forms of socialization are rather unlikely: the first because of the diversity of human features and complexity of social structures, the second because of various forms of backwardness. Problems also arise when, during the process of primary socialization, the child receives a message that comprises mutually exclusive worlds. Similar problems occur in a situation of incompatibility between primary and secondary socialization. Research on the processes of political socialization is undertaken by researchers from various parts of the world, although that conducted by Western scholars prevails. Research conducted by researchers from India focuses primarily on the social specificity of this state and its cultural conditions, especially those characteristic of the religion of Hinduism. The situation is similar in African and Latin American countries. Researchers emphasize the conditions associated with the construction of permanent democratic structures.
Major advances Many empirical studies have been undertaken that aimed to verify formulated scientific hypotheses and theories and to gain insight into the process of political socialization. Several trends of inquiries connected with the answer to the question of how an individual’s political beliefs and attitudes are connected with his/her life can be distinguished in the literature (Skarżyńska, 2005). According to the assumptions of the first trend, scientists underline the importance of an individual’s early experiences from his/her childhood. Here, studies on the subject include those on the formation of ethnic and racial identity; on attitudes toward foreigners or, in broader terms, toward strangers; on supporting various political systems; on identification with a definite political party
Political Socialization
or on the expressed support for a given type of political system. In this research current, it is assumed that identification with one’s own race or ethnic group and the consequent attitude toward foreigners develops between the ages of 7 and 13. Studies have revealed that Afro-American children generally show a preference for the white race as the one which enjoys higher status in the social structure (Annis and Corenblum, 1987). It follows from these investigations that children from minority groups prefer majority groups over their own because they want the power and material resources characteristic of higher status groups (Spencer, 1984; Cross, 1991). The family and environment in which a child is raised influence such behaviors. In reference to multicultural societies, it is emphasized that an individual does not have to identify either with his/her minority group or with the dominant group (Bernal and Knight, 1993). Scholars point to the fact that an individual may simultaneously identify with both groups. Ethnic identification developed in childhood influences political behaviors during adulthood. This is especially the case as far as identification with a particular political party is concerned (Campbell et al., 1960). Studies on the political socialization of three successive generations at various stages of their lives showed that the transmission of party identification differs in particular families. Transmitting one’s own preferences to children has greater chances of success in those families that are politically more involved and talk about politics, and where their attitudes are generally stable in this regard. It was shown that, regardless of cultural background, a greater correlation between parents’ and children’s political attitudes occurred in a situation when parents enjoyed the love and respect of their children and when they exercised mild control and repression in the educational process (Skarżyńska, 2005). Conversely, as the result of strict upbringing without love, studies point to a positive attitude toward authorities, the use of force
651
or reluctance regarding diversity (Greenstein, 1975). Attitude toward political leaders seems to be connected with the position of a family in the social and economic structure (Greenstein, 1975; Greenberg, 2017). The investigations highlight the importance in the process of political socialization of such elements as the position of the father in a family, inclusion of children in the process of making decisions and parents’ education. These variables largely shape the political attitudes of children. For example, Barbara FrątczakRudnicka (1990) studied the political attitudes of adolescents and their parents during the growing political and social conflict in Poland in 1981, before martial law was imposed. The studies showed that congruity between the political attitudes of teenagers and their parents was not very high at that time. Teenagers critically assessed the political system more often than their parents. Despite family influence, the political attitudes of children are shaped by the above mentioned agencies: school, peer groups, media. Detailed studies show a variety of effects by those factors. For example, in Poland, social confidence in young people is determined in a different way. In towns it is a derivative of such factors as freedom of discussion in schools, importance attached to joint actions, the level of educational aspirations, mother’s education and reading of the press, whereas in the countryside significant factors consist, essentially, in freedom of discussion in school and in parents’ discussions on politics with children (Dolata and Frączek, 2002). However, the comparative studies showed that teenage boys are generally more interested in discussions on political subjects and have greater knowledge on these subjects than teenage girls, who identify with democratic values to a greater extent (Torney et al., 1975). With regard to the stability of political attitudes and behaviors learned in childhood, three approaches prevail: studies on the consequences of the change of social environment; cross-sectional surveys; and surveys
652
The SAGE Handbook of Political Science
encompassing resistance against the change of one’s expressed attitude (Sears and Brown, 2013; Sears and Levy, 2003). The studies conducted as part of these three research approaches confirm the stability of political attitudes developed in childhood. Among researchers of this approach there is general agreement that political beliefs and attitudes are formed in childhood and are resistant to external influences in an individual’s later years. For example, Sears (1975) emphasizes that a child is open to the influence of other people, and finds it easy to learn and repeat the learned opinions. The concept of symbolic politics, proposed by Sears, is aimed at analyzing the process of development of political attitudes. Sears concludes that children learn specific concepts by means of positive or negative associations. Affective stimuli are of utmost importance during the process of acquiring symbolic politics. It is important, inter alia, what message is imparted by people significant for the child. The frequency and regularity of repetition of definite political attitudes and political behaviors appears to be an important factor that strengthens their stability. According to the assumptions of the second current, political beliefs and attitudes are not permanently formed during childhood, but are developed during an individual’s whole life and undergo changes in its various stages (Skarżyńska, 2005). We may find here a reference to Jean Piaget’s theory of cognitive and moral development. Political attitudes and beliefs in this interpretation are the consequence of the development of an individual’s cognitive process and may change under the influence of internal or external stimuli. The theory highlights, among others, the correlations between the level of cognitive and moral development and the way of thinking about justice. As the studies demonstrate, changes in the professional or economic situation of an individual’s family may impact his/her attitude to specific political problems or to the assessment of a given politician (Fiorina, 1981). Earlier studies showed a correlation between the age of white US citizens
and a racist attitude. Today, such correlations are not observable (Schuman et al., 1998). In contrast, the variable that shows stability is the low level of participation among the youngest voters. This tendency is characteristic of various parts of the world. Apparently, the change of convictions and attitudes in the subsequent stages of life is related to the change of life situation; inter alia, to migration, change of school or work and participation in social movements (Sigel, 1965; Sears and Brown, 2013; Sears and Levy, 2003). According to the assumptions of the third current, scientists emphasize that political convictions and attitudes are sometimes determined by the times in which an individual lives and by experiences in which s/he participates (Skarżyńska, 2005). Therefore, it is important whether an individual lives in times of stability in the social or political system, or whether s/he lives in a period of changes and crises. For example, successive generations in the United States were analyzed: people born before 1920 and those who took part in elections for the first time in the 1920s; people born after World War II who participated in activities of counterculture movements of the 1960s; and the generation of their children. According to those studies, the first generation was the most conservative, while the subsequent generations were more democratic. Particular attention should be paid to the generation of rebellion of the 1960s, because it is characterized by an invariably liberal attitude. With regard to the developed Western countries, the studies show a decrease in commitment to political matters and lower participation in elections by younger age groups (Pellegrini and Hilton, 2012). Scholars also highlight a decrease in social capital in successive generations. They emphasize the importance in this process of such events as the war in Vietnam, the Watergate scandal and the Bill Clinton sex scandal, which shape attitudes of distrust toward politics and politicians (Tolley, 1973). Additionally, emphasis should be laid on the emergence of new social, political or economic processes, which relate to the entire
Political Socialization
globe on an unprecedented scale. The studies conducted in this dimension concern the issues of values perceived by young people as important indicators of their identity, development of patriotic attitudes and the readiness of young people to engage in civic activity (Amnå et al., 2009). According to the Center for Public Opinion Research, in Poland in 2015, the highest percentage of respondents aged 18 to 24 described their level of interest in politics as ‘medium’, consisting in ‘following only main events’ (Kazanecki, 2015). Nearly half of the respondents declared ‘negligible’ or ‘no’ interest in political matters. Politics is a quite important sphere of life only to a small percentage of young Poles (Czakon, 2016). Polish young people’s indifferent attitude to politics is also expressed in the level of participation in parliamentary elections. In the Sejm and Senate elections in 1997, 2001, 2005 and 2007, those in the age groups 18–25 and 26–35 were characterized by the lowest election attendance (Czakon, 2016). The lack of concrete political views among the majority of young people is also noticeable. This situation may be interpreted at least in two ways: as an expression of the lack of political competencies connected with not knowing the referents of such concepts as left/right wing and center, or as a manifestation of an aversion to the world of politics (Czakon, 2016).
Perspectives The literature on the subject stresses that earlier studies on political socialization often used improper data or research methods (Neundorf and Smets, 2017). The future appears to belong to multistage panel studies, which allow the scientists to study the stability or variability of political preferences in the same persons over a longer period of time. What appears to be significant in this context are internet panel studies. There are still too few studies on the relationship
653
between genetics and environmental determinants and their interaction during the entire lifetime. Emphasis is also laid on the necessity to take account of the macroeconomic system and social factors in investigations into intergenerational changes in political socialization. One of the main challenges to political socialization are processes of immigration and globalization. Individual socialization agencies require further in-depth studies, including, first of all, the most important ones: the family, school, peer group and mass media. Modern technologies have a great impact on the process of political socialization (Owen, 2008; Kudrnáĉ, 2015). We may speak about a revolution in this sphere. Mutual communication between people, especially young ones, is now carried out to a greater extent through modern technological means than through face to face contact. Text messaging, which often replaces telephone calls, make it possible to send many messages. New communication technologies allow people, especially young ones, to form their own groups that meet and act in the virtual world. Social media are used to generate many virtual organizations. Young people develop their political identity online. They use social media such as Facebook and video streaming media such as YouTube to generate and disseminate messages through networks of friends. For that reason, studies on the influence of new media on the process of political socialization will be of significant importance.
References Almond, Gabriel A. and Powell, G. B. (1978) Comparative Politics: System, Process, and Policy. Boston: Little, Brown. Almond, Gabriel A. and Verba, Sidney (1965) The Civic Culture: Political Attitudes and Democracy in Five Nations. Boston: Little, Brown and Company.
654
The SAGE Handbook of Political Science
Amnå, Erik, Ekström, Mats, Kerr, Margaret and Stattin, Håkan (2009) ‘Political Socialization and Human Agency: The Development of Civic Engagement from Adolescence to Adulthood’, Statsvetenskaplig Tidskrift, 111 (1): 27–40. Annis, Robert C. and Corenblum, Barry (1987) ‘Effect of Test Language and Experimenter Race on Canadian Indian Children’s Racial and Self-Identity’, The Journal of Social Psychology, 126(6): 761–73. Berger, Peter L. and Luckmann, Thomas (1966) The Social Construction of Reality: A Treatise in the Sociology of Knowledge. Garden City, NY: Doubleday. Bernal, Martha E. and Knight, George P. (eds) (1993) Ethnic Identity: Formation and Transmission among Hispanics and Other Minorities. Albany: State University of New York Press. Campbell, Angus, Converse, Philip E., Miller, Warren E. and Stokes, Donald E. (1960) The American Voter. New York: Wiley. Cross, William (1991) Shades of Black: Diversity in African-American Identity. Philadelphia: Temple University Press. Czakon, Piotr (2016) ‘Zaangażowani czy obojętni? Aktywność społeczna i polityczna młodych Polaków’, Zeszyty Naukowe Politechniki Śląskiej: Ś, 95: 75–87. Dawson, Richard E. and Prewitt, Kenneth (1969) Political Socialization. Boston: Little, Brown and Company. Dekker, Henk (2014) ‘Advances in Political Socialization: Theory and Research’, Politics, Culture and Socialization, 5(2): 217–23. Dennis, Jack (ed.) (1973) Socialization to Politics: A Reader. New York: Wiley. Dolata, Roman and Frączek, Adam (2002) ‘Wzorce i korelaty zaufania politycznego polskich nastolatków (w międzynarodowej perspektywie porównawczej)’, Ruch Pedagogiczny, 1–2: 48–69. Easton, David (1957) ‘An Approach to the Analysis of Political Systems’, World Politics, 9(3): 383–400. Easton, David and Dennis, Jack (1969) Children in the Political System: Origins of Political Legitimacy. New York: McGraw-Hill. Fagen, Richard R. (1966) Politics and Communication. Boston: Little, Brown.
Fiorina, Morris P. (1981) Retrospective Voting in American National Elections. New Haven and London: Yale University Press. Frątczak-Rudnicka, Barbara (1990) Socjalizacja polityczna w rodzinie w warunkach kryzysu. Warszawa: Wydawnictwa Uniwersytetu Warszawskiego. Greenberg, Edward S. (2017) ‘Consensus and Dissent: Trends in Political Socialization Research’, in Edward S. Greenberg (ed.), Political Socialization. 1–18, London and New York: Routledge. Greenstein, Fred I. (1968) ‘Political Socialization’, in International Encyclopedia of the Social Sciences vol. 14. New York: Macmillan and The Free Press. Greenstein, Fred I. (1975) Personality and Politics: Problems of Evidence, Inference, and Conceptualization. New York: Norton & Company. Hess, Robert D. and Easton, David (1960) ‘The Child’s Changing Image of the President’, Public Opinion Quarterly, 24(4): 632–44. Hess, Robert D. and Easton, David (1962) ‘The Role of the Elementary School in Political Socialization’, The School Review, 70(3): 257–65. Hyman, Herbert H. (1959) Political Socialization: A Study in the Psychology of Political Behaviour. New York: The Free Press. Ichilov, Orit (ed.) (1990) Political Socialization, Citizenship Education and Democracy. New York: Teachers College Press. Kazanecki, Wojciech (2015) Zainteresowanie polityką i poglądy polityczne w latach 1989– 2015. Deklaracje ludzi młodych na tle ogółu badanych. Warszawa: CBOS. Key, Valdimer Orlando Jr (1961) Public Opinion and American Democracy. New York: Knopf. Kruger, Ann Cale (1993) ‘Peer Collaboration: Conflict, Cooperation, or Both?’, Social Development, 2(3): 165–82. Kudrnáč, Aleš (2015) ‘Theoretical Perspectives and Methodological Approaches in Political Socialization Research’, Sociológia, 47(6): 605–24. Langton, Kenneth P. (1969) Political Socialization. New York: Oxford University Press. LeVine, Robert A. (1960) ‘The Internationalization of Political Values in Stateless Societies’, Human Organization, 19(2): 51–8.
Political Socialization
Mead, Margaret (1978) Culture and Commitment: A Study of the Generation Gap. Garden City, NY: Doubleday. Neundorf, Anja and Smets, Kaat (2017) Political Socialization and the Making of Citizens. Oxford Handbooks Online. http://www. oxfordhandbooks.com/view/10.1093/ oxfordhb/9780199935307.001.0001/ oxfordhb-9780199935307-e-98 Owen, Diana (2008) ‘Political Socialization in the 21st Century. Recommendations for Researchers. https://www.researchgate.net/ publication/228450234_Political_socialization_in_the_twenty-first_century_Recommendations_for_researchers Patrick, John J. (1977) ‘Political Socialization and Political Education in Schools’, in S. A. Renshon (ed.), Handbook of Political Socialization. 190–222, New York: Free Press. Pellegrini, James W. and Hilton, Margaret L. (eds) (2012) Education for Life and Work: Developing Transferable Knowledge and Skills in the 21st Century. Washington: The National Academies Press. Renshon, Stanley Allen (1992) ‘Political Socialization’, in Mary Hawkesworth and Maurice Kogan (eds), Encyclopedia of Government and Politics, vol. 1. London and New York: Routledge. Schuman, Howard, Steeh, Charlotte, Bobo, Lawrence D. and Krysan, Maria (1998) Racial Attitudes in America: Trends and Interpretations. Revised Edition. Cambridge: Harvard University Press. Sears, David O. and Brown, Christia (2013) ‘Childhood and Adult Political Development’,
655
in Leonie Huddy, David O. Sears and Jack S. Levy (eds), The Oxford Handbook of Political Psychology. 60–109, Oxford: Oxford University Press. Sears, David O. and Levy, Sheri (2003) ‘Childhood and Adult Political Development’, in David Sears, Leonie Huddy and Robert Jervis (eds), Political Psychology. Oxford: Oxford University Press. Sears, David O. (1975) ‘Political Socialization’, in Fred I. Greenstein and Nelson W. Polsby (eds), Handbook of Political Science Vol 2: Micropolitical Theory. 127–53, Reading, MA: Addison Wesley. Sigel, Roberta S. (1965) ‘Assumptions about the Learning of Political Values’, Annals of the American Academy of Political and Social Science, 361(1): 1–9. Skarżyńska, Krystyna (2005) Człowiek a p olityka. Zarys psychologii politycznej. Warszawa: Wydawnictwo Naukowe Scholar. Spencer, Margaret Beale (1984) ‘Black Children’s Race Awareness, Racial Attitudes and Self-Concept: A Reinterpretation’, Journal of Child Psychology and Psychiatry and Allied Disciplines, 25(3): 433–41. Sztompka, Piotr (2002) Socjologia. Analiza społeczeństwa. Kraków: Wydawnictwo Znak. Tolley, Howard B. (1973) Children and War: Political Socialization to International Conflict. New York: Teachers College Press. Torney, Judith Vollmar, Oppenheim, Abraham Naftali and Farnen, Russell F. (1975) Civic Education in Ten Countries: An Empirical Study. New York: Wiley.
39 Social Movements Donatella della Porta
Social movement studies: A short history of the subject Social movement studies have grown especially since the 1970s, taking distance from previous research on collective behavior, on the one hand, and the labor movement, on the other. As for the former, especially in the United States, unconventional forms of political participation had been distinguished from ‘normal politics’. Within so-called mass politics, protest had been seen as an abnormal phenomenon: non-strategic, emotiondriven, often irrational. Similar to crowds, social movements were considered as populated by frustrated individuals, characterized by weak personalities and anomic behavior. With different accents, functionalist theorists singled out social movements as a symptom of systemic disequilibria and scholars of collective behavior as carriers of emergent norms. Especially in Europe, social movement scholars differentiated their subject of studies from the labor movement, often
addressed through Marxist approaches to new conflicts. While Marxist analyses had traditionally assessed the centrality of capital– labor conflicts, the transformations that followed World War II gave more weight not only to social characteristics, such as gender or generation, that did not overlap with class, but also to post-materialist concerns. Once a marginal area of concern in the social sciences, since the 1970s social movement studies has developed into one of the main fields of sociology and political science, also acquiring a significant space in disciplines such as anthropology, geography, history, psychology and philosophy, among others. As the field of study consolidated, a toolkit of concepts and middle-range theories was established as particularly useful to address the types of social movements that developed outside of the traditional social and political cleavages. While focusing on environmental, women’s or peace movements, social movement studies more rarely addressed labor or peasants movements, and ethnic or religious
Social Movements
movements. Especially concerned with social movements within established democracies, they devoted little attention to forms of resistance in authoritarian regimes, anticolonial movements in the periphery or poor people movements at the core. And, mainly interested in progressive social movements, they only occasionally addressed reactionary or conservative social movements. The consolidation of social movement studies as an established field was helped by the growth of specialized journals (such as Research in Social Movements: Conflicts and Change, Social Movement Studies, Mobilization: An International Quarterly, Contention, Partecipazione e conflitto and Forschungsjournal Soziale Bewegungen), book series, sections in main professional associations, research departments and PhD programs. A large number of introductory textbooks, handbooks and even encyclopedias on social movements have been published (Snow et al., 2004; della Porta and Diani, 2006; Tilly and Tarrow, 2007; Cefaï, 2007; Crossley, 2002; Goodwin and Jasper, 2009; Snow, della Porta, Klandermans and McAdam, 2013; della Porta and Diani, 2016, Fillieule and Accornero, 2016; Snow, Soule, Kriesi and McCammon, 2018), as well as methodological texts (the most recent is della Porta, 2014a). While contributing to the accumulation of knowledge, the institutionalization of social movement studies as a field also produced the risk of closure to new areas of research and innovative interpretations. Countering this risk, as each new wave of protests contributed to increasing attention to social movements, with new generations of scholars joining the field, attempts to innovate the established toolkit multiplied, bridging the study of social movements with that of other forms of contentious politics, considering the broad arenas in which movements are embedded, bringing back attention to the social bases of broad societal conflicts and taking into account dynamics triggered by eventful protests. There is also a growing literature on social movements in the global South (e.g. on
657
Latin America, Rossi and von Bülow, 2015; on the MENA region, Beinin and Vairel, 2019). While concepts developed in social movement studies have been mainly applied to progressive movements, they proved helpful also for understanding radical right-wing movements (see Caiani et al., 2012). What follows will present, first, the consolidated conceptual toolkit in social movement studies, and second, the innovative turns just mentioned.
Basic theories and concepts in social movement studies Social movements are (1) mostly informal networks of interaction, based on (2) shared beliefs and solidarity, mobilized around (3) contentious themes through (4) the frequent use of various forms of protest (della Porta and Diani, 2006). They are made of networks of informal relations between a plurality of individuals and groups, which may be more or less structured. If political parties and interest groups have relatively well-defined organizational boundaries, often signaled by statutes and card-carrying members, social movements are instead made of scattered and weakly connected networks of individuals and groupings. They are not organizations but nets of relations between diverse actors, which often also include organizations with formal structures. Networks of relations constitute a social movement when they converge around a system of beliefs that nourishes reciprocal solidarity and a sense of collective identification. Social movements elaborate alternative norms and visions often promoting social change. Emerging norms are at the basis of conflicts around which actors mobilize. Finally, social movements are characterized by their adoption of disruptive repertoires of political participation relying, in order to exert pressure on decision-making, especially on repertoires of protest, defined as a nonconventional form of action that interrupts
658
The SAGE Handbook of Political Science
daily routine in order to attract the attention of the public and influence elites. In the 1970s, social movement studies grew upon two considerations, basically summarized by the concepts of new social movements, particularly widespread in Europe, and resource mobilization, coined in the United States. In Europe, the students’, women’s and environmental movements were considered as examples of new social movements that were to substitute the institutionalized labor movement. Attention to emerging collective identities then supplanted the concern with social classes. In particular, since the 1970s, new social movements were defined as actors of the new conflict at the core of post-industrial or programmed societies. Scholars of new movements, such as Alain Touraine (1977), highlighted the declining relevance of the conflict between capital and labor and the parallel growing importance of conflicts around knowledge. According to Alberto Melucci (1996), in complex societies, which require increasing integration and extend their control over even the motivations for action, new social movements oppose the state and the market’s penetration into everyday social life. Therefore, activists reclaim individual identity and the right to choose one’s private life against the manipulation of the system. At the same time, in the United States, studies on the civil rights and student movements highlighted that their activities, rather than being irrational or anomic, were guided by strategic consideration and shared norms. The main explanatory dimensions were not grievances, but rather resources mobilizable by social movement organizations. Research on social movement organizations carried out by scholars such as John McCarthy and Mayer Zald (1987) underlined in particular the role of organizational entrepreneurs in the mobilization of resources within complex organizational fields. Considering social movements as a normal part of the political process and social movement organizations as strategic actors, linked to the resource
mobilization approach, the political process approach analyzed the sets of political institutions and alliances that might facilitate the mobilization of resources necessary for collective action. While conflicts are considered as always present, collective action emerges only when material and symbolic resources allow to transform discontent into action. The type of available resources explains the strategic choices of movements and the consequences of collective action for the social and political system, as solidarity links can sometimes compensate for the absence of material resources. Social movement studies between the 1970s and the following decade focused mainly on macro-level political opportunities for protest, organizational forms and strategies and the building of collective identities. Since the 1990s, this structural bias has been challenged by a renewed attention to various cultural aspects as well as to the causal mechanisms that intervene between structure and action in a field redefined as contentious politics and covering social movements as well as revolutions, democratization and other contentious phenomena. In this development, social movement studies contributed to several subfields in political science (della Porta and Diani, 2016).
Unconventional Political Participation First of all, social movement studies have enriched research on political participation, broadening attention to the rapidly growing repertoire of non-conventional forms (such as petitions, sit-ins, boycotts, the occupation of buildings, blocking traffic), and the different styles of participation in different countries, social groups and generations. As Samuel Barnes, Max Kaase and others (1979) observed in the 1970s, ever wider groups of citizens were ready to resort to forms of action characterized by their nonconventionality in order to oppose laws and
Social Movements
public decisions seen as unjust or illegitimate. Typologies developed in this period were based on the individual propensity to use conventional or unconventional forms of action, or a combination of the two. In particular, as confirmed by later research (e.g. Norris, 2002; Dalton, 2004), the increase in non-conventional participation was not to be considered as an indicator of the decline of the legitimacy of liberal democracies, but rather as an expression of a lasting enlargement of the strategies available to citizens, deriving from the growth in political competence, in particular among young people. Later comparative research has revealed that while levels of traditional types of political participation have remained stable or (in some forms) declined in the 1990s and the new millennium, levels of non-institutional participation have greatly increased while differences in levels of participation linked to gender, age and educational level have lessened, leading some to speak of a ‘participatory revolution’.
Historical Evolution of Protest Repertoire Moving from the micro to the macro level, social movement studies have contributed important knowledge on the historical evolution in repertoires of protest. According to the influential work of Charles Tilly (1978), the formation of the nation-state, the development of capitalism and the emergence of modern means of communication brought about the shift to a modern repertoire of collective action, formed by the set of means a group has at its disposal for making collective claims. Previously, protests were parochial in scope, addressing local actors or the local representatives of national actors, and relied on patronage, appealing to local power holders to convey grievances or settle disputes. In the 19th century, the emerging repertoire of protest was instead national and autonomous, utilizing forms such as strikes,
659
electoral rallies, public meetings, petitions, marches, insurrections and the invasion of legislative bodies. The new repertoire responded to a situation in which politics had become increasingly national, the role of communities had diminished and organized associations had spread instead.
Organizational Networks Differing from ‘mass society’ theories, which had linked unconventional political behaviour to the uprooting of individuals from primary groups and the social disaggregation stemming from modernization, the resource mobilization approach explains mobilization through the moral gratification intrinsic to the pursuit of a collective good and the existence of solidarity links. The intensity of relations within broad formal and informal networks in which social movement activists are embedded impacts upon the construction of collective identities. Through organizational networks, activists develop an alternative vision of the world, acquire information and skills necessary for collective action and develop norms of reciprocity. The density of the networks helps to explain different capacities for participation in various environments and among different social groups as their mobilization is influenced by their level of catnet-ness, a synthesis of characteristics linked to social categories and the density of social networks (Tilly, 1978). The development from a category (an aggregation of individuals who share specific characteristics) to a social group (defined as a community capable of collective action) requires a combination of categorical traits and networks of relations that link the subjects that share those traits. So, for the labor movement, the presence of both large numbers of workers that carried out similar tasks and socially homogeneous networks with intense social relations favored the choice to cooperate. Collective action then fueled the awareness of holding common interests – what Karl
660
The SAGE Handbook of Political Science
Marx had defined as class consciousness – and the socialist ideology provided for a broad world-view. Besides the labor movement, modernization has strengthened organizational capacities for collective action as technological developments (especially those related to the expansion of means of communication) have increased the quantity of resources available. Furthermore, economic progress had a positive effect on the associative capacities of individuals by augmenting the quantity of resources available for collective action, thanks to increasing education levels but also growth in spare time. As these conditions tended to disappear, social movements needed to find different resources for their mobilization as well as new solidarities and means of communication.
Collective Identities Social movement studies have also contributed to the analysis of collective identification as an important basis for political action. In general, politics refers to solidarity systems, in which the very definition of interests is rooted, as interests are recognizable only with reference to a certain value system. Identification with larger groups – the awareness of belonging to a collective ‘we’ – is what triggers a sense of belonging to sympathetic collectives inside which individuals consider one another as equals. The construction of a shared identity is a precondition for collective action, but also a product thereof, as participation in protests transforms individual identities through processes of collective identification (Karolewski, Chapter 31, this Handbook). While protest aims at influencing public decisions it has also internal effects by creating solidarity between participants, making them feel part of a collective effort (della Porta, 2017). Collective action requires that participants elaborate a definition of themselves, of other social actors and of the content of the relations that connect them. Through framing
activities (Snow, 2013), a shared ‘us’ with whom to sympathize is singled out, as is a ‘them’ to whom the blame for negative conditions is attributed. Collective identification thus requires a positive definition of who is part of a certain group and a negative definition of who is excluded. The identities of emerging collective actors are to be recognized by external actors, so that recognition becomes part and parcel of identification itself. Collective action itself contributes to the building and consolidation of collective identity through the definition of boundaries between the actors involved in a conflict.
Comparing Political Opportunities Focusing especially on liberal democracies, social movement studies have addressed the relations between institutional political actors and protest contributing to the broad field of comparative politics. The widely used concept of political opportunity structure singles out some main contextual characteristics that influence social movements. As challengers in a given political system, social movements interact with the political incumbents, which have a consolidated position. Political opportunities affect both the form taken by that collective action and its probability of success. In particular, in his study of protest cycles in Italy, Sidney Tarrow (1998 [1994]) suggested that the degree of openness or closedness of formal political access, the degree of stability or instability of political alignments, the availability and strategic postures of potential allies and political conflicts between and within elites are the main dimensions of political opportunities. Given increasing interest in social movements among political scientists, European scholars in particular (della Porta, 1995, 2013a) have used the concept of political opportunities in cross-country comparative analyses of the development of protests. A main contribution from this research is that different types of democracies, based on different institutional
Social Movements
assets and cultural traditions, affect social movements, their extent and the forms and degree of their success. In general, an institutional system is more open (and less repressive) the more political decisions are dispersed (through functional differentiation of power, territorial decentralization and direct democracy). Formal institutions do interact with informal strategies to deal with opponents, within either inclusive or exclusive historical traditions that tend to reproduce themselves. Beyond institutional opportunities, political allies such as trade unions, religious institutions and political parties are also relevant for social movements. Dimensions that are susceptible to change in the short term include aspects such as electoral instability and elite divisions. Research has indicated that the more open political opportunities are, the more protests spread, but mainly in a peaceful form. Moreover, the greater the number of actors who share political power (the greater, therefore, the checks and balances within the system) and the more inclusive the strategies to deal with opponents, the greater the chance that social movements will gain access to the decision makers. A weak executive may ease access for outsiders to the decision-making process, but will have more difficulties in implementing its own decisions.
Impacting Policy A growing number of analyses have recently addressed the effects of social movements on the policy process. In one of the first reflections on the topic, William Gamson (1990 [1975]) identified among the factors contributing to success a minimalist strategy (‘thinking small’), direct action and a centralized organization. Other scholars disagreed, however, pointing at the advantage of broad ideologies and utopian thinking, disruptive forms of protest and horizontal and loose networks (Piven and Cloward, 1977). In general, no particular strategy can be assessed
661
without taking into account the available resources and opportunities, as social movements are, as mentioned, part of alliances including political parties and, sometimes, even public agencies. Research has indicated that, under some circumstances, social movements may have relevant effects on different stages of public decision-making processes. Generally, social movements emerge to express dissatisfaction with an existing policy in a given area. Specific policy claims are relevant for the very self-definition of a social movement, symbolizing the movement’s identity. While non-negotiable demands are particularly important in the construction of collective identities, social movements also articulate broad requests for reforms. They have been considered as particularly important for the development of new ideas and norms, which emerge within critical communities and are then spread via mass mobilization, helping to thematize new issues in the public debate. Social movement outcomes can be evaluated by looking at the various phases of the decision-making process: the emergence of new issues; the proposal and implementation of new legislation; outcomes of public policies in alleviating the conditions of those mobilized by collective action. Levels of responsiveness to collective demands within the political system include the availability of the authorities to listen to the protestors, their willingness to put an issue on the agenda and the adoption and implementation of specific policies. While research on social movements initially concentrated on the production of legislation, it later also addressed the implementation of decisions as well as cultural transformations. Not limiting their interventions to single policies, social movements challenge the ways in which political institutions are organized, demanding decentralization of political power, consultation of interested citizens on particular decisions and appeals procedures against decisions by the public administration. Interacting with the public
662
The SAGE Handbook of Political Science
administrations, they ask to testify before representative institutions and the judiciary, to be listened to as counter experts, to receive legal recognition and material support. Social movements certainly increase the possibilities for access to the political system, through ad hoc channels relating to specific issues or institutions that are open to all noninstitutional actors. In this way, social movements have changed political culture, with its norms that define the issues and means of action that are politically legitimate. They have increased the acceptance of once nonconventional – or even illegal – repertoires of collective action. They have contributed to the creation of new policy arenas on issues of concern for social movement activists (e.g. environmental or gender rights agencies). At different territorial levels, public bureaucracies established under the pressure of movement mobilizations at times became potential allies. Social movement activists also maintain direct contacts with decision makers – participating in epistemic communities, made up of representatives of governments, parties and interest groups. At the transnational level, advocacy networks – composed of activists, bureaucrats belonging to international organizations and politicians from many countries – have won significant gains in a number of areas, such as protection of the environment or human rights violations (Khagram et al., 2002). Most importantly, participatory and deliberative institutional innovations have developed over the past three decades. Open to the participation of normal citizens in public arenas of debate, they also aim at developing high-quality discursive arenas.
Local Protest Social movement studies have addressed local politics especially through the analysis of urban movements as well as protests against locally unwanted land uses (LULUs), such as the construction of large infrastructure in both urban and semi-rural contexts.
Empirical research has challenged the narrative of local conflicts as motivated by a ‘not in my backyard’ (NIMBY) syndrome, associated with refusal to pay the necessary costs to attain public goods as well as with conservative behavior. Rather, it has singled out a complex reality, with protest campaigns characterized by a diverse capacity or will to present their particular claims within a more comprehensive framework (della Porta and Piazza, 2008; McAdam and Boudet, 2012). Local protests are often led by citizens’ committees, as organized but weakly structured groups of citizens that unite on a territorial basis and use forms of protest to oppose interventions that they claim would damage the quality of life in their territory. In the discourse of those who protest, the defense of local natural resources is often embedded in more general environmental claims that, however, rather than being ‘post-materialist’, often instead involve underprivileged groups in degraded areas. Local conflicts mobilize against urban ‘growth machines’, defined by formal and informal networks between public and private actors oriented toward increasing investment for economic growth. Under the frame of the Right to the City, urban social movements oppose the reduction of welfare resources destined for the most vulnerable social groups and propose a different model of development, bridging the opposition to specific land use with frames of social justice. Moreover, they criticize abuses of power and lack of transparency in public decisionmaking, as well as the collusive alliance between government and entrepreneurial interests, claiming the right to participate in decision-making.
Democratic Deepening Long considered rather marginal, if not dangerous for democratic development, social movements have been recently recognized as important players in democratization processes. While not all social movements are
Social Movements
democratic, research has pointed to the leading role that some of them have occupied in the historical development of citizens’ rights by pushing for wider suffrage, the recognition of associational rights and social rights. Moreover, progressive movements have often been pivotal in triggering waves of democratization, demanding increased equality and protection for minorities. In Latin America, as well as in Eastern Europe and more recently in Africa and Asia, social movements asked for (different forms of) democratization, at times succeeding in triggering the breakdown of authoritarian regimes. While research on democratic transition has pointed at the importance of elite pacts for transition and at the demobilization of civil society during consolidation, the presence of a tradition of mobilization, as well as movements that are independent from political parties, can facilitate the maintenance of high levels of protest during transition and consolidation (della Porta, 2014b, 2017). Furthermore, democratic consolidation opens opportunities for social movements that in their turn contribute to democratic deepening (della Porta, 2013b). At the cognitive level, social movements have developed a fundamental critique of conventional politics, stating the value of participation beyond elections. Especially since the 1960s, progressive social movements have supported the direct participation of citizens in public decision-making, criticizing the delegation of decisions to representatives who can be controlled only at the moment of election. Moreover, they have called for more transparency and accountability in public decisionmaking. In their alternative conceptions of democracy, the people themselves must be recognized, and their direct responsibility for intervening in the political decision-making process. More recently, the normative debate on deliberative democracy has found resonance in some social movement organizations that have developed norms and practices of consensual decision-making and constructed high-quality discursive arenas, responding
663
to the need to improve the quality of communication. Social movements have in fact created alternative public spheres, free from state intervention. In the global justice movement as well as in anti-austerity protests, organizational formats such as social forums and protest camps provided spaces for experimenting with participatory and deliberative models of democracy both in their internal decision-making and in their interactions with political institutions (della Porta, 2009, 2015). Internally, they have attempted to develop an organizational structure based on participation (rather than delegation), consensus building (rather than majority votes) and horizontal networks (rather than centralized hierarchies). While the search for high-quality internal democracy is always ongoing, progressive social movements have helped to open new channels of access to the political system, contributing to the identification, if not the solution, of a number of democratic problems.
Transnational Politics Social movement studies have contributed to research on international relations through the study of transnationalization of contentious politics. Taking into account the role of nonstate actors, some scholars have pointed out the effects of transnational protest campaigns upon developing international normative regimes. Research on environmental protection, human rights, peace and indigenous peoples pointed to the emergence of international norms that challenge the vision of international politics as an anarchic system of states. Studies of transnational social movement organizations targeting international organizations pointed at the politicization of international relations. Following an upward scale shift of power but also contributing to it, since the 1980s, transnational actors have grown enormously in terms of number, membership, material resources, public resonance and institutional access (della Porta and
664
The SAGE Handbook of Political Science
Tarrow, 2005; Tarrow, 2005). Protest campaigns address different geographical levels in search of allies and channels of access to decision makers. Research on transnational processes and social movements distinguished two main paths of transnationalization: social movements with domestic political concerns (especially in authoritarian regimes) searching for external, international allies; and social movements addressing their own governments in order to influence international political decisions. Difficulties in protesting beyond national borders were stressed by research on protest events, which tended to target the domestic level even when addressing global problems. However, in campaigns for human rights, national actors suffering from repression in authoritarian regimes found allies abroad in epistemic communities, involving international organizations, national governments, experts and transnational social movements. The evolution of international organizations has also been pointed at as transnational social movements adapt to their specific set of opportunities and constraints, but also challenge their policy choices as well as their lack of democratic accountability. Exchanges of knowledge as well as potential reciprocal legitimization have been considered by research on the interactions between social movements and international organizations that pointed at the capacity of transnational social movements to sensitize public opinion to global problems. With the development of campaigns addressing various and diverse international organizations, research revealed the different opportunities that different international assets offered to different social movements (della Porta and Tarrow, 2005). In this way, opportunities for things such as a consensual culture and a reciprocal search for recognition were observed, for example, in the General Assembly of the United Nations, but not for the (much more closed) international financial institutions such as the World Trade Organization (WTO), the International Monetary Fund (IMF), the World Bank and the European
Central Bank, which became the targets of massive protests. As international organizations are very different in terms of channels of access, social movements that target them have in fact looked for specific leverage, such as in the unanimity rules of the WTO that make alliances with some states particularly relevant, or the links with international experts and the formal channels of consultation available at the International Labour Organization (ILO). International organizations are also often diverse in their internal structure. For instance, in the European Union, the Council, Commission, Parliament and Courts, but also the European Central Bank, are all targeted by social movements, but protest strategies vary with these bodies’ characteristics. Different movements can have a more difficult or an easier life in terms of gaining access to specific (sympathetic) Directorates General in the EC, or opposing powerful ones (della Porta and Caiani, 2009). Different opportunities therefore trigger different strategies: from the lobbying highlighted in previous research to the open protests in the street that characterized the alter-globalist movement. Going beyond the specific experiences of the parallel summits organized by the UN on environmental or women’s issues, transnational social movements organized counter-summits and social forums. Frustration over the results of more moderate forms of interaction brought about the development of broader coalitions, including religious groups, unions and social movement organizations, using petitions, marches and even direct action. After the WTO meeting at Seattle in 1999, interactions between protestors and the police often escalated during the contestation of international summits (della Porta, 2009). Even though transnational protests may remain a rare occurrence, the few transnational protests that have taken place (in the form of global days of action, counter-summits or social forums) have been highly visible and influential. Finally, social movements have contributed to the emergence of a global narrative – which
Social Movements
developed together with intense interactions during the mentioned transnational protest events – and indeed a growing acknowledgment of the roles and responsibilities of international organizations. Normatively oriented literature has pointed at the emergence of a global civil society. Research highlighted in fact the cognitive effects of globalization, for example, in the intensification of relations beyond borders in terms of the cross-national diffusion of movement frames and strategies for public order control. Social movement activists started to present their actions as part of a global justice movement, calling for global justice and global democracy. Although still deep-rooted in national political systems and movement families, cosmopolitan activists tend to bridge the local with the global and vice versa (della Porta, 2009; Tarrow, 2005). In doing this, they are contributing to the development of a transnational political system, as well as to transnational identities.
Major advances, ongoing debates, critical assessments The expansion of social movement studies as a field has contributed to acquiring relevant knowledge about the functioning of political systems along the lines indicated in the previous section. While concepts such as resource mobilization, political opportunities, protest repertoires and frames became part of an established toolkit, they tended to produce selective attention to some particular empirical objects and theoretical perspectives. In the most recent debates, some new approaches have emerged as a partial correction of such risks. A first innovative turn is related to the conceptualization of contentious politics, defined as episodic, public, collective interaction among claims makers and their targets (when at least one government is a claimant, an object of claims, or a party to the claim and the claims would, if realized, affect the
665
interests of at least one of the claimants). In particular, Doug McAdam, Sidney Tarrow and Charles Tilly (2001) have promoted a broad research agenda that addressed similar mechanisms at play in social movements, revolutions, democratization processes and ethnic conflicts. While social movements are an important phenomenon, they interact with other types of social phenomena – some of them of a revolutionary type, as in the Arab Spring in 2010/11. Social movements need to struggle for democratization and the deepening of democracies. Ethno-nationalist conflict are re-emerging, old cleavages have been reactivated. These developments prompt the need to go beyond the established framework of social movement studies and look to other fields in order to get new ideas. Advocating a dynamic use of concepts, the scholars involved in this approach have tried to single out general mechanisms of contention. There is a second type of development that is important in connecting social movement studies with other fields of studies. As mentioned, social movements interact with other actors. Concepts such as fields and arenas have recently been proposed to analyze the interactions of different actors, institutional or not. Within social fields, various actors cooperate, compete, negotiate and fight with each other (Fligstein and McAdam, 2012). In each field, relationships are characterized by unequal power with struggles for the transformation or preservation of the field. Similarly, reflections on arenas point at the multiplicity of strategic dilemmas experienced by various players in their continuous interactions (Duyvendak and Jasper, 2015). Third, attention on critical junctures emerged, in particular on the way in which some specific waves of protest succeeded in producing important changes. Recent times have been defined as momentous: ‘great transformation’, ‘great recession’ and ‘great regression’ have frequently been used as shorthand to characterize the period following the financial breakdown of 2008. Moments of rupture are recognized as most important in
666
The SAGE Handbook of Political Science
defining new paths. Bridged with the concept of crisis, these terms have been used to highlight sudden changes, in particular those related to the neoliberal turn in capitalism and its crisis. Reference to such moments can indeed be found in different approaches addressing them at the macro-, meso- and micro-levels. In neo-institutional approaches, critical junctures are defined as periods of ‘crisis or strain that existing policies and institutions are ill-suited to resolve’ – and therefore different from normal politics, when ‘institutional continuity or incremental change can be taken for granted’ (Roberts, 2015: 65). While neo-institutional approaches have looked at extraordinary times from a macro perspective, the Chicago School had long ago addressed changes from a micro perspective, looking at the sudden breaking of established paths, the reproduction of ruptures and their stabilization. Collective behavior was the concept used to define forms of social behavior in which ‘usual conventions cease to guide social action and people collectively transcend, bypass or subvert established institutional patterns and structures’ (Turner and Killian, 1987: 3). While research on protests has focused on long waves (Markoff, 2016) as well as on short(er) cycles (Tarrow, 1998) as analytic concepts, reflection on the relevance of some specific protest moments as catalyzers of change speaks to the capacity of social movements to contribute to emerging norms by breaking routine and introducing new ethical concerns. As Mark Beissinger noted in analyzing extraordinary times in the development of nationalism, protest events are in fact ‘contentious and potentially subversive practices that challenge normalized practices, modes of causation, or systems of authority’ (Beissinger, 2002: 14). Although still largely an exception in social movement studies, protests as momentous events have been reflected upon in particular by research looking at contentious politics as triggering an intensification of the perception of time (della Porta, 2017).
Another recent trend has been oriented by a return to the analysis of the ‘why’ of social movements, that is, the way in which they are connected to societal conflicts and political cleavages. After a long period, research has, therefore, addressed again the class bases of protest and the ways in which capitalist transformations impact on collective action. After scholars lamented social movement studies’ lack of concern with capitalism, the bridging of social movement studies with a political economy approach has developed by singling out Marxist contributions to social movements and social movement analysis (Barker et al., 2013). Recognizing the role of agency, some debates developed on the characteristics of those actors that challenge capitalism in its various forms. In particular, Karl Polanyi (1957 [1944]) described those forces that resisted economic liberalism as counter-movements. Faced with economic crises, such as the Great Depression between the two World Wars, it was mobilization from below that pushed for a move toward a reversal of the economic paradigm, from liberalism to interventionist Keynesianism. According to the scholars of the so-called world system approach, it was indeed the task of anti-systemic movements to resist greedy capitalism, opposing the logic of the system. The concept of anti-systemic movements builds upon an analytic perspective of ‘the world-system of historical capitalism’ that gave rise to them, as ‘class and status groups were the two key concepts that justified these movements’ (Arrighi et al., 1989: 1). Building upon these concepts, recent research has looked at the structural basis of antiausterity protests (della Porta, 2015; della Porta et al., 2017). Finally, while recognizing the importance of protests, research is developing also on other areas of movement activities. First of all, research on meetings in movements allows reflection on the role of prefiguration, discourses and power in politics (della Porta and Rucht, 2013). Moreover, as social movement activists develop new ideas, the production
Social Movements
of knowledge by social movements is a most important topic for research, as the Great Recession has been a sort of laboratory for movements as service providers. As the importance of social movements emerged also within authoritarian regimes, attention developed to the invisible forms of resistance by people who are deprived of their rights (Bayat, 2013). This happens in dictatorships and hybrid regimes, but also in struggles for the rights of migrants, refugees and other marginalized groups, for instance precarious workers, that cannot use traditional forms of protest because their civic and political rights are not recognized. The institutional outcomes of the recent waves of anti-austerity protests triggered theoretical and empirical attention to the role of civil society actors in institutional processes, especially through the construction of movement parties and referendums from below. In particular, in times of crisis, social movements have acquired constitutional power. The sociology of constitutions has pointed to the growing importance of constitutional processes as well as to changes in the conceptions and practices of constitution-making. In moments of democratic malaise, the direct involvement of citizens in the writing and rewriting of constitutions is a way to restore collective identities and solidarities. Reflecting the shift in constitutional thinking from a legalistic vision of constitutionalism as a technical process to that of a participatory one oriented to create a founding moment, progressive social movements have in fact been seen as triggering constitutional processes, in which citizens’ participation in deliberative arenas is considered necessary in order to construct normative agreements. In Iceland as in Ireland, citizens’ initiatives prompted some first experiments in constitutionmaking through the building of citizens’ assemblies, which have endowed themselves with instruments for interactions with the outside. Plural information, civilized interactions, mutual respect, publicity and inclusive participation characterized the constitutional arenas in which citizens took part. In this
667
way, social movements acquired constituent power by calling for the constitutional protection of public goods against privatization, but also for the recognition of the value of the participation of citizens in public life. Similarly, referendums and other mechanisms of direct democracy have increasingly been called for in times of democratic malaise as a response to the increasing mistrust in representative institutions. While the uses of referendums have been most varied – as has been their democratic quality – progressive social movements have at times perceived the potential advantages (in terms of legitimacy, but also of efficiency) of handing the right to decide directly to the citizens. On issues as contested as the conception of public services or national independence, instruments of direct democracy can contribute not only to legitimating public decisions through participation, but also to increasing public participation and enriching public life. The participation of social movement organizations and campaigns in the referendum process can contribute to a substantial increase in not only the number of voters but also the plurality of the arguments, by multiplying the public arenas in which referendum issues are discussed. In participating in these processes, social movements increase the capacity for referendums to effectively extend the constitutional safeguards against the excessive power of the elected representatives, balancing delegation with participation. Protest activities organized by broad networks of social movement activists help the formation of citizens’ wills, contribute to improving the quality of public life and increase citizens’ ability to make considered judgments (della Porta et al., 2017). Referendums from below – involving a larger degree of extra-institutional mobilization, either through citizens’ initiated referendums or by the wide-scale appropriation of state endorsed ones – are in fact particularly conducive to broaden participation and deliberative qualities. Besides their final outcomes, social movements’ engagement
668
The SAGE Handbook of Political Science
in referendums can empower progressive ideas by making topics more relevant to the public and improving citizens’ awareness. Finally, in moments of crisis, movement parties have emerged and achieved broad support in a very short time-span. As late neoliberalism triggered a crisis of political legitimacy through public institutions’ evident lack of capacity (or willingness) to insure citizenships’ rights, the ensuing electoral earthquakes opened spaces for new parties on the left as well as the right. On the left, progressive social movements have in fact triggered the development of political parties that not only represented their claims, but were also influenced by their conceptions and practices of democracy. As center-left parties moved toward the center, accepting the main tenets of neoliberal trust in free markets and abandoning social protection, and as calls for social justice and against inequalities spread among citizens through massive protest campaigns, new parties emerged on the left. Besides thawing once frozen political cleavages, the growing discontent with the dominant parties is expressed in the electoral arenas not only (or mainly) by abstention, but rather by the emergence of movement parties. While increasing scepticism toward representative democracy also triggers right-wing populism, movement parties on the left attempt to bring into the representative institutions participatory and deliberative ideals nurtured by progressive social movements. As protest campaigns generate a contentious political culture and the strong critique of the political class brings about intense politicization, party alternatives develop, especially in the declining phases of the movements. The progressive movement parties have been fuelled by a mix of institutional closure to movements’ demands and the opening of electoral opportunities given electoral de-alignment and increasing volatility. In the cases of Podemos in Spain or MAS in Bolivia, given the crisis of trust in the centre-left parties, movement activists and sympathizers, even if critical of representative forms of democracy, took the chance
to occupy a political and electoral space that had been left empty, and successfully built support by putting forward the anti-austerity claims developed during the protest cycle. Mechanisms of organizational appropriation allowed for the exploitation of these institutional channels, while first electoral victories galvanized the movement activists that were in search of new repertoires of action. In fact, their electoral successes were facilitated by a reverse reputational effect, as the attacks by political and mass media actors, which were considered as part of the neoliberal elites, contributed to focus attention and sympathies on the new movement parties (della Porta et al., 2017). In all cases, while adapting to representative political institutions, progressive movement parties supported a participatory vision at the organizational level, maintained disruptive and unconventional repertoires of action and reflected movements’ frames and claims.
Perspectives As in many other fields in the social sciences, research on social movements has been very much influenced by two types of developments: one is development in the social sciences, in which research on social movements has been embedded; the other is development of the object of studies, namely the social movements themselves. Social movement studies institutionalized around some main concepts like the political opportunities that social movements need at different geographical levels, the capacity of social movements to mobilize organizational resources and the framing processes used in assessing what it is possible to do and how it is possible to resist or produce changes. These have been important concepts in addressing what were for a long time the main empirical types of movement research mainly focused upon, from the environmental to the women’s movement, mainly within consolidated democracies. Therefore, to a certain extent,
Social Movements
social movement studies considered social movements as children of expanding democracies and increasing social welfare within nation-states that were of course interacting with each other, but could still autonomously implement citizenship rights – civil rights, political rights, social rights. The type of knowledge that came out of this tradition has been very interesting, contributing to research on social movements but also to several other fields of political science and political sociology that were up to that point very much focused on institutions. Social movement studies have been able to investigate and provide evidence on the importance of a vibrant civil society for the development of democracies, but also on the relevance of conflicts. They have pointed at the need to make representatives in democratic institutions accountable, but also to develop alternative public spheres in which citizens can take the lead on developing new ideas and political innovations. At the empirical level, knowledge was achieved through methodological pluralism, which is very widespread in the field, and which has in fact been particularly important in an area where data are not readily available but must to a large extent be collected by the researchers themselves. Methods such as protest event analysis, or surveys during demonstrations, have been truly innovative attempts to increase knowledge about social movements. Moreover, life history studies, participant observation or participatory action research have contributed to our understanding of social movements through middlerange theories, and historically bounded and contextually grounded theories. In sum, social movement studies contributed a useful toolkit of concepts that have been exported to different geographical areas and different disciplines, becoming influential also in disciplines such as political geography, psychology, various types of historical studies, and even in economics, philosophy, democratic theory, urban studies and gender studies. By working around similar concepts, this has produced cumulative knowledge.
669
Yet the tradition of social movement studies has been largely restricted to the analysis of advanced democracies with developed forms of welfare states. Related theorizations have been mainly oriented toward explanation of the impact of social and political structures on collective action. Some of the most influential scholars in social movements studies developed in fact even a sort of self-criticism about the structuralist bias of the most widespread approaches, focusing either on the macroconditions for the development of social movements, or on the micro-motivations of social movement activism. With a focus on explaining when and how social movements expand as well as when and how people participate in different forms of protest, social movement studies were also very much influenced by the development in the social sciences, which was oriented toward attempts to test hypotheses by identifying crucial independent variables to explain the outcomes. As social movement studies expanded, the risk emerged that the institutionalization of the field could lead to analytic and empirical closure. We may need to go beyond the traditional toolkit of concepts, theories and methodologies in order to understand ongoing deep and rapid changes. While previous research was looking to the environment for resources and opportunities that social movements could exploit, in times of crisis attention moves toward understanding how protest itself can trigger and empower collective actions. As old patterns of behavior are challenged, and existing structures are less able to guide and constrain social movements, attention to emergent types of processes is developing – that is, attention to how social movements can become producers of their own resources, of their own empowerment. In order to understand these transformations, social movement studies need to expand their theoretical perspective and go beyond consolidated democracies, being open to cross-fertilization with research traditions that scholars and activists have developed in contexts other than the European or American ones.
670
The SAGE Handbook of Political Science
Empirical databases1 Main cross-national and general population surveys with information on protest and social movements: 1 European Social Survey (ESS): 8 waves between 2002 and 2016; countries included changed over time (https://www.europeansocialsurvey.org/ data/country_index.html), with information on: • Non-electoral political participation (dummies): Worked in political party or action group//Worked in another organisation or association//Worn or displayed campaign badge/sticker//Signed petition//Taken part in lawful public demonstration//Boycotted certain products last 12 months. • Posted or shared anything about politics online in last 12 months (only ESS8 2016). • Participated in illegal protest activities in last 12 months (only ESS1 2002). • Member of trade union or similar organization. 2 European Values Study (EVS): 5 waves (https:// europeanvaluesstudy.eu/). • Political actions: Sign a petition//join in boycotts//take part in lawful demonstrations// join unofficial strikes//occupy buildings or factories//damage things, break windows, street violence (only in EVS 1981)//personal violence (only in EVS 1981). • Membership: Belong to social welfare service//religious organization//cultural activities//labor unions//political parties//local political actions//human rights//conservation, the environment (only in EVS 1990)//animal rights (only in EVS 1990)//professional associations//youth work//sports or recreation (not in EVS 1981)//women’s group (not in EVS 1981)//peace movement (not in EVS 1981)// organization concerned with health (not in EVS 1981). 3 World Values Survey (WVS) http://www.worldvaluessurvey.org/WVSDocumentationWVL.jsp: change of measurement (might do; have done & how often).
• Signing a petition//Joining in boycotts// Attending peaceful demonstrations//Joining strikes (not in WVS5)//occupying buildings or factories (in WVS1-2-3)//other (WVS4-5)//any other act of protest (only in WV6). 4 International Social Survey program (ISSP), in 2004 and 2014. Citizenship modules; countries included specified here as well: https://www. gesis.org/issp/modules/issp-modules-by-topic/ citizenship/2014/ • Political actions: sign a petition, boycott certain products, take part in a demonstration, attend political meeting or rally, donate money or raise funds, contact media, express views on the internet • Trade union membership 5 Other large cross-national databases and projects with information on organizational membership and non-electoral participation include, e.g. LIVEWHAT: http://www.unige.ch/livewhat/wpcontent/uploads/2014/02/Integrated-Report.pdf 6 Going beyond Western countries, another very useful data source with rich information is the Latinobarometer: http://www.latinobarometro.org/ latContents.jsp. The scope and type of questions, however, change from wave to wave.
Non-survey databases on social movement activities include: 7 Nonviolent and Violent Campaigns and Outcomes Data Project (NAVCO): https://www.du.edu/ korbel/sie/research/chenow_navco_data.html 8 Global Nonviolent Action Database: https:// nvdatabase.swarthmore.edu/content/nonviolentaction-defined 9 Media and Peacebuilding Project: https://mediapeaceproject.smpa.gwu.edu/ 10 Ron Francisco’s comparative data on Protest Event Analysis across European countries: http:// web.ku.edu/~ronfrand/data/ 11 Dynamics of Collective Action (by McAdam, McCarthy, Olzak and Soule): https://web.stanford.edu/group/collectiveaction/cgi-bin/drupal/ 12 POLCON: https://www.eui.eu/Projects/POLCON by Kriesi et al. 13 MPEDS, by Pam Oliver et al: https://www.ssc. wisc.edu/~oliver/protest-research/mpeds/ 14 Regarding newswires, the most important and reliable cross-national and longitudinal source on not only but also protest events is the Integrated
Social Movements
Crisis Early Warning Systems (ICEWS): https:// dataverse.harvard.edu/dataverse/icews. More info: http://www.lockheedmartin.com/us/ products/W-ICEWS/iData.html 15 An alternative, quite popular among social movement scholars, is GDELT http://gdeltproject.org/ data.html#rawdatafiles. More info here: http:// data.gdeltproject.org/documentation/ISA.2013. GDELT.pdf 16 On-site protest surveys in CCC/protest survey: http:// www.protestsurvey.eu/index.php?page=project 17 Regarding labor struggles, the OECD has data on ‘union density’ and ‘union membership’: https:// stats.oecd.org/Index.aspx?DataSetCode=TUD 18 ILO data on collective bargaining are available at: http://www.ilo.org/global/topics/collectivebargaining-labour-relations/WCMS_408983/ lang–en/index.htm; additional data are provided in databases of EUROFUND & ICTWSS (Hibbs) 19 The SM outcomes database by Jennifer Earl et al. is available at: http://yapdatabase-yppnetwork.net/ 20 The Transnational SM Organizations dataset by J. Smith https://www.icpsr.umich.edu/icpsrweb/ ICPSR/studies/33863 and the SM Organizations Meta Data by J. Earl (that merges and expands the Dynamics of Collective Action + PONS datasets): https://jearl.faculty.arizona.edu/content/ social-movement-organizations-meta-data
Note 1 I am thankful to Martin Portos for the information reported in this section.
References Arrighi, Giovanni, Terence K. Hopkins and Immanuel Wallerstein. 1989. Antisystemic Movements. London: Verso. Barker, Colin, Laurence Cox, John Krinsky and Alf Gunvald Nilsen (eds). 2013. Marxism and Social Movements. Leiden: Brill. Bayat, Asef. 2013. Life as Politics: How Ordinary People Change the Middle East. Stanford: Stanford University Press; 2nd edition. Barnes, Samuel H., Max Kaase, Klaus Allerbeck, Barbara Farah, Felix Heunks, Ronald Inglehart, M. Kent Jennings, Hans D. Klingemann, Alan Marsh and Leopold Rosenmayr. 1979. Political Action: Mass Participation in Five Western Democracies. London/Newbury Park, CA: Sage.
671
Beinin, Joel and Frédéric Vairel. 2019. Social Movements, Mobilization, and Contestation in the Middle East and North Africa. Stanford: Stanford University Press; 2nd edition. Beissinger, Mark R. 2002. Nationalist Mobilization and the Collapse of the Soviet State. Cambridge: Cambridge University Press. Caiani, Manuela, Donatella della Porta and Claudius Wagemann. 2012. Mobilizing on the Extreme Right. Oxford: Oxford University Press. Cefaï, Daniel. 2007. Pourquoi se mobilise-t-on? Les théories de l’action collective. Paris: La Découverte. Crossley, Nick. 2002. Making Sense of Social Movements. Buckingham: Open Univ. Press. Dalton, Russell J. 2004. Democratic Challenger, Democratic Choices. The Erosion of Political Support in Advanced Industrial Democracies. Oxford: Oxford University Press. Della Porta, Donatella. 1995. Social Movements, Political Violence and the State. Cambridge/ New York: Cambridge University Press. Della Porta, Donatella (ed.). 2009. Another Europe. London: Routledge. Della Porta, Donatella. 2013a. Can Democracy Be Saved? Cambridge: Polity Press. Della Porta, Donatella. 2013b. Clandestine Political Violence. Cambridge: Cambridge University Press. Della Porta, Donatella (ed.). 2014a. Methodological Practices in Social Movement Research. Oxford: Oxford University Press. Della Porta, Donatella. 2014b. Mobilizing for Democracy: Comparing 1989 and 2011. Oxford: Oxford University Press. Della Porta, Donatella. 2015. Social Movements in Times of Austerity. Bringing Capitalism Back into the Analysis of Protest. Cambridge: Polity. Della Porta, Donatella. 2017. Where Did the Revolution Go? Contentious Politics and the Quality of Democracy. Cambridge: Cambridge University Press. Della Porta, Donatella and Manuela Caiani. 2009. Social Movements and Europeanization. Oxford: Oxford University Press. Della Porta, Donatella and Mario Diani. 2006. Social Movements: An Introduction. Oxford: Blackwell; 2nd expanded edition. Della Porta, Donatella and Mario Diani (eds). 2016. Oxford Handbook of Social Movements. Oxford: Oxford University Press.
672
The SAGE Handbook of Political Science
Della Porta, Donatella and Sidney Tarrow (eds). 2005. Transnational Protest and Global Activism. New York: Rowman and Littlefield. Della Porta, Donatella and Gianni Piazza. 2008. Voices of the Valley, Voices of the Straits: How Protest Creates Communities. London: Berghahn Books. Della Porta, Donatella and Dieter Rucht (eds). 2013. Meeting Democracy. Cambridge: Cambridge University Press. Della Porta, Donatella, Joseba Fernández, Hara Kouki and Lorenzo Mosca. 2017. Movement Parties Against Austerity. Cambridge: Polity. Della Porta, Donatella, Francis O’Connor, Martin Portos and Anna Subirats Ribas. 2017. Social Movements and Referendums from Below: Direct Democracy in the Neoliberal Crisis. Bristol: Policy Press. Diani, Mario. 1992. The Concept of Social Movement. Sociological Review, 40(1), 1–25. Diani, Mario. 2015. The Cement of Civil Society. Cambridge: Cambridge University Press. Duyvendak, Jan Willem and James Jasper. 2015. Breaking Down the State. Protestors Engaged. Amsterdam: Amsterdam University Press. Fillieule, Olivier and Guya Accornero (eds). 2016. Social Movement Studies in Europe. New York: Berghan Books. Fligstein, Neil and Doug McAdam. 2012. A Theory of Fields. Oxford: Oxford University Press. Gamson, William. 1990. The Strategy of Social Protest. Belmont, CA: Wadsworth; 2nd edition. Goodwin, Jeff and James M. Jasper (eds). 2009. The Social Movements Reader: Cases and Concepts. Oxford: Wiley-Blackwell; 2nd edition. Khagram, Sanjeev, James V. Riker and Kathryn Sikkink (eds). 2002. Reconstructing World Politics: Transnational Social Movements, Networks and Norms. Minneapolis: University of Minnesota Press. Markoff, John. 2016. Historical Analysis and Social Movements Research, in Donatella della Porta and Mario Diani (eds), Oxford Handbook of Social Movements. Oxford: Oxford University Press. McAdam Doug and Hilary Boudet. 2012. Putting Social Movements in Their Place:
Explaining Opposition to Energy Projects in the United States, 2000–2005. Cambridge: Cambridge University Press. McAdam, Doug, Sidney Tarrow and Charles Tilly. 2001. Dynamics of Contention. Cambridge: Cambridge University Press. McCarthy, John D. and Mayer N. Zald. 1987. Social Movements in an Organizational Society. New Brunswick: Transaction. Melucci, Alberto. 1996. Challenging Codes. Cambridge and New York: Cambridge University Press. Norris, Pippa. 2002. The Democratic Phoenix. New York: Cambridge University Press. Piven, Frances F. and Richard A. Cloward. 1977. Poor People’s Movements. New York: Pantheon. Polanyi, Karl. 1957 [1944]. The Great Transformation: The Political and Economic Origins of Our Time. London: Beacon Press. Roberts, Kenneth M. 2015. Changing Course in Latin America. Cambridge: Cambridge University Press. Rossi, Federico M. and Marisa von Bülow (eds). 2015. Social Movement Dynamics: New Perspectives on Theory and Research from Latin America. Farnham: Ashgate. Snow, David. 2013. Identity Dilemmas, Discursive Fields, Identity Work and Mobilization: Clarifying the Identity–Movement Nexus. In Jacquelien van Stekelenburg, Conny Roggeband and Bert Klandermans (eds), The Future of Social Movement Research: Dynamics, Mechanisms, and Processes. Minneapolis: University of Minnesota Press, pp. 263–80. Snow, David, Donatella della Porta, Bert Klandermans and Doug McAdam (eds). 2013. The Wiley-Blackwell Encyclopedia on Social and Political Movements. Oxford: Blackwell. Snow, David, Sarah Soule and Hanspeter Kriesi (eds). 2004. The Blackwell Companion to Social Movements. Oxford: Blackwell. Snow, David, Sarah Soule, Hanspeter Kriesi and Holly McCammon (eds). 2018. The Blackwell Companion to Social Movements. Oxford: Blackwell; 2nd edition. Tarrow, Sidney. 1998 [1994]. Power in Movement: Social Movements, Collective Action and Politics. New York/Cambridge: Cambridge University Press.
Social Movements
Tarrow, Sidney. 2005. The New Transnational Activism. New York/Cambridge: Cambridge University Press. Tilly, Charles. 1978. From Mobilization to Revolution. Reading, MA: Addison-Wesley. Tilly, Charles and Sidney Tarrow. 2007. Contentious Politics. Boulder, CO: Paradigm Press.
673
Touraine, Alain. 1977. The Self-Production of Society. Chicago: University of Chicago Press. Turner, Ralph H. and Lewis M. Killian. 1987 [1974, 1957]. Collective Behavior. Englewood Cliffs, NJ: Prentice Hall; 3rd edition.
40 Social Structure Manuel Antonio Garretón and Nicolás Selamé
Introduction The concept of social structure has been one of the most important in the social sciences. Because of its relevance, its meaning has been the subject of debate between diverse disciplines. While political science tends to consider social structure as it affects political dynamics, in sociology it has been a central object of study since the beginning of the discipline. This has led to several points of convergence and dialogue between both fields, since the comprehension of social structures can partially help to understand political phenomena, but can also lead to mistakes due to the reduction of political conflicts to the influence of social structure, or their ‘sociologization’ (Sartori, 1969). Nevertheless, in its conceptualization of social structures, sociology has always considered implications of political phenomena in one way or another, without necessarily reducing the focus simply to the consequences of social structures. This chapter
discusses problems of social structure, from the perspective of sociology, and their consequences for politics.
Social Structure: the sociological approach The concept of social structure in sociology varies as widely as social theories in general. It has been of concern since the beginnings of the discipline, even though it has not always been used in the same way. Taking the most general approach, according to Giner et al. (2006: 311), it can be said that social structure in sociology refers to ‘the most permanent, the basic, the non-apparent and maybe hidden, the framework or maybe the logical shape of something’. After this first vague delineation, the authors point to at least five different concerns in sociology’s structural problematization: (1) the structure– agency relation; (2) the static and dynamic
Social Structure
aspects of structures; (3) the distinction between analytical and concrete structures; (4) the descriptive–explanatory conception of structure; and (5) the structure–culture relation (ibid: 311). The first dimension of the problem concerns the micro–macro dichotomy, where it must be discerned whether a phenomenon depends on particular elements (the agents) or on long-lasting context characteristics (the structure). This tends to be the most important subject when talking about structure in sociology. The second dilemma refers to the conception of structure as an immutable, steady component, or as a dynamic factor of social changes (the motor of history of Marxism1 is an extreme example). Third, it must be discerned whether a structure has distinguishable characteristics that allow its isolation from others in a concrete empirical way or whether it is a purely analytical category that cannot be separated from other elements (e.g. economic and political structures that can be differentiated from others only analytically but in fact interact in many ways). The fourth point refers to Levi-Strauss’ concept of structure as a theoretical framework to understand the elements of society that determine the actions of subjects, but the objective existence of which cannot be assured. The fifth dimension deals with the contrasts and interactions of the concept of structure in relation to that of culture. The assimilation of both is present mainly in the Parsonian tradition, where it is assumed that culture assigns roles to actors and therefore determines structures. In the perspective that opposes structure and culture, the first is related to objective aspects (from demographic characteristics to social groups as classes or nations), while culture is understood as a ‘subjective’ dimension of social life. In some schools, the debate around these concepts is open – as in Marxism, where some oppose culture and structure while others consider culture as one of the structures of society. These are the problems most frequently dealt with in the treatment of social structure
675
in sociology. This does not mean that they are always considered in each theory, but rather that they are the most problematic issues of the concept. In most of the theories we discuss, these problematic dimensions of social structure can be traced.
Social Structure and Classical Sociology One initial conception of the problem of structure in Durkheim’s ideas emerges in the basic rules which he explains as part of his sociological method. When he defines ‘social facts’, he refers to them as something that affects a subject ‘with a compelling and coercive power by virtue of which, whether he wishes it or not, they impose themselves upon him’ (Durkheim, 1982: 51). It can be noticed in this brief statement that in Durkheim’s sociology the subjects, their will or the agency that can emerge from them are not relevant – neither for the discipline nor for the course of society. What can help to understand society and its changes are the collective phenomena that transcend individuals and ‘impose themselves’. The main social structure change to which Durkheim attends refers to the tendency of specialization in the functions of the subject in modern societies, which creates new relations between individuals (Durkheim,1965). In other words, he looks to a new type of society where the main processes of change are explained by that tendency of division of labor and the course this structure takes. The transition from traditional to modern societies that he describes is also an interesting example of the problem mentioned previously concerning the opposition of culture and social structure. In traditional societies, culture is the main cohesive element, while in modern societies the structural division of labor becomes most relevant for social cohesion. This is also a topic much studied by Karl Marx.
676
The SAGE Handbook of Political Science
In the works of Marx, social structure is more consistently dealt with and his notions are sharper than Durkheim’s. For our purposes, the Marxist approach to social structure can be summarized in two premises: first, a critique of the liberal concept of human beings according to which politics is the representation of ideas not linked to economic interests, opposing to this the concept of class-structured interests (Marx, 2018); second, a concept of conflictual social progress determined by the dynamics of the social relations of production, which represents social structure (Marx, 2017). For Marx, politics is a matter of classes confronted by their opposing interests, which they develop because of their positions in the process of production (Marx, 1998). In this way, the economic sphere in which the process of production takes place works as the social structure that configures subjects, determining their position in politics and the overall political life, which he calls ‘superstructure’. The transcendent aspects of a subject’s existence are in this way all linked to this structural determinant: they condition his interest, the part he takes in the class conflicts, and the future of society. The State and politics are, then, ‘super-structural’ phenomena, or what we call today a dependent variable. In Marx, as in Durkheim, social reality is explained to a large extent by structural phenomena, or macro events determining micro ones. Also, there is an opposition between structure and culture: culture is a reflection of the structural relations of production, says Marx, and the liberalization of society is derived from a new social solidarity, as postulated by Durkheim. Although both authors take a different route in explaining how structures operate and sustain society, there is a shared acceptance of an imposed and even fatal nature of a part of social life for the subjects. Later theories took up this issue. The acceptance of this concept of social structure remained more or less unchanged in the course of the discipline: the idea that some elements of social phenomena
transcend the capability of subjects to change their reality, and that this structural nature is linked in a dichotomous relation with the cultural production of society. That concept suffered an important turn with the advent of structural–functional theory in the works of Talcott Parsons. This change is related to the question about what produces social structure, and how that process takes place. It led, to a large extent, to an identification of culture and structure. For Parsons, society is the interaction of individuals that seek maximum gratification in the context of both a cultural and a material world. However, he does not take the individuals or their interactions as the starting point. Instead, he defines ‘status roles’; social positions that subjects occupy in order to play functions in the system, and which they internalize through the cultural guidelines that the society provides. Status roles for Parsons constitute social structures that are imposed on subjects through cultural guidelines, which determine their possibility of agency, their ambitions and rules for interaction. However, what is new in contrast to the previous structure theories mentioned is that here, structure results as a function of a cultural system. Status roles are an imposition on subjects in their socialization, but they do not exist before the social system; rather, they are a product of it, modeled by its functional needs. The needs of the social system in terms of reproduction, then, give place to the cultural determination of people’s behavior that generates Parson’s social structure as a set of ‘status roles’ (Parsons, 1999). Although Parsons’ disciple Robert Merton introduced notions of conflict to the structural functionalist conception of society, this previous turn in the understanding of the generation of social structures remained present in his works. An important part of Merton’s work is focused on the dysfunctions that exist within the social system. In this way, structural functionalism loses part of its unilinear understanding of social phenomena according to which social systems effectively
Social Structure
generate and reproduce complete harmony among their components. One of his most famous topics is social anomie, which treats the problems derived from the limits that social structures can impose on the achievement of the cultural guidelines accepted by subjects (Merton, 1968: 209–74). However, the major aspects of the structural functionalist tradition to which Merton belongs are those postulated by Parsons. In this sense Merton, Parsons and the functionalist school overall were interpreted by their opponents as a ‘consensus’ social theory, under which structural change is difficult to explain and to expect. This void in the comprehensive potential of functionalism is what conflict theories of the 20th century, usually under Marxian influence, opposed and tried to supply. The works of Dahrendorf were emblematic among these (see Dahrendorf, 1959). It can be seen that all the concepts of structure reviewed up to this point depend on the general theory of the authors mentioned. This complicates the dialogue between them, since it is a complete paradigm that is at stake. At the same time, it makes it difficult to open the debate to other disciplines and problems different from those addressed in sociology.
Towards an Open Concept of Social Structure Up to this point, social structure has been treated as an inherent problem for sociological theories. However, the argument developed so far lacks a concept of social structure around which a dialogue with political science can be created. According to Ritzer, this concept was revived during the 1970s as a reaction on the part of some sociologists to the culturalist turn in the discipline, highlighting the works of Peter Blau, which vindicate the study of social structure as the main purpose of sociology (Ritzer, 1993: 440). The particular relevance here lies in the explicit treatment of the concept of social
677
structure as ‘the discernible patterns of social life, the observable regularities, the detected configurations’ and, more precisely, as ‘the distributions of population according to diverse parameters in different social positions that influence the relations of roles of the people and social interaction’ (Blau, 1975a: 3). This idea of social structure, and the postulation of it as the principal topic of sociology, conflicts with the Parsonian and systemic approaches, since their main interest resides in cultural or idiosyncratic determinants of social phenomena (Blau, 1975b). The idea of social structure refers to what is beyond the will of subjects and imposes on them – consciously or unconsciously, in different ways – objective conditions that determine their possible actions as guidelines as well as limits. This is the notion that Blau emphasizes in his definition of social structure. Its advantage is that it is not bound to any long-range theory,2 but precisely that is what implies difficulties in talking of ‘a’ social structure and defining a particular role for it in the unfolding of social reality. Instead, ‘the structural problem’ emerges as a dimension in all social phenomena which must be elucidated in every study. This seems to be a better concept for a dialogue between the concepts of structure in sociology and political science, because it helps to shield against the risk of domination of political problems by sociological explanations, which is Sartori’s worry (Sartori, 1969). In this dialogue between sociology’s elaborations of structure and political science, it is useful to conceive the problem as ‘the structural’ (instead of ‘the structure’). This allows to consider the structural dimension in the study of any problem, but without a delimitated concept of structure, which also probably would involve a complete social theory and the obligation to deal with most of it premises. In fact, returning to Blau, he refers to socioeconomic stratification, gender and race as structural determinants, but also gives definitions that make it possible to consider other structural variables in ways that
678
The SAGE Handbook of Political Science
specific social formations may require. This allows the development of several linkages that sociological theories tend to establish between social structures and political dimensions, like the ones that we will consider later. Even if in current studies sociologists still tend to assume the idea of structural conditioning of social phenomena, there are, however, doubts about the possibility of isolating these structures, their nature and their consequences, due to the changes they have experienced. This means that in a social group affected by the same structures (class is the most typical example), several other variables, usually cultural ones, differentiate the community supposed to have similar characteristics, reducing the intensity of collective life and consequently changing collective action (Touraine, 1998). As the cause of this turn in the dynamics of social structure, we can point to the advent of post-industrial societies, with a more intense individual, cultural life, generating more distance between the individual subject and the collective (Garretón, 2015). One of the most powerful descriptions of this phenomenon refers to hybrid cultures and the instability that they exhibit in the globalized world (García Canclini, 2005). Although it can be said that cultures have always been ‘hybrid’, the instability and diversification that they actually show is one of the ways to explain the fragmentation that social structures today manifest, expressed in the diverse courses that groups under their influence can take. The idea developed here about social structure based on Blau’s concept seems to fit better with the emergent questions cited above, and to be more useful than those set out in classical theories. In this sense it opens the possibility that there is no universal and complete determinism of the structure over action, that there are different definitions of social structures according to each society and that other aspects of society must be empirically analyzed. This shows how social structure affects politics in very different ways, as we will see later. Finally, it takes into account what are presumably the same
problems of social complexity that today’s studies consider when talking of ‘postmaterialist values’ (Inglehart, 1985),3 globalization (Calderón, 2004), new social movements (Wickham-Crowley and Eckstein, 2017) and crises of representation (Garretón et al., 2003), among other current topics.
Agents, Subjects and Action The main question that the concept of social structure must answer is about the relation between structure and actors or subjects, that is, how its explanations leave space for agency and subjects. Here lies the crucial importance of the concept for politics. In sociology, a social–structural approach could be opposed to those referring to individuals’ capacity of action. This has led to several debates as to when or how, in social phenomena, it is individuals or structures that hold the key to an explanation. Summarizing this debate, Margaret Archer points to typical solutions at which sociologists have arrived, highlighting what she calls a common ‘conflationist’ fallacy. She refers to theories that overestimate the importance of one dimension (the agent or the structure) in the explanation of social events. There, she points to three paths of ‘conflation’: ascendant, when agency is understood as the unique source of explanation; descendant, when each explanation lies in the structure; and central, when the specificity of structure and agency are ignored, and social events are comprehended as a diffuse amalgam that simplifies the complexity of social reality which includes these two different poles (Archer, 1995). Instead, Archer proposes a ‘realistic’ theory that recognizes the specificity and relative autonomy of both the structure and the subjects. This understanding assumes the existence of social structure as a concrete, relatively autonomous entity that conditions the possibilities for subjects to act. However, she also recognizes the autonomy of subjects,
Social Structure
given a certain structural context, to act according to a large spectrum of possibilities or certain degrees of freedom (Archer, 2003). This implies rejection of Durkheim’s postulates, according to which social facts must be studied as ‘objective’, ‘concrete’ phenomena that impose on subjects something from which they cannot escape (Durkheim, 1982). According to Archer, this is a ‘descendant conflation’ (Archer, 2003). The same applies for Marx’ precepts that locate in the social relations of production the configuration of the social being as well as the motor of social transformations (or history). The Parsonian response to the problem is also dismissed by Archer, as it proposes that systems are all guided by the same logic, no matter whether the individual system or the cultural one. ‘Central conflation’ emerges mainly as the attempt of theories to overcome the polarized solutions that focus on agents or structures. However, this also amounts, according to Archer, to a fallacy. A social theory should recognize that social reality is composed of a variety of elements and the homologation of their nature, as with central conflation, tends to obscure problems more than clarify them. Two examples of this are Bourdieu’s and Giddens’ proposals, two theories of action that became very relevant but are also criticized by Archer. In the case of Pierre Bourdieu’s theory, it is assumed that social structure carries a logic, a ‘doxa’,4 that agents (the author avoids the concept of ‘subject’) internalize in the process of socialization through practical experience (Bourdieu, 1998). After this, the acts of agents are guided by this ‘doxa’ and, later, new structures will be mainly elaborated by agents previously configured according to the ‘doxa’ of previous structures. If this is so, the possibility of agents to act independently of structures (not just reproducing inherited dynamics) is eliminated, and with it the difference between these entities. That is what Archer denounces as a fallacy of central conflation. The case is similar for Giddens. He declares his purpose to be that of overcoming
679
the agency–structure dualism, mainly with a theory of structures that conceives them as constructed through human actions and relations (Giddens, 1979). The problem continues to be the need to draw a limit on structure’s constraints on agency without ignoring it, and the possibilities that this has to affect the former, regarding a space of ‘freedom’ that structures leave for agents. When talking of structures, the mentioned ascendant conflation does not apply, as this concept basically ignores the structural dimension. The relevance of Archer’s concept of structure and agency is emphasized here because of the connection it can make between social theories and political dimensions – particularly theories of cleavage, as we will see in the next section. The conflictive relation between structure and agency makes space for several explanations of the political dynamics. In this way the role that structure plays in determining the configuration of politics and the role of agents can be considered, as well as the inverse case, in which agents (voters, social groups or political elites) are more relevant as an explanatory factor of the phenomena. These considerations acquire more relevance when taking into account the problems of diversification and fragmentation of social structure previously mentioned. In the context of fragmented social structure and hybrid cultures, the one-causality explanations tend to be more improbable to describe social reality, and then Archer’s proposals emerge as a better alternative. Accordingly, the middle-range sociological concept of the ‘socio-political matrix’, used for analysis and comparison of the Latin American configuration of national states, illustrates how sociological studies can both connect social theory and political studies. The concept is an attempt to describe the dynamics of articulation generated in Latin American societies between the state, social groups and political parties with regard to structural constraints such as the socio- cultural and economic basis of the society, all of these being mediated by the political
680
The SAGE Handbook of Political Science
regime (Garretón et al., 2003). Starting from that concept, several studies have considered the effects of globalization, hybrid cultures and fragmentary structure on politics (Garretón, 2002; García Canclini, 2005). As will be seen, this constraint on the type of link that society and politics sustain is also a conditioning of social structure (or its fragmentation) over the nature of cleavage politics. The perspective on social structure drawn here seems to be coherent with contemporary transformations in society, recognizing the limits and guidelines it imposes for agents, and it sheds light on the roles that particular agency events play. It is also a historically grounded framework, open to the role and relevance that every element acquires in different cases – in particular, the increasing complexity of society which today places in question sociological structure theories and also, as we will see, cleavage studies.
Social structure and politics As we have said, the definition of social structure that we have adopted allows us to establish several linkages between the sociological perspective and political dimensions of society. In this sense, one of the main concepts that can establish the dialogue between sociology’s structural elaborations and political science has been the concept of cleavage. Lipset and Rokkan (1967) set out the concept of cleavage politics in their work ‘Party Systems and Voter Alignments’. Studying the democracies of Western Europe, the authors postulated as a hypothesis that party systems had ‘frozen’ around a few major social–structural cleavages. These were generated by historical events and processes that divided these societies. One example is the ‘class cleavage’ opposing capital and labor, or the bourgeoisie and the working classes in Marx’ terms, as a result of the industrial revolution. These became organized in employers’ associations and trade unions as well as
in conservative and socialist/social democratic parties. One expression of this is the classic left–right dimension in many contemporary party systems (Seiler, Chapter 33, this Handbook). In this first development of the concept, the link with social structure ideas is obvious, since social determinants – class or geographical distribution of the population, for example – are pointed to as key to understanding the behavior of the electorate. In that first elaboration by Lipset and Rokkan, political alignments were conceived as an epiphenomenon of conflictive structural processes, and the cleavage was the concept that explained the relation between both. At the same time, as analyzed by Bartolini and Mair (1990), a cleavage was constituted by three components: first, a social (structural) fissure; second, social institutions that expressed and mobilized that conflict; and third, as a result of the previous two, party alignments. After a first moment of social unrest, political conflicts tend to become ‘frozen’ and work around that social fissure and their social institutional expression. However, the dynamic of this cleavage evolved into a more complex relation. Political parties were, after their freezing around this issue, capable of reviving the conflict or, in more theoretical terms, susceptible to agency over the cleavage and its development. This last point operated in later studies as the door to a diversification of the cleavage concept. In fact, the first elaborations of the cleavage concept proposed by Lipset and Rokkan (1967) have experienced many changes susceptible to comparison to the social structure debate. The authors first defined particular social structures that explain political confrontations (class, center–periphery, rural– urban5) as sources of socio-political conflicts. However, there are others that emerged in the later cleavage studies, such as socioeconomic and professional differentiation, gender, ethnic diversity and age groups, which only a flexible concept of social structure, and the considerations about its fragmentation, can cover.
Social Structure
The theoretical framework behind the cleavage concept has tended to disperse at the point at which the empirical foundations have become the main topic of debate, acquiring a more flexible definition for the characterization of each case studied (Lybeck, 1985). In later studies this framework proved to have shortcomings, particularly in post-dictatorship contexts where the political (not ‘structural’) cleavage of authoritarianism versus democracy birthed new party systems, as well as in its use for the Eastern European democracies after the end of the Soviet Union (Enyedi, 2005). But it was not only the incorporation of new cases that represented a challenge for cleavage studies. The social changes that both Western and Eastern European countries experienced created a new political dynamic, which was condensed in the concept of ‘postmaterialist values’ as a conflict that was no longer rooted in a social structure (or was not so in the same way) as the classical cleavages (Inglehart, 1977, 1985). This led to debate, based on empirical evidence, around the pertinence of the concept of cleavages for the study of political dynamics. In political science, this proposal had at first been an object of criticism, with rejection of the influence over politics that it attributed to social structure (Sartori, 1969). The empirical evidence, however, took the discussion to another level. Basically, there were too many new and old cases that had now experienced change, that offered little or no evidence of the close relation between social structure and political conflicts as postulated in cleavage theory. The difficulties of standardizing a methodology to measure cleavages across a diversity of cases and the historical nature of the theory made it difficult to remake the theoretical framework in order to apply it to this new context. Instead, an internal decomposition of its minor elements took place, alongside diversification in its application. In this way, studies opened up the option of cleavages based more on recent social fissures and divisions generated in society by the political parties (Przeworski and Sprague, 1986).
681
This could be called a politics-centered solution, as this gives a preponderant role to agency in the political sphere in order to explain the course that the cleavage takes. In addition, there can be a ‘societal’ or a ‘vis-à-vis’ solution (Torcal and Mainwaring, 2003). The first remains loyal to the sociologically based theory postulated by Lipset and Rokkan and continues to attribute the most important part of the explanation of political cleavage to the social conflicts and their mobilization (Rose, 1968). The latter gives more autonomy to the political sphere than the traditional Lipset and Rokkan analysis, but without rejecting the importance of social phenomena as conditioning the agency of politics (Knutsen and Scarbrough, 1998, 2003; Bartolini and Mair, 1990). There is, however, another dimension of this debate, which refers not to the role of each pole of the cleavage (the social structure and the political sphere), but to the components that constitute it. The politics-centered solution not only gives a special place to the political conflict in creating a social fissure, but can also lead to an explanation of party competition that is not based on societal dynamics. Therefore, the cleavage studies had to be opened to the analysis of cases where only evidence for ‘less than a cleavage’ can be found. That means that not all the basic components needed to describe a cleavage may exist. The most typical cases of this are the ‘position divide’, where politics and political behavior are in conflict around a topic without a basis in social institutions, and the ‘issue divide’, where a social institution does not correspond to a structural social fissure (Deegan-Krause, 2007). Given that opening to ‘less than a cleavage’, the traditional concept of cleavage, constituted by the components present in the works of Lipset and Rokkan, can be called a ‘full cleavage’ (Deegan-Krause, 2007: 3). In sum, studies tended to adhere more and more to the politics-centered explanation of cleavages. The evidence showed the growing importance of the political sphere’s
682
The SAGE Handbook of Political Science
autonomy in the most socially articulated cases (Franklin, 1992). At the same time, there was evidence that, even when structural determinations of political preferences were found, these were expressed more in the form of differences of opinions or ideas than in compact isolated social groups, as with the classic cleavages (Kriesi, 1998). This shift is not isolated from reflections that have been made in sociology referring to the emergence of amplified social reflectivity (Giddens, 1979), the consequent difficulties for a general narrative that articulates politics as a form of social coexistence (Touraine, 1998) and its repercussions for national states (Garretón, 2015). Most of them are susceptible to be framed in the problem of hybridization (García Canclini, 2005). All of this constitutes reflections about the complexity that social and political phenomena progressively acquire and the difficulty of finding linear relations between these spheres, as classical cleavage theory did. As mentioned at the beginning of this chapter, for sociology, social structure can, in some cases, be hard to isolate in time and space. The described context tends to enforce that fact. While there is an understanding of how social structure is present and determines social dynamics, it is impossible to observe it acting by itself, not interlaced with other structures or phenomena. At the same time, there must be consideration of how the mentioned social transformations, which create new problems for cleavage studies and social theory, not only weaken traditional structures but can also make space for new ones. Globalization serves as a good example. While it intensifies the hybridization of cultures, weakening the expression of structural determinants in social collectives, it generates new ways of conditioning social phenomena because of the economic and political factors that become stronger in this process (Calderón, 2004). The case of Brexit and the rise of right-wing populism has generated a debate around its causes that gives great importance to this new type of constraint (Inglehart and Norris, 2016).
The same debate is also a good example of the crisis of representative democracies at the level of the nation State, due to the several identities and demands that they have to process and their limited range of action to confront global phenomena (Touraine, 1998; Rosanvallon, 2008). However, this is not sufficient to say, as the supporters of the ‘end of cleavages’ thesis do (Franklin, 1992), that social structure does not still have an influence on political phenomena. The diversification of social structures, which cannot be expressed in simple polarized groups, still affects political conflicts in more subtle ways (Enyedi, 2008). If the changes in social structure explain most of the political shifts that lowered the intensity of cleavages (Rose, 1968; Kelley et al., 1984), this is still a structural influence on the political sphere. Due to the complexity that social structure progressively acquires, the problem lies, rather than giving up its relation with politics, in opening the theory and searching for more diversified measurements.
The challenge of new dimensions For cleavage studies, the social structure dimension has become a problem since this last one starts to diversify and complexify, hindering the track down of its influence over politics (Rose, 1968; Kelley et al., 1984). Besides the problems that these changes in social structure have brought up, scholars have also seen an enrichment of the cultural sphere of Western societies. That has given to culture preponderance in the explanation of political conflicts and alignments, as happens with the already mentioned ‘postmaterialist values’ conflicts (Inglehart, 1985), all in detriment of the classic, structurally-rooted cleavage concept (Oddbjörn & Scarbrough, 2003). But as we have seen, postmaterialist values can have a structural explanation for their emergence, and therefore at least the hypothesis of a structural determination
Social Structure
that they take in a given context must be admitted (Kriesi, 1998). We must also consider new issues apart from the central fissures that characterized traditional cleavages. The first topic that should be analyzed is the gender problem. This intersects the current cleavage problems on all its levels, affecting at least half of the population (Risman, 2004). The structural character of gender lies in its power to determine several aspects of life, from the way people access the job market (or not) to the roles assumed in the division of household labor. There is evidence of the relevance of gender in political alignments. During the 1980s, in Western democracies women tended to vote more conservatively, which was mostly explained by the different relationships that women and men had with labor and religion (de Vaus and McAllister, 1989). A few decades later, the transformation in gender roles and other changes inverted the gender gap in voting, showing women to have a more leftist preference than men in Western democracies (Inglehart and Norris, 2000). Given this evidence, and regarding the differences with the classic cleavage studies, it can be said that gender as structure can determine political alignments in every context. Going further, for the cases used as examples, it could be hypothesized that the changing relationship between women and the construction of nation (Yuval-Davis, 1998) can explain this change from conservative to left-wing voting. If the hypothesis is correct, it shows how, even for such a powerful structure as gender, other structures have to be highlighted to understand the configuration of a cleavage. It also indicates how a change in social structure does not necessarily directly or only affect the dynamic of cleavage, but also affects the context in which cleavages unfold. In this case, the way in which women relate the social sphere to politics by acquiring a different social status is relevant to understand not only a later change in the cleavage, but also the change in the type of political agencies of which they are capable.
683
However, as opposed to the case just explained, it must be remembered that opening a theoretical approach that considers this linkage between social structure and political alignments does not always mean looking at the constitution of a cleavage; it can also mean looking only at parts of it (DeeganKrause, 2007), in which case the relation we are describing has much more explanatory power. In other words, discarding the cleavage hypothesis does not necessarily mean discarding structural determination altogether. This can help to understand why it is that, while gender structures explain much of the participation in gender struggles, as can be seen in the preponderantly female composition of feminist mobilizations or voting, they do not structure the entire party system as the classical cleavages studied by Lipset and Rokkan did. This can also be said of other dimensions, which have effects on what has been called a crisis of party representation. A different case of structural determination that must be addressed concerns indigenous populations or ethnic conditions. The first difference between this factor and gender lies in the differing weights that this condition acquires in a particular society, while gender always has the same importance in social structure (Hayes et al.,2000). The condition’s potential as a determinant of political alignments depends on this weight in a given society. An example of this can be seen by comparing the Ecuadorian and Bolivian cases. While in Ecuador around 7% of the population belong to an indigenous culture, this number reaches 40% among the population of Bolivia. The case of Bolivia is a powerful example of the potential for ethnic factors to generate a strong cleavage in local and national politics (Guzmán and Rodríguez, 2018). Here the indigenous population not only has unusual strength in explaining voting behavior and determining elections; it has also constituted a long-lasting social movement, which has sustained the government of Evo Morales over three terms and started several democratic reforms that reconfigured its relation with the State (Gamboa Rocabado, 2010).
684
The SAGE Handbook of Political Science
In the context of crisis confronted by ‘full cleavage’ theories, it is necessary to ask why indigenous movements are sufficiently strong to be configured as a cleavage of this type, as in the Bolivian case. The answer may lie in the more traditional, less hybrid structures of that society, and especially in the indigenous communities that serve as a basis for the movement. If it is assumed that the phenomena of globalization, for example, tend to fragment social structure and weaken its link with politics (Garretón, 2015), it is possible to expect greater strength of the ties that link this social basis with the political sphere. Besides, a more homogeneous community, without the social dispersion generated by hybridization (García Canclini, 2005) and modernity’s fragmentation (Touraine, 1998), can give more strength to social conflicts due to the strong membership of subjects in their social groups, and then give more substance to the cleavage. The ethnic condition has also given place to an indigenous movement in Ecuador, but one with less strength than in Bolivia. In terms of electoral power and reform programs, the ethnic cleavage tends to occupy a less important place in Ecuador than in Bolivia (Lalander and Gustafsson, 2008). However, it is interesting to see that while the relevance of both indigenous movements varies, they correspond in both cases to what has been called a ‘full cleavage’, in the sense that they are composed by a social fissure, generate a social closure and express themselves in political choices. This sheds light on the complexity of social processes that must be apprehended by political cleavage studies. They can identify the components of a determined cleavage – or ‘something less than a cleavage’ (DeeganKrause, 2007: 26) – but an overview of a given political context must realize the several cleavages that can compose it. For example, although the indigenous cleavage can be studied in Ecuador, it obviously does not explain the most important part of the political alignments in the country. Another interesting debate refers to the place of social structure in post-materialist
political conflicts, which experienced a boom with the emergence of new social movements (Inglehart, 1990). The main theory concerning this problem explains that, even when post-materialist problems are in fact structurally rooted, they are disputed at discursive levels so abstract that only certain factions of class structures get involved in the conflicts (Habermas, 1981; Offe, 1985). Empirical studies have shown these tendencies in some cases, where a ‘new’ middle class shows more sympathy for new social movements articulated around post-materialist values (Kriesi, 1989). In this sense, ‘post-materialist conflict’ seems to need a ‘materialist’ or ‘structural’ base if it is accepted (as some evidence shows) that they tend to emerge where a strong educated middle class exists, especially if the traditional working class loses strength. For the same reason, the translation of this problematique to Latin America and other developing countries is not automatic and the causes tend to vary from those seen in the first world. In the Third World, postmaterialist conflicts tend to mix with material and structural conflicts due to the hybridization of local cultures in the context of globalization, or due to the cultural exclusion which also carries material implications (WickhamCrowley and Eckstein, 2017). Translating this to the cleavage politics problem, two relevant issues emerge. First is the question about the possibilities of a cleavage configured around these topics to acquire strength and involve a social basis, when the interest in conflicts of this kind depends on the educational level of the electorate. The risk of elitization of politics is obvious, and the configuration of the cleavage probably will not imply a social fissure. Second is the change in the social composition of the electorate for parties who appeal to this topic. In fact, for Kelley et al. (1984) the decline of the working class forced left-wing parties to embrace post-materialist conflicts, recruiting members and voters of different social extractions. Similar tendencies are studied by Knutsen (1990) in the Norwegian case.
Social Structure
Given the described characteristics of these conflicts and the elitization of politics that they entail, it is not hard to imagine that here lies a clue to understand the right turn of working classes in some countries, linked to conservative parties for ideological or clientelistic reasons, but distanced from the left (Oesch, 2008). A historical process, then, can be seen in the background of the dynamic highlighted by cleavage politics. It is understood that although the essential aspect of a cleavage study lies in identification of the roles that the three main components (i.e. a social– structural fissure, social institutions and party alignments) play, this also needs to be complemented by other cultural or historical considerations that allow a better understanding of the processes studied. In fact, this is what most studies do when trying to comprehend a concrete cleavage dynamic: they review the previous cleavages that took place in the political system already mentioned, the strategies of political elites (Enyedi, 2005) and/or structural changes experienced in the country (Kelley et al., 1984). This is why it can be said that while cleavage studies are necessary, they are not enough for the full understanding of a given political process. In this regard, Latin American societies, particularly in their 20th-century political developments, must be mentioned. When contrasted with Western democracies they exhibit characteristics that cannot be fully comprehended by cleavage studies. In particular, the role that the State had in Latin America is different, but it also generated and modified social structures. This situation, which has created a symbiosis between the State and diverse subjects, requires broader study, as proposed in the socio-political matrix approach (Garretón et al., 2003). In fact, some processes of cleavage disarticulation in this context can be well understood through this approach, which considers structural, cultural and political transformations that determine the ways in which political alignments are generated (Garretón, 2014).
685
Therefore, there is a need to complement cleavage studies with ad hoc approaches for a better understanding of the given phenomena.
Political Regimes, State, Globalization, Social Change The described problem of social structure and the current discussion in sociology about its political implications exceeds the concept of cleavage with which political science has opened this dialogue. There are several other concerns about politics that have called the attention of sociologists when studying social structure. An example of this could be the relation in sociology between social structure and political regimes. This has been postulated since Marx’ studies of feudal economic and political regimes and his comparison with ‘bourgeois democracy’ (Marx, 1998), and is also present in Barrington Moore’s studies on the relation between political regimes and socioeconomic conditions (Moore, 1966). The long-lasting nature of this linkage in sociology can be traced even in Latin America. The most influential works of the region’s sociology during the 1970s, the dependency theories, studied how the economic configuration of the region gave place to certain types of State, with the inclusion and exclusion of social actors in the rising democracies (Cardoso and Faletto, 1979). Therefore, it is seen as a constraint that social structures, in class and economic terms, impose on the types of political regime. In fact, almost all the debates about the type of democracy during the 20th century in the region are crossed by the study of the type of economic regime, its duality (with a modern and an almost feudal type of production based in the agrarian and mining export sectors) and the range of action for change provided to the political regime based on those conditions (CEPAL, 1963). Those elaborations understood economic development not as linear progress, but as a conflicted process in
686
The SAGE Handbook of Political Science
which actors were more or less favored by the possible types of development depending on their social status in the structure, which at the same time gave them different possibilities for political action and change. It was impossible, then, to understand the type of democracies and states without understanding social structure. Related to this is the problematization of populism as a way of incorporating new sectors of the population into modern society and democracy (Di Tella, 1965; Germani, 1973). Also, during the 1980s, discussion about democratic transitions considered the limits set out by the economic sphere as a structural determination acting over the desirable new regime (O’Donnell, 1979). In that period, even the tendencies that rejected Marxism and stopped arguing in favor of the predominance of the material conditions ‘in the last instance’ continued to ask about the relation between social structure and possible political regimes (Garretón, 1989). These complexities can be considered not only due to the diverse conditions that social structure tends to put in place over politics, but also for the discussion on the cleavage concept which it creates. In fact, the transition of democracies in Latin America has in some cases generated cleavages that articulate the problem of the type of political regime preferred after the end of dictatorship (Tironi and Agüero, 1999; Valenzuela, 1999). For a sociology that problematizes the dependency on social structure of the type of political regime, this is more than just a problem of political alignments, and opens a very rich and complex field of study. There are some examples of other political phenomena that cannot be understood without reference to social structure. Among them can be mentioned globalization and the generation of new local, national or transnational structures; the nature of the State as agent of development or as the space of domination or compromise between actors; social change as revolution or reforms; emancipation of the ‘oppression structures’ mainly, not exclusively, when it is considered as a criticism of patriarchal structures and
the need to overcome them (Yuval-Davis, 1998); and social movements emerging from a structure or as a political creation to overcome the structure, or installation of democracy as breakdown of dictatorship. However, structure by itself is unable to provide all the required explanations for these problems. In this sense, in the current social structure debate, the empirical discussion has discovered more complexities for which the Blau concept is still useful, but not enough for its comprehension.
Conclusion The reviewed dilemmas that current social structure theories represent for political studies, point to the need of an extended concept of social structure and social agency from which these studies can nourish. This, without seeing their approaches restricted by the limits that long-range social theories can implicate. This is a main issue to look for a dialogue between sociology and its notions about social structure, on one side, and political science with its studies of political alignments, e.g., on the other. It is necessary, while considering the contributions made by the long-range social theories, to look forward, to rescue the possible forms that social structure can take along their precepts. In this way, one can evade a definitive, rigid concept of social structure or social agency, but have general and diverse guidelines regarding the possible explicative power of that concept in the study of a particular phenomenon. Although the criticism can be raised that this threatens the consistency of the theoretical approach, what is won is the possibility of dialogue with political science without adhering to the definitions of that discipline. In this sense, more than choosing a definitive concept of social structure or a definitive social theory that includes that concept, what is proposed here is to exploit the variations
Social Structure
that this problem has experienced in social theory as a toolbox. In that way, the conflictive notion of Marxist social structure can aid an understanding of some historical moments of rising social tension and consequent politicization, and the opposing ‘consensus theory’ of Parsons could allow an understanding of the adequacy or ‘discipline’ of certain social actors under a political regime. What seems to be the consensus around the problem of social structure is the idea of long-lasting elements, in relation to subjective wills that are imposed in a given society. This could be interpreted as material conditions as well as political institutions, cultural values or several other elements. With the flexible and open concept of structure in favor of which we have argued, a hypothetical social structure can have economic, social, cultural and political dimensions,6 or can refer only to one of these dimensions. In social dynamics, this structure penetrates and interrelates, while conserving degrees of autonomy depending on the case. The theoretical articulation between them under what has been called the ‘socio-political matrix’ (Garretón et al., 2003) can serve as a flexible concept of structure that rescues the diverse sociological contributions made around that concept, as well as face the different, complex problems that emerge today in political sociology. This way of dealing with theory supports not only dialogue with political science, but also the study of contradictory, more heterodox and multi-conditioned phenomena in politics through the view of sociology. Currently, as seen in this chapter, globalization processes limits the agencies of State as well as it diversifies the different identities that lies under them, through the acceleration of cultural hybridization. This implies on one side the narrowing of national states action margin, and the widening of the type of demands made by the population on the other. To these effects of globalization, it must be summed the progressive complexity that class structures acquires, as it gives
687
place to multipolar conflicts. In this context, the study of national politics turns too complex for a systematical theoretical approach as the classical cleavage theories looked for. Middle-range theory and flexible concepts are not a definitive answer, but are at least a better way of asking the questions. Last but not least, this seems to be the only way of rescuing a sociological approach to the current debate around cleavage politics. The first cleavage studies of Lipset and Rokkan relied on a determined, but not explicit, understanding of the relation between social structure and agency. In this regard, it was assumed that social structure had a great influence on political agency, expressed also in social life (through social closure and institutions kept active by structurally determined subjects) and in political expressions through party alignments. As explained above, this mechanism was later weakened. The big question, therefore, was and still is: If there exists a structural determination of political alignment, how does it work, and how do the changes in the way it influences political phenomena affect what was called cleavage politics? Obviously, this question gets more complex as social structures fragment and as the identities that it used to produce disperse in the context of globalization and the boom of new means of communication. The complexity acquired by social structures in most of the cases studied points to the weakening of previous strong cleavages. Better living standards and higher educational levels tend to decrease the direct determination of social structure over subjective aspects and, consequently, agency. However, in this respect some considerations in favor of structural factors must be made. First, the structural character of educational levels and well-being implies structural differentiation and the presence these factors can still have in parts of the population. Second, the structural character’s being part of post-materialist values through levels of education and professions associated with them implies their presence as well. What has changed, then,
688
The SAGE Handbook of Political Science
is the strength that structure had, in the first cleavage studies, to manifest direct influence on political agency. The hypothesis of influence in given cases must not be discarded, however, but rather modified. Finally, the opening of new structural aspects not previously considered has turned out to be necessary in this context. With reference to agency, the discussion of cleavages gave emphasis to the role of elites in cleavage formation and maintenance. This implies a return to the political science approach in the same way that the sociological concept was involved in Lipset and Rokkan’s works, where the dynamics of political parties and actors can be as powerful as social processes in determined contexts. That is why a concept of social agency according to current cleavage studies must be as open to the study of social movements as it is to the capacity of political elites. In this respect, the possibility must also be considered that, in the political dimension of cleavages and the dynamics that only involve their political elite’s agents, low democracy logics are generated (Gills and Rocamora, 1992), and, therefore, a lower strength of the sociopolitical articulation. The problem here is to study how social conflicts relate to politics that cannot always process them (Ruiz, 2015). This could also explain the de-politicization that some Latin American democracies experienced during the reconstruction of sociopolitical links after authoritarian rule – a reconstruction in which political elites usually played a major role (Garretón et al., 2003). There are more explanations for political conflicts than are expressed in the traditional cleavage scheme. This cleavage perspective, however, has important comparative potential and therefore can be employed for this purpose while being supplemented with other approaches for a deeper understanding of given cases. The diverse hypotheses discussed here regarding changes in social and political dynamics oblige us to consider new factors, not previously taken into account, that operate in and over cleavage politics.
And this, among other shifts, leads cleavage theories to articulate their scheme of political conflicts with post-materialist and new issues (Offe, 1985). Therefore, the question emerges whether it is necessary for cleavage studies to be open to this problem to apprehend the nature of today’s politics, as an expression of new social articulations that can give place to political alignments. If this option is chosen, then it is imperative to return to sociology for cleavage studies. In sum, the main conclusion of this chapter is that it is important to open the theoretical framework on social structure, agency and actors and cleavage politics to different historically situated hypotheses in every case, in opposition to the idea of the disappearing influence of social structure, or closure of the debate on some fixed political concepts.
Notes 1 However, it must be said that in Marx’ works structure is at the same time the steady component that limits the possible agencies and events in a given context, and the engine of the social processes of change. 2 In the words of Merton, Blau’s concept of social structure tends to be more like a ‘middle range theory’ (Merton, 1968), useful to study cases but without the background of a comprehensive social explanation. 3 The concept of post-materialist values or postmaterialist conflicts points to the idea of values or conflicts not centered on material conditions of life or determined by them. However, the notion of non materialist based conflicts was already present almost a century ago in the work of Gramsci, among others, where conflicts not only based on the material world or immediately economic are mentioned (Gramsci, 1992). The ‘post’ prefix in the concept can imply an evolutionist perspective that is hard to share. Nevertheless, it will be used here because it has been part of the commonly used language of the reviewed debate, and its discussion would exceed the objective and extension of this chapter. 4 For a better comprehension of Bourdieu’s ideas here, it can help to point out that when talking of the ‘doxa’ structuring social camps, the author has affirmed that he was looking for a different concept than ‘ideology’ (Bourdieu and Eagleton, 1991).
Social Structure
5 Although Lipset and Rokkan propose a fourth cleavage, the one between Church and State, it is doubtful whether this can be considered as a social structural determinant as defined here. 6 That is the case, for example, for traditional economies where economic relations depend on cultural constraints that determine very specific roles in economic institutions (Habermas, 1975).
References Archer, M. (1995). Realist Social Theory: The Morphogenetic Approach. Cambridge: Cambridge University Press. Archer, M. (2003). Structure, Agency and the Internal Conversation. Cambridge: Cambridge University Press. Bartolini, S., & Mair, P. (1990). Identity, Competition and Electoral Availability. Cambridge: Cambridge University Press. Blau, P. (1975a). Introduction. In P. Blau, Parallels and Contrast in Structural Inquiries (pp. 1–20). New York: Free Press. Blau, P. (1975b). Parameters of Social Structure. In P. Blau (ed.), Approaches to the Study of Social Structure (pp. 220–53). New York: Free Press. Bourdieu, P. (1998). Practical Reason. Stanford: Stanford University Press. Bourdieu, P., & Eagleton, T. (1991). ‘Doxa and Common Life’, Talking Ideas series. ICA, London. Calderón, F. (2004). ¿Es Sostenible la Globalización en América Latina? Santiago, Chile: Fondo de Cultura Económica. Cardoso, F. H., & Faletto, E. (1979). Dependency and Development in Latin America. California: University of California Press. CEPAL. (1963). The Social Development of Latin America in the Postwar Period. Santiago, Chile: CEPAL. Dahrendorf, R. (1959). Class and Class Conflict in Industrial Society. Stanford: Stanford University Press. De Vaus, D., & McAllister, I. (1989). The Changing Politics of Women: Gender and Political Alignment in 11 Nations. European Journal of Political Research 17(3): 241–62. Deegan-Krause, K. (2007). New Dimensions of Political Cleavage. In R. J. Dalton & H.-D.
689
Klingemann (eds), Oxford Handbook of Political Behaviour (pp. 538–56). Oxford: Oxford University Press. Di Tella, T. (1965). Populismo y Reforma en América Latina. Desarrollo Económico 4(16): 391–425. Durkheim, E. (1965). The Division of Labour in Society. London: The Free Press. Durkheim, E. (1982). The Rules of the Sociological Method. New York: The Free Press. Enyedi, Z. (2005). The Role of Agency in Cleavage Formation. European Journal of Political Research 44(5): 697–720. Enyedi, Z. (2008). The Social and Attitudinal Basis of Political Parties: Cleavage Politics Revisited. European Review 16(3): 287–305. Franklin, M. N. (1992). The Decline of Cleavage Politics. In M. N. Franklin, T. Mackie, & H. Valen et al. Electoral Change: Responses to Evolving Social and Attitudinal Structures in Western Countries (pp. 383–405). Cambridge: Cambridge University Press. Gamboa Rocabado, F. (2010). Transformaciones Constitucionales en Bolivia: Estado Indígena y Conflictos Regionales. Colombia Internacional 71: 151–88. García Canclini, N. (2005). Hybrid Cultures: Strategies for Entering and Leaving Modernity. Minnesota: University of Minnesota Press. Garretón, M. A. (1989). The Ideas of Socialist Renovation in Chile. Rethinking Marxism 2(2): 8–40. Garretón, M. A. (2002). La Transformación de la Acción Colectiva en América Latina. Revista de la CEPAL 76: 7–24. Garretón, M. A. (2014). Las Ciencias Sociales en la Trama de Chile y América Latina. Santiago: LOM. Garretón, M. A. (2015). La Sociedad en que Vivi(re)mos. Santiago: LOM. Garretón, M. A., Cavarozzi, M., Cleaves, P. S., Gereffi, G., & Hartlyn, J. (2003). LatinAmerica in the Twenty-First Century: Towards a New Sociopolitical Matrix. North-South Center Press/University of Miami. Germani, G. (1973). Democracia Representativa y Clases Populares. In G. Germani, T. S. Di Tella, & O. Ianni, Populismo y Contradicciones de Clase en Latinoamérica (pp. 12–37). México DF: Serie Popular Era. Giddens, A. (1979). Central Problems in Social Theory: Action, Structure, and Contradiction
690
The SAGE Handbook of Political Science
in Social Analysis. California: University of California Press. Gills, B., & Rocamora, J. (1992). Low Intensity Democracy. Third World Quarterly 13(3): 501–23. Giner, S., Lamo de Espinoza, E., & Torres, C. (2006). Diccionario de Sociología. Madrid: Alianza Editorial. Gramsci, A. (1992). Análisis de situaciones y correlaciones de fuerza. In A. Gramsci, Antología: Selección y traducción de Manuel Sacristán Madrid: Siglo XXI. Guzmán, G., & Rodríguez, F. (2018). Voto Étnico en Bolivia. Cohesión, Disgregación y Clivajes Étnicos. Política y Gobierno 25(1): 65–100. Habermas, J. (1975). Towards a Reconstruction of Historical Materialism. Theory and Society 2(3): 287–300. Habermas, J. (1981). New Social Movements. Telos 49: 33–7. Hayes, B., McAllister, I., & Studlar, D. (2000). Gender, Postmaterialism and Feminism in Comparative Perspective. International Political Science Review 21(4): 425–39. Inglehart, R. (1977). The Silent Revolution: Changing Values and Political Styles among Western Publics. Princeton, NJ: Princeton University Press. Inglehart, R. (1985). Aggregate Stability and Individual-Level Flux in Mass Belief Systems: The Level of Analysis Paradox. American Political Science Review 79(1): 97–116. Inglehart, R. (1990). Values, Ideology, and Cognitive Mobilization in New Social Movements. In R. J. Dalton & M. Kuechler (eds), Challenging the Political Order: New Social and Political Movements in Western Democracies (pp. 43–66). Cambridge: Polity Press. Inglehart, R., & Norris, P. (2000). The Developmental Theory of the Gender Gap: Women’s and Men’s Voting Behavior in Global Perspective. International Political Science Review 21(4): 441–63. Inglehart, R., & Norris, P. (2016). Trump, Brexit, and the Rise of Populism: Economic HaveNots and Cultural Backlash. Paper for the roundtable on Rage against the Machine: Populist Politics in the U.S., Europe and Latin America. Harvard Kennedy School. Kelley, J., McAllister, I., & Mughan, A. (1984). The Decline of Class Revisited: Class and
Party in England, 1964–1979. The American Political Science Review 79(3): 719–37. Knutsen, O. (1990). The Materialist/PostMaterialist Value Dimension as a Party Cleavage in the Nordic Countries. West European Politics, 13(2): 258–74. Knusten, O., & Scarbrough, E. (1998). Cleavage Politics. In J. W. van Deth & E. Scarbrough, The Impact of Values (pp. 492–523). Oxford: Oxford University Press. Knutsen, O., & Scarbrough, E. (2003). Cleavage Politics. In J. W. can Deth & E. Scarbrough, The Impact of Values (pp. 492–523). Oxford: Oxford Scholarship Online. Kriesi, H. (1989). New Social Movements and the New Class in the Netherlands. American Journal of Sociology 94(5): 1078–1116. Kriesi, H. (1998). The Transformation of Cleavage Politics. European Journal of Political Research 33(2): 165–85. Lalander, R., & Gustafsson, M.-T. (2008). Movimiento Indígena y Liderazgo Político Local en la Sierra Ecuatoriana: ¿Actores Políticos o Proceso Social? Revista Venezolana de Estudio Territoriales 19: 57–90. Lipset, S. M., & Rokkan, S. (1967). Cleavage Structures, Party Systems and Voter Alignments. New York: Free Press. Lybeck, J. (1985). Is the Lipset-Rokkan Hypothesis Testable? Scandinavian Political Studies 8(1–2): 105–13. Marx, K. (1998). The German Ideology. New York: International Publishers Company, Incorporated. Marx, K. (2017). On ‘The Jewish Question’. San Francisco: Hebrew Union College – Jewish Institute of Religion. Marx, K. (2018). Contribution to the Critique of Political Economy. New York: International Publishers. Merton, R. K. (1968). Social Theory and Social Structure. New York: Macmillan. Moore, B. (1966). Social Origins of Dictatorship and Democracy. New York: Penguin University Books. O’Donnell, G. (1979). Tensions in the Bureaucratic-Authoritarian State and the Question of Democracy. In D. Collier (ed.), The New Authoritarianism in Latin America (p. 285). Princeton, NJ: Princeton University Press. Oesch, D. (2008). Explaining Workers’ Support for Right-Wing Populist Parties in Western
Social Structure
Europe: Evidence from Austria, Belgium, France, Norway, and Switzerland. International Political Science Review 29(3): 349–73. Offe, C. (1985). New Social Movements: Challenging the Boundaries of Institutional Politics. Social Research 52(4): 817–68. Parsons, T. (1999). El Sistema Social. Madrid: Alianza Editorial. Przeworski, A., & Sprague, J. (1986). Paper Stones: A History of Electoral Socialism. Chicago: University of Chicago Press. Risman, B. (2004). Gender as Social Structure: Theory Wrestling with Activism. Gender and Society 18(4): 429–50. Ritzer, G. (1993). Teoría Sociológica Contemporánea. CDMX: McGraw-Hill. Rosanvallon, P. (2008). Counter-Democracy: Politics in an Age of Distrust. Cambridge: Cambridge University Press Rose, R. (1968). Class and Party Divisions: Britain as a Test Case. Sociology, 2(2): 129–62. Ruiz, C. (2015). De Nuevo la Sociedad. Santiago: LOM. Sartori, G. (1969). From the Sociology of Politics to Political Sociology. In S. M. Lipset (ed.),
691
Politics and the Social Sciences (pp. 65–100). Oxford: Oxford University Press. Tironi, E., & Agüero, F. (1999). ¿Sobrevivirá el Nuevo Paisaje Político Chileno? Estudios Público 74: 151–68. Torcal, M., & Mainwaring, S. (2003). The Political Recrafting of Social Bases of Party Competition: Chile, 1973–95. British Journal of Political Science 33(1): 55–84. Touraine, A. (1998). Can We Live Together? European Journal of Social Theory 1(2): 165–78. Valenzuela, J. (1999). Reflexiones Sobre el Presente y Futuro del Paisaje Político Chileno a la Luz de su Pasado. Estudio Públicos 75: 273–90. Wickham-Crowley, T., & Eckstein, S. E. (2017). Los Movimientos Sociales Latinoamericanos y la Ratificación de las Teorías Estructurales. In P. Almeida & A. Cordero, Movimientos Sociales en América Latina: Perspectivas, Tendencias y Casos (pp. 47–80). Buenos Aires: CLACSO. Yuval-Davis, N. (1998). Gender and Nation. New York: Sage.
This page intentionally left blank
PART IV
Comparative Politics
This page intentionally left blank
41 Political Accountability Ya n n i s P a p a d o p o u l o s
Introduction: Accountability as a normative desideratum and as a relational concept Accountability penetrates many social spheres: children are accountable to their parents, spouses to each other, managers to shareholders, private employees to their bosses, public bureaucrats to their organisational superiors, suppliers to customers, students to teachers and researchers to funding bodies. This chapter will concentrate more narrowly on the accountability of the official public policy-makers, in other words ‘the whole personnel employed by the modern state’ (Schedler, 1999: 22) that wields power by producing collectively binding decisions. It will disregard (with a few exceptions) accountability issues involving other types of actors that participate – and are sometimes highly influential – in the policy process, such as academic experts. This chapter also focuses on accountability in democratic systems, although rule-makers in
autocratic regimes may not be completely unaccountable as they may need to seek formal or informal approval from the ruling party, powerful interest groups and other actors. The chapter will mainly highlight the political procedures related to accountability and will only tangentially devote attention to other forms of accountability to which rule-makers are subject, such as legal accountability to courts or financial accountability to auditing bodies. Few people would dare to stand against accountability. Accountability is a core positive value in public debates, and some even consider it ‘the über-concept of the twenty first century’ (Flinders, 2014: 661), or as a ‘chameleon-like’ term (Mulgan, 2000: 555) that is now equated with all kinds of aspects of ‘good governance’. Let us first note that accountability is not necessarily related to democracy: democratic accountability is a subset of all possible accountability relations. This is evident in the following definition from two scholars from the field of international
696
The SAGE Handbook of Political Science
relations, in which standards of democratic accountability cannot easily apply: Accountability … implies that some actors have the right to hold other actors to a set of standards, to judge whether they have fulfilled their responsibilities in the light of these standards, and to impose sanctions if they determine that these responsibilities have not been met. (Grant and Keohane, 2005: 29)
Accountability is a multidimensional concept: Those who study it need to flesh out who is accountable to whom, for what,1 how (through what kind of processes and with what kind of standards),2 and possibly with what kinds of consequences. For analytical purposes it is helpful to view accountability as a social mechanism of relational and communicative nature that connects individual or collective policy actors to accountability ‘forums’ in deliberative (sometimes also bargaining) processes, usually under the threat of sanctions by the forums in case of the policy actors’ estimated misconduct or poor performance. Accountability is, therefore, closely related to power: being able to hold someone accountable indicates a privileged position (Waldron, 2014: 3), and the same applies if one can escape accountability. Further, even if the monitoring of actors by forums may be concomitant to their action and if policy-makers anticipate the accountability phase, accountability fundamentally takes place ex post. Thus, accountability is defined as a relationship between an actor and a forum, in which (1) the actor has an obligation to explain and justify his or her conduct to the forum by providing information about procedures, performance or outcomes (answerability); (2) a debate may ensue and the forum can pose questions, contest and pass judgement (the relationship may be more or less dialogical and confrontational); and (3) at the end of this (stylised) ‘time-line’ (Lindberg, 2013: 212) the actor may face positive or negative consequences, depending on the forum’s evaluation (enforceability)
(Bovens et al., 2014: 9). It is stating the obvious that ‘real-world’ accountability does not always function as formally prescribed ‘on paper’, which applies to all sequences of the accountability process. One should distinguish between de jure (in books) and de facto (in action) accountability, and forums endowed with formal oversight tasks may just be ‘paper tigers’, while forums that only informally perform an accountability role may prove not to be toothless. The effectiveness of accountability depends on the power balance between the involved actors. The properties and resources of forums can be decisive: on the one hand, a forum endowed with moral authority may induce compliance, even without coercion. On the other, actors may face collective action problems when establishing forums, and apart from formal competence and sanctioning power, forums may require expertise to process information and may also be bounded by the ‘bottleneck of attention’. There are basically two sources legitimizing a forum to exercise prerogatives with respect to political accountability. The first source is when the forum is a ‘principal’ that has previously delegated (some of its) prerogatives to an agent. Being in such a delegation relationship, the agent becomes accountable to the principal, and accountability is then based on ‘ownership’ (Bovens et al., 2014: 5). Typical examples are the accountability of elected officials to their constituencies, of members of the bureaucracy to their political superiors, of the leadership of interest group representatives to the rank and file, and so on. The second source is affectedness: those who convincingly argue that they are (deliberately or not) affected by the outcomes of policy outputs can claim – and even more if they have not participated in the policy process and are subject to externalities – that they have a legitimate right to hold output producers to account.3 As noted before, apart from these forms of political accountability that are respectively based on authority and stakeholder legitimacy (Bovens et al., 2014: 3),
Political Accountability
public decision-makers may be held to account by (in principle) independent and impartial third parties, such as courts.
Representative government: the crucial function of competitive elections In representative democracies the existence of competitive elections is the core mechanism that should ensure the accountability of the incumbents and, therefore, their responsiveness to the citizenry. They are at the same time a mechanism of authorisation to represent based on electoral pledges – ex ante ‘promissory’ representation (Mansbridge, 2003) – and a mechanism of ex post accountability that casts its shadow over decisionmakers. The fear of the verdict of the ‘tribunal of public opinion’ (according to Jeremy Bentham’s famous metaphor) acts as a counterforce to the power of the rulers, and the risk of exposure of misconduct to that fictional tribunal is a disciplining device. The anticipation of future competitive elections induces representatives to behave responsively if they want to avoid being ‘thrown out’ as ‘rascals’ (Manin, 1997). In democracies agents are only temporarily and conditionally authorised to act in the name of their principals. Ex post accountability follows ex ante authorisation and acts as a safeguard for decision-makers’ responsiveness to the preferences of voters (although it is not always conceptually clear whether the preferences that matter are those of particular electoral constituencies, the majority, the median voter or the citizenry at large): During electoral campaigns, candidates make various promises and, in electing a candidate, constituents authorize the candidate to carry out these promises in his or her capacity as an elected official. If representatives violate their promises, acting contrary to what they have been authorized to do, then constituents can hold them accountable by voting them out of office in the next election. (Volmert, 2012: 299)
697
The normative attractiveness of this standard model of accountability relies on its assumption of a direct line upward from citizens to the government and downward from the government to society (Hupe and Edwards, 2012: 182). It is a circular model – from voters to representatives and vice versa – and assumes that representation is ‘thermostatic’: governments and citizens, respectively, adjust their policy choices and their policy preferences. For accountability to ensure responsiveness, decision-makers need to anticipate the risk of electoral sanction and integrate it in their calculations (an operation that those with negative views of party politics call ‘electoralism’ and associate with short-sightedness). It is the threat of being dismissed from office that forces the incumbents to be responsive to citizens’ preferences, especially if they believe that the outcome of the forthcoming election is uncertain. What is more, the political communication literature that highlights the increasing mediatisation of politics suggests that parties and politicians are nowadays under strong pressure to continuously justify their conduct to the public. Nevertheless, the role of elections as an accountability mechanism should not be idealised. Unlike policy referenda, electoral choice is a very crude indicator of voters’ policy preferences. Moreover, for electoral accountability to operate effectively, a condition is that voting is retrospective. Citizens need to be primarily motivated in their voting choice by their desire either to reward or punish the incumbent government. This may be a bold assumption: ‘How are voters, whose utility is derived from outcomes, to decide among parties that offer policies?’, questions Przeworski (1998: 143; italics in the original). Against too narrow a conception of voters’ instrumental rationality, one should also consider that ‘loyalists’ with a strong attachment to the incumbent party will be less likely to sanction it, even if they are dissatisfied. Accountability may also be undermined if voting is primarily prospective – that is, if voters do not make their choices based on their evaluation of
698
The SAGE Handbook of Political Science
the past record of the power holders, but are forward-looking and trust above all candidates’ promises about the future. The mixture of voting motives may have as a consequence that a negatively evaluated government (retrospective vote) stays in power if voters find that the available alternatives (prospective vote) are worse. Furthermore, the nature and dynamics of the party system mediate the role of elections as an accountability mechanism because multiparty systems tend to lead to coalition governments that reduce the clarity of responsibility. The formation of such governments is the outcome of party negotiations that escape voters’ control.
Beyond electoral accountability: refining the principal–agent model of delegation and control Accountability is not limited to the relation between elected officials and voters. Parliamentary systems are characterised by the existence of a chain of delegation: voters delegate their power to MPs; the parliament delegates some of its powers to the executive, which delegates some of its prerogatives to the bureaucracy, which has its own internal chain of command. This chain of principal– agent relations is paralleled by a chain of accountability that operates in the reverse direction: street-level bureaucrats are accountable to more senior civil servants and the latter to their political superiors; these officials are accountable to the prime minister, who is accountable to the political majority that supports the government; and the latter is accountable to the citizenry (Strom et al., 2006). Of course, the existence de jure of a formal chain does not guarantee that all elements of the chain operate as expected, as this depends on the resources possessed by the fora and the credibility of their sanctioning capacities. For example, it has been widely argued that a consequence of the increased complexity of
policy matters, as well as of the internationalisation of policy-making, is that parliamentary assemblies as institutions become less effective – not only in producing legislation, but also in holding the executive accountable. Some parliaments have more recently reacted to their loss of power and have chosen to ‘fight back’ against ‘deparliamentarisation’ (Raunio and Hix, 2000), but the capacity and willingness of parliaments to scrutinise their executives’ activities remain uneven, and this counter-trend generates its own accountability problems because parliamentary influence over governmental positions is primarily exercised informally (Auel and Benz, 2005). Another limitation results from the fact that the principal–agent model assumes that principals are able to express clear preferences and, therefore, formulate explicit mandates to their agents. The correct operation of the accountability chain also presupposes that agents’ conduct can easily be decoded by principals who are able to evaluate it unambiguously, whereas ambiguity obfuscates such relations in real life. The vertical chain of accountability may be supplemented by horizontal accountability mechanisms (O’Donnell, 1998), frequently to non-majoritarian institutions in order to safeguard the rule of law and to protect minorities and individuals from abuses of power and violations of their rights. ‘Horizontal’ refers to formal accountability relations between institutional actors with roughly equivalent power. In bicameral systems with two chambers endowed with similar competencies, for instance, each chamber is in a sense accountable to the other, as it needs to convince the other about the pertinence of its decisions in order to prevent a veto. Even if some horizontal accountability mechanisms may involve legal actors – such as accountability to courts – they frequently have a political impact, such as whenever courts act as ‘veto players’ by striking down pieces of legislation. The so-called judicialisation of politics means that judges become de facto co-legislators because the scrutiny
Political Accountability
right of courts forces the legislator to anticipate their verdict. Although courts are usually considered as ‘negative’ legislators, thanks to their veto power, it appears that they are more than that because they make recommendations (which are often given due consideration by the official legislators) as to how the incriminated laws should be revised. In other words, like the shadow of future elections that induces the incumbents to be responsive to the voters’ preferences, the consequence of the shadow of court rulings is that the (alleged) preferences of the judiciary are endogenised by elected officials. As this may conflict with the preferences of democratic principals, elected officials may be caught in accountability dilemmas. These developments have given rise to controversies regarding the quality of democracy: they are welcome by those defending a liberal–constitutionalist conception of democracy who criticise ‘defective democracies’ for not providing sufficient countervailing powers to unrestrained majority rule (Merkel and Croissant, 2004), while others are worried about the fact that the formal separation of power is undermined, and about the danger of ‘juristocracy’ (Hirschl, 2007). To what extent should unelected accountability forums be able – in the name of impartiality and the public interest – to hold elected politicians to account? This is an open question.
Fire alarms and monitory democracy Diagonal accountability relations should be given consideration, too. They refer to situations in which an actor is formally accountable to a forum that has no formal sanctioning capacity, but that may report to another entity that is hierarchically superior to the actor and does have coercive power. This endows the first forum with a de facto power to redress misbehaviour, as in the case of the ombudsman institution. In that respect accountability
699
is mediated (Warren, 2014: 49) and the forums characterised by resource interdependence. This is also visible in the fact that, although forums may perform their control function directly by actively and systematically overseeing the activity of their targets, this requires investing considerable efforts, so that it might be less costly for forums to delegate their tasks. Holding an actor accountable incurs monitoring costs and coordination (even though not necessarily deliberate cooperation) between forums can reduce them. To use a classic distinction, ‘police patrol’ arrangements reduce the risk of moral hazard and opportunistic behaviour because systematic controls are feared, but it may be cost-saving for forums to rely on external ‘fire alarms’ (e.g. the media, NGOs or experts that have, for their own reasons, an interest in disclosing and blaming misconduct).4 Therefore, control also becomes delegated, though de facto and not through a deliberate act of delegation. Relying on ‘fire alarms’ instead of using ‘police patrols’ to ensure accountability requires that the forum places trust on external actors, therefore shifting scrutiny tasks may also entail perils, as does delegation in general. Reliance on fire alarms means that ‘social’ accountability to informal forums may be a ‘second-best’ or ‘surrogate’ form of accountability in the absence of formal accountability forums and procedures. Perhaps the most important informal accountability forum nowadays is the media system, which acts in the name of public opinion. Forums such as the media are devoid of formal sanctioning power, but ‘watchdog journalism’ may be a safeguard against misconduct because the media act as intermediaries that alert actors with sanctioning power: some bark and others bite. In other words, forums with strong sanctioning capacities, but limited time resources (such as legislative oversight committees), can benefit from forums without sanctioning capacities yet with strong informational capabilities (such as investigative media). Reliance on fire alarms for accountability
700
The SAGE Handbook of Political Science
illustrates the advent of ‘monitory’ democracy, in which the action of political officials is under the continuous scrutiny of monitory organisations and networks (Keane, 2009), and of ‘audience’ democracy (Manin, 1997), in which going public and alerting public opinion becomes a key feature of politics. This is no doubt a positive development for the vigour of democracy, although one may also deplore excesses that contribute to the erosion of public support for politicians. This happens, for example, when ruthless media outlets, driven by their own commercial logic, privilege negative reporting on blunders and scandals and search for scapegoats, with the risk that this behaviour unfairly destroys the reputation of honest politicians (Papadopoulos, 2013: 49–64). More generally, the media may be biased in their role as accountability forums because their attention is primarily drawn to those issues that are sufficiently salient to be newsworthy. Further, a parallel can be drawn to some extent with the debate on ‘surrogate representation’ (Mansbridge, 2003), in which NGOs act as ‘surrogates’ for the populations whose wellbeing is of concern to them. In the case of surrogate representation, Saward emphasises the need for ‘chosenness’, which means the obligation to ‘successfully claim requisite degrees of representativeness’ (2011: 93), or, in other words, the acceptance by the supposedly represented constituencies (90) of aspirations to ‘authenticity’ as credible. In the case of surrogate accountability forums, they act in the name of other legitimate forums, such as public opinion, so there should be some (at least tacit) general consent on such a role devolved to surrogates. The problem is, however, that it is difficult for unorganized segments of the population to raise their voice against that of powerful surrogates such as the media. Moreover, surrogate forums also act in the name of intangible principles (e.g. the respect of human rights) and communities (e.g. future generations), and this prevents the authenticity test. As such forums are not formally mandated to
play their monitory role, they have no proper ‘licence to control’ from democratic principles; thus, their legitimacy to hold accountable is a matter of contention. Finally, the inherent accountability of monitory bodies, as in the case of NGOs, should be debated as well. Although their role as accountability forums has become more prominent, they are themselves not accountable to the general public and it is uncertain to whom they should be accountable: through the affectedness principle to those whose interests they pretend to defend, or rather through the ownership principle and then primarily to their members or to their major funders?
The prevalence of complex and ‘messy’ forms of governance and accountability The presence in contemporary political– administrative systems of horizontal and diagonal accountability relations along with vertical relations signals that accountability, even though considered as a correlate of delegation relations, is not confined to them. Therefore, the stylised principal–agent model is insufficiently equipped to capture the dynamics of real-existing governance processes and, more specifically, the universe of significant accountability relations. The picture becomes even more complex if we consider the advent of ‘multi-level’ governance: decision-makers belong to different jurisdictional levels. Take the case of governance and accountability in the European Union system: democratically elected governments of EU member states are certainly accountable to their domestic constituencies, but they may also be held to account by supranational forums (for example, the Commission or the Court of Justice) and they need to explain and justify their behaviour to their European counterparts. These forums do not necessarily share the same preferences, which causes
Political Accountability
accountability dilemmas: governments have to cope with ‘multiple accountability disorders’ (MAD) (Koppell, 2005). Moreover, some forums’ lack of sufficient democratic credentials may be of concern for normative reasons. More generally, contemporary systems of government are often characterised by high levels of complexity and even ‘messiness’. The process of governing is often interactive, entailing collaboration among various interdependent state and non-public agents. Political decisions are formulated or implemented though bargaining or deliberation in polymorphous networks, usually involving politicians, administrators, interest representatives, stakeholders and experts. In policy networks multiple principals with distinct preferences cohabit with multiple agents, while the latter are embedded in intricate accountability webs with many forums endowed with unequal resources and power. This complex ecology leads ‘to a more diversified and pluralistic set of accountability relationships’ (Bovens, 2007: 110), which especially affects informal (not legally codified) relations. Inside policy networks, ‘interdependence’ (Scott, 2000: 50–2) accountability tends to gain in relevance, sustained by the sharing of common norms and expectations (Romzek et al., 2012). Participants engaged in iterative cooperative relations need to trust each other that the commitments they make are credible and, subject to the disciplining power of ‘naming and shaming’ and the fear of loss of reputation, they accordingly adjust their behaviour to the expectations of their peers (Papadopoulos, 2010). Therefore, although informal and horizontal accountability relations are very different from formal political accountability ties, they are not less significant as regulators of one’s behaviour. However, again, the requirements of peer accountability may not coincide with those of accountability to democratic principals or to affected populations, whose preferences and expectations may diverge from those of the
701
network members. This scenario differs from a situation where fire alarms alert forums. In that case the scrutinising actors were interdependent and pooled resources to achieve accountability. In the case discussed now, there is a patchwork of overlapping and uncoordinated forums with different preferences, expectations and standards. One may expect that accountability dilemmas will be solved to the profit of the most powerful forum(s) or that the inflation of accountability demands will undermine effective accountability. Even if accountability to democratic principals remains a salient issue for participants in governance networks, network characteristics may prevent outsiders from appropriately evaluating what happens therein. Monitors have difficulty in penetrating such governance arenas situated at the ‘backstage’, and this is a limitation that is not given due consideration by those emphasising the advent of ‘monitory’ democracy. To be more specific, visibility is a precondition for accountability because it reduces informational asymmetries: if the accountability forums are confronted with hidden information, they will not be able to critically scrutinise the operation of governance processes. This does not mean that such processes are purposively concealed from the public, but that the structural conditions for visibility to outsiders may not exist. Network-like modes of governance are frequently informal and weakly codified and, perhaps more importantly, it is hard to judge in networks who is responsible for what, and how much (Considine and Afzal, 2011: 376), something that is referred to in the literature as the ‘problem of many hands’ (Thompson, 1980) or the ‘paradox of shared responsibility’ (Bovens, 1998: 45–52). The more policy networks are uncoupled from the arena of democratic politics, the higher the risk of difficulties or even errors in assigning blame. On the one hand, when responsibility is diluted, actors can more easily offload accountability and engage in ‘blame-shift games’ in order to avoid criticism or sanction for poor performance
702
The SAGE Handbook of Political Science
(Hood, 2010). On the other hand, especially because of media attention, elected officials are highly visible targets for sanctions without necessarily being the key players in policy networks, in which their influence is challenged by interest group representatives, members of the bureaucracy and experts. In any case, the effectiveness of the democratic feedback loop decreases. Those who are formally accountable to the citizenry de facto delegate some of their power to actors who are not publicly visible and whose activities are not mediatised. Hence, the incumbents are held to account for outputs whose formulation or implementation escapes their control, at least partly, and those aspiring to office make pledges that will be difficult to fulfil because policy outputs also mirror the preferences of other influential actors who participate in the negotiation processes that take place in policy networks.
Administrative reforms and new accountability regimes Political scientists are not usually very concerned about the components of the accountability chain beyond voters and elected officials. Administration scientists, by contrast, acknowledge that organisational complexity increased through the delegation of rule-making tasks to non-majoritarian institutions and expert bodies at arm’s length, and more generally with several layers of administrative reform under the banner of new public management and its epigones, such as ‘joined-up government’. There have been many debates as to whether the enhanced independence of public agencies entails their lack of accountability, or whether they are not uncontrolled since they remain accountable. Being typical examples of ‘output-oriented’ organisations, agencies need to justify their choices and convince various audiences about their contribution, so it would definitely be wrong to equate their
independence with a lack of accountability: reporting and auditing are important aspects of agencies’ agenda. Moreover, there seems to be an ‘autonomisation paradox’: autonomy is frequently accompanied by more stringent results-based controls, so that agencies perceive themselves as being more controlled than before, especially if they deal with salient topics such as food safety or the regulation of finance (Verhoest et al., 2010: 263). Even if the management of agencies is no longer directly accountable to the ministry, agencies have account-giving obligations to other forums. Although the accountability regime of each agency differs, agencies are part of an accountability web in which they are held to account by different forums on different matters. Most notably, they are subject to managerial surveillance by agency boards, to financial surveillance by auditing institutions and to legal surveillance by courts. In the European Union, national agencies become part of a multi-level system through their participation in European networks. In a sense they become parts of two administrations – the national and the European – which adds yet another layer to their accountability obligations, with the Commission frequently playing the role of the forum. One should add to these formal aspects a de facto obligation to justify policy to stakeholders (firms from the regulated sector, consumer associations) or to the media in the case of salient issues. As we know, such forums have no direct sanctioning power, but their support is necessary for agency legitimacy. Overall, as agencies do not seem to be unaccountable, the question is rather if their accountability regime is coordinated and does not induce accountability dilemmas, and to what extent and in what forms their accountability to democratic principals and to the public at large is safeguarded. Coming now to discussion of the impact on accountability of reforms of the traditional administration, we should first mention that many among them are also driven by an accountability agenda because they
Political Accountability
aim at greater responsiveness to the needs of service users. A prominent example of new mechanisms of accountability related to administrative reform is the diffusion of the ombudsman institution, which has increased opportunities for citizens to oblige public organisations to justify their behaviour in decisions that directly affect them. With managerial reform, the senior staff of public organisations came to enjoy more discretion but also lost anonymity in a context of mediatisation, and their increased accountability became closely related to the evaluation of their performance: ‘Supported by quantitative techniques of evaluation, managerial accountability focuses on cost-effectiveness, output efficiency, results, and customer satisfaction, rather than institutional processes and formal procedures’ (Considine and Afzal, 2011: 376). Politicians in turn acquire better tools to make administrators accountable, while being able to blame managerial failure for their own errors. Accountability also supposedly moves ‘downwards’ and becomes ‘proximate’ (Warren, 2014: 49), with ‘users’ or ‘clients’ of public services becoming forums whose feedback should be valued. However, for such feedback to be effective, it often requires organisation and, unlike citizens’ votes, stakeholders’ ‘voice’ is not equally distributed. Moreover, when such feedback exists, its interpretation often lies in the hands of politicians and management who can use it as an instrument for their own strategic purposes. It should also be kept in mind that the layering reform process has not contributed to clarity regarding accountability because successive reforms have often been associated with different, if not contradictory, accountability requirements. Finally, administrative reforms have often led to the replacement of direct administration by contractual relationships with more or less independent suppliers of services. The accountability obligations of hybrid bodies endowed with public service tasks, such as partnerships involving for-profit or nonprofit organisations, are inevitably multiform,
703
conflicting and fuzzy, combining – for example – public with market, vertical with horizontal and formal with non-mandatory elements. In such cases, another issue is that it becomes difficult to resolve conflicts through hierarchical channels. What used to be internal bureaucratic disputes become ‘externalised’, leading to litigation. This is yet another driving factor of the empowerment of courts as accountability forums.
The accountability gaps in governance beyond the nation-state Governance beyond the nation-state is often considered as the realm of intergovernmental organisations and negotiations. Such a negotiated decision-making process provides a window of opportunity to national executives to emancipate themselves from domestic constraints. For example, they can convince domestic forums – which are often poorly informed about intergovernmental negotiations conducted behind closed doors – that they had no other choice but to accept unpopular measures, shifting blame to unaccountable spheres beyond the national level and thereby weakening domestic governmental accountability. However, the issue of accountability obviously transcends national political systems, especially as the more prominent role of multi-level, supranational and even global (including private) forms of governance is insufficiently captured by traditional intergovernmental models, which fail to consider the complexity of the ecology of actors involved in governance processes. Beyond the state, the most sophisticated institutional creation is that of the European Union, which has become a supranational political force in its own right. Therefore, it cannot be reduced to its intergovernmental component, and this component – rulemaking by democratically elected national governments – is not sufficient to secure
704
The SAGE Handbook of Political Science
democratic accountability: each national government is only accountable to its domestic constituency, not to the citizenries of other countries affected by negotiations. This problem becomes acute in the case of asymmetric forms of intergovernmentalism, such as those that prevailed in the European Union for the management of the eurozone crisis. One needs, however, to go beyond the criticism that the European Union is plagued by a ‘democratic deficit’. Accountability must be disentangled according to the various loci of power of the EU decisional system because it is necessary to move ‘from assertions to assessments’ on the subject (Bovens et al., 2010: 174). Not only should the accountability regime of the core institutions, such as Commission, Council and Parliament, be considered; so should the regimes of more recent creations. Examples include the European Council of heads of state and government; a technocratic institution such as the European Central Bank, which substantially expanded its policy mandate during the crisis; or an informal governance body of the European Monetary Union such as the Eurogroup. One should add the policy role of less visible bodies: ‘backstagers’, such as the numerous comitology committees (composed of national experts and in charge of the implementation of European legislation), or ‘outposts’, such as the European agencies (Bovens et al., 2010). In a multi-layered governance system such as that in the EU, complexity ‘breeds opaqueness, indeterminacy, and creates incentives for executive improvisation, negotiation, and entrepreneurship’ (Bovens et al., 2010: 196). Hence, effective control procedures are a basic requirement, with the risk that they need to be themselves complex, informal, and perhaps also opaque. Bovens et al. (2010: 192) correctly suggest that ‘it takes a network to catch a network, but from an accountability perspective the simultaneous dispersal of both actors and forums into networks creates a whole set of new challenges’. With the growing role of rule-making bodies that have a global reach, the issue of
accountability has penetrated this level, too (Koenig-Archibugi, 2010).5 Most international organisations are hybrids, incorporating a global body acting autonomously and a negotiation system comprising representatives from national governments. Rules of international organisations are officially negotiated by governments, but frequently prepared by topranking administrators that are part of transnational sectoral networks (Slaughter, 2004), for example, in trade policy (in the WTO) or in financial policy (in the IMF). The accountability chain in the IMF is ‘tortuous’ (Sperling, 2009: 42) – from administrative staff, to the executive board, to the board of governors, to the government of member states represented by a governor, and finally to national voters. In addition, members of the IMF executive council are not bound to follow the instructions of their state of origin, they cannot be removed before their term has expired, they are not subject to formal reviews or evaluation and their actions are not made public (Woods, 2006: 192). Moreover, upward accountability grounded on ownership may collide with outward accountability because of affectedness: The IMF might be considered accountable to those whose money it is lending to take only reasonable risks, which leads to a policy of requiring structural adjustments. But it is also called to account for the effects of those structural adjustments within the countries accepting the conditions of IMF loans. (Grant and Keohane, 2005: 33)
Further, governance at the international level takes place not only in international organisations and functional regimes such as the UN and WTO, but also, notably, in the field of regulation in less formalised and specialised instances such as the Basel Committee on Banking Supervision (BCBS) or the International Organization of Securities Commissions (IOSCO). These organisations ‘tend to operate with a minimum of physical and legal infrastructure; most lack a foundational treaty and operate only along a few agreed upon objectives or bylaws’ (Slaughter, 2004: 48). They do not have the capacity to
Political Accountability
issue binding decisions, but their soft norms of conduct are highly influential on collectively binding decisions issued by governments, which raises the issue of their accountability. Typically, the media and international NGOs scrutinise the deeds of transnational governance bodies and perform a monitory function based (for the media) on the claim that they represent public opinion and (for NGOs) on claims to represent the public interest (embodied in environmental protection for instance) or diffuse interests (such as those of consumers and even of ‘speechless’ communities, such as animal species or the future generations). There is no doubt that such forums have become a countervailing power to transnational rule-making bodies. It can even be argued that, for a lack of democratic institutions at the global level that renders the domestic analogy irrelevant, accountability to surrogates tends to be considered the functional equivalent of the disciplining power of elections upon decisionmakers at a national level. Such an analogy is, however, limited: surveillance by actors of the so-called global civil society remains ‘a rather soft mechanism of holding international network governance to account’ (Steffek, 2008: 15). What is more, NGOs do not escape problems of opacity, elitism, lack of representativeness and of accountability (Steffek and Hahn, 2010). NGOs often speak in the name of groups not represented in the organisation, but organisational leaderships do not have to seek approval from them for acting as surrogates, nor do they have to provide justification to them for their action. In addition, the internal accountability of NGOs to their members may be weak. Hence, in ‘monitory’ democracy, the monitors may themselves not be sufficiently accountable. At the global level, important trends towards the privatisation of regulation are also detected, which lead to major deficits in political accountability. One first observes various forms of unequally codified
705
public–private partnerships (PPPs), which are also en vogue at the national level. A prominent example is the World Commission on Dams, which incorporated national governments – the World Bank, NGOs and construction firms to make the construction of large dams compatible with sustainability requirements. Although it may not be legitimate to generalise, case studies reveal that ‘mechanisms through which affected actors can control the decision making in the supreme governance bodies of PPPs are largely absent’ (Beisheim et al., 2010: 377). To the development of PPPs should be added the thorough delegation of global governance functions to private regulatory regimes, which exist in fields as diverse as the regulation of the internet and intellectual property, international minerals, insurance, maritime transport industries and industrial production standard-setting. For example, together with the International Electrotechnical Commission (IEC), the International Standardization Organization (ISO) accounts for about 85 per cent of all international product standards. Market globalisation gave a prominent role to ISO, and many of its standards become binding through official endorsement. Although membership is by country, ISO funding is private, states cannot be members and the organisation can best be described as a network of hundreds of technical committees and thousands of experts, whose institutional backbone is formed by private sector standards bodies at the national level (Büthe and Mattli, 2010). Clearly, the shift of de facto authority to private hands should not be attributed to a deliberate attempt to hollow out democratic procedures. Nevertheless, it contributes to the technocratisation of policy-making, which is a major issue at the global level. Similarly, there is at this level a stark contrast between the proliferation of (indeed frequently quite effective) accountability mechanisms and the non-democratic accountability standards upon which most of them draw (Goodhart, 2014: 292–5).
706
The SAGE Handbook of Political Science
Recent advances in accountability research Before concluding this chapter, I would like to highlight some major recent advances in accountability research that contribute to our empirical knowledge or are likely to enrich the normative debate. Regarding novel empirical findings, it is worth considering the contribution of research on the accountability of national and European regulatory agencies and its relation to more general considerations on political accountability. Although the research object is a segment of the public administration, it would make sense to test if empirical findings on agency behaviour can be extended to the behaviour of the state bureaucracy in general, and beyond that to the behaviour of elected officials as well (even though accountability forums differ). Much of accountability thinking is dominated by the principal–agent model, according to which the delegation chain should be paralleled by an accountability chain in order to avoid agency ‘drift’ if the preferences of agents differ from those of principals. In such a view, agents are inclined to opportunistic behaviour and would therefore have good reason to seek to evade accountability whenever possible. However, recent research contradicts such an assumption. The management of agencies not only does not evade accountability, but also is proactive in seeking it. Why is this so? First, a consequential logic is at work: agencies contemplate costs and benefits; anticipate that a benevolent attitude from the accountability forum, if it adopts the posture of an ally, will be a resource; and therefore come to estimate that being voluntarily accountable helps them accumulate a capital of trust on behalf of forums that immunises them from threats. Second, a normative logic of ‘appropriateness’ (March and Olsen, 2008) is also at work, which is obscured by the principal–agent approach: agencies simply regard voluntary accountability as an appropriate practice (Koop, 2014).
Whatever logic is at work, it appears in the light of such findings that accountability relations between an actor and a forum should not be seen as merely adversarial and constraining for the actor who has to account. One could further hypothesise that willingness to account depends upon beliefs of the forum possessing expertise and not being toothless (consequential logic), but also on the perceived legitimacy of the forum (normative logic). We also know that the establishment of accountability forums requires effort. Their constitution may be impeded by collective action problems due to the heterogeneity of those with an interest in creating them. We have also seen that the advent of ‘monitory’ democracy finds its limits in the fact that many rule-making activities are – purposively or not – insulated from the highly mediatised sphere of politics. Hence, a lack of accountability may not be due to agency drift, but to forum paralysis or forum drift (Schillemans and Busuioc, 2015): accountability forums may not be able or willing to perform their functions. Forums may lack information or, conversely, may be flooded with too much information that they do not have the time or the cognitive ability to process, making the evaluation process extremely costly. Forums may also be reluctant to perform their monitoring role because they prefer to concentrate their attention on those issues that are salient to them, behaving thus as ‘rational ignorants’. Another recent body of research that also concentrates on the accountability of administrative agencies explores how individual (senior) members of agency bureaucracy perceive their own personal accountability. Again, political science would benefit from insights borrowed from research on organisational behaviour, and the study of subjective views on accountability deserves to be extended to other categories of the state personnel, such as elected officials. ‘Felt’ accountability is derived from experimental
Political Accountability
psychology and other disciplines such as human resources management. It has been defined as ‘[t]he implicit or explicit expectation that one’s decisions or actions will be subject to evaluation by some salient audience(s) with the belief that there exists the potential for one to receive either rewards or sanctions based on this expected evaluation’ (Hochwarter et al., 2007: 227). A recent seven-nation study to which the author of this chapter has contributed (Schillemans et al., s.d.) reveals that the agency management feel strong accountability pressure from parent departments that are not always considered as highly legitimate, and even less deemed to possess adequate expertise. An interpretation is that accountability is felt in that case more as a source of irritation and less as a moral duty. It would be interesting to know how accountability would be considered if the forums were deemed to possess high expertise: would accountability merely be feared, or would it also be accepted because it is legitimate? Finally, given that the need for accountability is frequently associated with the issue of delegation, accountability is a blind spot when delegation is absent. What about the accountability of ordinary citizens? As their power to decide is not delegated, there is no normative need for accountability because of ownership (they ‘own’ their decision rights), but the same cannot be said for accountability because of affectedness. Should voters be accountable, and to whom, when they elect their representatives? Perhaps more importantly, given the recent increasing use of instruments of direct participation such as referendums, should citizens be held accountable for their policy choices (Warren, 2014: 45–6)? Even in the case of direct democracy, it is misleading to posit a congruence between policy-makers and policy-takers. Referendum outcomes may produce externalities affecting constituencies that are not powerful enough to influence the vote and, therefore, are not exempt from the risk of subjection to majority tyranny. Vatter
707
and Danaci (2011) have shown, for example, that referendum outcomes in Switzerland are more inimical to minorities – especially the foreign resident population that has no right to vote – than decisions taken in parliamentary votes. The generation of externalities affects not only groups that are not able to have a say, but also future generations: one may think of decisions on environmental matters or on the sustainability and intergenerational equity of pension schemes. Offe and Preuss (1991) argued in favour of the necessity for citizens to develop moral resources so that they display self-restraint and make decisions that are other- and future-regarding. How can this be achieved? Trechsel (2010) argues that in the context of direct democracy, citizens must develop reflexive accountability towards themselves. This is not without some resemblance to accountability to internalised norms and role prescriptions developed by professionals. The reflective turn becomes even more sophisticated in Goodin’s work, in which he pleads for people to become more decentred and empathetic by imagining themselves in the position of other people and asking what they would think about a policy proposal (Goodin, 2003). This may be desirable, but it is very uncertain if reflexive accountability alone suffices to lead to such virtuous attitudes. Brennan and Pettit (1990), for example, suggested in an influential theoretical piece that voters do not choose responsibly because the secrecy of the vote shields them from public scrutiny, and proposed as a remedy to unveil the vote. Publicity of the vote is not sufficient as such; rather, publicity is a means to make voters answerable for their decisions because it induces them to choose in a discursively defensible manner. This is not without dangers – publicity may discourage voters to justify their preferences on purely self-interested reasons, but it may also favour group pressure and conformity – but, again, the costs and benefits of voters’ accountability should be considered simultaneously.
708
The SAGE Handbook of Political Science
Critical assessment and ongoing debates: maximising or optimising accountability? This chapter started with a reminder that accountability is usually a positive value. One could therefore logically expect that designers of democratic institutions seek to maximise the accountability of power-wielders. However, the ‘pathologies’ (Halachmi, 2014) of accountability include both deficits and overloads: accountability is a value among others and its maximisation might entail undesired trade-offs. The advent of an ‘audit society’ has been criticised for good reason because an ‘overload’ of accountability may be problematic in several respects. There are obvious privacy issues if the powerful can use a pervasive Benthamian (or Foucauldian) ‘panopticon’ to exert surveillance over their subjects. However, political accountability that we study here has to do with the control and contestation of rule-makers (Heidelberg, 2017), not rule-takers, and the vulnerability of the powerful is an indicator of the empowerment of those who are otherwise powerless (Waldron, 2014: 27). Having said that, ‘excess’ accountability may negatively impact the conduct of those – even the powerful – subject to controls, which in turn may have deleterious consequences on the public interest. Over-exposure and too much stress on the culpability dimension may be felt negatively by the objects of accountability and psychological evidence reveals that if accountability is perceived as intrusive or insulting, then the pressures may boomerang (Lerner and Tetlock, 1999: 258–9). They may therefore induce blame-avoidance or riskaverse strategies, such as ritualistic ‘appearance of conformity’ (Philp, 2009: 43). They may also induce excessive proceduralism that inhibits thinking out of the box and, therefore, impede organisational learning, successful adaptation or innovation. The presence of ‘many eyes’ may lead to fatalism or indifference because it increases the randomness of control (Hood, 2010). Ultimately, an inflation of accountability mechanisms may fuel ‘mutual suspicion and intrusive surveillance’
(Warren, 2014: 44), undermine trust relationships that are necessary in governance and lead to the dominance of a ‘bad faith model of politics’ (Flinders, 2012). Hence, it is wiser to plead for an optimisation of accountability, although it is not easy of course to establish the adequate degree of accountability. This becomes even more difficult because accountability is multifaceted: choices need to be made about who exactly should be held to account, by whom, for what, through what means, and with what kind of consequences. A related issue is that of the role of sanctions in accountability. The existence of sanctions is a Damoclean sword that reduces the risk of misconduct, but excesses in the use of sanctions may also lead to a depletion of mutual trust. Sanctions are an instrument, not an end: therefore, maximising sanctions ex post may be less appropriate than combining them with ex ante mechanisms of socialisation – that, indeed, require investment in the long term – to inculcate an ethos of responsibility among the rule-makers. Control is also effective when it is internalised, leading to self-restraint not only because rule-makers fear sanctions, but also because they come to develop preferences that are aligned with values that promote the public interest. The consequential logic (anticipation of rewards and sanctions) should operate in combination with a logic of appropriateness that dictates the kind of behaviour that is normatively valued by rule-makers. Further, one should not assimilate an increase of accountability to an increase of its democratic component. Accountability to democratic principals may be limited, while other forms of monitoring, scrutiny, checking and auditing are steadily increasing (Papadopoulos, 2010), as in the case of international organisations that are subject to accountability regimes that involve several formal and informal third parties as forums. What we observe today is a proliferation of accountability mechanisms more closely related to the concrete activities of decision-making bodies, including accountability to ombudsmen and courts and to informal forums such as the media or NGOs. Does this offset the gap engendered by the growth of
Political Accountability
democratically unaccountable policy-making, and illustrated most prominently by the rise of private governance bodies? This is debatable. We have seen that democratic accountability is not always possible, and that surrogate forms then need to be invented, but these forms are not without problems. Mutual accountability may be high between the actors involved in rule-making processes, but the democratic credentials of several of them may be disputable. There are also inequalities in actors’ capacities to effectively hold decision-makers to account. In addition, the accountability forums may be self-selected, or even sometimes selected by those that should provide accounts to them. Finally, although the existence of democratic accountability seems an obvious candidate for the provision of political legitimacy, one can also – a bit provocatively – ask to what extent it is desirable itself. We have seen ‘the rise of the unelected’ (Vibert, 2007), and Pierre (2009: 592) writes that ‘power and accountability have been divorced, if not de jure so de facto and we now need to assess what this means for democratic governance’. This is correct, but one should also consider the consequences of the opposite challenge to democracy, that is, the current dramatic rise of populism. Populism advocates a conception of politics according to which the ‘people’ are better served if they rule themselves instead of delegating their power. If this is unfeasible, then rule-makers should closely mirror the preferences of the ‘people’ (or, more precisely, of the majority of a given constituency) and be subject to close monitoring by them: delegation should be limited, veto points absent and accountability maximal. It is not without reason that such developments are perceived as threatening for the quality of liberal democracy, which also rests on the existence of checks and balances to unfiltered majority rule and on the protection of minority rights. Democratic accountability should not be liberticide.
Notes 1 One can think of responsiveness to voters’ preferences, the respect of procedural norms (such as fairness, impartiality or proportionality), the degree
709
of performance and goal achievement, the adequate management of public funds, the personal qualities (such as probity) of politicians and so on. 2 Political (for elected officials), legal/judicial (for specific rules), administrative/managerial (for civil servants), and so on. 3 Defining affectedness is more likely to engender controversy than defining ownership because delegation relations are more visible than relations between outputs and outcomes. 4 The distinction is also inspired by the principal– agent model: members of the US Congress (principals) use different devices to monitor the bureaucracy (agent) and avoid agency drift (McCubbins and Schwartz, 1984). It can be extended to accountability relations beyond the principal–agent dyad. 5 This argument closely follows Papadopoulos, 2013: chapter 3.
References Auel, Katrin and Arthur Benz (eds) (2005). ‘The Europeanisation of Parliamentary Democracy’, Journal of Legislative Studies (special issue), 11(3/4). Beisheim, Marianne, Sabine Campe and Marco Schäferhoff (2010). ‘Global Governance through Public–Private Partnerships’, in Henrik Enderlein, Sonja Wälti and Michael Zürn (eds), Handbook on Multi-Level Governance. Cheltenham: Edward Elgar, pp. 370–82. Bovens, Mark (1998). The Quest for Responsibility. Cambridge: Cambridge University Press. Bovens, Mark (2007). ‘New Forms of Accountability and EU-Governance’, Comparative European Politics, 5(1): 104–20. Bovens, Mark, Deirdre Curtin and Paul t’Hart (2010). ‘The Real World of EU Accountability: Comparisons and Conclusions’, Mark Bovens, Deirdre Curtin and Paul t’Hart (eds.), The Real World of EU Accountability: What Deficit? Oxford: Oxford University Press, pp. 174–97. Bovens, Mark, Thomas Schillemans and Robert E. Goodin (2014). ‘Public Accountability’, in Mark Bovens, Robert E. Goodin and Thomas Schillemans (eds), The Oxford Handbook of Public Accountability. Oxford: Oxford University Press, pp. 1–20. Brennan, Geoffrey and Philip Pettit (1990). ‘Unveiling the Vote’, British Journal of Political Science, 20(3): 311–33.
710
The SAGE Handbook of Political Science
Büthe, Tim and Walter Mattli (2010). ‘Standards for Global Markets: Domestic and International Institutions’, in Henrik Enderlein, Sonja Wälti and Michael Zürn (eds), Handbook on Multi-Level Governance. Cheltenham: Edward Elgar, pp. 455–76. Considine, Mark and Kamran Ali Afzal (2011). ‘Legitimacy’, in Mark Bevir (ed.), The Sage Handbook of Governance. London: Sage, pp. 69–385. Flinders, Matthew (2012). Defending Politics. Oxford: Oxford University Press. Flinders, Matthew (2014). ‘The Future and Relevance of Accountability Studies’, in Mark Bovens, Robert E. Goodin and Thomas Schillemans (eds), The Oxford Handbook of Public Accountability. Oxford: Oxford University Press, pp. 661–72. Goodhart, Michael (2014). ‘Accountable International Relations’, in Mark Bovens, Robert E. Goodin and Thomas Schillemans (eds), The Oxford Handbook of Public Accountability. Oxford: Oxford University Press, pp. 289–304. Goodin, Robert E. (2003). Reflective Democracy. Oxford: Oxford University Press. Grant, Ruth and Robert O. Keohane (2005). ‘Accountability and Abuses of Power in World Politics’, American Political Science Review, 99(1): 29–43. Halachmi, Arie (2014). ‘Accountability Overloads’, in Mark Bovens, Robert E. Goodin and Thomas Schillemans (eds), The Oxford Handbook of Public Accountability. Oxford: Oxford University Press, pp. 560–73. Heidelberg, Roy L. (2017). ‘Political Accountability and Spaces of Contestation’, Administration & Society, 49(10): 1379–1402. Hirschl, Ran (2007). Towards Juristocracy: The Origins and Consequences of the New Constitutionalism. Cambridge, MA: Harvard University Press. Hochwarter, Wayne A., Gerald R. Ferris, Mark B. Gavin, Pamela L. Perrewé, Angela T. Hall and Dwight D. Frink (2007). ‘Political Skill as Neutralizer of Felt Accountability – Job Tension Effects on Job Performance Ratings: A Longitudinal Investigation’, Organizational Behavior and Human Decision Processes, 102(2): 226–39. Hood, Christopher (2010). The Blame Game: Spin, Bureaucracy, and Self-Preservation in
Government. Princeton, NJ: Princeton University Press. Hupe, Peter and Arthur Edwards (2012). ‘The Accountability of Power: Democracy and Governance in Modern Times’, European Political Science Review, 4(2): 177–94. Keane, John (2009). The Life and Death of Democracy. New York: W. W. Norton. Koenig-Archibugi, Mathias (2010). ‘Accountability in Transnational Relations: How Distinctive Is It?’, West European Politics, 33(5): 1142–64. Koop, Christel (2014). ‘Theorizing and Explaining Voluntary Accountability’, Public Administration, 92(3): 565–81. Koppell, Jonathan G. S. (2005). ‘Pathologies of Accountability: ICANN and the Challenge of “Multiple Accountabilities Disorder”’, Public Administration Review, 65(1): 94–108. Lerner, Jennifer S. and Philip E. Tetlock (1999). ‘Accounting for the Effects of Accountability’, Psychological Bulletin, 125(2): 255–75. Lindberg, Staffan (2013). ‘Mapping Accountability: Core Concept and Subtypes’, International Review of Administrative Sciences, 79(2): 209–26. Manin, Bernard (1997). The Principles of Representative Government. New York: Cambridge University Press. Mansbridge, Jane (2003). ‘Rethinking Representation’, American Political Science Review, 97(4): 515–28. March, James G. and Johan P. Olsen (2008). ‘The Logic of Appropriateness’, in Michael Moran, Martin Rein and Robert E. Goodin (eds), The Oxford Handbook of Public Policy. Oxford: Oxford University Press, pp. 689–708. McCubbins, Mathew D., and Thomas Schwartz (1984). ‘Congressional Oversight Overlooked: Police Patrols versus Fire Alarms’, American Journal of Political Science, 28(1): 165–79. Merkel, Wolfgang and Aurel Croissant (2004). ‘Conclusion: Good and Defective Democracies’, Democratization, 11(5): 199–214. Mulgan, Richard (2000). ‘“Accountability”: An Ever-Expanding Concept?’, Public Administration, 78(3): 555–73. O’Donnell, Guillermo A. (1998). ‘Horizontal Accountability in New Democracies’, Journal of Democracy, 9(3): 112–26. Offe, Claus and Ulrich Preuss (1991). ‘Democratic Institutions and Moral Resources’, in David
Political Accountability
Held (ed.), Political Theory Today. Cambridge: Polity, pp. 143–71. Papadopoulos, Yannis (2010). ‘Accountability and Multi-Level Governance: More Accountability, Less Democracy?’, West European Politics, 33(5): 1030–49. Papadopoulos, Yannis (2013). Democracy in Crisis? Politics, Governance and Policy. Basingstoke: Palgrave. Philp, Mark (2009). ‘Delimiting Democratic Accountability’, Political Studies, 57(1): 28–53. Pierre, Jon (2009). ‘Reinventing Governance, Reinventing Democracy?’ Policy and Politics, 37(4): 591–609. Przeworski, Adam (1998). ‘Deliberation and Ideological Domination’, in Jon Elster (ed.), Deliberative Democracy. Cambridge: Cambridge University Press, pp. 140–60. Raunio, Tapio and Simon Hix (2000). ‘Backbenchers Learn to Fight Back: European Integration and Parliamentary Government’, West European Politics, 23(4): 142–68. Romzek, Barbara S., Kelly LeRoux and Jeannette M. Blackmar (2012). ‘A Preliminary Theory of Informal Accountability among Network Organizational Actors’, Public Administration Review, 72(3): 442–53. Saward, Michael (2011). ‘The Wider Canvas: Representation and Democracy in State and Society’, in Sonia Alonso, John Keane and Wolfgang Merkel (eds), The Future of Representative Democracy. Cambridge: Cambridge University Press, pp. 74–95. Schedler, Andreas (1999). ‘Conceptualizing Accountability’, in Andreas Schedler, Larry Diamond and Marc F. Plattner (eds), The SelfRestraining State: Power and Accountability in New Democracies. Boulder and London: Lynne Rienner Publishers, pp. 13–28. Schillemans, Thomas and Madalina Busuioc (2015). ‘Predicting Public Sector Accountability: From Agency Drift to Forum Drift’, Journal of Public Administration Research and Theory, 25(1): 191–215. Schillemans, Thomas et al. (s.d.). ‘Understanding Manager’s Felt Accountability’, manuscript. Scott, Colin (2000). ‘Accountability in the Regulatory State’, Journal of Law and Society, 27(1): 38–60. Slaughter, Anne-Marie (2004). A New World Order. Princeton: Princeton University Press.
711
Sperling, Valerie (2009). Altered States: The Globalization of Accountability. Cambridge: Cambridge University Press. Steffek, Jens (2008). ‘Public Accountability and the Public Sphere of International Governance’. RECON Online Working Paper 2008/03. Steffek, Jens and Kristina Hahn (eds) (2010). Evaluating Transnational NGOs: Legitimacy, Accountability, Representation. Basingstoke: Palgrave Macmillan. Strom, Kaare, Wolfgang C. Müller and Torbjörn Bergman (eds) (2006). Delegation and Accountability in Parliamentary Democracies. Oxford: Oxford University Press. Thompson, Dennis F. (1980). ‘Moral Responsibility of Public Officials: The Problem of Many Hands’, American Political Science Review, 74(4): 905–16. Trechsel, Alexander H. (2010). ‘Reflexive Accountability and Direct Democracy’, West European Politics, 33(5): 1050–64. Vatter, Adrian and Deniz Danaci (2011). ‘Mehrheitsdemokratisches Schwert oder Schutzschild für Minoritäten? Minderheitenrelevante Volksentscheide in der Schweiz’, in Adrian Vatter (ed.), Vom Schächt zum Minarettverbot. Zurich: Verlag Neue Zürcher Zeitung, pp. 215–37. Verhoest, Koen, Paul G. Roness, Bram Verschuere, Kristin Rubecksen and Muiris MacCarthaigh (2010). Autonomy and Control of State Agencies: Comparing States and Agencies. Basingstoke: Palgrave Macmillan. Vibert, Frank (2007). The Rise of the Unelected. Cambridge: Cambridge University Press. Volmert, Andrew (2012). ‘The Puzzle of Democratic Authorization’, Political Studies, 60(2): 287–305. Waldron, Jeremy (2014). ‘Accountability: Fundamental to Democracy’. New York University School of Law, Working Paper no. 14–13, Public Law & Legal Theory Research Paper Series. Warren, Mark E. (2014). ‘Accountability and Democracy’, in Mark Bovens, Robert E. Goodin and Thomas Schillemans (eds), The Oxford Handbook of Public Accountability. Oxford: Oxford University Press, pp. 39–54. Woods, Ngaire (2006). The Globalizers: The IMF, the World Bank, and Their Borrowers. Ithaca, NY: Cornell University Press.
42 Authoritarianisms and Authoritarianization O l i v e r S c h l u m b e r g e r a n d Ta s h a S c h e d l e r
A Short History Authoritarianism is today mostly viewed as contrary to democracy. Ever since humans settled and formed communities larger than the groups of a nomadic hunter-and-gatherer society, people have predominantly lived under ‘dictatorship’ – even though the term emerged much later than language. No matter where we look at historical forms of human association, patterns of authority have tended to be strictly hierarchical, and political participation, or active part-taking in political decision-making, has for the most part been restricted to a very narrow stratum of political elites, a small group, or even individual leaders: Egypt’s pharaohs, Chinese and East African emperors, Maya and Inca kings, ancient Sahel rulers such as the Jolof empire’s bours, the Mongol Khans, French emperors or post-independence rulers of sub-Saharan Africa may serve as examples. Political association under female leadership has not been an exception to this rule, as examples such as
Queen Zenobia of Palmyra, Cleopatra of Egypt and others demonstrate. While democracy is thus historically relatively recent and an exception rather than the rule, from the second half of the 20th century it came to be seen by some as a universal value. Recent trends, however, signal that the portion of humankind that suffers authoritarian rule has been increasing over the past two decades. Why ‘suffer’? Of course, not all subjects suffer under dictatorship. Rulers, their elites and circles of core supporters usually fare well or better under authoritarianism than in a polity where equal political rights and civil liberties are guaranteed even to opponents. But democracy arguably is the one political regime type that allows the largest possible number of people to not die of a violent, politically caused death. Since World War II, more people have died at the hands of dictators and their repression than died of war.1 Not only is democracy a recent phenomenon, but even during that half-century when
Authoritarianisms and Authoritarianization
democracy became more widespread globally than ever, each wave of democratization was followed by so-called ‘reverse waves’ during which countries (re-)authoritarianized politically. Today, more than half of the planet’s humans live under authoritarian rule. Russia, after a brief phase of democratic experiment in the early 1990s, re-emerged as a great power under its seemingly omnipotent president Putin; contrary to previous expectations, the former ‘satellite states’ of the Soviet Union also did not generally democratize after gaining independence, but for the most part remained firmly authoritarian. The re-strengthening of Chinese single-party rule, as well as continued authoritarianism in other large and populous countries (such as Bangladesh, the Democratic Republic of Congo, Egypt, Ethiopia, Iran, Kazakhstan, Nigeria, Pakistan, Turkey, Vietnam and others), helped in fostering new and massive scholarly interest in authoritarianism. It is thus of little surprise that research on authoritarianism mushroomed and that it has become one of the hottest sub-fields in comparative politics.
Basic Concepts and Theories Authoritarianism today refers to two related but distinct fields. First, in a broader sense, it is a social science concept that refers to traits of the individual; in this, the term is closely related to a seminal study conducted by Adorno and his co-authors (Adorno et al., 1950). Trying to explain how the horrors of the Nazi regime could occur, their main explanatory variable was the nature and behavior of the individual, for which their concept of the authoritarian personality remains a landmark in social psychology. The present contribution, however, focuses on the understanding of the term in political science. In this latter context, authoritarianism is understood as a form of political rule or a distinct type of political regime.2
713
In this understanding, authoritarianism has been most prominently defined by Juan Linz (1964). Writing on Spain under Franco, he found that this type of political regime neither fits with the established characterizations of totalitarian regimes (cf. Arendt, 1951; Friedrich and Brzezinski, 1956) nor is compatible with prevalent definitions of democracy (e.g. Dahl, 1971: 3; cf. also Chapter 43, this Handbook). To the extent that it is not used as a residual category, ‘authoritarianism’ emerges in political science as an ex-negativo class in between democracy and totalitarianism and, together with the latter two, forms the ‘classic triad’ of political regime types. While suggestions have been advanced to re-define authoritarianism in order to overcome weaknesses inherent in Linz’s initial definition, none have managed to achieve the broad scholarly consensus which Linz’ proposal still enjoys. According to Linz, authoritarian regimes (1) take a limited, non-responsible form of pluralism (as opposed to the political monism of totalitarian regimes and the principally unlimited pluralism of democracies); (2) have no elaborate ideology (as opposed to totalitarian regimes), but display distinct mentalities instead; (3) have no extensive nor intensive mobilization [unlike totalitarian regimes], ‘except [at] some points in their development’ (Linz, 1964: 297), but are characterized by ‘political apathy’ of the populace (other than in democracies where citizens are expected to partake in public affairs and debates); and (4) are characterized by political rule – exerted by a single leader or a small clique – in which power is exercised ‘within formally ill-defined limits’ (unlike in democracies, where power is exerted within a restricted system of formally guaranteed rights and liberties and a system of checks and balances), but which are ‘actually quite predictable’ (in contrast to the unpredictability of state terror exerted by totalitarian regimes) (ibid.). For Linz, the first criterion (limited pluralism) was the most important one.
714
The SAGE Handbook of Political Science
This definition, however, is not free from problems. First and foremost, it has been established in contradistinction to the other two basic regime types, in an effort to delimit it from both democracies and totalitarianisms. This is both a strength and a weakness. One core such weakness is that Linz’ four classical criteria of authoritarianism are packed with hard to operationalize soft language. Therefore, they allow for numerous exceptions and require subjective interpretation; the criteria themselves thus remain ambiguous. Another key weakness lies in the process of how these basic regime types have been defined: since each of the triad of authoritarianism, totalitarianism and democracy was defined independently of the others and by different scholars at different points in time, they do not follow the basic principles of how to construct typologies, namely to build classes that are mutually exclusive and jointly exhaustive (cf. Sartori, 1991). Third, an additional problem emerges when ‘authoritarianism’ is understood broadly as anything deemed ‘undemocratic’. In this sense, there can well be ‘authoritarianism’, or elements thereof, in democratic regimes which leads to ambiguity in language. A fourth issue, still on the typological level, arises from existing suggestions of constructing subtypes of authoritarianism. While the earlier work of Linz suggested a large number of often ad hoc subtypes, more recently a tripartite subtypology proposed by Geddes (1999) has become the most ubiquitously used. It distinguishes between single-party, personalist and military dictatorship (read: authoritarianism) while acknowledging that mixed types exist. This brings us back to the methodological and epistemological quest for mutual exclusiveness in class-building. It becomes problematic as soon as we would wish to use regime (sub)type as a variable to test. When it is impossible to unambiguously attribute a given empirical case to a specific (sub)type, it is equally impossible to test whether the assumed effects are attributable to that specific (sub)type.
Not within the family of authoritarian regimes, but on the blurred borders between democracy and authoritarianism, one suggestion has been to insert a fourth basic type, that is, hybrid regimes (see Gagné and Mahé, Chapter 47, this Handbook). A much more influential suggestion was to create so-called ‘diminished subtypes’, that is, subtypes that share some but not all traits of the root concept. That has led to a broad range of adjectives that distinguish between – often ad hoc generated – subtypes of both democracy (such as ‘illiberal’, ‘delegative’, ‘domain’, ‘enclave’ and even ‘authoritarian’ democracy) and authoritarianism (‘liberalized’, ‘electoral’, ‘competitive’ and so on). This has not only further blurred the boundaries of basic regime types, but also led to oxymoronic neologisms that make little sense. If and when, however, such adjectives were added in order to create classical subtypes (i.e. subtypes that fulfil all definitional characteristics of the basic type), such additions have mostly proven meaningless.3 This is because they seem to ignore basic elements of the definition (such as limited political competition): obviously, as long as we adhere to Linz’s definition of an authoritarian regime, we should not be surprised about contestation taking place under authoritarianism – as long as it is limited. All this has resulted in an unresolved basic debate about typologies: following Frantz’s (2018) wording, categorical typologies that view various types of regimes as equally authoritarian in nature (e.g. personalist, military, multiparty or monarchic authoritarianism) compete with continuous typologies that assess political regimes with respect to their presumed ‘distance’ to ‘full democracy’. Commonly used indices to ‘measure’ regimes or ‘democracy’ (such as Polity, Freedom House or the Bertelsmann Transformation Index) implicitly adhere to this continuous understanding. In conclusion, political science still works with Linz’s classic definition despite its shortcomings and despite the fact that to date, no
Authoritarianisms and Authoritarianization
methodologically sound subtypology exists. Likewise, our knowledge about what produces (which subtypes of) authoritarianism and, vice versa, which consequences (what subtypes of) authoritarianism have on a variety of possible dependent variables remains limited despite major advances that have been made over the past two decades (for a more detailed discussion on these debates see below).
Regional and global trends and developments After World War II, the former ‘Soviet Bloc’ (Central Eastern, Eastern and South Eastern Europe), the Soviet Union itself and a number of at least nominally socialist or communist republics in Central, East and South East Asia existed under the singleparty rule of local communist or socialist parties. East Germany, Poland, Hungary, Czechoslovakia, Bulgaria, Romania, Albania and Yugoslavia in Europe, alongside Vietnam, Cambodia and Laos, were among these. The most exciting member of that group, however, might be China, where the Communist Party has managed to transform ideologically and survive the end of communism as a ruling party. In fact, communist or socialist-oriented countries represent the single largest group of single-party regimes in the 20th century. But with the collapse of the Soviet Union and the accompanying delegitimization of social revolutionary collectivist ideologies, this changed. After the fall of the ‘Iron Curtain’ in Europe, Central and Eastern European countries – some of them, such as Latvia, Lithuania and Estonia, newly independent – democratized and became members of the European Union. After the Balkan wars that ended the state of Yugoslavia, South Eastern Europe gained a landscape of new states, of which Slovenia (2004) and Croatia (2013) accessed the EU.
715
More recently, a group of regime hybrids has established itself in Central Eastern Europe: Hungary, Poland, Ukraine and Moldova. Hungary and Poland have witnessed rapid processes of wilful authoritarianization (see also below) despite EU membership. Moldova and Belarus maintained authoritarian rule under presidents who preserved close ties with Russia, while Ukraine wobbled between democratic and clearly autocratic tendencies during its independence, with periods of conflict similar to developments in Georgia, Moldova or Armenia. In Africa, when the age of colonialism ended in the decades after World War II, many of the then newly independent states emerged under authoritarian rule.4 It was only during the most recent wave of democratization that a number of them transited toward (formal) democracy (including, but not limited to, Botswana, Ghana, Mali, South Africa and, as a latecomer, Tunisia in 2011–12). Nominally, many of these post-independence African regimes were ruled by single parties of socialist leanings (such as in Algeria, Angola, Benin, Ethiopia, Mozambique, the former People’s Republic of the Congo or Somalia) or, alternatively, by right-wing and/or nationalist single parties (such as in Burundi, Cameroon or Chad). Often these parties formed out of liberation struggles against former colonial powers. Overall, post-independence Africa from the 1960s/70s until roughly 1990/91 thus provides the single most important reservoir of cases to study authoritarian single-party rule, and the largest group of Marxist-Leninist single parties outside the former ‘Soviet bloc’. However, formal single-party rule often only thinly veiled clan- or tribe-based personalist rule – neopatrimonial rule,5 where inclusion is often sacrificed for rewarding tribal loyalties. It is thus difficult to assess the type of authoritarianism when singleparty rule on paper coincides with personalist rule in practice. An extreme case of such
716
The SAGE Handbook of Political Science
potentially misleading formal single-party rule but de facto personalist control is North Korea, where formal single-party rule has regressed into an absolute monarchy-like personalist rule of the Kim dynasty. The Middle East and North Africa (MENA) region also illustrates this problem: while MENA cases have provided inspiration for the study of military authoritarianism and civil–military relations, several countries in that region, at least for some time between the 1950s/60s and the 1980s/90s, witnessed nominally socialist single-party rule, which, however, was for the most part coupled with nationalist ideas (Algeria, Tunisia, Egypt, Libya, Syria, Iraq, South Yemen). Over time, however, all these converged, in their organization of the centers of political power, alongside their monarchical counterparts in Jordan, Morocco and the Persian Gulf (Kuwait, Saudi Arabia, Oman, United Arab Emirates, Qatar, Bahrain). More clearly than large parts of sub-Saharan Africa, Middle Eastern states have been characterized not primarily by single-party or military rule, but by their neopatrimonial regimes.6 Similar characteristics have been shaping states in Central Asia and the Caucasus, such as Kazakhstan, Turkmenistan, Uzbekistan and Azerbaijan, since their independence at the turn of the 1990s. Most countries in that region have been ruled for many years by increasingly personalist leaders who already held important political positions in Soviet times and who assumed office shortly after their countries’ independence (Aliyev in Azerbaijan; Nazarbayev in Kazakhstan; Karimov in Uzbekistan; Akayev in Kyrgyzstan). Georgia and Armenia in the Southern Caucasus are the exceptions to this regional pattern, and Kyrgyzstan has reverted from a decade of personalist rule to a multiparty system. The dominant regional trend, however, clearly consists of personalist dictatorships. This stands in marked contrast to the – mostly military – dictatorships which Latin America saw in the 20th century. In fact,
the only Latin American country that has not experienced long authoritarian rule is Costa Rica. While decidedly personalist rule (e.g. the Somoza regime in Nicaragua or the Duvaliers’ regime in Haiti, and more recently Peru under Fujimori) also occurred in this world region, military rule was the dominant form of authoritarian governance in 20th-century Latin America (Peron acceded to power virtually immediately after World War II). Second, international factors, and particularly external involvement of the United States, are important in Latin America’s experience with authoritarianism. In part due to the Truman doctrine of containing communism, US involvement produced mainly right-wing (military) dictatorships, many of which gained power through coups d’état (such as in Guatemala in 1954, Paraguay in 1954, Brazil in 1964, Bolivia in 1971, Chile in 1973, Uruguay in 1973 and Argentina in 1976). Thus, at one point during the late 1970s, military rule seemed an almost defining characteristic of Latin American politics, but then the continent democratized quasicollectively in the 1980s and 1990s. Again, mixed types of personal–military and bureaucratic civil–military regimes have existed in several cases, pointing to the need to re-think the differentiation of authoritarian subtypes. In academic works, it was mainly the Latin American experience that resulted in the first series of now classic in-depth studies on the working mechanisms of modern authoritarianism (most prominently O’Donnell, 1973; Collier, 1979) in its bureaucratic and military forms. In the current century, a renewed tendency toward left-wing authoritarian rule combined with anti-American populist rhetoric and policies can be found in Latin America, best exemplified by Venezuela under President Chavez and his successor Maduro. While personalist in style, the defining feature of their rule is its populism. While not all populist leaders in Latin America pursue clearly authoritarian politics (a counter-example is
Authoritarianisms and Authoritarianization
the presidency of Evo Morales in Bolivia), Latin America’s neo-populism can be seen as having spearheaded what, in the late 2010s, had become a global trend toward populist politics. Alongside authoritarianization, that is, the deliberate dismantling of democratic rule and its transition to authoritarian governance, and the personalization of authoritarian rule, populism is one of the most pertinent current trends in the study of authoritarianism (see below).
Databases There has been a proliferation of new datasets published as part of the recent wave of research on authoritarianism, all of which inquire into the question of the durability of various authoritarian (and democratic) subtypes. The underlying basic idea in five of them is to measure regime durability (or life expectancy), and to assess conditions of breakdown of both democratic and autocratic regimes (Anckar & Frederiksson and Luhrmann et al., 2018; Cheibub et al., 2010; Geddes et al., 2014; Hadenius et al., 2012; Lührmann et al., 2018; Wahmann et al. 2013). The most prominent of these is probably the first, by Geddes et al. While the 154 countries these authors cover between 1946 and 2010 are fewer than the 195 and 1,999, respectively, covered by the latter two, Geddes et al.’s study includes more detailed information, for instance on the mode of transition (e.g. coup, popular uprising or electoral defeat), on the degree of violence that occurred during a given transition and on exact dates (instead of years) at which the authors see a regime as starting and ending. Also, it allows for longer time series than the one established by Hadenius et al.7 The Anckar–Fredriksson dataset goes back to 1800, but that brings only limited added value as several forms of authoritarianism have existed for a much shorter time
717
span (e.g. single-party authoritarianism); second, some of the definitional traits of authoritarianism (such as those relating to, e.g., political mobilization, repression, public communication) would seem to require a modern, post-industrial revolution world in and by themselves, so their applicability to regimes of the early 1800s seems at least highly questionable. Calculating average regime duration across more than 200 years does not, therefore, add too much relevant knowledge to today’s study of authoritarianism. Interestingly, the studies’ respective definitions of ‘political regime’ differ – not tremendously, but significantly – from one another.8 In that respect, too, then, the dataset provided by Geddes et al. is the one most easily compatible with the existing literature and its conceptual foundations. Geddes et al. organize their data along the question of who rules and create four authoritarian subtypes (dominant-party, personalist, military and monarchy). By contrast, Hadenius et al. base their set on the question of how leaders accede to power and arrive at three subtypes of authoritarianism (monarchy, military, electoral9). Cheibub et al., in their turn, look at who the leaders are and how they acceded to power. They arrive at six subtypes of regimes which include three democratic and three authoritarian subtypes (parliamentary, semi-presidential and presidential, plus monarchy, military and civilian). Thus, when looking at how differentiations among the group of authoritarian regimes are conceptualized, monarchies and military rule are included in all of them, but only Geddes et al. provide a category for ‘personalist’ regimes, roughly along the lines of what Juan Linz had earlier called ‘sultanism’,10 while Lührmann et al. remain lean in their fourcategory typology that covers both democracies and autocracies. Finally, Gothenburg’s V-Dem team has produced another dataset (Teorell and Lindberg, 2019) that reaches back to the French Revolution (1789–2016) and, like the first four datasets, measures
718
The SAGE Handbook of Political Science
what they call ‘executive survival’, but also tests their subtypes of regimes – the same four basic types of autocracy and democracy as in Lührmann et al. – for properties such as corruption and repression. While the existence of these databases is a major advancement, a core underlying problem in all of them is that the typologies they establish mostly defy the principal rules of class-building efforts: that classes need to be not only jointly exhaustive, but also mutually exclusive. For instance, it is obviously not inconceivable that a monarchy might be ruled in a personalist manner. And would Iraq under Saddam Hussein come out as a single-party regime run by the Baa’th party, or as a personalist system run by Hussein? And would North Korea count as a multiparty electoral regime? A monarchy because power has become hereditary? Or personalist authoritarianism? The sixth dataset by von Soest and Grauvogel (2017) stands apart. Covering the relatively short time span from 1991 to 2010 only, it looks not primarily at regime durability but at legitimation strategies, which it assesses for only 98 countries. It differs from the ones discussed so far in that it does not look at institutions but at legitimation patterns, and tries to relate these to authoritarian subtypes. Six different claims to legitimacy are identified: foundational myth; ideology; personalism; procedures; performance; and international engagement. Their regime subtypology also differs from the other three: they suggest four types, three of which are authoritarian (closed authoritarian, hegemonic authoritarian, competitive authoritarian, democratic). This is an important addition to the mainstream databases as it zooms in on one particularly important field that is linked to the stability and life expectancy of political regimes, namely legitimacy (see more on this below). Yet Geddes et al. remains the most used, despite typological shortcomings – from which, it must be said, most of the other extant datasets suffer just as much.
Debates Recent debates on authoritarianism have been strongly influenced by the fact that Fukuyama’s prediction of the ‘end of history’ was empirically wrong. Consequently, much attention in research was spent on questions of authoritarian resilience: what are the factors that give birth to authoritarianism in the first place? Which ones make it survive, thrive and resist challenges from within and abroad? While much attention has been devoted to institutional approaches, and to democratic-looking institutions in particular, an alternative structure of current debates would cover areas that are vital for the above key research questions. First, (1) the foundations, both social and economic, of authoritarianism are essential. But while the material origins and underpinnings of authoritarianism are important, the category of (2) legitimacy and legitimation strategies undoubtedly represents another ‘pillar of stability’, alongside (3) repression, which authoritarian regimes by definition rely on in their survival over time. But since, in a globalized period of world politics, national political regimes do not exist in isolation from one another but are more deeply influenced by outside developments than probably ever before, (4) international factors cannot be ignored in any analysis of political regime development. These four broader fields of inquiry thus merit a closer look.
Social and Economic Foundations of Authoritarianism Regime Formation and Elite Constellations First, important research has sought to explain the emergence and survivability of authoritarianism through its initial construction (regime-building), the constellations of rulers vis-à-vis elites and elite–society constellations. While a consensus exists that
Authoritarianisms and Authoritarianization
authoritarianism’s various ways of emergence have given rise to divergent subtypes of rule (the dependent variable in these studies), dissent follows immediately after. ‘Selectorate theory’ (Bueno de Mesquita et al., 2003) builds on earlier findings by studies on democracy. It hypothesizes that the smaller the innermost circle of loyalists around an imagined leader (the winning coalition), the more likely authoritarian survival becomes, if the selectorate (those who have helped bring the ruler to power) is large; in principle, regime type does not matter here.11 By contrast, others have contended that the survival of ruling coalitions depends mainly on the initial purposes for which they came into being: Slater (2010) distinguishes ‘provision pacts’ from ‘protection pacts’, with the former guaranteeing, on average, better survival prospects for authoritarian regimes than the latter in the face of threats from the masses. Svolik (2012) identifies such threats from below as one out of two core challenges to autocratic survival, the other one being power-sharing among rulers and elites. In a different approach, Brownlee (2002) finds 15 cases in which extensive personalist neopatrimonial networks account for the re-stabilization of authoritarianism after crisis. What all these efforts have in common is that the number of variables omitted is usually large. Disaggregated empirical data that could specify concrete causal mechanisms that account for the overall claims is often absent or, if present, not generalizable. To capture more precisely who influences what decisions, and who sets what agendas, thus remains a challenge.
State–Society Relations and Civil Society While for Bueno de Mesquita and other orthodox political economy approaches, the citizenry at large plays little if any role, a huge literature with a focus on institutions that regulate state–society relations has
719
emerged. UK and US-based scholars in particular have extensively examined the role of democratic-looking formal institutions in autocracies: political parties, party systems, competitive and non-competitive elections, voting behavior, parliaments, as well as formal and informal political oppositions, both ‘loyal’ and ‘anti-systemic’, and courts and constitutions.12 This literature has the merit of (re-)focusing on a long neglected and/or misread feature of authoritarianism, namely the existence of democratic-looking or ‘imitative institutions’ (cf. Schlumberger (2004). We know now that such institutions do not usually signal a rapprochement to democracy but have been established to make autocracy more resilient because they fulfil functions that may range from patronage machines to opinion barometers, instruments of gaining non-democratic forms of responsiveness or facilitators of power-sharing, or may enhance both procedural security and legitimacy. Ultimately, they serve to control the public sphere, which under authoritarianism is key to political survival. The same applies to new media, which regimes actively use and manipulate to control public discourse and delegitimize potential challengers. In the realm of civil society, regimes have been able to create lookalike civil society organizations, frequently referred to as ‘GONGOs’ (governmentally organized NGOs), which mostly depend on the state in terms of finance, personnel and registration.
Resources and Distribution Richness in exportable scarce resources such as oil or natural gas tends to strengthen authoritarian rule because it allows regime elites to realize rent income independently of domestic extraction, and to allocate these rents (Luciani, 1987; Ross, 2001, 2012; Smith, 2004; Ulfelder, 2007).13 While the richest of such countries have for a long time refrained from taxing their citizens at all, allocation is practiced in two broad
720
The SAGE Handbook of Political Science
ways: first, strategically important elites on whose loyalty rulers need to rely benefit disproportionately through such allocation; second, rentier states also aliment the populace at large (through, e.g., subsidized basic foodstuffs, consumer goods or petrol; low-cost or free health and educational systems; other infrastructure). Other sources of such rent income include politically motivated external aid; tourism receipts; location rents due to important transit routes for ships, oil or gas pipelines; and even remittances by labor migrants back into their home economies (to the extent they can be taxed). Key, however, is the allocative power per person, so that countries with large populations and unequal distributional patterns (e.g. Nigeria) face a higher risk of violent conflict than those with small local populations and similar levels of rent income (e.g. the United Arab Emirates).
Legitimacy and Legitimation Sources of Legitimacy For a long time, authoritarian rule was thought to be innately less legitimate than democratic rule because authoritarian regimes lack the kind of input legitimacy that is provided through regular free and fair elections. In democracies, such elections legitimize those who ascend to positions of political decisionmaking power, that is, incumbent elites. Under authoritarianism, by contrast, the masses have little to no say in the selection of their rulers or policies. But legitimacy can be derived from a multitude of sources. Literature has produced a broad spectrum of such potential sources of legitimacy for authoritarian regimes (cf., e.g., Schlumberger 2010). Prominent among these are nationalism and other ideological sources, religion, foundational myths, war, avenues of political representation other than democratic elections, material benefits to specific target groups or to the population at large to demonstrate
effective regime performance – ‘output legitimacy’, to borrow Scharpf’s term, procedural legitimacy through ‘imitative institutions’ that look like (and often carry the names of) their counterparts in democratic settings but fulfil different purposes, discursive strategies of othering, the construction of enemies, and other identity-based legitimacy claims. Legitimacy can thus be viewed as an aggregate category which, important as democratic elections are, is not usually built on any single source even in democracies. In that sense, then, there is no logical reason to consider authoritarian regimes as a priori less legitimate than democracies.
Output Legitimacy or Performance An older debate about whether autocracies or democracies perform better at delivering desirable outcomes in the economic, social or cultural spheres has been complemented by a related and refined debate about the performance of various subtypes of authoritarianism. Once we acknowledge the importance of authoritarian institutions, the question is how exactly what institutions and institutional arrangements are producing what kinds of output on various dimensions such as social performance, human development, the protection of property rights, environmental sustainability or responses to economic crises, with developmental performance being among the most prominent topics in that emerging literature. However, some of the findings generated by this work are less than coherent, as is evident, for instance, in the literature on the assumed impact of (authoritarian) elections on performance on a wide range of indicators. Some claim that if elections, however authoritarian, are competitive, this has positive effects on civil liberties (which might be a tautological assessment), on gender equality, on health and on education, whereas non-competitive elections would not have such beneficial side effects (Miller, 2015).
Authoritarianisms and Authoritarianization
By contrast, McGuire (2013) finds that singleparty regimes boast lower infant mortality than do multiparty regimes; Schedler (2013: 380ff.) argues that ‘bad elections are better than no elections’; and Little (2016) claims that elections benefit citizens under any condition, whether competitive or not. Overall, research on authoritarian performance remains a nascent strand of literature that needs to start adjudicating between a wide range of initial suggested findings that contradict each other and which, up to now, cannot be said to have produced consolidated knowledge. The operationalization of key variables as well as the quality and amount of data used leave a range of open questions as to generalizability.
Material Legitimacy, Social Inequality and Consequences It has been said above that richness in scarce exportable resources can provide an economic basis for authoritarianism through the constant influx of rent revenues. This means authoritarian regimes that boast regular high influx of external rent income are better able to create legitimacy than others, so that the conclusion that ‘oil hinders democracy’ (Ross, 2001) seems valid. But we also know that bust periods in natural resource trade do not automatically lead to regime crisis in rentier states (Smith, 2006). Rather, as the political economy of the 2011 Arab revolts has demonstrated, it was the longer term relative societal deprivation which resulted from changes to the overall fiscal framework of the state that ultimately resulted in mass protests (Moore, 2015). Such fiscal reforms were pursued as part of neoliberal adjustment policies as prescribed by the international financial institutions and have, in the majority of cases, led to increased social inequalities. From here, there seems to be a potential link to ruptures in state–society relations and the breakdown of previously existing social contracts in which citizen acquiescence had been bought by material rewards. This, in turn, is related to regime performance, to repression
721
(of often ensuing protests) and to questions of waning regime legitimacy in the eyes of important segments of societies. While another consequence of global neoliberalism, the frequently seen amalgamation of regime interests with those of private business elites, can, on the one hand, constitute powerful underpinnings of authoritarian rule, it can also provide a cause for instability when inequality and social injustice become so big that citizens take to the streets in anger and frustration despite real risks of facing repression. In an opposite hypothesis, Solt (2012) contends that once citizens have been exposed to social inequality for longer periods, acceptance of authority comes more naturally and thus would lead to authoritarianism (‘relative power theory’). By contrast, Acemoglu and Robinson (2006) find that both very low and very high levels of inequality make authoritarianism more likely, while medium levels of inequality are more likely related to democracy.
Repression Following Gerschewski’s (2013) reflections, repression complements legitimacy as another cornerstone for authoritarian rule to remain ‘stable’ and survive over time. But repression is notoriously hard to study empirically, as real dangers for scholars and informants are often prohibitively high. Methodological issues such as preference falsification add to these difficulties. It is for these reasons that repression, in comparison to quasidemocratic institutions, has received far less attention.14 There is a potentially inverse relationship between repression and legitimacy (Schlumberger, 2004; cf. also Gerschewski, 2013): while gains in legitimacy reduce the need to control oppositional actors by force, repression aims at preventing oppositional actors from building up their own sources of legitimacy. On the other hand, not only can the potentially very high costs of repression be
722
The SAGE Handbook of Political Science
significantly lowered when repression is discursively legitimized by regime elites, as Edel and Josua (2018) demonstrate; in the extreme case, repression, if successfully framed, can even lead to gains in legitimacy among both domestic and international audiences. There is, thus, a direct, but complex, nexus between legitimacy, repression and dissent. Now, when repression is exerted, the results are not obvious: the effects of repression on public dissent can be positive, negative or not discernable at all. This so-called punishment puzzle (Davenport, 2007) still remains unresolved, even though several authors have hinted at potentially relevant variables that might explain the effects of repression. Yet, up to now most studies in the field of repression seem to treat the phenomenon as a dependent variable: when, why, how, to what extent, in which forms and to what avail is repression employed by authoritarian regimes? In order to answer such questions, the empirical phenomenon needs to be disaggregated into its spatial, temporal, typological and actor-related dimensions. A range of studies have differentiated not only between levels of repression, but also their various forms (hard versus soft, incapacitating versus constraining, and so on), in addition to the institutions of repression, its perpetrators, as well as its primary and secondary targets. Also, variation in repression can be seen as caused by the institutional setup of the regime, by the nature of the challenges and their perception by regime elites, and/or, in allusion to Weber’s definition of the state, by ‘state capacities’ (Josua and Edel, 2015). Further important aspects of repression consist in its internationalization and commercialization or privatization, which points to questions of agency, including for agents outside institutionalized regime actors themselves. Overall, then, the new research on repression has tackled an important and understudied field and it is quite natural that it has, as of yet, produced more open questions than definite answers.
International Factors There are four main ways in which international factors impact on authoritarian resilience that can be distinguished analytically: (1) democracies promoting, consciously or unconsciously, authoritarian resilience abroad for either geostrategic or economic reasons or for lack of better understanding; (2) autocracies actively and consciously supporting and promoting authoritarianism elsewhere; (3) domestic authoritarian rulers engaging in processes of active and conscious learning from authoritarians abroad; and (4) structural international factors that operate on a regional or global basis and enhance the survival prospects of authoritarianism in a given country. An important early distinction in this field is that made in the various works authored by Levitsky and Way (e.g., i.a., 2002) on Western powers’ leverage over political developments elsewhere, including not only policy choices but also polity-related institutional choices on the one hand, and changes induced through an increasing density of interactions between two countries or increased linkages between them (the exchange of goods and persons; border-transgressing media including social media and the like; migration) on the other hand. The latter need not be accompanied by any intent by the influencing power but assumes that a higher density of interaction structurally influences the recipient country. The open question, however, is: who influences, and who adopts? In contrast to earlier assumptions, there is no inherent reason to believe that only democracies spread their norms and ideas. Autocracies fare no worse in spreading their mode of governance than do democracies. Not only have foreign policy efforts specifically aimed at aiding democracy abroad been notoriously ineffective; also, they can have unintended opposite effects. Earlier naive assumptions about European foreign policies spreading democratic values by providing a credible role model have probably never had
Authoritarianisms and Authoritarianization
much empirical substance. Western powers have not, in actual policy, accorded the priority to democracy aid that is found in their political rhetoric (Youngs, 2010; Schlumberger, 2006). If, however, foreign aid is not regimeneutral, then Western policy-makers need to start reflecting more deeply than they have in the past about the usefulness of lending aid, lucrative arms contracts or legitimating support through rhetoric or symbol politics to closed dictatorships like Saudi Arabia. Large autocracies have in fact made inroads by appearing less paternalistic toward developing nations, particularly in sub-Saharan Africa, while the traditional colonial powers in Europe and the United States for the most part have a track record of acting condescendingly or clumsily in their foreign policies. By contrast, there is increased evidence of authoritarian learning from other autocracies’ experience, as well as a diffusion of strategies of regime maintenance and cooperation (e.g. Erdmann et al., 2013). Looking at the ‘recipient’ side, even smaller nations possess remarkable agency. They emphasize themes that are prioritized by donors and can thus result in international support. The surge of globalized terrorism along with the discourse of a ‘war against terror’ provided many autocracies with a pretext to repress at home – and even to be rewarded for doing so internationally. Simultaneously to the ineffective democracy assistance policies of Western powers, autocracies promoting autocracy abroad has become a standard feature not only empirically but also in scholarly research. So-called black knights (re-)stabilize autocracy (such as Russia in Syria, parts of Ukraine or Belarus); they pursue policies aimed at disrupting the political process in established democracies (such as that same actor’s meddling with the US presidential elections in 2016); and they spread false rumors and ‘fake news’ that aims at discrediting other countries’ governments, their domestic opponents, or democracy as such. What is less well examined to date is material support given by dictatorial regimes
723
to anti-systemic forces in established democracies; there is by now evidence of systematic material support lent by Russia’s leadership to extremist right forces (both parties and movements) in many European countries, underpinned by ideational partnerships (such Germany’s right-wing party’s youth organization, which is twinned with the Putin Youth), mutual visits and invitations.15 Overall, then, the international arena has tilted considerably, over the past two decades, in favor of authoritarianism – a trend that has often been facilitated by established democracies. Whether and what consequences this has for (1) further trends in regime developments domestically and (2) international politics and peace is still unclear, but a range of perspectives remain to be discussed.
Perspectives Authoritarianism has been on the rise over the past decades, as outlined above. So has its average life expectancy (Frantz, 2018: 120f.). The lesson from this is that the study of authoritarianism that has flourished over the past 20-odd years now needs to enter a new phase in which initial findings are adjudicated, knowledge is consolidated and new research questions are answered. Three fields in particular seem to merit deeper investigation: conceptual, methodological and epistemological issues; the global rise of populism and personalization of political rule; and the trend towards authoritarianization in formerly established democracies.
Methodological and Epistemological Challenges in Researching Authoritarianism Despite the many advances the new research on authoritarianism has produced over the past two decades, our knowledge on the topic, when compared to our state of research
724
The SAGE Handbook of Political Science
on democracy, is still in a state of adolescence at best. More new hypotheses have been generated than have been rigorously tested. The past 20 years of research have produced too many contradictory findings and claims, and too few established ways of adjudicating between contending results. There are competing explanations for core questions, with too little discrimination between various possible answers. In order to consolidate our knowledge, distinguishing solid from invalid findings remains probably the single most important challenge for research over the next decade. Even though the new authoritarianism research might recently have become the fastest growing area in comparative politics, political science still means, for a large part, ‘researching autocracy in a discipline of democracy’ (Ahram and Goode, 2016). This goes for the methodological, empirical and epistemological levels alike and has serious repercussions for research. On a methodological level, preference falsification is only one of such repercussions. If, as is the case in authoritarian environments by definition, transparency is absent and data production and dissemination is controlled by the regime, adopting such data often means buying into the frames that have been deliberately produced as part of a legitimizing discourse, which in turn can represent an ethical challenge. Naively adopting whatever information is available is clearly not an option, but cross-checking information for accuracy is often equally impossible. A recent renewed boost of quasiexperimental methods has led, at times, to grossly invalid research; mainstream quantitative methods are for the most part equally problematic, as they must rely on data whose accuracy cannot be evaluated independently and for which the core assumption needs to be that they were produced in the interest of ruling elites. Qualitative empirical studies, on the other hand, bear real danger for both scholars and their informants, and suffer from lesser generalizability. Overall,
thus, political scientists need to devote more explicit attention to how such challenges impact on their work and how they can be overcome. The resulting epistemological problems have only just begun to be realized.
Populism and Personalization There is today an evident global trend towards increased populism in politics, policies and institutional developments; this trend exists across political regime types, that is, in both democracies and autocracies. While neither left nor right-wing populism is automatically associated with authoritarianism or a decline in ‘democraticness’,16 it empirically is when populism and its notion of who ‘the people’ are become exclusive instead of inclusive. The rhetoric employed by the proponents of exclusive versions of populist politics does not usually acknowledge political adversaries as legitimate contenders in the political game; rather, it depicts them as corrupt, rotten and illegitimate, and paints the present as apocalyptic and the future as threatened. Typically, thus, only the populists’ own movement or group can bring salvation while, in this narrative, an incumbent ‘establishment’ or ‘system’ is conspiring against ‘us’, against ‘the nation’ or against ‘the people’. While in some democratic countries, latently or manifestly anti-democratic populist politicians, movements and parties have gained their largest influence since World War II,17 such groups or individuals have even assumed government office through elections in certain instances (in the US, Hungary, Poland, temporarily in Austria, for instance). In those instances where they did assume governing responsibility, there is a clear trend toward democratic erosion (e.g. the United States, Austria) or even towards authoritarianization, that is, the deliberate dismantling of democratic rule and its transition to authoritarianism (e.g. Hungary, Poland, Venezuela).
Authoritarianisms and Authoritarianization
But the trend is not confined to Europe or ‘the West’. Populist ruling styles have also become popular in places as different as Russia, Turkey, Egypt or Venezuela; they can thus also be observed in already authoritarian contexts. In Turkey, President Erdogan managed to re-organize the ownership structure of the entire media sector (and large parts of private business in general), as well as incarcerating more journalists per inhabitant than any other country on earth. President Putin of Russia has successfully brought under control or into exile the group of super-rich oligarchs on whom his predecessor largely depended and has concentrated more power in his hands than many former Soviet leaders. President Sisi of Egypt, during his 2013 campaign, displayed pictures of a young boy shaking the hand of legendary Arab leader Gamal Abdel Nasser, claiming the boy was himself in his youth and insinuating a continuation of the charismatic leadership for which Nasser had become famous, while disenfranchising the former ruling party to a greater degree than was seen even in Turkey.18 The rise of populism is thus equally dynamic in authoritarian regimes, which today tend to become more personalized than they were half a century ago – this, some claim, ‘spells trouble for global peace and democracy’ (Frantz, 2018: 103).
Authoritarianization The birth of authoritarianism has been studied, but we still have little to no consolidated knowledge about where it comes from, by whom it is brought about, and how it becomes established as ‘the only game in town’. Apart from the above mentioned regime formation processes after independence or secession, authoritarianism’s expansion in the 21st century has begun to replace established democracies predominantly not by the formation of new states, nor by coups d’état, military insurgency or revolutions from above. Authoritarianization today usually occurs as
725
a creeping encroachment on the discourses, norms, values and institutions of liberal democracy, up to the point that discourses have changed, norms have been replaced and institutions have turned into mere shells, while new authoritarian institutions p atterns of mobilization and policy-making have sprung up and become established. Democracy’s death and the birth of autocracy come creeping. To the surprise of many, authoritarianism (cf., i.a., Zielonka 2018 on Europe) has thus made its most successful inroads in precisely those contexts in which modernization theorists of the past (and present) would predict democracy to be most robust or ‘sustainable’, to borrow Lipset’s words: in reasonably established democracies that tend towards the ‘centers’, not the ‘periphery’, of the global system, where the political system supposedly is also embedded in a civic culture dominated by democratic values, and where regionally, risks of an externally induced black knight overthrow are usually assumed to be lesser than in developing regions. In these established democracies, a trend of ‘democratic backsliding’ was first decried by Freedom House in 2007 and has continued unabatedly until today; on a global scale, this has impacted negatively on the ‘quality of democracy’.19 Observers warn that democracies needed to be more strongly defended at home and/or better aided abroad, while skeptics question the correctness of such figures, which they see as grounded more in a change in scholarly perception than in empirical facts. However, two decades into the 21st century it has become clearer that, rather than a decrease in the quality of democracy, developments in places such as Hungary, Poland, Turkey and Venezuela are examples of the broader tendency toward authoritarianization.20 Unfortunately, the two processes are not distinguished by the main democracy indices due to the gradualist approach toward changes in/of political regimes which these indices take; in reality, they are qualitatively different, because one signals change within
726
The SAGE Handbook of Political Science
regime whereas the other represents change of regime. Authoritarianization and personalization are, thus, arguably the two most striking current global tendencies in the study of authoritarian regimes, and they are interlinked. Their scholarly adoption as a research topic was triggered primarily by the advent of Donald Trump to the American presidency and the rapid rise of anti-democratic right-wing populist movements and parties in Europe. Reflections on ‘how democracies end’ (Runciman, 2018) abound. But scholars are likely just beginning to realize how dramatic might be the phenomenon that will likely occupy our minds for years to come – at least, for as long as we are politically able to research the issue.
Notes 1 Independent of how authoritarian regimes perform on other criteria, this is an important background for both politics and for academia. It is at the same time a good reason why studying authoritarianism will remain important in comparative politics. 2 A political regime can be defined as ‘the formal and informal organization of the center of political power, and of its relations with the broader society. A regime determines who has access to political power and how those who are in power deal with those who are not’ (Fishman, 1990: 428). 3 See, for instance, Schedler’s (2006) ‘electoral authoritarianism’: when virtually no authoritarian regimes do not hold elections, it is unclear what we need an adjective for. The assumption that somehow authoritarian regimes with elections are more likely to democratize or are per se better than others (Schedler, 2013: 380) defies empirical and statistical counterevidence (cf. the work done by Gandhi and Przeworski (2007) or the studies by Geddes, Wright and Frantz (2014)). Cf. Snyder (2006) for a critical discussion of ‘electoral authoritarianism.’ 4 Dorenspleet (2000: 395) counts 36 out of 44 new states in Africa that were run by authoritarian regimes between 1958 and 1972. 5 The concept dates back to Weber (1922) and received broader attention in the study of developing countries since the 1970s. For an overview, cf. Erdmann and Engel (2007). As a subtype of
authoritarianism, it is defined by – idealtypically – a personalist leader who rules through an extensive patronage network. While formal state institutions such as a modern bureaucracy, military, security agencies and often also political parties or parliaments do exist, these formal institutions are penetrated by informal patterns of interaction that are decisive in determining Laswell’s famous ‘who gets what how’. 6 The exceptions to this rule are, of course, Israel in its 1948 borders (democratic); Turkey between 1923 and, roughly, 2010 (single-party/military; later democratic), and the Islamic Republic of Iran (authoritarianism with factionalized elites and massive contestation despite the existence of a ‘Supreme Leader’). 7 The Swedish team only covers the time period from 1972 to 2010, while Cheibub et al.’s dataset ends in 2008 with no update in sight. 8 Roughly, both Hadenius et al. and Cheibub et al. rely on an understanding of ‘political regime’ as the ‘institutions on which elites rely in order to regulate the access to and maintenance of public authority’ (Hadenius et al., 2012: 21); Geddes et al.’s understanding of ‘regime’ comes closer to Fishman’s (1990) almost classical definition, namely ‘the set of basic formal and informal rules that determine who influences the choice of leaders – including rules that identify the group from which leaders can be selected – and policies’ (Geddes et al., 2014: 327). By contrast, Cheibub et al. define an ‘authoritarian regime’ as being non-democratic, which is a matter that would merit more discussion than is possible here. 9 Within electoral authoritarian regimes, they further differentiate between multiparty, no-party and single-party regimes. 10 To make things more complicated yet, Linz did not view Sultanism as a subtype of authoritarianism, but as a regime type. 11 This model has been criticized for omitting a multitude of known relevant variables. For instance, it ignores not only qualitative differences between democracy and autocracy such as the necessity, in democracies, to govern through the rule of law, but also a broad range of other factors such as any normative conviction that might lead leaders and followers to adhere to democratic norms in democracies. 12 As exemplified in the works by Gandhi and LustOkar (2009); Magaloni (2009); Brown (2001); Ginsburg and Simpser (2013); Albrecht (2005), and others more. 13 Further refinement of the claims of the rentier state approach has been made in various fields and directions. For recent critical discussions, see Waldner and Smith (2015) or Brynen et al. (2012: chapter 9).
Authoritarianisms and Authoritarianization
14 For an overview, cf. Davenport and Inman (2012). 15 This goes at least for the ruling parties in Hungary and Poland, for the French Front National and for the German Alternative für Deutschland. But similar links are said to exist to extreme right and/or anti-democratic forces in other European countries. 16 A counter example, arguably, was Bolivia under the presidency of Evo Morales. 17 Particularly in Western Europe, including France, the United Kingdom, Germany, Finland, Denmark, Sweden, Greece, the Netherlands and others. On Europe in particular, cf. Zielonka (2018). 18 Even China, where the leadership of the Communist Party has ruled collectively since the 1940s, has now abolished the terms of office of its leader, indicating that the global wave toward not only populist but personalized authoritarian rule might have reached the shores hitherto deemed by many to be the most unlikely. 19 On that concept and as a structured introduction, cf., e.g., Diamond and Morlino (2005). 20 ‘Authoritarianization’ can thus be defined as ‘one type of democratic backsliding that results in the establishment of a dictatorship’ (Frantz, 2018: 94).
References Acemoglu, Daron and Robinson, James (2006). The Economic Origins of Dictatorship and Democracy. New York: Cambridge University Press. Adorno, Theodor W., Frenkel-Brunswik, Else, Levinson, Daniel and Sanford, Nevitt (1950). The Authoritarian Personality. New York: Harper. Ahram, Ariel and Goode, Paul (2016). ‘Researching Authoritarianism in the Discipline of Democracy’. Social Science Quarterly, 97(4), 834–49. Anckar, Carsten and Frederiksson, Cecilia (2018). ‘Classifying Political Regimes 1800– 2016: A Typology and a New Dataset’. European Political Science, 18(1), 1–13 [online]. Arendt, Hannah (1951). The Origins of Totalitarianism. New York: Harcourt. Brown, Nathan (2001). Constitutions in a NonConstitutional World: Arab Basic Laws and the Prospects for Accountable Government. Albany, NY: State University of New York Press. Brownlee, Jason (2002). ‘… And Yet They Persist: Explaining Survival and Transition in Neopatrimonial Regimes’. Studies in Comparative International Development, 37(3), 35–63.
727
Brynen, Rex, Moore, Pete W., Salloukh, Bassel F. and Zahar, Marie-Joelle (2012). Beyond the Arab Spring: Authoritarianism and Democratization in the Arab World. Boulder, CO: Lynne Rienner. Bueno de Mesquita, Bruce, Smith, Alastair, Sieverson, Randolph and Morrow, James (2003). The Logic of Political Survival. Cambridge, MA: MIT Press. Cheibub, José Antonio, Gandhi, Jennifer and Vreeland, James Raymond (2010). ‘Democracy and Dictatorship Revisited’. Public Choice, 143(1–2), 67–101. Collier, David (ed.) (1979). The New Authoritarianism in Latin America. Princeton, NJ: Princeton University Press. Dahl, Robert (1971). Polyarchy: Participation and Opposition. New Haven, CT: Yale University Press. Davenport, Christian (2007). ‘State Repression and Political Order’. Annual Review of Political Science, 10, 1–23. Davenport, Christian and Inman, Molly (2012). ‘The State of State Repression Research since the 1990s’. Terrorism & Political Violence, 24(4), 619–34. Diamond, Larry and Morlino, Leonardo (eds) (2005). Assessing the Quality of Democracy. Baltimore, MD: Johns Hopkins University Press. Dorenspleet, Renske (2000). ‘Reassessing the Three Waves of Democratization’. World Politics, 52(3), 384–406. Edel, Mirjam and Josua, Maria (2018). ‘How Authoritarian Rulers Seek to Legitimize Repression: Framing Mass Killings in Egypt and Uzbekistan’. Democratization, 25(5), 882–900. Erdmann, Gero and Engel, Ulf (2007). ‘Neopatrimonialism Reconsidered: Critical Review and Elaboration of an Elusive Concept’. Commonwealth & Comparative Politics, 45(1), 95–119. Erdmann, Gero, Bank, André, Hoffmann, Bert and Richter, Thomas (2013). ‘International Cooperation of Authoritarian Regimes: Towards a Conceptual Framework’, GIGA Working Paper, No. 229 (July). Fishman, Robert M. (1990). ‘Rethinking State and Regime: Southern Europe’s Transition to Democracy’. World Politics, 42(3), 422–440.
728
The SAGE Handbook of Political Science
Frantz, Erica (2018). Authoritarianism: What Everybody Needs to Know. New York: Oxford University Press. Friedrich, Carl J. and Brzezinski, Zbigniew (1956). ‘The General Characteristics of Totalitarian Dictatorship’. In Carl J. Friedrich and Zbigniew Brzezinski (eds) Totalitarian Dictatorship and Autocracy (pp. 15–27). Cambridge, MA: Cambridge University Press. Gandhi, Jennifer and Ellen Lust-Okar (2009): ‘Elections Under Authoritarianism.’ Annual Review of Political Science 12, 403–422. Gandhi, Jennifer and Adam Przeworski (2007): ‘Authoritarian Institutions and the Survival of Autocrats.’ Comparative Political Studies, 40(11), 1279–1301. Geddes, Barbara (1999). ‘What Do We Know about Democratization after Twenty Years?’ Annual Review of Political Science, 2, 115–44. Geddes, Barbara, Wright, Joseph and Frantz, Erica (2014). ‘Autocratic Breakdown and Regime Transitions: A New Data Set’. Perspectives on Politics, 12(2), 313–31. Gerschewski, Johannes (2013). ‘The Three Pillars of Stability: Legitimation, Repression, and Co-Optation in Autocratic Regimes’. Democratization, 20(1), 13–38. Ginsburg, Tom and Alberto Simpser (eds) (2013). Constitutions in Authoritarian Regimes. Cambridge: Cambridge University Press. Hadenius, Axel, Teorell, Jan and Wahmann, Michael (2012). The Authoritarian Regimes Dataset. Retrieved from: https://sites.google. com/site/authoritarianregimedataset/. Josua, Maria and Edel, Mirjam (2015). ‘To Repress or Not to Repress? Regime Survival Strategies in the Arab Spring’. Terrorism & Political Violence, 27(2), 289–309. Linz, Juan J. (1964). ‘An Authoritarian Regime: Spain’. In Erik Allardt and Yrjö Littunen (eds) Cleavages, Ideologies, and Party Systems: Contributions to Comparative Political Sociology (pp. 291–341). Helsinki: The Academic Bookstore. Linz, Juan (1975). Totalitarian and Authoritarian Regimes. Reading, MA: Addison-Wesley. Little, Andrew T. (2016). ‘Are Non-Competitive Elections Good for Citizens?’ Journal of Theoretical Politics. Retrieved from: http:// jtp.sagepub.com/content/early/2016/ 02/26/0951629816630436.full.pdf.
Luciani, Giacomo (1987). ‘Allocation vs. Production States: A Theoretical Framework’. In Hazem Beblawi and Giacomo Luciani (eds) The Rentier State (pp. 193–212). London: Croom Helm. Lührmann, Anna, Tannenberg, Marcus and Lindberg, Staffan (2018). ‘Regimes of the World (RoW): Opening New Avenues for the Comparative Study of Political Regimes’. Politics and Governance, 6(1), 60–77. Magaloni, Beatriz (2009). Voting for Autocracy: Hegemonic Party Survival and ist Demise in Mexico. Cambridge: Cambrigde University Press. McGuire, James (2013). ‘Regime and Social Performance’. Contemporary Politics, 19(1), 55–75. Miller, Michael (2015). ‘Electoral Authoritarianism and Human Development’. Comparative Political Studies, 48(12), 1526–62. Moore, Pete (2015). ‘Fiscal Politics of Enduring Authoritarianism’. In M. Lynch (ed.) The Arab Thermidor: The Resurgence of the Security State (pp. 24–29). POMEPS Studies, No. 11. O’Donnell, Guillermo (1973). Modernization and Bureaucratic-Authoritarianism: Studies in South American Politics. Institute of International Studies, Politics of Modernization Series No. 9. Berkeley: University of California Press. Ross, Michael (2001). ‘Does Oil Hinder Democracy?’ World Politics, 53(3), 325–61. Ross, Michael (2012). The Oil Curse: How Petroleum Wealth Shapes the Development of Nations. Princeton, NJ: Princeton University Press. Runciman, David (2018). How Democracy Ends. London: Profile Books. Sartori, Giovanni (1991). ‘Comparing and Miscomparing’. Journal of Theoretical Politics, 3(3), 243–57. Schedler, Andreas (ed.) (2006). Electoral Authoritarianism: The Dynamics of Unfree Competition. Boulder, CO: Lynne Rienner. Schedler, Andreas (2013). The Politics of Uncertainty: Sustaining and Subverting Electoral Authoritarianism. Oxford: Oxford University Press. Schlumberger, Oliver (2004). ‘Political Liberalization, Authoritarian Regime Stability, and Imitative Institution-Building: Towards a Formal Understanding’. Paper presented to
Authoritarianisms and Authoritarianization
the Fifth Mediterranean Social and Political Research Meeting of the Robert Schuman Centre, European University Institute, Florence, Italy. Schlumberger, Oliver (2006). ‘Dancing with Wolves: Dilemmas of Democracy Promotion in Authoritarian Contexts’. In Dietrich Jung (ed.) Democratization in the Middle East: New Political Strategies (pp. 33–60). New York: Palgrave. Schlumberger, Oliver (2010). ‘Opening Old Bottles in Search of New Wine: On Nondemocratic Legitimacy in the Middle East’. Middle East Critique, 19(3), 233–50. Slater, Dan (2010). Ordering Power: Contentious Politics and Authoritarian Leviathans in Southeast Asia. New York: Cambridge University Press. Smith, Benjamin (2004). ‘Oil Wealth and Regime Survival in the Developing World, 1960–1999’. American Journal of Political Science, 48(2), 232–46. Smith, Benjamin (2006). ‘The Wrong Kind of Crisis: Why Oil Booms and Busts Rarely Lead to Authoritarian Breakdown’. Studies in Comparative International Development, 40(4), 55–76. Snyder, Richard (2006). ‘Beyond Electoral Authoritarianism: The Spectrum of Nondemocratic Regimes’. In Andreas Schedler (ed.) Electoral Authoritarianism: The Dynamics of Unfree Competition (pp. 219–31). Boulder, CO: Lynne Rienner. Soest, Christian von and Grauvogel, Julia (2017). ‘Identity, Procedures and Performance: How Authoritarian Regimes
729
Legitimize Their Rule’. Contemporary Politics, 23(3), 287–305. Solt, Frederick (2012) ‘The Social Origins of Authoritarianism’. Political Research Quarterly, 65(4), 703–13. Svolik, Milan (2012). The Politics of Authoritarian Rule. Cambridge: Cambridge University Press. Teorell, Jan and Lindberg, Staffan (2019). ‘Beyond Democracy-Dictatorship Measures: A New Framework Capturing Executive Bases of Power, 1789–2016’. Perspectives on Politics, 17(1), 66–84. Ulfelder, Jay (2007). ‘Natural Resource Wealth and the Survival of Autocracy’. Comparative Political Studies, 40(8), 995–1018. Wahman, Michael, Teorell, Jan and Hadenius, Axel (2013). ‘Authoritarian Regime Types Revisited: Updated Data in Comparative Perspective’. Contemporary Politics, 19(1), 19–34. Waldner, David and Smith, Benjamin (2015). ‘Rentier States and State Transformation’. In Stephan Leibfried et al. (eds) The Oxford Handbook of Transformations of the State (pp. 714–29). Oxford: Oxford University Press. Weber, Max (1922). Wirtschaft und Gesellschaft. Tübingen: J. C. B. Mohr. Youngs, Richard (2010). The European Union and Democracy Promotion: A Critical Global Assessment. Baltimore, MD: Johns Hopkins University Press. Zielonka, Jan (2018). Counterrevolution: Liberal Europe in Retreat. Oxford: Oxford University Press.
43 Democracies Philippe C. Schmitter
In the real world of politics, democracy in the singular does not exist. There is only a large (recently growing) number of regimes (Whitehead, Chapter 52, this Handbook) that describe themselves as democracies and share a common core of principles – political equality, participation and accountability being the most important. They embody these principles through a wide variety of distinctive rules and practices. None of these configurations conforms strictly to the etymological meaning of the original Greek term: demos + kratos, or ‘rule of or by the people’. All modern versions are much more accurately described as regimes which are governed by politicians who claim to represent the people because they have competed in and won an election and can subsequently claim to rule on behalf of the people. It would be more accurate to call them ‘politocracies’, but the concept of democracy has stood remarkably firm against replacement, despite the great changes in its concrete embodiments.
By most accounts, the term ‘democracy’ first appeared in ancient Athens. Its birthdate is sometimes given as 508 bc, when Cleisthenes introduced major reforms that expanded the role of its (adult, male, nativeborn, tax-paying) citizens and changed the nature of their constituencies. It continued to evolve through a series of advances and reversals until Philip II of Macedonia conquered the city-state in 338 bc. Moreover, it seems probable that similar elements of citizenship and popular participation in government were more widespread – not just in Greece, but elsewhere in the city-states of Phoenicia and Mesopotamia. What distinguishes the Athenian version is the articulacy with which its participants and observers reflected upon their respective experiences with democracy and wrote about them – often unfavorably. The surviving works of Plato, Aristotle and Thucydides provide the ‘classic’ basis of our understanding of what democracy is (or, better, was). Some of these same features of citizenship were present in the Roman Republic
Democracies
and subsequently in various independent Italian city-states, but democracy by and large had disappeared from political practice by the 15th century – kept alive only in a few Swiss mountain cantons and the island of Iceland. It did not even survive as a popular aspiration, since virtually no one believed that this form of political domination was appropriate for any larger scale polity – and even then, it was associated with persistent disorder and ‘mob rule’. This changed in the 18th century, when a series of institutional innovations were introduced which radically altered its practice and made it applicable to the larger political units – national states – that had been emerging during the previous century in Western Europe. Moreover, changes in the structure of developing European capitalist economics also produced a group – the bourgeoisie – that had important interests in challenging the authority of established regimes based on monarchy, aristocracy or theocracy and, thereby, in obtaining ruling positions for themselves or their representatives. What emerged was something that could be called ‘democracy with lots of adjectives’: liberal, constitutional, representative, national, electoral, and, of course, capitalist. While these regimes differed considerably in their formal institutions of government – presidential versus parliamentary, unitary versus federal, uni-cameral versus bi-cameral, majoritarian versus consensual, monarchic versus republican – they shared a set of common principles: • an exclusive emphasis on the individual person as the basis for citizenship and on individual motivations and choices as the basis for political action – substantive and procedural; • a strong commitment to voluntarism in the form and content of political participation, as well as in the recruitment of politicians; • an insistence on formal political rights and their protection by pre-established constitutional/legal norms that place these rights beyond political contention; • a fixation with territorial representation and electoral competition for providing the primary
731
legitimate link between citizens and public authorities and for ensuring the accountability of the latter; • confinement to the boundaries of emerging national states, as well as a (tacit) complicity with nationalism, despite claims to cosmopolitan validity; • an ingrained hostility to coercive public authority, especially when backed by large numbers of less privileged citizens and, therefore, an affinity for complex systems of ‘checks and balances’; • a restriction of formal equality in political rights and obligations and an indifference to the systemic persistence of inequalities in the distribution of benefits, the representation of interests and the pursuit of influence produced by the capitalist economy in which they were all embedded.
While initially ‘liberal’ democracies varied considerably in the criteria for acquiring and exercising citizenship, participation and accountability, these practices gradually and fitfully became relatively standardized such that by the end of the 20th century all permanent residents holding the proper nationality by birth or naturalization, regardless of gender, religion, education, cultural origin or other sources of social or economic discrimination, had been granted equal political rights. Only age – 18 in most cases, 16 in a few – remains as a formal barrier, although in the contemporary context there is a continuing dispute over the status of legally resident but foreign-born nationals and over the voting rights of national citizens living outside their country of origin.
Definitions The formal practice of democracy may have become increasingly standardized, but its meaning remains ‘essentially contested.’ Most definitions emphasize some or all of the institutional characteristics mentioned above, and these have been employed in most quantitative scholarly analyses to measure empirically whether a given regime is ‘really’ democratic and, if so, how democratic.
732
The SAGE Handbook of Political Science
Moreover, the effort at definition is almost inevitably contaminated by the normative connotations that the concept carries – namely, that it is considered ‘good’ to be democratic and therefore it is good that the definition includes the peculiar traits of one’s own regime. At one time, the political marketplace was flooded with labels such as ‘people’s democracy’; ‘proletarian democracy’; ‘guided democracy’; ‘theocratic democracy’; ‘Asian’, ‘African’ or ‘Islamic democracy’ – not to mention such recent aberrations as ‘delegative democracy’, ‘hybrid democracy’, ‘ersatz democracy’, ‘defective democracy’ and, worst of all, ‘democradura’. Many scholars have simplified matters by concentrating on the absence or presence of only one (admittedly central) feature of all liberal democracies – namely, the presence of regular, ‘free and fair’ contested elections of uncertain outcome. Not infrequently they have justified this choice by citing the definition proposed by economist Joseph Schumpeter (1942): democracy is ‘that institutional arrangement for arriving at political decisions in which individuals acquire the power to decide by means of a competitive struggle for the people’s vote’. The political scientist Robert Dahl (1971: 235–6), goes beyond the usual minimal electoral criteria by introducing a number of ancillary conditions that are necessary to ensure that these elections are not merely ritualistic, but actually empower citizens before and after making their choice: 1 2 3 4 5 6 7
Freedom to form and join associations. Freedom of expression. Right to vote. Right of political leaders to compete for support. Alternative sources of information. Free and fair elections. Institutions for making government policies depend on votes and other expressions of preference.
Dahl proposed calling the regimes respecting these norms ‘polyarchies’, rather than
democracies. His empirical criteria have been applied by many subsequent scholars, even if they have continued to use the venerable label ‘democracy’ for the subject of their study. A different approach is to define democracy according to the process that it is supposed to empower, and to leave unspecified the particular institutions and practices that are intended to ensure that this process is effective. Philippe Schmitter and Terry Karl (1991: 4) have proposed the following definition: ‘Modern Political Democracy is a regime or system of government in which rulers are held accountable for their actions in the public realm by citizens, acting indirectly through the competition and cooperation of their representatives.’ How, as well as when, this interaction between rulers, representatives and citizens takes place remains to be discovered and implemented in accordance with the social, cultural and economic conditions in each country. Many different institutional configurations can, at least potentially, fulfill the function of ensuring accountability. In the liberal orthodoxy, elections are supposedly unique in doing so. However, even if they are free and fair (but invariably biased in terms of the material resources available to competitors), it is possible for the political parties involved to collude to restrict the range of alternatives offered to voters and to form a de facto cartel in order to share power and its spoils (Katz and Mair, 2009). The Italians have even invented a word for this: partitocrazia. In the sort of polyarchy imagined by Dahl, citizens would have at their disposition other channels of representation and strategies of protest that might be more effective. Even public opinion, if reliably and convincingly known, can be effective in inhibiting rulers ex ante from proposing or pursuing policies that they anticipate will trigger a contrary mobilization among citizens. If the distinction between procedures and processes were not enough, many definitions rely upon product. They stress the substantive
733
Democracies
outcomes that democracies are presumed to produce. For liberal democrats, this means ‘freedom’, defined negatively as the avoidance of arbitrary or unreasonable constraints on individual behavior. Social democrats look for ‘equality’ in social and economic conditions for all citizens. For the less ideologically minded, what counts is a ‘civic political culture’, namely, the extent to which citizens believe that their government is democratic and that this is sufficiently important to them that they obey voluntarily its laws. Behind the diversity in procedures, processes and products are several competing but not conflicting generic ‘models’ of how democracy should operate, as illustrated in Table 43.1. Two are realistic and claim to describe actual rules and practices; two are idealistic and advocate potentially better rules and practices. Competitive democracy is the most familiar one. Citizens in territorial constituencies vote for candidates proposed by parties. Their votes are counted equally. The winners are determined by a simple majority (or even a plurality) and, together, they form the government pro tempore. Accountability is ensured by the possibility of voting for opposing parties in a subsequent election that can form an alternative government. Cooperative (aka consociational) democracy is less well known but extensively practiced, especially in small European states. Associations representing mainly functional interests (classes, sectors, professions, but
also sometimes religious and ethno-linguistic groups) express the intensity of their preferences, bargain with each other, reach a consensus and hand it over to a parliament which ratifies and makes it publicly binding by an overwhelming majority. Accountability is ensured by the potential for withdrawal of negotiators and the subsequent failure to reach a consensus and to propose a policy. Deliberative democracy is a persistent ideal form of decision-making in which actors in small groups (e.g. ‘committees’ or ‘forums’) communicate with each other about their preferences and their reasons for holding them, and seek to reach an active agreement (not just a passive consensus) by persuading initially reluctant or opposed actors to change their preferences. Accountability is ensured by an eventual failure to persuade and, hence, a non-decision or action by the group. Participatory democracy emerges usually as a result of dissatisfaction with one or both of the realistic models. Its core actors are collective movements which appeal to the solidaristic identities and commitments of individuals and social groups in order to mobilize them in favor of a substantive or procedural change in existing practices. Accountability is ensured by the presumed or actual capacity of these mobilized groups to disrupt entrenched rules or policies and impose new ones. All existing democracies have some combination of the first two ‘realistic’ models – without the competitive one they would not
Table 43.1 Two realistic and two idealistic models of democracy Models Dimensions
Realistic
Idealistic
COMPETITIVE
COOPERATIVE
DELIBERATIVE
PARTICIPATORY
Procedure Actors Structure Resource Decision-making Representation Accountability
Voting Parties Pluralist Numbers Majority Territorial Vote for opponent
Negotiating Associations Corporatist Intensities Consensus Functional Withdraw from bargaining
Persuading Committees Disciplinary Preferences Agreement Ideational Break off community
Mobilizing Movements Solidaristic Identities Acclamation Motivational Disrupt policy implementation
734
The SAGE Handbook of Political Science
Table 43.2 Two perverse models of democracy Model
PLUTOCRATIC
TECHNOCRATIC
Buying Firms Oligopolistic Money Cartelistic Chief executives Market collapse Bankruptcy
Proving Guardian institutions Monopolistic Expertise Epistemic community Professional ‘think tankers’ Rivalry among experts Policy failure
Dimensions Procedure Key actors Structure Resource Decision-making Representation Accountability
succeed in being recognized as democratic, and without the cooperative one they would have great difficulty in implementing policies. Deliberative and participatory ideals are latent in the very etymology of the original concept of ‘rule by the people’ and serve as reminders of what is missing and could be made present in its defective existing practice. As Aristotle informed us long ago, all regime-types have their virtuous and their perverse configurations (Aristotle, 1984). For him, demokratia was likely to degenerate into okhlokratia (mob rule). Table 43.2 illustrates two contemporary perversions of this regime-type. While neither of these two perverse models yet exists in pure form, they are both latently present within the two realistic versions – competitive and cooperative – and could become an alternative to either or to both of them in the future.
Implications Given its ‘essentially contested’ and ‘intrinsically normative’ nature, the use of any definition of democracy can have serious implications. The simple fact that the same concept is shared by politicians and scholars, with their different motivations, virtually ensures controversy over usage. And in recent decades, developments at the
supra-national level of politics have made this more consequential. Being a certified member of the ‘Club of Democratic Regimes’ has become an imperative for entry into prestigious (and profitable) regional and global institutions – first and foremost the EU. Not being a member, or being threatened with expulsion from the club, may be invoked by one’s democratic neighbors as an excuse for intervention in the name of promotion and protection of democracy. Emmanuel Kant (1796) long ago argued that ‘perpetual peace’ was only possible in the relations between what he called ‘commercial republics’. In contemporary parlance, one of the most frequently cited maxims of international relations is that democracies do not go to war with each other. Moreover, acquiring the status of a well-established democracy (regardless of adjectives) can be of considerable economic advantage, since it is usually accompanied by the assumption that such regimes (and only such regimes) can be relied upon to reliably observe ‘the rule of law’ – in particular, the rule of law regarding private property. This presumably leads to a competitive advantage as a recipient of foreign direct investment, an intermediary in managing international financial flows, a partner in free trade agreements, a site for cross-national production arrangements and/or a venue for foreign tourists. All of this presumes that – all things being
Democracies
equal (but they are not always) – certified democracies will be in a better position to benefit from an increasingly globalized capitalist economy. Losing one’s certificate or never having obtained one can be a major obstacle to development and the capacity to fulfill the expectations of one’s subjects, which, in turn, has implications for the stability and durability of a regime.
Theories There are almost as many theories about why democracies exist as there are definitions of what it is. Beginning with Solon and Cleisthenes in ancient Greece, the emphasis was upon the ‘lawgiver’, some enlightened soul who, for various reasons, was motivated to and capable of imposing new rules that expanded the rights of citizens. Ever since, the history of almost all democracies has some heroic founding figure(s), although more recent theories tend to stress why he (it is almost always a ‘he’) was so motivated and capable of doing so. Since they emerged primarily in Western Europe in the late 18th and 19th centuries, most theories focus on the unique conditions present there (and, secondarily, in their North American and Oceanic outposts). A ‘proper’ feudal past with its multiple ‘orders’ and representative institutions is one feature that distinguishes Western from Eastern Europe. And, of course, even better if the country had no feudalism at all, as was the case in Switzerland and some central Italian city-states. Another, stressed by Barrington Moore Jr (1972), involves the evolution of capitalism and, more specifically, the early spread of wage labor into agriculture. Once landowners no longer needed coercion or slavery to ensure their supply of labor, they were capable of envisaging an accountable form of parliamentary government (provided, of course, that they retained a privileged position in it).
735
There is also a suspicious correlation with religion. Christianity may have initially played a role, with its emphasis on the individual person as the locus of ethical responsibility, but this was very quickly displaced by its internal split between Roman Catholicism and various forms of Protestantism. Virtually all the early development of liberal democracies occurred in countries dominated by the latter – so much so that the former was long considered an active impediment to it, especially in Southern Europe and its Latin American, African and Asian colonies. Associated with the religious divide was the importance of popular literacy – much higher in the Protestant countries. And there have been innumerable ‘cultural explanations’. The simple positive correlation between Anglo-American and Scandinavian (plus Swiss) societies and long-term, stable democracy suggested to some that these countries had national cultures that were especially propitious for this sort of regime, and the inverse correlation between Southern European, Latin American, Asian and, especially, Arab Muslim societies and democracy seemed to indicate a basic cultural animosity to citizen-based, competitive and accountable politics. The circularity in such causal reasoning seems obvious. It might just as well prove that no culture is intrinsically democratic. All basic human institutions – the family, the clan, the shop, the factory, the church – are hierarchical and authoritarian. It is the successful historical practice of democracy that eventually produces a correspondingly favorable ‘civic’ political culture. It might be easier to start the process in some cultures, but none preclude it. There are many other such correlations: smaller countries are more likely to be democratic than larger ones, as are those with linguistic/cultural/religious homogeneity (unless they are Arab Muslim), those located in temperate climates (unless they are semi-tropical Costa Rica or Jamaica) and those colonized by the British (except for the former French colonies of Senegal
736
The SAGE Handbook of Political Science
and, more recently, Tunisia). But by far the most frequently observed and explored correlation in the theoretical literature has been that between democracy and ‘modernization’. In a seminal article in 1959, the American political sociologist Seymour Martin Lipset concluded, after an extensive quantitative analysis of all existing polities at that time, that ‘the more well-to-do a nation, the greater the chances that it will sustain democracy’ (75). This unleashed a veritable avalanche of comparative research into what came to be called ‘the social pre-requisites of democracy’. Most of it has confirmed his observation, although the effect is not constant over time. No country can simply ‘buy’ democracy year by year by getting richer, nor do there seem to be insuperable thresholds to be crossed before becoming so. Lipset did not claim that only wealthy capitalist countries tried to become democratic, just that they had been more successfully in sustaining it. There were and still are obvious deviant cases. Switzerland and Norway were among the poorest countries in Europe when they became democratic, and yet have sustained the effort to the present day (where they are among the richest). Poverty-stricken India advertises itself as ‘the World’s Largest Democracy’. Saudi Arabia has one of the highest per capita incomes in the world and has never made the slightest attempt to democratize. In the contemporary world, countries as ‘un-modernized’ as Albania and Mongolia have managed to hold regular competitive elections, respect citizen rights and even peacefully alternate parties in power. The crucial question behind these correlations and exceptions involves the ‘mechanism’ or ‘mechanisms’ that are presumed to connect successful capitalism and the emergence of democracies. To begin with, the wealth has to be earned, rather than extracted as rents from the export of some natural resource. So-called “petro-states” have been notoriously difficult to democratize. It also is important that behind the wealth lie a
number of co-variant achievements, mainly the development of an urban, literate and better educated population, a substantial proportion of which is middle class in income and status. It is certainly advantageous if the increase in wealth is accompanied by a more even distribution of this economic surplus, but this hardly seems to be a prerequisite. Two of the countries with the most unequal distributions of wealth in the contemporary world, Brazil and South Africa, have managed (admittedly with difficulty) to sustain democratic institutions for more than 25 years. This may be important, since very few democracies that have survived for such a long period subsequently revert to autocratic rule (the exceptions being Chile and Uruguay in the early 1970s). Often implicitly (but virtually unanimously), scholars presume that democracies have only emerged and will continue to emerge within political units that already have the status of sovereign national states. Without prior ‘stateness’, democracy could not exist and, to make matters worse, there is no ex post democratic means for determining what these units should be. The objective of ‘self-determination of nations’ cannot be accomplished democratically – only a complex and lengthy process of wars, marriages and accidents can do this. The new regime needs the fiscal resources and physical security that only an organization with an effective monopoly of violence over a given territory can provide. Even more problematic, however, is the assumption of a prior demos, that is, a population that already has an overriding single identity and sense of shared fate. A cursory glance at the history of many successful cases would reveal instances in which ‘nationhood’ was the product, not the producer, of eventual democracy. ‘Sovereignty’ is an important component because democracies are presumed not only to be responsive to the preferences of their citizens, but also to be capable of acting on them without interference from other
Democracies
‘imperial’ authorities. This has always been an elusive property of national states, and nowhere more so than in Western Europe, where a relatively compressed space was populated by a large number of them. Their rivalries and conflicts had a major influence on democratization. First and foremost, war and even the threat of war was a powerful incentive for autocratic or oligarchic rulers to be concerned with the loyalty of their subjects and to accept reforms that increased their rights as citizens. Second, the close proximity and competition among these units virtually guaranteed that political innovations in one state would affect the others. In other words, first in Europe, and later in Latin America and Africa, democracy has been proven to be contagious. This helps to explain why, if one plots the instances of attempted democratization over time, one will find ‘clusters’ or ‘waves’ occurring sequentially – usually in adjacent states. Where they were temporally but not physically near each other, the triggering factor was World Wars I and II. The theoretic quest for the ‘pre-requisites of democracy’ seems, in retrospect, to have been misguided. While there are certainly historical conditions that have made such an outcome more likely – wealth, size, location, climate, religion, feudalism, capitalism, colonialism, just to name the most obvious – none of these is imperative. Democracy is possible anywhere – but not necessarily either inevitable or immediate. What is more, as we shall see below, its very nature is constantly changing. What might have been an effective facilitator or impediment in the past may no longer be in the present and future.
Transformations One of the reasons why theorizing about it has been so difficult is that democracy has always been a moving target. Not only has it not been possible to realize the ideal of ‘rule
737
by the people’, but the actual efforts to do so have repeatedly changed. Robert A. Dahl (1971) did not hesitate to call these changes ‘revolutions’, even though they were usually introduced gradually and non-violently by politicians who did not think of themselves as revolutionaries. Most often they were just responding to popular pressures, externally imposed circumstances or just everyday dilemmas of choice with incremental reforms and experimental modifications. These accumulated over time until citizens and rulers eventually found themselves in a radically different polity – but they still identified it with the same name. Dahl identified three early democratic revolutions. The first was in size. The American constitution, the first to make extensive use of territorial representation, federalist autonomies and presidential authority, broke with the previous limitation to city-states. The second was in scale. Early experiments with democracy were based on a very limited conception of citizenship. Sometimes gradually, other times tumultuously, these restrictions were removed until all adult ‘nationals’ eventually acquired this status. The third was in scope. Originally, democracies had a very restricted range of political tasks – mostly, external defense and internal order. Over time, they acquired responsibility for a vast range of regulatory, distributive and re-distributive matters. Subsequently, democracies have suffered through and benefited from several other revolutions. Two of them have exhausted their potential and become well-entrenched features of ‘modern, representative, liberal, political democracy’ – in Europe and North America, at least. Three others are more contemporary and still very active in their capacity to generate new challenges and opportunities. The first involved the displacement of individuals by organizations as the effective citizens of democracy. Beginning in the latter third of the 19th century, new forms of collective action emerged to represent the interests and passions of individual
738
The SAGE Handbook of Political Science
citizens. James Madison (1787) and Alexis de Tocqueville (1835) had earlier observed the importance of a multiplicity of ‘factions’ or ‘associations’ within the American polity, but neither could have possibly imagined the extent to which these would become large, permanently organized and professionally run entities, continuously monitoring and intervening in the process of public decisionmaking. Moreover, the interests and passions they represent cannot be reduced to a simple aggregation of the individuals who join or support them. They have introduced, on a large scale, their own distinctive organizational objectives into the practice of policy-making and become democracy’s most effective citizens. The second has to do with the professionalization of the role of politician. Earlier, democratic representatives and rulers were persons who might have been somewhat more affected by ‘civic’ motives, but who were otherwise not different from the restricted body of ordinary citizens. They would (reluctantly) agree to serve in public office for a period of time and then return to their normal private lives and occupations. At some time during the early 20th century, more and more democratic politicians began to live, not for politics, but from politics. They not only entered the role with the expectation of making it their life’s work, but they also surrounded themselves with other professionals – campaign consultants, fund-raisers, public relations specialists, media experts and – to use the latest term – ‘spin doctors’. Whether as cause or effect, this change in personnel has been accompanied by an astronomical increase in the cost of getting elected and of remaining in the public eye if one is so unfortunate as to become un-elected. Creating one’s own foundation seems to be the predominant way of coping with this dilemma. The third, fourth and fifth transformations are contemporary and simultaneous – and their implications are uncertain. Over the past 20 or more years – indeed, much longer in the case of the United States – democracies have
ceded the authority to make binding public decisions to ‘guardian institutions’. The expression is taken from Plato and refers to specialized institutions – usually regulatory bodies – that have been assigned responsibility for making policy in areas which politicians have decided are too controversial or too complex to be left to the vicissitudes of electoral competition or inter-party legislative struggle. The locus classicus in the contemporary period is the central bank, but earlier examples would be the general staffs of the military, anti-trust agencies or civil service commissions. In each case, it is feared that the intrusion of ‘politics’ would prevent the institution from producing some generally desired public good. Only experts acting on the basis of (presumably) neutral intent and scientific knowledge can be entrusted with such a responsibility. A more cynical view would stress that these are often policy areas in which the party in power has reason to fear that if they have to hand over office in the future to their opponents, the latter will use these institutions to punish the former or to reward themselves. The net effect of guardianship has been to deprive contemporary democracies of discretionary action over issues that have a major impact upon their citizens. ‘Democracies without choice’ is the expression that has emerged to describe and to decry this situation. Even more potentially alienating is the fact that some of these guardians are not national, but operate at the supra-national – regional or global – level. The European Union currently has 32 such regulatory agencies. Which brings us to the fourth ‘revolutionary’ transformation: multi-level governance. It is particularly well developed in Europe, although similar developments are emerging elsewhere. During the post-World War II period, initially in large measure due to a shared desire to avoid any possible repetition of that experience, European national polities began experimenting with the level of aggregation at which collectively binding decisions would be made. The most visible
Democracies
manifestation of this has been the emergence and subsequent expansion of a set of regional institutions, culminating in the EU. But paralleling this macro-level experiment has been a micro-level one, namely, the devolution of various political responsibilities to subnational units: provinces, regioni, Länder or estados autonómicos. As a result, virtually all European citizens find themselves surrounded by a very complex set of authorities, each with vaguely defined or concurrently exercised policy compétences. The oft-repeated assurance that only national states can be democratic is no longer true in Europe, even though in practice it is often difficult to separate the various levels and determine which rulers should be held accountable for making specific policies. European politicians have become quite adept at passing the buck, and especially at blaming the EU for unpopular decisions. New political parties and movements have even emerged blaming the EU for policies over which it has had little or no control – for example, the massive influx of migrants from non-EU countries. Multi-level governance could, of course, be converted into something much more familiar, namely, a federal state, but resistance to this is likely to remain quite strong for the foreseeable future – which means that the ambiguity over which of these multi-level institutions are appropriate for each of these multiple levels will persist. And, when it comes to the design question, there seems to be a general awareness that the rules and practices of democracy at different levels cannot and should not be identical. Especially when it comes to ensuring the accountability of a ‘layered’ polity of the size, scale, scope and diversity of the EU, democracy will have to be (yet again) reinvented (Schmitter, 2000). The last of these simultaneous transformations in the nature of democracy is both the most challenging and the most ambiguous: information and communications technology. Innumerable observers, theorists and pundits have declared that the new electronic means of direct communication are one of the – if
739
not the – revolutionary instruments that will substantially and irrevocably alter the practice of democracies. Most, at least initially, regarded these developments very favorably. As early as 1983, Ithiel de Sola Pool declared them ‘technologies of freedom’. More recently, this claim has been reiterated and expanded by Manuel Castells (2010, 2013), among many others. By making communication between individuals virtually effortless and costless, by eliminating the factor of distance and number and by ensuring equality of access and anonymity of users, the new social media seemed destined to realize the latent possibilities of both participatory and deliberative democracy. Citizens could participate virtually in the organization of eventual collective actions and the elaboration of policy proposals. They could use the new media to search for and discuss controversial issues with other interested persons – even without regard for national borders. As a result, they would become better informed about the preferences of others and more inclined to participate in the other available channels of political participation. Given the ability to make payments ‘online’, candidates for elected office or for leadership roles in associations or movements could reach much larger and more diverse audiences to request financial support – freeing them from dependence on big donors. Voting itself could become electronic – and perhaps expanded to a much wider set of issues and positions beyond and between the usual two, four or six year parliamentary or presidential cycles. ‘E-democracy’ would supersede the limitations of competitive and cooperative democracy and usher in new practices that would be both more participatory and more deliberative. Subsequent experience with the effects of this ‘revolution’ has dampened, if not dashed, much of the initial optimism. Individuals use the internet more to reinforce their existing opinions than to deliberate with those who have contrary ones. Anonymity seems to promote the dissemination of ‘fake news’ and
740
The SAGE Handbook of Political Science
exaggerated claims. The cross-border advantage has been turned into a license to interfere in other people’s elections. Electronic voting does seem to have encouraged higher voter turnout but has had little or no discernible positive impact on either party affiliation, electoral outcomes or citizen interest in politics. Security issues still plague the internet and its political exploitation. In short, e-democracy, with its alleged participatory and deliberative advantages remains an idealistic aspiration. Democracies – new and old – are presently facing the challenge of coping with the accumulated impact of these three multiple, overlapping and interacting transformations – even before they have absorbed completely the implications of having more organized intermediaries and professional politicians. The threat is especially critical since all of them focus attention on liberal democracy’s most vulnerable institution: representation. If citizens do not trust and follow those who win elections in specified territorial constituencies and/or if they do not trust and follow those who are selected by associations from designated functional categories, then the entire edifice of legitimate public authority becomes vulnerable.
Trajectories When Robert Dahl (1971) conducted the first systematic empirical research on the then existing 114 national regimes in 1969, he found only 28 full polyarchies. Four others (Switzerland, Chile, the United States and Ecuador) came close, but still retained de jure or de facto restrictions on voter eligibility. Most of them were located either in Western European or in overseas territories colonized by the British or the Americans. The only exceptions were Japan, Israel, Lebanon, Costa Rica and Uruguay. To make matters worse, Chile, Uruguay, the Philippines and Lebanon soon lost this
exalted status. The implication seemed clear to scholars: democracy with the proper adjectives was a rara avis, suitable primarily for those polities that already had an AngloAmerican ‘civic’ culture or that had been fortunate enough to have been colonized by one or to have been defeated in war and occupied by one. On April 25, 1974, a coup by junior military officers in Portugal unexpectedly brought down the longest surviving autocracy in Europe. This was closely followed by other regime transitions in Southern Europe and, with only a slight delay, in South America. In the ensuing 25 years, more than 60 countries distributed across all continents got rid of a variety of forms of autocracy and began to experiment with new forms of democracy. And then a second unexpected event occurred: the autocratic regime of the Soviet Union collapsed, and with this came the democratization of all six of its former allies in Central and Eastern Europe, along with the three Baltic republics and Georgia. Yugoslavia’s transformation was more tumultuous and violent, but despite the ensuing civil wars, all of its component republics have eventually settled into some variant of democratic politics. So has even Albania, the most isolated and repressive all the Communist autocracies. Democracies can now be found almost everywhere across the globe. One recent tally lists 68 ‘politically transformed’, successful new democracies since 1974 (Bertelsmann, 2008). Whatever its historical origins, this form of political domination no longer has a unique geographical, cultural, ethnic or religious ‘site’. The only ‘requisite’ that all of these regimes have in common is a capitalist economy, but that is much more varied in terms of level of development and source of wealth than in the past. Even in the Middle East and North Africa, the world region which has been most resistant to democratization, Lebanon, Kuwait, Bahrain and, more recently, Tunisia have been supporting free and competitive elections; the political
Democracies
freedoms of speech, assembly, press, association and petition; and governments that are potentially revocable according to the will of their citizens. More notable even than the number of attempts at democratization since 1974 is the proportion of them that have managed to remain democratic. In many cases, this constituted their very first attempt, and very few countries had previously succeeded the first time they tried to democratize. Countries in Latin America seem to hold the world record for failed attempts (Ecuador and Bolivia are the champions), but there have been many European, African and Asian failures as well. Today, all countries in both South and Central America (except for Cuba) have at least popularly elected governments, if not all of the liberal freedoms and rights associated with them. Africa can boast of several even greater accomplishments: Ghana, Mali, Benin, Botswana, Senegal, Namibia, Mauritius and South Africa. South Korea and Taiwan have become models of stable Asian democracy. The usual mechanism for reversion to autocracy has been the military coup, but the underlying motive is usually civilian, namely, the threat that governments elected by the non-property-owning majority presented to the property-owning minority previously protected by some civilian or military authoritarian regime. In other cases, it was cultural, ethnic or religious differences between minority rulers and their majority subjects that impeded successful democratization. Among the post-1974 attempts at regime change, very few have been overthrown by their respective militaries. Pakistan, Thailand and Egypt stand out as exceptions to this unprecedented rule. The spectre that threatens many of the other neo-democracies is decay rather than demise. The institutional overlay of constitutions, elections, associations, civil society organizations, sometimes even press freedom remains, but is distorted to ensure the tenure of incumbents and their allies. Russia is the prime example, but is only the most salient among a sizable group of
741
‘hybrid regimes’ (Gagné and Mahé, Chapter 47, this Handbook). Nicaragua, Venezuela and Honduras in Latin America; Liberia, the Congo, Togo, Cameroon, the Ivory Coast, Rwanda, Ethiopia and Angola in Africa; Morocco, Algeria, Jordan and Iran in the Middle East and North Africa; Belarus, Ukraine, Armenia and all of the Central Asian former republics of the Soviet Union (except for Kyrgyzstan); Indonesia, Malaysia, Sri Lanka, Bangladesh and Myanmar in Asia – all claim to have made the transition from autocracy, but all have failed to consolidate a regime that can plausibly claim to be institutionally accountable to its citizens. Part of the answer to this diffusion of ‘regime hybridization’ lies in what the recent literature on democratization has called ‘modes of transition’ (Karl and Schmitter, 1991). Historically, there were two models for getting from autocracy to democracy: the reformist path conveniently exemplified by Great Britain, and the revolutionary path less conveniently exemplified by France. In both cases, it was assumed that the impetus came from below – from a mobilization of those in the population that had been excluded, for various reasons, from full citizenship. In the former, ruling elites recognized the threat before it became widespread and violent and then responded by revising the rules of the game, using the pre-existing rules to incorporate the excluded. In the latter, they failed to do so and were subsequently deposed by a mass uprising, typically beginning in the capital city. When the post-1974 ‘wave of democratization’ began, it became immediately apparent that reform and revolution were not going to be the exclusive modes of transition. Leaving aside the very peculiar case of Portugal (whose revolution was triggered by a military coup), many of the other cases followed different patterns. There was almost always some degree of popular mobilization, but not enough to be threatening to the incumbents in itself. What made them vulnerable were divisions within their ranks, and this led to two relatively new modes of transition: the Pacted and the Imposed. In the first, a faction
742
The SAGE Handbook of Political Science
of the ruling elites agreed to negotiate with their more moderate opponents and came to an agreement on the rules that would lead to an eventual democracy. In the second, some hegemonic group within the existing regime anticipated the emerging threat from below and initiated a change in regime from above, controlling the timing and content of the transition. In both cases, the key element was some degree of reassurance of the enduring status and interests of the autocratic elite – hence, the greater likelihood that the ensuing democracy would have ‘hybrid’ characteristics. The presumption – not unrealistic – was that the subsequent functioning of electoral competition, interest group negotiation and social movement pressure would gradually and consensually eliminate these ‘authoritarian enclaves’.
Prospects When the post-1974 wave of democratizations struck – and, even more, when it was reinforced by the second, post-1989 wave – both practitioners of politics and students of political science tended to react euphorically. Democracy with all those adjectives had become ‘the only game in town’ (Linz and Stepan, 1996: 5) and ‘the final form of human government’ (Fukuyama, 1992: xi). Now that its main rival, the Soviet-style ‘People’s Democracy’, had been irremediably defeated, all the world’s regimes would converge toward the new norm – dependent only upon their economic development. This has not happened. Those pundits were correct that a period of history had ended during the 1970s and 1980s. What they neglected to consider was that it would be followed by another period, and one which has proven to be much less tranquil and consensual than they predicted. The absence of a clearly inferior enemy deprived ‘realexisting’ democracies of one of their most important sources of legitimacy. Henceforth, they would be judged primarily according to
their conformity to the enduring principles of democracy – equality, participation and accountability – not just by being marginally better than the alternative. To make matters worse, the increasingly globalized capitalist economy that had been performing so well in the aftermath of World War II started to falter and a full-scale financial crisis had come about by 2008. The threat of communism was gone, and the surplus of capitalism had diminished. The great political paradox of this new period of history was that precisely at the moment when so many aspiring neo-democracies emerged with the declared intention of imitating pre-existing ones, these archeodemocracies were entering a compound crisis caused by the ‘revolutions’ we described above. Their citizens have been questioning the very same ‘normal’ institutions and practices that new democratizers were trying so hard to imitate, and finding them deficient – not to say, outright defective. The list of morbidity symptoms is well known: their citizens have become more likely to abstain from voting; less likely to join or even identify with political parties, trade unions or professional associations; more likely not to trust their elected officials or politicians in general; and much less likely to be satisfied with the way in which they are being governed and the benefits they receive from public agencies. Needless to say, it has not taken long for the citizens of most neo-democracies to become equally, or more, disillusioned with what they have so recently accomplished. The Spaniards have called this desencanto and there is hardly a democracy – new or old – that is not currently suffering from it. Which is not to say that democracy – with or without all those adjectives – is doomed to extinction. Perhaps the strongest reason for its existence has been its repeatedly and uniquely demonstrated capacity to survive, and to do so by using pre-established institutions to change its rules and practices peacefully and consensually. So far, there is neither an obvious agent to promote the necessary reforms, nor a credible alternative ideology to
Democracies
justify them, nor an obvious method for doing so. The working class has been replaced by a fragmented and anomic public; neo-liberalism has entrenched itself behind the plausible slogan of TINA – ‘There Is No Alternative’; and the mechanism of revolution, or even its specter, is no longer credible. Who will promote the necessary changes? Why will they promote it? How will they accomplish it? I am confident that it can and eventually will be done, but with what detours and delays I cannot say.
References Aristotle, Politics: A Treatise on Government. In Jonathan Barnes, ed. (1984). The Complete Works of Aristotle: The Revised Oxford Translation. Princeton: Princeton University Press. 2 vols. Bertelsmann Stiftung, ed. (2018). Transformation Index BTI 2018. Gütersloh: Verlag Bertelsmann. Castells, Manuel (2010). Rise of the Network Society. New York: Wiley-Blackwell. Castells, Manuel (2013). Communication Power. Oxford: Oxford University Press; 2nd edition. Dahl, Robert A. (1971). Polyarchy: Participation and Opposition. New Haven: Yale University Press. Fukuyama, Francis (1992). The End of History and the Last Man. New York: The Free Press. Kant, Immanuel (1796). Project for a Perpetual Peace: A Philosophical Essay. Whitefish, MT: Kessinger Publishing.
743
Karl, Terry L. and Schmitter, Philippe C. (1991). ‘Modes of Transition in Latin America: Southern and Eastern Europe’. International Social Science Journal, 43(2), 269–84. Katz, Richard S. and Mair, Peter (2009). ‘The Cartel Party Thesis: A Restatement’. Perspectives on Politics, 7(4), 753–66. Linz, Juan J. and Stepan, A. (1996). Problems of Democratic Transition and Consolidation: Southern Europe, South America and PostCommunist Europe. Baltimore: Johns Hopkins University Press. Lipset, Seymour M. (1959). ‘Some Social Requisites of Democracy: Economic Development and Political Legitimacy’. American Political Science Review, 53(1), 69–105. Madison, James (1787). The Federalist Pap ers No 10. New York Daily Advertiser, November 22. Moore, Barrington Jr (1972). The Social Origins of Dictatorship and Democracy. Boston: Beacon Press. Schmitter, Philippe C. (2000). How to Democratize the European Union … and Why Bother? Boulder, CO: Rowman & Littlefield. Schmitter, Philippe C. and Karl, Terry L. (1991). ‘What Democracy Is … And Is Not’. Journal of Democracy, Summer, 3–16. Schumpeter, Joseph A. (1942). Capitalism, Socialism and Democracy. New York: Harper & Brothers. Sola Pool de, Ithiel (1983). Technologies of Freedom. Cambridge, MA: Harvard University Press. Tocqueville de, Alexis (1835 and 2010). De la démocratie en Amérique. Paris: Flammarion.
44 Electoral Systems Bernard Grofman
Varieties of Electoral Systems Perhaps the most important observation about voting rules is just how many different ways there are to implement democratic elections. Even when we confine ourselves to the election of a single candidate, we can readily identify dozens of ways to do so: from the simplest, where voters cast an X for a preferred candidate and the candidate with the most votes wins (the plurality rule, known in many English-speaking countries as first past the post); to a rule that asks voters to cast Xs for all the candidates they regard as acceptable, with the one receiving the most ‘approval votes’ declared the winner (Brams and Fishburn, 1983 [2007]); to a plethora of procedures requiring runoffs if no candidate receives a majority on the first ballot (Grofman, 2008), with the two-round majority runoff used in France being the best known of these; to rules requiring voters to rank order candidates, such as the alternative vote (Fraenkel and Grofman, 2006), the Borda rule
(Saari, 1994), the Coombs rule (Grofman and Feld, 2004) and the median preference rule (Balinski and Laraki, 2011). Rules such as approval voting, the Borda count, the Coombs rule and the procedure developed by Balinski and Laraki can all be thought of as ways to assure the selection of a candidate who is generally acceptable. Some procedures for selecting candidates that require a full ranking of alternatives by voters are intended to satisfy the majoritarian criterion (more often labeled the Condorcet criterion after its 1785 proposer, the Marquis de Condorcet), which requires that, if there exists an alternative that is preferred by a majority to each and every other alternative, then that is the alternative which should be chosen. Most common procedures for selecting a single winner fail to satisfy the Condorcet criterion (Black, 1958). Plurality not only fails the Condorcet winner criterion but even violates the Condorcet loser condition, which requires that no alternative which loses to each and every alternative in pairwise competition should be chosen.1
Electoral Systems
The range of options for electoral rule choices grows far larger when we move beyond elections for a single winner to consider ways to elect members of a legislature or other representative body. In addition to plurality rules applied to single-seat elections, the most important electoral rules worldwide are list forms of proportional representation (the most common rule for national parliamentary elections), and mixed electoral systems that combine plurality elections in individual districts with proportional representation rules for some of the seats (see Reynolds et al., 2005). While, at the national level, plurality rules are most often found in use in single-seat elections, plurality rules can also be applied directly to multi-seat elections: the simplest form is called plurality bloc voting, that is, M seats to be filled with each voter having M votes to cast as Xs, and with the M candidates with the most votes declared to be the winners. With this rule, a majority bloc that votes in a cohesive fashion can fill all M of the openings with its preferred candidates. Other variants of plurality are similar in effects but place some restrictions on candidacies, such as involving voting for numbered places so that, although the majority can still control all M elections, they must do by casting separate Xs in each of M single-seat contests. Usually the numbered places are given a specific geographic location, even though the voting electorate for each place is the set of voters as a whole, not just those who live within the given geography. A further variant of plurality voting with numbered places involves a geographic residency requirement for candidacy in each of the numbered places. Such rules, sometimes in conjunction with plurality elections in single seats, are common in US local elections (Davidson and Grofman, 1994). They have been used in the past for legislative elections in the United States in a substantial proportion of states (Niemi et al., 1985), although today only vestiges of this usage persist (e.g. the two-seat districts in the lower chamber of the New Jersey
745
legislature). Moreover, multi-seat plurality can be combined with party lists to create the little-known plurality party bloc voting rule used for national legislative elections in Singapore (Tan and Grofman, 2019). List proportional representation (PR) rules also have many variants. One dimension of variation involves the degree to which votes can affect the rankings of candidates on the party lists, with the key distinction being that between open lists and closed lists, with the latter far and away the most common. Another way to differentiate among list PR rules has to do with the specific aggregation rule used to translate votes into seats. List PR rules for translating votes to seats are mathematically equivalent to rules for allocating seats to geographic units within a legislature based on the population of those units. One important form of list PR is the greatest remainder method, which corresponds to the Alexander Hamilton method for legislative apportionment (Balinski and Young, 1982). But perhaps the most important set of rules for PR allocations are where we find a divisor such that the resulting quotients will fill all the seats under a particular rule for ‘rounding’ the quotient. There are five basic forms of rounding: round up, round down, round in the usual way, round according to the geometric mean and round according to the harmonic mean (Balinski and Young, 1982). The rounding method involving the harmonic mean is now used for apportioning the US House of Representatives. The D’Hondt rule, which is the most common form of list PR, corresponds to ‘always round down’. It is the equivalent of the rule proposed by Thomas Jefferson for apportioning the US House of Representatives. We can show that, under D’Hondt, any group with at least a share of the electorate equal to 1/(M + 1), where M is the number of seats in the district, can guarantee to elect a candidate of choices if its affiliates vote cohesively as a bloc. The Sainte-Lagüe rule, used in modified form in the Scandinavian countries, makes
746
The SAGE Handbook of Political Science
use of what we think of as ordinary rounding, that is, rounding based on whether the remainder is above or below .5. It is the equivalent of the Daniel Webster rule once used for apportioning the US House of Representatives (Balinski and Young, 1982). Similarly, we can show that, under pure Sainte-Lagüe, it is still the case that any group with at least a share of the electorate equal to 1/(M + 1) can guarantee to elect a candidate of choices if its affiliates vote cohesively as a bloc. Where D’Hondt and Sainte-Lagüe differ is in how they generate values for obtaining seats beyond the first seat (Lijphart and Gibberd, 1977). There are a number of different ways to mix plurality and PR in multi-seat elections for a legislature. One important consideration is the share of seats allocated to each of the two components, but even more important in determining overall effects is whether there is a linkage between the two components. The simplest distinction is between mixed member systems in which the single-seat plurality component and the PR component operate completely separately; and ones in which the level of electoral success in the single-seat component is taken into account in calculating the proportional share a party is to be given. Outcomes in the first type of mixed system can be thought of as semiproportional in that they combine some seats allocated accorded to proportional rules with some seats allocated by plurality; the latter type of mixed system in effect operates as a PR system even though it has a single seat plurality component.2 Another important distinction among mixed-member systems has to do with ballot structure. In a single-ballot system the vote for the single-member candidate is credited to the party ticket (if any) on which that candidate is running; in two-ballot systems, voters may express separately their preference for a candidate in the election in the singlemember district in which they are entitled to vote and their preference for a particular party list (Massicotte and Blais, 1999).
The electoral rules identified above, while arguably the most influential, do not even begin to exhaust the vast bestiary of electoral rule options. Here I mention only a handful of the better known rules. The most important proportional representation alternative to list PR is the single transferable vote (STV).3 STV is the multiseat variant of the alternative vote. Under STV, voters rank candidates, and any candidate who receives over one Droop quota of votes is elected. A Droop quota equals 1/(M + 1). Any cohesive group with at least a 1/(M+ 1) share of the electorate can guarantee to elect a candidate of choice. In tallying votes under STV we first look to first preferences to check for winners. If there are none, we eliminate the candidate with the fewest first-place votes and reallocate her votes to the next candidate on her list. But once there is a winning candidate who receives more than a Droop quota share, we look to the ballots cast by the candidate’s supporters, and a share of those ballots representing votes in excess of what was needed for the election of that candidate are reallocated to the next candidate on such voter’s ranked list.4 Then we check to see if that results in the election of a further winner. This process continues until M candidates are chosen. It is easy to see that it is mathematically impossible for more than M candidates to have greater than a 1/(M + 1) share of the vote. STV, in principle, allows voters to express complicated preferences that take into account both preferences for parties and views of individual candidates but, in practice, voters may choose to report a truncated ballot in which they confine their preference rankings to candidates of the party with which they identify. STV is most famously used for parliamentary elections in Ireland; there it has proved resistant to attempts to do away with it. It was also briefly used in more than two dozen localities in the United States, but now only the city council elections for Cambridge, Massachusetts continue to use STV.
Electoral Systems
There are also other rules that can, in principle, achieve proportional representation, even if in practice they do not do so perfectly. One of these is cumulative voting, a rule that allows voters to differentially weight their votes for different candidates to express intensity of preference. Another is the single non-transferable vote in which, even though there are multiple candidates to be elected, voters may vote for only one of these candidates, with those with the most votes being elected. In the United States such rules have been proposed as ways to assure the representation of racial and ethnic minorities at the local level. These rules allow a minority of a given size to elect at least one candidate of choice if they vote cohesively as a bloc (Brockington et al., 1998). SNTV (single non-transferable vote) has been used in the past in places such as Japan and Korea, and it is still in use in several countries, including Jordan and Afghanistan (Reynolds et al., 2005). The more general form of the single nontransferable vote, the limited vote, gives each voter k votes but with M candidates to be elected and M > k > 1. The limited vote can be thought of as a semi-proportional voting rule. It has been used in Spain.5 Other rules first developed for use in single-seat elections, such as the Borda rule and approval voting, can also be extended to allow for multi-seat elections. There are a number of different ways to create a taxonomy of electoral rules. By making more fine-grained distinctions we can avoid the misleading nature of simple comparisons of PR and non-PR systems, since each category has multiple variants which can differ greatly in their implications for the behavior of voters, candidates and political parties, and in their consequences for the types of public policies that are most likely to be chosen (e.g. ones targeted to particular constituencies versus ones designed for broad appeal). One common approach to classification of electoral systems relies solely on ballot features (e.g. Xs versus rankings; party-based voting versus candidate-based voting versus
747
some mix of the two; number of rounds of election required or potentially required). These are all features that do not explicitly reference expected outcomes. The second approach classifies voting rules according to their expected (or actual) proportionality of outcomes, taking district magnitude (the number of seats per constituency) into account. This approach lends itself to using proportionality as a key variable in explaining other outcomes of interest, such as the expected number of parties. In turn, the structure of the party system can be used to understand other politically salient outcomes, such as the nature of governing coalitions.
History of the Study of Electoral Systems Four Books that Defined the Modern Subfield of Electoral Studies The political science study of electoral systems is heavily normative in the 1930s and through the 1950s, with the beginnings of the modern study of electoral systems seen in Duverger (1948, translated into English in 1954). We may think of the early literature on electoral system effects as focused on three topics: effects on numbers of political parties seeking/gaining representation (Duverger, 1948, 1951); effects on proportionality of the translation of votes into seats (Lakeman and Lambert, 1955); and effects on the ideological structure of party competition, especially the likelihood that extremist parties will gain representation (Hermens, 1940). I view the birth of the modern comparative study of electoral systems as coming with the publication of Rae’s seminal Political Consequences of Electoral Laws in 1969. This book more or less does it all: it introduces new quantitative measurement tools, uses what were for the time sophisticated
748
The SAGE Handbook of Political Science
methods of analysis, makes use of graphical display to make points clearly and identifies most of what became the defining issues in the subfield over the next several decades. But I also wish to flag for special attention three subsequent books published in the 20th century that have become classics. The 20th-century culmination of the mainstream study of electoral system effects in major democracies is Arend Lijphart’s Electoral Systems and Party Systems: A Study of Twenty-Seven Democracies, 1945–1990, published in 1994. With the help of a large team of country specialists, this work uses cross-national regressions, and some before and after analyses involving changes in electoral rules, to study electoral system effects, including proportionality of seats–votes relationships and party system fragmentation. It also provided an influential typology of electoral rules, and a discussion of time trends in electoral rule choice. Two other landmark books provide major methodological innovations in the study of electoral systems. Taagepera and Shugart (1989) introduce a theory-based approach that on the one hand restricts itself to a handful of structural variables, such as district magnitude (the number of seats elected from a constituency), and on the other looks not to a prediction of countryspecific outcomes but at predictions of the expected average values (and distribution) of a handful of key outcome variables such as proportionality and the (effective) number of political parties.6 This work is based on the notion of logical constraints serving as boundary conditions on the range of feasible values, the principle of insufficient reason and insights into the behavior of aggregates of voters drawn from thermodynamics – ideas that stem from Taagepera’s training as a physicist. While Taagepera takes his inspiration from physics, Cox (1997) draws from contemporary economics. His book offers a magisterial synthesis of game theory-based models of electoral system effects, in which the aim
is to find equilibrium solutions to games involving the interaction of voters and political parties who take electoral systems rules to provide incentives for and constraints on the creation of political parties and their ideological locations. In his ‘rational’ world, parties look for winning ideological niches, seeking to maximize seat share – though later models allow for parties to have policy goals as well.
The Development of the Subfield There has been a remarkable growth in books and articles studying electoral rules from a theoretical and comparative perspective. That growth has, if anything, accelerated in the 21st century. The subfield of comparative electoral systems is now fully established as a major subarea within the broader field of comparative politics, and its theoretical underpinnings seem firmly in place. Moreover, it seems that at least every other issue of the top general-purpose political science journals carries an article on electoral systems. While the organizing issues of earlier work of electoral law effects – proportionality in the seats–votes relationship, number of parties, ideological range of party competition, coalition structure – remain central in the literature (see e.g. Carey and Hix, 2011 for recent work on proportionality), with a better theory of how electoral systems affect party systems, the literature has gone on to look at other types of effects of electoral rules. Here I mention only a few of the (relatively) newer topics, and provide only a few illustrative citations: 1 What are electoral systems’ effects on voter turnout? While research continues to confirm the long established fact that countries with PR systems tend to have higher turnout than countries operating under plurality rules, using within-nation comparisons, it disconfirms many of the standard theories as to why we might expect such a relationship to occur (Grofman and
Electoral Systems
Selb, 2011; see also Grofman and Selb, 2009, for example, the notion that more parties increase voter choice and thus, ceteris paribus, the more parties there are, the more likely are voters to have a party that they are enthusiastic about supporting at the polls – so that turnout should be higher when there are more parties – is not well supported. 2 What are the consequences of electoral rules for racial, ethnic and gender representation and for the mitigation of ethnic conflict? With respect to gender representation, as in other areas of electoral systems research, a clear finding is that the devil is in the institutional detail (see e.g. Krook and Moser, 2012). In particular, the effects of gender quotas depend considerably on whether they are suggestive or mandatory, on the level of gender representation that is called for, on the enforcement mechanism and on such subtle details as whether a requirement for the alternative placement of women on party lists also requires random assignment by gender as to who is at the top of the list. Similarly, work on electoral rules and ethnic accommodation casts doubt on overly facile acceptance of the idea that either proportional representation rules or socalled vote-pooling methods necessarily mitigate ethnic conflict (compare Reilly, 1997, 2005 with Fraenkel and Grofman, 2006). 3 What are the effects of ethnic cleavages on party proliferation under different types of electoral rules? Work such as Amorin Neto and Cox (1997) and Li and Shugart (2016) demonstrates that social diversity can increase the size of the party system above what we might expect from the electoral rule taken in isolation. However, what happened in Belgium, where a polarizing linguistic cleavage in effect nearly doubled the number of parties by having a language split override ideological similarities among party supporters coming from different language communities, seems to be an extreme case. 4 What factors affect the likelihood that campaigns will be party-specific as opposed to candidatecentered? Here, we have learned that singlemember districts are far from the only electoral rule to be compatible with candidate-centered politics (Carey and Shugart, 1995), and more recent work has shown that there are longitudinal trends toward personalized campaigning that extend well beyond the United States (Renwick and Pilet, 2016). A closely related question deals
749
with the implications of electoral rule choice for the likelihood that campaigns will be centered on constituency-specific concerns, such as porkbarrel projects in the district, rather than policy and distributional or redistributional issues affecting broad segments of the voting public defined in terms of class interests (see Persson and Tabellini, 2002; cf. Grofman, 2005). 5 What types of voting rules foster the type of political manipulation that produces partisan bias, and, relatedly, how do electoral rules differ in their susceptibility to manipulation via the gerrymandering of constituency boundaries? Tan and Grofman (2019) argue that party bloc voting as it is used in Singapore and in some African nations is especially pernicious in its potential for favoring the plurality group, and that it has become what they call the ‘autocrat’s friend’. But they also note that, on the other hand, there is little actual evidence for partisan gerrymandering in Singapore. Similarly, Sauger and Grofman (2016) show that partisan gerrymandering effects in France’s two-round ballot system appear to be exaggerated. 6 How does choice of electoral rules affect the likelihood that the views of the median voter will determine the policies of the governing party or party coalition (Powell, 2000)? On this question there is ongoing controversy. The earliest work seemed to confirm the view that PR systems would be more responsive to the preferences of the median voter than would ones using plurality, but later work (involving a different time period and a slightly different set of countries) did not support this conclusion.7
Since a full review of even the literature of the past two decades is beyond the scope of this short essay, I would like to call particular attention to a few recent books that either extend our knowledge or provide a useful synthesis, some of which are monographs and some of which are edited collections. Turning first to the monographs: Moser and Scheiner (2013) compare data on party proliferation effects in the PR and plurality components of mixed electoral systems, and emphasize the importance of distinguishing new democracies with still evolving party systems from more established democracies; Reynolds (2011) looks at the relationship
750
The SAGE Handbook of Political Science
between electoral systems and racial, ethnic and gender representation; Lublin (2014) also looks at ethnic representation but with a focus on ethno-regional parties, making use of a remarkably extensive data set which he has compiled; Bochsler (2010) brings to the political science literature on electoral systems a concern for electoral geography previously found almost solely among political geographers; Renwick and Pilet (2016) show how candidate-centric as opposed to partycentric competition is enhanced by or retarded by particular types of electoral rules; Norris (2004) offers a synthesis of the literature on electoral engineering that covers a multiplicity of topics; Taagepera (2007) provides new models for the relationship between electoral systems and the distribution of party sizes; and, perhaps most importantly, Shugart and Taagepera (2018) offer a magisterial update of Taagepera and Shugart (1989), but now with both the order of authorship reversed and the causal arrow in the reverse direction, that is, from seats to votes rather than from votes to seats. Turning now to edited volumes: a series of five edited books sponsored by the Peltason Center for the Study of Democracy at the University of California, Irvine under a research scheme which I devised compares countries that use the same electoral rule to seek to identify electoral effects that generalize across cases, and to identify intervening factors that limit generalizability.8 This series spans the last decade of the last century and the first decade of this one and reviews five of the world’s major electoral systems: STV, SNTV, list PR, plurality and mixed-member systems. Grofman et al. (1999) look at the single non-transferable vote in Japan, Korea and Taiwan; Bowler and Grofman (2000) look at the single transferable vote in Australia, Ireland and Malta; Shugart and Wattenberg (2001) examine mixed-member systems in both western and eastern Europe, as well as in New Zealand; Grofman and Lijphart (2002) look at list PR elections in the Nordic
countries; and Blais et al. (2008) examine plurality-based elections in Canada, India, the UK and the United States, with a focus on the effects of single-seat plurality on the (effective) number of parties. I would also single out two other edited volumes that expanded our knowledge of electoral system effects: Colomer (2004a) and Gallagher and Mitchell (2005). While primarily containing country-specific analyses across a range of electoral rules, these volumes seek to go beyond case studies to develop genuine comparative theory. Finally, I would note the important compilation created by an APSA taskforce edited by Mala Htun and G. Bingham Powell, published in 2012, that looks at the relationship between electoral rules and democratic governance, and contains a number of very insightful essays.
Data Availability, Measurement Tools and Graphical Display The tremendous growth in the electoral systems subfield has been facilitated by a number of factors – most notably the dramatic enlargement in the number of countries that could be treated as democratic and the creation of ever longer post-World War II longitudinal data sets, on the one hand, and the expansion in the number of political scientists doing empirically oriented research, especially those from newer democracies, on the other. While scholars located in the United States or the UK continue to play a large role, an increasing share of the work being published in the subfield is by younger scholars who are located at institutions outside the United States or other Englishspeaking countries, though many of these younger scholars received training outside their home countries. The increased use of English-language materials and instruction in the political science graduate programs of top European universities also plays a role in creating a truly international community of electoral systems scholars that fosters
Electoral Systems
research collaborations across national borders. But there are two other important changes that have greatly improved the quality of electoral system scholarship. The first is the increasing availability of data for elections in many countries at the level of individual districts, not merely at the national level, and in a format that allows for easy computer processing (Struther et al., 2018). The second is the existence of the Comparative Study of Electoral Systems (CSES) multinational project (cses.org/) that allows researchers to meld aggregate election data with survey data (and other information) on the same elections. Another reason for progress in electoral system research is the development of measurement tools that lend themselves to precise quantitative assessment and the development of long-run time series. Key articles appear in the 1970s introducing measurement ideas such as the Threshold of Exclusion and the Threshold of Representation (Rae et al., 1973), which provide a priori measures of the degree to which a given electoral rule allows for minority representation; the Index of Distortion (Loosemore and Hanby, 1971), which provides an ex post measure of disproportionality; and the Laakso–Taagepera (1979) measure of the Effective Number of Parties, which reflects the need to take into account party sizes when considering party constellations, rather than simply counting the number of parties gaining representation.9 Perhaps the most important technical refinement of the 1990s is the Gallagher Index (1991), an alternative way to measure ex post disproportionality in the form of a weighted measure that pays less attention to the disproportionality in the smaller parties, since these make up a smaller percentage of the votes or seats. All these tools, but especially the Laakso–Taagepera index, continue to be very widely used and are now more or less taken for granted features of much current work in the electoral systems and parties literature. There are, however, some more recent further advances. Both Blais and Lago (2009)
751
and Grofman and Selb (2009) offer generalizations of Cox’s (1997) measure of electoral competitiveness to multi-party situations operating under proportional representation rules. The Cox measure involves the gap between the vote shares of the largest and the second largest party. While that operationalization of level of competition makes sense for plurality settings, in PR settings even parties with relatively low seat shares may have an opportunity to gain seats. Thus, we would wish to know how far away from a seat gain each party’s share places it in the context of the overall distribution of votes across the parties. Feld and Grofman (2007) show how the Laakso–Taagepera (L-T) index can be reformulated in more standard statistical terms as regards the mean and variance of party seat or vote distributions. Dumont and Caulier (2003) and Kline (2009) offer an interesting way to extend the L-T index. They propose combining the Laakso–Taagepera index with power scores derived from game theory.10 This reformulation has a number of nice features, but in particular, unlike the L-T index, it generates a finding of one-party dominance whenever any party controls a majority of seats. However, perhaps the single most important contribution of the present century to an already solidly established measurement tool kit is the development of a new index of bipartism, looking exclusively at the extent to which a party constellation can be characterized as a two-party system (Gaines and Taagepera, 2013). This measure is of particular importance in testing Duverger’s (1948, 1954) claim that single-seat plurality-based elections foster two-party competition – at least at the local level.11 A quite different perspective has, however, also been recently proposed by Grofman and Kline (2012). They look at party constellations in terms of ideological propinquity rather than seat share and use a clustering model derived from earlier work on theories of cabinet formation to determine the ‘effective number of ideologically distinct parties’.
752
The SAGE Handbook of Political Science
Graphical tools also have come to the fore. Taagepera and Shugart (1989) introduce the Index of Advantage, a ratio measure, and then plot it over the range of party sizes so as to provide a graphic measure of the extent to which disproportionality in the translation of votes into seats benefits or hurts larger as opposed to smaller parties. Display tools showing the structure of party competition, such as ternary diagrams and Nagayama diagrams, are described and made use of in Grofman et al. (2004), Taagepera (2004) and Taagepera and Allik (2006).
Origins of Electoral Rules Issues involving choice of voting rules are found in many settings: from elections for a single executive (e.g. Baumgartner, 2006 on rules for electing the Pope; and see dueling perspectives on the Electoral College in Ross, 2019); to local elections for bodies such as city councils or school boards (Weaver, 1986); to elections for the European Parliament (Dolez and Laurent, 2010); to choice of voting rules within both governmental and non-governmental voting bodies, such as voting rules for the Council of the European Union or voting rules for corporate elections in the United States (Glazer, Glazer, and Grofman,1984). Here our focus will be on national elections for parliament (Rae, 1969; Lijphart and Grofman, 1984; Kreuzer, 2010). While the literature on voting rules stretches far back in time (McLean and Urken, 1995), a convenient starting point for this section is the invention of proportional representation (PR) methods in the 19th century. The invention of PR was eventually followed by a transformation of the electoral system usage map in the decades prior to World War I (Mackie and Rose, 1991), with PR rules adopted for parliamentary elections in most major non-English-speaking democracies.
There remains considerable dispute about the reasons why this change occurred (see e.g., Dunleavy and Margetts, 1995). The most common argument builds on ideas of Stein Rokkan (1970), to wit, that established elites saw PR as a way of holding on to power in the face of inexorable pressures for suffrage expansion that threatened to bring socialist parties to power (see especially Boix, 1999). While there is broadbased agreement that choice of electoral rules is intimately linked to issues of suffrage expansion and to broader processes of democratization (see e.g. Blais et al., 2005), the Rokkan thesis has been challenged in a number of different ways. For example, Calvo (2009: 255) reminds us that this argument ‘fails to explain PR reforms in countries with weak or nonexistent socialist parties, a list that includes most countries of the world in the early twentieth century’. Calvo also calls attention to an argument neglected in Rokkan about the importance of ethnic representation for choice of voting rules. Calvo’s own work emphasizes effects of electoral geography that can generate partisan biases that affect party preferences for electoral rule choices. Ahmed (2010) argues for the importance of strategic choices by conservative parties as to how best to fight socialism which did not always generate a preference for proportional representation. Ziblatt (2017) similarly takes a party-centric approach, focusing on conservative parties’ adaptations to a new electoral environment and the ways in which they developed strategies to compete for the votes of the less well off. et al. (2007, 2010), however, take a quite different tack, and argue that the degree of adversarial relationships in worker–management interactions offers the key explanation for why not all major democracies adopted PR, and that this factor also explains the continued relationship between varieties of capitalism and choice of electoral rules.12 From World War II through the early 1990s, in most long-term major democracies, we observe largely frozen electoral
Electoral Systems
system usage at the parliamentary level – with English-speaking nations using plurality or the alternative vote and just about every other OECD country using some form of PR. France is an exception in that the perceived weakness of the proportional system in use in the 4th French Republic led to the adoption of a two-round single-ballot system for parliamentary elections under the 5th Republic.13 In the 1990s, however, there was a new wave of change, involving imitation of some features of the mixedmember system electoral system that had been adopted by Germany: in Japan, the use of SNTV ended; in Italy, proportional representation was discontinued; and in New Zealand, a plurality system was replaced. In each case the replacement is a mixedmember system. For these cases of electoral system change, theories of electoral system choice based on western Europe in the late 19th and very early 20th centuries seem of limited relevance. Moreover, once we look beyond the longterm democracies, both the range of cases to study and the range of possible explanations for electoral choices widens considerably. Global changes come with the rise of new democracies after decolonialization in the 1960s, and then again in the 1990s after the break-up of the Soviet empire, with new (or restored) countries adopting new electoral rules. Here we simply note four factors that can impact electoral system choice. None of these factors reflects the idealist vision of electoral reform as serving good government ends, though surely some participants in electoral reform debates are primarily motivated by exactly such sentiments.14 1 Preference-based rational choice calculations made by those with the ability to affect change can be based on their (long run or short term; accurate or mistaken) expectations of having electoral or policy outcomes (more) to their liking. In John Ferejohn’s classic phrasing (personal communication, 1971): ‘Preference for rules is conditioned by preference for outcomes.’ For example, as the potential threat to white
753
dominance from enfranchised black voters began to be taken seriously, white Democrats in power in the US South after World War II began to adopt majority run-off election rules for Democratic congressional primaries to prevent African Americans from being nominated as plurality winners.15 2 Borrowing from what is used in other countries – perhaps historically conditioned, perhaps more or less faddish. For example, Mozaffar et al. (2003) note the link between the choice of electoral rules in Africa and colonial heritage, with British colonies more likely to choose plurality and French colonies initially opting for either proportionality or a two-round ballot (reflecting the rules from the 4th and 5th French republics). But it is hard to see the 1990s attractiveness of mixed-member systems as reflecting anything other than a view in other countries that postWorld War II Germany has been very successful both economically and politically, and that at least some of that success should be credited to its electoral institutions. 3 Decisions by outside actors after a country’s military loss, or negotiations taking place as part of a broader peace agreement after a civil war (perhaps brokered by outside mediators). 4 Sometimes there can be popular pressure for change, especially when the seats–votes relationship seems ‘out of whack’. But parties in power may be able to successfully resist pressures from below, unless accidents occur. For example, we apparently owe New Zealand’s change to a mixed-member system in part to a minister who misspoke and found his government committed to a referendum on electoral law change – one which his party later unsuccessfully opposed.
The Future of Electoral Systems Research Looking to the future of electoral system research, we can say for sure that there will be continuing work on the role of various types of quotas and redistricting mechanisms in ethnic and gender representation. But there are a number of other developments in recent decades which I expect to see being
754
The SAGE Handbook of Political Science
built upon and expanded in the future. Here I wish to highlight just three of these. 1 We will need to deal better with issues of causality. Understanding when particular types of electoral system effects will manifest remains key to future progress in the subfield. Because of overdetermination, disentangling causation can be quite difficult. For example, societies where socialist traditions are weak, or which have particular cultural features, such as British colonial heritage, have also been ones with plurality-based rules. These societies may be also be expected to have major differences from one another in a variety of ways, and so we need to be careful about making claims about the causal effects of electoral system differences. Similarly, Colomer (2004b) emphasizes selection bias effects in which countries pick voting rules that are compatible with their preexisting cleavage structure. The embedded institutions approach, to whose edited volumes I have previously called attention, deals with causality in terms of most similar systems design, on the one hand, and natural experiments, on the other. As previously noted, the most similar system design is intended to help tease out context effects (e.g. the intermediating role of institutions such as a president or a federal system). There is also a growing literature using experiments, both in the field and in the laboratory (Dolez, Grofman and Laurent, 2010; Muraoka and Barcelo, 2017) to better understand causality of electoral system effects. We can also expect that more will be done with the new methodological tools for empirical causal analysis, such as regression discontinuity design, difference in difference, and statistical matching of most similar cases. 2 Relatedly, we must better answer the question of how much we can use the levers of electoral system choice to achieve social change. Answering this question will require us to assess the independent effects of electoral rules as compared to other factors such as political culture and political history. 3 We should better acknowledge and build upon contributions from disciplines other than political science. In particular, work by leading economists uses very sophisticated econometrics as well as applied game theory to examine electoral system effects. Also, much of the most interesting experimental work on electoral system effects has
been done by economists – and economists are central in contributing to the highly mathematical work on voting rules that is published these days, largely in interdisciplinary journals such as Social Choice and Welfare. But the flow of ideas goes both ways. For example, the Canadian economist Stanley Winer has worked on applying electoral system ideas to the development of the Canadian party system (Winer et al., 2017), while the Italian economist Pietro Navarra, along with the British economist Ram Mudambi, has done work on electoral system effects on Italian party systems (Mudambi et al., 1996), and I have previously referenced work by the Spanish economist Josep Colomer (2004a, 2004b). I should also note that a new subfield of computational social choice has been developing in computer science, involving both axiomatic approaches to voting rule properties and simulations of election law effects. Because this work is being published largely in computer science and artificial intelligence journals and conference proceedings (see e.g. Betzler et al., 2013), and because it is highly technical, it has been virtually invisible to political scientists.
Notes 1 One famous example of this failure of plurality is the US Senate election contest in New York in 1970 involving the Republican candidate, James Buckley (brother of William F. Buckley); Richard Otinger, the Demcoratic candidate; and Charles Goodell, a liberal Republican running on the Liberal Party line. Buckley won with a 39% plurality to 37% for Otinger and 24% for Goodell. If Otinger had not been in the contest, it is certain that most Democrats would have supported Goodell over Buckley, since in a choice between two Republicans most Democrats would have supported the more liberal of the two. But similarly, almost certainly the majority of Goodell supporters, who were voting for him on the Liberal line, would have supported Otinger over Buckley, allowing Otinger to win. Thus, Buckley was a Condorcet loser. 2 However, there may be ‘contamination’ effects across the two components which operate to make the constellation of parties in each similar to one another. 3 STV is also known as the Hare system, after its British exponent, but it was first proposed by the Danish mathematician (and politician) Carl
Electoral Systems
Andræ, and was used for the first time in an election to the Danish Rigsdag in 1856. 4 Here we slough over technicalities involving the randomization of this allocation process. 5 An interesting historical footnote is that the limited voting rule was first proposed by the mathematician Charles Dodgson, who is better known to most of us under his pen name of Lewis Carroll (Black, 1969). 6 See below for definition of this term. 7 My own view is that electoral system representational effects depend heavily upon the degree of polarization between/across parties and on the ideological distribution of party strengths. This can vary across time. Thus, I am not that disturbed by seemingly contradictory findings. 8 I refer to examination of context effects in this most similar system design as ‘embedded institution’ analysis (Grofman et al., 1999). 9 The Laakso–Taagepera index involves summing squared seat (or vote) proportions. This index is the inverse of the even better known Hirfindahl– Hirschman index used in economics. 10 Writing in ignorance of the work of Dumont and Caulier (2003), Grofman (2006) independently suggested the possibility of creating such a combined index. 11 Duverger offers a three-part ‘law’ which asserts that plurality single-seat constituencies favor two-party competition, and that proportional representation and two-round ballot systems favor multi-partyism. Duverger proposes that electoral system effects occur through two mechanisms – one mechanical, tied to the district magnitude and the Threshold of Exclusion, and one more incentive-based, referred to by Duverger as a ‘psychological effect’. The latter impacts both voters and parties. Parties that expect to lose have little incentive to run, and voters have little incentive to vote for parties that have no chance of victory. These two sides of the coin interact to reduce the number of parties to what I have referred to as the ‘carrying capacity’ of the system. There are many articles arguing for the continued accuracy of Duverger’s Law, which has long been referred to as one of the handful of true ‘laws’ in the social sciences. This literature convincingly shows that, on average, the effective number of political parties is lower in countries with single-seat plurality elections than in other countries. However, there is a literature suggesting some important caveats. In particular, (a) Duverger’s law has been tested using ‘effective number of parties’, which is distinct from what Duverger had in mind; (b) for plurality elections, Duverger’s law can only be viewed as plausible in terms of district outcomes, not jurisdiction-wide
755
outcomes; and (c) even at the district level, with the exception of a handful of tiny Caribbean island republics, the United States is virtually alone in looking like a true two-party system. For example, in many districts in the UK and Canada, third parties (often regional parties) persist, and competition is rarely limited to only two major parties. In India, the picture is also not straightforward. With the decline of the Congress Party, Indian political competition at the district level is only rarely limited to two parties, but also parties tend to come and go. 12 I make no attempt to arbitrate this controversy, other than to note that necessary or sufficient conditions for some type of outcome are unlikely to be reducible to a single factor and that explanatory factors may interact in complex ways; nor need we expect that all similar outcomes have the same cause. 13 That two-round ballot is also used for French presidential elections in the Fifth Republic. 14 Renwick (2011) offers both case-specific analyses and theory to better understand the nature of debates about electoral reform. 15 D’Alimonte (2012: 255–6) provides another example of an interest-driven explanation for electoral system choice: Italy in 1993. ‘On the choice between two round elections and simple plurality in the SMD component of the mixed system that was being proposed the DC succeeded in steering the decision away from the French system, which was the preferred option of the PDS and towards the plurality formula. Its preference for this choice was influenced by the results of the [June 1993] local elections held with a two-round electoral system. The poor performance of the DC candidates, due to the DC difficulty in forming alliances, convinced party leaders that the French system was not in its interest.’
References Ahmed, Amel. 2010. ‘Reading History Forward: The Origins of Electoral Systems in European Democracies.’ Comparative Political Studies 43(8/9): 1059–88. Amorim Neto, Octavio and Gary Cox. 1997. ‘Electoral Institutions: Cleavage Structures, and the Number of Parties.’ American Journal of Political Science 41(1): 149–74. Balinski, Michel L. and Peyton H. Young. 1982. Fair Representation: Meeting the Ideal of
756
The SAGE Handbook of Political Science
One Man, One Vote. New Haven, CT: Yale University Press. Balinski, Michel L. and Rida Laraki. 2011. Majority Judgment: Measuring, Ranking, and Electing. Cambridge, MA: MIT Press. Baumgartner, Frederic J. 2006. ‘Creating the Rules of the Modern Papal Election.’ Election Law Journal 5(1): 57–73. Betzler, N., A. Slinko and J. Uhlmann. 2013. ‘On the Computation of Fully Proportional Representation.’ Journal of Artificial Intelligence Research 47: 475–519. Black, Duncan. 1958. The Theory of Committees and Elections. New York: Cambridge University Press. Black, Duncan. 1969. ‘Lewis Carroll and the Theory of Games.’ American Economic Review 59(2): 206–10. Blais, Andre, Agnieszka Dobrzynska and Indridi H. Indridason. 2005. ‘To Adopt or Not to Adopt Proportional Representation: The Politics of Institutional Choice.’ British Journal of Political Science 35 (January): 182–90. Blais, Andre and Ignacio Lago. 2009. ‘A General Measure of District Competitiveness.’ Electoral Studies 28(1): 94–100. Bochsler, Daniel. 2010. Territory and Electoral Rules in Post-Communist Democracies. London: Palgrave Macmillan. Bochsler, Daniel, Bernard Grofman and Miriam Hänni. 2017. ‘The Effects of Ethnic Fragmentation on Party Proliferation Revisited: The Intermediating Role of Ethnic Parties.’ Presented at the Annual Meeting of the European Political Science Association, Milan, June 22–24. Boix, C. 1999. ‘Setting the Rules of the Game: The Choice of Electoral Systems in Advanced Democracies.’ American Political Science Review, 93(3): 609–24. Bowler, Shaun and Bernard Grofman (eds) 2000. Elections in Australia, Ireland and Malta under the Single Transferable Vote. Ann Arbor: University of Michigan Press. Brams, Steven J. and Peter C. Fishburn. 1983 (2nd edition 2007). Approval Voting. New York: Springer. Brockington, David, Todd Donovan, Shaun Bowler and Robert Brischetto. 1998. ‘Minority Representation under Cumulative and Limited Voting.’ Journal of Politics 60(4): 1108–25.
Calvo, Ernesto. 2009. ‘The Competitive Road to Proportional Representation: Partisan Biases and Electoral Regime Change under Increasing Party Competition.’ World Politics 61(2): 254–95. Carey, John M. and Simon Hix. 2011. ‘The Electoral Sweet Spot: Low-Magnitude Proportional Representation Systems.’ American Journal of Political Science 55(2): 383–97. Carey, John M. and Matthew Shugart. 1995. ‘Incentives to Cultivate a Personal Vote: A Rank Ordering of Electoral Formulas.’ Electoral Studies 14(4): 417–39. Colomer, Josep M. (ed.). 2004a. Handbook of Electoral System Choice. New York: Palgrave Macmillan. Colomer, Josep M. 2004b. ‘The Strategy and History of Electoral System Choice.’ In J. M. Colomer (ed.) Handbook of Electoral System Choice. New York: Palgrave Macmillan, 1–80. Cox, Gary W. 1997. Making Votes Count. Cambridge: Cambridge University Press. Cusack, T. R., T. Iversen and D. Soskice. 2007. ‘Economic Interests and the Origins of Electoral Systems.’ American Political Science Review 101(3): 337–91. Davidson, Chandler and Bernard Grofman (eds). 1994. Quiet Revolution in the South: The Impact of the Voting Rights Act, 1965– 1990. Princeton, NJ: Princeton University Press. D’Alimonte, Roberto. 2012. ‘Italy: A Case of Fragmented Bipolarism.’ In Michael Gallagher and Paul Mitchell (eds) The Politics of Electoral Systems. New York: Oxford University Press, 253–76. Dolez, Bernard and Annie Laurent. 2010. La magnitude, facteur décisif? Les élections européennes de 2004 en France et les effets du changement de mode de scrutin Revue internationale de politique compare. Volume 17(3): 175–193. Dumont, Patrick and Jean-Francois Caulier. 2003. ‘The Effective Number of Relevant Parties: How Voting Power Improves LaaksoTaagepera’s Index.’ CEREC-FUSL Working Paper 2003/7. Dunleavy, Patrick and Helen Margetts. 1995. ‘Understanding the Dynamics of Electoral Reform.’ International Political Science Review 16(1): 9–29.
Electoral Systems
Duverger, Maurice. 1948. Les regimes politiques (1st ed.). Paris: Presses Universitaires de France. Duverger, Maurice. 1954. Political Parties. New York: Wiley. Feld, Scott L. and Bernard Grofman. 2007. ‘The Laakso-Taagepera Index in a Means and Variance Framework.’ Journal of Theoretical Politics 19(1): 101–6. Fraenkel, Jonathan and Bernard Grofman. 2006. ‘The Failure of the Alternative Vote as a Tool for Promoting Ethnic Moderation in Fiji.’ Comparative Political Studies 39(5): 663–6. Gaines, Brian J. and Rein Taagepera. 2013. ‘How to Operationalize Two-Partyness.’ Journal of Elections, Public Opinion and Parties 23(4): 387–404. Gallagher, Michael. 1991. ‘Proportionality, Disproportionality and Electoral Systems.’ Electoral Studies 10(1): 33–51. Gallagher, Michael and Paul Mitchell (eds), 2005. The Politics of Electoral Systems. New York: Oxford University Press. Glazer, Amihai, Deborah Glazer and Bernard Grofman. 1984. ‘Voting in Corporate Elections.’ South Carolina Law Review 35(2): 295–309. Grofman, Bernard. 2005. ‘Comparisons among Electoral Systems: Distinguishing between Localism and Candidate-Centered Politics.’ Electoral Studies, 24(4): 735–40. Grofman, Bernard. 2006. ‘The Impacts of Electoral Laws on Political Parties.’ In B. Weingast and D. Wittman (eds) The Oxford Handbook of Political Economy. New York and London: Oxford University Press, 102–18. Grofman, Bernard. 2008. ‘A Taxonomy of Runoff Methods.’ Electoral Studies 27(3): 395–9. Grofman, Bernard, Alessandro Chiaramonte, Roberto D’Alimonte and Scott L. Feld. 2004. ‘Comparing and Contrasting the Uses of Two Graphical Tools for Displaying Patterns of Multi-Party Competition: Nagayama Diagrams and Simplex Representations.’ Party Politics 10(3): 273–99. Grofman, Bernard and Scott L. Feld. 2004. ‘If You Like the Alternative Vote (a.k.a. the Instant Runoff) Then You Ought to Know about the Coombs Rule.’ Electoral Studies 23(4): 641–59.
757
Grofman, Bernard and Reuben Kline. 2012. ‘How Many Political Parties are There, Really? A New Measure of the Ideologically Cognizable Number of Parties/Party Groupings’. Party Politics 18(4): 523–544. Grofman, Bernard, Shaun Bowler, and Andre Blais (eds). 2008. Duverger’s Law in Canada, India, the U.S. and the U.K. New York: Springer. Grofman, Bernard, Sung-Chull Lee, Edwin Winckler and Brian Woodall (eds) 1999. Elections in Japan, Korea and Taiwan under the Single Non-Transferable Vote: The Comparative Study of an Embedded Institution. Ann Arbor: University of Michigan Press. Grofman, Bernard and Arend Lijphart (eds). 2002. The Evolution of Electoral and Party Systems in the Nordic Countries. New York: Agathon Press. Grofman, Bernard and Peter Selb. 2009. ‘A Fully General Index of Political Competition.’ Electoral Studies 28(2): 291–6. Grofman, Bernard and Peter Selb. 2011. ‘Turnout and the (Effective) Number of Parties at the National and At the District Level: A Puzzle Solving Approach.’ Party Politics 17(1): 93–117. Hermens, Ferdinand A. 1940. Democracy and Proportional Representation. Chicago: The University of Chicago Press. Htun Mala and G. Bingham Powell (eds). 2012. Report of the APSA Presidential Task Force on Electoral Rules and Democratic Governance. Washington, DC: American Political Science Association. Kline, Reuben. 2009. ‘How We Count Counts: The Empirical Effects of Using Coalitional Potential to Measure the Effective Number of Parties.’ Electoral Studies 18(4): 1–9. Kreuzer, Marcus. 2010. ‘Historical Knowledge and Quantitative Analysis: The Case of the Origins of Proportional Representation.’ American Political Science Review 104(2): 369–92. Krook, Mona and Robert Moser. 2012. ‘How Electoral Rules Affect Descriptive Representation.’ In Mala Htun and G. Bingham Powell (eds) Report of the APSA Presidential Task Force on Electoral Rules and Democratic Governance, 85–105. Laakso, Markku and Rein Taagepera. 1979. ‘Effective Number of Parties: A Measure with
758
The SAGE Handbook of Political Science
Application to West Europe.’ Comparative Political Studies 12(1): 3–27. Lakeman, Enid and James D. Lambert. 1955. Voting in Democracies; A Study of Majority and Proportional Electoral Systems, 1st edition (2nd edition, 1959). London: Faber and Faber. Li, Yuhui and Matthew Soberg Shugart. 2016. ‘The Seat Product Model of the Effective Number of Parties: A Case for Applied Political Science.’ Electoral Studies 41(1): 23–34. Lijphart, Arend. 1994. Electoral Systems and Party Systems: A Study of Twenty-Seven Democracies, 1945–1990. Oxford and New York: Oxford University Press. Lijphart, Arend and Robert W. Gibberd. 1977. ‘Thresholds and Payoffs in List Systems of Proportional Representation.’ European Journal of Political Research 5(3): 219–44. Lijphart, Arend and Bernard Grofman (eds). 1984. Choosing an Electoral System. New York: Praeger. Loosemore, John and Victor J. Hanby. 1971. ‘The Theoretical Limits of Maximum Distortion: Some Analytic Expressions for Electoral Systems.’ British Journal of Political Science 1(4): 467–77. Lublin, David. 2014. Minority Rules: Electoral Systems, Decentralization and EthnoRegional Party Success. Oxford: Oxford University Press. Mackie, Thomas T. and Richard Rose. 1991. The International Almanac of Electoral History (fully revised 3rd ed). Washington, DC: Congressional Quarterly. Massicotte, Louis and Andre Blais. 1999. ‘Mixed Electoral Systems: A Conceptual and Empirical Survey.’ Electoral Studies 18(3): 341–66. McLean, Iain and Arnold B. Urken (eds). 1995. Classics of Social Choice. Ann Arbor: University of Michigan Press. Moser, Robert G. and Ethan Scheiner. 2013. Electoral Systems and Political Context: How the Effects of Rules Vary across New and Established Democracies. New York and London: Cambridge University Press. Mozaffar, Shaheen, James R. Scarritt and Glen Galaich. 2003. ‘Electoral Institutions, Ethnopolitical Cleavages and Party Systems in Africa’s Emerging Democracies.’ American Political Science Review 97(3): 379–90. Mudambi, R., P. Navarra and C. Nicosia. 1996. ‘Plurality versus Proportional Representation:
Analysis of Sicilian Elections.’ Public Choice 86(3–4): 341–57. Muraoka, Taishi and Joan Barcelo. 2017. ‘The Effect of District Magnitude on Turnout: QuasiExperimental Evidence from Nonpartisan Elections under SNTV.’ Party Politics: 25(4) 1–7. Niemi, Richard, Jeffrey Hill and Bernard Grofman. 1985. ‘The Impact of Multimember Districts on Party Representation in U.S. State Legislatures.’ Legislative Studies Quarterly 10(4): 441–55. Norris, Pippa. 2004. Electoral Engineering: Voting Rules and Political Behavior. New York: Cambridge University Press. Persson, Torsten and Guido Tabellini. 2002. Political Economics: Explaining Economic Policies. Cambridge, MA: Harvard University Press. Powell, G. Bingham. 2000. Elections as Instruments of Democracy: Majoritarian and Proportional Visions. New Haven, CT: Yale University Press. Rae, Douglas. 1969 (2nd ed. 1972). Political Consequences of Electoral Laws. New Haven, CT: Yale University Press. Rae, Douglas, Victor J. Hanby and John Loosemore. 1971. ‘Thresholds of Representation and Thresholds of Exclusion: An Analytic Note on Electoral Systems.’ Comparative Political Studies 3(4): 479–88. Reilly, Ben. 1997. ‘The Alternative Vote and Ethnic Accommodation: New Evidence from Papua New Guinea.’ Electoral Studies 16(1): 1–11. Reilly, Ben. 2005. ‘Does the Choice of Electoral System Promote Democracy? The Gap between Theory and Practice.’ In P. G. Roeder and D. S. Rothchild (eds) Sustainable Peace: Power and Democracy after Civil Wars. Ithaca, NY: Cornell University Press, 159–71. Renwick, Alan. 2011. The Politics of Electoral Reform: Changing the Rules of Democracy. London: Cambridge University Press. Renwick, Alan and Jean-Benoit Pilet. 2016. Faces on the Ballot: The Personalization of Electoral Systems in Europe. Oxford: Oxford University Press. Reynolds, Andrew. 2011. Designing Democracy in a Dangerous World. Oxford: Oxford University Press. Reynolds, Andrew, Ben Reilly and Andrew Ellis. 2005. Electoral System Design: The
Electoral Systems
New International IDEA Handbook. Stockholm: IDEA. Rokkan, Stein. 1970. Citizens, Elections, Parties: Approaches to the Comparative Study of the Processes of Development. Oslo: Universitetsforlaget. Ross, Tara. 2019. Why We Need the Electoral College. Regnery Gateway Publications, Washington D.C. Saari, Donald G. 1994. Geometry of Voting. New York and Berlin: Springer Verlag. Sauger, Nicolas and Bernard Grofman. 2016. ‘Partisan Bias and Redistricting in France.’ Electoral Studies 44: 388–96. Shugart, Matthew and Rein Taagepera. 2018. Votes from Seats: Logical Models of Electoral Systems. New York: Cambridge University Press. Shugart, Matthew and Martin Wattenberg (eds). 2001. Mixed Member Systems: The Best of Both Worlds? Oxford: Oxford University Press. Struther, Cory L., Yuhui Li and Matthew S. Shugart. 2018. ‘Introducing New Multilevel Datasets: Party Systems at the District And National Levels.’ Research and Politics 5(4). Taagepera, Rein. 2004. ‘Extension of the Nagayama Triangle for Visualization of Party Strengths.’ Party Politics 10(3): 301–6. Taagepera, Rein. 2007. Predicting Party Sizes: The Logic of Simple Electoral Systems. Oxford and New York: Oxford University Press.
759
Taagepera, Rein and Mirjam Allik. 2006. ‘Seat Share Distribution of Parties: Models and Empirical Patterns.’ Electoral Studies 25(4): 696–713. Taagepera, Rein, Peter Selb and Bernard Grofman. 2013. ‘How Turnout Depends on the Number of Parties: A Logical Model.’ Journal of Elections, Public Opinion and Parties 24(4): 1–24. Taagepera, Rein and Matthew Shugart. 1989. Seats and Votes: The Effects and Determinants of Electoral Systems. New Haven, CT: Yale University Press. Tan, Netina and Bernard Grofman. 2020 forthcoming. ‘Electoral Rules and Manufacturing a Legislative Supermajority: Evidence from Singapore.’ Journal of Commonwealth and Comparative Politics. Winer, Stanley, Stephen Ferris and Bernard Grofman. 2017. ‘The Duverger-Demsetz Perspective on Electoral Competitveness and Fragmentation: With Application to the Canadian Parliamentary System, 1867–2011.’ In Maria Gallego and Norman Schofield (eds) The Political Economy of Social Choices. New York: Springer, 93–122. Ziblatt, Daniel. 2017. Conservative Parties and the Birth of Democracy. Cambridge: Cambridge University Press.
45 Executive Power F e rd i n a n d M ü l l e r- R o m m e l a n d M i c h e l a n g e l o Ve rc e s i
Introduction Executive power exists in all polities. Although ubiquitous in all countries, the concept of executive power has received remarkably little scholarly attention, if key political science handbooks are taken as the reference point. The subject is not well studied because it is difficult to define and to measure. In most studies, executive power is defined as the power of the political executive to make and influence governmental policy. Empirically, it has often coincided with political power tout court (Finer, 1997). In this sense, executive power can be fragmented or centralized, it can variously interact with other forms of social power and it can be channeled through more or less strong institutions (Acemoglu and Robinson, 2012). This chapter attempts to conceptualize executive power in different political regimes.1 It starts with a historical review of the notion of executive power. The chapter then introduces various definitions of
executive power and advocates that executive power should be studied in context with the functioning of political institutions. The discussion paves the way to an overview of the organization of executives in authoritarian and democratic regimes. In a further step, we examine the internal structure of political executives in democratic regimes particularly in light of their linkage to political parties and legislative support. Moreover, we discuss the issue of gender representation in the context of executive power. In the final section we tackle the pressing debate on how to measure executive power.
Executive Power in a Historical Context If one recalls the history of executive power back to Greek philosophers, Plato and Aristotle have probably been the most influential thinkers. While the former tried to
Executive Power
define the best rulers’ profiles, the latter was more concerned with illustrating (normatively) how rulers should use their power. Later Roman writers extended these issues. Cicero, for example, argued that an ideal system should give wise people the chance to choose superior leaders, who govern based on goodwill and are loved by the governed themselves (Keohane, 2014: 26–30). Outside Western society, Confucius (551–479 bc) had already stressed – even before Plato – the need to lead people through specific virtues and moral qualities (Wong, 2011: 771–2). A new perspective was introduced by early Christian thinkers, who were interested in answering the question of what role governing plays in God’s creation and how Christians should govern accordingly (LunnRockliffe, 2011: 142). The relationship between religion and politics was also a relevant topic in the Muslim world during the medieval period of Islam (about 850–1200): some philosophers sought to reinterpret Plato and Aristotle against the Islamic law; others provided advice to leaders about ethics and the ideal government. In this regard, the most famous treaty was the Book of Government, written by the Persian Nizam al-Mulk in the late 11th century. Later works on executive power simply described the caliphate as the sum of the functions of the caliph. After the abolition of the caliphate in the first half of the 20th century, Islamist theorists mainly focused on Islamic ideals, rather than providing empirical accounts of governing institutions (Akhavi, 2011). The classical Hindu tradition was another example of thought where ‘good’ government was related to the fulfillment of a sacred law: the individual’s spiritual sphere was seen as superior and government was understood as a necessary burden. An exception was Kautilya’s Arthashastra, a text of political realism from the 4th century before Christ (Dalton, 2011: 811–12). Because of its focus on government as it is and not government as it ought to be, Weber ([1919] 1992: 75) compared Arthashastra to
761
Machiavelli’s later The Prince. The Prince’s publication (1513) can be considered a watershed for the passage from a normative to a realist study of executive power in the Western thought. ‘The theme of the treatise is not guardianship or statesmanship, but the success of the individual prince in obtaining and retaining power.’ According to Machiavelli, these activities may well require behaving immorally (Keohane, 2014: 30). Only between the 17th and 18th centuries was executive power explicitly connected to the issue of constitutionalism. Authors such as Locke (1632–1704), Montesquieu (1689– 1755) and Rousseau (1712–78) aimed at justifying the executive as a distinct power of the state and defining how it relates with other constitutional powers. Similarly, Hamilton (about 1757–1804) discussed the specific powers of the executive in the well-known Federalist Papers (Keohane, 2014). In 1848, Marx and Engels connected the notion of executive power to their materialist theory of history, by asserting that in modern states the executive is but a committee for managing the common affairs of the whole bourgeoisie.
Executive Power and Political Institutions When political science as an academic discipline came into being at the end of the 19th century, old institutionalism focused on executive power, with a legalistic approach in terms of leaders’ formal powers in office (Helms, 2014: 196). Meanwhile, the main representatives of the Italian school of elitism (Pareto and Mosca) argued that politics is invariably about a minority who rules over a majority, irrespective of the political regime. These scholars tried to discover why executive power is always exerted by a relatively small number of individuals, citing factors such as superior personal qualities and oligarchical organizational principles (Blondel and Müller-Rommel, 2007: 820).
762
The SAGE Handbook of Political Science
These contributions have been prodromal to the establishment of the modern empirical study of executive power in political science. Over the years, it has become conventional wisdom in the discipline that politics is characterized by the search for and the exercise of power. Since this power is disputed between political actors, we can simply assert that politics essentially refers to a ‘struggle for power’ (Weber [1919] 1922). A first systematic–empirical linkage between politics and power was introduced by the Chicago School of political science in the 1920s and 1930s. Its main three representatives (Catlin, Merriam and Lasswell) argued that power should be the key concept to understand politics. In this regard, Lasswell’s definition of politics as ‘who gets what, when, how’ is the most famous expression (Lasswell, 1936). In the early 1950s, Lasswell, together with Kaplan, distinguished between influence (control over valued material or immaterial resources) and power: power, the two argued, is an exercise of influence, which modifies others’ behavior through (potential) punishments or rewards. This phenomenon would denote the realm of politics (Lasswell and Kaplan, 1950). Similarly, de Jouvenel (1963) depicted the core of the political as a relation between an ‘instigation’ (to do something) and a corresponding (positive) ‘response’. Yet, other scholars have observed that it is not any form of power that denotes politics, but rather a very specific type. In his theory of social systems, Parsons (1969) stressed that only political power is crucial to understand politics. Easton (1953) criticized the Chicago School and proposed to connect the notion of political system precisely to the authoritative allocation of values. Finally, Sartori (1975: 132) rephrased Easton and argued that politics is about collectivized decisions, which are ‘(i) sovereign, (ii) without exit, and (iii) sanctionable’. The theoretical viability of such definitions ultimately depends on how one defines the community that is affected by these
enforceable decisions. If one argues that this community is political since it is subjected to the political power itself, the conceptualization will appear tautological. If, instead, one does not define the community this way, one will conclude that other powers – for example, religious power – may also present the same characteristics. For these reasons, we assume that a sound definition of political power needs to address the very function of such power. Moreover, we argue that executive power can usually be equated to political power. Some authors suggested that the function of political power differs from other types of social powers’ (Stoppino, 2001; Poggi, 2014). Political power is a stable power, which integrates a form of authority and is valid for an entire social field. For those who are part of such field, political power produces stable entitlements, whose enforcement is ultimately guaranteed by political authorities. The function of political power is thus to generate these entitlements through policies. For example, political power can provide public goods such as laws, physical protection, civil liberties and social rights. These theoretical arguments are particularly useful for understanding executive power across time and space. The function of the political power – or, in other words, the governmental function – relies on specific authority positions, which endow rulers with the right to take binding decisions. In this sense, political executives are the central institutions that fulfill the functions of initiating, coordinating and implementing political decisions (Blondel, 2011: 866). These decisions are the outputs of the political process and are exchanged with political support from society (Easton, 1975). Executive institutions (the structural facet of the executive power) frame how power is produced and how it is exercised by political actors (the agential facet of the executive power). Thus, the way in which executive power is channeled depends on the institutional organization of political executives.
Executive Power
Executive Power in Authoritarian and Democratic Regimes Historically, executive power has always been concentrated. Before the birth of contemporary liberal democracies, executive power was wielded either by authoritarian governments or by relatively restricted and closed oligarchies in mutual competition. In particular, the ‘government by one’ was the rule, rather than the exception (Brooker, 2014). In the contemporary world, however, the emergence of new forms of authoritarian rule, hybrid regimes and varieties of democracies have made the picture of the extent of executive power more complex. The power of political executives differs extensively among the different regime types. In democratic regimes, for instance, the formal institutional setting defines and limits executive actors’ room for maneuver. The exercise of executive power in democracies derives from open competition and is based on stable expectations about the rules of the game (i.e. elections and constitutional provisions). In autocratic regimes, competition is often closed, not very permeable and ultimately based on the approval of the apical actors of the regime, such as the king or the leader to be succeeded. Power dynamics within the executive and between institutions are often uncertain and fickle (Stoppino, 2001: 368–71). These general differences are strictly connected to the very legitimation bases of executives in democracies and autocracies: legal–rational in the former and traditional/charismatic/ideological in the latter (Weber, 1921; Brooker, 2014).
Authoritarian Regimes In spite of previsions about the ‘victory’ of liberal democracy against alternatives after the end of the Cold War, authoritarian executives have remained numerous. Over the past decades, the third wave of democratization (Huntington, 1991) has been counterbalanced by reverse trends of autocratization (Mechkova
763
et al., 2017): between 1991 and 2001, the number of countries with an authoritarian government increased from 42 to 48 (Freedom House, 2001). From a more institutional perspective, the Polity IV project (Marshall et al., 2018) has estimated that in 2017 a total of 106 countries out of 165 (64%) were characterized by (more or less full-fledged) democratic systems, whereas 36% were under some sort of autocratic rule. Among non-democratic regimes, one can distinguish between competitive and non-competitive authoritarianisms (see Schlumberger and Schedler Chapter 42, this Handbook). Hybrid regimes (see Gagné and Mahé, Chapter 47, this Handbook) or competitive authoritarianisms can tend to either preserve the same institutional settings of democracies (or at least a façade) or reproduce characteristics of the other forms of non-democratic regimes. Executive power in non-competitive authoritarianism is basically channeled through two types of institutional settings: personal rule and organizational rule. The former implies that executive power is concentrated in the hands of ruling monarchs, military leaders or civilian dictators, whose power is hardly affected by institutional forms of checks and balances. Organizational rule refers to those cases where power is exercised by a collective organization such as the military or one ruling party (Brooker, 2014). Several sub-types exist; they differ based on how power is achieved, on the sources of legitimation and on the way of governing (Cheibub et al., 2010; Wahman et al., 2013; Geddes et al., 2014). We also find authoritarian regimes (particularly military dictatorships) in which the executive power is based on a combination of personal and organizational rule (mixed rule). Table 45.1 provides an overview of the different types of executive power in authoritarian regimes worldwide. According to the data, the most common way to organize executive power in contemporary authoritarian regimes has been civilian personal rule (54%), followed by party-based government (23%). Overall, the figures show
764
The SAGE Handbook of Political Science
Table 45.1 Executive power in 55 authoritarian countries, 2010 (percentage of countries) Personal rule Monarchic (13) Jordan Kuwait Morocco Oman Saudi Arabia Swaziland United Arab Emir.
Militarypersonal
Organizational rule
Mixed rule
Civilian-personal (54)
Military (4)
One-party (23)
Party-military (2)
Party-personalmilitary (4)
Afghanistan Armenia Azerbaijan Belarus Burkina Faso Cameroon Central African Rep. Chad Congo Cuba Eritrea Gabon Gambia Ivory Coast Kazakhstan Libya Madagascar Mauritania North Korea Russia Sudan Tajikistan Togo Turkmenistan Uganda Uzbekistan Venezuela Yemen
Algeria Myanmar
Angola Cambodia China Ethiopia Laos Mozambique Namibia Singapore Tanzania Tunisia Vietnam Zimbabwe
Rwanda
Egypt Syria
Note: Iran is counted as an authoritarian regime sui generis. Source: Geddes et al. (2014); Marshall et al. (2018), own elaboration. The dataset of Geddes et al. on the classification of authoritarian regimes provides information only until 2010. Countries are considered authoritarian if they scored below 6 in the democratic scale of the Polity IV dataset in 2010.
that executive power by the military is the exception: only in five countries (out of 52) was the military involved (10%). This number is even lower than that of countries with monarchic rule (13%). Thus, the data confirm the trend currently observed toward more civilianoriented autocratic leadership where executive power is basically in the hands of one person.
Democratic Regimes Contrary to authoritarian regimes, democracies are based on the principle of inclusive political
representation (see Schmitter, Chapter 43, this Handbook). In order to transfer individual political demands into collective interest, all democracies have constitutionally introduced free election. By voting for individual politicians or political parties, citizens (principals) delegate executive political leaders (agents). These leaders in turn become accountable to their voters when it comes to implementing governmental policy (Strøm, 2003; Samuels and Shugart, 2010). The chain of democratic delegation and accountability exists in all parliamentary, presidential and semi-presidential democracies.
Executive Power
In parliamentary and presidential systems, the executive power derives from the relation between voters and the executive branch. Yet, the origin of executive authority differs in the two types of liberal democracies. In parliamentary democracies we find a ‘fused power system’ where voters elect a legislature, which in turn chooses (directly or indirectly) the cabinet, which consists of a prime minister and her ministers. The cabinet is (collectively) accountable to the (majority of the) legislature, which can withdraw its confidence toward the executive power. The head of state has a merely ceremonial role and can be either a monarch or a president. Clear-cut examples are Germany, Japan and the UK. On the other hand, presidential democracies are defined by a ‘separated power system’ where the executive cannot dissolve and is not accountable to the legislature (Samuels and Shugart, 2010: 27). The presidential system is more likely to be conducive to gridlocks between the president and the legislature, thus undermining presidential freedom of action in certain situations (Blondel, 2011: 867). Furthermore, in presidential regimes, the political executive is monocratic, represented by a president who selects her cabinet members. Both the president and the legislature are directly elected by voters for fixed terms and the legislature can remove the president only in exceptional cases. Thus, the president is not an agent of the legislative majority but of her voters. Presidential democracies exist primarily in the United States of America and in Latin America (Blondel, 2015). The executive power in semi-presidential systems differs from the one in parliamentary and presidential democracies. In semi- presidential systems, a popularly elected president coexists with a prime minister who is selected by the parliament and accountable to its majority. Since a clear-cut definition of the separation of power within this dual executive remains vague, Shugart and Carey (1992) introduced the notion of premier–presidential and presidential–parliamentary sub-types of semi-presidential systems. In premier– presidential systems, the prime minister and
765
her cabinet are exclusively accountable to the parliamentary majority and not to the president, while in presidential–parliamentary systems the prime minister and her cabinet are accountable to the parliamentary majority and to the president. Table 45.2 provides an overview of the executive power structure in 103 democratic countries. The data show that parliamentary and presidential democracies exist in 60% of all countries under observation while semi- presidential systems are only present in 34% of all countries, with a clear majority of cases falling in the premier–presidential category (74%). Furthermore, the monarchical form of parliamentary systems is still in existence: 17 out of 30 parliamentary countries still have a monarch as (ceremonial) head of state. Finally, mixed executive power is only presented in a few countries. How executive power is distributed and organized in both regime types has important implications for how executive institutions are internally structured. In this regard, democracies display more complexity than authoritarian regimes. In the following sections, we therefore focus only on the power dynamics of political executives in democratic countries. Moreover, we only discuss executive power in the two most straightforward examples of separation of powers (i.e. presidential) and fused power (i.e. parliamentary) systems. Semi-presidentialism is considered as a mixture of both types.
DEMOCRATIC POLITICAL EXECUTIVES Democratic political executives are complex institutions. Several institutional bodies, offices and individuals work together and relate to one another within the executives to make the whole machine work (King, 1975). Sometimes the relationship between the various individual and collective political actors is cooperative; sometimes it is conflictual.
766
The SAGE Handbook of Political Science
Table 45.2 Executive power in 103 democratic countries, 2017 (percentage of countries) Presidential (30)
Argentina Benin Bolivia Brazil Chile Colombia Comoros Costa Rica Cyprus Dominica Republic Ecuador El Salvador Ghana Guatemala Guinea-Bissau Guyana Honduras Indonesia Kenya Liberia Malawi Mexico Nicaragua Nigeria Panama Paraguay Philippines Sierra Leona South Korea United States Uruguay
Semi-presidential (34)
Parliamentary (30)
Premier– presidential (25)
President– parliamentary (9)
Bulgaria Cape Verde Central African Rep. Croatia Czech Republic Finland France Georgia Haiti Ireland Kyrgyzstan Lithuania Macedonia Mali Moldova Mongolia Montenegro Niger Poland Portugal Romania Serbia Slovakia Slovenia Timor-Leste [Tunisia]
Austria [Burkina Faso] [Madagascar] [Mozambique] [Namibia] Peru Senegal Sri Lanka Taiwan
Albania (R) Australia (M) Belgium (M) Bhutan (M) Canada (M) Denmark (M) Estonia (R) Germany (R) Greece (R) Hungary (R) India (R) Israel (R) Italy (R) Jamaica (M) Japan (M) Latvia (R) Lebanon (R) Lesotho (M) Luxembourg (M) Malaysia (M) Mauritius (R) Nepal (R) Netherlands (M) New Zealand (M) Norway (M) Pakistan (R) Solomon Islands (M) Spain (M) Sweden (M) Trinidad & Tobago (R) United Kingdom (M)
Mixed rule (6) Directorial (2)
Others (4)
Suriname Switzerland
Botswana [Myanmar] South Africa Zambia
Note: (M) means monarchy; (R) means republic. Source: Elgie (2018); Marshall et al. (2018), own elaboration. Countries are considered democratic if they score from 6 to 10 in the democratic scale of the Polity IV dataset in 2017. Some countries were authoritarian in 2010 (see Table 45.1). These countries are in squared brackets.
A presidential government provides only a small amount of variation in terms of how the executive power is wielded (Müller, 2017). Presidential governments are openly hierarchical. The president, who is directly elected by the voters, is the most powerful figure of the executive. She selects the members of the executive based on her will and ministers are subordinated and responsible to her (Blondel, 2004: 285). A key characteristic of many presidential systems is that executive members are loosely connected to
one another: ‘the president may not have … a close relationship with at least a number of them, although some of the positions [… are] filled by the president with those … who are rewarded for their help, in particular during electoral campaign’ (Blondel, 2011: 864). Thus, presidential executives invariably function according to a model of government in which ‘the president is sovereign’ within the executive. Presidential political power is predominantly based on a list of formal constitutional prerogatives. Since the formal
Executive Power
constitutional power of presidents varies across countries, the executive power of presidents differs between countries (see below). Latin American presidents are, for example, constitutionally stronger and therefore more powerful than the president of the United States (Mainwaring and Shugart, 1997). The formal and informal organization of executives in the parliamentary government is fundamentally different to that in presidential systems. As stressed by Blondel (2004: 285), in parliamentary – or cabinet – systems, the chief executive (i.e. the prime minister) is formally embedded in a collegial context, meaning that she is nothing more than a primus inter pares. Ministers are also supposed to participate together in the decision-making process. From a principal–agent perspective, they are both agents of the cabinet and principals of their own ministry (Andeweg, 2000). Irrespective of countries’ idiosyncrasies, the highest echelon of the executive always comprises the prime minister and a number of heads of department, who form the cabinet. Junior ministers too are usually appointed, but they are hierarchically below the minister of their sector of competence (Barbieri and Vercesi, 2013). The principle of ‘collective responsibility’ binds cabinet members, by stating that all of them have to adapt to cabinet decisions. If these egalitarian principles are valid on paper, this does not normally apply in reality: they ‘are markedly eroded … in nearly all the countries [… and often] the cabinet ratifies decisions … de facto delegated to individual ministers … groups of ministers sitting in committee … or to the prime minister and some of the ministers’ (Blondel, 2004: 286). The comparative literature on cabinet decision-making in parliamentary systems has introduced a few types of government, which account for these variations (Vercesi, 2020). Cabinet government illustrates, for instance, the ideal-type of egalitarian and collective executive power as described above. In ministerial government, on the other hand, ministers have the power to decide over
767
policies within their own jurisdiction, without colleagues’ interference. Other types of government highlight issues such as the fragmentation of the decision-making process, the hierarchical nature of intra-cabinet dynamics or the impact of bureaucracy over the cabinet (Elgie, 1997). Perhaps the most controversially discussed model is the prime ministerial government, where the prime minister exercises a sort of monocratic power within cabinet by setting the agenda and controlling ministers’ actions (Rhodes, 1995; Strangio et al., 2013). The idea of an increasing executive power of prime ministers is also included in the concept of the ‘presidentialization of politics’ set out by Poguntke and Webb (2005). According to them, the power of prime ministers in parliamentary democracies has increased in three political arenas: in the prime ministerial office (executive face); in relation to their own party (party face); and in the direct impact of prime ministers on electoral campaigns (electoral face). In addition, Rhodes (2008: 328) claims that the central role of prime ministers in cabinet varies and depends on circumstances and policy areas. Prime ministers can be even stronger than presidents, if they are able to control their own party and a parliamentary majority.
Executive Power, Political Parties and Legislative Support In modern democracies parties play a crucial role in the recruitment, support and management of executives. In this context, Müller (2017) suggested that the political capacity of democratic executives and their way of functioning are deeply affected by their autonomy vis-à-vis parties as well as the partisan support they enjoy in the legislature. A first party-related aspect which has an enormous impact on executive power is the division between unified and divided government. This distinction typically applies to
768
The SAGE Handbook of Political Science
presidential (and semi-presidential) systems, although Elgie (2001) has argued that this can be applied also to parliamentary systems with two parliamentary chambers. A divided government occurs when, in one (or both) of the two legislatures’ branches, the partisan majority differs from the partisan orientation of the elected president. Especially when the chief executive does not hold strong constitutional powers to force members of parliament to pass legislation (or to block it), a president’s chances of getting her policy decisions into force decrease substantially. To overcome this problem, presidents can employ a range of strategies (from consensual to more conflictual) to cope with legislatures (Cox and Morgenstern, 2002). A second party-related impact on executive power is linked to the form of party government in parliamentary democracies. In the literature, we broadly differentiate between majority and minority governments. In all democratic polities, political executives need the support of a majority in parliament in order to successfully implement their policy proposals. This is particularly true for parliamentary and semi-presidential systems, but enjoying a parliamentary party majority is also relevant for presidents who want the legislature to translate their will into laws. Müller (2017: 145–6) argued that a party’s majority status in parliament increases the political power of chief executives tremendously. However, he also stated that minority governments do not necessarily have a negative effect on the power of the executives. Although majority governments tend to last longer and thereby might have long-term political power over policy decisions, minority governments are the ‘policy viable’ outcome in those situations where a ‘core’ party is important enough – either in terms of parliamentary seats or ideological position – to be included in all possible coalition alternatives (Laver and Schofield, 1990; Laver and Shepsle, 1996). Moreover, in some countries, particularly in Scandinavia, the political executive does not need a positive vote of
investiture to enter office. Rather, it survives as long as a majority does not vote against. This rule favors minority governments when parties in opposition do not reach a common agreement to find an alternative government (Bergman, 1993). Finally, minority cabinets are more likely when parties can obtain policy concessions from outside the government and when elections are decisive in determining the winner, especially when cabinet participation produces a loss of votes in the following elections (Strøm, 1990). A further (albeit intertwined) party-related impact on executive power is based on the distinction between single-party and coalition governments in parliamentary systems. Party coalitions substantially circumscribe the freedom of political activities among members of the political executive (Blondel and Müller-Rommel, 1993). For example, a coalition government limits the power of the prime minister to control the ministers and may thereby lead to oligarchical arrangements. In single-party majority cabinets, it is easier to achieve the goals of the political executive, because the only party in government controls the majority in the parliament.2 This scenario is well known in the UK, where the prime minister is the chief of the executive and at the same time the party leader. In coalition governments, however, policy decisions are the result of compromises about different policy views and goals between political parties on the one hand, and chief executives on the other. In this situation, executives face severe challenges even to their stability, in particular when the coalition parties are programmatically heterogeneous. In order to avoid governmental destabilization, coalition parties mostly employ a set of mechanisms to control one another and to make the decision-making smoother (Bergman et al., 2013). Within the political executive, coalition parties can for instance agree on appointing junior ministers who come from a different party than the minister’s. These junior ministers serve as ‘watchdogs’ for senior ministers, because they are
769
Executive Power
screening the policy decision-making process in single ministries (Verzichelli, 2008). Potential conflicts between coalition parties can also be reduced by proving jointly formulated coalition agreements, which ‘guide’ policy decision over the whole legislation (Andeweg and Timmermans, 2008).
Executive Power and the Issue of Gender Representation In a study on female government leaders around the world, Jalalzai (2013) found a general underrepresentation of women among executive rulers. Although in some countries women seem to have (nearly) broken the glass ceiling of representation in
national ministerial posts (Escobar-Lemmon and Taylor-Robinson, 2009; Annesley, 2015), the number of female presidents and prime ministers around the world has increased at a much slower pace. In late 2017, only 21 women were top leaders of the political executives. This corresponds to 11% of the available chief executive posts in the 194 national states around the world (see Table 45.3). Table 45.3 shows a striking variation of female representation in chief executive positions across regime types, party system types and geographical areas. Among the 21 female heads of government, only one person holds a prime ministerial post in a non-democratic state. All others are chiefs of democratic governments. The majority of them are elected in parliamentary and semi-presidential regimes where political parties are important actors in
Table 45.3 Women executives in office on December 31, 2017 by country, office and regime Country
Leader
Office
Regime (sub-)type
Bangladesh Chile Croatia Estonia Germany Iceland Liberia Lithuania Malta Marshall Islands Mauritius Myanmar Namibia Nepal New Zealand Norway Peru Serbia Singapore Switzerland United Kingdom
Hasina Wazed Bachelet Grabar-Kitarović Kaljulaid Merkel Jakobsdóttir Johnson Sirleaf Grybauskaitė Coleiro Preca Heine Gurib Aung San Suu Kyi Kuugongelwa Bhandari Ardern Solberg Aráoz Brnabić Yacob Leuthard May
Prime minister President President President Prime minister Prime minister President President President President President Prime minister Prime minister President Prime minister Prime minister Prime minister Prime minister President President Prime minister
Non-democracy Presidential Semi-presidential Parliamentary Parliamentary Semi-presidential Presidential Semi-presidential Parliamentary Democratic-other Parliamentary Democratic-other Semi-presidential Parliamentary Parliamentary Parliamentary Semi-presidential Semi-presidential Non-democracy Directorial Parliamentary
Note: Territories under the formal rule of a third country or part of the Commonwealth are excluded. The same applies to royal heads of state. Sources: See Table 45.2; Jalalzai (2018: 263), own update based on Worldwide Guide to Women in Leadership, https://guide2womenleaders.com/ (accessed on November 29, 2018).
770
The SAGE Handbook of Political Science
daily politics. Only two women were chief executives in democratic presidential system. Furthermore, we find ten female heads of government in European countries, four in Asia, three in Africa, two in Latin America and two in the Pacific. Finally, most women are chief executives in small states where the selection and recruitment processes to top political offices are less complex. In a nutshell, these empirical findings indicate, first, that women have to struggle more to reach chief executive positions. Second, democratic regimes foster the selection of women into chief executive offices. Third, parliamentary and semi-presidential regimes with strong multi-party systems help women to reach chief executive positions. Fourth, women have higher chances of entering into executive office in countries run through a democratic transition. Fifth, small countries provide greater access to chief executive positions for women than do large states. Sixth, the majority of the female chief executives are located in Europe, which indicates that the level of political empowerment of women is higher in this region of the world.
Measuring (Chief) Executive Power The measurement of executive power is difficult to tackle. If we define executive power as policy-making power, then the political power in democratic countries lies in the hands of the government. It is certainly true that members of parliament in presidential and parliamentary systems may exercise some influence on the policy-making. However, in reality, central political decisions are taken primarily by the chief executives rather than by the legislative chambers. This holds certainly true for the presidential systems but gradually even more so for parliamentary systems (Poguntke and Webb, 2005). Consequently, one – most prominent – way to examine executive power is to measure the
political strength of chief executives in liberal democratic systems. In this context, several diverse indexes on the political power of presidents and prime ministers have been proposed (Doyle, 2020). The power of chief executives is usually measured by considering the formal constitutional prerogatives for presidents and prime ministers. Informal aspects of executive power have not been studied systematically because of formidable problems in their conceptualization and operationalization. First, it seems unclear what the focus of analysis should be. Are we looking for informal executive power in the definition of policy areas, in the decision-making process or in the interaction among political actors? Second, even if one of these research subjects is specified, there are still serious problems in getting reliable information about informal decision-making processes among political executives, since most decisions are taken behind closed doors. Third, the few sources of information that are available on informal executive power structures are usually eclectic and difficult to quantify, particularly under a cross-national perspective. In the following, we therefore only introduce the existing measures for the formal political power of chief executives (i.e. presidents and prime ministers). Studies on presidentialism agree that the variety and degree of presidential power is defined differently in each country’s constitution. Shugart and Carey (1992) were the first to measure presidential power based on a cross-national examination of these written documents. They identified two dimensions of presidential power: legislative and non-legislative. Legislative power is conceptualized as the president’s power to veto legislation, to make new laws and suspend old ones, to exclusively introduce bills, to initiate the annual budget bill and to propose referenda. Non-legislative power is defined as the president’s power over cabinet formation, cabinet dismissal, the selection and de-selection of single ministers and the dissolution of
Executive Power
the assembly. The authors placed each item on a scale and added them together to a measure of presidential power on both dimensions in 35 countries. As a result, the authors identified world regions with strong presidential power, regions where presidents comprise great legislative power and regions with low presidential power (Shugart and Carey, 1992: 156). A few years later, Frye (1997) extended the checklist of power items to 27. One major disadvantage of this measurement is, however, that ‘it does not capture the dual authority structure of semi presidentialism’ (Metcalf, 2000: 667). Therefore, Metcalf suggested minor revisions of the existing checklists in order to apply the method to semi-presidential systems. Over the past two decades, major comparative studies have applied the ‘constitutional approach’ to operationalize the executive power of presidents. Measuring the power of prime ministers in parliamentary systems is more difficult because constitutions of parliamentary systems vary substantially in their definition of prime ministers’ powers. Furthermore, the power of prime ministers is not only dependent on constitutional prerogatives but also (and more) on their interaction with cabinet members and political parties. In a first systematic comparative assessment, Bergman et al. (2003) classified the power of prime ministers in a two-dimensional space that consists of institutional powers and power that derives from party system characteristics. The institutional power dimension is defined by nine items, most of them are related to the formal and informal behavior of prime ministers in cabinet. The party system dimension reflects the type of cabinet that exists in a given country (single-party cabinet; ‘bloc’ coalition cabinets; coalition cabinets in pivotal party systems). The authors applied the items of both dimensions to 17 countries and found that British and Spanish prime ministers are comparatively powerful, while prime ministers in Iceland, the Netherlands and Norway are the weakest chief executives in Europe.
771
A second quantitative study on prime ministerial power focuses on survey data rather than on ‘objective’ hard evidence. O’Malley (2007) asked 249 experts in 20 democracies to rate each prime minister on a nine-point scale about their influence over the policy outputs of the government. The findings confirm that prime ministerial power tends to be higher in countries with single-member plurality electoral systems. Furthermore, in countries with fragmented party systems and proportional electoral laws, prime ministers are less powerful.
Conclusion and Research Outlooks Executive power has been and will continue to be a prominent and widely used concept in political science. In most studies, executive power has been equated with political power, which – by its very nature – can be associated to the functioning of executive institutions in political regimes. The concept of executive power was easily applicable to countries under authoritarian rule, where political power is usually in the hands of one person. Its validity became markedly more complex in democratic societies, where political power is dispersed among many political actors. This is probably why studies on executive power in democratic regimes have been more numerous than in authoritarian regimes. The main challenges for future research in this field consist first in examining the effect of different forms of executive power on government performance, and second in the collection of more systematic empirical data on the different forms of executive power. The effect of executive power on governance varies, for instance, not only by the formal power of presidents and prime ministers (as described above), but also by their (rational) behavior within institutions. Future research therefore needs to examine in greater detail the individual behavior of chief executives
772
The SAGE Handbook of Political Science
in decision-making processes. A behavioral measurement of executive power is surely not as objective as a formal analysis of constitutional rules, but it certainly reflects more accurately what Siaroff (2003: 303) has called the ‘actual political practice’. Furthermore, the impact of executive power on government performance depends strongly on the personality traits and the leadership styles of single presidents and prime ministers. Thus, future research on executive power has to compose more sophisticated theoretical assumptions and empirical measurements that investigate the effect of personality traits and leadership styles on the quality of governance. Future studies could, for instance, follow up the classical works on presidential personalities in the United States (Barber, 2009) or on prime ministers in Europe (King, 1994) and examine under which conditions different personality and leadership styles lead to different governmental performance. Finally, the discipline needs more comprehensive comparative data on various forms of executive power in different political regimes. So far, the literature on executive power has been characterized by a paucity of data outside Western democratic countries and Latin America. The collection of more information and data on other parts of the world, such as Asia and Africa, is a necessary condition for more global-oriented comparisons of the modes of wielding executive power and the consequences that they have on political outputs. It would be particularly useful to identify measures that can bridge the concept of executive power in parliamentary and presidential systems. The more scholars agree on a universal concept and on the measurement of executive politics, the greater will be our ability to compare and assess the practices of power in politics. In sum, the use of the concept of executive power can help to understand the functioning of political life worldwide. However, one should note that, at present, the concept of executive power remains very vague in terms of definition, empirical measurement and
impact on governmental policy. Therefore, executive power should be treated as a flexible tool, taking into account the immense complexities of the political power relationships between political actors in different political regimes.
Notes 1 We exclude from our analysis both sub-national governments and supranational political organizations, such as the European Union. 2 This argument assumes that the party is united and not weakened by endemic factional internal conflict.
References Acemoglu, D. and J. A. Robinson (2012). Why Nations Fail: The Origins of Power, Prosperity, and Poverty. New York: Crown Business. Akhavi, S. (2011). The Muslim Tradition of Political Philosophy. In G. Klosko (ed.), The Oxford Handbook of Political Philosophy (pp. 789–802). Oxford: Oxford University Press. Andeweg, R. B. (2000). Ministers as Double Agents? The Delegation Process Between Cabinet and Ministers. European Journal of Political Research 37(3): 377–95. Andeweg, R. B. and A. Timmermans (2008). Conflict Management in Coalition Government. In K. Strøm, W. C. Müller and T. Bergman (eds), Cabinets and Coalition Bargaining: The Democratic Life Cycle in Western Europe (pp. 269–300). Oxford: Oxford University Press. Annesley, C. (2015). Rules of Ministerial Recruitment. Politics & Gender 11(4): 618–42. Barber, J. D. (2009). The Presidential Character. Predicting Performance in the White House. 4th ed. Abingdon: Routledge. Barbieri, C. and M. Vercesi (2013). The Cabinet: A Viable Definition in View of a Comparative Analysis. Government and Opposition 48(4): 526–47. Bergman, T. (1993). Formation Rules and Minority Governments. European Journal of Political Research 23(1): 55–66.
Executive Power
Bergman, T., W. C. Müller, K. Strøm and M. Blomgren (2003). Democratic Delegation and Accountability: Cross-National Patterns. In K. Strøm, W. C. Müller and T. Bergman (eds), Delegation and Accountability in Parliamentary Democracies (pp. 109–220). Oxford: Oxford University Press. Bergman, T., A. Ecker and W.C. Müller (2013). How Parties Govern: Political Parties and the Internal Organization of Government. In W. C. Müller and H. M. Narud (eds), Party Governance and Party Democracy (pp. 33–50). New York: Springer. Blondel, J. (2004). Executives. In M. Hawkesworth and M. Kogan (eds), Encyclopedia of Government and Politics. 2nd ed., Vol. 1 (pp. 283–93). London: Routledge. Blondel, J. (2011). Executive. In B. Badie, D. Berg-Schlosser and L. Morlino (eds), International Encyclopedia of Political Science. Vol. 3 (pp. 863–8). Los Angeles: Sage. Blondel, J. (2015). The Presidential Republic. Basingstoke: Palgrave Macmillan. Blondel J. and F. Müller-Rommel (eds) (1993). Governing Together: The Extent and Limits of Joint Decision-Making in Western European Cabinets. New York: St. Martin’s Press. Blondel, J. and F. Müller-Rommel (2007). Political Elites. In R. J. Dalton and H.-D. Klingemann (eds), The Oxford Handbook of Political Behavior (pp. 818–32). Oxford: Oxford University Press. Brooker, P. (2014). Non-Democratic Regimes. 3rd ed. Basingstoke: Palgrave Macmillan. Cheibub, J. A., J. Gandhi and J. R. Vreeland (2010). Democracy and Dictatorship Revisited. Public Choice 143(1): 67–101. Cox, G. and S. Morgenstern (2002). Epilogue: Latin America’s Reactive Assemblies and Proactive Presidents. In S. Morgenstern and B. Nacif (eds), Legislative Politics in Latin America (pp. 446–68). Cambridge: Cambridge University Press. Dalton, D. (2011). Hindu Political Philosophy. In G. Klosko (ed.), The Oxford Handbook of Political Philosophy (pp. 803–20). Oxford: Oxford University Press. de Jouvenel, B. (1963). The Pure Theory of Politics. Cambridge: Cambridge University Press. Doyle, D. (2020). Measuring Presidential and Prime Ministerial Power. In R. B. Andeweg,
773
R. Elgie, L. Helms, J. Kaarbo and F. MüllerRommel (eds), The Oxford Handbook of Political Executives. Oxford: Oxford University Press (forthcoming). Easton, D. (1953). The Political System: An Inquiry into the State of Political Science. New York: Knopf. Easton, D. (1975). A Re-Assessment of the Concept of Political Support. British Journal of Political Science 5(4): 435–57. Escobar-Lemmon, M. and M. M. TaylorRobinson (2009). Getting to the Top: Career Paths of Women in Latin American Cabinets. Political Research Quarterly 62(4): 685–99. Elgie, R. (1997). Models of Executive Politics: A Framework for the Study of Executive Power in Parliamentary and Semi-Presidential Regimes. Political Studies 45(2): 217–31. Elgie, R. (ed.) (2001). Divided Government in Comparative Perspective. Oxford: Oxford University Press. Elgie, R. (2018). The Semi-Presidential One: http://www.semipresidentialism.com/ (accessed on November 27, 2018). Finer, S. (1997). The History of Government from the Earliest Times. Volume I: Ancient Monarchies and Empires. Oxford: Oxford University Press. Freedom House (2001). Freedom in the World 2001. Annual Report. Frye, T. (1997). A Politics of Institutional Choice: Post-Communist Presidencies. Comparative Political Studies 30(5): 523–52. Geddes, B., J. Wright and E. Franz (2014). Autocratic Breakdown and Regime Transitions: A New Data Set. Perspectives on Politics 12(2): 313–31. Helms, L. (2014). Institutional Analysis. In R. A. W. Rhodes and P. ‘t Hart (eds), The Oxford Handbook of Political Leadership (pp. 195– 209). Oxford: Oxford University Press. Huntington, S. (1991). The Third Wave: Democratization in the Late Twentieth Century. Norman: University of Oklahoma Press. Jalalzai, F. (2013). Shattered, Cracked, or Firmly Intact? Women and the Executive Glass Ceiling Worldwide. Oxford: Oxford University Press. Jalalzai, F. (2018). Women Heads of State and Government. In A. C. Alexander, C. Bolzendahl and F. Jalalzai (eds), Measuring
774
The SAGE Handbook of Political Science
Women’s Political Empowerment across the Globe: Strategies, Challenges and Future Research (pp. 257–81). Basingstoke: Palgrave Macmillan. Keohane, N. O. (2014). Western Political Thought. In R. A. W. Rhodes and P. ‘t Hart (eds), The Oxford Handbook of Political Leadership (pp. 25–40). Oxford: Oxford University Press. King, A. (1975). Executives. In F. I. Greenstein and N. Polsby (eds), Handbook of Political Science: Governmental Institutions and Processes. Vol. 5 (pp. 173–255). Reading, MA: Addison-Wesley. King, A. (1994). ‘Chief Executives’ in Western Europe. In I. Budge and D. McKay (eds), Developing Democracy: Comparative Research in Honour of J. F. P. Blondel (pp. 150–63). London: Sage. Lasswell, H. D. (1936). Politics: Who Gets What, When, How. New York: McGraw-Hill. Lasswell, H. D. and A. Kaplan (1950). Power and Society: A Framework for Political Inquiry. New Haven: Yale University Press. Laver, M. and N. Schofield (1990). Multiparty Government: The Politics of Coalition in Europe. Oxford: Oxford University Press. Laver, M. and K. A. Shepsle (1996). Making and Breaking Governments: Cabinets and Legislatures in Parliamentary Democracies. Cambridge: Cambridge University Press. Lunn-Rockliffe, S. (2011). Early Christian Political Philosophy. In G. Klosko (ed.), The Oxford Handbook of Political Philosophy (pp. 142–55). Oxford: Oxford University Press. Mainwaring, S. and M. S. Shugart (eds) (1997). Presidentialism and Democracy in Latin America. Cambridge: Cambridge University Press. Marshall, M. G., T. R. Gurr and K. Jaggers (2018). POLITY IV Project: Political Regime Characteristics and Transitions, 1800–2018. Dataset and User’s Manual. Vienna, VA: Center for Systemic Peace. Mechkova, V., A. Lührmann and S. I. Lindberg (2017). How Much Democratic Backsliding? Journal of Democracy 28(4): 162–9. Metcalf, L. K. (2000). Measuring Presidential Power. Comparative Political Studies 33(5): 660–85. Müller, W. C. (2017). Governments and Bureaucracies. In D. Caramani (ed.),
Comparative Politics. 4th ed. (pp. 136–54). Oxford: Oxford University Press. O’Malley, E. (2007). The Power of Prime Ministers: Results of an Expert Survey. International Political Science Review 28(1): 7–27. Parsons, T. (1969). Politics and Social Structure. New York: The Free Press. Poggi, G. (2014). Varieties of Political Experience: Power Phenomena in Modern Society. Colchester: ECPR Press. Poguntke, T. and P. Webb (eds) (2005). The Presidentialization of Politics: A Comparative Study of Modern Democracies. Oxford: Oxford University Press. Rhodes, R. A. W. (1995). From Prime Ministerial Power to Core Executive. In R. A. W. Rhodes and P. Dunleavy (eds), Prime Ministers, Cabinet and Core Executive (pp. 11–37). New York: St. Martin’s Press. Rhodes, R. A. W. (2008). Executives in Parliamentary Government. In R. A. W. Rhodes, S. A. Binder and B. A. Rockman (eds), The Oxford Handbook of Political Institutions (pp. 323–42). Oxford: Oxford University Press. Samuels, D. and M. S. Shugart (2010). Presidents, Parties, and Prime Ministers: How the Separation of Powers Affects Party Organization and Behavior. Cambridge: Cambridge University Press. Sartori, G. (1975). Will Democracy Kill Democracy? Decision-Making by Majorities and by Committees. Government and Opposition 10(2): 131–58. Shugart, M. S. and J. M. Carey (1992). Presidents and Assemblies: Constitutional Design and Electoral Dynamics. Cambridge: Cambridge University Press. Siaroff, A. (2003). Comparative Presidencies: The Inadequacy of the Presidential, Semipresidential and Parliamentary Distinction. European Journal of Political Research 42(3): 287–312. Stoppino, M. (2001). Potere e teoria politica. Terza edizione riveduta e accresciuta. Milano: Giuffrè. Strangio, P., P. ‘t Hart and J. Walter (eds) (2013). Understanding Prime Ministerial Performance: Comparative Perspectives. Oxford: Oxford University Press. Strøm, K. (1990). Minority Government and Majority Rule. Cambridge: Cambridge University Press.
Executive Power
Strøm, K. (2003). Parliamentary Democracy and Delegation. In K. Strøm, W. C. Müller and T. Bergman (eds), Delegation and Accountability in Parliamentary Democracies (pp. 55–106). Oxford: Oxford University Press. Vercesi, M. (2020). Cabinet Decision-Making in Parliamentary Systems. In R. B. Andeweg, R. Elgie, L. Helms, J. Kaarbo and F. MüllerRommel (eds), The Oxford Handbook of Political Executives. Oxford: Oxford University Press (forthcoming). Verzichelli, L. (2008). Portfolio Allocation. In K. Strøm, W. C. Müller and T. Bergman (eds), Cabinets and Coalition Bargaining: The
775
Democratic Life Cycle in Western Europe (pp. 237–67). Oxford: Oxford University Press. Wahman, M., J. Teorell and A. Hadenius (2013). Authoritarian Regime Types Revisited: Updated Data in Comparative Perspective. Contemporary Politics 19(1): 19–34. Weber, M. ([1919] 1992). Politik als Beruf. Stuttgart: Reclam. Weber, M. (1921). Wirtschaft und Gesellschaft: Grundriss der verstehenden Soziologie. Tübingen: J. C. B. Mohr. Wong, D. (2011). Confucian Political Philosophy. In G. Klosko (ed.), The Oxford Handbook of Political Philosophy (pp. 771–88). Oxford: Oxford University Press.
46 Federalisms1 Surinder Kler Shukla
Meaning of Federalism Under a federal system of government, each of the self-governing units is answerable to a central authority. In this way, the whole country is unified under a common constitution or other legal system, despite being separated into different regions. The legal written constitution divides the governing powers between the provincial or state governments and the central government. In other words, smaller entities (such as provinces or states) unite and forgo some of their powers in favor of the national government. One example of a federal system of government is that of the United States. Here, the country is divided into several different states, each of which have (to some extent) their own laws and self-governing ability. However, all the states are part of the same country and governed by the central authority in Washington, D.C.
The Constitution of India, as well as many other diversified countries with different cultures and histories, also provides for a federal system. In India, the powers are properly distributed between the central government and the state governments. In present times, the importance of federal structures is increasing in order to accommodate differences of size, language, culture and socio-economic development within existing boundaries. Social identities based on such differences are often mobilized to obtain greater regional autonomy or even secession, as presently in Spain/ Catalonia or the United Kingdom and Scotland and Wales. Among the special rights claimed by regional identities are: a) special representation rights, devolution and national self-determination; b) special rights seeking accommodation of cultural and educational rights.
Federalisms
History of Federalism It was only in modern times, after Montesquieu’s The Spirit of Laws (1748) that a theory of separation of powers and the conceptualization of a federative republic was developed. Federalism in the modern sense then took form in the American experience of federalism as conceived in the Federalist Papers published by Alexander Hamilton, John Jay and James Madison under the name of ‘Publius’ (1786–1800). It contrasts with a unitary government in which a central authority holds the power, and a loose confederation in which individual states retain much of their sovereignty. In the nineteenth century, the French theorist Pierre-Joseph Proudhon used the concept of federalism for questioning the legitimacy of the French Jacobin State. Instead of a centralized state, he argued for a federalism, which promoted new modern principles such as autonomy, subsidiarity, cooperation and participation, in partial accordance with social Christianism and revolutionary unionism (Proudhon, 1863). Contemporary federalism is based on certain principles: the principle of partnership in place of central domination, the principle of union in place of separation, and the principle of concrete guarantees in place of vague assurances of future good conduct. The main pathways of federalisms are those seen in the United States, Canada, the UK, Switzerland, Germany and India. A new federalism has been progressively conceived as a conflict management strategy and was imposed by international actors in Bosnia Herzegovina, by the Dayton agreements in December 1995 or in Iraq by the US military occupation. It was also used as a state-building device when the decolonization process had to face an untraceable national background (in Nigeria, Comoros) or even for trying to overcome the Western model of nation-state (e.g. the Mali Federation briefly including French
777
Sudan and Senegal as a first step to Pan Africanism). All these constructions turned out as fragile or even ephemeral. The concept has also been used in international relations for referring to regional integration processes (“European federalism”, which tries to promote a “strong integration” and goes beyond a simple association of states) or to international integration, describing the utopia of a “world government”. This has given a new meaning and an international relations dimension to the study of federalism. Moreover, in some countries like India the traditional concept of federalism has been colored by local nuances in contrast to Western claims of being universal (Chatterjee, 1994: 285).
Global/Regional Differentiation In the 20th century federalism was integrationist and accommodative in nature, such as in Indian federalism. India adopted a federal structure, without using the term in its constitution, as both an integration strategy and an accommodative strategy in a diverse multicultural society. The success of India’s federal structure has encouraged the adoption of this system in Afro-Asian states since the early 1960s.
United States: Modern Federalism as a Form of Cooperative Federalism The United States today is composed of 50 self-governing states and several territories. Since the founding of the country, and particularly with the end of the American Civil War, power has shifted away from the states and toward the national government. Although a cooperative form of federalism has its roots in the civil war, it was the Great
778
The SAGE Handbook of Political Science
Depression that marked an abrupt end to dual federalism and a dramatic shift toward a strong national government. President Franklin D. Roosevelt’s New Deal policies reached into the lives of US citizens like no other federal measure. The national government was forced to cooperate with all levels of government to implement the New Deal policies; local government earned equal standing with the other layers, as the federal government relied on political machines at a city level to bypass state legislatures. The formerly distinct division of responsibilities between state and national government had been described as a ‘layer cake’, but with the lines of duty blurred, cooperative federalism was likened to a ‘marble cake’ or a ‘picket fence’. In cooperative federalism, federal funds are distributed through grants in aid or categorical grants, which gave the federal government more control over the use of the money (Lowi 1979). A “new federalism”, which is characterized by a gradual return of power to the states, was initiated by President Ronald Reagan (1981–9) with his ‘devolution revolution’ in the early 1980s. Reagan’s administration introduced a practice of giving block grants, freeing state governments to spend the money at their own discretion.
Canada: A Reluctant Confederation Canada is a federation with 11 jurisdictions of governmental authority: the countrywide federal Crown and ten provincial Crowns. (Three territorial governments in the far north exercise powers delegated by the federal parliament, and municipal governments exercise powers delegated by the province or territory.) Each, generally independent from the others in its realm of legislative authority, derives its authority from the Canadian Crown. Most sectors are under federal jurisdiction (such as foreign affairs and telecommunications) or that of the provinces (such as education and healthcare). The division of
powers is outlined in the Constitution Act 1867 (formerly the British North America Act 1867), a key document in the Constitution of Canada (Smith 2012). Federalism in Canada has been challenged by attempts in the mainly francophone province of Quebec to obtain greater autonomy or even secession. The primary mainstream political vehicle for the movement is the Parti Québécois, which has governed Quebec on multiple occasions. In 2012 it was elected to a minority government, in which its leader, Pauline Marois, became the first female Premier of Quebec. However, only 18 months later, the party was defeated by the Liberal Party of Quebec in the 2014 election. Other provinces, in turn, such as the oil-rich Province of Alberta, have raised similar claims. Altogether, a larger degree of regional autonomy and cultural diversity has been maintained than in the U.S.
Germany: Three Layers of Federalism Federalism in Germany is made up of states and the federal government. The central government, the states and the German municipalities have different tasks and partially competing areas of responsibilities ruled by a complex system of checks and balances. Since reunification in 1990, the Federal Republic has consisted of 16 states: the ten states of the former West Germany, the five new states of the former East Germany, and Berlin. In several areas, this separation of powers also has been challenged, especially with regard to the educational system where the major authority still lies with the federal states.
The EU and Economic Integration At the end of World War II, the political climate favored unity in Western Europe, seen by many as an escape from the extreme
Federalisms
forms of nationalism, which had devastated the continent. One of the first practical and successful proposals for European cooperation came in 1951 with the European Coal and Steel Community. Since then, the European Community has gradually evolved into a Union, with a range of policy areas in which its member states hope to benefit from working together. The federalization of the EU is seen as an institutional process by which the EU is transformed from a confederation (a union of sovereign states) into a federation (a single federal state with a central government, consisting of a number of partially self-governing federated states). There are ongoing discussions about the extent to which the EU has already become a federation over the course of decades, and, more importantly, to what degree it should continue to evolve into a federalist direction. Since the 1950s, European integration has seen the development of a supranational system of governance, as its institutions move further from the concept of simple intergovernmentalism and more toward a federalized system. With the Maastricht Treaty of 1992, new intergovernmental elements have been introduced alongside the federal system, making it more difficult to define the EU as a polity “sui generis”. The “multi-speed Europe” thesis envisions an alternative type of European integration, where the countries that want a more integrated EU can accelerate their integration, whereas other countries may go at a slower pace or cease further integration altogether. Specific current examples include the Eurozone and the Schengen area, which not all members have elected to join. According to Joseph H. H. Weiler (2003), ‘Europe has charted its own brand of constitutional federalism. The EU only lacks two significant features of a federation. First, the member states remain the ‘masters’ of the treaties, i.e., they have the exclusive power to amend or change the constitutive treaties of the EU. Second, the EU lacks a real ‘tax and
779
spend’ capacity, in other words, there is no fiscal federalism. Treaties must be agreed by all member states even if a particular treaty has support among the vast majority of the population of the EU. Member states may also want legally binding guarantees that a particular treaty will not affect a nation’s position on certain issues. In a referendum held on June 23, 2016 in the UK, 51.9% of the voters voted for the UK to exit the EU.
Bosnia and Herzegovina As noted previously, in Bosnia and Herzegovina, a federal structure has been used as a method of conflict resolution after the dissolution of the former state of Yugoslavia and an ensuing civil war in the 1990s – thereby giving a new impetus to the concept. Bosnia and Herzegovina has been an established federation since 1995. The contradictions in the Bosnian constitution are obvious. First, the Bosnian constitution is part of a peace treaty. In November 1995, the Dayton negotiations – peace negotiations between the states of Croatia, Serbia and Bosnia and under the moderation of the United States and the EU – were undertaken. Furthermore, the existence of two entities in a country with three main national identities (Bosnians, Bosnian Serbs and Bosnian Croats), of which one is named ‘Serb Republic’ and the other is a federation within a federal state, shows not only the complexity of the Dayton constitution but also its overfederalization and the dominant feature of ethnicity.
India India adopted a federal structure in 1950, in a multicultural society (Kashyap, 1969; Jennings, 1983) with diversity in religion, caste, language and geographical region and a large continental size. Indian federalism
780
The SAGE Handbook of Political Science
was an attempt to disperse power, both constitutional and political. It was focused on economic regulation and planning, besides forging administrative relations with smaller states in a mode of equanimity. It was felt that a highly centralized structure of decision-making and resource allocation in a large geographic expanse would not be conducive to rapid development. At the same time, the center should not be overly powerful, to ensure that the plural society does not disintegrate. Federal features of the Indian Constitution include a written constitution, constitutional division of power and the provision of an independent judiciary. Despite the division of powers clearly laid down in the Constitution, there still exist palpable center–state tensions relating to issues of the federal structure. The center–periphery bias for disbursement of funds is much resented by states. The classic example is that of the state of Punjab, which almost broke away from the Union. Although the states are empowered to make laws according to the State List, it is alleged that policies are generally evolved at the center and sent down for implementation. This fact is resented by the states. Protagonists of state autonomy claim that there is an unfair allocation of resources. A number of areas, including education, have been brought under the control of the center to enlarge its financial industrial, administrative and legislative powers visa-vis states. Center–state tensions are not specific to India – in fact, in modern times, they are an international phenomenon, and most nations – including the United States – have failed to strike a perfect balance between the powers of the center and those of the states.
Center–State relationship For administrative efficiency, either the nation state is divided into states (centripetal phenomenon), or a congregation of states pool their resources for common benefit (centrifugal phenomenon). Indian federalism
is based on the former. Tensions between center–state and local level officials are at the core, as regulation interferes with the policy preferences of different state and local units, leading to a general feeling of discontent with the democratic set-up (Kohli, 1991). Three key areas of conflict are: 1 conflict of interest among federal units; 2 unavoidable interdependence and interpenetration between levels; 3 increased role of state at all levels.
Channels of center-state relations devolve as they do on (i) legislation and policymaking, (ii) administration, (iii) fiscal adjustment, which is crucial in most multi-sphere federal structures. Interactions in federal systems encompass center -state relations that operate in vertical and horizontal or longitudinal and latitudinal dimensions. Regional movements in Asia, and more especially India, have coalesced around the question of cultural identity (Kothari, 1970) and all have posed demands in relation to the central power structure located at the capital. However, the party in power at the center has flexed its muscle in terms of channeling grants to the states/ provinces. In the actual working of the constitution, the center and state in India faced a number of problems, leading to tensions between them. The problem was compounded by the presence of the Congress Party at the center and non-Congress parties in the states. The tensions between the center and Punjab with special reference to Akali Dal (a regional political party in Punjab mostly representing the Sikh community) are different from the other states, because the importance of Akali Dal (whether in power or out of it) cannot be ignored. Terrorism also led to a lack of understanding between the center and Punjab (Puri, 2008). It had far-reaching consequences, including displacement of people and hesitation in the business class to invest.
Federalisms
The unitary tendencies of federalism in India and the powerful Congress system of the 1950s had to retreat in the face of a popular movement in Andhra, Maharashtra, Gujarat and Punjab and Haryana. In Jammu and Kashmir, centralizing tendencies are thwarted by local dissidents, with the support of agents from neighboring countries.
Existing Data Sets In a field that is characterized by a high discontinuity of results and research projects left uncompleted, there are two sets that deserve to be mentioned. The first one is the Database of Political Institutions (DPI) and the second the Comparative Constitutions Project (CCP). The DPI has been projected and implemented by the World Bank Development Research Group and started in 2000. During these years, it has been widely quoted. It has institutional data on government structures, measures of checks and balances, tenure and stability of the government, identification of party affiliation and ideology, and fragmentation of opposition and government parties in the legislature. It covers about 180 countries for 40 years, 1975-2015 with updating until 2017. The CCP has been conceived and implemented by Zachary Elkins, Tom Ginsburg, and James Melton and includes more than 300 topics on almost every active national constitution in the world. All of the constitutions have been tagged by subject area and the researcher can find out relevant constitutional provisions on particular subjects. Some other database projects focused on federal countries have not been completed and eventually discontinued.
An Overall Balance Effects of enforcement of federal structures can be both positive and negative. Among the
781
positives, first, governance becomes easier. It is very hard for a central authority to govern the entirety of a large country at once. Splitting the country into manageable chunks makes it easier to govern. Further, it is easier to give specific directions if you can narrow things down by state boundaries. Second, local issues are given more weight. Federalism enables local government to govern in a way that reflects the needs and interests of the specific region. Under a federal system, federal governments can pay attention to the issues that matter most to the people in the region, and then respond to these issues through the way in which they create and implement laws. Third, the federal system encourages diversity within a country. Different federations may have different ways of doing things, but they will all be respected equally. Fourth, it enables people to choose the best place for them to live. Citizens of a federal country can choose which region’s way of doing things suits them best, and then move to live or work in the region that best suits their lifestyle. Fifth, under a federal system, citizens feel a connection with their governors. It can be hard to feel connected to the people governing you when they live many hundreds of miles away in the capital of a central authority. Under a federal system, your federal governors will always live in your region. Sixth, a federal system is not one that aims to overthrow the central government. Rather, all federal governments are answerable to the country’s national legal system and/or constitution. In this way, a federal system provides unity for its people as well as diversity. Seventh, when people can vote to change federal laws, or when they can vote to replace federal governors, they have more immediate power over the issues that concern them and their region. Eighth, large countries may have entirely different climates in different regions. Thus, a federal system is useful, as federal laws relating to agriculture and similar issues can be tailored to suit the particularities of a region’s climate. Ninth, federal governors have a
782
The SAGE Handbook of Political Science
better understanding of their region. It is very difficult for a centralized government to have an in-depth knowledge of all the regions. Tenth, when a centralized government is not the only power in a country, it can be better kept in check. Different federal states can act as checks and balances for each other and give each other ideas about how to govern. This also helps to balance the power of a central government. Eleventh, this system is ideally suited to big countries. It can be easier to govern several smaller states than one large nation alone. Large nations work better when they are divided up into smaller units as this makes governing them easier. Twelfth, recognizing distinct federal states helps to preserve the local character of each federation. The negative effects of enforcement of federalism include, first, parochialism. It can be argued that federal governments become overly parochial, putting the interests of a relatively small region of a country ahead of national interests in a counterproductive way. Second, when several federal states deeply disagree on legal matters, conflicts can arise. Conflict and disputes between states can cause friction. Third, in a federal government, governors and the central government may become engaged in conflicts of authority about what is best for a given region. Fourth, there may be a lack of unity: the federal system of government may become very fragmented – the country can appear to be broken into various parts, without real unity between them. Fifth, citizens may feel out of touch with other citizens in different parts of the country, making it harder for them to relate to people from different states. Sixth, when your most visible system of government is the federal state government, you may feel out of touch with the central government, especially when it is far away. Seventh, having different laws for different states can be confusing. It can be argued that it is better to have a consistent system for all areas of a nation. Eighth, federalism can become very complex to navigate.
Federalism can be a useful solution to the issue of governing a large country. It is a way of meeting each region’s needs without allowing those regions to lose touch with the central government. However, federalism can also carry with it the potential for conflict and confusion if it is not handled properly.
Perspectives From these considerations we can conclude that: • Federalism expands opportunities for citizens to participate at various levels of government. • Federalism allows peripheral states to act as ‘laboratories’ of public policy that will show negative effects of bad policy and allow for diffusion of good policies. • Expanded opportunities to participate can lead to ‘election fatigue’ and confusion over which level of government is responsible for addressing public problems.
These features may account for the relative success of Federalism that covers about twenty-five countries in the second decade of the 21st century and more than 40% of the world population.
Note 1 Shortened and revised by the editors.
References Burgess, Michael (2006), Comparative Federalism: Theory and Practice, London: Routledge. Chatterjee, Partha (1997), A Possible India – Essays in Political Criticism, Oxford: Oxford UP. Hamilton, Alexander, James Madison and John Jay (1788), The Federalist Papers, Dover Thrift Editions (2014), Mineola/N.Y.: Dover Publications. Jennings, Ivor (1983), Some Characteristics of the Indian Constitution, Oxford: Oxford University Press.
Federalisms
Kashyap, Subhash C., ed. (1969), Union-State Relations in India, New Delhi: Institute of Constitutional and Parliamentary Studies. Kohli, Atul (1991), Democracy and Discontent, Cambridge: Cambridge University Press. Kothari, Rajni (1970), Politics in India, Delhi: Orient Longman. Lowi, Theodore (1979), The End of Liberalism: The Second Republic of the United States. Proudhon, Pierre-Joseph (1863), Du Principe fédératif et de la nécessité de reconstituer le Parti de la Révolution, Paris: E. Dentu.
783
Puri, Harish (2008), Terrorism in Punjab, Delhi: Jain Book Depot. Smith, David E. (2012), The Crown and the Confederal and Autonomy Arrangements, Harlow: Longman Current Affairs. Weiler, Joseph H. H. (2003). Federalism without Constitutionalism: Europe’s ‘Sonderweg’, in: Kalypso Nicolaidis and Robert Howse, eds, The Federal Vision: Legitimacy and Levels of Governance in the United States and the European Union. Oxford: Oxford University Press.
47 Hybrid Regimes Jean-François Gagné and Anne-Laure Mahé
Introduction Traditionally, the study of political regimes (see Whitehead, Chapter 52, this Handbook) has been dominated by a dichotomy: democracy on the one hand, and autocracy on the other. Few contested cases could be located at the conceptual frontier and everything in between was considered an anomaly. This framework prevailed up until Linz’s (1964) work, which opened new research avenues. Later on, a closer look at political regimes in Latin America, sub-Saharan Africa, East Asia and the post-communist space gradually revealed the coexistence of contradictory rules, norms and practices. In some countries, it was found that multiparty elections combined with targeted repression of opposition leaders. How best to describe this interaction between democratic and autocratic features? The concept of hybrid regimes is an answer to this question, and what makes it an interesting problem is the prevalence of
hybrid regimes and their resilience. Schedler claimed that ‘most regimes today are neither clearly democratic nor fully authoritarian’ (2002: 37). Moreover, many hybrid regimes settle durably in the middle of the continuum between democracy and autocracy. They are not a mere ‘optical illusion’, to use the words of Morlino (2011), a sort of temporary or transitional period. Ottaway (2003: 21) emphasizes their equilibrium nature and Merkel (2004: 33) asserts that ‘they are often seen by considerable parts of the elites and the population as an adequate institutional solution to the specific problems of governing effectively’. Hybrid regimes deserve scientific scrutiny and are worth the effort of delving into complex intellectual challenges. In this regard, the April 2002 issue of the Journal of Democracy represents a turning point. It basically sets the future research agenda. After this, publications increased significantly.1 Over the past two decades, hybrid regimes have gained substantial attention.
Hybrid Regimes
This chapter offers an overview of conceptual and theoretical contributions in the subfield of hybrid regimes. Concept describes a phenomenon and defines the boundaries of the object under study, which serves to elaborate theories and develop explanations. It also briefly takes a macro perspective pointing to regional trends and exploring international diffusion mechanisms. It ends with a discussion on future lines of research, such as other subnational politics and the impact of technologies on political regimes (see Schlumberger and Schmitter, Chapter 42, this Handbook).
A Conceptualization Concept formation, classification and generalization are all part of the scientific enquiry process. Concept formation includes the need to differentiate hybrid regimes from democratic and autocratic regimes2 based on specific dimensions and derivatives that come within the hybrid umbrella: that is, democracy and autocracy with defects. It disaggregates the phenomenon and hopes to add increasing granularity to our understanding. The typologies offer a tentative synthesis by proposing ways to classify hybrid regimes, and cross-national timeseries databases standardize and operationalize concepts on a large scale.
Describing the Regimes A simple definition of hybrid regimes consists of a combination of democratic and autocratic features that persists over time. Indeed, in order to be called a regime, the reproduction of some institutional arrangements is a necessary condition. The conceptual frontier between an autocratic and a hybrid regime rests upon the absence or existence of frequent direct multiparty national elections with universal suffrage. In the autocratic category, we find the Middle
785
East monarchies (e.g. Saudi Arabia), communist regimes (e.g. China, North Korea) and military dictatorships (e.g. Myanmar, at least until 2010). The distinction between a hybrid and democratic regime converges toward the degree of political competition. In western countries, elections are for the most part free and fair, which implies an environment conducive to respect for civil and political rights as well as the rule of law. In hybrid regimes, the electoral process is skewed in favor of the incumbent. Beyond this basic definition, the concept epitomizes a generic term, a class of phenomenon that encompasses a wide variety of institutional arrangements in the developing world that regroups all types of intermediary regimes. Each explores a specific dynamic related mainly to electoral institutions. The arbitrariness in the application of democratic rules certainly touches upon a crucial dynamic found in hybrid regimes and reflects autocratic behavior. In hybrid regimes, ruling elites often instrumentalize legislative provisions for political gain in electoral contest. Corruption charges against opposition leaders are a classic example. Similarly, nonelected actors, such as the military, religious groups and oligarchs, possess somewhat discretionary powers. They see themselves as above the law and often interfere in elections to push for their preferred candidate. The prevalence of powerful veto players refers to concepts such as protracted democracy (Loveman, 1994). Rules can be a priori democratic and their application relatively uniform, but if embedded in a specific context, they can become problematic and reveal autocratic tendencies. For example, if the electoral law bans ethnic parties to prevent polarization and foster national unity, it often de facto excludes many organizations in divided countries. It imposes a high barrier to entry into the party system that deprives the opposition of legal status and an ability to stand in elections. It indicates a form of restricted democracy (Waisman, 1989). In theory, the rules look
786
The SAGE Handbook of Political Science
consistent with the democratic spirit, but in practice the design reveals an intent to contain the political opposition. Furthermore, depending on structural inequalities in society, electoral laws are not necessarily undemocratic, but do not favor equal political participation. Although universal suffrage exists, and all citizens could register on official lists, segments of the electorate are prevented from doing so for economic or security reasons. In some countries, women are pressed to stay home or work for the survival of children. In other cases, widespread illiteracy restrains mostly impoverished people from being able to cast their vote according to the fundamental secret ballot principle. They become subject to outside influence. This prospect is encapsulated in concepts such as limited democracy (Archer, 1995). A regime might be somewhat democratic during the election period, but manifest autocratic traits the rest of the time. Political participation is a continuing process that requires the respect of civil liberties such as freedom of expression, association and so on. If media coverage of opposition parties is only allowed during elections, they enter the contest as unfamiliar candidates and most likely lose the election in anonymity. The idea that elections are not the sole and only standard to judge whether a regime is democratic or not is conveyed by O’Donnell’s (1994) concept of delegative democracy. Zakaria (1997) used the concept of illiberal democracy to describe the imbalance between political rights and civil liberties. He found that overall political rights such as free and fair elections as well as competitive and representative parties are less respected than civil liberties, including the freedoms of expression, assembly, association, education and religion. He suggests that this reveals a dangerous combination where the state is weak and autocratic and the society is strong and democratic: a recipe for frequent electoral boycotts and chronic political instability, as seen in many sub-Saharan African countries.
In addition, it can be helpful to focus briefly on authoritarianism with adjectives. In this vein, Schedler (2002), who proposes the notion of electoral authoritarianism, also introduces a distinction between competitive electoral authoritarianism (i.e. hybrid regime) and uncompetitive (or hegemonic) electoral authoritarianism. Levitsky and Way (2002, 2010) propose the concept of competitive authoritarianism, with detailed indications of how to determine unfair elections, violations of civil liberties and the uneven playing field. For instance, in some countries, the incumbent redirects state resources to partisan ends, such as buying media coverage. It provides significant funds which procure unfair advantage to the dominant party. In this regard, Svein-Erik Helle (2016: 54) develops additional criteria to assess the playing field – notably access to resources, which includes not only public but also private, illicit and international funding. Thus, in competitive electoral authoritarianism, political pluralism is effectively recognized and conflict about the rules is limited and regulated by legislative institutions. The competitive dimension consists not only of the ability of opposition parties to compete in elections, but foremost of the presence of opposition parties in political institutions. Hegemonic authoritarianism denotes the absence of ‘suspense’ over the outcome of an election and stresses the systematic prevalence of extralegal activities (Howard and Roessler, 2006: 108; Simpser, 2013). The unusually high voting percentage and equally high voter turnout signals that the electoral process was probably hampered in some way. Disclosure of irregularities also increases thanks to monitoring from civil society and international observers offering a better vantage point to gauge the fairness of elections. Finally, given that opposition parties are de facto excluded from formal institutions, contestation moves to the streets and provides additional information about the hegemonic nature of the electoral authoritarian regime.
Hybrid Regimes
We should add that hybrid regimes include various subtypes. Some argue that a point is reached at which new concepts only add confusion instead of contributing to the accumulation of knowledge (Armony and Schamis, 2005). Nevertheless, an effort should be made toward building consensus and concept classification is a step in the right direction. Obviously, organizing these concepts within a coherent scheme represents a significant challenge.
Classification Where concept formation looks at the specific, concept classification takes a more inclusive approach. It intends to make sense of an eclectic literature and enmeshes diverse concepts by creating typologies of political regimes. On the one hand, typologies define the conceptual frontier of hybrid regimes compared to democratic and autocratic regimes; on the other, they untie the relation between various types of hybrid regimes. Few studies cope with these twin objectives simultaneously. Wigell (2008) constructs a framework with four main types of regime (i.e. democratic, constitutional oligarchic, electoral autocratic and authoritarian), with detailed classification schemes and sixteen over fifteen criteria along two axes: electoral and constitutional conditions. ‘Constitutional oligarchy’ and ‘electoral autocratic’ loosely correspond to hybrid regimes, although the author creates new concepts instead of concentrating solely on classification objectives. He adds novel aspects such as electoral irreversibility and the complicated issue of electoral boycott, as well as legal accountability of not only elected officials but all civil servants in the public administration at local, regional and national levels. Diamond (2002) offers a classification of hybrid regimes integrated into a typology of political regimes which encompasses liberal democracy, electoral democracies, ambiguous
787
regimes, electoral authoritarianism and politically closed authoritarianism. He stresses that ‘the distinction between electoral democracy and electoral authoritarianism turns crucially on the freedom, fairness, inclusiveness, and meaningfulness of elections’ (Diamond, 2002: 28). He further differentiates between two types of electoral authoritarianism: hegemonic and competitive. Merkel (2004) advances a typology of defective democracies. The article distinguishes between four subtypes according to five partial regimes: the electoral regime at the core, surrounded by dimensions of political liberties, civil rights, horizontal accountability and effective power to govern. The subtypes are: exclusive democracy with shortcomings on civil rights, mainly the absence of universal suffrage; domain democracy, limiting the effective power to govern; illiberal democracy, emphasizing weak horizontal accountability based on a lack of separation of powers; and finally, delegative democracy, combining complications connected to political liberties, civil rights and horizontal accountability. Hence, defective democracies are all electoral regimes, but show imperfection in one or more partial regimes. Morlino (2011) adopts a novel approach. He stresses the importance of considering the legacy of previous regimes. Hybrid regimes are a by-product not only of an authoritarian past, but also of a traditional or colonial heritage and a mostly neglected instance of democratic backsliding. This leads to three different types of hybrid regimes: (i) protected democracy, with powerful veto players; (ii) limited democracy, with serious issues pertaining to civil and political rights; and (iii) democracy without state, which reflects an inefficient democracy or a democracy without law. Based on a factor analysis and country by country investigation, protected democracy appears not to be an empirical reality. Furthermore, a new category emerges: quasi-democracy, which combines obstacles to rights and the rule of law.
788
The SAGE Handbook of Political Science
Gilbert and Mohseni (2011) create a typology more explicitly using the concept of hybrid regimes. They argue that the competitive nature of multiparty elections sets hybrid regimes apart from other variants of authoritarianism because only then executive turnovers occur and in doing so change the distribution of power (Gilbert and Mohseni, 2011: 285). Within the hybrid regimes category, they identify illiberal hybrid regimes, tutelary hybrid regimes and illiberal tutelary hybrid regimes. ‘Illiberal’ refers to problems with civil liberties such as freedom of speech and assembly, which create an unfair playing field for competing parties when it comes to elections. ‘Tutelary’ points to non-elected actors meddling in elections for or against a candidate/party while protecting their own ‘reserved’ domain from outside influence. Overall, Bogaards (2009: 404) evokes the high degree of generality marked by the absence of clear indicators (including sources of information) and aggregation rules. As a consequence, the operationalization gives room for subjective interpretation of what hybrid regimes are and are not. The reproducibility of regime classification schemes over time and space is precisely the mission and mandate of large databases.
Generalizing Only a few organizations are devoted to taking a cross-national and historical perspective on political regimes. Presented as a non-partisan organization located in New York, Freedom House acts as a reference among hybrid regimes experts. Regimes are catalogued as either free, partly free or not free.3 ‘Partly free’ symbolizes hybrid regimes. Criteria are grouped into two clusters: civil liberties and political rights. Freedom House also introduced two more indexes: freedom of the press and freedom of the Internet. In the United States, the NGO Center for Systemic Peace runs the Polity Project. It is
a unique source of information that covers more than two centuries of political development. It classifies polities as either democracy or autocracy and more recently created an additional category: anocracy, which refers to hybrid regimes. The database gives priority to executive recruitment procedures, the independence of executive authority and political competition and opposition. It also produces an event analysis of regime change. The Varieties of Democracy (V-Dem.net) database is managed by the University of Gothenburg in Sweden. It provides a very rich examination of political regimes in countries around the world going back to 1789. It generates indexes on electoral, liberal, participatory, deliberative and egalitarian democracy based on more than 350 indicators, notably with information about the level of democracy in local and regional government as well as gender issues. It classifies political regimes into four categories: liberal democracy, electoral democracy, electoral autocracy and closed autocracy. Within this perspective, electoral autocracy and electoral democracy correspond to hybrid regimes.4 Finally, other databases deal with some aspects discussed in the section on definition, but do not offer a systematic approach to political regime classification. The World Bank Political Institutions database produces detailed information on a wide range of institutional characteristics starting in 1975. The legislative and executive indicators of electoral competition help in determining the domination of the incumbent or party in power. Variables include, notably, parties of the government or opposition coalition, as well as the degree of fragmentation of legislatures. The Electoral Integrity Project is a joint program between Harvard University and the University of Sydney. The database assesses the quality of elections worldwide since 2012 and makes available observations at country level, election level and, uniquely, expert level. The latter significantly improves efforts toward increased reproducibility of the evaluation process. The Electoral Integrity
Hybrid Regimes
Project is valuable to gauge the playing field in elections, offering information, among others, about political funding. These databases employ different criteria and weighting systems which generate estimations about the political regime of a given country in a given year. In many cases, the classification diverges, sometimes considerably, among them, even if they do follow a similar evaluation process involving regional experts, career analysts and academics. According to Lueders and Lust (2018), database assessment of political regimes is overly sensitive to political events. For instance, a coup or an electoral turnover does not inevitably represent a change of regime. They point to the necessity to differentiate between institutional rupture and reforms in order to measure the actual impact of these events on the political regime. Indeed, this information is crucial to explain the emergence and resilience of hybrid regimes.
Theory-Building As demonstrated, a large part of the literature on hybrid regimes is devoted to conceptual and categorization issues. In 2006, McMann emphasized that analysts dedicated more time to coining new expressions than to determining the proliferation of hybrid regimes. Fortunately, this criticism holds less relevance today. For the past 15 years, a rich corpus has yielded explanations for both the emergence and the resilience of hybrid regimes. Nonetheless, theory-building on those topics can appear limited and fragmented because not all authors use the term hybrid regimes, relying instead on other categories and concepts. For instance, a growing area of research is authoritarian resilience, which, despite its name, is overwhelmingly focused on electoral authoritarianism. Explanations of the emergence and resilience of hybrid regimes are therefore intrinsically connected
789
to the conceptual issues we highlighted, and providing an overview of theory-building in this field sometimes requires looking beyond the labels and mixing and matching research with very distinct starting points.
Explaining the Emergence of Such a Regime The origin of political regimes is a wellexplored topic in the social sciences, but few studies explicitly address it in relation to hybrid regimes per se. To explain their emergence, it is necessary to borrow from the literature on democratic transitions. A first category of explanatory factors focuses on the impacts of the historical context, looking at external determinants of the rise of hybrid regimes. Major global changes in the 1980s and 1990s posed a challenge to autocrats. The economic crisis that affected many of them in the 1980s threatened their capacity to sustain patronage politics and a coercive apparatus. They also reduced their legitimacy, since plenty of those regimes, especially in sub-Saharan Africa, were built during decolonization on the promise of development. Simultaneously, the autocrats’ options had decreased by the end of the Cold War. They could no longer benefit from superpower patronage and play out their opposition to increase the price of their alignment. Furthermore, the end of the Cold War, with the ‘victory’ of the Western democratic model, opened an era of active democracy promotion that came with the implementation of so-called democratic conditionality on foreign aid, which was a lifeline for many autocrats. Of course, not all autocrats were subjected to the same level of pressure. In their seminal work on competitive authoritarianism, Levitsky and Way (2010) demonstrate that regimes with the most links – economic, social and intergovernmental – with the West were the ones that democratized. They also find that leverage – that is, the pressure to
790
The SAGE Handbook of Political Science
democratize exercised by foreign powers – does not have any effect without high linkage. This can explain why, contrary to the arguments of modernization theory,5 poorer countries sometimes democratized – or, more exactly, hybridized – more than their wealthier counterparts (Miller, 2017). Dependency on foreign aid provided by democratic states made those countries more likely to reform. Hybrid regimes consequently emerged because autocrats were able to face external pressures to democratize by carrying out the minimal institutional reforms needed to satisfy foreign powers, while still preserving – even prolonging – their rules and receiving financial aid. Many of them went for the ‘low hanging fruit’ of multiparty elections. In countries where a majority of poor voters could be co-opted through clientelism, this was an especially unthreatening change (Miller, 2017: 1), even more so because the conditionality was rarely rigorously implemented. But to better understand how hybrid regimes emerged from some transitional processes, internal factors also need to be included. O’Donnell and Schmitter’s seminal work (1986) emphasized the uncertainty of such periods by focusing on the games of actors, most notably the ruling elite. The strategies adopted depend on the evaluated cost and availability of strategies and the balance of power within those elites – for instance between softliners and hardliners – but also on their own appreciation of the situation. According to O’Donnell and Schmitter, during the second period of the transition, when disorder and uncertainty are at their highest level, conditions are favorable for a coup from the hardliners and/or they can also push softliners to join them if they consider the opposition too menacing or realize they might lose control of the transition, which can lead to moments of autocratic restoration. Of course, other actors, such as the opposition and the military, can play a major role in determining the issue of the transition (Bratton, 1997).
In contrast with this approach, which underplays the role of structures in the emergence of various regime types, other authors have looked into the impact of institutions, and, first of all, of the preceding regime type. Hadenius and Teorell (2007) find that while monarchical regimes often transition to a highly restricted form of electoral monarchism and never to democracy, one-party systems change at a similar frequency to dominant multiparty systems, non-dominant multiparty systems and pure military regimes. Some past institutions therefore shape the country’s trajectory more than others. Gelman further argues that the legacy of the past influences the path of the transition because it ‘defines the initial constellation of actors and the distribution of their resources at the beginning of a regime change’ (2008: 160). It is part of ‘the initial political opportunity structure’ (ibid). In situations where the cost of coercion is higher than the cost of cooperation, dominant actors can compel the subordinated actors into a ‘cartellike deal’ (Gelman, 2008: 162), where the dominant shares some of his resources to co-opt the subordinate and retain control over major decisions. Those deals result in hybrid regimes, either electoral authoritarianism or defective democracy. The institutions adopted during the process of regime change can also decrease the likelihood of a democratic consolidation and push regimes onto the hybrid path. Notably, there is a long-standing debate on the comparative dangers of presidential and parliamentary systems for young democracies (Linz, 1990; Mainwaring and Shugart, 1997). Lastly, informal institutions such as clientelism and patrimonialism can also disrupt and distort the workings of formal democratic institutions to produce hybridity (Merkel, 2004: 54). A third internal factor that has been abundantly studied is the impact of the modes of transition, defined ‘in terms of the identity of the actors who drive the transition process and the strategies they employ’ (Munck
Hybrid Regimes
and Leff, 1997: 343). Munck and Leff (1997) identify five types of transition – reform from below, reform through transaction, reform through extrication, reform through rupture and reform from above – and argue that the latter is the mode of transition less likely to progress toward the consolidation of democracy, because there is no counterbalance to the ruling elites who single-handedly set the transitional agenda. Reforms from below are likely to give birth to a specific type of hybrid regime restricted democracy. Quite clearly, most explanations concerning the origins of hybrid regimes are connected to the specific historical moment of the third wave of democratization. Lately, a new literature has started to investigate a different dynamic: how hybrid regimes are born out of democracies. This refers to the notion of democratic backsliding, a gradual intra-regime change conceptualized as a series of incremental actions that restrict competition, participation and accountability (Waldner and Lust, 2018: 95). Its explanations are mostly actor-centered, though the impact of structural changes such as growing economic inequalities is acknowledged. According to Bermeo, the most common variety of backsliding ‘occurs when elected executives weaken checks on executive power one by one, undertaking a series of institutional changes that hamper the power of opposition forces to challenge executive preferences’ (2016: 10). It is therefore often done through legal action and formal institutions. Turkey’s evolution under the rule of the Justice and Development Party (AKP) is an illustrative example, with autocratic changes enacted by ‘elected officials with a strong popular mandate to rule’ (Bermeo, 2016: 11).
Explaining the resilience The increasing prevalence of a new road to hybridity starting in democracies demonstrates that hybrid regimes are enduring types of political regimes, despite initial assessments
791
emphasizing their transitional nature and depicting them as a necessary step on the path to democracy. The first to provide an analytical framework for this resilience were Levitsky and Way (2010), who discussed competitive authoritarianism and highlighted the role of a mix of international and internal factors. In cases where the organizational power of the incumbent, whether there was a cohesive or coercive party structure in place, was high, opposition challenge was thwarted, and the regime remained stable authoritarianism. Stability was enhanced by the support of a powerful authoritarian counterhegemonic force, such as China. In contrast with this approach, recent works emphasize how the appropriation of democratic traits plays a major role in the resilience and stability of hybrid regimes, challenging the theories arguing that the implementation of such institutions kickstarts a learning process, a gradualist path to democracy (Miller, 2015: 525; Lindberg, 2006: 139). Instead, formal and nominally democratic institutions, such as legislatures, electoral processes and multiparty systems, are actually important tools in the management of the threats against the incumbent coming from within the ruling elites or from outsiders. This would explain why regimes devoid of those institutions, such as personalist and military ones, are more likely to be overthrown by successful coups (Geddes, 2004: 19). To put it simply, those institutions enable durable and efficient co-optation because without them transfers remain occasional and discretionary, and are consequently plagued by commitment problems from both sides (Fjelde, 2010: 202). Formal institutions alleviate the moral hazard that characterizes dictatorships, where the coopted individuals are never sure the leader will not gather more power, to their detriment (Magaloni, 2008; Svolik, 2009). By limiting their own action with institutions, autocrats increase the credibility of their power-sharing agreements.
792
The SAGE Handbook of Political Science
Research on this topic focuses on three main institutions: legislatures, electoral processes and the party. Regarding the first of these, Gandhi (2008) argues that when facing significant threats, rulers need to make concessions. Those take the shape of policy compromises devised through institutions. Indeed, ‘for incumbents, these institutions are a way in which opposition demands can be contained and answered without appearing weak’ (Gandhi, 2008: xviii). They enable the rulers to broaden their support base, integrating new segments of the population of the regime’s coalition and thus preventing the formation of rebellion from below, especially when combined with elections. Elections allow popular discontent to be released in a controlled manner, but also monitored. Regimes can gather information on the constituencies where they have weak support during elections and then use it to develop their patronage system (Brancati, 2014). Of course, in semi-authoritarian contexts elections are circumscribed and therefore provide little and low-quality information, but they have other important roles, such as contributing to the legitimacy of the regime, especially in the eye of the international community (Schedler, 2002). Organizing successful electoral processes is also a device signaling the ruling party’s strength, generating a public image of invincibility that discourages potential opponents (Magaloni, 2006). Just as with legislatures, they are co-optation methods to share power among party politicians and to manage conflict and moments where new elites and groups are recruited in the ruling coalition (Gandhi and Przeworski, 2006). A well-structured and institutionalized party plays a similar role by providing explicit nomination procedures and rules for professional advancement. This makes the elite’s future less uncertain and consequently mitigates centrifugal forces by alleviating the arbitrariness of autocratic power (Brownlee, 2007). While those analyses provide a compelling narrative to explain the durability of
hybrid regimes, they run the risk of creating a ‘hybridity trap’. Indeed, if hybridity itself provides longevity and stability, then is it possible to transition to something else? A major research avenue is therefore identification of the conditions under which political institutions in hybrid regimes contribute to learning processes toward democratic consolidation or autocratic resilience, keeping in mind that in most cases of regime change, countries transition from one hybrid regime to the other (V-Dem Institute, 2018).
Regional Differentiation and International Diffusion Hybrid regimes are common at the global level, and many share similar traits, like the implementation of elected legislatures. They nonetheless do not have the same prevalence in all regions of the world. Looking at the global and regional averages of liberal democracy, defined as the most complete form of democracy, V-Dem data shows that Western Europe and North America are at the highest level, far above the other regions. Latin America and the Caribbean, Eastern Europe and Central Asia, Asia Pacific, subSaharan Africa and the Middle East and North Africa region follow, in that order (V-Dem Institute, 2018: 17). The data also enables us to identify more precisely which type of hybrid regimes dominate in those regions. For instance, electoral authoritarianism is prevalent in sub-Saharan Africa, with 24 cases out of the 49 countries included in the dataset (V-Dem Institute, 2018: 94). Though most of these regimes find their roots in the third wave, there are episodes of change within the past ten years. Comoros, Tanzania and Zambia backslid from elected democracy to electoral authoritarianism, and Mauritius and South Africa went from liberal to electoral democracy. By contrast, hybridity in Latin America is more on the democratic side of the continuum, with
Hybrid Regimes
16 electoral democracies out of 22 countries. Though the region’s political regime and levels of democracy have been relatively stable for the past 15 years, democracy has recently eroded in Venezuela, Honduras and Nicaragua, which are all now categorized under electoral authoritarianism (V-Dem Institute, 2018: 94). The data also shows that the two regions where waves of regime turmoil have recently taken place – with the ‘color revolutions’ of 2000–5 and the 2011 ‘Arab Spring’ – remain overwhelmingly autocratic. In Ukraine and Kyrgyzstan, autocratic restorations followed the initial moment of political liberalization, while in the Middle East only Tunisia jumped from electoral authoritarianism to liberal democracy (V-Dem Institute, 2018: 20). Consequently, those two waves of popular mobilization might have precipitated transitions, but this was usually from one autocratic regime to another. Furthermore, according to the V-Dem dataset, 6 out of 18 cases of democratic backsliding have taken place in eastern Europe: Hungary, Lithuania, Poland, Slovakia, Serbia and Turkey (see Mainwaring and Bizzarro, Chapter 91, this Handbook). The only region where democracy seems to advance is Asia Pacific, most notably Myanmar – which recently transitioned to a form of electoral authoritarianism where the army still fulfills a central role – but also Sri Lanka, Bangladesh and Bhutan. Yet, qualitative evidence about the Philippines and Thailand provides ground for debate. In the Philippines, the election of President Rodrigo Duterte in 2016 and the development of his strongman rhetoric and intransigent stance on drugs have eroded the rule of law. In Thailand, civilian rule collapsed in 2006 and 2014 (Croissant and Lorenz, 2017: 8). Attempting to clarify why specific types of hybrid regimes are prevalent in some regions and not others is an ambitious endeavor, which is beyond the scope of the present overview. Here, we focus rather on investigating the processes of international
793
diffusion that could explain this global trend of hybridization, both in autocratic and democratic polities. Brinks and Coppedge (2006) indeed find evidence for such processes, demonstrating that ‘countries tend to become more like their immediate geographic neighbours over time’ (2006: 464), matching the average degree of democracy or non-democracy of the other countries. However, this does not tell us how diffusion happens. One hypothesis, echoing Levitsky and Way’s work, is the domination of a powerful autocratic neighbor. It is, for instance, tempting to see Russia’s influence behind processes of democratic backsliding in eastern and central Europe, but there is no empirical evidence that linkage between electoral democracies and autocratic regimes fosters such processes (Brownlee, 2017). Latin America might be the sole region in which autocratization in different countries is clearly connected to the presence of a regional power and model: Venezuela. President Rafael Correa in Ecuador and Evo Morales in Bolivia followed Hugo Chávez’s script of ‘convening constitutional assemblies to create new constitutions that expanded rights while concentrating power in the executive, and his use of the state to control economy, the media, and civil society’ (De la Torre, 2017: 1272). Chávez himself actively promoted his playbook through mechanisms of economic support and soft power, for instance with the creation of the Bolivarian Alliance for the Americas (ALBA). ALBA ‘provided the institutional space for populist leaders to meet, gather information and devise common strategies’ (De la Torre, 2017: 1284). Nonetheless, the capacity of the other leaders to emulate his Bolivarianism was dependent on domestic incentives, most notably the weakness of democratic institutions, and on shared ideological goals. Many nations participated in ALBA only to guarantee access to cheap oil (De la Torre, 2017: 1284). Ironically, other recent cases of autocratization in the region are also connected to Chávez, but this time as a reaction. It was
794
The SAGE Handbook of Political Science
fear of the implementation of a similar type of regime that pushed the Honduran opposition ‘to preemptively back the military ouster of President Manuel Zelaya in 2009 and the Paraguayan opposition to impeach and remove from office President Fernando Lugo in 2012’ (Mainwaring and Pérez-Liñán, 2015: 116). Looking at their ways of ruling, however, those regimes are not so different from the Bolivarian ones: they have hegemonic aspirations and popular legitimacy, derived from an electoral process organized on a heavily tilted playing field, with malfunctioning horizontal accountability and intolerance toward opposition (Mainwaring and Pérez-Liñán, 2015: 116). Ideology, it appears, does not necessarily and extensively shape ruling practices.
Major Advances and Ongoing Debates The literature on hybrid regimes deepened our comprehension of the phenomenon both conceptually and theoretically. Progress has been made on many fronts. We now have a better understanding of what hybrid regimes are, as well as why and how some hybrid regimes remain stable while others quickly fall apart. The concept has been dissected to cover a wide diversity of institutional arrangements. With more refined dimensions and indicators, the internal validity and precision of hybrid regimes’ conceptualization improved. Similarly, significant contributions on the theoretical front offer a comprehensive analysis of ruling elites’ strategies to control political institutions and factors explaining hybrid regimes’ emergence and resilience. Most of all, hybrid regimes foster a dialogue among scholars studying political regimes, a meeting point between two corpora that used to work in isolation: the literatures on democratic and autocratic regimes. This is what makes work on hybrid
regimes an exciting and creative endeavor: the propensity to borrow, merge, bridge and (re) frame ideas. In the process, it fuels conceptual and theoretical innovation, focusing on action and concrete events in the real world that the notion implies. What makes regimes hybrid is precisely the intertwined ramifications of political praxis and formal institutions, which defy traditional categorizations. Through the lens of hybridity, scholars can tackle the malleability of power practices in autocratic and democratic settings and question how it extends across time – with temporal processes of inter and intra-regime changes – and space. It is around this latter concept that recent debates on hybrid regimes have emerged. Looking at sub-national politics, authors working on Latin America and Russia highlight how autocratic enclaves with clientelist and repressive practices subsist in states that are democratizing at the national level (Gelman, 2010; Gibson, 2010). There is also the impact of the decentralization process in fragile democracies, which creates predatory and kleptocratic behavior at the local level (Gervasoni, 2018). In autocratic settings, a growing literature investigates how power can be contested or negotiated in cities and regions (Morelle and Planel, 2018). How regimes at different levels interact remains understudied. Outside of the electoral arena, government attempts to contain civil society have attracted greater attention (Rutzen, 2015). Instead of focusing only on formal opposition parties and how the ruling elites deploy various stratagems to control the political space in legislative institutions, the influence of non-governmental organizations is acknowledged, especially in creating new empowering and destabilizing social movements. Although there exists a huge literature on social movements in democracies, far more research needs to be done to understand under which conditions they have an impact on the emergence or resilience of hybrid regimes.
Hybrid Regimes
The role of information and communication technologies (ICT) in political regimes is yet another novel development. The governance of ICT is in many ways the new frontier along which political regimes nurture hybridity. More and more evidence points to the effects of internet platforms in the pushback against democracy, especially during elections, with propaganda tactics used by outside authoritarian regimes to influence voters and manipulate results in democratic ones (Bennett and Livingston, 2018). Social media can also be a tool for emancipation and liberalization by democratizing information, and enabling opponents to organize and mobilize more efficiently against autocratic government, as the Arab Spring suggests (Rød and Weidmann, 2015: 340). At the same time, in a striking demonstration of how hybridity is produced even in the context of strong authoritarianism (here China), MacKinnon (2011) shows how digital communication increases the regime’s responsiveness to its population, all the while enhancing its control over it. Finally, and ironically, many of the practices used to contain cyberspace in autocratic regimes originate in democratic countries, where opaque and secretive practices of surveillance are also at play (Glasius and Michaelsen, 2018).
Conclusion: Perspectives More than two decades ago, the world seemed to have evolved into a hostile environment for authoritarian regimes. There were strong incentives to mutate, and many autocrats successfully adapted to the new context. It is by becoming hybrids that nondemocratic polities guaranteed their survival. To recall Tomasi di Lampedusa (1960: 40), for things to remain the same, everything must change. Recent research shows that even consolidated democracies are reced ing and developing autocratic features. Consequently, the question is whether
795
contemporary world developments favor hybridity instead. A first answer looks at the evolution of the global context since the early 2000s. The growing economic and political importance of autocratic counterhegemony, as in the rising international presence of China and Russia, and the switch away from democratization to security as a strategic priority for the United States and the EU give weight to the hybrid scenario. The fight against terrorism that gained traction after 9/11 and more recently with the rise of the Islamic State and terror attacks in Europe has pushed established democracies to support semi-authoritarian regimes. But terrorism is also used at home to justify infringements to the rule of law, extension of state surveillance and control over new technologies. In addition, as Bermeo (2016) points out, it is harder to sanction those regimes because their leaders can claim that they are respecting the rule of law and formal democratic institutions, acting within the confines of their popular and legal mandates. Procedural legitimacy, often sanctioned by international monitoring missions, has provided many hybrid regimes with a veneer of democracy that can be used to counter internal and external criticisms. There is, however, a second answer to this question, one that is critical and epistemological. It is to argue that hybridity was always there and could even be the result of a change in our perspective rather than an objective empirical transformation. Expanding our definitions of democracy to include equal access to power and the protection of the law reveals how large parts of the population have lived in hybrid regimes for a long time, despite specialists labeling them liberal democracies (Brachet-Marquez, 2005: 481). Similarly, some individuals or social groups had more freedom of speech than others in autocratic states. Hybridity, like all social science concepts, is a conceptual construct that mirrors the bias and subjectivity of its designers. Future research might therefore
796
The SAGE Handbook of Political Science
benefit from reflecting on hybrid regimes in everyday experience and from incorporating a greater diversity of perspectives.
Notes 1 For instance, a citation search in Google Scholars with ‘hybrid regimes’ in the title or text and political science as an additional criterion shows 141 results (1990–9), 1,890 results (2000–9) and 9,410 results (2010–18). 2 In the following overview, autocratic regimes include dictatorship and authoritarianism. 3 It includes data starting in 1972. 4 Electoral democracy includes various forms of defective democracies as the threshold is ‘not exceedingly demanding’ (V-Dem Institute, 2018: 19). 5 In its early form, modernization theory argued that the emergence of democracy was more likely in countries with a certain level of development (Lipset, 1959). Later work by Przeworski and Limongi (1997), however, finds no evidence of this causal relationship, though they did find support for the exogenous modernization theory asserting that democracy is more likely to survive in a developed country.
References Archer, Ronald P. (1995). ‘Party Strength and Weakness in Colombia’s Besieged Democracy.’ In Building Democratic Institutions – Party Systems in Latin America, edited by Scott Mainwaring and Timothy R. Scully, 164–99. Stanford: Stanford University Press. Armony, Ariel C. & Hector E. Schamis (2005). ‘Babel in Democratization Studies.’ Journal of Democracy 16 (4): 113–28. Bennett, Lance W. & Steven Livingston (2018). ‘The Disinformation Order: Disruptive Communication and the Decline of Democratic Institutions.’ European Journal of Communication 33 (2): 122–39. Bermeo, Nancy (2016). ‘On Democratic Backsliding.’ Journal of Democracy 27 (1): 5–19. Bogaards, Matthijs (2009). ‘How to Classify Hybrid Regimes? Defective Democracy and Electoral Authoritarianism.’ Democratization 16 (2): 399–423.
Brachet-Marquez, Viviane (2005). ‘Undemocratic Politics in the Twentieth Century and Beyond.’ In The Handbook of Political Sociology: States, Civil Societies, and Globalization, edited by Robert Alford, Thomas Janoski, Alexander Hicks & Mildred A. Schwartz, 461–81. New York: Cambridge University Press. Brancati, Dawn (2014). ‘Democratic Authoritarianism: Origins and Effects.’ Annual Review of Political Science 17: 313–26. Bratton, Michael (1997). ‘Deciphering Africa’s Divergent Transitions.’ Political Science Quarterly 112 (1): 67–93. Brinks, Daniel & Michael Coppedge (2006). ‘Diffusion is no Illusion: Neighbor Emulation in the Third Wave of Democracy.’ Comparative Political Studies 39 (4): 463–89. Brownlee, Jason (2007). Authoritarianism in an Age of Democratization. Cambridge: Cambridge University Press. Brownlee, Jason (2017). ‘The Limited Reach of Authoritarian Powers.’ Democratization 24 (7): 1326–44. Croissant, Aurel & Philip Lorenz (2017). Comparative Politics of Southeast Asia: An Introduction to Governments and Political Regimes. Basel, Switzerland: Springer International Publishing. De la Torre, Carlos (2017). ‘Hugo Chávez and the Diffusion of Bolivarianism.’ Democratization 24 (7): 1271–88. Di Lampedusa, Giuseppe (1960). The Leopard. Translation by Archibald Colquhoun. New York: Pantheon. Diamond, Larry Jay (2002). ‘Thinking about Hybrid Regimes.’ Journal of Democracy 13 (2): 21–35. Fjelde, Hanne (2010). ‘Generals, Dictators, and Kings: Authoritarian Regimes and Civil Conflict, 1973–2004.’ Conflict Management and Peace Science 27 (3): 195–218. Gandhi, Jennifer (2008). Political Institutions under Dictatorship. Cambridge: Cambridge University Press. Gandhi, Jennifer & Adam Przeworski (2006). ‘Cooperation, Cooptation, and Rebellion under Dictatorships.’ Economics & Politics 18 (1): 1–26. Geddes, Barbara (2004). ‘Authoritarian Breakdown.’ Manuscript. Department of Political Science, UCLA.
Hybrid Regimes
Gelman, Vladimir (2008). ‘Out of the Frying Pan, into the Fire? Post-Soviet Regime Changes in Comparative Perspective.’ International Political Science Review 29 (2): 157–80. Gelman, Vladimir (2010). ‘The Dynamics of Subnational Authoritarianism: Russia in Comparative Perspective.’ Russian Politics & Law 48 (2): 7–26. Gervasoni, Carlos (2018). Hybrid Regimes within Democracies: Fiscal Federalism and Subnational Rentier States. Cambridge: Cambridge University Press. Gibson, Edward L. (2010). ‘Politics of the Periphery: An Introduction to Subnational Authoritarianism and Democratization in Latin America.’ Journal of Politics in Latin America 2 (2): 3–12. Gilbert, Leah, & Payam Mohseni (2011). ‘Beyond Authoritarianism: The Conceptualization of Hybrid Regimes.’ Studies in Comparative International Development 46 (3): 270–97. Glasius, Marlies & Marcus Michaelsen (2018). ‘Authoritarian Practices in the Digital Age. Illiberal and Authoritarian Practices in the Digital Sphere – Prologue.’ International Journal of Communication 12: 3795–813. Hadenius, Axel & Jan Teorell (2007). ‘Pathways from Authoritarianism.’ Journal of Democracy 18 (1): 143–57. Helle, Svein-Erik (2016). ‘Defining the Playing Field: A Framework for Analysing Fairness in Access to Resources, Media and the Law.’ In Democratization and Competitive Authoritarianism in Africa, edited by Matthijs Bogaards & Sebastian Elischer, 47–78. Wiesbaden: Springer Fachmedien Wiesbaden. Howard, Marc Morjé & Philip G. Roessler (2006). ‘Liberalizing Electoral Outcomes in Competitive Authoritarian Regimes.’ American Journal of Political Science 50 (2): 365–81. Levitsky, Steven & Lucan A. Way (2002). ‘The Rise of Competitive Authoritarianism.’ Journal of Democracy 13 (2): 51–65. Levitsky, Steven & Lucan A. Way (2010). Competitive Authoritarianism: Hybrid Regimes after the Cold War. Cambridge: Cambridge University Press. Lindberg, Staffan I. (2006). ‘The Surprising Significance of African Elections.’ Journal of Democracy 17 (1): 139–51. Linz, Juan J. (1964). ‘An Authoritarian Regime: The Case of Spain.’ In Mass Politics –Studies in
797
Political Sociology, edited by Erik Allardt & Yrjo Littunen, 251–89. New York: The Free Press. Linz, Juan J. (1990). ‘The Perils of Presidentialism.’ Journal of Democracy 1 (1): 51–69. Lipset, Seymour Martin (1959). ‘Some Social Requisites of Democracy: Economic Development and Political Legitimacy.’ American Political Science Review 53 (1): 69–105. Loveman, Brian (1994). ‘“Protected Democracies” and Military Guardianship: Political Transitions in Latin America, 1978-1993”.’ Journal of Interamerican Studies and World Affairs 36 (2): 105–89. Lueders, Hans & Ellen Lust (2018). ‘Multiple Measurements, Elusive Agreement, and Unstable Outcomes in the Study of Regime Change.’ The Journal of Politics 80 (2): 736–41. MacKinnon, Rebecca (2011). ‘China’s Networked Authoritarianism.’ Journal of Democracy 22 (2): 32–46. Magaloni, Beatriz (2006). Voting for Autocracy: Hegemonic Party Survival and Its Demise in Mexico. Cambridge: Cambridge University Press. Magaloni, Beatriz (2008). ‘Credible PowerSharing and the Longevity of Authoritarian Rule.’ Comparative Political Studies 41 (4–5): 715–41. Mainwaring, Scott & Matthew S. Shugart (1997). ‘Juan Linz, Presidentialism, and Democracy: A Critical Appraisal.’ Comparative Politics 29 (4): 449–71. Mainwaring, Scott, and Aníbal Pérez-Liñán (2015). “Cross-Currents in Latin America.” Journal of Democracy 26 (1): 114–127. McMann, Kelly (2006). Economic Autonomy and Democracy: Hybrid Regimes in Russia and Kyrgyzstan. Cambridge: Cambridge University Press. Merkel, Wolfgang (2004). ‘Embedded and Defective Democracy.’ Democratization 11 (5): 33–58. Miller, Michael K. (2015). ‘Democratic Pieces: Autocratic Elections and Democratic Development since 1815.’ British Journal of Political Science 45 (3): 501–30. Miller, Michael K. (2017). ‘The Strategic Origins of Electoral Authoritarianism.’ British Journal of Political Science, First view: 1–28. Morelle, Marie & Sabine Planel (2018). ‘Appréhender des situations autoritaires. Lectures croisées à partir du Cameroun et de
798
The SAGE Handbook of Political Science
l’Éthiopie.’ L’Espace politique. Revue en ligne de géographie politique et de géopolitique, no 35. Morlino, Leonardo (2011). Changes for Democracy: Actors, Structures and Processes, Oxford: Oxford University Press. Munck, Gerardo L. & Carol Skalnik Leff (1997). ‘Modes of Transition and Democratization: South America and Eastern Europe in Comparative Perspective.’ Comparative Politics 29 (3): 343–62. O’Donnell, Guillermo (1994). ‘Delegative Democracy.’ Journal of Democracy 5 (1): 55–69. O’Donnell, G. and Schmitter, P. C. (1986), Transitions from Authoritarian Rule: Tentative Conclusions about Uncertain Democracies. Baltimore: The Johns Hopkins University Press. Ottaway, Marina (2003). Democracy Challenged: The Rise of Semi-Authoritarianism. Washington, DC: Carnegie Endowment for International Peace. Przeworski, Adam & Fernando Limongi (1997). ‘Modernization: Theories and Facts.’ World Politics 49 (2): 155–83. Rød, Espen Geelmuyden & Nils B. Weidmann (2015). ‘Empowering Activists or Autocrats? The Internet in Authoritarian Regimes.’ Journal of Peace Research 52 (3): 338–51. Rutzen, Douglas (2015) ‘Civil Society under Assault.’ Journal of Democracy 26 (4): 28–39.
Schedler, Andreas (2002). ‘The Menu of Manipulation.’ Journal of Democracy 13 (2): 36–50. Simpser, Alberto (2013). Why Governments and Parties Manipulate Elections. Cambridge: Cambridge University Press. Svolik, Milan W. (2009). ‘Power Sharing and Leadership Dynamics in Authoritarian Regimes.’ American Journal of Political Science 53 (2): 477–94. V-Dem Institute (2018). Democracy for All? V-Dem Annual Democracy Report 2018. Sweden: V-Dem Institute. Waisman, Carlos H. (1989). ‘Argentina:Capitalism and Democracy.’ In Democracy in Developing Countries: Latin America, edited by Larry Diamond, Juan J. Linz, and Seymour Martin Lipset, 71–129. Boulder: Lynne Rienner Publishers. Waldner, David & Ellen Lust (2018). ‘Unwelcome Change: Coming to Terms with Democratic Backsliding.’ Annual Review of Political Science 21: 93–113. Wigell, Mikael (2008). ‘Mapping “Hybrid Regimes”: Regime Types and Concepts in Comparative Politics.’ Democratization 15 (2): 230–50. Zakaria, Fareed (1997). ‘The Rise of Illiberal Democracy.’ Foreign Affairs 76 (6): 22–43.
48 Judicial Power Daniela Piana
Introduction The notion of judicial power refers to the decision-making processes enacted by a judge or by the system in which the judge operates, notably the court. This concept comprises both the semantics of ‘courts’ and those of ‘judiciary’. If ‘court’ is prevalently used to refer to the ‘agency’ dimension of the judicial power – under an ‘as if’ clause that assigns to the court the nature of a sole actor – the emphasis of the judiciary is on systems, branches and structures, in which are embedded organizational, professional and institutional norms conferring the authority to the agency – whether a single judge or a collective body, such as a court section – that adjudicates a dispute between private interests or between private and public interests (Raz, 1979). While there is a long tradition of scholarship addressing the ‘ought to be’ of a judicial branch in a constitutional State, empirical (and behavioral) analysis of the functioning of the judiciary dates back to the
early 1960s – especially in the United States, where courts have been innovatively framed in the broader spectrum of the political institutions and, consequently, analyzed as actors (Dahl, 1957; Weingeist, 1997). Afterwards and later on, this has been most prominent in the context of analysis of the supreme courts, interpreted as counter-majoritarian agents (Hirschl, 2004) or, in a different perspective, as devices adopted by political incumbents to secure the protection of the fundamental guarantees of freedom and equal access to power in democratic regimes (Ginsburg, 2003; Epstein et al., 2001). In the 1970s, and more widely in the 1980s, the judiciary began to be observed as an arena in which social groups, especially disadvantaged ones, mobilize resources to put pressure on rulers in a prospective way – as in pressure on the rule adoption of the legislative or the executive branch – or, in a retrospective view, to revise and mitigate the effects of previously adopted legal provisions (Epp, 1998; Commaille and Lacour, 2018). Comparative
800
The SAGE Handbook of Political Science
perspectives since then have cast light both on the civil versus common law dichotomy – and the trespassing of its differences (Cappelletti, 1989) – and on the different models of judicial procedures in the civil and criminal justice fields. Several disciplines are interested in the behaviors, organization, institutional nature and cultural stances of judicial actors and judicial institutions. Beyond the many varieties of approach, to project scholarly attention toward the judiciary in the 21st century it is worth bearing in mind the following. The law and economics field addresses the impact that legal provisions have on the behavior of social and economic actors and stems from a rational choice model of agency (Posner, 1983); the law and society field – mostly developed in North America up to the end of the 1990s – addresses the interplay between social forces, groups, values and stakes within the law in action, that is, within the scope of action of the courts (Santos, 2006); and comparative politics is interested in the evolving patterns of the interplay of the branches within a State, or in the patterns of change featuring the judiciary as a system of multiple actors, such as prosecutors and judges, with judges at different ranks. Alongside a variety of epistemological positions on the role played by norms in sociopolitical systems and ultimately on the agency/norms matrix, all of these fields relate to legal scholarship, which covers both ordinary courts and supreme courts, as well as what has been called political jurisprudence (Shapiro, 1983).
One principle, many institutional designs Judicial power stands as the safeguard of individual rights against the potential abuse of power: ‘Preventing the abuses of powers means having in the legal system safeguards against arbitrariness; providing that the
discretionary power of the officials is not unlimited, and it is regulated by law.’1 This is to be taken as an ideal-typical position.2 Over the centuries, the development of modern States and the practices of Western liberal institutions gave birth to two different, but related, formal mechanisms to limit the power of the sovereign (and, broadly speaking, to limit the power of the executive branch). These are, first, its subjection at an early stage to natural law, and then afterward to parliament; and second, the separation of powers, based on the assumption that the three branches of government (legislative, executive and judicial) handle three different kinds of power (see Müller-Rommel and Vercesi, Chapter 45, and Patzelt, Chapter 49, this Handbook). These powers should perform their functions under the control of inter-institutional (inter-branch) accountability mechanisms, which ensure that no one power overrules the others. Independently of the way in which power has been bounded, judicial institutions have always been placed in a critical position: on the one hand, courts are of paramount importance in keeping public officials accountable to the law (Raz, 1979); on the other, the judicial branch is crucial in implementing the principle of separation of powers (Waldron, 2008). Within the scope of the jurisdiction, judicial functions interact with other key functions of the State, notably the prosecutorial function (Pakes, 2015). Beyond the vast range of models and policies relating to the functioning of the judiciary, judicial independence stands as the azimuth of all actions performed by the judicial function, either in discontinuing authoritarian traditions (see the example of the Southern European or Latin countries) or in strengthening the capacity to put into motion the principle of the rule of law (Larkins, 1996). In a way, the vast array of policies (setting up a Council of the Judiciary, reforming the mechanisms of judicial appointment, revising the mechanisms of judicial evaluation, incorporating into the court organization managerial
Judicial Power
tools optimizing resource allocation, to offer just some examples) to structure or strengthen the judicial function are inspired by a clear desire for an effective rule of law (Garoupa and Ginsburg, 2009). A lack of guarantee of judicial independence (JI) may be fatal for the effectiveness of limitations of power and consequently the effective implementation of the rule of law, as low guarantees of judicial independence may leave judges unprotected from external influences. Judicial independence is, however, meant to refer to different aspects of the judiciary. Scholarly doctrine and institutional practices advocating, elaborating and putting into motion JI have been also extended to reflect upon the application of some key notions, such as external and internal independence of the status of the prosecutor (see on authoritarian regimes Solomon, 2015). Scholars addressing the issue of the status of prosecutorial functions within the scope of the liberal institutions have largely investigated the role of these within the criminal procedure, having in mind a typology based on the degree of autonomy enjoyed by public prosecutors in pursuing a crime and the structural stance public prosecutors take in relation to the judicial function. In countries where judges and prosecutors belong to the same body, a number of other aspects – such as career paths, or common versus differential cultures – are considered by comparative scholars (see Damaska, 2019). When scholars investigate the legal conditions of JI they refer to de jure JI, which in many countries is entrenched in the constitutional architecture of the State. This is the case in continental European, North American and Latin American countries and has been seen in recent transitions to democracy such as those in the Mediterranean region and in some of the MENA countries (Choudhry and Bass, 2014). De jure JI is the primary goal for political regimes that shift from an authoritarian setting toward – even minimally – a set of constitutional guarantees, such as civil and political freedoms. The
801
example of the Balkan countries is telling in this respect. Despite the varieties of path followed to reach the standards of the so-called European model of rule of law, in all cases de jure JI was adopted in the constitutional setting of the State. The isolation of the judiciary from undue influences marked the shift from authoritarian rules to democratic or hybrid regimes during the Arab Spring. JI applies to the judicial function at the macro and the micro levels. It refers to the status of the judiciary in relation to the other branches of the State, and to the status of the individual judges within the judiciary, in relation to the highest ranked justice, and in relation to the external environment. The first of these is external JI (Russell and O’Brien, 2001), whereas the second is internal JI (Hayo and Voigt, 2003). We also need to refer to de facto JI, which relates to the actual functioning of the judiciary and to the actual behaviors adopted by judges. De facto JI is a category introduced into the scholarship to point to the gap between the guarantees entrenched in the legal systems and the contextual conditions where judges and judicial functions operate. Moreover, and especially with reference to the most recent experiences, this category allows for explanation of changes within the judiciary when formal norms do not change. The example of Italy is insightful. Without changing the constitutional provisions that ensure de jure JI, the Italian judiciary underwent two major waves of transformation, impinging upon the professional profile of judges – a de facto specialization – and the room for maneuver which lower judges enjoy in innovating jurisprudence in key sectors of social and political life. A survey of the models of judicial governance which coexist in Europe would reveal a spectacular variety of institutional solutions. Each solution combines a formal organizational design (identifying who does what in appointing, promoting, evaluating and checking judicial actors) with informal practices and ways of doing – organizational and legal cultures which differ from one country to
802
The SAGE Handbook of Political Science
another. In the English system, legal norms took the form of jurisprudential decisions, through which ordinary judges paved the way for modern English law (Bell, 2006). Over the centuries, English lawyers have regularly been appointed as justices, whereas the promotion of judges has always taken place in relationship with the reputation they gained at the bar. Usually the English model is explained in terms of the persistence of a common law-based legal system. However, a more attentive insight is that it is possible to point to other factors, including social pluralism, low fragmentation in the political system and stability of the institutional framework. This has remained largely true after the introduction of a new and highly significant institution, the supreme court, whose jurisdiction reproduces within the British system the model of last resort adopted in the most of the Western countries and in several countries that have moved from an authoritarian legacy toward the implementation of a minimum core of constitutional guarantees of fundamental freedoms. In continental Europe, due to the more straightforward importance of the sovereign State, it is possible to identify two models – French and German (Guarnieri and Pederzoli, 2002). In the first model the main mechanism used to ensure that judicial decisions comply with the law is the creation of a bureaucratic model of judicial governance. Judges are selected according to the same model of recruitment used for civil servants, and so are driven to develop an esprit de corps (Troper, 1980). Both socialization and respect of seniority guarantee the coherence of judicial decisions. Judicial interpretations should be strictly residual, not creative. This model, which spread across Southern Europe and Belgium along with the establishment of Napoleonic rule, underwent a process of radical change after the end of World War II. A pure model of self-government has been adopted by Southern European countries throughout their democratic transitions. After the fall of pre-war authoritarian regimes,
High Judicial Councils were introduced in Italy, Spain, Portugal and Greece in order to insulate judges from the influence of the executive (Toharia, 1975). In this model, the High Judicial Council appoints, promotes, evaluates and trains judges. The model, of German tradition, relies on the idea of the Rechtsstaat. It interprets the legitimacy of the law in terms of procedural correctness, which ultimately depends on the respect legal norms exhibit in terms of the Gründnorm, the fundamental rule of the State (Kommers, 1997). The State, which is the depositary of the Gründnorm, is endowed with the power to issue the norms of the legal system. The legal accountability of judges functions predominantly as a guarantee of judicial independence. In this system, undue interference with the judiciary is not expected from the executive, but rather from the legislative arm. The risk of an overwhelming majority which overrules the fundamental rule of the constitutional State is avoided by adopting a strong constitutional mechanism of judicial review. The review is operated by an ad hoc institution, specialized in monitoring the formal and substantial coherence of the statutory law with the Gründnorm. This institution is the constitutional court. The idea of a specialized court granted full and unique responsibility in reviewing the statutory laws and ensuring the last resort to the protection of fundamental rights is the result of a historical process marked by dramatic turning points, including World War II. If, along the centuries – from the introduction of the Magna Carta onward – the judiciary was meant to work at the level of the ordinary courts and to act in a bounded scope of action where the executive was not allowed to intervene, the 20th century saw growing distrust of the legislative branch and the power of the majorities. Rules adopted in the parliaments were not automatically legitimated by their democratic roots: citizens have seen concrete possibilities of falling victim to rules adopted by oversized majorities subverting – under the label of democratic legitimacy – the rule
Judicial Power
of law (Ferejohn, 1998; Stone Sweet, 2000). This is the rationale inspiring the introduction of the constitutional courts. The choice to give to the judge the ultimate protection of individuals’ rights is a mark of the democratization processes unfolding in many different regions, from Central and Eastern Europe to Latin America.
Judicial reforms beyond judicial independence If we shift from a macro-systemic approach to a micro approach – that is, actor-centered – new and insightful dimensions may be found, notably in relation to the organizational and professional accountability of the judiciary. Once again, there are two models, all coexisting in the different continents. In one model, the appointment, selection and promotion of judges is based on (1) candidate justices’ general legal knowledge and (2) seniority. This yields a strictly bureaucratic model of judicial governance. Throughout their processes of socialization judges learn to comply with the rule of law, which imposes a strictly transparent and equal application of legal norms. In each model, ethical norms and legal ideologies are transmitted and enforced by different institutions that correspond to different social groups (Damaska, 1986). Judicial actors adjudicate according to the norms – interpretative principles and deontological standards – which they learn from senior judges, legal actors and legal scholars. In the UK model, the bar is the source of the behavioral standards with which justices comply. In the continental model, legal scholars and senior justices represent the group of reference for ordinary judges, even if the introduction of the High Judicial Council unbalanced this delicate mechanism. Indeed, the organizational innovation represented by the High Judicial Council carries a surreptitious change in the allocation of power within the judicial
803
branch. Whereas the High Judicial Council maximizes the external independence of justices from the other branches of the State, it challenges and ultimately weakens the internal hierarchy of the judiciary through the introduction of a democratic principle into the judicial governance. Members of the High Judicial Council are elected by ordinary judges. This undermines the cohesiveness of the judicial hierarchy because of the equal status awarded to all judges on the basis of the principle ‘one man, one vote’ (Volcansek, 1992). If referred to the functioning of the supreme court, a rich scholarship has been developed on the conditions that situate the supreme courts in the adequate position of exercising a proper review of the statutory laws. These conditions, which are primarily (even if not exclusively) instantiated into the patterns of judicial appointment and judicial tenure, impinges upon the balance between elected actors and not elected actors in the overall design of a constitutional democracy. A more specific but not less important dimension regards the supreme court method of deliberation and decision making, where the device of the “dissenting opinion” introduces into the supreme court an “arena” dimension where different visions and culture have an echo and a visibility (Baum, 1990). This is a bridge between a comparative analysis of the supreme courts and analysis of the dialogue among the courts (Dallara and Piana, 2015). Professional dialogue, reference to the external public and internal interactions among justices serving within the same jurisdiction, whether ordinary or supreme court, together belong to the field of judicial accountability. Dimensions of judicial accountability have been investigated in both comparative political studies and organizational analysis. The role played by the organizational dimension calls for a new scholarly approach which has been reflected both in the policy discourse and in the judicial reforms. Speaking of the judiciary by referring to the concept of JI may lead scholars and practitioners to assess the way democracy and rule of law
804
The SAGE Handbook of Political Science
are intertwined on a very superficial level. Formal institutional designs provide a fairly accurate view of how the judicial branch is inserted in the broader context of a constitutional system, but they do not say that much about how, in reality, courts hold rulers and ruled accountable, and about the ways in which courts are themselves subjected to mechanisms of control and accountability. This point is strengthened further by empirical evidence provided by the processes of change triggered in democratizing countries by the reforms that have set up JI guarantees. The democratization waves unfolding in Southern Europe and Latin America represented a quasi-experimental context to assess the impact of the JI guarantees on the judiciary’s capacity to meet citizens’ demands. These countries opted for the capsulation of the judiciary in order to isolate it from potentially undue interferences, such as from former elites (not fully gained to the democratic rule, O’Donnell and Schmitter, 1986) or from the army (Hilbnik, 2007) or from clientelistic social groups (Couso, 2011). In Europe, this view has been supported by the Council of Europe in particular. Furthermore, the European Commission endorsed the same approach and promoted a spectacular range of policies within the pre-access strategy design to fill the institutional gap of the candidate countries to the membership from 1995 to 2007, all pivoting on the key idea that a self-governing judiciary may be able to come to terms with the totalitarian or post-totalitarian rule in the Central Eastern European countries (Selznick, 1999; Schwartz, 2000). Outside the EU a wave of self-governing judiciaries traverses the Latin American countries, imitating the models set up in Southern European countries such as Spain and Portugal. Today, in the Balkans, the promotion of the rule of law incorporates sanguine prescriptions about the institutional design to be followed by judicial governance. Yet, the evidence collected in the aftermath of the first wave of judicial reforms adopted in homogeneous regions – Central and Eastern Europe, the
Balkans, Latin America – revealed a relationship between JI and the rule of law less sanguine than had been expected or promised. This evidence offers strong and compelling reasons to shift the focus from JI to judicial accountability (JA). The idea underpinning this shift is that justice administration is a public sub-sector and should be held accountable from the point of view of the capacity of delivering a good service to users – citizens – and that of allocating money with a strict instrumental rationality. Remedies suggested come from best practices experienced in more advanced countries – countries that rank highly from the point of view of courts’ efficiency – and from the development of common standards which serve as common transnational reference points to assess the quality of national and sub-national judicial offices. Judicial offices respectful of the rule of law should be efficient in delivering judicial decisions in due time, should be transparent in the way they manage their resources and should introduce IT instruments to facilitate information processing and public communication. Innovation has become common sense when policy makers are being asked to resolve and remedy issues in the judicial sector, such as unreasonable timeframes, uneasy or unfair access to the courts, lack of confidence in the bench on the part of the general public, and so on. This has entailed a growing commitment to inject into the traditional systems of judicial governance new organizational practices and policies originating in other systems or offices. Having judicial accountability as an azimuth, the European member States underwent different waves of judicial reform. First along this path were the Scandinavian countries and the Netherlands. In the late 1990s and in the early 2000s a number of reforms targeting the efficiency and transparency of the judicial systems were enacted. In the Netherlands, for instance, comprehensive reform of the court system was adopted in 2002 and implemented through the decade
Judicial Power
2002–12. This included the introduction of a completely new institution, the Council for the Judiciary. In a completely different context, the Singapore Supreme Court played a leading role in persuading and advocating for an efficiency-oriented reform of both procedural and organizational provisions structuring the civil justice system. Even though it may appear a simplification, as a matter of fact, JA oriented measure seem to turn out into a less effective and coherent result if measured in the entire domestic territory. The judicial offices that feature more favorable conditions to implement new public management tools score higher in the performance than offices where the lack of managerial capacity of the chief justice, the lack of specialized clerks, the lack of robust daily work organization undermine the implementation of these tools. In this respect, the case of Italy shows that governmental staff – at the level of ministerial officers – are transformed by the diffusion of a new public management strategy to overcome resistance and inertial effects within the judiciary. In the aftermath of these new waves of judicial reform, some key aspects have emerged. Efficiency-oriented policies do not ensure equal treatment of citizens. In some countries, where local organizational contexts feature specific characteristics and retain a considerable degree of autonomy from the center, equal treatment still seems to be a prospective goal rather than an achievement. In a way, the quality of justice policy stream aims to provide an answer to this question. Scholars and practitioners have been discussing the quality of justice since the early 1990s. This has come alongside the growing interest in court management and managerial and public accountability as they are applied to the judicial sector (Voermans, 2007; Piana, 2010). The reasoning behind this new policy stream is very simple. A fair trial is not only expected to be respectful of the procedural codes and fundamental rights, but it should also comply with the standard of a reasonable timeframe. Moreover, it is desirable that
805
a judicial institution is not only independent, but also transparent and predictable in terms of results and resource management. This leads to the introduction of a conception of quality of justice that goes beyond the principle of the rule of law, while still incorporating it. A further critical aspect calls for a comparative assessment. Beyond the differential paths followed by judicial actors to interact with the sociopolitical systems where they operate, including the transnational level (Epp, 1998), the growth of the judicial power has catalyzed a process of judicialization of politics, which is mirrored in a vast and compelling variety of cases and examples, from settling conflicts within judicial arenas rather than legislative or executive ones to shifting the bulk of the rule making from statutory laws to case law and jurisprudence (Stone Sweet, 2000; Commaille and Dumoulin, 2010).
The judiciary under assessment: a comparative and international policy stream Over the past decades, judiciaries have been experiencing recurrent and interlaced waves of reform (Vigour, 2018). There are a number of reasons for this. First and foremost, the scope of the judicial function has had to expand. Increasingly complex and intensive litigations have demanded a deeper and wider response from judicial institutions in many countries and, with higher significance, in countries featuring high levels of fragmentation or cultural polarization (Morlino and Sadurski, 2010). These phenomena have provoked overload of the judicial institutions and have called for a reallocation of resources within the administrative services attached to them. Second, the economic crisis which hit the eurozone in 2007 and 2008 forced public institutions to rethink their human resources endowment and rationalize their expenditure schemes
806
The SAGE Handbook of Political Science
(Mascio and Natalini, 2014). The same is true of the judicial sector, which experienced a comprehensive process of rationalization in the budget allocation scheme. In many countries the role played by IT-based tools in the improvement of court management has been praised under the auspices of an efficiencyoriented approach (Frydman and Jeuland, 2011) warmly welcomed as a reaction to the crisis. In regions where politics shifted from authoritarian rule toward a hybrid regime (see Morlino, 2011: chapter 3, and Gagné and Mahé, Chapter 47, this Handbook) the status of the judicial branch within the constitutional design of the State gained a spectacular emphasis and ranked high among the priorities of the incumbent elites. For all these reasons, combined with the streamlining effect entailed by the promotion of efficiency-oriented policies and quantitative method of performance assessment, the current state of the art in terms of data and information covering the judiciaries and the functioning of those is very promising and rich. Judiciaries got ranked at the top of the international agenda on good governance, inclusive growth, poverty reduction, equal treatment promotion. This goes hand in hand with a comprehensive process of standard setting in the European area. For more than two decades, a vast repertoire of instruments, such as checklists, recommendations, monitoring and assessment tools, benchmarks and so on, has been developed and subsequently diffused across the countries that adhere to the European institutions such as the EU and the Council of Europe. Judicial networks, particularly those created in the frame of the Council of Europe and given the responsibility of discussing and setting standards of rule of law, have been instrumental to this process. During the judicial networks’ meetings, domestic representatives (in some cases appointed by the national governments, in others appointed by judicial institutions, such as the judicial schools or Councils for the Judiciary) embark on a process of data
collection, benchmarking, assessment and standard setting with respect to several key dimensions of the judicial systems, such as trial timeframe, access courts, communication to the broad public and resource allocation per inhabitant. The avenue taken by the European institutions in the judicial sector is not new for the international and transnational setting. First of all, we observe a conceptual shift, from a policy discourse centered on exclusively JI toward a policy discourse centered on the quality of justice (Fabri et al., 2005). The latter seems to have rephrased the concept of rule of law by adding to impartial and lawful adjudication other principles, such as the actual possibility to access the court system, the transparency of the court management and the efficiency of the resource management scheme adopted by courts. In short, the addition consists into the conditions that ensure JA. A high number of non-legally binding norms has made its appearance in the EU as one of the most path breaking outcome of a transnational standard setting process targeting the administration and the organization of domestic courts and public prosecutor offices. Several types of standards have been put forth: reasonable timeframe, equal access to justice, efficient financial management, effective public communication, and so on. In order to ensure both the measurement and (consequently) the quantitative assessment of the judicial systems, concepts such as timeframe, delays and fair trial have been unpacked and translated into indicators. The operationalization of the quality of justice was a new avenue to compare systems that proved to be reluctant to mere integration or quite different and divergent in terms of their own strategies to go about court overloading or challenging cases (involving children, refugees, or ethical and religious issues). In general, if, by any chance, a European citizen had the opportunity to observe the European judicial systems from an external point of view, she would be in a position to spot huge differences in the way trials take
Judicial Power
place and surely in the way the law is used, applied and enforced. ‘Differences’ do not refer here to legal norms (the European system is still a system of 28 national legal systems existing underneath the EU law). Rather, ‘differences’ refers to the organization, the staff, the services offered to users and the number of mechanisms of public and social accountability under which judicial staff are held. Despite the several different conceptions of rule of law that have been endorsed within these policies (Piana, 2010), the programs and projects financed by the World Bank, the Council of Europe, the European Commission, USAID and many other actors all converge on the pivotal roles played by the procedural dimension of the rule of law, by the impact that the legal system may have on institutional stability and social prosperity and by the entrenchment in constitutional or statutory laws of the guarantees of judicial independence. Datasets available reflect the process of reframing that has been described so far. From early stage datasets focusing on legal provisions of JI – both external and internal, but mostly de jure rather than de facto – international organizations moved toward a more comprehensive approach centered on the enforcement of the right to a fair trial, measured in terms of trial timeframe, efficiency and effectiveness of the judiciaries. In the second decade of the 21st century, a new approach has appeared, bridging the United Nations’ Agenda 2030 and the previous positions by pointing to the role played by access to justice or, in other words, by the barriers encountered by citizens in their pathways toward justice. The engagement of the international organizations has been relaunched at the aftermath of the Agenda 2030 adoption, which includes the purpose of increasing the equality of everyone in accessing reliable, impartial, and accountable mechanisms of rights enforcement. The combination of these factors impinged upon the design of the judicial reforms and demanded that the national elites engage in long term agendas,
807
a condition that has been fulfilled dependent on several factors, closely related to the culture, the legacy and the veto players’ distribution in each country. Four groups of assessment exercises are worth consideration for scholarly analysis: a) Datasets covering judicial reforms and structure. This first type includes the Consultative Committee of Judges in Europe, operating in the framework of the Council of Europe (CoE), which monitors the 47 judiciaries of the CoE member States and the Commission of Venice dataset, where the official documents of the reforms and the advisory body assessment reports are available for all 47 CoE member States. Moreover, the dataset Varieties of Democracy (V-Dem) includes a set of indicators touching upon the balance of power – executive versus judiciary as well as legislative versus judiciary – for all the States, and from 1900 up to today. b) Datasets covering the organizational dimensions of the judiciary and the efficiency of the judicial systems. Most prominent in this group is the example of the CEPEJ (Commission Européenne pour la Evaluation des Judiciaires), once again framed under the rule of law program of the Council of Europe and a frontrunner in quantitative analysis of the judiciary. Data are collected over two years and are validated through a participative mechanism where domestic judicial institutions are directly involved. c) Datasets addressing the key issue of fair trial protection with a comprehensive and contextsensitive approach. The best case available today is the World Justice Project. Set up in the early 2000s with the goal of assessing protection of the right to due process, it has become a leading voice in the assessment of the quality of justice. Indicators cover both structural and functional dimensions for all countries in the world. d) Datasets framing the justice institutions within the broader spectrum of the public institutions. This shift has played a major role in extending analysis of the judiciary. The OECD has played and still is playing the role of leader in this direction. Datasets focusing on the judiciary are therefore taken into consideration in several outlook exercises carried out on a regular basis by the OECD. In the same vein, the European Commission has launched a new exercise where the judiciaries are monitored. This is the EU
808
The SAGE Handbook of Political Science
Justice Scoreboard, which combines data from the CEPEJ dataset and data provided by another important international platform, such as the World Economic Forum and the World Bank.3
Pathways to quality of justice: standards, policy transfers, innovations The avenue opted for in the judicial sector is not new for the international and transnational setting. The strategy, which consists in governing courts by standards, is largely accepted in most international organizations, such as the World Bank and the OECD (Genn, 1999). In Europe the development and the diffusion of standards of quality of justice has taken the shape of a cross borders and transnational policy operating with the mechanisms of the horizontal learning. The concept of ‘quality of justice’ seems to have reshaped the notion of rule of law by adding to the impartial and lawful adjudication other principles, such as the actual possibility to access the court system, the transparency of the court management and the efficiency of the resource management scheme adopted by courts. A high number of non-legally binding norms has made its appearance in the European Union as one of the most path breaking outcomes of a transnational standard setting process targeting the administration and the organization of domestic courts and public prosecutor offices. Several types of standards have been put forth, such as reasonable timeframe, equal access to justice, efficient financial management, effective public communication. The operationalization of the quality of justice came as a new avenue to compare the domestic judicial systems. More prominently than in other sectors, the interplay between ordinary courts and supranational courts performs differently and has a different impact on the protection of fundamental rights of citizens. Comparison of the European Court of Human Rights and the
Inter-American Court of Human Rights operating in Latin America revealed that what matters for effective protection of rights beyond domestic borders is a combination of the institutional legacy and mutual trust among the judiciaries. In many cases differences in rights enforcement are not due to legal norms. Rather, they stem from the organization, the staff, the services offered to users and the number of mechanisms of public and social accountability with which the judicial staff has to comply. The quantitative turn went hand to hand with the managerial turn in the international discourse about the judicial branch. This has allowed to address similar functional problems arising in different countries despite the contexts might have been different, in terms of domestic legal and political culture. In a way the core of the promotion of the rule of law shifted toward a more easily measurable strategy, such as the promotion of the quality of justice (Hammergren, 1998; Albers, 2001; Piana, 2010). The shift from a judiciary functioning as a system to a judiciary that works as a policy arena entails several important consequences. Among these there are the appearance and the empowerment of new policy actors promoting judicial policies without creating pressure for any constitutional or statutory change. This holds also in contexts where the rulers and the change agents are currently addressing the issue of reforming the judiciary by referring to a multidimensional framework, such as the one adopted at the international level. This is the case, for instance, in countries such as Georgia, Kazakhstan and Moldova, where legal and management experts are supporting the agenda setting of the reforms under the auspices of the Council of Europe, the OECD and the World Bank. The rise of a transnational policy discourse ensuring the quality of the justice system with a standardized set of notions paved the way toward what can be qualified as a process of networking within the rule of law institutions – the ordinary and
Judicial Power
supreme courts (Ackerman, 1997; Tate and Vallinder, 1995). Consequently, the judicial power displays the hallmarks of a closed system, characterized by the boundaries and the authority of the legal system and the aspects of an arena where actors, notably judges, promote organizational strategies of quality of justice, or engage in intensive dialogues with other courts across domestic borders. At the level of the ordinary court systems, the promotion of specific solutions is based on the endorsement of a user-oriented approach that frames the judicial reforms and leads them in an output-oriented direction. The actual entrepreneurship of the domestic institutions and the implementation of these types of normative inputs – that is, non-legally binding norms – mainly depends on the availability of capable actors, of legitimate and influential policy entrepreneurs, and of domestic facilitating conditions in terms of political competition and the organizational forces at work in the judicial field. This is why analysis of a critical case, the Italian one, may cast new light on the potential consequences – including the unintended effects – of the judicial reforms driven by the quality of mainstream justice. And yet, beyond the different patterns of interaction entailed by standards and legal provisions, the reasoning behind their implementation converges in at least one sense: implementation processes are deployed on the basis of a deductive rationality, which goes from an abstract principle – a norm – to a specific case – a practice or behavior that instantiates the norm. The largest part of the standards is worded in abstract terms. For example, ‘budget transparency’ is a principle. How to implement it and turn it into a practice depends on a few factors that narrow progressively the possible behavioral options that can be adopted coherently to the abstract principle, and possibly the specific context of application. The same logic holds in the case of the principle of equal access to the judicial mechanisms of dispute resolution. The term ‘equal access’ is abstract: this way of
809
normative wording is suitable for a wide and differential implementation. This principle is put into motion by means of a specific set of tools, mechanisms, organizational solutions, and policy designs. The judiciary as a policy arena represents more than a metaphor. Twinning and bilateral or multilateral cooperation projects have entailed large-scale socialization activity (Dallara and Piana, 2015; Paine 2016). Socialization and training were thus not just specific objectives of cooperative projects, but also spill-over effects of other projects, most of which were based on peer review, cross-border discussion or teaching. The role played by experts in the standard-driven policy transfer experiences can be summarized in the following way. Experts are expected to be aware of the standards and formulate an abstract model of what should be done under general conditions. Moreover, they are selected and appointed in projects of judicial cooperation because of their knowledge of the system or of the organization from which the practice that should be transferred originates. They visit the beneficiary or the recipient organization and select those conditions that can facilitate the transfer or/and address those barriers that can create obstacles to the transfer. The expert operates during a short or medium timeframe in the recipient organization. A number of training sessions are foreseen to create awareness and to train the staff of the organization, which, once the project reaches its end, must be capable of managing and incorporating the new practice into their own way of doing things. Despite the quality of the design of the transfer and the quality of the practices transferred, however, in a high number of cases the a posteriori audit showed that the internalization of the norms and the translation into routinized practices easily fails. In countries where old models have been transferred – such as the Albanian case, where the Italian model of self-government has been adopted – the actual impact of the transfer remains a subject for further empirical analysis.
810
The SAGE Handbook of Political Science
Perspectives Patterns of change triggered by the processes of policy transfer and by the implementation of international standards of quality of justice also characterize the interplay between justice and technology. The interaction between technological innovation and the justice system is now an acquis for anyone working in the sector. At different levels of jurisdiction, lawyers, magistrates and administrative staff have integrated the technological component into the everyday functioning of the procedures and the organizations that respond to the need to concretely implement the right to due process. Today, however, the reality – and, above all, the potential for change linked to the diffusion of new technologies and instrumentation, both computational and digital – goes far beyond the phenomenon of dematerialization of public administration, which has meant above all the transformation of the civil procedure in the justice sector. From a general point of view, the introduction of technological tools in this sector has been interpreted and supported as one of the most important paths to ensure the efficiency of the system. On the demand side, technology meets the world of justice in terms of the information available to the parties – for example, by making use of open government mechanisms that also affect the content of judicial decisions or the contents of the implementation documents – and on the plan of the representation of the problem – for example, the lawyer or the mediator can be assisted by the results of research undertaken with computational devices that analyze databases, so-called big data, massive identifying trends, specificities and median points, in terms of jurisprudential orientations of the guidelines on compensation. Again, on the demand side, but taking into account the first forms of interaction between the demand and supply of justice, the flow of documents can be profoundly transformed by availability of the digital platforms and digital devices. For these reasons – among which the search for
more efficiency and more accessibility are prominently highlighted – digital devices for justice administration and justice delivering are rapidly and relentlessly rising to the top of the rankings in the priority lists outlined by policy makers, domestic and international institutions, regulators, legal experts and consultants. Changes triggered by the encounter between technological tools and justice institutions are countless and span a wide range of different dimensions – procedural, substantial, organizational, communicational – all together impinging upon the way in which rights are protected and enforced. The technology has often been welcomed by those who unveiled the failures of the State in providing timely and efficient responses to the demands of services coming from citizens, social groups and companies. If we consider these new phenomena in a comparative perspective, e-justice, cyber justice, digital justice and predictive justice encompass a vast array of aspects, none of which stick to the perimeter of simple efficiency, as important as it might be. Modernizing justice institutions, differentiating conflicts and settlement mechanisms, and redesigning the matrices where demands and supplies of justice meet entails much more than simply increasing the speed of case management, reducing the costs of access to justice and – inasmuch as it becomes readable and open – making the justice system transparent to nonlegal experts. In the second decade of the 21st century, the Los Angeles Police Department adopted a method to predict crime behavior propensity based on the inductive analytics from a massive dataset on individual behaviors, typical situations and descriptive mindsets accounts for types of social actors. Back in 2013, Professor Joel Caplan of Rutgers School of Criminal Justice highlighted that the approach described above keeps in mind short term objectives. Police officers can stop criminal activities in a particular area, only to allow them to occur elsewhere. Alternatively,
Judicial Power
the potential criminals may return once the police officers leave. Therefore, a more sustainable method would be to perform risk terrain mapping. The crime history of a certain region is merged with local behaviors to define crime-prone areas, thus considering the impact of the environment as well. This goes far beyond the pure case management adopted by the courts to speed up the trial time frame, or simply to ensure that this latter is kept under the control of quality (i.e. efficiency and effectiveness-oriented) management. In the UK and the Netherlands, online dispute resolution has been adopted, and many scholars and practitioners are ready to argue for the great potential of these mechanisms. Once the documents associated with a procedure of civil litigation are scanned and available in digital format, analysis of an infinitely large number of documents may qualify as a case of data analytics. In 2017 the Court of the 9th Circuit of California raised a critical case seeking to define who is responsible for the final ruling with the support of both human and artificial intelligence, human reasoning and big data analysis. The horizon prospected by the introduction of the artificial intelligence into the courts offers empirical and theoretical reasons for scholars and practitioners to revisit the traditional notion of power. A ruling elaborated on the basis of both legal and mathematical elements and contents, providing insights about the past trends in the jurisprudence for specific types of cases, is still the instantiation of a judicial power? Whether this represents a new season for the judiciary or the first embryonic age for a new paradigm in judicial politics, only will the time tell us.
Notes 1
https://www.venice.coe.int/WebForms/pages/? p=02_Rule_of_law&lang=EN 2 We refer to the Weberian notion of the ideal-type to disentangle the functional core of all judicial powers, regardless of the context in which they
3
811
operate, the structures they feature and the patterns of change they experience. See https://www.coe.int/en/web/cepej/dynamicdatabase-of-european-judicial-systems; https://ec. europa.eu/info/policies/justice-and-fundamentalrights/effective-justice/eu-justice-scoreboard_en; https://worldjusticeproject.org/; https://www. v-dem.net/en/ https://www.imf.org/en/Publications/ WEO/Issues/2019/03/28/world-economic-outlookapril-2019; http://www.oecd.org/gov/accessto-justice.htm
References Ackerman, B. (1997) ‘The Rise of the World Constitutionalism,’ Virgin Law Review 83 (4), 771–97. Bell J. (2006) Judiciaries within Europe: A Comparative Review. Oxford: Oxford University Press. Cappelletti, M. (1989), The Judicial Process in Comparative Perspective. Oxford: Clarendon Press. Choudhry, S. and Bass, K. G. (2014) Constitutional Courts after the Arab Spring: Appointment Mechanisms and Relative Judicial Independence. New York: The Center for Constitutional Transitions at NYU Law & International IDEA Reports. Commaille, C. and Dumoulin, L. (2010) ‘From Critiquing Capitalism to Realizing Democracy via the Law’, [Présentation du dossier ‘De la critique du capitalisme à la réalisation de la démocratie par le droit?] Droit et Société, 76(3), 513–21. https://www.cairn.info/revuedroit-et-societe1-2010-3-page-513.htm Commaille, C. and Lacour, S. (2018) ‘Legal Consciousness Studies as a Laboratory of a Renewed System of Knowledge about Law: Presentation of the Special Report’, Droit et Société special issue ‘After Legal Consciousness Studies’, 100(3), 559–68. Couso, J. (2011) Constitutional Law in Chile. Dordrecht: Kluwer. Dahl, R. A. (1957) ‘Decision Making in a Democracy: Supreme Court as a National Policy Maker,’ Journal of Public Law 6, 279–95. Dallara, C. and Piana, D. (2015) Networking the Rule of Law. London: Ashgate. Damaska, M. (1986) The Faces of Justice and State Authority. New Haven: Yale University Press.
812
The SAGE Handbook of Political Science
Damaska, M. (2019) Evaluation of Evidence: Pre-Modern and Modern Approaches. Cambridge: Cambridge University Press. Epp, C. (1998) The Rights Revolution. Chicago: University of Chicago Press. Epstein, L., Knight, J. and Shvetsova, O. (2001) ‘The Role of Constitutional Courts in the Establishment and Maintenance of Democratic Systems of Government,’ Law and Society Review 35(1), 117–64. Fabri, M., Jean, J.-P., Langbroek, P. and Pauliat, H. (eds) (2005) L’administration de la justice en Europe et l’évaluation de sa qualité. Paris: Montchrestien. Ferejohn, J. (1998) ‘Independent Judges, Dependent Judiciary: Explaining Judicial Independence,’ Southern California Law Review 72(2/3), 353–84. Frydman, B. J. and Jeuland, E. (2011) Le nouveau management de la justice et l’indépendence des juges. Paris: Dalloz. Garoupa, N. and Ginsburg, T. (2009) ‘Guarding the Guardians: Judicial Councils and Judicial Independence,’ American Journal of Comparative Law 57(1), 201–34. Genn, H. (1999) Paths to Justice: What People Do and Think about Going to Law. Oxford: Hart. Ginsburg, T. (2003) Judicial Review in New Democracies: Constitutional Courts in Asian Cases. Cambridge: Cambridge University Press. Guarnieri, C. and Pederzoli, P. (2002) The Power of Judges. Oxford: Oxford University Press. Hammergren, Linn (1998) Fifteen Years of Judicial Reform in Latin America: Where We Are and Why We Haven’t Made More Progress. New York: USAID Global Center for Democracy and Governance. Hayo, B. and Voigt, S. (2003) ‘Explaining De Facto Judicial Independence’, Volkswirtschaftliche Diskussionsbeiträge 46/03, Dept. of Economics, University of Kassel. Hilbnik, L. (2007) Judges Beyond Politics in Democracy and Dictatorship: Lessons from Chile. Cambridge, Cambridge University Press. Hirschl, R. (2004) Toward Juristocracy: The Origins and Consequences of the New Constitutionalism. Cambridge: Harvard University Press. Kommers, D. P. (1997) The Constitutional Jurisprudence in the Federal Republic of Germany, 2nd ed. Durham: Duke University Press.
Larkins, C. (1996) ‘Judicial Independence and Democratization: A Theoretical and Conceptual Analysis,’ American Journal of Comparative Law 44(4), 605–30. Mascio F.D. and Natalini A. (2014) ‘Austerity and Public Administration: Italy Between Modernization and Spending Cuts’, American Behavioral Scientist 58(12), 1634–1656. Morlino, L. (2011) Changes for Democracy. Oxford: Oxford University Press. Morlino L. and Sadurski, W. (eds) (2010) Democratization and the European Union: Comparing Central and Eastern European post-Communist Countries. London: Routledge. Pakes, F. (2015) Comparative Criminal Justice, 3rd ed. London: Routledge. Piana, D. (2010) Judicial Accountabilities in New Europe: From Rule of Law to Quality of Justice. Farnham: Ashgate. Piana, D. (2016) ‘Quality of Justice as an Institutional Game,’ Journal des Economistes et des Etudes Humaines 22(2), 165–89. Posner, R. (1983) The Economics of Justice. Cambridge: Harvard University Press. Raz, J. (1979) The Authority of Law. Oxford: Oxford University Press. Russell, P. H. and O’Brien, D. M. (eds) (2001) Judicial Independence in the Age of Democracy: Critical Perspectives from around the World. Charlottesville: University of Virginia Press. Santos, A., 2006. ‘The World Bank’s Use of the “Rule of Law” Promise in Economic Development,’ in D. M. Trubek and A. Santos (eds), The New Law and Economic Development: A Critical Appraisal (pp. 253–300). Cambridge: Cambridge University Press. Schwartz, H. (2000) The Struggle for Constitutional Justice in Post-Communist Europe. Chicago: University of Chicago Press. O’Donnel, A.G and Schmitter C.P. (1986) Transitions from Authoritarian Rule: Tentative conclusions about uncertain democracies. Baltimore/London: Johns Hopkins University Press. Selznick, P. (1999) ‘Legal Cultures and the Rule of Law,’ in M. Krygier and A. Czarnota (eds), The Rule of Law after Communism: Problems and Prospects in East-Central Europe (pp. 21–38). Aldershot: Ashgate. Shapiro, M. (1981) Courts: A Comparative and Political Analysis. Chicago: University of Chicago Press.
Judicial Power
Shapiro, M. (1983) ‘Recent Developments in Political Jurisprudence,’ Western Political Quarterly 36(4), 541–8. Sieder, R. and Costello, P. (1996) ‘Judicial Reform in Central America: Prospects for the Rule of Law,’ in R. Sieder (ed.), Central America: Fragile Transition (pp. 169–211). London: Institute of Latin American Studies; Palgrave Macmillan. Solomon, P. (2015) ‘Law and Courts in Authoritarian States,’ in the International Encyclopedia of Social and Behavioral Sciences, 2nd edition (online). Elsevier. Stone Sweet, A. (2000) Governing with Judges: Constitutional Politics in Europe. Oxford: Oxford University Press. Tate, C. N. and Vallinder, T. (eds) (1995) The Global Expansion of Judicial Power. New York: New York University Press. Toharia, J. J. (1975) ‘Judicial Independence in an Authoritarian Regime: The Case of
813
Contemporary Spain,’ Law and Society Review 9(3), 475–96. Troper, M. (1980) La séparation des pouvoirs et l’histoire Constitutionnelle Française, I. Paris: LGDJ. Vigour, C. (2018) Réformes de la justice en Europe: Entre politique et gestion. Louvainla-Neuve: De Boeck Supérieur. Voermans, W. (2007) ‘Judicial Transparency Furthering Public Accountability for New Judiciaries,’ Utrecht Law Review 3(1), 148–59. Volcansek, M. L. (ed.) (1992) Judicial Politics and Policy-Making in Western Europe. London: Frank Cass. Waldron, J. (2008) ‘The Concept and the Rule of Law,’ Georgia Law Review 43(1), 1–61. Weingast, B. R. (1997) ‘The Political Foundations of Democracy and Rule of Law,’ American Political Science Review 91(2), 245–63.
49 Legislative Power Werner J. Patzelt
Introduction In a narrow sense, ‘legislative power’ may pass as the power of a legislature to legislate, or of a parliament to exert control over a cabinet. Yet the formal power to legislate, or to oversee the executive, is only the tip of an iceberg of institutional and material resources that can make an assembly powerful and give it ‘a significant effect on policy’ (Arter, 2006b: 255). Therefore, we have to look at legislative or ‘parliamentary’ power in a broad perspective. And although there is no unchallenged consensus on terms, we can safely use the term ‘parliament’ for a legislature, which has the function of forming and sustaining a cabinet in addition to an assembly’s ‘classical’ functions to represent people, to control government and to legislate.
History of the parliamentary power In the 20th century, a major topic of debates on assemblies has been the alleged ‘decline of parliaments’ (as a classic SEE Bryce 1990 [1921]). In this view, legislatures have degraded from an – in fact only invented – ‘golden age’ in the 19th century into something close to ‘rubber-stamp institutions’ that are controlled by a cabinet or a party. According to that narrative, in the past, individually elected members of parliament engaged in open-ended discussions, provided government with legislation, oversaw the budget and were not affected by a ‘bounded mandate’ or any ‘imperative of party cohesion’. From the perspective of ‘liberal parliamentary theory’ (see Schuett-Wetschky, 1984) it was an aberration from ‘parliamentarism proper’, when party
Legislative Power
machines came up due to periodic, competitive elections and started to influence parliament from outside. Things allegedly became even worse when cabinet members came to be generally selected from the ranks of parliamentarians, and when the need emerged for parliamentary majorities to support their ‘own’ cabinets in a disciplined way. Such a victory of prime ministerial power, or of cabinet power, looked like the executive’s victory over an institution that had failed to stabilize itself as a counterweight. Although there is more than a kernel of truth in this narrative, it is misleading. First, there has always been much more variety among parliaments than can be covered by the story of steady decline. Second, this narrative seems to reflect that period of rather stable small-n party systems in some Western countries when majority-building – usually by forming coalitions – was no major challenge, or when even minority cabinets had expectable support. Under such circumstances, parliamentary party groups may bow down to energetic prime ministers, or they may at least give the appearance of doing so. Yet today, with party systems splitting up and a new cleavage emerging between wellestablished parties and populist movements, cabinet cohesion has become vulnerable, because members of parliament try to avoid electoral defeat due to implausible policies set out by party leaders or chief executives. Under such circumstances, assemblies have an incentive to revitalize parliamentary power proper, unless they choose to fall back on authoritarian executive policy-making. Yet the history of parliamentary power is much longer than this (see Patzelt, 2007). It is true that parliaments in the modern understanding, namely institutions based on periodic and more or less free elections, have not existed for longer than roughly 200 years. Yet much older, and unfortunately widely ignored by legislative research, is the ‘institution type’ of an assembly with own authority in processes of deliberation and decision-making, with the ‘power of the purse’, and sometimes
815
even with the ‘power of the sword’. Skipping the long history of the ecclesiastic councils and of the chapters of religious orders, we have at least to look at the direct predecessors of modern parliaments. These are the estate assemblies, typical of European history since the 14th century (Bosl, 1977). Originally they consisted of the politically, economically and militarily most important nobles, present in person; later these were supplemented or replaced by representatives of the nobility, clergy and cities, and, in some rare cases, even farmers. Summoned by an overlord or monarch, these assemblies convened not regularly but depending on need – usually for money or for troops. Their power increased in line with the growing intensity of a government’s activities in terms of defense, maintenance of public infrastructure and promotion of economic activities. Policies had to be enacted, and rules had to be implemented, by the factual lords over territories and their inhabitants, since a civil service or public administration did not yet exist. Therefore, the consent and the resources of these lords were required whenever issues such as security and justice, or commerce and infrastructure, needed common regulations. As a result, assembly power waxed or waned depending on whether a prince or king needed resources from the nobles or the wealthy in his territory. As a precondition for granting resources such as money and soldiers, the key figures of estate assemblies could hold the overlord responsible for their interests and claims. Usually he had to confirm, or even to extend, privileges or exemptions. Sometimes he had to agree on new rules, formally issued by him but demanded by his vassals; on some occasions he even found himself confronted with a ruling by the assembly that he was supposed to execute. The chances of imposing an assembly’s will over even a reluctant overlord were particularly good when he was no longer able to pay the debts he had accumulated through warfare, buildings, bribery or a luxurious lifestyle, where only the estates might have remained
816
The SAGE Handbook of Political Science
as trustworthy recipients of credit, or as payers stepping in for the monarch’s liabilities. Then an assembly’s power could surpass the effective power of the nominal suzerain. The first stage of the French Revolution, between the aristocratic revolt in 1787 and the selfdeclaration of the third estate as a ‘national assembly’ in 1789, is the most outstanding example of such a shift in power. Out of such processes emerged what now are the ‘parliamentary functions’ of budget control and legislation. Representation itself, as an assembly’s core function, had already been put into effect by convening the holders of real power over territories and people, or by summoning their representatives. One initial and consequential modification of this institutional setting was a stepwise change in the mode of recruitment of assembly members. Today, most representatives are elected, or re-elected, by those whom they are meant to represent. What began in the 19th century with a class suffrage based on income and limited to males, has ended in free and fair elections, open for participation for all citizens above a certain age, at least in some dozens of democracies. This change provided a personal power base even to those representatives who were lacking personal wealth – namely, support by voters or networks in civil society. And when parties started to play a decisive role in elections during the 19th century, thereby serving as new linkage institutions between society and government, they could hardly avoid becoming the central points of reference and political orientation for the citizens of most democracies. As a result, party leaders on at least regional level used to run for parliament, and party leaders on national level made it into cabinet after victorious elections. As a result, the personal power and political influence of a member of parliament usually stems very much from his or her position in a party, controlling it at least in the voting district, and – as a top politician – perhaps throughout the country. Understanding parliamentary power, therefore, requires an understanding of the
creation and social organization of party power. The second major modification of an assembly’s power position was whether it remained restricted to representation, deciding on a budget and drafting or approving legislation. Under such restraints, assemblies institutionalized themselves (as in Imperial Germany) as the ‘legislature’ of a constitutional monarchy, or (as in the United States) as the ‘legislative branch’ in a presidential system of government. Alternatively, assemblies could gain direct influence over staffing the executive and selecting or overthrowing the prime minister (as in the UK). In this way, a parliamentary system of government emerged. There, parliament as an institution – yet not necessarily any individual member of parliament, or each parliamentary party group – reaches the peak of its possible power (cf. Lijphart, 1992). But due to the unavoidable oligarchical structure of parties, their leaders – including the leaders of their parliamentary party groups – can become, and often are for years, the main holders of a parliament’s concrete political power (see, for instance, Keman and Müller-Rommel, 2012). In a presidential system, their power can be balanced, or even surpassed, by presidential power. Yet in a parliamentary system, the power of parliamentary party group leaders can actually be fused with the institutional power of the executive branch of government, either by party leaders assuming the position of prime minister or cabinet member, or by forging a coalition with subsequent party discipline. In this case, party leaders – in their capacity as chief executive – may control the parliamentary agenda on a top-down basis (as in the UK), or can benefit from a euphemistically so-called ‘rationalized parliamentarism’ (which is in fact a ‘curtailed’ parliamentarism, as in the French 5th Republic). Under such circumstances, a parliament mainly serves as the institutional framework – or as the transmission belt – of party leaders’ policies, be the leaders in their positions because of election by other party
Legislative Power
members, due to personal charisma or since they have reached top positions in the hegemonic party of a dictatorial regime. The latter was true for the parliaments of Europe’s former socialist states. The more – under the conditions of a coalition government in a parliamentary system – parliamentary power is centralized around top parliamentarians and top executives, the more important is, at least normatively, the role of a well-organized parliamentary opposition. Based on public support, such an opposition can become the real counterpart for the usually well-integrated ‘action unit’ of coalition and cabinet. Thus, a powerlimiting ‘new dualism’ is created and replaces that ‘old dualism’ between the legislative and the executive branches of government which is so typical of presidential systems. Parliamentary minority power, then, stretches as far as the opposition can effectively oversee governmental activities, can present plausible alternatives of policies suggested by the cabinet and of candidates presented by the governing coalition and can bring up own issues for public debate on which the government would prefer to be silent or inactive. The rise of both the idea and the practice of such parts of a parliament, and of parties outside parliament, that are loyal to the constitutional order but effectively oppose those who are actually running a country is one of the best institutional inventions of humankind and has been a decisive step in the history of parliamentary power. On balance, there are ups and downs in that history. There has been neither a constant rise nor a constant decline of parliamentary power. What really can be detected, are path-dependent developments into different systems of government, bringing about assemblies in the forms of ‘legislatures’ or ‘parliaments’ with different roles. In addition, we recognize periodic gains and losses in legislative or parliamentary power. These seem to depend on time-specific circumstances, on politicians’ differing leadership skills and on the fact that institution-shaping leaders
817
may have been made responsive to changing groups of politically relevant actors. As soon as we analyze carefully such power relations and their evolution, we will easily recognize that the basic institutional demarcation line is not that between ‘old-fashioned estate assemblies’ and ‘modern parliaments’. Instead, the first and central difference is whether an assembly’s practically relevant power base is or is not created by a truly democratic political process. This means the use of periodic elections and the existence of active parties, both of them linking an assembly to the people and making it ‘representative’ in a very strong sense of that concept (see Pitkin, 1967 and Eulau and Karps, 1978). The second difference, then, originates in whether an assembly is a mere ‘legislature’ without cabinet overthrowing capacity, or whether it is a full-fledged ‘parliament’ with not only the authority but even the duty to create and to support a cabinet.
Basic theories and concepts for the analysis of parliamentary power A first set of analytical concepts comes from constitutional theory. One among them is ‘separation of powers’. As to parliamentary power, such separation can be ‘horizontal’, establishing a system of checks and balances between the legislative and the executive branches of government. Yet ‘separation of powers’ can also be ‘temporal’, meaning that elections need to be held after a formally or informally fixed period. If a member of parliament can be re-elected and really desires re-election, he or she cannot act as he or she pleases during the legislative term, but instead must cultivate trust on the part of the future voters. In this way, the ‘re-election mechanism’ links democracy with representation. If, however, term limits are introduced, the re-election mechanism loses its
818
The SAGE Handbook of Political Science
effectiveness for each parliamentarian’s last term and loosens the ties between the electorate and those elected. In addition, the personal power structures around the outgoing individual member of parliament dissolve. As a result, fixed term limits seem to curtail parliamentary power. Two other important concepts from constitutional theory depict the type, the scope and the limits of the mandate of an assembly member. Is there, as stipulated by liberal theory, a legally – or at least a practically – ‘free mandate’, allowing members of parliament to form teams among themselves – that is, ‘parliamentary party groups’ – along lines of political convictions? If so, the role of a possibly powerful ‘trustee’ is available to members of parliament, who – benefiting from such a role – can accumulate power for one’s party group and may even be able to engage optimistically in conflicts with party leaders. Yet having only ‘trustees’ in a parliament, without any concern for party cohesion, hamper parliamentary team play, would thin out the chains of personal accountability between members of parliament and their voters and could thereby erode the democratic base of the assembly. This, in consequence, could not avoid decreasing parliamentary power, at least under cultural circumstances where political power is expected to stem from policyshaping elections. In contrast, the mandate of a parliamentarian may also be a legally – or practically – ‘binding’ one, making the assembly member a ‘delegate’ of a constituency which, in an extreme case, may even be endowed with imperative power (on trustees and delegates see Wahlke et al., 1962). The motivation for such a desire to impose binding mandates on members of parliament from ‘bottom up’ is usually some version of democratic theory, which calls for serious attempts to link members of parliament closely to those represented. A variant is the binding ‘top-down’ mandate, which often follows from a party’s claim to represent all legitimate concerns of
a society. This is how communist parties justified their taking control over all members of parliament. In such a command system of parliamentary behavior, as practiced in most socialist states, an assembly becomes simply a ‘rubber-stamp institution’. Yet as soon as such an overarching role of a ‘leading party’ starts to fade away, as was the case in most of Europe’s socialist states in 1989, the power potential inherent in the very form of an assembly is reinvigorated, sometimes even at an amazing pace (see Schirmer, 2005 as a case study). A second set of concepts for the analysis of parliamentary power stems from party system analysis and coalition theory. In relation to parties, relevant concepts are ‘party system fragmentation’ (with its negative impact on the possibilities of forming a stable majority in parliament), ‘party system polarization’ (having a damaging impact on parliamentary culture and on the possibilities for compromising) and ‘electoral turnout’ (influencing debates on the legitimacy of coalitions, of cabinets and of non-majoritarian policymaking). In relation to coalitions and cabinets, useful concepts include ‘minimal winning coalition’, ‘oversized coalition’ and ‘minority government’ (Strøm, 1990), all of which cover factors of intra-parliamentary stability and, hence, of the power position of parliament as an institution. The notion of ‘consociational democracy’ needs to be added, since it depicts practical rules and mutual understandings that allow – either in a whole society or in its parliament – the building and maintenance of formal and informal alliances between social and political groups that are deeply divided along ethnic, religious or linguistic lines (see Lijphart, 1984). Depending on the concrete power distribution between an assembly – often split between a cabinetsupporting coalition and the parliamentary opposition – and the executive branch of government, further notions such as the ‘interparty mode’ of cabinet control, as opposed to the ‘intra-party mode’, are additional core concepts of legislative studies (for details see
Legislative Power
King, 1976). In addition, the concepts with which electoral systems are classified (such as ‘proportional representation’ or ‘winner takes it all’) refer to immediate causal factors of concrete parliamentary power distributions and on possibilities to shape them at the discretion of power-holders. A third set of relevant concepts for the analysis of legislative power is provided by theories of power at large (see Patzelt et al., 2005; Patzelt, 2005a). Here, at least three basic forms of power need to be distinguished: (a) the power to act momentously; (b) veto power (on veto players see Tsebelis, 1995, 2011); and (c) discourse-shaping power. Parliament’s power to act momentously may be based on constitutional provisions, its veto power on inter-institutional or intra-institutional majority requirement and its discourse-shaping power on parliament’s effectiveness to get media coverage for its top politicians and their issues. In addition, a distinction is needed between ‘gross’ power and ‘net’ power, with the latter concept meaning what is left from the power potential of actor A as soon as the countervailing power of the actors B through F is taken into account. The significant gross power of a parliament to overthrow a cabinet may be, for instance, significantly diminished by a prime minister’s right to dissolve the parliament in turn, thus stirring revolting parliamentarians’ fear of losing their seats at the upcoming elections. We also need to look at the social organization of power structures (see e.g. Krehbiel, 1991; Strøm, 1995). There may be ‘transitive’ power relations between political players who can calculate their moves by reflecting their particular, yet intertwined, interests and resources, and thus use power ‘to do something’. This is typical of parliamentary decision-making based on the rule of quid pro quo. Yet there are ‘intransitive’ power relations as well. These use to consist in – more or less well-defended – rules for ‘correct thought, speech, and action’, and they will be felt as collective expectancies and judgments. Discrediting them may
819
end with exclusion from the ranks of ‘honest politicians’ or of ‘good citizens’, and is better avoided even before entering into the phase of strategic political behavior inside or outside parliament. Here we face ‘power over something’, in particular over the communicative resources of behavior and interaction. One more useful concept for power analysis is the ‘ex ante effect’ of power, which focuses on those behavior-changing anticipations that usually go along with the observation or the ascription of an opponent’s power resources. This ‘ex ante effect’ works basically as an ‘anticipation loop’, coming into effect by taking into account that such resources might be mobilized against oneself as soon as one’s opponent might want to use them for ‘punishment’ or ‘gratification’. Such anticipation loops are quite usual in politics and in no way restricted to the parliamentary arena. We notice them as evasive reactions to perceived veto power (cf. Huber, 1992), and we recognize their deeply game-changing effects as soon as we look at one of the most decisive turning points in legislative–executive relationships. This was the introduction of an obligatory countersignature of laws, still promulgated by a head of state, by a responsible parliamentary leader. Setting up this rule, the developmental path toward parliamentary systems of government had been entered. A fourth set of concepts for analyzing parliamentary power becomes available when we look systematically at the sources of parliamentary power. Manifold as they are, they can be ordered conveniently by using Aristotle’s typology of ‘four causes’ (Bastit, 2002). These include the causa materialis (‘material cause’: out of what is something made, and what are the consequences thereof?); the causa efficiens (‘efficient cause’: what makes something tick?); the causa finalis (‘purpose cause’: what are the consequences of a goal to be reached?); and the causa formalis (‘formal cause’: what are the particular consequences of the structure or form of a system?).
820
The SAGE Handbook of Political Science
Four sources of parliamentary power ‘Material Causes’ of Parliamentary Power Through the lens of the causa materialis we recognize an assembly as the confluence of enormous social and cultural capital. This basic understanding of parliamentary power can be elaborated in four directions. First, an assembly possesses the more power the more powerful the persons are who are cooperating there. This is why estate assemblies became such important political institutions in so many different places. Of course, the concrete sources of personal power may vary widely. Today, an influential position in one’s party or public prestige serve as functional equivalents to, for instance, representing a wealthy territory ruled by oneself, as was so important in estate assemblies. It is true that the process of democratic institutionalization has made many parliaments rather independent of the individual personal resources of most of their members. Yet only under the – quite exceptional – circumstances of undisputed parliamentary authority may we expect, as a ‘normality’, an enormous asymmetry between the power of an assembly ‘as such’ and the average personal power of each individual member. Although parliamentary power may be derived from constitutional rules and a general respect for the assembly, the highly personal resources of members of parliament will make the difference between the assembly’s success and defeat when it comes to a serious conflict with the holder(s) of executive power over enforcing rules or defending respect for the assembly’s legitimate political role. Second, the potential for creating and shaping politically relevant communication that is inherent in the institution type of an assembly can make a parliament powerful. After all, this institution bears the very characteristic of talking (in French or Italian parler or parlare) already in its name. Many parliaments first
became places of fearless discussion about politics, and then a principal source of political information and of political opinions that could be published freely. ‘Free speech in the chamber’ made parliaments influential visà-vis rulers, and the subsequent ‘free public discourse on parliamentary debates’ allowed assemblies to mobilize public opinion. As a result, feedback loops emerged between intra-parliamentary and extra-parliamentary political discourse. This gave parliaments, still under pre-TV circumstances, the role of political ‘communication hubs’ and provided them with additional, widely de-personalized and hence less attackable power. With ever more mass media spreading what previously had been ‘secret topics’ of chamber debates, governments’ opportunities to withhold facts and political moves from the public were reduced. ‘Explaining policy’ became an imperative for exerting political leadership, whereas concealing executive actions or obstructing their public discussion started to pass as presumably or overtly illegitimate. As a result, the unchallenged maneuvering space of the executive was restricted by the combined effects of parliamentary communication and its resonance in the media. Yet with ‘the public’ separating – due to the multiplication of TV channels and the rise of echo chambers and filtering bubbles on the internet – into many, barely interlinked ‘publics’, the communication role of parliament as a whole, although not of some individual members of parliament, has been reduced. In the meantime, it was possible to establish the political responsibility of the executive before parliament on a routine basis, and even in the most concrete sense of the word ‘responsibility’. Questions to the government by individual parliamentarians or by parliamentary groups, with subsequent parliamentary and public discussions on the answers provided by cabinet members, brought almost the full range of executive actions under parliamentary control (see Martin, 2011). Thereby, ‘political responsibility’ was established as a very effective institutional
Legislative Power
mechanism, easily triggered by question time in the assembly (on institutional mechanisms see Patzelt, 2003: 66–82). First, assemblies gained the right to do ‘interpellations’, that is, to pause the process of legislative deliberations in order to discuss ongoing policies. Then, the rule was established that ministers, and even heads of government, had to answer even malicious questions in the assembly, thus making the replies reportable to the public. Both supporters and opponents of the cabinet, and both journalists and the broader public, were afterward free to evaluate these answers as they pleased, and to take action based on their evaluations. Therefore, rationally acting leaders of the executive anticipated that they could not avoid highly unwelcome reactions if they gave answers that could be considered insufficient or even implausible. As a result, cabinet members better avoided political moves that looked unjustifiable or ‘hard to sell’. In this way, the capacity to hold the executive accountable by organizing intra- and extra-parliamentary communication became a very important source of additional parliamentary power. Third, the infrastructure of a parliament is an obvious ‘material cause’ of its power. An assembly’s capacity to act influentially is greatly affected by whether it has qualified staff, disposes of enough office space and good technical equipment and is in control of a budget of its own, or whether its ability to work depends on the goodwill of the executive. After all, the government, with all its ministries and public agencies, has the best access to information and can use many tools for influencing public opinion. In particular, the executive is able to make – or hold – a parliament powerless for as long as it has the opportunity to decide on the assembly’s budget and infrastructure. Of course, parliamentary leaders such as committee and caucus chairs can do the same with rankand-file members if they are in a position to monopolize resources around themselves. Therefore, strengthening the infrastructure of individual parliamentarians, and not only
821
of parliamentary leaders, is one further and important possibility to make parliament powerful as an institution, not only its elite. Fourth, the organization of time is a very important material resource for parliamentary power (see the case studies in Patzelt and Dreischer, 2009). It is obvious that rarely summoned estate assemblies could accumulate and exert much less power than those of today’s parliaments that meet on an almost continuous basis, with only a few off-weeks. After all, much power vis-à-vis the executive is created by interaction and networking activities going on between parliamentarians, interest groups and media. Another ‘temporal’ source of parliamentary power is an assembly’s success in synchronizing its own working cycles (gathering information, deliberation, compromising, decision-making) with the work-flow within the government. Otherwise, parliaments may find themselves reduced to the role of mere spectators of policy-making taking place in executive or international institutions that set their own rhythm and try to protect their procedures from early parliamentary interference. This is a particular challenge for national parliaments within a supra-national setting like the EU.
‘Efficient Causes’ of Parliamentary Power In self-recruited parliaments and among selfselecting parliamentarians, as are typical of democracies, people will usually arrive who share, or develop, a certain degree of – even progressive – political ambition. Although a feeling of duty may be another incentive for working as a member of parliament, ambition seems to be an even more important causa efficiens. After all, considerable investments in time, energy and money are required to win a parliamentary seat. Successfully running for parliament is, therefore, highly selective for character traits. Only those with constant over-average capacity to work and to overcome manifold resistance, including
822
The SAGE Handbook of Political Science
the willingness to complete even fatiguing work, will usually make it into parliament and stay there for more than a single term. Many of them will then strive for more resources and seek influential offices, including – in a parliamentary system of government – the position of a cabinet member. Consisting mainly of people with such characteristics, a well-equipped parliament can hardly fail to acquire power up to its constitutional limits, and to defend a power position once reached. Thus even parliamentary recruitment patterns have significant influence on parliamentary power, at least with a certain time lag.
‘Purpose Causes’ of Parliamentary Power Much power stems from well-coordinated activities of actors going after common goals. In the case of institutions, such coordination is usually provided by a ‘guiding idea’, or by a set of guiding ideas, that expresses the ‘mission’ and the self-understanding of that institution. For parliaments, this set of guiding ideas includes the rendering of – single or combined – services (i.e. ‘functions’) for the surrounding political system. Such parliamentary functions typically include representing people, overseeing the government, setting out legislation or even staffing the executive. Having all or some of these functions in mind, this causa finalis motivates actors to willfully and reliably serve their institution, and thereby helps to generate institutional power. This growth of power along pre-defined purposes can be easily observed in the case of the European Parliament. In 1952, it came into being as the nameless ‘assembly’ of the European Coal and Steel Community (ECSC), consisting of delegates from each parliament of the six initial member states. No duty was given to this assembly other than discussing matters of the Community. Yet from the very beginning, the presidents
of that assembly – still without significant functions, let alone power – defined their situation as presiding over a ‘real parliament in being’ (see Patzelt, 2004). Together with most of the members, they decided to behave like the actors of a full-fledged parliament. Therefore, from the outset, they struggled for greater institutional authority. In 1970, they named their institution – presumptuously – ‘The European Parliament’. Doing so, they took a purely symbolic measure to legitimate further attempts at becoming a ‘real parliament’ one day. Afterward, they worked for direct elections to this parliament, which were introduced in 1979. Next, the members of the European Parliament successfully extended their powers in a series of conflicts with the European Commission on budget issues. Today, the European Parliament has unchallenged legislation on European law, shared only with the Council of the European Union, acting as something like a ‘federal first chamber’. The European Parliament can even overthrow the European Commission, which – at present – is something close to a cabinet drafting legislation and planning the budget. In addition, the European Parliament has veto power as to the nomination of European Commissioners, elects the president of the commission, and now expects the leaders of the EU member states to suggest no one other than the leader of the biggest party group, or of a party coalition, in the European Parliament as a candidate for president of the European Commission. This tremendous rise in parliamentary power was effectuated within a period of no more than roughly 60 years, and there are attempts to push it still further (see Garben, 2015; Garzón Clariana, 2015; Haroche, 2018). It is easy to detect that ‘efficient secret’ of this impressive rise to power. At the beginning, the members of parliament acted along the – still counterfactual – guiding idea of ‘being a real parliament’, relying on the fact that having parliaments on all levels of government passes as desirable throughout the Western world. Then they made this idea
Legislative Power
attractive also for the national parliaments and governments of the European Community member states. This could be achieved with the argument that extending the jurisdiction of Europe’s supra-national institutions would call for effective democratic control, which could be provided only by an additional ‘real parliament’. Along this claim the whole spectrum of parliamentary functions became normatively available for the formerly nameless assembly: legislation, budget control, even influencing the composition of the Commission one day, and exactly this vision lent plausibility to the claim that the assembly should become a ‘truly representative’ assembly through direct elections as soon as possible. All these demands were gradually accepted by the member states of the emerging European Union, although it was absolutely clear that this new parliament’s rise to power would set ever closer limits on national policy-making. This, however, could pass as legitimate and even desirable only as long as there was widely shared consensus on the ‘finality’ of the whole enterprise, namely creating an ever ‘more perfect union’. With the waning of this ‘purpose cause’, first the ‘felt legitimacy’ of the Parliament would decrease, and afterward its real authority and power. These observations can be generalized. Parliamentary power widely stems from emotional identification with the institution type of a parliament in general, and with the mission of a specific parliament in particular. In oligarchies, in most of which estate assemblies once blossomed, it may have sufficed to give credible expression to the shared interests of the most important actors in the assembly. In democracies, however, much more is needed: plausibility of a parliament’s claim to really represent the people; undoubted integrity of the electoral process; a certain degree of fairness in intraparliamentary behavior. Otherwise a democratic parliament falls short of its alleged purpose, can no longer count on institutional loyalty of its members and allows an important source of power to dry out.
823
‘Formal Causes’ of Parliamentary Power Referring to parliaments, the concept of causa formalis covers the written and unwritten rules of parliamentary behavior, all the structures resulting therefrom, and the established procedures and intra- or interinstitutional mechanisms of assemblies. Being the most easily visible and understood causes of parliamentary power, most political and academic debates tend to focus on these. Research has shown that all of these ‘formal causes’ have resulted from, and are time and again changed by, the results of political conflicts, by the pursuit of constant interests on the part of steadily important actors, and by attempts at improvements of the status quo in processes of trial and error. First, whether an assembly usually meets only in plenary and acts as no more than a ‘debating parliament’ has a deep impact on parliamentary power (Steffani, 1979). This usually results in low parliamentary power proper, in particular if the government succeeds in setting the parliamentary agenda. On the other side, a parliament may use quite a differentiated, hierarchic structure of specialized committees and sub-committees, is thus able to practice detailed division of labor and subsequently can act as a real ‘working parliament’. In this case, significant parliamentary power will be generated by accumulation of knowledge and experience in parliament itself, that is, by breaking down the former information monopoly of the executive, and can be used for systematic oversight of government activities. Wherever parliamentary party groups become stable and highly differentiated organizations, such parliamentary power is augmented by using the same organizational pattern within the parliamentary party groups. Unfortunately, this structural source of parliamentary power is widely unknown to citizens. Therefore they have distorted perceptions of the power distribution between parliament and government in liberal democracies, often underestimate
824
The SAGE Handbook of Political Science
parliamentary influence on policy-making, and call for reforms like term limits or banning parliamentary party discipline that would, different from what is hoped for, not increase parliamentary power, but make it fade away. Second, the power of a parliament depends on its ability to put energetic, power-hungry members of parliament in leadership positions such as chair of a parliamentary party group, committee or subcommittee. In doing this, top-down nomination – so typical of patronage systems – is apparently inferior to bottom-up elections. Parliamentary officeseekers may even have to run for election twice during a legislative term that lasts four years or longer, namely at the beginning and roughly halfway through the term. If so, they are under continuous pressure to perform well in their positions, at least according to the evaluation of their colleagues. Repeatedly achieving re-election provides them with a truly personal power base, and this will additionally strengthen – for merely procedural reasons – the power position of the parliament as a whole. Third, the availability and reliability of some historically detected intra- and interinstitutional mechanisms confers power on a parliament. An institutional mechanism is a chain of strategic actions, triggered by an actor’s interests, patterned by formal and informal rules and running between institutional positions endowed with specific power resources (see Patzelt, 2003: 66–82). Examples are the re-election mechanism (‘Whom do I have to treat how well if I want his support in a fair competition for re-election?’), or the mechanism of countersignature (‘Whose support do I need to get a bill signed into law; on whom does this actor depend?; who – therefore – actually sets the limits to my own power?’). Apart from being sources of power, institutional mechanisms are important because each parliamentary function is fulfilled by institutional mechanisms, working seriatim or conjointly. Their operations give ‘practical
answers’ to political questions like the following: How can a parliament influence the composition of a cabinet? How can a parliament exert concrete control over governmental activities with respect to both their general line and their specific details? What is the effective role of an assembly in the legislative process? To what extent can a parliament influence the executive in international negotiations? What possibilities of far-reaching communication has a parliament, in particular in comparison – or competition – with the executive’s channels of public communication? For all of these questions, manifold practical answers have been found in a large variety of political systems, depending on their individual experiences with institutional history, shaped by results of political conflicts and rooted in cultural patterns. Some of the established institutional mechanisms are similar across countries through adaptation to similar functional requirements on the part of political systems and their surrounding societies; others are similar by common origin or by institutional dissemination; several are similar for both reasons (see Patzelt, 2007, 2017). In addition, the same institutional mechanisms may differ in the degree of their effectiveness or relevance across countries. All of them also have – depending on their institutional environment – particular side-effects or even collateral damage. Comparative legislative research discloses both their shaping factors and their advantages or disadvantages. Therefore, practical conclusions for parliamentary reform can be drawn from such research. Fourth, much parliamentary power stems from how an assembly is linked to society. In democracies, electoral systems have a decisive effect on that. On average, the personal power of members of parliament seems to be greater under the following conditions: they have to run, after short legislative terms, in one-person voting districts, must cope with a first-past-the-post electoral system, and can enjoy – due to the lack of term
Legislative Power
limits – an extended time horizon for creating and maintaining structures of personal influence. The main reason for this combined effect seems to be that, under such circumstances, new candidates and incumbents have a strong incentive to build up close contacts with the general citizenry and to maintain personal networks with elite groups in their constituencies. Therefore, parliaments with many incumbents are usually more powerful than assemblies with many novices.
Global differentiation and attempts at typology Parliaments differ widely according to the sources of power they use and the different structures and institutional mechanisms through which they generate or channel power. As to parliamentary power, their varieties are neatly grasped in the well-known typologies of Blondel (1973), Polsby (1975) and Mezey (1979). Blondel (1973) ranked parliaments according to the range and the degree to which they could fulfill (some of) a parliament’s functions. ‘True legislatures’ are those that master legislation, oversee the government and practice two-way communication with citizens. Building and supporting a cabinet, as is typical of parliamentary systems, is unfortunately omitted in this typology. Next come ‘inhibited legislatures’, who may engage in political debates, but have no chance of influencing the executive in issues deemed essential by it. Even worse is the power position of those ‘truncated legislatures’ that, being embedded in dictatorships, are hindered even from openly discussing important political issues. Ranked lowest are ‘nascent or inchoate legislatures’ that still are ‘in being’, like the once nameless assembly of the European Coal and Steel Community in the 1950s. Polsby (1975) went beyond that onedimensional typology. First, he distinguished entire political systems depending on their
825
degree of political division of labor. On the one side, there may be so little structural differentiation in a polity that no need for an integrating assembly has emerged as yet; or it may still suffice for a country to be run by an autocratic leadership group without any support from a representative assembly. On the other side, we can find much more structurally complex systems of government that make use of an assembly. Among them, Polsby distinguished ‘closed’ and ‘open’ political systems. In closed political systems we find ‘rubber stamp legislatures’ without a power base of their own, merely bringing into legal form such more or less informal decisions as have been previously taken in the ranks of the ruling elite or oligarchy. In open political systems, however, there are two types of legislature that are situated quite differently in the policy process. On the one side, we find parliaments in the form of an ‘arena’. They are basically formalized settings where the significant forces of a political system can interact. On the other side, we find assemblies in the form of a ‘transformative’ legislature. Such parliaments have accumulated so much institutional power that they can define their own agenda quite independently from actual external influences, and they are able to transform the results of internal debates into laws or public policy at their own discretion. Apparently, this distinction is quite close to that of Steffani (1979), where (less powerful) ‘debating parliaments’ are juxtaposed with (more powerful) ‘working parliaments’ who have built up an organizational power base of their own. Mezey (1979) has presented an even more differentiated, power-centered typology. In a two-way table, he uses the ‘policy-making power’ of parliaments as the vertical axis, distinguishing between ‘strong’, ‘modest’, and ‘little or none’ assembly power. He bases the ranking on constitutional regulations, mainly focusing on the parliament’s role in the legislative process, yet widely omitting any sources of power other than legal. In the horizontal dimension, Mezey looks at
826
The SAGE Handbook of Political Science
‘popular support’ for the assembly, referring basically to the political culture of a society in which a parliament is playing its role. On the one side, there are societies where assemblies enjoy low esteem for whatever reason and, hence, get only little public support. If such a parliament is given strong institutional power, it will be ‘vulnerable’ in the case of conflicts with other political actors such as the executive. If a less supported legislature has only ‘modest’ power anyhow, it is nothing more than ‘marginal’. On the other side, there are countries with ‘more supported legislatures’, where the support can stem either from constitutional regulations and convictions, or from democratic procedures. Parliaments that are ‘supported’ in this sense may indeed have and exert strong power; then they are ‘active’ legislatures, setting their own agenda. Yet parliaments may also have only modest power; then they are called ‘reactive’ legislatures, since they must react to an agenda set by others, usually the executive. If, however, a parliament has little or no real power, although it may be symbolically supported by constitutional regulations or by – possibly even believed – political propaganda, then it is a ‘minimal’ legislature, just like all parliaments of the former socialist states. Mezey’s typology is more complex than Polsby’s, and it encompasses the assemblies of authoritarian regimes as well. Thus, it allows for far-reaching, even historic comparisons. Unfortunately, this typology is not very precise in the measurement of ‘support for parliament’, and it is not comprehensive when it comes to the forms and sources of parliamentary policy-making power (on that see Arter, 2006a, 2006b). Arter (2006a: 426) tried to move beyond that state of the art by offering an ‘anatomy of legislative influence’ driven by 15 questions that implicitly form a framework for comparative analysis. They run from ‘Do members of the legislature/ committees of the legislature have the (unrestricted) right of legislative initiative?’ via ‘Do legislative and executive leaders consult
on strategic policy matters?’ and ‘Does the legislature convene regularly to engage in the deliberation of legislation?’ to ‘Does the legislature deliberate along party lines?’ and ‘Does the legislature oversee the executive along party lines?’ These questions operationalize what were discussed as some of the sources of parliamentary power earlier in this chapter. A very complex taxonomy – not typology – for parliamentary power analysis has been presented more recently by Sebaldt (2009). Based on Patzelt (2005a) and on most of the concepts of power analysis introduced already in this chapter, Sebaldt (2009: 26) provides no less than 96 cells in an analytically straightforward matrix into which all the power phenomena of past and present assemblies can be entered. Afterward, patterns of (joint) frequency distributions – if based on valid measurements, and if reliably detectable – can open a path toward typologybuilding. In a first step, Sebaldt distinguishes power with respect to its purpose (thus defining the rows of the matrix) and power with respect to its degree of effectiveness (thus defining the columns of the matrix). As to the rows, there is a threefold purpose to using power: the power to act momentously, the veto power and the discourse-shaping power. The mode in which power works for each of these purposes is twofold: power can be used in a transitive way as ‘power to do something’, or in an intransitive way as ‘power over resources’, the latter ones usually being resources of communicative (inter-) action. As a result, there are three times two rows in this matrix, that is six. In the columns, Sebaldt distinguishes first ‘gross power’ and ‘net power’, and then – as the point in time when power starts to work – ‘anticipated power’ and ‘actually used power’. As a result, there are twice two columns. These four columns, together with six rows, create a matrix consisting of 24 ‘basic cells’. This taxonomy, then, allows analyzing the power of every kind of actor. In order to operationalize this general power matrix for the study
Legislative Power
of parliamentary power in particular, Sebaldt finally suggests – just like Blondel (1973) – to focus on an assembly’s right, and on its real capacity, to fulfill the range of, basically, four parliamentary functions: representation/ communication, overseeing the executive, legislation and electing/supporting/overthrowing a cabinet. Thus integrating constitutional provisions, the rules of procedure and concrete political practice, finally these four times 24 cells of the matrix, in sum 96, can now be filled with data on all assemblies chosen for a comparative analysis. Such a look at parliamentary performance and its prerequisites connects – possibly also along the items of the interrogatory framework in Arter (2006a) – parliamentary power analysis proper with both collecting new data and the many empirical (case) studies of parliaments present and past. In addition, this matrix allows for widespread comparisons, and it may stimulate theory-driven historical studies of how – and under what circumstances – parliamentary power is generated, maintained, used, modified, curtailed or even annihilated. The findings of such research could, in a next step, be integrated with the results of an evolutionary morphology of parliaments, comparing the guiding ideas and power resources of assemblies in corporations, federations, estate-based systems, liberal regimes and democracies (Patzelt, 2007; Patzelt, 2017).
Empirical data bases Data on parliamentary power are available in publications by institutions such as the InterParliamentary Union or the International Centre for Parliamentary Documentation. More data can be found in encyclopedias, such as Kurian (1998), or in monographs on parliaments and their legal foundations, structures, rules and behavioral patterns (such as Döring, 1995; Döring and Hallerberg, 2004; Strøm, 1990). Further data is available
827
in comparative volumes on parliaments in general (such as Arter, 2009; Loewenberg, et al. 2002; Norton, 1990, 2013; Olsen, 2008) or on single parliamentary functions (such as Rasch and Tsebelis, 2011). Additional relevant information is contained in texts on such features of parliament that contribute to its power (such as Bräuninger et al., 2017) and in comparative publications on parliamentary power (such as Patzelt, 2005; Fish, 2006; vol. 12/2006 of the Journal of Legislative Studies; Sebaldt, 2009). In addition, most parliaments now have electronically accessible data bases on their own history, competences and performance.
Major advances and ongoing debates In one sense, parliamentary power is the best researched feature of parliaments. Ongoing research about legislative–executive relations (see, for instance, the classics by King (1976) and Döring (1995)), about the rise and fall of political parties at elections, about the effects of electoral systems on the composition and behavioral patterns of parliaments and about the role of political communication and media is – by the very nature of these topics – research of parliamentary power as well. The same is true for analyses of legislative efficiency and legislative autonomy, or of legislative capacity and legislative performance, with the latter having a temporal, a quantitative and a qualitative dimension (see Arter, 2006b). Looking only at such work, scientific progress may appear as providing ever more knowledge about ever more parliaments and about their ongoing history. Yet such a definition of scientific progress would include no more than mere positivism. Major advances, however, are achieved by the development of new theories that cover ever more cases and features of parliaments, or that have more explanatory power. And even greater advance is made by setting up such
828
The SAGE Handbook of Political Science
typologies that allow for ‘pattern recognition’ in the diverse world of assemblies of past and present times, looking at their roles within surrounding political systems and at their institutional power potentials and the manifold sources thereof. For a long while, Mezey’s typology was considered the last word in comparative parliamentary power analysis. More recently, Sebaldt has shown how to move beyond Mezey if parliamentary power is the main topic of analysis. In addition, the development of evolutionary institutionalism (Patzelt, 2012a) and its application to comparative parliamentary history (see the case studies in Patzelt, 2012) has opened new perspectives on the old topic of how, and why, parliaments have developed from modest origins into an institution type without which nearly no contemporary political system is run. Many debates in legislative research, however, are still reluctant to address such ‘big questions’ and prefer focusing on smaller issues. Often arguing within the conceptual framework of new institutionalism (cf. North, 1990), they deal with modeling principal–agent relationships and their inherent power structures (see the classic by Kiewiet and McCubbins, 1991), or they further explore the potential of rational-choice models (including spatial models) to gain a better understanding of how legislative power is organized and exerted (Shepsle, 1989; Shepsle and Weingast, 1994). Other studies look at the symbolic side of parliamentary representation and at the power potential residing there (Patzelt, 2006). Other debates address the role of parliament in (neo-) corporatist institutional settings or the reduced influence of national parliaments in multilevel systems of supra-national government (Abels, 2016; Abels and Eppler, 2015). More classical debates still cover issues of constitutional law and of its impact on parliamentary power (e.g. Shugart and Carey, 1992), or focus on the behavior of individual members of parliament, or of parties, regarding political power structures. In doing so, they give
answers to David Arter’s clear-cut question regarding parliamentary power: ‘How do legislators, both severally and collectively, work to perform their legislative roles in the three phases of the policy process – that is, in the formulation and deliberation of public policy and the oversight of the executive?’ (Arter, 2006b: 255).
Perspectives What are the perspectives of parliamentary power? The growing literature on ‘postparliamentarism’, initiated by Crouch (2004), suggests that parliaments and their machinery may be in a process toward becoming a mere formal shell, with political energy and real policy-making moving away from them. Among the alleged causes are the loosening ties between society and parties. This goes along with a loss of substance in political struggles, detaching many politically ‘fash ionable’ debates from what, and how, ordinary people deal with political issues in daily life and at home. Consequently, much professional political communication gets quite close to either theatrical performance or mere advertising. All that leads to losses in political credibility and, therefore, in parliamentary power. In addition, we observe growing tensions between the ‘internal logic’ of policymaking in complex institutions and the ‘external logic’ of attempts by their leaders at exerting institutional influence (Benz, 1998). This may hamper parliament’s effectiveness and efficiency in policy-making, in particular if an assembly has to compete for attentiveness and support with organizations from the private sector. Finally, it became apparent that the ‘de-nationalization’ of political decisionmaking in an era of globalization with ever more international regimes, or in the process of ‘Europeanization’, will necessarily decrease the power of national parliaments. At the same time, it is uncertain whether, or to what degree in terms of power performance,
Legislative Power
new supra-national parliaments may develop, at least aside from the ‘success story’ of the European Parliament. As a result, there are some serious arguments inviting even further use of the narrative about a ‘decline of parliaments’. Presumably, however, we should recognize no more than additional ages in the transformation of parliament as a context-depending institution, and in the ups and downs of this type of institution. It is true that the impressive evolution of parliaments since the late 18th century, and their successful spread around the world in the 20th century, have been among the most impressive episodes of worldwide institutional history. Yet nevertheless there was no ‘golden age’, nor is there a predetermined future – neither for parliaments in general nor for their power.
References Abels, Gabriele (2016). ‘Parlamentarismus im europäischen Mehrebenensystem – Niedergang, Renaissance, oder beides?’ [Parliamentarism in the European multi-level-system – decline, renaissance, or both?], Zeitschrift für Politikwissenschaft 26: S1, 165–77. Abels, Gabriele and Eppler, Annegret (eds) (2015). Subnational Parliaments in the EU Multi-Level Parliamentary System. Innsbruck: Studienverlag. Arter, David (2006a). ‘Conclusion. Questioning the “Mezey question”: an interrogatory framework for the comparative study of legislatures’, Journal of Legislative Studies 12: 3–4, 462–82. Arter, David (2006b). ‘Introduction: comparing the legislative performance of legislatures’, Journal of Legislative Studies 12: 3–4, 245–57. Arter, David (2009). ‘Comparing and classifying legislatures’, in David Arter (ed.) Journal of Legislative Studies, vol. 12. London: Routledge. Bastit, Michel (2002). Les quatre causes de l’être. Selon la philosophie première d’Aristote [The four causes of being, according to Aristotle]. Louvain-la-Neuve: Peeters. Benz, Arthur (1998). ‘Postparlamentarische Demokratie? Demokratische Legitimation im
829
kooperativen Staat’ [Post-parliamentary democracy? Democratic legitimation in a cooperative state], in Michael T. Greven (ed.) Demokratie – eine Kultur des Westens? 20. Wissenschaftlicher Kongress der Deutschen Vereinigung für Politische Wissenschaft. Opladen: Leske + Budrich, 201–22. Blondel, Jean (1973). Comparative Legislatures. Englewood Cliffs: Prentice Hall. Bosl, Karl (1977). Der moderne Parlamentarismus und seine Grundlagen in der ständischen Repräsentation [Modern parliamentarism and its foundations in estate representation]. Berlin: Duncker & Humblot. Bräuninger, Thomas, Debus, Marc and Wüst, Fabian (2017). ‘Governments, parliaments and legislative activity’, Political Science Research and Methods 5:3, 529–54. Bryce, James (1990) [1921]. ‘The decline of legislatures’, in Philip Norton (ed) 1990, Legislatures. New York: Oxford University Press, 47–56. Crouch, Colin (2004). Post-Democracy. Cambridge: Polity Press. Döring, Herbert (1995). Parliaments and Majority Rule in Western Europe. Frankfurt am Main: Campus. Döring, Herbert and Hallerberg, Mark (2004). Patterns of Parliamentary Behavior. Aldershot: Ashgate. Eulau, Heinz and Karps, Paul D. (1978). ‘The Puzzle of Representation: specifying Components of Responsiveness’, in Heinz Eulau and John C. Wahlke (eds) The Politics of Representation: Continuities in Theory and Research. Beverly Hills and London: Sage, 55–71. Fish, M. Steven (2006). ‘Stronger legislatures, stronger democracies’, Journal of Democracy 17:1, 5–20. Garben, Sacha (2015). ‘Confronting the competence conundrum: democratizing the European Union through an expansion of its legislative powers’, Oxford Journal of Legislative Studies 35:1, 55–89. Garzón Clariana, Gregorio (2015). ‘El Parlamento Europeo y la evolución del poder legislativo y del sistema normativo de la Unión Europea’ [The European Parliament and the evolution of legislative power and of the norm system of the European Union], Revista de Derecho Europeo 50, 43–83.
830
The SAGE Handbook of Political Science
Haroche, Pierre (2018). ‘The inter-parliamentary alliance: how national parliaments empowered the European Parliament’, European Journal of Political Research 25: 7, 194–216. Huber, John D. (1992). ‘Restrictive legislative procedures in France and the United States’, American Political Science Review 86:3, 675–87. Keman, Hans and Müller-Rommel, Ferdinand (2012). Party Government in the New Europe. London: Routledge. Kiewiet, D. Roderick and McCubbins, Matthew D. (1991). The Logic of Delegation. Chicago: University of Chicago Press. King, Anthony (1976). ‘Modes of executivelegislative relations: Great Britain, France, and West Germany’, Legislative Studies Quarterly 1: 1, 11–16. Krehbiel, Keith (1991). Information and Legislative Organization. Ann Arbor: University of Michigan Press. Kurian, George (ed.) (1998). World Encyclopedia of Parliaments and Legislatures. Washington DC: CQ Press. Lijphart, Arend (1984). Democracies: Patterns of Majoritarian and Consensus Government in Twenty-One Countries. New Haven and London: Yale University Press. Lijphart, Arend (1992). Parliamentary versus Presidential Government. Oxford: Oxford University Press. Loewenberg, Gerhard, Squire, Peverill and Kiewiet, D. Roderick (eds) (2002). Legislatures: Comparative Perspectives on Representative Assemblies. Ann Arbor: University of Michigan Press. Martin, Shane (2011). ‘Parliamentary questions, the behavior of legislators, and the function of legislatures: an introduction’, Journal of Legislative Studies 17:3, 259–70. Mezey, Michael L. (1979). Comparative Legislatures. Durham, NC: Duke University Press. North, Douglass C. (1990). Institutions, Institutional Change, and Economic Performance. Cambridge: Cambridge University Press. Norton, Philip (1990). Legislatures. Oxford: Oxford University Press. Norton, Philip (2013). Parliaments in Contemporary Western Europe. Hoboken: Taylor and Francis. Olsen, David (2008). Post-Communist and Post-Soviet Parliaments: The Initial Decade. London: Routledge.
Patzelt, Werner J. (2003). ‘Institutionalität und Geschichtlichkeit von Parlamenten. Kategorien institutioneller Analyse’ [Institutionality and historicity of parliaments. Categories of institutional analysis], in Werner J. Patzelt (ed.) Parlamente und ihre Funktionen. Institutionelle Mechanismen und institutionelles Lernen im Vergleich [Parliaments and their functions. A comparative analysis of institutional mechanisms and institutional learning]. Opladen: Westdeutscher Verlag, 51–117. Patzelt, Werner J. (2004). ‘Identitätsstiftung durch Konstruktion fiktiver Kontinuität. Erfahrungsmanagement im frühen Europäischen Parlament’ [Building identity by the construction of fictitious continuity: experience-managing in the early European Parliament], in Gert Melville and KarlSiegbert Rehberg (eds) Gründungsmythen – Genealogien – Memorialzeichen. Beiträge zur institutionellen Konstruktion von Kontinuität, Köln, Weimar and Wien: Böhlau, 187–205. Patzelt, Werner J. (ed.) (2005). Parlamente und ihre Macht. Kategorien und Fallbeispiele institutioneller Analyse [Parliaments and their power: categories and exemplary case studies of institutional analysis]. Baden-Baden: Nomos. Patzelt, Werner J. (2005a). ‘Phänomenologie, Konstruktion und Destruktion von Parlamentsmacht’ [Phenomenology, construction and destruction of parliamentary power], in Werner J. Patzelt (ed.) Parlamente und ihre Macht. Kategorien und Fallbeispiele institutioneller Analyse [Parliaments and their power: categories and exemplary case studies of institutional analysis]. Baden-Baden: Nomos, 255–302. Patzelt, Werner J. (2006). ‘Parliaments and their symbols: topography of a field of research’, in Emma Crewe and Marion G. Müller (eds) Rituals in Parliaments, Frankfurt: Lang, 159–82. Patzelt, Werner J. (2007). ‘Grundriss einer Morphologie der Parlamente’ [Outline of a morphology of parliaments], in Werner J. Patzelt (ed.) Evolutorischer Institutionalismus. Theorie und empirische Studien zu Evolution, Institutionalität und Geschichtlichkeit [Evolutionary institutionalism: theory
Legislative Power
and empirical studies of evolution, institutionality, and historicity], Würzburg: Ergon, 483–564. Patzelt, Werner J. (ed.) (2012). Parlamente und ihre Evolution. Forschungskontext und Fallstudien [Parliaments and their evolution: research context and case studies]. BadenBaden: Nomos. Patzelt, Werner J. (2012a). ‘Evolutorischer Institutionalismus in der Parlamentarismusforschung. Eine systematische Einführung’ [Evolutionary Institutionalism in legislative research, A systematic introduction], in Werner J. Patzelt (ed.) Parlamente und ihre Evolution. Forschungskontext und Fallstudien [Parliaments and their evolution: research context and case studies]. BadenBaden: Nomos, 47–110. Patzelt, Werner J. (2017). ‘Comparative politics and biology’, in Steven A. Peterson and Albert Somit (eds) Handbook of Biology and Politics. Cheltenham and Northampton: Edward Elgar, 181–205. Patzelt, Werner J. and Dreischer, Stephan (ed.) (2009). Parlamente und ihre Zeit. Zeitstrukturen als Machtpotentiale [Parliaments and their time: time structures as potentials for power]. Baden-Baden: Nomos. Patzelt, Werner J., Demuth, Christian, Dreischer, Stephan, Messerschmidt, Romy and Schirmer, Roland (2005) ‘Institutionelle Macht. Kategorien ihrer Analyse und Erklärung’ [Institutional power. Categories for analysis and explanation], in Werner J. Patzelt (ed.) Parlamente und ihre Macht. Kategorien und Fallbeispiele institutioneller Analyse [Parliaments and their power: categories and exemplary case studies of institutional analysis]. Baden-Baden: Nomos, 9–46. Pitkin, Hanna F. (1967). The Concept of Representation. Berkeley: University of California Press. Polsby, Nelson W. (1975). ‘Legislatures’, in Fred I. Greenstein and Nelson W. Polsby (ed.), Handbook of Political Science, vol. 5. Reading: Addison-Wesley, 257–319. Rasch, Bjørn Erik and Tsebelis, George (ed.) (2011). The Role of Governments in Legislative Agenda Setting. London: Routledge. Schirmer, Roland (2005). ‘Machtzerfall und Restabilisierung der Volkskammer im Lauf
831
der Friedlichen Revolution’ [Dissolution and re-stabilization of power in the People’s Chamber during the peaceful revolution], in Werner J. Patzelt (ed.) Parlamente und ihre Macht. Kategorien und Fallbeispiele institutioneller Analyse [Parliaments and their power: categories and exemplary case studies of institutional analysis]. Baden-Baden: Nomosin, 171–215. Schuett-Wetschky, Eberhard (1984). Grundtypen parlamentarischer Demokratie. Klassisch-altliberaler Typ und Gruppentyp [Basic types of parliamentary democracy: the classical old-liberal type versus the group type]. Freiburg-München: Karl Alber. Sebaldt, Martin (2009). Die Macht der Parlamente. Funktionen und Leistungsprofile nationaler Volksvertretungen in den alten Demokratien der Welt [The power of parliaments: functions and performances of national representative assemblies in the ‘old democracies’ of the world]. Wiesbaden: VS Verlag für Sozialwissenschaften. Shepsle, Kenneth A. (1989). ‘Studying institutions: some lessons from the rational choice approach’, Journal of Theoretical Politics 1:2, 131–47. Shepsle, Kenneth A. and Weingast, Barry R. (1994). ‘Positive theories of congressional institutions’, Legislative Studies Quarterly 19:2, 148–79. Shugart, Matthew S. and Carey, John M. (1992). Presidents and Assemblies: Constitutional Design and Electoral Dynamics. Cambridge: Cambridge University Press. Steffani, Winfried (1979). ‘Das präsidentielle System der USA und die parlamentarischen Systeme Großbritanniens und Deutschlands im Vergleich’ [The presidential system of the USA and the parliamentary systems of the United Kingdom and of Germany compared], in Winfried Steffani (ed.) Parlamentarische und präsidentielle Demokratie. Strukturelle Aspekte westlicher Demokratien. Opladen: Westdeutscher Verlag, 61–104. Strøm, Kaare (1990). Minority Government and Majority Rule. Cambridge: Cambridge University Press. Strøm, Kaare (1995). ‘Parliamentary government and legislative organization’,
832
The SAGE Handbook of Political Science
in Herbert Döring (ed.) Parliaments and Majority Rule in Western Europe. Frankfurt am Main: Campus, 51–81. Tsebelis, George (1995). ‘Decision making in political systems: veto players in presidentialism, parliamantarism, and multipartism’, British Journal of Political Science 25: 3, 289–325.
Tsebelis, George (2011). Veto Players: How Political Institutions Work. Princeton: Princeton University Press. Wahlke, John C., Eulau, Heinz, Buchanan, William and Ferguson, LeRoy C. (1962). The Legislative System: Explorations in Legislative Behavior. New York: Wiley.
50 Legitimacy and Legitimation Hans-Joachim Lauth
Introduction and key concepts The concept of political legitimacy is of key importance to political science. Beetham (1991: 41) called it ‘the central issue in social and political theory’. There are two basic questions associated with it: why should people obey their rulers, and why do people obey a particular political system? These two questions need two different types of answer, which has given rise to two distinct strands of research. A first step in explaining these two variant approaches is to distinguish between the concepts of legitimacy and legitimation. While legitimacy is a normative concept that evaluates grounds for acknowledging the authority of political systems or regimes, rules of power and the actions of rulers, legitimation or belief in legitimacy is an empirical concept that describes, rather than evaluates, the mechanisms by which a regime’s authority is, or comes to be, perceived as justified by its citizens. Hence, a dictatorship can possess legitimation despite
lacking legitimacy from a normative perspective. Although these approaches are sometimes said to reflect a shift from philosophy to sociology (Heywood, 2013: 81), and with it a shift from legitimacy to legitimation, it can be shown that both approaches remain current and operate in parallel. The concept of legitimacy traces its origins back to the Latin ‘legitimus’ or ‘legitimare’, meaning ‘rightfulness’, which thus only captures one aspect of the modern conception. The concept of legitimation/legitimization also derives from this Latin term, but diverges still further because it implies a process. Let us begin here by considering Max Weber’s seminal discussion of legitimacy, in which he considers several key aspects linked to the maintenance and justification of political power. Weber (1921 [1978]) was one of the first to systematically explore the fact that regimes cannot sustain their rule over the long term solely on the basis of violence and repression, but require acceptance from those over whom they rule. Only if the
834
The SAGE Handbook of Political Science
principles upholding a regime’s authority are shared by the people is that authority legitimate. Weber distinguishes three ideal types of legitimate authority: traditional, charismatic and rational/legal. These three types are empirically based on specific grounds of legitimation that are regarded positively by the governed subjects: specifically, esteem for traditional authority; captivation with a ruler’s fascinating personality; or respect for the rational, legal basis underpinning a regime’s rule. This typology does not make any normative judgements about the rightfulness of the regime. Rather, Weber seeks to explain the reasons why governed subjects accept and support a regime’s authority.1 He therefore consistently speaks of belief in the legitimacy of political authority or, more succinctly, of belief in legitimacy, which in this chapter I treat as synonymous with legitimation/legitimization. This understanding of legitimation sees it as a process, acknowledging that empirical attitudes change. Weber regards the rational/legal type of authority as one of the defining characteristics of modern societies. We shall therefore consider it in more detail so as (inter alia) to clarify its relation to democratic legitimacy. Legal authority is closely linked to rule of law, but presupposes special qualifications that not every system of positive law will satisfy. Legal authority is based on enacted laws obeyed by everyone; even a country’s president is subject to the impersonal order (Weber, 1978: 217). Impersonal orders of this sort are obeyed because they are understood as an expression of rational authority. The fundamental categories of rational authority find their purest, ideal-typical form in bureaucracy, which is typified by a continuous, rule-bound, hierarchically ordered conduct, precisely delineated spheres of competence and clearly defined and regulated means of compulsion (Weber, 1978: 218). Weber emphasizes the importance of technical knowledge in bureaucratic administration – describing it as the feature which makes it
specifically rational (Weber, 1978: 225) – and the universal application of bureaucratic procedures in everyday affairs (Weber, 1978: 220). This makes clear that acceptance is based primarily not on the enactment of laws and constitutions, but on the character of social orders and their rational procedures. Accordingly, Weber regards as merely relative the distinction between orders established on the basis of agreement (i.e. democratically) and ones that are imposed (Weber, 1978: 37).2 For Weber, legal orders are fundamentally based on rationality: specifically, instrumental rationality rather than value-rationality. He is a proponent of legal positivism, which holds that no objective knowledge of moral values and norms is possible, and that law and morality should hence be considered independently (Baurmann, 1991: 113). In this tradition of jurisprudence, the source of legal norms is of secondary importance; the crucial point is that they conform to procedures. The key feature of a legal order is that it is an internally consistent, clearly structured system of rules, whose application in individual cases can be unambiguously deduced from abstract norms. The rules are universally and continuously valid; although they must be adapted to any changes in the environment, on the whole they remain fundamentally stable, so that their application remains calculable. The legal order is underpinned by the state’s monopoly on force. Legal certainty must also be guaranteed, which is why modern legal systems need a highly professionalized jurisprudence that helps to systematize the law and ensure consistent legal interpretation. It is not difficult to discern in these features the form of a formal constitutional state (Rechtsstaat), which is explicitly distinguished from a material constitutional state (Baurmann, 1991: 123). According to Weber, a ‘social law’ based on ethical postulates such as justice or human dignity would weaken the calculability of the law or even lead to wholly arbitrary, ‘irrational adjudication’ (Weber, 1978: 886). Thus, for
Legitimacy and Legitimation
Weber, the purpose of the system of positive law is not to safeguard human rights or justice; rather, its central function is to provide a secure legal grounding for capitalism. The principle of legal authority eschews any normative foundation: ‘Today the most common form of legitimacy is the belief in legality, compliance with enactments which are formally correct and which have been made in the accustomed manner’ (Weber, 1978: 37). This is legitimation by way of procedures, an idea later taken up by Luhmann (1989), albeit reinterpreted in terms of decision procedures.3 These procedures are not necessarily democratic, but correspond to the principles enshrined in the constitution or fundamental legal order, which could also be, say, dynastic. This entails that law in the sense of legal authority can serve to legitimize both democratic and authoritarian regimes. As a result of this ambivalence, most political theorists regard it as insufficient to establish legitimacy solely on the basis of law (belief in legality), even with the special qualification of a formal Rechtsstaat. A strict distinction must be drawn between the legality principle and the Rawlsian constitutionality principle. Rawls’ proposed ‘liberal principle of legitimacy’ is based on a specific conception of constitutionality: ‘political power is legitimate only when it is exercised in accordance with a constitution (written or unwritten) the essentials of which all citizens, as reasonable and rational, can endorse in the light of their common human reason’ (Rawls, 2001: 41). A legitimate constitution not only rests on the rationality principle, but requires the endorsement of all citizens. This endorsement is in turn qualified, with the citizens required to exhibit something akin to Dahl’s ‘enlightened understanding’ (Dahl, 1989). This makes clear that for Rawls, the legitimation of a state’s authority requires a democratic regime form. Legitimacy is distinguished not just from the concept of legality, but also from that of stability. As Beetham correctly notes, the characteristics of legitimacy should not be
835
conflated with its consequences.4 It can be assumed (and has been empirically tested) that the stronger the belief in legitimacy, the more stable a regime will be. But stability also depends on other factors, such as the general economic and social situation or alternatives to the current regime, while a legitimation gap can be counteracted at least temporarily by other mechanisms, such as repression. Stability could, therefore, be the result of non-normative acceptance, which is distinct from legitimation. However, following Weber, belief in legitimacy is regarded as a significant contributor to the stability of political systems. Other functions are also attributed to it: for example, Scharpf (2004: 3) notes that the greater the compliance of citizens, the less disruption there will be and hence the more efficiently a government can operate: ‘Legitimacy is, therefore, the functional prerequisite for governments which aim to be simultaneously effective and liberal.’5 The two fundamental forms described here correspond to the terms ‘legitimacy’ and ‘legitimation’ (Garzón Valdés, 1988). Legitimacy is a normative category, referring to the justification of norms and the rightfulness of regimes. The exercise of political authority and state power is justified if there are good reasons for it. Legitimation – referred to by Weber as ‘belief in legitimacy’ and strictly distinguished from a normative sense – refers to belief in the rightfulness of a regime. It is thus a descriptive category, which assesses the extent to which rulers are accepted by the ruled. Do citizens believe in the rightfulness of their rulers’ authority? This idea is not linked to any universal normative standard: a triumphant dictator is just as capable of experiencing legitimation or acceptance as a traditional monarchy or constitutional democracy. The term ‘legitimation’ will henceforth be used to refer to the second idea. It will be treated as synonymous with ‘legitimization’, with both terms describing the process or act of providing legitimacy (Gaus, 2011: 4).
836
The SAGE Handbook of Political Science
The term ‘legitimacy’ is also used in this descriptive sense in the literature. However, in this chapter I shall reserve ‘legitimacy’ for the normative sense to make the distinction clearer and avoid further confusion. Alongside these two main variants, some prominent theories also add a third alternative to the mix.6 Accordingly, legitimacy, trust and confidence must be clearly separated. Trust is not an expression of moral quality, but in its very essence refers to an interpersonal relationship (social trust). By contrast, the relationship between legitimation and trust is closely interwoven. This is particularly true when trust in people as representatives of political institutions is analysed. Here we can understand trust as an expression of legitimation. On the other hand, it seems difficult to speak of social trust in a type of regime. However, by way of contrast with the first form of trust, which is concrete and personalized, it is conceivable that there could be abstract institutional trust, which could also exist towards courts or the civil service. This notion of institutionalbased trust is in essence very similar to the concept of ‘system trust’ (Luhmann, 1979) or ‘societal trust’. The term ‘confidence’ places the emphasis on the viability and functionality of organizations and institutions.
Legitimacy – the normative approach In recent years, the question of the legitimacy of political action has been taken up with increasing intensity and for a variety of different reasons. On the traditional view, political science is chiefly concerned with the legitimacy of power/authority and different types of government/state (Connolly, 1984; Green, 1988).7 Who can legitimately exercise power, including the use of coercion, and morally compel individuals to obey; what are the limits to power? While political philosophy formerly concentrated on the justification
of state power in general, finer distinctions are now drawn according to different regime types and systems of rule. Researchers have analysed the conditions for legitimate authority. Work in recent decades has increasingly incorporated the supranational level: the European Union and advancing European integration, international organizations and global governance structures. At the same time, attention has also been directed to the inner workings of political systems, drawing distinctions between specific subdomains and individual decisions. What provides normative justification for political authority? Political philosophers have argued for various different answers (Green, 1988): justice, stability and security, peacekeeping, promotion of the common good, constitutional protection of individual rights. In recent debates, most of these goals are seen as integrally linked to the democratic regime type, which provides the fundamental argument to justify state authority: participatory processes that make citizens the ultimate authors of their own laws and guarantee them the ability to participate in the exercise of power and decision-making. Procedural rules concerning both participation and rule of law are seen as key foundations for legitimizing political authority. This creates pressure to justify even individual decisions; democracy is a political system in which important (non-)decisions must always be justified. In relation to democratic legitimacy, two principles are of particular significance: responsibility and responsiveness. The former is a measure of how responsibly decisions are taken: are common interests, possible consequences and fundamental rights taken into consideration? Weighing up such factors can lead to a decision that goes against prevailing majority opinion. The second principle, responsiveness to citizens’ preferences, is intended to prevent precisely this possibility. It requires that a government’s actions are suitably reflective of citizens’ preferences.8 However, if these preferences go against the fundamental normative underpinnings
Legitimacy and Legitimation
of democracy – for example, if they would involve discriminating against minorities – they cannot be satisfied without violating the principle of responsibility. This potential for conflict between the principles shows the difficulty of setting a generally recognized standard for the legitimacy of democratic authority. The same conflict can be seen elsewhere in the dispute over constitutionalism, in particular concerning the role of a supreme court: should the supreme court protect constitutional rights, or should this be left to the people as the democratic sovereign? The extensive scope of these requirements for justification makes the standards for legitimation far more stringent and complex than in the three ideal types of Weberian provenance. In the contemporary debate, democracy serves as a normative benchmark or gold standard for the legitimacy of political authority. However, there are significant differences in how democracy is conceived (Peter, 2008, 2017), most crucially with respect to the status of participatory processes. Following Habermas (1996) and Bohman and Rehg (1997), forms of deliberative democracy are ascribed greater legitimacy than conventional representative democracy. This debate does not concern itself with the legitimacy of individual political decisions, but rather with whether the procedures used in such decisions are suitable or could be improved. One particular focus is innovating new democratic procedures, a discussion which also draws on empirical research. Other topics that are addressed are the limits of representative democracy and the opportunities offered by direct democracy and related deliberative procedures. Brexit is a good example of a case where procedures, including the conduct of the referendum itself, did not lead optimally to a deliberative solution. It is generally claimed that improving participatory and decisionmaking procedures increases the quality of decisions. Deliberative democracy combines the idea of public reason with the element of democratic participation.
837
Although the legitimacy of individual decisions is not usually questioned by public actors in democracies, there are exceptions to this rule. One such exception is the principle of civil disobedience, according to which illegal actions can be justified (Brownlee, 2012; Perry, 2013). This idea underscores a fundamental tension between legitimacy and legality: decisions that were properly reached in accordance with the law can be ruled illegitimate on the basis of overriding norms, which must themselves be compatible with democracy and cannot be ideologically rooted in anti-democratic values. Since democratic decisions can generally be revised by democratic means, civil disobedience must be justified by the claim that revising the decision by these means would take an unacceptably long time given the pressing nature of the issue. Examples of civil disobedience include protests against the introduction of nuclear power, which was regarded as posing incalculable risks with extremely long-term consequences, and the NATO Double-Track Decision in the early 1980s, or the more recent phenomenon of ‘church asylum’ where churches offer sanctuary to people threatened with deportation because they believe their cases have not been properly considered; if all legal remedies have been exhausted or the deportation is scheduled to take place before an appeal has concluded, civil disobedience is regarded as the only alternative. The aim of civil disobedience is not to resist democracy, but to improve its procedures and decisions. Another, competing principle for evaluating the legitimacy of political systems appeals to the concept of justice (Buchanan, 2002): only political systems that are also just can legitimately exercise power, and since democracies are not automatically just, their legitimacy must also be scrutinized. Rawls (1993), by contrast, opposes conflating the concepts of authority and justice, arguing that the exercise of political power can be unjust yet legitimate, though the illegitimate exercise of power cannot be
838
The SAGE Handbook of Political Science
just. Regardless of how the relation between justice and legitimacy is conceived, the definition of justice itself remains a subject of dispute. Ultimately, basing legitimacy on justice would require combining a procedural with a substantive understanding of democracy. But there are good reasons for rejecting a substantive conception, according to which the quality or even the existence of a democracy can be discerned from its performance. Ultimately, what performance is called for is a matter for the democratic sovereign, meaning the outcome will be historically contingent and impossible to formulate in universal terms. By contrast, suitable procedures can be expressed in universal form, though it should be noted that it is not only the procedures themselves that are relevant, but also the possibility of using them appropriately. They are thus linked not just to certain minimum social standards, but also to cognitive capacities (‘enlightened understanding’; Dahl, 1989: 307). A number of other conditions and capacities have also been considered in studies on innovative procedures in democracies (Mayne and Geissel, 2018). Another intriguing question concerns the legitimacy of the European Union and its predecessors (Schmidt, 2013). How is the union legitimized if – as it is claimed – it lacks adequate democratic legitimacy? Various arguments have been made for this lack of legitimacy. One argument points to the long legitimation chains: members of key decision-making bodies such as the European Commission, European Court of Justice and European Central Bank are not directly elected, despite having more powers than the directly elected European Parliament. Critics also claim that there is an imbalance in favour of the executive, and that there is no collective European demos as the democratic sovereign. Although the Treaty of Lisbon has made the EU more democratic, many of the criticisms remain. Another alternative to legitimation based on democratic procedures is legitimation based on utilitarian considerations. On this
view, it is not the input processes that legitimize the political system of the EU but its performance, that is, the output side. Fritz Scharpf (2004) takes a position of this sort in his evaluation of the legitimacy of European integration, which he believes is not guaranteed on the input side. However, he also considers the possibility for output-based legitimacy in the EU to be limited to allocation decisions that satisfy the Pareto criterion: decisions that benefit one party at the expense of another lack legitimacy in the absence of a solidary community (though such decisions are, he concedes, unlikely given that the EU’s scope for decision-making is constrained by many layers of checks and balances9). Others, however, are critical of the possibility of utilitarian justification even in the case of solidary national communities. Peter (2017) summarizes the argument thus: ‘Rawls (1971: 175f.) and Jeremy Waldron (1987: 143f.) object that the utilitarian approach will ultimately only convince those who stand to benefit from the felicific calculus, and that it lacks an argument to convince those who stand to lose.’ There is also the question of whether the EU needs the same level of legitimacy as nation-states. Scharpf (2004) makes the case for a notion of gradated legitimacy, whereby the level of required legitimacy depends on the depth and significance of the decision in question. In positive-sum games with distribution conflicts or pure coordination games, the need for legitimacy is, he says, significantly lower than when dealing with zero-sum conflicts where the solution that satisfies the interest of one group will be at the expense of another. This criterion is particularly relevant to evaluating the legitimacy of international institutions. In recent years, the scope of the debate about legitimacy has expanded to include the international order (Hurrelmann et al., 2007; Zaum, 2013). What legitimacy is possessed by the United Nations and its bodies, or by special organizations such as the IMF? What decisions, and with what consequences, can be legitimized? How far can such institutions
Legitimacy and Legitimation
intervene in the sovereignty of national governments (e.g. by imposing austerity programmes)? At a very general level, there is the question of what form the international order should take: should it be conceived as a global state, and/or what minimum democratic requirements should be established (Höffe, 2007; Nullmeier and Pritzlaff, 2010)? Many commentators are extremely sceptical of the possibility of an international or even global democracy, as there is no demos with a well-defined collective identity.10 In the absence of such an identity, however, it is difficult to acceptably set rules that impose special sacrifices on individual states or treat them worse than others. This does not exclude the possibility of international solidarity agreements, commendable examples of which exist between Scandinavian states and poorer countries. However, these agreements are not based on a communicatively formed global society, but on voluntary national decisions underpinned by public discourse in the countries in question; the legitimacy of governance at a level beyond the nation-state requires an influx of legitimacy from national societies. As well as governments, civil society organizations can also play a key role in this transformation. Hence, the legitimacy of international political structures and decisions remains closely interwoven with the national sphere. This is also evident in discussions on specific questions of international policy, which always touch on issues of legitimacy. When is it right or necessary to intervene by force in another country (Merkel and Grimm, 2009)? What kinds of emergency can only be dealt with in this way without incalculable risks? Questions are also asked about the economic activities of individual countries: how justified is the considerable global variation in resource consumption (Dobson, 1999; Agyeman et al., 2002)? The many different issues linked to sustainability can be boiled down to a single question: is economic activity at the expense of other nations and/or future generations normatively justified? As
839
these questions show, it is not just procedures but also concrete decisions whose legitimacy comes in for scrutiny; there has been a noticeable expansion in the focus of the normative legitimacy debate. A comparison of normative justificatory structures reveals a pattern that is also observable in the development of human rights. Originally, security and the guarantee of civil liberties were regarded as the central criterion of legitimacy; rights to political participation then became increasingly important, as can be seen in the normative standard of democratic authority; this was followed by a gradual increase in the significance of social rights, the interpretation of which is reflected in the wide-ranging discussion on justice and inclusion as foundations of legitimacy. Internationally, the progression through these three stages has been accompanied by a growth in the importance of human rights in general. Protecting human rights is now used to justify intervening in states’ domestic affairs, thus imposing limits on the centuries-old principle of the inviolability of national sovereignty.
Research on legitimation Concepts Alongside studies of normative justificatory procedures, another strand that has established itself in political science is empirical research on legitimation. This empirical research investigates the legitimation possessed by the rulers in a political system, looking at the factors that ground belief in legitimacy and support for regimes. It focuses on different sources of legitimation and their ability to sustain stable systems of rule. Within political science, this strand of research is situated in the fields of political culture research and political sociology. The focal point is the relation between rulers and ruled, and the extent to which the latter
840
The SAGE Handbook of Political Science
regard the former’s authority as justified. This issue is relevant to all regime types and has been studied in relation to both democracies and autocracies. Even more so than the normative variant, this strand of research focuses on the stability of political systems. A high level of belief in legitimacy or legitimation is seen as key to stability. Originally, the empirical frame of reference for studies on legitimation mainly comprised democracies. The collapse of various democracies in the first and second waves of democratization made clear the importance of the role played by citizens’ attitudes. If they lack democratic beliefs or do not support the political system and its actors, there is a danger that democracy will collapse. This line of research was also motivated by the increasing democratization of states in the third wave. The focus on democracies has impacted significantly on the selection of investigative criteria. The sources of legitimation described by Weber have been restructured and expanded, with citizens’ attitudes becoming central objects of study. Almond and Verba (1965) investigated different objects and modes of political orientation. They began by distinguishing four objects of political orientation: the political system as a whole and its fundamental values and institutions; participatory processes (input objects); the performance of the political system (output objects); and the self as political actor. The attitudes towards these objects are broken down into cognitive, affective and evaluative modes of orientation. By combining these different dimensions, Almond and Verba categorized different types of political cultures, with the mixed type of civic culture considered the most conducive to democracy. In a civic culture, the citizens’ attitudes and value orientation help support the functioning and stability of a democracy. The study has a clear functional emphasis, with the congruence of political culture and political structure regarded as critical for the stability of a political system. Lipset (1960) also considers the issue of stability, but with the focus
on legitimation and effectiveness now taking stronger account of economic performance. Building on these ideas, mainstream research follows David Easton’s 1965 theory that the degree of legitimation depends on how closely the political order and the values inherent to it correspond to citizens’ personal moral principles and beliefs. Another of Easton’s ideas that has proved influential is his distinction between diffuse support, which is based on approval of political authorities’ fundamental principles, and specific support, which is based on these authorities’ performance. This model continues to be applied in empirical research to this day, though it has been supplemented by additional distinctions (Easton, 1975). Norris (1999) developed a fivefold classification of political support, which draws a line between political community, regime principles, regime performance, regime institutions and political actors. This approach enables a systematic analysis of different functional areas. Fuchs (2007) established a hierarchical model of democratic orientations towards regime type/democratic system, type of democratic regime/governmental system and specific governments. This differentiation is helpful in identifying the level of support. Distrusting government officials while believing that it is right to obey the state is not, as McMann (2016: 555) suggests, evidence that trust and legitimation are distinct, but rather that different political objects can achieve divergent degrees of support. Weatherford (1992) developed a broad theory of legitimacy orientations, which includes views from ‘above’ and ‘below’ and attempts to integrate the micro and macro levels of investigation. However, this distinction only applies to the object level (political versus personal). The data is still based on surveys. Gilley (2006) focuses only on the ‘diffuse’ support dimension by measuring state legitimacy. He excludes government and other actors from the analysis. His theory distinguishes three subtypes of legitimacy. While the first two cover the legitimacy of
Legitimacy and Legitimation
the legal and normative side (justification), the third subtype (act of consent) concerns the degree of mere acceptance. In addition to surveys, this theory also includes patterns of behaviour. According to the examples, the following categories of items are typically distinguished: at the level of the general political system, identification with the political community, support for central democratic values (such as freedom, equality and the separation of powers); at the level of actors and performance, trust or confidence in key political actors (government, parliament, parties) and state institutions (civil service, courts, military). When measuring these attitudes, an attempt is made to separate general trust in political institutions from specific trust based on concrete everyday practice, though clearly this categorical distinction is not always straightforward to define. It might make sense to distinguish between concrete trust in political organizations (parties, government) represented by public persons as an expression of specific support and abstract trust in ‘faceless’ organizations (courts, civil service) or institutions as an expression of diffuse support. Attitudes are measured using representative surveys. There are now many datasets that also record developments in support over time.11 The analysis of legitimation in these studies appears to reduce it to the factor of support, though different subtypes of support are distinguished (Klingemann, 1999). One key assumption is that deep-rooted democratic values are more important for stability than high approval based on output performance, which can rapidly change. Despite the widespread use and high acceptance of survey research, the method has been subjected to a range of criticisms that put the validity of the measurements into question. They include the difficulty of precisely measuring short-term attitudes and long-term beliefs, and of controlling for distortion resulting from respondents’ seeing things in accordance with the desires and
841
expectations around them.12 There are also a number of pragmatic issues, such as how to properly translate question items into different cultural contexts or how to actually achieve representativity, as well as criticisms at the level of principle concerning the closed nature of the questionnaires and the neglect of historical context: In survey research, respondents only react to stimuli provided by questionnaires that offer respondents a preselection of political institutions to be assessed and of evaluative benchmarks to be commented on. This approach is unlikely to shed much light on the actual contours of legitimacy beliefs. Even more importantly, it neglects the contextbound nature of legitimation processes. (Hurrelmann et al., 2005: 4)
Another problem consists in the selection of items and categories which are useful in comparative research. Findings may therefore be inaccurate because they ignore aspects relevant to legitimization in one case, while not in others. The criteria by which governments are legitimated may vary on a case by case basis.13 Therefore, to obtain a full picture of a single case, it is necessary to include all relevant aspects of legitimation in the study. One contrasting or complementary way of measuring political support or its decline consists in documenting political action such as protest. A distinction is drawn between active protest, expressed in conventional and unconventional forms of participation, and passive protest, such as voter abstention (Rucht et al., 1999).14 Active protest involves the dimension of action, thus expanding the scope of investigation. Through participatory behaviours, citizens can withdraw legitimation both from political actors and their decisions as well as from the current form or general idea of democracy. The same applies to the passive behaviour of non-voting, the study of which relies more strongly on survey research, though it needs to be assessed on a case by case basis whether non-voting is actually a form of protest and loss of legitimation, or whether
842
The SAGE Handbook of Political Science
there are other reasons (e.g. because it is expected that the person’s preferred party will win, or due to generalized political apathy). Other categories of actions and behaviour highlight support measures such as tax payments or legal compliance, which are often measured by the degree of corruption. Nearly all these studies of political actions underscore the relevance of social interaction and collective action. They should be understood as calling for the inclusion of the intermediary level. This short overview underscores one problem which results from the different definitions. It is not always clear whether legitimation and support or mere acceptance are being measured. While the subjects are always the citizens, the selection of objects varies significantly (state, regime type, government, parties, civil service, courts, etc.), as do other aspects (trust, alienation, accountability, responsiveness, procedural and distributive fairness, efficacy and efficiency). Likewise, some concepts conflate the measurement of legitimacy with the identification of its causes and consequences. Sound empirical research would need to analyse orientations (attitudes at the micro level) as well as patterns of behaviour at the meso level. One should add, however, a further intermediate dimension, which is embedded in the public debate that is often dominated by the media. Citizens’ evaluations are always shaped by the framing of public arguments and issues. Thus, it is possible that very similar performances by governments will be judged differently depending on the communicative framing. In addition to different public relations strategies, the credibility of the actors (messengers) and the utility of the ideas play a significant role in this process. The degree of legitimacy thus also depends significantly on the ability of political elites or the opposition to introduce their own legitimacy criteria into the communication process. The analysis of legitimation is therefore always an empirical–hermeneutical task, too.
Strategies of Legitimation Forms of legitimation can vary over the course of time and between different cases. This raises some crucial questions: Why do the findings differ? What reasons can be adduced for this variation? What effects does a loss or crisis of legitimation have on a political system, and how can such a loss or crisis be prevented? These questions are interrelated. For example, actions taken to prevent legitimation crises are also factors that help to explain the variation in the findings. Causes can be broken down into actorspecific factors (which usually form part of legitimation strategies) and structural, systematic factors. In the former case, the relevant legitimation strategies need to be identified and investigated. What strategies are distinguished and are they dependent on regime type? Let us first consider this aspect, which leads on to the idea of the politics of legitimacy and prompts the general question: ‘What are governments doing when they spend time, resources and energy legitimating themselves?’ (Barker, 2001: 2). Barker is assuming here that legitimation begins with rulers’ legitimation of themselves, but in democracies the chain of legitimation starts from below. Accordingly, we can ask: what can a government do to generate support and thus legitimation? Following Nullmeier et al. (2012: 24), I understand the politics of legitimacy as all efforts that are undertaken to produce and secure the normative worthiness of a political order, decision or actor to be recognized. These efforts are distinguished from those that are being directed purely at generating acceptance with no reference to normativity. A first legitimation strategy in democracies is based on performance. Lipset (1960: 77) argues that political systems can actively contribute to their being recognized as legitimate. He believes that the political system’s performance plays a key role: the more highly citizens rate the output, the higher their specific support. The longer this specific support
Legitimacy and Legitimation
lasts, the more likely it is to transform into robust, diffuse support: West Germany in the 1950s and 1960s is one example of such a transformation. Legitimation qua output or performance is in principle also possible in authoritarian regimes, but in democracies this legitimation strategy utilizes the democratic principle of responsiveness, whereby citizens view outcomes more positively the more closely they correspond to their preferences. This brings about an alignment between the moral principles and values of citizens and rulers. Political parties attempt to formulate policies that reflect citizens’ preferences. Elections are the true testing grounds for these efforts to bolster legitimation; the success of these efforts is measured by the election results and turnout, though the latter can be distorted by various factors (such as compulsory voting). A second strategy is based on the appeal of political actors, and is distantly related to notions of charismatic authority. Surveys of politicians’ popularity attempt to measure this aspect, though it is difficult to predict what factors will affect popularity ratings; even scandals do not always have a negative effect, but can actually increase approval. However, falling approval ratings are often attributed to politicians. Anti-politician attitudes are based on a negative view of politicians’ conduct and character; they are seen as only interested in looking after themselves and their careers. A third variant is institutional legitimation strategies, by means of which changes are made to a political system’s institutional framework. Such strategies can be applied to various building blocks of democracy: for example, opportunities for participation can be increased by introducing direct democracy procedures, the political process can be made more transparent and open to scrutiny, or quotas can be used to address issues of equality. The use of mediation and other deliberative procedures also falls within the scope of these strategies. A fourth legitimation strategy is based in the realm of political discourse and relies
843
on a government’s capacity for communication: not just letting the public know what it is doing, but providing comprehensible justifications for its decisions, either by drawing on existing normative standards or else by reinterpreting or replacing them. This is not a simple strategy, since in pluralistic media landscapes the government does not have a dominant role and must compete against alternative narratives. Coming across too slick, by acting in a way that bears the clear hallmark of spin doctors, can actually prove counterproductive as it can damage the credibility of politics. The growth of social media is also making it increasingly hard to manage public perception. It seems easier to spread fake news and mistrust than nuanced, rational arguments. A final category that should be mentioned is symbolic politics, which can arouse or reinforce positive attitudes. Little research has been carried out on this category; the studies that do exist are primarily in the fields of sociology and ethnology (Schlichte, 2018). Autocracies also make use of a diverse array of legitimation strategies. One reason for this is that they lack democracy as a key normative source of legitimacy, and thus need to draw on many different sources to achieve legitimation (Burnell, 2006; Gerschewski, 2013). It should be noted that although repression and other coercive measures can contribute to stability, they are not forms of legitimation: ‘the acceptance of a justification does not count if the acceptance itself is produced by the coercive power which is supposedly being justified’ (Williams, 2005: 6). In his analysis of autocracies’ stabilization mechanisms, Gerschewski (2013) specifically notes the wide range of legitimation strategies that are used alongside measures such as repression and co-option. These legitimation strategies are mainly structured around categories of diffuse and specific support. Specific support is operationalized primarily in terms of economic and social indicators, as well as the aspects of corruption, law and order and quality of
844
The SAGE Handbook of Political Science
bureaucracy. Law is interpreted with a focus on its contribution to domestic security; its other functions and qualities are ignored. The law thus plays only a limited role in the legitimation of authoritarian regimes, even though they are structured by legal systems and despite the centrality of legal authority in Weber’s (1921 [1978]) account of ‘types of legitimate domination’. Unlike in democracies, autocracies’ legitimation strategies are strongly tied to the type of autocracy in question. Classifications of autocratic systems of rule (or dictatorships) need to distinguish between authoritarian and totalitarian regimes, since the two types are based on different fundamental principles that mean it is not possible to regard one ‘merely’ as a subtype of the other.15 One key strategy of totalitarian regimes is to legitimize their authority through the use of ideologies; these ideologies can be fascist/ National Socialist, communist or theocratic, according to the nature of the regime in question. By contrast with the communicative strategies used in democracies, totalitarian regimes operate with methods of indoctrination and manipulation. There are also subtypes of autocratic regime with specific legitimation strategies. Modernizing regimes base their legitimation on their output performance; military regimes on creating security and order; dynastic regimes on the legitimation patterns of traditional authority; post-colonial dictatorships and one-party regimes on their performance in the war for liberation or on claims that they are warding off imperialist domination or some other external threat. Individuals, such as Fidel Castro, are also able to draw on charismatic resources. Personality cults, by contrast, are an institutionalized form of charismatic authority that go to great lengths in trying to imitate the real thing (as seen, for example, in North Korea). Due to the rising global acceptance of democracy as a system of rule, authoritarian governments imitate democratic elements (electoral autocracies or competitive authoritarianism: Schedler,
2006; Bogaards and Elischer, 2015) or even attempt a redefinition that presents their own authoritarian regime as the true democracy.16 In some of these forms of legitimation, a significant role is played by the use of symbols and national myths. Dukalskis and Gerschewski (2017) argue that depoliticization measures should also be understood as legitimation strategies, but this is unpersuasive; such measures are clear-cut cases of attempts to generate non-normative acceptance. The main way in which authoritarian regimes can legitimize themselves based on what Weber called legal authority is by reference to a specific legal structure: the formal constitutional state. Historical examples of this are Prussia or the German Empire, while a modern-day example is Singapore, though these regimes also made or make reference to their modernizing reforms. Other legal structures can also serve in various ways to support and legitimize authoritarian regimes. One strategy seeks to win support from elites who benefit from flawed constitutional states or hybrid legal systems. Perverting the rule of law through corruption, clientelism and state capture can provide a stabilization mechanism specifically geared towards regimesupporting elites, who are more important for stability in autocracies than in democracies. Another legally based legitimation strategy tries to win support from other sections of the population by explicitly utilizing traditional systems of norms and rules that enjoy high acceptance. Using these two categories could also help to give structure to the diverse findings in the context of legal pluralism (Shah, 2014). Empirical research on the dynamics and stability of authoritarian regimes should take greater account of these multilayered, formal and informal interactions between law and governance (Lauth, 2017). This brief outline of legitimation strategies in autocracies has shown that these regimes attempt to legitimize themselves by a diverse range of different means, since they lack democracies’ fundamental input
Legitimacy and Legitimation
legitimation. The considerable effort autocracies put into legitimizing themselves further underscores the importance of legitimation in order to maintain power. Empirical studies of legitimation in democracies and autocracies concentrate on different aspects. Research in autocracies is less able to rely on survey methods than comparable studies in democracies and also refers to legitimation strategies that capture different groups: they are not only directed at all citizens but also at regime-supporting elites. To respond to the diverse range of legitimation strategies found in different types of regimes, it is necessary to draw on an equally diverse methodological repertoire that goes far beyond the methods used in traditional research on legitimation in democracies: first, inductive survey approaches; second, discourse analysis methods, for studying public communication; third, methods that take account of the dimension of action, which allows the empirical legitimacy puzzle to be resolved (Booth and Seligson, 2009).
Legitimation Crises in Democracies Although nowadays autocratic regimes generally need to compensate for a legitimacy deficit that does not affect democracies, it is democratic regimes that appear to be particularly prone to legitimation crises. There is a wealth of literature on legitimation crises in democracies. Two fundamental patterns can be distinguished, both of which are primarily based on systemic factors and can manifest in a variety of forms. First, the problem is seen in the excessive expectations that democracy itself generates (King, 1975; Rose, 1980). During election campaigns, parties attempt to outdo each other with promises that, once in government, they can only deliver with difficulty or by taking on ever increasing debts. At the same time, citizens expect more and more of the political system, and it becomes less and less possible
845
to satisfy these expectations. The result is an immanent legitimation crisis. Second, legitimation crises are understood as expressions of capitalist dynamics (Habermas, 1973). According to this view, in order to maintain acceptance from citizens a political system must make concessions to them, most notably by expanding the welfare state. However, this curbs the free market and redirects profit from companies to the state, which dampens capitalist dynamism. But in the face of growing pressure from globalization, this dynamism needs to be sustained, which in turn forces cuts in state benefits. Over time, it becomes increasingly difficult for capitalist states or democracies to maintain a balance between these antagonistic interests, resulting in a legitimation crisis. In certain respects, Colin Crouch’s theory of post-democracy can be seen as a continuation of this idea (Crouch, 2004). No general empirical confirmation has yet been found for either of these two crisis theories. State spending generally remains high, even if small reductions have been made in some countries. Nor is a rejection of democracy discernible; rather, support for democracy as a general regime type is high in all established democracies. However, in many countries approval is dramatically lower when it comes to specific political institutions and actors. Recent decades have seen trust in governments and parliaments declining in many democracies. Election turnout has also fallen. Political parties have suffered a particularly sharp loss of trust, with membership numbers collapsing almost everywhere. This is undoubtedly a legitimation crisis. In terms of political sociology, the loss of trust in politics is based on exogenous factors. Central to this crisis is the transformation of society, manifested in the breakdown of overarching unities and social differentiation. This dynamic is driven by economic factors that emerge from global markets, and is linked to a decline of traditional worldviews and shift in values that has been described – not without basis – as the ‘silent revolution’
846
The SAGE Handbook of Political Science
(Inglehart, 1977). The consequences for the legitimation of democracies are considerable, complex and contradictory. Various different interpretations have been put forward. On Dalton and Welzel’s (2014) positive account, the result of the changes has been not a rejection of politics, but rather a move towards new forms of political participation. The authors argue that although there is a continued trend of dealignment and a decline in support for mainstream political parties, people are engaging in non-electoral forms of participation and assuming greater political responsibility. Other authors also allow for the possibility of a modified realignment. Negative interpretations come in a number of variants. According to one of them, neoliberalism’s permeation of society is reinforcing a focus on individual benefits and consumption. This is fuelling the above noted rise in expectations, but without people being willing to contribute themselves (something known as the free-rider problem). Another variant holds that in an increasingly pluralistic society, individual groups are rarely able to satisfy their interests in undiluted form; in a culture of compromise, everyone is ultimately dissatisfied. What is interesting about this interpretation is that one of democracy’s greatest achievements – resolving conflicting interests without violence by means of compromises – is now undergoing a negative reinterpretation that cannot be resolved within the system. The ancient cynical argument, which is undergoing something of a revival, runs along similar lines. Politicians are now commonly lumped together as a selfinterested political class that exists separately from ordinary people (Allen and Cairney, 2017). The difficulty of finding adequate political solutions in globalized contexts is conceived in terms of the ineffectiveness of this caste. There is widespread discontent with the transformation of society, which is understood as the result of failed politics. Visions of the future therefore reach back to the national past, in line with the programmes of right-wing populist parties, which are
regarded as a clear expression of the political system’s legitimation problems. What all this makes clear is that modern democracies face myriad legitimation problems that are difficult to resolve, for two main reasons: first, because they are rooted in systemic, structural factors that can only be changed slowly, if at all, by political means; second, because they are based on different constructions of social reality that it is increasingly difficult to mediate between, as evidenced by the increasing polarization of political culture in countries such as the UK and, especially, the United States. Though it is not possible to explore this topic in depth here, it is clear that there remains a pressing need for research on the legitimacy of political systems. Moreover, there has thus far been no discussion of what happens if the legitimation problems persist or grow. Although some plausible, reasonable suggestions have been made based on facilitating and expanding political participation and education, it is an open question how effective these would be.
Conclusion: open questions and avenues for future research Research on legitimacy is divided into two main strands: a normative one based on the concept of legitimacy, and an empirical one based on the concept of legitimation. The normative strand is a vibrant field of study, whose scope has significantly broadened from the original focus on the justification of national governments to also include international institutions and actors. Furthermore, research on legitimacy is no longer confined to the political and social spheres, but also encompasses the capitalist economic order, its actors (banks, corporations and trusts) and their activities. Consequently, the number of grounds and motives for legitimacy has increased. Finergrained distinctions are also drawn between different aspects of democracy – not just the overall concept but individual elements of it,
Legitimacy and Legitimation
such as participation, transparency and separation of powers, are used for purposes of justification. The reasons for this vast proliferation may lie outside political science: it can perhaps be attributed to the rising standards of justification demanded in modern enlightened societies, which has given rise to a need for a more systematic approach to the topic of legitimacy. One idea that merits further exploration is that of gradated legitimacy, according to which standards for the justification and grounding of legitimacy become higher in proportion to the scope of an institution’s or actor’s powers and its ability to impose sanctions; nation-states would thus have to satisfy higher standards of justification than, say, international organizations. However, this idea cannot be used to develop a materially coherent theory of legitimacy, as it does not take account of the logic of different fields. For example, ideas about how the market can be justified according to criteria of efficiency and effectiveness cannot simply be transposed to the political domain, although there are attempts to establish relations between different subsystems (for example, the social market economy or public and private regulation (Wolf et al., 2017)). It would therefore make sense to initially concentrate on developing a theory of political legitimacy, even if merely clarifying the concept ‘political’ would raise fresh controversies. Extensive, wide-ranging work has also been carried out within the empirical strand of research on legitimation. Although this field was long dominated by Almond and Verba’s theory of political culture, their approach has been supplemented by some significant additions, including more inductive survey methods (instruments with open questions), constructivist and discursive approaches specifically designed to identify patterns of legitimation in the public sphere, and perspectives and methods from media sociology. One productive approach is the research being carried out into the politics of legitimacy, which draws on some of the
847
distinctions from the normative debate to identify the different legitimation strategies used in national and international contexts and in democracies and autocracies. The discussion of the two strands should have made clear that, despite their difference of emphasis, they both involve empirical and normative elements. The normative debate reflects empirical changes, while the empirical studies focus on the normative grounds for recognizing political authority. Would it therefore make sense to try to integrate the two strands? This would certainly require more than simply adding them together. The impulse to integrate is inherent to politics itself: ‘politics is a matter of establishing relations of justification in which those who were subjected to rule can be the justification authorities of this rule’ (Forst, 2014: 674). It is not just philosophers and theorists who engage in the justification of political authority, but also rulers and ruled themselves. Any adequate study of legitimacy will be conscious of this dual construction of reality and combine the different aspects in a logical manner. It will also link universally justifiable norms to concrete manifestations in specific historical situations. To conclude, more realism in the study of legitimacy means – somewhat counter-intuitively – to overcome the empirical focus on beliefs, attitudes and compliant behaviour. It means to understand political legitimacy as a dynamic concept referring to a normatively structured societal practice of legitimation, the analysis of which requires the systematic combination of the perspectives of political theory, sociology and the history of ideas. (Gaus, 2011: 17–18)
Notes 1 ‘A populace’s belief in legitimacy is not based on an absolute normative standard, but pluralistically on heterogeneous worlds of meaning [Sinnwelte] and relationally by comparison with historical or contemporary social realities’ (Nohlen, 1998: 352). Easton’s definition of legitimacy also belongs to this tradition (1965: 278).
848
The SAGE Handbook of Political Science
2 The relativization of types of regime is also evident in Weber’s remark on how the ‘supreme chief’ of an organization acquires their position: either through appropriation, an election or being designated as a successor (Weber, 1978: 220). 3 Luhmann understands legitimation as ‘the general willingness to accept substantially still undetermined decisions within certain limits of tolerance’ (Luhmann, 1989: 28; italics in original; translation from Gaus, 2011: 3). While Luhmann thus rejects a normative definition, Habermas conceives of legitimacy in explicitly normative terms: ‘Legitimacy means that there are good arguments for a political order’s claim to be recognized as right and just; a legitimate order deserves recognition. Legitimacy means a political order’s worthiness to be recognized’ (Habermas, 1976: 178; italics in original; cited in Gaus, 2011: 3). 4 ‘Against those who equate legitimacy with stability or efficiency, I argue that legitimacy should not be confused with the effects it produces on a system of power through the enhanced obedience of its subordinates’ (Beetham, 2013: 38). 5 Translation adapted from Scharpf (2009). 6 Beetham (2013: 37) distinguishes two central dimensions or axes of legitimacy: the justificatory principles and the conventions of consent embodied in different rules or systems of power. He thereby incorporates the normative variant into empirical research. Habermas (1973) also combines the two main strands. 7 Given this narrowness of focus in the discipline, it might be more accurate to speak of political legitimacy and political legitimation. However, this more precise way of speaking is not adopted in the literature, and so that is the convention I follow in this chapter, in the knowledge that the adjectives are tacitly implied. 8 Dahl (1989) and Lijphart (1984) go so far as to regard responsiveness as the core of democracy. On problems with this position, see Lauth (2013). 9 He regards the cases of the ECJ and ECB as more problematic, since their power is relatively unconstrained by treaties. 10 Scharpf (2004: 6) is sceptical of the possibility of democratic orders at a level beyond the nationstate: ‘the condition of a real and robust collective identity [is] the Achilles’ heel of attempts to apply input-based legitimation arguments to governance structures “beyond the nation-state”.’ 11 Examples include the World Values Survey and similar regional survey instruments. 12 Carrying out empirical research on legitimation in autocracies poses certain additional problems, including difficulties in gaining access to the field and methodological issues that affect the valid-
ity of the research. There is also the possibility of distorted response behaviour or a distorted understanding of key concepts such as democracy (Welzel and Kirsch, 2017). 13 Empirical research indicates that corruption of political elites can either decrease or increase their legitimation, even in democracies, depending on the evaluative standards that prevail in public debate. 14 Some studies combine the different approaches into a complex measure. For example, Gilley (2006: 510) considers not just support and protest, but also normative concepts: ‘As both Beetham (1991) and Habermas (1975) have argued, the moral justification of state power (as opposed to its legality or consent) is particularly important because that power underwrites the laws and rules that govern so much of the rest of social and economic life. It is, so to speak, the uber-power and without moral justification, its negative consequences are just too hard to bear. I thus believe that justification should be weighted more heavily for a fully theorized measure of legitimacy.’ 15 Totalitarian regimes are characterized by the complete absence of political freedoms and political equality, with power concentrated in the hands of a small elite so that the vast majority of citizens are utterly powerless and have no control over how they are governed. Authoritarian regimes, by contrast, do afford some limited political freedoms and power to their citizens, though not to the full extent of democracies. 16 Attempts at such redefinitions were observable in ‘real socialist’ regimes such as the German Democratic Republic, and are now appearing in China (Lu and Shi, 2015).
References Agyeman, J., Bullard, R. D. and Evans, B. (2002) Exploring the Nexus: Bringing Together Sustainability, Environmental Justice and Equity. Space and Polity 6(1): 77–90. Allen, P. and Cairney, P. (2017) What Do We Mean When We Talk about the ‘Political Class’? Political Studies Review 15(1): 18–27. Almond, G. A. and Verba, S. (eds) (1965) The Civic Culture: Political Attitudes and Democracy in Five Nations. Boston: Little, Brown and Company (1st edition 1963 Princeton University Press). Barker, R. (2001) Legitimating Identities: The Self-Presentations of Rulers and Subjects. Cambridge: Cambridge University Press.
Legitimacy and Legitimation
Baurmann, M. (1991) Recht und Moral bei Max Weber. In: Jung, H., Müller-Dietz, H. and Neumann, U. (eds) Recht und Moral. Beiträge zu einer Standortbestimmung. Baden-Baden: Nomos: 113–38. Beetham, D. (1991) The Legitimation of Power. Houndmills et al.: Macmillan. Beetham, D. (2013) The Legitimation of Power, 2nd edition. Houndmills: Palgrave Macmillan. Bogaards, M. and Elischer, S. (eds) (2015) Competitive Authoritarianism in Africa Revisited. Zeitschrift für Vergleichende Politikwissenschaft/ Comparative Governance and Politics Special Issue 6. doi: 10.1007/s12286-015-0257-6. Bohman, J. and Rehg, W. (eds) (1997) Deliberative Democracy: Essays on Reason and Politics. Cambridge, MA: MIT Press. Booth, J. A. and Seligson, M. A. (2009) The Legitimacy Puzzle in Latin America: Political Support and Democracy in Eight Nations. Cambridge: Cambridge University Press. Brownlee, K. (2012) Conscience and Conviction: The Case for Civil Disobedience. Oxford: Oxford University Press. Buchanan, A. (2002) Political Legitimacy and Democracy. Ethics 112(4): 689–719. Burnell, P. J. (2006). Autocratic Opening to Democracy: Why Legitimacy Matters. Third World Quarterly, 27(4), 545–62. http://wrap. warwick.ac.uk/900/1/WRAP_Burnell_7270220180609-autocraticopeningtwq.pdf Connolly, W. (ed.) (1984) Legitimacy and the State. Oxford: Blackwell. Crouch, C. (2004) Post-Democracy. Cambridge: Cambridge Polity Press. Dahl, R. A. (1989) Democracy and Its Critics. New Haven: Yale University Press. Dalton, R. J. and Welzel, C. (eds) (2014) The Civic Culture Transformed: From Allegiant to Assertive Citizens. Cambridge: Cambridge University Press. Dobson, A. (1999) Justice and the Environment: Conceptions of Environmental Sustainability and Dimensions of Social Justice. Oxford: Oxford University Press. Dukalskis, A. and Gerschewski, J. (2017) What Autocracies Say (and What Citizens Hear): Proposing Four Mechanisms of Autocratic Legitimation. Contemporary Politics 23(3): 251–68. Easton, D. (1965) A Systems Analysis of Political Life. New York: Wiley.
849
Easton, D. (1975) A Re-Assessment of the Concept of Political Support. British Journal of Political Science 5(4): 435–57. Forst, R. (2014) Justice and Democracy in Transnational Contexts: A Critical Realistic View. Social Research 81(3): 667–82. Fuchs, D. (2007) The Political Culture Paradigm. In: Dalton, R. J. and Klingemann, H.-D. (eds) The Oxford Handbook of Political Behavior. Oxford: Oxford University Press: 161–84. Garzón Valdés, E. (1988) Die Stabilität politischer Systeme. Analyse des Begriffs mit Fallbeispielen aus Lateinamerika. Freiburg and München: Verlag Karl Alber. Gaus, D. (2011) The Dynamics of Legitimation: Why the Study of Political Legitimacy Needs More Realism. ARENA Working Paper 08. Gerschewski, J. (2013) The Three Pillars of Stability: Legitimation, Repression and Co-Optation in Autocratic Regimes. Democratization 20(1): 13–38. Gilley, B. (2006) The Meaning and Measure of State Legitimacy: Results for 72 Countries. EJPR 45 (3): 499–525. Green, L. (1988) The Authority of the State. Oxford: Clarendon Press. Habermas, J. (1973) Legitimation Crises. Boston: Beacon Press (orig.: Legitimationsprobleme im Spätkapitalismus. Frankfurt am Main: Suhrkamp). Habermas, J. (1996) Popular Sovereignty as Procedure. In: Between Facts and Norms, translated by William Rehg. Cambridge: MIT Press: 463–90. Habermas, J. (1975). Legitimation Crisis. Cambridge MA: Beacon Press. Heywood, A. (2013) Politics. Houndmills: Palgrave Macmillan. Höffe, O. (2007) Democracy in an Age of Globalisation. Wiesbaden: Springer. Hurrelmann, A., Krell-Laluhová, Z. and Schneider, S. (2005) Mapping Legitimacy Discourses in Democratic Nation States: Great Britain, Switzerland, and the United States Compared. TranState working papers No. 24 Univ. SFB 597. Staatlichkeit im Wandel. Bremen. Hurrelmann, A., Schneider, S. and Steffek, J. (eds) (2007) Legitimacy in an Age of Global Politics. Houndmills: Palgrave Macmillan. Inglehart, R. (1977) The Silent Revolution: Changing Values and Political Styles among Western Publics. Princeton: Princeton University Press.
850
The SAGE Handbook of Political Science
King, A. (1975) Overloaded: Problems of Governing in the 1970s. Political Studies 23 (2–3): 284–96. Klingemann, H.-D. (1999) Mapping Political Support in the 1990s: A Global Analysis. In: Norris, P. (ed.) Critical Citizens: Global Support for Democratic Governance. Oxford: Oxford University Press: 31–56. Lauth, H.-J. (2013) Core Criteria for Democracy: Is Responsiveness Part of the Inner Circle? In: Böss, Michael, Møller, Jørgen and Skaaning, Svend-Erik (eds) Developing Democraciesi Democracy, Democratization, and Development. Aarhus: Aarhus University Press: 37–49. Lauth, H.-J. (2017) Legitimation autoritärer Regime durch Recht. In: Legitimationsstrategien von Autokratien. Zeitschrift für Vergleichende Politikwissenschaft/ Comparative Governance and Politics Special Issue 2. Wiesbaden: VS-Verlag: 247–73. Lijphart, A. (1984) Democracies: Patterns of Majoritarian and Consensus Government in Twenty-One Countries. New Haven: Yale University Press. Lipset, S. M. (1960) Political Man: The Social Bases of Politics. Garden City, NY: Doubleday. Lu, J. and Shi, T. (2015) The Battle of Ideas and Discourses before Democratic Transition: Different Democratic Conceptions in Authoritarian China. International Political Science Review 36(1): 20–41. Luhmann, N. (1979) Trust and Power. Chichester: Wiley. Luhmann, N. (1989) Legitimation durch Verfahren. Frankfurt am Main: Suhrkamp. Mayne, Q. and Geißel, B. (2018) Don’t Good Democracies Need ‘Good’ Citizens? Citizen Dispositions and the Study of Democratic Quality. Politics and Governance 6(1): 33–47. McMann, K. (2016) Developing State Legitimacy: The Credibility of Messengers and the Utility, Fit and Success of Ideas. Comparative Politics 48(4): 538–56. Merkel, W. and Grimm, S. (eds) (2009) War and Democratization: Legality, Legitimacy and Effectiveness. London: Routledge. Nohlen, D. (1998) Legitimität. In: Nohlen, D., Schultze, R.-O. and Schüttemeyer, S.S. Lexikon der Politik (7th ed.). Politische Begriffe. München: Beck: 350–2.
Norris, P. (ed.) (1999) Critical Citizens: Global Support for Democratic Government? Oxford: Oxford University Press. Nullmeier, F. and Pritzlaff, T. (2010) The Great Chain of Legitimacy: Justifying Transnational Democracy. TranState working papers. No. 123. Univ., SFB 597 Staatlichkeit im Wandel. Bremen. Nullmeier, F., Geis, A. and Daase, C. (2012) Der Aufstieg der Legitimitätspolitik. Rechtfertigung und Kritik politisch-ökonomischer Ordnungen. Leviathan Special Issue 27: 11–38. Perry, L. (2013) Civil Disobedience: An American Tradition. New Haven: Yale University Press. Peter, F. (2008) Democratic Legitimacy. New York/London: Routledge. Peter, F. (2017) Political Legitimacy. In: Zalta, N. (ed.) Stanford Encyclopedia of Philosophy (Summer 2017 edition), available at https:// plato.stanford.edu/archives/sum2017/entries/ legitimacy/. Rawls, J. (1993) Political Liberalism. New York: Columbia University Press. Rawls, J. (2001) Justice as Fairness: A Restatement. Cambridge: Harvard University Press. Rose, R. (ed.) (1980) Challenges to Governance Studies in Overloaded Politics. London: Sage. Rucht, D., Koopmans, R. and Neidhart, F. (eds) (1999) Acts of Dissent: New Developments in the Study of Protest. Lanham: Rowman & Littlefield. Scharpf, F. W. (2004) Legitimationskonzepte jenseits des Nationalstaats. MPIfG Working Paper 04/6: Köln. Scharpf, F. W. (2009) Legitimacy in the Multilevel European Polity. European Political Science Review 1(2): 173–204. Schedler, A. (2006) Electoral Authoritarianism. Boulder: Lynne Rienner Publishers. Schlichte, K. (2018) A Historical–Sociological Perspective on Statehood. In: Draude, A., Börzel, T. A. and Risse, T. (eds) The Oxford Handbook of Governance and Limited Statehood. Oxford: Oxford University Press: 49–68. Schmidt, V. A. (2013) Democracy and Legitimacy in the European Union Revisited: Input, Output and ‘Throughput’. Political Studies 61(1): 2–22. Shah, P. (2014) Legal Pluralism in Conflict. Milton Park: Routledge.
Legitimacy and Legitimation
Weatherford, M. S. (1992) Measuring Political Legitimacy. APSR 86(1): 149–66. Weber, M. (1978) Economy and Society: An Outline of Interpretive Sociology. Berkeley and Los Angeles: University of California Press. Original: (1980 [1921]) Wirtschaft und Gesellschaft. Grundriß der verstehenden Soziologie. 5th ed. Tübingen: J. C. B. Mohr. Welzel, C. and Kirsch, H. (2017) Democracy Misunderstood: Authoritarian Notions of Democracy around the Globe. World Values Research 9(1): 1–29.
851
Williams, B. (2005) In the Beginning Was the Deed: Realism and Moralism in Political Argument. Oxford and Princeton: Princeton University Press. Wolf, K. D., Collin, P. and Coni-Zimmer, M. (eds) (2017) Legitimization of Private and Public Regulation: Past and Present. Politics and Governance 5(1): 15–25, available at https://www. cogitatiopress.com/politicsandgovernance/ issue/view/58 Zaum, D. (ed.) (2013) Legitimating International Organizations. Oxford: Oxford University Press.
51 Political Competition Jennifer Cyr and Alexis Work
Studies across comparative politics address political competition as a central factor in their theories. Referred to in myriad ways, the phenomenon plays a role in many processes, from the more obvious ones such as democratization to less intuitive areas, including in autocratic regimes. This review characterizes political competition as that which occurs among political parties within a system of governance, where parties vie to place candidates in positions of political power. With political parties at the center, competition can be evaluated from different angles and at different levels of analysis. In this overview, we first outline how the concept has been treated historically. We then address, in a non-exhaustive way, research that treats competition as both an outcome to explain and as an explanans in its own right. Following this, we review more recent veins of research that consider political competition. We look first at the nuances of the relationship between competition,
representation and democracy. Competition, we show, need not necessarily bolster representation or the quality of a democratic regime. Among other things, it can drown out minoritized voices. Competition has also become a near sine qua non of nondemocracy, and can even help in sustaining an authoritarian regime. We then examine different heuristics for political competition, including political fragmentation and electoral volatility. We use the latter to show how competition varies across the globe. Finally, we review some of the empirical databases that provide us with the metrics for assessing political competition. In all, we acknowledge that competition is fundamental to our understanding of regimes and probably one of the most studied phenomena in the political sciences. Yet, one overwhelming conclusion of this overview is that we still have much to learn about how it operates and what are its consequences.
Political Competition
A (Brief) History of Political Competition Scholars of democracy began making reference to political competition in their efforts to conceptualize various features of democratic and democratizing societies. The first accounts provide the foundation of the contemporary study of political competition, and later works have successfully built upon those original theories. We focus here on early, and perhaps some of the most prominent, understandings of political competition and its relationship, in particular, to democracy. Some of the most influential early texts on democracy considered competition as fundamental to the regime. Schumpeter (1942) and Dahl (1971) viewed competition as necessary for democracy, although only the former considered it sufficient. Indeed, Schumpeter’s theory of democracy challenged prior accounts by emphasizing that citizens need not always agree on a common good. In doing this, he locates a ‘competitive struggle for the people’s vote’ at the center of democratic governance. Dahl also values competition, making it one of two dimensions whose variation helps to distinguish partial from full polyarchies (for Dahl, democracy was an unattainable ideal). If competition is the foundation of democracy, then the question we must ask is: Who contests power? In modern-day regimes, there is one predominant institution that both constitutes and channels competition: the political party. Given this, scholars of competition typically focus on political parties, including what they are and how they are constituted. Duverger’s classic book (1954), for example, viewed parties as reflections of the social structures present in society and a nation’s political and institutional history. Sartori (1976) built on Duverger in important ways, presenting a more straightforward idea of the party as an organization with predefined values that posts candidates for election. His definition required that parties be ‘relevant’,
853
that is, that they have an effect on political competition. (Today, this distinction is often made quantitatively, by considering only those parties that achieve some electoral threshold.) Schattschneider (1960) noted that political competition was crucial for ensuring the stability of a ruling party; without party pluralism, divisions within the ruling party could destabilize democratic governance. A focus on political parties drove early scholars to study their nature. These works focused initially on party development in Western Europe and the United States (e.g. Ostrogorski, 1910; Michels, 1911; Duverger, 1954; Kirchheimer, 1990). One major tendency from these texts was to interrogate the inherent tensions between political parties as institutions that seek power while also existing to (democratically) represent and govern. The oligarchic (Michels, 1911) nature of the internal dynamics of the party itself inevitably contradicted its democratic function. Indeed, Ostrogorski, an early analyst of these tensions, noted that ‘As soon as a party, even if created for the noblest object, perpetuates itself, it tends to degeneration’ (1910: 441). These early works identified a recurring contradiction inherent to party politics: that one’s private goals (e.g. re-election, a longstanding career in politics, party longevity) might clash with the very public end of serving one’s constituents. Second, scholars sought to analyze how political party structures changed over time and in response to structural changes, including, for example, suffrage extension and economic crisis. Parties were founded initially as cadres of elites. With the expansion of voting rights, however, mass-based parties emerged to capture the support of recently politicized groups (Duverger, 1972). The scramble for votes over time transformed parties into catch-all organizations (Kirchheimer, 1990). Finally, as membership and citizen participation generally declined, parties became increasingly dependent upon state resources and patronage for survival, leading to cartelization (Katz and Mair, 1995). In this (largely
854
The SAGE Handbook of Political Science
European-centric) view, party structure evolved as ties to society changed. Parties gained prominence elsewhere in the world as countries transitioned to democracy in the mid to late 20th century. Yet, the ties binding citizens to parties – to the extent that they did at all – tended to be more diverse (Roberts, 2002) and, in some cases, much more fragile than in the early European context (see, e.g., Mainwaring and Torcal, 2006). Consequently, successful adaptation and survival (e.g. Ishiyama and Quinn, 2006; Cyr, 2017) required parties to adopt unique strategies and rely on a diverse set of resources. Discussions of political competition emerged as scholars cited it as a key factor in democratic governance. Since then, political competition has grown into a key concept of its own. To explore this, in the next two sections we ask: What explains political competition, and what outcomes does it help to explain?
Explaining Political Competition Competition manifests in democratic settings primarily as competition among political parties. Party systems vary in terms of the number of parties and their degree of fragmentation (Sartori, 1976). What accounts for these variations? Scholars have explained the dynamics of party competition using a variety of perspectives. We examine each of these here to understand why political competition looks like it does across space and time. One longstanding take on competition views it from the lens of the institutional environment in which political parties operate. Most famously, perhaps, Duverger argued that electoral institutions shape party systems. Single-member plurality districts, he found, tended to produce two-party systems, whereas proportional representation was more likely to generate multi-party systems. His so-called law evoked multiple reactions, sparking a longstanding debate – one
that we still consider today – on what factors shape the dynamics of party competition. Some authors, in effect, built upon and further specified Duverger’s Law. Rae (1967), for example, found that different electoral institutions – including, as per Duverger, the average district magnitude, but also the electoral formula and the ballot structure – worked together to shape the degree of multipartyism. Some decades later, Cox (1997) re-assessed Duverger’s Law, taking as a point of departure that citizens will not waste their votes on parties that are unlikely to win. Given this, Cox re-affirmed the impact of district magnitude on the effective number of parties. His ‘rule’ asserts that the number of parties will tend to be one greater than the district magnitude in single-member plurality and proportional representation systems, or one greater than the number of candidates that advance in contests with runoffs (Cox, 1997: 271). Strom (1989) finds that, while proportional representation fosters lower levels of disproportionality, two-party systems have greater government responsiveness. A second approach to competition finds that party systems are more likely to embody the structural dynamics of the societies they represent. Spatial theories of political competition (e.g. Downs, 1957; Adams, 2001), for example, contend that the number and positions of parties on an ideological continuum will broadly reflect the distribution of the public along that continuum. For Downs (1957), party competition could be explained in great part by simply looking at the left–right dispersion of the voting population. In two-party systems, both party platforms would converge to the position of the median voter. This ‘median voter theorem’ concluded, therefore, that policy platforms would be quite similar, yet unstable, as voting predispositions vacillated among different party alternatives. Adams (2001), in turn, noticed that, despite Downsian incentives for similarity and instability, political parties in practice often held divergent and quite stable policy positions.
Political Competition
Why, he asks, is this the case? The author finds that voters are motivated by policy concerns and also by their (partisan) preference toward one versus all other parties. Parties therefore direct their policy appeals to partisans whose support is grounded in non-policy concerns. Party systems, as a consequence, include parties with stable and distinct platforms. To be sure, voters care about much more than their policy or ideological predispositions. They exhibit differences along ethnic, religious and regional lines. For many, these social cleavages leave a lasting impact on the dynamics of party systems. For Lipset and Rokkan (1967), for example, parties give voice to and crystallize social conflicts. What matters, then, is which cleavages gain political salience in any given place. Once parties emerge to politicize a cleavage, however, Lipset and Rokkan find that the resulting party system is unlikely to change. Their ‘freezing’ hypothesis applied to early to mid 20th-century patterns of party system development in Western Europe. Later research (Bartolini and Mair, 1990) extended the finding through much of the century. Later research added nuance to the relationship between social diversity and political competition. The extent of party system fragmentation, that is, the number of parties in a party system, is shaped by the interaction of societal heterogeneity and the permissiveness of the electoral rules. Systems where smaller parties are more likely to win seats, such as proportional representation, will increase fragmentation; more restrictive systems, such as first past the post (FPTP), will constrain party competition (Amorim Neto and Cox, 1997). Milazzo et al. (2018), however, use districtlevel data to problematize this finding. They argue that greater social diversity leads to more fragmentation even under FPTP. As data access improves and our methodological toolkit becomes more extensive, the nuances of these relationships will continue to be fleshed out. The research evaluated thus far tends to be rooted in studies of developed countries, and especially Western Europe and the United
855
States. Research on other global regions further problematizes the relationship between diversity and political competition. Madrid (2005: 3), for example, has found that social cleavages have not translated into salient political dimensions in much of Latin America. The presence of the former does not, therefore, necessarily imply the existence of the latter. Roberts concurred, noting that ‘the incongruence between the social “fault lines” of Latin American societies and their institutionalized forms of political representation is unusually pronounced, and it appears to be growing’ (Roberts, 2002: 3). What, then, explains patterns of political competition outside of Europe? Here, of course, the literature is too varied and extensive to properly summarize. Often, however, the response includes an interaction of multiple different factors. Party systems embody, for example, social cleavages as a function of institutional constraints and other barriers to representation (e.g. Birnir, 2007; Madrid, 2005). Some authors find that the incentives to create parties can be noninstitutional and derived, for example, from expectations regarding how government is formed (Zeigfield, 2012). Indeed, incentive structures and structural constraints should be distinct in countries where the extension of the suffrage (Dahl’s [1971] participation dimension) and the liberalization of competition (Dahl’s contestation dimension) occur simultaneously. Pre-authoritarian party systems may condition post-authoritarian patterns of competition. The strength of authoritarian incumbents (Riedl, 2014), and their ability to adapt to the new democratic context, can shape the opportunity structure for new party entrants.
What does political competition help explain? At the same time as different factors shape political competition within a polity, the
856
The SAGE Handbook of Political Science
nature of political competition imparts effects of its own. Theoretical and empirical work alike posits strong relationships between political competition and governance that is both more responsive to voters and representative of their interests. Scholars have long argued that competition has a positive impact on the quality of democracy and the integrity of democratic institutions. More recently, however, and perhaps paradoxically, research has also highlighted the potentially stabilizing role of competition in non-democratic settings. By offering more than one political platform to choose from, competition engenders responsiveness both from politicians in office and from voters themselves. When different political parties have a reasonable chance of winning elections, voters are less likely to waste their vote (Cox, 1997). Supporters are more likely to seat ‘their’ representatives. Citizens in general are more likely to expect governance that caters to what they perceive is best for the public (Aldrich and Griffin, 2018). Generally, where party competition is open and effective, citizens are more likely to feel enfranchised. Incumbents, for their part, will understand that they can lose elections. Consequently, their responsiveness to the public should increase in kind (Bartolini, 1999). In addition to a political process that is more responsive, voter representation should increase where competition reflects the level of societal diversity in a polity. When, for example, voters are presented with candidates and party platforms that reflect their political views, they are more likely to vote than is the case when they do not recognize any such political party. They are also more likely to generate stronger affinities with a particular party. It follows that citizens who are able to vote for a candidate that represents them will afford a higher level of legitimacy to government institutions, suggesting that they feel represented (Morlino, 2011; see Bingham Powell Jr, 2004 for a comprehensive review). By contrast, bias toward whiter, richer, male
leaders has historically resulted in a political discourse that is exclusive of the poor and minoritized (Schattschneider, 1960). With regard to democratic quality, researchers credit political competition with the production of more effective governance. Morlino (2011) uses ‘democratic deepening’ to evaluate the normative goodness of democracies. Political competition is a core feature of this. Indeed, competition is interrelated with other aspects of governance, including the rule of law and electoral accountability. As a result, higher quality competition is typically associated with overall quality of governance. Other authors (Aldrich and Griffin, 2018) highlight the role of strong opposition parties in particular on effective governance. Opposition parties pressure incumbent parties to perform or risk being voted out of office. Incumbent parties, as a consequence, are more likely to deliver both in terms of general public goods and also on those issues related to their political platform. Competition also has a hand in keeping political parties in check. For example, given the possibility of voter defection, incumbent parties should be more inclined to adhere to a set of ideological preferences that foster a loyal voting base (Aldrich and Griffin, 2018). Relatedly, party scholars emphasize the ineffectiveness of intra-party competition as a substitute for inter-party competition, especially when it comes to single-party regimes. Even if competition exists within the party in single-party systems, the voters’ inability to defect from that particular party precludes actual competition (Sartori, 1976; Bartolini, 1999). Intra-party competition is not necessarily sufficient for partisans either. Leadership positions are singularly filled. Intra-party competition over a particular post will only satisfy a fraction of the party is represented (Rahat et al., 2008). It is important that competition exist outside of political parties, where democracy provides some guarantee of procedural fairness (Bartolini, 1999). Other institutions also benefit as a result of increased political competition. Studies
Political Competition
of federal court behavior, for example, demonstrate that political competition leads to increased judicial independence (Helmke, 2002). As competition increases the uncertainty of future political leadership, judges alter their strategies so as not to appear loyal to any particular party. Competition may also increase governmental transparency. Higher party competition translates into more competitive, and therefore more democratic, public procurement processes (Broms et al., 2019). In this last example, less party competition makes government vulnerable to corruption, even in ‘old democrac[ies] with a meritocratic bureaucracy’ (Broms et al., 2019: 1). Greater competition can also translate into policy outcomes that benefit a broader sector of the population. More highly contested elections are also associated with a higher level of capital investment and lower taxation (Besley et al., 2010) and less bloat in government salaries (Chamon et al., 2018). These results hold in terms of economic development as well; Indian states with higher political participation and competition typically have higher Human Development Index (HDI) scores (Dash and Mukherjee, 2015). These studies provide evidence that political leaders in competitive situations are more likely to invest in broad social development in an effort to garner the support of a larger electorate. To be sure, the positive effects of political competition that these studies elucidate are not seen everywhere in practice. Earlier theorists, for example, did not fully anticipate the unique challenges faced by developing countries or, as we will consider later, the more recent proliferation of hybrid regimes. Where democratic institutions are newer or weaker, additional variables, such as economic crisis (Pacek and Radcliff, 1995), inequality (Rueschemeyer, 2004) and natural resource dependence (Jensen and Wantchekon, 2004), have altered the presumed effects of competition on governance, representation and institutional stability. Moreover, from the
857
fledgling Weimar Republic to Venezuela at the turn of the 21st century, populism, nationalism and fascism have derailed competition. More recent research has found, perhaps paradoxically, that party competition can have a negative impact on democracy under certain circumstances. For example, in countries with a strong presence of organized crime, party competition and electoral turnover can result in more violence. In Mexico, regional victories by the Partido de Acción Nacional (PAN) increased the violence between drug cartels by interrupting government complicity in the dominant cartel’s dealings (Snyder and Duran-Martinez, 2009). Furthermore, when competition becomes polarized, people may begin to favor partisan goals over democratic principles. Svolik (2018) recognizes this as a potentially significant threat to democracy in the post-Cold War era. Using survey data from Venezuela, his study finds that the more intense a respondent’s partisan views are, the more likely that respondent is to trade democratic principles for policy concessions.
Recent Advances, Ongoing Debates and Critical Assessments The knowledge on political competition is in no way complete or even settled. Our understanding of even the most primary relationships – such as that between political competition and democracy – is constantly being challenged. The more we research, it seems, the less we know conclusively. Recent works on competition reflect this conundrum, suggesting that, despite its importance to the study of politics, political competition remains an elusive topic of investigation. For example, recent works suggest that there is still much to learn about how parties emerge and how they function internally, especially in the global South. A new research agenda highlights the role that resources play
858
The SAGE Handbook of Political Science
in, first, the successful emergence of parties (e.g. Tavits, 2013) and, second, their survival over time (e.g. LeBas, 2013; Cyr, 2017). Organizationally strong parties, or parties with active members, a widespread set of branches and a professional central bureaucracy, have the people and the places necessary to successful coordinate elections across a territory. The role of organizational resources has been correlated with successful party formation in post-Communist democracies in Russia, Latin America and sub-Saharan Africa. They have also been correlated with electoral survival even in the absence of other, more conventional material resources, such as money and patronage (Cyr, 2017). Although scholars have long purported that democracy is unthinkable without political parties, it is not the case that competition is necessarily or inherently beneficial to voters. Indeed, one concern is that party competition disproportionately favors certain sectors over others. Economic elites and prominent interest groups are more effective at shaping policy in the United States, even when their preferences contradict that of the majority of voters (e.g. Page et al., 2013). Furthermore, individuals who make larger campaign donations are more likely to be granted access to their representatives (Kalla and Broockman, 2016). Ultimately, unless representation directly includes disadvantaged or minoritized groups, competition can drown out the voices of resource-poor but vote-rich factions. In all, one major point of this burgeoning research agenda is that representation is not guaranteed by effective competition (Rahat et al., 2008). Indeed, in countries marked by high levels of difference and inequality, the two objectives may actually work at cross-purposes, leading countries to ‘careen’ (Slater, 2013) between different modes of politics. This is especially the case when national debates center on conflicts over liberal constraints and inclusion. Democratic quality need not, therefore, necessarily follow from increased competition and representation.
Democracy may be unthinkable without competition, but competition alone, however effective or stable, is not sufficient for the stability of democracy. Still, a new set of works finds renewed importance for the role of competition, and elite-centered organizations in particular, in stimulating a country’s turn to democracy. Weyland (2017) compares waves of democratization and asks why some periods of democratic diffusion were more successful than others. The answer lies in the nature of political organization at the time of the wave. Where the political agents charged with channeling societal momentum for change were tied to well-developed political organizations, then they had access to information and institutional support that enabled rational decision-making processes. These, in turn, were more likely to lead to successful democratization than transitions grounded in snap decisions. Ziblatt (2017) returns to democracy’s origins in Western Europe. He posits a newly important role for conservative parties in orchestrating successful democratic transitions. Where these could re-define themselves as mass democratic parties, then liberal democracy was more likely to take root. He compares the UK’s success story to the failure of Germany. Madrid (2019) re-examines the earliest democratic reforms in Latin America. He finds that, against the existing literature’s expectations, pressures from below did not instigate early turns to liberalization. Instead, it was the presence of opposition parties that, accompanied with key fissures within the ruling party, turned the tide and allowed for the implementation of liberalizing measures. Bernhard et al. (2019) examine more contemporary concerns of democratic backsliding. They find that organized groups, including political parties, are able to ‘credibly threaten sanctions against anti-system behavior’, shoring up democracy even in periods where backsliding seems likely (Bernhard et al., 2019: 5). Each of these studies revitalizes the role that political parties have in
Political Competition
institutionalizing uncertainty and channeling key, strategic interests. The presence of (certain types of) competition in key moments, such as elections, suffrage expansion, and when there are threats to the regime can bolster democracy over other regime types. While one stream of literature has dived deeper into the relationship between competition and democracy, another contemplates the role of competition in democracy’s absence. The recent literature on hybrid regimes throws a wrench into the Schumpeterian idea that competition is a necessary and sufficient condition for democracy. Indeed, elections have become an ostensible sine que non of any regime. As Schedler notes, rather wryly, ‘The modal dictator in the contemporary world holds multiparty elections’ (2015: 1). In other words, the relationship between competition and democracy is not necessarily linear or straightforward. Levitsky and Way (2010) develop the notion of competitive authoritarianism to consider the hybrid regimes that emerged after the end of the Cold War. Competitive authoritarian regimes are host to formal democratic institutions. Yet, incumbents exploit their power to garner significant advantage when elections are held. This literature notes that elections can be an important tool for authoritarian incumbents. They are key sources of information on incumbent legitimacy (Slater and Wong, 2013). Incumbents can bargain with opposition elites in authoritarian legislatures, trading spoils and concessions for support and resources (Svolik, 2012). An entire literature on authoritarian legislatures (see Schuler and Malesky, 2014 for a review) posits that, even under autocratic rule, certain advantages can be gained by allowing legislators to publicly debate policy, even as the presence of opposition views poses a risk to the incumbent. Competition may be restricted in these circumstances. Still, just as authoritarian legislatures can lead to contested dictatorships, they can also help bolster the authoritarian regime. These new research agendas cohere with and challenge our normative understanding
859
of the role that competition plays in promoting plurality, checks and balances and democratization in general. Undoubtedly, they reinforce the notion that we still have much to learn about political competition and its relationship to governmental outcomes and regime endurance. Happily, data on political competition are constantly evolving, allowing us to answer older questions in new and different ways. The next two sections examine the empirical data on competition, discussing, first, how metrics of competition change across the globe, and, second, the sources that exist for measuring competition.
Political Competition around the Globe Common measures of competition examine the nature of inter-party competition in a polity. Most prominently, scholars ask: How many competitors are there? How steady is competition? To what extent do competitors interact in stable ways? Metrics to help answer these questions exist. Research on the nature, causes and consequences of competition has proliferated as a result of the development of these measures. In what follows, we examine three different measures of competition: party system fragmentation, party system institutionalization and electoral volatility. We examine this last measure of electoral volatility in great detail, exploring the (wealth of) literature that has flourished across the globe.
Party System Fragmentation A party system is fragmented if it contains more than two parties and none of these captures an absolute majority in the representative assembly. Fragmentation is, thus, a result of both the number of parties in the system and their relative size (Karvonen, 2011: 1822). The Laakso–Taagepera Index of the
860
The SAGE Handbook of Political Science
Effective Number of Parties is typically used to measure fragmentation. Broadly speaking, fragmentation shapes how governing coalitions are formed, including whether coalitions are necessary at all. It also affects the nature and quality of representation. Where governments are constituted by many parties, the capacity to hold officials accountable becomes increasingly complex. In all, fragmentation has long been seen as one marker of the potential strength and success of government (Karvonen, 2011). Recent studies on party fragmentation typically adopt either a cross-national (e.g. Lublin, 2017) or a subnational approach (e.g. Rozenas and Sadanandan, 2018) and seek to evaluate the role that different social cleavages (e.g. ethnicity or religion), or social heterogeneity more generally, might have on driving up the number of viable parties in a country. Theoretical innovations suggest that the relationship between diversity and fragmentation may not be linear: greater social diversity tends to correlate, at more moderate stages, with greater party system fragmentation. At more extreme levels of diversity, however, that positive, linear relationship disappears (Raymond, 2015).
Party System Institutionalization and Electoral Volatility A party system is institutionalized when it is composed of a ‘stable set of parties’ that ‘interacts regularly in stable ways’ (Mainwaring et al., 2018: 19). Institutionalized party systems allow agents to develop expectations about the nature of competition in a particular polity, including their likelihood to endure into the future. Although the concept has been around since the 1990s, party system institutionalization has increasingly become tied conceptually to the stability of competition among different political parties (see, e.g., Mainwaring et al., 2018). In practice, this stability has most typically been denoted
through the metric of electoral volatility.1 Electoral volatility is typically measured as a change in vote shares obtained by individual political parties in a given political system across consecutive elections (Roberts and Wibbels, 1999: 576). Where these vote shares are relatively consistent across time, then a party system exhibits low electoral volatility. Party systems with consistently low electoral volatility are viewed as highly institutionalized (Kuenzi et al., 2017: 1). The same set of parties are, by and large, governing over time, allowing actors to draw reasonably valid conclusions and expectations regarding coordination and policy-making. Higher levels of electoral volatility, by contrast, are markers of a weakly institutionalized party system. Electoral volatility has typically been measured using Pedersen’s index of volatility (1979), which captures the percentage change in either vote or (legislative) seat shares between parties from one election to the next. More recently, scholars have differentiated between two types of volatility: one that captures the emergence of new parties and, consequently, votes away from incumbent parties (called Type A volatility), and a second that measures vote shifts between existing parties (called Type B volatility) (Powell and Tucker, 2014). Type A volatility in particular is a sign of lower party system institutionalization. In this case, voters are not simply changing their vote from one (existing) party to another. Instead, they are choosing to defect from all existing parties and vote for something new (Powell and Tucker, 2014). Scholars rely heavily on electoral data, based at the country, regional or crossnational level (see, e.g., the IPU Parline Database on National Parliaments) to measure electoral volatility. Some scholars have examined trends in volatility across the globe (see, e.g., Mainwaring and Zoco, 2007). Most typically, however, volatility is examined at the regional level. Electoral volatility is, in practice, highly volatile. Not only does
861
Political Competition
it vary across regions; it also varies considerably within and across the countries that constitute regions. Table 51.1 provides an overview of volatility trends across several different regions toward the end of the 20th century. Western Europe, perhaps not surprisingly, has the lowest electoral volatility of any region under investigation. Its average score is just about half that of Southern Europe, which has the second lowest volatility score. In general, scholars have underscored the relative stability of party competition across time in Europe (see, e.g., Bartolini and Mair, 1990). The post-Communist countries, alternatively, exhibit volatility scores that are higher than other regions, including Latin America and Africa. Powell and Tucker (2014) find that Type A volatility, that is, switching votes from incumbent to new parties, drives overall volatility in post-Communist countries. This differs from, for example, Western Europe, where a great majority of volatility is driven by changes of votes within the same group of parties (p. 131). Some of the earliest works on volatility outside of the global West examined electoral volatility in Latin America (see, e.g., Roberts and Wibbels, 1999; Madrid, 2005). Decades
into the study of volatility in the region, scholars have noted that volatility across time has remained, by and large, quite stable. That is, higher levels of volatility that marked the post-transition period (1980s–1990s) remain relatively high well into the 21st century (Mainwaring et al., 2018). Research on Asian countries shows that volatility levels trend, on average, alongside those in Latin America. They also tend to be considerably lower than in post-Communist countries, although there is considerable variation across Asian regions (Hicken and Kuhonta, 2011: 581). African scholars note that volatility is a problem in much of the African continent, although here too the within-region variance is significant. The region includes some of the most volatile party systems ever recorded (Weghorst and Bernhard, 2014: 1707), even as, on average, it fares better than the Baltic states, post-Communist Europe and Latin America during certain decades (Table 51.1). More recent work has broken down the study of volatility into its Type A and B forms, with some discrepancy on how these distinct measures of volatility have evolved. While Weghorst and Bernhard (2014) find that type A volatility has decreased with time – a marker of growing party system
Table 51.1 General trends in electoral volatility in different regions Region
Period
Mean volatility
Type
Western Europe Southern Europe Latin America Balkans East Central Europe Post-Communist Europe Sub-Saharan Africa Sub-Saharan Africa Latin America Latin America Baltic States Post-Communist Europe Former USSR
1885–1995 1970s 1980s 1990s 1990s 1989–2002 1990s 1960s–2007 1980s and 1990s 1990s 1990s 1989-2006 1990s
8.6 16.3 19.6 20.3 20.5 25.6 28 26.0 28.1 28.2 31.4 37.0 41.9
Vote Vote Seat Vote Vote Vote Seat Seat Seat Vote Vote Seat Vote
Source: Weghorst and Bernhard (2014: 1710).
862
The SAGE Handbook of Political Science
institutionalization – Kuenzi et al. (2017) conclude that both types of volatility (i.e. A and B) remain quite high over time, with little indication of institutionalization (p. 6). In all, there are notable differences between older party systems, such as those in Europe, and newer ones, such as those in the global South, with respect to overall volatility trends. Mainwaring and Zoco (2007) confirm this pattern and find that the age of democracy is an important predictor of levels of electoral volatility. Older democracies tend to have party systems that are less volatile than newer democracies. Sequentially, where parties formed before the onset of mass media, then linkages between parties and citizens tended to be stronger. But there is also a period effect, wherein parties become less central for integrating citizens into politics as time passes (Mainwaring and Zoco, 2007: 166–7). Additionally, the authors find that, in line with empirical trends noted above, the duration of a democratic (‘competitive’) regime has little to no impact on the stabilization of electoral competition (Mainwaring and Zoco, 2007: 169). They attribute this to the generally low levels of confidence that post-1978 democracies inspire with respect to their party systems. Where ‘citizen disgruntlement with parties is rampant’, then one can hardly expect that the utility of party labels will grow (Mainwaring and Zoco, 2007: 169). This means that, empirically, we do not see volatility trends decreasing over time. In all, the age of a democracy seems to determine a volatility baseline that, thereafter, party systems struggle to improve. Political competition in the form of interparty competition is incredibly important for shaping the quality of the democratic regime. Research on volatility suggests that structural factors, and in particular world-historical time, may delimit the capacity of parties to perform that job effectively. On this point, citizen confidence clearly matters. Luckily for scholars of political competition, ample public opinion databases exist to explore this relationship. We turn to these sources now.
Empirical Databases There are two types of empirical data that are more salient for understanding the nature, causes and consequences of political competition. First, electoral databases provide the vote counts and seat shares necessary to determine, for example, the effective number of parties or the level of electoral volatility. Often, countries will make their own electoral data available, although whether these online sources are complete and fully up to date tends to vary – at times quite widely – from country to country. Institutions such as the Inter-Parliamentary Union (IPU, https://data. ipu.org/), and even individual scholars (e.g. Boorman and Golder, 2013, http://mattgolder. com/elections), also help to make electoral data publicly available. A second set of databases address political competition indirectly, through surveys with country experts or a representative sample of citizens. Here, individuals are asked to provide their opinion on a wide range of questions associated with party politics and, in the case of citizens, their own political and voting behavior. To what extent are political parties organized subnationally? What is the ideology, if any, of a political party? How disciplined are parties in government? These are the kinds of questions that expert surveys seek to answer. Varieties of Democracy (https://www.v-dem.net/en/) makes its data publicly available and allows individuals to compare the nature of party politics across countries and across time. With respect to, for example, the reception of parties by the citizenry, their perceived effectiveness and their strength of ties, nationally representative surveys can be extremely valuable. Additionally, certain clearinghouses of regional data are available, including the Global Barometer Surveys (GBS, https://www.globalbarometer.net/), which carries out nationally representative surveys in six different regions of the world: Africa (Afrobarometer), East and Southeast
Political Competition
Asia (Asian Barometer), South Asia (South Asia Barometer), Central and South America (Latinobarómetro), the Middle East (Arab Barometer) and countries of the former Soviet Union (Eurasia Barometer). There is a paywall attached to this service, and so publicly available options are becoming increasingly available, including, for example, the Latin American Public Opinion Project (LAPOP, https://www.vanderbilt.edu/lapop/).
Conclusion While our study of competition has a long and distinguished trajectory, it is nonetheless the case that we lack clear systematic conclusions regarding the nature, causes and consequences of party competition. This is true even for one of the oldest posited relationships: that of competition and democratic regimes. Our brief review of the state of the literature suggests that context matters when it comes to understanding how competition shapes (and is shaped by) political institutions and the quality of regimes. Timing also seems to matter, including, in particular, when competitive institutions emerge. Moving forward, scholars will likely exploit increasingly more and better data on elections, public opinion and experts’ considerations, as well as increasingly sophisticated methods, to better understand political competition and how it operates across space and time. On this point, a turn to the subnational, and the dynamics of competition across local or regional levels of government, may help elucidate nuances while controlling for country effects. Moreover, continued research on competition in non-democratic regimes – especially as these endure over time – may help to establish the relationship between competition and regime type. If nothing else, this overview demonstrates that competition remains a variable worthy of interrogation. Indeed, we must not take competition for granted. This is true even in
863
the United States, where new research has belied the oligarchic nature of the country’s stable, two-party system (Bartels, 2016). Competition has been, and will continue to be, vital to our understanding of how politics operates and evolves in the real world.
Note 1 This tendency has not gone without criticism. Indeed, in his most recent operationalization of party system institutionalization, Mainwaring supplements the measure of electoral volatility with two additional metrics: cumulative electoral volatility, or a comparison of recent electoral results with those from 1990, and the vote share in the most recent election of those parties that existed in 1990 (Mainwaring et al., 2018: 5). His goal is to capture stability over time with these additional measures, since electoral volatility is essentially a snapshot view of electoral shifts from one election to the next.
References Adams, J. (2001). Party Competition and Responsible Party Government: A Theory of Spatial Competition Based upon Insights from Behavioral Voting Research. Ann Arbor, MI: University of Michigan Press. Aldrich, J. & Griffin, J. (2018). Why Parties Matter: Political Competition and Democracy in the American South. Chicago, IL: University of Chicago Press. Amorim Neto, O. & Cox, G. (1997). Electoral Institutions, Cleavage Structures, and the Number of Parties. American Journal of Political Science, 41(1), 149–74. Bartels, L. (2016). Unequal Democracy: The Political Economy of the New Gilded Age. Princeton, NJ: Princeton University Press. Bartolini, S. (1999). Collusion, Competition and Democracy: Part I. Journal of Theoretical Politics, 11(4), 435–70. Bartolini, S. & Mair, P. (1990). Policy Competition, Spatial Distance and Electoral Instability. West European Politics, 13(4), 1–16. Bernhard, M., Hicken, A., Reenock, C. & Lindberg, S. I. (2019). Parties, Civil Society, and the
864
The SAGE Handbook of Political Science
Deterrence of Democratic Defection. Studies in Comparative International Development. https://doi.org/10.1007/s12116-019-09295-0 Besley, T., Persson, T. & Sturm, D. M. (2010). Political Competition, Policy and Growth: Theory and Evidence from the US. The Review of Economic Studies, 77(4), 1329–52. Bingham Powell Jr., G. (2004). Political Representation in Comparative Politics. Annual Review of Political Science, 7, 273–96. Birnir, J. K. (2007). Divergence in Diversity? The Dissimilar Effects of Cleavages on Electoral Politics in New Democracies. American Journal of Political Science, 51(3), 602–19. Bormann, N. & Golder, M. (2013). Democratic Electoral Systems Around the World, 19462011. Electoral Studies, 32, 360–369. Broms, R., Dahlström, C. & Fazekas, M. (2019). Political Competition and Public Procurement Outcomes. Comparative Political Studies (Online first) 1–34. Chamon, M., Firpo, S., de Mello, J. M. P. & Pieri, R. (2018). Electoral Rules, Political Competition and Fiscal Expenditures: Regression Discontinuity Evidence from Brazilian Municipalities. The Journal of Development Studies 55(1), 19–38. Cox, G. (1997). Making Votes Count. Cambridge: Cambridge University Press. Cyr, J. (2017). The Fates of Political Parties: Institutional Crisis, Continuity, and Change in Latin America. Cambridge: Cambridge University Press. Dahl, R. (1971). Polyarchy: Participation and Opposition. New Haven: Yale University Press. Dash, B. B. & Mukherjee, S. (2015). Political Competition and Human Development: Evidence from the Indian States. The Journal of Development Studies 51(1), 1–14. Downs, A. (1957). An Economic Theory of Democracy. New York: Harper & Row. Duverger, M. (1954). Political Parties, Their Organization and Activity in the Modern State. New York: Wiley. Duverger, M. (1972). Party Politics and Pressure Groups: A Comparative Introduction. New York: Crowell. Helmke, G. (2002). The Logic of Strategic Defection: Court-Executive Relations in Argentina under Dictatorship and Democracy. American Political Science Review, 96(2), 291–303.
Hicken, A. & Kuhonta, E. M. (2011). Shadows from the Past: Party System Institutionalization in Asia. Comparative Political Studies, 44(5), 572–97. Ishiyama, J. & Quinn, J. J. (2006). African Phoenix? Explaining the Electoral Performance of the Formerly Dominant Parties in Africa. Party Politics, 12(3), 317–40. Jensen, N. & Wantchekon, L. (2004). Resource Wealth and Political Regimes in Africa. Comparative Political Studies, 37(7), 816–41. Kalla, J. L. & Broockman, D. E. (2016). Campaign Contributions Facilitate Access to Congressional Officials: A Randomized Field Experiment. American Journal of Political Science, 60(3), 545–58. Karvonen, L. (2011). Party System Fragmentation. In Badie, B., Berg-Schlosser, D. & Morlino, L. (eds.) International Encyclopedia of Political Science (pp. 1823–4). Thousand Oaks, CA: Sage. Katz, R. & Mair, P. (1995). Changing Models of Party Organization and Party Democracy: The Emergence of the Cartel Party. Party Politics, 1(1), 5–28. Kirchheimer, O. (1990). The Catch-All Party. In Mair, P. (ed.) The West European Party System (pp. 50–60). Oxford: Oxford University Press. Kuenzi, M., Tuman, J. P., Rissmann, M. P. & Lambright, G. M. (2019 [online, 2017]). The Economic Determinants of Electoral Volatility in Africa. Party Politics, 25(4), 621–31 LeBas, A. (2013). From Protest to Parties: PartyBuilding and Democratization in Africa. Oxford: Oxford University Press. Levitsky, S. & Way, L. (2010). Competitive Authoritarianism: Hybrid Regimes after the Cold War. Cambridge: Cambridge University Press. Lipset, S. M. & Rokkan, S. (1967). Party Systems and Voter Alignments: Cross-National Perspectives. New York: Free Press. Lublin, D. (2017). Electoral Systems, Ethnic Heterogeneity and Party System Fragmentation. British Journal of Political Science, 47(2), 373–89. Madrid, R. (2005). Ethnic Cleavages and Electoral Volatility in Latin America. Comparative Politics, 38(1), 1–20.
Political Competition
Madrid, R. (2019). Opposition Parties and the Origins of Democracy in Latin America. Comparative Politics, 51(2), 57–178. Mainwaring, S., Bizzarro, F. & Petrova, A. (2018). Party System Institutionalization, Decay, and Collapse. In Mainwaring, S. (ed.) Party Systems in Latin America: Institutionalization, Decay, and Collapse (pp. 17–33). Cambridge: Cambridge University Press. Mainwaring, S. & Torcal, M. (2006). Party System Institutionalization and Party System Theory after the Third Wave of Democratization. In Katz, R. S. and Crotty, W. J. (eds), Handbook of Party Politics (pp. 204–27). Thousand Oaks: Sage. Mainwaring, S. & Zoco, E. (2007). Political Sequences and the Stabilization of Interparty Competition. Party Politics, 13(2), 155–78. Michels, R. (1911). Political Parties: A Sociological Study of the Oligarchical Tendencies of Modern Democracy. Eastford, CT: 2016 edition from Martino Fine Books. Milazzo, C., Moser, R. G. & Scheiner, E. (2018). Social Diversity Affects the Number of Parties Even under First-Past-the-Post Rules. Comparative Political Studies, 51(7), 938–74. Morlino, L. (2011). Changes for Democracy: Actors, Structures, Processes. Oxford University Press. Ostrogorski, M. (1910). Democracy and the Party System in the United States (Scholar’s Choice Edition). Sacramento, CA: 2015 reprint by Creative Media Partners LLC. Pacek, A. & Radcliff, B. (1995). The Political Economy of Competitive Elections in the Developing World. American Journal of Political Science, 39(3), 745–59. Page, B. I., Bartels, L. M. & Seawright, J. (2013). Democracy and the Policy Preferences of Wealthy Americans. Perspectives on Politics, 11(1), 51–73. Pedersen, M. N. (1979). The Dynamics of European Party Systems: Changing Patterns of Electoral Volatility. European Journal of Political Research, 7(1), 1–26. Powell, E. & Tucker, J. (2014). Revisiting Electoral Volatility in Post-Communist Countries: New Data, New Results, and New Approaches. British Journal of Political Science, 44(1), 123–47. Rae, D. W. (1967). The Political Consequences of Electoral Laws. New Haven: Yale University Press.
865
Rahat, G., Hazan, R. Y. & Katz, R. S. (2008). Democracy and Political Parties: On the Uneasy Relationships between Participation, Competition and Representation. Party Politics, 14(6), 663–83. Raymond, C. D. (2015). The Organizational Ecology of Ethnic Cleavages: The Nonlinear Effects of Ethnic Diversity on Party System Fragmentation. Electoral Studies, 37, 109–19. Riedl, R. B. (2014). Authoritarian Origins of Democratic Party Systems in Africa. Cambridge: Cambridge University Press. Roberts, K. M. (2002). Social Inequalities without Class Cleavages in Latin America’s Neoliberal Era. Studies in International Comparative Development, 36(4), 3–33. Roberts, K. M. & Wibbels, E. (1999). Party Systems and Electoral Volatility in Latin America: A Test of Economic, Institutional, and Structural Explanations. American Political Science Review, 93(3), 575–90. Rozenas, A. & Sadanandan, A. (2018). Literacy, Information, and Party System Fragmentation in India. Comparative Political Studies, 51(5), 555–86. Rueschemeyer, D. (2004). The Quality of Democracy: Addressing Inequality. Journal of Democracy, 15(4), 76–90. Sartori, G. (1976). Parties and Party Systems: A Framework for Analysis. Cambridge: Cambridge University Press. Schattschneider, E. E. (1942). Party Government. New York: Holt, Rinehart and Winston. Schattschneider, E. E. (1960). The Semisovereign People: A Realist’s View of Democracy in America. New York: Holt. Rinehart and Winston. Schedler, A. (2015). Electoral Authoritarianism. In Scott, R. & Kosslyn, S. (eds) Emerging Trends in the Social and Behavioral [online]. (pp. 1–16). Chichester: John Wiley & Sons. Schuler, P. & Malesky, E. (2014). Authoritarian Legislatures. In Martin, S., Saalfeld, T. & Strøm, K. (eds) The Oxford Handbook of Legislative Studies (pp. 676–95). Oxford: Oxford University Press. Schumpeter, J. A. (1942). Socialism, Capitalism and Democracy. New York: Harper and Brothers. Slater, D. (2013). Democratic Careening. World Politics, 65(4), 729–63. Slater, D. & Wong, J. (2013). The Strength to Concede: Ruling Parties and Democratization
866
The SAGE Handbook of Political Science
in Developmental Asia. Perspectives on Politics, 11(3), 717–33. Snyder, R. & Duran-Martinez, A. (2009). Does Illegality Breed Violence? Drug Trafficking and State-Sponsored Protection Rackets. Crime, Law, and Social Change, 52(3), 253–73. Strøm, K. (1989). Inter-Party Competition in Advanced Democracies. Journal of Theoretical Politics, 1(3), 277–300. Svolik, M. (2012). The Politics of Authoritarian Rule. New York: Cambridge University Press. Svolik, M. (2018). When Polarization Trumps Civic Virtue: Partisan Conflict and the Subversion of Democracy by Incumbents. SSRN. Tavits, M. (2013). Post-Communist Democracies and Party Organization. Cambridge: Cambridge University Press.
Weghorst, K. R. & Bernhard, M. (2014). From Formlessness to Structure? The Institutionalization of Competitive Party Systems in Africa. Comparative Political Studies, 47(12), 1707–37. Weyland, K. (2017). Autocratic Diffusion and Cooperation: The Impact of Interests vs. Ideology. Democratization, 24(7), 1235–52. Zeigfield, A. (2012). Coalition Government and Political Party Change: Explaining the Rise of Regional Political Parties in India. Comparative Politics, 45(1), 69–87. Ziblatt, D. (2017). Conservative Political Parties and the Birth of Modern Democracy in Europe. Cambridge: Cambridge University Press.
52 Regime Change Laurence Whitehead
A Foundational Topic The study of regime structures and regime types, and therefore also of regime change, is an essential theme within the field of comparative politics. Such basic normative categories as despotism, oligarchy, dictatorship, democracy and mixed constitutionalism (see Schlumberger and Schedler, Chapter 42; Gagné and Mahé, Chapter 47 this Handbook) can be traced over millenia to the very beginning of theorizing about politics. Aristotle also originated the comparative empirical analysis of the Greek polities of his time. Since then both normative and empirical comparisons of varieties of political rule extended from city states to agrarian empires, pre-industrial civilizations, European colonial systems and eventually modern nation states. After the end of the Cold War the metrics of regime classifications became much more refined, while on the normative side there was increased celebration of liberal democracy as the superior model.
Contemporary political science therefore draws upon a deep reservoir of prior scholarship informed not just by the rich and diverse array of political regimes currently in existence, but also by an even wider and more heterogeneous set of historical experiences. In the 21st century, however, more (allegedly scientific) work on regime change has narrowed the focus of comparative analysis to the study of a small set of precisely defined regime types, and in particular to the binary and normative comparison of democracies (understood in a very different way from those of classical Greece) and authoritarian regimes (conflating all the many variants of non-democratic rule). The normative connotations of this binary schema are also quite recent, since order had traditionally been valued above mob rule. In the post-1989 setting liberal democracy gained normative ascendancy on the grounds that with good institutional design strong rule of law and accountability procedures could generate legitimate leadership and public
868
The SAGE Handbook of Political Science
choice outcomes that were also flexible and grounded on spontaneous citizen consent. By contrast authoritarian regimes were too rigid and would periodically resort to coercion to cover up their lack of citizen legitimacy. In practice, however, ‘really existing’ regimes are never as neatly conformed as these idealized normative terms would require, with the result that students of comparative politics have invoked a variety of intermediate possibilities, including ‘illiberal democracies’, hybrid regimes, electoral authoritarianism, and so forth. In response to recent experiences empirically minded scholars have also reintroduced other transversal considerations such as the wide sub-national variations in political rule that can be experienced by citizens within a single regime and the strong distortions that can arise from systemic corruption, organized crime and even state failure in a considerable range of national cases. At the normative level, the superiority of liberal democracy has been challenged by growing evidence of shorttermism, gridlock and kleptocratic capture in even the most longstanding and well regarded of democratic regimes, while certain authoritarian regimes have wrapped themselves in nationalist justifications and have sometimes achieved considerable degrees of performance legitimacy. Notwithstanding such real life complications, mainstream political science continues an intense research program to generate measurements, indicators and rankings that reconcile the complexities of national political structures with the parsimony of available categorizations. Such scientific regime type analysis is supposed to explain (perhaps even predict) not just cross-national similarities and differences but also pathways and the outcomes of processes of regime change to an improbable degree of mechanical necessity and statistical accuracy. The provisional assessments and tentative generalizations of previous scholarly generations have been considered too hesitant and fuzzy, since modern techniques can establish such strong
and categorical findings as that democracies do not go to war with each other, and that above some threshold level of income per capita once a nation has democratized it will not backslide. In the light of such recently discovered academic truths it became easy to justify the adoption of more insistent and even interventionist methods to replace those regimes classified as non-democracies. The expectation both of political science and of the self-proclaimed leading world democracies was that the resulting substitute regimes would be more peaceful, more stable and in every way more desirable to their citizens and to the international community of democracies. In this way the venerable concept of regime change was transformed from a provisional category of observation into an instrument of great power imposition.
A Recently Contaminated Concept Since the Iraq invasion and overthrow of Saddam Hussein in 2003 the term regime change has acquired notoriety as code word for a controversial and perhaps extra-legal variant of great power interventionism. Dan Reiter (2017) calls this FIRC (foreign-imposed regime change) and provides both a general definition and a measure of its frequency. O’Rourke (2018) opts for the alternative designation, covert regime change, in his historical account of all such episodes by the US government during the Cold War.1 But in many of these cases the United States did not act alone, and in some it was not even the dominant player, while other prominent western powers also promoted regime change by coercive and/or covert means, notably the British and the French. In addition, over the course of the 20th century a range of other leading powers (including some authoritarian regimes) engaged in more undisguised but still FIRC-like activities, including the Nazis in Czechoslovakia and Norway before World
Regime Change
War, and the USSR in Eastern Europe after 1945. There have followed a miscellany of further episodes since the Cold War that might be considered either FIRCs or attempts at covert regime change, for example in Afghanistan, Bolivia, Georgia, Haiti, Iraq, Libya, Panama, Somalia, Syria, Ukraine, Venezuela and Yemen. As this illustrative list indicates, a great variety of regimes are involved, and the possible sources of external intervention are multiple and diverse. Moreover, the coding of these cases is imprecise, and still subject to conflicting expert interpretations of key issues. Thus, the boundary between foreign influenced and foreign imposed cases can be elastic, and where covert action is involved the scope for mis-specification is magnified. Polemical features of 21st-century political turbulence have thus contaminated popular understandings of the meaning of ‘regime change’.
Standing Back The post-Cold War experience shows we need a more disaggregated and nuanced grid of ‘regime change’ trajectories. Explanations need to distinguish between internal regime features and external drivers of regime change. Some of these drivers are applicable in both directions; others are unilinear. Early literature recognized the possibilities for regime type overlap, as evidenced by the concepts of democradura and dictablanda. Since then, more sophisticated classifications of contemporary ‘really existing’ regimes highlight the persistence of intermediate or ‘hybrid’ categories. The major national ranking and rating indicators tend to work with continuums rather than dichotomies. Multi-dimensional mapping indicates that many regimes evolve gradually over extended periods, often in an unbalanced manner. Tracking changes over time indicate as much evidence of ‘oscillation’ and ‘backsliding’, contradicting early assumptions of
869
‘irreversible’ one-step regime change. Subnational studies have contributed to uncover large territorial variance within many national regimes. The drivers of change are likely to vary between different subsets of a regime type (e.g. absolute monarchy, theocracy, military rule and one-party systems, may all be counted as autocracies, but change according to distinctive logics). Another step is to separate out structural forces that may precipitate regime change over the long run (generational change, economic decline, or demographic shifts) from more short-term factors (external military defeat, succession crisis or ideological polarization). The trajectory of the change process will depend in part upon the time horizon under consideration. A third aspect concerns the isolated or region-wide scope of the destabilizing impulse (most regime changes affect neighboring states, rather than being confined to one country). Finally, there is a broader set of contextual factors that have sometimes been invoked as causally determinant (state ‘failure’ or religious dogma or post-conflict memories and legacies). With these considerations in mind, and in order to decontaminate the concept, and to return it to use as an academic tool in the study of comparative politics, this chapter first examines the two component parts – ‘regime’ and ‘change’ – before combining them and focusing in on the scope conditions required to make political ‘regime change’ an effective analytical instrument. Much of the contemporary usage of the term concerns change from authoritarian rule to democracy (i.e. democratization), but as a general concept in comparative politics it cannot be restricted to that fashionable area alone. It must also apply to change in the reverse direction, and if the artificial assumption of a binary divide in regime types is relaxed it should also extend to changes between multiple variants (e.g. from ruling monarchy to constitutional theocracy in Iran in 1979, or from totalitarian communism to one-party authoritarianism
870
The SAGE Handbook of Political Science
in China in the same period). While externally imposed change requires attention (and the international relations literature on this aspect is highly developed), internal regime dynamics can never be disregarded, and in most cases occupy the center stage.
Regimes Our first step should be to specify what is meant by a political regime, and how this relates to adjoining concepts. How do contemporary scholars use the term? A political regime is a durable and coherent system or structure of rules and norms (both formal and conventional) regulating the relations between the rulers and their subjects in a given jurisdiction (normally, but not necessarily, a recognized sovereign state). If so, a number of distinctions follow. If the core matrix of rules is too ephemeral or unstable there is no regime (as in Somalia, perhaps). When considering only formal and explicit regulations, one is dealing with a ‘constitution’ rather than a whole regime (which can be highly arbitrary, corrupt or personalist so long as informal norms support such features). ‘Government’ is another category that overlaps with ‘regime’, without equating to it, since governments are not necessarily durable or coherent, and they may relate to their subjects in an erratic and transactional manner. A regime is typically lodged within a state, but state formation is a separate and often a very long-term process that precedes and underpins regime formation, which tends to persist even when major regime change occurs (Fishman, 1990). So, in standard social science parlance, a regime outlasts successive governments but is less durable than a state. At the margin, this schema may not always hold (Prince Sihanouk’s government in Cambodia persisted through several changes of regime, at least nominally, and the Saudi regime seems to have preceded the Saudi state). Steven Krasner has argued that
it is principles and values, rather than rules and norms, which best distinguish one regime type from the next. Stephanie Lawson (1993) adds the suggestion that the more democratic the regime, the sharper its differentiation both from government and from state. Conversely, in a deeply authoritarian regime neither government nor state can be strongly separated out. Note that all this discussion concerns national ‘political’ regimes – not to be confused with international regimes, or various other orders where the word is also applied (electoral, party, even dietary regimes).
Genealogy The past three decades have seen an explosion of regime typologies and empirical indicators that purport to measure and rank nations according to these criteria. All member states of the United Nations qualify for consideration under this rubric, but in practice it is the post-colonial states of the ‘global south’ that provide the bulk of the evidence reviewed here. Contemporary usage of the term is reasonably clear and stable, but it is also instructive for the purposes of this chapter to delve back into the genealogy of ‘regime’ as a concept derived from the history of European political thought. So for the genealogy of the concept it is worth reaching back to the ancien régime prevailing in France until 1789, and, although there are problems involved in tracing genealogical traditions too far back, comparing what the leading thinkers of Florence said about the shift from a Medici-dominated ‘narrow’ regime to a more ‘open’ regime in their city state in 1494.2
France References to the ancien régime in France appeared as early as 1789, but they initially
Regime Change
concerned fairly precise and narrow aspects of the previous dispensation, such as the old fiscal rules. Once the national assembly claimed full and indivisible sovereignty the meaning of the term broadened to encompass monarchical rule as a system, but even then, the divide between past and future forms of government was soft-edged. It was not until the following year that the wholesale ‘abolition’ of the Old Regime came to embody the purpose of the Revolution, making way for a tabula rasa with all the resulting radical consequences. The elasticity of the term ‘regime’ ended in the crystallization of an all- encompassing concept of a ‘Regime’, identified with all the negative practices, assumptions and values that the revolutionaries were sworn to uproot forever. The polysemy of the term was thereby ‘decontested’ (Freeden, 2013: chapter 2) by the imposition of its strongest and most emotionally polarizing variant, with consequences that have divided French politics (and have structured European political thought) ever since.3 Even today, much of the contemporary regime change literature rests on a moralizing and totalizing binary model of reasoning that reflects similar tabula rasa assumptions. But this is not the only way that regime discontinuities can be (or always have been) modeled.
Florence The perspective of Guiccardini’s 15th-century Dialogue offers a sharp contrast to the binary outlook of 19th-century French analysts of regime change, and this goes far to explain their very different standpoints.4 The Dialogue opens with an airing of the case for the departed regime of Piero de Medici before his expulsion from Florence. As the exchanges unfold the Dialogue contains more arguments supportive of a cautious version of the incoming republican form of government. But taken as a whole, the debate is unresolved, with the drift of the argument pointing rather toward a hardheaded realism about
871
the problems of both alternatives, as compared to the severe challenges facing rulers of whichever type. On this reading, whatever the theoretical merits of one regime or another, the greater need is to find a practical way of managing political dangers, with voluntaristic regime change pictured more of a risk than a remedy. The lived experience of Florence in the four decades after 1494 lends extra authority to this line of argument. A comparison between the texts of Guiccardini and Furet points to a more general lesson about perspectivalism in the selection and deployment of concepts used to compare complex historical processes. Retrospection (France after the Revolution) generates a different angle of vision from prospection (Florence ahead of the last Medici experiment). This is so even when the cases under study are very similar. For example, when the ‘transitions’ project studied prospective transitions to democracy in the 1980s we already had access to substantial research on, say, post-Franco Spain, but in advance of what was about to take place, we were bound to adopt a more tentative position (O’Donnell et al., 1986). But the lesson is not confined to differential judgments separating known past outcomes from uncertain future ones. Perspectives on future regime changes can range from hubristic confidence that an eventual success is predetermined (Blair and Bush on Iraq) to implacable division between those favoring the outcome (regardless of the drawbacks) and those opposed (whatever the merits), as in the current British divide over exiting the European Union. When polarization of this kind takes hold it can divide and distort expert and academic opinion as well as broader public views, not only in the near term but also for generations to come, as the ideological impact of 1789 on French political culture demonstrates so clearly. But how relevant are such country specific debates from centuries long past for the study of regime changes in current comparative politics? The negative view would be that they are only single country episodes;
872
The SAGE Handbook of Political Science
they may not really resemble 20th and 21st-century conditions; they are hard to code; and the material presented here is subjective and impressionistic. On the other hand, the alternative case is that they are central to the process tracing of our concept formation; they alert us to the perspectival distortions that can enter into all case studies (modern quite as much as historical); and they demonstrate that the topic of regime change is inherently value-laden. To establish and compare narratives of this kind, observers should be aware of the background ‘framing’ that provides a good part of their meaning, both to the actors involved and to the scholars aiming to explain them.
Three Dimensions of Regime Change These glimpses of the genealogy of thinking about ‘regimes’ also introduce the term ‘change’ – the second we must define and deconstruct. In the French case the relevant master concept is ‘revolution’, whereas in the Florentine example there are two key words: alterazione and mutazione. A further Italian contribution, named after the 1958 novel and referring to Garibaldi’s arrival in Sicily, is gattopardismo.5 The classical Chinese concept of tianming is translated as ‘losing the Mandate of Heaven’.6 Standard terms in the modern literature on comparative democratization include liberalization, transition, consolidation, backsliding, degeneration and ‘careening’ (Slater, 2013). All refer to change of a single regime, but there are also collective linear processes (such as modernization) and cyclical movements. For example, many authors still lazily invoke the metaphor of ‘waves’ – more accurately thought of as ‘clusters’. And I have recently proposed ‘oscillations’ (Whitehead, 2018), which could apply both to single regime changes and to wider demonstration effects. The most ambitious (and misguided) conceptualization
of global change was a once-and-for-all liberal democratic terminus as presented in Fukuyama’s The End of History (1992). In short, the scholarly literature contains an embarras de richesses as regards terms available to categorize macro-political processes of regime change. Although the main focus of all these alternative phrases is to characterize discontinuities between successive political orders within a given sovereign territory, regimes can also change through secession (South Sudan’s independence in 2011) or incorporation (annexation of Goa by India in 1961); decolonization or military incursion – either successful (Russia and Crimea) or thwarted (the Greek attempt to seize Cyprus or Argentina’s Falkland invasion). In short, both internal and international routes to political regime change can come in many variants, and the concept therefore encompasses a wide diversity of processes. Depending on the issue under consideration, various strategies of disaggregation may therefore be required. Alternatively, taking all versions of regime change as a set, separate dimensions of these processes may be distinguished. In what follows, all such processes can be analyzed and classified along three basic dimensions: the direction of change, the scale of change and the duration of the changes in question.
Direction of Change For a couple of decades after the fall of the Berlin Wall in 1989, regime change was a term generally equated with transitions to democracy (Rustow,1970). This was the dominant direction of movement in the 1990s and was still assumed by many academics to be the natural or underlying tendency in the new millennium, given the assumed dominance of democratic polities in the post-communist world system. But as contrary experience accumulated, scholarly opinion began to shift (albeit with a time-lag), and by the end of the Obama administration not only the resilience
Regime Change
but also even the resurgence of ‘authoritarian’ regimes became an academic growth industry. Thereafter, a new priority abruptly gained prominence in the comparative politics literature: the perceived vulnerability of even the most secure and long-established democratic regimes to destabilization by their undemocratic rivals. Thus uni-directional models of change gave way to bi-directional thinking. In practice this was really a return to an older tradition of analysis: the ‘breakdown’ of democracy studies of the early 1970s, which drew lessons from Weimar and reflected on the South American experiences of that period (Linz and Stepan, 1978). At a more theoretical level, it should have been obvious that so long as there are multiple regime types in operation in world politics, analysis must contemplate the possibility that changes between them might go in any direction.7 In a world with only two regime types the directional possibilities are binary, but as we shall see below, both theory and measurement of 21st-century experience has shifted the discussion toward more plural typologies, with various intermediate or mixed possibilities that are proving sufficiently stable and coherent to count as hybrid ‘regimes’ (electoral authoritarianism, etc.). The essential point for comparative theory is that three regime types require the examination of 6 possible directions of change, while four types generate 12 possible pathways, five types raise the theoretical options to 24 possible routes between any one and any other, and so on. For that reason, any refinement of the initial binary schema generates much larger shifts in the content and dynamics of regime change theory. In practice, the norm is regime continuity rather than change, and where multiple shift options become available in theory, in reality most are ‘empty box’ hypotheticals. For example, it is hard to imagine any contemporary case of migration from liberal democracy to royal absolutism,8 whereas movement in the opposite direction is fairly standard (recently in Bhutan and Nepal, for example).
873
But royal absolutism can equally well transition to electoral authoritarianism (Kuwait), or to military rule (Libya), or to theocracy (Iran), or to a one-party state (Iraq), rather than to a constitutional democracy. And if all these alternatives are recognized as variants of regime change, one must also entertain the possibility that one such transition may also belong in a sequence (such as from royal absolutism to hybrid constitutionalism to military rule to liberal democracy – and, perhaps, back again). But with multiple regime options one also needs to keep in mind that our definition of the term includes the attribute of durability. That can be reconciled with the idea of sequential regime changes, but only if they are major discontinuities that must also be spaced out over a considerable period of time. With that caveat it becomes logical to model regime changes as not only bi- directional but also potentially reversible, and as sometimes involving intermediate stages and at other times including the possibility of skipping stages (moving from one pole of a continuum directly to the opposite pole, for instance). For example, France in 1789 can be considered to have transited within three years directly from royal absolutism to constitutional republicanism, and thereafter for two centuries to have shifted back and forth between various intermediate (but less categorical) versions of these two pure types. The UK, over the same period, can be seen as pursuing a more linear and gradualist progression, as the franchise was extended step by step, and as more inclusionary and egalitarian features were added on incrementally. This contrast leads us to the second dimension of regime change.
Scale of Change Two centuries ago the UK was not a democracy. Today (by general consent) it is. If so, when did Britain’s political regime change from constitutional oligarchy to democracy?
874
The SAGE Handbook of Political Science
Was it at the time of the 1832 Reform Act (Ertman, 2010), the Second Reform Act of 1867, the showdown over the House of Lords in 1910, the granting of universal suffrage in 1928, the defeat of the old order in 1945 or ‘UK Independence’ from the anti-democratic shackles of the EU in the referendum of 2016? Or will it not be accomplished until the hereditary element of the House of Lords is eliminated, and Britain acquires its own written constitution? These are all valid steps that, when taken together, could be counted as democratic ‘regime change’ in standard coding exercises in comparative politics. But they were not taken together in this case – they were conceded bit by bit over almost two centuries. So what scale of discontinuity is required for political scientists to authorize the use of the term ‘regime change’? Perhaps the UK’s three centuries of constitutional monarchy and continuity make it a case apart, outside the classifications that encompass all other political systems? Not really, for gradual evolution is a more general feature of regime trajectories. Hence, in 2004 the Democracy Coalition Project compared ten new democracies under the heading ‘regime change by the book’. The study concerned four categories of what it called ‘legal regime change’ outside of national elections that might serve to ensure democratic continuity and avoid backsliding. These changes within a democratic regime (recall and votes of no confidence, impeachment, succession and criminalization of unconstitutional power seizures) were all visualized as partial legal regime changes that might reinforce regime development in a democratic direction (Piccone, 2004). So although some (probably most) regime changes can be dated precisely as total systemic transformations, it is also necessary to contemplate the possibility of smaller scale but linked innovations stretching out over generations, or even centuries, that can cumulatively and incrementally give rise to long run system change. If so, we need some criteria to identify which smaller scale but linked innovations should be included
under this rubric, since there are also many examples of partial and potentially incremental measures that do not culminate in full spectrum regime change. Indeed, a majority of such reforms may more plausibly be understood as attempts to pre-empt and indeed avert ‘real’ democratization. Temporality enters in here as well: how long is it possible to delay before the next step in the sequence confirms that the real direction is cumulative? Even when the formal institutions of a regime are of sufficient scale and cohesion to guide a systemic change, the ‘informal’ institutions that can also regulate political behavior may well thwart this objective, in particular at the local and community levels. For example, some of the Arab Spring initiatives to introduce more democratic politics to the Middle East after 2011 were blocked in part by informal and traditional power structures that operated most strongly away from the main cities, and especially in many rural areas (as in Egypt). More generally, recent scholarship on regime change has shifted from an initial focus on questions of institutional design toward paying more attention to ‘informality’ and formal institutional weaknesses as major factors shaping trajectories of political change (Levitsky and Murillo, 2009).9 There is also more scholarly attention paid to various sectoral components of regime resilience or instability, such as the role of youth organizations (see, for example, Krawatzek, 2018). The dimension of scale not only relates to sectoral and informal responses to a given institutional innovation; it can also concern the geographical scope of intentional regime change: for example, the establishment of the Scottish Assembly may only directly affect one-tenth of the UK’s citizenry, but even so it could prove a critical step in the democratization of Scotland (possibly in contradistinction to its impact on the other 90%). Direct election to the European Parliament (also connected with other provisions granting it budgetary, legislative, treaty approving and supervisory functions) can be regarded as a significant step toward the establishment of a
Regime Change
more pooled sovereignty regime in the great mass of continental Europe as a whole. Thus, in addition to steps taken at the nation-state level, both sub-national and supra-national power transfers can qualify as features of incremental regime change.10 This does not only apply to peaceful change in democratic regimes – the secession of South Sudan or the incorporation of Zanzibar into Tanzania were also ‘regime changes’ on a regional scale in non-democratic contexts. The same may apply to Hong Kong’s gradual absorption into the PRC. These examples serve as a lead-in to the third dimension of regime change to be considered here.
Duration of Change A good starting point here is the step-change model of regime transition, well exemplified by the case of Spain. For four decades Franco and his military imposed a brutally repressive and centralized authoritarian regime, until the caudillo’s death from natural causes in 1976. There followed a brief and intense process of transition, but then from the constitution of 1978 (approved by 92% of the votes cast in a referendum), for a further four decades Spain was governed as a standard liberal democracy. Here we have one durable regime succeeded by a very different type of successor of at least equal durability, with only the briefest of intervals separating the two. This experience provided an influential model for many other countries to replicate and has captured the scholarly and public imagination as a ‘normal’ pattern of democratization. Even so, closer inspection of the Spanish case raises a few complications. In 1981 a military contingent seized the Cortes with the evident intention of restoring a Franquistastyle system. It was promptly suppressed owing to the action of the king, but the fragility of the new disposition had been exposed to public view. For the first three decades of the democratic regime an armed resistance operated in the Basque country, led by ETA,
875
the regional separatist movement that in some ways had precipitated the transition when it assassinated General Franco’s chosen successor and thereby opened the way for monarchical arbitrage. In the fourth decade, as the 1978 regime overcame the armed opposition of some Basques, a (bare) majority of Catalan voters migrated to support for independence under a separatist republic. In a similar vein, Franco’s authoritarian rule encompassed an early period of utter ruthlessness, but also a final two decades of relative pluralism (within some strict limits). Moreover, before the civil war of 1936–9 a variety of more short-lived and partial regime forms had succeeded one another in dizzying succession – absolute monarchy, constitutional monarchy, first republic, monarchical restoration, but with a military interlude (the dictablanda of General Berenguer and Admiral Aznar in 1930–1, contrasting with the previous dictadura of Primo de Rivera), a second republic (including a democradura episode during the 1930s depression) and then the election of a Popular Front government. Taking all these features into consideration, a heroic degree of schematization is required to picture Spain’s experiences of regime change simply in terms of a single step-change between two very durable and opposed regime types. At the other extreme in terms of duration are those fleeting political experiments that do not last long enough to qualify as regimes (Chile’s hundred-day first ‘socialist republic’ only lasted from June 4 to September 13, 1932, for example). Four decades later, Juan Linz (1975) suggested that some South American military takeovers were to be understood as corrective interludes rather than permanent power seizures, and so coined a distinction between authoritarian ‘situations’ and more durable ‘regimes’. In synthesis, models of regime change are only usable within certain ‘scope conditions’ that do not apply to all cases in all time periods. The scale of the political system can be too small/too subordinate; its duration can be too brief; its internal coherence can be too
876
The SAGE Handbook of Political Science
unstable (causing directional unpredictability); the territory in question may contain too many ‘ungoverned’ spaces; and so on. In addition to these three general dimensions of regime change (direction, scale and duration) there are, of course, other useful sub-classifications to consider. One transversal category of particular interest is the speed of the relevant change or changes. From the fall of the Bastille to Ten Days that Shook the World or certain episodes within the Arab Spring sequence, we know that intensely drastic, comprehensive and unexpected political changes can be compressed into extremely short time spans. (This can apply regardless of the direction, scale or eventual duration of the regime change in question.) On the other hand, a slow buildup of relentlessly advancing pressures can also exert such pressure for change that even a very robust regime will be compelled to give way. So another consideration, related but not reducible to speed, is timing. For example, there are also voluntarist possibilities, in which change can be decreed from the apex of a strong regime, which uses all its powers of control to speedily and proactively rewrite the rules of the game ahead of events (Ataturk in the 1920s; Taiwan in the 1980s). When great speed is involved resistance may be too late, but collective understanding may lag well behind the momentum of change, whereas when relentless longterm forces have built up, the society may be psychologically prepared for the new regime well before the rulers discover what is happening. In summary, speed can be a highly relevant variable shaping the trajectory of regime changes, regardless of their direction or magnitude.11
Hybrid and Borderline Cases In general, it is easy to grasp the idea that all ‘really existing’ examples of regime change must fit within the three dimensions of direction, scale and duration, and that they must fall
between some upper and some lower limit on each account. But when it comes to placing any individual process within this comparative framework it turns out that many contemporary episodes are hard to place with much precision, and that crucial but awkward features of key examples often fall outside neat boundaries. To put the same point in different language, careful and contextually sensitive examination of the empirical evidence tends to throw up an abundance of borderline cases and ‘hybrid’ experiences. Recent discussions of ‘backsliding’ by long-established (allegedly ‘consolidated’) democratic regimes highlight the difficulties that can arise when attributing directionality to regime changes. How long must a regression last, and what scale must it display, before it should be reclassified as a de- democratization? Does the same threshold (in terms of duration and scale) apply when examining an exemplary case of democratic consolidation (Switzerland or the United States, for example) as when considering a more recently and less solidly implanted counterpart (say Brazil, or the Philippines)? Likewise, when a totalitarian regime (say Mao’s China) undergoes far reaching liberalization on multiple dimensions (economic, institutional, communicational and even cultural), as under the Deng reforms, and therefore migrates to the ‘authoritarian’ category, the concept of regime change may seem applicable – but this is only clear so long as the three dimensions operate steadily and conjointly. Once Xi Jinping starts countermanding these liberalization processes, a ‘hybrid’ or borderline classification may be needed, or perhaps the reversal of direction may even merit the label ‘restored totalitarianism’. Current scholarly uncertainties about both the United States and the China examples underline the importance and sensitivity of such judgments, and therefore challenge overconfident labeling exercises in the literature on contemporary regime change. More generally, as already noted, when more regime types
Regime Change
are singled out for separate consideration (for example, Juan Linz (1975) identified seven sub-types of authoritarian regime, which he also distinguished from totalitarianism; or for military regimes see Geddes, 2014), that can greatly expand the scope for mismatch between cases and categories.
The Explosion of Large-N Statistical Analysis Notwithstanding these complications, since the end of the Cold War empirical work on regime change has become far more focused on the analysis of large-N datasets. The construction of such series began well before 1989, with the Freedom House annual ratings of the nations of the world on a 1–7 scale dating back to 1972. But the proliferation of ‘objective’ annual measurements of regime type outcomes really took off in the 1990s. Polity IV used a ten-point scale to measure both autocracy and democracy in all countries stretching far back over time. Bertelsmann constructed two ‘transformation indices’ to track the progress of both political and economic reform in major transforming states since the Soviet collapse; and the Varieties of Democracy project probably constitutes the most elaborate and carefully specified catalogue of regime trajectories currently available. Each series adopts its own specific criteria and protocols. In broad terms a large degree of co-variance is reported, although much is also made of second order divergences, and differences of coverage. The two main methods involved are expert ratings and public opinion surveys. The annual Barometer Surveys for Europe, Latin America, Africa and Asia are increasingly coordinated to generate comparable regional outputs, which can also be checked against such global exercises as the World Values Survey. Other supportive datasets are provided on more specific themes, such as the Electoral Integrity Project and Transparency International’s Corruption Perceptions Index.
877
Gasiorowski (1996) produced an early political regime change dataset covering 97 ‘third world’ countries of more than one million inhabitants, from independence to 1992. This has been superseded by later work using extensive post-Cold War evidence, with many more observations and more sophisticated coding procedures. However, it is worth revisiting as a clear and foundational large-N exercise. He found high correlation between his data and nearly all the other extant series available. He collapsed ‘totalitarian’ into ‘authoritarian’ regimes on the grounds that there were so few of the former. Nevertheless, he identified not two but four separate regime types – democracy, semi-democracy, authoritarian and transitional. The last of the four was surprising, given that he recorded so few cases, and also that they were (self-evidently) all short-lived. The most significant finding was that only 5% of the 6,842 regime-years in his tabulation recorded some form of regime change. Regime continuity was very much the norm in this third world universe (Gasiorowski, 1996: 477).12 A decade later, using more careful concepts, coding and sources, Mainwaring et al. (2007: Appendix Table 5.3) conducted a similar exercise, restricting their coverage to the 20 republics of Latin America between 1945 and 2004, and limiting themselves to only three regime types (authoritarian, semidemocracy and democracy). From this set of 1,200 regime-year observations they identified in total 90 regime changes (12 in Peru, 10 in Argentina, 9 in Ecuador, 7 in Venezuela, 6 in Honduras, 5 in Bolivia, 5 in Brazil). Overall there were 52 changes in a ‘positive’ direction (4.3% of all years) compared with 38 regressions from democracy (3.2%). But it is permissible to query whether these were really comparable episodes. The average Peruvian regime lasted five years (almost too short to qualify as durable), whereas in Cuba and Paraguay only one change was recorded in six decades. Moreover, neither the Cuban Revolution of 1958 nor the Nicaraguan Revolution of 1979 were coded as regime
878
The SAGE Handbook of Political Science
changes of any kind: unbroken authoritarian rule is what this classification system decrees for these profound political transformations. Many further large-N regime studies have since accumulated, with broader coverage and more refined coding procedures. For example, Jan Teorell (2010: 142) provided a useful overview and a range of causal models and estimations of democratic regime change (he accepts that this is not the only possible variant) for 165 countries between 1972 and 2006, using a continuum and separating ‘upturns’ from ‘downturns’. Probably his most robust finding was that ‘during the third wave, modernization and economic freedom prevented de-democratization’. However that may be, a major preoccupation in the literature since his study was published has been to explain the proliferation of de- democratization tendencies (or ‘backsliding’) in modern and ‘marketized’ regimes following the 2008 global financial crisis. Two key features of all these exercises require particular attention here: they generate comparative data in the form of regimeyears; and they take for granted the relative homogeneity of the universe of cases under consideration. From a ‘regime change’ standpoint the first feature means that major discontinuities in national political trajectories should show up as dateable breaks in a numerical series – the scale and direction of each break should therefore be quantified and capable of comparison with equivalent movements in parallel series. The second feature means that this measurement of ‘equivalent movement’ is an artifact of the coding procedures adopted, rather than the product of contextually sensitive paired comparison. It is simply taken for granted or imposed as a convenient assumption, rather than established through careful observation, that a one-year movement of so many points in the scoring of country A equates, in regime change terms, to the same outcome as in country B when the same movement is recorded. (Thus, for example, the breakup of the USSR counts as one observation unit; the ‘rescue mission’ to
Grenada rates as an equivalent other). Both these features merit some critical exploration. Consider the regime-year criterion as it operates in one of the more interesting recent episodes of regime change. For three decades after 1955, Argentine politics followed the path of an ‘impossible game’ (O’Donnell, 1973), with frequent and often dramatic changes not only in political leadership but also in the rules governing political competition and the goals and values of the system as a whole. Eventually, in 1984, a more durable regime change (transition to democracy) took place, but even then military rebellions, the forced early retirement of an elected president and a disputable constitutional change intervened. Still, a decade and a half of fairly normal democratic politics followed – until 1999, after which a new and extended period of more polarized confrontation ensued. At best, the ‘regime-year’ method of coding might more or less succeed in tracking the key turning points, but it would also pick up a great deal of turbulence and noise that, standing back a little, can be seen to have been leading nowhere. Although Argentina is an especially disorienting case, similar distortions can arise when relying on a sequence of annual observations in other leading cases as well, such as Thailand or Turkey, for example (Turan, 2015). At the other end of this continuum there are also very gentle and incremental progressions from one regime to another, with no abrupt annual discontinuity, but a clear shift over time nonetheless (post-Mao China, for example). Even very well constructed indicators may therefore miss important features of regime change if they focus too heavily on calendar years. (And most such annual series are not that well honed.) One reason why such numerical proxies are so loose an approximation of reality is that they rely on standardized criteria taken out of their configurative context and assumed to be uniform in their significance (electoral turnover, constitutional reform, media pluralism, judicial independence and so on). Large-N
Regime Change
tabulations do not discriminate between exemplary regime changes (France in 1789, Spain in 1976) and marginal episodes. This might make sense if the universe of processes under inspection was homogeneous (if there was a worldwide convergence toward a single liberal democratic end-state of regime, say), but it will distort and mislead if in fact not only the starting points but also the trajectories and the eventual outcomes of regime changes prove diverse or divergent. A country gets included in these databases if it is recognized as a sovereign state (since 1945 the critical test is membership of the UN General Assembly), and perhaps also if it has a minimum population size (more than one million inhabitants is the usual cut-off point). In major respects the universe of cases under study is quite heterogeneous, as are both the form of regime that each starts with and the available pathways for subsequent political development. For example, although the parliamentary democracies of Iceland, India and Israel may be similar in terms of their constitutional structures, other key indicators measuring core features of their respective political orders are far from fully comparable. The first is an ethnically homogeneous and geographically secure pluralist regime with a very small population and a very high per capita income. The second is a vast, linguistically and religiously diverse lowincome sub-continental nuclear power with bitter internal divisions and hostile neighbors. It has more than three thousand times as many citizens as Iceland, with far lower average levels of education and productivity. The third is a beleaguered state recently constructed as the homeland of a previously scattered Jewish people, with occupied territories containing a large and embittered nonJewish population not yet reconciled to their diminished legal status. In order to compare regime change in such contrasting situations it is essential to factor in the features that are unique to each of these contexts, as well as those that are common to all three. Enlarging the universe of countries in the database may
879
distract attention from this requirement, but it only adds to the range of non-standard features screened out by the indicators available for all cases. The 193-strong current membership of the General Assembly ranges from China with 1.4 billion inhabitants to Nauru with 9,000. So instead of treating each member state as an equal weight reporting unit for regime change purposes, it would make sense to test13 large-N comparative political databases on population-weighted criteria. For example, Corbett and Veenendaal (2018) examine the 39 states with fewer than one million inhabitants as a separate cluster. They found that whereas GDP per capita was a significant predictor of democracy in the 154 states with more than one million inhabitants, there was no such relationship among the 39 smallest states. These small states also differ from the rest with regard to cultural fragmentation and colonial legacy, and also on other dimensions. Overall they not only establish that small states constitute a separate set of cases with distinctive empirical characteristics, they also provide a credible theory to account for this contrast. Stated simply, regimes with small populations depend more on face-toface personal relationships, and therefore display distinctive political traits. It is also worth disaggregating further to compare regime features in small Caribbean, Pacific Island and other settings. Too indiscriminate a universe of reporting units would blot out these behaviorally critical determinants of regime dynamics. In addition, some major large-N studies have excluded such small states – perhaps on the plausible basis that few experts are available to provide the codings, and that it is uneconomic to conduct opinion surveys in such settings. But if a database is constructed on the one country, one unit principle, the omission of 20 per cent of all cases may distort the overall picture, especially if the omitted cases prove atypical.14 Alternatively, instead of using country weightings, political regimes could be measured and compared in terms of the number of
880
The SAGE Handbook of Political Science
subjects governed under each dispensation. On that basis, for example, just six large federal democracies encompass about half the voters existing in the world’s electoral democracies (Behrend and Whitehead, 2016). India provides the pre-eminent weight in this set: a single state within India (Uttar Pradesh) has almost as many citizens as Brazil, the third most populous member of the set. Within these huge federal regimes one can find major variations in sub-national political conditions. Kashmir with more than 12 million inhabitants, for example, is partially occupied by Pakistan (which supports an endemic insurgency), and the Indian-governed core of the state is subject to repeated episodes of ‘presidential rule’ imposed from New Delhi. So the ‘regime’ in this state is very different from that of the overall nation, and its rhythms of ‘regime change’ are highly distinctive. Counting the whole of India as a single reporting unit in a regime change database would screen out such considerations.
Multiple Drivers This survey has emphasized the polysemy of the term ‘regime’; the many and varied possible forms of regime discontinuity; and the importance of a range of contextual conditions that set boundaries to the validity of comparisons and that generate diverse hybrid and borderline cases. In light of all these considerations, the inescapable inference is that no single factor can be expected to trigger all regime changes. To the contrary, researchers will have to consider a multiplicity of drivers, some of them associated with the observable diversity of pathways and of outcomes. Some basic divisions can be outlined here. One can distinguish between mainly external and predominantly internal precipitants of change (e.g. war and decolonization in the first case, versus succession crisis and fiscal disorder among the second set of causes). Internationally driven discontinuities can
confirm pre-existing foreign alignments (so that only the internal political order is changed), or they can overturn them. Then on the domestic side there are top-down versus bottom-up drivers of regime change. The impetus can originate in the capital city, but it can also flow in from the periphery. The transition can be negotiated (e.g. through an elite pact), or it can be imposed. Sometimes there is a clear and short interregnum (a stepchange), but fitful, incremental and protracted processes are also possible – notably when bridging institutions are weak or absent, and when incoming rulers are inexperienced and divided. In addition to these actor-driven pathways to regime change, scholars must also assess longer term and more ‘structural’ factors, such as socio-economic trends, demographic and occupational shifts and even changes in underlying norms and values. The latter can be internally driven but also produced by external demonstration effects. Historical legacies can prove very long lasting, and may therefore shape or constrain the trajectories available in any given case. In summary, there is a large inventory of possible drivers of regime change, and comparative work needs to sift and contextualize these factors if it is to account for individual processes or for related clusters of them.
Conclusion Let us revert back to the foundational assumptions of ‘regime change’ theory, and some reformulations in the light of recent experience and empirical study, taking into account the more complex ‘grid’ of trajectories outlined above. As a serious concept in comparative politics, ‘regime change’ has a venerable genealogy and a complex morphological structure. However, since the US invasion of Iraq this term has also entered into popular currency as shorthand for a unilateral and arrogant style of external interventionism, dressed up
Regime Change
in pretentious language but probably really serving concealed and unworthy objectives. In general usage it is associated with unintended and undesirable consequences (‘collateral damage’), so if it is to be restored to scholarly use it must first be detoxified. This chapter has attempted that task. In the process it has also uncovered a challenging degree of complexity. The result is not to render the notion unusable, but this exercise does highlight both the conceptual and the empirical difficulties that need to be faced and overcome in order for it to perform a constructive function in the analysis of post-Cold War comparative politics. (For earlier periods, when fewer instances were available for comparative inspection, the traditional usage was more adequate.) To determine which type of regime changed on what dimensions and as a consequence of which drivers requires careful conceptualization and refined disaggregation and contextualization of the cases. This does not eliminate the need for comparative work, but it does mean caution against crudely reductionist substitutes for the real thing. In the absence of comparative framing individual experiences of political discontinuity may be almost impossible to interpret: it becomes unclear where to begin, or which lines of enquiry deserve priority. But pre-established explanatory models can be equally misleading, if they are used to override the diversity and nuances of the contemporary case experiences. In order to capture case-specific contextual realities, not one but a variety of comparative methods should be considered – exemplary models, paired comparisons, cluster analysis within precise scope conditions, as well as wide-focus search for recurrent regularities. The most appropriate mix of methods will depend upon the precise hypotheses selected for investigation. To conclude, good explanations of regime change need to consider the following general guidelines. First, these are not reducible to single events but involve complex multifaceted processes. Second, they do not necessarily follow predetermined pathways, but often
881
involve erratic and contested trajectories, perhaps leading to uncertain outcomes. Third, such trajectories may be open-ended, but they are also heavily constrained by structural conditions, and by the limited range of coherent and stable outcomes that are possible. Fourth, ideas and imagination also matter. The social construction of a new regime is partly the product of dialogue and persuasion between actors experimenting with previously untested solutions and attempting to institutionalize collective values and aspirations (Owen, 2010).
Notes 1 Kinzer (2007) extends the historical record back to 1893, while also only considering Washingtondirected examples. 2 Two of the key words in Guicciardini’s Dialogue on the Government of Florence are reggimento and mutazione. Is it anachronistic to equate these to regime and regime change? I rely on the Dialogue’s (1994) editor and translator, Alison Brown for comfort on this score. Note that she also translates stato strictly as ‘regime’ in some contexts. The evidence on this takes the form of a dialogue in which friends exchange arguments on both sides of the issue, and although the author was well placed to present these issues as they might have been seen at that time, he was in fact writing 30 years later, with the benefit of the hindsight that neither side had prevailed, and in the knowledge that the stand-off between these regime alternatives was still generating instability and public danger. 3 My account of the early changes in usage and of the long-term reverberations of the ancien régime–revolution binary divide draw heavily from François Furet (1992). 4 A further factor to consider is the Florentine need for disguise and plausible denial, given the extreme sanctions risked by those too closely identified with the losing side in a still uncertain seesaw of violent conflict. By contrast, Alexis de Tocqueville knew he was an ‘out’ in the Second Empire, but that he was free to write under his own name, when he used his internal exile to pen The Old Regime and the French Revolution (Herr, 1992). 5 Giuseppe Tomasi di Lampedusa, Il Gattopardo (1958). The key sentence reads ‘Si vogliamo che tutto rimanga come prima, bisogna che tutto cambi’. Some attribute this idea to Macchiavelli in chapter 25 of Discorsi, libro. It applies well to the
882
The SAGE Handbook of Political Science
regime change in 1989 in Paraguay, when the longtime dictator Stroessner was ousted in what gets classified as a ‘transition to democracy’. 6 According to Anne Cheng it was the Zhou dynasty, in the 11th century bc, that introduced the notion of tianming to justify their overthrow of the preceding Shang dynasty. Thereafter, for three thousand years, this ‘Mandate of Heaven’ concept provided the basis for all Chinese political theory (Cheng, 1997: 56). 7 Were Fukuyama’s End of History ever to be realized, the topic of regime change would either have become redundant, or could only involve change away from liberal democracy. 8 The only case known to me is Nepal’s constitutional interlude of 1959–60, which was followed by four decades of royal absolutism. 9 On informality and judicial politics, see Dressel et al. (2018). 10 Recently, Behrend and Whitehead (2016) examined the variability of sub-national trajectories in large federal democracies. Although we did not exclude ‘sub-national authoritarian regimes’ for some of these cases, we were reluctant to adopt the term ‘regime’ as the norm on the grounds that this would tend to overstate the autonomy of the sub-unit compared to the federal regime within which it is embedded. So, for comparative purposes, we preferred to analyze ‘illiberal structures and practices’ at the sub-national level. 11 This paragraph was prompted by a verbal communication with Michael Freeden. 12 His article is also helpful in providing an appendix where all the coding decisions can be inspected. It is noteworthy that he does not count the Mexican Revolution of 1910 or the Cuban Revolution of 1958 as regime changes (see Teorell, 2010). 13 Although large N studies can, in principle, always add a dummy variable to test for any missing factor, good reality checks require a good grasp of the relevant realities, whereas in practice most such exercises rely on excessively crude and unrealistically standardized indicators. 14 Another source of distortion would be the omission of territories that were non-members of the United Nations. Taiwan, Hong Kong, Puerto Rico and Occupied Palestine are significant political regimes with distinctive trajectories that do not get included in most datasets.
References Behrend, Jacqueline and Whitehead, Laurence, eds (2016). Illiberal Practices: Territorial
ariance within Large Federal Democracies. V Baltimore, MD: Johns Hopkins University Press. Cheng, Anne (1997). Histoire de la pensée chinoise. Paris: Editions du Seuil. Corbett, Jack, and Veenendaal, Wouter (2018). Democracy in Small States: Persisting against All Odds. Oxford: Oxford University Press. Di Lampedusa, Giuseppe Tomasi (1958). Il Gattopardo. Sicily: Feltrinelli. Dressel, Björn, Sánchez-Urribarri, Raul and Stroh, Alexander, eds (2018). ‘Information Networks and Judicial Institutions: Comparative Perspectives,’ International Political Science Review, 39(5), November: pp. 573–84. Ertman, Thomas (2010). ‘The Great Reform Act of 1832 and British Democratization,’ Comparative Political Studies 43(8–9), August– September: 1000–22. Fishman, Robert (1990). ‘Rethinking State and Regime: Southern Europe’s Transition to Democracy,’ World Politics, 42(3): 422–40. Freeden, Michael (2013). The Political Theory of Political Thinking: The Anatomy of a Practice. Oxford: Oxford University Press. Fukuyama, Francis (1992). The End of History and the Last Man. New York: Penguin. Furet, François (1992). Dictionnaire critique de la revolution française. Paris: Flammarion. Gasiorowski, Mark J. (1996). ‘An Overview of the Political Regime Change Dataset,’ Comparative Political Studies, 29(4), August: 469–83. Geddes, Barbara, Frantz, Erica and Wright, Joseph (2014). ‘Military Rule,’ Annual Review of Political Science, 17, May: pp. 147–62. Guicciardini, Francesco (1994). Guicciardini: Dialogue on the Government of Florence, edited and translated by Alison Brown. Cambridge: Cambridge University Press. Herr, Richard (1992). Tocqueville and the Old Regime. Princeton: Princeton University Press. Kinzer, Stephen (2007). Overthrow: America’s Century of Regime Change from Hawaii to Iraq. New York: Times Books. Krawatzek, Félix (2018). Youth in Regime Crisis: Comparative Perspectives from Russia to Weimar Germany. Oxford: Oxford University Press. Lawson, Stephanie (1993). ‘Conceptual Issues in the Comparative Study of Regime Change and Democratization,’ Comparative Politics, 25(2), January: 183–205.
Regime Change
Levitsky, Steven and Murillo, Victoria (2009). ‘Variation in Institutional Strength,’ Annual Review of Political Science, 12, June: 115–33. Linz, Juan (1975). ‘Totalitarian and Authoritarian Regimes,’ in Handbook of Political Science, Volume III, edited by Fred I. Greenstein and Nelson W. Polsby. Reading, MA: Addison-Wesley, pp. 175–328. Linz, Juan and Stepan, Alfred (1978). The Breakdown of Authoritarian Regimes. Baltimore, MD: Johns Hopkins University Press. Mainwaring, Scott, Brinks, Daniel and PérezLiñán, Aníbal (2007). ‘Classifying Political Regimes in Latin America, 1945–2004,’ in Regimes and Democracy in Latin America, edited by Gerardo L. Munck. Oxford: Oxford University Press, pp. 123–60. O’Donnell, Guillermo (1973). Modernization and Bureaucratic-Authoritarianism. Berkeley, CA: University of California Institute of International Studies. O’Donnell, Guillermo, Schmitter, Philippe, and Whitehead, Laurence, eds (1986). Transitions from Authoritarian Rule: Prospects for Democracy. Volumes I–IV. Baltimore, MD: Johns Hopkins University Press. O’Rourke, Lindsey A. (2018). Covert Regime Change: America’s Secret Cold War. Syracuse, NY: Cornell University Press.
883
Owen, John M. (2010). The Clash of Ideas in World Politics: Transnational Networks, States, and Regime Change. Princeton: Princeton University Press. Piccone, Theodore J., ed. (2004). Regime Change by the Book. Washington, DC: Democracy Coalition Project. Reiter, Daniel (2017). ‘Foreign-Imposed Regime Change,’ in Oxford Research Encyclopedia of Politics, Oxford: Oxford University Press. (online), March. Rustow, Dankwort (1970). ‘Transitions to Democracy: Toward a Dynamic Model,’ Comparative Politics, 2(3) April: 337–63. Slater, Dan (2013). ‘Democratic Careening,’ World Politics, 65(4): 729–63. Krasner, Stephen D. (1983). International Regimes. Cornell University Press. Teorell, Jan (2010). Determinants of Democratization: Explaining Regime Change in the World, 1972–2006. Cambridge: Cambridge University Press. Turan, Ilter (2015). Turkey’s Difficult Journey to Democracy: Two Steps Forward, One Step Back. Oxford: Oxford University Press. Whitehead, Laurence (2018). ‘Temporal Models of Political Development: In General and of Democratization in Particular,’ in Ursula Van Beek (ed.) Democracy under Threat: A Crisis of Legitimacy? New York: Springer, pp. 23–44.
53 Religion and Politics Jeffrey Haynes
Introductory historical remarks Around the world, religions have left their assigned place in the private sphere, becoming politically active in various ways and with assorted outcomes. Religion’s re- emergence from political marginality dates from the last decades of the 20th century. Subsequently, religious actors of various kinds became involved in various political changes, including democratisation, terrorism and demands for societal changes, such as improved human rights. Prior to the 18th century and the subsequent formation and development of the modern (secular) international state system, religion was a key ideology that frequently stimulated political conflict between societal groups. Following the Peace of Westphalia in 1648 and the subsequent development of centralised states first in Western Europe and then, via European colonisation, to most of the rest of the world, the political importance
of religion significantly declined both domestically and internationally. In the late 20th century, there was a resurgence of – often politicised forms of – religion. This trend was especially noticeable in the post-Cold War era (that is, from the late 1980s), among the so-called world religions (Buddhism, Christianity, Confucianism, Hinduism, Islam and Judaism). In terms of important events in this context, many observers point to the Iranian revolution of 1978–9, as it marked the ‘reappearance’ of religion (in this case, Shia Islam) as a significant political actor in Iran – a country that, like neighbouring Turkey decades before, had adopted a Western-derived, secular development model. The past few decades have seen increased political involvement of religious actors within many countries, as well as internationally. Much attention is often focused upon socalled Islamic fundamentalists or Islamists,1 particularly in the Middle East and North Africa, to the extent that a casual observer
Religion and Politics
might assume that the entire region is polarised religiously and politically between, on the one hand, Islamic fundamentalists and secularists, and, on the other, Jews and (Palestinian) Muslims, both of which claim to be the rightful controllers of Jerusalem, a holy city for both. However, it is not only Islamists who pursue political goals related to religion. In officially secular India, militant Hinduism has been a feature of politics since the early 1990s. In Israel, Jewish religious parties serve in successive governments. The Roman Catholic Church was a leading player in postCold War democratisation in various parts of the world, including eastern Europe, subSaharan Africa and Latin America. In sum, there are numerous examples of recent religious involvement in politics in various parts of the world, in both domestic and international contexts. This is a relatively recent turn of events, having largely taken place since the Cold War ended in the late 1980s. Today, it is difficult to find any countries where the relationship between religion and politics is not a controversial and often fraught issue.
Basic theories and concepts The nature of the relationship between religion and politics is controversial. Although scholars disagree about their nature and scope, there is widespread concern in many countries regarding the role of religious actors in: (1) helping underpin or support authoritarian regimes; (2) inter-communal clashes; and (3) transnational extremist networks. In Europe, for example, such phenomena today represent a dual challenge: first, religious communities must effectively integrate into democratic institutions; second, policy-makers must work out and implement new policies and forms of cooperation to cope with previously unexpected threats and issues, some of which come from religious extremist actors.
885
Debates about the current political importance of religion also include a focus upon various issues that can be grouped together under the rubric ‘Religion, Security and Development’. What unites them is a common concern with the impact of religion on conflict and development issues and outcomes. Among them can be noted Samuel Huntington’s (1996) controversial thesis about ‘clashing civilisations’, with religion and culture key factors, while others stress the potential of religion to help resolve political conflicts and be a major component of peacebuilding. Scholars also focus upon the influence of religion on various manifestations of terrorism and, more generally, the post-9/11 ‘Global War on Terror’, as well as the significance of religion in relation to the developmental position of females. In sum, a variety of religious actors and factors are now involved in various political issues and controversies. Examining the relationship between religion and politics in the contemporary world, we can note that, apparently irrespective of which religious tradition we are concerned with, many religious ideas, experiences and practices are all significantly affected by the impact of globalisation on both politics and international relations. The impact of globalisation is encouraging many religions to adopt new or renewed agendas in relation to a variety of religious, social, political and economic concerns. It is also stimulating many religious individuals, organisations and movements not only to look at local and national issues and contexts but also to focus on regional and international environments. In many cases, such concerns focus on the relationship of religion to conflict and conflict resolution.
Conflict and Conflict Resolution A second issue is religion’s global role in conflict and conflict resolution. A starting point for this analysis is to note that
886
The SAGE Handbook of Political Science
globalisation both highlights and encourages religious pluralism. But religious responses may well be different. This is because some religions, including Judaism, Christianity and Islam (sometimes known as the ‘religions of the book’, because in each case their authority emanates principally from sacred texts – actually, similar texts) claim what Kurtz (1995: 238) calls ‘exclusive accounts of the nature of reality’, that is, only their religious beliefs are judged to be true by adherents. As globalisation results in increased interaction between people and communities, the implication is not only that there are increasing encounters between different religious traditions but also that there are variable outcomes: some are harmonious, others are not. Sometimes, the result is what Kurtz (1995: 168) has called ‘culture wars’. Culture wars occur when religious worldviews encourage differing allegiances and standards in relation to various areas, including the family, law, education and politics. Resulting conflicts between people, ethnic groups, classes or nations can be framed in religious terms. Such religious conflicts seem often to ‘take on “larger-than-life” proportions as the struggle of good against evil’ (Kurtz, 1995: 170). This may be noted in relation to certain religious minorities who may regard their own existential position – for example, Muslim minority communities in Thailand, the UK, France, the Philippines and India – as being unacceptably weakened because of actual or perceived pressure from majority religious communities – Buddhists in Thailand; Christians in Britain, France and the Philippines; and Hindus in India – to conform to the norms and values of the religious and cultural majority. There are many examples of religious involvement in recent and current national and international conflicts. For example, stability and prosperity in the Middle East is a pivotal goal, central to achieving general peace and the elimination of poverty in the region. Yet the Middle East is particularly emblematic in relation to religion – in part because the region was the birthplace
of the world’s three great monotheistic religions (Christianity, Islam and Judaism). This brings with it a legacy not only of shared wisdom, but also of conflict – a complex relationship that has impacted in recent years on countries as far away as Thailand, the Philippines, Indonesia, the United States and Britain. Key to peace in the region may well be achievement of significant collaborative efforts among different religious bodies, which, along with external religious and secular organisations – for example, from Europe and the United States – may through collaborative efforts work towards developing a new model of peace and cooperation to enable the Middle East to escape from what many see as an endless cycle of religiously based conflict. Overall, this emphasises that religion may be intimately connected – and not only in the Middle East – both to international conflicts and their prolongation and to attempts at reconciliation of such conflicts. In other words, in relation to many international conflicts, religion can play a significant, even a fundamental role, contributing to conflicts in various ways, including how they are intensified, channelled or reconciled. In addition, religion may have a key part to play in resolution of conflicts, including in South Asia (notably India–Pakistan), Israel–Palestine and sub-Saharan Africa (for example, in relation to Sudan’s long-running civil war). It is important not to over-stress religion’s involvement in and propensity to conflict. To do so would mean that we would be likely to overlook the many recent and current examples of religious involvement in attempts at conflict resolution. On the other hand, the fact remains that many current international conflicts have religious aspects that can exacerbate both hatred and violence, and make the conflicts themselves exceptionally difficult to resolve. Hans Küng, a Swiss Catholic priest and theologian, claims that the most fanatical, the cruelest political struggles are those that have been colored, inspired, and legitimized by religion. To say this is not to reduce all political conflicts to religious ones, but to take
Religion and Politics
seriously the fact that religions share in the responsibility for bringing peace to our torn and warring world. (Hans Küng, quoted in Smock, 2004)
Such concerns are echoed in Huntington’s (1993, 1996) controversial thesis of a ‘clash of civilisations’ – a controversial topic, especially since 9/11. Huntington argues that there is a serious ‘civilisational’ threat to global order that became apparent after the Cold War. It is rooted in the idea that there are competing ‘civilisations’ that engage in conflict that affects outcomes in international relations in various ways. On the one hand, there is the ‘West’ (especially North America and Western Europe), with values and political cultures deemed to be rooted in liberal democratic and Judeo-Christian concepts, understood to lead to an emphasis on tolerance, moderation and societal consensus. On the other hand, there is supposedly a bloc of allegedly ‘anti-democratic’ Muslim countries, believed to be on a collision course with the West. A key problem with Huntington’s thesis, however, is that there are actually no ‘civilisations’ that act politically or in international relations in uniform and single-minded ways. Instead, wherever we look – for example, the United States, Europe, Israel, the Muslim countries of the Middle East – what is most notable is the plurality of beliefs and norms of behaviour that are apparent even in allegedly cohesive and uniform civilisations. It is useful to bear these concerns in mind when thinking about the role of religion in relation to conflict in both domestic and international contexts. It is important not to overestimate religion’s potential for and involvement in large-scale violence and conflict – especially if that implies ignoring or underestimating its involvement and potential as a significant source of conflict resolution and peacebuilding. It is also important to recognise that, especially in recent years, numerous religious individuals, movements and organisations have been actively involved in attempts to end conflicts and to foster post-conflict
887
reconciliation between formerly warring parties (Bouta et al., 2005). This emphasises that various religions play often key roles in international relations and diplomacy, helping to resolve conflicts and in some cases build peace. The ‘clash of civilisations’ thesis oversimplifies causal interconnections between religion and conflict, in particular by disregarding important alternate variables, including the numerous attempts from a variety of religious traditions to help resolve conflicts and build peace. When successful, religion’s role in helping resolve conflicts is a crucial component in wider issues of human development. As Ellis and ter Haar (2004) note: ‘Peace is a precondition for human development. Religious ideas of various provenance – indigenous religions as well as world religions – play an important role in legitimising or discouraging violence’ (my emphasis).
Global/regional differentiation: Religion, politics and democracy The question of how religious actors2 might affect political change, including through issues of democratisation and authoritarianism, is a topical and controversial issue. Scholars of comparative politics stress the importance of political culture in explaining the success or failure of democratisation in various countries influenced by US policy after World War II, including West Germany, Italy and Japan (Linz and Stepan, 1996; Stepan, 2000; Huntington, 1991). Religious traditions – for example, Roman Catholicism in Italy and Christian democracy in West Germany – were said to be important in the (re)making of these countries’ political cultures following the traumatising effects of totalitarian regimes in the 1930s and early 1940s (Casanova, 1994; Madeley, 2009). In addition, during the ‘third wave of democracy’ (mid 1970s–early 2000s), much attention was paid to the role of religious actors in
888
The SAGE Handbook of Political Science
political changes (Huntington, 1991). For example, in Poland, the Roman Catholic Church, in tandem with a Polish pope, John Paul II, played a key role in undermining the existing communist government, helping establish a post-communist, democratically accountable regime (Weigel, 2005, 2007). In addition, the democratic impact of the Roman Catholic Church had a wider political effect beyond Poland, extending to Latin America, sub-Saharan Africa and parts of Asia from the 1980s. Contemporaneously, there was the rise of the Religious/Christian Right in the United States and, since then, its considerable and continuing impact on the electoral fortunes of both the Republican and Democratic parties. Add to this the emergence, growth and spread of various kinds of Islamist movements across the Muslim world, from Morocco to Indonesia, which has had significant ramifications for electoral outcomes in various countries, including Algeria, Egypt and Morocco; electoral successes for the ‘Hindu fundamentalist’ Bharatiya Janata Party in India; and substantial political influence over time for various Jewish political parties in Israel, and we have clear and sustained evidence of religion’s political importance in relation to political changes, including democracy. At the same time, it is important not to overemphasise the religious actors’ importance in these political events and developments. Focusing on the East European democratising experience in the 1980s and 1990s, Juan Linz and Alfred Stepan argue that religion was not generally the – or even a – key explanatory factor in democratisation outcomes (Linz and Stepan, 1996). In relation to Muslim countries in the Middle East, Fred Halliday argued that apparent barriers to democracy were primarily linked to extant social and political, not religious, features (Halliday, 2005). These include in many cases long histories of authoritarian rule and weak civil societies and, although some or all of those features might be legitimised by the state in terms of ‘Islamic doctrine’, there is in fact nothing specifically
‘Islamic’ about them. On the other hand, for Samuel Huntington, religions have a crucial impact on democratisation (Huntington, 1996). He claims that Christianity, in both Protestant and Catholic forms, has a strong propensity to be supportive of democracy while other religions, such as Islam, Buddhism and Confucianism, do not. To understand the overall political importance of religious actors, and by extension how they involve themselves in political changes, including democratisation, it is necessary first to comprehend what they say and do in their relationship with the state. I mean something more than ‘mere’ government when referring to the state. The state is the continuous administrative, legal, bureaucratic and coercive system that attempts not only to manage the various state apparatuses, but in addition to ‘structure relations between civil and public power and to structure many crucial relationships within civil and political society’ (Stepan, 1988). As a result, almost everywhere in the world, apparently regardless of the nature of political systems and/or the level of economic development in a country, states have over time sought to reduce and control religion’s political importance and involvement. That is, around the world states have sought to privatise religion, and thus considerably to reduce its political impact. Sometimes, for example in Poland and Italy (Roman Catholicism) and Turkey (Sunni Islam), states have attempted to erect a ‘civil religion’ arrangement, whereby a designated religious format effectively ‘functions as the cult of the political community’ (Casanova, 1994: 58). The declared purpose is to try to create and develop forms of consensual, corporate religion, claiming to be guided by general, culturally appropriate, specific religious beliefs of intrinsic societal significance. In short, when states seek to develop ‘civil religions’ it is an attempted strategy to try to avoid social conflicts and promote national coordination and cohesion. Yet, religious actors’ relationships with the state are by no means limited to attempts by the
Religion and Politics
latter to build civil religions. In fact, in many countries, relations between religious entities and the state are not only now more visible, but also increasingly problematic. Why is this the case? First, recent increases in religious challenges to the authority of the state may merely be transitory reactions in the context of the onward march of secularisation. Second, even if the modern state is particularly vulnerable to legitimation crises, it does not necessarily mean that religion is again becoming automatically relevant to state functioning and policy making. Third, religion-based challenges to state hegemony have roots in endeavours by the latter to assert a monitoring role vis-à-vis religion, in effect to control it. We can see such a development at three levels: political society, civil society and the level of the state itself. In many countries religion is being liberated from providing sometimes slavish legitimacy to secular authority. Many religious actors are now willing routinely to criticise and challenge the state in various ways in relation to a variety of issues and themes. Yet, even if heightened concern about the state’s policies can be held up as evidence of the regeneration of the socio-political power of religion, we still need to ask further questions. The issues are themselves secular and in so far as religious agencies are active in these areas, this is a radical shift of concern from the supernatural, from devotional acts, to what are largely secular goals pursued by secular means. However, a note of caution is in order: we need to bear in mind that when religious interests act as ‘pressure groups’ – rather than as ‘prayer bodies’ – they are not necessarily going to be effective in what they seek to achieve. This is because the more secular society is, the less likely it is that religious actors will play politically significant roles (Wilson, 1992).
Religion and Political Society At the level of political society – that is, the arena in which the polity specifically arranges
889
itself for political contestation to gain control over public power and the state apparatus – we can note a range of religious responses that are in part dependent upon the degree of secularisation. These include (1) resistance to the disestablishment and the differentiation of the religious from the secular sphere – the goal of many so-called religious ‘fundamentalist’ groups; (2) religious groups and confessional political parties’ mobilisations and counter-mobilisations against other religions or secular movements and parties; and (3) religious organisations’ mobilisation in defence of religious, social and political freedoms – that is, demanding the rule of law and the legal protection of human and civil rights, protecting mobilisation of civil society and/ or defending institutionalisation of democratically elected governments. In pursuit of such goals in recent times, we can note Roman Catholic transnational political mobilisation in and between various countries, as well as activities of Islamist groups in various countries, including Egypt, Syria and Tunisia.
Religion and Civil Society Civil society is the arena in which various social movements – including neighbourhood associations, women’s groups, religious entities, and intellectual currents – join with civic organisations, including lawyers’, journalists’, trade unions’ and entrepreneurs’ associations, to constitute themselves into an ensemble of arrangements to express themselves and seek to advance their interests. Sometimes, the concept of civil society is used in contrast to political society. Unlike the latter, civil society refers to organisations and movements – not political parties – formally uninvolved in both the business of government and overt political management. Note, however, that this does not necessarily prevent civil society organisations from sometimes seeking to or actually exerting political influence on various matters,
890
The SAGE Handbook of Political Science
including democratic outcomes and the content of national constitutions. Regarding religion at the level of civil society, one can distinguish between hegemonic civil religions – such as evangelical Protestantism in 19th-century America – and the recent public intervention of religious entities, concerned either with single issues such as anti-abortion or with morally determined views of wider societal development, for example, in relation to homosexual rights or appropriate days for shops to open, currently a highly controversial issue in Israel. In trying to influence public policy, without themselves seeking to become political officeholders, religious actors may employ a variety of tactics, including, in no particular order: (1) lobbying the executive apparatus of the state; (2) going to court; (3) building links with political parties; (4) forming alliances with like-minded groups, both secular and from other religious traditions; (5) mobilising followers to lobby and/or protest; and (6) working to sensitise public opinion via mass media. The overall point is that religious actors may use a variety of methods to try to achieve their objectives.
Religion and the State Interactions between the state and religious entities are often referred to as ‘church–state relations’. Yet, one of the difficulties in seeking to survey the nature of contemporary ‘church–state’ relations in many countries around the world, is that the very concept of church is a somewhat parochial, AngloAmerican standpoint with direct relevance only to explicitly Christian traditions. It is derived primarily from the context of British establishmentarianism – that is, maintenance of the principle of ‘establishment’ whereby one church is legally recognised as the only established church. In other words, when we think of ‘church–state’ relations we may assume a single relationship between two clearly distinct, unitary and solidly but
separately institutionalised entities. In this implicit model built into the conceptualisation of the religion–political nexus there is but one state and one church; both entities’ jurisdictional boundaries need to be carefully delineated. Both separation and pluralism must be safeguarded, because it is assumed that the leading church – like the state – will seek institutionalised dominance over rival religious organisations. For its part, the state is expected to respect individual rights even though it is assumed to be inherently disposed towards aggrandisement at the expense of citizens’ personal liberty. In sum, the conventional concept of state–church relations is rooted in prevailing Christian conceptions of the power of the state of necessity being constrained by forces in society—including those of religion. Expanding the problem of church–state relations to non-Christian contexts necessitates some preliminary conceptual clarifications – not least because the very idea of a prevailing state–church dichotomy is culture-bound. As already noted, church is a Christian institution, while the modern understanding of state is deeply rooted in the Post-Reformation European political experience. In their specific cultural setting and social significance, the tension and the debate over the church–state relationship are uniquely Western phenomena, present in the ambivalent dialectic of ‘render therefore unto Caesar the things which be Caesar’s and unto God the things which be God’s’ (Matthew 22:21). Overloaded with Western cultural history, these two concepts cannot easily be translated into non-Christian terminologies. The differences between Christian conceptions of state and church and those of other world religions are well illustrated by reference to Islam. In the Muslim tradition, mosque is not church. The closest Islamic approximation to ‘state’ – dawla – means, as a concept, either a ruler’s dynasty or his administration. Only with the specific Durkheimian stipulation of church as the generic concept for moral community, priest
Religion and Politics
for the custodians of the sacred law and state for political community can we comfortably use these concepts in Islamic and other nonChristian contexts. On the theological level, the command–obedience nexus that constitutes the Islamic definition of authority is not demarcated by conceptual categories of religion and politics. Life as a physical reality is an expression of divine will and authority (qudrah’). There is no validity in separating the matters of piety from those of the polity; both are divinely ordained. Yet, although both religious and political authorities are legitimated Islamically, they invariably constitute two independent social institutions. They do, however, regularly interact with each other. Yet, as recent political conflicts have shown in relation to, inter alia, Egypt and Turkey, there are on occasion sometimes serious tensions between Islamist actors of various stripes and the state in regard to democratisation and political outcomes more generally. The overall point is that tensions widely exist between secular power holders and religious actors of various kinds in the modern world. It is often the case in some European countries, for example, that religious actors, apparently regardless of their religious persuasion, may work individually or collectively towards reducing the ability of the state to sideline them. We can see this in relation to France, where recent years have seen a campaign by some Muslim women to wear Islamic headscarves, despite the ban since 2004 on such attire. While these women regard it as their fundamental human right to be allowed to dress as they wish, the French state see things differently – Muslim women’s efforts to dress as they wish is regarded as a direct contravention of a core French post-revolutionary principle: subjugation of religion by the state. In effect, such religious challenges reflect a wider development: a wish on the part of some religious actors to reverse religious privatisation, a course of action which impacts on a variety of political and social concerns.
891
Ongoing debates and critical assessments Religion’s political and social impact is universal. While impact varies from country to country and in different international contexts, it is clear that overall religion is much more socio-politically significant today compared to half a century ago. How and why is religion now so ‘significant’? It is largely because religion can encourage, or help resolve, often interlinked political, social, economic and developmental conflicts or disputes. This is because religion has important functions serving to engender and/or significantly influence individual and group values which, in turn, impact upon common existential issues, which affect people everywhere regardless of ethnicity or culture. Ongoing debates and critical assessments of the role of religion in politics, in both domestic and international contexts, typically focus on two overlapping, but conceptually distinct, issues: security and governance. Significant issues include: • Politically assertive religion will continue to affect governance and security outcomes in many countries around the world, as well as internationally; • Globalisation and associated technology, including satellite television channels and social media, will play a considerable role in spreading sectarian and inter-faith mistrust; • Factional divisions within religious traditions will exacerbate tensions; • High levels of economic and developmental inequality, derived from religion, ethnicity and/or class, will be prominent sources of tension; • Sectarian and other inter-religious tensions reflecting often longstanding socio-economic disparities will escalate when governments do not address fundamental issues of socio-economic adversity; • Sectarian conflicts which deepen pre-existing religious divides will in some cases escalate into serious domestic and/or international conflict. When this happens, they will deleteriously impact upon both governance and political and social stability and, in some cases, encourage anti-US/ Western extremism and terrorism.
892
The SAGE Handbook of Political Science
Focusing on the next 20 years, we can identify and briefly examine emerging trends of the relationship between religion and politics both generally and in terms of the relations between the West and the non-West. Centrally informed by the current importance of religion’s influence – affecting individual identity, society and governance – we note that for billions of people religion is highly likely to remain a very significant issue, in both developing and developed countries. But religion does not act in isolation, and in recent years two key developments have led to increased religious responses in many parts of the world. On the one hand, the recent expansion of representative governments to all global regions, with the important exception of the Middle East and North Africa (MENA), has provided new political and social space for religion to be assertive. On the other hand, however, because religion is so fundamental to many people’s identity, opening political and social space in some cases encourages existing or new tensions to surface, leading to inter-group conflict, which can sometimes develop into war. Today, two of the world religions – Christianity and Islam – are showing particularly strong growth. Christianity, especially evangelical Christianity, is currently growing annually by 1.47%, implying 30% progress by 2035. Christianity’s current growth is particularly swift in South and East Asia and sub-Saharan Africa (Martel, 2013). The growth of Islam between 2010 and 2020 is estimated at 1.7% a year, linked mainly to ‘high’ birthrates among existing Muslims in Asia, MENA and Europe (ibid.). A 2010 report by the Pew Research Center’s Forum on Religion & Public Life estimated that, on present trends, the global Muslim population will grow by about 35% by 2030, rising from 1.6 to 2.2 billion (Pew Research Center, 2011). Given that some countries – such as Nigeria, Kenya, Tanzania and Russia – are already experiencing growing tensions between followers of Christianity and Islam, then it may be that swift expansion of both
Islam and Christianity over the next few decades will exacerbate such tensions, with implications for global security and governance (Pew Research Center, 2010).
Politics, Religion and Security Following the end of the Cold War, both Christianity and Islam showed increasing political assertiveness, a development which seems likely to continue (Haynes, 2016). While such political assertiveness has been manifested in many countries domestically, it has also shown itself internationally and transnationally. Central to this development is the phenomenon of globalisation and associated developments in communications technology. The latter permits religious entities’ messages to unite or divide real or imagined communities, physically separated by international borders and thousands of kilometres. In particular, it enables diaspora populations to feel a closeness otherwise denied them and appeals to a far wider audience than previously possible. Technology is also likely to contribute to diaspora communities being increasingly affected by intra-faith discord in countries of origin, such as Pakistan and India. In addition, some governments may have to address new challenges from religious groups at home. For example, it is posited that over the next two decades, China will be home to some of the world’s largest Muslim and Christian populations. The impact on China’s internal politics and global attitude and focus are likely to be influenced significantly by the manner in which these two faith groups pursue their goals and seek enhanced religious freedoms. A wider point is that as religion is so fundamental to many people’s identity, where tensions between different groups exist or develop, they are likely to be exacerbated by religious differences. Some of these will impact upon US and Western security, as detailed below. Post-Cold War globalisation led to dramatic, continuing increases in interactions
Religion and Politics
between people and communities, which were no longer dependent on geographical closeness to easily enable such connections. Globalisation encourages religions to adopt new, revised or reformed social, moral and/or political agendas. It stimulates many religious individuals, organisations and movements not only to look at local and national issues and contexts but also to focus on regional and international environments, which often link into or exacerbate pre-existing negative perceptions of foreign – including US and Western – cultural, political and economic hegemonies. Moreover, encounters between different religious traditions, both within faiths and between them, are increasingly common and not always harmonious. Sometimes the result can be extreme hostility, captured in the term ‘culture war’. Continuing culture wars, for example in Israel and the United States, occur in relation to serious differences between secular and religious groups regarding the appropriate positions of religious and secular norms, values and behaviour. Culture war occurs when differing religious worldviews encourage different allegiances and standards in relation to various areas, including the family, law, education and politics. As a result, conflicts involving, inter alia, gender, ethnicity, class and nations can be framed in religious terms and can impact on security, sometimes dramatically, both within counties and internationally. This is also the case with some religious minorities who may regard their own existential position – for example, Muslim minority communities in Thailand, the UK, France, the Philippines and India, and Christian minorities in many countries in the MENA – to be unacceptably weakened because of actual or perceived pressure from majority religious communities, such as Buddhists in Thailand, Christians in the UK, France and the Philippines, and Hindus in India, which encourage religious minorities to conform to the hegemonic norms and values of the religious and cultural majority.
893
This issue has recently affected a region long thought to be immune to the public impact of religion and culture: Western Europe. There, governments long ago chose the path of secularisation, with a linked ‘downgrading’ of religion from public realm to privatised belief. Today, however, many urban areas across Western Europe contain areas of pronounced social deprivation, often home to many migrants. Recent extensive immigration to Western Europe, coupled with enhanced mobility of people within the countries of the region due to the expansion of the European Union, has led to increasingly multicultural societies, albeit often within a wider trend towards secularism. Yet, local communities with strong religious beliefs continue to exist and, due to natural expansion, are growing in size. Recent political developments, such as the rise of UKIP (the UK Independence Party), have highlighted that many, perhaps most, Western Europeans tolerate, rather than embrace and welcome, integration and immigration into their countries and the region more generally. This is particular apparent in times of economic stress – for example, since 2008 – when it appears that many Western Europeans revert to older forms of identity, including reference to cultural models of Christianity, feeling that it exemplifies and underlines two key components of (Western) European culture: liberal and individualistic values underpinning modern (Western) European culture. For some, this sets apart Western European culture from what is seen as different – that is, less liberal, more conservative – values and norms of Europe’s Muslim immigrants, who hail originally from the MENA, South Asia and sub-Saharan Africa. To date, this issue has not emerged as a clear security threat to American or more generally Western interests. Future projections are that the population growth of non-Muslims in Europe is slow, while the Islamic population of Europe is expected to continue to grow, exceeding 58 million by 2030 (that is, approximately 8% of the total
894
The SAGE Handbook of Political Science
population), but with Christian traditions likely to remain dominant (Pew Research Center, 2011). Reflecting the impact of globalisation and communications technology, diaspora Muslim communities in Western Europe are likely to be increasingly affected by intra-faith and intra-Islamic focused in the countries of the MENA. In particular, tension between Sunni and Shia Islam tension could spread. For example, in 2012, Belgium’s largest Shiite mosque was fire-bombed by hard-line Sunnis. However, while there may be an increase in such incidents, particularly in response to events in the MENA, it is unlikely that large-scale violence between the different sects will occur in Europe itself, not least because of the small number of Shias. In the MENA itself, the past few decades have seen the region emerge as a global focal point of increased political involvement of religious actors both within countries and internationally. On the one hand, religious minorities across the region are being squeezed and their security compromised. While ‘Islamic fundamentalism’ or ‘Islamism’ attracts much attention in this context, we can also observe serious sectarian divisions and linked conflicts across much of the MENA, including in Syria, Iraq and Yemen. The situation was exacerbated by the 2011 Arab Spring and its aftermath, where state weakness or breakdown combined with the impact of politically assertive religious actors saw increasing pressure on religious minorities to convert to the dominant religious tradition or, failing that, to flee for their lives.3 Religious actors like Islamic State thrive on sectarian division. Given the widespread diminution of state capacity in the MENA following the Arab Spring and the linked expansion of aggressive Sunni entities, such as Islamic State, then it seems highly likely that the short and medium-term will feature many sectarian conflicts in the MENA, which will cause significant friction and, in some cases, result in out and out conflict between warring sectarian groups. Tensions between Shiite
Iran and the Sunni Gulf Cooperation Council (GCC) are likely to remain high in the next few years – not least because each is seen to support one sect of Islam only. However, not all Shia movements will necessarily be proIranian and not every Salafi or Wahhabist Sunni movement kowtows to Saudi Arabia. Indeed, there are significant Shiite minorities in GCC countries, as well as a growing (Sunni) Salafi movement in Iran. Sectarian tensions also reflect socio-economic disparities and are likely to escalate if governments do not address these fundamental issues. For example, Bahrain and Saudi Arabia, where economic inequality between Sunni and Shia is greatest, are more likely to see tensions rise than other countries in the region. Globalisation, represented by influential satellite television channels and social media, will play a growing, perhaps pivotal, role in spreading anti-government rhetoric and sectarian mistrust. In addition, over the next few years, we are likely to see growing tensions within Sunni and Shiite communities. Sunni Islam is particularly likely to become increasingly factionalised. As Salafist groups grow in prominence around the world, a backlash may emerge from moderate Sunnis. Correspondingly, Shiite Islam contains a number of internal divisions. The countries in the MENA that have suffered most from decades of systematic political, sectarian and racial repression and mass killings – that is, Iraq and Syria – made possible the foundation, emergence and development of Islamic State. What makes these countries’ situation direr is the world’s failure to condemn this oppression, turning a blind eye to the roots of radicalisation, while failing to help deal with the existential threat that Islamic State poses due to political considerations at home. Yet, it is no longer about a choice between countering terrorism and respecting human rights. It is impossible to win the fight against terror in this region without addressing the oppression and lack of opportunity that spawns it. Defending human rights and confronting religious extremism,
Religion and Politics
and working to end the discrimination against Syrian and Iraqi Sunni populations, as well as against Bedouins of Sinai, would be the necessary first steps in a long journey to deal with human rights violations in the MENA. Whereas in Western Europe, Muslim minority populations question their social and cultural position, and in MENA, state breakdown encourages sectarian strife and the persecution of religious minorities, in ‘secular’ Central Asia, Islamist movements represent a challenge to the status quo. This is not because they are especially powerful: today they stand almost no chance of overpowering state institutions or gathering substantial support in urban areas. Yet, regional governments have sought to combat what they see as extremism in a heavy-handed manner which has exacerbated the problem which Islamist movements see themselves fighting against: poor, corrupt and repressive ruling regimes. Many Central Asian governments are Western-friendly and, while Islamism is likely to remain a long-term (if low-level) threat to stability, it does highlight to many ordinary Central Asians that the West is a friend to their often highly disliked governments. Continued socio-economic adversity and growing animosity towards an overbearing, monopolistic state is likely to increase the number of instances of instability across Central Asia. Social discontent may result in support for underground religious movements, rather than opposition parties, while strengthening anti-Western feeling in many Central Asian countries.
Politics, Religion, Governance and Global Order Many recent analyses of religion and politics highlight the relevance of economic, social and/or cultural issues, including the economic range and social and cultural significance of transnational business corporations (TNCs) (Haynes, 2013). This often leads to the perception that TNCs are taking economic power
895
both from governments and from citizens. This comes in the context of what is popularly understood as a significant downside to economic globalisation: the apparent mass impoverishment of already poor people, in both developing and developed countries. These circumstances have encouraged numerous religious organisations, including, for example, the 350-member World Council of Churches, to focus on these economic imbalances, and suggest ways to ameliorate then using the power of religious organisation and community. This focus is manifested in various ways, including: new religious fundamentalisms; support for anti-globalisation activities, such as recent anti-globalisation and anti-World Trade Organisation protests; and North/South economic justice efforts, including the Millennium Development Goals (2000–15) and their successor, the Sustainable Development Goals (2015–30). In short, recent religious responses to what are perceived as the unacceptable face of economic globalisation highlight again the emergence of religion as a public actor and the (potential or actual) impact on political outcomes, within countries or internationally. This observation is based on a recognition that around the world, many religious organisations and (secular) development agencies share similar concerns: how to improve (1) the lot of materially poor people, and (2) the societal position of those suffering from social exclusion, and (3) unfulfilled human potential in the context of glaring developmental polarisation within and between countries, which the World Bank now accepts has arisen in part because of the polarising impact of globalisation. These developmental concerns focus upon, but are not confined to, issues linked to the impacts of poverty, HIV/AIDS, conflict, gender concerns, international trade and global politics. These issues explicitly link all the world’s countries and peoples – rich and poor – into a global community, and how to resolve them poses a challenge to governance and global order.
896
The SAGE Handbook of Political Science
These challenges are manifested in the actions of some extremist religious organisations whose impact upon US and Western interests is explicitly hostile and very difficult to counter. They are likely to get worse over the next 20 years – unless concerted efforts are made to blunt their impact by ameliorating the conditions which give rise to them. For example, al Qaeda still has a stronghold in Yemen, while Islamic State established a hold over large parts of Syria and Iraq and a so-called State of Sinai in Egypt. Despite its setbacks, Islamic State still has a considerable presence in both Libya and Mali and has developed a transcontinental alliance with Boko Haram in Nigeria, and is no doubt still plotting to carry out terrorist attacks in European and American cities. Premised upon the idea that ‘Western education is forbidden’, Boko Haram is vehemently antiWestern, although an apparently indiscriminate killer of anyone who disagrees with the organisation and its ideology. Such horrific suffering is the result of decades of severe social injustices, Islamic extremism, sweeping human rights violations and the absence of good governance. Boko Haram violence underpins rising death rates among Nigerian citizens, while thousands are also killed in conflicts in the Central African Republic, South Sudan and Somalia. Such conflicts highlight how religion, along with culture, ethnicity and identity, is an important component in understanding governance and global order issues, while contextualising current Western counterinsurgency efforts. Following 9/11, first al Qaeda and its affiliates and then Islamic State and its allies sequentially posed serious threats to governance in many countries, and, by extension, global order and Western security. While it is well known that al Qaeda perpetrated multiple attacks against US and Western targets in the 1990s and early 2000s, these outrages raised questions about the ideological assumptions and goals of al Qaeda. While Bin Laden was personally committed to the fight against the ‘far enemy’ – the
United States – Islamic State fights the ‘near enemy’: ideologically and ‘un-Islamic’ governments and populations in the MENA. However, given that many of the dead in the attacks were not Western Christians or Jews but local Muslims, it raised the question of what exactly the perpetrators are seeking to achieve. What today are the ideological assumptions and goals of what is left of al Qaeda and its temporarily more powerful successor, Islamic State? Al Qaeda first emerged in the late 1980s to challenge the incumbency and authority of rulers in various Middle Eastern countries, including Saudi Arabia, with the objective of replacing them with plausibly ‘Islamic’ leaders. Over time, however, a lack of success in achieving these objectives led al Qaeda strategists to shift attention to regional and global goals, including taking the fight, on 9/11, to the ‘far enemy’ (Gerges, 2009). The result was a continuing ‘anti-Western’ war, which sought to utilise various ‘weapons of terror’ – a campaign now taken up and adapted to today’s specific conditions by Islamic State. Both al Qaeda and Islamic State share concerns about spreading the ‘right’ religion by jihad, and the global balance of power currently dominated by the United States and the West. Over time, wars in Iraq and Afghanistan, as well as more recent – and in some cases – continuing conflicts in Mali, Nigeria and Syria, indicate that religion, culture and identity are continuing concerns in many conflicts. In each case, there are explicit links to long-term and systemic governance shortfalls, which have to be ameliorated before the threat from extremist Islam can be nullified and the threat to the West’s security significantly reduced.
Conclusion Examples of religion’s recent political impact abound in countries at varying levels of economic and political development. For example, there was the crucial role of Christian, especially Catholic, churches in
Religion and Politics
the ‘third wave’ of democracy in southern and eastern Europe and Latin America and Africa from the early 1970s to the early 2000s; the overthrow of the Shah of Iran in 1979 and the sequential growth of Islamist movements across the Muslim world from Morocco to Malaysia; the (New) Christian Right in the United States, which emerged in the 1980s and is still today demanding fundamental political, social, and moral changes; long-running hostility between Protestants and Catholics in Northern Ireland and between Muslims and Christians in Africa; Hindu and Sikh political radicalism in India; Buddhist activism in South East Asia, including anti-Rohingya activities in Myanmar; and Jewish extremism and ‘normal’ political activities in Israel. As a result, around the world, the mass and social media, social scientists, professional politicians and policymakers, and many ‘ordinary’ people, now feel compelled to pay greater attention to religion as a significant, albeit variable, socio-political actor. What they have in common is that today religious organisations of various kinds openly reject the previously uncontroversial secular ideals dominating many polities, especially in the West, appearing instead as champions of alternative, confessional options. Often keeping faith with what they interpret as divine decree, such religious actors refuse to render to non-religious power holders either material or moral tribute. Increasingly concerned with various political issues, they challenge the legitimacy and autonomy of the primary secular spheres: the state, political organisation and the market economy. In addition, religious leaders now refuse to restrict themselves to the pastoral care of individual souls; instead, they raise questions about, inter alia, the interconnections of private and public morality and the claims of states and markets to be exempt from extrinsic normative considerations. Intent on retaining or increasing their social and religious importance, they seek to elude what they regard as the
897
cumbrous constraints of temporal authority, while threatening to usurp constituted political functions. In short, refusing to be condemned to the realm of privatised belief, religion strongly reappears in the public sphere, thrusting itself into issues of societal, social, moral and political contestation. As Casanova (1994: 6) puts it, ‘what was new and became “news” in the 1980s was the widespread and simultaneous’ refusal of the so-called ‘world religions’ – that is, Islam, Christianity, Hinduism and Buddhism – to be ‘restricted to the private sphere’. The most eye-catching and societally concerning aspect of the ‘return’ of religion to the public sphere is its connection to political extremism and terrorism. In particular, terrorism encouraged and fuelled by Islamic extremism poses a threat to global security and stability, not only in the West but also in many other parts of the world, such as the countries of the MENA, where sectarian rivalries are egregious. Around the world, high – and in many cases growing – levels of inequality contextualised by differences based on religion, ethnicity and/or class are highly likely to endure; and this will encourage and, in some cases, exacerbate the spread of religion-based terrorism. This will continue to be a serious source of tension in many countries and regions and will impact on the overall governance and stability of many countries, while also affecting the ability of global governance to work smoothly and harmoniously. It is clear that areas of considerable sectarian tension exist across the world, especially in many of the 20 or more countries that comprise the MENA. If there was a prolonged period of escalation, perhaps underpinned by further deteriorations in political and developmental wellbeing, then campaigns of terrorist attacks could be carried out on a previously unseen scale, further plunging the MENA region into chaos with knock on effects felt in neighbouring regions: Western Europe and sub-Saharan Africa. It is possible that attacks at such a heightened level could cause
898
The SAGE Handbook of Political Science
a hitherto relatively stable major power, such as Egypt, to descend into civil war, just like in neighbouring Syria. In addition, pre-existing religious and sectarian divides, including intra-Islamic and Islamic–Christian and/ or Islamic–Jewish conflicts could coalesce, rapidly escalating into transnational conflicts affecting both the MENA and neighbouring regions. In such circumstances, it is conceivable that some countries would be drawn into a wider war, as pressure from their populations, existing treaty obligations and allegiances compels them to take sides. Should the United Nations, typically deadlocked, weak and hamstrung, and regional security organisations, unsure of how and why to act, be unable to take up the challenge, then widespread and serious conflicts could result. While the scenario sketched out in the previous paragraph represents an extreme outcome, it is clear that both terrorism and sectarian and inter-religious tensions and conflicts are at the centre of Western security concerns, and have been since at least 9/11 and, arguably, as far back as the late 1970s and the unexpected success of the Iranian revolution. As we have seen in recent years in relation to the Arab Spring events and political developments in many countries in the MENA more generally, governance problems are at the heart of religion’s involvement in regional and transnational conflicts which also directly and seriously affect Western security interests. The starting point for our analysis in this regard was to note that globalisation can both highlight and encourage religious pluralism, or it can encourage intrafaith and inter-religious hostility and conflict. Several world religions, Christianity, Islam and Judaism (‘religions’ of the book’), claim ‘exclusive accounts of the nature of reality’, that is, only their religious beliefs are judged to be true by adherents. This is not to say that conflict is somehow inevitable. Rather religious responses can aim to be both constructive and ameliorative. But whether this turns out to be the case can only be answered with empirically derived evidence over time.
Notes 1 An Islamist is a believer in or follower of Islam who may be willing to use various political means to achieve religiously derived objectives. 2 In this chapter, a religious actor is one that undertakes action as a consequence of his or her religious faith and beliefs. They include: churches and comparable religious organisations in nonChristian religious contexts; social movements whose main motivating factor is their members’ religious beliefs; and political parties, whose ideology has roots in identifiable religious beliefs and traditions. 3 Of the more than 20 countries in the Middle East and North Africa, only Tunisia has undergone a post-Arab Spring transition to democracy which endures at the time of writing (October 2018).
References Bouta, T., A. Kadayifci-Orellana, S. and AbuNimer, M. (2005) Faith-based Peace-Building: Mapping and Analysis of Christian, Muslim and Multi-Faith Actors, The Hague: Netherlands Institute of International Relations. Casanova, J. (1994) Public Religions in the Modern World, Chicago: University of Chicago Press. Ellis, S. and ter Haar, G. (2004) Religion and Development in Africa. Unpublished background paper prepared for the Commission for Africa, 9 December. Available at https:// o p e n a c c e s s . l e i d e n u n i v. n l / b i t s t r e a m / handle/1887/12909/ASC-071342346-17401.pdf?sequence=1 Last accessed 10 October 2018. Gerges, F. (2009) The Far Enemy: Why Jihad Went Global, 2nd ed., Cambridge: Cambridge University Press. Halliday, F. (2005) The Middle East in International Relations: Power, Politics and Ideology, Cambridge: Cambridge University Press. Haynes, J. (2013) An Introduction to International Relations and Religion, 2nd ed., London: Pearson. Haynes, J. (ed.) (2016) Routledge Handbook of Religion and Politics, 2nd ed., London: Routledge. Huntington, S. (1991) The Third Wave: Democratization in the Late Twentieth Century, Norman: University of Oklahoma Press.
Religion and Politics
Huntington, S. (1993) ‘The clash of civilisations?’, Foreign Affairs, 72, 3, pp. 22–49. Huntington, S. (1996) The Clash of Civilizations and the Remaking of World Order, New York: Simon and Schuster. Kurtz, L. (1995) Gods in the Global Village: The World’s Religions in Sociological Perspective, New York: Sage. Linz, J. J. and Stepan, A. (1996) Problems of Democratic Transition and Consolidation: Southern Europe, South America, and PostCommunist Europe, Baltimore, MD: John Hopkins University Press. Madeley, J. (2009) ‘E unum pluribus: The role of religion in the project of European integration’, in J. Haynes (ed.), Religion and Politics in Europe, the Middle East and North Africa (pp. 114–35), London: Routledge/ECPR. Martel, Frances (2013) ‘Christianity booming in Asia and sub-Saharan Africa’, Breitbart, December 19. Available at http://www. breitbart.com/national-security/2013/12/19/ christianity-booming-in-asia-and-sub-saharanafrica/ Last accessed October 10, 2018. Pew Research Center (2010) ‘Tolerance and Tension: Islam and Christianity in sub-Saharan Africa’, April 15. Available at http://www.
899
pewforum.org/2010/04/15/executive-summary-islam-and-christianity-in-sub-saharanafrica/ Last accessed October 10, 2018. Pew Research Center (2011) ‘The Future of the Global Muslim Population’, January 27. Available at http://www.pewforum.org/ 2011/01/27/the-future-of-the-global-muslimpopulation/ Last accessed October 10, 2018. Smock, D. (2004) ‘Divine intervention: Regional reconciliation through faith’, Religion, 25, 4. Available at http://hir.harvard.edu/article/? a=1190 Last accessed October 10, 2018. Stepan, A. (1988) Rethinking Military Politics: Brazil and the Southern Cone, Princeton, NJ: Princeton University Press. Stepan, A. (2000) ‘Religion, democracy and the “twin tolerations”’, Journal of Democracy, 11, 4, pp. 37–57. Weigel, G. (2005) Witness to Hope: The Biography of Pope John Paul II, 1920–2005, New York: Harper Collins. Weigel, G. (2007) Faith, Reason, and the War against Jihadism, New York: Harper Collins. Wilson, B. (1992) ‘Reflections on a Many-Sided Controversy’, in S. Bruce (ed.), Religion and Modernization (pp. 195–210), Oxford: Clarendon Press.
54 Responsiveness Jeeyang Rhee Baum1
Introduction Robert Dahl famously asserted that ‘[a] key characteristic of democracy is the continuing responsiveness of the government to the preferences of its citizens, considered as political equals’ (1971: 1). Following the democratic guidelines established by Dahl, political theorists describe two fundamental principles of democracy. The first principle states that ‘people [must] have a substantial influence over which policies a government passes and which it does not pass’ (Peters and Ensink, 2015: 577). The second principle adds that ‘government should not just be responsible overall, but they should also be equally responsive to their citizens’ (Peters and Ensink, 2015: 577). Leonardo Morlino defines responsiveness as ‘the capacity to satisfy the governed by executing the policies that correspond to their demands’ (Morlino, 2004: 15) and identifies it as one of the principal dimensions along which good democracies may vary. Responsiveness is a central
quality that cannot be detached from representative democracies, and it is of great significance as it ‘connects citizens to authoritative decision making and not only to the selection of leaders in elections’ (Esaiasson and Wlezien, 2016: 699). Interestingly, while traditional studies of responsiveness have primarily focused on the relationship between opinion and public policy in democratic systems, an increasing number of scholars have begun to devote their attention to responsiveness in supranational institutions and in authoritarian systems. Disentangling responsiveness from the type of regime is critical in understanding the political dynamics between the government and its citizens. As Dahl (1971: 14) explains, when ‘hegemonic regimes and competitive oligarchies move toward polyarchy they increase the opportunities for effective participation and contestation and hence the number of individuals, groups, and interests whose preferences have to be considered in policy making’. However, in
Responsiveness
the perspective of the incumbents who currently govern, this transformation can be problematic as it may bring potential conflict, with the opponents of the government having greater opportunities to translate their goals into policies. Dahl (1971: 15) argues that the greater the conflict between government and its opponents, the more costly it is for each to tolerate the other. He advances three axioms regarding the likelihood of democratic change in a non-polyarchy. These are, first, that government tolerance of opposition is inversely related to the costs of toleration; second, that tolerance of opposition increases with the costs of suppression; and third, that the likelihood of a competitive regime increases as the costs of suppression rise relative to the costs of toleration. In short, seeking consensus is cheaper than suppression. Through an analysis of responsiveness in the United States, the EU, several East Asian democracies, the People’s Republic of China and Russia, this chapter seeks to gain a better understanding of the cross-national advances and differences in responsiveness.
Theories and Concepts The concept of responsiveness is complex. Although there is a general acceptance that responsiveness may be perceived as ‘the “congruence” between citizens’ interests and political outcomes’ (Morlino and Quaranta, 2014: 334), the ‘chain of responsiveness’ (Powell, 2005) helps demonstrate that identifying such congruence between citizens and policies can be empirically difficult. Not only do citizens lack coherent and consistent policy preferences, but studies on responsiveness must be conducted under the assumption that ‘citizens are able to identify and evaluate their desires and preferences’ (Morlino and Quaranta, 2014: 334). The complexity of the concept helps explain why different studies have utilized divergent
901
approaches when measuring responsiveness. For instance, Arend Lijphart (1999) analyzed the distance between the government’s and citizens’ policy preferences to measure responsiveness, while Stuart Soroka and Christopher Wlezien (2010) relied on their examination of public preferences and policy outputs. Sara Binzer Hobolt and Robert Klemmensen (2008), alternatively, used executive policy promises (speeches) and policy actions (public expenditure) to determine policy responsiveness. Understanding the difficulty of measuring actual responsiveness, Morlino and Quaranta assert that progress in the study can be made ‘if a distinction is drawn between a more general “political” responsiveness and a more specific “economic” responsiveness’ (2014: 334). Political and economic performances are both argued to be factors that have significant influences on responsiveness. Political responsiveness is based on the notion that ‘political institutions can be determinants of the functioning of a democratic regime’ (Morlino and Quaranta, 2014: 336). Analysis of electoral systems (Singh, 2013) and party system fragmentations (Anderson, 1998) are examples of indicators of political responsiveness that have been used. Scholars (Criado and Herreros, 2007) have also long argued that the type of democracy – majoritarian or proportional – affects citizens’ perceptions of the political institutions. Lijphart (1999) and Powell (2005) find that consensual regimes perform better than majoritarian regimes, as citizens in majoritarian democracies showed less trust in their government. Kees Aarts and Jacques Thomassen (2008), however, reach contradictory conclusions in their research, asserting that majoritarian regimes demonstrated higher levels of satisfaction for democracy. The level of political freedom should also be taken into consideration in the study of responsiveness, as it is one of the core elements of a democratic regime (Morlino, 2011). Although measures of freedom have mostly been used as control
902
The SAGE Handbook of Political Science
variables in past studies, it is essential to recognize their theoretical importance. Studies have shown that ‘citizens living in countries with lower ratings of freedom are those more critical of the way democracy performs in their country (Norris, 1999)’, since ‘consolidated democracies, where political and civil rights are present, show better political performance, which in turn affects citizens’ evaluation of the regime’ (Morlino and Quaranta, 2014: 337). Economic responsiveness is based on the conception that economic performance can foster legitimation for democratic regimes (Anderson, 1998), and consequently affect citizens’ perceptions of the government. Economic performance, such as in the areas of inflation, unemployment and GDP growth, have demonstrated considerable influence on citizens’ satisfaction with democracy. Scholars (Wagner et al., 2009) have established that economic growth is positively associated with citizens’ satisfaction of democracy, while bad economic performances (i.e. unemployment and inflation) were negatively correlated with democratic satisfaction (Di Tella et al., 2003). Furthermore, economic inequality is another indicator that should be considered in the study of responsiveness, as a ‘high level of [economic] inequality is in contradiction to the democratic principles’ (Morlino and Quaranta, 2014: 337). One study (Anderson and Singer, 2008) indicated that economic inequality was negatively correlated with democratic satisfaction, and another (Solt, 2008) added that economic inequality depresses voting turnout, political interest and discussion and general participation. There are, however, other studies that find no evident relationship between economic inequality and democratic satisfaction (Wagner et al., 2009). Further research is necessary in this area, as comparatively little attention has been given to the study on the relationship between inequality and democratic regimes.
Analysis of Empirical Data in Different Systems The United States A 2017 report conducted by the Economist Intelligence Unit established that democratic norms in 89 countries have deteriorated since 2016, and that the percentage of the world’s residents living in fully functional democracies decreased from 8.9% to 4.5% between 2015 and 2017. It is interesting to note that this precipitous drop was triggered by a decline in democratic norms in the United States. For a country that has long regarded itself as exceptional on the basis that it discovered at its inception the definitive form of a free society with core values such as liberty and equality, this result is a matter of great mortification. Nevertheless, considering the degree of economic inequality and the recent rise of populism in the United States, it is difficult to dispute the result. Sheri Berman (2017) asserts that ‘[i]nstitutional decay, in short, has weakened the responsiveness of American democracy, exacerbating the impact of its already many undemocratic features’. Both the Democratic and Republican parties now have ‘less ability to transmit voter preferences to politicians and into policies’ and ‘gerrymandering has also warped the translation of voter preferences into political outcomes while the ease and efficiency of the American voting process have also declined’ (Berman, 2017). Scholars have noticed a general declining trend in the public’s trust in the US government in the past decade. A 2007 survey by Tyler Schario and David Konisky disclosed that while the public expressed some confidence in responsiveness of local governments, they tended to have ‘negative views of state and, especially, [of the] federal government’ (Schario and Konisky, 2008: 4). A 2018 report by the Pew Research Center supports the view that the public’s confidence in responsiveness of the US government is in decline, with 61% stating that it is unlikely
Responsiveness
that their representative would be responsive to their problems, and 76% saying that the government is run by a few big interests looking out for themselves. With an increasing number of reports suggesting that there is an overall trend among the public to perceive the government as unresponsive to their preferences, this chapter will look into whether this is also true in state and local governments.
The Local Level in the United States Jeffrey Lax and Justin Phillips examined the quality of democratic government at the state level in the United States using national surveys and advances in subnational opinion estimations. Their research established that state governments were generally responsive to voter preferences across a wide range of issues. In particular, state governments were more likely to be responsive to voter preferences concerning issues of high salience, ‘even after controlling for the ideology of state voters and elected elites’ (Lax and Phillips, 2012: 164). Julianna Pacheco’s research supports the view that the degree of policy representation and public responsiveness has been ‘dependent on issue saliency, [suggesting] that representation is higher for some issues’ (Pacheco, 2013: 327). Although state governments seemed to be responsive to public preferences on issues of high salience, it is important to realize that there still exists considerable evidence of democratic deficit at the state level. Lax and Phillips establish that state policies were congruent with majority opinion only about half of the time, ‘a clear “failing” grade on the congruence test’ (2012: 164). A more recent study by Devin Caughey and Christopher Warshaw (2018) supports previous findings that state policymaking responds to mass policy preferences. Examining state policies in the United States from 1936 through 2014, they find that state policies have become more aligned with
903
public opinion in recent decades. They further note that the early 1970s was ‘a key inflection point’, in which governmental responsiveness appeared to be more pronounced. While the results of their study showed an almost equal degree of responsiveness throughout US states prior to the 1970s, government responsiveness in areas outside of the South became higher after the 1970s, especially concerning economic issues. One noteworthy discovery in Caughey and Warshaw’s research was that, contrary to the popular perception of partisan statelevel politics, there wasn’t much evidence to support the claim that responsiveness was channeled through turnover in political parties. Changes in policy were not driven by selection of officials from parties, but instead occurred ‘in large part through the adaptation of incumbent officials’ (2018: 249).
Cities Studies of responsiveness at the state level provide us with some reassurance regarding the health of American democracy. At least at the state level, the public’s voice seems to make a difference. But is this also true with responsiveness in local governments? Katherine Einstein and Vladimir Kogan (2015: 3) use ‘a comprehensive crosssectional database linking voter preferences to local policy outcomes in more than 2,000 midsize cities and a new panel covering cities in two states’ to conclude that cities tend to be responsive to the needs of their constituents. The results from their study established that voter preferences helped explain why cities adopted certain policies. For example, when the residents demanded greater spending, the cities responded by choosing to apply for aid. Chris Tausanovitch and Christopher Warshaw (2014: 605) support their argument, finding that ‘policies enacted by cities across a range of policy areas correspond with the liberalconservative positions of their citizens on national policy issues’.
904
The SAGE Handbook of Political Science
Paul Schumaker and Russel Getter (1977), however, note that responsiveness bias is indisputably present in US cities. Throughout analysis of responsiveness bias in 51 American communities, they find that responsiveness bias was present in the communities overall. Cities were more likely to ‘bias their policy responses in favor of the advantaged than in favor of the disadvantaged, thus suggesting that in most cities upper-SES and white citizens receive more of what they prefer in public policy than do lower-SES and black citizens’ (p. 274). They further determined that cities with great bias toward the advantaged shared the following characteristics: (1) larger population size, (2) greater levels of population wealth, (3) power structures in which governmental officials have comparatively lower levels of influence than private elites in matters of public affairs, (4) lower levels of Democratic party strength, and (5) interest groups which are well-organized, and which exhibit low levels of black representation (Schumaker and Getter, 1977: 275). These characteristics suggest that wealthier cities with lower minority representation typically had greater bias in policies in ways that preferred the advantaged.
Income and Responsiveness in the United States Vast literatures concerning responsiveness in the United States have focused on determining the relationship between income and responsiveness (e.g. Bartels, 2017; Gilens, 2005, 2012). It is interesting to note that there is a general consensus in results despite the fact that they were independent studies employing different data and research designs. Larry Bartels found in his analysis of US senators’ roll call votes that they were ‘influenced by the preferences of constituents in the top one-third of the income distribution but not at all by the preferences of low-income constituents’ (Bartels, 2017: 5–6). Research conducted by
Elizabeth Rigby and Gerald Wright (2013) also established that state governments were generally unresponsive to low-income preferences. Furthermore, Martin Gilens (2005, 2012) used an original data set of nearly two thousand survey questions on proposed policy changes between 1981 and 2002 to determine that responsiveness was strongly tilted toward the preferences of the most affluent citizens, while the preferences of low-income citizens had virtually no impact on the policies that the government adopted.
The EU Analogous to the United States, Europe has been experiencing an overall decay in democratic values. In a 2014 survey by the European Election Study, 47% of Europeans responded that the European Parliament did not represent their preferences while 44% asserted that they did not trust the institutions of the European Union (Schmitt et al., 2015). Simon Hix (2008) finds that popular support for the EU has been in decline since the 1990s, ‘when the age of “permissive consensus” ended, following the establishment of the internal market’ (Sorace, 2017: 4). Sheri Berman (2017) identifies two major causes for this democratic deficit in the EU. First, the European political parties have become considerably weaker over the past decades. ‘[M]embership has declined, activist networks have withered, and voter loyalty has diminished, which has translated into higher rates of vote switching and greater political disengagement’ (n.p.). Peter Mair supports this view, asserting that those in the European party system ‘have become so disconnected from the wider society, and pursue a form of competition that is so lacking in meaning, that they no longer seem capable of sustaining democracy’ (cited in Berman, 2018: n.p.). Second, Berman and many other scholars view the EU as a factor that undermines democracy, as critical decisions are made by
Responsiveness
‘unelected technocrats’ without any direct input from the citizens. Many Europeans today perceive the EU as being run by distant and unaccountable political elites, and such disconnect between Brussels and many European voters has enabled populist and Euroskeptic parties to gain political ground. Euroskeptic parties assert that ‘domestic institutions can’t prevent governments, businesses and international bureaucrats from trampling over the welfare of the citizens in pursuit of their own interests’ (Schneider, 2018b). Nevertheless, further research is necessary to determine if it is accurate to perceive supranational organizations such as the EU as unable to be responsive to the preferences of its citizens. There is growing scholarly attention on the difference between short term responsiveness and middle term responsiveness, particularly as it relates to the behavior of neo-populist leaders in Europe. Morlino and Quaranta (2014) find that responsiveness is associated more strongly with macro-economic conditions (debt, unemployment, growth and inflation) than with the substantive dimensions of democracy (freedom and equality). In other words, they argue that responsiveness is a dimension that is partially unrelated to the type of regime and to the level of inequality. This implies two things: first, partially democratic regimes can have high responsiveness and, as a result, if responsiveness becomes a priority, some aspects of democracy becomes less of a priority; second, if employment or growth improve, the level of inequality is partially irrelevant and responsiveness could improve, as well as the level of freedom and the type of democracy (pp. 351–2). They conclude that this result is relevant for party leaders, insofar as it indicates that they should primarily focus on the macro-economic conditions.
Can International Organizations Be Responsive? Although the standard argument is that supranational institutions fail to be responsive to
905
the concerns of the public, Sara Hagemann et al. (2016) argue that governments are able to respond to domestic public opinion even when acting at the international level. Using a data set on all legislative decisions adopted in the Council of the EU since 1999, the authors contend that ‘government opposition to legislative proposals is shaped by public opinion on European integration’ (Hagemann et al., 2016: 869). It is, however, important to realize that this research conducted by Hagemann et al. focuses on signal responsiveness instead of policy responsiveness. The authors note that understanding this distinction is significant because ‘while the presence of signal responsiveness indicates that citizens’ views are heard, it does not guarantee that they are represented’ (Hagemann et al., 2017: 870). Christina Schneider (2017) supports this research by Hagemann et al., arguing that the EU governments signal responsiveness to their national electorates and voters respond favorably to these signals of responsiveness. Using data on ‘the bargaining behavior and negotiation success of the 28 EU members in European legislative, and original data from a survey experiment in Germany’, Schneider finds that ‘EU governments are more likely to defend positions that favor their domestic constituents’ (Schneider, 2017: 1). Similar to the research conducted by Hagemann et al., however, Schneider’s (2018a) research focused on signal responsiveness instead of policy responsiveness. Therefore, future research is necessary to further examine whether governments’ continuous use of the international stage to signal responsiveness can help guarantee more of the public’s preferences and concerns to be represented in actual policies.
Income and Responsiveness in Europe Much literature concerning the relationship between income and responsiveness has concentrated on the United States, but recent
906
The SAGE Handbook of Political Science
studies have expanded their scope to the European context (Fortin-Rittberger and Eder, 2013; Rosset, 2013). Parallel to results in the United States, Jan Rosset finds that ‘representatives are better aligned with the ideology of the higher-income than with the lowerincome group’ in Switzerland (cited in Peters and Ensink, 2015: 579). A subsequent 2013 study by Rosset et al. examining differential responsiveness in 24 European democracies found that the tendency to under-represent low-income citizens was not unique to Switzerland, but instead was a widespread phenomenon in European democracies. Yvette Peters and Sander Ensink support this view, asserting that there is a tendency for lower-income groups to be under-represented while higher-income groups are over-represented. Peters and Ensink’s research, using time-series cross-sectional methods to analyze data of information on 25 European countries from 2002 to 2010, suggested that ‘differential responsiveness is more pronounced in situations where the preferences of the rich and the poor diverge more’ (Peters and Ensink, 2015: 596). In such situations, governments ultimately responded to the preferences of high-income citizens rather than those of low-income citizens.
East Asian Democracies A critical link in the chain of democratic accountability occurs when elected officials delegate sweeping policy making authority to unelected bureaucrats. After all, as the US government’s own definition of democracy implies, accountability in governance is a cornerstone of stable democracy. Yet, observers worry that elected officials may be unable to control these bureaucratic policymakers and that citizens may have difficulty holding them accountable. This is especially troublesome given the extraordinary breadth and depth of responsibilities that elected policymakers routinely delegate to administrative agencies in modern nation states, including
authority to interpret health, safety, environmental and economic regulations, as well as to allocate public investments (such as transportation and energy). To increase accountability, some countries require that bureaucratic agencies provide the public with information on their past performance (e.g. public disclosure) and direct opportunities to participate in future bureaucratic policymaking (e.g. notice, public hearing and comment procedures). They also bestow on civil society the right to monitor bureaucratic compliance with legislative directives (e.g. citizen lawsuits and judicial review). Yet other democracies appear to exclude civil society from directly participating in bureaucratic policymaking. Though not all new democracies have implemented administrative procedural reform, and though Administrative Procedure Acts (APAs) do not appear to be a necessary feature of democracy, Baum (2011: 237) suggests that enactment of APAs in new democracies does appear to facilitate movement from ex post toward ex ante democracy. This, in turn, can make the state more responsive to its citizens. After all, by making government decision making more transparent and giving citizens voice in the policy implementation process, procedural openness reduces arbitrariness in governance while increasing the ability of citizens to influence policy outcomes. In other words, transparency and participation – among the primary effects of an APA – essentially define institutional democracy. In most cases, it is too early to know with certainty whether statutory transparency and participation (via an APA) have contributed toward the success of new democracies. However, if public trust is a valid gauge of citizen satisfaction with democracy and thus a useful indicator of the state of democracy within a given nation or ‘perceived responsiveness’ (Morlino, 2011; Powell, 2005), there is suggestive evidence that citizens in South Korea and Taiwan – and especially those not closely affiliated with the former,
Responsiveness
authoritarian regime – appear to hold greater trust in their democracy in the post-APA period (Baum, 2009, 2011). Just as elections are an important pillar of democratic accountability, statutory transparency and participation procedures are equally important for the maturation of new democracies. Only time will tell whether and to what extent APAs ultimately facilitate the transition from ex post institutional democracy to ex ante responsive democracy in other developing democracies around the world (Baum, 2011: 239–40).
The People’s Republic of China It is not surprising that comparatively little attention has been given to the study of responsiveness in authoritarian regimes, as this is a concept fundamentally tied to democracy. Nevertheless, recently there has been an increasing number of studies examining responsiveness in these regimes. Tianguang Meng et al. (2014) stress that it is extremely important that scholars be cautious when applying the concept of responsiveness to an authoritarian context. This is because in countries such as China, where the legacy of totalitarianism is deeply rooted in its culture, ‘the willingness of citizens to express their preferences undoubtedly differs from that in a consolidated democracy’ (Meng et al., 2017: 402). Despite the fact that the Chinese government has developed new formal and informal channels for citizens to express their preferences, it still remains inevitable that information or speech deemed inappropriate in the perspective of the government will be curtailed through censorship or physical repression. Considering that theoretical predictions of Western political science assert that authoritarian governments are generally not responsive to public preferences, this chapter subsequently investigates why authoritarian governments feel the need to respond to the preferences of their citizens.
907
Why Do Authoritarian Governments Respond? Contrary to predictions of the standard theory of democracy, recent research of responsiveness in authoritarian systems reveal surprisingly high responsiveness. But why do authoritarian governments feel the need to be responsive? Jidong Chen et al. (2015: 29) argue that citizen engagement may in fact contribute to regime survival in authoritarian regimes, ‘or at the very least, citizen engagement is not necessarily a harbinger of the collapse of institutionalized single-party regimes’. This finding is in direct contrast to previous research by James Robinson and Daron Acemoglu (2006), who asserted that citizen engagement and protest act as ‘catalysts for regime change’. Using an online experiment among 2,103 Chinese counties, Chen et al. find that ‘approximately one third of county governments in China are responsive to citizens’ requests related to social welfare’ (Chen et al., 2015: 383). Results from the research disclosed that threats of collective action increased responsiveness overall in local governments by approximately 30%, while increasing public responsiveness by almost 50%. It came as a surprise, however, that the study did not find the Chinese government to demonstrate any bias toward longstanding members of the Communist Party.
A Comparison with Democracies Recent studies on responsiveness in China have, surprisingly, established that the Chinese government exhibits a considerably high level of government responsiveness. In an Asian Barometer Survey conducted in 2008, approximately 78% of mainland Chinese citizens responded that the government is responsive to their needs. In contrast, only 36% of Taiwanese citizens answered that their government was responsive. The results are even more alarming in other East Asian democracies that have modeled the
908
The SAGE Handbook of Political Science
Western liberal democratic system, such as South Korea (21%) and Mongolia (25%). J. Yingnan Zhou and Ray Ou-Yang (2017) assert that authoritarian countries may seem like they have a higher level of government responsiveness compared to democracies due to three key differences. First, unlike in democracies, where failing to elect one’s preferred candidate predisposes voters to critical assessment of government responsiveness, such predisposition does not exist in authoritarian countries ‘where elections are nonexistent or nominal’ (J. Yingnan Zhou and Ray Ou-Yang, 2016: 283). Second, elections create incentives for democratic leaders to ‘over-respond to certain groups’, yet such mechanism also does not exist in authoritarian countries. Third, ‘the solid and clear legitimacy established by electoral victories shield democratic leaders from particularistic demands made through unconventional channels’ (Zhou and Ou-Yang, 2017: 283). Therefore, without solid and clear legitimacy, authoritarian governments ‘are compelled to cement legitimacy by increasing responsiveness’ (Zhou and Ou-Yang, 2017: 283). The authors conclude that low responsiveness in democracies may, in fact, be an inevitable result of having democratic elections and government transparency.
Quasi-Democratic Institutions as Window Dressing? An increasing number of scholars are finding that authoritarian countries employ quasidemocratic institutions as an effort to retain their power. China has adopted quasi- democratic institutions ranging from ‘village elections to people’s congresses to public participation mechanisms’ (Meng et al., 2017: 403). Using a list experiment of 1,377 provincial and city level leaders in China, Meng et al. find that receptivity is possible at the subnational levels. Despite recent evidence that China has high perceived government responsiveness, however, it is imperative
to view this argument with some skepticism. Zhou and Ou-Yang (2017) assert that authoritarian governments may be merely creating a ‘perception’ of a responsive government in the wider world. In their words, ‘[p]erception may not correspond to reality’ and ‘[w]e cannot infer that authoritarian governments are more responsive to public opinion by any objective measure’ (Zhou and Ou-Yang, 2017: 297). On the other hand, Meng et al. assert that quasi-democratic institutions may be more than mere window dressing for local leaders. Nevertheless, they also remain cautious about making any hasty judgments as it remains unclear whether responsiveness at the local level ultimately leads to actual policy decisions. Thus, further research is therefore necessary in this area to better evaluate whether China’s use of quasi- democratic institutions is mere window dressing.
Russia Russian electoral authoritarianism is a classic case where responsiveness is a key element of the regime, but is complemented by manipulation, corruption and other similar aspects. Andreas Schedler writes: ‘Electoral authoritarian regimes play the game of multiparty elections by holding regular elections for the chief executive and a national legislative assembly. Yet they violate the liberal-democratic principles of freedom and fairness so profoundly and systematically as to render elections instruments of authoritarian rule rather than “instruments of democracy”’ (2006: 3). Electoral authoritarian (EA) regimes set up a whole institutional landscape of representative democracy by establishing ‘constitutions, elections, parliaments, courts, local governments, subnational legislatures, and even agencies of accountability’ (Schedler, 2006: 12). By allowing competitive multiparty elections, the EA regime recognizes subjects as citizens and ‘endow[s] them with the ultimate controlling power over who shall occupy the summit of the state’ (Schedler, 2006: 13). It is,
Responsiveness
however, imperative to realize that electoral contests in EA regimes are ‘subject to state manipulation so severe, widespread, and systematic that they do not qualify as democratic’ (Schedler, 2006: 3). Authoritarian manipulation can exist in many forms: [r]ulers may devise discriminatory electoral rules, exclude opposition parties and candidates from entering the electoral arena, infringe upon their political rights and civil liberties, restrict their access to mass media and campaign finance, impose formal or informal suffrage restrictions on their supporters, coerce or corrupt them into deserting the opposition camp, or simply redistribute votes and seats through electoral fraud. (Schedler, 2006: 3)
No matter the form, different guises of authoritarian manipulation all tend to serve the purpose of ‘containing the troubling uncertainty of electoral outcomes’ (Schedler, 2006: 3). In EA regimes, as opposition parties are designed to lose elections, electoral contests are ‘a profoundly ambiguous affair for opposition parties’ (Schedler, 2006: 14). Schedler notes that authoritarian elections ‘do not provide any of the normative reasons for accepting defeat losers have under democratic conditions’, and also ‘fail to display the procedural fairness and substantive uncertainty that makes democratic elections normatively acceptable’ (Schedler, 2006: 14). In a more recent book, Schedler finds that manipulating the public perception of political realities through media censorship is one of the most effective ways in which ruling parties cope with problems, with exclusion and fraud also seen as effective instruments for authoritarian stabilization (Schedler, 2013).
Major Advances in the Field Public Opinion and Political Decisions The seminal works of political theorists Robert Dahl and Hanna Pitkin inspired a
909
great deal of empirical research ‘examining the relationship between citizens’ policy preferences and the policy choices of elected officials’ (e.g. Miller and Stokes, 1963; Page and Shapiro, 1983; Stimson et al., 1995; Soroka and Wlezien, 2010). Shapiro and Jacobs (2001) assert that there has been ‘a decline in democratic responsiveness at the very top of American national government’ (p. 150). Although it may look like American presidents are paying increasing attention to public opinion, the authors note that this does not necessarily mean that policy making has become more responsive to public opinion. In fact, they argue that public opinion has little bearing on most political decisions and that, if anything, politicians are keen to aim to change the public’s perceptions rather than their own agenda. Only in circumstances when politicians feel very pressured by the public’s sentiment do they seem to follow the non-confrontational, neutral and compromising approach of choosing what would satisfy the majority.
Ideological/Partisan Preferences Lax and Phillips asserted that partisanship and interest groups affect the ideological balance of incongruence (2012: 148), leading to policy decisions which are much more responsive to ideological and party preferences than to policy preferences of the majority. This view has been standard for many political scientists, yet Caughey and Warshaw found contrasting results in their research. They argue that ‘partisan selection is a comparatively minor mechanism of responsiveness not because party control has no policy effects, but rather because mass policy preferences explain relatively little of the variation in party fortunes’ (Caughey and Warshaw, 2018: 263). This argument is ‘consistent with Erikson et al.’s “statehouse democracy” model, in which the platforms of Democratic and Republican parties in a given state diverge from one another (resulting in
910
The SAGE Handbook of Political Science
partisan effects on policy) but are roughly centered on the state’s median voter’ (Caughey and Warshaw, 2018: 263). In addition, incumbents pay attention to voter preferences and adapt to changes in public option even without no party control (Caughey and Warsaw 2018: 263).
Substantive Representation in the European Parliament With an increasing number of scholars arguing that the EU is able to be responsive to the needs of its citizens, Miriam Sorace (2018: 3) compared ‘voters’ preferences in economic policy to political parties’ economic written parliamentary questions during the 2009–14 term of the European Parliament. Unlike previous studies that have concentrated on data from the Council, Sorace’s research focused on data from the Parliament. The results indicated that there were good institutional representations of the position of the average European voter on the part of the European Parliament, with EP political parties demonstrating strong tendencies to be drawn toward the preferences of the average citizen. One interesting finding is that, unlike previous studies which have argued that responsiveness was biased toward people of high income, data from the Parliament did not demonstrate any bias against the elites and the working class. Sorace holds that despite the popular perception that there is democratic deficit in Europe, evidence from the Parliament suggests that it is ‘at most a pluralism deficit in the European Parliament, since substantive representation in the European Parliament is successful as far as the majoritarian norm is concerned’ (2018: 3).
‘effective responsiveness’ – the response offered by politicians in terms of actions. The scholars compared data from Britain, Denmark and the United States in the period 1970–2005 and determined that while rhetorical responsiveness is highest in Denmark, effective responsiveness is highest in the United States. One interesting finding from their study is that all three countries that were compared shared tendencies to display greater responsiveness when under pressure – suggesting that the degree of democratic responsiveness is highly correlated with the degree of political contestation present in a country.
Increasing Responsiveness Online in China In China, there has been increasing use of online forums as new channels for political participation. Using big data analytics of full records of citizen–government interactions from 2008 to early 2014, Zheng Su and Tianguang Meng (2016) observed ‘a dramatic increase in both the expressed demands from citizens and the responses from government through the internet channel’ (Su and Meng, 2016: 65). Their study further finds that approximately 33% of online public demands were responded, signaling a rapid growth of response rate in China. One important thing to note in the study is that authoritarian government responsiveness seemed highly selective, ‘conditioning on actors’ social identities and the policy domains of their online demands’ (Su and Meng, 2016: 52). Demands from local citizens that were collective and related to economic growth were most likely to be responded.
The Salience of Political Contestation
Recommendations for the Future
Sara Hobolt and Robert Klemmensen (2008) examine data on ‘rhetorical responsiveness’ – how politicians respond to the public’s preferences through statements and speeches – and
Potential Negative Externalities Political equality and responsiveness to citizens are two fundamental values in representative
Responsiveness
democracy, and thus a great deal of previous empirical research has focused on understanding actual responsiveness processes (e.g. Burstein, 2010; Wlezien and Soroka, 2010). However, comparatively little research has been done on whether there are negative externalities of government responsiveness. Marcia Grimes and Peter Esaiasson (2014) use a data set of the siting of unwanted facilities in two Swedish cities to argue that government responsiveness may have some undesirable consequences. They find that citizens with strong political resources have greater success in impressing their preferences upon decision makers, ‘meaning that government responsiveness may possibly exacerbate inequality in policy outcomes, especially if participatory democratic arrangements are prevalent’ (p. 758). This concern was demonstrated in the results of their study as it found that participatory decision making had contributed to inequality in policy outcomes, because citizens with stronger political resources were able to participate to a greater extent and sway the decision makers to their favored position. More research on potential negative externalities of government responsiveness is necessary in order to ensure that every citizen’s preferences are being represented equally.
Challenges and Limitations of the Study Many challenges and limitations exist in the study of responsiveness. As Bingham Powell notes, evaluating the quality of democratic responsiveness is a very difficult task. Powell argues that ‘correspondence between citizens’ policy desires and government’s policy outcomes is not a sufficient measure of democratic responsiveness’, and is doubtful as to whether there are any simple measures to adequately assess countries’ relative democratic responsiveness (Powell, 2005: 2). This chapter concurs with Powell’s assertion that there are several problems with the approach of measuring democratic responsiveness by
911
examining the relationship between public preferences and policy outcomes. First, as recent data suggest, voters lack coherent, consistent policy preferences. Consequently, it is extremely difficult for the legislators to decide on the preferences to which to respond. Second, as Bartels argues in his critique of democratic theory, voters share tendencies to ‘take their policy preference cues from political elites’ (Hill, 2018). Thus, responsiveness may not be democratic as the elites shape public preferences and ‘use those preferences as justification for their policies’ (Hill, 2018). Third, voters tend to be highly influenced by their cognitive and emotional biases preventing them from making rational choices. Even in cases when voters can possibly overcome their biases, they still cannot effectively express their attitudes through elections because ‘America’s current electoral processes constrain voter choice such that citizens cannot adequately express their political attitudes through the election mechanism’ (Hill, 2018).
Shared Concerns of Democratic Deficit in the United States and the EU An interesting phenomenon is that contemporary liberal democracy is in crisis in both Western Europe and the United States. Saskia Brechenmacher (2018) observes that the decline in democratic norms in democracies on both sides of the Atlantic reveals a new sense of political convergence. For example, Europe and the United States share the concern of ‘high levels of citizen distrust in democratic institutions, alienation from establishment political actors, and unease about an increasingly fragmented and incoherent public information space that is vulnerable to polarization’ (Brechenmacher, 2018: 30). Unresponsiveness of political power is a salient issue which both Europe and the United States must confront in order to recover from the current state of
912
The SAGE Handbook of Political Science
democratic crisis. To improve government responsiveness, both Europe and the United States are exploring new mechanisms to improve public trust in media institutions – ranging from ‘fact-checking and greater transparency measures to programs centered on media and civic literacy and investments in local newspapers and reporting’ (Brechenmacher, 2018: 31). Timothy Besley and Robin Burgess (2002) support this method of expanding the role of democratic institutions and mass media to ensure that citizens’ preferences are represented in policy. Besley and Burgess assert that the media can be utilized to provide a more informed and politically active electorate, which is crucial to have strong incentives for the government to be responsive. It is imperative that the United States and Europe perceive the current moment of democratic crisis as an opportunity to develop new local experiments in democratic innovation and identify more effective mechanisms for democratic responsiveness. Although it may not be enough, expanding the role of democratic institutions and mass media can be a good starting point for both the United States and Europe.
Note 1 The author thanks David Chai for excellent research assistance.
References Aarts, Kees, and Jacques Thomassen (2008). ‘Satisfaction with Democracy: Do Institutions Matter?’ Electoral Studies, 27(1): 5–18. Acemoglu, Daron and James Robinson (2006). The Economic Origins of Dictatorship and Democracy. New York: Cambridge University Press. Anderson, Christopher J. (1998). ‘Parties, Party Systems, and Satisfaction with Democratic
Performance in the New Europe’. Political Studies, 46(3): 572–88. Anderson, Christopher J., and Matthew M. Singer (2008). ‘The Sensitive Left and the Impervious Right: Multilevel Models and the Politics of Inequality, Ideology, and Legitimacy in Eurency ope’, Comparative Political Studies, 41(4–5): 564–599. Barabas, Jason (2007). ‘Measuring Democratic Responsiveness’. Unpublished. Florida State University, May 25. Revised version of paper presented at the 2007 Annual Meeting of the Midwest Political Science Association. Chicago, IL, USA. Barabas, Jason (2016). ‘Democracy’s Denominator: Reassessing Responsiveness with Public Opinion on the National Policy Agenda’. Public Opinion Quarterly, 80(2): 437–59. Bartels, Larry M. (2017). ‘Political Inequality in Affluent Democracies: The Social Welfare Deficit’. Unpublished. Vanderbilt University Center for the Study of Democratic Institutions. Presented at a Workshop on Political Inequality and Democratic Innovations. Baum, Jeeyang Rhee (2009). ‘The Impact of Bureaucratic Openness on Public Trust in South Korea’. Democratization, 16(5): 969–97. Baum, Jeeyang Rhee (2011). Responsive Democracy: Increasing State Accountability in East Asia. Ann Arbor: University of Michigan Press. Berman, Sheri (2017). ‘Populism Is a Problem. Elitist Technocrats Aren’t the Solution’. Foreign Policy, December 20, 2017. Berman, Sheri (2018). ‘Populists Have One Big Thing Right: Democracies Are Becoming Less Open’. The Washington Post, January 8, 2018. Besley, Timothy and Robin Burgess (2002). ‘The Political Economy of Government Responsiveness: Theory and Evidence from India’. The Quarterly Journal of Economics, 117(4): 1415–51. Bickerton, Christopher J. (2012). European Integration: From Nation-States to Member States. Oxford: Oxford University Press. Brechenmacher, Saskia (2018). ‘Comparing Democratic Distress in the United States and Europe’. Carnegie Endowment for International Peace, June 21, 2018. Burstein, Paul (2010). ‘Public Opinion, Public Policy, and Democracy’, Kevin T. Leicht and J. Craig Jenkins Eds., iHandbook of Politics:
Responsiveness
State and Society in Global Perspective. New York: Springer. 63–79. Camyar, Isa (2014). ‘Institutions, Information Asymmetry and Democratic Responsiveness: A Cross-National and Multi-Level Analysis’. Acta Politica, 49(3): 313–36. Caughey, Devin and Christopher Warshaw (2018). ‘Policy Preferences and Policy Change: Dynamic Responsiveness in the American States, 1936–2014’. American Political Science Review, 112(2): 249–66. Chen, Jidong, Jennifer Pan and Yiqing Xu (2015). ‘Sources of Authoritarian Responsiveness: A Field Experiment in China’. American Journal of Political Science, 60(2): 383–400. Criado, Henar, and Francisco Herreros (2007). ‘Political Support: Taking into Account the Institutional Context’. Comparative Political Studies, 40(12): 1511–32. Dahl, Robert A. (1971). Polyarchy: Participation and Opposition. New Haven: Yale University Press. Dahl, Robert A. (1998). On Democracy. New Haven: Yale University Press. Di Tella, Rafael, Robert J. MacCulloch, and Andrew J. Oswald (2003). ‘The Macroeconomics of Happiness’. Review of Economics and Statistics, 85(4): 809–827. Dizikes, Peter (2018). ‘People Power’. MIT News, May 11, 2018. Einstein, Katherine L. and Vladimir Kogan (2016). ‘Pushing the City Limits: Policy Responsiveness in Municipal Government’. Urban Affairs Review, 52(1): 3–32. Erikson, Robert S., Gerald C. Wright and John P. McIver (1993). Statehouse Democracy: Public Opinion and Policy in the American States. New York: Cambridge University Press. Erikson, Robert S. (2015). ‘Income Inequality and Policy Responsiveness’. Annual Review of Political Science, 18(1): 11–29. Esaiasson, Peter and Christopher Wlezien (2017). ‘Advances in the Study of Democratic Responsiveness: An Introduction’. Comparative Political Studies, 50(6): 699–710. Fortin-Rittberger, Jessica, and Christina Eder (2013). ‘Towards a Gender-equal Bundestag? The Impact of Electoral Rules on Women’s Representation’. West European Politics, 36(5): 969–985.
913
Gilens, Martin (2005). ‘Inequality and Democratic Responsiveness’. Public Opinion Quarterly, 69(5): 778–96. Gray, John, and G.W. Smith eds. (1991). J.S. Mill On Liberty in Focus. London and New York: Routledge. Grimes, Marcia and Peter Esaiasson (2014). ‘Government Responsiveness: A Democratic Value with Negative Externalities?’ Political Research Quarterly, 67(4): 758–68. Hagemann, Sara, Sara B. Hobolt and Christopher Wratil (2017). ‘Government Responsiveness in the European Union: Evidence From Council Voting’. Comparative Political Studies, 50(6): 850–76. First pub. online 2016. Hill, Charlotte (2018). ‘Is Legislative Responsiveness a Good Measure of Democracy?’ Medium, February 22, 2018. Hix, Simon (2008). What’s Wrong with the European Union and How to Fix It. Cambridge: Polity Press. Hobolt, Sara Binzer and Robert Klemmensen (2008). ‘Government Responsiveness and Political Competition in Comparative Perspective’. Comparative Political Studies, 41(3): 309–37. Hong, Seung-Hun and Jong-sung You (2018). ‘Limits of Regulatory Responsiveness: Democratic Credentials of Responsive Regulation’. Regulation and Governance, 12(3): 413–27. Kang, Shin-Goo and G. Bingham Powell Jr. (2010). ‘Representation and Policy Responsiveness: The Median Voter, Election Rules, and Redistributive Welfare Spending’. The Journal of Politics, 72(4): 1014–28. Lax, Jeffrey R. and Justin H. Phillips (2012). ‘The Democratic Deficit in the States’. American Journal of Political Science, 56(1): 148–66. Lijphart, Arendt (1999). Patterns of Democracy: Government Forms and Performance in Thirty-Six Countries. New Haven: Yale University Press. Mair, Peter (2014). On Parties, Party Systems and Democracy: Selected Writings of Peter Mair. Colchester: ECPR Press. Manza, Jeff and Fay L. Cook (2002). ‘A Democratic Polity?: Three Views of Policy Responsiveness to Public Opinion in the United States’. American Politics Research, 30(6): 630–67. Meng, Tianguang, Jennifer Pan and Ping Yang (2017). ‘Conditional Receptivity to Citizen
914
The SAGE Handbook of Political Science
Participation: Evidence From a Survey Experiment in China’. Comparative Political Studies, 50(4): 399–433. Miller, Warren E. and Donald E. Stokes (1963). ‘Constituency Influence in Congress’. American Political Science Review, 57(1): 45–56. Morlino, Leonardo (2004). ‘“Good” and “Bad” Democracies: How to Conduct Research into the Quality of Democracy’. Journal of Communist Studies and Transition Politics, 20(1): 5–27. Morlino, Leonardo (2011). Changes for Democracy: Actors, Structures, Processes. Oxford: Oxford University Press. Morlino, Leonardo and Mario Quaranta (2014). ‘The Non-Procedural Determinants of Responsiveness’. West European Politics, 37(2): 331–60. Norris, Pippa, ed. (1999). Critical Citizens: Global Support for Democratic Government. Oxford: Oxford University Press. Pacheco, Julianna (2013). ‘The Thermostatic Model of Responsiveness in the American States’. State Politics & Policy Quarterly, 13(3): 306–32. Page, Benjamin and Robert Y. Shapiro (1983). ‘Effects of Public Opinion on Policy’. American Political Science Review, 77(1): 175–190. Palus, Christine K. (2010). ‘Responsiveness in American Local Governments’. State & Local Government Review, 42(2): 133–50. Peters, Yvette and Sander J. Ensink (2015). ‘Differential Responsiveness in Europe: The Effects of Preference Difference and Electoral Participation’. West European Politics, 38(3): 577–600. Pew Research Center (2018). ‘The Public, the Political System and American Democracy’. April. Powell, Bingham (2000). Elections as Instruments of Democracy: Majoritarian and Proportional Visions. New Haven: Yale University Press. Powell, G. Bingham Jr (2005). ‘The Chain of Responsiveness’, in Larry Diamond and Leonardo Morlino (eds), Assessing the Quality of Democracy. Baltimore, MD: Johns Hopkins University Press, 62–76. Robinson, James A. and Daron Acemoglu (2006). Economic Origins of Dictatorship and Democracy. Cambridge, UK: Cambridge University Press.
Rigby, Elizabeth, and Gerald C. Wright (2013). ‘Political Parties and Representation of the Poor in American States’. American Journal of Politcal Science, 57(3): 552–565. Rosset, Jan, Nathalie Giger and Julian Bernauer (2013). ‘More Money, Fewer Problems? Cross-Level Effects of Economic Deprivation on Political Representation’. West European Politics, 36(4): 817–35. Schario, Tyler and David Konisky (2008). ‘Public Confidence in Government: Trust and Responsiveness’. Report 9-2008. University of Missouri Institute of Public Policy. Schedler, Andreas. (2006). Electoral Authoritarianism: The Dynamics of Unfree Competition. Boulder, CO: Lynne Rienner. Schedler, Andreas. (2013). The Politics of Uncertainty: Sustaining and Subverting Electoral Authoritarianism. Oxford: Oxford University Press. Schmitt, Hermann, Sara Hobolt and Sebastian Adrian Popa (2015). ‘Does Personalization Increase Turnout? Spitzenkandidaten in the 2014 European Parliament Elections’. European Union Politics. 16(3): 347–368. Schneider, Christina J. (2017). ‘Signals of Responsiveness in the European Union’. Unpublished. University of California, San Diego. Schneider, Christina J. (2018a). The Responsive Union: National Elections and European Governance. Cambridge: Cambridge University Press. Schneider, Christina J. (2018b). ‘People Think that the EU Is Run by Unelected Technocrats. They’re Wrong’. The Washington Post, September 26, 2018. Schumaker, Paul D. and Russel W. Getter (1977). ‘Responsiveness Bias in 51 American Communities’. American Journal of Political Science, 21(2): 247–81. Shapiro, Robert Y. and Lawrence R. Jacobs (2001). ‘Source Material: Presidents and Polling: Politicians, Pandering, and the Study of Democratic Responsiveness’. Presidential Studies Quarterly, 31(1): 150–67. Singh, Shane P. (2013). ‘Not All Election Winners Are Equal: Satisfaction with Democracy and the Nature of the Vote’. European Journal of Political Research, first published online July 5, 2013. doi:10.1111/1475-6765.12028.
Responsiveness
Solt, Frederick (2008). ‘Economic Inequality and Democratic Political Engagement’. American Journal of Political Science, 52(1): 48–60. Sorace, Miriam (2018). ‘The European Union Democratic Deficit: Substantive Representation in the European Parliament at the Input Stage’. European Union Politics, 19(1): 3–24. Soroka, Stuart N., and Christopher Wlezien, eds (2010). Degrees of Democracy: Politics, Public Opinion, and Policy. New York: Cambridge University Press. Stimson, James A., Michael B. MacKuen, and Robert S. Erikson (1995). ‘Dynamic Representation’. American Political Science Review, 89(3): 543–565. Su, Zheng and Tianguang Meng (2016). ‘Selective Responsiveness: Online Public Demands and Government Responsiveness in Authoritarian China’. Social Science Research, 59: 52–67. Tang, Wenfang (2018). ‘The “Surprise” of Authoritarian Resilience in China’. American Affairs, 2(1): 101–17. Tausanovitch, Chris and Christopher Warshaw (2014). ‘Representation in Municipal
915
Government’. American Political Science Review, 108(3): 605–41. Wagner, Alexander F., Friedrich Schneider and Martin Halla (2009). ‘The Quality of Institutions and Satisfaction with Democracy in Western Europe – A Panel Analysis’. European Journal of Political Economy, 25(1): 30–41. Wlezien, Christopher and Stuart N. Soroka (2010). ‘Federalism and Public Responsiveness to Policy’. Publius: The Journal of Federalism, 41(1): 31–52. Wratil, Christopher (2015). ‘Democratic Responsiveness in the European Union: The Case of the Council’. LEQS Paper No. 94. Wratil, Christopher (2018). ‘Modes of Government Responsiveness in the European Union: Evidence from Council Negotiation Positions’. European Union Politics, 19(1): 52–74. Zhou, Yingnan J. and Ray Ou-Yang (2017). ‘Explaining High External Efficacy in Authoritarian Countries: A Comparison of China and Taiwan’. Democratization, 24(2): 283–304.
55 Political Performance and State Capacity Edeltraud Roller
Introduction Political performance and state capacity refer to activities of public officials as individual and above all as collective actors. The general concept of political performance, comprising all public officials such as governments, institutions and administrations, was suggested for the first time in the 1970s; the more specific concept of state capacity, focusing on the administrative activities of a political regime, was introduced in the mid 2000s. Political performance and state capacity is not a well-established field of research with a common terminology and well-defined concepts, a set of well-formulated theories and proven empirical findings. Research is divided into several heterogeneous areas, some of which are still in their infancy. In principle, political performance is defined in broad and narrow ways. Broadly defined, it encompasses the activities of public officials (descriptive aspect) as well as the evaluation of these activities and their
outcomes: how well public officials do or how successful they are (evaluative aspect). Hence, it concerns the description of particular activities, such as passing bills and collecting taxes. In addition, it includes the evaluation of whether public officials make public policies (the activity), for example, in an efficient way and without corruption, plus whether they achieve intended goals (outcomes) such as welfare of citizens or liberty. Narrowly understood, political performance is only an evaluative concept, referring to the evaluation of what public officials do and what the outcomes of their actions are. This narrow definition, which dominates in political science, is the one adopted here. Three stages of the study of political performance, as well as a preliminary stage, can be distinguished.
A Short history of the subject The classical political science literature can be interpreted as the preliminary stage in the
Political Performance and State Capacity
debate on political performance. In fact, the idea of evaluating political systems is a core concern in this literature. Criteria of political performance such as liberty and equality dominated the writings of classical political theorists (e.g. John Locke and Jean-Jacques Rousseau) in their search for good types of government. Additionally, performance criteria such as common good and political stability served as central yardsticks for scholars of comparative government, from Aristotle to Karl Loewenstein, in describing and comparing different nondemocratic and democratic regimes. This preliminary stage, however, is characterized by the fact that its protagonists did not make explicit use of the term political performance. Furthermore, they mainly relied on theoretical arguments and unsystematic empirical observations to assess the merits of different regimes.
First Stage The first stage of systematic conceptual and empirical study of political performance started in the 1970s. At this point, two crucial obstacles had been overcome. The first, the availability of cross-national data, became less of an obstacle with the collection and documentation of a wide range of social and political indicators for many independent countries of the world. First editions of several data handbooks appeared during the 1960s (e.g. World Handbook of Political and Social Indicators). The second obstacle, the deliberate avoidance of evaluation by empirically oriented political scientists, was removed when they dared to make explicit and systematic appraisals of concrete political systems based on empirical data. The outset of this phase can be dated to 1971, when Harry Eckstein developed and justified theoretical criteria for evaluating political systems and when Ted Gurr and Muriel McClelland (1971) made a first systematic attempt to operationalize these criteria and applied measures to a sample of
917
democratic and nondemocratic countries. Eckstein, stressing the fact that measuring political performance is an evaluative task, distinguishes between two types of performance: goal-oriented or substantive performance, which aims at the attainment of particular goals such as welfare or liberty; and procedural performance, independent of particular goals, which helps to promote the attainment of any particular goal. Focusing on procedural performance, Eckstein (1971) identifies four dimensions on which polities must perform well if they are to effectively attain any specific goal: durability, civil order, legitimacy and decisional efficacy. In 1978, Gabriel Almond and G. Bingham Powell developed their concept of political productivity, the most influential and lasting concept of political performance emerging from this phase. The political system itself is seen as the producer of (political) goods that are ‘commonly sought by or expected of political systems … and are widely acknowledged as the legitimate obligation of political systems’ (Almond and Powell, 1978: 394). The authors distinguish between two types of goods: goods that satisfy the needs of the state and goods that satisfy the needs of the citizens. Their concept of political productivity comprises eight political goods: system maintenance, system adaptation, participation, compliance and support, procedural justice (needs of the state; the first two originating in Eckstein’s work) and welfare, security and liberty (needs of the citizens). The list incorporates both of Eckstein’s types of performance; the five needs of the state refer to procedural performance, whereas the three needs of citizens refer to substantive performance. Aside from suggesting this list, the authors empirically study the attainment of some of these goods in several democratic and nondemocratic countries. The studies of the first stage of performance research, mainly conducted within systems theory, were interested in finding an external and relatively unbiased evaluation of political systems and presented primarily
918
The SAGE Handbook of Political Science
descriptive information on the performance of several democratic and nondemocratic regimes. This early work, however, provoked only a few isolated studies.
Second Stage The rise of new institutionalism in the 1980s, with its central premise that different institutional arrangements produce different results (institutions matter), was the starting point for the second stage. Research on political performance advanced significantly and resulted in the development of an independent research area in the 1990s. The focus was no longer on identifying and studying generally accepted performance criteria but on the empirical analysis of a wide range of consequences of political regimes, interpreted as political performance. At the beginning of this second stage, research focused on the performance of democracies. Arend Lijphart’s Patterns of Democracy (1999, [2012]), one of the most important and influential works, is devoted to the performance of majoritarian and consensus democracy. Lijphart asks which type works best and offers a comprehensive empirical analysis, covering 36 established democracies and 32 performance indicators categorized into four areas: macroeconomic management; control of violence; quality of democracy and democratic representation; and what he calls kinder and gentler policy areas (encompassing welfare state, environmental policy, criminal justice and foreign aid). This study stimulated further research on the performance of different types of democracy, such as parliamentary and presidential systems, unitary and federal systems, majoritarian and proportional electoral systems or different types of negotiation democracy (e.g. Armingeon, 2002; Roller, 2005; Gerring and Thacker, 2008). Research comparing democracy and authoritarianism is now accelerating. The revival of scholarly interest in the classic
question whether democracy or autocracy performs better originates in historical developments. Despite the worldwide triumph of democracy in the aftermath of the third wave of democratization, a great number of nondemocratic regimes continue to persist, some of which display great economic success (e.g. China, Singapore). Thus far, a considerable number of empirical studies comparing the performance of democracy and authoritarianism have been presented. These studies, however, mainly focus on two dimensions of policy performance: economic policies and/ or and social policies (e.g. Przeworski et al., 2000; Baum and Lake, 2003; Ross, 2006). These are the two policy areas for which comparative data is available. In this second phase, the latest studies compare the performance of different types of authoritarian regimes. Confronted with a multitude of different nondemocracies, political scientists identified different types of contemporary authoritarian regimes and, what is more, collected corresponding comparative data starting at the end of the 20th century (e.g. Cheibub et al., 2010; Geddes et al., 2014; Wahman et al., 2013). The most important types are electoral authoritarianism and closed authoritarianism; the latter is further divided into monarchies, military regimes and one-party regimes. As a result, the question of which type of authoritarian regime performs better has been moving to the top of the research agenda. To date, the number of empirical studies is of a manageable size. They focus on economic and/or social policies; but they also start examining the question of whether types of authoritarian regimes differ in terms of basic political freedoms (e.g. Gandhi, 2008; McGuire, 2013; Miller, 2015; Stier, 2015). Guided by the ‘so what?’ question of comparative government (Lijphart, 2012: 255), the majority of empirical studies which are part of this second stage are not interested in investigating specific types of performance or concrete performance criteria but rather in analysing the general question whether
Political Performance and State Capacity
political regimes make any difference in performance at all. That is why many scholars choose performance indicators without any further theoretical justification (for exceptions see, e.g., Putnam, 1993; Roller, 2005). Very often, the selection of indicators is guided solely by the availability of comparative performance data. As a result, heterogeneous performance indicators and criteria are studied. What is more, even measurements such as expenditure data, which simply describe but do not evaluate the performance of political regimes, are used.
Third Stage While this second stage is still ongoing, a third stage of the study of political performance is on the way. The defining characteristic of this stage is the introduction of state capacity as a specific concept of political performance. It is difficult to identify the beginning of this stage, but it seems reasonable to choose the year 2008, when Hanna Bäck and Axel Hadenius presented a first empirical analysis of ‘Democracy and state capacity’, and Bo Rothstein and Jan Teorell a theoretical contribution entitled ‘What is quality of government? A theory of impartial government institutions’. Their point of origin is the observation that in many cases democratic transitions do not result in improved welfare or human well-being (e.g. poverty, inequality, infant mortality, life expectancy). This raises doubts as to whether the democratic regime is an important performance-enhancing factor. The proponents of the state capacity concept (Rothstein and Teorell use the term quality of government) argue that besides democracy, a state is needed that ‘functions well in an administrative sense’ (Bäck and Hadenius, 2008: 1). The idea that a high quality administration is a necessary condition for welfare is not new. In the mid 1990s, the promotion of ‘good governance’ by international aid and development organizations such as the
919
World Bank and the United Nations was motivated by similar reasons. However, the concept of good governance and the associated Worldwide Governance Indicators, (WGI) (Kaufmann et al., 2009) have been criticized for being too broad and lacking conceptual clarity and precision. The idea behind the state capacity concept is to provide a precise definition of this administrative factor so that it can be eventually be operationalized and measured. The introduction of the state capacity concept leads to the study of several new research questions. On the one hand, empirical studies examining the relationship between political regimes and state capacity ask whether political regimes differ in terms of state capacity and what type of political regimes perform better. As in the second stage, studies compare different types of democracy (old and new), different types of authoritarian regimes and democracy with authoritarianism (e.g. Bäck and Hadenius, 2008; Charron and Lapuente, 2010, 2011). On the other hand, research addresses the pivotal question whether the political regime (authoritarianism or democracy), or state capacity, or both determine welfare. At issue is how both factors interact in determining welfare – whether democracy and state capacity are complements or substitutes (e.g. Knutsen, 2013; Hanson, 2015). The identification of state capacity as a determinant of welfare has far-reaching theoretical consequences. State capacity, referring to administration, is conceptualized as a causal factor influencing welfare, covering specific goals such as economic growth and infant mortality. Applying Eckstein’s typology (1971) of political performance to this chain of causation, state capacity as a specific type of procedural performance is postulated as a determinant of various forms of substantive (goal-related) performance. Hence, the third stage of the study of political performance focuses on the relationship between different types of performance. More specifically, it seeks to test the basic hypothesis of whether the performance of
920
The SAGE Handbook of Political Science
the administration (procedural performance) helps to promote the attainment of particular goals (substantive performance). So far, there is only scattered evidence on this central issue; research is just beginning to burgeon.
adjusted to all political regimes, is based on two dimensions:
Political Performance
• The first dimension, going back to Eckstein (1971), distinguishes between substantive and procedural performance. Substantive performance aims at the attainment of particular goals such as welfare or liberty. Procedural performance helps to promote the attainment of any particular goal: examples are stability or decisional efficacy. • The second dimension, differentiating between systemic and democratic performance, takes into account different normative expectations that exist either with regard to all political systems or only with regard to democratic regimes. Systemic performance refers to achievements every political system must generate for society, such as economic growth and stability. Democratic performance refers to achievements that are to be ensured by democratic regimes, such as liberty, responsiveness and accountability; they are intrinsic to democracy.
Political performance is defined as the evaluation of what public officials (as individual and above all collective actors) do and what the outcomes of their activities are. For an explicit and systematic evaluation of political performance, normative criteria are required; they function as yardsticks against which the activities of public officials and their outcomes can be assessed. For an external and relatively unbiased evaluation of different political regimes, these normative criteria should represent universal values that are commonly expected by political systems. As stated earlier, research on political performance is characterized by the study of heterogeneous performance indicators and criteria that are seldom justified. Because scholars are mainly interested whether political institutions and political regimes have any effect, they tend to disregard the fact that they are examining different types of performance. With the help of a typology of performance criteria, the most important types of criteria can be identified. The typology, initially developed for the performance of liberal democracy (Roller, 2005) and thereafter
By combining these two dimensions, four types of performance criteria can be established. Both substantive and procedural political performance can be distinguished according to whether they are to be provided by all political regimes or only by democratic regimes. Table 55.1 shows definitions and examples for the four types of performance criteria. The typology can be applied to classify the few elaborated concepts of political performance. While one-dimensional concepts rely on a single type of performance criteria, multidimensional concepts combine different types of performance criteria. For example, Harry Eckstein (1971) suggested a one-dimensional concept of procedural performance valid for all political regimes. It includes durability, civil order, legitimacy and decisional efficacy. Edeltraud Roller’s (2005) Normative Model of Political Effectiveness is also a one-dimensional concept aiming at a complete list of substantive, systemic performance criteria. It covers international security, domestic security, wealth, socioeconomic security and socioeconomic equality
Basic concepts Concepts of political performance define the research object, and classify and justify criteria for evaluating political performance; theories of political performance explain political performance. In the following, the general concept of political performance and the specific concept of state capacity are discussed.
Political Performance and State Capacity
921
Table 55.1 Typology of performance criteria Systemic performance
Democratic performance
Substantive performance
Procedural performance
Effective realization of goals valid for all political systems (e.g. security, welfare) Effective realization of democratic values (e.g. liberty, equality, responsiveness)
Characteristics of all political processes that promote the attainment of particular goals (e.g. efficiency, stability) Characteristics of democratic political processes that promote the attainment of democratic values (e.g. accountability, participation)
Source: Author.
(goals of the welfare state) and environmental protection. Larry Diamond and Leonardo Morlino (2005) developed a two-dimensional concept of democratic performance. Their quality of democracy concept covers two procedural (rule of law and accountability) and three substantive criteria (responsiveness, liberty and equality). The typology of performance criteria can also be used to describe the main research activities. Research on the performance of different types of democracy embraces the most comprehensive range of performance criteria. For example, Lijphart’s (1999 [2012]) influential Patterns of Democracy studies all four types of performance criteria. He considers substantive, systemic performance (e.g. inflation rates, unemployment); substantive, democratic performance (e.g. gender inequality index); procedural, systemic performance (e.g. corruption); and finally procedural, democratic performance (e.g. accountability). The majority of his indicators, however, refer to substantive performance, indicating Lijphart’s dominant understanding of performance. Research comparing the performance of democratic and authoritarian regimes as well as performance of different types of authoritarian regimes is much more limited. The most widely used measures are economic growth and child mortality, referring to substantive, systemic performance (e.g. Ross, 2006; Knutsen, 2013; Miller, 2015). There are also many studies on corruption, representing procedural, systemic performance (e.g. Bäck and Hadenius, 2008; Charron
and Lapuente, 2010). Recently, some studies have investigated basic freedoms such as media freedom – that is, substantive, democratic performance (e.g. Miller, 2015; Stier, 2015) – and ecological performance, that is, substantive, systemic performance (e.g. Wurster, 2013). As far as the theoretical justification of performance criteria is concerned, most scholars tend to select their criteria arbitrarily without any further theoretical justification. At most, they just attach importance to their criteria. If theoretical justifications are being made, scholars refer to different theories depending on the type of performance criteria. In the case of democratic criteria, whether substantive (e.g. equality) or procedural (e.g. accountability), normative democratic theory (e.g. Dahl, 1989) is used. In the case of systemic criteria, especially substantive criteria such as security, wealth, socioeconomic security and socioeconomic equality (welfare state), and environmental protection, empirical political theories, especially theories on the development of policies or the expansion of the role of government (e.g. Rose, 1976), are used to justify the policy-performance criteria.
State Capacity State capacity is not yet a well-established concept. For all five attributes of the concept – designation, definition, criteria, justification and measurement – there are alternative
922
The SAGE Handbook of Political Science
specifications, even sometimes highly controversial ones. The most important specifications for the first four attributes will be discussed in what follows (for measurement issues, see section on empirical databases). Starting with the designation issue, alternative terms for ‘state capacity’ are ‘quality of government’ and ‘good governance’. Authors treat either all three terms or at least ‘state capacity’ and ‘quality of government’ as synonyms. When authors reject the term ‘good governance’ as a synonym, they refer to the popular World Bank (2019) definition: Governance consists of traditions and institutions by which authority in a country is exercised. This includes the process by which governments are selected, monitored and replaced; the capacity of the government to effectively formulate and implement sound policies; and the respect of citizens and the state for the institutions that govern economic and social interactions among them.
This definition of governance has been criticized for being too broad and including all aspects of politics (e.g. Rothstein and Teorell, 2008; Agnafors, 2013), and that is why some authors disclaim the term ‘good governance.’ When authors prefer the ‘state capacity’ term over both synonyms, they argue that the administrative dimension that is at stake here is traditionally part of the classic state-ness concept in the sense of Max Weber. State capacity or state-ness is defined both narrowly and broadly. Typically, Michael Mann’s (1993: 59) famous conception of infrastructural power as ‘the institutional capacity of a central state, despotic or not, to penetrate its territories and logistically implement decisions’ is used as the starting point for both types of definition. Subsequently, Bäck and Hadenius (2008: 3), favouring a narrow concept, have contended that state-ness or a functioning state covers at least two dimensions. First is ‘the capacity of state organs to maintain sovereignty over a geographical territory’, that is, controlling the territory with the monopoly of violence. Second is the administrative or implementation capacity of
how well the state organs are able to carry out their tasks. Furthermore, Bäck and Hadenius argue that the implementation capacity occurs only after the effective establishment of sovereignty, and it is (only) this dimension of administrative or implementation capacity that promotes welfare. The administrative or implementation capacity is rooted in the Weberian tradition regarding the existence of a professional and insulated bureaucracy. Scholars, favouring broad concepts of state capacity (e.g. Carbone and Memoli, 2015) criticize this one-dimensional, administrative definition of state capacity as too narrow. They suggest a multi dimensional, broad conception of state capacity that, in addition, incorporates the above-mentioned dimension of sovereignty (also called political order) as well as the dimension of legitimacy. There seems to be an obvious explanation for the use of either a narrow or a broad definition of state capacity. Proponents of broad concepts are mainly interested in explaining the overall state capacity of different political regimes. Hence, the use of narrow or broad definitions of state capacity varies with the research questions at hand. Examining state capacity as a determinant of political performance (state capacity as independent variable), scholars are inclined to work with a narrow definition of state capacity focusing on the administrative dimension. In examining determinants of state capacity (state capacity as dependent variable), scholars are inclined to employ a broad definition of state capacity. Criteria, the third attribute of the state capacity concept, refers to the yardsticks used for evaluating the administration. What is a well-functioning administration, or when do bureaucrats do a good job? Regarding this attribute, there are different, sometimes contradictory ideas. Bäck and Hadenius (2008: 3) list different features characterizing a well-functioning administration: a bureaucracy that recruits and promotes persons on professional grounds; a bureaucracy that applies clear rules for decision-making
Political Performance and State Capacity
such as impartiality, openness and accountability; and a bureaucracy that enjoys a high degree of autonomy. In contrast, Rothstein and Teorell (2008: 170) focus on a single normative criterion, namely impartiality. They define impartiality in the exercise of public power as follows: ‘When implementing laws and policies, government officials shall not take into consideration anything about the citizen/case that is not beforehand stipulated in the policy or the law.’ Impartiality is conceptualized as a basic norm on the output side of the political system (referring to the way in which authority is exercised), which complements the basic norm of political equality on the input side of the political system (relating to the access to public authority). The authors assume that impartiality, as a basic norm, will result in increased state capacity. Furthermore, they conceive of impartiality as a universally accepted procedural principle that promotes the substantive criteria of effectiveness. Rothstein and Teorell suggested the most elaborated concept, but this concept is not without dispute. For example, Francis Fukuyama (2013: 349) argues that an impartial state could still lack the capacity and/or autonomy to deliver welfare in an effective way. Consequently, he suggests capacity and autonomy as criteria for well-functioning bureaucracies. Capacity consists of resources (e.g. extractive capacity measured in terms of tax extraction) and the degree of professionalization of bureaucratic staff. At first glance, a consensus seems to exist with regard to the fourth attribute of justification. Advocates of the state capacity concept (e.g. Bäck and Hadenius, 2008; Rothstein and Teorell, 2008; Hanson, 2015) share the basic functional premise that state capacity is good for welfare, or in other words, that state capacity is a determinant of substantive performance. In fact, this is an empirical proposition that may or may not be true. Hence, it is a hypothesis that has to be put to an empirical test. However, depending on the use of a narrow or broad definition of state capacity, two different hypotheses are at hand. Proponents
923
of a narrow definition of state capacity claim that (only) a well-functioning administration is necessary to promote welfare (e.g. Bäck and Hadenius, 2008; Rothstein and Teorell, 2008). On the other hand, proponents of a broad definition of state capacity hypothesize that a well-functioning administration is not sufficient; further, state dimensions such as political order are required to promote welfare (e.g. Hanson, 2015). To summarize, there are currently a large number of different concepts of state capacity competing for dominance. Research is in its infancy and it is unclear which concept or which concepts will prevail.
Theories explaining political performance Research on political performance has been shaped in large part by the new institutionalism, with its central premise that institutions matter. Therefore, theories explaining political performance focus on political institutions as the central determinant. Four theoretical explanations can be distinguished: theoretical explanations for the performance of different types of democracy, for the performance of democracy and authoritarianism and for different types of authoritarian regimes, and theories on the interaction between political regime and state capacity as determinant of welfare. In the following, these theories and approaches, as well as related research, are reviewed.
Different Types of Democracy Even though systematic research on political performance started with a comparison of different types of democracy, this research was not guided by well-formulated theories. Lijphart’s Patterns of Democracy (1999 [2012]), comparing the performance of majoritarian and consensus democracy, is a good
924
The SAGE Handbook of Political Science
example. Majoritarian democracy is defined by the principle ‘government by the majority’ and, on the institutional level, by power concentrating institutions such as the twoparty system and unicameral legislature. Consensus democracy is defined by the opposite principle, ‘government by as many people as possible’, and by power dispersing institutions such as the multi-party system and bicameralism (Lijphart, 2012: 2). Lijphart identifies in total ten institutions (each contrasting between the majoritarian and consensus models) that cluster in two separate dimensions. The first, executivesparties dimension covers five institutions (executive, relationship between executive and legislative, party system, electoral system, interest groups) and the second, federal-unitary dimension covers the same number (federal and unitary government, legislature, constitution, judicial review, central bank). To derive hypotheses on the performance of both types of democracy, Lijphart refers to a so-called conventional wisdom of PR versus plurality and majority elections, which he extends to the broader contrast between consensus and majoritarian democracy. The conventional wisdom assigns to consensus democracy more ‘accurate representation, and, in particular, better minority representation and protection of minority interests, as well as broader participation in decision-making’ and to majoritarian democracy more decisive and hence more effective policy-making (Lijphart, 2012: 255). Lijphart seems to be content with the conventional wisdom as a theoretical basis for his comparison. Even after his empirical analyses are found to be able to confirm the conventional wisdom only for the executives-parties dimension – that is, only as far as the executives-parties dimension is concerned do consensus democracies outperform majoritarian democracies – he is not interested in further theorizing the findings. With the help of veto player theory (Tsebelis, 1995), some authors tried to give the executives-parties and the federal-unitary
dimensions a more profound meaning (e.g. Birchfield and Crepaz, 1998; Roller, 2005). Veto players are actors whose consent is necessary before a policy can be changed. Veto player theory distinguishes between institutional veto players (set by the constitution) and partisan veto players (set by the party system and governing coalition). In a first step, the authors reinterpret the federal-unitary and the executives-parties dimensions as measures of institutional and partisan veto players. The central feature of institutional veto players is that different political actors operate through separate institutions with mutual veto powers (e.g. federalism, strong bicameralism, presidentialism), while partisan veto players emerge from institutions where different political actors operate in the same body and whose members interact with each other on a face-to-face basis (e.g. multi-party coalition government, multi-party legislatures). In the next step, the authors describe the different veto players’ consequences for the decision-making process. With an increasing number of institutional veto players, there will be blockage of the executive, whereas in the case of increasing partisan veto players there is a structural need for the government to negotiate and find consensus. To characterize the different effects of the two kind of veto players, Vicki Birchfield and Markus M. L. Crepaz (1998) suggested the terms ‘competitive veto points’ for the institutional and ‘collective veto points’ for the partisan veto players. The study by John Gerring et al. (2009) comparing parliamentary and presidential systems is another example of the lack of well-formulated theories on the performance of different types of democracy. In a parliamentary system, the executive is chosen by and responsible to an elective body (legislature), building a single locus of sovereignty at the national level. In the presidential system, the policy-making power is divided between two separately elected bodies (legislature, president). Gerring et al.’s (2009: 336) literature review on the performance of both types
Political Performance and State Capacity
of democracy concludes that ‘a genuinely coherent, operational, and plausible theory of how democratic institutions work’ does not exist. Consequently, the authors reverse the conventional sequence of scientific inquiry. They do the testing first and theorize afterwards. Their empirical analysis demonstrates that parliamentary systems are associated with higher levels of performance in policy areas of economic and human development. At the end of their analysis, they explain this pattern through the capacity of parliamentarism to function as a coordination device. Parliamentarism offers better tools for resolving coordination problems ‘because parliamentarism integrates a diversity of views while providing greater incentives for actors to reach agreement’ (Gerring et al., 2009: 354). To conclude, types of democracy vary with respect to two dimensions (Roller, 2005: 99): distribution of power (constraining or dispersing) and the type of veto player (constitutional or partisan). Theoretical explanations for the performance of different types of democracy focus on these two dimensions and ask whether they promote negotiations and the finding of consensus. They claim that democratic institutional arrangements promoting negotiations and the finding of consensus show higher levels of performance.
Democracy and Authoritarianism Research comparing the performance of democracy and authoritarianism is also not guided by well-formulated theories. Typically, scholars list several causal mechanisms to explain the superiority of democracy. The designation, type and number of causal mechanisms varies; they are depending partly on the performance indicator at hand. Two instructive examples are presented. Michael Ross (2006) examines the welfare of the poor in democratic and authoritarian regimes. He suggests three mechanisms accounting for a better performance of
925
democracy: (a) competitive elections in democracies allow the poor to penalize governments; (b) freedom of the press in democracies transmit information from the poor to the central government; (c) democratic governments produce more public goods and more income redistribution to meet the demands of a larger group of supporters. The latter mechanism is based on the classic Meltzer–Richard model of redistribution (1981) arguing that in democracies the decisive median voter earns a below average income and prefers more economic redistribution. In their study on democracy and human development, Gerring et al. (2012: 2) discuss ‘four of the numerous possible pathways’ linking democracy with human development. In particular, they are interested in a possible time-dependent nature of the relationship. The pathways are: (a) competitive elections produce accountability; (b) institutions of democracy such as political rights tend to foster a well-developed civil society, which is instrumental in providing services for the poor; (c) democracy may serve to inaugurate a culture of equality that empowers oppressed groups; (d) older democracies will benefit from greater institutionalization in the political sphere. The postulated causal mechanisms follow a simple logic. Focusing on various democratic institutions (competitive elections, political rights, freedom of the press, and so on) that do not exist in autocracy, they describe effects of these institutions that are supposed to promote political performance. Remarkably, the authors do not explain the selection of the causal mechanisms; quite obviously, there are sometimes more mechanisms at work than those discussed (see the above-mentioned quote from Gerring et al., 2012: 2). What is more, they are not interested in exploring the question of which causal mechanisms or democratic institutions can account for the better performance of democracy. Hence, the main function of these lists is to generate a number of plausible arguments (the more the better) that speak
926
The SAGE Handbook of Political Science
in favour of democracy. The theoretical basis of research comparing the performance of democracy and authoritarianism tends to be of an eclectic character.
Different Types of Authoritarian Regimes Research on the performance of different types of authoritarian regimes is still in its infancy. Available studies work with two different typologies. They compare either electoral with closed authoritarianism (e.g. Miller, 2015) or several subtypes of authoritarian regimes, namely one-party authoritarian regimes, limited multi-party authoritarian regimes, military regimes and monarchies (e.g. McGuire, 2013). Depending on the typology used, two theoretical approaches can be identified. Electoral authoritarianism, in contrast to closed authoritarianism, is characterized by multi-party competition in legislative elections, but these elections are not of a democratic type – they are unfree and unfair. Michael K. Miller (2015) suggests that electoral authoritarianism displays better welfare than closed authoritarianism. He justifies his hypothesis by referring to three causal mechanisms. The first two are commonly associated with democracy and provide channels for popular pressure: (a) electoral pressure (although authoritarian elections are contested on an uneven playing field, they involve elements of uncertainty and the risk of electoral turnover); (b) civil liberties and political openness (civil liberties such as the rights to speech and association, as well as the openness of the political space to rival groups, allowing for direct popular engagement of citizens); (c) governmental capacity (electoral authoritarianism implements similar institutions as democracies). Hence, Miller takes causal mechanisms from the contrast between democracy and autocracy. He argues that ‘autocratic elections can be sufficiently competitive to contribute to
popular pressure and governmental capacity’ (Miller, 2015: 1528). While hypotheses on the performance of electoral and closed authoritarianism are explained with effects of democratic institutions, the situation is more complex in the case of the second typology of authoritarian regimes, distinguishing between oneparty and limited multi-party authoritarian regimes, military regimes and monarchies. In this case, the types of authoritarian regimes are defined mainly by different groups that establish the authoritarian regime, and, in addition, by the degree of party competition. James W. McGuire (2013) uses the selectorate theory (Bueno de Mesquita et al., 2003) to deduce hypotheses on the social performance of subtypes of authoritarian regimes. The selectorate theory argues that the performance of political regimes can be explained with two concepts: the size of the selectorate and the size of the winning coalition. The selectorate (S) is the set of people who have a legitimate say in the selection of the leader. The winning coalition (W) is the subset of the selectorate whose support is necessary for the leader to stay in power. The size of W determines whether the leaders distribute private goods (e.g. business or export licenses) or public goods (e.g. health care, education, and infrastructure). The larger the W, the more expensive it is for the leader to stay in power by providing private goods to his supporters rather than public goods (for a similar argument on the ratio W/S, measuring the loyalty of the members of the winning coalition with the incumbent leader, see Clark et al., 2018: 391). In order to deduce hypotheses on the performance of the subtypes of authoritarian regimes, information on the size of each regime’s winning coalition is required (and in the case of W/S also information on the size of the selectorate). Bueno de Mesquita et al. (2003) provide such information, and consequently McGuire (2013: 59) suggests that after democracy, the limited multi-party regimes should have the highest level of social performance, followed by one-party
Political Performance and State Capacity
authoritarian regimes, monarchies and military regimes, in that order. The selectorate theory is particularly qualified for deducing hypotheses on the performance of different political regimes. On the one hand, the majority of performance criteria studied are conceptualized as collective goods. On the other, the elegant and parsimonious character of the theory allows a ranking of multiple political regimes along one dimension (W or W/S) supposed to be decisive for the political performance of the regimes. Although McGuire’s (2013) findings on social performance can only partly confirm the hypothesized ranking of political regimes, further research will assess the merits of selectorate theory.
Interaction between Political Regime and State Capacity The political regime and state capacity are supposed to be central determinants of welfare, but the nature of the interaction between both factors is unclear. Research on this question is also still in its infancy. Jonathan K. Hanson (2015) tried to sort out two alternatives – complements or substitutes – and suggested an interesting hypothesis. According to Hanson (2015: 305) a complementary relationship exists when democracy creates ‘the incentives to expend greater resources on public services’ and state capacity provides ‘the means to deliver these services effectively’. Democracy and state capacity would substitute for each other, when they ‘work independently and through different mechanisms to create the incentives and means to deliver public services’. Hanson hypothesizes a partial substitution. In a first step, he explains why democracy creates incentives to expend greater resources on public services. Here he refers to the wellknown causal mechanism of accountability, namely the accountability of policymakers to citizens. In a similar way, he explains why state capacity provides the means to deliver
927
these services effectively. In this case, the key causal mechanism is also accountability, but it refers to the relationship between policymakers and public service providers. Services are delivered effectively when policymakers hold public service providers accountable for delivery. In the next step, Hanson (2015) formulates the decisive argument for his partial substitution thesis. He argues that high state capacity provides alternative mechanisms to accomplish some functions assigned to democracy. High state capacity can improve the flow of information regarding public needs (information can come from a well-organized data collection), facilitate the design of better policies and help set the policy agenda. Hanson’s theory is one example of conceptualizing the interaction between political regime and state capacity in explaining welfare (or substantive performance). Further theoretical explanations are on the way. Nevertheless, his theory is instructive in that it shows what such an explanation might look like. First, he identifies causal mechanisms that account for the advantages of democracy as well as causal mechanisms that account for the advantages of high state capacity. Second, he studies the relationship (substitutive or complementary) between both of them.
Empirical databases In order to evaluate the performance of political regimes, comparative data on substantive and procedural performance over a long time span is required. The measurement of political performance raises several issues, which are discussed as follows.
Substantive and Procedural Measures Substantive performance aims at the attainment of particular goals such as welfare.
928
The SAGE Handbook of Political Science
Policy analysis distinguishes between two types of indicators gauging activities of public officials: output measures, referring to actions or efforts to reach goals, and outcome measures, referring to the actual results of these actions. In the case of redistribution, for example, social expenditures measure the effort (output) and poverty rates the results of this effort (outcomes). Inasmuch as political performance is an evaluative concept aiming at assessing whether the intended goals are achieved, outcome measures should be used. They are the real test as to whether the outputs have produced the intended results or not. A common argument raised against the use of outcome measures claims that usually public officials cannot directly control outcomes. Consequently, these scholars doubt whether one can speak of political performance here at all. To avoid this problem, they consciously decide to use outputs rather than outcomes as measure for substantive political performance. For example, Putnam’s Index of Institutional Performance (Putnam, 1993) primarily employs outputs such as the number of day care centres and health expenditures, rather than mortality rates, for example. This is convincing only if outputs could serve as valid proxies for outcomes, but this is a much disputed assumption. An argument for the use of outcome measures can be derived from Almond and Powell’s (1978: 394) dictum that substantive performance refers to goals that are commonly expected of political systems. If there is empirical evidence that the goals studied guide the actions of politicians and politicians assume responsibility for achieving these goals, scholars can deliberately rely on outcomes. For example, Roller’s Index of Effectiveness of Democracies (Roller, 2005), which is a pure measure of outcomes covering indicators such as poverty, infant mortality and unemployment rate, satisfies these conditions. In the case of procedural performance of the administration, or state capacity, the question of valid indicators is also pressing.
Procedural performance, narrowly defined, is the capacity of how well the administration can carry out its tasks. The most often used indicator is the ‘bureaucracy quality’ rating of the International Country Risk Guide (ICRG). It is based on expert ratings of whether ‘the bureaucracy has the strength and expertise to govern without drastic changes in policy or interruptions in government services’ and ‘whether the bureaucracy tends to be somewhat autonomous from political pressure and to have an established mechanism for recruitment and training’ (PRS, 2019). This measure unambiguously focuses on an evaluation of the administrative capacity. However, some scholars instead use outcomes of public policy such as infant mortality rates. There are several arguments against using outcomes as measures for state capacity. The most persuasive counterargument refers to the fact that state quality itself has been introduced to explain just these outcomes. Thus, state quality and substantive performance would coincide (Fukuyama, 2013: 356). Scholars working with a broad definition of state capacity consider measures for the administrative capacity and additionally measures for the extractive capacity and/or coercive capacity of the state (for an overview see Hanson and Sigman, 2013). Typically, the extractive capacity is measured with tax revenues as proportion of GDP (e.g. Fortin, 2012). The coercive capacity is gauged with an expert rating from the Bertelsmann Transformation Index (BTI) considering the degree to which the state has a monopoly on the use of force (e.g. Carbone and Memoli, 2015).
Factual and Evaluative Measures In principle, performance measures are based either on facts (e.g. poverty rate, infant mortality) or on evaluations of experts (e.g. ICRG’s bureaucratic quality rating, Freedom House’s (2019) rating of freedom
Political Performance and State Capacity
of the press). Generally, the type of measure varies with the performance dimension at hand. Typically, substantive, systemic performance such as welfare is measured by facts; substantive, democratic performance such as freedom of the press is measured by expert ratings as well as procedural performance of the administration. However, there are exceptions to the rule. Sometimes, scholars are particularly interested in the evaluation of substantive performance by the citizens themselves. They use citizens’ life satisfaction or citizens’ satisfaction with the way democracy works as measures. Hence, sometimes it is a matter of theoretical decision whether factual or evaluative measures are used. Very often, the validity of expert ratings is put into question and expert ratings are dismissed as merely subjective. However, such critics fail to recognize that in certain cases the perception or evaluation of a phenomenon is the only way to measure it. The stateof-the-art method to address the problem of reliability is to ensure a sufficient number of experts per country in the data collection process (e.g., Coppedge et al., 2018) and/or to check robustness with alternative indicators in the process of data analysis.
Single and Composite Measures Sometimes, studies on political performance are interested not only in the assessment of single specific criteria (e.g. infant mortality) but also in the overall level of regime performance. The most efficient method to gauge the overall level of performance is to construct a composite measure aggregating several performance measures. A widely used composite measure of substantive performance is the Human Development Index (HDI) of the United Nations Development Programme (UNDP, 2019). It subsumes three dimensions of welfare – a long and healthy life, knowledge and a decent standard of living – and relies on four indicators (life
929
expectancy at birth, expected years of schooling, mean years of schooling and gross national income per capita). It assesses the average achievements in a country on these dimensions of welfare. The Worldwide Governance Indicators (WGI) of the World Bank (2019) cover the most widely used composite measures focusing on procedural performance. They encompass measures for six dimensions – voice and accountability; political stability and the absence of violence; government effectiveness; regulatory quality; rule of law; control of corruption – and an overall composite measure comprising these six dimensions. The WGI, based on several hundred individual variables from multiple sources, measures perceptions of governance by experts and ordinary citizens (e.g. law and order, citizens’ satisfaction with education system). Very often, scholars use the dimension ‘government effectiveness’ as an indicator for administrative capacity. This dimension is defined as follows: ‘Government effectiveness captures perceptions of the quality of public services, the quality of the civil service and the degree of its independence from political pressures, the quality of policy formulations and implementation, and the credibility of the government’s commitment to such policies’ (World Bank, 2019). Hence, government effectiveness does not focus on administrative capacity, and includes more components than the ICRG bureaucratic quality index (thus the ICRG bureaucratic quality index constitutes only one of multiple World Governance Indicators measuring government effectiveness). Constructing composite measures implies decisions about the techniques of standardization, weighting and aggregation of individual values. Different statistical procedures of standardization (e.g. z-score transformation and indexing), weighting (e.g. equal or unequal) and aggregation (e.g. arithmetic mean, multiplication and unobserved-components model) are used, and their relative merits are discussed intensively.
930
The SAGE Handbook of Political Science
Availability of Data The feasibility of research on political performance depends on the availability of data. Empirical studies comparing substantive performance of democracy and authoritarianism, as well as the performance of different types of authoritarian regimes, suffer from incomplete factual data. The willingness of governments to disseminate policy-relevant data to international organizations (e.g. United Nations, World Bank) is considerably lower for authoritarian than for democratic countries (Ross, 2006; Hollyer et al., 2011). These differences in ‘policy transparency’ (Hollyer et al., 2011: 1192) lead to selection bias; authoritarian regimes are more likely to be missing in samples of countries than democratic ones. Furthermore, there are hints that missing data are not randomly distributed within the group of authoritarian regimes. For example, authoritarian regimes with better performance regarding incomes and child mortality tend to have more missing data. They may have fewer incentives to report their data to international organizations (Ross, 2006: 864). Given the high probability of selection bias within the group of authoritarian regimes, scholars are well advised to check, if possible, whether his or her sample of countries is biased (Roller, 2013). In order to correct for these flaws, fixed effects in time-series cross-sectional analysis and multiple imputation to replace missing data have been suggested.
Major advances, ongoing debates, critical assessments Major Advances Research on political performance has seen three major advances in the past decade. The first is an expansion from the comparison between types of democracy to the comparison of democratic with authoritarian regimes,
as well as between different authoritarian regimes. Research on the performance of authoritarian regimes is now a significant part of the emerging field of authoritarianism, which has gained momentum after the millennium. Since political competition between democratic and (some) authoritarian countries is increasing, research on the performance of democracy and authoritarianism is of political importance too. It provides empirical evidence as to whether in any performance measure authoritarianism is superior or at least equal to democracy. So far, most of the available evidence is in favour of democracy. As for the performance of different types of authoritarian regimes, there is no clear pattern yet. However, it must be noted that the comparison between democracy and authoritarianism, as well as between different authoritarian regimes, involves a notable narrowing of the performance measures. Typically, infant mortality is the most commonly used indicator interpreted as an indicator either for health (e.g. Miller, 2015) or for poverty (e.g. Ross, 2006). On the one hand, this limitation is a result of the lack of available data for nondemocratic regimes; they produce and publish less policy-relevant data. On the other hand, this limitation is a result of the lack of comparable data for poor countries that are overrepresented among nondemocratic regimes. Since the living conditions of the poor countries differ greatly from those of rich countries, many performance measures that work well for rich countries either do not exist or are not valid in poor countries (e.g. household income in subsistence economy). The identification of state capacity as a determinant of welfare is a second major advance of performance research. There is evidence to suggest that state capacity and political regime exert an independent effect on welfare. The inclusion of state capacity helps to explain the poor performance of some democratic regimes and the good performance of some authoritarian regimes. In theoretical terms, state capacity as a
Political Performance and State Capacity
specific type of procedural performance is introduced as a causal factor for various forms of substantive (goal-related) performance. The third major advance refers to the empirical analysis. Scholars are increasingly using elaborate statistical methods for analysing the time-series cross-sectional data. For example, they control for global trends in welfare, they use fixed-effect models to control for country-specific effects, and they address the endogeneity problem with the help of the method of instrumental variables (e.g., Ross, 2006; Miller, 2015; Wang et al., 2019).
Ongoing Debates That political regimes matter is the basic premise of research on political performance. A follow-up question is which aspect of the political regime matters for political performance. One debate addresses the question whether it is the current regime type or the regime type history. Studies on the performance of democracy and authoritarianism are going to test the hypothesis whether it is the current level of democracy or democratic experience (stock) that matters (e.g. Gerring et al., 2012; McGuire, 2013; Miller, 2015). So far, the available empirical findings indicate that both factors matter, and that democratic experience would seem to be more important. A second recent debate refers directly to various components of democracy. This debate is made possible by the new Varieties of Democracy measure (Coppedge et al., 2018). Whereas former measures of democracy (e.g. Polity IV, Freedom House) are highly aggregated measures, Varieties of Democracy allows for differentiating between multiple institutional components of democracy (e.g. electoral, liberal, participative, deliberative and egalitarian democracy). Based on this new measure, it is possible to examine the question of which components of democracy are relevant for performance.
931
A first empirical analysis suggests that electoral competition reveals a stronger relationship with infant mortality than citizen empowerment; the latter comprises individual liberty, political equality, female empowerment, civil society and deliberation (Gerring et al., 2015). With the help of this new democracy measure it will be possible to test some of the postulated causal mechanisms that have been suggested to explain the superior performance of democracy (see section on theories explaining political performance).
Critical Assessment The dominance of new institutionalism in performance research leads to a bias in favour of political regimes and their effect on political performance, that is, the independent variable. Scholars pay little attention to the dependent variable of performance. Performance measures are simply used as a test case. Consequently, different performance dimensions and criteria (cf. Table 55.1) seem to be almost irrelevant for theorizing on the effects of political regimes. This is a major drawback of existing theories explaining performance. Another neglected problem refers to the quality of performance data for authoritarian regimes. Empirical studies document low policy transparency for authoritarian regimes, that is, the willingness of governments to disseminate policy-relevant data to international organization (Hollyer et al., 2011). However, it is very plausible to assume that low policy transparency also implies that the available factual performance data of authoritarian regimes are less reliable than that provided by democratic regimes. It is striking that scholars are not interested in discussing this fundamental issue – they seem to rely uncritically on the data collected by international agencies. More information, and empirical studies about the quality of performance data for authoritarian regimes, are urgently needed.
932
The SAGE Handbook of Political Science
Perspectives Although research on political performance and state capacity has made substantial progress in the last decade, its achievements are far from being solid, comprehensive and cumulative. In order to become a well-established field of research, it might be helpful to try to integrate the present concepts, theories and findings, to check the quality of the available performance data and to make use of new data on political regimes.
References Agnafors, M. (2013). Quality of government: Toward a more complex definition. American Political Science Review, 107(3), 433–45. Almond, G. A., & Powell, G. B. (1978). Comparative politics: System, process, and policy (2nd ed.). Boston: Little, Brown. Armingeon, K. (2002). The effects of negotiation democracy: A comparative analysis. European Journal of Political Research, 41(1), 81–105. Bäck, H., & Hadenius, A. (2008). Democracy and state capacity: Exploring a J-shaped relationship. Governance, 21(1), 1–24. Baum, M. A., & Lake, D. A. (2003). The political economy of growth: Democracy and human capital. American Journal of Political Science, 47(2), 333–47. Birchfield, V., & Crepaz, M. M. L. (1998). The impact of constitutional structures and collective and competitive veto points on income equality in industrialized democracies. European Journal of Political Research, 34(2), 175–200. Bueno de Mesquita, B., Smith, A., Siverson, R. M., & Morrow, J. D. (2003). The logic of political survival. Cambridge, MA: The MIT Press. Carbone, G., & Memoli, V. (2015). Does democratization foster state consolidation? Democratic rule, political order, and administrative capacity. Governance, 28(1), 5–24. Charron, N., & Lapuente, V. (2010). Does democracy produce quality of government? European Journal of Political Research, 49(4), 443–70. Charron, N., & Lapuente, V. (2011). Which dictators produce quality of government?
Studies in Comparative International Development, 46(4), 397–423. Cheibub, J. A., Gandhi, J. & Vreeland, J. R. (2010). Democracy and dictatorship revisited. Public Choice, 143(1), 67–101. Clark, W. R., Golder, M. & Golder, S. N. (2018). Principles of comparative politics (3rd ed.). Thousand Oaks, CA: Sage. Coppedge, M., Gerring, J., Knutsen, C. H., Lindberg, S. I., Skaaning, S.-E., Teorell, J., Krusell, J., Marquardt, K. L., Medzihorsky, J., Pemstein, D., Pernes, J., Stepanova, N., Tzelgov, E., Wang, Y. & Wilson, S. (2018). V-Dem Methodology Version 8. Varieties of Democracy (V-Dem) Project. Available at: www.v-dem.net (accessed April 6, 2019). Dahl, R. A. (1989). Democracy and its critics. New Haven, CT: Yale University Press. Diamond, L., & Morlino, L. (eds). (2005). Assessing the quality of democracy. Baltimore, MD: The Johns Hopkins University Press. Eckstein, H. (1971). The evaluation of political performance: Problems and dimensions. Beverly Hills, CA: Sage. Fortin, J. (2012). Is there a necessary condition for democracy? The role of state capacity in postcommunist countries. Comparative Political Studies, 45(7), 903–30. Freedom House (2019). Freedom in the world. Available at: https://freedomhouse.org/ (accessed April 12, 2019). Fukuyama, F. (2013). What is governance? Governance, 26(3), 347–68. Gandhi, J. (2008). Political institutions under dictatorship. Cambridge: Cambridge University Press. Geddes, B., Wright, J., & Frantz, E. (2014). Autocratic breakdown and regime transitions: A new data set. Perspectives on Politics, 12(2), 313–31. Gerring, J., & Thacker, S. C. (2008). A centri petal theory of democratic governance. Cambridge: Cambridge University Press. Gerring, J., Thacker, S. C., & Alfaro, R. (2012). Democracy and human development. The Journal of Politics, 74(1), 1–17. Gerring, J., Thacker, S. C. & Moreno, C. (2009). Are parliamentary systems better? Comparative Political Studies, 42(3), 327–59. Gerring, J., Knutsen, C. H., Skaaning, S.-E., Teorell, J., Coppedge, M., Lindberg, S. I. &
Political Performance and State Capacity
Maguire, M. (2015). Electoral democracy and human development. Working Paper Series 2015: 9 New Version. University of Gothenburg. Available at: www.v-dem.net (accessed April 6, 2019). Gurr, T. R., & McClelland, M. (1971). Political performance: A twelve-nation study. Beverly Hills, CA: Sage. Hanson, J. K. (2015). Democracy and state capacity: Complements or substitutes? Studies in Comparative International Development, 50(3), 304–30. Hanson, J. K., & Sigman, R. (2013). Leviathan’s latent dimensions: Measuring state capacity for comparative political research. Available at: www-personal.umich.edu/~jkhanson/ resources/ hanson_sigman13.pdf (accessed April 6, 2019). Hollyer, J. R., Rosendorff, B. P., & Vreeland, J. R. (2011). Democracy and transparency. The Journal of Politics, 73(4), 1191–1205. Kaufmann, D., Kraay, A. & Mastruzzi, M. (2009). Governance matters VIII: Aggregate and individual governance indicators 1996– 2008. Policy Research Paper, The World Bank Development Research Group. Knutsen, C. H. (2013). Democracy, state capacity, and economic growth. World Development, 43(issue C), 1–18. Lijphart, A. (1999 [2012]). Patterns of democracy: Government forms and performance in thirty-six countries. New Haven, CT: Yale University Press. Mann, M. (1993). The sources of social power. Vol. 2: The rise of classes and nation states, 1760–1914. Cambridge: Cambridge University Press. McGuire, J. W. (2013). Political regime and social performance. Contemporary Politics, 19(1), 55–75. Meltzer, A. H., & Richard, S. (1981). A rational theory of the size of government. Journal of Political Economy, 89(5), 914–27. Miller, M. K. (2015). Electoral authoritarianism and human development. Comparative Political Studies, 48(12), 1526–62. PRS (2019). ICRG Methodology. Available at: www.prsgroup.com (accessed March 16, 2019). Przeworski, A., Alvarez, M. E., Cheibub, J. A. & Limongi, F. (2000). Democracy and
933
development: Political institutions and wellbeing in the world, 1950–1990. Cambridge: Cambridge University Press. Putnam, R. D. (1993). Making democracy work: Civic traditions in modern Italy. Princeton, NJ: Princeton University Press. Roller, E. (2005). The performance of democracies: Political institutions and public policy. Oxford: Oxford University Press. Roller, E. (2013). Comparing the performance of autocracies: Issues in measuring types of autocratic regimes and performance. Contemporary Politics, 19(1), 35–54. Rose, R. (1976). On the priorities of government: A developmental analysis of public policies. European Journal of Political Research, 4(3), 247–89. Ross, M. (2006). Is democracy good for the poor? American Journal of Political Science, 50(4), 860–74. Rothstein, B., & Teorell, J. (2008). What is quality of government? A theory of impartial government institutions. Governance, 21(2), 165–90. Stier, S. (2015). Democracy, autocracy and the news: The impact of regime type on media freedom. Democratization, 22(7), 1273–95. Tsebelis, G. (1995). Decision making in political systems: Veto players in presidentialism, parliamentarism, multicameralism, and multipartyism. British Journal of Political Science, 25(3), 289–325. UNDP – United Nations Development Programme (2019). Human Development Reports. Available at: hdr.undp.org (accessed April 6, 2019). Wahman, M., Teorell, J. & Hadenius, A. (2013). Authoritarian regime types revisited: Updated data in comparative perspective. Contemporary Politics, 19(1), 19–34. Wang, Y.-t., Mechkova, V., & Andersson, F. (2019). Does democracy enhance health? New empirical evidence 1900–2012. Political Research Quarterly, 72(3), 554–69. World Bank (2019). Worldwide Governance Indicators. Available at: info.worldbank.org/ governance/ wgi/#doc (accessed April 6, 2019). Wurster, S. (2013). Comparing ecological sustainability in autocracies and democracies. Contemporary Politics, 19(1), 76–93.
56 State Formation and Failure I. William Zartman
Introduction A state is the authoritative political institution that is sovereign over a recognized territory and its inhabitants. Like any fundamental concept, it has been defined by many people in many ways, but essentially they all boil down to the definition above (Dawisha and Zartman, 1988; Luciani, 1990; McIver, 1926; Service, 1975; Laski, 1935; Weber, 1947; Lasswell and Kaplan, 1950; Nettl, 1968; Tilly, 1975; Evans et al., 1985; Schatzberg, 1987; McLennan, Held and Hall 1984; Worster, 2009). Weber (1958), who is often considered the reference authority on the state, defined it as ‘an association that claims the monopoly of the legitimate use of violence, and it cannot be defined in any other manner’ (334), but he then went on to add ‘a compulsory association’, ‘a relation of men dominating men’, ‘(successfully) claims’, and ‘within a given territory, (82, 78). More than just as association or a relation, a state is an institution, that is, it has a structure and organization that can accomplish its
associated functions. Beyond its functional institutions, that state as an institution itself contains a perception of corporate interests, an autonomy that distinguishes it from society at large (Rothchild, 1987). As an institution, the state enjoys autonomy, a quality that relates to many agencies and functions. Institutionalization gives structure and longevity to the state; leadership may wobble but the civil service organization carries on. The state becomes larger than its ruler, with whom it was identified in the pre-modern period (and remains so in many contemporary cases). The human agent may disappear, physically or intangibly as a discredited symbol, but the state continues, awaiting capture by its next leader or government (as illustrated in Indonesia after the fall of Soekarno and then Soharto, Tunisia after the fall of Habib Bourguiba and then Lazzadin ben Ali, Egypt after Gamal Abd al-Nasser and then Hosni Mubarek, Libya after the fall of Muammar Qaddafi, Yemen after the fall of Ali Saleh, Zaire (Congo) after the fall of
State Formation and Failure
Mobutu Sese Seko and Somalia after the fall of Siad Barre, to give 20th and 21st-century examples). Its capabilities are not just ad hoc but are built into bodies to perform them, well or badly. Some of these functions may be provided informally or spontaneously, but it takes an institutionalization of them into a state design to make a state (Call, 2008). Procedurally, the functions have come to be divided into rule-making (legislation), ruleexecuting (executive), rule-applying (judicial) and rule-enforcing (police/military). Substantively, that state performs functions that no other body can handle – if one could, it would be a state. These include security, both domestic and external; welfare; adjudication; conflict management; allocation; and regulation and coordination. Successful claim over the monopoly of force is a core element but it also involves qualifications; the claims must be not only operative but legitimate. Weber never defined legitimacy very clearly, but a good succinct definition is ‘the right to rule’ (Lasswell and Kaplan, 1950). But who gives that right? The source of legitimacy can be both internal and external. External legitimization comes as international recognition by other states, but it comes as a confirmation that the institution has embodied the other characteristics. Under international law, there is a bit of circularity. A state must exist to be recognized but it must be recognized in order to exist in law (Aynete, 2012). Thus, recognition is not the same thing as existence. The 1933 Montevideo Convention on the Rights and Duties of States (art. 1), based on the International Court of Justice Statute (art. 38), declares that in order to qualify as a person in international law, a state must possess a permanent population, a defined territory, a government capable of ruling over the population in the territory and a capacity to enter into relations with the other state. However, there are two schools of thought on this circular relationship (Worster, 2009). The declaratory school regards recognition as merely an acknowledgment of the existing
935
state status. Thus, recognition should be automatic, based on objective criteria. De facto existence precedes de jure existence through recognition by other states (Montevideo arts 3 and 6). The constitutive approach holds that recognition is the final criterion for state existence and is therefore a matter of judgment and even of politics; UN membership is granted by the Security Council to peace-loving states who are willing and able to carry out UN Charter obligations (UN Charter 4.1). Thus Somaliland, a functioning and democratic state in all aspects for a quarter century, has not been recognized. Then there are puppet states: Abkhazia and South Ossetia were removed from Georgia and set up as states by neighboring Russia but are not recognized; the Sahrawi Arab Democratic Republic has little territory and its population is in Algeria, but it is recognized as a member by half of the Africa Union. Yet for the UN, once a state, always a state. Somalia and Libya joined the UN in 1960 and are still members; Somalia has had no government since 1991 and only a temporary one since 2015, and Libya has had no government since 2010 (or two since 2015). Legitimization by the population is basic, although the concept is hard to operationalize. Optimally, internal legitimization comes voluntarily from the full support and identification of the population with the state, but in many cases it stops at the door of the monopoly of violence, making the requirement circular (Papagianni, 2008). Internal legitimization refers to an alternative definition of the state, related to Durkheim’s approach rather than Weber’s (Lemay-Hébert, 2013, and one that runs counter to the notion of autonomy. For Durkheim (1957: 79–80; 1964: 79; 1986: 54), the state ‘is the very organ of social thought … comprising the sentiments, ideals, beliefs that the society has worked out collectively and with time’. As ‘more an idea held in common by a group of people than it is a physical organism’ (Buzan, 1991: 63), the state exists beyond its institutions and reaches into an additional capacity, ‘the capacity to command loyalty – the right to rule’ (Holsti, 1996: 82). Carried to an
936
The SAGE Handbook of Political Science
exclusion, this behavioral sense of a state can be seen as opposed to the institutional definition, but the two are in fact compatible. Sentiments, ideals and beliefs, for all their intrinsic content, relate to a governed, populated territory with an institution that has not only the right to rule but the right to fight (Goemans, 2006). The behavioral notion expands the institutional understanding, allows for the legitimization of the state through popular feelings – difficult thought this may be to operationalize – and invokes the notion of a social contract. A social contract is the ‘bargain’ that ties together state and society. It is often simply understood but occasionally written. Its written form can be a constitution but also a series of commentaries and ‘op-eds’, like The Federalist Papers, which seek to form and express a consensus (Arjomand, 1992). The idea comes out of the 18th-century political philosophers’ groping for an expression of the relationship between ruler and ruled (Barker, 2008). For Thomas Hobbes there was no social contract if not erected by a domineering state, but for John Locke it meant the voluntary pooling of individual liberties for security and for Jean-Jacques Rousseau it was the obligatory melting of individual interests into a common will. For established nations, the social contract is contained in the underlying spirit of the constitution, legislation and judicial hearings, but new nations have a headier challenge in collecting authoritative statements and building an effective consensus. Their political values are concentrated on the struggle for independence to the exclusion of other interests, in what has been termed the ‘funnel phase’ of political life, and on defining themselves as ‘not the Other’ rather than in positive values and identities. For this reason, they are often prey to ideologies that provide already packaged answers, explaining the unpleasant past and laying out the glorious future and the path to it in the present. When the path to Arab or African unity or Arab or African socialism or the Asian Way does not produce promised results, new efforts
are required – pragmatic or ideological – to fashion an appropriate social contract for the nation and the state. The political wanderings of the South African state after apartheid under Nelson Mandela, Thabo Mbeki, Joseph Zuma and Cyril Ramaphosa exemplify this search. Thus, the social contract is an important element in defining the state but is an elusive subject of interpretation. Legitimacy can be equated with sovereignty, which is the legal parallel to the monopoly over violence; sovereignty is the (successful) claim to monopoly over the legitimate use of civilian authority to perform the functions expected of it. It is significant in the UN policy of Responsibility to Protect, which holds states responsible for the welfare of their people (Deng et al., 1995; ICISS, 2001; UN, 2005; Evans, 2008).
Basic Theories and Concepts A number of concepts associated with the state and its various definitions are also used in attempts to identify different types of state, often associated with their location. Territorialization refers to the fact that the state is an entity in control of a bounded piece of land and its inhabitants. It denotes both the state as an area of land control and one whose territory is a focus of its inhabitants’ identity. The two are related, as territorialization strengthens the modern state by fixing identity on a physical ‘mappable’ piece of land (and water) as a tangible embodiment of home – a sacred homeland (Goemans, 2006; Casey, 1997). The alternative is a demographic state, whose citizens are determined by their national, ethnic or religious identity wherever they are, without regard to territory. The demographic state has been associated with Islam and with African or American Indian tribal entities, especially in mobile or transhumant societies. Nation-state, like territorialization, bears a double meaning, which provides confusion.
State Formation and Failure
It refers to the coincidence of nation and state, but also to the location of legitimization in the populace rather than in the monarch, as in a royal state. In the first sense, the community of identity and affection, the nation, is assumed to be the basis of the state. Nation, often confused with state, is the imagined communality derived from shared language and religion, history and ethnicity, and ancestry; it is rare where these elements of commonality are not mythical (Anderson, 1983/2006). The assumption of a nation-state subsumes all diversity under a common identification. Where the nation is not the result of a long history of tradition and traditional nations or tribes are the basis of identity, or where new populations bring their own identity, states often seek to create a new, encompassing nation by decree, reversing the process by integration into a state-nation. Such efforts are often counter-productive since they work to reinforce adherence to the traditional nation for solidarity, particularly when democratization or rebellion has pushed leaders to seek supporters for their promotion (Arnson and Zartman, 2005). In the second sense, the actions of the state, notably in interaction with other states, are claimed to represent the general will of the population. In the tradition of the French Revolution, the state is the population mobilized and policy is legitimized as the people’s will. This can lead to problems when policy goes awry and is repudiated as really only representing the regime, a problem faced by post-Nazi Germany. National self-determination is the policy based on the notion of the nation-state, which says that every nation should have its state and the justification for the existence of the state is the pre-existence of the nation. The idea has followed a complex evolution. It came out of Woodrow Wilson’s Fourteen Points at the end of World War I and was applied to the multi-national empires of Germany, AustriaHungary and Ottoman Turkey; Egypt, for example, was liberated from the Turkish and placed under the British, giving rise to
937
protests from the Wafd party at the Versailles Peace Conference. In each case, ethnic nationalities other than the determining one were included in the new state, sometimes with destabilizing effects, such as the inclusion of ethnic Germans in Czechoslovakia. Only after the following World War was the concept applied to the colonial possessions of Britain, France and Portugal, in Asia and Africa. But here it was the colonial unit – the colonially created territorial state – that was the determining self, since almost all colonial units were multi-national. In the biggest exception, Pakistan broke away from multinational India to form a homogeneously Muslim state in 1947 but was then split in two between ethnically diverse (west) Pakistan and Bangladesh in the east. The principle of state – not national – self-determination continued in the determination of new states’ boundaries, so that new states contained pieces of some neighboring nation in addition to the legitimizing nation, as in South Sudan, Ukraine, Azerbaijan and Timor Leste. Nationalities’ self-determination would simply pose too many insurmountable problems to be practicable.
Types of States Weberian state refers to two characteristics discussed by Weber (1958): the claim to the legitimate monopoly of force, as discussed above, and the routinization of legitimacy by law as exercised by bureaucracy. The latter combines issues that were important to Weber: legal legitimization as the end stage of an evolution from traditional and charismatic sources (78–9) and bureaucratic authority in fixed and official jurisdictional areas ordered by laws and regulations of modern states (196–259). Together, the characteristics encompass the functions of a contemporary, developed state. A 19th-century doggerel declared: ‘As forms of government may fools contest. Whate’er is best administered is best.’1
938
The SAGE Handbook of Political Science
Administrative state in a sense refers to a Weberian state gone too far, in which legislative functions are taken over by executive agencies, aided by judicial lawmaking by appointed officials, without involvement of elected officials as policymakers. The term has been applied to the administration of Barack Obama in the United States but even more pointedly to the functioning of the EU. ‘Policy made in Brussels’ was cited as the major underlying reason for Britain’s resignation from the EU (‘Brexit’). Democratic state is one where regularly elected officials are in charge of policymaking and execution within the state institutions. Democracy – ‘the ability to choose and the ability to repent’ – is a procedural requirement that does not guarantee good government, but only the occasion to reject bad government; the democratic state emphasizes accountability as the judgment of performance and by extension carries with it the idea that state officials are public servants. In reality, democracy is not a fixed condition which, once achieved, remains a permanent condition; democracy is a process, continually pursued, and even the most democratic states are always democratizing, well or badly, although some states of course are not democratizing at all (such as 21st-century China, Cambodia, Vietnam and Saudi Arabia). Elections are not sufficient; they must be periodically established by law and free and fair, which implies an open opportunity for candidatures and for election campaigns. Stateless societies – also termed acephalic, for an absence of chieftain – are those which lack the institutional structure of a state but cohere through a sense of established community and moral economy (Evans-Pritchard, 1940; Lewis, 1961). Institutions are rudimentary but other functions are performed by the community as a whole, and community solidary is strong. Nonetheless, acephalic societies are also often associated with segmentation, following the rule of action: ‘Myself against my brother, my brother and
I against our cousins, my cousins and I against the outside world.’ Civil society is the subject of widespread debate of a proper definition. Generally and most usefully, its essence is captured by G. W. Hegel’s view ‘as a historically produced sphere of ethical life … positioned between the simple world of the patriarchal household and the universal state’ (Keane, 1988: 50), or that area of public activity between the individual family and the state. For Alexis de Tocqueville (1835), it is the self-organized ‘eye of society’ exercising vigilance and calling for accountability from the state, but in addition to accountability it also performs functions of its own, as Antonio Gramsci (1967) develops. Civil society is the result of the pluralization and complexity of modern society, alive with multiple demand-bearing groups beyond the capability of the state to satisfy. Its area of action is flexible; in a well-functioning state there is a balance in civil society’s performance of things it does best, both lower level and aggregating, from neighborhood clubs to town meetings and from food kitchens to parent–teachers’ associations. An overactive civil society indicates that the state is not functioning up to capacity; a shriveled civil society indicates a repressive state. However, in an oppressive state, civil society may surge with activity to challenge official dominance. In Zaire (Congo) in the 1980s, villages and regions built their own schools and roads in face of government neglect, but in Iraq at the same time there was little space for non-governmental organizations (NGOs) independent of state control. Even in Iran and Russia of the early 21st century, civil society is present to carry out civic functions and contest government.
Historical Evolution States grew out of monarchial tradition in Western Europe, but an argument can be made for their roots as institutionalized
State Formation and Failure
bodies for regulation and allocation in the imperial regimes of ancient Greece and Rome and their Asian competitors. Even feudal systems were based on an exchange of security for allegiance (and taxes, often in kind), an arrangement that still can be found in the ‘proto-states’ of the mafia, rebels, national liberation movements and gangsters of modern times (Clunan and Trinkunas, 2010; Magnuson, 2018). Importantly, they were bounded and territorialized, usually with borders or, if not, with marches or border zones. Not until the development of the nation-state in the French Revolution did democracy and national identification begin to appear as legitimization; however, on one hand, national identification began to appear as early as the Sicilian Vespers in 1282 (Runciman, 1958), and on the other, the state was still identical with its head alone at the Congress of Vienna nearly six hundred years later. History does not come in sharply bounded epochs. Similar states existed elsewhere – Africa, Mideast, Asia, South America – primarily as empires over multiple and non-identifying populations without entering the democratization stage. Nevertheless, the governance qualities of the regime often approximate the functions of the state (Kautilya, 1960; Darke [ibn Ali al-Tusi], 1074/2002). In much of the non-European world, the notion and organization of a state was revived or implanted by 19th–20th-century colonialism. Often, a national liberation movement came into being as a proto-state, embodying what was termed the ‘real country’ against the ‘legal country’ as denominated in French colonies. Upon independence, the two became one as the movement became a single (‘state’) party. The former colonies have striven to maintain the colonial institutions, sometimes also continuing traditional institutions to help them fill out state functions. While formally recognized, these creations have frequently been inadequate for territorial control, functional performance, institutional coherence, popular accountability and effective security
939
(Jackson, 1990; Herbst, 2000/2015). Since the implant was artificial, some states collapsed, many were fragile, weak or failing, and state-building became an international priority. However, it is likely that time and experience in both success and failure will be the most important element in the attempts of ‘new nations’ to catch up with the history of the West. State formation is a historic process, both deliberate and inherent (MacIver, 1926; Service, 1975; Badie, 1987). When the indicated functions need to be performed, they create pressure for a state to come into being. Although a core element, successful claim of monopoly over the use of force is not enough; insurgent groups enforce such a claim but may have no other functions. The need to claim and defend a territory is an ingredient, not just for security reasons but more importantly to respond to other needs that it takes a state to meet. Money needs to be raised to fund security, and services need to be provided to undergird efforts to raise funds. The judgment of Tilly (1975: 42; 1978: 3–4) that ‘war made the states and the state made war’ encapsulates (too briefly) the idea that effort to sustain the claim to the monopoly of the use of force in a given territory requires additional functions to fund and legitimize that claim. When the feudal notion of security for service was enhanced by notions of welfare (service to, not from the population) and national identification with a particular land (popular territorialization of security), the need for a body to perform functions associated with a state came into being. These requirements imposed the development of a final element of stateness: institutionalization. Thus, the growth and articulation of social organization in a particular era created its own demands and efforts to meet them. This is not to say that purposeful action by individuals and groups had no role in the process; to the contrary, human agency could seriously hobble or accelerate the process. Henry II in England, Richelieu and Louis XIV and
940
The SAGE Handbook of Political Science
then Napoleon in France, Mohammed Ali in Egypt, Senghor in Senegal and HouphouëtBoigny in Ivory Coast are among the many state-builders who could be cited from many eras and places. It is not only through the leadership of notable statesmen that individuals are important in the process of state formation. Collective roles are important too, notably in the formation of political parties. Parties are the collective political component of state-building, and again are a natural evolution of participation. The tendency of a polity to establish a governing elite has been emphasized often enough, as the Iron Rule of Oligarchy (Michels, 1949), but the equal tendency toward fractioning has received too little attention (Coser, 1956). Participation requires organization, and even if there is a broad consensus, as for example in a social contract, there will always be differences over implementation, among social or age groups, between groups for change (from the left) and those for constancy (from the right), from different regional attachments, or simply over personalities (Zartman, 1980). Parties tend to arise either from differences within a governing institution (legislature, council, incumbent followers) or against a governing authority (opposition) (Tilly, 1978; LaPalombara 1966, 2015). The results of state formation have been graded by a number of different criteria. Strong versus weak state provides a basic spectrum based on effectiveness (Myrdal, 1968; Migdal, 1987). However, each contains a number of subcategories. States can be strong in various ways: hard states have a solid structure with a heavy hand of control over participation, but brittle states have a repressive shell and little or no participation. Weak states come in various gradations sometimes reaching the level of failed and collapsed. After an early introduction of the notion of collapse (Helman and Ratner, 1993; Zartman, 1995), attention rapidly shifted to the broader phenomenon of state failure and failing, and finally simply political instability (Rotberg, 2003, 2004; Clement, 2004; PITF, 2007).
State ‘un-building’ or failure can be a much more rapid process, although frequently unnoticed roots go back much further. In accordance with the definitions of a state given above, state failure involves a breakdown of the state security function, eroding its monopoly of violence. It can lose its control over dissident regions, usually over ethnic neglect, as in Sudan over South Sudan, or its claim to the right to govern the entire nation in the face of revolutionary rebellion, as in Tsarist Russia confronted by the Communists, or simply its ability to provide civic order against organized crime and local armed groups, as in the Central African Republic and Democratic Republic of Congo. Much of the attention to state failure has focused on the deeper structural problem of the state’s inability to provide for its citizens’ welfare. The state may depend on failing sources of revenue and lose the ability to deliver to its population, as in Algeria when oil prices dropped in the 1980s and 2010s or in Haiti, or it may overspend so badly that it falls into inextricable debt, as in Argentina in the 1980s or Venezuela in the 2010s, or it may fall prey to deep corruption where a small group intervenes between supply and demand to keep state resources for itself, as in Gabon, Tunisia or Brazil in the early 21st century (Arnson and Zartman, 2005). Deeper still in the state–society nexus is a state’s loss of resonance with its population, so that they feel alienated and unrepresented and the social contract is ruptured or eroded. This may be because the state apparatus is monopolized by a small group for their own benefit, as in Syria under the Alawite Assad dynasty; or the state may follow policies that tear the population apart, as in the UK under Prime Ministers Cameron and May; or the state may simply lose contact with its population, who break down into smaller groups of more meaningful identity, as in the Central African Republic or Yemen or the former Yugoslavia. In reality, security, economic and social causes of state failure (to return to Weberian categories) can coexist
State Formation and Failure
and reinforce each other, leading to complete state collapse (Esty et al., 1995; Esty et al., 1998; Fragile States Index 2018; Rotberg, 2003, 2004). State collapse is a process that may take many paths as a result of many causes, with the component characteristics reinforcing each other. As a result, a threshold of collapse is difficult to establish, just as it is difficult to declare when the government mechanism of a society has qualified as a state. Collapse means the state can no longer perform the functions required for it to qualify (Zartman, 1995; DFID, 2005; OECD, 2007). It has lost the ability to enact meaningful policy, solve salient problems and manage internal conflicts; laws are not made, order is not preserved and societal cohesion is not maintained (Badie and Birnbaum, 1983; Howad, 2014; Schlichte, 1998). The breakdown of order involves rising challenges beyond internal security control and ‘power deflation’, where force is increasingly deployed with decreasing effect (Johnson, 1966). Not only do social issues arise without solutions to break down social cohesion, but regional and other identify divisions pull parts of the country out of central control. The center rules for itself; the state becomes brittle (Acemoglu and Robinson, 2012). Institutional incapacity to meet these challenges carries with it a loss of legitimacy and of symbolic representation of society, as people look elsewhere for their identity. Despite the attention that has been given to the incidence of state failure and collapse, there has been little work analyzing collapse as a process. While it is clear that failing is a slippery slope that begins well before the final arrival of collapse, if it is not stopped beforehand, it is also clear that the slide is marked by bumps and slips: conditions may improve or worsen, sudden changes or improvements may be brought on by policies or by external events, components of collapse may vary at different rates. It is difficult to discern a starting point for these interrelated processes and because of its compound nature, components of the collapsing process may begin at any
941
point, dragging the others with it, all amounting to overcoming the efforts of the institution to persist as a state. Ultimate signposts are identifiable, even though it is not clear whether they are in the right order or not (or whether there is an order at all) (Zartman, 1995: 10): 1 Power devolves to the peripheries because the center fights among itself. Those in central power are too busy defending themselves against attacks from their colleagues to hold onto the reins of power over the countryside, as in Somalia in the 1970s. Local authority is up for grabs and local power-grabbers – rebels and eventually warlords – grab. 2 Power withers away by default at the center because central government loses its power base. It no longer pays attention to the needs of its social bases and they withdraw their support, as the center relies instead on its innermost trusted circle –an ethnic or regional base, or a functional group such as an army officers’ clique, as in Congo in the 1980s. Attention to the needs and demands of the support group diverts allocations from the broader social sources of support. 3 Government malfunctions by avoiding necessary but difficult choices on pressing problems. Such measures mount in urgency and difficulty as a result, facing the state with a governing crisis, as in Algeria in the 1990s and 2010s. Decisional avoidance can take place either because of institutional incoherence, in which the mechanisms of government are inadequate to their challenges, or because of political flabbiness, in which the politicians themselves are incapable of biting the bullet. The effect is the same. 4 The incumbents practice only defensive politics, fending off challenges and reducing threats, concentrating on procedural rather than substantive measures. Such measures include both repression and concession, both taken to get the opposition off their back. Absent is a political agenda for participation and programs, as in Venezuela in the 2010s. Elections are postponed; platforms are absent. Exclusion rather than inclusion is the way of handling demand-bearing groups. 5 Probably the ultimate danger sign is when the center loses control over its own state agents, who begin to operate on their own account. Officials exact payments for their own pockets, and law and order are consistently broken by the agents of law and order, the police and army
942
The SAGE Handbook of Political Science
units becoming gangs and brigands, as in Zaire in the last days. Local society handles local needs that were formerly part of the state’s responsibilities.
Some of these may appear as harbingers of revolution (Johnson, 1966), but in the case of collapse, society is too tired and disorganized to produce a revolutionary movement and elite. It is true that revolutions tend be occasioned by state weakness (Brinton, 1965), but the alternative outcome is a society set back on its local resources with no state to be taken over. The French state in the late 18th century and the Iranian state in the late 20th century were failing, but there was a state to be overthrown; the state in the Central African Republic or Somalia or Yemen in the late 20th century did not exist. In such cases, without a strong, dedicated revolutionary movement to take over, the revival of a state (of any kind) is a slow process of putting the pieces together when they tend to weigh each other down instead of building on each other. Collapsed state rebuilding in the early 21st century in Sierra Leone and Liberia has been a slow process, and more rapid in the later 20th century in German and Japan because of the strong state tradition and the enormous international effort. State-building, commonly mistermed nation-building, is the process and policy of restoring functionality to weak, failing, failed and collapsed states, and is a major focus of developed states’ foreign policy, a testimony to the importance accorded to the state as the organizer of the world’s people and territory. The attention has come about because weak states of all varieties have proven unable to handle their own affairs and this incapacity has given rise or at least locus to disputes and rebellions that threaten other states’ security. There is an urgency in state rebuilding after conflict, since untended formerly conflicted states have a 50% chance of recidivism within five years (Collier, 2003). State rebuilding then becomes key to conflict prevention (Zartman, 2015). Where a people
has not enjoyed (in the literary sense) a history dedicated to the consolidation of their state, efforts must be expended to make up for lost experience. By the nature of the challenge, national efforts are unlikely to be sufficient and require international assistance; the relation between the two is important for the eventual solidity of the building (Tahiri, 2010; Doyle and Sambanis, 2006). Instant institutions have to be installed and manned; political life, including parties and free and fair elections, have to be created; functioning and respected judicial systems need to be installed; and national budgets and taxation and fiscal policy need to be established. State-building programs have been advocated in many forms. Some proposals look for the level most appropriate for strengthening; the need for building from the local base contrasts with international programs that seek institutionalization from the top down (Kaplan, 2008; USAID, 2005). Others have divided failing into vertical component and categorized the process into major gaps – in legitimacy and security – so as to be able to develop by functional segment (Call, 2010). Another approach sees state-building as primarily an economic matter, since the unemployed youth bubble is vulnerable to alienation and delegitimization (Collier, 2003; USAID, 1995). In any case, rebuilding efforts need to be coordinated to be effective; the World Bank has been accused of rushing to provide economic rehabilitation before political reconstruction has been achieved in conflict-ridden states, thereby funding irresponsible and unaccountable rulers. State-building has been discussed primarily as institutionalization, with much less attention paid to the behavioral approach, admittedly more difficult and closer to the real sense of nation-building.
Regional Variations State is a universal concept, but states are not universally distributed, at least as defined. If
State Formation and Failure
any other entity were to perform its functions, it would be a state. Although the concept is associated with the Western European cultural area with a claim of conceptual universality, developing areas of the world sometimes claim their own brand of state. They do not qualify as some other type of institution, since they aspire to filling state functions, but they do so badly, simply because they are young in their statehood and only developing in the given direction. As new states, their greatest source of weakness is their unsteady legitimacy. Coming out of a traditional era, they depend on two of Weber’s sources of legitimization: either charismatic, depending on the attractiveness of an individual, or revolutionary, a concept that Weber named but never got around to developing. States in Africa, Latin America and the Arab world often claim their own identity, largely based on their historical heritage rather than on any predominant characteristics from their societies. All three cultural areas lean heavily on central personal leadership – the Big Man or chief in Africa (Schatzberg, 2001); the jefe or caudillo in Latin America; the za’im (leader) in the Arab world. Since the central figure tends to lean on his own extended family and act as the father figure of the nation, combining traditional with charismatic or even revolutionary legitimization, the notion of a patrimonial state is appropriate. The routinization of charismatic (and, again, especially revolutionary) legitimization is a long and difficult task involving jolts or transition and pressures of backsliding (Weber, 1968: chapter 6). The Latin American state is an identifying designation rather than a specific analytical concept. Latin America had gotten over the post-colonial problems it experienced after independence (by state self-determination) two centuries ago, a century and a half before Africa. The result has been the creation of a small landed elite, supported by the Catholic church (despite a small 20th-century breakaway protest movement of liberation theology). Militant protest from Soviet-supported
943
agrarian movements in the 1970s and 1980s only reinforced the gap between them and the state, but in the quarter century since then, some more moderate leftist movements have been elected into power. In this evolution, the Latin American state came to be viewed as an extension of international economic domination and was analyzed as an autonomous state, located between its masses and its international capitalist dependency in a ‘Janus-faced’ position (Evans et al., 1985). Dependency ‘theory’ became the neo-Marxist lens for analyzing the Latin American state during the Cold War (Frank, 1967; Cardoso and Faletto, 1979), but the development of local capital and of bureaucratic–authoritarian regimes thereafter provided material for the analysis of more complex relations. However, the evolution of the state as an autonomous actor with its own interests separate from both its population and its international ties left it strong but limited in its capabilities, and so allowed the growth of unregulated corruption and a return of equally autonomous bandit movements (bacrims for criminal bandits) of private advantage, primarily in drugs, rather than of popular protest. The Latin American state is a strong state, democratically functioning in many cases, with a few military states appearing from time to time; Chavez– Maduro Venezuela in the 2010s has been the only state to show great fragility. The African state is a term used to characterize the outgrowth of the nationalist movement into a party-state whose writ is concentrated on major cities where political life, social services, education and public utilities are concentrated (Herbst, 2015; Villalón and Huxtable, 1997). The African state is like an egg: yolk in the middle and a fragile shell outside, but weak fluidity between the two. The contemporary African state was born with its politics concentrated in a post-World War II national liberation movement (violent or simply political) under a George Washington figure. The movement became a single party centered on a Great
944
The SAGE Handbook of Political Science
Coalition of modern and traditional forces and trained only in colonial state overthrow. After the first decade of independence in the 1960s, military coup became the most common means of succession, since civilian electoral opposition to the charismatic figure of the president was still considered treasonous. The embryo state was weighted from birth with a series of burdens. It came into being on the wave of nationalist state self- determination, designed to return sovereignty to its citizens and answer their needs (the story is told of a Malian who went into a bank and wanted to get money ‘because that’s where money is’). It was also accompanied by the developmentalist thesis, according to which new states would take off like a plane and develop into the world economy. It also had no local capital, which obliged government to create a state-centralized economy that soon developed a foreign aid dependency. With the fall of Soviet support and inspiration, the ‘African Spring’ beginning in 1993 rose against single-party rule and the centralized economy, the latter aided by the World Bank’s Structural Adjustment (Zartman, 1997). Since the 1990s, elections have provided a succession of party-cum-business coalitions. Except for Rwanda, a unique case of rupture with traditional structures, government in the capital is aided by the incorporation of traditional authorities and leaders, coopted as ruling party members. Marginal groups uncovered by the specific ethnic business party coalition are outside the scope of service and participation in the state; seeing this neglect as targeted discrimination, they enter dissidence. This is the story of regional rebellions against the state in Senegal, Mali, the Central African Republic, Sudan, Chad, Nigeria, Sierra Leone, Liberia, Ivory Coast, both Congos, Uganda, Kenya, Somalia, Zimbabwe, Angola, Mozambique and South Africa, among others. In a few cases – Chad, Ivory Coast for a decade in the early 2000s, Rwanda, Burundi, South Africa – the marginalized group was able to
take over the state; elsewhere the dominant group maintains control, and constitutional revisions are increasingly frequent from the 1990s for presidential terms extended to life. The African state south of the Sahara is typically a weak state, totally collapsed in a few places such as Central Africa, South Sudan, Democratic Republic of Congo and Somalia, followed at the next level by Mali, Chad, Sudan and Botswana (Fragile States Index 2018). The Arab state has long been regarded as based on an authoritarian regime riding on a sham democracy and a military intelligence (mukhabarat), as confirmed, paradoxically, by the events of the Arab Spring of 2010–15. Although the region also saw national liberation movements and parties after World War II, the heritage of sultanates and janissaries (armies) remained (Badie, 1987; Luciani, 1990; Mello, 2018). References to the population (masses) and prevalence of single parties continued; most of the countries’ rentier economies’ dependence on oil revenues removed the need for effective taxation and underwrote inefficient state industries. These token political and economic ties with the population allowed for plebiscite elections and entrenched leaders claiming revolutionary legitimization. In many places – Libya, Somalia, Egypt, Algeria, Sudan, Yemen; Iran and Afghanistan outside the Arab world; and in the 21st century Syria and Iraq – popular protest against corrupt, uncaring governments found expression in Islamist movements that challenged the secular state. When, finally, the unemployed youth, supported by the middle classes, exploded in secular protest in 2010–11 in the Arab Spring, the popular movement did not have enough organization and experience to take over power and create a state based on a new legitimacy, and the old state forms returned; in three cases – Syria, Libya and Yemen – already brittle states collapsed in civil war (Zartman, 2015). The Arab state is indeed a hard state, with executive succession by inheritance, designation or coup, although collapsed in Yemen and
State Formation and Failure
Libya, with Iraq and Afghanistan (non-Arab) following not far behind in 2019 (Fragile States Index 2018). The Islamic state is a newcomer to the modern scene but with historic antecedents. It is based on a purported resurrection of the khilafa (caliphate) as embodied in the golden age of the first four successors (khalifa) of the prophet Mohammed; in historic reality, the period of the eighth century was an age of enthusiasm, expansion and instability. The notion of the Islamic state, also referred to by its Arabic acronym da’esh (Islamic State of Iraq and the Levant), was revived most recently by an extremist Muslim group launched by former followers of Saddam Hussein in Iraq and rapidly establishing a rulership in Syria and Iraq in the early 2000s. It acted as a transnational movement rather than a territorial state, although it did administer areas that it occupied, and it extended its claim to other self-affiliating groups throughout the Muslim world. Similar ventures were launched at the end of the 20th century in Afghanistan by the Taleban (‘students’) after the eviction of the Russian occupation, and then by al-Qaeda (The Base), which had similar characteristics but, as its name indicates, a fixed location in Pakistan and Afghanistan (Kendall, 2018). Al-Qaeda’s base was dispersed to other regions – Arabian Peninsula (Yemen) (AQAP), North Africa (AQMI), Syria (under a succession of names), Somalia and Indonesia, among others. Al-Qaeda operates as a terrorist movement with targets in the region and abroad, with little pretention at governance (al-Suri, 2005), but da’esh made efforts to govern its region. It provided services, offered payments and salaries, administered justice and enforcement of its strict fundamentalist regulations and lived largely on extensive oil revenues, under a khalifa (Abu Bakr al-Baghdadi [Ibrahim al-Badri]) elected by a selected consultative council. Legitimacy comes from a self-declared designation from God, and actions are based on a narrow interpretation of the Qoran. The territorial base in eastern
945
Syria was finally destroyed around 2019, but the hydra’s tentacles continue to operate.
Advances, Debates and Assessments Conceptualization of the state advances as new types of cases appear from inductive analysis and then deductive typology and process development. The independence of former colonies in Africa and Asia after World War II (much more than the independence of Latin American states after the fall of Napoleon) brought out new problems and new concepts. Debate is still ongoing over whether African and Arab (more than Asian) states are a phylum of their own, or merely an early (how early?) stage of development toward the Western archetype. In the process, they raise basic questions about the components of states and strategy of state-building, and their universality. The debate is emotionalized by the protests of those in the developing world who do not want to be seen as following the Western path (Mamdani, 1996). State-building is nonetheless the current thrust of dealing with new states in the 21st century, much as development theories were the theme for analysis of new states in the postwar period. Interestingly, the al-Qaeda attacks of September 11, 2001 and the Islamic State attacks of 2015 have had little impact on the conceptualization of the state (cf. Rotberg, 2003; Badie, 1987: chapter 8). Despite much attention, many of the processes described above still elude well- developed conceptualization and analysis. In many cases, opposed approaches are more content to draw lines exclusionarily instead of looking for complementarity. The legal criteria for recognition as a sign of external legitimization remain circular. The declaratory and constitutive approaches provide the material for debates but in fact each is dependent on the other. Proof of internal legitimization is still elusive: how does one
946
The SAGE Handbook of Political Science
measure the degree of legitimacy accorded by the population (homogeneously or by groups and types – Horowitz, 1985 versus Lijphart, 1977; Montville, 1990), and what is the threshold – how much legitimacy does a state have to have to be legitimate? The process of state failure is still unexplored; the characteristics of each stage are well studied, but the way in which states fail and eventually collapse (while still retaining their legal existence, of course) is not fully understood. That process may be completely idiosyncratic, but it is more likely that there are patterns that remain to be discovered and explored. The debate over remedies to state weakness is widely open, but needs more study of the process of weakness itself (Patrick and Brown, 2007; Kaplan, 2008; Lund, 2010; Zartman, 2015). While admittedly there must be a local component, pure bootstrapping is rare and takes time. Therefore, the temptation for foreign states and international organizations to step in and set it all right is great; trillions of dollars are spent in these efforts, and because there is no control case, it is impossible to know how necessary or useful they were (USAID, 2005). The composition of R2P gives an answer by identifying its three pillars of responsibility: states are responsible for the welfare of their people; external states are responsible for giving help to the task when asked; external states are responsible for stepping in when the first pillar state does not exercise its responsibilities. But the trilogy does not give any indication about when to pass from one pillar to the next. The debate over R2P is a necessary and engrossing challenge for the definition of a new international norms of conduct and responsibility.
Perspectives States will continue to be built and to fail. The wave of new states produced by decolonization and the end of empire is essentially over and the experience of the few
latecomers shows that many lessons remain unapplied, as in South Sudan, Eritrea, Ukraine and Georgia (Miller, 2010; Beissinger and Young, 2002; Barrington, 2006). But the striking fact is that stateness is a universal aspiration and the number of qualifying entities, despite all their problems and challenges, is only increasing. Debates, both academic and policy-oriented, rage over how to prolong their life and increase their health. Arguably, the widest ranging substratum of those debates is whether such measures must be home-grown by those who know the territory or imported by those who know the answers. It is of little help to insist that both must be combined because this only leads to the second question of the proper measures in the combination and the skills in making the mix (Leader and Waever, 2019). The two questions are not new but badly need some resolution; this is complicated by the fact that that resolution involves a meeting of the minds of analysts and practitioners (George, 1993). Since building and failure is ultimately a policy problem, it requires repeated attention to the challenges of prevention and the lessons of the innumerable past successes (attested by the number of functioning states and their ability to overcome recurrent challenges). A new element of focus brought on by the rise of ‘new’ internal and asymmetric wars (Kaldor, 2012; Walter, 2017), is the conflict trap, which indicates that once caught in an internal conflict, it is difficult for states to emerge because the conflict has done great damage to their capabilities for recovery. By this logic, rebuilding is no longer a policy challenge, but a struggle against inherent forces (Collier, 2003). Partially in response, another concept has drawn attention to a revised focus: the matter of resiliency or the ability of state-society to roll with the punches and recover its balance when confronted with the prospect of failure (Aall and Crocker, 2017). It is not clear whether the concept refers to the ability to regain the status quo ante or to absorb the challenge and emerge
State Formation and Failure
strengthened – restoring versus renovating resilience (Martin-Breen and Anderies, 2011). The concept of the state itself is the latter, since the world does not stand still, and that means that it will face new challenges and find new answers in order to maintain and improve its essential existence.
Note 1 As a personal note, on my MA comprehensive exam, I wrote on the matter: ‘On forms of government may wise men disagree: Whate’er is best administered may verge on tyranny.’
References Aall, Pamela & Crocker, Chester, eds. 2017. The Fabric of Peace in Africa: Looking beyond the State. Waterloo, ON: CIGI. Acemoglu, Daron & Robinson, James 2012. Why Nations Fail. New York: Crown. Al-Suri, Abdu Musab (Mustafa Setmariam Nasar) 2005. Call for World Islamic Resistance. Anderson, Benedict 1983/2006. Imagined Communities. New York: Verso. Arjomand, Saïd Amir 1992. ‘Constitutions and the struggle for political order,’ Archives Européenes de Sociologie, 33(1): 39–82. Arnson, Cynthia J. & Zartman, I William, eds. 2005. Rethinking the Economics of War: The Interface between Need, Creed, and Greed. Washington, DC: Woodrow Wilson Center. Aynete, Abebe 2012. ‘Unclear criteria for statehood and its implications for peace and stability in Africa,’ Conflict Trends, 1 (January): 42–8. Badie, Bertrand and Birnbaum, Pierre 1983. Sociologie de l’État. Paris: Fayard. Badie, Bertrand 1987. Les deux États: pouvoir et société en Occident et en terre d’Islam. Paris: Fayard. Barker, Ernest, ed. 2008. Social Contract. Oxford: Oxford University Press. Barrington, Lowell, ed. 2006. After Independence: Making and Protecting the Nation in Postcolonial and Post-Communist States. Ann Arbor, MI: University of Michigan Press.
947
Beissinger, Mark R. & Young, Crawford, eds. 2002. Beyond State Crisis? Post-Colonial Africa and Post-Soviet Eurasia in Comparative Perspective. Washington, DC: Woodrow Wilson International Center. Brinton, Crane 1965. Anatomy of Revolution. New York: Vintage. Buzan, Barry 1991. People, States and Fear. London: Harvester Wheatsheaf. Call, Charles, ed. 2008. Building States to Build Peace. Boulder, CO: Lynne Rienner. Call, Charles 2010. ‘Beyond the “failed state”: Toward conceptual alternatives,’ European Journal of International Relations, 17(2): 303–26. Cardoso, Fernando Henrique and Faletto, Enzo 1979. Dependency and Development in Latin America. Berkeley, CA: University of C alifornia Press. Casey, Edward S. 1997. The Fate of Place. Berkeley, CA: University of California Press. Clement, Cathy 2004. State Collapse: A Common Causal Pattern? A Comparative Analysis of Lebanon, Somalia and the Former Yugoslavia, Doctoral Thesis, University of Louvain. Clunan, Anne & Trinkunas, Harold, eds. 2010. Ungoverned Spaces. Stanford CA: Stanford University Press. Collier, Paul et al. 2003. Breaking the Conflict Trap. Washington, DC: World Bank. Coser, Lewis 1956. The Functions of Social Conflict. New York: Free Press. Darke, Hubert, ed. 2002. Nizam al-Mulk: The Book of Government or Rules for Kings, 3rd ed. New York: Persian Heritage Foundation. Dawisha, Adeed & Zartman, I William, eds 1988. Beyond Coercion: The Durability of the Arab State. London: Croom Helm. Deng, M., Kimaro, Sadiklel, Lyons, Terrence, Rothschild, Donald and Zwartman, I. William et al 1995. Sovereignty as Responsibility: Conflict Management in Africa Washington, DCD: Brookings Institution Press. De Tocqueville, Alexis 1835/1955. Democracy in America. Englewood Cliffs, NJ: Doubleday. Doyle, Michael W. & Sambanis, Nicholas 2006. Making War and Building Peace. Princeton, NJ: Princeton University Press. DFID 2005. We Need to Work More Effectively in Fragile States. London: UK Department for International Development.
948
The SAGE Handbook of Political Science
Durkheim, Émile 1957. Professional Ethics and Civic Morals. London: Routledge. Durkheim, Émile 1964. The Division of Labor in Society. New York: Free Press. Durkheim, Émile 1986. ‘The concept of state,’ in A. Giddens, ed., Durkheim on Politics and the State (pp. 32–72). Cambridge: Polity. Esty, Daniel C, Goldstone, Jack A, Gurr, Ted R, Surko, Pamela T, and Unger, Alan N 1995. State Failure Task Force Report, I. McLean, VA: CIA. Esty, Daniel C, Goldstone, Jack A, Gurr, Ted R, Harff, Barbara, Levy, Marc, Dabelko, Geoffrey D, Surko, Pamela T, and Unger, Alan N 1998. State Failure Task Force Report, II. McLean, VA: CIA. Evans, Gareth 2008. The Responsibility to Protect. Washington, DC: Brookings Inst. Press. Evans, Peter, Rueschemeyer, Dietrich & Skocpol, Theda, eds. 1985. Bringing the State Back In. Cambridge: Cambridge University Press. Evans-Pritchard, EE 1940. The Nuer. London: Clarendon Press. Fragile States Index 2018, published by Fund for Peace, Washington, DC. Frank, Andre Gunder 1967. Capitalism and Underdevelopment in Latin America. New York: Penguin. George, Alexander L. 1993. Bridging the Gap: Theory and Practice in Foreign Policy. Washington, DC: US Institute of Peace. Goemans, Hein 2006. ‘Bounded communites: Territory, territorial attachment and conflict,’ in Miles Kahler & B Walter, eds, Territoriality and Control in an Era of Globalization (pp. 25–61). Cambridge: Cambridge University Press. Gramsci, Antonio 1967. The Modern Prince and Other Writings. New York: International. Helman, Gerald & Ratner, Steven 1993. ‘Saving failed states,’ Foreign Policy, 89(winter): 3–20. Herbst, Jeffrey 2000/2015. States and Power in Africa: Comparative Lessons in Authority and Control. Princeton, NJ: Princeton University Press. Holsti, Kalevi J. 1996. The State, War and the State of War. Cambridge: Cambridge University Press. Horowitz, Donald 1985. Ethnic Groups in Conflict. Berkeley: University of California Press.
Howard, Tiffany 2014. Failed States and the Origins of Violence. New York: Ashgate. ICISS 2001. The Responsibility to Protect. Ottawa: International Commission on Intervention and State Sovereignty. Jackson, Robert H. 1990. Quasi-States: Sovereignty, International Relations and the Third World. Cambridge: Cambridge University Press. Johnson, Chalmers 1966. Revolutionary Change. New York: Little, Brown. Kaldor, Mary 2012. New and Old Wars: Organised Violence in a Global Era, 3rd ed. Cambridge, UK/Malden, MA: Polity Press. Kaplan, Seth 2008. A New Paradigm for Development. New York: Praeger. Kautilya Polity 1960. Arthashastra. Mysore: Mysore Printing & Publishing House. Keane John, ed. 1988. Civil Society and the State. London: Verso. Kendall, Elisabeth 2018. Contemporary Jihadi Militancy in Yemen. Washington, DC: Middle East Institute. LaPalombara, Joseph and Weiner, Myron, eds. 1966, 2015. Political Parties and Political Development. Princeton, NJ: Princeton University Press. Laski, Harold J, 1935. The State in Theory and Practice, Harmondworth: Penguin. Laswell, Harold D & Kaplan, Abraham 1950. Power and Society: A Framework for Political Inquiry, New Haven, CT: Yale University Press. Leader, Anna & Waever, Ole, eds. 2019. Assembling Exclusive Expertise: Knowledge, Ignorance and Conflict Resolution in the Global South. London: Routledge. Lemay-Hébert, Nicolas, ed. 2013. Semantics of State-Building. London: Routledge. Lewis, I M 1961. A Pastoral Democracy: A Study of Pastorialism and Politics among the Northern Somali of the Horn of Africa. New York: Oxford University Press. Lijphart, Arend 1977. Democracy in Plural Societies. New Haven, CT: Yale University Press. Luciani, Giacomo, ed. 1990. The Arab State. Abingdon: Routledge. Lund, Michael 2010. Engaging Fragile States. Washington DC: Woodrow Wilson Center. MacIver, Robert M. 1926. The Modern State. London: Oxford University Press.
State Formation and Failure
Magnuson, Salamah 2018. Non-State Armed Groups in Social Conflict. PhD Dissertation, Baltimore, Maryland: Johns Hopkins University. Mamdani, Mahmood 1996. Citizen and Subject: Contemporary Africa and the Legacy of Late Colonialism. Princeton, NJ: Princeton University Press. Martin-Breen, Patrick & J. Marty Anderies, 2011. ‘Resiliency: A Literature Review,’ Institute for Development Studies. http://opendocs.ids.ac.uk/opendocs/bitstream/ handle/123456789/3692/Bellagio-Rockefeller%20bp.pdf?sequences McIver, R M 1926.The Modern State. London: Oxford University Press. McLennan, Gregor, Held, David & Hall, Stuart 1984. The Idea of the Modern State. Milton Keynes: Open University Press. Mello, Brian 2018. ‘The Islamic State: Violence and ideology in a post-colonial revolutionary regime,’ International Political Sociology, 12(2): 139–55. Michels, Robert 1949. Political Parties: A Sociological Study of the Oligarchical Tendencies of Modern Democracy. Glencoe, IL: Free Press. Migdal, Joel S. 1987. Strong Societies and Weak States. Princeton, NJ: Princeton University Press. Michels, Robert (1949) Political Parties: A Sociological Study of the Oligarchical Tendencies of Modern Democracy. Glencoe, IL: Free Press. Miller, Laurel, ed. 2010. Framing the State in Times of Transition. Washington, DC: USIP. Montville, Joseph V., ed. 1990. Conflict and Peacemaking in Multiethnic Societies. Lexington, MA D. C.: Heath. Myrdal, Gunnar 1968. Asian Drama: An Inquiry into the Poverty of Nations. New York: Pantheon. Nettl, J P 1968. ‘The State as a Conceptual Variable,’ World Politics, 20(4):, no. 4:558–92. OECD 2007. Principles for Good International Engagement in Fragile States and Situations. Paris: Organization for Economic Cooperation and Development. Papagianni, K 2008. ‘Participation and state legitimation,’ in Call Charles T., ed., Building States to Build Peace. Boulder, CO: Lynne Rienner. Patrick, Stewart & Brown, Kaysle 2007. Greater Than the Sum of Its Parts? Assessing “Whole
949
of Government” Approaches to Fragile States. Boulder, CO: Lynne Rienner. PITF 2007. Political Instability Task Force. Available at: http://globalpolicy.gmu.edu/pitf/ index.htm (accessed November 3, 2008). Rotberg, Robert, ed. 2003. State Failure and State Weakness in a Time of Terror. Cambridge, MA/Washington, DC: World Peace Foundation/Brookings Institution Press. Rotberg, Robert I, ed. 2004. When States Fail. Princeton, NJ: Princeton University Press. Rothchild, Donald 1987. ‘Hegemony and state softness: Some variations in elite responses,’ in Zaki Ergas, ed., The African State in Transition (pp. 117–48). London: Frank Cass. Runciman, Steven 1958. The Sicilian Vespers. New York: Pelican. Schatzberg, Michael 2001. Political Legitimacy in Middle Africa: Father, Family, Food. Bloomington, IN: Indiana University Press. Schlichte, Klaus 1998. ‘Why States decay,’ Arbeitspapier 2. Universität Hamburg IPW. Service, Elman 1975. Origins of the State and Civilization. New York: Norton. Tahiri, Edita 2010. International Statebuilding and Uncertain Sovereignty, PhD Thesis, Pristina: University of Pristina. Tilly, Charles, ed. 1975. The Formation of National States in Western Europe. Princeton, NJ: Princeton University Press. Tilly, Charles 1978. From Mobilization to Revolution. New York: McGraw-Hill. UN General Assembly 2005. World Summit Outcome, Resolution A/RES/60/3, 24 October. USAID 1995. Program for the Recovery of the Economy in Transition (PRET). Washington, DC: Agency for International Development. USAID 2005. Fragile States Strategy. Washington DC: US Agency for International Development. Villalón, Leonardo A. & Huxtable, Phillip A., eds. 1997. The African State at a Critical Juncture. Boulder, CO: Lynne Rienner. Walter, Barbara 2017. ‘The New New Civil Wars,’ Annual Review of Political Science, 20: 469–86. Weber, Max 1958. Essays in Sociology. New York: Galaxy Books (Oxford University Press). Weber, Max 1968. On Charisma and Institution Building. S. N. Eisenstadt, ed. Chicago: University of Chicago Press.
950
The SAGE Handbook of Political Science
Worster, W 2009. ‘Law, Politics and the Conception of the State in State Recognition Theory,’ Boston University International Law Journal, 27 (115). Zartman, I William, ed. 1980. Elites in the Middle East. New York: Praeger. Zartman, I William, ed. 1995. Collapsed States. Boulder, CO: Lynne Rienner.
Zartman, I William 2015. Preventing Deadly Conflict. Cambridge: Polity. Zartman, I William, ed. 2015. Arab Spring: Negotiating in the Shadow of the Intifadat. Athens, GA: University of Georgia Press.
PART V
Public Policies and Administration
This page intentionally left blank
57 Bureaucracy and Bureaucratic Effectiveness B. Guy Peters
The Nature of Bureaucracy The term bureaucracy is usually associated with the German political sociologist Max Weber and is taken to mean a formal structure governed by hierarchy and rules and populated by career officials recruited on merit principles. Although the Weberian concept of bureaucracy was based on the experiences of, or aspirations for, Western countries, bureaucracies of various sorts have existed in traditional and transitional regimes for centuries (Eisenstadt, 1963). The formal bureaucratic model also has been largely adopted in most non-Western countries but may be adopted in form only, with the practice differing markedly from the Weberian ideal (see Nef, 2007). Formally, a bureaucracy (following Weber) is an organization that has the following features: 1 hierarchical organization; 2 fixed lines of authority;
3 4 5 6 7
division of labor and specialization; governance through published regulations; maintenance of files; full-time, career employees; recruitment, retention and promotion by merit principles.
It is crucial to remember that for Weber, the bureaucracy is an Ideal Type rather than an organizational form that would necessarily be manifested in the real world of public administration. That is, the bureaucratic form of organization was intended to be a manifestation of the achievement of the rational legal form of authority in a developed society (Lepsius, 2013). However, a number of empirical studies have demonstrated that the bureaucratic model is rarely fully achieved in the real world of public or private administration (Hall, 1963; Hinings and Meyer, 2018; Olsen, 2008). While bureaucracy has come to have strong negative connotations (Kaufman, 1981), at the time the concept was developed it could be seen as a significant advance
954
The SAGE Handbook of Political Science
over the amateur, patrimonial administrative structures that existed in many countries during the time Weber was writing. In addition to the rationality embedded in this form of organization, the assumption was that a bureaucracy would perform more efficiently than previous structures. As well as the efficiency gained through the bureaucratic form, this mode of administration was also more transparent for citizens as they could learn of any rules in advance and could also find out at what stage their file was in being processed. Finally, the career structure, the training of individual bureaucrats, and authoritative controls within the organization were intended to make the administrative system less corrupt and more competent. The concept of bureaucracy also implied a construct of the bureaucrat who inhabits a bureaucracy. These individuals are meant to demonstrate substantial abilities to perform their tasks, but to also suspend their personal judgment when performing their tasks. The good bureaucrat acts sine irae et studio.1 That is, they would make their decisions without concern for the particular circumstances of the individuals involved in the issue, but only on the basis of the law and the directions of superiors within the organization. This behavior should create greater equality between citizens as their cases would not be judged on ascriptive criteria but instead on the universalistic criteria of the law.2 Given that bureaucracy is an intellectual construct, the characteristics of comprehensive rationality and the complete divorce of the bureaucrat from their own ideas and values were not likely to ever appear in reality. This construct was intended largely as a methodological tool against which to compare real-world organizations (Eliason, 2000), thus measuring the extent to which the rational-level model had been achieved. Rather than being a guide for building better organizations, the concept of bureaucracy may be better understood as an aid to comparative analysis (Dahlström and Lapuente, 2017).
As well as being perhaps unachievable, the Weberian bureaucratic form of organization might not even be desirable, especially in a democracy, were it ever to be achieved fully. As just noted, this form of organization would imply that the bureaucrat would become something of an automaton who was making decisions strictly according to the rules and the hierarchical commands of their superiors. The experiences of numerous authoritarian regimes have made it very apparent that public bureaucrats must be prepared to exercise some independent moral judgments about their actions and the actions of their organizations (Du Gay, 2005; Goodsell, 2015). The same sort of separation between individual values and decisions on behalf of the State can be identified when bureaucrats are excessively loyal to the regime or to a particular political leader (Pierre, 2019). Thus, a strict bureaucratic organization can be dehumanizing both for the employees of the organization and for its clients. For the employees, working for such an organization provides little opportunity for personal growth or the use of talents: decisions are governed by rules and hierarchical command and do not involve personal judgment. Likewise, for the client of the organization there is little room for consideration of the individual characteristics of the case. The rules are the rules and the case will be judged according to those impersonal rules.3 Although there is a formal definition of bureaucracy, the term also is used more informally and almost always negatively. One set of connotations of bureaucracy are of organizations that are bound by red tape, inefficiency, and corruption. That image of bumbling incompetence can be contrasted with an alternative set of connotations of bureaucracy as organizations that aggressively seize power from the more democratic institutions of the public sector (Niskanen, 1971). Those images existed long before the advent of contemporary populism and the attempts to undo the ‘administrative state’ (Metzger, 2017). The connotations
Bureaucracy and Bureaucratic Effectiveness
of bureaucracy mentioned above are obviously contradictory, but they may exist sideby-side in the media and even in academic discourse. Whether viewed in more formal terms or through such negative lenses, the bureaucracy remains a central institution for governance. Much of the work of governing is carried out by the bureaucracy, and in some places it is the most competent and modernized institution for governing. Variations in the form of bureaucratic institutions will have a profound effect on the performance of government and the economy (Acemoglu et al., 2015). This article will address the concept of bureaucracy as well as examine the nature of bureaucratic institutions within a comparative perspective, as well as the influence of those institutions on the governing capacity of political systems.
Bureaucracy as Attitudes and Values as Well as Structure The discussion above, and indeed most discussions of bureaucracy, focus on the structural features outlined by Weber and by other scholars such as Woodrow Wilson in the United States. It is crucial to remember, however, that bureaucracy also depends upon the values of the individuals who populate those structures. Further, as with most aspects of bureaucracy, those attitudes held by the bureaucrats may be both positive and negative. Indeed, the same value held by a public employee may have both positive and negative consequences for the clients of government organizations and for the government itself. The negative conceptions of the attitudes of bureaucrats usually focus on patterns such as rigidity, legalism, and risk aversion. These values are, in turn, related to classic pathologies of bureaucracy such as red tape and buck-passing. The traditional rules within a public bureaucracy guarantee tenure for the
955
employee of a public organization unless he or she does something wrong – perhaps egregiously so. These rules therefore generate behaviors, and the attitudes supporting those behaviors, that make it difficult for citizens to get prompt decisions from government. However, those same values of legalism and rigid adherence to rules can also enable bureaucracies to create equality in the services provided to citizens. In some settings bureaucrats have been noted to have an attitude of superiority and to consider themselves different from the society they serve. Historically, this was the case in many European societies, but even where the role of public sector bureaucrat remains respected, the democratization of public life has reduced or eliminated those attitudes. However, they may persist in some settings. This attitude has been noted in a number of African countries (Olivier de Sardan, 2008), with the bureaucrats lacking a service orientation and inheriting the sense of superiority from the former colonial administrators. But there are also more positive values associated with the public bureaucracy. Most importantly, the literature on ‘public service motivation’ (Perry and Hondeghem, 2008) has demonstrated that many, if not most, civil servants in developed democracies are motivated by the opportunity to do the service for the public and their country that is offered by working in the bureaucracy. Rather than being concerned about defending their own positions and paychecks, these bureaucrats see the public sector as a locus for service to their society. The problem for individual bureaucrats becomes how to balance their service motivation with the formal rules of government. They are responsible for upholding the rule of law and following the rules for delivering public services. Even if they are sympathetic with their clients, as some of the literature on street-level bureaucracy has indicated (Evans, 2016), then they may still be constrained by formal rules and supervision from organizational superiors.
956
The SAGE Handbook of Political Science
The Tasks of Bureaucracy Bureaucracies are large and multi-faceted institutions. Any simple notion of them as Kafkaesque, paper-processing institutions will quickly disappear when one examines their tasks and their responsibilities. While they certainly do move a great deal of paper – or now digital – files, they also provide a wide range of public services, including education, healthcare, social services, environmental regulation, and a host of others. We can name these various types of public services provided through public administration, or the public bureaucracy, but we can also think of them as being provided through several fundamental activities.
Implementation The most fundamental task of the public bureaucracy is to implement public policy. This has been their role for centuries, but the concept of implementation has been developed more fully over the past half-century (Pressman and Wildavsky, 1974). In addition to the analytic developments in this literature, there has been a great deal of emphasis on the difficulties inherent in implementing policies in a timely and effective manner. Largely unintentionally, this emphasis on failures in implementation has reinforced the generally negative stereotypes of bureaucracy held both by the public and many academics alike. Implementation remains the principal task for public bureaucracies, and the large majority of the individuals working for the public sector are engaged in implementing policies and are working at the ‘street level’, having direct contact with the public (Lipsky, 2010). Therefore, not only are these public employees crucial for the actual performance of public programs, they are also the primary contact between the State and society. Most citizens see their political leaders infrequently (except on television)
but they are in frequent contact with postal workers, social workers, policemen, teachers, and a host of other street-level bureaucrats (Zacka, 2017). While the bulk of the bureaucracy is engaged in implementation, it is also here where there are numerous attempts to reduce the size of the public sector. The implementation process is increasingly performed by non-governmental actors, whether from the market or from social actors such as churches and voluntary organizations. These collaborative forms of policy implementation have the advantage of reducing the apparent size of government, and they also can leverage the skills and resources of the private sector for public purposes (Donahue and Zeckhauser, 2011). Further, in more democratic terms, collaboration also permits more participation by social actors in policy decisions, albeit at the implementation stage. But we must also recognize that these collaborative forms of implementation can have their pitfalls as well. Perhaps most importantly, the implementation process can, in a manner analogous to regulation, be captured by the partners in collaboration. That is, the goals of the public program may be subverted (by design or by gradual change) by the goals of the partner(s) from the private sector. Capture has been especially important in public – private partnerships but can be an issue in almost any form of collaboration.
Policy Advice A second major function of the public bureaucracy is to provide policy advice to politicians. The public bureaucracy is a major repository of expertise about the policies they administer, while many politicians – especially in a parliamentary system – have little or no experience with the policies for which they are responsible. Therefore, bureaucracies become important sources of advice for their political ‘masters’, and the interface between bureaucratic and political
Bureaucracy and Bureaucratic Effectiveness
officials can be extremely important in defining the success of policymaking. However, this conventional role of policy advice for the bureaucracy has become reduced by changes in both the prevailing theories about public administration and the attitudes of politicians. First, the New Public Management (NPM) (see below) emphasized the managerial role of senior public servants and downplayed the policy advice role. The assumption was that political leaders and their own personal advisors should think about policy, while the bureaucracy got on with the task of implementation. In addition to the ideology of NPM, the policy advice function of the bureaucracy was reduced by the increasing ‘presidentialization’ of political executives (Poguntke and Webb, 2007). Parliamentary democracies increasingly became dominated by their cabinets, and especially by their prime ministers. This domination included increasing the size of personal staffs, including policy advisors. The advice that might have come from the bureaucracy then came from those personal staffs that were considered more loyal, albeit perhaps less expert, than the career bureaucracy.
Rule Making Although most citizens might think of legislature as the source of legal rules, the major source of rules in most countries, even democratic societies, is the public bureaucracy. The rules made by the bureaucracy are a secondary form of legislation made in pursuit of the primary rules issued by the legislature but far exceed the volume of the laws coming from the political institutions. For example, in the United States, the federal bureaucracy makes thousands of regulations each year to elaborate the rather broad legislation passed by Congress (Kerwin and Furlong, 2019). There are very good reasons for the importance of the bureaucracy in making legal
957
rules for the society. For example, governments tend to have relatively little capacity to make detailed legislature. This capacity is limited both because of restraints on time and also because of limited technical information. Governments therefore have chosen to delegate a good deal of the rule-making role in society to bureaucracies. In democratic societies, there are always legal and political controls over the actions of bureaucracies in making rules (Page, 2001), but the administrative organizations of government still exercise a great deal of power over public policies.
Rule Adjudication Just as bureaucracies make more rules than legislatures, so too do they try more legal cases than do the formal court systems. These legal cases arise from other activities of the administrative system. Whenever a bureaucrat makes a decision about an individual client, that decision may be subject to a legal proceeding, especially if the claim for benefits is denied or a fine is imposed. Likewise, in many cases where a bureaucratic agency makes a new regulation, that regulation can be contested – for example: did the agency act in an ‘arbitrary and capricious manner’ when they made the decision?4 Many of the administrative law courts are managed and staffed by the same agencies that are making the decisions being contested (Adler, 2003). This appears to create some conflict of interest, although in most instances steps are taken to insulate the legal from the decision-making components of the organization. And there may also be independent administrative law courts and organizations such as the Conseil d’Etat in France which assesses the legality of administrative actions. Further, it may be possible for a citizen or a corporation to appeal decisions in the regular court system if there is a substantial legal or constitutional question involved.
958
The SAGE Handbook of Political Science
Alternative Forms of Bureaucracy There is some tendency to assume that all public bureaucracies are the same, in part because the Weberian model posits a certain form of bureaucracy. Although the organization of public administration may be more similar than that of legislatures or political executives, there are still marked differences. These differences are found in the organization of the public sector, but more importantly in the underlying patterns of thinking about public administration and its role in governing. Superficially, bureaucracies are the same but, when viewed more closely and more carefully, there are important differences.
across federal systems, being extremely high in countries such as Belgium and Germany but low in Australia. Another option in the organization of public bureaucracy is to build supervisory structures into the system in order to ensure conformity by the implementing organizations. The use of prefets in France and other Napoleonic systems to supervise implementation at the local level ensures greater uniformity in the treatment of citizens, albeit involving some potential loss of local autonomy and local democratic control. Communist systems have used the party as a parallel structure to supervise government officials, again to ensure compliance.
Culture and Values Organization The first important dimension of difference among public bureaucracies is their organizational formats. In his discussion of bureaucracy, Weber assumed a hierarchical pattern, and the usual portrayal of bureaucracy is a neat pyramid with some political leader(s) on top and then an orderly hierarchical structure beneath those leaders. There is a unified structure that goes from the top of the organization all the way down to the lowest level field staff who generally are in contact with the clients – or objects – of the organization, whether they be individual citizens, regulated industries, or foreign governments. But there are many other ways in which public administration may be organized. For example, the Scandinavian countries, especially Sweden, have delegated a great deal of the implementation of public policies to autonomous agencies, leaving the ministries to make policy. This plan has been copied widely as a component of NPM (below). In federal systems, much of the implementation function is delegated to the sub-national units, again separating policy and administration. The extent of that delegation varies
As noted throughout, there is some tendency to think of public bureaucracies, or public administration, as being very similar across countries and levels of government. That, however, tends to assume away a number of important differences among administrative structures and practices. Some of those we have mentioned above when discussing the structures of administration, but the differences go deeper than just structure. At one level, bureaucracy should be understood as a concept that is the same across cultures. If we begin with the basic ideas advanced by Weber, then a bureaucracy can be seen as a standard against which to compare organizations existing in a particular country. Given that bureaucracy was intended to be an Ideal Type rather than a description of any particular organization, the comparative use of the concept arises from the contrast between a real-world organization and the standards contained in the ideal model. That is, a bureaucracy in any setting should be much the same as in any other setting. If, however, we adopt a less strict conception of how to proceed with using the concept, then the various forms of public administration that have been developed
Bureaucracy and Bureaucratic Effectiveness
across the world must be compared among themselves, rather than necessarily with the Ideal Type model itself.5 That form of comparison is more common than one based strictly on the Ideal Type, and although it lacks a conceptual standard against which to compare real-world administrative systems, the more direct comparison of those systems may reveal more about them as functioning public bureaucracies. Such a comparison reveals how the administrative systems function in their own terms, rather than those imposed by the researcher. One of the principal sources of difference among administrative systems is the administrative tradition from which they have emerged. Although contemporary administrative systems are heavily influenced by the changing demands of governance, and changing ideas about what constitutes good public administration, they also reflect legacies from the past. In some cases, the inheritance may be over centuries. Some scholars (Daly, 1968) argue, for example, that patterns of public administration in contemporary Latin America reflect the style of administration of the Spanish conquistadores. There may be as many administrative traditions as there are countries, but there are also some clear groups or families of national patterns. Among Western countries there appears to be four major groups: Napoleonic, Germanic, Scandinavian, and Anglo-American. We can also identify a clear administrative pattern in Islamic countries (Samier, 2017) as well as in Confucian countries (Chau, 1996). Interestingly, although it is to some extent an amalgam of the traditions of the member states, administration in the EU appears to be developing a distinctive pattern of its own (Kassim, 2018).
Levels of Development There are legacies of the past for less developed countries, just as there are for other countries. The legacies of the past are often
959
the vestiges of their colonial experiences. The post-colonial states in Africa (Young, 2012) reflect in many ways the administrative styles of the previous colonial power, but they also reflect the difficulties of attempting to create a modern bureaucratic system in societies and economies with a shortage of available talent and deeply entrenched patterns of patronage, nepotism, and corruption. Further, the demands on bureaucracy in developing societies may be greater, given the role of the public sector in fostering and managing economic development. One common pattern of administration in less developed countries is the dual existence of modern bureaucratic organizations and practices along with more traditional patterns of administration, especially at the local level. This dualistic structure contains two alternative sources of legitimation for the actions of the public sector. The formal bureaucracy has a rational-legal foundation for action while the use of existing structures in villages maintains traditional sources of authority.
Bureaupathologies As noted at the beginning of this chapter, bureaucracies tend to be unloved, and part of that absence of affection or respect results from the pathologies that have been identified as being endemic in public administration. These pathologies are too numerous to detail (one scholar has a list of over 100; Caiden, 1991) but they can be placed into some broader clusters and discussed in that way. There are also marked differences in the types of pathologies experienced in more or less developed administrative systems. Red tape is one of the most commonly mentioned negative consequences of the bureaucratic style of organization (Bozeman and Feeney, 2011; Gupta, 2012). The complaints regarding red tape argue that decisions within public organizations involve
960
The SAGE Handbook of Political Science
an excessive amount of time because of the numerous checks and clearances required. Seemingly simple decisions will require multiple officials to agree to them. As already noted, this red tape is both a barrier to effective bureaucracy and a protection both for citizens and also the bureaucrats making the decision. ‘Passing the Buck’ is a pathology within bureaucracies that is similar to red tape. Often for fear of making an incorrect decision, members of the bureaucracy will refer a decision to their superior, and they will refuse to make a decision, passing it on to an even higher level within the organization. One official will eventually have the courage, or at least the responsibility, to make a decision, and the case will be passed back down through the hierarchy for implementation. However, by that point, a great deal of time and energy will have been wasted before a decision is made. Another commonly mentioned pathology of organizations in the public bureaucracy is that they tend to survive long after their utility has been exhausted. There are a number of amusing examples of the survival of organizations for decades longer than needed (Kaufman, 1974) but these examples are not amusing to critics of the public sector. Although the extent of survival of organizations is not as great as sometimes assumed (Peters and Hogwood, 1993), there are certainly good examples of this pathology in most bureaucracies. For all the pathologies mentioned here, and many more, a positive feature of the bureaucratic model of public administration has created unintended consequences (Merton, 1936; Baert, 1991). The greatest single source of unintended consequences is the protection against dismissal from office provided that the individual public servant follows the rules. The fear of not following the rules, therefore, may result in not making any decisions. Bureaucrats are not meant to be risk-takers and are, indeed, supposed to be rule followers. But rules can be more
ambiguous than those who write them think they are. When faced with any ambiguity, therefore, the safest thing for the public employee to do is to do nothing. Even if the individual bureaucrat does make a decision, they are more likely to make a safe one that follows the letter of the law.
Threats to Bureaucracy Bureaucracies tend not to be popular with politicians or with the public, although, as noted above, the civil service as an institution is often more trusted than are politicians and legislatures. Since the inception of public administration there have been attempts to reform the institution. Many of these reforms have attempted to make government more like the presumably more efficient organizations within the private sector, while others have sought to make bureaucracies friendlier to the ordinary citizen and to minimize the pathological elements of bureaucracy mentioned above. In addition to reforms directed at making the public sector more businesslike, there have been attempts to make the bureaucracy less autonomous from political institutions. While the presumed value of a bureaucracy comes in part from its autonomy from political pressures, political leaders generally want to have greater control and to have the bureaucracy conform to the policy preferences of the government of the day. These political pressures on bureaucracy may be exacerbated by the managerial reforms given that a more managerialist bureaucracy will be more concerned with efficiency than with policy preferences of the incumbent government.
General Rejection of the Bureaucratic Model Although the bureaucratic model rarely appears in its pure form, the pattern of public
Bureaucracy and Bureaucratic Effectiveness
administration in many countries has contained many of the features. The dominant pattern of public administration depended for decades on hierarchy, authority, and career public servants who were relatively immune to sanctions or dismissal. This model was considered the best way to govern and did provide advantages both for government and for citizens. But it also presented major problems for individuals working in it, as well as for citizens attempting to receive service from it. Internally, individual bureaucrats have seen bureaucracy an impediment to the development of their careers. Many people want jobs that give them more latitude to make decisions on their own and have a lot of variety in them. Many public employees have found their lives in bureaucracy less fulfilling than they had hoped, perhaps especially those with a strong public-service motivation. Some reforms of the public sector have sought to increase participation by lower level officials and to increase their opportunities for making decisions, but in many public sector organizations there is still little option for individual autonomy. In addition to the internal critique within the system, beginning as early as the 1970s, there have been other, more political, calls for a ‘post-bureaucratic’ model of public sector organizations. On the political right, the continuing critique has been primarily that the bureaucracy engages in empire-building for themselves and attempts to evade political controls. The bureaucracy profits from a large state and high levels of public spending (Niskanen, 1971) and therefore has little incentive to be frugal with funds – indeed, it even has incentives to expand its budget. These claims about the bureaucracy have returned with substantial force in the early 21st century, with populist claims in a number of settings, for example the United States and Brazil, about the existence of a ‘Deep State’ that defends its own interests against the people. Critiques on the political left also argue that the bureaucracy and the bureaucrats defend their own interests. But in these critiques the
961
complaint is less about the growth of public expenditure than about the failure to provide needed services to the public. The rigid, rulebound nature of bureaucracy is assumed to prevent public servants from serving the public, especially the disadvantaged (Piven and Cloward, 1993). Further, class and cultural differences between bureaucrats and their clients limit the quality of the services provided to those citizens who most need public services.
Managerialism The NPM has perhaps been the major challenge to the traditional form of public bureaucracy over the past several decades (Hood, 1991; Christensen and Lægreid, 2001, 2007). We can date the beginning of this movement to the end of the 1970s, with political leaders such as Thatcher, Reagan, and Mulroney (Savoie, 1994), and there are numerous elements of the reforms associated with NPM that persist. Indeed, in some cases the ideas associated with NPM are still being implemented anew. The fundamental idea behind the NPM is that public management should be more like management in the private sector and, indeed, that the public sector in general should be more influenced by market ideas. The diagnosis for these reforms was that the hierarchy and monopoly of public organizations produced inefficiency and high costs (Peters, 2000). Therefore, to make public administration perform better, some of the defining features of these institutions should be weakened. For example, it was argued that the personnel system should become more like the market with civil protections being weakened or eliminated. Likewise, employment in the public sector should cease to be a distinctive career, and individuals should be hired from outside as needed. Following on from that, there has been an emphasis on performance management (Bouckaert and Halligan, 2007) to assess the level of performance of
962
The SAGE Handbook of Political Science
individuals and organizations, with rewards and sanctions associated with the level of performance. While these managerialist changes might be seen as merely enhancing the efficiency of public administration, they may also threaten some important values in the public sector. Perhaps most fundamentally, as Wallace Sayre argued, public and private management are alike in all unimportant [emphasis added] respects. For example, the threats to the civil service may undermine the independence and probity of administration, and the capacity for managers or political leaders to appoint individuals from outside the career service opens the system up to patronage and politicization (see below). In addition to the emphasis on management, in the NPM there was also a strong emphasis on altering the structure of the public bureaucracy.6 The agency model – small policymaking ministries with autonomous agencies implementing policy – was a part of the NPM prescriptions for the public sector (Lægreid and Verhoest, 2010). Although in many implementations of the agency model the agencies were not as autonomous as sometimes assumed, there was still some separation from the direct political control of the minister. Further, the emphasis on management and therefore on implementation, tends to undervalue some of the other activities of the public bureaucracy. In particular, the NPM de-emphasized the policy advice role of the bureaucracy rather significantly. The assumption was that politicians and their political advisors should make policy, while the bureaucracy would implement those policies. This separation of policy and administration is a classic idea in public administration but tends to underestimate the knowledge about policy that exists within the bureaucracy. It also tends to over-estimate the policy knowledge of most political leaders and many of their advisors (Rose, 1976). Although the NPM seriously challenged the bureaucratic form for organizing the
public sector, the post-NPM reforms from the 1990s onwards have tended to rediscover some of these elements of bureaucracy. The creation of the ‘Neo-Weberian State’ (Lynn, 2008) was intended to bring together the best features of NPM with the formal accountability and the merit-based administration associated with Weberian bureaucracy. In addition, there was the need to restore the center of government after the pursuit of decentralization and deconcentration in the implementation of the NPM reforms (Peters, 2004).
Politicization I noted above that managerialism may lead to some weakening of the merit principles that have been central to the development of civil service systems and public administration. The pressures from managerialism are, however, only part of a larger set of pressures that seek to undermine merit-based civil service systems in favor of more politicized patterns of public employment (Peters and Pierre, 2004; Neuhold et al., 2013). Even in countries with deeply ingrained merit systems, there have been changes that allow more political appointments to public office and more political influence over the behavior of all public administrators. Of course, in some instances, more politicized systems of public administration have been in place for decades or centuries, despite the advocacy of merit systems by reformers within the governments and by external donor organizations. The administrative systems of Africa (Olowu, 2003) and Latin America (Panizza et al., 2019) continue to be characterized by high levels of patronage. In some instances, this can be functional for the performance of the political system, enabling government to hire – even if for just a short period – more talented individuals than they might be able to, given the relatively low wages in the public sector. But in other cases, these patronage appointments are just ‘Jobs for the Boys’ (Grindle, 2012).
Bureaucracy and Bureaucratic Effectiveness
There has also been some continuing increase in the politicization of the public service in Europe and Asia, which have previously depended largely on merit appointments. The level of ‘democratic backsliding’ in the public bureaucracy has been especially noticeable in former socialist countries such as Hungary and Poland (Kopecky et al., 2012) but there has also been some increased politicization in other European countries. In Asia, Japan has been moving away from its bureaucratically dominant style of politics to a more politicized style of governing. With that changing style has come a greater openness to patronage appointments in the public sector. Contemporary versions of politicizing the bureaucracy have perhaps been more fundamental than simply placing political appointees in positions that might be thought to be more appropriately filled by a career civil servant. The various populist regimes that have come to power in Europe, as well as North and South America, have taken aim at the bureaucracy and the ‘administrative state’ as a central barrier to a government that serves ‘the people’. These attacks on the bureaucracy, and the desire for more direct political control, are politicizing the public sector more intensely. The point above about the capacity to hire more talented individuals raises a more general point that can be overlooked when considering merit systems and patronage. This is simply that patronage is not always as evil as some believers in the strict bureaucratic model might argue. In addition to the capacity to recruit talent into government for a short time – even when civil servants are talented and well-paid – patronage can provide other benefits. In democratic terms, patronage permits elected officials to have greater control over the administration of its policies and over the policy advice it receives. Further, using patronage appointments rather than a formal merit system may enable governments to accelerate attempts to make the public sector more representative (Peters, 2015).
963
Corruption The third major challenge to the bureaucratic model of public administration is corruption. The legal foundations and the emphasis on, and the adherence to, rules implies that bureaucracy will function with a great deal of probity. The presence of a strict hierarchy within bureaucratic organizations also helps to control the behavior of individuals within it and hold them accountable for their actions. The problem is that these controls may in practice be weak, and that there are also strong incentives for individuals within the government to enrich themselves through the misuse of their offices. Despite continuing efforts to eliminate corruption, it remains endemic in many public bureaucracies (Berman, 2011). In many cases, this is because of very low wages paid to public employees who must then find ways to make a living wage. This petty corruption by lower level officials can be contrasted with the large-scale corruption that occurs at the top of the hierarchy by both political and administrative leaders alike. These officials may have a reasonable wage but yet want more, desiring incomes more like those found in the private sector. Both forms of corruption, however, erode public confidence in public administration, and reduce the legitimacy of the public sector in general. Although corruption is often economically based, it can also be cultural. That is, despite the ideas of honesty and probity usually found in bureaucracies, the acceptance of corrupt practices is endemic in some cultures (Rose-Ackerman, 2010). This is especially true for forms of corruption such as nepotism that involve benefitting family members of the public official. When there is a strong commitment to the family, then not providing a cousin or a niece a public job can be seen as corrupt. It is very easy to condemn corruption, but it is sometimes more difficult to understand the roots of that apparent misuse of public power.
964
The SAGE Handbook of Political Science
Given that corruption can be deeply ingrained in administrative systems, how can it be overcome, and a more bureaucratic system of administration established? If we look at some of the more successful cases in rooting out corruption, several strategies become apparent. One is to pay members of the civil service adequately, so they have less incentive to augment their income through accepting bribes. Second, vigorous enforcement of anti-corruption laws is essential and can be achieved through autonomous anti-corruption organizations. Finally, the emphasis on training and creating an esprit de corps among the civil servants, can also help to overcome the temptations to receive extra income.
Bureaucratic Effectiveness and Policy Capacity Although widely denigrated in popular literature and also in some academic literature, the bureaucracy is key to the governing capacity of political systems. Legislatures and political executives may be capable of making policy and perhaps monitoring the effects of those policies, but to make government work effectively, they require a public bureaucracy. More precisely, they require a public bureaucracy that can take rules and policy ideas coming from the political actors and make them work in the economy and society. To do this requires a bureaucracy that can perform the rather daunting tasks mentioned above. Bureaucracies in developing and transitional societies tend to face the most significant problems in creating policy and governance capacity. Those bureaucracies are often working in societies with wellestablished patterns of clientelism and patronage that make developing a professional and effective civil service difficult. Those societies may also have a history of endemic corruption that siphons off scarce resources and
increases public discontent with, and distrust of, the public sector. Moreover, they are often working with limited resources – human and financial – that makes the achievement of policy goals difficult or impossible. Further, donor organizations have imposed many of the same reform programs, such as NPM, on bureaucracies in these countries with apparently little concern about whether they can implement them effectively (Manning, 2001), often with the effect of exacerbating problems rather than ameliorating them. A number of empirical analyses have established a link between the form of the public bureaucracy and the capacity of governments to manage economic development. For example, Evans and Rauch (1999) argued that the ‘Weberianess’ of public bureaucracies in developing societies was closely associated with the level of growth. They focused on the personnel aspects of the Weberian model, notably merit recruitment and the existence of a career system for public employees. In particular, countries such as the Asian ‘Little Tigers’ that have been able to create extremely well-qualified public bureaucracies have been able to prosper while growth in many other countries has faltered. More recently, scholars have tended to discuss the role of a strong state in development more than focusing on the bureaucracy per se (Acemoglu and Robinson, 2012; Bardham, 2016). However, in these analyses, the bureaucracy is an essential element of that strong state. International organizations such as the World Bank have also emphasized the need to reduce corruption in the public bureaucracy and to create a more Weberian state as essential to development and good governance. In these analyses the Weberian bureaucracy is significant not only for its capacity to process the work of the public sector but also as a means of controlling corruption and establishing the rule of law. Other scholars have extended the analysis to link the bureaucracy with the capacity to deliver good governance. Dahlström and Lapuente (2017) demonstrate the linkage
Bureaucracy and Bureaucratic Effectiveness
between the Weberian model of bureaucracy – again conceptualized primarily in terms of the personnel system – and several measures of good governance. The evidence does provide substantial support for the idea that a Weberian bureaucracy is associated with better performance by government as a whole. The merit system for recruiting bureaucrats is often justified in terms of producing political neutrality, but it can also be related to the performance of government. But why should a better bureaucracy necessarily produce better government performance? First, the bureaucracy may be central in advising the political elites on what policies to make and in helping them avert policy disasters. The task of the policy analyst to ‘speak truth to power’ (Wildavsky, 1987)7 is also a task for the public administrator. Members of the bureaucracy have experience and often have expert knowledge which politicians may lack, and hence their role in policy advice can be essential to creating effective policy. Unfortunately, as already noted, many contemporary reforms of the public sector have tended to remove, or at least diminish, the policy role of the bureaucracy. These reforms may also tend to reduce the quality of the policies being produced (Page, 2001). The central role of the bureaucracy in implementation is also essential for effective governance. Implementation is not a simple, almost mechanical, enforcement of the law but also requires good judgment and, increasingly, political skills. Some policies can be enforced in the manner of a ‘machine’ organization (Mintzberg, 1995), applying the same rules again and again to relatively similar cases. While many of these repetitive functions in government organizations are now largely managed by information technology, the image of the bureaucrat sitting at a desk endlessly stamping forms still abounds, especially in less developed countries. That mechanical form of implementation is not where the real skills of the implementing bureaucrat come into play. Implementation
965
increasingly involves collaboration with private actors, whether they be market-based or not-for-profit (Ansell and Gash, 2007). And this collaboration in turn requires some bargaining and negotiation among the parties involved. For example, the use of public – private partnerships involves the capacity of the public sector to work with private actors in programs involving large amounts of money and significant levels of discretion in the use of those funds. Even when implementation does not involve non-State actors, it may still involve a great deal of judgment on the part of the public employees. Public bureaucrats have a great deal of discretion and how they exercise that discretion is crucial for the success or failure of programs and for a more pervasive effect on public trust of government. If we assume, for example, that the police are uniformed, and perhaps armed, bureaucrats, then they must exercise very high levels of discretion, often in dangerous situations and with limited time. Even when not in extremis, public employees such as social workers and school teachers make largely autonomous decisions that have profound consequences on the lives of citizens – those decisions also determine how effective government is as an institution. Third, bureaucracies make rules using the powers delegated to them by legislatures, and this power can enhance their effectiveness in governance. However, this power also heightens the possibilities of conflict with other institutions in the public sector, especially in an era of distrust of government in general and bureaucracy in particular. Although governed by legal constraints such as the Conseil d’Etat in France and the Administrative Procedures Act in the United States, and bringing a great of expertise to bear on the writing of the regulations, rule making through the bureaucracy appears undemocratic and an usurpation of powers. Likewise, the bureaucracy also adjudicates millions of cases each year concerning their own application of the law (Bruff, 1991).
966
The SAGE Handbook of Political Science
Much like rule making can be seen as usurping legislative powers, the use of these powers within the bureaucracy has been seen as usurping the powers of the courts. Further, as administrative courts may be located within the same organizations that are parties to the adjudication, these courts can be seen as having a conflict of interest. But the power of the bureaucracy to manage so much of the business of governance not only enhances its own power, but also makes the overall conduct of public business more efficient. For example, the Department of Veterans Affairs in the United States tries millions of cases each year, a number that would overwhelm the regular court system. In addition to these specific tasks performed by the bureaucracy, the presence of a permanent and professionalized institution can promote the policy capacity of a government. The very factors that cause critics to bemoan the place of the public bureaucracy in governance – permanence, stability, insulation from immediate political pressures – may also be the factors that are essential for it to play the important role that it does in governing. Political leaders are not necessarily elected because of their knowledge about public policy or public management, so therefore having the permanent expert bureaucracy influencing policy choices will tend to benefit governance. The danger in thinking about linking bureaucracy to the performance of the public sector is that this may appear to assume that better performing governments will be dominated by the executive – that is, be perhaps less democratic. Certainly, while many less democratic regimes do rely more on their bureaucracies than more democratic governments and in more economically developed countries, even in democratic regimes a more capable bureaucracy is related to better governance. Despite the importance of the bureaucracy for governance, many contemporary populist governments have been attacking the role of the bureaucracy and the remainder of the ‘Deep State’, albeit with no
alternatives for improving the performance of government (Peters and Pierre, 2018).
Conclusion: Bureaucracy as Virtue and Vice As should be apparent from reading this chapter, bureaucracy can be a virtuous form of organization, but it also has the potential for being pathological. The form of organization may not be the cause of these more or less virtuous outcomes per se, but it is rather the leadership of the organizations and the contexts within which they function that are the principal causes. Bureaucracy as a form of organization is relatively value free, but in practice it can have significant normative consequences, both positively and negatively. Moreover, the presence of an effective, Weberian public bureaucracy appears to be related to the capacity for economic development. The ability to manage the economy and to impose the rule of law on the part of the public bureaucracy creates conditions in which businesses can thrive, but there is the danger that too much bureaucracy can stifle economic development. For example, if there is too much red tape and too many rules, this will make managing businesses difficult and slow the development process. What appears a virtue at one level of institutional development may be a vice when taken too far. The difficulty, of course, is recognizing when the bureaucratic model of organization has been taken to the extreme. In summary, bureaucracy is a powerful weapon for government and a potential impediment to the effective delivery of services to citizens. It can be conceptualized as the highpoint of modernity or as the barrier to creating more humane and effective public programs. The form of organization is in of itself neutral, but the way public sector bureaucracies are managed, and the popular reactions to public organizations, can have a major impact on performance.
Bureaucracy and Bureaucratic Effectiveness
Notes 1 Without malice or consideration. 2 As already noted, the concept of bureaucracy had a strong element of modernization just as did the pattern variables of Talcott Parsons, including ascription vs. achievement, and particularistic vs. universalistic criteria. 3 Fortunately, as I will discuss below, there are very few organizations with this degree of rigidity in contemporary governments and the tendency has been to expand the discretion exercised by public employees. 4 This is the language of the Administrative Procedures Act in the United States, but other systems of administrative law contain similar statements. 5 This represents, to some extent, the move from classical categorization to the ‘family resemblance’ style of conceptualization (Collier and Mahon, 1993). 6 The first major implementation of the agency model was in the UK. This was meant to be a copy of a long-standing model of administration in Sweden, but the version developed in the UK did not provide nearly as much autonomy to the agencies as the original model. 7 Although now applied to policy analysis, this term was originally used by the Quakers (the Society of Friends) to describe their commitment to non-violent resistance to evil.
References Acemoglu, D. and Robinson, J. A. (2012) Why Nations Fail: The Origins of Power, Prosperity, and Poverty (New York: Crown Business). Adler, M. (2003) A Socio-legal Approach to Administrative Justice, Law and Policy 25, 323–52. Ansell, C. and Gash, C. (2007) Collaborative Governance in Theory and Practice, Journal of Public Administration Research and Theory 18(4), 543–571. Baert, P. (1991) Unintended Consequences: A Typology and Examples, International Sociology 6, 201–10. Bardham, P. (2016) State and Development: The Need for a Reappraisal of the Current Literature, Journal of Economic Literature 54, 862–92. Berman, E. M. (2011) Public Administration in Southeast Asia: Thailand, Philippines,
967
Malaysia, Hong Kong and Macao (Boca Raton, FL: CRC Press). Bouckaert, G. and Halligan, J. (2007) Managing Performance: International Comparisons (Abingdon: Routledge). Bozeman, B. and Feeney, M. K. (2011) Rules and Red Tape: A Prism for Public Administration Theory and Research (Armonk, NY: M. E. Sharpe). Bruff, H. H. (1991) Specialized Courts in Administrative Law, Administrative Law Review 43, 329–66. Caiden, G. E. (1991) What Really is Public Maladministration, Indian Journal of Public Administration 37, 1–16. Chau, D. M. (1996) Administrative Concepts in Confucianism and their Influence on Development in Confucian Countries, Asian Journal of Public Administration 18, 45–69. Christensen, T. and Lægreid, P. (2001) New Public Management. The Transformation of Ideas and Practice (Aldershot: Ashgate). Christensen, T. and Lægreid, P. (2007). Transcending New Public Management. The Transformation of Public Sector Reform. (Aldershot: Ashgate). Collier, D. and Mahon, J. E. (1993) Conceptual ‘Stretching’ Revisited: Adapting Categories to Comparative Analysis, American Political Science Review 87, 845–55. Dahlström, C. and Lapuente, V. (2017) Organizing Leviathan: Politicians, Bureaucrats and the Making of Good Government (Cambridge: Cambridge University Press). Daly, G. (1968) Prolegomena on the Spanish American Political Tradition, Hispanic American Historical Review 48, 37–58. Donahue, J. D. and Zeckhauser, R. G. (2011), Collaborative Governance: Private Roles for Public Goals in Turbulent Times (Princeton, NJ: Princeton University Press). Du Gay, P. (2005) Bureaucracy and Liberty: State, Authority and Freedom, in P. Du Gay, ed., The Values of Bureaucracy (Oxford: Oxford University Press), pp. 104–22. Eisenstadt, S. N. (1963) The Political Systems of Empires (New York: Free Press). Eliason, S. (2000) Max Weber’s Methodology: An Ideal Type, Journal of the History of the Behavioral Sciences 36, 241–63. Evans, P. and Rauch, J. E. (1999) Bureaucracy and Growth: A Cross-National Analysis of
968
The SAGE Handbook of Political Science
the Effects of ‘Weberian’ State Structures on Economic Growth, American Sociological Review 64, 748–65. Evans, T. (2016) Professional Discretion in Welfare Services: Beyond Street Level Bureaucracy (London: Routledge). Goodsell, C. T. (2015) The New Case for Bureaucracy (Washington, DC: CQ Press). Grindle, M. S. (2012) Jobs for the Boys: Patronage and the State in Comparative Perspective (Cambridge, MA: Harvard University Press). Gupta, A. (2012) Red Tape: Bureaucracy, Structural Violence and Poverty in India (Durham, NC: Duke University Press). Hall, R. H. (1963) The Concept of Bureaucracy: An Empirical Assessment, American Journal of Sociology 69, 32–40. Hinings, B. and Meyer, R. E. (2018) Starting Points: Intellectual and Institutional Foundations of Organization Theory (Cambridge: Cambridge University Press). Hood, C. (1991) A Public Management for all Seasons?, Public Administration 69, 3–19. Kassim, H. (2018) The European Commission as an Administration, in E. Ongaro and S. Van Thiel, eds., The Palgrave Handbook of Public Administration and Management in Europe (London: Palgrave), pp. 978–1013. Kaufman, H. A. (1974) Are Government Organizations Immortal? (Washington, DC: The Brookings Institution). Kaufman, H. A. (1981) Fear of Bureaucracy: A Raging Pandemic, Public Administration Review 41, 1–9. Kerwin, C. M. and Furlong, S. (2019) Rulemaking: How Government Agencies Write Law and Make Policy, 5th ed. (Washington, DC: CQ Press). Kopecky, P., Mair, P. and Spirova, M. (2012) Party Patronage and Party Government in European Democracies (Oxford: Oxford University Press). Lægreid, P. and Verhoest, K. (2010) Governance of Public Sector Organizations: Proliferation, Autonomy and Performance (Basingstoke: Palgrave Macmillan). Lepsius, M. R. (2013) ‘Institutionenalyse und Instituionenpolitik’ [Institutional analysis and institutional policy], in M. R. Lepsius, ed., Institutionaliserung politischen Handeln (Wiesbaden: Springer).
Lipsky, M. (2010 [1970]) Street-Level Bureaucracy: Dilemmas of the Individual in Public Services (New York: Russell Sage Foundation). Lynn, L. E. (2008) What is a Neo-Weberian State?, NISPACEE Journal of Public Administration and Policy 1, 17–30. Manning, N. (2001) The Legacy of New Public Management in Developing Countries, International Review of Administrative Sciences 67, 297–312. Merton, R. K. (1936) The Unintended Consequences of Purposive Social Action, American Sociological Review, 1, 894–904. Metzger, G. E. (2017) 1930s Redux: The Administrative State Under Siege, Harvard Law Review 131, 1–96. Mintzberg, H. (1995) Structure in Fives: Designing Effective Organizations (Englewood Cliffs, NJ: Prentice Hall). Nef, J. (2007) Public Administration and Public Sector Reform in Latin America, in B. G. Peters and J. Pierre, eds., The Handbook of Public Administration, 2nd ed. (London: Sage), pp. 476–89. Neuhold, C., S. Vanhoonacker, S. and Verhey, L. (2013) Civil Servants and Politics: The Delicate Balance (Basingstoke: Macmillan). Niskanen, W. (1971) Bureaucracy and Representative Government (Chicago, IL: Aldine/ Atherton). Olivier de Sardan, J. P. (2008) State Bureaucracy and Governance in Francophone West Africa: An Empirical Diagnosis and Historical Perspective, in G. Blundo and P.-Y. Le Meur, eds., The Governance of Daily Life in Africa (Leiden: Brill), pp. 78–99. Olowu, D. (2003) African Governance and Civil Service Reforms, in N. Van de Walle, N. Ball, and V. Ramachandran, eds., Beyond Structural Adjustment: The Institutional Context of African Development (London: Macmillan), pp. 134–62. Olsen, J. P. (2008) The Ups and Downs of Bureaucratic Organization, Annual Review of Political Science 11, 13–37. Page, E. C. (2001) Governing by Numbers: Delegated Legislation and Everyday Policymaking (Oxford: Oxford University Press). Panizza, F., Ramos, C. and Peters, B. G. (2019) Party Professionals, Programmatic Technocrats, Apparatchiks and Agents: A Typology
Bureaucracy and Bureaucratic Effectiveness
of Modalities of Patronage in Comparative Perspective, Public Administration (forthcoming). Perry, J. L. and Hondeghem, A. (2008) Motivation in Public Management: The Call of Public Service (Oxford: Oxford University Press). Peters, B. G. (2000) The Future of Governing, 2nd ed. (Lawrence, KS: University of Kansas Press). Peters, B. G (2004). Back to the Centre? Rebuilding the State? Political Quarterly 75, 130–40. Peters, B. G. (2015) Political Patronage, Machine Politics and Ethnic Representativeness in the Public Sector, in P. Von Maravić, B. G. Peters and E. Schröter, eds., The Politics of Representative Bureaucracy (Cheltenham: Edward Elgar) pp.113–122. Peters, B. G. and Hogwood, B. W. (1993) The Death of Immortality: Births, Deaths and Marriages in the US Federal Bureaucracy, American Review of Public Administration 18, 119–33. Peters, B. G. and Pierre, J. (eds) (2004), The Politicization of the Civil Service: The Quest for Control (London: Routledge). Peters, B.G. and Pierre, J. (2018) Populism and Public Administration. Paper presented at Annual Conference of the American Political Science Association. Boston, MA, September 2018. Pierre, J. (2019) Institutions, Politicians or Ideas?: To Whom or What are Public Servants Expected to be Loyal?, British Journal of Politics and International Relations. 21(3): 487–93.
969
Piven, F. F. and Cloward, R. A. (1993) Regulating the Poor: The Functions of Public Welfare, updated ed. (New York: Vintage). Poguntke, T. and Webb, P. (2007) The Presidentialization of Politics: A Comparative Study of Modern Democracies (Oxford: Oxford University Press). Pressman, J. L. and Wildavsky, A. (1974) Implementation (Berkeley, CA: University of California Press). Rose, R. (1976) The Problem of Party Government (London: Macmillan). Rose-Ackerman, S. (2010) Corruption: Greed, Culture and the State, Yale Law Journal Online 120, 125–40. https://papers.ssrn. com/sol3/papers.cfm?abstract_id=1648859. Accessed 9/13/2019 Samier, E. (2017) Islamic Public Administration Tradition: Theoretical and Practical Dimensions, Administrative Culture 18, 53–71. Savoie, D. J. (1994) Thatcher, Reagan and Mulroney: In Search of a New Bureaucracy (Pittsburgh, PA: University of Pittsburgh Press). Wildavsky, A. (1987) Speaking Truth to Power: The Art and Craft of Policy Analysis (Boston, MA: Little, Brown). Young, C. (2012) The Postcolonial State in Africa: Fifty Years of Independence, 1960– 2010 (Madison: University of Wisconsin Press). Zacka, B. (2017) When the State Meets the Street: Public Services and Moral Agency (Cambridge, MA: Harvard University Press).
58 Corruption Bo Rothstein
Corruption and the Political Science Discipline Since the mid 1990s, many international aid and development organizations have become interested in issues related to the problem of corruption. Since corruption tends to be a sensitive issue, the ‘coded language’ for this policy re-orientation has been to stress the importance of ‘good governance’. In academic circles, concepts such as ‘institutional quality’, ‘quality of government’ and ‘state capacity’ have also been used. However, a central problem in this discussion is a serious lack of conceptual precision (Rothstein and Varraich, 2017). Moreover, until the late 1990s, the interest in researching political corruption in political science and related disciplines, such as public administration and policy analysis, was very modest. As shown in Figure 58.1, the total number of articles published in journals listed in the major bibliographical database, Thomson ISI, that had the term ‘political corruption’ in the title, as
a keyword or in the abstract for the year 1992, was 87. Since the database covers about 1,700 scholarly social science journals, each publishing about fifty articles per year, this is a surprisingly low number. As stated as late as 2006 by one of the most prominent political scientists in this field, Michael Johnston (2006: 809): ‘American political science as an institutionalized discipline has remained steadfastly uninterested in corruption for generations’. This lack of an interest in issues about corruption can also be seen from a look at the many handbooks in political science that have been published during the last decade. Looking at ten important Oxford Handbooks in areas related to political science, none has a chapter, a section of a chapter or even an index entry on the term ‘corruption’.1 This lack of interest in research about corruption in political science stands in sharp contrast to what seems to be the opinion of the ‘general public’. According to a BBC poll in 2010, surveying 13,353 respondents in 26 countries, corruption is the most talked
971
Corruption
Corruption as keyword in scientific articles 2000
1813
1800 1600 1400 1200
953
1000 715
800 600 400 200 0
197
252
1996
2000
334
87 1992
2004
2008
2012
2016
YEAR
Figure 58.1 Corruption as keyword in scientific articles
about issue globally, surpassing issues such as climate change, poverty and unemployment (Katzarova, 2011). A reason for why political science should pay attention to issues related to corruption is to do with unexpected and, for many – including this author – also normatively unwelcome, results. The problem pertains to the effects of democratization. The waves of democracy that have swept across the globe since the mid 1970s have brought representative democracy to places where it seemed inconceivable 50, 30 or even 10 years ago. This is certainly something to celebrate but there are also reasons to be disappointed. One example is South Africa, which miraculously managed to end apartheid in 1994 without falling into a full-scale civil war. As Nelson Mandela said in one of his speeches, the introduction of democracy would not only liberate people but also greatly improve their social and economic situation. The slogan that his political party (ANC) used in the first democratic elections was ‘a better life for all’ (Mandela, 1994: 414). Available statistics give a surprisingly bleak picture for this promise. Since 1994, the country has not managed to improve the time that children on average go
to school by even a single month. Economic inequality remains at a world record level, life-expectancy is down by almost six years and the number of women that die when they give birth has more than doubled.2 Simply put, for many central measures of human well-being, the South African democracy has not delivered. Another example has been provided by Amartya Sen in an article comparing ‘quality of life’ in China and India. His disappointing conclusion is that on most standard measures of human well-being, the communist-autocratic Peoples’ Republic of China now clearly outperforms liberal and democratically governed India (Sen, 2011). Using a set of 30 standard measures of national levels of human well-being, and also some variables known to be related to human wellbeing such as the capacity for taxation, including data from between 75 and 169 countries, Holmberg and Rothstein (2014) find only weak, no, or sometimes even negative, correlations between these standard measures of human well-being and the level of democracy as defined above. Maybe the most compelling evidence about the lack of positive effects of democracy on human well-being comes from a study about child deprivation
972
The SAGE Handbook of Political Science
by Halleröd et al. (2013) using data measuring seven aspects of child poverty from 68 low- and middle-income countries for no less than 2,120,734 cases (children). The result of this large study shows that there is no positive effect of democracy on the level of child deprivation for any of the seven indicators (access to safe water, food, sanitation, shelter, education, health care and information3). The picture that emerges from the available measures is this: representative democracy is not a safe cure for severe poverty, child deprivation, economic inequality, illiteracy, being unhappy or unsatisfied with one’s life, infant mortality, short life-expectancy, maternal mortality, access to safe water or sanitation, gender inequality, low school attendance for girls, low interpersonal trust or low trust in parliament. Why is this so? One explanation was given by Larry Diamond in a paper presented at the National Endowment for Democracy in the United States as it celebrated its first twenty-five years of operations: There is a specter haunting democracy in the world today. It is bad governance – governance that serves only the interests of a narrow ruling elite. Governance that is drenched in corruption, patronage, favoritism, and abuse of power. Governance that is not responding to the massive and longdeferred social agenda of reducing inequality and unemployment and fighting against dehumanizing poverty. Governance that is not delivering broad improvement in people’s lives because it is stealing, squandering, or skewing the available resources. (Diamond, 2007: 119)
If we follow Diamond’s shift of focus from representative democracy and turn to measures of corruption and the quality of government (henceforth QoG), the picture of what politics can do for human well-being changes dramatically. For example, the above-mentioned study on child deprivation finds strong effects of measures of quality of government on four out of seven indicators on child deprivation (lack of safe water, malnutrition, lack of access to health care and lack of access to information), controlling for GDP per capita and a number of basic
individual level variables (Halleröd et al., 2013). Other studies largely confirm that various measures of control of corruption and quality of government have strong effects on almost all standard measures of human well-being, including subjective measures of life satisfaction (aka ‘happiness’) and social trust (Norris, 2012; Holmberg and Rothstein, 2012; Helliwell et al., 2018). Recent studies also find that an absence of violence, in the form of interstate and civil wars, is strongly affected by levels of quality of government and more so than by the level of democracy (Lapuente and Rothstein, 2014; Teorell, 2015; Norris, 2012). As Sarah Chayes (2015) has pointed out, corruption is an important cause behind the rise of terrorist and insurgent military groups that has hitherto been ignored both by research and in the academic analyses of security policy. Some may argue that the normative reasons for representative democracy should not be performance measures like the ones mentioned above, but political legitimacy. If people have the right to change their government through ‘free and fair elections’, they will find their system of rule legitimate (Rothstein, 2011). Here comes maybe an even bigger surprise from empirical research, namely that democratic rights do not seem to be the most important cause behind people’s perception of political legitimacy. Based on comparative survey data, several studies show that variables such as control of corruption, the integrity of public officeholders and the rule of law trumps democratic rights and representation in explaining political legitimacy (Gilley, 2009; Gjefsen, 2012; Dahlberg and Holmberg, 2014; Murtin et al., 2018). The argument is certainly not that representative democracy is unimportant, but without a reasonably competent, impartial, uncorrupt and effective public administration, representative democracy is unlikely to deliver human well-being. Thus, if the relevance of research in political sciences is understood as to how it may improve human well-being and/or improve political legitimacy, research
Corruption
has to a large extent been focusing on the least important part of the political system, namely how the ‘access to power’ is organized (that is, electoral and representative democracy and processes of democratization). This focus on ‘input’ variables ignores the more important part of the state machinery for increasing human well-being, namely how power is exercised or, in other words, the quality of how the state manages to govern society. As argued by Fukuyama (2013), this seems to have been driven by an underlying ideological view inspired by neo-classical economics and is particularly strong in the United States, which emphasizes the need to limit, check and control (and also minimize) the state which is basically seen as a ‘predatory’ organization. In other words, how to ‘tame the beast’ has been the central focus, not what ‘the animal’ can achieve.
Corruption as Taboo In the late 1960s, the Swedish economist (and Nobel Laureate) Gunnar Myrdal pointed out that the term corruption was ‘…almost taboo as a research topic and is rarely mentioned in scholarly discussions of the problems of government and planning’ (Myrdal, 1968: 937–51). According to Myrdal, there were different reasons for the lack of an academic focus on corruption, especially for research concerning developing countries – one being a general bias of ‘diplomacy in research’. This ‘diplomacy’ stems from the historical setting of when Myrdal’s article was published, that is, in the midst of the Third Wave of Democratization. The fact that Myrdal’s essay formed part of a lengthy book titled ‘An Enquiry into the Poverty of Nations’, which focused on the Asian continent, is evidence of the prevalent prejudice of the time, which may explain the need felt by many academics to remain ‘neutral’ or ‘diplomatic’, thereby avoiding touching upon a sensitive issue like corruption. Matters were of a similar nature
973
on the policy front, where this type of reasoning was also utilized by international organizations, such as the World Bank, effectively avoiding research and discussion of the topic. The official stance of these organizations was that problems related to corruption constituted ‘a national issue’ that was beyond the purview of the organization’s mandate, which stated that interference into national political issues was not allowed. As Pearson points out, the reluctance of these institutions to address corruption can also be attributed to their ‘perception of themselves as politically neutral, the limitations of their charters and because of the sensitivities of many of their member States’ (Pearson, 2013: 31). This all changed when former World Bank President James D. Wolfensohn redefined corruption as an economic problem in the mid 1990s. In an interview in 2005, he stated the following: ‘Ten years ago, when I came here, the Bank never talked about corruption, and now we are doing programs in more than a hundred countries, and it is a regular subject for discussion’ (cited in Holmberg and Rothstein, 2012: 287). It is noteworthy that it was a policy organization that broke the ‘taboo’ surrounding corruption and that academia, and not least the political science discipline, came around later.
Definitions of Corruption The concept of corruption is an age-old concept, perhaps as old as human civilization. The first chapter in the recent volume titled ‘Anticorruption in History’ starts with the sentence ‘In 323 BCE, a corruption scandal erupted across Athens’ (Taylor, 2017: 21). Another chapter in this volume points at the many anti-corruption measures established in the later Roman Republic (Arena, 2017). The stability over time of the general understanding of corruption is presented in another chapter in the following way: ‘In the later Middle Ages, as today, behavior identified as
974
The SAGE Handbook of Political Science
corrupt characteristically involved two things: the promotion of one’s own interests above those of the public and the bending of rules or official powers under the influence of bribery or affection’ (Watts, 2017: 94). Analyses of what counted as corruption in very distant pasts, such as the Roman Empire or 13th-century France, give the impression of it not being different from contemporary notions of the concept (MacMullen, 1988; Jordan, 2009). This line of thought indicates that the concept of corruption is not specifically Western or new but also reinforces the concept as very much universal and not limited to the modern, liberal West. The classic conception of corruption as a general disease of the body politic was also central to the thinking of Enlightenment thinkers such as Machiavelli, Montesquieu and Rousseau, aptly described by Friedrich (1972) in mapping the historical evolution of the concept.
A Universal Definition of Corruption? As pointed out by the Council of Europe, ‘no precise definition can be found which applies to all forms, types and degrees of corruption, or which would be acceptable universally as covering all acts which are considered in every jurisdiction as contributing to corruption’ (Pearson, 2013: 36). This poses many problems, one of which is bringing together the different forms of corruptions such as, for example, clientelism, patronage, nepotism and patrimonialism into one comprehensive analytical landscape. Philosophically, concepts such as these all share a ‘core’ with corruption, which appears to be the reason justifying why these are often examined hand in hand. It is perhaps in lieu of this that some scholars have attempted to identify a ‘core’ that can be pinned down and which binds these different forms of corruption together, thereby going beyond the cultural or relativist understandings that tend to dominate
within much of the empirical research. It is true that using the same concept for a situation when a policeman demands a small sum in return for not giving a speeding ticket and the huge sums that are reported to be paid for securing governments’ arms deals can seem a bit awkward. However, biologists call both hummingbirds, hens, eagles and ostriches ‘birds’, but they are, to say the least, quite different ‘birds’. This analogy may indicate that it is not the size of the matter that is important, but some qualitative core or aspect of the phenomenon we want to define. The development of the international anticorruption regime since the late 1990s has not been without its critiques. One point that has been stressed in this critique is that the international anti-corruption agenda represents a specific Western, liberal ideal that is not easily applicable to countries outside that part of the world (Bukovansky, 2006; Bratsis, 2003; Hindess, 2005; de Maria, 2010). There are at least two arguments against this type of relativistic conceptual framework. The first is normative and based on the similar discussion in the areas of human rights and democracy; the right not to be discriminated by public authorities, the right not to have to pay bribes for what should be free public services and the right to get treated with ‘equal concern and respect’ from the courts are in fact not very distant from what counts as universal human rights. For example, for people that do not get the health care they are entitled to because they cannot afford the bribes the doctors demand, corruption can result in a life-threatening situation. The same can occur for citizens that do not get protection by the police because they do not belong to the ‘right’ group. The second reason against a relativistic definition of corruption is empirical. Although the empirical research in this area is not entirely unambiguous, most of it points to the quite surprising result that people in very different cultures seem to have a very similar notion of what should count as corruption. Survey results from regions in India and in sub-Saharan Africa show that
Corruption
people in these societies take a very clear stand against corruption. For example, the Afrobarometer survey asked whether respondents considered it ‘not wrong at all’, ‘wrong but understandable’ or ‘wrong and punishable’ if a public official: (1) decides to locate a development project in an area where his friends and supporters live; (2) gives a job to someone from his family who does not have adequate qualifications; and (3) demands a favor or an additional payment for some service that is part of his job. The result was that between 60 and 74% of the more than 25,000 respondents in the 19 sub-Saharan countries deemed all three acts both ‘wrong and punishable’ (Rothstein and Varraich, 2017: 47). Widmalm (2005, 2008) finds similar results in a survey study of rural villages in India. Although an absent figure in these villages, Widmalm finds that the Weberian civil servant model (impartial treatment of citizens, disregarding income, status, class, caste, gender and religion) has a surprisingly large amount of support among the village population for handling common issues. In other words, the idea that the public acceptance of what is commonly understood as corruption varies significantly across cultures does not find much empirical support (see also Rotberg, 2017). Especially in so-called post-colonial theory, the existence of a universal understanding of corruption has been questioned (Apata, 2018; de Maria, 2010). However, it can be pointed out that in Frantz Fanon’s classic book ‘The Wretched of the Earth’, which in many ways is ideologically the most important and founding text for the post-colonial approach to development issues, the author himself points to corruption among the new political elite as a serious malady for West-Africa. In Fanon’s words: ‘Scandals are numerous, ministers grow rich, their wives doll themselves up, the members of parliament feather their nests and there is not a soul down to the simple policeman or the customs officer who does not join in the great procession of corruption’ (Fanon, 1967: 67).
975
The reluctance of many scholars in the post-colonial approach to look at corruption as a serious problem is thus difficult to understand. In sum, there are both normative but also strong empirical grounds for opting for a universal understanding of corruption and the opposite to corruption. However, this does not exclude that there are different types of corruption and that the connection between corruption and the political system can differ. This is not different from saying that while we can have a universal definition of what constitutes representative democracy, the specific institutional configuration of democracies varies a lot. The Swiss democracy is institutionally very different from the Canadian version, which in turn is different from what they have in Denmark. The reason why people, although condemning corruption, participate in corrupt practices seems to be that they understand the situation as a ‘collective action’ problem where it makes little sense to be ‘the only one’ that refrains from using or accepting bribes and other kick-backs (Persson et al., 2010). As Gunnar Myrdal stated in his analysis of the ‘soft state’ problem, in relation to developing countries in the 1960s, ‘Well, if everybody seems corrupt, why shouldn’t I be corrupt’ (Myrdal, 1968: 409). In his anthropological study of corruption in Nigeria, Jordan Smith (2007: 65) concludes that ‘although Nigerians recognize and condemn, in the abstract, the system of patronage that dominates the allocation of government resources, in practice people feel locked in’. It makes little sense to be the only honest policeman in a severely corrupt police force, or to be the only one in the village who does not pay the doctor under the table to get one’s children immunized if everyone else pays. This may also be caused by a distinction pointed out by Bauhr (2017) between need corruption, which she defines as paying a bribe to get a service (like health care) that you are legally entitled to, and greed corruption, which is demanding a bribe for a service that you otherwise would not give even though it is your legal obligation to do so.
976
The SAGE Handbook of Political Science
Measuring corruption Unsurprisingly, there has been a very long debate about the possibility and meaningfulness of measuring corruption. The very first efforts to create a comparative measure of corruption was produced by the international anti-corruption organization Transparency International (TI) in the mid 1990s and was mainly based on surveys of experts. TI’s corruption perception index is carried out annually. Many other research institutes and organizations are producing similar indexes of either corruption or related issues such as the rule of law, government effectiveness, quality of government, the level of ‘black market’ activity and fairness in the system for public procurement. While corruption is certainly difficult to conceptualize in a way that makes it possible to operationalize and measure its frequency and prevalence, so are many other central concepts in the social sciences (democracy, power, conflict, equality etc.) and there is no reason why corruption should be more complicated than other important concepts. The main impression from the discussion is that at the country level, the existing measurements of corruption and related concepts, such as the rule of law etc., correlate on a surprisingly high level (Fazekas and Kocsis, 2017; Holmberg et al., 2009). Moreover, measures mainly based on expert surveys and surveys with representative samples of the population also correlate on a very high level (Svallfors, 2013; Charron, 2016). The problem is that in some countries, for example Italy, there are surprisingly large differences between different regions (Charron et al., 2018). That implies that giving a country like Italy one single figure for its level of corruption seriously underestimates the corruption problem in some parts of the country (and of course overestimates it in other parts of the country). The same seems to be the problem when corruption in various sectors, such as education, health care and the police, is measured. Some sectors in a country are much more hurt by
corruption than others. The conclusion is that while corruption, like other central concepts in the social sciences, is difficult to measure, the strong correlations between different measures gives us some assurance that the problem should not prevent us from using this data, though with necessary caution. The regional and sectoral variation that exists is mainly not a measurement problem but a question of how finely grained a measure it is possible to produce.
The public goods approach to corruption One way to understand why there seems to exist a universal understanding of what should count as corruption despite its enormous variation both in types, frequency and location, is what we would call a public goods approach to this problem. In all societies and cultures, in order to survive, all groups of people have had to produce at least a minimal set of public goods, such as security measures, a basic infrastructure or organized/collective forms for the provision of food. The very nature of a good being ‘public’ is that it is to be managed and distributed according to a principle that is very different from that of private goods. The public good principle implies that the good in question should not be distributed according to the private wishes of those who are given the responsibility of managing them. When this principle for the management and distribution of public goods is broken by those entrusted with the responsibility for handling the public goods, the ones that are victimized see this as malpractice and/or as corruption. This is why corruption is a concept that is related to the political and not the private sphere, and why it is different from (or a special case of) theft and breaches of trust in the private sector. Much of the confusion about cultural relativism in the discussion about what should count as corruption stems from the issue
Corruption
that what should count as ‘public goods’ differs between different societies and cultures. For example, in an absolutist feudal country where the understanding may be that the central administration is the private property of the lord or king, the state is not seen as a public good. In many indigenous societies with non-state political systems, local communities have usually produced some forms of public goods; for example, for taking care of what Ostrom (1990) defined as ‘common pool resources’, which are natural resources that are used by members of the group but which risk depletion if overused. Such resources are constantly faced with a ‘tragedy of the commons’ problem and are thus in need of public goods in the form of effective regulations to prevent overuse leading to depletion. Our argument departs from the idea that it is difficult to envision a society without some public goods. The point is: when these public goods are handled or converted into private goods, this is generally understood as corruption, independent of the culture. A conclusion that follows is that we should not expect people in developing countries, whether indigenous or not, to have a moral or ethical understanding of corrupt practice that differs from, for example, what is the dominant view in Western societies. An example could be the case in which there is not a system for taxation, yet there are certain individuals that have been selected to perform functions as arbitrators or judges. These functions are to be understood as public goods because it makes it possible to solve disputes between village members and/or families in a non-violent way. These arbitrators may, in several cases, receive gifts from the parties involved for their services. Such gifts may, for a Westerner, look like bribes, but they are usually not seen as bribes by the agents, who make a functional distinction between bribes and gifts (Sneath, 2006; Werner, 2000). The reasons for why they are not seen as bribes by the local villagers are: (1) the gifts are publicly given, and (2) there is a culturally defined level for how big such gift can be. This implies that the gift
977
is to be seen as a fee for a service, not a bribe. It would only be a bribe, and seen as such by the local populace, if it was given in a way to influence adjudication by favoring one party over another. In this case the public good is converted into a private one and it is this which is perceived as corruption. To support this argument, Rothstein and Torsello (2014) have used data from the Human Relations Area Files (HRAF), which is the single most comprehensive and largest ethnographic database of world cultures. The HRAF database has been compiled by Yale University and includes data on 258 world cultures and over 600,000 pages of ethnographic descriptions made by professional anthropologists. The cultures covered are divided among 8 world regions: Africa, Asia, Europe, Middle America and the Caribbean, the Middle East, North America, Oceania and South America. Their analysis shows that the word ‘bribe’ is found in 113 of the 258 cultures – that is 48% of the whole HRAF sample when excluding European countries. It is also found in all four general types of societies (foragers, horticulturalists, pastoralists and agriculturalists). The agriculturalists’ societies/cultures (which are also monetarized and commercial) contain the largest number of bribery entries, which supports the thesis that corruption is widely spread where public and private arrangements for the use of and access to resources and goods can be expected to vary. Even more interesting is their finding that pastoralist societies are apparently the least exposed to corruption among the subsistence types. This also supports the ‘public goods’ theory since it is in this economic type of society that one should expect to find the least ambiguity between private goods (herds and land) and public goods.
Anti-corruption policy From a policy perspective, anti-corruption policy is looking quite good compared to
978
The SAGE Handbook of Political Science
the situation in the 1990s. The ‘institutional turn’ in the social sciences, embodied for example by Nobel Laureates Douglass C. North and Elinor Ostrom, paved the way for studying why some societies had ‘good’ and others ‘not so good’ institutions. Thanks to their work, we now have quite good theoretical models of why collectively dysfunctional institutions (formal and informal) are common, stable and detrimental to prosperity and human well-being. In addition, there is now a large amount of comparative, and to some extent also longitudinal, data as well as many good case studies and historical studies of corruption and anti-corruption. In addition, compared to the 1990s, we have more stringent laws against corruption in many countries, more countries with special anti-corruption units, and in 2003 the United Nations Convention Against Corruption was signed and is now ratified by more than 170 countries. In addition to this fact is that many national and international development and aid organizations have put anticorruption high on their agenda. Thus, compared to the situation 20 years ago, there is room for quite some optimism since many of the ‘weapons’ needed in this conflict seem now to be in place. However, the results ‘on the ground’ from the many anti-corruption policies and initiatives that have been launched so far have not been that impressive. It seems to be very difficult to trace any major results from the many ‘good governance’ programs that the World Bank and many other international development organizations have launched since the mid 1990s. Alina Mungiu-Pippidi (2015: 207) summarizes the situation in this way: ‘By and large, the evaluations piling up after the first fifteen years of anti-corruption work showed great expectations and humble results’. Francis Fukuyama (2014: 25) adds that the international development and aid community ‘would like to turn Afghanistan, Somalia, Libya and Haiti into idealized places like “Denmark” but it doesn’t have the slightest idea of how to bring this about’. In
a recent book, Dan Hough (2017: 123) adds that ‘success stories are depressingly thin on the ground’. There are countries that have improved, but not much of the change can be attributed to any donor-led programs or initiatives (Rotberg, 2017). A particularly difficult result is that democratization seems not to be a safe cure against corruption. Keefer and Vlaicu (2008: 378) show that: ‘…in 2004 more than one-third of all democracies exhibited as much or more corruption than the median non-democracy’. Their argument builds on the fact that in a country that has recently democratized, politicians have no or low reputation and thus no means of making credible electoral promises to the citizenry. Politicians must therefore rely on local patronage networks and provide targeted goods to their supporters in a direct exchange for votes. In other words, in order to attain office and to stay in power, they undermine the quality of public institutions by, for example, handing out public sector jobs and/or targeting benefits directly to their presumed political supporters. Consequently, a young and fragile democracy will typically overprovide targeted goods, such as public sector jobs and public works projects, etc., while at the same time underprovide non-targeted goods, such as universal health care, education, the rule of law and the protection of property rights (Keefer, 2007). This argument is supported by a recent study by D’Arcy showing that between 1985 and 2008 scores for measures of corruption for countries in sub-Saharan Africa have increased to a considerable extent and that this negative development is ‘primarily driven by the 38 countries which have experienced increased levels of democracy’ (D’Arcy, 2015: 111). Systemic corruption, which is when citizens and public officials take corrupt practice as being the default position when they interact, seems thus to be a hardened and difficult enemy implying that current anti-corruption strategies seem to be in need of some serious rethinking (cf. Persson et al., 2010)
Corruption
What should be the opposite of corruption? One way to handle the definitional problem is to figure out what we should think of as the opposite of corruption. More precisely, what state of affairs are we looking for when we can say that corruption is under control? One should be clear that a society free from corruption is as likely as a society free of crime. But as with crime, there are important degrees. The intentional homicide rate in Brazil is more than 30 times higher than in the Nordic countries.4 As mentioned above, the problem with the standard definitions is that what should count as ‘abuse’ or ‘misuse’ of public office is not specified. Or to put it otherwise, the norm that is transgressed when we can say that an act is corrupt is not defined. One solution is ‘good governance’, but this becomes problematic because the term ‘governance’ is used very differently in different approaches in the social sciences (Rothstein and Varraich, 2017: 125). An alternative has been put forward by Rothstein and Teorell (2008) in what they call ‘the quality of government’ (QoG) approach. For the understanding of what should be understood as the opposite of corruption, Heywood and Rose (2015) have made an important point, namely that we would not be satisfied with just ‘no-corruption’. The reason is that this is a far too low a threshold for what we ought to demand from those who are entrusted to handle our ‘public goods’. Our demand is not just that these agents should refrain from corruption, but that we as citizens (and taxpayers) are entitled to expect something more than the absence of corruption. This could of course just be competence, but we can argue that there should be a more basic normative standard for how people entrusted to provide public services ought to behave. There are four reasons a normative definition is necessary. First, terms like ‘good’ (as in ‘good governance’) or ‘quality’ (as in ‘quality of government’), not to mention corruption,
979
are inherently normative. Something is ‘good’ or has high/low ‘quality’ in relation to a certain norm (or norms) and it is therefore necessary to specify this norm. To state that something or someone is corrupt is doubtless a normative judgement. Trying to define good governance while ignoring the normative issue of what should constitute ‘good’ simply defies logic. Second, the empirical results show that when people make up their mind on whether they find their governments legitimate, how a state’s power is exercised turns out to be more important for them than their rights pertaining to the ‘access’ side of the political system. Moreover, as mentioned above, the procedures at the ‘output side’, such as the rule of law and the absence of corruption, turn out to be more important for political legitimacy than the ‘outcomes’ in the form of public services or benefits. Since perceptions of political legitimacy are inherently normative, we have to conceptualize this norm(s). It should be noted that the legitimacy of how the access side of a democratic system should be organized is, according to Robert Dahl, based on a single basic norm – namely political equality, which in practice is equal voting rights and an equal right to stand for office (Dahl, 1989, 2006). Thus, if the procedures that take place at the ‘output side’ of the political system are more important for citizens when they make up their mind on whether their government is to be considered legitimate, then with the procedures at the ‘input side’, we should be able to find the parallel basic norm for this part of the political system. Obviously, it cannot be ‘political equality’ since most laws and public policies entail that citizens should be treated differently (pay different taxes, get different benefits, subsidies and services dependent on their specific situation and circumstances). Third, the risk with empirical definitions is that they have a tendency to become equal to the outcome we want to explain so that, in practice, they become tautological. One example is the definition of good institutions provided by Acemoglu and
980
The SAGE Handbook of Political Science
Robinson (2012). Their well-known argument is that it is institutions of a certain kind that promote economic prosperity. Such institutions should”secure private property, an unbiased system of law, and a provision of services that provides a level playing field in which people can exchange and contract. Moreover, such institutions”also must permit the entry of new business and allow people to choose their careers”. The list goes on, the institutions that are needed for economic prosperity should also”distribute power broadly in society and ensure that ‘political power rests with a broad coalition or plurality of groups’ (Acemoglu and Robinson, 2012: 73, 80). The problem with this definition is that it is very close to what the theory intends to explain. It seems self-evident that a society with such ‘inclusive’ institutions will be a good and prosperous society. What they are saying is that a good society will produce a good (or prosperous) society. The central issue is this: if a society decides to organize its public sphere according to a certain norm (or set of norms) which states, for example, who will work in this administration and according to which principle(s) civil servants and professionals will make decisions, will this result in higher organizational capacity and competence? Furthermore, will this make it more likely that the politicians will entrust this administration with a certain degree of autonomy? The empirical answer to this question seems to be in the affirmative. For example, if civil servants are recruited based on the norm of impartiality, which means that factual merits for the job in question is what decides recruitment and promotion, this will lead to higher competence and thus to higher state capacity, which in turn is likely to lead to increased levels of human well-being (Dahlström and Lapuente, 2017). Thus, the procedural principle of impartiality translates in practice into meritocracy which, inter alia, leads to increased competence and capacity in the public sector. The question raised by Fukuyama (2013: 349) – that is, if impartiality as the basic norm for how the
state interacts with its citizens will result in increased state capacity – is thus no longer only ‘simply asserted’, but empirically grounded. Simply put, there are now a number of reasonably good empirical indicators showing that impartiality as the procedural norms will lead to better outcomes in terms of lower corruption and higher state capacity. A fourth reason for a normative definition of QoG instead of pointing at specific empirically existing institutions is that if we look at countries that are judged to have high levels of QoG, their political and legal institutions, as well as their systems of public administration, show remarkable variation (Andrews, 2013). This implies that simply exporting such institutions (or a specific state’s institutional configuration) from high QoG to low QoG countries will not work to improve QoG. When this has been tried, the results have not been encouraging (Andrews, 2013). The reason seems to be that it is not the specific institutional configuration of the state and the public administration, but the basic norm under which the institutions operate, that is the crucial factor.
Political procedures or policy substance? Is QoG something that should be defined by reference to a set of political procedures or should it be defined by reference to certain policies or outcomes? An example of the latter is the well-known definition of ‘good governance’ provided by Daniel Kaufmann and colleagues at the World Bank which, among other things, include ‘sound policies’ (Kaufmann et al., 2009). Political philosophers, on the other hand, have argued for including the ‘moral content’ of the enacted laws or policies into the definition (Agnafors, 2013). The well-known problem with any substantive definition of democracy and thereby QoG is why people, who can be expected to have very different views about
Corruption
policies, should accept them. Since we are opting for a definition that can be universally accepted and applied, including specific policies becomes problematic. To use Rawls’ terminology, political legitimacy requires an ‘overlapping consensus’ about the basic institutions for justice in a society so that citizens will continue to support them even when they have incommensurable conceptions of ‘the meaning, value and purpose of human life’ and even if their group would lose political power (Rawls, 2005: 135). This is of course less likely to be the case if specific (sound) policies or moral content of the laws are included in the definition of QoG. Including, as the World Bank does, ‘sound policies’ in the definition also raises the quite problematic question of whether international (mostly economic) experts really can be expected to be in possession of reliable answers to the question of what ‘sound policies’ are. For example, should pensions, health care or education be privately or publicly funded (or some mix of these)? To what extent and how should financial institutions be regulated? Second, such a definition of QoG, which is not restricted to procedures but includes the substance of policies, raises what is known as the ‘Platonian-Leninist’ problem. If those with superior knowledge decide policies, the democratic process will be emptied of most substantial issues. The argument against the ‘Platonian-Leninist’ alternative to democracy has been put forward by one of the leading democratic theorists, Robert Dahl, in the following way: ‘its extraordinary demands on the knowledge and virtue of the guardians are all but impossible to satisfy in practice’ (Dahl, 1989: 85). All this implies that a strictly procedural definition of QoG is to be preferred. This also follows from the ambition to strive for a definition of QoG that is parallel to how the ‘access side’ for liberal representative democracy is usually defined, which speaks for a strictly procedural definition. The system known as liberal representative democracy should not in itself favor any specific
981
set of policies or moral standards (except those that are connected to the democratic procedures as such). There is, however, a well-known drawback to all procedural definitions of political processes for decisionmaking, namely that they cannot offer a guarantee against morally bad decisions. As is well known, there is no guarantee against perfectly democratically made decisions in a representative democracy that will result in severe violations of the rights of minorities and individuals. As Mann (2005) has argued, there is a ‘dark side’ to democracy. This is also the case for any procedural definition of QoG, be it ethical universalism (MungiuPippidi, 2015), impersonal rule (North et al., 2009), bureaucratic autonomy and capacity (Fukuyama, 2013) or impartiality in the exercise of public power (Rothstein and Teorell, 2008). In this procedural approach, we think the strategy suggested by John Rawls is the right one. His central idea is that if a society structures its procedures for making and enforcing collective decisions in a fair way, this will increase the likelihood that the outcomes are normatively just. As Rawls stated: ‘… substantive and formal justice tend to go together and therefore that at least grossly unjust institutions are never, or at any rate rarely, impartial and consistently administered’ (Rawls 1971: 59).
Conclusions: The Opposite to Corruption as Quality of Government A state regulates relations to its citizens along two dimensions. One is the ‘input’ side which relates to the access to public authority. This is where we in democracies find rules about elections, party financing, the right to stand for office and the formation of cabinets. The other side of the political system is the ‘output’ side and refers to the way in which that political authority is exercised. On the input side, where the access to
982
The SAGE Handbook of Political Science
power and thereby the content of policies is determined, as stated above, the most widely accepted basic regulatory principle has been formulated by Robert Dahl (1989) as being that of political equality. This is also John Rawls’ (2005) basic idea on how to construct a non-utilitarian society based on his wellknown principles of justice. Political equality certainly implies impartial treatment on the input side of the system, and this makes political equality and impartiality partially overlapping concepts (Rawls, 2005). Elections have to be administrated by the existing government but if they are to be considered free and, in particular, fair, the ruling party must refrain from organizing them in a manner that undermines the opposition’s possibilities to obtain power. That is, in order to be seen as legitimate, free and fair, elections must be administered by impartial government institutions (Norris, 2012; Schedler, 2002). But again, the impartial organization of elections does not imply that the content or outcome of this process is impartial. On the contrary, the reason for why many, if not most, people are active in politics is that they are motivated by very partisan interests. A working democracy must thus be able to implement the partisan interests produced by the input side of the system in an impartial way. In this context, impartiality is not a demand on actors on the input side of the political system, but first and foremost an attribute of the actions taken by civil servants, various professional groups in public service, law enforcement personnel and the like. In order to effectuate this ideal, it may, however, also be necessary to inscribe impartiality as an ideal into the mindset of these actors. Empirically, this is actually what many governments try to do. In an analysis of ethical codes for the public administration from 22 countries (of which a majority are non-Western countries), it was found that these codes show remarkable similarity regarding their core values, and that impartiality is the most important value throughout these 22 codes (Rothstein and Sorak, 2017).5
To see why the impartiality definition of QoG is universal, it is useful to compare it to Dahl’s idea of political equality as a basic norm for democracy. Every particular democratic state is, in its institutional configuration, different. It should suffice to point at the extreme variation in the electoral systems in, for example, the Swiss, Danish and British democracies. There are, in fact, innumerous ways of organizing a national democracy (presidentialism vs. parliamentarism, uni- vs. bicameralism, proportional vs. majoritarian electoral systems, variation in the power of the courts, federalism vs. unitarianism, the role of referendums, the strength of local governments, etc.). As long as the principle of equality in the access to power is not violated (for example by giving one specific political party the right to rule, or by refusing to give some specific group of citizens the right to stand for office or take part in the public debate), we call such differing political systems as in Finland and in the United States democracies. The reason is that all institutional arrangements on the input side in a representative democracy should be possible ways to ensure ‘political equality’. Impartiality as the parallel legitimatizing and defining principle for the ‘output’ side can in a similar way also encompass various administrative practices. As shown by Andrews (2013), the specific administrative and organizational configurations of governments deemed to be of high quality are in effect quite different. QoG as impartiality is of course in line with the idea of a procedural definition which means that it can encompass very different policies and does not rule out support for specific groups or interests.
Notes 1 They are: The Oxford Handbook of Political Science, The Oxford Handbook of Comparative Politics, The Oxford Handbook of Public Policy, The Oxford Handbook of Comparative Institutional Analysis, The Oxford Handbook of Political Theory, The Oxford Handbook of the Welfare State,
Corruption
The Oxford Handbook of Political Institutions, The Oxford Handbook of Law and Politics, The Oxford Handbook of Political Psychology and The Oxford Handbook of Political Leadership. 2 Data from the Quality of Government Data Bank, www.qog.pol.g.u.se. Accessed March 3rd, 2019. 3 Children with no access to radio, television, telephone or newspapers at home are defined as lacking information. 4 United Nations Office of Drugs and Crime, Statistics 2017, https://www.unodc.org/unodc/en/ data-and-analysis/statistics.html. Accessed 14 December 2018. 5 The others are Openness, Integrity, Legality, Loyalty, Equal Treatment, Reliability, Service and Professionalism.
References Acemoglu, Daron, and James A. Robinson. 2012. Why Nations Fail: The Origins of Power, Prosperity and Poverty. London: Profile. Agnafors, Marcus. 2013. Quality of Government: Towards a More Complex Definition. American Political Science Review 107(3): 433–55. Andrews, Matt. 2013. The Limits of Institutional Reform in Development: Changing Rules for Realistic Solutions. Cambridge: Cambridge University Press. Apata, Gabriel O. 2018. Corruption and the Postcolonial State: How the West Invented African Corruption. Journal of Contemporary African Studies 37(1): 34–56. Arena, Valentina. 2017. Fighting Corruption: Political Thought and Practice in the Late Roman Republic. In Anticorruption in History, ed. R. Croeze, A. Vitória and G. Geltner. Oxford: Oxford University Press, pp. 35–49. Bauhr, Monika. 2017. Need or Greed? Conditions for Collective Action against Corruption. Governance: An International Journal of Policy Administration and Institutions 30(4): 561–81. Bratsis, Peter. 2003. The Construction of Corruption, or Rules of Separation and Illusions of Purity in Bourgeois Societies. Social Text 21(4): 9–33. Bukovansky, Mlada. 2006. The Hollowness of Anti-corruption Discourse. Review of International Political Economy 13(2): 181–209. Charron, Nicholas. 2016. Do Corruption Measures have a Perception Problem? Assessing
983
the Relationship Between Experiences and Perceptions of Corruption among Citizens and Experts. European Political Science Review 8(1): 147–71. Charron, Nicholas, Victor Lapuente, and Bo Rothstein. 2018. Mapping the Quality of Government in Europe. Stockholm Swedish Institute for European Policy Studies. SIEPS report 2018: 2. Chayes, Sarah. 2015. Thieves of State: Why Corruption Threatens Global Security. New York: W.W. Norton & Company. D’Arcy, Michele. 2015. Rulers and Their Elite Rivals. How Democratization Has Increased Incentives for Corruption in Sub-Saharan Africa. In Elites, Institutions and the Quality of Government, ed. C. Dahlström and L. Wängerud. New York: Palgrave Macmillan, pp. 111–128. Dahl, Robert A. 1989. Democracy and Its Critics. New Haven: Yale University Press. Dahl, Robert A. 2006. On Political Equality. New Haven: Yale University Press. Dahlberg, Stefan, and Sören Holmberg. 2014. Democracy and Bureaucracy: How their Quality Matters for Popular Satisfaction. West European Politics 37(3): 515–37. Dahlström, Carl, and Victor Lapuente. 2017. Organizing the Leviathan: How the Relationship Between Politicians and Bureaucrats Shapes Good Government. Cambridge: Cambridge University Press. de Maria, William. 2010. Why is the President of Malawi Angry? Towards an Ethnography of Corruption. Culture and Organization 16(2): 145–62. Diamond, Larry. 2007. A Quarter-Century of Promoting Democracy. Journal of Democracy 18(4): 118–20. Fanon, Frantz. 1967. The Wretched of the Earth. Harmondsworth: Penguin Books. Fazekas, Mihály, and Gábor Kocsis. 2017. Uncovering High-Level Corruption: CrossNational Objective Corruption Risk Indicators Using Public Procurement Data. British Journal of Political Science: 1–10. Published online by Cambridge University Press: 24 August 2017. Friedrich, Carl J. 1972. The Pathology of Politics. Violence, Betrayal, Corruption, Secrecy and Propaganda. New York: Harper & Row. Fukuyama, Francis. 2013. What is Governance? Governance: An International Journal
984
The SAGE Handbook of Political Science
of Policy, Administration and Institutions 26(3): 347–68. Fukuyama, Francis. 2014. Political Order and Political Decay: From the Industrial Revolution to the Globalization of Democracy. First ed. New York: Farrar, Straus & Giroux. Gilley, Bruce. 2009. The Right to Rule: How States Win and Lose Legitimacy. New York: Columbia University Press. Gjefsen, Torbjorn. 2012. Sources of Legitimacy: Quality of Government and Electoral Democracy, Department of Political Science, University of Oslo, Oslo. Halleröd, Björn, Bo Rothstein, Adel Daoud, and Shailen Nandy. 2013. Bad Governance and Poor Children: A Comparative Analysis of Government Efficiency and Severe Child Deprivation in 68 Low- and Middle-income Countries. World Development 48: 19–31. Helliwell, John F., Haifang Huang, Shawn Grover, and Shun Wang. 2018. Empirical Linkages Between Good Governance and National Wellbeing. Journal of Comparative Economics 46: 1332–1346. Heywood, Paul M., and Jonathan Rose. 2015. “Curbing Corruption or Promoting Integrity? Probing the Hidden Conceptual Challenge.” In Debates of Corruption and Intregrity, ed. P. Hardi, P. M. Heywood, and Davide Torsello, pp. 102–119.. New York: Palgrave Macmillan. Hindess, Barry. 2005. Investigating International Anti-corruption. Third World Quarterly 26(8): 1389–98. Holmberg, Sören, and Bo Rothstein, eds. 2012. Good Government: The Relevance of Political Science. Cheltenham: Edward Elgar. Holmberg, Sören, and Bo Rothstein. 2014. Correlates of The Level of Democracy. Gothenburg: The Quality of Government Institute, University of Gothenburg. QoG Working Paper 2014: 18. Holmberg, Sören, Bo Rothstein, and Naghmeh Nasiritousi. 2009. Quality of Government: What You Get. Annual Review of Political Science 13: 135–62. Hough, Dan. 2017. Analysing Corruption: An Introduction. Newcastle upon Tyne: Agenda Publishing. Johnston, Michael. 2006. From Thucydides to Mayor Daley: Bad Politics, and a Culture of Corruption. P.S. Political Science and Politics 39(4): 809–12.
Jordan, William Chester. 2009. Anti-corruption Campaigns in Thirteenth-century Europe. Journal of Medieval History 35: 204–19. Jordan Smith, Daniel. 2007. A Culture of Corruption: Everyday Deception Popular Discontent in Nigeria. Princeton, N. J.: Princeton University Press. Katzarova, Elitza. 2011. The National Origin of the International Anti-Corruption Business. Montreal: International Studies Assocation Annual Convention. Kaufmann, Daniel, Art Kraay, and Massimo Mastruzzi. 2009. Governance Matters VIII: Aggregate and Individual Governance Indicators 1996–2008. Policy Research Working Paper Series 4978. In Policy Research Working Paper Series 4978. Washington, D.C.: The World Bank. Keefer, Philip. 2007. The Poor Performance of Poor Democracies. In The Oxford Handbook of Comparative Politics, ed. C. Boix and S. Stokes. Oxford: Oxford University Press, pp. 886–910. Keefer, Philip, and Razvan Vlaicu. 2008. Democracy, Credibility, and Clientelism. Journal of Law Economics & Organization 24(2): 371–406. Lapuente, Victor, and Bo Rothstein. 2014. Civil War Spain versus Swedish Harmony: The Quality of Government Factor. Comparative Political Studies 47(10): 1416–41. MacMullen, Ramsay. 1988. Corruption and the Decline of Rome. New Haven: Yale University Press. Mandela, Nelson. 1994. Long Walk to Freedom: The Autobiography of Nelson Mandela. London: Little Brown. Mann, Michael. 2005. The Dark Side of Democracy: Explaining Ethnic Cleansing. New York: Cambridge University Press. Mungiu-Pippidi, Alina. 2015. The Quest for Good Governance: How Societies Develop Control of Corruption. New York: Cambridge University Press. Murtin, Fabrice, Lara Fleischer, Vincent Siegerink, Aarstein Aassve, Yann Algan, Romina Boarini, Santiago González, Zsuzsanna Lonti, Gianluca Grimalda, Rafael Hortala, Vallve Soonhee Kim, David Lee, Louis Putterman, and Conal Smith. 2018. Trust and its Determinants. Evidence for the Trustlab experiment. Paris: OECD Statistics Papers 2018/02.
Corruption
Myrdal, Gunnar. 1968. Asian Drama: An Enquiry into the Poverty of Nations. New York: Twentieth Century Fund. Norris, Pippa. 2012. Democratic Governance and Human Security: The Impact of Regimes on Prosperity, Welfare and Peace. New York: Cambridge University Press. North, Douglass C., John J. Wallis, and Barry R. Weingast. 2009. Violence and Social Orders: A Conceptual Framework for Interpreting Recorded Human History. Cambridge: Cambridge University Press. Ostrom, Elinor. 1990. Governing the Commons: The Evolution of Institutions for Collective Action. New York: Cambridge University Press. Pearson, Zoe. 2013. An International Human Rights Approach to Corruption. In Corruption and Anti-corruption, ed. P. Larmour and N. Wolanin. Canberra: Asia Pacific Press, pp. 30–62. Persson, Anna, Bo Rothstein, and Jan Teorell. 2010. The Failure of Anti-Corruption Policies. A Theoretical Mischaracterization of the Problem. QoG Working Paper 2010:19. Gothenburg: The Quality of Government Institute, University of Gothenburg. Rawls, John. 1971. A Theory of Justice. Oxford: Oxford University Press. Rawls, John. 2005. Political Liberalism (expanded edition). New York: Columbia University Press. Rotberg, Robert I. 2017. The Corruption Cure: How Leaders and Citizens Can Combat Graft Princeton: Princeton University Press. Rothstein, Bo. 2011. The Quality of Government: Corruption, Social Trust and Inequality in a Comparative Perspective. Chicago: The University of Chicago Press. Rothstein, Bo, and Nicholas Sorak. 2017. Ethical Codes for the Public Administration (QoG Working Paper 2017:12). Gothenburg: The Quality of Government Institute, University of Gothenburg. Rothstein, Bo, and Jan Teorell. 2008. What is Quality of Government: A Theory of Impartial Political Institutions. Governance: An International Journal of Policy, Administration and Institutions 21(2): 165–90.
985
Rothstein, Bo, and Davide Torsello. 2014. Bribery in Pre-industrial societies: Understanding the Universalism–Particularism Puzzle. Journal of Anthropological Research 70(2): 263–82. Rothstein, Bo, and Aiysha Varraich. 2017. Making Sense of Corruption. Cambridge: Cambridge University Press. Schedler, Andreas. 2002. The Menu of Manipulation. Journal of Democracy 13(2): 36–50. Sen, Amartya. 2011. Quality of Life: India vs. China. New York Review of Books LVIII (2011:25): 44–47. Sneath, David. 2006. Transacting and Enacting Corruption, Obligation and the Use of Money in Mongolia. Ethnos 71(1): 89–122. Svallfors, Stefan. 2013. Government Quality, Egalitarianism, and Attitudes to Taxes and Social Spending: A European Comparison. European Political Science Review (online preview) 5(3): 363–80. Taylor, Clare. 2017. Corruption and Anticorruption in Democratic Athens. In Anticorruption in History, ed. R. Croeze, A. Vitória and G. Geltner. Oxford: Oxford University Press, pp. 21–34. Teorell, Jan. 2015. A Quality of Government Peace? Explaining the Onset of Militarized Interstate Disputes, 1985–2001. International Interactions 41: 648–73. Watts, John. 2017. The Problem of the Personal. Tackling Corruption in Later Medieval England, 1250–1550. In Anticorruption in History, ed. R. Croeze, A. Vitória and G. Geltner. Oxford: Oxford University Press, pp. 91–102. Werner, Cynthia. 2000. Gifts, Bribes and Development in Post-Soviet Kazakhstan. Human Organization 59(1): 11–22. Widmalm, Sten. 2005. Explaining Corruption at the Village Level and Individual Level in India. Asian Survey XLV (5): 756–76. Widmalm, Sten. 2008. Decentralisation, Corruption and Social Capital: From India to the West. Thousand Oaks: Sage.
59 Governance Carlos R. S. Milani
INTRODUCTION In 1975, a major report on the governability of democracies was published by the Trilateral Commission. This report was based on the hypothesis that governability problems, at least in Western Europe, Japan and the United States, would be related to the cleavage between the increase in social demands on the one hand, and the lack of the state’s financial, managerial and human capacities and resources on the other. In a more general perspective, the report argues that the political crisis in developed societies is due to the acceleration of technological progress and the complexity of their social fabric, conditions to which traditional public management could not respond properly. As a result, the authors recommended that state institutions and governmental policies should change in view of this new reality, but also that individuals should adapt their behavior and expectations towards the state and its capacities in public good provision (Crozier et al., 1975).
The Trilateral Commission Report aroused interest in international organizations, which have since then broadened the scope from ‘governability’ to ‘governance’, often associating it with adjectives such as ‘good’ and ‘democratic’. Particularly after the end of the Cold War era, governance has also been presented as complementary to global market regulation mechanisms, without pointing out contradictions between market interests, state regulation, social policies and human rights. In addition, the reference to questions of citizens’ participation and public management, without necessarily mentioning the pertinence of contextual politics and the role of states, has frequently made governance a malleable instrument for international organizations and policy experts. Therefore, during the transition between the 1980s and the 1990s, a normative vision of governance was disseminated in World Bank (WB) reports and in action plans or final declarations of United Nations (UN) conferences, programs and agencies.
Governance
This international circulation of the concept has not only resulted in biased uses and abuses of governance, it has also reinforced its potential mystification as a social science construct and produced the risk of its manipulation and decontextualized uses as a world-wide policy recipe. The fact that I acknowledge these biases and abuses related to governance diffusion does not mean, however, that contemporary societies are not complex or that the world-system is not composed of much more self-governing subsystems. Moreover, neither does it imply that societies are not currently experiencing a deep crisis in their democratic models, especially in terms of representativeness, participation, responsibility, legitimacy, recognition, redistribution and social cohesion. In fact, debates on governance stemming from its global circulation in the field of development cooperation and its policy diffusion by international organizations have seldom reflected the in-depth epistemological and methodological efforts of renewal within political science, in particular in the fields of public policy and international relations, when analyzing problems of collective action. Policy-wise, governance presented normative and prescriptive dimensions related to the world visions supported by national and international agencies, whereas in their scholarly work political scientists attempted to carve out new concepts that would take into consideration both public and private actors, and both governmental and non-governmental interests, in the construction of public goods from local to global levels. In this chapter, I argue that debates on governance, in spite of the effects resulting from its international circulation by interest groups and regional and global organizations, have had the merit of reopening fundamental discussions about the public space from local to international levels. Such qualified debates have revealed that the rationale of procedures in complex societies has become as important as the substance of decisions or the content of public decisions and governmental policies. In
987
fact, governance as a political science concept goes beyond government because it encompasses control mechanisms that are outside the strict jurisdiction and regulatory sphere of the state. It goes beyond liberal democracy because it implies notions of efficiency, sustainability and social justice that should concomitantly feed debates on democracy and development. Governance refers not only to the political system as a whole, but also to subsystems in which social groups control the process or parts of the process from which the actions of diverse stakeholders result. Bearing such origins of the concept and these complexities in mind, this chapter is organized around the following topics: (i) a short history of the subject; (ii) the international circulation of governance and regional differentiation; and (iii) visions, definitions and typologies related to the subject. In the conclusion, I wrap up the main argument of the chapter and present some research implications for political science and international relations.
A short history of the subject Etymologically, the word ‘governance’ derives from the ancient Greek kybernetes, a term used to define how to conduct, drive or navigate something. The first appearance of the word dates back to the 15th century in French (gouvernance), coming into the Anglo-Saxon world at the end of the 17th century, and since then it has been circulated as related to the exercise of power or the activity of government. In the modern world the term has been used to imply the provision of means of collective steering for markets, societies and politics. In 1904, S. Low used the word governance, but gave it the same definition that one could then find for government (Low, 1904). In 1949, J. Fesler defined local governance as linked in many ways with processes of decentralization, since the author considered that decentralization dealt with the distribution of power on a
988
The SAGE Handbook of Political Science
territorial base (Fesler, 1949). In 1975, M. Crozier, S. Huntington and J. Watanuki emphasized the governability of contemporary market–state relations as a central issue for Western societies (Crozier et al., 1975). In the 20th century, many other publications in the transition between the 1970s and the 1980s stressed the need to improve governance as related to the economic costs of bureaucratic management, to the decentralization of governmental activities, to risk analysis, as well as to the need to increase efficiency in public affairs (Baden and Stroup, 1981; Fleishman, 1982; Henderson and Abney, 1977; Mead, 1977; Mirtimer and Richardson, 1977; Zimmerman, 1982). In the field of economics and management, the term ‘corporate governance’ was also introduced in the 1970s and used as a synonym for a better management of businesses in their relations with public and private stakeholders (Williams, 1975; Sommer, 1977). In this context, it is therefore clear that at the national level, the term has historically been very much associated with public and private efficiency within institutional spheres. In the field of Administrative Law, national and local governments have often been responsible for implementing public policies and programs. Bureaucracies play a central role in shaping the implementation process but may also share it with non-governmental actors, such as local associations and service providers. New Public Management (NPM) has been the main subfield within Public Administration to capture the new meanings of public service provision and public–private partnerships. In some versions of NPM, not only would the managers resume a central role in public policy implementation, but citizens (often conceived as customers) could also be called upon to participate in the process, either as co-implementers or as evaluators of public services. Different to the experiences of participatory democracy in which citizens would also play a decision-making role, in NPM, conceptions and practices of citizens’ or customers’ participation have tended to be
restricted to results-based management and service quality evaluation (Mayntz, 1993; Peters, 2011). In the field of International Relations, the term has varied in its usages and meanings. In general, governance has been presented as a combination of cooperation and coordination efforts among interdependent units (mostly states, but not only). Interdependence would give rise to cooperation or coordination for purposes of economic development, prosperity, international trade, reduction of transaction costs and regional interaction or integration among units (Keohane and Ostrom, 1995). However, in international politics, global governance has also meant that conflicts of interests could rise when units had to define goals or whenever they had to debate its norms (Finkelstein, 1995). The Commission on Global Governance, established by UN Secretary-General Boutros Boutros-Ghali in 1992 and co-chaired by Ingvar Carlsson and Shridath Ramphal, made global governance widely popular in the field of International Relations thanks to the publication in 1995 of the Our Global Neighborhood report. While some scholars considered that global governance implied the pluralization of actors, which have in increasing numbers injected unexpected voices into international discourse about numerous problems of global scope (Gordenker and Weiss, 1995), others criticized the Commission’s excessive ideological confidence in market-oriented public policymaking to solve distributive conflicts and political problems (Broadhead, 1996; Falk, 1995). Indeed, the use of global governance both as a policy tool and a concept was highly controversial in the 1990s, and the debate around its applications appealed to a variety of criticisms stemming from defenders of the sovereignty principle, supporters of a world-federalist viewpoint or those who considered it as a rebirth of traditional domination patterns in world politics (Senarclens, 2001; Uvin and Biagiotti, 1996; World Bank, 1994). In this regard, L. Finkelstein (1995) stated that global governance was presented
Governance
as the (new) field of (future) international politics, although its conceptual contours had not clearly been identified and its empirical implications remained under-demonstrated. The author recalled that, due to the interconnectedness of the decision processes among and within states in international politics there would, in fact, be many new actors, such as non-governmental organizations, which tend to play increasingly significant roles in international negotiations that involve what had once been termed ‘two level games’ (Putnam, 1988) or ‘double edged diplomacy’ (Evans et al., 1993). However, because governance overlapped categories of functions performed internationally (information creation and exchange, formulation and promulgation of principles, good offices and mediation in conflict resolution, regime formation, maintenance of peace and order, etc.), L. Finkelstein wondered what conceptual differences between global governance and international regime would actually exist in the field of IR. As the author concludes, ‘global governance appears to be virtually anything’ when it comes to its conceptual development (Finkelstein, 1995: 368). Irrespective of such critical remarks, one must recognize that at both national, international and global levels, governance is a concept that can be used to respond to economic, social and political changes that have occurred world-wide. Concepts have history and can present secondary layers of meaning in addition to their explicit or primary meaning; however, these changes from local to global levels are all related to what is now encompassed by the multidimensional phenomenon of globalization. The steering activity that local, national, regional or global governance has implied in the use of the term ‘governance’ is usually associated with relationships between both governmental actors (and with the formal institutions of the public sector), social actors and market actors, which have deeply changed in the last 30 years. In a nutshell, governance owes its initial success to the diagnosis of post-Second World
989
War forms of government or public policymaking in Western countries that galvanized its conceptual development. Very often the welfare state and Keynesian-inspired policies were criticized for their inability to cope with the budgetary difficulties arising from the crisis of the years 1975–1990. These essentially redistributive policies consisted of a tax collection on economic sectors to provide benefits to organized groups that were socially and economically disadvantaged. Globally, the collapse of the dollar-gold standard in 1971 and the growing financialization of the economy, among other factors, have rendered such welfare policies extremely difficult or inexecutable within the context of macroeconomic stability precepts. Led by a relatively centralized administrative apparatus, welfare policies functioned in developed countries from the end of the Second World War to the mid 1970s in an economic environment marked by significant growth and in the political context of the Cold War era. In the developing world, these welfare policies were extremely limited in scale (% of GDP) and scope (% of population), and only semi-industrialized countries such as Argentina, Brazil, India, Mexico and Turkey could partially implement such policies. The economic crisis of the 1970s and the end of the Cold War in the transition between the 1980s and the 1990s have halted this pro-welfare wave first in the West and later world-wide: inflation continued while growth went down, making the interplay between equality and efficiency in Western and industrialized no longer a functional equation according to the emerging macroeconomic parameters (Hirschman, 1980; Leca, 1996; Venkataraman, 1994). As a result, there would be at least four difficulties related to governance, as listed by R. Mayntz (1993): (i) an implementation problem: the inability to implement the rules; (ii) a motivation problem: the refusal of groups to recognize the legitimacy of governmental action; (iii) a knowledge problem: the misunderstanding of causality links between means
990
The SAGE Handbook of Political Science
and ends; and (iv) a governability problem: the absence or lack of instruments of governmental action. The term governance, whose content was relatively vague from an analytical point of view when the first articles and reports were published, therefore appeared within the framework of a critique of the welfare state and governmental policies. According to the defenders of governance, the solution to the various problems then encountered in developed and developing countries would stem from the recognition of roles played by organized social actors and economic operators that do not necessarily go through the government filter. To speak of governance has therefore meant to emphasize the role of organized groups of actors, be they more or less formal (NGOs, communities, networks, foundations), in the definition and implementation of collective actions aimed at a certain common good which was not comparable to public actions aimed at the public interest implemented by governments. Governance, at both national and global levels, has contradicted the central role of states and governments to refer to a set of cognitive and organizational capabilities and techniques to improve decisionmaking processes and, at large, ‘public’ policy-making. Along these lines, it is possible to argue that the actual definition of governance supposes developing a given theory of the state and its role in the public realm – an aspect which some political scientists frequently tend to disregard in their analysis of governance issues.
International circulation and regional differentiation Aside from the 1995 report published by the Commission on Global Governance, the UN promoted a series of summits in the 1990s, the action plans or final declarations of which regularly called for new governance structures, mechanisms and procedures to fight
against global problems: that was the case of Rio de Janeiro on environment and sustainable development, Copenhagen on social development, Cairo on population, Vienna on human rights, Istanbul on human settlements and Beijing on women. Put together, the UN conferences, their follow-up plans and the Our Global Neighborhood report steered the international circulation of governance as a policy tool that should reflect the different processes and methods through which individuals as well as public and private institutions manage collective problems. The Commission’s report forthrightly stated that governance corresponds to a continuous process by which conflicting interests are regulated and cooperation can be developed. The process comprises the constitution of formal institutions and regimes capable of strengthening the relations of subordination, it also includes informal agreements that people and institutions establish or intend to establish in the protection of their interests. (Commission on Global Governance, 1995: 2)
The United Nations Development Program (UNDP) embraced governance and spread it out in the developing world as the exercise of political, economic and administrative authority in the management of collective problems at different levels. According to the UNDP, ‘sound’ governance would correspond to the complex set of mechanisms, processes, and institutions through which citizens and social movements articulate their interests, exercise their rights, fulfill their duties and resolve differences. So as to be effective, governance should be, inter alia, participatory, transparent, effective, equitable, and founded on legal principles. In addition, sound governance would encompass the state, civil society and the private sector, and be understood in three main perspectives, namely: economic governance, comprising decision-making processes that affect national and international economic activities; political governance, which refers to the design and implementation of development policies; and administrative governance, which includes what the UN frames as innovative management systems (UNDP, 1996).
Governance
In the work of the WB, governance was first launched in the late 1980s in several reports dealing with failures of structural adjustment programs (particularly in Africa) and introducing new political conditionalities for international aid. Requesting from beneficiary countries to accept and implement ‘good’ governance programs became a first condition for these countries to have access to loans and credits from the Bank (Nunnenkamp, 1995). Afterwards, the WB sophisticated its approach and started setting up surveys, indicators and performance measures to deal with governance in the developing world. It attempted to support performance activities in developing countries mainly through two types of activities: (i) what the Bank called ‘civic engagement’, which focused on a bottom-up process by which NGOs defined performance outcomes and were involved in assessing the achievement of those outcomes; and (ii) based on New Zealand’s experience, the Bank promoted a top-down effort emphasizing efficiency and market-based solutions (Radin, 2007). Therefore, the WB associated governance with social capital and civic culture. By using these concepts as policy instruments in its function of a knowledge bank, the WB sought to explain in general terms the reasons why some societies are more developed than others, often mechanistically and in a neoDarwinian fashion (Nelson, 2000). One of the issues related to this knowledgepolicy bridge-building function deployed by the Bank is that the academic discourse has also concerned itself with both the causes and consequences of quality of government or quality of democracy (Altman and PerezLiñan, 2002; Meny and Surel, 2002; Morlino, 2011; Putnam, 1993). Morlino et al. (2011), for instance, proposed a framework which avoids not only ranking countries according to their alleged democratic virtue (in the singular), but also assuming a priori that all democratic elements must go together. The authors therefore consider that countries may perform differently in distinct domains and
991
dimensions of democracy. Domains would include procedure, content and outcome, whereas dimensions would comprise procedure (rule of law, electoral accountability, inter-institutional accountability, political participation and political competition), content (freedom) and outcome (equality and responsiveness). The theory underlying the WB’s proposals posits that governance (in the singular) and social capital are fundamental pieces to promote economic development and to shape incentives for different actors in society. Again, academic research has also highlighted the importance of civic culture and social capital for a democratic development. The concept of social capital, for example, was used by Robert D. Putnam in order to understand the developmental differences between North and South Italy. According to Putnam, social capital refers to the set of norms of mutual trust, cooperation networks, sanction mechanisms and rules of behavior that can improve the effectiveness of society in solving problems that require collective action. Social capital would thus be a public good, a by-product of other social activities founded on horizontal networks and trust relationships (Putnam, 1993). On the basis of Putnam’s work, the WB argued that any social structure based on vertical networks, hierarchical relations, neo-clientelist practices or different forms of submission would produce a non-optimal and non-cooperative equilibrium. Any society founded, however, on a complex structure of horizontal relations of associations, social movements, cultural entities and interactive professionals would produce a good level of social capital, mutual trust and civic commitment. It is not taken into account, in the WB’s attempt to understand governance, that practices of democratic governance differ according to existing democratic systems in contemporary states. An essential part of the democratic vision of governance is precisely the commitment to individual freedoms and personal responsibilities in the exercise of
992
The SAGE Handbook of Political Science
such freedoms. Moreover, popular sovereignty and political equality are also at the center of democratic governance, as well as individual and collective reason in the management of social problems. Ultimately, democratic governance emphasizes the stability and effectiveness of institutional procedures and the regulation of arbitrary powers in exchange for socio-historical and cultural mechanisms (March and Olsen, 1995). On the one hand, it is true that particularly at the end of the Cold War era, many countries have displayed an increasing similarity in governance mechanisms and procedures. The WB and other agencies have played a role in this process of ‘locking in’ national political and policy contexts. Nevertheless, there are important variations in terms of regional and national traditions of governance, collective action and statehood. That is the case of the welfare state in Europe, the developmental state in Asia or the ongoing debates on neo-developmentalism in Latin America. As there are varieties of capitalism, it seems appropriate to speak of qualities of democracy and varieties of governance models world-wide. Through its international policy diffusion programs, the WB has tended to ignore contextual variations when disseminating the governance agenda, its initiatives being paralleled or followed by other development banks (such as the Inter-American Development Bank or the African Development Bank), the OECD and main bilateral aid agencies. The WB would not explain, for instance, that concepts such as social capital and NGO participation in local governance aim to explain why it becomes very difficult to transpose models of collective action from one society to another. Moreover, by disseminating good or sound governance as a policy solution, UN agencies and the WB did not specify what governance should be good for, or to whom. Implicitly in their practices one could systematically find out that for governance to be good or sound it should guarantee stability in market terms. The concept of governance
was then restricted to the quality and capacity of governments and the governed peoples to manage social transformations according to the criteria set up by market economy organizations. Criteria would include, inter alia, governmental efficiency, trade liberalization, privatization, financial openness and transparent regulatory mechanisms (Crane, 2010; Prats Catala, 1996). The 1997 World Bank Report, for example, focused on rebuilding the state as a sine qua non condition for a market economy. Three main elements in the Bank’s conception about governance can be summed up as follows: (i) the economic role of the state, which was restricted to setting up the legal foundations, maintaining the macroeconomic stability, investing in basic social services and infrastructure, protecting the most vulnerable people and safeguarding the environment; (ii) the role of the Bank in promoting policy reforms in several key sectors (fiscal, financial, exchange rate, trade, investment, privatization); and (iii) the need for states to promote electoral democracy, as well as transparency and accountability policies (Guhan, 1998). In fact, in their efforts in favor of policy transfer, international organizations also produced a certain confusion between means and ends, as if development were necessarily and exclusively aimed at a liberalized market economy, as if privatizing and opening up national markets constituted an end in itself (Hadjinsky et al., 2017). In this sense, governance would be limited to the notion of governmental stability and the means used to guarantee it. There would be little room for other social values, ethics, behaviors and parameters that could be proposed and legitimized by civil society organizations, unions, environmental movements, human rights organizations and social justice networks. In this perspective, the international policy diffusion promoted by UN agencies and the WB reduced governance to isolated dimensions (economic, social and political institutions) and sectoral reforms (governance in the area
Governance
of education, environmental governance, decentralization and provision of justice, etc.) in the executive, legislative and judicial capacities of states in developing countries.
Visions, definitions and typologies There are at least two broad world visions which underlie assumptions used to construct governance as a concept. The first one is usually described as ‘liberal’ or ‘minimalist’ because of the role it attributes to governance as a set of rules and institutions for managing voluntary exchanges between citizens and political actors. Under this vision, it is assumed that governance is based on actions depending upon individual rationality and voluntary consent from political actors. The individual enterprise – more than the public action – constitutes the source of innovations and social transformations. Therefore, governance does not play a role in the construction or modification of identities, capacities and mechanisms of adaptation of the actors. This first vision implies two main corollaries: (i) voluntary exchanges among actors aim at the constitution of coalitions according to the preferences and interests of the individuals, who establish formal and informal mechanisms of bargaining that are considered exogenous to the political system; and (ii) individual actors are directly responsible for social transformations, and in this connection governance can be considered as a process of converting individual wills and resources into collective action through coalitions that tend to follow Pareto’s model of social exchanges based on a search for optimum satisfaction for each of the individual actors. In fact, this liberal view of governance prescribes that the satisfaction of the common good comes from the freely agreed exchange between the actors involved, frequently known as stakeholders. As far as global governance is concerned, private companies,
993
NGOs and international experts are seen as key agents in the quest for global governance, together with international organizations (mainly the OECD and the WB). Such a perspective follows utilitarian reasoning that places all the actors around the negotiating table without establishing hierarchy among them, without worrying about asymmetry or phenomena of domination and exclusion of the weaker actors. A second broad world vision is generally called ‘democratic governance’. Building on institutional learning processes and notions of identity and ownership by the actors themselves, this vision of governance goes beyond coalitions and negotiations by individual actors within the limits established by law, institutions, preferences or availability of resources. The main assumption or normative belief of this vision is that individual agents bear different capacities to transform norms, institutions and society. Democratic governance would also admit the influence of the governed on the process by which boundaries are established, but not all of them weigh the same in negotiations. In this connection, democratic governance would consist of four main stages: the development of democratic identities (by citizens and institutions); the development of capacities for political action among social groups, associative movements and formal institutions; the development of options for historical and democratic control of the institutional outcomes; and the development of a political system capable of questioning itself and, as a result, able to adapt to the different historical and cultural environments (March and Olsen, 1995). This democratic vision of governance would therefore entail a series of on-going tensions between participation and representation, freedom and control, social demands and institutional autonomy, social needs and state capacities, bureaucratic politics and relations between politicians and bureaucrats, as well as between globalization and national autonomy. Under such a vision, governance can be defined as a complex and continuous
994
The SAGE Handbook of Political Science
process through which self-organizing networks, collective action mechanisms and institutions shield and uphold local, national and global public goods. These networks, mechanisms and institutions are formal and informal settings that create legitimate regimes and reinforce allegiances among and within state and non-state actors. In this connection, governance would allow citizens, social groupings, economic operators and the state to articulate their interests, defend their rights and fulfill their duties, solve their problems and attempt to avoid the de-stabilizing effects connected with power relations. Hence, the state and the public authorities play a key role in organizing and setting up basic rules within the public realm; however, this vision of governance supposes the democratic division of responsibilities and the decentralization of decision-making. The ‘public’ is not equivalent in meaning of ‘government’, non-governmental organizations and associative movements can also engage in this complex building process of the public good. Therefore, democratic governance implies acknowledging the interdependence of organizations, contextual variations, as well as the dialectic relationship between and within governmental and nongovernmental forces in the public realm – which is broader than the institutional system. Under this vision of democratic governance, it is crucial to understand the ways through which the challenges of collective action are met, to analyze how the tensions mentioned beforehand evolve contextually, but also to monitor the processes associated with the shift in the traditional patterns of government. Nonetheless, in between such broad visions of governance, which correspond to the two extreme positions on a continuum, there are several possible combinations or models of governance. As J. Pierre and B. G. Peters analyzed in their proposed theory of governance, typologies can be set up in between a state-centric governance model and, at the other end of the spectrum, a network-based governance model. Therefore, the authors
believe that there are variations in governance models around the world that may attribute to the state, to economic operators and to social actors largely different roles and degrees of relevance (Pierre and Peters, 2000). In this regard, and based on a review of specialized literature available in 1996, R. Rhodes established a typology based on six uses of governance (Rhodes, 1996): (i) Governance as the minimal state: this use of governance refers to a new form of public intervention and the use of markets and quasimarkets to deliver public services. Regulation replaces ownership as the preferred form of public intervention and the government created regulatory bodies. (ii) Governance as corporate governance: corporate governance is not concerned with running the business of a company, but with giving overall direction to the firm. That is the reason why corporate governance foresees transparency, accountability, oversight and controlling measures to satisfy interests and expectations of stakeholders beyond the corporate boundaries. (iii) Governance as New Public Management: NPM merges managerialism and new institutional economics. Whereas managerialism refers to introducing private sector management methods to the public sector, new institutional economics refer to introducing incentive structures (such as market competition) into public service provision. R. Rhodes recalls that to understand how the transformation of the public sector involves less government and more governance, one can think of the following metaphor: if the public sector were a boat, it would thus need less ‘rowing’ and more ‘steering’. (iv) Governance as good governance: the term ‘good governance’ was originally made up by the WB in order to define its lending policy towards developing countries. Since good governance involves public services efficiency, the WB encourages competition and markets, privatization of public enterprises, the introduction of budgetary discipline, as well as the decentralization of administration services. (v) Governance as a socio-cybernetic system: R. Rhodes cites Jan Kooiman (1993), according to whom governance can be seen as the pattern or structure that emerges in a socio-political system as common result or outcome of the interacting
Governance
intervention efforts of all involved actors. This use of governance stresses the multiplicity of governance actors, wherein government is no longer supreme and its main responsibility should be restricted to enable socio-political interactions in the fields of self-regulation and co-regulation, public–private partnerships or cooperative management. If government refers to activities backed by a formal authority, then this use of governance will refer to activities backed by shared goals and interests. (vi) Governance as self-organizing networks: governance is about managing networks that are self-organizing. Since government is only one of the actors that play a role in a societal system, it does not have enough power (or should not have power) to exert its will on other actors. Integrated networks should thus develop their own policies and mold their environments.
Governance can also be classified according to the different levels of its policy application, such as in the cases of local, regional and global governance. First, governance has very often been thought of as a useful tool for decision-makers at the local level. D. Wilson (1998: 90) defined local governance as ‘the division of decision-making authority and service provision between local authorities and a range of non-elected organizations’. When analyzing local governance models, J. Fesler argued it was important to distinguish between devolution and de-concentration. Devolution would refer to the distribution (or re-distribution) of authority to allow for decision-making by local governments which could then act in a more independent way from a central administration. Central (national) governments might retain overall legal control (equal protection under the laws, voting eligibility, allocating authority to raise revenue, ensuring general law and order, and regulating fraud and corruption) and the authority to alter local government powers. Within those boundaries, devolution would exist if local entities had substantial authority to hire, fire, tax, contract, expend, invest, plan, set priorities and deliver the services they chose. De-concentration, in contrast, would occur when local entities acted largely as the
995
local agents of central governments, thus bearing authority to manage personnel and to spend resources allocated to them by central authorities. De-concentration would essentially refer to the redistribution of central resources to localities under a looser coordination of central government (Fesler, 1949). Another level of policy application is regional governance, which has been used to refer to regions within states and to regional integration processes such as the EU (and to a lesser extent the MERCOSUR, the SADC and ASEAN). Within states, regional governance has regularly been used as a synonym for regional planning, and its policy application would be justified by the improved efficiencies generated by a regional management of public goods (transportation, education, health). Authorities such as mayors or governors would benefit from regional economies of scale, for instance, through regulation of land use, common implementation, control and monitoring of policies; however, on their way to regional governance they would be challenged by institutional coordination needs and cooperation tradeoffs related to responsibility sharing throughout the public policy cycle. Regional governance can denote social demands in regions for greater autonomy from the central institutions of their state. However, the unfolding story of designing appropriate governance structures for the region illustrates both the changes and the continuities in the linkages between local, national and global politics of globalization. This means that the challenges of regional governance may not be confined only to national borders, since regions show an increasing tendency to identify and pursue interests that can be divergent from those expressed in international or European organizations by their respective central authorities (Evans, 2003). That is one of the reasons why, among states, regional governance has become an important catchphrase on the global economic and political scene. The reemergence
996
The SAGE Handbook of Political Science
of regions has also been part of the new debates on international political economy and international security. In the EU two concrete manifestations of regional governance are the institutional construction of the Union itself (council, parliament, courts, programs, networks, etc.) and a non-anticipated effect of regional integration, i.e. what Europeans themselves call the ‘Europe of regions’ (Catalonia, Brittany, Lombardy, etc.). In the case of the EU, regional governance mechanisms have also been set up to deal with the resurgence of these regions within states but inserted in a broader regional integration process, thus changing the scale of policies, potentially intervening in the relations between European central states and subnational spaces, but also rejuvenating old cultural meanings and identities on local and regional territories. Finally, a third level of policy application has been global governance, which has gained currency in recent years, although it is sometimes shorn of realistic aspirations due to the reemergence of nationalisms and far-right movements both in the global North and South (Dingwerth and Pattberg, 2006; Groom and Powell, 1994; Halliday, 2000). Global governance can be seen as having several components: the strengthening of existing international and regional institutions, the evolution of law and norms at the international level, as well as the protection and promotion of global public goods (such as the environment, climate change, space, the high seas). At the normative level, global governance has also enshrined human rights, democracy and anti-corruption as global norms, but the repeated inconsistency between norms and enforcement, between discourse and practice, and the politics of selectivity by Western powers in their interventions and denunciations of practices in the developing world, among other factors, have compromised this moral basis and political justification of global governance. Another risk associated with the policy use of global governance stems from the fact that global
civil society organizations, global governance scholars, global governance norms, etc. come mainly from the North. Yet as another fallout of global asymmetries, the overwhelming majority of international NGOs (especially advocacy NGOs) are European and North American. Therefore, it is inevitable that the concept of global governance tends to be characterized almost exclusively by Western ethics, which also brings about a debate on how to adjust ‘global’ governance to a truly universally accepted vision of governance, North and South, West and East.
Concluding remarks: research implications for political science and international relations In this chapter I have discussed how debates on governance within governments, policy networks, international organizations and academia have re-signified the public realm and its political relevance at local, regional and international levels. Based on the assumption that theoretical work on governance stems from the interest that social scientists have in changing patterns and new methods of governing from local to regional and global levels, these concluding remarks posit some research implications that such governance debates may have for political science and international relations. Despite the different visions and the multifarious usages of governance as a concept, its emergence in academia has suggested a series of relevant on-going research agendas. To conclude this chapter, I focus on two of these research agendas: (i) public policy and bureaucracy; and (ii) new actors in a post-intergovernmental conception of IR. First, as G. Stoker mentioned, discussions about governance in public policy have contributed to structure five relevant propositions for policy-makers and scholars: (i) governance refers to a set of institutions
Governance
and actors that are drawn from but also beyond government; (ii) governance identifies the blurring of boundaries and responsibilities for tackling social and economic issues; (iii) governance identifies the power dependence involved in the relationships between institutions involved in collective action; (iv) governance is about autonomous selfgoverning networks of actors; and (v) governance recognizes the capacity to get things done, which does not rest on the power of government to command or use its authority. It sees government as able to use new tools and techniques to steer and guide (Stoker, 1998). Within public policy studies, one of the main contributions coming from critical scholars has been to challenge a functionalist conception of governance which tended to emphasize markets as an efficient mechanism for making individual choices, somehow neglecting that there is also a need to make some decisions for the society as a whole. In addition, as far as societal decisions are concerned, decision-makers need to consider a range of human values – from social justice to freedom, from solidarity to individualism. That means that decision-makers must not only consider the economic dimension of development policies, rather, they also need to consider social, cultural and environmental dimensions, thus trying to accommodate influences and interests stemming from a broader scope of political, economic, social and environmental actors (Auer, 2000; Paterson, 1999; Winchester, 2009). This compromise among different stakeholders within public administration is particularly discernible when it comes to the implementation of public policies. Policy implementation is of great relevance for the success of any government: the political system may be legitimate and equitable, its goals may be virtuous, but its organizational structure and capacities must be robust and rigorous for the implementation of a public policy to succeed. Shaping public policy implies a complex and multifaceted process in which numerous individual and collective actors
997
compete and cooperate to influence decisionmaking according to their interests. Politics matters in the success or failure of a public policy, but effectiveness in policy implementation also depends on the way policies are shaped and on the state capacities to deliver them. Implementing public policies and programs is part and parcel of any development strategy, wherein apart from political and socioeconomic actors, public bureaucracies play significantly expanding roles. Public bureaucracies have expertise, bear institutional memories and are often those who are more regularly in direct contact with citizens; in spite of this, at the beginning of the 21st century, political science tends to underscore their roles and forms of organization (Pierre and Peters, 2000; Souza, 2015). Second, debates on governance have also contributed to the further development of a sociology of the actors, agendas and networks in the world order, thus moving beyond traditional conceptions of international relations rooted in the monopoly of the state as a global player. Irrespective of intellectual and ideological choices, such debates gave social visibility to actors who tended to be ignored or underestimated in their capacities to influence and intervene in the global arena (Badie, 2008). Classical-realist IR theory was so firmly based on the exclusive analysis of states as rational actors living in an anarchical system that it tended to give little priority to international organizations, international law, changes in capitalism and the role of non-governmental agents in decision-making processes. Liberalism emphasized non-governmental actors and their agendas, but their conception of the state’s roles in trade, regional integration and international cooperation tended to undermine issues related to asymmetry, hierarchy and inequality in world politics. Constructivism, critical theory, Marxism, neo-Gramscian and Durkheimian approaches to IR were among the main schools of thought to benefit from global governance debates in the 21st century. Therefore, even though IR theory was a late-comer in acknowledging the significant
998
The SAGE Handbook of Political Science
presence of NGOs in world politics, let alone in beginning to conceptualize their influence on and even direct involvement in governance both within and across state boundaries, in the past two decades, global governance has been a research field that has become increasingly impossible to ignore (Rosenau and Czempiel, 1992). Global governance debates have drawn attention to concepts such as the international society, the public international arena, as well as global and international mechanisms of dialogue and participation, thus making IR specialists connect with national political sociology traditions of research, comparative politics, area and cultural studies. IR specialists have also engaged more rigorously with field and comparative research. However, as M-C. Smouts recalls, one cannot neglect that the concept itself of global governance is based on a certain representation of social life which tends to disregard phenomena of domination, the unequal distribution of power world-wide, the fragmentation of territories and the increasing social exclusion in developed and developing societies. Due to its attachment to public choice theories, global governance’s underlying ontological criterion is effectiveness, i.e. it is interested in analyzing problems that are to be managed and solved, as well as conflicts of interest that are to be accommodated for the benefit of consensus building (Smouts, 1998).
References Altman, G. and Perez-Liñan, A. (2002), Assessing the quality of democracy: Freedom, competitiveness and participation in eighteen Latin American countries. Democratization, v. 9, n. 2, pp. 85–100. Auer, M. R. (2000), Who participates in global environmental governance? Partial answers from international relations theory. Policy Sciences, v. 33, n. 2, pp. 155–180. Baden, J. and Stroup, R. L. (1981), (eds.) Bureaucracy vs. environment: The environmental costs of bureaucratic governance. Rexdale: John Wiley and Sons Canada Ltd.
Badie, B. (2008), Le diplomate et l’intrus, l’entrée des sociétés dans l’arène internationale. Paris: Fayard. Broadhead, L.-A. (1996), Commissioning consent: Globalization and global governance. International Journal, v. 51, n. 4, pp. 651–668. Commission on Global Governance (1995), Our global neighbourhood: The report of the Commission on Global Governance. Oxford: Oxford University Press. Crane, A. (2010), From governance to governance: On blurring boundaries, Journal of Business Ethics, v. 94, Supplement 1: CROSSSECTOR SOCIAL INTERACTIONS, pp. 17–19. Crozier, M., Huntington, S. P. and Watanuki, J. (1975), The crisis of democracy. Report on the governability of democracies to the Trilateral Commission, New York: NY University Press. Dingwerth, K. and Pattberg, P. (2006), Global governance as a perspective on world politics. Global Governance, v. 12, n. 2, pp. 185–203. Evans, A. (2003), Regional dimensions to European governance. The International and Comparative Law Quarterly, v. 52, n. 1, pp. 21–51. Evans, P., Jacobson, H. and Putnam, R. (1993), Double-edged diplomacy: International bargaining and domestic politics. Berkeley: University of California Press. Falk, R. (1995), Liberalism at the global level: the last of the independent commissions? Millennium, v. 24, n. 3, pp. 563–576. Fesler, J. (1949), Area and administration. Tuscaloosa: University of Alabama Press. Finkelstein, L. (1995), What is global governance? Global Governance, v. 1, n. 3, pp. 367–372. Gordenker, L. and Weiss, T. G. (1995), Pluralising global governance: Analytical approaches and dimensions. Third World Quarterly, v. 16, n. 3, pp. 357–387. Groom, A. J. R. and Powell, D. (1994), From world politics to global governance. A theme in need of a focus, in A. J. R. Groom and M. Light (eds.). Contemporary International Relations: A Guide to Theory. London: Pinter, pp. 81–90. Guhan, S. (1998), World bank on governance: A critique. Economic and Political Weekly, v. 33, n. 4, pp. 185–190. Hadjinsky, M., Pal, L. and Walker, C. (2017), (eds). The micro-dynamics and macro-effects
Governance
of policy transfers: Beg, borrow, steal or swallow? Cheltenham: Edward Elgar. Halliday, F. (2000), Global governance: prospects and problems. Citizenship Studies, vol. 4, n. 1, pp. 19–33. Henderson, T. and Abney, G. (1977), The state legislator and intergovernmental relations: the job of local governance. Publius, v. 7, n. 2, pp. 85–100. Hirschman, A. (1980), The Welfare State in trouble: Systemic crisis or growing pains? The American Economic Review, v. 70, n. 2, Papers and Proceedings of the NinetySecond Annual Meeting of the American Economic Association, pp. 113–116. Keohane, R. and Ostrom, E. (1995), (eds.). Local commons and global interdependence, heterogeneity and cooperation in two domains. London: Sage. Kooiman, J. (1993), ‘Social-political governance’, in J. Kooiman (ed.). Modern governance: New government–society interactions. London: Sage. Leca, J. (1996), Gouvernance et institutions publiques. L’Etat entre sociétés nationales et globalisation, in R. Fraisse and J. B. De Foucauld (eds.). La France en prospectives. Paris: Odile Jacob, pp. 317–350. Low, S. (1904), The governance of England. London: Fisher Unwin. March, J. and Olsen, J. P. (1995), Democratic governance. New York: The Free Press. Mayntz, R. (1993), Governing failures and the problems of governability: Some comments on a theoretical paradigm, in Jan Kooiman (ed.). Modern Governance: New GovernmentSociety Interactions. London: Sage, pp. 9–20. Meny, Y. and Surel, Y. (2002), (eds.). Democracies and the populist challenge. London: Palgrave. Mirtimer, K. and Richardson, R. (1977), Governance in institutions with faculty unions: six case studies. Center for the study of higher education, Research report, Pennsylvania state university, 1977. Morlino, L. (2011), Changes for democracy. Actors, structures and processes, Oxford: Oxford University Press. Morlino, L., Dressel, B. and Pelizzo, R. (2011), The quality of democracy in Asia-pacific: Issues and findings. International Political Science Review, v. 32, n. 5, pp. 491–511.
999
Nelson, P. (2000). Whose civil society? Whose governance? Decision making and practice in the New Agenda at the Inter-American Development Bank and the World Bank. Global Governance, v. 6, n. 4, Civil Society and Multilateral Development Banks (Oct.–Dec.), pp. 405–431. Nunnenkamp, P. (1995), What donors mean by good governance. IDS Bulletin, v. 26, n. 2, pp. 9–16. Paterson, M. (1999), Interpreting trends in global environmental governance. International Affairs (Royal Institute of International Affairs 1944–), v. 75, n. 4, pp. 793–802. Peters, B. G. (2011). Governance, administration policies, in B. Badie, D. Berg-Schlosser and L. Morlino (eds.). International encyclopedia of political science. Thousand Oaks: Sage, pp. 994–1010. Pierre, J. and Peters, B. G. (2000), Governance, politics and the state. New York: St. Martin’s Press. Prats Catala, J. (2005), De La Burocracia Al Management, Del Management A La Gobernanza. Madrid: Instituto Nacional de Administración Pública. Putnam, R. (1988), Diplomacy and domestic politics: The logic of two-level games. International Organization, v. 42, n. 3, pp. 427–460. Putnam, R. (1993), Making democracy work: Civic traditions in modern Italy. Princeton: Princeton University Press. Radin, B. A. (2007), Performance measurement and global governance: The experience of the World Bank. Global Governance, v. 13, n. 1, pp. 25–33. Rhodes, R. A. W. (1996), The new governance: governing without government. Political Studies, v. 44, n. 4, pp. 652–667. Rosenau, J. and Czempiel, E-O. (1992), (eds.). Governance without government: order and change in world politics. Cambridge: Cambridge University Press. Senarclens, P. de. (2001), International organizations and the challenges of globalization. International Social Science Journal, v. 53, n. 170, pp. 509–522. Smouts, M-C. (1998), The proper use of governance in international relations. International Social Science Journal, v. 50, n. 155, pp. 81–89. Sommer, A. (1977), The impact of the SEC on corporate governance. Law and Contemporary Problems, v. 41, n. 3, pp. 115–145.
1000
The SAGE Handbook of Political Science
Souza, C. (2015), Capacidade burocrática no Brasil e na Argentina: quando a política faz a diferença. Brasilia: IPEA (Institute for Applied Economic Research), Discussion paper series, n. 2035. Stoker, G. (1998), Governance as a theory: five propositions. International Social Science Journal, v. 50, n. 155, pp. 17–28. UNDP (1996), Governance for sustainable human development, a UNDP policy document, management development and governance division, Bureau for Policy and Program Support. Uvin, P. and Biagiotti, I. (1996), Global governance and the ‘new’ political conditionality. Global Governance, v. 2, n. 3, pp. 377–400.
Venkataraman, A. (1994), The crisis of the welfare state. The Indian Journal of Political Science, v. 55, n. 2, pp. 159–165. Williams, T. (1975), Governance is the real issue: a management manifesto. The Phi Delta Kappan, v. 56, n. 8, pp. 561–562. Wilson, D. (1998), From local government to local governance: re-casting British local democracy. Democratization, v. 5, n. 1, pp. 90–115. Winchester, N. B. (2009), Emerging global environmental governance. Indiana Journal of Global Legal Studies, v. 16, n. 1, pp. 7–23. World Bank (1994), Governance and development. Washington: The World Bank. Zimmerman, J. (1982), Regional governance models. National Civic Review, v. 71, n. 2, pp. 84–90.
60 Implementation Harald Sætren
Introduction The aim of this chapter is to provide an updated, state-of-the-art assessment of policy implementation research since its early days, based on bibliometric data. This implies charting both the evolution of this research field over time as well as analysing how that has affected its composition/content along a number of salient dimensions. We will use a ‘research generations’ metaphor to describe and analyse how this field of study has evolved over close to half a century towards a ‘third generation research paradigm’. The indicators and motivation of the latter will be used as a benchmark in our assessment with respect to scientific progress in more qualitative terms. The findings and conclusions reported here may be unexpected to many. Especially those who have thought that implementation research is a ‘yesterday’s’ research topic, almost extinct, largely irrelevant and hence also quite unfashionable. To readers interested in the academic discipline
to which this Handbook is devoted, we think and hope that what we have to say about the state of affairs with regard to research on policy implementation is pleasantly surprising and comfortingly good news.
Research methodology and data sources Our state-of-the-art assessment is based on a systematic review (Cooper, 2010) which entails detailing several explicit, transparent and reproducible steps with respect to identifying the relevant implementation research literature as well as extracting a manageable sample for synthesizing results in a reliable and valid manner. Implementation has become an immensely popular research topic in most academic fields. According to the Web of Science Core Collection, by late 2018 some 81,000 journal articles had implementation or implementing as a title word
1002
The SAGE Handbook of Political Science
(see chart 60.1). At least two-thirds of these are published in Computer Science, Engineering, Physics and Medicine journals, where implementation and even policy refers to mathematical and technical solutions as well as medical treatment procedures. We deem these micro-level fields of study to be not very relevant for our more holistic policy analytical purposes. This includes an intervention-oriented strand of implementation research in life sciences like health, medicine and psychology called implementation science, which has its own journal and biannual conferences with the same name (cf. Nilsen et al., 2013). Briefly stated, the database for our systematic review was limited to the Web of Science Core Collection journals in the academic fields of political science, public policy and public administration and management. A challenge here is that even these journal categories are very broadly defined academically, thus including to some extent area studies, business, economics, law and sociology journals. Weeding out the latter type of irrelevant journals reduced the initial sample of journals from approximately 1,600 to well over 800, thus producing the more refined, genuine journal sample we aimed for. Furthermore, we limited ourselves to examining only articles in these core journals as we call them that used the title words ‘implementation’ or ‘implementing’. See appendix 1 in Sætren (2005) for a listing of the journals we classified as belonging to the core, which we later supplemented with some public management journals. The selection of journals and related articles was premised on the assumption that they would be more committed to a general theory development of the policy-making process, of which implementation is an integral part, than journals outside this core field and articles not using implementation or implementing as title words. This selection procedure generated a sample of 836 articles published between 1952–2017 which was subjected to content analysis with respect to a number of salient features for our analytical
purposes that were registered and coded in an electronically retrievable database program called Cardbox. Thus, while our sample clearly has some limitations in a larger sense, it is justified in terms of our political science, public administration/management and public policy orientation as well as consistency, manageability, simplicity, transparency and replicability. It should also be noted that this chapter is an expanded and updated version of a similar empirical assessment of the implementation research literature conducted half a decade ago by Sætren (2014). The three generations of implementation research identified so far (Goggin et al., 1990; Sætren, 2014) will each be analysed in terms of how they have dealt with conceptual, methodological, regional and theoretical issues challenging implementation scholarship since its infancy.
First-generation implementation research: 1970s and earlier The period from the mid to late 1970s into the early to mid 1980s was in many ways the golden era of implementation research, accompanied with great enthusiasm and excitement. Perhaps there was even some hubris among policy scholars claiming to have discovered implementation as the missing link in the study of the policy process so far (e.g. Hargrove, 1975). Previous evaluation studies had documented that many ambitious welfare programmes in the United States enacted under political slogans like ‘War on Poverty’ and ‘Great Society’ in the mid 1960s did not work out in practice, as intended by policy makers. However, valuable as these studies were, they could not say much about how and why this happened as they did not focus on the intervening transformative process between policy objectives and programme outputs and outcomes: the implementation part of the policy process. The seminal case study by Pressman and
Implementation
Wildavsky of one such programme, published in a book pointedly titled Implementation in 1973, is usually and rightfully credited with turning policy scholars’ research attention towards opening this hitherto black box in the policy process. Nevertheless, and before we proceed in detailing this sudden surge of interest in implementation research during the 1970s, we must point out that there are some other, and earlier, seminal studies, for instance in the fields of public administration and sociology of law, that have focused on the execution stage of public policies without ever using implementation as a key term. Phillip Selznick’s (1949) famous case study of the Tennessee Valley Authority, where co-optation of local farmers by the TVA impacted on its programme implementation in a manner not anticipated or intended, is just one example. Another is the wellknown study of the corps of forest rangers by Herbert Kaufman (1960). A last example here is the sociology of law research traditions dating back to the 1950s that examined both the execution and impact of laws (e.g. Aubert, 1969). Valuable as these and other similar classical studies still are with respect to their observations and findings, they were not devoted to theorizing about the execution stage of the policy process. A significant number of journal articles published before 1970 used the title word implementation without having the same seminal impact as the 1973 book by Pressman and Wildavsky (Sætren, 2005). First-generation implementation research was important as it brought about a shift of focus in policy research from how a bill becomes law, to problematizing how laws are translated into implemented programmes. These scholars gave detailed accounts of how a single authoritative decision was carried out, usually at a single location, or more rarely at multiple sites, and their focus was on identifying the numerous barriers to effective policy implementation (Linder and Peters, 1987). The seminal pioneering
1003
work of Pressman and Wildavsky (1973) is a classic example. It was a study of a federal economic development programme aiming at reducing unemployment among ethnic minority groups in Oakland, California. Pressman and Wildavsky identified a fairly large number of ‘clearances’ needed at each decision point of the implementation process which they interpreted to be a major source of the observed policy/programme failure. They thus concluded that the more critical clearance points in the implementation process, the smaller the chance for overall policy/programme success. These complexities of joint action, as the authors called them, are fairly common in the implementation of most policy programmes (Hall and O’Toole, 2000). Another pioneering implementation scholar, Derthick (1970, 1972), chronicled and tried to explain the dismal fate of seven federally funded public housing projects. There were a few exceptions to this bias in research focus towards failed policy programmes – e.g. Eugen Bardach (1977) – but even he found that policy success required quite extraordinary individual efforts and skills by a dedicated policy entrepreneur during both policy formation and implementation. No wonder, then, that Pressman and Wildavsky (1973) concluded that most federal policy programmes were doomed to fail. Some basic conceptual issues related to policy implementation were dealt with by pioneering scholars. Thus, when first presenting a more comprehensive and analytical framework intended to guide the research focus in future implementation studies, Van Meter and Van Horn (1975: 447) offered the following definition: policy implementation encompasses those actions by public and private individuals (or groups) that are directed at the achievement of objectives set forth in prior policy decisions. This includes both one-time efforts to transform decisions into operational terms, as well as continuing efforts to achieve the large and small changes mandated by policy decisions.
1004
The SAGE Handbook of Political Science
The authors also refer to Pressman and Wildavsky (1973: xiii), who said it even more clearly: Implementation to us means just what Webster and Roger say it does: to carry out, accomplish, fulfil, produce, complete. But what is being implemented? A policy naturally. There must be something out there prior to implementation: otherwise there is nothing to move towards in the process of implementation. A verb like ‘implement’ must have an object like ‘policy’.
Many later authors would offer similar but slightly different definitions of implementation (e.g. Sabatier and Mazmanian, 1980), yet, according to Hill and Hupe (2014: 7–8), they all seem to boil down to ‘what happens between the establishment of a policy and its impact in the world of action’ (O’Toole, 2000: 273). While most definitions are relatively clear about where implementation begins, some are less clear about where the process ends. Nevertheless, Pressman and Wildavsky (1984) and Van Meter and Van Horn (1975), like most later policy scholars (e.g. Winter, 2012; Hill and Hupe, 2014), seem to agree with O’Toole (2000) above that the logical end point is when all actions to carry out a policy as intended have been completed
(i.e. implementation outputs) and before efforts to evaluate their societal effects and impacts have started (implementation outcomes) (Hill and Hupe, 2014: 12). Finally, and not least importantly, it should be noted that, although the definition of the implementation process by Van Meter and Van Horn (1975) is relatively open with respect to who might participate in this process (e.g. non-governmental actors), there is little doubt that the criteria for judging implementation outputs and outcomes by the authors and their contemporaries were primarily premised on the goals and values of democratically elected policy makers at the top of the political-administrative system in question. The main contribution of the first wave of implementation studies (during the 1970s) was to describe and conceptualize policy implementation as a complex and dynamic process that involved multiple participants with a wide range of interests and interpretations regarding authoritative decisions. They were predominantly (69%) studies of a single policy programme at a single geographical site (i.e. single case studies) by overwhelmingly male authors (78%) located for the most part in North America (79%) and
30000 25000 20000 15000 10000 5000
19 56 –5 19 9 60 –6 19 4 65 –6 19 9 70 –7 19 4 75 – 19 79 80 –8 19 4 85 –8 19 9 90 –9 19 4 95 –9 20 9 00 –0 20 4 05 –0 20 9 10 –1 20 4 15 –1 8
0
Chart 60.1 Web of Science Core Collection articles in all academic fields published in 5-year increments with title words implementation or implementing. Total N = 81182
1005
Implementation
relying overwhelmingly on qualitative data (84%) and non-statistical methods (97%) of analysis. Furthermore, only a slight majority of these would have had some sort of theoretical reference or framework and very few of these again would have had some derived and explicitly formulated hypotheses to guide their research as such constructs only started
to emerge from the mid 1970s onwards (see tables 60.1–60.4). After a slow and gradual start with journal articles predating even the seminal book by Pressman and Wildavsky (1973), a rapid and exponential growth in articles published took place from the mid 1970s and continued unabated through the remaining decade (see chart 60.2).
350 300 250 200 Web of Sci. Core ar�cl.
150
Cardbox Core ar�cl.
100 50
4
9
–1
20
10
4
–0
20
05
9
–0
20
00
4
–9
19
95
9
–9
19
90
4
–8
19
85
9
–8
19
80
4
–7
19
75
9
–7
19
70
4
–6
19
65
–6
60
19
19
53
–5
9
0
Chart 60.2 Web of Science Core Collection and Cardbox core journal articles by 5-year increments of publication. Total N is 1,219 and 748 respectively Table 60.1 Type of data in core journal articles by time period published. Percentage base: empirical articles
Single case study Only qualitative data Only quantitative Questionnaire data Comparative research Longitudinal study (N)=100%
1953–1979
1980–89
1990–99
2000–09
2010–17
Total
69% 84% 10% 10% 31% 25% (32)
40% 63% 19% 12% 51% 17% (147)
32% 55% 27% 15% 52% 21% (108)
30% 60% 25% 19% 50% 17% (156)
27% 54% 33% 15% 47% 17% (205)
37% 59% 26% 15% 49% 19% (649)
Table 60.2 Level of empirical analysis in core journal articles by time periods. Percentage base: empirical articles Some form of hypotheses testing Statistical analysis Regression analysis (N)=100%
1953–1979
1980–89
1990–99
2000–09
2010–17
Total
16% 3% 0% (32)
37% 17% 12% (147)
42% 29% 24% (108)
52% 24% 18% (156)
54% 30% 23% (205)
46% 24% 17% (649)
1006
The SAGE Handbook of Political Science
Table 60.3 Core journal articles by regional focus/origin and time period published. Percentage base: all sample articles
US & Canada Europe Latin America Africa Asia Middle-East Oceania (N)=100%
1953–1979
1980–89
1990–99
2000–09
2010–17
Total
79% 16% 0% 0% 4% 2% 2% (57)
69% 24% 1% 1% 5% 1% 1% (211)
61% 23% 1% 3% 7% 3% 5% (141)
48% 43% 2% 2% 4% 1% 3% (193)
32% 48% 5% 2% 9% 2% 6% (234)
54% 35% 2% 2% 7% 2% 4% (836)
Note: Percentages add to more than 100% since some articles have authors and/or data from more than one region.
Table 60.4 Core journal articles by most frequent type of policy studied and time period published. Percentage base: articles with a policy focus Administrative change & reform Environmental & Energy policy Social policy Health policy Economic policy (N)=100%
1953–1979
1980–89
1990–99
2000–09
2010–17
Total
19% 5% 7% 9% 14% (43)
7% 24% 16% 9% 12% (179)
18% 22% 21% 11% 14% (125)
25% 12% 20% 12% 13% (174)
29% 12% 13% 19% 11% (218)
22% 17% 16% 14% 12% (739)
Second generation of implementation research: 1980s The transition in implementation research from the 1970s to the 1980s was large in many respects, especially during the first half of the latter decade. The increase of empirically oriented articles and advances in their applied research methodology is especially noticeable. This implied a substantial weakening of the dominating position of the single case study research design from the 1970s down to 40% of all empirical studies in the 1980s. Conversely, approximately half of all studies now had a comparative research design, compared to less than onethird in the 1970s. Similarly, the use of a statistical analysis of a more sophisticated nature – regression analysis – increased even more from the 1970s (0%) to the 1980s (12%). The latter trend implied a substantial weakening of the predominant role of only qualitative data in the
1970s (84%) to the 1980s (63%) and a similar strengthening of the use of both quantitative and mixed method data (tables 60.1–60.4). The transition from the 1970s to the 1980s was no less in theoretical terms. We have already mentioned the pioneering analytical framework offered by Van Meter and Van Horn (1975). Fairly soon, this was followed up by a sudden burst of other publications during a relative short time period lasting less than five years. Thus, in the year 1980 alone, no less than 22 books were published focusing explicitly on policy implementation. Most of these were efforts to offer an initial analytical and theoretical framework to guide implementation research in a book format. It is quite telling of the US dominance at this time that less than a handful of all the authors of these books from the early 1980s were from Europe. Thus, in view of these watershed events in terms of advances in new analytical-theoretical constructs and
Implementation
research methodologies, it seems very appropriate and correct to refer to what happened from the 1970s to the 1980s as a transition from a first generation to a second generation of implementation research. The proliferation of analytical and theoretical constructs in the early 1980s also entailed a change from a relatively harmonious earlier period, with little apparent disagreement among implementation scholars on conceptual, normative/theoretical, and methodological issues, to a new decade characterized by much more contention in these respects. Thus, some new implementation scholars criticized the pioneering analytical implementation models for being overly topdown in their nature, thereby privileging the empirical and normative position of formal national policy makers at the expense of local level political-administrative units of government and other more informal networkrelated actors (Elmore, 1980; Lipsky, 1980; Hjern and Porter, 1982). These critics advocated an alternative bottom-up approach in implementation research to identify networks of actors in service delivery at local levels as well as their strategies, goals, activities and contacts. This approach was founded on the assumption that successful policy programmes must be compatible with the desires, wishes, interests and behavioural dispositions of lower level political-administrative bodies and target groups. Thus, its conception of the implementation was that of bargaining and negotiation processes within networks of multiple formal and informal implementers (Linder and Peters, 1987; Hill and Hupe, 2014). However, this alternative analytical approach to implementation research was also challenged on its implicit or explicit normativedemocratic model attributing too much autonomy to implementing agencies and networked non-public organizations and interest groups. Furthermore, the inductive nature of their research approach led to few theoretical generalizations and conclusions (Linder and Peters, 1987; Matland, 1995).
1007
This debate between a top-down and bottom-up approach in implementation research turned out to be less positive in the long run – in the sense that it came to dominate the discourse among policy scholars during most of the 1980s, with a tendency of coalescing many of them into two opposing schools of thought later referred to as ‘topdowners’ versus ‘bottom-uppers’. Although perhaps necessary and useful at the outset, this unfortunately developed into a protracted, entrenched and unproductive debate where ideological-normative, methodological and epistemological-theoretical issues were intertwined (Sætren, 2005). Nevertheless, the 1980s, especially the first half, was in many ways the golden era of research on policy implementation. The rapid exponential increase of research publications since the previous decade suggested this had become a new ‘growth industry’ in academia. Furthermore, the most influential analytical and theoretical frameworks for understanding the policy implementation process were published and debated vigorously at this time and substantial advances in research methodology were made in order to subject theoretical constructs to empirical testing. Efforts to bridge the top-down/bottomup approaches had started even before the debate about them became heated, premised on the notion that each approach may be legitimate and relevant depending on somewhat different circumstances and hence more supplementary than competing, both methodologically and normatively. Thus, scholars like Berman (1980) and Elmore (1980) offered useful suggestions to settle the topdown and bottom-up debate. Berman’s (1980) contribution has not received much attention; he focused on ‘designing strategies to improve implementation’. Two strategies for policy makers, namely ‘programmed implementation’ and ‘adaptive implementation’, were outlined. Programmed implementation (i.e. top-down) was suggested as appropriate and effective given some beneficial policy-relevant
1008
The SAGE Handbook of Political Science
circumstances while the adaptive implementation approach was assumed to be more appropriate and effective given less beneficial, similar circumstances. Thus, he attempted to integrate and synthesize the top-down and bottom-up approaches by suggesting some kind of contingency-theoretic approach to policy implementation. Elmore (1980), who is a more well-known ‘synthesizer’, also proposed a two-pronged strategy for policy implementation, similar to Berman (1980). His main thesis is that policy makers need to consider policy instruments and other key resources at their disposal – forward mapping – and then consider the incentive package or structure of the ultimate target groups – backward mapping. Like Berman (1980), Elmore claims that policy success depends on the blend of the two considerations or strategies. However, he was less explicit in providing an analytical implementation model scholars could use in their research and explain the policy implementation process (Matland, 1995). Sabatier (1986) and his response to criticism levelled at the analytical-theoretical framework he developed with his colleague (Mazmanian and Sabatier, 1983) is often erroneously credited with being another attempt to bridge the top-down/bottom-up divide in implementation research. In fact, Sabatier heralded his departure as an implementation researcher by abandoning the implementation concept in favour of the less precise concept, policy change, premised on an interest group theoretic construct termed the advocacy coalition framework. The golden era ended with a systematic and comprehensive assessment of the stateof-affairs in policy implementation research literature by one of the, by now, most seasoned implementation scholars, O’Toole (1986). His assessment, based on more than 300 publications covering approximately 40 research journals, concluded that despite the substantial advances noted above, there were nevertheless still many unsolved challenges to address.
Thus, despite early efforts to define clearly what implementation is, O’Toole (1986: 183) noted that implementation researchers are not in agreement about what constitutes the subject of their inquiry. Some take implementation to refer to all that is part of the process between initial statement of policy and ultimate impact in the world. Others restrict implementation to the actions of those charged with handling a policy. [Furthermore, he observed:] Researchers do not agree on the outlines of a theory of implementation nor even the variables crucial to implementation success. Researchers, for the most part implicitly, also disagree on what should constitute implementation success, especially in a multi-actor setting. (Ibid.: 184).
As to the causes of this state-of-affairs, he referred to leading contemporary implementation scholars who had recently observed that: ‘the field thus far has a strong inductive orientation with numerous case studies and most of the conceptual work consisting of loosely linked hypotheses and identification of critical variables derived from the case material’ (Mazmanian and Sabatier, 1983: 152). O’Toole (1986: 189) followed up this quote with his own diagnosis saying: ‘The typical situation in implementation research, even more so than for other types of social science, has been for there to be very little conscious efforts to develop and test systematically the insights generated in previous work, and thus to separate the promising from the merely plausible but unproductive’. Despite this lack of knowledge cumulation plaguing the observed multitude of analytical-theoretical frameworks, O’Toole nevertheless saw some possibilities for theoretical convergence in this field of research. Thus, based on his investigation of the research literature he, much like Van Meter and Van Horn (1975) intuitively had done a decade earlier, identified a similar handful of clusters of variables that authors tended to mention as important explanatory devices regarding policy implementation performance (O’Toole, 1986: 189). As the account above strongly suggests, the dominance of male North American
Implementation
scholarship remained strong during the 1980s even though it had been reduced somewhat since the 1970s (to 69% from 74%) and European scholars now accounted for approximately one-quarter of all articles published. The most important input from Europe to the implementation research literature in the 1980s was undoubtably the work by Hjern and associates in some seminal articles commented upon above. Their work was, again, greatly influenced by other European-based leading policy scholars (Hanf and Scharpf, 1978; Barrett and Fudge, 1981), who advocated for a more network-oriented form of policy research. Also, in terms of the volume of research publications, the golden era of implementation research ended during the first half of the 1980s. After a fairly rapid increase through the late 1970s, the number of publications stagnated in the mid 1980s and then declined for a few years up until the mid 1990s (see Chart 60.2). The latter trend suggested something even more sinister: that the golden years of policy implementation as a popular and ‘hot’ research topic had ended.
Third-generation implementation research: 1990s and later Visions and Ambitions We can date the beginning of the third generation of research to 1990. In that year Goggin et al. (1990) published a book, with the subtitle ‘Toward a Third Generation’, where they called for a new, improved and more scientific mode of implementation studies. This was meant as a remedy against the shortcomings O’Toole (1986) had identified in the second generation of implementation research, especially with respect to the observed lack of theoretical cumulation. Hence, it is hardly a coincidence that the latter scholar was one of the co-authors of the book in question here.
1009
The third-generation implementation research paradigm must be understood more as an ambition and ideal for researchers to work towards, rather than something the authors should expect to find fully implemented in any single research project at any point in time. This begs the question of what this more scientific mode of research was all about. According to its advocates, the quintessential characteristic of third-generation research should be its rigorous research design. Two issues in particular were emphasized strongly: (1) addressing the conceptual and measurement problems related to variables commonly identified to be important for policy implementation; and (2) the manner in which hypotheses were formulated and tested. Goggin et al. (1990: 19) defined the unique trait of third-generation research in this way: ‘… an explicit theoretical model; operational definitions of concepts; an exhaustive search for reliable indicators of implementation and predictor variables and the specification of theoretically derived hypotheses with analysis of data using appropriate qualitative and statistical procedures as well as case-studies for testing them’. Furthermore, the principal aim of thirdgeneration research was to: ‘shed new light on implementation behaviour by explaining why that behaviour varies across time, policies and units of government’ (Goggin et al., 1990: 17). Thus, we can sum up the defining features of the third-generation research paradigm in the following six points (see Goggin et al., 1990: 15–19): • key concepts and related variables must be clearly defined and operationalized; • hypotheses derived from theoretical constructs should guide empirical analysis; • more use of statistical analysis using quantitative data to supplement qualitative analysis; • more comparison across different units of analysis within the same policy sector; • more comparison across different policy sectors; • more longitudinal research designs (i.e. research timeframe of at least 5 to 10 years).
1010
The SAGE Handbook of Political Science
Last, but not least importantly, Goggin et al. (1990: 29–41) presented their own dynamic analytical model of policy implementation in the US multi-level governmental system that was meant to bridge the divide between the top-down and bottom-up schools of thought. Here they tried to exemplify how their more scientific methodological approach can be applied with respect to studying the implementation of three selected federally enacted policies at state and sub-state level, and using what they call a communication theory or model to derive empirically testable hypotheses (Goggin et al., 1990: 171–197). As for the future of implementation research, these scholars were quite upbeat in stating that: ‘… interest is likely to grow during the 1990s and continue well into the twenty-first century. In fact, the field of public policy in the next decade will very likely be defined by its focus on implementation. The nineties are likely to be the implementation era’ (Goggin et al., 1990: 9).
What Happened to the Third Generation of Implementation Research? The 1990s did not turn out to be the implementation era that Goggin et al. (1990) had envisioned – at least not in quantitative terms as articles published in core field journals fell by one-third compared to the 1980s. The latter trend did not escape the attention of some implementation scholars towards the end of the 1990s who, a little prematurely, started to worry about the fate of their field of study and call for its rediscovery and revival (see Lester and Goggin, 1998 and responses to their comments in Policy Currents in the same publication in 1999 and 2000). This negative trend in published volumes did not translate into a similar trend in qualitative terms relative to the third-generation research paradigm – on the contrary. The use of quantitative data and statistical techniques of analysis to test theoretically
derived hypotheses all increased substantially from the 1980s with a doubling of research articles using multivariate regression analysis. Conversely, implementation studies based solely on qualitative data dropped between the 1970s and the 1980s to account for only a slight majority (55%). Use of the single case study research design followed a similar, even stronger, trend from 40% in the 1980s to 32% in the 1990s. With respect to applying the comparative and longitudinal research designs, there was little or no change in its application from the 1980s onwards. As to the regional origin or focus of implementation research, the dominance of North American articles during the 1990s was reduced further to 61% compared to 69% during the decade before. The gender parity of journal authors was still quite uneven even in the 1990s (69% male/31% female), though female authorship had improved slightly from 1980s. At the same time, regions other than Europe increased their share of research articles while Europe’s contribution remained to be around 25%. Another attempt at bridging the top-down and bottom-up divide in implementation scholarship, and contributing towards a synthesized theory of policy implementation, happened in the mid 1990s when Matland (1995: 160) presented his ‘ambiguityconflict’ model. The core idea here was that these two dimensions of policies – the degree of ambiguity and conflict associated with them – affected their policy implementation. Thus, he hypothesized that high or low levels of policy ambiguity and conflict respectively, in various combinations, would tend to result in a specific type of implementation style termed administrative, political, experimental and symbolic, respectively. This is one of the last theoretical constructs we are aware of that has been launched to reconcile the topdown and bottom-up approaches in implementation research. Interestingly, it is also at this time that the heated debate between the two contending approaches seems to dissipate, perhaps because many leading voices in this respect had already exited this field of
1011
Implementation
study and the remaining scholars started to worry more of its survival. The 2000s, compared to the 1990s, present a different and largely opposite pattern of implementation research. This implied a rebound in published volumes almost back to the level of the 1980s but is now characterized mostly by stagnation or even some slight setbacks in terms of some salient third-generation research indicators (see tables 60.1–60.3). Most noticeable in the latter respect is a 5–6% reduction in research articles employing more advanced statistical techniques of analysis from the 1990s. More use of qualitative data back to the level of the 1980s is another aspect of the same trend. On most other third-generation indicators, changes from the 1990s are quite small. By far the largest change during the 2000s had to do with the regional focus/origin of all articles published. In this respect, Europe leaped ahead from 23% in the 1990s to 43%, while the United States and Canada experienced a somewhat weaker reverse trend from 61% in the former period to 48% in the 2000s. Another very important and noteworthy change here was that by the 2000s, only male authorship in journal articles had been reduced to 51% from 69% the decade before.
Our most recent data, concerning 2010–2017, offer another variation of previously observed trend patterns. First there is a rebound back to the level of the 1990s in terms of the use of advanced statistical techniques of analysis and application of only qualitative data. Second, the surge in published volumes from the 2000s has continued at an even stronger pace, with an increase of more than 50% from the previous decade. On two other salient third-generation indicators, we observe opposite trends. The first is positive in the sense that we now find the single case study design in only one out of four research articles. The other is less positive in the sense that the comparative research design in the same type of articles has dropped from 50% to 47%. The decline in publications from North American scholars has continued at an even stronger pace than before, now accounting for only 32% of the total since 2010, while the same figure for their European colleagues has risen to 48% (see table 60.8). Last, but not least noteworthy is the fact that the positive trend towards gender parity has continued as women, either alone or together with men, are now represented as authors in 55% of all sampled core journals (see table 60.7).
Table 60.5 Core journal articles by region of focus/origin and research methodologies. Percentage base: empirical articles Single case studies Only qualitative data Only quantitative Questionnaires Comparison Cross-national Cross-state Cross-local systems Cross-agency site Cross-program/policy Longitudinal Statistical analysis Regression (N)=100%
US & Canada
Europe
Other Regions
33% 51% 33% 20% 54% 7% 26% 11% 7% 6% 19% 32% 24% (327)
37% 67% 20% 10% 51% 20% 3% 10% 9% 7% 15% 16% 10% (240)
49% 70% 13% 9% 30% 11% 6% 4% 0% 2% 29% 11% 9% (97)
1012
The SAGE Handbook of Political Science
Table 60.6 Core journal articles by regional focus/origin and research methodologies before and after the mid 1990s. Percentage base: empirical articles US & Canada
Single case studies Comparative studies Qualitative data only Only Quantitative Questionnaire data Longitudinal data Statistical analysis Regression (N)= 100%
Europe
Other regions
1953–1995
1996–2017
1953–1995
1996–2017
1953–1995
1996–2017
40% 50% 53% 29% 15% 21% 26% 17% (167)
26% 59% 48% 38% 24% 16% 39% 31% (160)
41% 48% 75% 7% 9% 27% 4% 2% (56)
36% 52% 65% 24% 11% 12% 20% 13% (183)
59% 37% 78% 7% 7% 22% 7% 7% (27)
44% 27% 67% 16% 10% 31% 13% 10% (70)
Table 60.7 Core journal articles by regional focus/origin and gender profile of authors before and after the mid 1990s. Percentage base: all sampled articles US & Canada
Male authors only Female authors as well Total (N)= 100% Missing data (N)
Europe
Other regions
1953–1995
1996–2017
1953–1995
1996–2017
1953–1995
1996–2017
72% 28% (242) (2)
45% 55% (194) (4)
82% 18% (76) (0)
57% 43% (285) (6)
82% 18% (27) (10)
67% 33% (61) (24)
Table 60.8 Articles by type of core field journal published in and time period. Percentage base: all sample articles Type of core field journal:
1953–1995
1996–2017
Total
32% 49% 18% (349)
51% 30% 19% (487)
43% 38% 19% (836)
Public Administration and management Public Policy Political Science (N)=100%
Summary and discussion Four main findings stand out in our analysis. The first is that, since the 1970s, implementation research has made no small strides towards the application of more rigorous, scientific research methodologies prescribed by the third-generation research paradigm, although progress in this respect has been more uneven and mixed since the mid 1990s and onwards. The second finding is that this positive development has resulted in little or
no progress towards a theory of policy implementation that motivated these advances in research methodology. Third, we have for over half a century observed a major shift in the regional focus of implementation studies from North America towards Europe and other regions as well as a major development towards an increased gender parity in terms of journal authorship. Finally, we see a significant rebound of implementation research publications starting from approximately the mid 1990s and continuing more or less
1013
Implementation
steadily and exponentially until the present time. The latter trend strongly suggests that implementation research has now become fashionable again after its demise as a research topic from the mid to late 1980s. In this final section, we will comment on these main observations, discuss their inter-relationships and links to other simultaneously occurring changes as well as their impacts on contemporary scholarship and prospects for theory building in implementation research. As to the first finding, we think advances in this respect (see tables 60.1–60.3) have been substantial enough to justify the claim that implementation studies conducted after 1990 have made the transition into the thirdgeneration research paradigm as outlined by us above, based on Goggin et al. (1990). A somewhat puzzling observation here was that progress in this respect has been more uneven and mixed across some salient research methodologies since around the end of the last century. A closer investigation of our data reveals that the latter observation is probably in no small part due to the influence from our third main finding: the shift in regional origin and focus of implementation studies that
increased substantially in pace from the early to mid 1990s and onwards (see chart 60.3) away from North America, first towards Europe and then other regions as well. The downside of this otherwise positive trend was that it meant that implementation studies from the latter mentioned regions with less advanced research methodologies increasingly replaced those from North America which employ more sophisticated research techniques. Table 60.5 clearly demonstrates these regional differences in application of third-generation research methodologies. Furthermore, the trend towards gender parity in journal authorship has been stronger among North American implementation scholars than their colleagues from Europe and other regions of the world (see Table 60.7). Thus, we see how one main positive developmental trend may have some adverse impact on another equally important and positive trend. The good news here is that these regional differences in application of third-generation research methodologies and gender disparity are decreasing over time as tables 60.6 and 60.7 demonstrate. This, again, suggests that, in the years to come, implementation studies
90 80 70 60 50
US & Canada
40
Europe Other regions
30 20 10 0
1950–79 1980–84 1985–89 1990–94 1995–99 2000–04 2005–09 2010–14
Chart 60.3 All sampled Cardbox core journal articles by region of origin/focus and time of publication in 5-year increments. Absolute numbers here (763) is larger than total number of journal articles (602) since some of these, due to country comparison, sometimes cover more than one region
1014
The SAGE Handbook of Political Science
from Europe, but particularly also from less developed regions, will play a more important role in advancing the third-generation research paradigm than they have in the past. The next important question concerns why so much progress in terms of more advanced research methodologies has resulted in no visible progress towards what motivated their promotion: a theory of policy implementation. Instead of convergence, we observe more, rather than less, diversity with respect to theoretical constructs and derived hypotheses subjected to some form of empirical testing. One simple explanation for this apparent paradox may lie in the fact that quite a few new theoretical approaches in relevant academic fields like organization theory, public administration/management and public policy date their origins back to the 1970s and 1980s but did not become more widely known and applied until the 1990s and later. Thus, implementation scholars may have found it more interesting and exciting to pursue some of these new theoretical constructs and ideas than to provide another test of ones that are more established but not so new. Another explanation of why so much comparative implementation research has been insufficient in this respect is that comparison, per se, is no panacea, as its value for theory building critically depends upon the way in which it is done. All too often, selection of units of analysis for comparison – e.g. in comparative public policy research – are made on the basis of convenience and other pragmatic criteria rather than those promulgated in textbooks that are ideal for theory building purposes (Dierkes et al., 1987). There is no reason to think that the situation in comparative implementation research in general has been much better in this respect. Nevertheless, despite this disappointing situation, but contrary to what O’Toole (1986) said, we would argue that we are closer to an outline of a theory of implementation in the sense that there is more agreement now than then on what constitutes some of the potentially crucial variables in
explaining implementation performance and outputs. In fact, O’Toole (1986) himself indicated this based on his comprehensive review of the implementation literature by summing up his findings and pointing to the five most common clusters of variables that implementation scholars had found to be important in their studies. These were: (1) policy characteristics, (2) resources, (3) implementation structure(s), (4) attitudes, perceptions and actions of implementing personnel, and (5) target group characteristics. This range of clusters of variables that bridges the top-down and bottom-up divide is supposed to function as a ‘road map’ guiding implementation researchers in the direction of potentially important explanatory variables in their research projects. It has appeared in several state-of-theart review textbooks since then (e.g. Winter, 1990, 2012). Obviously not all these clusters of variables are equally important regardless of context and type of policy issue. Hence, the only way to sort out which variables are important, when, where and under what conditions, is to carry out comparative studies that are sufficiently well designed with respect to case selection in order to establish some sort of control for some endogenous and exogenous model variables that enable narrowing down the range of plausible explanatory variables. The good news here is that there is a separate strand of systematic comparative implementation research whose focus, until recently, has made it largely irrelevant for the implementation research community more generally but is now not only highly relevant, but also quite promising with respect to research findings for developing a more parsimonious theory of policy implementation. The theoretical, tentative conclusions from this first strand of implementation research is to a large extent supported by another strand which is very different from the first in many respects, but whose cumulative research findings and conclusions nevertheless point in the same direction. What they have in common
Implementation
are two things. First, their theorizing tends to build on well-established theoretical constructs within their respective specialized fields, which is conducive to theoretical knowledge accumulation. Second, they both try to integrate more levels of analysis regarding the multi-layered governance structure of modern states into their research designs but from opposite directions in that same context. Until now, these two strands of implementation research have not communicated much with each other and may not even be aware of what is going on outside of their respective specialized research fields. This is unfortunate because a careful reading of recent state-of-the-art reviews by seasoned scholars within each strand of implementation research clearly indicates a trend of convergence towards a common type of theoretical explanation with respect to policy implementation. This provides a good rationale for describing the two, their backgrounds, differences and accomplishments, towards a common goal in more detail.
EU Implementation Studies Until recently, these studies have had little or no connection to other types of implementation research in our sample of that universe. This seemingly odd fact may, in addition to the familiar myopia of specialized scholars, be due to two other factors. First, these studies started to take off in the late 1980s and 1990s when implementation research tapered off more generally and became unfashionable, as detailed above. Second, they defined policy implementation in a manner that is more generally at odds with this field of research. Thus, implementation was understood to be about whether and to what extent EU legislation and directives were followed up and integrated into national laws in member states (legal implementation) without paying much attention to whether these new laws and directives were actually implemented domestically (practical implementation). In the usual policy
1015
studies parlance, this legal implementation concept could of course be said to entail implementation in the true meaning from the central EU level point of view, but it is nevertheless a case of policy adoption, and not implementation, as seen from the member states’ viewpoint. Despite impressive cross-national comparative research design, innovative theorizing and sophisticated statistical multivariate methods to analyse quantitative data, these studies and their findings have had limited value for other implementation scholars because their dependent variable was not really about implementation. They were also faulted by poor data quality in measuring their already awkward dependent variable. The good news here is that, during the last decade or so, some leading scholars in this field of research have become quite critical of the predominant legal implementation orientation among their colleagues and have argued that much more research needs to be devoted to the systematic study of the next stage of EU policy execution, i.e. practical domestic implementation at the member state level (see e.g. Treib, 2014). One important fact bolstering this argument is that there are different types of actors with different logics of action involved at the two policy stages in question here, with national-level politicians dominating legal implementation and administrative actors dominating the practical implementation stage. Hence, it cannot axiomatically be assumed that the type of theoretical constructs found to explain legal implementation will do the same with respect to practical implementation (Treib, 2014: 29). On the other hand, it is argued by the same critic that the practical implementation of EU policies in member states is not fundamentally different from the implementation of laws that have a national origin, thus making theoretical insights from traditional domestic implementation research highly relevant (Treib, 2014: 32). One such theoretical insight is sometimes called the ‘goodness of fit’ or alternatively
1016
The SAGE Handbook of Political Science
the ‘misfit’ hypothesis. Its origin can be dated back to ‘top-down’ domestic implementation scholars in the early 1980s (Mazmanian and Sabatier, 1983 [1989]) who then labelled it ‘the statutory coherence thesis’. The argument here is that the degree of fit between the content and design of enacted laws and existing administrative and institutional arrangements designated to implement them is crucial to subsequent policy success or failure. The misfit hypothesis is consistent with the neo-institutional approach in political science, which assumes that politicaladministrative institutions are characterized by path dependency and hence inertia towards substantial change that is not compatible with established policy legacies and practices (March and Olsen, 1989). Efforts to test the goodness of fit hypothesis by EU implementation scholars, mostly focusing only on legal implementation, have revealed mixed support for it. Moreover, Meier and McFarlane (1996) found much more support for the same goodness of fit or statutory coherence thesis by studying the implementation outputs and outcomes of four US federal family policies in all 50 states using quantitative data and a longitudinal (7 years) comparative research design. May (1993), in a less comprehensive cross-state comparative study in another US policy area also using quantitative data, did not find empirical support for the statutory coherence hypothesis as originally formulated and concluded that while clear and coherent policy goals might be sufficient conditions for successful implementation, it is not a necessary condition as other local factors might compensate for the adverse confusing impacts of incoherent policy goals on policy implementation. The latter conclusion supports a criticism that the misfit hypothesis often has been interpreted in too much of a deterministic, rather than a probabilistic, manner (Treib, 2014). We will conclude this section on the contribution of EU implementation studies to the broader field of implementation research by
referring to a recently published study whose research design is probably the most rigorous and advanced in terms of the third-generation research paradigm. It was conducted by Zhelyazkova et al. (2016). Their database seems to remedy the shortcomings of previous research, not only by including practical implementation in addition to legal implementation as dependent variables, but also by using more valid and reliable indicators in this respect. Their findings and conclusions also shed new light on the relationship between the legal and practical implementation of 24 EU directives within four policy areas in 27 member states. First, the gap between legal and practical implementation observed in previous research (mostly case studies) is confirmed here to not be a rare phenomenon at all, though not excessively widespread either. Second, and more interesting and perhaps more surprisingly, they find more variation in the practical implementation performance between policy sectors and issues than between member states. Third, their findings are also interesting in terms of the goodness of fit theoretical construct that they also tried to test. This is how they summarize their study: The findings from this study suggest that practical implementation is mostly shaped by the institutional capacities of member states (management) and the perceived legitimacy of EU policies by societal actors while the policy preferences of political actors (enforcement) have little impact on practical implementation. In other words, political actors are generally unable to fully steer the implementation process in their most desired direction, even if their preferences are broadly aligned with the goals of internationally agreed policies. (Zhelyazkova et al., 2016: 843)
In a follow-up study by some of the same authors (Schrama and Zhelyazkova, 2018) using more refined data on civil society organization and activities, they find that high levels of civil society participation and routine consultation with relevant implementing agencies also improve policy implementation, though this effect is not observed in
Implementation
countries with low administrative capacities. In other words, in the practical implementation stage and contrary to the legal implementation, it appears that institutional and societal factors trump the political ones. Another recently published study of the implementation of federal labour market policies, aiming at integrating asylum seekers in Swiss cantons using fuzzy-set qualitative comparative analysis arrived at a similar conclusion, namely that institutionalized policy paths trump politics (Sager and Thomann, 2017: 287). We will end this section by mentioning an important caveat formulated in the exemplary study by Zhelyazkova et al. (2016), stating that their analysis ‘does not directly capture the preferences or ideas of administrative actors regarding legislative outcomes’ and acknowledging that ‘in-depth case study approaches are better suited to analyzing implementers’ ideas and preferences towards EU policies’ (ibid.: 843). Fortunately for us, there is another strand of implementation research that has worked for some 40 years to shed light on this unchartered area of EU implementation studies. Hence, it can help us to assess the importance of not including data on street-level bureaucrats’ preferences on policy implementation performance and outcomes.
Street-Level Bureaucracy Research Contrary to the previous strand of implementation studies, this one is very much rooted in the traditional domestic policy implementation research. It emerged as a reaction to the first-generation implementation scholarship and is predominantly top-down orientation both empirically and normatively, as outlined earlier in this chapter. The most elaborate, and by far the most influential, formulation of the alternative bottom-up approach to understanding the implementation process was Michael Lipsky’s classic study published in 1980, titled Street-Level Bureaucracy: Dilemmas of the Individual in Public Services. To this day, core ideas and concepts from this
1017
book have continued to guide research on what happens at the lowest level of governance and in a diversity of academic fields beyond political science and public administration (e.g. law, economics, criminology). Initially the field developed in the United States, but over time it has spread to other regions, thus reducing substantially the initial predominance of North American scholarship. In that sense the field and its empirical basis is implicitly cross-national in character even though very few studies have an explicit cross-national comparative research design. More common is comparing public service agencies at the lowest level within and across counties/municipalities. Only one-third of published articles seem to be based on only qualitative data, and a similar amount use quantitative data and statistical techniques of analysis (usually multivariate regression). The main dilemma facing street-level bureaucrats, according to Lipsky (1980), is that they often are stuck with inadequate resources and excessive workloads forcing them to resort to informal coping mechanisms in order to handle their work in a satisfactory, if not optimal, manner. The fact that street-level bureaucrats have and need some discretionary space to apply rules and make judgements on their application in individual cases facilitates this coping behaviour. Thus, as Lipsky (1980) famously claimed, street-level bureaucrats, in the final analysis, become de-facto policy makers by exercising their discretion through coping behaviour in their client/citizen interaction. An extensive recent review of the literature revealed that three broad families of coping behaviour could be identified (Tummers et al., 2015): moving towards clients, moving away or moving against them. Moving towards clients may involve rule bending, rule breaking or use of other means, all aimed at helping the clients. Moving away from clients, as the concept suggests, is about avoiding meaningful contact with clients, e.g. through dealing with them in standardized ways (routinization) or making their access to public services more
1018
The SAGE Handbook of Political Science
difficult (rationing). Moving against clients can imply coping such as ‘rigid rule following’ or hostile, even aggressive, responses to difficult clients as a way to relieve frustrations. The good news here is that negative coping behaviour seen from the clients’ point of view seems to be infrequently used (Tummers et al., 2015; Radlick, 2018). Recent reviews of the street-level research literature seriously challenge some not uncommon interpretations of Lipsky’s aforementioned, famous claim with respect to the street-level bureaucrats’ role in the implementation process, namely that their exercise of discretion through coping behaviour tends to subvert the good intentions of formal policy makers. Thus, Brodkin (2015: 32–33), a seasoned scholar in this field of research, says that a striking thing emanating from a growing body of relevant studies is that they ‘rarely with only some exceptions, indicate opposition or resistance to policy aims, at least as street-level practitioners understand them’. Furthermore, she says that ‘practices that appear “deviant” or “subversive” from a principal-agent perspective may have little to do with practitioners’ personal preferences’ and ‘are better understood as adaptions to conditions of work’. Finally, her clinching summary conclusion in this respect: ‘These studies also point to the limitations of seeking to explain street-level behavior as a consequence of individual level phenomena (e.g. preferences, training etc.) without accounting for organizational conditions that affect what individual can and are likely to do under certain conditions’ (ibid.: 33). Summing up research on the use of discretion in this context from management studies, the same scholar reaches a similar conclusion by saying that ‘…a growing body of management studies clarifies that discretion operates within limited degrees of freedom, embedded in an organizational context that shapes the possibilities of its use’ (Brodkin, 2015: 36). The theoretical insight that emerges from the two strands of implementation studies is that the implementation process at the
domestic and practical level is not influenced much at all by the preferences of either political actors at the top or at the bottom of the political-administrative hierarchy. Rather, it is the capacities and malleability of the intermediate institutions and administrative structures, combined with societal legitimacy and participation from civil society actors, that seem to be most important in this context. As political scientists, some of us may find this conclusion about politicians playing only a relatively minor role in the execution process of policies not only surprising but also somewhat disturbing from a normative point of view unless we believe strongly in a sharp distinction between politics and administration. An explanation for this, suggested by the EU implementation scholars themselves, could be that despite the diversity of policy areas studied, a large part of their related directives are usually of a technical nature that are dealt with by the national bureaucracies in a routine fashion (Treib, 2014: 23). Furthermore, they suggest that in policy issues that are more party politically salient, and perhaps more contentious but also rarer, politicians may play a much more important role. To test the latter assumption, this author (Sætren, 2015) carried out a cross-national and longitudinal comparative case study of efforts by three Scandinavian governments to relocate some of their central agencies from their respective capitals to other more peripheral regions. It involved comparing seven national relocation programmes (two in Finland and Sweden and three in Norway) launched over a period of five decades with more failures in implementation than success (one success in Norway and Sweden and only failures in Finland). The selection of policy cases was premised on Lowi’s (1964) famous dictum that policies determine politics and his insight that redistributive policies are among the most contentious and difficult of all to adopt and implement as they, by their very nature, tend to create winners and losers. This type of policy usually means that resources of various kinds are redistributed
Implementation
from those who have plenty (losers) to those who have less (winners). The challenge from governments in implementing such policies is that the losers in this context have not only stronger motivation but also more resources to mobilize in attempting to thwart such policies. In this context the perceived losers constitute a powerful and broad-based coalition of centrally located actors relative to policy makers such as the personnel and leaders of affected central agencies, their trade unions and superior ministries, as well as the biggest newspapers located in the capital. On the other hand, the potential winners had fewer strategic resources, constituted a more fragmented and peripheral constellation of actors relative to policy makers, were poorly organized and often did not know what was at stake. Thus, the odds were stacked heavily against any policy success which proved to be correct. What this author found to be the more intriguing research question in this context was not why a majority of these relocation programmes failed, but rather why a few succeeded against very unfavourable odds. It was assumed that the answer to this research question could also shed some light on the more crucially contingent factors involved in policy implementation more generally. The interest group politics so important to explaining policy failures turned out to not play a significant role at all in explaining policy success. The same was the case with respect to public opinion, level of macro-economic resources or the severity of policy problem the relocation programmes were supposed to address. Instead, it was the political will of central policy makers, a beneficial geographical distribution of agencies to be relocated relative to legislators’ constituencies (programme design), the internal cohesion and stability of their type of government, the number of institutional veto points in the political system more generally and the way the relationship between central agencies and their mother ministries were organized that turned out to be most important in this respect. Thus, even apparently
1019
strongly articulated political will to ensure the relocation of central agencies by policy makers crumbled rapidly when their governments had low internal cohesion and stability (due to oversized majority coalitions), faced multiple institutional veto points (e.g. in the parliament) and affected agencies had easy access to their respective ministries. Again, the goodness of fit or misfit theoretical construct seems highly relevant here as well. The latter study also underscores the critical role played by the duality of institutions as both constraining and enabling the policy-making process depending on their design and policy issue at stake. To sum up, even in our very politically contentious policy case we end up concluding as others have done before us that institutions are more important than politics and related interest group activities as the former structure and regulate the latter (Steinmo et al., 1992). Returning to our summary of major findings previously in this section, the steady and exponentially increasing volume of articles published since the mid 1990s, mostly by European scholars, was among the least expected. This latter trend has been accompanied by a significant change among two types of core field journals over time in terms of their share of all articles published (cf. Table 60.7). Thus, while public policy and management journals published one-third of all articles up to the mid 1990s, and public policy journals published nearly 50%, that position has been more than reversed since with public administration and management journals now publishing 51% of all articles and public policy journals less than one-third. On the other hand, political science journals stand out here with the smallest and most steady share of implementation articles over time, hovering just below 20% both before and after the mid 1990s. What may have caused this strong resurgence of interest in implementation research in Europe? A few factors may be worth mentioning here. First, the expansion of the EU’s policy agenda since the early 1990s has triggered increasing scholarly interest in
1020
The SAGE Handbook of Political Science
how its policies are not only formulated, but also implemented. Thus, while only 2% of empirical articles in our sample focused on the implementation of EU policies up to the mid 1990s, that portion has risen to 13% since then. Virtually all of these studies have been conducted by European scholars. Second, a new updated general textbook on policy implementation authored by two European scholars, Michael Hill and Peter Hupe, published in 2002 with two subsequent reprints (2009 and 2014), titled Implementing Public Policy, soon became a standard reference for students and younger policy scholars. Third, at approximately the same time (2003), a new Handbook of Public Administration with a European co-editor (B. G. Peters and J. Pierre) contained no less than three separate chapters on implementation. Fourth, for the first time since the late 1980s, several international journals such as Public Administration (2004), Public Management Review (2012), Public Policy and Administration (2014) and Journal of Comparative Policy Analysis (2015) had special issues with European guest editors devoted to policy implementation topics. Last, but not least importantly in this respect, policy implementation has also regained its popularity as a regular workshop theme at international conferences in political science and public administration and public policy. To conclude, policy implementation research has made great strides towards implementing the more rigorous research methodologies prescribed by the third-generation research paradigm, but not enough and not in a manner that has resulted in a more parsimonious theory of policy implementation. However, recent third-generation studies in some strands of implementation research discussed above are promising in this respect as they investigate more systematically and in a cumulative manner the dynamic contingent relationships of the constituent parts of such a theory whose contour has been known for quite some time. Implementation research is a multidisciplinary enterprise (Sætren, 2005) which has grown exponentially in volume during
the last 20 years in all academic fields, but interestingly less so in our defined core fields than most others, as charts 60.1 and 60.2 clearly indicate. Furthermore, our core fields constitute a fairly small part relative to other relevant academic fields in this respect. Thus, Sætren (2005: 565) found that in 2000, their share accounted for only 11%. Considering the more rapid expansion of implementation research in neighbouring academic fields (e.g. education, health and management etc.), this number is no doubt even smaller close to 20 years later. The downside of this fact is that implementation research in these other larger academic fields seems to be guided more by area- and sector-specific practical issues than theory development more generally (see e.g. Nilsen et al., 2013). A revival of interest in implementation research in our core fields started among public administration and management scholars in Europe and later Asia and has more than compensated for a declining interest in this type of research in North America. Conversely, political science scholars have shown the least interest in implementation research, judging by the relative amount of volumes published in political science journals and the absence of this research topic in previous handbooks of political science. Why this is the case is a mystery to this author, considering the crucial role of implementation in any governmental policy-making process. The fact that this is the first chapter on implementation in any handbook of political science ever may hopefully signal a growing interest and recognition also among political scientists about the salience of this field of research.
References V. Aubert (ed.) (1969) Sociology of Law. Selected Readings (Harmondsworth, UK: Penguin Books Ltd). E. Bardach (1977) The Implementation Game (Cambridge, MA: MIT Press).
Implementation
S. Barrett and C. Fudge (1981) Policy and Action (London: Methuen). P. Berman (1980) ‘Thinking about Programmed and Adaptive Implementation. Matching Strategies to Situations’, pp. 205–227 in H. Ingram and D.R. Mann (eds) Why Policies Succeed or Fail (Beverly Hills/London: Sage). E.Z. Brodkin (2015) ‘The inside story: streetlevel research in the US and beyond’, pp. 25–42 in P.L. Hupe, M.J. Hill and A. Buffat (eds). Understanding Street-Level Bureaucracy (Bristol: Policy Press). H. Cooper (2010) Research synthesis and meta-analysis: A step-by-step approach (Thousand Oaks: Sage). M. Derthick (1970) The Impacts of Federal Grants (Cambridge, MA: Harvard University Press). M. Derthick (1972) New Towns In-Towns (Washington, DC: The Urban Institute). M. Dierkes, H.N. Weiler and A.B. Antal (1987) Comparative Policy Research: Learning from Experience (New York: St. Martins Press). R.F. Elmore (1980) Backward Mapping. Implementation Research and Political Decisions, Political Science Quarterly, 94, 601–616. M.L. Goggin, A. Bowman, J.P. Lester and L.J. O’Toole jr. (1990) Implementation Theory and Practice: Toward a Third Generation (Glenview: Scott Foreman/Little Brown). T.E. Hall and L.J. O’Toole jr. (2000) Structures for Policy Implementation. An Analysis of National Legislation 1865–1966 and 1993–1994, Administration & Society, 31, 667–686. K.I. Hanf and F.W. Scharpf (eds) (1978) InterOrganizational Policy Making: Limits to Coordination and Control (London: Sage). E. Hargrove (1975) The Missing Link (Washington, DC: The Urban Institute). M.J. Hill and P.L. Hupe [2002] [2009] (2014) Implementing Public Policy: An Introduction to the Study of Operational Governance (London: Sage). B. Hjern and D.O. Porter (1982) Implementation Structures. A New Unit for Administrative Analysis, Organization Studies, 2, 211–237. H. Kaufman (1960) The Forest Ranger (Baltimore: Johns Hopkins University Press). J.P. Lester and M.L. Goggin (1998) Back to the Future: The Rediscovery of Implementation Studies, Policy Currents, 8, 1–9. S.H. Linder and B.G. Peters (1987) A design Perspective on Policy Implementation: The
1021
Fallacies of Misplaced Prescription, Policy Studies Review, 6, 459–475. M. Lipsky (1980) Street-Level Bureaucracy. Dilemmas of the Individual in Public Services (New York: Sage). T. Lowi (1964) American Business Public Policy, Case Studies and Political Theory, World Politics, 16(4), 677–715. J.G. March and J.P. Olsen (1989) Rediscovering Institutions: The organizational basis of Politics. (New York: The Free Press). R.E. Matland (1995) Synthesizing the Implementation Literature: The Ambiguity-Conflict Model of Policy Implementation, Journal of Public Administration Research and Theory, 5, 145–174. P.J. May (1993) Mandate Design and Implementation: Enhancing Implementation Efforts and Shaping Regulatory Styles, Journal of Policy Analysis and Management, 12(4), 634–663. D.A. Mazmanian and P. Sabatier [1983] (1989) Implementation and Public Policy (Glenview: Scott Foreman). K. Meier and D.R. McFarlane (1996) Statutory Coherence and Policy Implementation: The Case of Family Planning, Journal of Public Policy, 15(3), 281–298. P. Nilsen, C. Stähl, K. Roback and P. Cairney (2013) Never the Twain Shall Meet? A Comparison of Implementation Science and Policy Implementation Research. Implementation Science, 8(63): 1–12. L.J. O’Toole Jr. (1986) Policy Recommendations for Multi-Actor Implementation: An Assessment of the field, Journal of Public Policy, 6, 181–210. L.J. O’Toole Jr. (2000) Research on Policy Implementation: Assessment and Prospects, Journal of Public Administration Research and Theory, 10, 263–288. B.G. Peters and J. Pierre (eds) [2003] (2012) Handbook of Public Administration (London: Sage). J.L. Pressman and A.B. Wildavsky [1973] (1984) Implementation. How Great Expectations in Washington are dashed in Oakland (Berkeley: University of California Press). R.L. Radlick (2018) Implementing integration programs: Accountability, coping and the pursuit of performance. PhD dissertation, University of Bergen. P. Sabatier and D.A. Mazmanian (1980) The Implementation of Public Policy.
1022
The SAGE Handbook of Political Science
A Framework of Analysis, Policy Studies Journal, 8, 538–560. P.A. Sabatier (1986) Top-down and bottom-up approaches to implementation research. A critical analysis and suggested synthesis, Journal of Public Policy, 6, 21–48. F. Sager and E. Thomann (2017) Multiple streams in member state implementation: politics, problem construction and path dependency in Swiss asylum policy, Journal of Public Policy, 37(3), 287–314. R. Schrama and A. Zhelyazkova (2018) ‘You can’t have one without the other’: The differential impact of civil society strength on the implementation of EU policy, Journal of European Public Policy, 25(7), 1029–1048. P. Selznick (1949) TVA and the Grass Roots (Berkeley: University of California Press). S. Steinmo, K. Thelen and F. Longstreth (1992) Structuring Politics. Historical institutionalism in comparative analysis. (Cambridge: Cambridge University Press). H. Sætren (2005) Facts and myths about research on public policy implementation: Out-of-fashion, allegedly dead, but still Very Much alive and relevant. Policy Studies Journal, 33, 559–582. H. Sætren (2014) Implementing the third generation research paradigm in policy implementation research: An empirical assessment, Public Policy and Administration, 29, 84–105.
H. Sætren (2015) Crucial factors in implementing radical policy change: A comparative longitudinal study of Nordic central agency relocation programs, Journal of Comparative Policy Analysis, 17(2), 103–123. O. Treib (2014) Implementing and complying with EU governance outputs. Living Reviews in European Governance, 9(1), 1–47. L. Tummers, V. Bekkers, E. Vink and M. Musheno (2015) Coping during public service delivery. A conceptualization and systematic review of the literature, Journal of Public Administration Research and Theory, 25, 1099–1126. D.S. Van Meter and C.E. Van Horn (1975) The Policy Implementation Process, Administration & Society, 6, 445–488. S.C. Winter (1990) Integrating Implementation Research, pp. 19–38 in D.J. Palumbo and D.J. Calista (eds) Implementation and the policy process. Opening up the black box (New York: Greenwood Press). S.C. Winter (2012) ‘Implementation Perspectives: Status and Reconsideration,’ pp. 264–278 in B.G. Peters and J. Pierre (eds) Handbook of Public Administration, Chapt. 16 (New York: Greenwood Press). A. Zhelyazkova, C. Kaya and R. Schrama (2016) Decoupling practical and legal compliance: Analysis of member states’ implementation of EU policy, European Journal of Political Research, 55, 827–846.
61 Informal Governance and Participatory Institutions Leonardo Avritzer
Introduction Governance is a concept that emerged to study the post-Cold War consensus on the organization of the state and public policies (see Milani, Chapter 59, this Handbook). During most of the 20th century, the definitions of the state were based on the Weberian notion of coercion and administrative monopoly (Tilly, 1975; Weber, 1946: 78). Coercion is the state’s capacity to employ physical force in a legitimate way. Alongside its coercive capacity, the state also has administrative monopoly. Weber’s focus has been on an administrative staff with hierarchy, management by rules and specialized division of labor (Weber, 1978). For him, and some subsequent authors such as Tilly, all these characteristics were ‘universal’ in the sense that all states shared the need for coercion but organized it differently. This was the justification of both authoritarian and topdown forms of government. They both have a similar hierarchical organization and
assume that bureaucracy could play the role of getting access to information and organizing policy. This is the key idea at stake in the process of informal governance. After the end of the Cold War, the reorganization of the international order took place through a series of changes that put at stake the Weberian notion of coercion and administrative monopoly. This conception was weaker in the United States where several different institutional arrangements for the organization of public policy were already in place during the post-war period (Dahl and Lindblom, 1953). After the end of the Cold War, this structure of different designs for the organization of public policy became widespread and gave rise to an important set of literature. The foundation of the new approach is the acknowledgment of the incapacity of top-down institutions to steer democratic processes or to democratically organize public policy. Democracy needs the input of citizens in the organization of policy for several reasons: the bureaucracy alone cannot cope with
1024
The SAGE Handbook of Political Science
the gathering of information (Fischer, 2006); bureaucracy is not a neutral institution and has established its own interests in top-down processes of policy elaboration (Lowy, 1970); and there are social actors whose integration into policy elaboration democratizes the policy and furthers democratic legitimacy. Informal governance, as a field of research, focuses on the adaptation of the issue of governance and the organization of public policy in the new democratic era. It associates a new understanding of the flaws of top-down models of bureaucratic organization (Fischer, 2012; Fung, 2006) with a new valorization of citizens’ input toward democracy. It is important to differentiate the emerging literature on informal governance from the literature on informal institutions (Helmke and Levitsky, 2004). Informal governance is a literature within democratic theory whereas informal institutions is a Latin American theory about informal arrangements that go around the formal institutions and have specific rules and practices (O’Donnell, 2000). In their work on informal institutions, Helmke and Levitsky (2004) argue that rule-based incentives interact or compete with informal rules. In order to fully grasp this problem, it is important to first of all understand what is meant by rules. According to Hall and Taylor (1996: 949), ‘institutions are …formal and informal procedures, routines, norms and conventions embedded in the organizational structure of the polity…’. This definition broadens the traditional approach to institutions. Yet, it still falls short of recognizing the different forms that rules and norms of reciprocity can take in different contexts that admit informal rules or norms (O’Donnell, 1996). Networks of trust, for instance, operate with rules and norms that are not always embedded in the organizational structure of the polity. In addition, the specific way that new rules are fixed or negotiated at the micro-level, which was not a major concern for new institutionalism, is of key importance in understanding a few phenomena in Latin America, such as human rights violation or
the formation of political majorities in some countries. Thus, though this literature is radically different from the literature of informal governance, it is still important to point out that informal rules operate within multilevel arrangements among social actors and governance institutions for the production of public policy. In this chapter, I will discuss the emergence of the concept of informal governance and its main characteristics from the perspective of democratic theory, trying to differentiate the existing experiments according to their capacity to further democracy. My aim will be to show the reasons for the emergence of the so-called ‘participatory arrangements’ in different global contexts. I will discuss five models of informal governance: the North American model centered on environmental issues; the Latin American model centered on urban issues; the European model centered on subsidiarity and urban issues; the Chinese model centered on the differentiation between national and local political organization; and the African model centered on accountability to international donors. My point is that, due to its high level of flexibility, informal governance adapts itself to the needs or demands that exist in specific contexts, even if they are non-democratic.
Informal governance and the emergence of participatory institutions The end of the Cold War led to a new focus on processes of collaboration between social actors and the state beyond the so-called topdown model with the bureaucratic organization that prevailed during the post-war period. A few experiences anticipated the participatory models that emerged in Latin America, Europe, Asia and Africa. The oldest model of informal governance and participation emerged in the United States in the area of environmental policies. Habitat Conservation
Informal Governance and Participatory Institutions
Plans are part of the Endangered Species Act of 1973 that develops a few specific policy designs in cases of risks for animals and wildlife. The very aggressive stand of the Endangered Species Act that was still elaborated under a top-down understanding of governance led economic interest to withdraw investments from areas in which species were declared in danger due to excessive economic risks. This has led to new legislation in 1982, which allowed the submission of several actors of Habitat Conservation Plans that involved multiple partners, could use large parts of public land and could lead to both market-based controls or zones of restricted use (Thomas, 2003: 158). The idea behind these plans involved two important aspects. The first one is that the coalition of multiple actors generate superior knowledge of the condition for governance on the ground. The second one is that bureaucracy can be either substituted or associated with local actors’ input of information that otherwise would not be available. Habitat Conservation Plans are an interesting way of introducing the issue of informal governance due to the way they leave to the applicants the decision to determine the institutional design that better fits their aims. Thus, they involved multiple actors – economic actors, scientists, NGOs, among others – in order to deal with issues that the top-down Weberian model could not deal with because it could only work with the dual code of allowing/forbidding. They also introduced the issue of horizontally dealing with policy issues that largely anticipated approaches in other parts of the world after the fall of the Berlin Wall. However, Habitat Conservation Plans in the United States, from the very beginning, adapted themselves to the contextual conditions of the country in the sense that they were local and thematic and did not involve changing or democratizing broader political structures. On the contrary, they adapted themselves to the flexible logic of American federalism. The 1990s saw the broadening and the diffusion of informal governance experiences
1025
in recently democratized countries in Latin America, Southern Europe, China and Africa. Latin America took the lead in this process with the democratization of most of its countries between 1983 (Argentina and Bolivia) and 2000 (Chile). Informal governance in Latin America changes from country to country but has a common framework based on the need to make public policy flexible and to incorporate social actors into decisionmaking. Most countries that have democratized rewrote their constitutions and approved new ones in which broad frameworks of collaboration between state and civil society emerged. Brazil, Colombia and Bolivia are such cases and Argentina and Mexico at the sub-national level also qualified for experiences of informal governance and local participation. I will discuss some of these cases.
Constitutionalism, participation and informal governance The constitutions in Brazil, Colombia and Bolivia introduced incentives to social participation and to both new participatory drives and old institutionalized ones. Democratic design involved introducing a participatory element between organizations, citizens and the state. These institutions have specifically been designed to increase and deepen citizen participation in the decisionmaking process (Smith, 2009: 1). Policies of income distribution, democratization of access to urban land, gender inclusion, racial integration and environmental policies gained participatory elements and established informal arrangements for the governance of public policies. Policy councils in Brazil are the result of a constitution-making process that accepts participation and accountability and gives social actors an important role in the implementation of health and social assistance policies. During the last two decades, the proliferation of health
1026
The SAGE Handbook of Political Science
and social assistance councils in Brazil highly improved the quality of these policies at the local level, particularly in cities with more than 100,000 inhabitants. In all successful cases, technical knowledge was democratized, and efficiency was produced through the participatory control of powerholders. Participatory budgeting (PB) emerged in Porto Alegre in 1990 as the result of structural characteristics present only in Rio Grande do Sul, a space for innovation opened by the 1988 Constitution. PB changed the process of budget-making in Brazil (Abers, 2000; Avritzer, 2009; Baiocchi, 2005; Wampler, 2015). In its most important application, in Porto Alegre (1990–2003), PB introduced new types of institutions, three of which involve deliberation: regional and thematic assemblies, the Participatory Budgeting Council (COP) and the establishment of rules for the decision-making process itself. Each of these created a new type of interface between the state and civil society. However, what is the most important characteristic of all these deliberations is that they are integrated with policy decisions in urban politics. In this sense, this deliberative institution is an attempt to combine collective action logic with democratic innovation in one specific area of social policy. Thus, PB is clearly an institution that challenges technical control over urban decisions and democratizes the debates on the budget. To understand it better, it is necessary to comprehend the emergence of democratic innovation in Brazil. PB is the result of the Brazilian process of democratization as well of its impact on the region of the country that developed the most democratic tradition. It took advantage of the new legislation that emerged in the constitution-making process as well as of the productive overlapping of a tradition of democratic associations in the South with a specific proposal for participation that emerged within the Workers’ Party (PT). PB is a local innovation that responded to the drive to overcome the prevalence of private interests in urban policy. Both social actors (namely poor urban
dwellers) and political actors (the local branch of the PT that supported the initiative) participated in the process. This case was successful because of the positive convergence of actors, social and political, in the elaboration of a bottom-up process of policymaking. Success reflects both a deepening democratic (Fung and Wright, 2003) and distributive capacity (Avritzer, 2009). Both of them together constitute a challenge to the bureaucratic and technical conception of budget elaboration. However, the question of the likelihood of the replicability of the policy needs to be answered separately. It may also function as limits to expansion and policy replicability. Thus, PB is part of a process of introducing informal governance institutions in Latin America. And PB design is a response to top-down policies that did not work well previously, in addition to bringing new actors to policymaking and to create new institutions that deeply democratize urban policy in Brazil. Comités de vigilancia in Bolivia are a consequence of the implementation of the Ley de Participación Popular (LPP), which was the result of the long process of resistance of indigenous and rural populations that needed to negotiate public policy demands with the central state. The Bolivian LPP was a key aspect in the process of change of the Bolivian state. In its first article, the law recognizes the need to integrate isolated indigenous and rural populations into the country’s political life (Bolivia, 1994). The Bolivian 2009 Constitution maintained all the elements of the LPP in Articles 271 and 272. The law decentralized local government in Bolivia and transferred 20% of the federal budget to more than 300 municipalities where comités de vigilancia have been active in the control of health and education policies. Comités de vigilancia performed the role of accountability and have allowed health and education policies to become more efficient than during the period of top-down technical implementation. The LPP created approximately 16,000 territorial organizations (OTBs) to establish
Informal Governance and Participatory Institutions
community priorities. The law represented many advances, but the most important one was the creation of administrative references for rural communities. Prior to LPP, most rural communities had to relate to an abstract and distant state in La Paz that was very centralized and difficult to reach unless there were huge mobilizations toward the center. The LPP changed this situation by decentralizing the state prerogatives on health, education and culture and creating incentives for individuals to control government at the local level. In addition to creating OTBs, the LPP established comités de vigilancia (monitoring or oversight committees) (Oxhorn, 2001). These committees are made of canton representation (the smallest administrative unit in Bolívia) and have their members elected by all grassroots territorial organizations. The LPP introduced two new designs that were key in the democratization of local politics: the first is the creating of a mechanism to coordinate the actions of the new local authority with the local community; the second is the creation of the concept of a socialcultural unity that gave opportunity for those not represented to create their own form of representation (Halkyer, 2000). Today, the LPP and the comités de vigilancia are perhaps the key participatory and monitoring institutions in Bolívia in the role that they play in health and education policies, which were also decentralized. Thus, we have a second model of informal governance with the presence of participatory institutions in Latin America. According to this model, informal governance emerges in situations in which there is a strong capture of the state1 through normal bureaucratic topdown policies. The introduction of horizontal bottom-up policies responded to the need to change public policy priorities. This has been done in different ways in different countries: in Brazil, PB emerged as the most important among the democratic responses to unequal access to public goods. However, it should be noted that despite its world-famous position, PB has shown success in decision-making
1027
processes related to small infrastructure works in the outskirts of large Brazilian cities. It did not involve, or could not carry, successful decision-making on social policies. Other less participatory designs were more successful in relation to social policies. It is also interesting to discuss the relation between PB, as an informal governance arrangement, and democratic participation. The model of PB that emerged in Porto Alegre was highly participatory. It involved a system of open-entry regional assemblies that took place twice during the budgeting cycle. These assemblies were highly democratic for at least two reasons: first, because they involved a significant part of the poor population of the city; second, because they were mandatory and, as such, bind the public administration to the assembly’s decisions. The results of 14 years of PB were positive because they integrated a system of informal governance with a process of democratic participation, each one synergizing the other. However, it would be a mistake to assume that every other form of democratic participation and informal gov ernance is also democratic or has a democratic effect, as I will show in regard to the European and African models. The European model of informal governance and participation is different from both the North American and the Latin American models. The North American model of informal governance is more topic-based or thematic, but it also builds a lot on an existent tradition of flexible public administration which allows new modes of governance in order to deal with emerging issues. In this case, as I have showed above, informal governance connects with a federalist and local democratic tradition. With regards to the Latin American case, there is a tenuous connection with the European system, depending on the different model adopted. Some European cases strongly resemble the Brazilian case of informal participation, whereas other cases seem to draw upon the publicity of PB in order to establish a very limited process of
1028
The SAGE Handbook of Political Science
horizontal participation with almost no governance involved. I will differentiate three cases of informal governance and participation in Europe in order to make this point: the first one is the German/French cases of PB; the second is the Portuguese case; and the third is the Madrid and Barcelona cases of PB. Their occurrence will be approached in terms of waves of informal governance and participation. The first wave of informal governance and democratic participation in Europe dates to the late 1990s and was influenced by the success and visibility of PB in Brazil. It was also influenced by the emergence of global alternative for new forms of governance such as the World Social Forum of Porto Alegre and networks of participatory cities that emerged in Europe and Latin America, among them the Ur-bal network. In this wave, PB emerged as a new solution to offset the lack of participation in Europe. Sintomer et al. (2008) systematized this first wave, pointing out two models of participation introduced in Europe, one that they call ‘Porto Alegre adapted for Europe’ and a second one they call ‘Participation of organized interests’. In the first model, mainly left-wing administrations in Europe tried to reproduce the Porto Alegre model of participation. In this case, the experiences dealt with investments and projects, whereas in the second model, city councils continued to have decision-making over the budget (Sintomer et al., 2008). Most of the experiences of participation and informal governance in this first wave have been marginal and did not have the democratizing and political influence that PB has had in Porto Alegre. A few conclusions can be gathered in relation to this first wave that, according to the authors, also has had a few other marginal experiences. The first conclusion is that the reproduction of a very wide participatory and informal governance model in Europe would be very difficult unless some sort of strong political renewal would take place as it has had in the case of Spain with Podemos.
Before the 2008 economic crisis, most of the experiences belonged to left-wing administrations that did not have too much influence on the European systems. A second issue that already emerged, even in these very early experiences, is linked to the fact that a very well-established top-down form of political administration already exists in the European continent and a mode of informal governance would be at best marginal in this scenario unless a country would be struck by a major crisis. This has been the case in Portugal and Spain after 2008. Portugal and Spain became highly participatory after the 2008 crisis and yet participation in the two countries assumed different configurations. In the case of Portugal, there has been a widespread emergence of participatory experiences across the political spectrum (Allegretti and Alves, 2012). After an initial unsuccessful drive for participation that resembled the other European cases, Portugal developed what has been called a second wave of PB. The main characteristics of these informal governance mechanisms are: (a) a large range of means of participation in order to overcome low levels of face-toface participation; (b) a strong concentration on training; and (c) a strong presence of external consultants interacting with the cities’ technical personnel. Portugal has had 64 experiences of PB, but only 18 survived after a 10-year period. Among the 18, there is one truly meaningful experience, in Cascais, that places Portugal in the list of countries with a strong top-down informal governance orientation with a mostly symbolic meaning. Participation and informal governance have little influence in the design of key policies, but they do have a piecemeal presence in all policies as a way of encouraging organized actors and key members of civil society to interact with the government. The last case in Europe that we are going to discuss is the most meaningful among the experiences of informal governance that emerged in Spain with the rise of Podemos. In contrast with Portugal, Spain developed
Informal Governance and Participatory Institutions
many new forms of political mobilization after the 2008 crisis, among them Indignados and the Platform of People with Mortgages. However, it was after the rise of Podemos (Bringel, 2015) that several new experiences of participation emerged, including a new wave of PB in which Madrid plays a key role with the election of Ahora Madrid. This local party unleashed a process of public consultation that included, among other things, the vote on the pedestrianization of the Gran Via, a process that saw the participation of more than 210,000 people, reaching close to 10% of all eligible voters (Ramos, 2018). Madrid also introduced a PB with the participation of 45,000 people and the approval of 188 proposals for public works in the city. When we look closer to the Madrid model of participation that resembles participation in other large cities governed by Podemos, like the city of Barcelona, we see an important difference vis-à-vis other European models. The Madrid experience transfers effective decision-making prerogatives on to the participants of its informal governance experiences whereas in most of the other European cases, the models of participation
1029
are both restricted and top-down. The Madrid and Barcelona Podemos models differentiate themselves from the restricted European model for reasons similar to the ones that triggered participation in Latin America, namely because there is a left-wing party that wants to differentiate its governance model from the other existing parties. Because there is a repressed demand for public goods, it is best approached through mechanisms of informal participation transferring the pressure for delivery to the informal governance mechanisms. The three models approached above – North American, Latin American and the European – allow us to propose the first typology on models of participation and informal governance (see Table 61.1). This typology starts with the North American case, which is the oldest and has very specific characteristics such as the fact that it draws on federalism and a local model of participation that goes way back to the early 19th century (Rehfeld, 2005). This model made viable forms of informal governance in the moment in which most of the international experiences of implementation of public policies were top-down and bureaucratically oriented.
Table 61.1 Models of informal governance Country or region
Type of informal governance
Features of participation
Administrative result
Levels of governance integrated
North America
Local and based on federalism
Horizontal and thematic
The state and NGOs at the local level
Latin America
Local and based on a Horizontal and relation between comprehensive at participation and the local level representation Variation from country Horizontal and to country. In the comprehensive at Spanish case the local level in of Madrid and the case of Spain Barcelona, it is based on the empowerment of citizens at the local level
Production of administrative efficiency in thematic policies Production of administrative efficiency in local policies Production of efficiency in local policies in the case of Spain (Madrid and Barcelona)
Europe
Local but related to a new constitutional tradition Local government with informal EU support
1030
The SAGE Handbook of Political Science
Non-democratic forms of participation During the last decade, informal models of governance made their way to other parts of the world, in particular, China and Africa. However, it is important to keep in mind that these experiences need to be sharply differentiated from the three above models due to a lack of integration with democratic aims. China is the political model that, following the end of the Cold War, did not make an attempt to pursue a transition to democracy (Tsai, 2006). However, the fact that China did not pursue a clear strategy of democratization did not preclude a strategy of using institutions of informal governance in order to deal with local demands. China introduced many informal institutions or informal governance experiences in order to pursue much needed administrative reforms that would cope with existing problems in the area of budgets and the struggle against corruption. Even before PB had been introduced in China, Chinese villagers or village representatives had been monitoring the budget with what we may call accountability aims. They wanted to insure ‘that village leaders collect money for public goods, distribute village income in a fair way and invest village money effectively’ (He, 2011: 123). Thus, we may say that one of the main advantages of informal governance institutions is that their flexibility can be both associated with participatory democracy and democratization or disassembled from this broader aim as has been the case in China.
PB in China was introduced mostly at the town level, though there have been cases at the city and province levels. In 1998, Heibi province introduced PB meaning that partial budgets were disclosed to the people’s deputies of the People’s Congress for examination and deliberation. In 2004, Huinan township in Shanghai undertook an experiment in public budgeting. Similar experiments in Xinghe and Zeguo townships were conducted in 2005. They subsequently spread to 8 neighboring townships in Weiling in 2009 and to 79 townships in Taizhou in 2010. (He, 2011: 123)
These experiments all share similar characteristics, namely that they create local mechanisms for participation on the budget, but they stop at this level and do not interfere with national politics (see Table 61.2). Finally, we have the case of informal governance and participation in Africa. Informal governance in some African cases has had a different configuration because they do not emerge from the initiatives of the government or civil society but rather through top-down policies of international institutions, in particular the World Bank, which is associated with national governments. In the cases of Cabo Verde and Mozambique, there is very little connection between the practice of informal governance and the introduction of participatory institutions. PB was introduced by the World Bank in Africa with the intention to expand the Brazilian experiences of PB. The detailed story of this incorporation of a leftwing practice by an international institution is yet to be told and it is linked to the visit to Porto Alegre of a World Bank team for the
Table 61.2 Expansion of informal governance to semi- or non-democratic contexts Country or region
Type of informal governance Relation with participation
Administrative result
China
Local and based on concessions from the central government Local and unleashed from the top as part of the international agencies agenda
Horizontal and thematic
Production of administrative efficiency
Selective and insulated from the rest of the government
Production of accountability to international agencies
Portuguese-speaking Africa (Mozambique and Cabo Verde)
Informal Governance and Participatory Institutions
overseeing of a loan and its approval by the PB Council. PB and World Bank association is an indication of an additional dimension of informal governance which is the way it connects different governance levels to the international institutions. However, the type of adaptation needed in order to tackle the issue of informal governance in these African countries has been larger than in other settings because this process intertwines with the huge need for international financial aid, which, most of the time, comes tied to administrative conditions (Goldfrank, 2018; Nylen, 2014). The cases of PB in Africa that are the most well-known are the ones in the Portuguesespeaking countries, mostly Mozambique and Cabo Verde. In the case of Mozambique, one of the well-known cases, PB emerged as part of a broader attempt of decentralization and deconcentration. This plan has led to experiences of PB in some cities such as Quelimane, Beira and Nampula. These experiences involve a strong adaptation both in what is the meaning of informal governance and what is its relation to participatory experiences. These cases are driven by incentives produced by the World Bank and UN urban program resources (Jamal, 2017). Independent analysts of these experiences suggest that, in spite of some goodwill of their carriers – particularly in Beira and Maputo – these experiences show the impossibility of moving beyond the meaningful inclusion and distribution of resources due to the contextual politics where they are inserted (Jamal, 2017; Nylen, 2014). In relation to Mozambique’s case, Jamal (2017) points to the weakness of the local democracy and the low quality of citizen participation in the local decision-making forum. For him, ‘the participatory process in Mozambique is no more than a local community representationbased platform in which citizens are represented through their traditional chiefs, religious leaders and influential individuals to form the local consultative councils to interact with local governments’ (Jamal, 2017: 240). Even in cases where there is a strong willingness to
1031
implement horizontal informal participation, as has been the case in Maputo, this attempt clashed with the darker side of the party-state system he [Comiche] hoped to reform: the propensity of party-based clientelism to easily morph into corruption and exclusion. Like many if not most noncompetitive systems, the Frelimo party-state is constructed upon and maintained through the continuation and even expansion of this ‘darker side’, all at the ultimate expense of a more generalized or universalized public service’. (Nylen, 2014: 55)
Thus, the cases of informal governance in Africa are similar to the cases in Asia and they involve a topical attempt to introduce informal governance at the bottom, keeping formal governance at the top (see Table 61.2).
Conclusion To draw a line under this analysis, we can say that informal governance is a solution that emerged after the fall of the Berlin Wall to integrate social actors into public policy in contentious cases where the top-down model failed. This is the most important characteristic of informal governance in the way it intertwined with horizontal forms of political participation, opening a new venue for tackling social problems. The specific democratic cases which have taken place in North America, Europe and Latin America have significantly contributed to a democratic deepening in these regions despite the limitations in some areas in terms of public policy and in the scope of the experience in the territory. In these cases, informal governance is a democratizing tool as long as it is effective. However, as I have pointed out in this chapter, informal governance, despite its democratizing intentions, can also be expanded to non-democratic settings with different consequences. The case of China is emblematic in this respect because the introduction of participation in the budget has been associated with the lack of intention to introduce
1032
The SAGE Handbook of Political Science
democracy or representative institutions at the core. The cases of informal governance in Africa that have been approached in this paper are somehow different because they belong to authoritarian electoral settings (Levitsky and Way, 2010: Nylen, 2014) and participation is part of the attempt of international institutions to have transparency over their funds rather than to introduce horizontal relations. The results in each one of the cases are different because informal governance, per se, cannot democratize undemocratic settings. It can only deepen democracy in contexts where it is already in place.
Note 1 Under the topic of ‘state capture’, there is important literature on state regulatory policy through the insertion of interest groups in policy arrangements. Though this literature devoted more attention to state development, it has also been applied to state urban policy (Evans, 1995).
References Abers, R. (2000). Inventing Local Democracy: Grassroots Politics in Brazil. Boulder, CO: Lynne Rienner Publishers. Allegretti, G. and Alves, M. L. (2012). (In) stability, a Key Element to Understand Participatory Budgeting: Discussing Portuguese Cases. Journal of Public Deliberation, 8(2), Article 3. Avritzer, L. (2009). Participatory Institutions in Democratic Brazil. Washington, DC: Wilson Press/Johns Hopkins University Press. Baiocchi, G. (2005). Militants and Citizens. Stanford, CA: Stanford University Press. Bolivia (1994). Ley de Participación Popular., Pub. L. No. 1551. Bringel, B. (2015). 15-M, Podemos e os movimentos sociais na Espanha. Novos Estudos Cebrap, 34(103), 59–76. Dahl, R. and Lindblom, C. E. (1953). Politics, Economics and Welfare: Planning and Politico-economic Systems Resolved into Basic Social Processes. Chicago: University of Chicago Press.
Evans, P. (1995). Embedded Autonomy: States and Industrial Transformation. Princeton: Princeton University Press. Fischer, F. (2006). Participatory Governance as Deliberative Empowerment: The Cultural Politics of Discursive Space. The American Review of Public Administration, 36(1), 19–40. Fischer, F. (2012). Participatory Governance: From Theory to Practice. (D. Levi-Faur, Ed.). Oxford University Press. Retrieved from http:// www.oxfordhandbooks.com/view/10.1093/ oxfordhb/9780199560530.001.0001/ oxfordhb-9780199560530-e-32 Fung, A. (2006). Varieties of Participation in Complex Governance. Public Administration Review, 66(s1), 66–75. Fung, A. and Wright, E. (2003). Deepening Democracy. London: Verso Press. Goldfrank, B. (2018). Brazil’s Participatory System in Comparative Context. Presented at the Latin American Studies Association Annual Meeting, Barcelona. Hall, P. A. and Taylor, R. C. R. (1996). Political Science and the Three New Institutionalisms, Political Studies, 44(5), 936–957. Halkyer, R. O. (2000). Municipalización de pueblos indígenas en Bolivia: impactos y perspectivas. In W. Assies, G. van der Haar, and A. Hoekama, El reto de la diversidad: pueblos indigenas y reforma del estado en America Latina. Zamora, Mexico: El colegio de Michoacan, 315–340. He, B. (2011). Civic Engagement through Participatory Budgeting in China: Three Different Logics at Work. Public Administration and Development, 31(2), 122–133. Helmke, G. and Levitsky, S. (2004). Informal Institutions and Comparative Politics: A Research Agenda. Perspectives on Politics, 2(4), 725–740. https://doi.org/10.1017/ S1537592704040472 Jamal, S. (2017). The Role of Participatory Budgeting in Promoting Urban Development in Mozambique (Dissertation (PhD)). University of Coimbra/ Centre for Social Studies, Coimbra. Retrieved from https://estudogeral. uc.pt/bitstream/10316/80763/1/The%20 Role%20of%20Participatory%20Budgeting% 20in%20Promoting%20Urban%20 Development%20in%20Mozambique.pdf Levitsky, S. and Way, L. A. (2010). Competitive Authoritarianism: Hybrid Regimes after the Cold War. New York, NY: Cambridge University Press.
Informal Governance and Participatory Institutions
Lowy, T. (1970). The End of Liberalism. New York: Norton Press. Nylen, W. R. (2014). Participatory Budgeting in a Competitive Authoritarian Regime: A Case Study (Maputo, Mozambique). Cadernos IESE, (13E). O´Donnell, G. (1996). Illusions about Consolidation. Journal of Democracy 7(2): 34–51. O’Donnell, G. (2000). Poliarquias: a (in)efetividade da lei na América Latina. Uma conclusão parcial. In G. O’Donnell, J. E. Mendez, and P. S. Pinheiro, Democracia, violência e injustiça. O não estado de direito na América Latina. São Paulo: Paz e Terra, 337–373. Oxhorn, P. (2001). La construcción del estado por la Sociedad civil. La ley de participación popular de Bolivia y el desafio de la democracia local. Washington, DC: Interamerican Development Bank. Ramos, A. (2018). Citizen Participation in Madrid: Interaction or Disconnection. Paper presented at the conference Systemic Approaches to Democracy. Cascais, Portugal, Feb., 2018. Rehfeld, A. (2005). The Concept of Constituency: Political Representation, Democratic Legitimacy, and Institutional Design. https:// doi.org/10.1017/CBO9780511509674
1033
Sintomer, Y., Herzberg, C. and Röcke, A. (2008). Participatory Budgeting in Europe: Potentials and Challenges. International Journal of Urban and Regional Research, 32(1), 164–178. Smith, G. (2009). Democratic Innovations: Designing Institutions for Citizen Participationtheories of Institutional Design. Cambridge: Cambridge University Press. Thomas, C. (2003). Habitat Conservation Planning. In A. Fung, and O. Wright, Deepening Democracy. London: Verso, Chapter 5, 157–187. Tilly, C. (1975). The Formation of National States in Western Europe. Princeton: Princeton University. Tsai, K. S. (2006). Adaptive Informal Institutions and Endogenous Institutional Change in China. World Politics, 59(1), 116–141. Wampler, B. (2015). Activating Democracy in Brazil. Notre Dame: Notre Dame Press. Weber, M. (1946). Politics as a Vocation. In H. H. Gerth and C. W Mills, From Max Weber: Essays in Sociology. Oxford: Oxford University Press, 77–128. Weber, M. (1978). Economy and Society: An Outline of Interpretive Sociology (Vol. 1). Berkeley: University of California Press.
62 Local Politics Hellmut Wollmann
Scope and focus Crucial Institutional Conditions of Local Politics In taking up the issue of local politics, the article shall focus on and single out three institutional dimensions upon which the viability and vitality of local politics crucially hinge. First, decentralization: the position of local government in a country’s multi-level governmental system essentially depends on whether, and to what degree, relevant powers, competencies and responsibilities are transferred (‘decentralized’) from the upper levels to the lower (local government) levels. Second, local territoriality: the size (‘scale’) of the local territory essentially influences the democratic and operational base and potential of local politics. Third, democratization: the democratic quality of local politics essentially hinges on the choice of institutions, rules and procedures
of representative and/or direct democracy and on the forms of politico-administrative leadership in local government.
Local Politics in a ‘Global Perspective’ In line with the ‘global perspective’ of this Handbook, the article aspires to attain a global coverage. In doing so it will address, in a much used dichotomous classification, ‘developed’ as well as ‘developing’ countries or, in another dichotomous parlance, ‘Global North’ as well as ‘Global South’ countries. However, preference will be given to identifying and grouping the countries by their geographical allocation to continents and ‘global regions’, largely in line with the ‘mapping by continents’ method that has been used in the Global Reports initiated, composed and published by United Cities and Local Governments (UCLG) (UCLG, 2008).
Local Politics
1035
Conceptual Scheme of Analysis
Towards a Comparative Analysis?
When attempting to identify the factors and events that have influenced the course of decentralization, territoriality and democratization in the respective countries, two sets of factors should be highlighted at the outset. For one, emblematic of the ‘developed’ versus ‘developing’ country dichotomy, the countries differ greatly as to whether locallevel decentralization and democratization has long been, to a larger or lesser degree, part and parcel of the country’s history and fabric (which applies particularly to North American and European countries) or whether locallevel decentralization and democratization has only relatively recently been ushered in as crucial elements of the country’s political transformation (such as in Asia-Pacific and African countries at the end of colonial rule, in Latin American countries after the termination of military rule and Central Eastern European countries following the collapse of communist regime). These differences in their historical ‘starting conditions’ and political contexts should be given particular attention in the analysis of the respective countries. Second, the socio-economic, politicocultural, ethnic and other discrepancies and differences, which exist not only between the ‘developed’ and ‘developing’ countries but also within them, need to be given prime attention as driving factors and forces that entail varied trajectories of local-level decentralization and democratization. At this point it should suffice, by using the GDP per capita1 as an indicator, to highlight the glaring socio-economic disparity which gapes between the United States with $59.530 and the (EU) European countries with $33.715, on the one side, and Latin American/ Caribbean countries with $8.313, Asia-Pacific countries with $7.127 and Sub-Saharan African countries with $1.552, on the other.2 But great differences appear within these global regions as well, for instance between India with $1.939 and the entire Asia-Pacific region with $7.127 or the Sub-Saharan African countries with $1.552 versus South Africa with $6.160.
The intended global outreach of this account could, at first sight, suggest that this article should be carried out as a methodologically sophisticated piece of comparative analysis, perhaps even along the ‘most similar’ or ‘most different cases’ research design (Przeworski and Teune, 1970, Lijphart, 1975). However, at a closer look, the multitude of (methodologically speaking: both dependent and independent) variables such a globally far-reaching investigation is bound to come to grips with virtually rules out the possibility of embarking on such a methodologically demanding and rigorous analysis, let alone the lack of time and of resources. Yet a methodologically less elaborate and, as it were, softer comparative approach appears to be worth attempting. Instead of treating the three proposed institutional dimensions in a country by country manner, in this article, each of the three institutional dimensions will be addressed in a separate section in an acrosscountry comparative manner. Dealing with these three dimensions in separate chapters might allow a narrowing of the range of relevant variables to a ‘researchable’ size and thus enhance the analytical potential of identifying the (causal) relation between, say, the rate of decentralization or the profile of local autonomy on the one side and the influencing factors (say, the specific historical ‘starting conditions’, for instance, and at the end, colonial, military, etc., rule) on the other. Such a methodologically ‘softer’ approach promises to reach (in alluding to the distinction between ‘comparative’ and ‘comparable’ research aptly proposed by Derlien, 1992) ‘comparable’ results and insights even if (methodologically rigorous) ‘comparative’ analyses appear to not be feasible. At last a cautioning caveat needs to be voiced. In view of the great number of countries ‘around the globe’ (amounting, in the official counting of the UN, to 195) and of the ensuing multitude of local government systems, the following attempt at venturing a global perspective cannot help being selective, fragmentary and ‘broad brush’ in scope.
1036
The SAGE Handbook of Political Science
Decentralization In addressing decentralization, this chapter deals with the transfer of powers, competencies and responsibilities, etc., in multi-level government systems to subnational government levels with a focus on the (lower tier) local government level (municipalities etc.), its autonomy and competencies. Most European countries, being unitary States, are made up of three government levels, i.e. the central government, the regional/meso and the local government levels, the latter often having a two-tier structure (for instance county and municipality). In federal countries, the meso level has a federal status in its own right (i.e. Länder in Germany and cantons in Switzerland, as well as the ‘comunidades autónomas’ in Spain and the ‘regioni’ in Italy, both ‘quasi-federal’ countries) (Kuhlmann and Wollmann, 2019: 146). The legal regulation of local government lies at the central government level or (in federal countries) with the regions (Länder, cantons). Since the 1980s, in EU member states, the supra-national level has come in addition to the existing multi-level structure of the member states with its norm-setting and financial funding influencing (‘Europeanizing’) the local level as well (Guderjan, 2019: 396). Traditionally, in Continental European countries, local government has been provided, often constitutionally, with a ‘general competence’, which is at the core of the typical ‘multi-function’ model of local government (Wollmann, 2004a). In the European Charter of Local Self-Government, which was adopted by the European Council in 1985 and subsequently ratified by all European governments, the common understanding of the ‘general competence’ has been reiterated and confirmed in formulating that the local authorities have ‘the right and ability, within the limits of the law, to regulate and manage a substantial share of public affairs under their own responsibility and in the interests of the local population’. By contrast, in the UK, historically the ultra-vires doctrine reigned, under which the
local authorities could carry out only those functions assigned to them by Parliament. However, in 2000, legislation has granted the local authorities a so-called ‘well-being power’, (‘to promote the economic, social and environmental well-being of their area’) which now comes close to a general competence clause (Wilson and Game, 2011: 169). During the 1960s and 1970s, in some European countries, in particular in the UK, Sweden and Germany, the political and functional role of the local authorities was significantly strengthened along with the post-war expansion of national welfare which climaxed in the 1970s when the implementation of public policy tasks was increasingly transferred (decentralized) to the local authorities. Typically, the decentralization of tasks went hand in hand with (in part massive) local-level territorial reforms that were meant to enhance the capacity of the local authorities to cope with these tasks (see below in the section, ‘Territoriality’). As a result, Sweden, being a unitary state, stands out as an exceptionally decentralized country marked by the politically, functionally and financially strongest local authorities in Europe. Exemplifying Sweden’s peculiar ‘local Welfare State’ local-level personnel amounts to 83% of the total of public sector employees with the lion’s share (70%) of local government spending paid from local taxes (Kuhlmann and Wollmann, 2019: 102). During the 1980s and 1990s some countries embarked upon decentralizing their hitherto (‘Napoleonic’) centralist states. In 1982, France adopted a major decentralization act (Kuhlmann and Wollmann, 2019: 164) and in the early 1990s Italy followed suit (ibid.: 169). In both cases, the decentralization measures implied a significant political and functional upgrading of the regional levels (départements and regioni respectively) and, albeit to lesser degree, some political and functional strengthening of the local government levels (communes and comuni respectively) as well.
Local Politics
In Central Eastern European (CEE) countries, after 1989/1990, following the collapse of the communist regimes, the previously centralist communist/socialist states was dismantled and ‘transformed’ by rampant decentralization and the introduction of local self-government (for country reports see Baldersheim et al., 2003; Marcou and Wollmann, 2008: 139). Hungary presents a doubly peculiar case. After 1989, this country became the most decentralized one in the region, with the politically and functionally strongest local authorities among the other transformation countries. By contrast, since 2011, under the right-wing government led by Victor Orban, Hungary has experienced a radical re-centralization of the entire politico-administrative system; the local authorities were stripped of many of their functions and put under stringent central government control (Kuhlmann and Wollmann, 2019: 171). As a result of different state traditions and different rates of decentralization, the scope of local government functions varies significantly between European countries. Judging, as indicator, by the municipal expenses as a percentage of GDP (in 2007) (Baldersheim and Rose, 2010b: 3, table 1.1), three ‘Nordic’ countries (Denmark, Sweden and Finland, unsurprisingly (with 32.2, 24.5 and 19.3 % respectively)) top the ranking, while Italy and France (with 15.1 and 11.2 %) hold a middle rung and Greece (with 2.5 %) comes out at the low end (For another largely congruent 2005 dataset, see Marcou and Wollmann, 2008: 143, figure 1). Until the collapse of the Communist regime in 1990, the Soviet Union was marked by the post-Stalinist model of extremely centralist one-party rule under which, premised on the doctrine of the ‘Unity of the State’, all subnational (regional and local) levels (as well as societal spheres) were bereft of any autonomy and meant to serve as subnational and local cogs in the centralist state machinery (Wollmann, 2004b). In the wake of the reforms (perestroika) initiated by Mikhail
1037
Gorbachev, since the late 1980s and following the break-up of the Soviet Union in late 1990, Russia emerged as a federal country made up of 85 federal regions (‘federal subjects’), including two ‘federal States’ (the capital Moscow and St. Petersburg). The new Constitution adopted in 1993 gave the regions (which, during the turbulent transition period under President Yeltsin, had unfolded almost ‘secessionist’ dynamics) farreaching powers. At the same time, article 12 of the Constitution laid down that ‘the bodies of local self-government shall not be part of the State power bodies’ (Gel’man, 2008: 71; Wollmann, 2004b: 112; Khabrieva et al.: 2008: 97), thereby conspicuously breaking with the Soviet doctrine of the ‘unity of the State’. Since becoming President in March 2000, Vladimir Putin has resolutely moved towards re-centralizing the country’s intergovernmental system and to establish a top-down ‘vertical power’ structure to bring the regions back under central government (‘presidential’) control and to accordingly subdue the local authorities to the central government’s rigid command and supervision. Flying in the face of the constitutional promise of article 12, Russia has turned back to ‘statelize’ (ogosudarstvlenie) local government (Gel’man, 2008: 71) thus reviving the centralist legacy and imprint of the Soviet past. In the United States, the constitution (made effective in 1789) is based on the concept of ‘dual sovereignty’ according to which the (now) 50 states and the federal government are deemed to have separate and independent spheres, each having some sovereignty in their own affairs (Savitch and Vogel, 2005: 219). Hence, the legislation on the local government level lies with the individual states each having its own legislative framework on local government. Consequently, there are practically ‘some fifty American local government systems’ (Sellers, 2008: 238). Under the socalled Dillon’s rule, which was historically inherited from the British ultra-vires doctrine, the local authorities can exercise only the functions explicitly assigned to them by
1038
The SAGE Handbook of Political Science
their state. However, in the meantime, all but three states have granted the local authorities so-called ‘home rule’ powers, by virtue of which local governments have, in practice, reached a great deal of legislative, operational and fiscal autonomy (Sellers, 2008: 237; Savitch and Vogel, 2005: 220). The Latin American countries had gained independence from colonial rule in the 1810s and 1820s and fell under military dictatorship during the 1970s and 1980s. Ending military during the 1980s, they embarked upon decentralization and democratization. This process was significantly propelled by the Inter-American Development Bank and the World Bank who emphasized and financially fostered the role of local government in the promotion of economic development (Rosales and Carmona, 2008: 171; Kersting et al., 2009: 77). A prominent example is Brazil (with some 200 million inhabitants, it is the largest Latin American country). Following the end of the military regime, a new constitution was adopted in 1988, which defined the newly established federal system as an ‘indissoluble union of states and municipalities and the federal district’ (i.e. the capital), thus constitutionally recognizing the federal status of the municipalities (besides the 26 federal States) and granting them full autonomy (Rosales and Carmona, 2008: 175, also on Mexico’s similar development). Strengthening the political autonomy of local government has been a major feature of the recent decentralization process in Latin American countries (Nickson, 2019: 140). However, while national legislation has usually granted local government a general competence-type autonomy, the municipalities have rarely taken the initiative to expand their own mandate due to their continuing financial, technical and political weakness (Kersting et al., 2009: 83) (For an overview on all Latin American countries see Rosales and Carmona, 2008 and Kersting et al., 2009 with country reports on Bolivia, Chile and Paraguay). In (South) Asia-Pacific countries, following the end of colonial rule and their often conflict-ridden independence process between
1945 and 1947, institution-building in these countries was typically confronted with the problem of what has been aptly called ‘internal de-colonisation’ (Vajpeyi, 2003a: 11). That is, with the task of coping with and overcoming the institutional and mental legacies left by their respective (British, French or Dutch) colonial past. Thus, the new national elites took over not only many of the institutions inherited from the colonial era, but also the mental attitude and pattern to govern from the centre and to treat the local level and its native leaders as subject to a paternalistic and top-down rule (Baldersheim and Wollmann, 2006b: 117). Consequently, when moves towards decentralization and democratization got under way during the 1990s – also promoted by international donors – they showed a great variance in timing, scale and modality still shaped by the country’s specific colonial legacy (Nickson et al., 2008: 53). India (with about one billion inhabitants, it is the second largest country in the world – second to China with some 1.4 billion people) is a case in point. Historically, rural India was marked by ‘panchayats’ (village assemblies) that date back to ancient times and which after 1857, when India came under British colonial rule, were largely side-lined or instrumentalized for central British colonial administration (Vajpeyi and Arnold, 2003b: 34). The Federal Constitution of 1950, adopted after India’s independence and laying the ground for the federal system, required that ‘the (federal) States shall take steps to organize village panchayats and endow them with such powers and authority as may be necessary to enable them to function as units of local self-government’. But, during the years to come, the (federal) States failed to live up to the constitutional mandate, particularly with regard to the bulk of rural villages. Finally, in 1992, an amendment to the Constitution was adopted which explicitly recognized a three-tier local government structure as a third government level (below the national and State levels) and stipulated that in every State ‘at the village, intermediate and district levels…panchayats’ shall be
Local Politics
constituted and elected ‘from territorial constituencies in the panchayat area’ (Vajpeyi and Arnold, 2003b: 39). Thus, the institutionalization of local government and local democracy has made a significant step forward. However, the constitutional mandate to implement administrative and fiscal decentralization has still not been applied with to the same extent in all (federal) states (Nickson et al., 2008: 57). Besides, as it was critically observed, the ‘panchayat system is often unable to function effectively due to the embedded nature of bureaucracy, the low level of political consciousness and the feudal and patriarchal structures of the society’ (Tremblay, 2003: 54), in other words, due, not least, to the still unfinished business of ‘de-colonialization’. In Indonesia (with 260 million people it is another major Asian country), the Regional Development Law of 1999 has triggered a sweeping (‘big bang’) decentralization process which has significantly shifted resources and responsibilities from the central and provincial levels to the urban and rural municipalities (Nickson et al., 2008: 58; Kersting et al., 2009: 171) (on the development of decentralization and local democracy in other areas of South Asia, see the country chapters in Vajpeyi, 2003b and Baldersheim and Wollmann, 2006b: 117). In the (some) 50 countries of Sub-Saharan Africa that, with the end of colonial rule, became independent states during the 1960s and 1970s, decentralization reforms have been attempted since the 1980s. They, too, showed marked differences depending on their (French or British) colonial past: Francophone countries, such as Mali and Burkina Faso, tended to treat decentralization as a technique for managing a centralized unitary state, while in Anglophone countries, such as South Africa and Nigeria, federal structures have been adopted and decentralization has been associated with the constitutional recognition of the local government level (Letaief et al., 2008: 24; Kersting et al., 2009: 130). In the majority of African countries, decentralization was imposed from the
1039
top down, making it more a tool used by the central government to control the territory and population. A crucial problem of making local institutions work was partly to reconcile traditional (tribal) community structures (chiefdoms) with modern national institutions. Subsequently, despite constitutional and legislative provisions and safeguards, the autonomy of local government has remained restricted by a central government oversight of local government bodies and their actions (Letaief et al., 2008: 46). The latter obstacle has been compounded by the spread of the one-party state in many African countries (Baldersheim and Wollmann, 2006b: 116). Throughout the African countries, decentralization has often been fraught with political instability and, most notably, with ethnic and tribal conflicts. In the meantime, the process of decentralization and local-level government reforms has often been stalled and even reversed (Kersting et al., 2009: 128). However, noteworthy progress in decentral and democratic institution-building has been made in South Africa (with some 55 million inhabitants, it is the second largest Sub-Saharan African country). Following the end of the Apartheid regime, the Constitution adopted in 1997 has installed, within a three-tier federal system (central, provincial and local), a local government level (Cameron, 2007: 316; Kersting et al., 2009: 139). Consequently, local government has undergone a significant transition from the previous apartheid setting towards a more democratic system (Cameron, 2007: 322). Local elections that were peacefully held in 1994 and 2000 signalled a viable future development (for a cautioning assessment, see Baldersheim and Wollmann, 2006b: 117).
Territoriality Seeking and defining the (optimal) territorial size (scale) of local government has been
1040
The SAGE Handbook of Political Science
marked in many countries by the choice and balancing (‘trade-off’) between enhancing its operational efficiency of local government and buttressing local democracy (Baldersheim and Rose, 2010b: 8). In European countries, the local-level territorial structure was historically characterized by predominantly small-sized municipalities. In the more recent territorial development, two country groups can be discerned (De Ceuninck et al., 2019; Kuhlmann and Wollmann, 2019: 185). On the one side, in the UK, Sweden and in some German federal regions (Länder) during the 1960 and 1970s, a first wave of territorial reforms got under way that aimed to territorially and demographically enlarge (‘upscale’) the municipalities by way of consolidation (amalgamation) and, if need be, through ‘coercive’ legislation against local opposition (see the country reports in Baldersheim and Rose, 2010a). This reform drive was premised on the then dominant ‘rationalistic’ assumption that the local-level territorial consolidation could enhance the economic and operational efficiency of the local authorities and their capacity to cope with the ever growing scope of tasks that were transferred to them by the expanding national welfare state. Thus, there was a close conceptual and operational tie between local territorial reforms and decentralization/’functional’ reform (see above in the section ‘Decentralization’; Marcou and Wollmann, 2008: 136). The UK went furthest, in 1974, in this strategy of large-scale amalgamation (often identified as the ‘North European pattern’, (Norton, 1994: 40)) by reducing some 1,300 lower tier local (district and borough) governments to 369, resulting in the unparalleled average of 129,000 inhabitants (Wilson and Game, 2011: 78; Kuhlmann and Wollmann, 2019: 187). In Sweden, too, in 1974 the number of municipalities (kommuner) was cut back from 2,282 (averaging some 2,800 inhabitants) to 290 (averaging some 31,000) (Kuhlmann and Wollmann, 2019: 189) the average population size per municipality. Since the 1990s, in
some countries a second wave of large-scale territorial ‘upscaling’ has taken place – with Denmark’s territorial reform of 2007 tackling a veritable ‘revolution in local government’ (Mouritzen, 2010: 21) by cutting the number of municipalities from 275 to 98 with an average of 55,480 inhabitants (now the second largest average among European countries). On the other side, in some European countries, exemplified by France, Italy and Switzerland, local-level territorial consolidation has not been undertaken in what has been identified as the ‘South European pattern’ (Norton, 1994: 40). In these countries, the politically and culturally rooted principle of ‘voluntariness’ has prevailed according to which the boundaries of municipalities can be redrawn only with local consent. An exemple is France, where the boundaries of some 36,000 municipalities, averaging some 1,500 inhabitants, have remained largely unchanged since the Revolution of 1789 or even earlier (Kuhlmann and Wollmann, 2019: 191). Another striking case is Switzerland which has very small municipalities (half of them having less than 840 inhabitants) whose boundaries have remained unchanged over the last 150 years (Kübler and Ladner, 2003: 140). In CEE countries, after 1990, the fragmented structure of small-size municipalities inherited from the communist era has been largely retained, often due to the political motive of not impairing the newly restored small-scale local democratic arenas. In addition, the number of small-scale municipalities has even increased as local communities were given the right and opportunity to undo territorial amalgamations imposed under the previous communist regime (see country reports in Swianiewicz, 2010). For instance, in Hungary the number of municipalities jumped from some 1,600 before 1990 to 3,170 after (Kuhlmann and Wollmann, 2019: 198; on the Czech Republic see ibid.: 199). Depending on whether, or on which scale, local-level territorial reforms have been carried out, the average population size varies greatly, ranging between 139,000 (UK) and
Local Politics
1,700 (France), (for 2008 data covering 30 European countries on the number, average population size and percentage of municipalities with less than 5000 inhabitants, see Baldersheim and Rose, 2010b: 3, table 1.1). While the purpose and ‘logic’ of (largescale) amalgamation can be seen as an attempt to ‘internalize’ the coordination of multiple functions and actors (intra-municipally) within the (enlarged) jurisdiction of local government (Wollmann, 2010), in contrast, in countries without amalgamation, an alternative strategy (and ‘logic’) to achieve such coordination of functions and actors can be seen in the creation of institutional forms of inter municipal cooperation. In the European context, France is, in view of the historically small-sized pattern of its municipalities, unsurprisingly, an example of such alternative strategy. Dating back to the 1890s, intermunicipal bodies (établissements publics de coopération intercommunale (EPCI)) which are designed, be it as single-purpose or as multiple-purpose entities, to assist their member municipalities in the provision of services have been established. They are directed by councils that are not directly elected but appointed by the member municipalities (for details see Borraz and Le Galès, 2005; Wollmann, 2010). The complex system and network of intermunicipal bodies (intercommunalité) was (in 2015) made up of a total of 2,133 EPCIs, all of different types that are endowed with taxing power (à fiscalité propre), comprising 36,588 member municipalities and 62.9 million inhabitants (for details and data, see Kuhlmann and Wollmann, 2019: 195, table 4.7). Likewise, in other countries which, in the absence of territorial reforms, are marked by a multitude of small-scale municipalities, similar intermunicipal bodies are galore (Marcou and Wollmann, 2008: 138). Still, another variant of intermunicipal bodies has been embarked upon in France since 2018 with the creation of metropolitan entities (so-called métropoles) in and around France’s 21 major cities and urban
1041
areas. Constituting the most integrated yet organizational form of intermunicipal cooperation, the métropoles vertically combine most functions of their (continuing to exist) member municipalities, as well as some functions of the territorially connected départements, and levy most of the municipal taxes. However, their decision-making councils are still indirectly elected by the councils of their member municipalities. The largest métropole is ‘Grand Paris’, in and around the capital city of Paris, with some 7 million inhabitants and 131 member municipalities (Kuhlmann and Wollmann, 2019: 193; for a similar development in Italy, with the formation of 14 metropolitan intermunicipal entities: città metropolitane, see ibid.: 170). The recent development of intermunicipal bodies, including metropolitan entities, hints at a trend towards some gradual territorial consolidation as the institutions of intermunicipal cooperation appear, in the absence of formal consolidation, to prepare the political and mental ground for fully fledged consolidation (for an overview and differentiated conclusions, see Baldersheim and Rose, 2010b). The Russian Federation, which resulted from the break-up of the Soviet Union in 1990, is still the territorially largest country in the world with some 17 million square kilometres (and counting some 144 million inhabitants). By 2005, the map of Russia’s two-tier local government system has been extensively redrawn by the regions (federal subjects). The number of municipalities almost doubled from 12,215 to 24,079 as the formal local government was extended to all settlements of more than 1,000 inhabitants (Khabrieva et al., 2008: 104). At the same time, the municipal districts (munitsipal’nye raiony) have been introduced as units of a new (upper) local government level whose decision-making bodies are made of mayors of the member municipalities. While, the multitude of settlements are included in the local government structure, it is organizationally ascertained that they serve as key links in
1042
The SAGE Handbook of Political Science
the hierarchy of administration and its vertical power mechanism (Gel’man, 2008: 81), thus reinforcing the vertical and centralist (presidential) power grip. In the United States, the two-tier local government structure is made of 3,043 counties as well as 19,372 municipalities and 16,629 townships and towns (Sancton, 2002: 186 et seq.; Sellners, 2008: 239). Despite the great number of often small municipalities, hardly any territorial consolidation by way of amalgamation has occurred over the years (Sellners, 2008: 239). This applies even to the larger cities and their mushrooming suburban hinterland. The reason for this plausibly lies essentially in a racial and ethnic divide between ‘white’ suburbs and ‘ethnically diverse’ central core cities with either side politically blocking amalgamation (Sancton, 2002: 190). As an alternative strategy to provide locallevel public services, typically ‘special purpose’ bodies (school districts and special purpose districts) have been created outside the (‘general purpose’) local authorities, often territorially transcending them. Comparable to the intermunicipal bodies frequent in European countries, they are meant to perform specific (primarily single purpose) functions (schools, public utilities, services etc.), sometimes possessing a local taxing power of their own (Sellers, 2008: 237). In some, the directing boards are chosen by separate direct elections, in others they are composed of representatives of member local authorities. Amounting in a total of 50,432 units (in 2007) they by far outnumber the some 36,000 (general purpose) local governments. The rapid rise of special purpose districts from some 18,000 in 1962 to some 34,000 in 1997 (figures from Scanton, 2002: 186), they have observed growing importance in US local government besides, and also in place of, the ‘general purpose’ local authorities. Moreover, in order to cope with the demographically and functionally mushrooming metropolitan areas, various strategies of intermunicipal cooperation have been embarked upon, among which different forms
of overarching metropolitan governments have come to loom large (Sellers, 2008: 239; Sellers and Hoffmann-Martinot, 2008: 259). Typically, such (sometimes still informal but also formally institutionalized) metropolitan arrangements, geared to specific functions, consist of multiple municipalities and counties mostly along with, and revolving around, the respective central city. For instance, the city region of San Francisco comprises 90 municipalities and 9 counties with a total of 6.78 million inhabitants, of whom 10.9% live in San Francisco as the central city proper.3 Reflecting the characteristic aversion to amalgamation, these processes of ‘metropolitization’ have gone without annexation of or merger of municipalities. In Latin American countries, the local level is marked by a somewhat paradoxical territorial structure. On the one hand, it shows a high degree of territorial fragmentation, even called ‘atomization’ (Rosales and Carmona, 2008: 179), with around 90% of all municipalities in the region having fewer than 50,000 inhabitants (Nickson, 2019: 134). The countries have typically refrained from territorial consolidation (amalgamation) as redrawing local boundaries has been politically resisted by the culturally embedded understanding of local autonomy (Nickson, 2019; Kersting et al., 2009: 79). In addition, in some countries, the number of (smallsized) municipalities has even increased and, as in the case of Argentina, even doubled.4 On the other hand, a small number of municipalities of the region count among the largest cities in the world, such as Mexico City (with 26 million inhabitants) and Sao Paolo (with 24 million inhabitants). While the city boundaries, notwithstanding the demographically exploding metropolitan areas, have hardly been redrawn (rescaled) (Nickson, 2019: 134), many forms of intermunicipal cooperation, such as mancomunidades, have emerged (ibid.: 140), particularly around central cities and adjacent metropolitan areas (Rosales and Carmona, 2008: 180). Remarkably, in Brazil, under the military regime, nine metropolitan
Local Politics
cities were created, such as the (Metropolitan) Municipality of Sao Paulo, with some 13 million inhabitants, it the largest South American city, while Sao Paulo’s metropolitan area comprises 22 million people. However, in only few of the metropolitan areas in Latin American countries a functioning metropolitan system of government has been put in place that would allow them to manage their territory in an integrated manner (for details see Rosales and Carmona, 2008: 179; Kersting et al., 2009: 80). In the Asia-Pacific region there is a great variance in the number and population size of municipalities (for an overview and list of the territorial organization of local government in the Asia-Pacific region, see Nickson et al., 2008, tables 3 and 4). Japan stands out as having carried out large-scale local-level territorial reforms that, in two waves (1953 and 2001), reduced the number from some 6,000 to 1,820 (Norton, 1994: 457; Nickson et al., 2008: 64). In South-Asian countries, there is often a mix of modern local government structures and traditional or customary village institutions (Nickson et al., 2008: 60, table 3). Due to rampant urbanization (the population dwelling in urban areas has multiplied by seven since 1950) the countries have experienced a rampant expansion of metropolitan areas (Sellers and Hoffmann-Martinot 2008: 261). The top ten of the hundred largest metropolitan areas in the world are all located in Asia-Pacific countries; of these, Tokyo Metropolis (or the Greater Tokyo Area) is the most populous (with some 38 million inhabitants) consisting of Tokyo (with 9 million) and some 20 neighbouring cities. In most Sub-Saharan African countries hardly any amalgamations have been carried out at the local level (for an overview see Letaief et al., 2008; table 1). However, following the end of the Apartheid regime in the early 1990s, in South Africa, a largescale local-level amalgamation has been effected which, as an essential component of its decentralization programme of 1998 and as a crucial political goal, aimed at abolishing
1043
the racially separated settlement structure and at creating racially mixed and integrated municipalities (Cameron, 2007; Kersting et al., 2009: 139; Baldersheim and Wollmann, 2006b: 116): the number of municipalities was drastically reduced from some 1,000 to 284, averaging 62,000 and 6,000 inhabitations (Kersting et al., 2009: 141). At the same time, six metropolitan municipalities were created by amalgamation, for instance, the City of Johannesburg Metropolitan Municipality (with some 960,000 inhabitants) as the core city of the adjacent metropolitan area (counting some 7.8 million people (Sellers and Hoffmann-Martinot, 2008: 271)).
Local democracy The theme of local democracy is taken up in the following section along two institutional tracks: the first addresses the (representative democratic as well as direct democratic) rights and procedures empowering local citizens to influence local political decisionmaking and the second focuses on determining local politico-administrative leadership. In Europe, in a tradition dating back to the 19th century, Switzerland stands out singularly in giving citizens at all (federal, cantonal and local) levels of government the (direct democratic) right to determine political decision-making (Kübler and Ladner 2003). In 80% of Swiss municipalities, local political decision-making lies with (direct democratic town-meeting type) assemblies of citizens instead of elected local councils. Moreover, local referendums on a broad scope of issues (local taxes etc.) are galore. On average the Swiss citizens are invited to vote in local, cantonal and federal referendums on up to 30 subjects every year (ibid, 144) (for a cautioning assessment, such as on low voter turn and socio-economically ‘biased’ interest assertion, see (ibid.: 144). To contrast, in the other European countries, historically, the principle of
1044
The SAGE Handbook of Political Science
representative democracy has prevailed, according to which the local citizens are entitled (and restricted) to elect the local councils as the supreme (parliament-type) local decision-making bodies. However, since the 1990s, in an increasing number of countries, as a complement to local representative democracy, local referenda – both consultative and binding – have been introduced. In some countries, such as in Germany, binding local referendums have gained increasing frequency and considerable impact on local decision-making, at times revoking decisions made by the elected council (Vetter et al., 2016: 277). Tellingly, even in countries in which the principle of representative democracy has been firmly entrenched in the country’s political tradition and culture, direct democratic procedures have been adopted. A striking case is the UK, where in 1998 regional referendums were held in Scotland and Wales, resulting in establishing regional parliamentary assemblies and ushering in a ‘quasifederal’ regional autonomy (Wilson and Game, 2011: 98), and in 2014 a regional referendum was conducted in Scotland on its becoming independent (rejected by a thin majority). The most conspicuous example was the recent national referendum held on June 23, 2016 on whether the UK should leave (‘Brexit’) or remain in the EU (with a thin majority of 52% voting to leave). Similarly, on local-level matters, direct democratic procedures have advanced, for instance in the 2000 legislation giving local citizens right to decide, by local referendum, to opt for the direct election of mayors (ibid.: 113). As for local leadership in Europe, historically, two schemes have evolved, both based on the representative democracy principle (see country reports in Berg and Rao, 2005; Reynaert et al., 2009; for comparative overviews, see Wollmann, 2009; Copus et al., 2016; Lidström et al., 2016). In the ‘government by committee’ variant, which was in place in the UK and in Sweden, sector-specific committees of the elected local councils were
in charge of making the relevant decisions as well as directing and overseeing the respective administrative units. By contrast, in the ‘council mayor systems’ that have been practised throughout continental European countries, the elected councils are the prime decision-making bodies while, in a kind of local parliamentary system, a council-elected executive (mayor) carries out the council’s decisions alongside some responsibilities of his own. In reaction to growing criticism over democratic (accountability) and operative deficits, both systems have, since the 1990s, been reformed. In the ‘government by committees’ scheme, the politico-administrative leadership has been, grosso modo, shifted to a (parliamentary system-type) political council leader. By contrast, the traditional council mayor systems have been changed, visibly leaning on the US example, towards a kind of local presidential system by introducing the directly elected executive (‘strong’) mayor. Germany took the lead as, since the beginning of the 1990s, all Länder have introduced the directly elected (executive) mayor – in some Länder this process has been accompanied by recall procedures. Moreover, the direct election of the mayors has been adopted in other continental European countries (with the exception, for instance, of France) as well as, in the wake of post-communist transformation, in Central Eastern European countries (Copus et al., 2016; Marcou and Wollmann, 2008: 155). In the Russian Federation, after the collapse of the Communist regime, a first major step towards creating democratic local government and to ensuring its autonomy was set by the adoption of the law ‘on local government’ on July 6, 1991, in which, among others, the election of the local councils as well as the direct election of a mayor (‘head of administration’) was laid down (Wollmann, 2004b). However, following the abortive putsch by Communist Party hardliners in August 1991 and the ensuing, fierce power struggle, the already scheduled direct election of the mayors was postponed; instead
Local Politics
the mayors were appointed by President Yeltsin in a move to consolidate the (presidential) ‘vertical power’ grip. Later on, in 2005, the mayors were to be directly elected throughout the country municipalities and, since then, as soon as Putin became President in 2012, the position of the mayors as pivots of local democracy has been gradually undermined on two scores. First, borrowing from the US example, new legislation has introduced the position of a city manager who, as the newly appointed local chief executive, rivalled and eventually side-lined the elected mayor; in the meantime, the city manager scheme has been installed in most municipalities as directly or indirectly appointed by the regional governor. Second, under new legislation, the direct election of the mayors can be abolished, and the sitting mayors can be dismissed by the regional governor (as an arm of the President) (Bucklay et al., 2014). As of 2018, in only seven out of 85 regional capital cities are the mayors still directly elected. It appears that the elected mayors are about to vanish, signalling the political and functional degradation local government has undergone in Putin’s era. In the United States, direct, democratic local citizen rights have long been part and parcel of the country’s democratic tradition, making the United States, besides Switzerland, virtually the homeland of direct democracy. Since the 1820s, the direct election of mayors, and even judges, sheriffs and other local positionholders, has been introduced (Norton, 1994: 394). Currently, only three out of the 50 States do not have legal provisions on some type of local-level direct democratic procedures, including binding local referendums. Moreover, ‘home rule’ provisions have been adopted in some States that give the local councils the right to adopt direct democratic procedures on their own (Svara 2005: 131). In determining local leadership, two main variants have historically been in place (Norton, 1994: 421). First, the directly elected (executive, strong) mayor form, which is common in larger cities, and, second, the
1045
(council-appointed) city manager system, which is being increasingly adopted (Sellers, 2008: 248). In the last decade, a gradual mix or ‘hybridization’ of both forms has taken shape with a trend towards ‘executive-centred governance’ (Savitch and Vogel, 2005: 213). Besides, recall procedures for local officials are legally provided for in about half of the States. When Latin America, after having fallen under military and authoritarian rule between the 1960s and early 1980s, returned to democratic government during the mid 1980s, they moved to reinstall local democracy as well (Rosales and Carmona, 2008: 171). By 2008, all countries in the region, except Cuba, had free local-level elections (Kersting et al., 2009: 79). Influenced by the US example, the directly elected (executive, strong) mayor form made its entry in the Latin American countries, along with recall procedures in some of them. However, the political practice is often still marred by the centralist legacy of caudillismo (‘political bossism’) and a personalist political culture that tends to accentuate one-person leadership and to marginalize the role of the elected councillors (Nickson, 2019: 144). As for direct democratic citizen rights in Brazil, the Constitution of 1988 laid down various forms of direct popular participation besides regular voting, such as referendums and the right of citizens to propose new laws, including direct democratic and participatory local citizen rights. The local participatory budget process, which directly involves the local community in formulating the plans for municipal investments, was first applied between 1989 and 2004 in the city of Porto Alegre. In attracting world-wide attention and recognition as a remarkable direct democratic innovation, participatory budgeting has since spread to other Latin American countries and beyond (Rosales and Carmona, 2008: 196; Kersting et al., 2009: 112; on the spill-over of participatory budgeting into European countries, see Kersting et al., 2016). However, by the 2010s, there has been a widening gap between the envisaged and
1046
The SAGE Handbook of Political Science
proclaimed citizen participation in local government and the sobering reality throughout the region (Nickson, 2019: 146). In the (some 50) Sub-Saharan African countries, from the 1970s onward, steps have been undertaken, prodded by international financial donors, to establish democratic local government as an important lever for economic development (Kersting et al., 2009: 128). In a positive assessment of the development, ‘local elections are being held with a regularity unprecedented in the history of Africa’ (Letaief et al., 2008: 44).5 However, with regards to the political reality, observers speak of a ‘systematic decline and death of local democracy’ (Kersting et al., 2009: 128) as ‘national as well as local, tribal and family loyalties and traditions impact on civic behaviour at the local level’ (ibid.: 150). Again, South Africa stands out as a positive example. The municipal legislation of 1998 laid down a parliamentary-type local government system made up of an elected council and a council-elected executive mayor. At the same time, it emphasized the direct democratic participation in local decision-making (for details, see Kersting et al., 2009: 153; Cameron, 2007: 318). The latter can be seen particularly in the provision, within each municipality, of ‘ward committees’, which are considered crucial in allowing different (social, ethnic, etc.) groups to participate, on a voluntary basis, in local decision-making (Kersting et al., 2009: 158; for a cautioning assessment, see Kersting et al., 2009: 159).
CONCLUSION Brief Notes on the Development and State of Research on Local Politics and Government The following remarks on the ‘state of the art’ of the research on local politics and government are, again, bound to be sketchy and of a ‘broad brush’ (for a more detailed and
referenced discussion, see the chapters in Baldersheim and Wollmann, 2006a, particularly the historical overview by Goldsmith, 2006 and the summarizing article by Baldersheim and Wollmann, 2006b). From the 1940s and well into the 1960s, political science research on local politics and government was primarily led by US scholars teaching and working in numerous academic institutions and benefiting from the abundant research funding that was then available in the United States. Since the 1970s, the relevant research networks and topics have both internationally and thematically expanded. In 1970, the European Consortium for Political Research (ECPR) was formed and has come to encompass more than 350 institutions throughout Europe, with associate members across the world and its Standing Group on Local Government and Politics (Hoffmann-Martinot, 2006: 92). In 1972, the Research Committee on the Comparative Study of Local Government and Politics, first chaired by Franco Kjellberg, was formed within the International Political Science Association (IPSA) (the latter having been founded in 1950). Similarly, in 1975, the Research Commission on Urban and Regional Development, first chaired by Terry Clark, was established within the International Sociological Association (founded in 1952). Thus, by widening the international outreach and composition of researchers, the ‘Anglo-Saxon-centricity’ that marked the earlier research discourse and agenda has been significantly recalibrated and shifted. The annual conferences and workshops of the ECPR and its Standing Groups, and the ensuing publications, have greatly contributed to the European research community developing a profile and self-confidence of its own (Goldsmith, 2006: 15). In addition, in the wake of the collapse of the communist regime abroad, an, albeit relatively short-lived, stream of publication focused on the subnational/ local political and administrative transformation in Central-Eastern European countries and in the Russian Federation occurred.
Local Politics
The research on subnational/local politics and government in European countries has, recently, been significantly promoted by an EU-funded international research consortium (within the EU’s COST programme) that focused on ‘on local public sector reforms in Europe. It has generated a ‘wave’ of publications, among others, on ‘Local Public Sector Reforms in Times of Crisis’ (Kuhlmann and Bouckaert, 2016) and on ‘Public and Social Services in Europe. From Public and Municipal to Private Service Provision’ (Wollmann et al., 2016). Thus, while the focus and arena of the ongoing political science debate and research on local politics and government have become somewhat ‘Europeanized’, research and publications in and on the ‘developing countries’ appear to have remained quite scarce. An important impulse to close this gap and to open up a global perspective, particularly on the ‘developing’ countries, has recently come from the United Cities and Local Governments (UCLG) (the world-wide umbrella organization for cities, local and regional governments, and municipal associations) as the UCLG has initiated, organized and edited a series of ‘global reports’ that are researched and authored by scholars and experts both from and on these ‘global’ regions. Tellingly, the UCLG’s first ‘global report’ was on ‘decentralization and democratization’ (GOLD I, see UCLG, 2008). Its ‘regional chapters’ make for a valuable research and information source which has accordingly been amply drawn on and quoted in the writing of this article. In the meantime, other equally important and substantive reports have been published by UCLG, on ‘local government finance’ (GOLD II), ‘basic services’ (GOLD III) and on ‘co-creating the urban future’ (GOLD IV) (UCLG, 2010, 2014, 2017 respectively).
Convergence or Divergence in a Global Perspective? In conclusion, the question shall be addressed as to whether, to which degree and why the
1047
institutional development highlighted in the preceding account shows convergence or divergence in a global perspective. Although varying in phase, dynamics and contents, world-wide, there have been convergent general (‘macro’) trends since 1945 towards decentralization and a strengthening of local government and local democracy. The trends have been triggered and carried forward by politically, historically and contextually different factors and events: as for the ‘developing’ countries, in Latin American countries (during the 1980s), following the end of military rule; in Central Eastern European countries and Russia (after 1990), on the heels of the collapse of communism; and in South Asian (after 1945) Sub-Saharan African countries (from the 1970s onwards), following the end of colonial rule. In other, ‘developed’ countries, decentralizing and the strengthening of local government gained traction with the post-war expansion of the national welfare state and the related transfer of tasks to the local authorities while further democratization has, since the 1970s, been promoted by the rising demands for enhancing (local) citizens’ rights to both influence and hold policymaking accountable. However, while falling in line with these ‘macro trends’ the countries have shown significant variance in the course and outcomes of these developments due to different political, historical and cultural givens and forces. In South Asian and Sub-Saharan African countries, decentralization and local- democracy-targeted reforms have, after promising start-ups, been slowed down or stalled by persisting institutional and mental legacies of their colonial past, by continuing tribal, ethnic and religious conflicts, their often dismal socio-economic and financial conditions and, last not least, by corrupted political elites (Baldersheim and Wollmann, 2006b: 121). In post-communist countries, such as Russia and Hungary, the decentralization and democratization process has been reverted by the return to quasi-authoritarian and centralist government under Orban and Putin.
1048
The SAGE Handbook of Political Science
The rampant urbanization and massive rural-urban in-migration that transcends the existing territorial structure of the local government has become a world-wide crucial challenge. This applies particularly to Asian, African as well as Latin-American countries, where, along with rampant population growth, urbanization and agglomeration in and around mushrooming metropolitan areas has continued (Sellers and HoffmannMartinot, 2008). Hence, in the global perspective, the number and size of the metropolitan regions have experienced an explosive growth. While by 1975 there were five metropolitan regions with more than 10 million inhabitants, (nota bene: three of them in developing countries) by 2000, the number of metropolitan areas of this megasize jumped to 16 (notably, twelve of them are in the developing region) (Sellers and Hoffmann-Martinot, 2008: 259). At the same time, world-wide, the number of middle- and smaller-sized metropolitan regions with big and larger municipalities as core cities in their midst have been increasing as well. One of the strategies to cope with the challenges of the urbanization process has been to territorially and functionally consolidate (amalgamate, fuse) the local authorities. This strategy has been embarked upon only in a relatively small number of countries and cases (in some European countries during the 1970s, in Japan in 1999 and in South Africa in 1999). Instead, different forms of intermunicipal cooperation and coordination have been sought and experimented with. These various modalities range from informal voluntary and flexible agreements among municipalities to the creation, by way of amalgamating the municipalities concerned, of a new territorially and functionally integrated metropolitan city (Sellers and Hoffmann-Martinot, 2008: 272). However, in any case, the ensuing formation mostly covers only part of the actual extension of the surrounding metropolitan region. Internationally, in the face of a world-wide, ‘globalized’ competition of transnational
corporations within both national and local markets, these ‘mega-cities’, who because of their economic and financial potential and global outreach are also identified as ‘global cities’ (Sassen, 1991), are eager to attract international companies to invest and relocate to them. In turn, ‘globalized’ companies experience a ‘dialectical process (in that) global economic activities become more dispersed but also more embedded in particular territorial settings’ (Clarke, 2006: 39) – i.e. ‘localised’ in certain urban sites. This dialectical logic, which both ‘global’ cities and globally operating companies have in common, has been dubbed ‘glocalization’ (linguistically fusing global and local). In other words, global (politics) become local and vice versa.
Notes 1 To view data for 2017, see https://data.worldbank. org/indicator/NY.GDP.PCAP.CD. 2 Nota bene: not counting the high-income OECD countries such as Japan, Australia and New Zealand 3 For a list of the 20 US city regions with over 2 million inhabitants, see Sancton, 2002: 186. 4 See data on all Latin American countries in Nickson, 2019: 134, table 10.1 and Kersting et al., 2009: 3.2. 5 For an overview of voting systems in municipal councils and of appointment modes of local executives in 15 Sub-Saharan countries, see Letaief et al., 2008: 45, table 5.
References Baldersheim, H., Illner, M. and Wollmann, H. (eds.) 2003, Local Democracy in PostCommunist Europe, Leske+Budrich: Opladen. Baldersheim, H. and Rose, L. (eds) 2010a, Territorial Choice. The Politics of Boundaries and Borders, Palgrave Macmillan. Baldersheim, H. and Rose, L. 2010b, Territorial Choice. Rescaling Governance in European States, in: Baldersheim, H. and Rose, L. (eds), Territorial Choice. The Politics of Boundaries and Borders, Palgrave Macmillan, pp 1–20.
Local Politics
Baldersheim, H. and Wollmann, H. (eds) 2006a, The Comparative Study of Local Government and Politics. Overview and Synthesis, Barbara Budrich Publishers. Baldersheim, H. and Wollmann, H. 2006b, Assessment of the Field of Comparative Local Government and a Future Research Agenda, in: Baldersheim, H. and Wollmann, H. (eds), The Comparative Study of Local Government and Politics. Overview and Synthesis, Barbara Budrich Publishers, pp. 109–132. Berg, R. and Rao, N. (eds) 2005, Transformation Local Political Leadership, Palgrave. Borraz, O. and P. LeGalès 2005, France: the intermunicipal revolution, in B. Denters and L.E.Rose (eds.), Comparing Local Governance, Basingstoke: Pelgrave Macmillan, pp. 12–38 Bucklay, N., Garifullina, G., Reuter, O.J. and Shubenkova, A. 2014. Elections, Appointments, and Human Capital. The Case of Russian Mayors, Demokratizatsiya, 22(1), 87–117. Cameron, R. 2007, South African Local Government, in: Lazin, F., Evans, M., HoffmannMartinot, V. and Wollmann, H. (eds), Local Government Reforms in Countries in Transition, Rowman & Littlefield, pp. 311–327. Clarke, S. 2006, Globalization and the Study of Local Politics: Is the Study of Local Politics Meaningful in a Global Age? in: Baldersheim, H. and Wollmann, H. (eds), The Comparative Study of Local Government and Politics. Overview and Synthesis, Barbara Budrich Publishers, pp. 33–66. Copus, C., Iglesias, A., Hacek, M., Illner, M. and Lidström, A. 2016, Have Mayor Will Travel. Trends and Developments in the Direct Election of the Mayor. A Five-Nations Study, in: Kuhlmann, S. and Bouckaert, G. (eds), Local Public Sector Reforms in Times of Crisis, Palgrave Macmillan, pp. 301–315. De Ceuninck, K., Vlalcke, T. and Verhelst, R. 2019, Local Government Outside Local Boundaries. Re-scaling Municipalities, Redesigning Provinces and Europeanisation, in: Kerley, R., Liddle, J. and Dunning, P. (eds), The Routledge Handbook of International Local Government, London: Routledge, pp. 377–393. Derlien, H.U. 1992, Observations on the State of Comparative Administration Research in Europe: Rather Comparable than Comparative, Governance, 5(3), 279–311.
1049
Gel’man, V. 2008, The Politics of Local Government Reform in Russia. From Decentralization to Recentralization, in: Lazin, F., Evans, M., Hoffmann-Martinot, V. and Wollmann, H. (eds), Local Government Reforms in Countries in Transition, Rowman & Littlefield, pp. 71–90. Goldsmith, M. 2006, From Community to Power and Back Again?, in: Baldersheim, H. and Wollmann, H. (eds), 2006, The Comparative Study of Local Government and Politics. Overview and Synthesis, Barbara Budrich Publishers, pp. 11–34. Guderjan, M. 2019, Local Government in the European’s Multilevel Polity, in: Kerley, R., Liddle, J. and Dunning, P. (eds), 2019, The Routledge Handbook of International Local Government, London: Routledge, pp. 394–404. Hoffmann-Martinot, V. 2006, The Infrastructure of Research and Academic Education, in: Baldersheim, H. and Wollmann, H. (eds), The Comparative Study of Local Government and Politics. Overview and Synthesis, Barbara Budrich Publishers, pp. 83–108. Kersting, N., Caulfield, J., Nickson, R.A., Olowu, D. and Wollmann, H. (eds) 2009, Local Governance Reform in Global Perspective, Wiesbaden: VS Verlag. Kersting, N., Gasparikova, J., Iglesias, A. and Krenjova, J. 2016, Local Democratic Renewal by Deliberative Participatory Instruments: Participatory Budgeting in Comparative Study, in: Kuhlmann, S. and Bouckaert, G. (eds), Local Public Sector Reforms in Times of Crisis, Palgrave Macmillan, pp. 317–332. Khabrieva, T.Y., Andrichenko, L.V. and Vasiliev, V.A. 2008, Eurasia, in: UCLG 2008 (ed.), Decentralization and local democracy in the world (GOLD I), pp. 96–111, Barcelona, e-book. Kübler, D. and Ladner, A. 2003, Local Government Reform in Switzerland, in: Kersting, N. and Vetter, A. (eds), Reforming Local Government in Europe, Wiesbaden, pp. 137–155. Kuhlmann, S. and G. Bouckaert (eds.) 2016, Local Public Sector Reforms in Times of Crisis, London: Palgrave Macmillan. Kuhlmann, S. and Wollmann, H. 2019, Introduction to Comparative Public Administration, 2nd edition, Edward Elgar.
1050
The SAGE Handbook of Political Science
Letaief, M.B., Mback, C., Mbassi, J.P. and Ndiaye, B. 2008, Africa, in: UCLG 2008 (ed.) Decentralization and local democracy in the world (GOLD I), pp. 15–49, Barcelona, e-book. Lidström, A., Baldersheim, H., Copus, C., Hlynsdottiv, E.M., Kettunen, P. and Klimanovsky, D. 2016, Reforming Local Councils and the Role of Councillors. A Comparative Analysis of Fifteen European Countries, in: Kuhlmann, S. and Bouckaert, G. (eds), Local Public Sector Reforms in Times of Crisis, Palgrave Macmillan, pp 287–300. Lijphart, A. 1975, The Comparable-Cases Strategy in Comparative Research, Comparative Political Studies, 8(3), 158–177. Marcou, G. and Wollman, H. 2008, Europe, in UCLG 2008 (ed.), Decentralization and local democracy in the world (GOLD I), pp. 130– 167, Barcelona, e-book. Mouritzen, P.D. 2010, The Danish Revolution in Local Government. How and Why, in Baldersheim, H. and Rose, L. (eds), Territorial Choice. The Politics of Boundaries and Borders, Palgrave Macmillan, pp. 21–41. Nickson, A. 2019. Local Government in Latin America, in: Kerley, R., Liddle, J. and Dunning, P. (eds), The Routledge Handbook of International Local Government, pp. 131–146. Nickson, A., Devas, N., Brillantes, A., Dabo, W. and Calestino, A. 2008, Asia-Pacific, in: UCLG 2008 (ed.) Decentralization and local democracy in the world (GOLD I), pp. 52–92, Barcelona, e-book. Norton, A. 1994, International Handbook of Local and Regional Government, Aldershot, UK: Edward Elgar. Przeworski, A. and Teune, H. 1970, The Logic of Comparative Social Inquiry, New York: John Wiley. Reynaert, H., Steyvers, K., Delwit, P. and Piler, J. (eds) 2009, Local Political Leadership in Europe, Baden-Baden: Nomos. Rosales, M. and Carmona, S.V. 2008, Latin America, in UCLG 2008 ed. Decentralization and local democracy in the world (GOLD I), pp. 170–203, Barcelona, e-book. Sancton, A. 2002, Local Government in North America. Localism and Community Governance, in: Caulfield, J. and Larsen, H. (eds), Local Government at the Millennium, Opladen, pp 186–201. Sassen, S. 1991, The Global City, Princeton University Press.
Savitch, H.V. and Vogel, R.K. 2005, The United States: Executive-Centred Politics, in: Denters, B. and Rose, L. (eds), Comparing Local Governance Trends and Developments, Palgrave, pp. 211–227. Sellers, J. 2008, North America, in UCLG 2008 (ed.), Decentralization and local democracy in the world (GOLD I), pp. 237–255, Barcelona, e-book http://www.citiesalliance. org/node/422. Sellers, J. and Hoffmann-Martinot, V. 2008, Metropolitan Governance, in: UCLG 2008, Decentralization and local democracy in the world (GOLD I), pp. 259–293, Barcelona, e-book http://www.citiesalliance.org/node/422. Svara, J.H. 2005, Institutional Form and Political Leadership in American City Government, in: Berg, R. and Rao, N. (eds), 2005, Transformation Local Political Leadership, Palgrave, pp. 131–149. Swianiewicz, P. (ed.) 2010, Territorial Consolidation Reforms in Europe, Budapest: Open Society Institute. Tremblay, R.C. 2003, Recent Developments and Debates in Local Governments in India, in: Vajpeyi, D. (ed.), Local Democracy and Politics in South Asia, Opladen: Leske + Budrich, pp. 47–67. UCLG 2008 (ed.), Decentralization and local democracy in the world (GOLD I), Barcelona, e-book http://www.citiesalliance.org/ node/422. UCLG ed. 2010, Local Government Finance: The Challenges of the 21st Century (GOLD II), Barcelona. UCLG ed. 2014, Basic services for all in an urbanizing world (GOLD III), Routledge. UCLG ed. 2017, Co-creating the urban future: the agenda of metropolises, cities and territories (GOLD IV), Barcelona http://habitat3. org/wp-content/uploads/4th-Global-Reporton-Local-Democracy-and.pdf. Vajpeyi, D. and Arnold, J. 2003, Evolution of Self-Government in India, in: Vaypeyi, D. (ed.), Local Democracy and Politics in South Asia, Opladen: Leske + Budrich, pp. 33–46. Vajpeyi, D. 2003a, Introduction, in: Vaypeyi, D. (ed.), Local Democracy and Politics in South Asia, Opladen: Leske + Budrich, pp. 9–22. Vajpeyi, D. (ed.), 2003b, Local Democracy and Politics in South Asia, Opladen: Leske + Budrich. Vetter, A., Klimanovsky, D., Denters, B. and Kersting, N. 2016, Getting Citizens More to
Local Politics
Say in Local Government. Comparative Analyses of Change Across Europe in Times of Crisis, in: Kuhlmann, S. and Bouckaert, G. (eds), Local Public Sector Reforms in Times of Crisis, Palgrave Macmillan, pp. 273–286. Wilson, D. and C. Game 2011, Local Government in the United Kingdom, 5th edition, Basingstoke: Palgrave Macmillan. Wollmann, H. 2004a, Local Government Reforms in Great Britain, Sweden, Germany and France. Between Multi-Function and Single-Purpose Organizations, Local Government Studies, 30(4), 639–663. Wollmann, H. 2004b. Institution Building of Local Self-Government in Russia. Between Legal Design and Power Politics, in Evans, A.B. and
1051
Gel’man, V. (eds), The Politics of Local Government in Russia, Rowman, pp. 104–127. Wollmann, H. 2009, The Ascent of the Directly Elected Mayor in European Local Government, in: Reynaert, H., Steyvers, K., Delwit, P. and Piler, J. (eds), Local Political Leadership in Europe, Baden-Baden: Nomos. Wollmann, H. 2010, Comparing Two Logics of Interlocal Cooperation. The Cases of France and Germany, Urban Affairs Review, 46(2), 263–292. Wollmann, H., Kopric, I. and Marcou, G. (eds) 2016, Public and Social Services in Europe, From Public and Municipal to Private Sector Provision, Palgrave Macmillan.
63 Policies beyond the State Eva G. Heidbreder and Daniel Schade
Definition Policies beyond the state are the product of steering the public realm in contexts other than a territorially delineated, hierarchically ordered state. Even though policies have never been exclusively contained within the modern state, the explicit concern with policy-making beyond the state is rather young. Logically, it follows from the shift to governance perspectives in International Relations (IR) and Public Policy research. In IR, debates about globalisation and the independent impact of international organisations have widened the perspective to questions of global governance short of a hierarchical ordering system on the global scale (Rosenau and Czempiel, 1992). In Public Policy research, the turn to governance widened the notion of steering as a hierarchical and state-led process to an understanding of a more permeable state whose policy-making includes non-state actors of various kinds (Rhodes, 1996). The theoretical introduction of non-hierarchical modes of
governance, both in the international and within-state context, entail the logical consequence that policy-making is no longer limited to the state represented by elected and administrative actors. At the same time, hierarchical, state-limited policy-making continues to exist. Governance does by no means replace such traditional forms of steering. Therefore, the traditional understanding of policy-making needs to be complemented by the notion of policies beyond the state as the product of policy-making by non-state actors and with an effect across borders or fully detached from state structures. Accordingly, we introduce the following definition: Policies beyond the state are the product of policy-making processes that are conducted in networks spanning beyond or detached from state-limited territories and jurisdiction, that are prone to involve nonstate actors and have a significant impact beyond territorially defined states. Like the notion of governance (for a conceptual overview: Kjær, 2004), this definition
Policies beyond the State
remains somewhat blurry and needs specification to capture its theoretical core. First, and most relevantly, policy-making actors perform in specific network formations that can take varying forms (Marsh and Rhodes, 1992 first introduce the term policy networks). Policies can accordingly be produced by a wide range of interaction modes that do not exclude hierarchical steering but rely mostly on alternative modes of policymaking. The notion of varying modes of governance (Benz, 2007) captures these diverse forms of producing policies. Second, the decisive networks are not limited to state actors. On the contrary, they are prone to include or be fully made up of non-state actors such as civil society groups, non-governmental organisations (NGOs), companies, experts and epistemic communities, but also supranational or international bureaus and their officials. What makes the outcome of policy-making involving such actors a policy spanning beyond the state is both their status as non-state actors and their linkages and interactions beyond the territory and jurisdiction of the state. In IR, this has sparked a wide literature on the question of whether international organisations matter and, if so, how (Keohane and Nye, 1974; Barnett and Finnemore, 1999), as well as questions regarding the global impact of non-state networks such as social movements (Mattli and Büthe, 2005; Hafner-Burton and Montgomery, 2006). Inside the state, it is equally the recognition that public policies are not exclusively shaped by state officials or elected decision-makers (Marsh and Rhodes, 1992; Rhodes, 1996), in part linked to the normative claim that governance should include civil society and private actors (Bennett, 2005; Keck and Sikkink, 2014). Exemplary research is offered by studies on the highly institutionalised policy-making context of the EU, which offers a set governance framework for policymaking context of of the European Union (EU). The EU is in fact a governance system beyond the state that allows us to study the influence of non-hierarchical steering (Héritier
1053
and Lehmkuhl, 2008), lobbying (Quittkat and Kotzian, 2011) and civil society involvement (Kohler-Koch, 2013b; Heidbreder, 2012). Finally, the outputs and outcomes of such de-territorialised and non-hierarchical policymaking span beyond the boundaries of the single (unitary) state. Again, this feature is not uniquely new. Strictly speaking, all foreign policy-making meets this criterion. Yet, in combination with networked actor interaction and the openness to non-state actors, the notion of policies beyond the state captures precisely policies that are not explicitly directed at external actors but inevitably produce externalities that go beyond a clearly defined territory that coincides with a delineated political system and jurisdiction. Hand in glove, this produces new venues ranging from the global to regional and local level with varying degrees of institutionalisation. The arguably most researched alternative venue that meets this definition is the EU, inside which policy-making beyond the state has been strongly institutionalised on the supranational level as well as through interstate cooperation, offering policy-dependent levels and scopes of integration (Zürn, 2003; Archibugi et al., 2011). In sum, following the observation that the ‘political boundaries that have structured many examinations of policy-making are less definitive than at any time since the midtwentieth century’ (Perl, 2013: 45), policies beyond the state become a main concern in order to understand, explain and shape present policy-making. Notably, due to the specific features of policy-making beyond the state, we are also faced with significant, new normative concerns about democratic legitimacy due to the new logic of (self)delegation and control that such policy-making entails. To unfold the core of the notion of policies beyond the state, the next section first offers an overview of explanatory approaches in IR and policy research to ask how these may apply to policies beyond the state. In addition, we contrast the de-territorialised notion of policies with the standard theoretical
1054
The SAGE Handbook of Political Science
heuristic of the policy cycle to highlight the parallels and differences to traditional policymaking. To render these theoretical elaborations more accessible, the proceeding section provides an exemplary selection of empirical research. Due to its bottom-up emergence on the global scene and its truly global reach, climate policy has attracted special attention in academia. Hence, this policy serves to illustrate how multiple research questions and approaches are applied in studying policies beyond the state. Finally, we offer a short outlook based on on-going debates and remaining gaps in the research of policies beyond the state.
Explaining and conceptualising policy-making beyond the state Policies beyond the state relate strongly to the sub-disciplines of IR and Public Policy, and theories originating in both fields can contribute to explaining the origin and functioning of policies beyond the state. Accordingly, underlying research questions and causal explanations vary widely. As pointed out above, a main concern of IR literature has been the very question of if political organisation and administration beyond the state exists in its own right and with its own independent actorness, or if it is a mere expression of (unitary) states’ negotiated wills. Considering that international organisations matter raises further questions about how they can influence and steer policies up to the critical step of effective implementation and enforcement. These questions have been tackled with various kinds of IR approaches. This variance in approaches reflects the plurality of explanatory theories that range from rationalism, which views IR as interactions between mostly unitary states, to constructivist and transnationalist theories that start from the assumption of interdependencies. Due to the core question of whether international organisations ‘matter’ at all as independent actors,
new institutionalist approaches have been a key analytical framework (Hall and Taylor, 1996). Rather than a theory itself, institutionalism can be applied with a rationalist, sociological or historical explanatory core, which makes the approach compatible with the dominant realist and constructivist schools in IR. Similarly, the notions of a logic of appropriateness – linked to rather sociological explanations – and a logic of consequences – linked to rational explanations – have been applied in Policy Analysis (for the distinction, see March and Olsen, 2004). Reflecting the plurality of these different theories, processes and outcomes of international, global or transboundary policy-making cannot be explained by one approach. Yet, certain approaches have been particularly useful in capturing processes beyond borders because they are suited to capture actor interactions and the production of policyoutcomes short of the ordering structures of the state. Amongst the more frequently used approaches are liberal theories that underline global interdependencies, constructivist theories that uncover global communities and values, sociological and rational institutionalist approaches that offer different explanations for the impact of institutionalisation, and rational approaches that have brought forth rational choice and principal-agent approaches to explain actor interactions as a matter of strategic calculation by rentseeking actors. This non-exhaustive list could be extended by critical theories – such as feminist approaches – and their respective focus on underlying power relationships. In essence, reviewing the variety of theoretical lenses that inform our understanding of policies beyond the state underpins that the subject area is in fact already a significant issue across different scientific agendas rather than a single-standing research field in its own right. While explanatory theories remain largely committed to the respective theoretical sources instead of adding to a genuine ‘theory of the policy beyond the state’, heuristics that serve the systematic description of
Policies beyond the State
policy-making have been more consistently adapted. However, expanding the definition of policies beyond the state raises the question as to how far established conceptualisations and theories of policy-making still serve their descriptive and explanatory purpose. The most prominent heuristic in policy analysis is the policy cycle or policy stages (Howlett et al., 2009). It sub-divides policymaking into stages – mostly problem definition, policy-formulation, decision-making, policy implementation and evaluation. Each one of these stages bears its respective functioning logic and regularities, although their empirical appearance may diverge from the theoretical stages. Even though this heuristic has been criticised for its limited applicability to actual overlapping policy processes, it still remains a useful and widely applied ideal typical device to depict and analytically disentangle policy-making. In addition, the separate stages have been analysed in their own right with research agendas that provide insights about the respective functioning logics. Accordingly, policy research has developed different streams that focus on separate stages, such as decision-making or implementation exclusively. There is no overarching translation of these stages to policies beyond the state. Obviously, such a direct transfer is hardly possible because the single stages must play out differently according to mixed actor constellations, modes of governance and venues at stake in a particular process. To illustrate how the stages play out differently if we remove the framework of the delineated state-territory and closed jurisdiction, we take policymaking in the policy cycle of the EU as an example (cf. Heidbreder and Brandsma, 2017). As the EU has highly institutionalised supranational policy-making structures, it is a useful illustration for policy-making beyond the state that exposes regularities that can be assumed to hold equally for policy-making in other highly institutionalised regional organisations or trans-border networks. Drawing on the parallels between
1055
traditional policy-making and policy-making beyond the state, it is assumed that the inner logic of policy-making does not change. Independent of actor constellations, action modes and venues, the cycle heuristic therefore remains useful. Moreover, certain critical research questions about policy-making remain the same: how is a political problem defined and decided upon; how can a policy that reacts to it be implemented and evaluated? The way these questions are treated reveals the specific differences in the application of the heuristic beyond the state. This can be exemplified by the selected research contributions from EU research and beyond that tackle specific policy-related questions. Thus, the agenda-setting (Princen, 2009), policy-formulation (Hartlapp et al., 2013) and decision-making (Best, 2016; Tsebelis, 2002) stages have attracted particular research interest. In all stages, analyses pinpoint the implications of the more permeable governance structures that lead to less fixed actor networks and the inclusion of non-state actors. Veto-player analyses for decision-making have been particularly prominent because of the extended circle of decision-makers that render decision-making more complex and reduce the capability of actors to change decisions once they have been institutionalised (Scharpf, 2006). At the same time, the explicit lack of hierarchical steering authority has evoked extensive work on the EU’s use of so-called new modes of governance (Eberlein and Kerwer, 2004) that are based on mechanisms such as competition, bargaining, incentive setting and persuasion (Treib et al., 2005) – or combinations thereof (Benz, 2009). Notably, while scholars have extensively studied the EU’s agenda-setting and decision-making stages, these concepts also hold for other non-state contexts. A remarkable example of an international scientific research network is the comparative agendas project, which collects agenda-setting data in five countries and in the EU (see: https:// www.comparativeagendas.net). Likewise, veto-player and new modes of governance
1056
The SAGE Handbook of Political Science
perspectives have not remained limited to the EU, but the shift to an actor-focus allows an analysis of non-hierarchical policy-making in non-state contexts, as illustrated below for the case of climate policy. Like agenda-setting and decision-making, policy implementation represents a large and well-established research focus in its own right. Policy implementation beyond the state has the particular twist to it that decisions have either been taken in an international venue and need to be voluntarily transposed into national law, or that they depend on voluntary action altogether. Only rarely are policies beyond the state truly subject to a hierarchical order. This eventually also applies to EU policy-making that ultimately depends on the willingness of national governments to comply with joint EU-decisions (Börzel and Heidbreder, 2017). Accordingly, the question of ‘compliance’ of national actors with EU-law has brought forth a whole stream of literature (Treib, 2014). While initial compliance studies focused very much on the formal transposition of EU regulation into national law, the agenda has widened to direct implementation by specialised EU agencies (Egeberg and Trondal, 2011) and ‘beyond compliance’ to more general questions of implementation and enforcement (Thomann and Sager, 2018). These studies go as far as studying the relevance of low-level bureaucrats for the implementation of policies that come from ‘outside’ their immediate political context (Dörrenbächer, 2017). These perspectives allow the generalisation more broadly of the regularities and dynamics of implementation beyond the state along long-standing dimensions (Heidbreder, 2017, 2015). Such questions have raised increasing attention to the underpinning administrative structures and processes of administration because policies beyond the state naturally imply the need to either create new administrations that cut across borders (mainly international secretariats), or ways for national administrations to cooperate across borders given that
policy enforcement largely remains purely state-based (Heidbreder, 2011). The studies on multilevel administrations thus offer deeper insights for policy implementation beyond the state more generally (Bauer and Trondal, 2015). Hence, the particularities of EU policy-making have been systematically compared to secretariats of international organisations, revealing striking similarities in policy initiation, formulation and implementation (Bauer et al., 2017) and link to other research on international bureaucracies more generally (Knill and Bauer, 2016). In sum, these studies that treat policies beyond the state highlight more than the conceptual usefulness of traditional Public Policy heuristics. The theoretical extension has been applied intensely in EU studies but also in other international, transnational or global policy-making. Studying these non-state policy-making venues and actors implies a shift of attention to the relevance of policy networks, the interaction between different jurisdictions, the role of veto-players and the importance of the varying degrees of institutionalisation of procedures and rules. In addition, the focus on supranational policy-making has brought forth new topics. The so-called Europeanisation literature asks how EU policy-making shapes national polity, politics and policies and vice-versa (Green Cowles et al., 2001). In addition these EU-specific findings have been extended to a comparative research agenda to analyse transnational policy-making in other sub-national regional units (Börzel and Risse, 2015).
The empirical research: climate policy as an exemplary case Given the different analytical guises and concepts related to policies beyond the state, empirical research into this phenomenon has been the subject of varying research agendas that do not make for a single, homogenous field of study. Therefore, this
Policies beyond the State
section does not offer a review of all the policy fields where it occurs, but instead uses the well-researched, yet complex, field of environmental and climate change policy to exemplify the variety of dimensions of the phenomenon and to illustrate the plurality of research concerned with it. In particular, international climate change policy-making demonstrates the relevance of policy-making in networks and venues that depart from traditional notions of territory and hierarchy, while the impact of this policy-making similarly permeates the notion of boundaries. The advent of climate change as a policy issue departs from traditional notions of hierarchy in policy-making, as it was primarily scientists’ concerns and their collective activity as ‘knowledge brokers’ (Bodansky, 2001: 27) that helped frame climate change as an international political issue, ultimately leading to the set-up of the Intergovernmental Panel on Climate Change (IPCC) under the auspices of the United Nations Environment Programme in the late 1980s. Initial agendasetting on climate change thus followed a bottom-up approach that was ultimately enabled by the networked activity of nontraditional actors, namely scientists and their allies from the outset. While the IPCC was indeed designed as an intergovernmental organisation, its set-up included both governments and non-governmental organisations as observers. Scientists formally involved in this body also influence it through their activity as an epistemic community around this issue (Newell, 2006: 41–43), thereby demonstrating one of the ways in which governmental and non-governmental actors can interact in policy-making beyond the state. As the more recent example of epistemic communities involved in advocacy for so-called carbon capture and storage technology shows, these can include unusual constellations of actors such as businesses and civil society organisations working hand-in-hand (Mariussen, 2010). The United Nations Framework Convention on Climate Change (UNFCCC) and the associated Conferences of the Parties (COPs)
1057
have emerged as the main venues for international climate change policy-making, including the groundbreaking Kyoto Protocol of 1997 and the 2015 Paris Agreements. These international agreements set out reductions in greenhouse gas emissions. Accordingly, a lot of research has focused on the dynamics of these gatherings, which once more include both governmental and international nongovernmental actors. The most recent studies on the 21st meeting in Paris serve as a telling example to understand how a transnational consensus on a single policy, climate change policy targets, could be found and which roles were played by the individual actors both in the run-up and during the meeting (Dimitrov, 2016). This meeting in particular is wellresearched as it led to the conclusion of the Paris Agreement with its associated target and measures to limit climate change. Similar to early agenda-setting and the activity of the IPCC, various kinds of non-governmental actors ranging from businesses to civil society organisations have also continued to play a role in global climate change policy-making in this context (Bäckstrand et al., 2017). For the notion of policy-making beyond the state in this example, it is particularly relevant that governmental actors and NGOs should not be seen as antagonists in this process as a large part of climate change policy-making relies on partnerships stretching across these distinctions (Pattberg, 2010). Research into the COPs as the most formalised venue for climate governance is useful when gathering insights into one particular dimension of climate change policy-making. However, as is part of the underlying dynamics of policy-making beyond the state, the COPs are only one of multiple venues within which climate change policy-making occurs. Ultimately, various traditional and nontraditional actors are able to use different fora for different kinds of discussions on climate change policy, which ultimately contributes to a fragmentation of the global climate change regime (Zelli, 2011). For instance, at the global level, climate change policy-making
1058
The SAGE Handbook of Political Science
also forms part of the activities of the regular gatherings of the G7/G8 industrialised nations, or indeed the G20 of leading global economies (Kirton and Kokotsis, 2015). It is also relevant to note that while important decisions on policy are taken globally, this needs to be complemented by policy-making in other contexts, such as at the national level, to achieve the targets set in a global context. This ultimately shows that climate change is a policy issue that is decided upon concurrently at the global, national and other levels (Rabe, 2007). An important example of the complexity that ensues from this multilayeredness are the climate policy activities of the EU as a dedicated venue for transboundary policy-making, which are both influenced by decisions at the global level and shaped by the positions of its member states (Skjærseth and Wettestad, 2008). Beyond that, climate change policy-making also occurs at levels and by governmental actors not traditionally considered in international policy-making. For instance, both regional governments and city authorities have developed trans-border climate change policies on their own, often in the absence of or despite policy-making at the international and national level. This has been studied with regard to the role of networks between US states and Canadian provinces in North America (Boyd, 2017), between networks of globalised cities (Bulkeley, 2013), or in the more institutionalised context of transnational municipal networks, within the EU (Giest and Howlett, 2013). In addition to the diversity of decisionmaking venues and actor constellations, climate and environmental policy-making also demonstrates the advent of new modes of policy-making, such as the set-up of marketbased mechanisms or environmental certification of products to shape outcomes. In the realm of climate policy-making, the issue of greenhouse gas emissions markets is a particularly prominent policy tool, and one extensive such emissions market exists within the EU (Skjærseth and Wettestad, 2008). Similarly, product or practice certification aims at
setting incentives, and this has occurred for multiple environmental issues (Bartley and Smith, 2010) including the management of forests (Chan and Pattberg, 2008). The latter phenomenon is particularly driven not by traditional state actors, but by businesses, NGOs and other stakeholders working together. Climate change policy-making thus occurs in parallel in different venues uniting different actor constellations at different levels of decision-making and with different modes of policy-making. In addition to this, there are also important bottom-up interactions between these levels and venues. Once more the example of the EU can be used, as it has aimed to internationalise its environmental and climate change policies by advocating for defining its policies as global standards (Kelemen, 2010). In the absence of an active push for the global adoption of climate change policy-making in one venue, such policy-making can also have indirect effects on others. For instance, the adoption of stringent car emissions standards in one jurisdiction may have important effects in others, including on ensuing policy-making in those jurisdictions (Perkins and Neumayer, 2012). Environmental and climate change policymaking is a particularly well-researched field that can serve to illustrate the complexity and prevalence of policies beyond borders, including policy-making processes that involve networks of non-traditional actors which permeate established boundaries, multiple and often parallel venues for decision-making, innovative policy-making mechanisms as well as complex policy impacts at different levels. Climate policy is by far not the only policy area with such elementary policymaking features that reach beyond the state. Research into other global challenges such as economic and financial stability, terrorism, or international migration can serve as the basis for similar enquiries into the phenomenon. Given the persistent multitude of theoretical angles and specific research interests policies beyond the state are treated under, the empirical studies cover a wide and very dispersed
Policies beyond the State
literature that is mostly accessed through the specific streams rather than an overreaching agenda on policies beyond the state.
Future challenges for research and practice The conceptual, theoretical and empirical overview on policies beyond the state presented here point to relevant scientific, applied and normative questions for future research. The on-going debates indicate the concern for policy-making beyond the state in many, very diverse research agendas. Despite its academic richness, this diversity indicates that there is a continued gap between researchers interested in policymaking research and those that study policies rather indirectly, through questions focused on issues such as the possibility and dynamics of international cooperation. Beside the very encompassing research on policymaking processes as well as its practical and normative implications, IR and Public Policy research have still not fully established the policy-making side as a joint standard analytical perspective. Although recent research has expanded to international administration as a subject of research in its own right, a consideration of policy-making approaches is often rather a side-aspect when studying other phenomena such as regionalism or the role of international organisations in globalised politics. This concluding section will therefore spell out the remaining scientific, applied and normative gaps. First, we can identify multiple scientific desiderates. These follow mainly from the still by-and-large separate research fields of IR and Policy Analysis. Although both subdisciplines tackle policies beyond the state from their respective analytical angle, the specific policy-related insides are rarely directly linked. In EU research, this inter-disciplinary divide has in some part been overcome by the ‘governance turn’ (Kohler-Koch and
1059
Rittberger, 2006) which has led to research that considers the EU as a polity ‘beyond the state’ in its own right and, accordingly, its specific policy-making system (Wallace et al., 2014). The introduction of the governance framework allows analysing systematically how an explicit non-state policy operates and produces policies. However, while the EU is often depicted as a laboratory for non-state policy-making processes, findings are hardly ever generalised beyond the EU or subject to sincere comparative research (legal scholars more regularly draw parallels between EU and international law and governance, see e.g. Mendes and Venzke, 2018). Therefore, to advance the study of policies beyond the state, we are faced with multiple challenges. On the one hand, much conceptual and theoretical basic research on how to transfer established concepts and explanations to the transboundary context, and especially at a generalisable level is still missing. On the other hand, studying phenomena beyond national jurisdictions and institutionally established structures implies significant empirical challenges of how to identify, measure and compare the complex processes of policy-making, especially given the interlinkages with established institutions and decision-making models. The existing studies on regional policy-making and international bureaucracy offer valuable starting points for these endeavours. Second, in face of policy problems beyond states, the study of the conditions under which policies beyond borders can be produced and effectively executed gain preeminent relevance. Borderless challenges such as global warming, economic crises, migration or organised crime and terrorism can hardly be resolved within the territorial boundaries of a single state (interestingly, opponents to multilateralism and cross- or sub-national governance often negate the existence of these threats altogether). In contrast, assuming interdependencies between states, communities and individuals, as well as the ecosystems we live in, makes research
1060
The SAGE Handbook of Political Science
on policy-making beyond the state preeminent. Exemplary questions include how global climate agreements can be effectively implemented, or if completely different instruments at various levels are needed; how global economic networks can be stabilised or should be reshaped; and how and why humans migrate and how these movements can be steered in a mutually beneficial manner. However, even if the interdependency assumption is relaxed, also for representatives of rational theories who consider international politics essentially as an interaction between unitary states, policies beyond the state are a significant practical issue for how to treat and use international venues. Given recent political attempts to reverse established practices of policy-making beyond the state, in particular in institutionalised multilateral venues such as the EU, the United Nations or NATO, it is in turn necessary to ask whether and under what conditions such policy-making is undone, and if decision-making logics stratify over time. In addition, all these trends take place in a context marked by radical changes in information and communication technologies that offer previously unknown means to manage and develop governance structures beyond boundaries, for instance in the administration of cross-border policies (Heidbreder, 2015). The impact of digitalisation, the new opportunities for effective transboundary policymaking and its wider implications remain largely under-researched. Third, normative implications of policymaking beyond the state follow directly from these applied and empirical issues and revolve mainly around questions of the legitimation of de-territorialised policy-making. Not only is the traditional state order based on hierarchical command-and-control systems of governance challenged through a – at least partial – dissolution thereof. In the case of little or non-institutionalised processes and networks, legitimation of policymaking is also rendered more complex. The standard model for democratic participation
remains built around the assumption of hierarchical and territory-based decisionmaking and legitimation chains of a closed and independent jurisdiction. It is precisely these features that are permeated or even dissolved by policy-making beyond the state. A debate on the possibility for liberal democratic policy-making in an internationalised context has developed in parallel to these observations (Plattner, 2007), though it has recently become more pronounced. In the context of the EU, this has led to an intensive debate on the presence of a ‘democratic deficit’ of the non-state-like governance system (Føllesdal and Hix, 2006; Moravcsik, 2002). A parallel discussion on the legitimacy of the EU’s supranational decision-making emphasises that despite its institutional innovations, it is only the persistent relevance of traditional democratic institutions that can provide legitimacy in ordinary supranational decisionmaking or bottom-up processes involving civil society (Kohler-Koch, 2010, 2013a). The same questions about how the mismatch between the decision-making and the decision-taking actors, which is inevitable in policy-making beyond the state, can be mitigated and ‘democratised’ remains a continued, pressing issue for all policy-making beyond the state. While, at least theoretically, authors have pointed to possibilities for (democratic) incipient legitimation strategies beyond the state, awareness of the problem has risen in the public discourse in recent years because both ‘globalisation’ and ‘Europeanisation’ have entered the domain of political contestation. Notable opponents of these processes claim that the aim of ‘globalist’ elites involved in policy-making beyond the state is ultimately to circumvent nationally delineated modes of democratic legitimation. The policies of the Donald Trump administration, or the attempts to repatriate policy-making powers to a sovereign UK after the Brexit referendum are but a few examples of the results of the very questioning of policies beyond borders as a new societal cleavage. Beyond empirical research
Policies beyond the State
agendas into the potential for policy-making beyond borders in an increasingly politicised context, such debates also raise important normative questions that research needs to be concerned with. To name but a few: How can personal freedoms and rights be protected in a globally interlinked digital space? How can citizens influence inter- or sub-state decisionmaking that happens between, but not within, a democratically ordered territory of a state? Who is ultimately responsible for externalities of policy-making that span beyond the space, the group or the time-span the decision-makers have been mandated for? And who are legitimate decision-makers after all? These questions of how we should and want to be governed are essential if we assume global challenges signal a growing need for policy-making beyond the state. Given on-going social, economic and technological developments, policy-making beyond the state will remain a vital phenomenon that deserves to be studied more closely. Socially and politically, actions and lives themselves have become more mobile and indeed ‘transboundary’. In the realm of the economy, an increasing decoupling between sites of the knowledge economy and the production of physical goods has also increased the degree of international exchanges, and the linkage of the physical economy with digital aspects has altered power relationships in this domain. Technological changes, in particular, have radically altered the way in which individuals can communicate with one another and gain insights into developments in faraway places, yet they have also created the possibility for faraway actors to influence public opinion and political processes elsewhere. These innovations add thrust to the scope of policies beyond the state, both in terms of problem definitions and policymaking tools to reply to these challenges. These fundamental dynamics imply not least a reallocation of social, economic and power resources. Hence, economic inequality, international conflicts, and the consequences of climate change have created larger linkages
1061
between societies, leading to actual mobility across borders and an increasing politicisation of migration. At the same time, we observe not only an increase of policy-making beyond the state, but also massive resistance based on policy solutions to roll-back policy-making beyond the state in favour of traditional, territorially delineated national policymaking within the hierarchically ordered state. Against this backdrop, a further consideration of policy-making beyond the state faces not only significant research gaps, but also creates practical challenges for policy-makers and raises important normative questions that deserve more intense efforts and attention.
References Archibugi, D., Koening-Archibugi, M. and Marchetti, R. (eds.) (2011) Global Democracy: Normative and Empirical Perspectives (Cambridge: Cambridge University Press). Bäckstrand, K., Kuyper, J. W., Linnér, B.-O., et al. (2017) Non-state Actors in Global Climate governance: From Copenhagen to Paris and Beyond. Environmental Politics 26(4): 561–79. Barnett, M. N. and Finnemore, M. (1999) The Politics, Power, and Pathologies of International Organizations. International Organization 53(4): 699–732. Bartley, T. and Smith, S. N. (2010) Communities of Practice as Cause and Consequence of Transnational Governance: The evolution of Social and Environmental Certification. In: Djelic, M.-L. and Quack, S. (eds.) Transnational Communities: Shaping Global Economic Governance (Cambridge: Cambridge University Press), pp. 347–74. Bauer, M. W., Knill, C. and Eckhard, S. (eds.) (2017) International Bureaucracy: Challenges and Lessons for Public Administration Research (London: Palgrave Macmillan). Bauer, M. W. and Trondal, J. (eds.) (2015) The Palgrave Handbook of the European Administrative System (Houndmills: Palgrave). Bennett, L. W. (2005) Social Movements Beyond Borders: Understanding Two Eras of Transnational Activism. In: della Porta, D. and
1062
The SAGE Handbook of Political Science
Tarrow, S. (eds.) Transnational Protest and Global Activism (Lanham: Rowman & Littlefield), pp. 203–26. Benz, A. (2007) Governance in Connected Arenas – Political Science Analysis of Coordination and Control in Complex Rule Systems. In: Jansen, D. (ed.) New Forms of Governance in Research Organizations: Disciplinary Approaches, Interfaces and Integration (Dordrecht: Springer), pp. 3–22. Benz, A. (2009) Combined Modes of Governance in EU Policymaking. In: Tömmel, I. and Verdun, A. (eds.) Innovative Governance in the European Union (Boulder: Lynne Rienner), pp. 27–44. Best, E. (2016) EU Decision-Making: An Overview of the System. In: Best, E. (ed.) Understanding EU Decision-Making (Cham: Springer), pp. 5–24. Bodansky, D. (2001) The History of the Global Climate Change Regime. In: Lutterbacher, U. and Sprinz, D. F. (eds.) International Relations and Global Climate Change. (Cambridge, MA: MIT Press), pp. 23–40. Börzel, T. and Heidbreder, E. G. (2017) Enforcement and Compliance. In: Harlow, C., LeinoSandberg, P. and della Cananea, G. (eds.) Research Handbook on EU Administrative Law (Cheltenham: Edward Elgar), pp. 241–62. Börzel, T. A. and Risse, T. (2015) The EU and the Diffusion of Regionalism. In: Telò, M., Fawcett, L. and Ponjaert, F. (eds.) Interregionalism and the European Union: A Post-Revisionist Approach to Europe’s Place in a Changing World (Farnham: Ashgate), pp. 51–65. Boyd, B. (2017) Working Together on Climate Change: Policy Transfer and Convergence in Four Canadian Provinces. Publius: The Journal of Federalism 47(4): 546–71. Bulkeley, H. (2013) Cities and Climate Change (Abingdon: Routledge). Chan, S. and Pattberg, P. (2008) Private RuleMaking and the Politics of Accountability: Analyzing Global Forest Governance. Global Environmental Politics 8(3): 103–21. Dimitrov, R. S. (2016) The Paris Agreement on Climate Change: Behind Closed Doors. Global Environmental Politics 16(3): 1–11. Dörrenbächer, N. (2017) Europe at the Frontline: Analysing Street-level Motivations for the Use of European Union Migration Law. Journal of European Public Policy 24(9): 1328–47.
Eberlein, B. and Kerwer, D. (2004) New Governance in the European Union: A Theoretical Perspective. Journal of Common Market Studies 42(1): 121–42. Egeberg, M. and Trondal, J. (2011) EU-level Agencies: New Executive Centre Formation or Vehicles for National Control? Journal of European Public Policy 18(6): 868–87. Føllesdal, A. and Hix, S. (2006) Why There is a Democratic Deficit in the EU: A Response to Majone and Moravcsik. Journal of Common Market Studies 44(3): 533–62. Giest, S. and Howlett, M. (2013) Comparative Climate Change Governance: Lessons from European Transnational Municipal Network Management Efforts. Environmental Policy and Governance 23(6): 341–53. Green Cowles, M., Caporaso, J. and Risse, T. (eds.) (2001) Transforming Europe: Europeanisation and Domestic Change (Ithaca: Cornell University Press). Hafner-Burton, E. and Montgomery, A. H. (2006) International Organizations, Social Networks, And Conflict. Journal of Conflict Resolution 50(1): 3–27. Hall, P. A. and Taylor, R. C. R. (1996) Political Science and the Three New Institutionalisms. MPIFG Discussion Paper 96(6). Available at https://www.mpifg.de/pu/mpifg_dp/ dp96-6.pdf Accessed 9 January, 2020. Hartlapp, M., Metz, J. and Rauh, C. (2013) Linking Agenda Setting to Coordination Structures: Bureaucratic Politics inside the European Commission. Journal of European Integration 35(4): 425–41. Heidbreder, E. G. (2011) Structuring the European Administrative Space: Policy Instruments of Multi-level Administration. Journal of European Public Policy 18(5): 709–26. Heidbreder, E. G. (2012) Civil Society Participation in EU Governance. Living Reviews in European Governance 7(2). doi:10.12942/lreg-2012-2. Heidbreder, E. G. (2015) Multilevel Policy Enforcement: Innovations in How to Administer Liberalized Global Markets. Public Administration 93(4): 940–55. Heidbreder, E. G. (2017) Strategies in Multilevel Policy Implementation: Moving Beyond the Limited Focus on Compliance. Journal of European Public Policy 24(9): 1367–84. Heidbreder, E. G. and Brandsma, G. J. (2017) The EU Policy Process. In: Ongaro, E. and van
Policies beyond the State
Thiel, S. (eds.) The Palgrave Handbook of Public Administration and Management in Europe (London: Palgrave), pp. 805–821. Héritier, A. and Lehmkuhl, D. (2008) The Shadow of Hierarchy and New Modes of Governance (Introduction). Journal of Public Policy 28(1): 1–17. Howlett, M., Ramesh, M. and Perl, A. (2009) Studying Public Policy: Policy Cycles and Policy Subsystems (Oxford: Oxford University Press). (3rd edition). Keck, M. E. and Sikkink, K. (2014) Activists beyond Borders: Advocacy Networks in International Politics (New York: Ithaca). Kelemen, R. D. (2010) Globalizing European Union Environmental Policy. Journal of European Public Policy 17(3): 335–49. Keohane, R. O. and Nye, J. S. (1974) Transgovernmental Relations and International Organizations. World Politics 27(1): 39–62. Kirton, J. J. and Kokotsis, E. (2015) The Global Governance of Climate Change: G7, G20, and UN Leadership (Farnham: Ashgate). Kjær, A. M. (2004) Governance (Cambridge: Polity Press). Knill, C. and Bauer, M. W. (2016) Policy-making by International Public Administrations: Concepts, Causes and Consequences. Journal of European Public Policy 23(7): 949–59. Kohler-Koch, B. (2010) Civil Society and EU Democracy: ‘Astroturf’ Representation? Journal of European Public Policy 17(1): 100–16. Kohler-Koch, B. (2013a) Civil Society and Democracy in the EU: High Expectations Under Empirical Scrutiny. In: Kohler-Koch, B. and Quittkat, C. (eds.) De-Mystification of Participatory Democracy (Oxford: Oxford University Press), pp. 1–17. Kohler-Koch, B. (2013b) Civil Society Participation: More Democracy or Pluralization of the European Lobby? In: Kohler-Koch, B. and Quittkat, C. (eds.) De-Mystification of Participatory Democracy (Oxford: Oxford University Press), pp. 173–91. Kohler-Koch, B. and Rittberger, B. (2006) The ‘Governance Turn’ in EU Studies. Journal of Common Market Studies 44(Annual Review): 27–49. March, J. G. and Olsen, J. P. (2004) The Logic of Appropriateness. ARENA Working Papers WP 04/09.
1063
Mariussen, Å. (2010) Global Warming, Transnational Communities, and Economic Entrepreneurship: The Case of Carbon Capture and Storage (CCS). In: Djelic, M.-L. and Quack, S. (eds.) Transnational Communities: Shaping Global Economic Governance (Cambridge: Cambridge University Press), pp. 327–46. Marsh, D. and Rhodes, R. A. W. (1992) Policy Networks in British Government (Oxford: Clarendon Press). Mattli, W. and Büthe, T. (2005) Global Private Governance: Lessons from a National Model of Setting Standards in Accounting. Law and Contemporary Problems 68(3 & 4): 225–62. Mendes, J. and Venzke, I. (eds.) (2018) Allocating Authority: Who Should Do What in European and International Law? (Oxford: Hart Publishing). Moravcsik, A. (2002) In Defence of the ‘Democratic Deficit’: Reassessing Legitimacy in the European Union. Journal of Common Market Studies 40(4): 603–24. Newell, P. (2006) Climate for Change: NonState Actors and the Global Politics of the Greenhouse (Cambridge: Cambridge University Press). Pattberg, P. (2010) Public–Private Partnerships in Global Climate Governance. Wiley Interdisciplinary Reviews: Climate Change 1(2): 279–87. Perkins, R. and Neumayer, E. (2012) Does the ‘California effect’ Operate Across Borders? Trading- and Investing-up in Automobile Emission Standards. Journal of European Public Policy 19(2): 217–37. Perl, A. (2013) International Dimensions and Dynamics of Policy-making. In: Araral Jr., E., Fritzen, S., Howlett, M., et al. (eds.) Routledge Handbook of Public Policy (Milton Park: Routledge), pp. 44–56. Plattner, M. F. (2007) Democracy Without Borders?: Global Challenges to Liberal Democracy (Lanham: Rowman & Littlefield). Princen, S. (2009) Agenda-Setting in the European Union (Houndsmill: Palgrave). Quittkat, C. and Kotzian, P. (2011) Lobbying via Consultation – Territorial and Functional Interests in the Commission’s Consultation Regime. Journal of European Integration 33(4): 401–18. Rabe, B. G. (2007) Beyond Kyoto: Climate Change Policy in Multilevel Governance Systems. Governance 20(3): 423–44.
1064
The SAGE Handbook of Political Science
Rhodes, R. A. W. (1996) The New Governance: Governing without Government. Political Studies 44(4): 652–67. Rosenau, J. and Czempiel, E.-O. (eds.) (1992) Governance without Government (Cambridge: Cambridge University Press). Scharpf, F. (2006) The Joint-Decision Trap Revisited. Journal of Common Market Studies 44(4): 845–64. Skjærseth, J. B. and Wettestad, J. (eds.) (2008) EU Emissions Trading: Initiation, Decision-Making and Implementation (London: Routledge). Thomann, E. and Sager, F. (eds.) (2018) Innovative Approaches to EU Multilevel Implementation: Moving Beyond Legal Compliance (London: Routledge). Treib, O. (2014) Implementing and Complying with EU Governance Outputs. Living Reviews in European Governance 9(1).
Treib, O., Bähr, H. and Falkner, G. (2005) New Modes of Governance: A Note Towards Conceptual Clarification. European Governance Papers (EUROGOV): N-05-02. Tsebelis, G. (2002) Veto Players: How Political Institutions Work (Princeton, NJ: Princeton University Press). Wallace, H., Pollack, M. A. and Young, A. R. (eds.) (2014) Policy-Making in the European Union (Oxford: Oxford University Press). Zelli, F. (2011) The Fragmentation of the Global Climate Governance Architecture. Wiley Interdisciplinary Reviews: Climate Change 2(2): 255–70. Zürn, M. (2003) Globalization and Global Governance: From Societal to Political Denationalization. European Review 11(3): 341–64.
64 Politics and Policy Giliberto Capano
Policy and Politics: same object, two words, two disciplines? The definition of concepts and the adoption of specific terms represent the first steps of any scientific undertaking. Defining the phenomenon to be studied in one way rather than in another may radically influence the theoretical perspective through which politics is analyzed. Different definitions correspond to different ways of conducting the Science of Politics (Lowi, 1992). In Political Science, the ‘mother’ of all definitions is the definition of the political phenomenon, and this intrinsically involves the dyad represented by politics and policy. The distinction between politics and policy is not a simple one to establish. There is not only a linguistic problem – that is, that these two words exist only in English, while in other languages just one word is used to express the political phenomenon – but also the historical legacy with which each country addresses the
social construction of what political affairs are. We could quickly resolve this question by following Heinz Eulau, who underlined that, all in all, the fact that only the English language has two words covering the same semantic field, due to a specific historical development of the UK, while all other languages only have one word for the two concepts, ‘suggests that there is not politics apart from policy and no policy apart from politics. The differentiation that can be made is analytic and does refer to something concrete’ (Eulau, 1977: 420). However, by following this perspicacious observation, something would remain totally unexplained, namely the emergence of public policy as a specific object of study accompanied by a number of specific theoretical (sub-) disciplines devoted to analyzing this object. The question ‘why policy sciences do exist’ would remain unanswered, as would that regarding the justification for policy analysis and for the fact that Public Policy permits so many political scientists to call themselves policy scholars.
1066
The SAGE Handbook of Political Science
The analytical distinction suggested by Eulau should probably be the beginning of every reflection on the topic. In fact, precisely because Eulau underlines that fact that the distinction is only analytical and not concrete, means that it permits different definitions of the same object to be offered or different dimensions of the same phenomenon to be focused on. All in all, the analytical distinctions are one of the pillars of the analysis of ‘reality’, whereby that reality is circumscribed and ordered, and its relevant dimensions are established. Moreover, different analytical definitions permit different characteristics of the same phenomenon to be grasped. This is definitively true even in the case of the politics–policy dyad. Let me here recall that in the endless debate about the conceptualization of political phenomena, this dual dimension is always present. If we assume, for now, that politics is a continuous struggle for power and the exercise of such (power), while policy is the complex process whereby solutions to collective problems are pursued, then these two dimensions have always been present in the theorization of politics since ancient times. Consequently, it is not surprising that also in contemporary political science, the definitional debate continues to rumble on. The never-ending nature of such debate clearly shows what the implications are, in terms of scientific inquiry, when focusing either on politics or on policy: depending on the definition of the political phenomenon adopted, different disciplinary perspectives are at stake. This underlines the importance of the politics–policy dyad with all of its analytical, theoretical and empirical consequences. In this chapter I shall try to disentangle the politics–policy problem conundrum by using an analytical framework, starting (in the section ‘Concepts and Words: A Brief Historical Overview’) with a brief historical overview of the evolution of the concept(s) of the political phenomenon. In the section ‘Different Words, Different Meanings, Different Disciplines?’, I shall focus on the antagonistic rise of a
policy-centered science of politics, while in the section ‘Reconnecting Policy and Politics. The State of the Empirical Study of Political Phenomena’ I shall show how the two dimensions of the political phenomenon can be reconnected through the public policy approach.
Concepts and words: a brief historical overview The Politics–Policy Dyad Prior to the Advent of Political Science Like other concepts that change their meaning according to the context and the time when they are used (as witnessed, for example, by the historical development of the meaning of the concept of ‘family’), the concept of politics has also undergone significant changes over time. Moreover, different words have been used to cover its semantic field. The Ancient Greeks’ concept of politics differs considerably from current definitions of the same notion (or at least from the definition of politics most commonly adopted by political scientists, since, as I shall show below, one element of that ancient definition has reappeared with the emergence of the policy perspective). The word ‘πολιτική’ looks the same, but the meaning is very different, and above all, the context has completely changed. In fact, the Greek conceptualization of politics was based on the indistinctness of politics, society and ethics. Politics was conceived as the collective dimension of communal living, as the specific dimension that distinguishes human beings from animals. Furthermore, the Greek word means, unsurprisingly, ‘“the things concerning” the πόλις (the city)’. As Aristotle (Politics 1252a) has written: it is evident that a city is a natural production, and that man is naturally a political animal, and that whosoever is naturally and not accidentally unfit for society, must be either inferior or superior to
Politics and Policy
man: thus the man in Homer, who is reviled for being ‘without society, without law, without family’. Such a one must naturally be of a quarrelsome disposition, and as solitary as the birds. The gift of speech also evidently proves that man is a more social animal than the bees, or any of the herding cattle: for nature, as we say, does nothing in vain, and man is the only animal who enjoys it. Voice indeed, as being the token of pleasure and pain, is imparted to others also, and thus much their nature is capable of, to perceive pleasure and pain, and to impart these sensations to others; but it is by speech that we are enabled to express what is useful for us, and what is hurtful, and of course what is just and what is unjust: for in this particular man differs from other animals, that he alone has a perception of good and evil, of just and unjust, and it is a participation of these common sentiments which forms a family and a city.
Thus, politics was conceived as the totalizing dimension of the Greek citizen: it did not define the political phenomenon but the human being (Sartori, 1973, 1987). Seen from this perspective, the ultimate political good was the collective good of wealth. In some ways this feature echoes one fundamental dimension of Lasswell’s (1951, 1971) proposal for policy orientation. This Ancient Greek conception of the political was necessarily characterized by those horizontal relationships that are typical of living as a community (koinonia politiké) where the exercise of power is collectively managed. The Aristotelian conception of politics has come to the fore, notwithstanding the variety of perspectives that characterized Ancient Greek tradition. These include Plato’s hierarchical, profoundly ethical conception, the utilitarian vision of the Sophists, and the definition of politics as the art of persuasion as proposed by Protagoras. However, the Aristotelian conception of politics (with its ‘policy’ dimension) was to prevail, thanks to the Christian doctrine, and particularly to the Thomistic interpretation, which by bringing together religion, ethics, law and politics, institutionalized politics’ focus on what needed to be done to achieve the common good. Until Machiavelli, what for us would constitute real politics – and thus all questions
1067
relating to power, its distribution and its exercise – was not covered by the word politics itself, but by words like ‘regnum’, ‘dominium’, ‘gubernaculum’ and ‘principatus’. From this point of view, although it is always difficult and methodologically questionable, when applying contemporary categories to the past, it could be said that until the Middle Ages, different words were used to define political phenomena and (in one way or another) the policy–politics dimension. It is not surprising that Machiavelli – of whom many consider to be the first modern political thinker (politics being perceived as an activity strictly related to power and the exercise of it) and the founder of modern political science (Mansfield, 1981) titled his fundamental work The Prince, exactly because he wanted to emphasize that vertical dimension of politics that was not expressed by the contemporary use of the word. In terms of the politics–policy dyad, Machiavelli’s thought may be considered double-sided. In fact, on the one hand, the Florentine thinker is considered the founder of modern political science, not only for his primary focus on power and its exercise, but also, and above all, for his realistic perspective. On the other hand, he may also be considered a policy analyst. Indeed, thanks to his realistic (due to it being power-centered) focus and empirical approach, Machiavelli develops what we might call an applied vision of politics: his main goal is to offer the Prince the best possible policy advice, meaning what instruments to adopt in order to pursue his chosen ends. In doing so, Machiavelli analyzed the decisional and policy-making process by paying specific attention to the instruments utilized for the implementation of decisions and policy. Thus, he focused on the various aspects of the art of government, in order to render it more effective (Friedrich, 1963; Regonini, 1995), and not on power alone. This aspect of Machiavellian thinking has been seen as a way by which to carry out policy analysis (Baakman, 1997) and even to conduct public administration (Tholen, 2016).
1068
The SAGE Handbook of Political Science
However, this double-sided nature of Machiavelli’s thought is often forgotten, whereas the aforementioned persistence of the politics–policy dyad in his thought is of crucial importance here. Unsurprisingly, the policy dimension of Machiavelli’s thought also emerged in translations of his works over the course of the following two centuries. In England, for example, ‘policy’ and ‘practice’ were used ‘to denote both the practical art of politics and the theory or doctrine of that art’, and became in the late 16th and early 17th century ‘the technical terms of Machiavellianism in England’ (Orsini, 1946: 122). Moreover, in Germany, as Heidenheimer (1986) reminds us, certain translations rename the Machiavelli’s Prince as ‘Policey’. This interpretation of Machiavelli’s thought shows exactly how, from the 16th century on, European countries started to shift towards increasing Stateness, albeit at different rates and with differing density (Tilly, 1975). What is interesting is that the process of establishment of the absolute state was accompanied by the word ‘policy’, in its various guises – police in France, policey in Germany, policy in England – being linked to the action of the State. In all three variants, the meaning was almost the same: administration, system and organization of governing, and domestic rule-making. While in the UK, and thus in English, the word has survived, in French and in German, as well in other continental languages, it has disappeared, with only one specific meaning (police/polizey) surviving. The prevalence of the ‘police’ concept over the policy concept is the product of the complex dynamics of the State’s evolution, which during the 19th century led the word ‘politics’ to embrace the entire semantic field previously covered by various different terms. Here the dividing line is clearly represented by the changed role of the State. Unsurprisingly, one of most oft-cited definitions of politics is from Max Weber (1919: 78), who on the basis of his in-depth, innovative study of the process of the modern State’s formation, proposed his
power-centered – thus also State-centered – definition of politics: Hence, ‘politics’ for us means striving to share power or striving to influence the distribution of power, either among states or among groups within a state. This corresponds essentially to ordinary usage. When a question is said to be a ‘political’ question, when a cabinet minister or an official is said to be a ‘political’ official, or when a decision is said to be ‘politically’ determined, what is always meant is that interests in the distribution, maintenance, or transfer of power are decisive for answering the questions and determining the decision or the official’s sphere of activity. He who is active in politics strives for power either as a means in serving other aims, ideal or egoistic, or as ‘power for power’s sake’, that is, in order to enjoy the prestige-feeling that power gives.
However, Weber’s hyper-realistic definition too is characterized by ‘the service of a ‘cause’’ (ibidem) requirement. Obviously, this ‘cause’ is linked to the concept of the vocation of politics and thus has a strongly normative, and in some way tragic, meaning. Nonetheless, this often forgotten detail of the Weberian definition once again calls for recognition of the duality of the political phenomenon. Power is central to the definition of politics, but it cannot stand alone. This point is fundamental to an understanding of the politics–policy dyad. Very often in contemporary political science, there is a common belief that everything is about power, and very often Weber’s thoughts are considered a cornerstone of this belief. However, not only did Weber not say as much, but if power is to be considered a constitutive dimension of politics, it needs to be contextualized exactly in those arenas and processes where the political phenomenon emerges.
The Dyad in Political Science The two sides of the definition of politics (power and ‘something else’) have constantly characterized the debate within political science. Here the clash between the ‘major thinkers’ of political science is evident.
Politics and Policy
In this work, I am going to look at the contributions made by David Easton, Robert Dahl, Harold Lasswell and Giovanni Sartori. Apparently the ‘power’ dimension of politics also characterizes David Easton’s (1953: 129) famous definition of politics as ‘the authoritative allocation of values for the whole society’. However, also in this case, the role of the constitutive dimension of the exercise of power within society as a feature of politics is not enough. In fact, when referring to values – be they symbolic or material – Easton focuses on all those activities by which such values are assigned. Furthermore, he looks at what these values are, how they are established, and which design is followed to deliver them. In Easton’s (1953: 355) view, power is not the only relevant variable to be used to circumscribe and define politics, and thus he points out that ‘the interest for political science on power is only educed from its preoccupation on how policy is made and executed’ (Easton, 1953: 144). Dahl and Stinebrickner clearly abandoned the recurrent double-sided definition of politics in favor of a purer, more parsimonious definition: ‘politics is simply the exercise of influence’ (Dahl and Stinebrickner, 2003: 24). This definition strictly delimits the meaning of politics, and can be seen as a natural evolution of his relational definition of power, according to which ‘A has power over B to the extent that he can get B to do something that B would not otherwise do’ (Dahl, 1957: 202–203). Through this conceptualization, politics appears to be everywhere, not only in political institutions and behavior, but also, for example, in private undertakings, thus leading political scientists to also study the politics of ‘private clubs, business firms, trade unions, religious organizations, civic groups, primitive tribes, clans, perhaps even families’ (Dahl, 1970: 6). Dahl’s main contribution to the theory of government and democracy can be considered to be based on an attempt to analyze how the battle for power (the power game) is organized and works in democratic systems. From this point of view, Dahl can be
1069
considered the first political scientist to have clearly ‘solved’ the question of the policy– politics dyad by simply defining the political phenomenon without making any significant reference to the policy dimension, and this has become the core definition of mainstream political science (Klingemann and Goodin, 1996; Goodin, 2011).1 Giovanni Sartori has tried to find a way of isolating the nature of the political phenomenon, thus avoiding the risk, intrinsic in Dahl’s approach, of basing it on an all-inclusive definition of influence. He clearly points out that what matters is the ‘verticalized’ dimension of politics and proposes that it be defined in terms of the specific arena of politics. Thus, politics is defined as ‘the arena of collectivized, sovereign, sanctionable, inescapable decisions’ (Sartori, 1987: 257). Politics is clearly defined in a very specific way. Its constitutive dimension is power – however, where ‘power’ means its exercise in a very specific arena and with a specific output. Is that all? The answer is no, since to better clarify the specific nature of politics as collectivized decisions, Sartori (1987: 258) specifies that It is obvious that political decisions may deal with very different matters: they may concern economic policy, legislative policy, social policy, religious policy, education policy, etc. If all these kinds of decision are in principle political, is because that they are collectivized sovereign decisions taken by people situated in a political arena.
Thus, the connection with policy is also clearly present in Sartori’s thought: even when defined in a very restricted manner, some reference to policy is called for, meaning that specific basis on which collectivized decisions are taken. However, within the context of this dispute, Lasswell (1936) adopts a relevant position. His definition of politics as ‘who gets what, when and how’ has been criticized for its ‘distributional’ meaning; and years later, when he had refocused his scientific interests, he explained that the study of politics is not just ‘the study of influence and the influential’ (Lasswell, 1936: 3) because politics also
1070
The SAGE Handbook of Political Science
involves political actors ‘striving … for the attainment of various values for which power is a necessary (and perhaps also sufficient) condition’ (Lasswell and Kaplan, 1950: 240). Lasswell also believed that politics was an activity where goals matter as much as valued outcomes, and it was precisely this focus on valued outcomes that led Lasswell to take more interest in the analysis of policy-making, and above all in launching the policy sciences. In doing so, Lasswell refocused his analysis of the political phenomenon by striving for a policy approach – that is, a way of analyzing politics where the focus ‘is upon the fundamental problems of man in society, rather than on the topical issues of the moment’ (Lasswell, 1951: 3). By doing so, Lasswell pushed the politic–policy dyad towards a clearer division. Substantially, in Lasswell, there is a clear definition of the second side of the political phenomenon. On the one hand, he focuses on power/influence (and thus politics), while on the other, his focus is on public problems (policy). It is rather peculiar to see a highly reputed politics scholar becoming the (re-)founder of a policy-oriented approach by his redefining of the object of his research and in some way through the clear emphasis on the ‘policy’ dimension of politics. The importance Lasswell placed on the problemcentered nature of the political phenomenon dramatically shifted attention from its definition as power ‘plus something’ to policy, thus completely overturning Dahl’s perspective. On the whole, political science’s major thinkers failed to clearly agree on a definition about its nature. Basically, the focus on power has survived and continues to dominate such debate, but it has never resolved the question altogether, since the other aspect of politics (what is done through/thanks to the exercise of power) is always present, at least latently. In order to better understand the problem of the politics/policy dyad, the key point here is to decide whether the definition of power as a fundamental element of politics is an absolute concept, meaning that what matters is to have power ‘over’ something – that is, the
capacity to be influential, (according to Dahl) or a more processual definition whereby the exercise of power is the first step in a more complex process, a step where the only thing that is decided is ‘who gets what’ (Lasswell and Kaplan, 1950; Lindblom, 1965) while the ‘how’ dimension remains marginalized. From this second perspective, the most important dimension of power is not power ‘over’ but power ‘of’ (doing, having, saying, etc.). Power ‘over’ is related to something that is an intrinsic part of politics – that is, doing things designed to maintain the social order by resolving collective problems. This is succinctly expressed by Hugh Heclo (1974: 305–306) when he points out that: Politics finds its sources not only in power but also in uncertainty – men collectively wondering what to do…. Governments not only ‘power’ … they also puzzle. Policy-making is a form of collective puzzlement on society’s behalf… Much political interaction has constituted a process of social learning expressed through policy.
All in all, what this brief overview of the definitional debate regarding the politics– policy dyad shows is that while the evasive nature of the political phenomenon, and above all of power, is important when distinguishing it from all other spheres of collective human behavior (such as society and the economy), it alone is not enough to explain such a distinction. Politics is not only about power but is also about the ideas and knowledge of actors pursuing their own interests and also their own values through a complex process of interaction designed to deal with collective uncertainty.
Different words, different meanings, different disciplines? Separating the Meanings of Politics and Policy We know perfectly well that in mainstream political science the ‘power-centered’
Politics and Policy
definition has prevailed to date, and that this accounts for the analytical focus on all of those processes in which what matters is the different dimensions of power. Those processes are: the pursuit and maintenance of power, and the process of its legitimation. This choice has been made, notwithstanding the major thinkers’ doubts and differences. Said choice accounts for political research’s specific focus on: 1 electoral behavior (a fundamental step in obtaining and legitimizing power within a democratic system); 2 political parties (considered to be the best organizational form for obtaining power or influencing it); 3 political institutions (the tools and goals of the power battle); 4 political élites (the key actors in the decisionmaking process).
Political research has thus focused on politics as a competition for power rather than as an activity linked to the solution of collective problems. According to this powercentered mainstream, policies are a product/ output of the political process conceived precisely as the battle for power/influence. However, a policy-centered school of thought has been slowly but inexorably gaining ground in American Political Science thanks, among other factors, to the influence that Dewey and his pragmatism had in its development. ‘The public and its problems’ (Dewey, 1927) represented a pivotal driver of political research, which even before the Second World War had focused on the policy-making process and its characteristics. See, for example, the concept of whirlpool proposed by Griffith in 1939, designed to grasp the complexity of ‘social interests and problems’ (Griffith, 1939: 183). Then, there was Easton’s (1953: 129–130) important focus on the complexity of policies: the essence of a policy lies in the fact that through it certain things are denied to some people and made accessible to others. A policy, in other
1071
words, whether for a society, for a narrow association, or for any other group, consists of a web of decisions and actions that allocates values. A decision alone is of course not a policy; to decide what to do does not mean that the thing is done. A decision is only a selection among alternatives that expresses the intention of the person or group making the choice. Arriving at a decision is the formal phase of establishing a policy; it is not the whole policy in relation to a particular problem. A legislature can decide to punish monopolists; this is the intention. But an administrator can destroy or reformulate the decision by failing either to discover offenders or to prosecute them vigorously. The failure is as much a part of the policy with regard to monopoly as the formal law. When we act to implement a decision, therefore, we enter the second or effective phase of a policy. In this phase the decision is expressed or interpreted in a series of actions and narrower decisions which may in effect establish new policy.
This long quote from Easton helps us to understand that while politics represents the allocation of values, these allocated values are realized and implemented through the policy-making process and, consequently, policy-making can be considered to be the very essence of it. This could also be said of the power-centered definition of politics. If politics is about power or influence, where are the characteristics of such power or influence revealed? Almost certainly in typical political and institutional behavior, but, above all, in policy-making processes. Thus, even if politics were considered simply as a question of power, where would it be visible if not in policy-making? Policies are not simply the product of politics, but are also, significantly, the real world where politics plays out. Policies cannot be considered simply as an output, but also need to be understood as a process, and also as an independent variable, or as capable of endogenous change (Lowi, 1964; Wildavsky, 1979). Thus, policies could be considered as politics in action. The Lasswellian redefinition of politics – in terms of policy – has not only paved the way for the policy approach in political science, but, above all else, it has led to the need to define policy as such. This need represents
1072
The SAGE Handbook of Political Science
the other aspect of the problem of defining politics and has tended to suffer from the same problems. The lack of any self-evident meaning (Heclo, 1972) has resulted in different definitions of what policy is, and the dual nature of the political phenomenon has impacted the definition of policy. For example, two very famous, commonly adopted definitions of policy are based on the specific relevance of power and/or authority. This is the case of Dye’s (1972: 1) definition of policy as ‘whatever governments choose to do or not to do’, which echoes the Sartorian definition of politics in terms of the compression of policy to highly verticalized decisions, and is the case with Lowi’s (1970: 315) assumption that ‘policy is deliberate coercion – statements attempting to set forth the purpose, the means, the subjects, and the objects of coercion –’. Once again, the vertical dimension of the political phenomenon is also applied to policy, and the overlap with the Sartorian definition is quite evident. In other words, in defining policy, both Dye and Lowi see the same vertical dimension through which political power is exercised while deciding on behalf of the collectivity. Thus, paradoxically, the Lasswellian focus on problems is disregarded and the dual dimension of the political emerges once again. Moreover, those adopting the problemcentered perspective have focused their attention on certain characteristics such as: the presence of real actors, and not necessarily public or governmental ones; the presence of certain common/collective values or goals; and the presence of a processual dimension. From this perspective, policies are processes or courses of actions through which different actors interact with one another in order to resolve collective problems or problems perceived to be collective (Friedrich, 1963; Ranney, 1968; Anderson, 1975; Dunn, 1981). Thanks to these definitions, the political phenomenon is clearly otherwise defined as an on-going process where policy-makers operate in order to address and resolve socially important problems.
Different Words, Different Analytical Approaches When the definition of policy is power/ authority-centered, it engenders misunderstanding and rivalry between scholars of politics and scholars of policy. When problem-centered, the concept of policy is apparently purified of the ‘political’ power-centered dimension. Definitions of policy like those offered by Dye and Lowi, in maintaining the connection with the ‘political’ dimension, have engendered rivalry among scholars of politics/policy. In this regard, the often-misunderstood theoretical proposal of Theodore Lowi is very interesting. Its apparently provocative assumption that ‘policy determines politics’ is well-rooted in the enduring debate over pluralism and elitism, and thus over different theories of power, postulating that governments and states’ actions (policies) are epiphenomena of the battle for power. By rediscovering the State, Lowi (1964: 688) postulates exactly the opposite, starting from the exercise of power by governments and by the way they address people’s expectations: In politics, expectations are determined by governmental outputs or policies. Therefore, a political relationship is determined by the type of policy at stake, so that for every type of policy there is likely to be a distinctive type of political relationship. If power is defined as a share in the making of policy, or authoritative allocations, then the political relationship in question is a power relationship or, over time, a power structure.
Thus, policies design specific arenas of power, with specific necessary political games and resources at the disposal of the concerned political actors. Lowi’s argument has led to conflicts between political scientists and policy scholars about the primacy of politics and policy, while Lowi’s framework simply underlines how policy can structure the characteristics of political dynamics by encouraging/discouraging conflict, bargaining and logrolling. The problem-centered definition originally led to a focus on policy science for
Politics and Policy
policy-making, whereas the scientific mission is based on the normative assumption that the best policy solution is to be found based on the context (Nagel, 1983). This stream of research substantially abandoned the ‘political’ dimension of policy and raised the interest, and became the focal point, of welfare economics and public management scholars. What Lasswell defined as policy science for policy-making, has become ‘policy analysis’, and has essentially forsaken any political dimension. Unsurprisingly, the policy approach has been defined as an old public administration in a refurbished wardrobe (Eulau, 1958; Schick, 1975). However, it should be said that by placing collective problems at the center of scientific inquiry, a real policy science of policymaking has also emerged (particularly from the 1970s onwards). This research stream (see also below) has been flourishing and has developed a number of theoretical frameworks and significant empirical research focused on trying to understand and explain ‘who gets what, when and how’. This stream of scholarship may be called that of ‘Public Policy’, and it was developed by analyzing political phenomena from a policy perspective, thus considering policy not simply as an output of the battle for power, but as a complex process allocating values in society through the interaction of different policy actors, their interests and their ideas. This perspective has conflicted with mainstream political science due to its different definition of political phenomena as driven by a different research design. For example, on the one hand, political science usually defines policies simply as the by-product of one of the following: elections; interest group politics; the ruling coalition; the ruling élites; ideological politics. However, on the other hand, starting with the problem in question, Public Policy assumes a ‘realistic’ perspective regarding who really influences policy-making, how policies develop from one stage to the next, and, above all, ‘how’ policies are formulated and implemented,
1073
thus overcoming the structural limitations of mainstream political science, which usually focuses on policy decisions. In concentrating on policy dynamics, public policy does not exclude the relevance of power/influence but considers it as one of the drivers of the policy-making process, which interacts with other drivers over the course of time.
The Evolution of the Academic Discipline Distinguishing between the terms in question has encouraged the emergence of a number of different sub-disciplines, or, rather, the potential separation of mainstream political science from the different streams of the policy approach. Policy scholars have been concentrating on the characteristics of policy dynamics and change by designing specific theoretical approaches like the Advocacy Coalition Framework (Weible and Sabatier, 2017); the Multiple Stream Approach (Kingdon, 1984); the punctuated equilibrium theory (Baumgartner and Jones, 2009); the characteristics of policy design and of the chosen policy instruments (Howlett, 2011; Capano and Lippi, 2017; Peters, 2018); the relevance of ideas and knowledge as drivers of policy dynamics (Béland, 2009); the role of specific key actors – entrepreneurs, brokers, leaders – (Capano and Galanti, 2018); and the evolution of governance arrangements (Pierre and Peters, 2000; Capano et al., 2015a). At the same time, according to an authoritative review on the state of Political Science (Katznelson and Milner, 2003), mainstream political science has continued to focus on the State (now in a globalized era), political institutions, citizenship, political participation and behavior, international institutions, ideology, populism and the state of democracy. It is no coincidence that in this book, published to celebrate the 100 years of the American Political Science Association, there is not one single chapter devoted to public policy (although a
1074
The SAGE Handbook of Political Science
number of public policy scholars are cited in various chapters). This further indicates the division between the academic areas of mainstream political science and public policy, at least in the United States. Obviously, this may depend on the fact that Public Policy (meaning the study of policy-making and the components thereof) is currently developed to a greater extent outside of the United States than within (where the prevalence of rational choice and the strength of economic-oriented policy analysis has definitively restricted the growth of public policy, at least relatively). However, this is also confirmed by the main journals published in the Political Science and Public Administration fields: here a clear divide persists, even in general journals.2 This could be due not only to the definitional divide, but also to a methodological asymmetry, since while political science has become increasingly quantitative and oriented towards large n research design, public policy right now is mostly devoted to case studies or small n research design – although the adoption of QCA has permitted an increase in the number of cases dealt with. Overall, the distinction made between the two terms has implied not only disciplinary specialization, but also the institutionalization of (sub-disciplinary) borders. This institutionalization has been characterized by certain virtues (deeper knowledge of political phenomena as well as the concrete operationalization of such), as well as certain vices, such as: (1) the risk of an increasing separation between political factors (elections, political institutions ideologies, party system, etc.) and policy factors (stages, network of actors, problems, ideational solutions, knowledge, policy legacies, etc.); and (2) the depoliticization of policy (the battle for power risks remaining hidden or marginalized) and the de-problematization of politics (the risk of the focus on social problems becoming irrelevant, in favor of a simplified partisan perspective on policies). These vices could exceed the virtues. For example, if we think of the rise of populism
in Europe, it is clearly based on the diachronic intersection of policy and politics. However, at present, the main interpretations of this rise are based on political factors (and on economic ones), whereas an analysis that also takes certain policy dimensions (not only policy performance but also the way in which policies have been framed and implemented over time) into account could help gain a better understanding of the roots (and potential implications) of the phenomenon. Another typical example regards the explanation of political/policy change, where both sides tend to develop theoretical frameworks that reciprocally underrate one another. Thus, the specialized divide between political science and public policy not only weakens the explanatory capacity of the ‘discipline’, but also the public’s perception of its social usefulness.
Reconnecting policy and politics. The state of the empirical study of political phenomena As I have tried to explain above, the story of the emergence of the politics–policy dyad is a complex one, a kind of forced marriage, due to the dual nature of political phenomena. However, over the last three decades, there has been a process of progressive institutionalization of a ‘forced’ divide. This notwithstanding, there has also been recent signs of a re-convergence, if not of a new possible ‘marriage’ of the two. This is specifically due to the increasingly common use of certain theoretical approaches, to the focus on selected common topics of research and to the increased willingness of the policy approach to consider the politics, or the political aspects, of policy-making. As regards the aforesaid theoretical approaches, it is quite clear that the diffusion of historical neo-institutionalism – which is widely adopted, in particular by those
Politics and Policy
working on political change and on policy change – has led to a certain rapprochement between mainstream Political Science and Public Policy. More specifically, the focus on institutions that tend to characterize both political science and public policy has addressed a shared awareness of the important role played by the process of institutionalization and institutional variables in the analysis of both political and policy processes. Thus, for example, historical neo-institutionalism has been adopted as an approach to the study of regime change (Mahoney, 2001; Mahoney and Thelen, 2010) and of paradigm shifts in public policy (Capano, 2003; Hogan and Howlett, 2015). Furthermore, it is evident how the ideational tide-change in political science has led to a crossing of boundaries by reconsidering political power in the public policy field (Béland et al., 2016) or by connecting rational choice theory to contingentist approaches when explaining political and policy actors’ preferences (Katznelson and Weingast, 2005). With regards to shared topics of interest, what first emerges is the attempt to explain a change that pushed political scientists towards the consideration of both political and policy factors. For example, in public policy, certain political variables are of pivotal importance to the multiple stream approach (Kingdon, 1984) as well as to the advocacy coalition framework (where a change in the ruling coalition dramatically impacts how policy dynamics and change are addressed). At the same time, policy variables (like policy actors’ freedom of action, policy legacy, and policy instruments) are taken into consideration when analyzing political-institutional change in welfare policy (Trampusch, 2010, Gingrich and Häusermann, 2015; Lee et al., 2017). Governance is another topic over which the focus of political science and that of public policy have been overlapping, and where a common interest has emerged, but without resulting in any significant reciprocal influence of the one on the other. From a political science point of view, however, the focus has
1075
been more on the ‘quality’ of governance, which is often operationalized by emphasizing the quality of government (Rothstein and Teorell, 2008; Rothstein, 2011; Fukuyama, 2013; Holmberg and Rothstein, 2012) or, in a broader sense, the quality of democracy (Diamond and Morlino, 2005). It is interesting here how the focus of public policy has been quite different not only in terms of the definition of the word governance, but also in terms of the analytical perspective adopted. In fact, there has been more in-depth reflection on the definition of governance in public policy due to the implications of the characteristics of governance arrangements about structuring policy-making (Capano et al., 2015b). Furthermore, policy scholars have focused their attention on the dynamics and nature of governance shifts (in term of the changes in the adopted policy instruments) rather than on the improved performance and policy effectiveness of those changes. There has been greater overlapping with political science’s focus on better governance and the quality of democracy on the part of the recently reformulated policy design perspective in public policy studies (Howlett, 2014). This new perspective takes into consideration not only the characteristics traditionally considered relevant for good policy design (the possession of technical expertise and good knowledge), but also the role of governmental will or political capacity (Howlett et al., 2015; Capano, 2018). Furthermore, all political scientific studies of governmental capacity have been overlapped by the emerging literature on policy capacity (Wu et al., 2018). Thus, the issue of how to design effective political institutions and public policy could in fact be a common research topic which the two sides could converge on in a highly profitable way. Another important topic of common interest is that of agenda setting, which represents a core research topic not only for policy scholars but also for political scientists. In this case, political scientists are more interested in the role of political parties, political
1076
The SAGE Handbook of Political Science
institutions and the media with regards to agenda setting, while policy scholars are more interested in the framing of the problem, and the way in which it is channeled towards the decisional agenda. In this case, the role of specific actors (policy entrepreneurs) and contingencies (focusing events) are particularly emphasized. Finally, some of the most commonly adopted theoretical frameworks in the public policy field pay due attention to the political variables concerned. This is true, for example, of the Multiple Stream Approach: with this approach, the role of politics is strategic to agenda-fixing and decision-making of the Advocacy Coalition Framework (ACF), as previously mentioned, since a change in government is considered one of the drivers of significant policy change; it is also true of the punctuated equilibrium theory, where concepts that are also very important for political science, such as institutional venue, attention cycle and asymmetric information (Baumgartner and Jones, 2009; Jones and Baumgartner, 2012) are fundamental to an understanding of policy change and stability. Then, there is the aforementioned attention to political variables in policy design. Finally, all literature on policy advisory systems pays substantial attention to the political-institutional structure and to the role of politicians in using and addressing policy advisory systems (Craft, 2015). Overall, public policy has progressively abandoned, from the theoretical point of view, its original separation from political science by progressively including political dimensions and factors within its scope. This represents an asymmetric situation, since political science does not seem to be doing likewise with the same degree and conviction.
A common future? The politics/policy dyad continues to be present. The two terms represent the two
intrinsically different dimensions of political phenomena. The terminological distinctions can help not only to separate the semantic fields in question, but also to circumscribe certain specific theoretical and empirical fields for political science scholars. At the same time, however, the two dimensions are structurally intertwined, and even if one of them is analytically compressed or marginalized, it is nevertheless always present. The terminological distinction should be treated as an analytical tool, according to a precise and specific research design, and not as a kind of epistemological or scientific divide. Focusing on the ‘policy’ dimension of political phenomena does not mean abandoning the realistic tradition of mainstream political science. On the contrary, it means adopting a similarly realistic perspective, since power is not marginalized but is instead considered from a dynamic perspective whereby it is not only a scarce resource, but is also something that can be reproduced as a result of the diachronic interaction of policy-makers. Furthermore, power is observed not only in relation to actors who pursue their own interests, but also in terms of their need to legitimize their actions by showing that they are closely involved in working towards solutions to collective problems. Furthermore, the focus on the policy dimension of political phenomena allows political science to bridge the cognitive gap separating it from the ‘black box’ of decisional processes. This is because the focus of the policy dimension allows politics to be studied in action. At the same time, the focus on politics permits a better understanding of the battle of values and interests that design the premises of the actions of political actors as well of those actors wishing to influence politics. The complex interaction of politics and policy is never-ending, and thus any semantic separation, made for analytical reasons, can only be partial. In fact, if the analytical separation is not designed to better formulate theoretical frameworks or empirical research,
Politics and Policy
but is instead pursued for purely hegemonic reasons, then there will be a strongly negative trade-off both in scientific terms and in terms of the social perception of the relevance of political science. From the scientific point of view, the divide separating policy and politics could lead to increasing specialization, and, above all, to missing aspects of what should be explained. For example, the emphasis on politics excessively limits the possible answers to the ‘what’ questions posed by Lasswell with regards to power and influence, while missing the policy side of the ‘what’ (what is delivered to people? what is the outcome for the collectivity?). On the other hand, the emphasis on policy too often tries to answer the ‘what’ question by focusing on output and outcome, while neglecting the power dimension. Furthermore, the ‘how’ question is answered by those who focus on politics by analyzing structural and political-institutional factors, while policy scholars focus on the process in which the problem is framed, knowledge is constructed and used, network relationships develop, and policy capacities are engaged. The aforesaid divide produces in-depth, yet marred, scientific knowledge. From a social perception of the relevance of the discipline, the divide is in danger of further weakening a discipline that is undervalued, not only due to its intrinsic characteristics (Stoker, 2010), but also because the policy/politics divide weakens the capacity of the discipline to present a solid, compact, unified image to the outside world (also, in terms of the scientific core, shared theories and those shared assumptions are fundamental if both advocacy and the recommended prescriptions are to be reliable). It is probably time to re-think, in a more modern vein, Lasswell’s call for a new way of studying politics and to take Heclo’s suggestion seriously. If politics is not only the exercise of power but also the questioning of that power, then it can finally reconnect these two dimensions.
1077
Notes 1 It should be pointed out that O’Key (1942: 2) defined politics in terms of power, but only in terms of the relationship between governors and governed: ‘[T]he essence of politics lies in power…of relationships of super-ordination, or dominance and submission, of the governors and the governed’. 2 There is the exception of the research deriving from the Comparative Agenda Project, which, due to its intrinsic characteristics, gets published in both types of journal.
References Anderson, J. (1975) Public Policy Making. New York: Praeger. Aristotle. Politics. Translated by Benjamin Jowett. http://classics.mit.edu/Aristotle/politics.html Accessed on 23 December 2018. Baakman, N. (1997) The Legacy of Machiavelli and Policy Analysis, The European Legacy: Toward New Paradigms, 2(2), pp. 252–257. Baumgartner, F.R., and Jones, B.D. (2009) Agendas and Instability in American Politics. 2nd ed., Chicago: University of Chicago Press. Béland, D. (2009) Ideas, Institutions, and Policy Change, Journal of European Public Policy, 16(5), pp. 701–718. Béland, D., Carstensen, M.B.,, and Seabrooke, L. (eds.) (2016) Ideas, Political Power, and Public Policy. Abingdon: Routledge. Capano, G. (2003) Administrative Traditions and Policy Change: When Policy Paradigms Matter. The Case of Italian Administrative Reform During the 1990s, Public Administration, 81(4), pp. 781–801. Capano, G. (2018) Policy Design Spaces in Reforming Governance in Higher Education: The Dynamics in Italy and the Netherlands, Higher Education, 75(5), pp. 675–694. Capano, G., and Galanti, M.T. (2018) Policy Dynamics and Types of Agency: From Individual to Collective Patterns of Action, European Policy Analysis, 4(1), pp. 23–47. Capano, G., Howlett, M., and Ramesh, M. (eds) (2015a) Varieties of Governance: Dynamics, Strategies and Capacities. London: Palgrave.
1078
The SAGE Handbook of Political Science
Capano, G., Howlett, M., and Ramesh, M. (2015b) Bringing Governments Back In: Governance and Governing in Comparative Policy Analysis, Journal of Comparative Policy Analysis, 17(4), pp. 311–321. Capano, G., and Lippi, A. (2017) How Policy Instruments are Chosen: Selecting Between Legitimacy and Instrumentality, Policy Sciences, 50(2), pp. 269–293. Craft, J. (2015) Conceptualizing the Policy Work of Partisan Advisers, Policy Sciences, 48(2), pp. 135–158. Dahl, R.A. (1957) The Concept of Power, Behavioral Science, 2(3), pp. 201–215. Dahl, R.A. (1970) Modern Political Analysis. 2nd ed., Englewood Cliffs: Prentice Hall. Dahl, R.A., and Stinebrickner, B. (2003) Modern Political Analysis. 6th ed., Upper Saddle River: Prentice Hall. Dewey, J. (1927) The Public and its Problems. New York: Holt. Diamond, L.J., and Morlino, L. (2005) Assessing the Quality of Democracy. Baltimore: Johns Hopkins University Press. Dunn, W.N. (1981) Public Policy Analysis: An Introduction. Englewood Cliffs: Prentice Hall. Dye, T. (1972) Understanding Public Policy. Englewood Cliffs: Prentice Hall. Easton, D. (1953) The Political System: An Inquiry into the State of Political Science. New York: Alfred A. Knopf. Eulau, H. (1958) H. D. Lasswell’s Developmental Analysis, Western Political Quarterly, 11(2), pp. 229–242. Eulau, H. (1977) The Interventionist Synthesis, American Journal of Political Science, 2(2), pp. 419–423. Friedrich, C.J. (1963) Man and His Government. New York: McGraw Hill. Fukuyama, F. (2013) What is Governance, Governance, 26(2), pp. 347–368. Gingrich, J., and Häusermann, S. (2015) The Decline of the Working-class Vote, the Reconfiguration of the Welfare Support Coalition and Consequences for the Welfare State, Journal of European Social Policy, 25(1), pp. 50–75. Goodin, R.E. (2011) ‘The State of the Discipline, the Discipline of the State’, in Goodin, Robert E. (ed.), The Oxford Handbook of Political Science. New York: Oxford University Press, pp. 3–57.
Griffith, E.S. (1939) Impasse of Democracy. New York: Harrison-Hilton Books, Inc. Heclo, H. (1972) Review Article: Policy Analysis, British Journal of Political Science, 2(1), pp. 83–108. Heclo, H. (1974) Modern Social Politics in Britain and in Sweden. New York: Yale University Press. Heidenheimer, A. (1986) Politics, Policy and Policey as Concepts in English and Continental Languages: An Attempt to Explain Divergences, The Review of Politics, 48(1), pp. 3–30. Hogan, J., and Howlett, M. (eds.) (2015) Policy Paradigms in Theory and Practice: Discourses, Ideas and Anomalies in Public Policy Dynamics. Basingstoke: Macmillan. Holmber, S. and Rothstein, B. (eds.) 2012. Good Government: The. Relevance of Political Science. Cheltenham: Edward Elgar. Howlett, M. (2011) Designing Public Policies: Principles and Instruments. London, Routledge. Howlett, M. (2014) From the ‘Old’ to the ‘New’ Policy Design: Design Thinking Beyond Markets and Collaborative Governance, Policy Sciences, 47(2), pp. 187–207. Howlett, M., Mukheriee, I., and Woo, J.J., (2015) From Tools to Toolkits in Policy Design Studies: The New Design Orientation Towards Policy Formulation Research, Policy & Politics, 43(2), pp. 291–311. Jones, B.D., and Baumgartner, F.R. (2012) From There to Here: Punctuated Equilibrium to the General Punctuation Thesis to a Theory of Government Information Processing, Policy Studies Journal, 40(1), pp. 1–19. Katznelson, I., and Milner, H. (eds.) (2003) Political Science: The State of the Discipline. New York: W.W. Norton. Katznelson, I., and Weingast, B.R. (2005) Preferences and Situations. New York: Russell Sage Foundation. Kingdon, J.W. (1984) Agendas, Alternatives and Public Policies. New York: Harper Collins College Publishers. Klingemann, H.D., and Goodin, R.E. (eds.) (1996) A New Handbook of Political Science. Oxford: Oxford University Press. Lasswell, H. (1936) Politics: Who Gets What, When, How. New York: Whittlesey House, McGraw-Hill. Lasswell, H. (1951) The Policy Orientation, in Lerner, D., and Lasswell, H. (eds.), The Policy Sciences: Recent Developments in Scope and
Politics and Policy
Method. Palo Alto: Stanford University Press, pp. 3–15. Lasswell, H. (1971) A Preview of The Policy Sciences. New York: Elsevier. Lasswell, H., and Kaplan, A. (1950) Power and Society. New Haven: Yale University Press. Lee, S., Jensen, C., Arndt, C., and Wenzelburger, G. (2017) Risky Business? Welfare State Reforms and Government Support in Britain and Denmark, British Journal of Political Science, published online 2 October, doi:10.1017/S0007123417000382. Lindblom, C.E. (1965) The Intelligence of Democracy. New York: Free Press. Lowi, T. (1964) American Business, Public Policy, Case-Studies, and Political Theory, World Politics, 16(4), pp. 687–713. Lowi, T. (1970) Decision Making vs. Policy Making: Toward an Antidote for Technocracy, Public Administration Review, 30(3), pp. 314–325. Lowi, T. (1992), The State in Political Science: How we Become What we Study, American Political Science Review, 86(1), pp. 1–7. Mahoney, J. (2001) Path Dependent Explanations of Regime Change: Central America in Comparative Perspective, Studies in Comparative International Development, 36(1), pp. 111–41. Mahoney, J., and Thelen, K. (eds.) (2010) Explaining Institutional Change: Ambiguity, Agency, and Power. New York: Cambridge University Press. Mansfield, H. (1981) Machiavelli’s Political Science, The American Political Science Review, 75(2), pp. 293–305. Nagel, S. (ed.) 1983. Encyclopedia of Policy Studies. New York: Dekker. O’Key, V. (1942) Politics, Parties and Pressure Groups. New York: Thomas Y. Crowell Company. Orsini, N. (1946) Policy: Or the Language of Elizabethan Machiavellianism, Journal of the Warburg and Courtauld Institutes, 9(1), pp. 122–134. Peters, G. (2018) Policy Problems and Policy Design. Cheltenham: Edward Elgar. Pierre, J. and B. Guy Peters (2000) Governance, Politics and the State. Basingstoke: Macmillan.
1079
Ranney, A. (ed.) (1968) Political Science and Public Policy. Chicago: Markh. Regonini, G. (1995) Politiche pubbliche e potere, in Regonini, G. (ed.), Politiche pubbliche e democrazia. Napoli: ESI, pp. 21–102. Rothstein, B. (2011) The Quality of Government: Corruption, Social Trust, and Inequality in International Perspective. Chicago: University of Chicago Press. Rothstein, B., and Teorell, J. (2008) What is Quality of Government: A Theory of Impartial Political Institutions, Governance, 21(2), pp. 165–190. Sartori, G. (1973) What is Politics, Political Theory, 1(1), pp. 5–26. Sartori, G. (1987) The Theory of Democracy Revisited. Chatham: Chatham House. Schick, A. (1975) The Trauma of Politics: Public Administration in the Sixties, in Mosher, F.C. (ed.), American Public Administration: Past, Present, Future. Tuscaloosa: University of Alabama Press, pp. 142–180. Stoker, G. (2010) Blockages on the Road to Relevance: Why has Political Science Failed to Deliver?, European Political Science, 9(1), pp. 72–84. Tholen, B. (2016) Machiavelli’s Lessons for Public Administration, Administrative Theory & Praxis, 38(2), pp. 101–114. Tilly, C. (ed.) (1975) The Formation of National States in Western Europe. Princeton: Princeton University Press. Trampusch, C. (2010) Employers, the State and the Politics of Institutional Change: Vocational Education and Training in Austria, Germany and Switzerland, European Journal of Political Research, 49(4), pp. 545–573. Weber, M. (1919) Politik als Beruf, Gesammelte Politische Schriften. Munich: Duncker & Humblodt. Translation: H.H. Gerth and C. Wright Mills, Max Weber: Essays in Sociology, pp. 77–128, New York: Oxford University Press, 1946. Weible, C., and Sabatier, P (eds.) (2017) Theories of the Policy Process. Boulder: Westview Press. Wildavsky, A. (1979) Speaking Truth to Power. Boston: Little Brown. Wu, X., Howlett, M., and Ramesh, M. (eds.) (2018) Policy Capacity and Governance. London: Palgrave.
65 Policy Evaluation1 E v e r t Ve d u n g
Definition Evaluation is the process of determining the merit of an entity and the product of that determination. This statement captures the two universal meanings of evaluation as a phenomenon in all intellectual and practical human endeavors (Scriven, 2013: 178). In the present context, evaluation will be restricted to the scrutiny and judgment of public policies or, in more general wording, public interventions. In the spirit of the eminent Italian political scientist Giovanni Sartori (2009a: 89–91, 84–85; 2009b: 139–140), I will start with a minimal definition of the term evaluation, which is a contested concept, a semantic magnet (Vedung, 1997: 3). To cut a long and convoluted story short, for public governance, I propose the following minimal definition: evaluation is the activity of careful assessment of the merit of content, administration, output, effects, and organization of ongoing or abolished government interventions and the resulting product of this
activity, both of which are intended to play a role in future, practical action situations. Evaluation is assessment (appraisal, judgment) concerned with ascertaining the merit, i.e., the worth or value, of government interventions. The term refers to valuing in two senses: the process of valuing and the product of such valuing. A person may waive a written report in the air and exclaim: ‘this is an evaluation!’. Simultaneously, she may also affirm: ‘the evaluation for this report required 12 months of hard work!’. To qualify as evaluation, an assessment must be carefully applied, i.e., meticulously and systematically executed without necessarily being scientific or scholarly. Evaluation objects are confined to public sector interventions, normally interpreted as adopted policies, programs, program ingredients, initiatives, schemes, projects, services, front-line practices, etc., as well as the organization of such measures. Interventions are judged by their results in society and nature, but their pathways toward results are
Policy Evaluation
1081
also appraised in the sense of how adopted intervention content influences implementation organization and processes all the way through immediate target group responses. In addition, the term interventions is sometimes also interpreted as commissions to create policy proposls. In that case, the primary expected result is new policies, and implementation refers to the creative, often intricate, processes involved in the shaping of these new policies. To simplify, I will illustrate my upcoming reasoning by evaluation of adopted interventions, not by evaluation of commissions to create new interventions. Evaluation is about terminated or ongoing interventions, but not about interventions contemplated and discussed but without any public decision to start a planning process. Finally, at minimum, a study meeting all of the above-mentioned criteria must also be intended to play a role in practical action situations to be an evaluation. It is not necessary for the study to actually play such a role. It is sufficient that it is intended to play it. In other words, a study fulfilling all of the criteria, the intention to play a role included, but which is never put to use, is an evaluation anyway but simply unused. To end, meta-evaluation in both of its senses (systematic summary reviews of findings of several evaluations, evaluative audits of evaluations and evaluation systems) is also evaluation, because evaluations by themselves are also interventions, albeit second-order ones.
When the simple system model is used in public sector evaluation, more functions are added and the terminology is changed. Input is renamed intervention (between-coming, which I will use as a more covering term than policy). The conversion function is retitled administration. Output means phenomena such as prohibitions, grants, subsidies, taxes, exhortation, moral suasion, services, or goods that leave government bodies and reach target groups. A new outcome phase is tacked onto the output function. Outcome is what happens on the target group side, the actions of the target group individuals included, but also what occurs beyond the immediate target individuals in the chain of potential influence. We may distinguish between immediate, intermediate, and ultimate outcome. Effects are a subgroup of outcome, i.e., the portion of the outcome that is, at least to some minimal extent, generated by the intervention and the administrative activities. Another term for effects is impacts. Results is used as a summarizing term for either outcomes or effects or both. In addition, results sometimes also covers outputs. The term implementation covers administration, output, and the early activities in the effect area that front-level bureaucrats may influence in their encounters with target group individuals. All of this is surrounded by context, indicating the environment within which everything occurs. Figure 65.1 simplifies and summarizes this reasoning.
System view of public interventions
Eight questions approach to public policy evaluation
As might be gleaned from my minimal definition, evaluators tend to view the public sector as a system; a system is a whole in which the component parts are dependent upon each other. In its simplest form, a system consists of input, conversion, output, and feedback from output to conversion and input.
Questions are the fountains of all research, policy evaluation research included. The problems to be attacked in policy evaluation can be presented as eight questions with subquestions, each covering a specific area. I shall call this the eight questions approach to public policy evaluation (Table 65.1).
1082
The SAGE Handbook of Political Science
Figure 65.1 The general system model adapted to public intervention Source: Adapted from Vedung, 1997: 5 via Vedung, 2006: 398 by Tage Vedung and EV.
Table 65.1 Eight questions approach to public policy evaluation 1. Comprehensive purposes.What are the overall aims of launching the evaluation? 2. Organization (evaluator). Who should perform the evaluation and how should the endeavor be structured? 3. Valuing. What merit criteria can and should be applied in assessing the worth of the evaluand, i.e., the public sector intervention to be appraised? By what standards of performance on the merit criteria can and should success/failure be judged? And what are the actual merits and demerits of the intervention? 4. Intervention selection and description. How is the evaluand to be chosen and represented? 5. Implementation. Are there any obstacles or malfunctions in the execution phase between the formal instigation of the intervention through the earliest target group responses in the immediate outcome area? How can such problems be mitigated? 6. Outcome. What are the possible relevant outcomes – immediate, intermediate, and ultimate, intended, unintended – of the intervention? 7. Effect. To what extent is the outcome impacted by the intervention and its implementation in given contexts? Besides the intervention, what other contingencies and factors (causal forces, mechanisms) contributed to the outcome in given contexts? 8. Utilization. How is the evaluation to be used? How is it actually used? How can use be improved? The immediately following eight sections will describe the specific areas that the eight questions cover. Source: Adapted from Vedung, 1997: 93 f. and Vedung, 2006: 398; also Shadish et al., 1991.
Comprehensive Purposes Evaluation is performed for either accountability, development, or basic knowledge. In addition, evaluation is laced with strategic aims to gain time, cover up failures, whitewash, or display fronts of rationality. I shall call this fourth purpose strategy, or, tongue-in-cheek, Potemkin villages and white elephants. The key rationale of accountability evaluation is to find out whether agents have exercised their delegated powers and discharged their duties properly so that principals can judge their work, take remedial action, or grant discharge. Accountability (after-the-fact control) involves two parties, the principal and her/his agent, and three functions: (1) the
principal’s downward delegation of duties; (2) the agent’s upward account-giving; and (3) the principal’s downward assessment. Due to lack of time and competence, the principal cannot carry out everything by herself/himself but has to commission the job to some agent – the agent agrees to carry out the job. After some time, the agent (the accountor) is expected to account for her/his handling of the job commission to the principal (the accountee). On the basis of this, the principal can monitor and evaluate whether the agent has followed the agreement to satisfaction, issue praise and grant discharge, or criticize and take corrective action. The essence is that accountability evaluation is intended to serve the supervising needs of an external overseeing body.
Policy Evaluation
In the development (improvement, promotion) mode, an evaluation is genuinely devoted to the collection of solid facts in order to deliver a valid and reliable information base for policy amelioration. Ideally, no critical principals harboring shady intentions lurk in the background. The comprehensive development purpose is either formative or summative. In formative development, evaluators do not contest goals and underlying problems but concentrate their assessment on means, methods, strategies, and actions to solve given problems and attain given goals. Evaluation aspires to single-loop learning, that is, minor adjustments, while leaving the essential properties of the appraised intervention intact. The point is to guide the fine-tuning of the intervention’s implementation. The fundamental question is: ‘how can implementation be improved?’ Summative development evaluation, on the other hand, involves wholesale intervention reconsideration. The purpose is double-loop learning, in the sense that not only means, methods, and strategies are questioned but also goals and sometimes even the problem that motivated the intervention in the first place. Both accountability and development are worthy aims of evaluation. Several experts, the author of this entry included, hold development as the major one. A grim challenge to all evaluation, development evaluation included, is peoples’ penchant for game-oriented, self-serving strategic action. In the strategic mode, evaluatee action can take the shape of Potemkin villages. The metaphor stands for devices used by an agent to camouflage his/her dubious performances to his/her principal. The expression refers to the famous 1787 visit by Russia’s Empress Catherine II to Crimea, where Prince Potemkin, Governor General of the Ukraine, built artificial house fronts along her route to give her a false impression that the province under his governorship flourished. Grigorij Potemkin – the agent – a favorite and former lover of the Empress, who had conquered and governed the province, attempted
1083
to use his information superiority over the Empress – the principal – who, in far-away St. Petersburg, had little chance of informing herself about the problems of the area. The motive lurking behind the attempt to mislead Catherine was his self-interest, to keep her favor, and avoid any negative retaliation. Also in present-day evaluatee behavior, the inclination for window dressing to cover up red tape, sloppy work practices, or low productivity may surface. Evaluation is intimidating to agents. They fear criticism that might hurt their reputations and threaten their jobs. They may act deceitfully toward their principals due to an information asymmetry in their favor. Those social workers (agents) who interact with clients day in and day out know infinitely more about the work actually being done than the managers of the social bureau (principals) who never meet a client. This information asymmetry tilts the playing ground in favor of agents’ self-interested behavior. Self-interested behavior is strategic in the sense that people act on the basis of what they believe others will do. Strategic action is grounded in anticipation, i.e., people foresee other peoples’ future actions and their consequences and act on the basis of what this will mean to their own present action. Action is driven by prescient beliefs about future realities, not by values. In their upward accounts, the agents doctor the reality because they believe that their bosses will put some blame on them otherwise. They also believe that their colleagues in other units, who report to the same boss, will do the same. They trust neither their colleagues nor their bosses. Their disloyal behavior is facilitated by the information asymmetry in their favor. The existence of hidden, strategic purposes behind agents’ sweet accounts of situations indicate the limits of evaluation as a producer of rational information to support decision-making in the public sector. Evaluation takes place in political settings. Politics is a value-laden activity. Instead of delivering objective facts in their accounts
1084
The SAGE Handbook of Political Science
to their principals, agents may doctor their databases and deliver Potemkin villages and white elephants. Strategic considerations reduce the value of evaluation as a deliverer of unbiased knowledge. In basic knowledge, the third higher-level purpose, evaluation is seen as fundamental research seeking to increase the general understanding of actions and events in the public sector. It augments the amalgamated body of knowledge in some academic field of study. In meta-evaluation or systematic review, typical of the evidence movement in public policy to which I shall return later, the basic knowledge ambition is adamant. It is also the major aim of evaluative investigations initiated and carried out by academics as fundamental research. In evaluations commissioned by public sector agents to be swiftly reported, the basic knowledge purpose is secondary to accountability and development, and best regarded as an accidental by-product in the pursuit of the other two.
Organization (Evaluator) Who should perform evaluation and how should it be organized? Is self-evaluation a reasonable thing to do? Or should evaluation be carried out by external bodies, detached from and independent of the evaluand and its stakeholders? This is a no real choice situation, it might be argued. Evaluation should always be performed externally, or at least by people not directly involved in the formation, adoption, implementation, and reception of the evaluand. Internal evaluation usually results in bragging, swaggering, self-advertisement, and cover up; and from this, nobody will learn. This view is inaccurate. Let us first illuminate what self-evaluation stands for. When the operator in her energy agency, the inspector in her municipal environment administration, or the professor in her state university political science department carefully assesses their own practice with a view to further their own
professional improvement, they are certainly performing self-evaluation. When the said individuals assess their common activities to promote their organization’s performance, they are also performing self-evaluation, but at a higher level. Yet, self-evaluation cannot solve all appraisal needs. The pervasiveness of selfregarding behavior in organizations indubitably calls for external evaluation. The wisest solution, perhaps, is to choose with respect to the comprehensive purpose of the assessment. When the comprehensive purpose is accountability to some outside principal and objectivity is important, a strong case can be made for external evaluation. Generally, evaluations by some autonomous body carry greater credibility as objective enterprises than do internal evaluations by the agents. A case can also be made for combining internal and external accountability evaluation. In results-oriented management (to which I shall return), subordinate agents are supposed to account for their own results and performance to their higher-level principals. The ideal-type rationale for this must be that agents know much more about the state of program implementation, outputs, and outcomes than principals. Yet since information asymmetry may tempt them to cheat – with impunity, because it is hidden – the final responsibility should be with the principal, i.e., external evaluation. When development is the comprehensive purpose, self-evaluation is commendable. Actually, it may be performed as braided evaluation. Braided evaluation, a neologism I coined, refers to a model whereby the evaluation is interwoven into the activities under evaluation in the sense that, e.g., the leader of the evaluand is also the leader of the evaluation of the evaluand, operators of the evaluand are also charged with collecting some performance data for the evaluation of the evaluand, and the evaluators are immediately and repeatedly reporting back their evaluative findings to the operators of the evaluand.
Policy Evaluation
If accountability evaluation is deemed necessary in cases like this, it should be carried out by external auditors who are entirely isolated from, and independent of, the development evaluation. The rationale for this quarantine is to prevent strategic self-serving behavior from destroying the promotion purpose of the development evaluation.
Valuing A key process in evaluation is determining the merit of the public intervention under appraisal. The quandary is: what constitutes a valuable public intervention and how can it be appraised? Valuing is a four-step procedure: 1 Identification of appropriate criteria of merit to be used in the assessment. 2 On the chosen merit criteria, selection of performance standards that constitute success or failure. 3 Ascertaining the actual performance of the evaluand on each criterion and comparing it to each standard. 4 Deciding whether to keep the various judgments strictly apart in the reporting or to integrate them into a single, overall appraisal of the worth of the intervention.
Valuing can be descriptive or prescriptive. In descriptive valuing, the evaluator chooses the values of others as merit criteria and standards. In prescriptive valuing, the evaluator herself advocates the primacy of particular values, such as justice, equality and diversity, regardless of whether these values are adopted by any decision-making body or held by some stakeholder constituency. Furthermore, criteria may be established before the start of the evaluation (ex ante), during the process of performing the evaluation (ex nunc), or once findings are available (ex post). Aside from adopted general orientations (descriptive/prescriptive, ex ante/ex nunc/ex post), the particular values preferred in each orientation must be justified. Table 65.2 lists
1085
Table 65.2 Substance-only and Substance/ Cost value criteria I Substance-only criteria of merit 1 Intervention goals (effectiveness evaluation) 2 Criteria for side effects appraisal 3 Client criteria 4 Professional criteria: self-evaluation/peer review 5 Stakeholder criteria II Substance/Cost criteria of merit 1 Productivity (substantive output to cost) 2 Efficiency: cost-effectiveness, cost/benefit (substantive outcome to cost) Source: Adapted from Vedung, 1997: 93 f. and Vedung, 2006: 400.
some commonly used substance-only criteria and substance/cost criteria. I will present the merit criteria, starting with the five substance-only yardsticks.
Intervention Goals as Merit Criteria Assessment by intervention goals is the classic in evaluation. Early literature constructed evaluation by definition as assessment of interventions against their own goals. Figure 65.2 depicts the goal-attainment evaluation of intervention results. The fundamental reason in favor of goalattainment is the argument from representative democracy. In a democracy, power belongs to the demos, the people. Yet, people, at large, do not have the necessary time or competence to participate in hundreds upon hundreds of decisions on, for instance, public support of individuals in social and economic dire straits, or day-to-day care for ailing senior citizens in public sector homes for the elderly. For these reasons, the citizenry must elect political representatives to make the decisions for them. However, representatives in political assemblies do not have time or competence to make all decisions. They must delegate their power to governments to make decisions for them. But governments do not have time or competence, so they, in turn, have to delegate to civil servants and professionals to make decisions, etc. The public
1086
The SAGE Handbook of Political Science
Figure 65.2 Goal-attainment evaluation focused on goals-results Source: Own elaboration, Tage Vedung and EV in cooperation.
sector is made up of long chains of principal– agent relationships. If an agency adopts a program to reach some goals, these stated goals derive their legitimacy from the fact that the agency’s decision-making authority has been delegated to it by the government and that the government, in turn, has received its authority to do so from parliament, and parliament, in turn, from the people. It is a strength of the goal-attainment approach that it recognizes the representative-democracy aspect of public sector goals. On the other hand, goal-attainment suffers from weaknesses. The haziness argument maintains that policy goals are deficient as virtue criteria due to their obscurity. One difficult problem with policy haziness is produced by so-called goal catalogues. Most large social reforms contain impressive directories of diverse goals. While a single goal may be hailed as the major one, often, it is also maintained that this one must be balanced against all others, including potentially conflicting ones. However, the necessary trade-offs between the several goals are not indicated, which makes it impossible to elicit from such lists of goals one distinct, transparent, expected outcome. Thus, an array of goals of this type are not lucid enough to be usable as value criteria against which to measure intervention successes and failures. Another compelling rebuttal emanates from goal-attainment’s blindness to side effects and, most importantly, unintended side effects. This issue is addressed through side effects evaluation.
Merit Criteria for Side Effects Mapping and assessing side effects is an obligatory duty of policy evaluation, especially when conducted in the academic world. The side effects approach implies a widening of the subject matter of the goalattainment model in that searching for results in the target area is supplemented by searching outside of the target area for side effects. From the policymaker’s point of view, a side effect can be defined as at least a partial consequence of the intervention and its implementation, which is not included among the main effects. Main effects are the actual, expected and desired consequences which, at least partly, are produced by the intervention. Side effects can be unanticipated and not considered, or anticipated and considered in calculations preceding decisions to adopt policies. They may be beneficial as well as detrimental. See the section ‘Outcome Effects’ for further details. Moreover, in terms of merit criteria, the side effects model engenders an extension of goal-attainment in that intervention goals stated in advance are supplemented with merit criteria for side effects. If some side effects are unexpected and unforeseen, the criteria and standards for valuing these side effects are also not pre-specified. Evaluators guided by premeditated stated goals will encounter problems when tracing unforeseen spillovers. And if they discover such things, there will be no prestated goals that can serve as merit criteria for judging them. As a result, pre-specified goals are insufficient
Policy Evaluation
1087
Figure 65.3 Side effects model with specified pigeonholes for side effects Source: Own elaboration, Tage Vedung and EV in cooperation.
as instruments of judging unanticipated side effects. Consequently, the evaluator can chart side effects but may leave it to the commissioners and other users of the evaluation’s findings to ascertain their value ex post facto. Figure 65.3 presents a possible scenario describing the structure of the evaluation on the outcome side if the totality of the effects of a government intervention inside and outside its targeted area were to be investigated. I strongly commend side effects to goalattainment evaluation. Indeed, the major rationale for doing public sector evaluation in the first place is that beyond-implementation outcomes of state actions are not fully predictable and may result in side effects not originally foreseen. It is an important duty of research evaluation to map and assess the worth of these side effects. By-products, whether anticipated or unanticipated, detrimental or beneficial, are crucial factors in every inclusive judgment of the operation of an intervention.
Client-Oriented Merit Criteria Responsiveness to values espoused or suggested by intervention clients (targets, participants, addressees, customers) have been
proposed as an alternative or supplement to intervention goals and merit criteria for side effects. Client criteria such as desires, wishes, interests and expectations can be weakly, moderately and strongly included in the evaluation. In the weakly included case, participants respond to data-gathering instruments by providing information and judgments but nothing more. Studies of client satisfaction with public sector services belong to this category. Being strongly included connotes that the evaluation is initiated, funded, designed, carried out, and reported by the clients themselves. Let me reason from the moderately included case, wherein the evaluation is (1) commissioned by non-clients but (2) charted to involve service users in important decisions regarding service planning and execution. Intervention clients are asked to select intervention dimensions to be judged and value criteria to be used. For instance, clients may be asked to judge service quality. Is the core service tailored to the demands of the clients? In addition, clients may be asked to judge service process features; do the service employees encounter the clients with respect and correctness? Finally, clients may also be asked to estimate service impacts on themselves or on the client community in general. In such effects-on-clients evaluations, targets
1088
The SAGE Handbook of Political Science
try to determine the relative change in themselves or in the overall client body as a result of their participation in specific treatment modalities. Currently, client-based merit criteria are employed in appraisals of services like child care, nursing homes for the elderly, public housing, mental health, public utilities, parks and recreation, etc., wherein clientele participation is crucial to program operation. Client-based criteria are used to evaluate library services, museums, halls for indoor badminton, bandy, tennis and workouts, trash hauling, street cleaning, snow removal, traffic noise, and urban transit. It is a favorite with educators. At universities, students are routinely requested to share their opinions on courses, reading lists, and lectures. They are asked to rate their teachers’ abilities to organize course content, to stimulate debate and discussion, to stir student interest, foster critical thinking, and to show concern and enthusiasm for their students. Occasionally, these evaluations are used to rank faculty and courses with the stated purpose of enabling prospective future students to make betterinformed choices. Application of client merit criteria departs from the notion that public administration produces goods and services for customers in the marketplace. In buying a commodity, ideal-type customers pay no attention to producer goals. The value of the good to the consumer is what counts. Conclusion: public services should also be geared to consumer tastes only. Yet, the customer-in-the-market parallel cannot be pushed that far. In representative democracies, public sector targets are also citizens and, in that capacity, principals of last resort. From this citizen perspective, the client notion includes participatory and deliberative aspects, which are absent from the ideal-type customer concept. The participatory feature suggests that clients/citizens may voice their complaints and desires in interactive, dialogical encounters with the evaluators and service providers and, to some extent, influence and
take responsibility for service content. The deliberative trait engenders a discursive, reasoning, discussing, learning-through-dialogue countenance, which may engender serviceinfluencing and responsibility-taking features but also, as a side effect, perhaps an element of client-education to become better future citizens. Client influence has its limits, though. Since representative democracy is the dominant form of government by the people, decisions on various sectoral policies cannot be left to targets/clients and their service providers. The major power belongs to the demos as a whole, its representatives, and their delegates. Client governance must work within the frames fixed by representative democracy.
Professional Merit Criteria: Self-Evaluation/Peer Review Combinations Professional or collegial evaluation models imply that members of a profession are entrusted to evaluate their own and their colleagues’ performances with respect to the profession’s own standards of quality. The most celebrated collegial model is the peer review approach, in which the evaluation is conducted by an external collegium, i.e., an assembly of professional equals somewhat more prominent in their area of expertise than the colleagues they are invited to assess. In this way, political scientists evaluate political scientists, lawyers evaluate lawyers, surgeons evaluate surgeons, social workers evaluate social workers, and so on. In evaluation of research and higher education, a combination of self-evaluation and external peer review is used. This model is partly based upon dialogue, discussion, and deliberation. The procedure starts with selfevaluation. The professionals to be evaluated carry out an appraisal of their own performance, their research projects, and their graduate education program. What are the strengths and weaknesses of the content of our research and graduate education? What opportunities
Policy Evaluation
do we have to make it better? What threats are confronting us that may turn it worse? In sum, they perform a SWOT analysis. Then comes the peer review. Renowned external scientists and educators of the particular field are assigned to assess the quality and relevance of the evaluatee’s work content. These peers base their preliminary assessment on the written self-evaluation combined with face-to-face evaluator–evaluatee dialogues during site visits. The evaluators inscribe their preliminary judgments in a draft report, often structured according to the SWOT-scheme. Then, the dialogical process continues in that evaluatees are invited to advance comments before the draft report is finalized. The self-evaluation/peer review approach is interactive. The evaluators meet the evaluatees, ask questions that the evaluatees answer, which spawn new questions and new answers, etc., from which both parties learn. Dialogue, deliberation, and interaction among the evaluatees themselves and between evaluatees and evaluators are crucial. Collegial evaluation is applied in professiondriven areas of the public sector. In these areas, quality criteria are complex, liquid, mutable, and continually changing. Often they are communicated orally only in the relevant professions as tacit knowledge. This tacit knowledge must be brought forward through penetrating dialogues among groups of colleagues with some scientific education and clinical training and experience in the pertinent field. Architects, judges, professors, doctors, veterinarians, engineers, and forest rangers would be cases in point. Hence, it is also considered reasonable to delegate appraisals to the professions. However, since these professionals work in the public sector, financed by taxpayers’ money, collegial evaluation must be regarded as a model on a par with the other models used in public life. Collegial evaluations may produce questionable results. Studies with matched panels show that peers use different merit criteria and reach miscellaneous conclusions. However, in complex fields, interactive, discursive
1089
collegial evaluation is probably the finest method available to judge the content quality of what is produced.
Stakeholder Criteria of Merit Concerns and issues of stakeholders serve as evaluative merit criteria. Stakeholders might be defined as actors (groups or individuals) who are affected by or have an interest vested in the evaluation or in the evaluand, its implementation, or its outcome effects. Interest may be measured in terms of money, status, power, face, reputation, prestige, esteem, respect, chance, opportunity, promotion, advancement, or other value criteria, and may be large or small, as constructed by the groups in question. This is quite different from using predetermined objectives as merit criteria, as in goal-attainment evaluation. Stakeholder evaluation, however, does resemble the clientoriented model, the major difference being one of scope: while the client-driven model is concerned with one group of affected interests, the stakeholder model is geared to all of them. Figure 65.4 presents an overview of conceivable stakeholders in local social welfare interventions (Hansen and Vedung, 2010). The stakeholder approach starts with evaluator mapping of the major groups who are affected by the intervention or by its evaluation. The evaluator identifies the people who initiated, hammered-out, funded, and adopted the intervention. She pinpoints those who are charged with its implementation: managers, staff, and front-line operators. She singles out the intervention’s primary target group, i.e., its clients and the clients’ associations. She identifies relatives and relatives’ associations. She may also include lay people. As to data gathering, self-observation and sustained dialogical methods are mixed with surveys, questionnaires, and statistics. In-depth interviewing of individual targets, admitting interactive interviewer–interviewee dialogue, is one favored technique. Selfreporting instruments are used that clients,
1090
The SAGE Handbook of Political Science
Figure 65.4 Potential stakeholders in local social welfare interventions Source: Adapted from Vedung, 2013: 395 by Tage Vedung and EV.
parents, relatives, and other stakeholding networks can easily complete. In some cases, stakeholder evaluators endorse focus-group interviewing, which allows for group deliberations among participants and between participants and evaluators. After data are amassed and processed, the reporting of findings, which might vary from one stakeholder to another, will commence. The key word seems to be portrayals, i.e., information-rich characterizations using statistics, pictures, anecdotes, thick descriptions, and quotes. The use of stakeholder value criteria has several advantages. Fundamental is the argument from expertise. Stakeholders are supposed to contribute evaluative (and substantive) insights that enhance evaluation quality. For political scientists cum evaluators, democratic aspects are crucial, although side effects in relation to expertise. Democratic arguments depart from participative, deliberative, and representative points of view. True, democracy means that citizens in general elections vote for competing elites that are supposed to make decisions on their behalf (representative democracy). Yet, the citizenry should also be able to partake in final public decision-making between elections (participative democracy). Furthermore, discussion,
dialogue, and interactive discourse are also important democratic values because they help people to form and refine their beliefs and preferences (deliberative democracy). Stakeholder models satisfy participative and deliberative values within the frames set by representative democratic bodies of the system, it is often maintained.
Substance/Cost Criteria of Merit Paying no heed to costs is characteristic of all evaluations using substance-only criteria in their appraisals. To remedy this omission, economists have devised substance/cost criteria, two of which will be touched upon here: productivity and efficiency. Productivity is the ratio of outputs to costs, i.e., outputs:inputs. A study of productivity in municipal libraries can use cost productivity as a measure, i.e., number of books borrowed:costs in euros. In addition, work productivity can be used: number of books borrowed:number of hours worked. In both cases, output is indicated in physical terms – the number of books borrowed. The difference is that costs in the former case are indicated in monetary terms; in the latter case, in number of hours worked,
Policy Evaluation
that is, costs are indicated in physical entities. Costs can be computed in both ways in productivity measurement. Productivity is not an ideal measuring rod for assessing the merit of public sector activities. The public institution may do wrong things, i.e., the outputs may not produce the desired outcome effects. Therefore, efficiency, as a yardstick for effects, is a better metric. Ideally, efficiency presupposes two things. The evaluator must distinguish between gross and net outcomes, the latter taken to be intervention effects in the sense of being produced, at least partly, by the intervention and its implementation and not by something else (see Figure 65.1). Second, the value of these intervention effects must be calculated by using some merit criterion, such as intervention goals. If efficiency is measured in cost–benefit analysis, it can be expressed as the ratio of the monetized value of the intervention effects to the monetized costs, i.e., value of intervention effects (in euros):costs (in euros). This will end our discourse on the questions concerning valuing, the third area covered in our eight questions approach to public policy evaluation (Table 65.1). It is time to turn to questions concerning areas 4–8 in the list, starting with number 4: Intervention selection and description.
Intervention Selection and Description What can be evaluated? The answer is: anything. This article is about public intervention evaluation. A public intervention under evaluation (evaluand for short) might be an assignment to develop a draft intervention proposal. It might be an adopted intervention, such as a policy, a reform, a program, a plank in a program, or a project. It might be adopted administrative strategies, like results-oriented management. It might be a policy instrument or a mix of policy instruments. And it might be practices, like childcare service provision.
1091
Public intervention evaluands vary on many other dimensions as well. Some are broad, others narrow. Some are regular programs that go on for decades, others projects to be finished in a couple of weeks. Some are local in their scope, others intraregional, national, interregional (EU), or global (17 sustainable development goals, SDGs, to be attained in 2030). Yet, however wide or narrow, long-lasting or short-lived, permanent or provisional, they must be described – the fourth item in the eight questions approach. To ensure usability and utilization, these descriptions should not render the evaluands as too idiosyncratic and situation-bound. For reasons of space, I will illustrate the general reasoning with one example only: representing the evaluand in terms of its policy instruments. By definition, public policy instruments are a set of techniques by which public sector authorities wield their power in attempting to effect social change or eliciting support. Apart from organizing, there are three, and only three, basic instruments that governments have recourse to: sticks, carrots, and sermons. Figure 65.5 illustrates this policy instrument triad. Governments can either force us to do what they want (the stick, regulations), reward us or charge us materially to make us do what they want (the carrot, subsidies, taxes, i.e., economic means), or preach to us about what we should do (the sermon, encouragements, warnings, i.e., information). In characterizing interventions in this general, elaborated terminology, evaluations will be thematic, focusing on several interventions and may become more relevant, attended to, and used.
Implementation The fifth issue in the eight questions approach includes two types of activities focusing on intervention execution. Qualified monitoring checks the various stages of the complete chain of implementation – from formal intervention adoption through outputs and the
1092
The SAGE Handbook of Political Science
Figure 65.5 The policy instruments triad with affirmatives and negatives Source: Adapted from Bemelmans-Videc and Vedung, 1998: 250 by Tage Vedung and EV.
initial target response to outputs – in pursuit of identifying malfunctions and hindrances. The monitor searches for and verifies difficulties, extricates mechanisms releasing difficulties and suggests solutions to difficulties. Is the intervention as delivered to the clients in accordance with the intentions in the formal intervention? Is it reaching all prospective participants? In addition, qualified monitoring also focuses on processes preceding delivery. Qualified monitoring is actually formative process evaluation in that the full course of execution is scrutinized for possible trouble. The point is to examine whether an intervention decision is carried out according to plan in order to correct any mistakes and omissions identified during execution. Simple monitoring, as opposed to qualified monitoring, engenders data assembly only on some variable without marrying this activity to merit criteria or trouble search. True, the data gathered may be used for qualified monitoring and impact assessment, but in itself, simple monitoring is not evaluation.
Outcome Effects The issue of outcome-effects evaluation concerns intervention impacts on outcomes, whether immediate, intermediate, ultimate, or intended/unintended. Impact assessment
attempts to determine to what extent outcomes are influenced by the intervention or by something else operating besides the intervention. Impact assessment addresses the seventh area – and indirectly also the sixth – in the eight questions approach. Are the outcomes – once the pertinent data are collected and organized, inside or outside the targeted areas – at least indirectly and to some extent triggered and shaped by the intervention and its implementation or are they brought about by something else? Impact assessment, in other words, differentiates between gross outcomes and net outcomes, the latter being caused, at least minimally, by some factor in the intervention and its execution (Figure 65.1). To capture intervention effects, the evaluation guild has fashioned an analytical language. Main effect is defined as an actual substantive outcome impact of the adopted intervention, corresponding to the intended goals embedded into the said intervention. In addition to transpiring in the target area, main effects are, by definition, anticipated as well as positively valued in the intervention. A side effect is defined as an intervention consequence outside of the target area(s) of the intervention. Side effects can be anticipated and unanticipated, negative or positive. Perverse effects are intervention impacts that run counter to the purposes stated in the said intervention. These impacts may occur in the
1093
Policy Evaluation
target area(s) or outside of the target areas and are side effects. Perverse effects are different from null effects. Null effects occur when interventions generate no impacts at all inside or outside their targeted areas. The effects tree in Figure 65.6 shows which aspects of effects might be studied in evaluation research. A formidable school of evaluation methodologists maintain that research approaches can be rank-ordered with respect to their capacity of providing sound and persuasive evidence on the intervention impact, with randomized controlled, two-group experimentation at the top and qualitative process tracing, executed on one case only without comparison case, at the bottom. Table 65.3 exhibits a condensed ranking list of this type, often found in textbooks on evaluation. Opponents demur at the rank ordering in Table 65.3 with the RCTs as the best approach to causality in policy evaluation and process tracing as the worst; instead they argue that intervention-theory informed, mechanismseeking explanatory process tracing of one case only, as well as shadow controls and generic, reflexive, and statistical controls, are more applicable and relevant to evaluation in public sector contexts. A majority of political scientists tend to align with the protesters, at least in their practices. The controversy continues.
Utilization Findings from evaluations should be used, period! This was a dogma among pioneering evaluators in the 1960s. Since then an array of distinctions have been conceived to probe more deeply into use and usefulness. Fundamental is the distinction between process use and product use, departing from the aforementioned observation of evaluation as a process-product concept. The final report and written, oral, and visual information efforts around the final report, such as press releases, seminars, and special sections drawn out from the final report and transmitted to particular audiences, are included in product use. The product use category also includes the utilization of findings and recommendations gleaned from draft final reports and from interim oral, written, audiovisual, or electronical reporting during evaluation processes, i.e., before evaluation activities have been brought to an end (Vedung, 2020; Højlund, 2014; Stockmann and Meyer, 2013; Henry and Mark, 2003; Patton, 2012). Process use covers utilization prompted by activities of announcing, planning, and performing of evaluation. The sheer announcement of an upcoming evaluation as well as broadcasting of information on evaluation subject-matter, questions to be asked, and
Effects (Intended – Unintended) (Anticipated – Unanticipated)
In the target area(s)
Beneficial (Main effects) Primary Secondary Tertiary...
(Null effects)
Outside the target area(s)
Detrimental (Perverse effects)
Beneficial (Side effects)
Primary Secondary Tertiary...
Primary Secondary Tertiary...
(Null effects)
Figure 65.6 Main effects, side effects, perverse effects, and null effects Source: Vedung, 2006: 409 and adapted from Vedung, 1997: 54.
Detrimental (Side effects) Primary Secondary Tertiary...
1094
The SAGE Handbook of Political Science
Table 65.3 Positivist ranking list of research designs for causal impact Experiments with Randomized Controls: In a provisional tryout before the permanent intervention is adopted, targets are randomly divided into an experimental group, to whom the considered intervention is administered and a control group – randomized controls – from whom the considered intervention is withheld (classic experiments). Experiments with Matched Controls: Targets to whom a provisional tryout is given or who have been exposed to the permanent intervention are compared to a theoretically equivalent group, created nonrandomly through matching – matched controls – from which the intervention is withheld or which has been exposed to other intervention(s) (quasiexperiments). Generic Controls: Effects of a provisional or permanent intervention among targets are compared with established norms about typical changes occurring in the larger population not covered by the intervention. Reflexive Controls: Targets who receive or have received the provisional or permanent intervention are compared to themselves, as measured before, during, and after the intervention. Statistical Controls: Participant and nonparticipant targets of the provisional or permanent intervention are compared, statistically holding constant differences between participants and nonparticipants in the intervention. Shadow Controls: Targets who receive or have received the provisional or permanent intervention are compared to the judgments of experts, program managers, and staff and of the targets themselves on what changes they believe would have happened should there have been no intervention. Explanatory Process Tracing: The implementation of the permanent intervention in its natural settings is traced step by step from decision to outcome to find out facilitating and constraining factors in the process and its surrounding contexts. Note. A permanent intervention is a ‘real’ intervention in contrast to a provisional tryout intervention. Source: Adapted from Vedung, 1997: 170 via Vedung, 2006: 409.
methods to be applied may provoke use. This means that an evaluation may be used well before an evaluation team has been formed and started to work. Moreover, potential users may surface outside of the evaluation team and the range of stakeholders to be contacted by the team. Second, the answering of distributed survey items or being interviewed by the evaluators, working as a team member and helping to determine and fine-tune the evaluation’s key questions, contributing to data analyses, and aiding in the development of recommendations or action plans are other features that may stimulate use. Four broad action plans to enhancement of use of evaluation findings will be briefly touched upon here. Taking evaluative findings as a given, the diffusion-oriented strategy is concerned with making their availability and linguistic costume as user-friendly as possible. Table 65.4 presents a collection of some diffusion-oriented recommendations. Timely, repeated reporting and linguistic fashioning is not enough. The productionoriented strategy suggests that findings should be made recipient-friendly through
indirect efforts targeted at the evaluation process preceding the discoveries. Evaluators should take user worries seriously. The responsive evaluator should answer the users’ questions, not the questions of academic interest only. Plausible users, with different positions in the chain of implementation and at different hierarchical levels working with interventions in different stages of maturity, have dissimilar information needs. Preferably, evaluators and the likely recipients should frame the questions together before they are left to the evaluators for investigation. Alternatively, prospective recipients and evaluators should work together throughout the entire assessment process. In addition to problem identification, planning for data collection, actual data collection, data processing, conclusion drawing, and report writing, cooperation should spill over into dissemination and utilization as well. Evaluations should be demand-led, not supply-led. The evaluator/recipient consultation approach displays advantages. The chances of providing the right kind of information to recipients will increase. This will enhance the probability
Policy Evaluation
1095
Table 65.4 Diffusion-oriented strategy for improved use of evaluation processes and products I Draft and interim reporting (before final reporting) 1 Potential evaluation clients and users should be located in advance and then continually; 2 Preliminary findings, insights and recommendations ought to be disseminated to core audiences and others rapidly and continually, before final reporting; II Primary final reporting: designing and shaping of final report 3 A report should contain a highly visible abstract to enable potential users to decide whether or not to continue reading; 4 A report should contain an executive summary, somewhat longer than an abstract, but still short and sharp; the summary should start with the major substantive findings; 5 Reports should display some startling fact that makes people sit up and think; 6 Substantive findings ought to be presented first, methods afterwards; the major substantive results should be stated in unequivocal terminology prior to reservations, not the other way around; 7 Reasoning on methods should be minimal in the bulk of the report, but appended as attachments; 8 Reports should include recommendations for action and lessons learned; III Primary final reporting: handling of final report 9 Final reports should be on time; IV Secondary final reporting 10 Evaluators should make efforts to disseminate their findings after the fact; 11 Evaluators should tell stories and performance anecdotes to illustrate their points; 12 Evaluators should engage in public discourse; 13 Evaluation findings should be disseminated in syntheses. Source: Adapted from Vedung, 1997: 281 and Vedung, 2006: 414.
that the users will become committed to the findings, which will make them more prone to using the findings or recommending that others do so. Another merit is that learning may occur in the evaluation process, i.e., long before the publication of the final tract. An indirect production-oriented strategy would be meta-evaluation, in the sense of auditing of the evaluation function. Instead of actually carrying out substantive evaluations, senior management should concentrate on auditing the evaluation function in subordinate bureaus. While lower-level branches are instructed to do self-evaluation/ peer evaluation of their own performance and summarize the findings in an evaluation essay, higher authorities assume the task of conducting evaluations of their subordinates’ evaluation work based upon the report supplemented with other information. Meta-evaluation, in the second sense of summarizing all kinds of discoveries from several evaluations, is another strategy. When summarized, findings across many assessments may cumulate to a surprising extent. Summaries may seem more useful
to decision-makers than single evaluation efforts and lead to improved utilization. The third major approach to utilization improvement, the user-oriented strategy, engenders making potential evaluation clients more susceptible to utilization. Users might be educated in evaluation through evaluation capacity building. Fourth, and finally, the policy formation process may be adjusted to the demands of evaluation research. This intervention-oriented strategy could be realized through a two-step style of public policymaking: first a provisional, small-scale tryout, accompanied by stringent evaluation, and second, inauguration across the board of the best alternative elicited in the tryout and pointed out through the evaluation of the tryout.
Evaluation History and its Profound Controversies Policy evaluation, in the prescriptive sense of providing advice to power, was established in pioneering countries in the middle of the
1096
The SAGE Handbook of Political Science
1960s. It was born as a supplement to prescriptive policy analysis ex ante, i.e., providing advice by calculating beforehand the probable outcome effects of various suggested interventions. This left outcome effects of actually adopted interventions, their actual outputs, and actual implementation outside the realm of analysis, a lacuna that policy evaluation was devised to address. Public policy evaluation came to focus on adopted, extant, or abolished interventions and their implementation and empirical outcome effects in society or nature in order to suggest improvements. Policy evaluation can also be descriptive in the sense of just portraying various aspects of actually performed evaluations of policies without aiming at improvement. This article has focused the prescriptive variant with the proviso that everything said might be relevant also for descriptive policy evaluation. Since its inception more than 50 years ago as a prescriptive analysis, evaluation has expanded enormously and become a worldwide movement. Much evaluation is carried out. National evaluation associations are founded in practically all countries and continents of the world. Particular journals and special conferences are thriving. Yet, evaluation is not a university discipline like political science, economics, psychology, sociology, education, public health, and social work. Evaluation is a transdiscipline, taught and practiced as one field among many in most social science departments, and, importantly, evaluation is controversial, sometimes dramatically so, politically and academically. This is revealed when we take a look at its history.
Four Major Evaluation Waves and their Depositions, 1960–2018 Over the last half-century, policy evaluation as a practice, field of interest, study, and discourse has been in constant flux. Sometimes, one previously very popular form of evaluation has been heavily criticized and lost its
attractiveness, prestige, and support in favor of new forms. However, after some time, the new forms have, in their turn, been viciously attacked and lost their attractiveness and backing in favor of new, contrasting, and seemingly promising approaches (Furubo and Sandahl, 2002; cf. Wollmann, 2003). The whole situation can be likened to ocean waves swelling in and subsiding, swelling in and subsiding. In subsiding, silt from earlier waves has not entirely disappeared but solidified into layers of left-behind sediment. In due time, the extant evaluation landscape in some public sectors in the West has come to consist of layers upon layers of evaluative sediment. To capture these vicissitudes in policy evaluation, I have used the metaphor of waves leaving layers of sediment.2 Four major waves of evaluation have swept ashore in the Atlantic world between 1960 and 2018:3 1 2 3 4
scientific wave; dialogical wave; New Public Management wave; and evidence-based wave.
The Scientific Wave Starting in the early 1960s, evaluation emerged as part of a large stream of ideas purporting to make public policy more scientific. It was maintained that the public sector would perform much better with a proper dose of trustworthy scientific findings about the real results of adopted policies and programs. The coveted scientification was fashioned according to the engineering model. The engineering model implied that public policy decision-making should proceed in two stages. The first, preliminary stage suggested that conceivable means to reach given ends should be rigorously tested in carefully designed, small-scale pilot trials carried out by academic researchers. Acting as distanced, neutral observers and armed with the best available scientific method in the form of randomized two-group experimentation,
Policy Evaluation
researchers should empirically test which means are most effective in attaining the given ends of the pertinent interventions. In a second stage, on the basis of the means findings from these scientific trials, the political system should arrive at decisions on the full-scale introduction of the most effective means to achieve the stated ends. Provided the intervention decision is faithfully submitted by decision-makers to managers and operators who, in their turn, faithfully implement it, the desired outcome will occur. The two last steps of implementation were assumed, not empirically tested in the pilots. The engineering model posited that evaluative findings are used instrumentally. Instrumental use implies that evidence on methods (means, instruments) in evaluation products (final and draft final reports) is accepted as trustworthy by primary users and transformed into binding decisions on the best means to reach the goals in the intervention under scrutiny. The science-driven wave rested upon means–ends rationality, emanating from the thinking of the great German sociologist Max Weber. Given that goals were set by bodies outside of the scientific community and expressly recognized as subjective and falling outside the realm of science, the ability of various means to reach these externally set goals could be empirically and objectively ascertained in experimental settings by scientists. In political science, behavioralism (and positivism) are designations for this train of thought concerning the centrality of observing the means–ends distinction in academic research (Simon, 1976: 37).
The Dialogical Wave In the early 1970s, faith in methods-driven scientification of government started to languish. Mistrust of experimental evaluation gained momentum. Demands were voiced for more participation by diverse groups and more dialogue and communication in evaluations.
1097
Evaluation should be pluralistic and democratic, not scientific, it was argued. Evaluation should be performed by the Common Man, not the Academic Man. All groupings having some stake in an intervention should be activated (Guba and Lincoln, 1989: 51; Karlsson, 2001, 1996). Although considerably older than evaluation, the dialogue-among-stakeholders idea was incorporated into policy evaluation discourse and practice at about this time, and it has stayed there ever since. Evaluations should be set up with significant stakeholder audiences represented (see the earlier section ‘Stakeholder Criteria of Merit’). They should continue to be a concern for politicians and top-level managers but now as members of larger groups of stakeholders communicating among each other and with the evaluators. The claims, concerns, issues, and goals of the various stakeholders should serve as points of departure for evaluations. Far from being carried out as rigorous scientific two-group experimentation, stakeholder evaluation was conducted by discussion and dialogue among equals, even deliberation avant la lettre (Guba and Lincoln, 1989: 56). Thus, the dialogical wave is an appropriate designation. Actually, for this wave of