Causality and Objectivity in Macroeconomics (Routledge INEM Advances in Economic Methodology) [1 ed.] 036755724X, 9780367557249

Central banks and other policymaking institutions use causal hypotheses to justify macroeconomic policy decisions to the

136 41

English Pages 218 [219] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Table of Contents
Preface
1 Introduction
1.1 Preliminary Considerations
1.2 The Agenda
1.3 A Note on Interdisciplinarity and the Philosophical Methods Employed
Part I: Causality
2 What is Macroeconomic Causality?
2.1 Introduction
2.2 Causal Modeling in Macroeconomics
2.3 An Interventionist Account of Macroeconomic Causality
2.4 Macroeconomic Causality as Privileged Parameterization
2.5 The Potential Outcome Approach to Macroeconomic Causality
2.6 An Adequate Account of Macroeconomic Causality
3 The Ontology of Macroeconomic Aggregates
3.1 Introduction
3.2 Reduction, Supervenience, and Emergence
3.3 The Canonical Macroeconomic DSGE Model
3.4 Do Macroeconomic Aggregates Emerge?
3.5 Causality and Manipulability
3.6 Toward a Program of Empirical Microfoundations
4 The In-Principle Inconclusiveness of Causal Evidence in Macroeconomics
4.1 Introduction
4.2 Randomized Controlled Trials
4.3 A Natural Experiment in Macroeconomics
4.4 The General Case
4.5 The Hoover Test
4.6 The AK Test
4.7 Conclusion
5 Causality and Probability
5.1 Introduction
5.2 Suppes on Genuine Causation
5.3 Granger Causality
5.4 Zellner on Causal Laws
5.5 Causal Bayes Nets Theory
5.6 Policy or Prediction?
5.7 Common Effects and Common Causes
Part II: Objectivity
6 Scientific Realism in Macroeconomics
6.1 Introduction
6.2 Newton or Kepler?
6.3 Truth-to-Economy in the “Measurement without Theory” Debate
6.4 Truth-to-Economy in Contemporary Macroeconomic Policy Analysis
6.5 Scientific Realism in Macroeconomics: Given as a Problem
7 The Role of Non-Scientific Values in Macroeconomics
7.1 Introduction
7.2 Structural Objectivity and Causal Modeling in Macroeconomics
7.3 Longino on Values and Empirical Underdetermination
7.4 Simply Walras or Non-Simply Marshall?
7.5 Ideologies, Value Judgments, and Group Interests
8 Macroeconomic Expertise
8.1 Introduction
8.2 Trained Judgment and Expertise in Economics
8.3 Macroeconomic Expertise: How Does It Work?
8.5 Expert Intuition and Scientific Objectivity
9 Macroeconomics in a Democratic Society
9.1 Introduction
9.2 Popper on Scientific Objectivity and the Critical Method
9.3 Myrdal on Scientific Objectivity and Inferences from Value Premises
9.4 Kitcher on the Ideal of Well-Ordered Science
9.5 Well-Ordered Macroeconomics
Index
Recommend Papers

Causality and Objectivity in Macroeconomics (Routledge INEM Advances in Economic Methodology) [1 ed.]
 036755724X, 9780367557249

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

There is a set of questions at the heart of both philosophy of science and macroeconomics that relate to the issue of causality. What is a causal relationship? How can we learn about it from data? How can we use it for policy analysis? In Causality and objectivity in macroeconomics, Tobias Henschen guides the reader in this fascinating but difficult territory with analytical rigour and deep knowledge of both the philosophical debate and macroeconomic practice. The reader may be hurt when learning that macroeconomics with its causal inference tools is fragile but will also find many inspirations for the challenges ahead. Alessio Moneta, Sant’Anna School of Advanced Studies, Pisa The Ukraine War? The pandemic? The interruption of supply chains? The massive expansion of central bank balances? The recent surge in inflation has made abundantly clear that our understanding of causation in macroeconomics is wanting. Tobias Henschen’s wonderful new book goes a long way towards explaining our predicament and makes valuable suggestions for improvements. No macroeconomic analyst, policy maker or methodologist interested in the foundations of economic policy can afford to miss it. Julian Reiss, Institute for Philosophy and Scientific Method, Johannes Kepler University Linz Tobias Henschen has written the go-to work on the crucial issues for causality in macroeconomics raised by the new classical modelers’ insistence on microfoundations and the endogeneity of expectations. Anyone relying on DSGE economic models for objective policy guidance needs to read this book. Alex Rosenberg, R. Taylor Cole Professor of Philosophy, Duke University Henschen’s very interesting book sheds new light on why the effectiveness of macroeconomic policies is unremittingly challenged both within and outside the discipline. The author’s well argued answer, to put it abruptly, is that this is the consequence of the missing scientific objectivity of the causal mechanisms supposedly at work. Andrea Salanti, University of Bergamo

Causality and Objectivity in Macroeconomics

Central banks and other policymaking institutions use causal hypotheses to justify macroeconomic policy decisions to the public and public institutions. These hypotheses say that changes in one macroeconomic aggregate (e.g. aggregate demand) cause changes in other macroeconomic aggregates (e.g. in inflation). An important (perhaps the most important) goal of macroeconomists is to provide conclusive evidence in support of these hypotheses. If they cannot provide any conclusive evidence, then policymaking institutions will be unable to use causal hypotheses to justify policy decisions, and then the scientific objectivity of macroeconomic policy analysis will be questionable. The book analyzes the accounts of causality that have been or can be proposed to capture the type of causality that underlies macroeconomic policy analysis, the empirical methods of causal inference that contemporary macroeconomists have at their disposal, and the conceptions of scientific objectivity that traditionally play a role in economics. The book argues that contemporary macroeconomists cannot provide any conclusive evidence in support of causal hypotheses, and that macroeconomic policy analysis doesn’t qualify as scientifically objective in any of the traditional meanings. The book also considers a number of steps that might have to be taken in order for macroeconomic policy analysis to become more objective. The book addresses philosophers of science and economics as well as (macro-) economists, econometricians and statisticians who are interested in causality and macro-econometric methods of causal inference and their wider philosophical and social context. Tobias Henschen is a principal investigator in a research project that is hosted by the University of Cologne, funded by the German Research Foundation (DFG), and devoted to an investigation of the philosophical foundations of complexity economics. Previously, he had been holding temporary positions of full professor for epistemology and philosophy of science at University College Freiburg (2018– 2020) and of assistant professor at the Philosophy Department at the University of Konstanz (2013–2018). From 2011 to 2013 he was a postdoctoral researcher at the Hebrew University of Jerusalem, and from 2009 to 2011 a postdoctoral research and teaching fellow at the Philosophy and Economics Departments of

the University of Heidelberg. He holds degrees in economics and philosophy: a Licence (or BSc) in economics from the University of Toulouse 1 (2000), an MA in philosophy and economics (2001) and a PhD in philosophy from the University of Heidelberg (2009). He published a book on Heidegger’s philosophy of science and language in 2010 and various articles in the fields of general philosophy of science, philosophy of economics, and the philosophy of Kant.

Routledge INEM Advances in Economic Methodology Series Editor: Esther-Mirjam Sent, University of Nijmegen, the Netherlands

The field of economic methodology has expanded rapidly during the last few decades. This expansion has occurred in part because of changes within the discipline of economics, in part because of changes in the prevailing philosophical conception of scientific knowledge, and also because of various transformations within the wider society. Research in economic methodology now reflects not only developments in contemporary economic theory, the history of economic thought, and the philosophy of science; but it also reflects developments in science studies, historical epistemology, and social theorizing more generally. The field of economic methodology still includes the search for rules for the proper conduct of economic science, but it also covers a vast array of other subjects and accommodates a variety of different approaches to those subjects. The objective of this series is to provide a forum for the publication of significant works in the growing field of economic methodology. Since the series ­defines methodology quite broadly, it will publish books on a wide range of different methodological subjects. The series is also open to a variety of different types of works: original research monographs, edited collections, as well as republication of significant earlier contributions to the methodological literature. The International Network for Economic Methodology (INEM) is proud to sponsor this important series of contributions to the methodological literature. The Positive and the Normative in Economic Thought Edited by Sina Badiei and Agnès Grivaux Methodology and History of Economics Reflections with and without Rules Edited by Bruce Caldwell, John Davis, Uskali Mäki and Esther-Mirjam Sent Causality and Objectivity in Macroeconomics Tobias Henschen

For more information about this series, please visit: www.routledge.com/Routledge-INEM-Advancesin-Economic-Methodology/book-series/SE0630

Causality and Objectivity in Macroeconomics Tobias Henschen

First published 2024 by Routledge 4 Park Square, Milton Park, Abingdon, Oxon OX14 4RN and by Routledge 605 Third Avenue, New York, NY 10158 Routledge is an imprint of the Taylor & Francis Group, an informa business © 2024 Tobias Henschen The right of Tobias Henschen to be identified as author of this work has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-367-55724-9 (hbk) ISBN: 978-0-367-55725-6 (pbk) ISBN: 978-1-003-09488-3 (ebk) DOI: 10.4324/9781003094883 Typeset in Times New Roman by codeMantra

To Katarina, Martha, and Raphael

Contents

Preface 1 Introduction 1.1 Preliminary Considerations 1 1.2 The Agenda 8 1.3 A Note on Interdisciplinarity and the Philosophical Methods Employed 11

xv 1

PART I

Causality19 TOBIAS HENSCHEN

2

What is Macroeconomic Causality? 2.1 Introduction 20 2.2  Causal Modeling in Macroeconomics  21 2.3  An Interventionist Account of Macroeconomic Causality  24 2.4 Macroeconomic Causality as Privileged Parameterization 29 2.5 The Potential Outcome Approach to Macroeconomic Causality 33 2.6  An Adequate Account of Macroeconomic Causality  37

20

3

The Ontology of Macroeconomic Aggregates 3.1 Introduction 41 3.2  Reduction, Supervenience, and Emergence  43 3.3  The Canonical Macroeconomic DSGE Model  47 3.4  Do Macroeconomic Aggregates Emerge?  51 3.5  Causality and Manipulability  56 3.6  Toward a Program of Empirical Microfoundations  59

41

xii Contents 4

The In-Principle Inconclusiveness of Causal Evidence in Macroeconomics 4.1 Introduction 69 4.2  Randomized Controlled Trials  70 4.3  A Natural Experiment in Macroeconomics  74 4.4  The General Case  77 4.5  The Hoover Test  80 4.6 The AK Test 85 4.7 Conclusion 90

5

Causality and Probability 5.1 Introduction 94 5.2  Suppes on Genuine Causation  95 5.3 Granger Causality 97 5.4  Zellner on Causal Laws  99 5.5  Causal Bayes Nets Theory  100 5.6  Policy or Prediction?  103 5.7  Common Effects and Common Causes  105

69

94

PART II

Objectivity111 TOBIAS HENSCHEN

6

Scientific Realism in Macroeconomics 6.1 Introduction 115 6.2  Newton or Kepler?  116 6.3 Truth-to-Economy in the “Measurement without Theory” Debate 122 6.4 Truth-to-Economy in Contemporary Macroeconomic Policy Analysis 127 6.5  Scientific Realism in Macroeconomics: Given as a Problem 131

115

7

The Role of Non-Scientific Values in Macroeconomics 7.1 Introduction 141 7.2 Structural Objectivity and Causal Modeling in Macroeconomics 143 7.3  Longino on Values and Empirical Underdetermination  146 7.4  Simply Walras or Non-Simply Marshall?  151 7.5  Ideologies, Value Judgments, and Group Interests  155

141

Contents  xiii 8

Macroeconomic Expertise 8.1 Introduction 162 8.2  Trained Judgment and Expertise in Economics  163 8.3  Macroeconomic Expertise: How Does It Work?  166 8.5  Expert Intuition and Scientific Objectivity  170

162

9

Macroeconomics in a Democratic Society 9.1 Introduction 175 9.2  Popper on Scientific Objectivity and the Critical Method 176 9.3  Myrdal on Scientific Objectivity and Inferences from Value Premises 180 9.4  Kitcher on the Ideal of Well-Ordered Science  183 9.5 Well-Ordered Macroeconomics 186

175

Index193

Preface

The book you are reading is the revised version of a manuscript that I submitted as Habilitationsschrift (or postdoctoral thesis) to the Philosophy Department of the University of Konstanz in December 2016. The idea for this book occurred to me about eight years earlier, in October 2008, soon after submitting my doctoral thesis. Lehman Brothers Inc. (the famous American financial services firm) had just filed for bankruptcy. Stock markets had plunged and millions of US citizens had lost their homes. Employment and gross domestic product had fallen and continued to fall in Europe and the US for another year. The blame game in politics was in full swing. Relatively new, however, was the blame game within the economics profession. The Economist Magazine (not exactly a magazine with Marxist leanings) wrote that macroeconomists “misread the economy on the way up, misread it on the way down and now mistake the right way out,” and that they “missed the origins of the crisis; failed to appreciate its worst symptoms; and cannot now agree about the cure” (July 18th 2009 edition). The accusation was that macroeconomists had failed to provide policymakers with the scientific knowledge they needed to recognize the emerging crisis in time and to take measures that could have prevented or at least mitigated the crisis. Although the macroeconomists and econometricians I had studied with had always avoided the concept of causality, I was enough of a philosopher to understand that the scientific knowledge that macroeconomists had failed to provide was causal knowledge. I began to study what little macroeconomists and econometricians have had to say about causality and methods of causal inference. I also devoted attention to the work of macroeconomists who are outside the macroeconomic mainstream (especially to the work of agent-based macroeconomists). And I turned to the work of theorists who before me had tried to make sense of causality and causal inference methods in (macro-) economics (especially to the work of Nancy Cartwright, Kevin Hoover, and Julian Reiss). Reading the work of these theorists has taught me that there are more things in the philosophy of economics than are dreamt of in my economics. In 2013, I was very lucky to be offered a position of assistant professor at the chair of Wolfgang Spohn in Konstanz. There is hardly anyone on earth who knows as much about causality as Wolfgang. The elegance and clarity of his philosophical system have always driven me to sharpen my own philosophical positions and

xvi Preface arguments. Wolfgang later accepted me into his DFG-funded research group for two semesters so that I was exempt from teaching and could visit the Center for Philosophy of Science at the University of Pittsburgh. My second important mentor in Konstanz was Thomas Müller. Similar to Wolfgang, he looks at scientific practice with the critical eye of the logician, mathematician, or theoretical physicist. Both Wolfgang and Thomas have guided my research through its various stages and repeatedly encouraged it with astute comments and criticism. I am extremely grateful for their intellectual and institutional support. Besides Wolfgang and Thomas, Julian Reiss and Jan Schnellenbach have been supportive of my research throughout the years. Julian was always available when I sent him emails with questions about his work, and it was always a pleasure to speak to him at conferences or meetings. Jan is one of the few economists in Germany with genuine philosophical and methodological interests. At the time, he was among the first scholars I was able to talk to about the accusations that were being made against macroeconomists. Like Julian, he later read and commented on earlier drafts of most chapters of this book. I am grateful to both of them for their help and many good conversations. Other scholars who have read and commented on earlier drafts of some of the chapters include Matthew Brown, Anjan Chakravartty, Kevin Hoover, Andreas Hüttemann, Kareem Khalifa, Edouard Machery, Alessio Moneta, and Winfried Pohlmeier. I would like to thank them all for their comments and suggestions. To Anjan, Edouard, Kareem, and Matthew I am also grateful for the friendly and inspiring atmosphere they helped create at the Center for Philosophy of Science in Pittsburgh, while I was visiting (and beyond). I have benefited from philosophical conversations with many other scholars. My interest in philosophy of economics was first sparked by Malte Faber, who is now Professor Emeritus of Economic Theory at the University of Heidelberg. I spoke with him frequently about the causes of the financial crisis and his book on Marx. I have been fortunate a few times to be able to explain my positions and arguments to Nancy Cartwright. I have rarely experienced someone who can point out the gaps in a philosophical argument as quickly as Nancy. In Pittsburgh, I was able to talk at length with John Norton about causality and causal inference. On my way back to Europe, I stopped in New York to talk with Philipp Kitcher about the prospects and the ideal of well-ordered macroeconomics. Back in Konstanz, Christina Zuber (who is a political scientist, but also a clever philosopher) gave me interesting insights into how randomized controlled trials are conducted in political science. When I recently held a position in Freiburg, I was fortunate to get some first-hand insights from Lars Feld into the adventures of economic policy consulting in Germany. I was also able to have some illuminating discussions with Andreas Buchleitner about the problems of economic policy in the face of the complexity of economic systems. Many thanks to all these scholars for the inspiring conversations. Earlier versions of some of the chapters were presented at conferences or department seminars in Durham, Freiburg, Cologne, Pittsburgh, Konstanz, Munich,

Preface  xvii Madrid, Tilburg, Exeter, Paris, and Rotterdam. I am grateful to the various audiences for many good questions and critical comments. Research on this book was financially supported by the German Research Foundation (grant number HE 6845/3-1), the Center for Philosophy of Science at the University of Pittsburgh, the Zukunftstkolleg of the University of Konstanz, and the Fritz Thyssen Foundation. I gratefully acknowledge this financial support. The DFG grant is for a research project that is still ongoing. The project allows me to continue my study of the philosophical and economic literature on causality and causal inference. It also allows me to discover the philosophical and scientific literature on a different but related field: that of complex systems. I experience my work on this project as an enormous privilege, but a certain drawback is that I find it increasingly difficult to draw a line. The more I read, the more problems seem to require attention, and the less I feel able to bring the book to a close. I now decided to draw a somewhat arbitrary line and not to consider any further material. The resulting book expresses inevitably views and positions that will solidify, develop, or perhaps even change. I thank Andy Humphries, the publisher at Routledge who has supervised the printing of the book from start to finish with a lot of patience and good humor. I’m grateful to Quentin Fietzek for preparing the index. Chapters 2, 4, and 5 of the book are reprinted with few alterations from the following articles: “What is macroeconomic causality?”, Journal of Economic Methodology 25 (2018/1): 1–25. “The in-principle inconclusiveness of causal evidence in macroeconomics”, European Journal for Philosophy of Science 8 (2018): 709–733. “Causality and probability”, in C. Heilmann and J. Reiss (eds.), Handbook for Philosophy of Economics (London: Routledge, 2021). I dedicate this book to my wife and our two kids.

1

Introduction

1.1

Preliminary Considerations

On 15 January 2009, the European Central Bank’s (ECB) Governing Council decided to reduce the interest rate on the main refinancing operations of the Eurosystem by 50 basis points to 2%. That reduction brought the total reduction since October 2008 to 225 basis points. Jean-Claude Trichet, the former president of the ECB, commented on the decision as follows: Today’s decision takes into account that inflationary pressures have continued to diminish, owing in particular to the further weakening in the economic outlook. […] Looking further ahead, […] the euro area should over time reap the full benefit from the effects of policy measures announced over recent weeks (ECB Press Conference 15 May 2009). Trichet expected the interest rate reduction to fully benefit the Euro area “over time.” He didn’t specify what he meant by “over time,” but let’s say that what he had in mind was a period of 2 to 3 years. Trichet also expected the interest rate reduction to lead to an increase in inflation. He expected that increase to be moderate, however, because commodity prices continued to fall at the time. A look at the data suggests that Trichet’s expectations were fulfilled: inflation remained under control, and by the end of 2010, GDP regained the level it had lost when Lehman Brothers failed in September 2008. On 13 September 2012, the Federal Open Market Committee (FOMC) of the US Federal Reserve System (Fed) announced a third round of quantitative easing (or QE3): To support a stronger economic recovery and to help ensure that inflation, over time, is at the rate most consistent with its dual mandate, the Committee agreed today to increase policy accommodation by purchasing additional agency mortgage-backed securities at a pace of $40 billion per month (Board of Governors of the Federal Reserve System, Press Release 09/13/2012). DOI: 10.4324/9781003094883-1

2 Introduction By its large-scale purchase of mortgage-backed securities (or quantitative easing) the FOMC hoped to lower the interest yield of the securities, to counteract deflation, and to influence banks’ liquidity preferences in a decisive way (to stimulate economic activity by allowing private banks to increase lending). Again, a look at the data suggests that the FOMC’s hopes were justified: inflation increased only slightly, GDP continued to grow, and unemployment fell (if only slightly). On 27 July 2022, the FOMC of the Fed decided to raise the federal funds rate (the interest rate at which depository institutions trade federal funds) by 0.75 percentage points to 2.25%–2.5%. The decision was commented on as follows: Recent indicators of spending and production have softened. Nonetheless, job gains have been robust in recent months, and the unemployment rate has remained low. Inflation remains elevated, reflecting supply and demand imbalances related to the pandemic, higher food and energy prices, and broader price pressures. Russia’s war against Ukraine is causing tremendous human and economic hardship. The war and related events are creating additional upward pressure on inflation and are weighing on global economic activity. The Committee is highly attentive to inflation risks. The Committee seeks to achieve maximum employment and inflation at the rate of 2 percent over the longer run. In support of these goals, the Committee decided to raise the target range for the federal funds rate to 2-1/4 to 2-1/2 percent and anticipates that ongoing increases in the target range will be appropriate (Board of Governors of the Federal Reserve System, Press Release 07/22/2022). The FOMC decided to raise the federal funds rate by 0.75 percentage points at three subsequent meetings and by 0.5 percentage points at its final meeting in 2022. The federal funds rate now sits at 4.5%, and the FOMC has declared that it is not going to lower the rate before 2024. The FOMC “anticipates that the ongoing increases in the target range will be appropriate”: that these increases will decrease the level of inflation to the target level of 2%, and that the increase in unemployment associated with a decline in inflation will be moderate and tolerable because “job gains have been robust in recent months.” Will the ongoing increases in the federal funds rate be appropriate? It depends. The level of inflation will probably approximate the 2% target level at some point in the future, while unemployment is sufficiently low. But the question is whether it will be plausible to attribute that outcome to the monetary policy of the FOMC in the second half of 2022. The ECB’s decision to reduce the interest rate and the Fed’s decisions to purchase mortgage-backed securities and to increase the federal funds rate represent macroeconomic policy decisions, as taken by central banks and other policymaking institutions on a regular basis. European and US law determines the key macroeconomic objectives of the ECB and the Fed: price stability in the case of the ECB and the “dual mandate” of price stability and maximum employment in the case of the Fed. European and US law also structures the ECB and the Fed in a way that is meant to ensure that its decisions do not become subject to political pressures

Introduction  3 that could compromise the achievement of the key macroeconomic objectives. But like other policymaking institutions, the ECB and the Fed are ultimately accountable to the general public and public institutions (such as the European Parliament, the European Commission, and the US Congress). And a question of considerable interest is how policymaking institutions can account for macroeconomic policy decisions. The wording of press releases or transcripts of press conferences suggests that policymaking institutions account for macroeconomic policy decisions by citing some kind of research or (empirical) evidence that supports the claim that the respective decision is conducive to the respective macroeconomic objective. Consider, for instance, the following passage from a statement that Ben Bernanke, the former chairman of the Federal Open Markets Committee, made when asked about the efficiency of QE3: I’ve spent a lot of time, as all of my colleagues have, looking at the evidence, and, of course, the staff here have done a great deal of work on the question. And the bottom line for most of it, most of the research, is that while these tools are not so powerful that they can solve the problem, they are at least able to provide meaningful support to the economy. Our job is to use the tools we have to meet our mandate, which is maximum employment and price stability. So if we have tools that we think can provide some assistance and we’re not meeting our mandate, then I think that our obligation is to do what we can (Fed Press Conference 13 September 2012). Bernanke suggests that “most of the research” provides evidence in support of the claim that QE3 can “provide meaningful support to the economy”: that most macroeconomists agree that QE3 isn’t powerful enough to “solve the problem” (i.e., reduce unemployment significantly) but that it is powerful enough to contribute a little to the achievement of the dual mandate (and therefore worth trying). But of what kind are the research and the evidence that Bernanke cites? The kind of research that he cites can be described as macroeconomic causal modeling, i.e., as the practice of using empirical data to specify, estimate, and test macroeconomic models that represent the causal relations that macroeconomists believe hold between aggregate quantities: relations that permit manipulations of one quantity (e.g. the real interest rate) to influence another (e.g. inflation). A particular model that supports Bernanke’s claim that QE3 can provide meaningful support to the economy is, for instance, the model that Lawrence Christiano, Roberto Motto, and Massimo Rostagno (2014) develop to analyze the causal relations that they think obtain between aggregate quantities in the US and the Euro area. And a particular model that supports Trichet’s claim that the interest rate reduction will fully benefit the Euro area over time is the model that Frank Smets and Rafael Wouters (2002) develop to investigate the causal relations that they believe hold between aggregate quantities in the Euro area. A causal model is, of course, only a set of mathematical equations. But as a set of mathematical equations, it expresses a set of causal

4 Introduction hypotheses. A causal hypothesis that Christiano, Motto, and Rostagno (2010, p. 6), for instance, defend says that “[o]n rare occasions, changes in banks’ liquidity preferences can become a major cause of disruption for the broad economy.” And a causal hypothesis that Smets and Wouters (2002, p. 6) defend says that “a temporary monetary policy tightening, associated with a temporary increase in the nominal and real interest rate, has a […] negative effect on both output and inflation.” Another question of considerable interest is what it means to say that X “can become a major cause of” Y, or that X has a positive or “negative effect on” Y, when X and Y stand for macroeconomic aggregates. Economists, econometricians, and macroeconomists have remained remarkably silent on that question. Probably the only macroeconomist who has ever dealt with that question in substantial philosophical detail is Kevin Hoover (2001, pp. 59–61, 2011, 2013). And the only economist or econometrician who has ever come up with an explicit definition of ‘X causes Y’ is Clive Granger (1980). Textbooks and commentators usually agree that Granger’s definition doesn’t capture what economists, econometricians, or macroeconomists have in mind when developing causal models or using causal vocabulary. But hardly any economist, econometrician, or macroeconomist has tried to propose an alternative definition. If the kind of research that Bernanke cites is macroeconomic causal modeling, then the kind of evidence that he cites is causal evidence. Causal evidence is meant to determine a causal hypothesis: to allow for the selection of that hypothesis and for the rejection of competing hypotheses. Causal evidence is not to be confused with the kind of evidence that derives from simple comparisons of the development of aggregate quantities before and after the respective policy measure. Before March 2008, for instance, GDP had been continuously growing in the Euro area even though there hadn’t been any noteworthy monetary operations since 2000. Similarly, GDP had been continuously growing in the USA since 2002 even though the first round of quantitative easing was started only in November 2008 (and even though the Fed increased the federal funds rate steadily from 1% to 5.25% between 2002 and 2004). These comparisons indicate that GDP might have been growing even in the absence of the respective policy measure (QE3 or the reduction of the interest rate on the main refinancing operations of the Eurosystem). They therefore do not sufficiently determine a causal model that implies that the policy measure causes a change in GDP. If causal evidence doesn’t derive from simple comparisons of the development of aggregate quantities before and after the respective policy measure, how does it derive? Economists, econometricians, and macroeconomists have again remained remarkably silent on that question. But while the proposal of an explicit definition can be regarded as an exercise in causal semantics or metaphysics (a field that arguably isn’t part of the business of economists, econometricians, or macroeconomists), the analysis of causal inference methods seems to belong to the core of the scientific activity that any macroeconomist who aims to defend causal hypotheses that policymaking institutions cite when accounting for macroeconomic policy decisions should engage in. Again, the only macroeconomist who has ever come up with a detailed account of the logic of causal inference in macroeconomics is Hoover.

Introduction  5 The importance of causal modeling and causal evidence for macroeconomic policy and the neglect of semantic and inferential accounts of causality among macroeconomists call for a semantic and methodological analysis of macroeconomic causality: for an analysis of the meaning that causal expressions need to have when figuring in causal hypotheses that are meant to justify macroeconomic policy decisions and for an analysis of the causal inference methods that macroeconomists can use to support causal hypotheses. This book is supposed to heed that call. It is going to discuss no less than six accounts that have been or can be proposed to capture the meaning of causal expressions in macroeconomics: two accounts that have been explicitly proposed (Hoover’s and Granger’s accounts); two general accounts that might be applicable to macroeconomics (the probability account proposed by Patrick Suppes 1970, and the interventionist account proposed by James Woodward 2003); and two accounts that can be reconstructed from the causal Bayes nets approach developed by Peter Spirtes, Clark Glymour, and Richard Scheines (2000), and from the potential-outcome approach that Joshua Angrist and Guido Kuersteiner (2011) have introduced into macroeconomics more recently. Each of the six accounts stands in one of roughly two traditions: in the tradition of understanding efficient causes as raising the probability of their effects, or in the tradition of understanding them as causally dependent on an intervention variable (IV) (or instrumental variable or “instrument”), i.e., on a variable, on which not only the putative cause causally depends but also the putative effect (via the putative cause), and which doesn’t causally depend on the putative effect or any other variable on which the putative effect causally depends. While the first tradition goes back to Hume, the second tradition has its roots in some of the work of the early econometricians (most notably of Haavelmo 1944/1995 and Simon 1953). The second tradition is younger than the first; but unlike the first tradition, the second tradition is at least compatible with Aristotelian approaches to efficient causation: with approaches that involve firm ontological commitments to powers, tendencies, or capacities. Hoover’s account, for instance, stands in the second tradition. He characterizes it as “not inconsistent” (Hoover 2001, p. 100) with that of Nancy Cartwright, who is sympathetic to probability theories of causality, but holds that (high) conditional probabilities only manifest capacities or “nomological machines.”1 Like Hoover’s account, Woodward’s interventionist account and the potential outcome approach of Angrist and Kuersteiner stand in the second tradition. The probability account proposed by Suppes, the Granger causality account, and the causal Bayes nets approach, by contrast, stand in the first tradition. The present book will defend variants of the accounts standing in the second tradition. What speaks against the accounts standing in the first tradition is that knowledge of causes that raise the probability of their effects can be employed for purposes of prediction, but less so for purposes of policy analysis. The book will also analyze the inference methods that come along with the three accounts that stand in the second tradition: natural experiments and the empirical procedures that Hoover (2001, Chapters 8–10) and Angrist and Kuersteiner (2011)

6 Introduction propose to test for causal hypotheses in macroeconomics. What these methods have in common is that they derive empirical evidence from the conditions of the IV method, i.e., from conditions requiring that there be no confounders of I and X and X and Y when Y is supposed to causally depend on X (where I is an IV on which X causally depends). One final question of considerable interest is whether the evidence that macroeconomists provide when conducting causal inference is perhaps too inconclusive to support specific causal hypotheses. Asking that question is not beside the point. Broad scientific consensus is strikingly absent from macroeconomic policy analysis, and the inconclusiveness of causal evidence would explain some of that absence. The absence of consensus is particularly striking in the case of the monetarist position that nominal interest rate manipulations have real effects, i.e., that inflation and aggregate output and demand causally depend on the nominal interest rate. This is the position that will be at the center of discussion for much of the book. But there are other positions about which macroeconomists disagree: the position that inflation causally depends on aggregate demand, the position that aggregate demand causally depends on inflation expectations etc. It will turn out that the evidence that macroeconomists provide when conducting causal inference is indeed too inconclusive to select specific hypotheses from pools of competing causal hypotheses. The evidence is too inconclusive because it derives from the conditions of the IV method (i.e., from conditions requiring that there be no confounders of I and X and X and Y), and because in macroeconomics, there are confounders of which we cannot know whether they can be controlled: aggregates of expectations that individuals form with respect to the behavior of all kinds of variables that matter to them (expectational aggregates, or ‘hidden variables,’ as I will say more generally in Section 4 of Chapter 4). We cannot know whether they can be controlled because we cannot measure them. It is important to understand that the position that macroeconomists cannot provide any conclusive evidence in support of specific hypotheses from pools of competing causal hypotheses is not a nihilistic one. The position says that the inconclusiveness of causal evidence in macroeconomics is non-sporadic in the sense that the confounders, of which we cannot know whether they can be controlled, are likely to be active whenever there is an attempt of control for them (this is an implication of the famous Lucas critique). But the position also says that macroeconomists are currently unable to provide conclusive evidence: that macroeconomists are currently unable to measure individual expectations, that their inability to measure individual expectations has to do with their current inability to understand how individual expectations are formed, and that empirical science (for instance, experimental economics) might at one point be able to find out how individual expectations are formed. This book is going to draw an important conclusion from the current inconclusiveness of causal evidence in macroeconomics: the conclusion that macroeconomic causal models cannot be understood as ‘scientifically objective’ in any of the traditional meanings of the term. In macroeconomics, scientific objectivity has

Introduction  7 been traditionally conceived of as scientific realism, independence of non-scientific values, or expertise. One may say that a causal model used to justify macroeconomic policy decisions is objective in the sense of scientific realism if what it represents or refers to (causal relations between aggregate quantities) really exists, that it is objective in the sense of value-independence if it is selected independently of the non-scientific values (i.e., ideologies, value judgments, or group interests) that macroeconomists happen to endorse, and that it is objective in the sense of expertise if its specification relies on the intuitions of macroeconomic experts. If causal evidence is inconclusive in macroeconomics, then scientific realism and expertise will be problematic, while independence of non-scientific values will be impossible. Scientific realism and expertise will be problematic because there will be no way for us to decide whether there are causal relations connecting aggregate quantities, or whether there is anything like macroeconomic expertise. And independence of non-scientific values will be impossible because macroeconomists won’t be able to select macroeconomic causal models (or the causal hypotheses they express) on the basis of empirical evidence alone. If they select a causal model, then the non-scientific values that they happen to endorse will inevitably play a role. Another important conclusion that the book is going to draw relates to the political organization of macroeconomic policy analysis as a scientific research field. If causal evidence is inconclusive in macroeconomics, then non-scientific values will dominate others unless the scientific practice of macroeconomic policy analysis is organized democratically. The democratic organization of a scientific research field largely coincides with what Philipp Kitcher (2011) calls “well-orderedness.” According to Kitcher, a scientific discipline is well-ordered if significance (the question of what problems research should be conducted on) is dealt with in an ideal discussion under mutual engagement; if certification (the acceptance of research results as providing evidence for scientific claims) results from applications of methods that an ideal deliberation endorses as reliable, and that conform to the ideal of transparency; and if application (the use of public knowledge to solve urgent problems) is the topic of an ideal discussion under conditions of mutual engagement at the time and in the circumstances when the knowledge for the application becomes available. In macroeconomics, the problem arises that the methods that an ideal deliberation endorses as reliable are currently unavailable. In order for these methods to be forthcoming, progress needs to be made especially in experimental economics: experimental economists need to understand the formation of individual expectations, and macroeconomists need to develop reliable methods of measuring these expectations. If these methods fail to be forthcoming, however, then macroeconomic policy analysis might turn out to lack secure scientific-empirical foundations: then macroeconomic policy might be led astray when trying to manipulate macroeconomic quantities to influence others, and then it might have to focus on more moderate goals like that of the enhancement of the resilience of the economic system.

8 Introduction 1.2 The Agenda The book divides naturally into two parts. Part I is concerned with macroeconomic causality and divided into four chapters. Chapter 2 (the first chapter of part I) aims to develop an adequate account of macroeconomic causality. It discusses the definition that is central to Woodward’s (2003) interventionist account and the definitions that can be extracted from Hoover’s (2001, 2011, 2013) remarks on privileged parameterization and from the potential outcome approach that Angrist and Kuersteiner (2011) have introduced into macroeconomics more recently. The definition that it defends can be regarded as the gist that is common to all three definitions when they are relieved of overly restrictive conditions. It says (roughly) that Y causally depends on X if and only if there is a possible intervention on X that changes Y, where X and Y stand for macroeconomic aggregates, where an intervention is understood as a manipulation of an IV I that satisfies conditions requiring that X causally depend on I, and that there be no confounders of X and Y, and where an IV is either a variable or a parameter. Chapter 3 analyzes the ontology of macroeconomic aggregates and draws conclusions for the type of program of microfoundations that I think macroeconomists should adopt. The ontological analysis shows that macroeconomic aggregates fully reduce to the microeconomic quantities, of which they are composed, that they supervene on these quantities, and that they take positions in relations of probabilistic dependence that emerge from the direct interactions between heterogeneous agents in dynamic non-equilibrium. The ontological analysis shows further that we cannot identify the relations of direct causal dependence that generate the relations of probabilistic dependence because in macroeconomics, empirical procedures of causal inference cannot provide any conclusive evidence in support of causal relations. The conclusion for the program of microfoundations says that the program should be rejected if it is understood as the program of reducing macroeconomics to general equilibrium theory (which is essentially the theory of homogenous agents who interact indirectly via a price mechanisms in dynamic equilibrium), that it should be accepted if it is understood as a program of empirical microfoundations, and that emerging relations of probabilistic dependence between macroeconomic aggregates cannot be investigated independently of any microfoundations. Chapter 4 analyzes the methods of causal inference that come along with the three definitions discussed in Chapter 2: the IV method and econometric causality tests. It argues that the evidence that macroeconomists can provide when using these methods is in principle too inconclusive to support the hypothesis that X causes Y, where X and Y stand for macroeconomic aggregates like the key interest rate and aggregate demand. The evidence provided by the IV method is too inconclusive because it derives from conditions requiring that there be no confounders of I and X and X and Y (where I is an instrumental or IV, on which X causally depends), and because in macroeconomics, confounders that cannot be controlled for or measured (in particular, variables relating to inflation expectations, demand expectations, and so on) are likely to be present. The evidence provided by

Introduction  9 econometric causality tests (Hoover’s testing procedure and the potential outcome approach) is too inconclusive because they can be shown to rely on the conditions of the IV method at least tacitly. Chapter 5 discusses three probability approaches to causality: the probability approach proposed by Suppes (1970), the Granger causality approach (Granger 1969; 1980), and causal Bayes nets theory (Spirtes et al. 1993). What motivates this discussion is the negative result of Chapter 4: if the methods of causal inference that come along with the three definitions discussed in Chapter 2 fail to provide conclusive evidence, then a more adequate definition might be extractable from one of the three probability approaches. But Chapter 5 will ultimately dismiss these approaches. The essential reason is that knowledge of causes that raise the probability of their effects cannot be employed for purposes of policy analysis. Chapter 5 also discusses some of the problems that appear to be inherent to attempts to derive knowledge about causality in the sense of the definition defended in Chapter 2 from knowledge about the probabilistic dependencies and independencies that are of specific interest in causal Bayes nets theory. Part II deals with scientific objectivity in macroeconomics and is also divided into four chapters. Chapters 6–8 analyze the various conceptions of scientific objectivity that are traditionally thought to be pertinent to macroeconomics: scientific realism in macroeconomics (the position that macroeconomic causal hypotheses have a counterpart in reality), value-independence, and expertise. Scientific realism, value-independence, and expertise are associated with the traditional meanings of ‘scientific objectivity’ that Daston and Galison (2010) describe when investigating the historical development of the concept of scientific objectivity: scientific realism exemplifies what Daston and Galison call “truth-to-nature,” valueindependence an amalgamation of what they call “mechanical” and “structural” objectivity, and expertise what they call “trained judgment.” Chapter 6 argues that scientific realism is problematic in macroeconomics because we cannot know whether macroeconomic causal hypotheses adequately represent the causal relations that exist in the economy at hand. We know that causal relations exist if we can demonstrate that the common cause principle obtains: the principle that X and Y are correlated if and only if X causally depends on Y, Y causally depends on X, or there is a common cause Z on which both X and Y depend. Demonstrating that the common cause principle obtains is more difficult than it might seem: the macroeconomic time series that provide values to X and Y are often non-stationary (Hendry 1980); non-stationary time series are not subject to the common cause principle unless they are co-integrated (Hoover 2003); and the empirical procedure that can be applied to test for co-integration (Johansen 1988) is prone to several shortcomings (Cheung and Lai 1993, Pagan 1995). But even when assuming that the common cause principle obtains, we cannot know whether macroeconomic causal models adequately represent causal relations because of the negative result of Chapter 4: because these models are empirically underdetermined (because ‘X and Y are correlated’ underdetermines ‘X causally depends on Y,’ ‘Y causally depends on X,’ and ‘X and Y causally depend on (variable set) Z’). Sometimes the correlation between X and Y combines with an additional

10 Introduction piece of evidence to rule out ‘X causally depends on Y’ (as in applications of the potential outcome approach), but in macroeconomics, it is currently impossible to provide empirical evidence to rule out the possibility of confounding. Chapter 7 argues that the value-free ideal is currently unattainable in macroeconomics because hypotheses about causal relations between macroeconomic aggregates are underdetermined by empirical evidence; because in macroeconomics, specific combinations of values that are traditionally held to be of the scientific kind (the combination of simplicity with conservatism in the sense of consistence with Walrasian general-equilibrium theory, and the combination of non-simplicity with conservatism in the sense of consistence with non-Walrasian generalequilibrium theory) select only competing hypotheses from pools of competing and empirically underdetermined causal hypotheses; and because macroeconomists are likely to prefer one combination to another on the basis of the non-scientific values that were known to Marx (1857/21994, 41890/1990, 1894/1981) and have been analyzed subsequently by Hausman and McPherson (21992) and Schumpeter (1949): on the basis of ideologies, value judgments, and group interests. The argument developed in Chapter 7 represents a macroeconomic application of the argument from empirical underdetermination. Longino (1996, 2008) develops a generalization of that argument when endorsing the well-known thesis of empirical underdetermination (cf. e.g. Quine 1975). But Norton (2008, pp. 17, 18) is right when saying that “the underdetermination thesis is little more than speculation based on an impoverished account of induction,” and that the underdetermination of a theory or hypothesis by empirical evidence needs to be decided “by direct study on a case-by-case basis.” There is another popular argument that is often cited in support of the unattainability of the value-free ideal: the argument from inductive risk (Douglas 2000, Rudner 1953). It can be shown, however, that this argument is not particularly strong: that there is an ambiguity in its premises, and that it misrepresents the procedure by which scientists should conduct statistical hypothesis tests (cf. Henschen 2021). Chapter 8 describes and points to the problematic character of the conception of trained judgment or expert knowledge in macroeconomics. Expert knowledge is not (only) propositional knowledge but to a substantial degree, non-propositional knowledge or ‘know-how’ (Dreyfus and Dreyfus 1986). In macroeconomics, expert knowledge is accordingly not only propositional knowledge in the shape of evidential propositions that can be derived from theory, natural experiments, or econometric tests, but to a substantial degree, also non-propositional knowledge (or ‘know-how’), i.e., knowledge how to specify a causal hypothesis in an intuitive response to the situation at hand (a hypothesis that may deviate from any prevalent models in certain respects). One may accordingly think that in macroeconomics, a causal hypothesis is objective if its specification relies on the intuitions of a macroeconomic expert. But the problem with that conception is that in macroeconomics, it is difficult to decide who of the macroeconomists who deny each other the capacity of expertise qualify as experts. Alvin Goldman (2001) lists five sources of evidence that a competent non-expert might have for deciding whether someone qualifies as an expert: (a) arguments

Introduction  11 presented by the contending experts to support their own views and critique their rivals’ views, (b) agreement from additional putative experts on one side or other of the subject in question, (c) appraisals by ‘meta-experts’ of the experts’ expertise (including appraisals reflected in formal credentials earned by the experts), (d) evidence of the experts’ interests and biases vis-à-vis the question at issue, and (e) evidence of the experts’ past ‘track record.’ Chapter 8 argues that none of these sources can be exploited: arguments presented by the contending experts underdetermine the hypothesis they wish to defend; agreement from additional putative experts and appraisals by ‘meta-experts’ can be found on all sides (Keynesians, new Keynesians, new classicists, Austrian economists etc.); according to the result of Chapter 7, evidence of the experts’ interests and biases can also be found on all sides; and evidence of the experts’ past ‘track record’ is unavailable because it is impossible to decide whether the hypotheses they defend have a counterpart in reality. Chapter 9 argues that non-scientific values will dominate others unless the scientific practice of macroeconomic policy analysis is organized democratically. The democratic organization that it recommends largely coincides with what Kitcher (2011) calls “well-orderedness.” Kitcher maintains that a scientific discipline is well-ordered if significance is dealt with in an ideal discussion under mutual engagement; if certification results from applications of methods that an ideal deliberation endorses as reliable, and that conform to the ideal of transparency; and if application is the topic of an ideal discussion under conditions of mutual engagement at the time and in the circumstances when the knowledge for the application becomes available. Chapter 9 also argues that the ideal of well-orderedness has implications for the sort of progress that macroeconomics will have to make in order to become scientifically objective. Longino (1996, p. 40) believes that critical interaction is what transforms “the subjective into the objective.” But if objectivity in the sense of scientific realism, value-independence, or expertise is to be restored, there will need to be certification instead of critical interaction. There will need to be methods that an ideal deliberation endorses as reliable; and these methods are most likely to be forthcoming if progress can be made in experimental economics. If these methods fail to be forthcoming, however, then macroeconomic policy analysis will lack secure scientific-empirical foundations. 1.3 A Note on Interdisciplinarity and the Philosophical Methods Employed As an interdisciplinary work in the fields of macroeconomics, macroeconomic methodology and the philosophy of causality, causal inference and objectivity, this book is supposed to appeal to economists and philosophers alike: to theorists who belong to either of these two disciplines and not necessarily to both. Preparing such a piece of research runs the obvious risk of presupposing too much theoretical, methodological, and terminological knowledge on both sides. In order to minimize that risk to a tolerable degree, the book aims to provide an analysis that is theoretic

12 Introduction and specific enough to make a convincing case for its core theses but that at the same time is mundane and general enough not to hamper the understanding of theorists who have a decent background in either of the two disciplines but not in both. An analysis of this sort will have to introduce philosophical terminology carefully and to avoid complicated formal proofs of propositions that might play an important role in the philosophy of causation or causal inference. It will also have to explain macroeconomic theory and macroeconomic methods in a way that allows theorists who are not well-versed in macroeconomics or econometrics to understand them without major difficulties. I am honestly not 100% sure that the analysis in the following chapters is always theoretic and specific enough and at the same time mundane and general enough. I would therefore like to mention some sections that may not be accessible to the same extent to all readers of this book. Philosophers and non-economists who are not interested in every detail of the potential outcome approach and Hoover’s account of causality and causal inference might want to skip Sections 2.4–2.6 of Chapter 2 and Sections 4.5 and 4.6 of Chapter 4. The remaining sections of these chapters make it sufficiently clear why I defend a macroeconomic variant of the interventionist account, and why I think that the empirical evidence that macroeconomists can provide in support of causal hypotheses is too inconclusive. But (macro-) economists – especially those who have doubts about the soundness of my skeptical argument – will have to read the four chapters of Part I (Chapters 2–5) in detail. They may, on the other hand, be satisfied with a merely cursory look at the four chapters of part II (Chapters 6–9). The general idea that the skeptical argument of part I has negative implications for the scientific objectivity of macroeconomic policy analysis is clear enough. And perhaps the philosophical analysis of arguments for and against scientific realism, of the distinction between scientific and non-scientific values, of scientific expertise, and of the political organization of scientific practice is not the cup of tea of every economist. As an interdisciplinary work in the fields of macroeconomics, macroeconomic methodology and the philosophy of causality, causal inference and objectivity, the book is supposed to appeal to economists and philosophers alike. But ultimately, the book is a philosophical (and not a macroeconomic) piece of research. It takes macroeconomic theory and macroeconomic methods of causal inference as its object of investigation and applies philosophical methods to investigate that object. The philosophical methods that it applies are, more specifically, of roughly three kinds. The first method is that of semantic analysis: a method that starts off with a question, that carefully balances reasons that speak for or against the various claims that can be made in response to that question, and that (if possible) comes out in favor of at least one of these claims. Consider e.g. the following question: what does it mean to say that Y causally depends on X in the context of macroeconomic policy analysis? A possible claim that can be made in response to that question states that Y causally depends on X if and only if there is a possible intervention on X that changes Y while all other variables in a set of pre-selected variables remain unchanged. A reason that speaks for this claim says that it does justice to the policy context: under this claim, a

Introduction  13 causal relation between X and Y is one that can be exploited for policy purposes. A reason that speaks against this claim is that the notion of “intervention” is relatively unclear (and that it turns out that this notion is defined in terms of a condition that is violated in macroeconomics as a matter of principle). But semantic analysis can come out in favor of this claim when the claim is suitably modified (e.g. when the condition used to define “intervention” can be dropped). The second method is descriptive methodology: the method of describing the methods that scientists use as a matter of fact. The methods that the book describes comprise Hoover’s and Angrist and Kuersteiner’s empirical procedures, randomized controlled trials (RCTs) and natural experiments. It looks at the famous natural experiment that Milton Friedman and Anna Schwartz (1963, especially Chapter 13) observe to test the hypothesis that economic changes causally depend on monetary changes; at the procedure that Hoover (2001, Chapters 9 and 10) applies to test for the hypotheses that prices causally depend on money, and that taxes causally depend on government spending; and at the procedure that Angrist and Kuersteiner (2011) apply to test for the hypothesis that changes in real GDP (or aggregate output) causally depend on changes in the intended federal funds rate. The book also looks at the two RCTs that Brian Graversen and Jan van Ours (2008) and Gerard van den Berg and Bas van der Klaauw (2006) have conducted in labor economics to test for the hypothesis that the exit rate from unemployment causally depends on activation programs. The essential reason for describing these RCTs is to allow for a comparison between RCTs in microeconomics and fictitious RCTs in macroeconomics: for a comparison that is supposed to underline why RCTs can be carried out in microeconomics but not in macroeconomics. The third method that the book applies is prescriptive methodology: the method of evaluating scientific methods on the basis of purely theoretical considerations (‘from the armchair’). The methods that the book evaluates are the same as the ones that it describes: econometric causality tests and natural experiments. It evaluates these methods by investigating whether the applications of these methods deliver what they promise: whether the evidence that macroeconomists can provide when using these methods is conclusive enough to support specific hypotheses. The upshot of this evaluation will be that this evidence is too inconclusive, and that macroeconomic policy analysis needs to be organized democratically: that it needs to be open to critical interaction, and that it needs to become well-ordered if it is supposed to be scientifically objective. I am aware that this upshot is not going to meet with a lot of sympathy among practicing macroeconomists. Perhaps practicing macroeconomists feel that it is impertinent if not impudent for a philosophical outsider to evaluate their methods negatively. But I can preempt possible accusations of impertinence or impudence by adopting a strategy that I have learned from Alexander Rosenberg (1992, Chapter 5): the strategy of backing up negative evaluations of specific methods by citing what practicing scientists think about these methods. Adopting that strategy to back up my negative evaluation of macroeconomic methods of causal inference is not especially difficult because criticism of these methods is relatively widespread among practicing econometricians and macroeconomists.

14 Introduction In econometrics, Christopher Sims (1980) argued early on that the “incredible” exclusion restrictions used to estimate macro-econometric models undercut the reliability of policy advice based on these models. A few years later, Edward Leamer (1983) attacked both the “whimsical” nature of the assumptions used to justify inferences in regression studies and the sensitivity of results to arbitrary decisions about the choice of control variables. The two articles targeted different audiences and proposed different techniques for solving the perceived problems in econometric practice. But they shared the common message that far more attention needed to be paid to the identification of causal effects, and that econometric inference should not hinge on subsidiary modeling assumptions. The econometric research that has been conducted in response to these articles has led to important advances that make empirical work look far more “credible” than four decades ago. Nowadays, empirical work is based on more careful design, including both actual and natural (or “quasi”-) experiments. Some econometricians go as far as to suggest that the identification of causal effects is no longer a problem. Angrist and Pischke (2010, pp. 3–4), for instance, are “happy to report” that empirical microeconomics “has experienced a credibility revolution, with a consequent increase in policy relevance and scientific impact.” But other econometricians, most notably Sims and Leamer, remain skeptical. Leamer (2010, p. 33) issues “warnings about the willingness of applied economists to apply push-button methodologies” and points to “the inevitable limits of randomization” and “the need for sensitivity analysis in this area.” Sims (2010, p. 59) claims that what Angrist and Pischke say “about macroeconomics is mainly nonsense.” Criticism of causal inference methods is just as widespread among macroeconomists as among econometricians. Robert Lucas (1976) directed his famous critique primarily against the causal modeling practice of Keynesian macro-econometrics. Much of the criticism that macroeconomists like Willem Buiter (2009), Bradford DeLong (2012) and Paul Krugman (2009) have expressed in the wake of the crisis of 2008–2009 can be interpreted as targeting the inability of dynamic-stochastic general-equilibrium models to justify the macroeconomic policy measures that have been implemented to mitigate the crisis. Lawrence Summers (1991, p. 130) claims that no econometric test ever decided an economic question. He explicitly exempts natural experiments (and especially the one observed by Friedman and Schwartz) from the charge of never deciding any economic question. He also points out, however, that natural experiments decide economic questions by persuasion, and that they lack the scientific pretension of an explicit probability model (cf. Summers 1991, pp. 139–40). Buiter, DeLong, and Krugman voice their concern mainly through the public media, while Summers (1991, especially section III) argues for his claim primarily by way of example. He argues convincingly that specific applications of the deep parameter and vector autoregression approaches do not decide particular economic questions. But he doesn’t show that other applications of these approaches or applications of other approaches do not decide particular economic questions. He does not provide any general argument in support of his claim. This lack of argument is the primary motivation of this book: it aims to provide the critics with a detailed

Introduction  15 and general argument and to increase the pressure on practicing econometricians and macroeconomists. The argument to be provided is similar in spirit to the arguments of Leamer and the other skeptics. But it focuses more on macroeconomics, places more emphasis on the problem of confounding, and is more philosophical in nature (it operates on the basis of the interventionist account of causality and the associated theory of causal graphs). Note 1 A nomological machine is “a fixed (enough) arrangement of components, or factors, with stable (enough) capacities that in the right sort of stable (enough) environment, give rise to the kind of regular behavior that we represent in our scientific laws” (Cartwright 1999, p. 50).

References Angrist, J. D. and Kuersteiner, G. M. (2011). “Causal Effects of Monetary Shocks: SemiParametric Conditional Independence Tests with a Multinomial Propensity Score.” The Review of Economics and Statistics 93(3), 725–47. Angrist, J. D. and Pischke, J.-S. (2010). “The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics.” Journal of Economic Perspectives 24(2), 3–30. Buiter, W. (2009). “The Unfortunate Uselessness of Most ‘State of the Art’ Academic Monetary Economics.” VoxEU (6 March). Cartwright, N. (1999). The Dappled World. Cambridge: CUP. Cheung, Y.-W. and Lai, K. S. (1993). “Finite-Sample Sizes of Johansen’s Likelihood Ratio Tests for Cointegration.” Oxford Bulletin of Economics and Statistics 55, 313–28. Christiano, L. J., Motto, R., and Rostagno, M. (2010). “Financial Factors in Economic Fluctuations.” ECB Working Paper Series No. 1192 (May). Christiano, L. J., Motto, R. and Rostagno, M. (2014). “Risk Shocks.” American Economic Review 104(1), 27–65. Daston, L. and Galison, P. (22010). Objectivity. New York: Zone Books. DeLong, B. J. (2012). “This Time, It is not Different – The Persistent Concerns of Financial Macroeconomics.” In Blinder, A. et al. (eds.), Rethinking Finance. New York: Russell Sage Foundation. Douglas, H. (2000). “Inductive Risk and Values in Science.” Philosophy of Science 67, 559–579. Dreyfus, H. L. & Dreyfus, S. (1986). Mind Over Machine. New York: The Free Press. Friedman, M. and Schwarz, A. J. (1963). A Monetary History of the United States, 18671960. Princeton: PUP. Goldman, A. (2001). “Experts: Which Ones Should You Trust?” Philosophy and Phenomenological Research 63(1), 85–110. Granger, C. W. J. (1969). “Investigating Causal Relations by Econometric Models and Cross-Spectrum Methods.” Econometrica 37(3), 424–38. Granger, C. W. J. (1980). “Testing for Causality: A Personal Viewpoint.” Journal of Economic Dynamics and Control 2(4), 329–52. Graversen, B. K. and van Ours, J. C. (2008). “Activating Unemployed Workers Works. Experimental Evidence from Denmark.” Economics Letters 100, 308–10.

16 Introduction Haavelmo, T. (1944/1995). “The Probability Approach in Econometrics.” In Hendry, D. F. and Morgan, M. (eds.), The Foundations of Econometric Analysis. Cambridge: CUP, 477–90. Hausman, D. M. and McPherson, M. S. (21992). “Economics, Rationality, and Ethics.” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP, 252–77. Hendry, D. (1980). “Econometrics – Alchemy or Science?” Economica 47(188), 387–406. Hoover, K. D. (2001). Causality in Macroeconomics. Cambridge: CUP. Hoover, K. D. (2003). “Nonstationary Time Series, Cointegration, and the Principle of Common Cause.” The British Journal for Philosophy of Science 54, 527–51. Hoover, K. D. (2011). “Counterfactuals and Causal Structure.” In McKay Illari, P., Russo, F. and Williamson, J. (eds.), Causality in the Sciences. Oxford: OUP, 338–60. Hoover, K. D. (2013). “Identity, Structure, and Causal Representation in Scientific Models.” In Chao, H.-K., Chen, S.-T. and Millstein, R. (eds.), Towards the Methodological Turn in the Philosophy of Science: Mechanism and Causality in Biology and Economics. Dordrecht: Springer, 35–60. Johansen, S. (1988). “Statistical Analysis of Cointegration Vectors.” Journal of Economic Dynamics and Control 12, 231–54. Kitcher, P. (2011). Science in a Democratic Society. Amherst, NY: Prometheus Books. Krugman, P. (2009). “How Did Economists Get It So Wrong?” The New York Times 09/06/2009. Leamer, E. (1983). “Let’s Take the Con Out of Econometrics.” American Economic Review 73(1): 31–43. Leamer, E. (2010). “Tantalus on the Road to Asymptopia.” Journal of Economic Perspectives 24(2), 31–46. Longino, H. (1996). “Cognitive and Non-cognitive Values in Science.” In Nelson, L. and Nelson, J. (eds.), Feminism, Science, and the Philosophy of Science. London: Kluwer, 39–58. Longino, H. E. 2008. “Values, Heuristics, and the Politics of Knowledge.” In Carrier, M., Howard, D. and Kourany, J. (eds.), The Challenge of the Social and Pressure of Practice:  Science and Values Revisited. Pittsburgh: University of Pittsburgh Press, 189–216. Lucas, R. E. (1976). “Econometric Policy Evaluation: A Critique.” Carnegie-Rochester Conference Series on Public Policy 1, 19–46. Marx, K. (41890/1990). Capital. A Critique of Political Economy. Volume I. Fowkes, B. (transl), Mandel, E. (intr). London: Penguin Books. Marx, K. (1894/1981). Capital. A Critique of Political Economy. Volume III. Fernbach, D. (transl), Mandel, E. (intr). London: Penguin Books. Marx, K. (1857/21994). “Grundrisse: Foundations of the Critique of Political Economy (excerpts).” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge, MA: CUP, pp. 119–142. Norton, J. D. (2008). “Must Evidence Underdetermine Theory?” In In M. Carrier, D. Howard, and J. Kourany (eds.), The Challenge of the Social and Pressure of Practice: Science  and Values Revisited. Pittsburgh: University of Pittsburgh Press, pp. 17–44. Pagan, A. (1995). “Three Methodologies: An Update.” In Oxley, L. et al. (eds.), Surveys in Econometrics. Oxford: Basil Blackwell, 30–41. Quine, W. V. O. (1975). “On Empirically Equivalent Systems of the World.” Erkenntnis 9, 313–328. Rosenberg, A. (1992). Economics:  Mathematical  Politics  or  Science  of  Diminishing  Returns? Chicago: University of Chicago Press.

Introduction  17 Rudner, R. (1953). “The Scientists qua Scientists Makes Value Judgments.” Philosophy of Science 20: 1–6. Schumpeter, J. A. (1949). “Science and Ideology.” American Economic Review 39, 345–59. Simon, H. A. (1953). “Causal Ordering and Identifiability.” In Hood, E. W. C. and Koopmans, T. (eds.), Studies in Econometric Methods (Cowles Commission Monograph No. 14), 49–74 (chapter III). Sims, C. A. (1980). “Macroeconomics and Reality.” Econometrica 48(1), 1–48. Sims, C. A. (2010). “But Economics is Not an Experimental Science.” Journal of Economic Perspectives 24(2), 59–68. Smets, F. and Wouters, R. (2003). “An Estimated Stochastic Dynamic General Equilibrium Model of the Euro Area.” ECB Working Paper Series No. 171 (August). Spirtes, P., Glymour, C., Scheines, R. (1993). Causation, Prediction and Search. New York: Springer. Summers, L. H. (1991). “The Scientific Illusion in Empirical Macroeconomics.” The Scandinavian Journal of Economics 93(2), 129–48. Suppes, P. (1970). A Probabilistic Theory of Causality. Amsterdam: North-Holland, chaps. 1&2. Van den Berg, G. J. and Van der Klaauw, B. (2006). “Counseling and Monitoring of Unemployed Workers: Theory and Evidence from a Controlled Social Experiment.” International Economic Review 47(3), 895–936. Woodward, J. (2003). Making Things Happen: A Causal Theory of Explanation. Oxford: OUP.

Part I

Causality Tobias Henschen

An important (or perhaps the most important) goal of macroeconomics is to provide policymakers with the kind of models that they need to justify macroeconomic policy decisions to the general public and public institutions. To justify macroeconomic policy decisions, these models need to represent relations that can be exploited for purposes of manipulation and control: relations that allow for manipulations of one relatum (e.g. the federal funds rate) to influence another (e.g. inflation). Relations that can be exploited for purposes of manipulation and control are causal relations. Therefore, the models that policymakers need to justify macroeconomic policy decisions are causal models, and a question of significant philosophical and methodological interest relates to the nature of the causality of these models. I will deal with this question in Chapters 2 and 5 of this first part of this book. If causal models are supposed to justify macroeconomic policy decisions, then these models need to refer to (a) quantities that can be manipulated for policy purposes and to (b) relations of causal dependence that connect these quantities with some target quantity, and then there needs to be evidence in support of our belief that these models refer to (a) and (b). Are macroeconomists capable of providing evidence of that sort? Most macroeconomists believe that the answer is positive. This book, by contrast, will defend a negative answer. It’s going to defend that answer by arguing that in macroeconomics, causal models often refer to quantities, of which we cannot know whether they can be manipulated because we cannot measure them (Chapter 3), and that the evidence that macroeconomists can provide in support of these models is too inconclusive to support the hypotheses they express (Chapter 4). The argument for the inconclusiveness of this evidence has a bearing for the kind of realism that present-day macroeconomists seem to endorse when developing their models. I will argue in the first chapter of Part II (Chapter 6) that this realism is problematic. But I will begin this book with a definition of what it means for Y to causally depend on X when X and Y stand for macroeconomic aggregates.

DOI: 10.4324/9781003094883-2

2

2.1

What is Macroeconomic Causality?

Introduction

A striking fact about the title question of this chapter is that macroeconomists or econometricians hardly pay any attention to it. There are, as far as I can see, only four accounts of macroeconomic causality that have been proposed in response to that question. Clive Granger (1980, pp. 330, 336) is the only econometrician who comes up with an explicit definition of “cause.” Nancy Cartwright (2009, p. 415) notes that David Hendry sometimes suggests that causality is super exogeneity. Perhaps the most elaborate account of macroeconomic causality can be found in Kevin Hoover (2001, 2011, 2013). And probably the most recent account can be reconstructed from the potential outcome approach that Joshua Angrist and Guido Kuersteiner (2011) apply to test for the causal effects of monetary policy shocks. The aim of the chapter is not to speculate about the reasons why macroeconomists and econometricians tend to refrain from confronting the question of the nature of the causality of the relations that they think obtain between aggregate quantities.1 The aim of the chapter is to develop an adequate account of macroeconomic causality. In order to develop that account, I’m not going to discuss the Granger causality and super exogeneity accounts. I will postpone a discussion of the Granger causality account to Chapter 5 and refer the reader to Cartwright (2009, pp. 415–6), Hoover (2001, pp. 167–8), and Judea Pearl (22009, pp. 165–70) for a discussion of the super exogeneity account. In order to develop an adequate account of macroeconomic causality, I’m going to discuss Hoover’s and Angrist and Kuersteiner’s accounts. I will, in addition, analyze the extent to which James Woodward’s interventionist account of causality can be applied to macroeconomics. Woodward (2003, pp. 18, 258, 321) acknowledges on numerous occasions that his account is firmly rooted in the pioneering work of some of the early econometricians. With the exception of Hoover (cf. e.g. 2011, pp. 339–43; 2013, pp. 45–55), however, macroeconomists have largely ignored that account (or work that is similar in spirit, such as Pearl’s). I’m going to argue against all three accounts that they are too restrictive: that Woodward’s account makes reference to a condition that doesn’t need to be satisfied in the case of macroeconomics, that Hoover unnecessarily restricts the notion of an intervention to interventions on parameters, and that Angrist and Kuersteiner’s DOI: 10.4324/9781003094883-3

What is Macroeconomic Causality?  21 account relies on at least two conditions that don’t need to be satisfied in the case of macroeconomics. The account that I will defend can be regarded as the gist that is common to all three accounts when they are relieved of these overly restrictive conditions. The definition that is central to that account says that X directly type-level causes Y (i.e., that Y causally depends on X directly) if and only if there is a possible intervention on X that changes Y while all causal parents of Y excluding X (i.e., all variables that directly type-level cause Y except X) remain fixed by intervention, where X and Y stand for macroeconomic aggregates, where an intervention is understood as a manipulation of an intervention variable I that satisfies conditions requiring that I type-level cause X, that manipulating I break all arrows directed into any parent of Y except X (i.e., into any variable other than X that directly type-level causes Y), and that there be no confounders (i.e., type-level causes of both X and Y), and where an intervention variable is either a causal structure variable or a parameter. I will begin by introducing the notion of a causal model and by presenting macroeconomic models that can be regarded as instances of that model (Section 2.2). I will then discuss a macroeconomic application of the definition that is central to Woodward’s interventionist account of causality (Section 2.3), Hoover’s definition of ‘causality’ in terms of privileged parameterization (Section 2.4), and a definition that can be extracted from the potential outcome approach that Angrist and Kuersteiner (2011) have introduced into macroeconomics more recently (Section 2.5). In the final Section 2.6, I will summarize the arguments of Sections 2.3–2.5 and spell out in detail what I think is an adequate definition of ‘direct type-level causation’ in macroeconomics. 2.2

Causal Modeling in Macroeconomics

When trying to justify macroeconomic policy decisions, macroeconomists typically employ causal models. A causal model can be defined as a triple M = 〈V, U, Π〉, where i U is a set of background variables, ii V is a set of causal structure variables, iii Π is a set of parameters assigning a probability measure P(ui) to each Ui (i = 1, … , n) and a function yi = fi(dci(yi)), i = 1, … , n, to each Yi ⊆ V, where DC(Yi) ⊆ U ∪ V \Yi is the set of variables representing direct causes of Yi ⊆ V, and where a direct cause is to be understood as causally relevant to (or a typelevel cause of) and not as an actual (or token-level) cause of Yi.2 The background variables in U encompass the influence of variables that represent direct causes of variables in V but have been omitted from V. The variables in V are measured variables that directly cause other variables in V, are directly

22 Causality caused by other variables in V or both. Like the variables in U and V, the parameters in Π are variables that represent sets of potential values that are measurable or quantifiable. When time-indexed, the variables in U ∪ V represent sets of ordered pairs that assign each possible value to each possible point in time. When Yi is timeindexed, lagged values of Yi may also show up among the direct causes of Yi. Since the parameters assign a probability measure only to each of the variables in U, it is only the variables in U that supply a stochastic element. The underlying causal relationships between the variables in V are assumed to be deterministic. Macroeconomists are interested in relations of direct type- and token-level causation alike: they are not only interested in a relation of direct type-level causation between, for instance, the real interest rate and aggregate output but also in relations of direct token-level causation between the real interest rate in, say, 2008Q4 and aggregate output in 2009Q2, between the real interest rate in 2016Q3 and aggregate output in 2019Q2 (where aggregate output in 2019Q2 is the future value that aggregate output will attain in the second quarter of 2019) etc. But they tend to understand direct token-level causes as instantiations or realizations of direct type-level causes, and not relations of direct type-level causation as generalizations of relations of direct token-level causation. The models that they use to justify macroeconomic policy decisions are therefore supposed to represent relations of direct type-level causation and not relations of direct token-level causation. As an example of a macroeconomic causal model, consider the rational expectations model that Hoover (2001, pp. 64–6; 2011, pp. 349–50; 2013, pp. 42–3) discusses at various points in his work. The model consists of two equations, of which the first describes the demand for real money: mt − pt = μ − α(t pet + 1 − pt) + νt,

(1)

where subscripts index time, Mt is the nominal money stock, Pt the general price level, t Pet + 1 the expectation at time t of the price level at time t + 1 so that (t Pet + 1 − Pt) approximates the rate of inflation between t and t + 1, μ a constant encompassing the influence of real GDP and the real interest rate,3 α a regression coefficient greater than zero, and Νt an independent random error.4 The second equation describes a monetary policy rule: mt= λ + mt − 1 + εt,

(2)

where λ stands for the constant growth rate of the money stock and Εt is an independent random error. The errors Νt and Εt are assumed to be serially uncorrelated, to be uncorrelated with each other, and to have means of zero. If expectations of future inflation are formed rationally, i.e., if t

pet + 1 = E(pt + 1⎹ Ξt),

(3)

where E(⋅) is the expectations operator and Ξt the information available at time t (which is assumed to include the structure of the economy and the current and

What is Macroeconomic Causality?  23 lagged values of the variables included in the model), then the inflation rate conforms to the rate of growth of the money stock: t

pet + 1 − pt = λ.(4)

Substituting (4) into (1) and rearranging yields: pt= mt − μ + αλ + νt,(5) which is the solution to the money-price model under rational expectations. In macroeconomics, examples of causal models also include the dynamicstochastic general-equilibrium (DSGE) models that central banks and other policymaking institutions use to justify macroeconomic policy decisions. Macroeconomic DSGE models are dynamic models built from microeconomic foundations with explicit assumptions about the behavior of the variables in U. As an example of a macroeconomic DSGE model, consider the following canonical DSGE model:5 yt = Et yt + 1 − (it − Etπt + 1) − δt,(6) πt = ξyt + βEt πt + 1 + ut,(7) it = ρ it −1 + (1 − ρ )[rt* + π*t + φπ (π t −1 − π*t ) + φ y ( yt − yt* )] + ε ti , (8) where yt ≡ log Yt is the logarithm of aggregate demand (which in equilibrium equals aggregate output), E the expectations operator, it ≡ log It the continuously compounded nominal interest rate, and πt ≡ log Pt /Pt − 1 the quarterly inflation rate; Rt* , Π*t , and Yt* are the monetary policy targets for the real interest rate, inflation, and output, respectively; ∆t, Ut, and Eti represent shocks to aggregate demand, inflation and the nominal interest rate, respectively, and are assumed to follow independent first-order autoregressive processes. The various parameters of the model are identified in microeconomic theory: β in the utility function of the representative household, ξ in the price-setting behavior of the representative firm, and ρ, ϕπ and ϕy in the (forward-looking) interest-rate rule followed by the central bank. The model is obviously stylized: the dynamics of aggregate demand and inflation are very simple, the empirical performance of (7) is poor, everything is linear, all behavior is forward-looking etc. But the model is “canonical” in the sense that it serves as a key reference point in macroeconomic DSGE modeling. Various modifications and extensions of it are used in central banks and other policymaking institutions.6 As a final example of a macroeconomic causal model, consider the model that Angrist and Kuersteiner (2011, p. 727) use to analyze the causal effects of changes in the intended federal funds rate on changes in real GDP: ∆gdpt + jψ (∆fft).(9) This expression describes changes in real GDP (∆GDPt + j) as a potential outcome: as the value that ∆GDPt + j would assume if ∆FFt (changes in the intended

24 Causality federal funds rate) were ∆fft. The model that Angrist and Kuersteiner think describes the process determining changes in the intended federal funds rate, i.e., the process that leads the members on the federal open markets committee to intend changes in the level of the federal funds rate, looks as follows: ∆fft = ψ(∆ff(zt, t), εt, t), where ψ is a general mapping and εt an independent regression error or monetary policy shock, i.e., an error reflecting the reaction of policymakers to idiosyncratic information. The set Zt of observed random variables includes lagged, present, and predicted values of variables standing for changes in real GDP, inflation, and the unemployment rate. The assumption that allows Angrist and Kuersteiner (2011, pp. 727–9) to understand ∆GDPt + j as a potential outcome is the selection-on-observables assumption (SOA), i.e., the assumption that potential outcomes of ∆GDPt + j are probabilistically independent of ∆FFt, given Zt.7 Potential outcomes of changes in real GDP are probabilistically independent of ∆FFt, given Zt, if Zt is an admissible (or de-confounding) set of covariates (for more on the property of admissibility cf. end of Section 2.5 below). If Zt is an admissible set of covariates, then a potential outcome of changes in real GDP can be identified from the product of the probability of changes in real GDP, given ∆FFt and Zt, and the probability of Zt: ΣZt P(∆gdpt + j⎹∆fft, zt)⋅P(zt). 2.3 An Interventionist Account of Macroeconomic Causality The decisive question is, of course, how membership in DC(Yi) is defined, i.e., what it means for a variable to be a direct type-level cause of Yi. The present and the following two sections will consider three accounts that have been or can be proposed to answer that question. The main purpose of that consideration is to show that some of the conditions, that these accounts make reference to, are too restrictive in the case of macroeconomics. The first of the three accounts is Woodward’s interventionist account. In the case where i = 1, it says that (Ia) X directly type-level causes Y relative to V if and only if there is a possible intervention on X that changes Y (or its probability distribution) while all other variables in V remain fixed by intervention (cf. Woodward, 2003, pp. 55, 59). What is striking about that definition is that it relativizes the notion of direct type-level causation to a variable set. According to Woodward (2003, p. 56), the selection of a variable set V depends on what an epistemic subject is prepared to accept as serious possibility. But what an epistemic subject is prepared to accept as serious possibility might be intersubjectively different. Does Woodward believe that relations of direct type-level causation do not exist independently of the mind? No: he endorses a “kind of realism that […] is metaphysically modest and

What is Macroeconomic Causality?  25 noncommittal” (Woodward, 2003, p. 121). And this kind of realism can be reconciled with a de-relativized variant of (Ia) that says that (Ib) X directly type-level causes Y if and only if there is a possible intervention on X that changes Y (or its probability distribution) while all other variables in- and outside V remain fixed by intervention.8 What it means to say that all other variables in- and outside V remain fixed by intervention becomes clear when Woodward’s (2003, p. 98) definitions of the terms ‘intervention’ and ‘intervention variable’ are considered. His definition of ‘intervention’ says that I’s assuming some value zi is an intervention on X with respect to Y if and only if I is an intervention variable for X with respect to Y and I = zi is an actual (token-level) cause of the value taken by X. Note that this definition is in line with Woodward’s (2003, pp. 103–4) “nonanthropomorphism”: I’s assuming some value zi doesn’t necessarily mean that I is set to zi by policy or (more generally) human intervention. Woodward’s definition of the term ‘intervention variable,’ by contrast, states that I is an intervention variable for X with respect to Y if and only if the following four conditions hold: (I1) (I2) (I3) (I4)

I type-level causes X, certain values of I are such that when I attains these values, X is no longer determined by other variables that type-level cause it but only by I, any directed path from I to Y goes through X, and I is statistically independent of any variable Z that type-level causes Y and is on a directed path that does not go through X.

That all other variables in- and outside V remain fixed by intervention means that conditions (I1)–(I4) are satisfied: that there isn’t any variable in- or outside V such that any of the propositions expressed by (I1)–(I4) becomes false. Satisfaction of conditions (I1)–(I4) guarantees the presence of a causal graph connecting I to X and X to Y by directed paths or the presence of a causal structure represented by that graph if there is a possible intervention on X that changes Y or its probability distribution. Note that according to Woodward, an intervention on X that changes Y (or its probability distribution) while all other variables in- and outside V remain fixed only needs to be possible in order for a relation of direct type-level causation to hold between X and Y. Woodward is not entirely clear about the exact modality captured by the term ‘possible’ in ‘possible intervention.’ In one passage, he rejects the notion of an intervention on variables for which there is no well-defined notion of change (e.g. on variables standing for race, sex, or species) (Woodward, 2003, p. 113). In another passage, he admits interventions that even involve violations of physical laws (cf. Woodward, 2003, p. 132). But what needs to be asserted at present is only that in order for X to directly type-level cause Y, it is not necessary that an intervention on an intervention variable satisfying (I1)–(I4) be actually present.

26 Causality One might perhaps wonder why (Ia) and (Ib) make reference to “a” possible intervention on X that will change (the probability distribution of) Y when one holds fixed all other variables in- (and outside) V by intervention. Isn’t a relation of direct type-level causation a lawful relation, and shouldn’t (Ia) and (Ib) therefore say that X is a direct type-level cause of Y if all interventions on X will change (the probability distribution of) Yi when one holds fixed all other variables in- (and outside) V by intervention? Woodward (2003, pp. 68–70) answers this question in the negative. He points out that in order for a relation between X and Y to qualify as causal, this relation doesn’t need to remain intact under all interventions on X. He cites examples from physics and the medical sciences to substantiate his point. But one might equally draw on examples from macroeconomics, including the models introduced in Section 2.2 above. Interventions, under which the relations represented by the equations of these models fail to remain intact, are of essentially four kinds. Interventions of the first kind set some of the variables of these equations to values that lie outside the domain of these equations. An example for an intervention of that kind is an intervention that sets Mt in equation (2) of the rational-expectation model from Section 2.2 to zero. Interventions of the second kind result in a breakdown of the relation represented by an equation even though they set a variable that figures in that equation to a value that belongs to the domain of that equation. An example for an intervention of that kind is an intervention that in a new classical model, sets wages to values that are high enough to result in values of supplied labor that are sufficiently high to deprive a representative agent of the leisure she needs to recreate. Interventions of the third kind lead to a breakdown of the relation represented by an equation by changing the value of a variable that is included in the set V of preselected variables but not included in that equation. An example for an intervention of that kind is an intervention that sets Eti in equation (8) of the canonical DSGE model from Section 2.2 to a value that yields a value for It that, given values for the remaining variables in the equation, is sufficiently high to yield a negative value for Yt in equation (6). Finally, interventions of the fourth kind lead to a breakdown of the relation represented by an equation by changing the value of a variable that is not even included in V. An example for an intervention of that kind is an intervention that changes the value of an omitted financial-market related variable in such a way that it counteracts the causal influence that Yt is supposed to have on Πt according to equation (7). Interventions of that kind are what Woodward refers to as “changes in the background circumstances” (cf. especially Hitchcock and Woodward, 2003, pp. 187–8). They are arguably the most interesting because they are the ones that typically get in the way of predictions of the behavior of aggregate quantities or of the consequences of macroeconomic policy manipulations. Woodward holds that in cases where i > 1 (where we are dealing with systems of equations), the condition of modularity needs to be satisfied. Modularity requires that (i) each equation in the system satisfy the condition of invariance, and that (ii) for each equation, there be a possible intervention on the dependent variable that changes only that equation while the other equations in the system remain unchanged (cf. Woodward, 2003, p. 329). Clause (i) doesn’t create any

What is Macroeconomic Causality?  27 major difficulties: invariance just requires that an equation in which X figures as an independent variable and Y as a dependent variable remain invariant under some intervention on X so that Y will change under this intervention (cf. Woodward, 2003, p. 69). Invariance is of utmost importance for Woodward: he says that it “is the key feature a relationship must possess if it is to count as causal or explanatory” (Woodward, 2003, p. 237). Invariance also comes in degrees: the wider the range of possible interventions under which the equation expressing the relation of direct type-level causation remains invariant, the greater the degree of invariance of that equation (cf. Woodward, 2003, p. 257). Clause (ii), by contrast, does create difficulties: Cartwright (2007, pp. 15–6) shows that there are systems of equations that represent causal structures even though they do not satisfy that clause. Consider, first, an example of a system of equations that clearly satisfies that clause: the system of equations (6)–(8), i.e., the canonical DSGE model from Section 2.2. An intervention on the interest rate in the sense of Woodward is a manipulation of an intervention variable I that sets It in equation (8) to a particular value: that breaks all arrows that are directed into It and depart from any variables other than I (e.g. from Πt − 1 or Yt) in accordance with condition (I2). The modularity condition requires that setting It in equation (8) to a particular value doesn’t disrupt any of the other equations in the system, i.e., equations (6) and (7). If the modularity condition is satisfied, condition (I3) is satisfied: the only path from I to Yt will go through It. If, moreover, condition (I4) is satisfied, then It can be inferred to directly type-level cause Yt. Consider next an example of a system of equations that represents a causal structure even though it doesn’t satisfy clause (ii): equations (2) and (5), i.e., Hoover’s money-price model under rational expectations from Section 2.2. Hoover (2013, p. 49) argues that an intervention in the sense of Woodward amounts to setting Mt to a particular value by breaking the causal arrow from Mt − 1 to Mt, and that breaking that arrow has the unwelcome consequence of rendering Λ meaningless, i.e., of undercutting any basis for forming rational expectations of the path of Mt. What is special about equations (2) and (5) is that they form a system that is subject to a nonlinear cross-equation restriction: Λ cannot be associated exclusively with either (2) or (5). It seems that Woodward’s account doesn’t hold for such systems: that a system of equations is non-modular whenever it is subject to a nonlinear cross-equation restriction. One might object that nonlinear cross-equation restrictions result only if expectation variables are solved out, and that expectations variables don’t need to be solved out because they can be observed: because macroeconomists can measure them by conducting surveys. Hoover (2001, p. 137) explains, however, why that objection is misguided: True, people form expectations and act upon them […], but such expectations do not exist independently of the actions they affect; they are not palpable, like so many pounds of rice bought by a consumer […]. Of course, one could ask people to state their expectations. That, however, would be simply their guess about how they would act or would have acted in a situation that was

28 Causality not yet at hand or had already passed. Such expectations are no more directly observable than their own preferences and are subject to the same whimsy, arbitrariness, and adjustment to subtle changes in background conditions. Hoover suggests that expectations fall into the same category as preferences. In revealed-preference theory, a consumer’s preference is reconstructed from her behavior (from her “revealed” preference). Statements about what she thinks she prefers are to be dismissed as neither verifiable nor trustworthy.9 Similarly, expectation variables cannot be measured because a subject’s statement about what she expects can neither be verified nor trusted. Expectation variables therefore need to be solved out. If they are not solved out (as in the case of the canonical DSGE model), they need to be interpreted as attaining whatever value is required to render the model consistent. It is important to see, however, that Woodward’s account can be modified in order to accommodate systems of equations that are subject to nonlinear crossequation restrictions. In order to see this, note that such systems violate clause (ii) because they violate condition (I2). Equations (2) and (5) do not violate clause (ii) in the sense that changing equation (2) disrupts equation (5): setting Mt to mt while breaking the causal arrows from Λ and Mt − 1 to Mt renders Λ “meaningless” but doesn’t disrupt equation (5). Equations (2) and (5) violate clause (ii) because for equation (2), there isn’t any possible intervention on Mt that changes that equation: because for equation (2), there isn’t any possible intervention that satisfies condition (I2), i.e., that breaks all arrows directed into Mt and departing from Λ and Mt − 1. If systems of equations that are subject to nonlinear cross-equation restrictions violate clause (ii) because they violate condition (I2), then dropping condition (I2) from the conditions that Woodward lists to define ‘intervention variable’ will modify his account in such a way that it accommodates such systems. It will modify his account in essentially two ways. It will, first, lead to a reformulation of clause (ii) of his modularity condition: to a formulation requiring that for each equation, there be a possible intervention on the dependent variable that satisfies (I1), (I3), and (I4) and does not disrupt any of the other equations in the system. Cartwright’s example of a carburetor is one that violates even this reformulation. But in macroeconomics, systems of equations generally satisfy this reformulation or (as was shown earlier) even clause (ii). They even always satisfy this reformulation or clause (ii) if the notion of an intervention is not restricted to either interventions on parameters (variables in Π) or interventions on variables in V (as I shall argue in the following section). The second modification relates to the phrase “while all other variables … remain fixed by intervention” in definitions (Ia) and (Ib). Definitions (Ia) and (Ib) would no longer define ‘direct type-level causation’ unless interventions that hold fixed all other variables were understood as manipulations of intervention variables that satisfy conditions (I1)–(I4) including (I2). In order to see this, consider the case of X causing Y only indirectly via W, i.e., the case of a directed path from X to Y through W. We would not be able to rule out that case unless there was an intervention variable certain values of which are such that when it attains these values,

What is Macroeconomic Causality?  29 W is no longer determined by other variables that type-level cause it but only by that intervention variable. X might, after all, be among those other variables. But once we start tampering with the phrase “while all other variables …,” we should also note that in order for definitions (Ia) and (Ib) to define ‘direct type-level causation,’ it suffices to hold fixed only the causal parents of Y excluding X, i.e., all variables in DC(Y)\X. Woodward (2003, p. 55) suggests that the phrase “while all other variables …” is needed to ensure that the causal relation between X and Y is direct. But Pearl (22009, pp. 127–8) is right when pointing out that in order to ensure that the causal relation between X and Y is direct, it suffices to hold constant the parents of Y excluding X. The phrase “while all other variables … remain fixed by intervention” should therefore be restated as “while all causal parents of Y excluding X … remain fixed by intervention,” where an intervention that holds fixed any of the variables in DC(Y)\X is to be understood as different from an intervention on X. I am going to spell out that difference in greater detail in Section 2.6. 2.4

Macroeconomic Causality as Privileged Parameterization

The second of the three accounts that have been or can be proposed to define membership in DC(Yi) is Hoover’s account. What is the definition that is central to that account? That question is a bit difficult to answer because Hoover nowhere states any definition that is as explicit as Woodward’s, and because he in fact offers two definitions. A condition that figures in both definitions requires that in a self-contained (or soluble) system of equations, X and Y belong to different selfcontained subsystems, that the self-contained subsystem including X be of a lower order than the self-contained subsystem including Y (i.e., that the former needs to be solved in order to solve the second), and that there be no self-contained subsystem intervening between the two. Hoover (2012, p. 782) refers to this condition as “Simon’s hierarchy condition” in order to emphasize the indebtedness of his approach to the work of Herbert Simon. In Hoover’s first definition, Simon’s hierarchy condition combines with the condition that the parameterization of the self-contained system of equations (i.e., that the set Π of parameters in a causal model M) be privileged, where a parameterization is privileged if it (and the functional form of the causal model equations) is invariant “in the face of specific interventions,” i.e., in the face of interventions on the parameters that we can directly control to control for the variables in V indirectly, and where a parameterization is invariant in the face of such interventions if the parameters in Π are variation-free, i.e., if they are mutually unconstrained. Hoover (2011, pp. 344–5; 2012, pp. 781–2; 2013, pp. 40–1) points out that Simon introduces this condition to solve the problem of the observational equivalence of different self-contained systems of equations. This condition may therefore be referred to as “Simon’s condition of privileged parameterization.” In Hoover’s second definition, by contrast, Simon’s hierarchy condition combines with Hoover’s “parameter-nesting condition” (cf. Hoover, 2012, p. 782), i.e., with the condition that Π(X) be a proper subset of Π(Y), where Π(X) and Π(Y) are the sets of parameters that can be controlled directly to control X and Y indirectly.

30 Causality Hoover (2012, p. 782) suggests that the parameter-nesting condition extends Simon’s approach, while also inheriting the property that causal order is uniquely defined by the functional relations among variables, as long as we know the privileged parameters. Hoover’s first definition can be reconstructed from what he says about privileged parameterizations (cf. Hoover 2011, p. 345; 2013, p. 41). It states that (IIa) X directly type-level causes Y if and only if Simon’s hierarchy condition and Simon’s condition of privileged parameterization hold. Hoover’s second definition, by contrast, can be extracted from his set-theoretic formalization of causal structure (cf. Hoover, 2001, pp. 61–3). It says that (IIb) X directly type-level causes Y if and only if Simon’s hierarchy condition and Hoover’s parameter-nesting condition hold. Which definition should we prefer? It seems to me that (IIa) is preferable. In one passage, Hoover (2012, p. 782) suggests that (IIb) is required if systems of equations that are subject to nonlinear cross-equation restrictions are to be accommodated. But in the same passage, he shows that the parameter-nesting condition is not necessary: that the equations A = αA and B = αAA might represent a causal structure in which A directly type-level causes B even though Π(A) = Π(B) = {αA}, i.e., even though Π(A) is not a proper subset of Π(B). There are, moreover, at least two passages in which Hoover (2011, p. 350; 2013, p. 42) claims that (IIa) can successfully deal with the money-price model under rational expectations from Section 2.2.10 And I agree. In the case of that model, Simon’s hierarchy condition and his condition of privileged parameterization hold: the parameters of that model are variation-free; and equations (2) and (5) form a self-contained system in which (2) is of lower order and no subsystem intervenes. There are, however, at least two problems with Hoover’s characterization of the notion of intervention, as it is basic to both (IIa) and (IIb). The first problem is that Hoover (2001, p. 61; 2011, pp. 346–7; 2013, pp. 52–3) restricts that notion to interventions on parameters (or variables in Π). He shares Woodward’s nonanthropomorphism: he doesn’t think that direct parameter control is necessarily control through policy manipulation; he accepts e.g. war-induced changes of parameter values as cases of direct parameter control (cf. Hoover 2001, pp. 229–30). Hoover (2011, p. 348) is also prepared to understand variation-free parameters as intervention variables. But an intervention for Hoover is not the same as for Woodward: while for Hoover it changes the value of a variable in Π, for Woodward it changes the value of a variable in V. The problem with this restriction is that in macroeconomics, there are obviously interventions that cannot be understood as interventions on parameters. Take the canonical DSGE model from Section 2.2: how does the Federal Reserve intervene on It in equation (6)? It changes It by supplying or removing reserves from the banking system. A change of that kind is not an intervention on any of the parameters

What is Macroeconomic Causality?  31 in the model. Instead, it is an intervention on It in the sense of Woodward. Woodward’s characterization of the notion of intervention is, of course, likewise too restrictive. If Hoover’s money-price model under rational expectations is taken as a point of departure, it is legitimate to understand a monetary policy intervention as an intervention on Λ. The general lesson to be drawn states that in macroeconomics, the notion of an intervention is not to be restricted to either interventions on parameters (variables in Π) or interventions on variables in V. The second problem with Hoover’s characterization is his insistence that in order for X to directly type-level cause Y, there doesn’t need to be a possible intervention on a parameter-intervention variable that satisfies all of Woodward’s conditions (I1)–(I4) (cf. Hoover, 2011, pp. 348–9). Hoover agrees that in order for X to directly type-level cause Y, there needs to be a possible intervention on a parameter-intervention variable that satisfies (I1). But he denies that there needs to be a possible intervention on a parameter-intervention variable that satisfies any of (I2)–(I4). And while this denial is justified in the case of (I2), it is unjustified in the cases of (I3) and (I4). It is justified in the case of (I2) because, as we have seen toward the end of Section 2.3, there are macroeconomic systems of equations for which there is no possible intervention satisfying (I2), i.e., systems of equations that are subject to nonlinear cross-equation restrictions. But it is unjustified in the cases of (I3) and (I4) because a parameter-intervention variable is causally relevant to X if it satisfies (I1), because a parameter-intervention variable corresponds to an arrow-emitting node if it is causally relevant, and because X might not qualify as a direct type-level cause of Y if Y is connected to the parameter-intervention variable by a path that doesn’t go through X. Conditions (I3) and (I4) are, of course, meant to rule out the existence of such a path, as it becomes clear in the figure that Woodward (2003, pp. 101–2) uses to illustrate conditions (I3) and (I4):

X

X

?

Y

I

Y

I

?

Z X

I

? Y

I Z

W

X

? Y

Z

Figure 2.1 The four cases ruled out by Woodward’s conditions (I3) and (I4) (cf. Woodward 2003, pp. 101, 102).

32 Causality Hoover insists that it is impermissible “to draw causal arrows from parameters to other parameters and variables” (Hoover, 2012, p. 781), and that drawing such arrows is against his “representational conventions” (Hoover, 2010, p. 391). Hoover refers to these conventions when discussing an example that Cartwright (2007, p. 207) thinks is a counterexample to his parameter-nesting condition. I’m not sure whether these conventions make sense: if we can control a parameter directly to control a variable indirectly, then that parameter is causally relevant to (or a typelevel cause of) that variable (Hoover admits that much when agreeing that an intervention on a parameter-intervention variables satisfies Woodward’s condition (I1)); and if that parameter is causally relevant to that variable, then why shouldn’t we be allowed to draw a causal arrow from that parameter to that variable (i.e., to understand that parameter as corresponding to an arrow-emitting node in a causal graph)? I would also like to emphasize, however, that the above argument does not rely on a violation of Hoover’s conventions. When avoiding the language of causal graphs it says that on Hoover’s account, an intervention on a parameterintervention variable needs to satisfy (I3) and (I4) because that parameterintervention variable is causally relevant if it satisfies (I1), and because X might not qualify as a direct type-level cause of Y if the parameter-intervention variable type-level causes or is type-level caused by a variable (or set of variables) Z that type-level causes Y. If an intervention on a parameter-intervention variable needs to satisfy (I1), (I3), and (I4) in order for X to directly type-level cause Y, then an important question relates to the exact relationship between Hoover’s account and the macroeconomic variant of Woodward’s account. In this book, I can only offer a conjecture (instead of a formal proof). Both accounts consist of essentially three components: one component guaranteeing that the relation of type-level causation between X and Y is direct (the reformulation of the phrase “while all other variables …” and Simon’s hierarchy condition), one component specifying modularity (the reformulation of Woodward’s modularity condition and Simon’s condition of privileged parameterization), and one component characterizing the notion of an intervention (Woodward’s definitions of ‘intervention’ and ‘intervention variable’ in terms of (I1), (I3), and (I4) and Hoover’s characterization of direct parameter control that he mistakenly believes can get along without conditions (I3) and (I4)). The conjecture is that both accounts are equivalent, once the notion of an intervention is not restricted to either interventions on parameters (variables in Π) or interventions on variables in V. I sufficiently motivated that conjecture with respect to the third component. It is also sufficiently motivated with respect to the first component: while Simon’s hierarchy condition is meant to guarantee direct causation in the case of systems of equations, Woodward’s clause “while all other variables …” is more general and guarantees direct causation also in more garden-variety cases. How about the modularity-specifying component? Hoover (2012, p. 782) agrees that Simon’s condition of privileged parameterization is a modularity condition. He says, more specifically, that the condition that parameters be variation-free is a modularity condition. But the condition that parameters be variation-free is a precondition

What is Macroeconomic Causality?  33 for their being privileged. Simon’s condition of privileged parameterization may therefore be understood as a modularity condition. Hoover (2012, p. 782) also suggests that his (or Simon’s) and Woodward’s modularity conditions are different. But while Hoover’s (or Simon’s) modularity condition says that parameters are unconstrained, Woodward’s modularity condition says that functions (or the equations expressing them) are unconstrained. And remember from the definition of the notion of a causal model in Section 2.2 that the parameters of that model assign these functions to each of the Yi. So how can Woodward’s and Hoover’s modularity conditions be different? 2.5 The Potential Outcome Approach to Macroeconomic Causality The third definition of membership in DC(Yi) can be extracted from the potentialoutcome approach that Angrist and Kuersteiner (2011) have introduced into macroeconomics more recently. It says that (III) X is a direct type-level cause of Y if and only if a ΣZP(y⎹x, z)⋅P(z) ≠ 0, b potential outcomes of Y are probabilistically independent of X given Z, and c all variables in DC(Y)\X remain fixed by intervention, where the expression in (a) measures the causal effect of X on Y, where (b) is the selection-on-observables (SOA), and where (c) is meant to ensure that (III) defines the notion of direct type-level causation. It is to be conceded that Angrist and Kuersteiner’s approach contains only conditions (a) and (b), and that conditions (a) and (b) combine to define the notion of total cause, not that of direct type-level cause. But the purpose of the present section is to consider whether Angrist and Kuersteiner’s approach might pass for a valid account of direct type-level causation in macroeconomics. And that consideration requires that their account be translated into an account of direct type-level causation. Fortunately, condition (c) is exactly what is needed to achieve that translation (cf. end of Section 2.3). In his discussion of the literature on potential outcomes, Pearl (22009, p. 80) refers to the conditioning set Z of variables in the SOA as “admissible” or “deconfounding.” He argues that researchers need a workable criterion to guide their choice of the covariates that need to be included in that set. And he claims that the concept of the SOA “falls short of providing researchers with a workable criterion to guide the choice of covariates” (Pearl, 22009, p. 79). Pearl’s claim is put forth as a general criticism of the potential outcome framework. But it clearly applies to the monetary-policy model from Section 2.2: all that Angrist and Kuersteiner (2011, p. 729) remark with respect to the SOA that they think holds for Zt, ∆FFt and potential outcomes of ∆GDPt+j is that it holds “after appropriate conditioning”; and all they do to justify their choice of covariates is to conduct a series of diagnostic (or misspecification) tests on a number of causal models for the process determining ∆FFt. The criterion that Pearl (22009, p. 79) himself provides is the “back-door criterion.” He says that Z satisfies the backdoor criterion relative to an ordered pair

34 Causality of variables (X, Y) in a causal graph if Z (i) doesn’t include any descendants of X and (ii) blocks every path between X and Y that contains an arrow into X, where Z is said to “block” a path p if p contains at least one arrow-emitting node that is in Z or at least one collision note that is not in Z and has no descendants in Z (cf. Pearl, 22009, pp. 16–7). Pearl (22009, p. 80) uses the following graph to illustrate the backdoor criterion: Z1

Z2

Z3

Z4

Z5

X

Z6

Y

Figure 2.2 {Z3, Z4} and {Z4, Z5} satisfy the backdoor criterion; {Z6} and {Z4} don’t (cf. Pearl 2009, p. 80).

In this graph, {Z3, Z4} and {Z4, Z5} satisfy the backdoor criterion because they do not include any descendants of X, and because they block every path between X and Y that contains an arrow into X. It is clear that {Z6} does not satisfy the backdoor criterion because it is a descendent of X. Note, however, that {Z4} by itself doesn’t satisfy the back-door criterion either: it blocks the path X ← Z3 ← Z1 → Z4 → Y because the arrow-emitting node is in Z; but it does not block the path X ← Z3 ← Z1 → Z4 ← Z2 → Z5 → Y because none of the arrow-emitting nodes (Z1, Z2) is in Z, and because the collision node Z4 is not outside Z. Pearl (22009, pp. 80–1) then proves the proposition that P(y⎹do(x)) = ΣZP(y⎹x, z)⋅P(z) if and only if Z satisfies the backdoor criterion, where P(y⎹do(x)) is the probability that Y = y if X is set to x by intervention. For the case of Angrist and Kuersteiner’s monetary-policy model from Section 2.2, all this means that ∆FFt (changes in the intended federal funds rate) cannot directly type-level cause ∆GDPt + j (changes in real GDP) unless ΣZtP(∆gdpt + j ⎹∆fft, zt)⋅P(zt) ≠ 0 and ∆GDPt + j are probabilistically independent of ∆FFt given Zt; that ∆GDPt+j cannot be probabilistically independent of ∆FFt given Zt unless Zt is admissible; and that Zt cannot be admissible unless the variables included in Zt (lagged, present, and predicted values of ∆GDPt + j; inflation; and the unemployment rate) satisfy the back-door criterion: unless Zt doesn’t include any descendants of ∆FFt and blocks every path between ∆FFt and ∆GDPt + j that contains an arrow into ∆FFt. It is true that Angrist and Kuersteiner do not explicitly endorse the backdoor criterion as a criterion to guide their choice of the variables in Zt. But Pearl proves

What is Macroeconomic Causality?  35 that ΣZP(y⎹x, z)⋅P(z) cannot measure the causal effect of X on Y unless the variables in Z satisfy the backdoor criterion. It is accordingly fair to say that (III) cannot qualify as an adequate definition of ‘direct type-level causation’ unless the variables included in Z satisfy the backdoor criterion. Since Angrist and Kuersteiner do not explicitly endorse the backdoor criterion and condition (c) in (III) does not figure explicitly in their approach, it is questionable whether (III) can be “extracted” from their approach. But the approach to which (III) is central should be given a name. And since Angrist and Kuersteiner (2011) are the ones to have introduced the potential outcome approach into macroeconomics, I will continue to refer to (III) as central to Angrist and Kuersteiner’s approach. It is important to understand that the backdoor criterion implies Woodward’s conditions (I1)–(I4). For Pearl (22009, pp. 70–1), do(x) amounts to setting X to x by manipulating an intervention variable I and by breaking all arrows directed into X and departing from variables other than I. For Pearl, that is, do(x) requires that Woodward’s conditions (I1) and (I2) be satisfied. Condition (ii) of the backdoor criterion, moreover, rules out the same cases as Woodward’s conditions (I3) and (I4). The cases that Woodward’s conditions (I3) and (I4) are meant to rule out correspond to the four graphs in Figure 2.1 (cf. end of Section 2.4). These are cases in which it is impossible to infer that X directly type-level causes Y if Z is unknown. These are also cases in which Z blocks the paths between X and Y that contain an arrow into X if it is known. Therefore, conditioning on Z (knowing the value of Z) rules out the same cases as conditions (I3) and (I4). The backdoor criterion is a bit stronger than Woodward’s conditions (I1)–(I4) since condition (i) of the backdoor criterion rules out all cases in which arrows are directed into Z and depart from X. But the backdoor criterion represents a set of conditions that includes Woodward’s conditions (I1)–(I4). One may accordingly say that the backdoor criterion implies these conditions. If the backdoor criterion implies Woodward’s conditions (I1)–(I4), then an important question relates to the exact relationship between the macroeconomic variant of Woodward’s interventionist account and Angrist and Kuersteiner’s potential-outcome approach. As in the case of the relation between that variant and Hoover’s account, I can only offer a conjecture. Angrist and Kuersteiner’s potential-outcome approach consists of essentially two components: one component guaranteeing that the relation of type-level causation between X and Y is direct (condition (c) in definition (III)), and one component characterizing the notion of an intervention (conditions (a) and (b) together with the equality of P(y⎹ do(x)) and ΣZP(y⎹ x, z)⋅P(z), given that Z satisfies the backdoor criterion). A component specifying modularity is missing because (9) expresses the only functional relationship considered. But such a component could easily be added by requiring e.g. like Woodward that setting X to x by intervention do not disrupt any other functional relationships. The conjecture is that the potential-outcome approach will reduce to the macroeconomic variant of Woodward’s account if two conditions are dropped: condition (i) of the backdoor criterion and Woodward’s condition (I2). The loss of (I2) can be compensated if an intervention that holds fixed a variable in DC(Y)\X is

36 Causality interpreted as a manipulation of an intervention variable for W ∈ DC(Y)\X with respect to Y that breaks all arrows directed into W and departing from variables other than that intervention variable. And the loss of condition (i) of the backdoor criterion can be compensated if acyclicity is accepted as a general property of the causal graphs that can be taken to represent relations of direct type-level causation in macroeconomics. Pearl (22009, p. 339) points out that estimates of ΣZP(y⎹ x, z)⋅P(z) may be biased if the variables in Z do not satisfy condition (i) of the backdoor criterion. He uses the following graph to illustrate that point: U0

U1

U2

U3

U4

X

W1

W2

W3

Y

Z1

Figure 2.3 (notation slightly modified). Violating condition (i) of the backdoor criterion when fine-graining (cf. Pearl 2009, p. 339).

This graph is supposed to be a fine-grained variant of X → W1 → W2 → W3 → Y. If Z1 is included in Z, then estimating ΣZP(y⎹ x, z)⋅P(z) will be biased because W1 is a collider, i.e., because there will be a dependency between X and U1 that in the coarse-grained variant of the graph, acts like a backdoor path: X ↔ U1 → W1 → W2 → W3 → Y. Pearl (22009, p. 339) suggests that condition (i) of the backdoor criterion is meant to rule out such backdoor paths. But he also admits that the presence of descendants of X does not necessarily bias estimates of ΣZP(y⎹ x, z)⋅P(z): that e.g. a descendant of X that is not also a descendant of some Wi or Ui (with i > 0) can safely be conditioned on without introducing bias. If the presence of descendants of X does not necessarily bias estimates of ΣZP(y⎹ x, z)⋅P(z), then condition (i) of the backdoor criterion is unnecessarily strong. That condition becomes even superfluous once acyclicity is accepted as a general property of the causal graphs that can be taken to represent relations of direct type-level causation. If acyclicity is accepted as a general property, then a causal graph like X ↔ U1 → … is inadmissible from the start. In macroeconomics, the causal graphs that can be taken to represent causal models are typically acyclic. Simultaneous equation models require representation by cyclical causal graphs (cf. Pearl, 22009, pp. 28, 215, for an example from economics) but do not qualify as causal in the sense of the definition stated at the beginning of Section 2.2. Condition (i) of the backdoor criterion is therefore not a necessary condition for direct type-level causation in macroeconomics.

What is Macroeconomic Causality?  37 2.6 An Adequate Account of Macroeconomic Causality The analysis of the preceding three sections suggests that a macroeconomic variant of Woodward’s interventionist account qualifies as an adequate account of macroeconomic causality: that in macroeconomics, an adequate definition of ‘direct type-level causation’ says that (*) X directly type-level causes Y if and only if there is a possible intervention on X that changes Y (or its probability distribution) while all causal parents of Y except X remain fixed by intervention, where - IX’s assuming some value zi is an intervention on X with respect to Y if and only if IX is an intervention variable for X with respect to Y and IX = zi is an actual (token-level) cause of the value taken by X. - IX is an intervention variable for X with respect to Y if and only if the following three conditions hold: (IX1) IX type-level causes X, (IX3) any directed path from IX to Y goes through X, and (IX4) IX is statistically independent of any variable Z that type-level causes Y and is on a directed path that does not go through X. - IW’s assuming some value zi is an intervention on W ∈ DC(Y)\X with respect to Y if and only if IW is an intervention variable for W with respect to Y and IW = zi is an actual (token-level) cause of the value taken by W. - IW is an intervention variable for W with respect to Y if and only if the following four conditions hold: (IW1) IW type-level causes X, (IW2) certain values of IW are such that when IW attains these values, W is no longer determined by other variables that type-level cause it but only by I W, (IW3) any directed path from IW to Y goes through W, and (IW4) IW is statistically independent of any variable Z that type-level causes Y and is on a directed path that does not go through W. - an intervention variable is either a parameter (i.e., a variable in Π) or a variable in V. It is easy to see that this definition coincides with definition (Ia) or (Ib), except that “all other variables …” is replaced with “all causal parents of Y except X,” that interventions on X and interventions on variables in DC(Y)\X are interpreted differently, and that the notion of an intervention is no longer restricted to interventions on variables in V. I argued in Section 2.3 that interventions on X and interventions on variables in DC(Y)\X need to be interpreted differently because Woodward’s account cannot

38 Causality accommodate cases of macroeconomic models that are subject to nonlinear crossequation restrictions unless (I2) is dropped from the conditions that Woodward lists to define the term ‘intervention variable for X with respect to Y.’ I also argued in Section 2.3 that dropping condition (I2) has the unwelcome consequence of rendering the relation of type-level causation between X and Y indirect unless Woodward’s phrase “while all other variables …” is replaced with “while all causal parents of Y except X …,” and unless interventions that hold fixed all causal parents of Y except X are interpreted as manipulations of intervention variables that satisfy conditions (IW1)–(IW4). I pointed out in Section 2.4 that allowing intervention variables to include parameters is necessary because in macroeconomics, it is sometimes legitimate to understand interventions as interventions on parameters. Definition (*) is an adequate definition of ‘direct type-level causation’ in macroeconomics. I argued toward the end of the preceding section that it doesn’t coincide with the definition that can be extracted from the potential-outcome approach that Angrist and Kuersteiner (2011) have introduced into macroeconomics more recently: that the latter is inadequate as a definition of ‘direct type-level causation’ in macroeconomics, and that it reduces to the former if Woodward’s condition (I2) and condition (i) of the backdoor criterion are dropped. But I also conjectured toward the end of Section 2.4 that definition (*) coincides with Hoover’s definition (IIa), as long as the notion of intervention that is basic to Hoover’s definition is not restricted to interventions on parameters (or variables in Π). By way of conclusion, I’d like to point to a common feature of all three definitions: their circularity. Causal vocabulary (‘cause,’ ‘intervention,’ ‘control,’ and ‘arrow’) shows up on both sides of the biconditionals expressed by (IIa), (III), and (*). While this is obvious in the case of (*), it is less obvious in the cases of (IIa) and (III). Hoover (2013, p. 52) claims at one point that “the structural account does not actually use the notion of direct control to define causal order.” But that claim is true only as long as Simon’s condition of privileged parameterization is not spelled out. Hoover is aware, of course, that spelling out that condition requires the notion of direct control. In a different context, he even says that “[i]nvariance of the functional forms in the face of specific interventions is […] the hallmark of a true causal representation” (cf. Hoover, 2011, p. 345). One might likewise claim that the potential-outcome approach doesn’t actually use any causal vocabulary to define the notion of direct type-level causation. But similarly, this claim is true only as long as the backdoor criterion isn’t spelled out as a condition that guides the choice of covariates. Definitional circularity is arguably a problem for philosophers who aim to develop reductive accounts of causality: theories that Glymour (2004, pp. 779–80) refers to as “Socratic” (not without a sense of irony, of course) and that he distinguishes from Euclidean theories. Woodward (2003, pp. 104–5), however, makes it clear that he aims to develop a Euclidean, and not a Socratic theory: that he is interested in the conceptual entanglement between causation and intervention, and not in any non-circular definition or reductive account of direct type-level causality.11 And similarly, Hoover (2001, p. 42) claims that “[circularity] is less troubling epistemologically than it might seem to be ontologically.”

What is Macroeconomic Causality?  39 Notes 1 Hoover (2004, p. 161) mentions one such reason when stating that this tendency “is a corollary to the rise of mathematical formalism – especially the dominance of Walrasian general-equilibrium models.” 2 In this book, I will sometimes use uppercase letters for variables and lowercase letters for their values. This convention is a bit unusual for (macro-) economists but widespread in the philosophical literature on causality. 3 Hoover (2001, p. 64) points out that the case in which the influence of real GDP and the real interest rate is encompassed by a constant is a case of hyperinflation, i.e. a case in which the inflation rate is so large relative to changes in the real interest rate and real incomes that it is reasonable to defer them as constants to the causal background. The case of hyperinflation isn’t the norm but will be assumed here for purposes of simplification. 4 In Hoover’s model, Mt, Pt and tPet + 1 are in natural logarithms. In macroeconomics, variables are usually lowercased when written in logarithms. In order to maintain the distinction between variables and their values, however, uppercase letters will be reserved for variables (no matter whether or not they are in logarithms) and lowercase letters for their values throughout the book. 5 An exposition of the model can be found in Sbordone et al. (2010). I will analyze the assumptions of the model in closer detail in Section 3.3 of Chapter 3. 6 For extensions of the canonical model, consider e.g. the DSGE models mentioned in the introduction: Smets and Wouters (2003) and Christiano, Motto, and Rostagno (2010). 7 Angrist and Pischke (2009, p. 54) refer to the same assumption as “conditional independence assumption.” In the present context, however, it is more appropriate to use the term “selection-on-observables assumption” because the assumption in question might otherwise be confused with the assumption that Angrist and Kuersteiner (2011, p. 729) refer to as “the key testable conditional independence assumption,” i.e., with an assumption that involves actual, not potential outcomes. 8 For a more complete argument in favor of this de-relativized variant, cf. Henschen (2015, Section 3). 9 Sen (1973: p. 242) provides a long list of quotations that testify to the worries about introspection and verifiability that motivated revealed-preference theory. 10 In these passages, the terms “Simon’s framework” and “structural account” must refer to (IIa) because neither article is concerned with the parameter-nesting condition. 11 I should mention, however, that Glymour (2004, pp. 785, 790) also argues that the notion of a causal graph is basic, and that the notion of an intervention should be defined in terms of it.

References Angrist, J. D. and Kuersteiner, G. M. (2011). “Causal Effects of Monetary Shocks: SemiParametric Conditional Independence Tests with a Multinomial Propensity Score.” The Review of Economics and Statistics 93(3), 725–47. Angrist, J. D. and Pischke, J.-S. (2009). Mostly Harmless Econometrics. Princeton: PUP. Cartwright, N. (2007). Hunting Causes and Using Them. Cambridge: CUP. Cartwright, N. (2009). “Causality, Invariance, and Policy.” In Kincaid, H. and Ross, D. (eds.), The Oxford Handbook of the Philosophy of Economics. Oxford: OUP, 410–23. Christiano, L. J., Motto, R. and Rostagno, M. (2010). “Financial Factors in Economic Fluctuations.” ECB Working Paper Series No. 1192 (May). Glymour, C. (2004). “Critical Note On: James Woodward, Making Things Happen.” British Journal for the Philosophy of Science 55, 779–90.

40 Causality Granger, C. W. J. (1980). “Testing for Causality: A Personal Viewpoint.” Journal of Economic Dynamics and Control 2(4), 329–52. Henschen, T. (2015). “Ceteris paribus Laws and the Interventionist Account of Causality.” Synthese 192(10), 3297–311. Hitchcock, C. and Woodward, J. (2003). “Explanatory Generalizations, Part II: Plumbing Explanatory Depth.” Nous 37(2), 181–99. Hoover, K. D. (2001). Causality in Macroeconomics. Cambridge: CUP. Hoover, K. D. (2004). “Lost Causes.” Journal of the History of Economic Thought 26(2), 149–64. Hoover, K. D. (2010). “Causal Pluralism and the Limits of Causal Analysis: A Review of Nancy Cartwright’s Hunting Causes and Using Them.” In Emmett, R. B. and Biddle, J. E. (eds.), Research in the History of Economic Thought and Methodology. Bingley, UK: Emerald. Vol. 28A, 381–95. Hoover, K. D. (2011). “Counterfactuals and Causal Structure.” In McKay Illari, P., Russo, F. and Williamson, J. (eds.), Causality in the Sciences. Oxford: OUP, 338–60. Hoover, K. D. (2012). “Causal Structure and Hierarchies of Models.” Studies in History and Philosophy of Science Part C 43(4), 778–86. Hoover, K. D. (2013). “Identity, Structure, and Causal Representation in Scientific Models.” In Chao, H.-K., Chen, S.-T. and Millstein, R. (eds.), Towards the Methodological Turn in the Philosophy of Science: Mechanism and Causality in Biology and Economics. Dordrecht: Spinger, 35–60. Pearl, J. (22009). Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press. Sbordone, A. M., Tambalotti, A., Rao, K. and Walsh, K.J. (2010). “Policy Analysis Using DSGE Models: An Introduction.” FRBNY Economic Policy Review 2010 (October), 23–43. Sen, A. (1973). “Behavior and the Concept of Preference.” Economica 40(159), 241–59. Smets, F. and Wouters, R. (2003). “An Estimated Stochastic Dynamic General Equilibrium Model of the Euro Area.” ECB Working Paper Series No. 171 (August). Woodward, J. (2003). Making Things Happen: A Causal Theory of Explanation. Oxford: OUP.

3

3.1

The Ontology of Macroeconomic Aggregates

Introduction

After defending a definition of the causality of the relations that macroeconomists believe obtain between macroeconomic aggregates, I now turn to the ontology of the relata of these relations: the ontology of macroeconomic aggregates. Macroeconomic aggregates like employment, the average rate of interest on bonds, the money stock, GDP, inflation, aggregate demand, and so on are calculated from hours worked, the rate of interest on particular bonds, assets held by individual agents (as cash, checking accounts, or closely related assets), quantities and prices of specific commodities, wealth levels of individual agents and so on according to some algorithm. If expectations could be measured directly, inflation expectations, aggregate demand expectations, and so on could in principle also be calculated from the expectations that individual agents form with respect to inflation, aggregate demand, and so on in accordance with some algorithm. For the ontology of macroeconomic aggregates, this means that they fully reduce to the microeconomic quantities that agents in ‘flesh and blood’ work, hold, produce, pay, or purchase. It is true that macroeconomic aggregates “supervene” on the microeconomic quantities, of which they are composed, and that they take positions in the relations of probabilistic dependence that “emerge” from the direct interactions of these quantities (or from the direct interactions of the heterogeneous agents who work, hold, produce, pay, or purchase these quantities). But the supervenience of macroeconomic aggregates or the emergence of relations of probabilistic dependence between them does not imply that macroeconomic aggregates are ontologically independent of their microeconomic constituents to any extent. The present chapter has three objectives. Its first objective is to analyze the extent to which the leading model used for macroeconomic policy analysis (the canonical dynamic-stochastic general-equilibrium (DSGE) model) can be said to capture the ontology of macroeconomic aggregates. It goes without saying that this model cannot be expected to capture the ontology of macroeconomic aggregates to its fullest extent. But there is a difference in quality between models that model relations of probabilistic dependence between macroeconomic aggregates as emerging from the direct interactions of heterogeneous agents, and models that DOI: 10.4324/9781003094883-4

42 Causality interfere with the ontology of macroeconomic aggregates by paying no attention to the heterogeneity of individual agents or to the directness of their interactions. Models of macroeconomic fluctuations that pay no attention to the heterogeneity of agents or to the directness of their interactions are the DSGE models, a canonical version of which had been presented in Section 3.2 of the previous chapter. DSGE model equations are (log-linear approximations of) reformulations of the optimality conditions deriving from the solutions to the optimization problems of representative agents (households and firms) and a monetary authority. Representative agents are very different from the flesh-and-blood agents that work, hold, produce, pay, or purchase microeconomic quantities. They have identical and homothetic utility functions and homogeneous technologies and interact only indirectly via a price mechanism. The microeconomic quantities that these agents work, hold, produce, pay, or purchase in fact coincide with the macroeconomic aggregates that policymakers try to manipulate. Macroeconomists engaging in DSGE modeling thus eliminate the level of macroeconomic aggregates by reducing it to the level of microeconomic quantities that representative (and optimizing) agents work, hold, produce, pay, or purchase. But the relations of causal dependence that obtain between these quantities are not the ones that policymakers exploit as a matter of fact. The relations that policymakers exploit as a matter of fact include relations of supervenience and relations of downward causation between macroeconomic aggregates and microeconomic quantities. And macroeconomic DSGE models cannot represent these relations because they model macroeconomic aggregates as microeconomic quantities. Nonetheless, macroeconomists use DSGE models to justify policy interventions: interventions that manipulate aggregate quantity X to influence aggregate quantity Y while leaving all of the other aggregate quantities (largely) unchanged. Aggregate quantity X cannot be manipulated in this way unless X directly typelevel causes aggregate quantity Y. So how can macroeconomists use DSGE models to justify surgical policy interventions? In macroeconomic DSGE modeling, the assumption of relations of direct type-level causation between aggregates derives from the assumption that the parameters that figure in DSGE model equations are “deep”: that these parameters are identified in microeconomic theory (in the utility function of a representative household, in the price-setting behavior of firms, and in the interest-rate rule followed by the central bank). Identifying these parameters in microeconomic theory is meant to ensure that the model equations remain invariant to manipulations of independent variables: that setting X to a particular value does not break the equation connecting X to Y. But the invariance of the model equations will be questionable if microeconomic theory turns out to be out of touch with the ontology of macroeconomic aggregates. The second objective of the chapter is to discuss Hoover’s claim that macroeconomic aggregates emerge: that they do not fully reduce to the microeconomic quantities, of which they are composed (that they are ontologically independent of these quantities to some extent). Hoover derives his claim from a disjunction of four premises: (1) the most important macroeconomic aggregates are dimensionally

The Ontology of Macroeconomic Aggregates  43 distinct from the microeconomic quantities, of which they are composed; (2) complete characterizations of the microeconomic must include characterizations of the macroeconomic; (3) macroeconomic aggregates can be manipulated; and (4) the strategy of idealization in macroeconomic model construction is empirically successful. I will argue against (3) that there are aggregates, of which we cannot know whether they can be manipulated (aggregates of the expectations that individuals form with respect to the behavior of all kinds of variables that matter to them), and against (4) that the strategy of idealization is empirically unsuccessful if the strategy is supposed to result in a model representing relations of direct type-level causation between macroeconomic aggregates. But the more general problem with the four premises is that they do not imply Hoover’s claim, either alone or together. The dimensional distinctness that he diagnoses for the general price level and its microeconomic constituents can be found at the microeconomic level when only two commodities and their prices are considered. Macroeconomic aggregates are calculated from their microeconomic constituents according to some algorithm; thus, even macroeconomic aggregates, characterizations of which need to be included in complete characterizations of the microeconomic, fully reduce to their microeconomic constituents. And the properties of manipulability and direct typelevel causation are likely to be found among the microeconomic constituents of a macroeconomic aggregate if they are found at the level of the aggregate. The third objective of the chapter is to draw conclusions about the program of microfoundations that I think macroeconomists should adopt. The program of microfoundations is typically understood as the program of eliminating macroeconomics by reducing it completely to microeconomic theory and the representativeagent assumption. If it is understood in this sense, it should be rejected. But it should be accepted if it is understood as the program of providing macroeconomics with empirical microfoundations. The principal reason why I think that macroeconomists cannot get along without empirical microfoundations is that up to now the interplay between macroeconomic aggregates and individual expectations is poorly understood. The chapter will pursue its three objectives by first analyzing the notions that are central to the ontology of macroeconomic aggregates: the notions of reduction, supervenience, and emergence (Section 3.2). The chapter will then investigate the extent to which DSGE models can be said to capture the ontology of macroeconomic aggregates (Section 3.3). Next, the chapter will discuss Hoover’s claim that macroeconomic aggregates do not fully reduce to their microeconomic constituents (Sections 3.4 and 3.5). The chapter will finally argue in favor of a program of empirical microfoundations (Section 3.6). 3.2

Reduction, Supervenience, and Emergence

The national bureaus of statistics calculate macroeconomic aggregates from the microeconomic quantities, of which they are composed, according to some algorithm: aggregate employment is the sum of the hours that employed workers

44 Causality worked; the average rate of interest on bonds is the percentage that bonds or loans of similar maturity (3 months, 6 months, 1 year etc.) yield on average; and nominal GDP and aggregate demand are the sum of the final goods and services produced in a specific period, multiplied by their respective prices, or alternatively, of the incomes of each individual in the economy. The money stock is the sum of assets held as cash, checking accounts, and closely related assets. In the Euro area, there are three main money stocks (or monetary aggregates): M1, M2, and M3. M1 (or narrow money) comprises currency (i.e., banknotes and coins) and overnight deposits (i.e., balances that can be immediately converted into currency or used for cashless payment). M2 (or intermediate money) includes M1 and deposits with a maturity of up to two years and deposits redeemable at a period of up to three months. M3 (or broad money) comprises M2 and marketable instruments issued by the monetary financial institutions sector. As we move from M1 to M3, the liquidity of the assets decreases, while their interest increases. Since M3 is less affected by substitution between various liquid assets than M1 and M2 (and therefore more stable), the ECB targets M3 as instrument of monetary control. The general price level is the weighted average of percentage rates of changes in disparate prices. There is an irreducible degree of arbitrariness in one’s choices of weights, and different choices of weights lead to different general price level definitions. In these definitions, it is somewhat natural to choose the quantities consumed in a given period as weights of the respective percentage rates of change. But quantities from which consumption period are we supposed to select as weights? If we choose only quantities from the base period, we subscribe to the Laspeyres index: n n Pt L = Σ i= 1 pit ⋅ xit −1 / Σ i=1 pit −1 ⋅ xit −1,

where pit is the price of commodity i in current period t, pi − 1 the price of commodity i in base period t − 1, and xit − 1 the quantity of commodity i in the base period. If, by contrast, we choose only quantities from the current period, we go along with the Paasche index: Pt P = Σ in=1 pit ⋅ xit / Σ in=1 pit −1 ⋅ xit , where xit is the quantity of commodity i in current period t. There is, moreover, a potentially infinite number of indices lying between the Laspeyers and Paasche indices: one forming the arithmetic mean, another the geometric mean of both indices; a third weighing prices with the arithmetic mean of the quantities of the base and current period; a fourth weighing prices with the arithmetic mean of the quantities of all periods etc. Inflation, real GDP, the real interest rate, and so on are defined in terms of the general price level. Inflation is the rate of change in the general price level: πt ≡ (Pt − Pt − 1)/Pt − 1,

The Ontology of Macroeconomic Aggregates  45 where πt is inflation, Pt the general price level of today’s period, and Pt-1 the general price level of the last period (of last year, for instance); real GDP is nominal GDP divided by the general price level; the real interest rate is the nominal interest rate minus the rate of inflation, and so on. The definitional arbitrariness of the general price level spreads, of course, to all aggregates that are defined in terms of the general price level. The case of inflation makes it clear that definitional arbitrariness can be exploited for policy purposes. In many countries, many payments (e.g. social security payments) are “indexed to inflation” (which means that nominal payment is adjusted for inflation to keep the real payment constant). Suppose that a government announces that the official price index overstates inflation, and that it legislates a correction of 1%. This government can claim to be paying the same real benefits while reducing indexed payments by 1% less the first year, by 2% less the second year, and so on. And it is difficult to see how this claim could be shown to be false.1 Expectations cannot be measured directly. But if they could, then the national bureaus of statistics could also calculate inflation expectations, aggregate demand expectations, and so on from the expectations that individual agents form with respect to inflation, aggregate demand, and so on in accordance with some algorithm: then aggregate demand and inflation expectations would be the arithmetic mean of the expectations that each individual agent forms with respect to aggregate demand and inflation. So macroeconomic aggregates are calculated from the microeconomic quantities, of which they are composed, in accordance with some algorithm. For the ontology of macroeconomic aggregates, this means that they reduce to the microeconomic quantities, of which they are composed. Whether they fully reduce to these quantities, or whether they are ontologically independent of these quantities to some extent, is a question that will be discussed in Section 3.5. In the present section, I want to list the notions that are central to the ontology of macroeconomic aggregates. Besides the notion of reduction, these are the notions of (asymmetric) supervenience and emergence. To say that Y supervenes (asymmetrically) on X = {X1, …, Xn} means that (i) X and Y are not spatiotemporally distinct, that (ii) X determines Y, and that (iii) changes to Y necessitate changes to X but not vice versa.2 Macroeconomic aggregates clearly supervene on their microeconomic constituents in this sense. Consider nominal GDP. Nominal GDP and the final goods and services, of which it is composed, are not spatiotemporally distinct: nominal GDP in the Euro area at t is the sum of the final goods and services produced in the Euro area at t. The final goods and services determine nominal GDP. If nominal GDP changes, this must be because there are more or less final goods and services. But the values of final goods and services can be changed without changing the value of nominal GDP (the value of GDP at t doesn’t change if pit ⋅ xit = $1M and pjt ⋅ xjt = $2M, and if the values of xit and xjt are changed, such that pit ⋅ xit = $2M and pjt ⋅ xjt = $1M). The same holds true for the general price level and the macroeconomic aggregates that are defined in terms of the general price level. Consider the general price level. The general price level is indexed by ‘t’ but spans the whole period from the

46 Causality base period (t − 1) to the current period (t). Therefore, the general price level and the microeconomic quantities, of which it is composed, are not spatiotemporally distinct: the general price level in the Euro area at t is the average of percentage rates of changes in disparate prices paid in the Euro area at t and t − 1, weighted by quantities of commodities purchased in the Euro area at t and t − 1. The disparate prices and quantities determine the general price level. If the general price level changes, this must be because these quantities and prices have changed. But the values of quantities and prices can be changed without changing the value of the general price level (the value of Pt P doesn’t change if pit ⋅ xit = $1M and pjt ⋅ xjt = $2M, and if the values of quantities and prices are changed, such that pit ⋅ xit = $2M and pjt ⋅ xjt = $1M). To say that Y emerges from X means that Y is the order that a complex system produces and that X is the order of that system itself.3 The order of a complex system itself is characterized by the direct interactions between numerous system components that lack correlations (or are hardly correlated) over space and time at any scales. The direct interactions between system components are direct exchanges of energy or matter. These direct interactions feed back into the system in the sense that later interactions depend on earlier ones. They, moreover, occur in thermal or dynamic non-equilibrium, where a system is in thermal nonequilibrium if it imports energy or matter, and in dynamic non-equilibrium if it is not in a ‘steady state’ (if system behavior continues to change significantly over time). The order that a complex system produces is, by contrast, order in the sense of the dynamics, properties, laws, or forms of invariance and universal behavior that are absent from the order of the system itself, and that arise from this order “spontaneously,” i.e., without external control. Condensed matter physics is often described as the scientific study of how the macroscopic properties of liquids and solids emerge from the basic physics that describes their constituent parts. Perhaps the most telling example is water. The order of water itself is characterized by the direct interactions that take place between myriads of uncorrelated (or hardly correlated) H2O molecules, depend on earlier interactions and occur in thermal and dynamic nonequilibrium.4 The order that water produces is, by contrast, order in the shape of its macroscopic properties (or aggregate states) and the forms of invariance and universal behavior that it exhibits when passing from one aggregate state to another. These forms of invariance and universal behavior are, in particular, the phase transitions that water undergoes when passing through critical points5, and the scaling laws that describe the configuration of water at precisely these critical points.6 Like water, the economy needs to be understood as a complex system. The order of the economy itself is characterized by the direct interactions that take place between myriads of heterogeneous and uncorrelated (or hardly correlated) agents, occur in dynamic non-equilibrium,7 and feed back into the system in the sense that later interactions depend on earlier ones. The order that the economy produces is, by contrast, order in the shape of the conditional and unconditional probability distributions of the values of specific micro- or macroeconomic variables. These

The Ontology of Macroeconomic Aggregates  47 distributions include the cross-correlations between macroeconomic variables, the phase transitions that some macroeconomic aggregates (such as the bad debt of firms or the value of stocks) undergo when passing through critical points, and the scaling laws that describe the behavior of these aggregates at precisely these critical points. In this book, I won’t be able to elaborate on the complexity aspects of the economic system any further. In Section 3.5, I will discuss whether the order that the economy produces can be inferred to contain relations of direct type-level causation between macroeconomic aggregates. What I would like to emphasize at this point is that it is not aggregates of system components that emerge: not aggregates of H2O molecules, and not aggregates of the microeconomic quantities that fleshand-blood agents work, hold, produce, pay, or purchase. What emerges are dynamics, properties, laws, or forms of invariance and universal behavior: macroscopic properties (or aggregate states), phase transitions and scaling laws in the case of water, and specific shapes of the probability distributions of specific micro- and macroeconomic variables, cross-correlations between macroeconomic variables, phase transitions and scaling laws in the case of the economy. I would also like to emphasize that emergence does not imply that emerging entities are ontologically independent of the system components from which they emerge to any extent. Metaphysicians may feel free to assume that emerging entities are ontologically independent to some extent. But we ultimately cannot know whether that assumption is true or false. What emergence does imply in some scientific disciplines is that the dynamics of the emerging entities can be studied, while the dynamics of the underlying system components are largely ignored. This implication applies, for instance, to condensed matter physics, where lower-dimensional equations describe the dynamics of macroscopic properties, while the dynamics of the microscopic properties can be largely ignored. The ideal gas law, for example, uses only three degrees of freedom (pressure, volume, and temperature) to describe a system that really has degrees of freedom in the order of 1023. The question is whether the same implication applies to macroeconomics: whether the dynamics of macroeconomic aggregates can be studied, while the dynamics of the direct interactions between heterogeneous agents in dynamic non-equilibrium are largely ignored. I will defend a negative answer in Section 3.6. 3.3 The Canonical Macroeconomic DSGE Model I now want to revisit the canonical DSGE model that I presented in Section 3.2 of the preceding chapter. Remember that macroeconomic DSGE models are dynamic models built from microeconomic foundations with explicit assumptions about the behavior of the underlying shocks. They are dynamic-stochastic to the extent that present levels of macroeconomic aggregates are largely determined by expectations about the future levels of these aggregates, and that the fluctuations of these aggregates are generated by random events (so-called ‘exogenous shocks’). The general-equilibrium aspect of the models implies that the economy is in dynamic

48 Causality equilibrium, or that it quickly returns to a dynamic equilibrium whenever it has been kicked out of equilibrium temporarily by an exogenous shock. The DSGE models used by central banks and other policymaking institutions tend to be sophisticated.8 But what they have in common is a basic structure that is built around three blocks: a demand block, a supply block, and a monetary policy block. While the canonical DSGE model captures this basic structure, the DSGE models used by policymaking institutions extend this structure in various ways: by adding another dimension to the demand block (the process of capital accumulation), by incorporating a labor or financial market block, and so on. The canonical DSGE model presented in Section 3.2 of the preceding chapter is the model that Sbordone et al. (2010, Sections 2 and 3) describe essentially for expository purposes. Its target system is an economy that is populated by a representative household, a continuum of firms, and a monetary authority. The members of the household work for the firms and consume the goods produced by the firms. Each of the firms is a monopolist in the production of a particular good, which means that it is able to set the price of that good. The monetary authority sets the nominal interest rate. At the core of the demand block is a negative relationship between the nominal interest rate and aggregate demand, the so-called “new Keynesian IS curve”: yt = Et yt + 1 − (it − Et πt + 1) − δt, where yt ≡ log Yt is the logarithm of aggregate demand, E is the expectations operator, it ≡ log It is the continuously compounded nominal interest rate, πt ≡ log Pt /Pt − 1 is the quarterly rate of inflation, and Δt is an exogenous shock to aggregate demand. The new Keynesian IS curve is a log-linear approximation of the so-called Euler equation, in which consumption is replaced with aggregate demand (Yt) because consumption is assumed to be the only source of aggregate demand. The Euler equation results from a combination of first-order conditions if there is no habit formation (i.e., if consumers do not form the ‘habit’ of being unhappy whenever their current consumption is low or falls much below the level of their consumption in the recent past). The first-order conditions, in turn, solve the optimization problem of the representative household: the problem of maximizing its expected discounted lifetime utility at t. At the center of the supply block is a positive relationship between inflation and aggregate output (or GDP), the so-called “new Keynesian Phillips curve”: πt = ξyt + βEt πt+1 + Ut, where πt is inflation, yt is the logarithm of aggregate output (which in equilibrium equals aggregate demand), ξ measures the sensitivity of inflation to changes in aggregate demand, β discounts profits, and Ut is an exogenous shock to inflation. The new Keynesian Phillips curve is a log-linear approximation to the firstorder condition, which solves the optimization problem of each price-setting firm:

The Ontology of Macroeconomic Aggregates  49 the problem of setting an optimal price by maximizing the discounted stream of expected future profits. In each period, the nominal interest rate it is given by the following policy rule, which defines the monetary policy block of the model:

(

) (

it = ρ it −1 + (1− ρ ) rt* + π*t + φ π πt −1 − π*t + φ y yt − yt* 

) + ε , i t

where Rt* , Π*t and Yt* are baselines for the real interest rate, inflation, and output, respectively, and where Eti is a ‘monetary policy shock’ or random variable with mean zero that captures any deviation of the observed nominal interest rate from the value suggested by the rule. The baselines for the real interest rate, inflation, and output are, more specifically, the levels of the real interest rate, inflation, and output that would obtain in an economy, in which firms were forced to engage in perfect competition. The baselines for inflation and output are the targets of monetary policy (i.e., levels of inflation and output that are consistent with the mandate of the US Federal Reserve). If inflation and output rise above (fall below) these baselines, the nominal interest rate is lifted (cut) over time by amounts determined by the parameters ϕπ and ϕy and at a speed determined by the parameter ρ. It is important to understand that macroeconomic DSGE models have been developed in response to the Lucas critique.9 Lucas (1976) directed his critique primarily against the causal modeling practice of the Keynesian econometricians. Considered generally (and in its most uncontroversial form), the critique says that each of the equations of models representing relations between macroeconomic aggregates needs to be “derived from decision rules […] of agents in the economy,” that “some view of the behavior of the future values of variables of concern to them […], in conjunction with other factors, determines their optimum decision rules,” and that the assumption that this view remains invariant under alternative policy rules is an “extreme assumption” (Lucas 1976, p. 25). According to Lucas, the equations of the models representing relations between macroeconomic aggregates are of “no value in guiding policy” (Lucas and Sargent 1979, p. 50) unless they are “autonomous” (or invariant to policy interventions). And they are unlikely to be autonomous because policy interventions target variables that are of concern to agents (i.e., variables representing aggregate demand, inflation, and so on), because agents form expectations with respect to these variables, and because agents are (at least partially) successful in predicting the actions of policymakers when forming these expectations. DSGE models respond to the Lucas critique because the parameters that figure in DSGE model equations are “deep,” i.e., identified in microeconomic theory. That they are identified in microeconomic theory is supposed to mean that the model equations remain invariant to manipulations of independent variables: that setting X to a particular value does not break the equation connecting X to Y. Consider the parameters that figure in the equations of the canonical DSGE model: β is identified in the utility function of the representative household, ξ in the pricesetting behavior of the representative firm, and ρ, ϕπ and ϕy in the (forward-looking)

50 Causality interest-rate rule followed by the central bank. That they are identified in microeconomic theory is supposed to mean that setting e.g. the nominal interest rate to a specific value won’t disrupt the new Keynesian IS curve: that setting the nominal interest rate to a specific value will lead to a change in aggregate demand, while all other variables included in the canonical model remain largely unchanged. It is also important to understand, however, that the assumptions (or microeconomic foundations) of macroeconomic DSGE models conflict with the ontology of macroeconomic aggregates on a number of points. The economy is populated with homogenous agents (one representative household and firms with identical technologies) according to DSGE models, and with heterogeneous agents according to the ontology of macroeconomic aggregates. The economy is in dynamic equilibrium according to macroeconomic DSGE models, and in no such equilibrium according to the ontology of macroeconomic aggregates. While the ontology of macroeconomic aggregates is marked by a difference between the order that the economic system produces and the order of that system itself, macroeconomic DSGE models fail to capture that difference. Agents interact directly according to the ontology of macroeconomic aggregates, and indirectly via a price mechanism according to macroeconomic DSGE models.10 Macroeconomists engaging in DSGE modeling might think that the ontology of macroeconomic aggregates is the foundational premise of an alternative research program and that they don’t need to worry about conflicts between that premise and the microeconomic assumptions of DSGE models. But these assumptions have been criticized on their own terms. What speaks against the dynamic-equilibrium assumption is that firms have no incentive to innovate in a dynamic equilibrium (or ‘steady state’), while they are known to innovate continuously to remain competitive (Arthur 2006). What also speaks against that assumption is that there needs to be a process of tâtonnement that takes the economy back to equilibrium whenever it has been kicked out of equilibrium temporarily by an exogenous shock, that there is no empirical evidence that has been provided in support of such a process, and that it seems that this process must be imagined to occur in meta-time (Kirman 2006). Perhaps the most devastating blow to the dynamic-equilibrium assumption comes from a famous microeconomic result: the Sonnenschein-Mantel-Debreu (SMD) theorem (Debreu 1974, Mantel 1974, Sonnenschein 1973). The SMD theorem says that for any function with the properties of continuity, homogeneity of degree zero, and compliance with Walras’ law,11 there is at least one price vector that corresponds to an economy, in which the aggregate demand function is the function in question. The theorem implies that while the properties that the aggregate excess demand function inherits from the individual excess demand functions guarantee the existence of an equilibrium, they are not sufficient to guarantee its uniqueness; and Sonnenschein (1973) shows that they are also insufficient to guarantee its stability. The implication of the SMD theorem can be avoided if the aggregate excess demand function is assumed to be the excess demand function of one representative household. Since the canonical DSGE model relies on the assumption of one

The Ontology of Macroeconomic Aggregates  51 representative household, it is strictly speaking not affected by the SMD theorem. Note, however, what Alan Kirman (1992, p. 118) has to say about the assumption of a representative agent: [W]hatever the objective of the modeler, there is no plausible formal justification for the assumption that the aggregate of individuals, even maximizers, acts itself like an individual maximizer. Individual maximization does not engender collective rationality, nor does the fact that the collectivity exhibits a certain rationality necessarily imply that individuals act rationally. There is simply no direct relation between individual and collective behavior. Kirman concludes that “the ‘representative’ agent deserves a decent burial, as an approach to economic analysis that is not only primitive, but fundamentally erroneous.” The representative-agent assumption is also the assumption that eliminates the macroeconomy by reducing it to the microeconomy: if the aggregate excess demand function is the excess demand function of one representative household, then the other macroeconomic aggregates that the national bureaus of statistics calculate from microeconomic quantities turn out to coincide with the microeconomic quantities that representative (and optimizing) agents work, hold, produce, pay, or purchase. The assumption of indirect agent interaction is a direct consequence of the assumption that agents solve optimization problems. Since they solve optimization problems, their behavior is determined by the (equilibrium) prices that show up in the budget constraints of households and the profit functions of firms. These prices are set by firms under incomplete competition and taken as given under complete competition. But the behavior of one agent does not affect the behavior of any other agent directly. Akerlof (2002) criticizes the assumption that agents solve optimization problems when arguing that reciprocity, fairness, identity, money illusion, loss aversion, herding, or procrastination play a non-negligible role when agents make decisions. Especially since the financial crisis of 2008, macroeconomists engaging in DSGE modeling have gone out of their way to try to respond to these criticisms. But their response remains either unconvincing or partial. They have responded to criticism of the homogeneity assumption, for instance, by introducing “heterogeneous agent models.”12 That response is inadequate, however, because macroeconomic aggregates continue to reduce to the microeconomic quantities that representative (and now heterogeneous) agents work, hold, produce, pay, or purchase. 3.4

Do Macroeconomic Aggregates Emerge?

Unlike macroeconomists engaging in DSGE modeling, Hoover denies that macroeconomic aggregates reduce to the microeconomic quantities that representative agents work, hold, produce, pay, or purchase. He believes that macroeconomic aggregates do not fully reduce to microeconomic quantities and that their existence is ontologically independent of these quantities to some extent. He offers

52 Causality two accounts of this ontological independence. According to his first account, macroeconomic aggregates are ontologically independent of their microeconomic constituents to the extent that the former “supervene” on the latter (Hoover 2001, pp. 119–24). The problem with this account is that supervenience is a specific type of full reduction (cf. Section 3.2). Reiss (2008, p. 232) points out accordingly that Hoover’s nonreductive account of supervenience is “odd.” Hoover (2009, p. 393) later acknowledges this point but responds to it by dropping the supervenience aspect of his account (“the term supervenience ought to be dropped”), not by dropping its nonreductive aspect.13 According to Hoover’s second account, macroeconomic aggregates emerge: they “could be seen as emergent properties of the macroeconomy” (Hoover 2009, p. 390). To make sense of the way they emerge, Hoover (2009, Section 2.3) refers to Searle’s (1995) account of collective action or intentionality. According to Searle, a peculiarity about collective action or intentionality is that it cannot be the outcome of the interactions of atomic intentionality exclusively because atomic intentionality is typically neither necessary nor sufficient for collective intentionality, and because the rules that are constitutive of collective action or intentionality are usually not the object of the conscious (or unconscious) intentions of individuals. Searle cites money and the actions and intentions of governments and universities as examples of collective action or intentionality. But what he calls “money” is a set of constitutive rules that determine purchasing power or the exchange of goods. It doesn’t have a lot to do with what macroeconomists have in mind when thinking of money (what they have in mind is typically a monetary aggregate like M2 or M3, cf. Section 3.2). And unlike the actions and intentions of governments and universities, macroeconomic aggregates are multivalued quantities and not coordinated.14 Hoover (2009, p. 397) claims that what macroeconomic aggregates and the examples cited by Searle have in common is that they “ontologically transcend individual economic agents.” Let us assume that Searle’s examples transcend individual economic agents ontologically. How does Hoover argue in support of the claim that macroeconomic aggregates transcend economic agents ontologically: that macroeconomic aggregates emerge from the microeconomic quantities, of which they are composed (which economic agents work, hold, produce, pay, or purchase); that they do not fully reduce to these quantities and so on? There is, as far as I can see, a disjunction of no less than four premises, from which Hoover aims to derive his claim: 1 “The most important aggregates in macroeconomics” (Hoover 2001, p. 119) are dimensionally distinct from the microeconomic quantities, from which they are computed (Hoover 2001, Section 5.2). 2 “Complete characterizations of the microeconomic must include characterizations of the macroeconomic on the part of individual agents” because these agents must use estimates and expectations of macroeconomic aggregates “to form any practical assessment of their situations” (Hoover 2001, p. 123). 3 Macroeconomic aggregates can be manipulated (Hoover 2001, pp. 124–6).

The Ontology of Macroeconomic Aggregates  53 4 The strategy of idealization in macroeconomic model construction is empirically successful; it cannot be empirically successful unless macroeconomic models manage to identify and isolate “the real essences or causally effective capacities of economic reality” (Hoover 2001, pp. 126–7). I will argue against premises (3) and (4) in the following section. In the present section, I will argue that the four premises do not imply Hoover’s claim, neither alone nor together. By “the most important aggregates in macroeconomics,” Hoover (2001, p. 119) understands the general price level and real GDP. The general price level is dimensionally distinct from the microeconomic quantities, of which it is composed. While the general price level is a weighted average of percentage rates of change of disparate prices, the dimensions of the microeconomic quantities, from which it is computed (disparate prices and final goods), are currency per unit good and quantities of final goods and services. Dimensional distinctness spreads to real GDP and other macroeconomic aggregates that (like real GDP, inflation etc.) are defined in terms of the general price level. While real GDP is the sum of final goods and services divided by the general price level, the dimensions of the microeconomic quantities, from which it is computed (final goods and their prices) are currency per unit good and quantities of final goods. By itself, however, dimensional distinctness does not demonstrate that macroeconomic aggregates emerge. One reason is that many macroeconomic aggregates (nominal GDP, the nominal interest rate, the money stock etc.) are not dimensionally distinct from the microeconomic quantities, of which they are composed (cf. Section 3.2). Another reason is that dimensional distinctness occurs at the level of the order of the economic system itself, and that it is not a novelty produced by the system (not a novelty that occurs at the level of the produced order exclusively). In order to form a weighted average of percentage rates of change of disparate prices or the sum of final goods and services divided by that weighted average, one needs only two disparate final goods and prices. Hoover is right when claiming that “complete characterizations of the microeconomic must include characterizations of the macroeconomic on the part of individual agents.” There are at least two reasons why characterizations of the macroeconomic need to be included in complete characterizations of the microeconomic. The first reason is that agents must form expectations of macroeconomic aggregates “to form any practical assessment of their situations” (Hoover 2001, p. 123). Consider, for instance, the expectation that an individual agent (household, or firm) forms at t with respect to inflation at t + 1. The agent must form that expectation to form any practical assessment of her situation at t (perhaps her situation is one, in which she wonders whether she should invest in stock or buy real estate). The second reason why characterizations of the macroeconomic need to be included is that macroeconomic aggregates clearly influence the expectations that agents (households, or firms) form with respect to the behavior of all kinds of micro- or macroeconomic variables that matter to them. Consider, again, the

54 Causality expectation that an individual agent (household, or firm) forms at t with respect to inflation at t + 1. Clearly, inflation at t influences the formation of that expectation. I will argue in Sections 3.5 and 3.6 that the interplay between macroeconomic aggregates and individual expectations is little understood: that we currently don’t understand how agents form individual expectations, or how macroeconomic aggregates affect the formation of individual expectations, and that consequently, a program of empirical microfoundations will be required. What I want to point out in the present section is that the necessity of including characterizations of the macroeconomic in complete characterizations of the microeconomic doesn’t show that macroeconomic aggregates emerge. Inflation at t is calculated from disparate prices and quantities according to some algorithm, and so will inflation at t + 1. When defending premise (3), Hoover (2001, p. 125) lists a number of “irreducible macroeconomic aggregates” that he thinks can be manipulated. He says that nominal GDP and the general price level can be manipulated by changes in government spending, that the Fed can manipulate the federal funds rate by supplying or removing reserves, that the yield curve can be manipulated by changing the general price level or the federal funds rate, that the real interest rate can be manipulated by changing the general price level, and so on. I will argue in the following section that Hoover’s list fails to include aggregates, of which we cannot know whether they can be manipulated (expectational aggregates). But at this point, I want to explain why the emergence of macroeconomic aggregates does not follow from their manipulability. There are two explanations that I have at my disposal. The first explanation refers to the difference between the order of the economic system itself and the order that it produces. Manipulability does not only occur at the level of the order that the system produces, but also at the level of the order of the system itself. It is not a novelty that the system produces (not a novelty that occurs at the level of the produced order exclusively). Just as the general price level, for instance, can be manipulated by changes in government spending, its microeconomic constituents (final goods and their prices) can be manipulated by changes in individual demand or supply. The second explanation makes reference to supervenience. Remember from Section 3.2 that macroeconomic aggregates supervene on their microeconomic constituents and that the supervenience of macroeconomic aggregates implies that macroeconomic aggregates cannot be changed (or manipulated) independently of their microeconomic constituents. If macroeconomic aggregates cannot be manipulated independently of their microeconomic constituents, then the manipulability of macroeconomic aggregates does not imply that they emerge (that they are ontologically independent of their microeconomic constituents to any extent). Hoover believes that macroeconomic aggregates supervene on their microeconomic constituents.15 But that belief does not imply the belief that macroeconomic aggregates emerge; if anything, it implies that macroeconomic aggregates fully reduce to their microeconomic constituents. Premise (4) consists of two premises. The first premise says that (4a) the strategy of idealization in macroeconomic model construction is empirically successful. Hoover (2001, pp. 126–7) maintains that this strategy aims to identify and

The Ontology of Macroeconomic Aggregates  55 isolate “the essence of the matter,” and that “the real essences” are the “causally effective capacities of economic reality” (Hoover 2001, pp. 126–7). One may accordingly understand premise (4a) as saying that the strategy of idealization in macroeconomic causal modeling is empirically successful. The second premise says that (4b) the strategy of idealization in macroeconomic causal modeling cannot be empirically successful unless the strategy manages to identify and isolate “the real essences”: unless what is essential are the macroeconomic aggregates (with their “causally effective capacities”), and not the microeconomic quantities, of which they are composed. I will argue against premise (4a) in the following section. Right now, I want to argue against premise (4b) by making a similar point as in the case of premises (1) and (3). The point is that, by itself, the causality of the relations between macroeconomic aggregates does not demonstrate that macroeconomic aggregates emerge. To show that macroeconomic aggregates emerge, Hoover needs to show that a relation of direct type-level causation between macroeconomic aggregates cannot be understood as chains of relations of direct type-level causation that obtain between the microeconomic quantities, of which the aggregates are composed. He needs to show, in other words, that the causality of a relation between macroeconomic aggregates is a novelty that is part of the order that the economic system produces, and that the causality of this relation is not part of the order of the system itself (in the sense that it does not reduce to the causality of chains of relations between the microeconomic constituents of the aggregates). Few macroeconomists believe that the causality of relations between macroeconomic aggregates represents a novelty that is part of the order that the economic system produces. Most of them believe that it reduces to the causality of chains of relations between the microeconomic constituents of these aggregates. Consider, for instance, the relation of direct type-level causation that many macroeconomists believe obtains between the money stock and aggregate demand. Macroeconomists who believe that there is such a relation tend to believe that there is a monetary transmission mechanism underlying that relation. According to the traditional (Keynesian) version of the mechanism, the central bank sells or purchases bonds in an open market operation, thus changing bond prices and yields. Portfolio holders respond to this change by buying or selling other assets, thus changing interest rates (asset prices and yields). Consumers respond to this change by increasing or decreasing spending, thus changing their demand for final goods and services. According to the traditional version of the monetary transmission mechanism, that is, there are chains of relations of direct type-level causation between the microeconomic quantities, of which the money stock and aggregate demand are composed: relations of direct type-level causation between the fraction of wealth that people hold in the form of money and interest rates, and between interest rates and individual consumption. Hoover (2001, p. 112) thinks that his “position is analogous to that of condensed matter physicists.” Remember from Section 3.2, however, that according to condensed matter physicists, it is not aggregates that emerge, but dynamics, properties, laws, or forms of invariance and universal behavior. Thus, Hoover’s position

56 Causality cannot be analogous to that of condensed matter physicists. The position that is analogous to that of condensed matter physicists says that (specific shapes of) the conditional and unconditional probability distributions of the values of micro- or macroeconomic variables emerge from myriads of direct interactions that take place between myriads of heterogeneous agents in dynamic non-equilibrium and feed back into the system in the sense that later interactions depend on earlier ones. Hoover’s position that macroeconomic aggregates emerge is an ontological one: it states that “aggregates […] are among the fundamental units from which economic reality is constructed” (Hoover 2001, p. 112). Remember from Section 3.2, however, that condensed matter physicists often endorse the methodological (or epistemological) position that the dynamics of macroscopic properties can be investigated, while the dynamics of microscopic properties are largely ignored. The economic analog of this position says that relations of probabilistic or causal dependence between macroeconomic aggregates can be investigated independently of the direct interactions between heterogeneous agents. This methodological position is implied by Hoover’s ontological position and would in principle be open to him even if he were to renounce his ontological position. I will discuss this methodological position in Section 3.6. But before discussing this position, I will argue against Hoover’s premises (3) and (4). My argument against these premises will have repercussions for the analysis of Section 3.6. 3.5

Causality and Manipulability

Hoover defends premise (3) by drawing on a point that Ian Hacking (1983, pp. 22–4) makes with respect to the existence of theoretical entities like electrons (“If you can spray them, then they are real”). Drawing on this point, Hoover (2001, p. 125) wonders whether “irreducible macroeconomic aggregates” can be manipulated. He states that the “answer seems to be clearly yes” and defends his positive answer by citing a kind of scientific consensus among macroeconomists about the manipulability of certain aggregates: a scientific consensus that relies on “overwhelming” empirical evidence, and that even finds expression in the Wall Street Journal. Hoover is right when saying that there is a scientific consensus among macroeconomists about the manipulability of these aggregates. But he is wrong when suggesting that the consensus is one about the manipulability of “irreducible macroeconomic aggregates”; hardly any macroeconomist would agree that macroeconomic aggregates can be compared to electrons in terms of irreducibility (cf. previous section). Hoover also fails to mention that there is an important group of macroeconomic aggregates, of which we cannot know whether they can be manipulated: (aggregates of) expectations of inflation, of aggregate demand, and so on. Scientific consensus about the manipulability of these aggregates is strikingly absent. While some macroeconomists believe that these aggregates can be manipulated (that central banks, for instance, can manipulate inflation expectations by ‘anchoring’ them, i.e., by specifying an inflation target, by promising to the public that they will stick to that target, and by then really sticking to that target), other macroeconomists

The Ontology of Macroeconomic Aggregates  57 disagree, and empirical studies continue to come up with conflicting empirical evidence.16 The problem is that macroeconomists cannot find out about the manipulability of expectational aggregates because these aggregates cannot be measured. The claim that these aggregates cannot be measured may come as a surprise to theorists who conduct surveys to measure expectations on a regular basis. But remember from Section 3.3 of the previous chapter that Hoover himself explains why surveys cannot be accepted as an adequate means of measuring expectations: they cannot be accepted because a subject’s statement about what she expects can neither be verified nor trusted. Hoover (2001, p. 139) puts his explanation into perspective when saying on the next page that “expectational aggregates […] based on surveys […] may well improve causal models.” He does not explain, however, how these aggregates can improve causal models if they cannot be measured. Assume for the sake of argument that a subject’s statement about what she expects could be verified or trusted. Manski (2018, Section VII) lists a number of problems that would continue to obtain if that assumption were true: the expectations of firms are difficult to measure; the practice of rounding complicates the interpretation of responses; probabilistic expectations may be ambiguous (to the extent that they express, for instance, ignorance or uncertainty); respondents sometimes confound beliefs and preferences etc. Manski (2018, p. 423) believes that these problems can be solved. But even if they could be solved, there would be the additional problem that agents (e.g. firms) are likely to revise or update their expectations after a survey and before taking action (e.g. before shrinking or expanding production). And if they revise or update their expectations after a survey and before taking action, we won’t be able to measure the expectations, on the basis of which they act, unless we understand the way, in which they form (i.e., revise or update) expectations. One could argue that the way, in which agents form individual or aggregate expectations, can be investigated in laboratory experiments. But laboratory experiments are unlikely to teach us anything about the formation of individual or aggregate expectations. They are unlikely to teach us anything about the formation of individual expectations because in real life, the formation of individual expectations requires that expectations be revised and updated in light of information that (like government announcements, media reports, personal observations etc.) is often generated or collected in obscure ways, and because these obscure ways are difficult to mimic in laboratory experiments. These experiments are also unlikely to teach us anything about the formation of aggregate expectations because the dynamics of the formation of aggregate expectations in small groups and economic systems have little in common. Expectations, therefore, cannot be measured. There is currently no way to tell whether the expectations that agents form with respect to the behavior of macroeconomic variables (or indeed any variable) should be modeled as “rational,” “adaptive,” or in some alternative way. Expectations are rational if whatever information is available is completely exploited (Lucas and Prescott 1971, Muth 1961). Expectations are adaptive if agents correct their expectations on the basis of mistakes

58 Causality they made when forming expectations in the past. Critics of the theory of rational expectations argue that agents rarely exploit the available information completely, or that they tend to make systematic mistakes when exploiting that information. But proponents of the theory respond that systematic mistakes cancel out each other at the aggregate level and that expectations are correct on average. The critics point out that the response remains unsupported by experimental studies (Anufriev and Hommes 2012) or observational evidence (Gennaioli et al. 2016). But the point does not imply that agents never form rational expectations. One may assume, for instance, that in the face of competition, firms have a strong incentive to exploit the available information completely. Since individual expectations cannot be measured, expectational aggregates cannot be measured; since expectational aggregates cannot be measured, we cannot tell whether they can be manipulated; and since we cannot tell whether they can be manipulated, Hoover’s defense of premise (3) leaves out an important group of aggregates: expectational aggregates – aggregates of expectations that agents (households or firms) form with respect to the behavior of macroeconomic variables. I will now demonstrate that the presence of expectational aggregates has important consequences for premise (4a): for the premise saying that the strategy of idealization in macroeconomic causal modeling is empirically successful. I will argue against premise (4a) in much greater detail in the following chapter. Right now, I want to anticipate that argument by briefly summarizing it. In order for the strategy of idealization in macroeconomic causal modeling to be empirically successful, there needs to be empirical procedures that provide conclusive evidence in support of the causality of the relations among macroeconomic aggregates. The procedures that have been proposed to provide empirical evidence in support of the causality of these relations are important and sophisticated. They include the instrumental-variable method, potential-outcome research, and the procedure that Hoover (2001, Chapter 8) himself proposes. I will analyze these methods carefully in the following chapter. At this point, I must limit myself to mentioning their most important shortcoming. In order to provide conclusive evidence in support of a relation of direct type-level causation between variables X and Y (or the quantities, for which they stand), one needs to rule out that there is a third variable (or set of variables) Z that type-level causes both X and Y. In order to rule out that there is such a variable (or set of variables) Z, one needs to control for Z, i.e., include Z in the set of pre-selected variables and observe that Z remains (largely) unchanged, while X and Y change. There are a variety of problems that can get in the way of an efficient control for Z. But one problem that is especially pertinent in the case of macroeconomics occurs when Z cannot be measured. If Z cannot be measured, then we won’t be able to tell whether Z can be controlled for, and then there won’t be any conclusive evidence in support of a causal relation between variables X and Y. The problem is especially pertinent in macroeconomics because in macroeconomics, expectational aggregates form an important group of aggregates, of which we cannot know whether they can be controlled (or manipulated) because

The Ontology of Macroeconomic Aggregates  59 they cannot be measured. Consider, for instance, the claim that the aggregate demand directly type-level causes inflation. In order to provide conclusive evidence in support of that claim, one would need to rule out that inflation expectations directly type-level cause both aggregate demand and inflation. That inflation expectations directly type-level cause both aggregate demand and inflation is, however, exactly what one would expect when understanding the new Keynesian IS curve and the new Phillips curve of the canonical DSGE model of Section 3.3 as expressing causal hypotheses. Thus, in order to rule out that inflation expectations directly type-level cause both aggregate demand and inflation, one would need to control for inflation expectations. The problem is that inflation expectations cannot be measured and that we cannot tell whether we are able to control them. In other words: the empirical evidence that we can provide in support of the claim that aggregate demand directly type-level causes inflation remains inconclusive. One might think that the case of expectations is an exotic one and that the nonmeasurability of expectations does not get in the way of causal inference in macroeconomics in general. But remember that the Lucas critique says that the relations that macroeconomists believe obtain between macroeconomic aggregates are unlikely to remain invariant to policy interventions because policy interventions target variables that are of concern to agents (i.e., variables representing aggregate demand, inflation, and so on), because agents form expectations with respect to these variables, and because agents try to predict the actions of policymakers when forming these expectations. The Lucas critique says, in other words, that variables Z change whenever policy interventions target X to influence Y, where Z represents an expectational aggregate like inflation or aggregate demand expectations (or a ‘hidden variable,’ as I will say more generally in Section 3.4 of the following chapter). Since Z cannot be measured, the Lucas critique denies that there are reliable empirical procedures that can be employed to provide conclusive empirical evidence in support of causal relations between X and Y, when X and Y stand for macroeconomic aggregates. Hoover (2009, p. 400) says that the “Lucas critique appears […] to be compatible with” his claim that macroeconomic aggregates emerge. But if expectations cannot be measured, the Lucas critique implies that premise (4a) is false: that the strategy of idealization in macroeconomic causal modeling cannot be empirically successful. And if Hoover’s claim cannot be derived from any of the other premises, his claim will appear to be incompatible with the Lucas critique. 3.6 Toward a Program of Empirical Microfoundations The lesson that many macroeconomists have drawn from the Lucas critique says that there needs to be a program of microfoundations: that the parameters figuring in macroeconomic model equations need to be identified in microeconomic theory, where microeconomic theory is essentially general-equilibrium theory, i.e., the theory, according to which all agents solve optimization problems (households maximize utility under budget constraints, and firms maximize profits under

60 Causality technological and demand constraints) and all markets clear. The SMD theorem necessitates the assumptions that the aggregate excess demand function is the individual excess demand function of one representative household. Thus, the “microfoundations” of the program of microfoundations can be said to encompass general-equilibrium theory and the representative-agent assumption. Remember from Section 3.3 that the representative-agent assumption eliminates the level of macroeconomic aggregates by fully reducing it to the level of microeconomic quantities that representative agents work, hold, produce, pay, or purchase. Lucas (1987, p. 108), who is himself an advocate for the program of microfoundations, says accordingly: “the term ‘macroeconomic’ will simply disappear from use and the modifier ‘micro’ will become superfluous,” if the program is successful. But is the program successful? The analysis of Section 3.3 has shown that there is reason for doubt. The microfoundations of the program do not only conflict with the ontology of macroeconomic aggregates; they have also been criticized on their own terms. One might think that the “success” of the program is more of the instrumentalist kind: that the program is successful as long as the policy interventions that macroeconomists justify when using DSGE models are successful. But the success of these interventions must be doubted at least since the financial and economic crisis of 2008-2009. How are macroeconomists supposed to respond to the failure of the program of microfoundations? It would be convenient for them if they could adopt the methodological position I described toward the end of Section 3.5: if they could proceed like condensed matter physicists and investigate relations of probabilistic or causal dependence, independently of the direct interactions between heterogeneous agents in non-equilibrium. Hoover, for instance, could argue that there are empirical procedures of causal inference (his own procedure, the instrumental-variable method, potential-outcome research etc.), which can be used to investigate relations of direct type-level causation between macroeconomic aggregates independently of the microeconomic quantities, of which they are composed. But there are at least two reasons why macroeconomists cannot proceed like condensed matter physicists. The first reason is that the lower and upper levels are not as neatly separated in economics as they are in condensed matter physics. In economics, there is this delicate interplay of individual expectations and macroeconomic aggregates: individual expectations (like the expectation that an agent forms at t with respect to inflation at t + 1) functionally depend on macroeconomic aggregates (like inflation at t); expectational aggregates (like inflation expectations) form the arithmetic mean of the expectations that individual agents form with respect to aggregate quantities (like inflation); and expectational aggregates (like inflation expectations) are believed to directly type-level cause nonexpectational aggregates (like inflation or aggregate demand). In condensed matter physics, there is no comparable interplay between system components at the lower level (molecules) and system components at the upper level (macroscopic properties of liquids and gases).

The Ontology of Macroeconomic Aggregates  61 The second reason is that the direct interaction between system components at the lower level is much better understood in condensed matter physics than in economics. Condensed matter physicists understand very well the average motions and energies of the many molecules in liquids or gases. By contrast, economists fail to understand important aspects of the direct interaction between system components at the microeconomic level. They currently don’t know, for instance, how agents form (i.e., revise or update) individual expectations. I argued in the previous section that they cannot measure individual and aggregate expectations if they don’t know how agents form individual expectations, and that they accordingly don’t know whether expectational aggregates can be controlled for. I also argued that this is a problem because expectational aggregates are believed to act as confounders and because confounders need to be controlled for if causal inference is supposed to work. The program that macroeconomists should adopt in response to the failure of the program of microfoundations is a program of empirical microfoundations. The program can be described as one of using the results of empirical economic disciplines like econometrics and behavioral or experimental economics to model the chains of relations of causal dependence and supervenience that policymaking institutions can exploit. These chains of relations include relations of causal dependence between microeconomic quantities, relations of downward causation, and relations of supervenience. Relations of causal dependence or downward causation cannot be shown to obtain unless confounders can be shown to be absent, and confounders cannot be shown to be absent unless progress is made in the study of the formation of expectations. Therefore, the study of the formation of expectations is an integral part of the program of empirical microfoundations. The program of empirical microfoundations largely coincides with the program of microfoundations that has been underway in agent-based macroeconomics for a while, and that Dosi and Roventini (2019, pp. 19–20) describe when speaking of “definitions of aggregate variables” that “closely follow those of statistical aggregates,” and of “sound microfoundations based on realistic assumptions as far as agent behaviors and interactions are concerned, where realistic here means rooted in actual empirical microeconomic evidence.” But the program of empirical microfoundations is more specific than the program of “sound microfoundations,” as I will call it henceforth (even though both programs are empirical and sound to the same degree). The kind of research that the program of empirical microfoundations prescribes is also somewhat different from the kind of research that is carried out in accordance with the program of sound microfoundations. Dosi and Roventini (2019, p. 9) take the “actual empirical microeconomic evidence” to speak in favor of a macroeconomy “populated by heterogeneous agents (e.g. firms, workers, banks) whose […] local interactions yield some collective order.” They also accept the theoretical considerations that speak in favor of the “far-from-equilibrium” nature of the interactions between agents and against “any isomorphism between micro- and macroeconomic levels” (it is because of these considerations that Dosi and Roventini require that “the definitions of aggregate

62 Causality variables closely follow those of statistical aggregates”). Dosi and Roventini (2019, p. 9) finally agree that “higher levels of aggregation can lead to the emergence of new […] statistical regularities.” They fully endorse, that is, the ontology of macroeconomic aggregates that I described in Section 3.2. But the program of empirical microfoundations is more specific than the program of sound microfoundations. While the functional form of agent-based model equations typically underdetermines whether they stand for relations of causal dependence or supervenience, the program of empirical microfoundations requires that these relations be modeled explicitly. These relations need to be modeled explicitly because the simulated experiments that agent-based macroeconomists conduct to provide evidence in support of the effectiveness of policy interventions presuppose that agent-based model equations stand for relations of causal dependence or supervenience. Consider one of the leading agent-based models – the “Keynes meets Schumpeter” (K+S) model – and the simulated experiments that Dosi et al. (2015, Section 4; 2017, Section 5) conduct to demonstrate the effectiveness of innovation, fiscal, and monetary policies. In the case of innovation policies, a change resulting from an intervention on a parameter denoting the search capabilities of firms propagates to a change in a macroeconomic variable denoting GDP via changes in microeconomic variables standing for labor productivities and quantities produced. That propagation would be impossible unless the macroeconomic variable were connected to the parameter by a chain of relations that include relations of causal dependence between the parameter and the other microeconomic variables and a relation of supervenience between the microeconomic variables standing for quantities produced and the macroeconomic variable. In the case of fiscal policies, the change brought about by a manipulation of a parameter referring to the degree to which an unemployment subsidy is proportional to the current wage propagates to a change in a variable denoting GDP via changes in variables standing for the unemployment subsidy, individual consumption, and aggregate consumption. Fiscal policies, in other words, exploit a chain of relations that include a relation of causal dependence between microeconomic variables denoting the unemployment subsidy and individual consumption, and relations of supervenience between the parameter and the microeconomic variable referring to the unemployment subsidy, and between the macroeconomic variables standing for aggregate consumption and GDP. In the case of monetary policies, a change resulting from a central bank intervention on a variable referring to the discount rate propagates to changes in the real economy via changes in variables standing for microeconomic quantities (the discount rate, the interest rate that firms pay for a bank loan, the profits of these firms, the stock of liquid assets of these firms). That propagation would be impossible unless there were relations of causal dependence between the variables standing for the microeconomic quantities and a relation of supervenience between the number of firms or variables denoting the quantities produced and a macroeconomic variable standing for GDP (firms are assumed to exit the market if their stock of liquid

The Ontology of Macroeconomic Aggregates  63 assets is smaller than zero). It is also worth noting that in the K+S model there are relations of downward causation between inflation and employment, on the one hand, and the discount rate, on the other, when the Taylor rule that Dosi et al. (2015, p. 171; 2017, p. 71) use to describe the behavior of the monetary authority is interpreted causally. If the program of empirical microfoundations requires that relations of causal dependence and supervenience be modeled explicitly, then the kind of research that this program prescribes will differ from the kind of research that is carried out in accordance with the program of sound microfoundations. While the program of empirical microfoundations requires that macroeconomists engage in causal inference to provide empirical evidence in support of the relations of causal dependence their models express, agent-based macroeconomists engage primarily in stylized fact reproduction to validate models. Stylized facts are the conditional and unconditional probability distributions of the values of specific micro- or macroeconomic variables: cross-correlations between macroeconomic variables, the phase transitions that some macroeconomic aggregates undergo when passing through critical points, and the scaling laws that describe the behavior of these aggregates at precisely these critical points (cf. Section 3.2). Agent-based macroeconomists aim to reproduce these facts by selecting initial values for variables and parameters, by coding their model equations in a structured or object-oriented programming language, and by running the model on a computer an arbitrary number of times. Guerini and Moneta (2017, Section 1) explain why the method of stylized fact reproduction is not sufficiently rigorous: Economics, as any scientific discipline intended to inform policy, has inevitably addressed questions related to identification and measurement of causes and effects. […] The quality of ABMs has been up to now evaluated according to the ex-post ability in reproducing a number of stylized facts […]. [S]uch an evaluation strategy is not rigorous enough. Indeed the reproduction, no matter how robust, of a set of statistical properties of the data by a model is quite a weak form of validation, since, in general, given a set of statistical dependencies there are possibly many causal structures which may have generated them. Thus models which incorporate different causal structures, on which diverse and even opposite practical policy suggestions can be grounded, may well replicate the same empirical facts (Guerini & Moneta 2017, section 1). The method of stylized fact reproduction is not “rigorous enough” because for any stylized fact, there is an indefinite number of relations of causal dependence that might have generated that fact. An example of a stylized fact that the K+S model is able to reproduce is the correlation (and pro-cyclicality) of consumption and inflation. That correlation is compatible with three relations of causal dependence: consumption causally depends on inflation; inflation causally

64 Causality depends on consumption; or both inflation and consumption depend causally on a third variable (or set of variables). The third relation (with the set of variables acting as a confounder) divides into an indefinite number of relations, depending on the relations of causal dependence that obtain between the variables in the set. Since agent-based macroeconomists engage primarily in stylized fact reproduction, they also underestimate the need to study the formation of individual expectations. Dosi and Roventini (2019, p. 20) suggest that the “actual empirical microeconomic evidence” is largely conclusive with respect to the behavior of individual agents, but largely inconclusive with respect to their interactions. They say that these interactions “concern what happens within organizations […] and across organizations and individuals, that is the blurred set of markets and networks,” and that we are “very far from any comprehensive understanding” of these interactions. Thus, the program of sound microfoundations is largely a program of using evidence from sociology and network theory to improve our understanding of the interactions between individual agents. But if agent-based models are to be employed for purposes of policy analysis, then the “actual empirical microeconomic evidence” should also include results obtained from a more systematic study of the behavior of individual agents, including a study of the formation of individual expectations. The analysis of the preceding section explains why the study of the formation of individual expectations is important. Macroeconomists won’t be able to decide whether they can control for expectations unless they can measure expectations; and they won’t be able to measure expectations unless they understand how agents form expectations. Dosi and Roventini (2019, p. 20) take the “actual empirical microeconomic evidence” to speak against the “Olympic rationality” of agents with optimizing behavior and rational expectations and in favor of agents with adaptive behavior and expectations. I argued in the previous section, however, that we cannot rule out that some individual agents form rational expectations about the behavior of microeconomic quantities or macroeconomic aggregates at least some of the time. It is also worth noting that the presence of rational expectations is an implication of the simulated experiments that Dosi et al. (2020, Section V) conduct to show that the stability of macroeconomic system dynamics increases with the degree to which the heuristics of forming adaptive (demand) expectations is “naïve.” If the stability of macroeconomic system dynamics increases with the degree to which the heuristics of forming adaptive (demand) expectations is “naïve,” then the instability of macroeconomic system dynamics (output volatility and the number of economic crises) will increase with the degree to which agents form rational expectations. Periods of economic crises and high output volatility are among the stylized facts that emerge from the myriads of direct interactions between myriads of economic agents. Therefore, the simulated experiments imply that these agents form rational expectations at least some of the time. For a program of “sound microfoundations” that is meant to back up macroeconomic models that can be employed for purposes of policy analysis, this means

The Ontology of Macroeconomic Aggregates  65 that the study of the formation of individual expectations should be a top priority. I argued in the previous section that currently, survey methods or laboratory experiments are unlikely to teach us anything about the formation of individual or aggregate expectations. But maybe survey methods or lab experiments can be improved in such a way that at one point, macroeconomists can successfully use these methods to study the formation of expectations. Or perhaps we can develop alternative methods: conduct field experiments, collect survey data with a panel structure, or follow individuals over time through their digital footprints. Successful causal inference will, in any case, depend crucially on scientific progress in the study of the formation of expectations. Notes 1 Cf. Reiss (2008, Chapter 2) for an illuminating discussion of the policy implications of using different aggregating procedures to derive the general price level. 2 This is the definition that Stern and Eva (2021, Section 3) propose. They argue (correctly, to my mind) that conditions (i)–(iii) are necessary and sufficient for supervenience. 3 This is the characterization of emergence that Ladyman and Wiesner (2020, pp. 73–5) propose in a recent contribution to the philosophy of complexity science. I adopt that characterization because of its markedly close proximity to scientific practice, i.e., because it accounts for one of the “truisms” of complexity science: for the truism that “there is a difference between the order that complex systems produce and the order of the complex systems themselves” (Ladyman and Wiesner 2020, pp. 8–9). Hooker (2011, p. 28) proposes a similar characterization of emergence. 4 The interactions between the H2O molecules occur in dynamic non-equilibrium unless there is a net influx and outflux of water, as in the case of a lake that receives water through precipitation and loses water through evaporation and infiltration into the underlying soil (cf. Ladyman and Wiesner 2020, p. 28). 5 Phase transitions are sometimes described as transitions from one equilibrium to another. But such a description holds only under ceteris paribus conditions. As long as pressure and temperature vary continuously, the system is moving progressively toward a critical state, which means that it cannot be described as moving from one equilibrium to another. 6 At boiling point, for instance, water is both liquid and steam: it is made up of liquid water droplets, made up of myriads of bubbles of steam, themselves made up of myriads of droplets etc. 7 The notion of thermal non-equilibrium does not apply to the economy because physical parameters like temperature and energy are irrelevant to its dynamics (cf. Ladyman and Wiesner 2020, p. 72). 8 For more sophisticated variants, consider e.g. the two DSGE models mentioned in the introduction: Christiano, Motto, and Rostagno (2014), Smets and Wouters (2003). 9 Kydland and Prescott (1982) have responded to the Lucas critique by developing a “real business cycle” (RBC) model, which later evolved into the canonical DSGE model. 10 Another point of conflict is that in the canonical DSGE model, agent interactions do not depend on earlier interactions. But the conflict disappears when the assumption of no habit formation is given up. 11 Continuity is clear; homogeneity of degree zero means that demand remains invariant to a simple rescaling of all prices, while Walras’s law requires that the sum of all excess

66 Causality

12 13

14 15 16

demand be zero (where excess demand is the quantity that an individual agent or household demands minus the level of her individual wealth). Cf. Christiano et al (2018, pp. 131–2) for a list of these “heterogeneous agent models.” In a similar vein, Epstein (2014) argues that macroeconomic aggregates fail to supervene on microeconomic quantities. His example is that of government expenditures that include payments made to households to compensate them for storm damages, and that differ in their levels, depending on whether or not catastrophic storms in fact occur (Epstein 2014, Section 4.1). The problem with the example is that government expenditures are defined as a proportion of the budget of the government, and that the government’s budget is composed of microeconomic quantities (taxes, custom duties, borrowings, fees etc.). Epstein also fails to make plausible how the more important macroeconomic aggregates (GDP and inflation) might possibly fail to supervene on microeconomic quantities. Note in this context that List and Pettit (2011, p. 12) explicitly exempt markets from being collective actions or “group agents,” as they call them. Hoover says that “the term supervenience ought to be dropped” (cf. above). But this means that emergence should not be understood in terms of supervenience. It does not (or should not) mean that macroeconomic aggregates do not supervene. Beechey et al. (2011), for instance, use survey data to show that inflation expectations are anchored in the Euro area and (less firmly) the USA. Kumar et al. (2015), by contrast, use survey data to show that inflation expectations are not anchored in New Zealand.

References Akerlof, G. A. (2002). “Behavioral Macroeconomics and Macroeconomic Behavior.” The American Economic Review 92, 411–33. Anufriev, M. and Hommes, C. H. (2012). “Evolutionary Selection of Individual Expectations and Aggregate Outcomes.” American Economic Journal: Microeconomics 4(4), 35–64. Arthur, W. B. (2006). “Out-of-Equilibrium-Economics and Agent-Based Modeling.” In Tesfatsion, L. and Judd, K. L. (eds.), Handbook of Computational Economics, Vol. 2. Amsterdam: Elsevier. Beechey, M. J., Johannsen, B.K. and Levin, A.T. (2011). “Are Long-Run Inflation Expectations Anchored More Firmly in the Euro Area than in the United States?” American Economic Journal: Macroeconomics 3(2), 104–29. Christiano, L. J., Eichenbaum, M. S. and Trabandt, M. (2018). “On DSGE Models.” Journal of Economic Perspectives 32(3), 113–140. Christiano, L. J., Motto, R. and Rostagno, M. (2014). “Risk Shocks.” American Economic Review 104(1), 27–65. Debreu, G. (1974). “Excess Demand Functions.” Journal of Mathematical Economics 1(1), 15–21. Dosi, G. and Roventini, A. (2019). “More is Different … and Complex! The Case for AgentBased Macroeconomics.” Journal of Evolutionary Economics 29, 1–37. Dosi, G., Fagiolo, G., Napoletano, M., Roventini, A. and Treibich, T. (2015). “Fiscal and Monetary Policies in Complex Evolving Economies.” Journal of Economic Dynamic and Control 52, 166–89. Dosi, G., Napoletano, M., Roventini, A. and Treibich, T. (2017). “Micro and Macro Policies in Keynes+Schumpeter Evolutionary Models.” Journal of Evolutionary Economics 27, 63–90.

The Ontology of Macroeconomic Aggregates  67 Dosi, G., Napoletano, M., Roventini, A., Stieglitz, J. A. and Treibich, T. (2020). “Rational Heuristics? Expectations and Behaviors in Evolving Economies with Heterogeneous Interacting Agents.” Economic Inquiry 58(3), 1487–516. Epstein, B. (2014). “Why Macroeconomics does not Supervene on Microeconomics.” Journal of Economic Methodology 21(1), 3–18. Gennaioli, N., Ma, Y. and Shleifer, A. (2016). “Expectations and Investment.” In Eichenbaum, M. and Parker, J. (eds.), NBER Macroeconomics Annual 2015. Chicago: University of Chicago Press. Guerini, M. and Moneta, A. (2017). “A Method for Agent-Based Models Validation.” Journal of Economic Dynamics and Control 82, 125–41. Hacking, I. (1983). Representing and Intervening. Cambridge: CUP. Hooker, C. (2011). “Introduction to Philosophy of Complex Systems: A + B.” In Hooker, C. (ed.), Philosophy of Complex Systems. Oxford: North Holland, 3–90, 841–909. Hoover, K. D. (2001). Causality in Macroeconomics. Cambridge: CUP. Hoover, K. D. (2009). “Microfoundations and the Ontology of Macroeconomics.” In Kincaid, H. and Ross, D. (eds.), The Oxford Handbook of the Philosophy of Economics. Oxford: OUP, 386–409. Kirman, A. (1992). “Whom or What Does the Representative Individual Represent?” Journal of Economic Perspectives 6(2), 246–80. Kirman, A. (2006). “Demand Theory and General Equilibrium: From Explanation to Introspection, a Journey Down the Wrong Road.” History of Political Economy 38, 246–80. Kumar, S., Afrouzi, H., Coibion, O. and Gorodnichenko, Y. (2015). “Inflation Targeting Does Not Anchor Inflation Expectations. Evidence from Firms in New Zealand.” NBER Working Paper 21814. Kydland, F. E. and Prescott, E. C. (1982). “Time to Build and Aggregate Fluctuations.” Econometrica 50, 1345–70. Ladyman, J. and Wiesner, K. (2020). What Is A Complex System? New Haven: Yale University Press. List, C. and Pettit, P. (2011). Group Agency. The Possibility, Design, and Status of Corporate Agents. Oxford: OUP. Lucas, R. E. (1976). “Econometric Policy Evaluation: A Critique.” Carnegie-Rochester Conference Series on Public Policy 1, 19–46. Lucas, R. E. (1987). Models of Business Cycles. Oxford: Blackwell. Lucas, R. E. and Prescott, Edward, C. (1971). “Investment Under Uncertainty.” Econometrica 39(5), 659–681. Lucas, R. E. and Sargent, T. (1979). “After Keynesian Macroeconomics.” Federal Reserve Bank of Minneapolis Quarterly Review 3(321), 49–69. Manski, C. F. (2018). “Survey Measurement of Probabilistic Macroeconomic Expectations: Progress and Promise.” NBER Macroeconomics Annual 32, 411–71. Mantel, R. (1974). “On the Characterization of Aggregate Excess Demand.” Journal of Economic Theory 7(3), 348–53. Muth, J. F. (1961). “Rational Expectations and the Theory of Price Movements.” Econometrica 29, 315–335. Reiss, J. (2008). Error in Economics. Towards a More Evidence-Based Methodology. London/New York: Routledge. Sbordone, A. M., Tambalotti, A., Rao, K. and Walsh, K.J. (2010). “Policy Analysis Using DSGE Models: An Introduction.” FRBNY Economic Policy Review 2010 (October), 23–43.

68 Causality Searle, J. R. (1995). The Construction of Social Reality. New York: Free Press. Smets, F. and Wouters, R. (2003). “An Estimated Stochastic Dynamic General Equilibrium Model of the Euro Area.” ECB Working Paper Series No. 171 (August). Sonnenschein, H. (1973). “Do Walras’ Identity and Continuity Characterize the Class of Community Excess Demand Functions?” Journal of Economic Theory 6(4), 345–54. Stern, R. and Eva, B. (2021). “Antireductionist Interventionism.” British Journal for the Philosophy of Science 74(1), 241–267.

4

4.1

The In-Principle Inconclusiveness of Causal Evidence in Macroeconomics

Introduction

In Chapter 2, I presented an account of macroeconomic causality, according to which X directly type-level causes Y if and only if there is a possible intervention on X that changes Y, where X and Y stand for macroeconomic aggregates, where an intervention is understood as a manipulation of an intervention variable I that satisfies conditions requiring that I be a cause of X and that there be no confounders of X and Y, and where an intervention variable is either a variable or a parameter. In Chapter 3, I investigated the ontology of macroeconomic aggregates: of the quantities that take positions in relations of direct type-level causation in macroeconomics. I argued in Section 3.5 of Chapter 3 that the empirical evidence that macroeconomists can provide in support of these relations is too inconclusive. In the present chapter, I will develop that argument in greater detail. I will argue that the empirical evidence is too inconclusive because it derives from the conditions of the instrumental variable (IV) method, i.e., from conditions requiring that there be no confounders of I and X and X and Y (where I is an instrumental or intervention variable that type-level causes X), because in macroeconomics, confounders that cannot be controlled for or measured are likely to be present, and because econometric causality tests can be shown to rely on the conditions of the IV method at least tacitly. Of the econometric causality tests that macroeconomists can use to provide evidence in support of causal hypotheses, the chapter will be dealing with exactly two: with the procedure designed by Kevin Hoover (2001, Chapters 8–10) and with the procedure that comes along with the potential outcome approach that Joshua Angrist and Guido Kuersteiner (2011) have introduced into macroeconomics more recently. It is true that on occasion, macroeconomists employ the methods of Granger causality tests or causal Bayes nets to provide evidence in support of causal hypotheses. But these methods come along with probability approaches to causality and will be discussed (together with the underlying probability approaches) in the following chapter.1 The present chapter will begin by examining two possible instantiations of the IV method: randomized controlled trials (RCTs) and natural experiments. Section 4.2 is going to look at an RCT, as it is typically conducted in microeconomics, and DOI: 10.4324/9781003094883-5

70 Causality at the fictitious case of an RCT conducted in macroeconomics. Section 4.3 will analyze the famous natural experiment that Milton Friedman and Anna Schwartz (1963) observe to test their hypothesis that monetary changes directly type-level cause economic changes. It is going to argue that the evidence provided by that experiment is too inconclusive because it derives from the conditions of the IV method, and because confounders that violate these conditions and cannot be controlled for or measured are likely to be present in that experiment. Section 4.4 will defend the general case. Sections 4.5 and 4.6 are going to deal with the econometric procedures that Angrist and Kuersteiner (2011) and Hoover (2001, Chapters 8–10) propose to test for causal hypotheses in macroeconomics. These sections will argue that the evidence provided by these tests (the ‘Hoover test’ and the ‘AK test’) is too inconclusive because they tacitly rely on the conditions of the IV method. The final Section 4.7 will summarize the main argument and argue against Friedman (1953) that the inability to conduct RCTs reflects a basic difference between macroeconomics and many of the other special sciences. 4.2 Randomized Controlled Trials Imagine we would like to find out whether activation programs (programs consisting of job search activities, intensive counseling, and job training) directly typelevel cause the rate of exit from unemployment. We won’t be able to find out about this relationship if we assign all unemployed individuals to an activation program and check, after a certain period of time, whether the exit rate has changed. The exit rate might, after all, change as a result of the influence of all sorts of causes (including an increase or decrease in economic activity). In order to find out about the relationship, we will have to proceed in roughly two steps: we will first have to select a randomizing procedure that assigns roughly half of the unemployed to a group of individuals undergoing the activation program (the treatment group) and the other half to a group of individuals not undergoing that program (the control group). The principal purpose of this randomizing procedure is to avoid bias resulting from the influence of confounders, i.e., of possible type-level causes of which we don’t know anything: perhaps the long-term unemployed (men; 50+ etc.) are less inclined to exit unemployment than the newly unemployed (women; 30− etc.); results would accordingly be biased if both groups contained long-term unemployed (men; 50+) etc. in unequal numbers. In the second step, we will have to control for type-level causes that we think are likely to bias the results of our experiment. We will have to make sure, for instance, that once individuals are assigned to the treatment and control groups, participation in the program is mandatory. Otherwise individuals that were assigned to the program and have a low inclination to exit unemployment might choose to withdraw from the program. We will also have to rule out that the unemployment insurance agencies pay special attention to the individuals in the treatment group. Otherwise they could revert to every possible means to increase the exit rate (they, after all, have a substantial interest in getting a positive evaluation for their program and in receiving more public funds). We will finally have to make sure that the individuals

The In-Principle Inconclusiveness of Causal Evidence  71 in the treatment and control groups believe that they belong to the same group. Otherwise the (psychologically well-understood) role obligations of being an experimental subject might bias the results of our experiment.2 If both steps are taken, we will be dealing with a binary intervention variable I that is set to ‘yes’ for the individuals assigned to the treatment group and to ‘no’ for the individuals assigned to the control group, and that is likely to satisfy the following set of conditions: (I1) I type-level causes X, (I2) certain values of I are such that when I attains these values, X is no longer determined by other variables that type-level cause it but only by I, (I3) any directed path from I to Y goes through X, and (I4) I is statistically independent of any variable Z that type-level causes Y and is on a directed path that does not go through X. I is likely to satisfy (I1) because I is likely to type-level cause X (activation program). I is likely to satisfy (I2) because I is likely to break the arrow that is directed into X and departs from a variable standing for voluntary participation. I is likely to satisfy (I3) because I is likely to break the directed paths from I to Y that don’t go through X but through the variables standing for the special attention paid to the individuals in the treatment group and the role obligations of being an experimental subject. And I is likely to satisfy (I4) if the sample of the unemployed assigned to the treatment and control groups is large enough: if it is large enough, then I is likely to be probabilistically independent of any (nuisance) variable Z that type-level causes Y and is on a directed path that doesn’t go through X. Remember from Chapter 1 (Section 1.3) that conditions (I1)–(I4) are the conditions that James Woodward (2003, p. 98) spells out to define the term ‘intervention variable.’ The term ‘intervention variable’ figures in his definition of the term ‘intervention.’ And the term ‘possible intervention’ is used to define the term ‘direct type-level cause’ (cf. Woodward 2003, pp. 55, 59). One may accordingly say that X directly type-level causes Y if I satisfies conditions (I1)–(I4), if there is a possible intervention on X that changes Y or its probability distribution, and if all direct type-level causes of Y except X remain fixed by intervention.3 One may further say that X is likely to directly type-level cause Y, i.e., that we may believe with some confidence that X directly type-level causes Y, if I is likely to satisfy conditions (I1)–(I4), if there is likely to be a possible intervention on X that changes Y or its probability distribution, and if all direct type-level causes of Y except X are likely to remain fixed by intervention. There are, of course, important criticisms that have been advanced against the use of RCT methodology in economics (cf. Reiss 2013, pp. 202–06): (a) randomization ensures that treatment and control groups are identical with respect to all confounders only in the limit; (b) in economics, neither subjects nor experimenters can be blinded; (c) RCTs may introduce new confounders; (d) there is no guarantee that RCTs generalize to other settings. But while (d) relates to a difficult problem that all social policy faces (to the problem of the external validity of experiments),

72 Causality there are effective means of dealing with criticisms (a)–(c) in labor economics: the random sample drawn from the population of unemployed individuals can be large enough to ensure that treatment and control groups are at least nearly identical with respect to possible confounders; role obligations can be controlled for by lying to the individuals in the control group that they would undergo a treatment too (which is, of course, ethically questionable but not impossible in principle); and newly introduced confounders (like withdrawals from the program or the special attention paid to the individuals in the treatment group) can be controlled for by rendering participation mandatory, and by creating several treatment groups of which only one is monitored. Imagine next that we would like to find out whether the nominal interest rate is a direct type-level cause of aggregate demand, and that the canonical DSGE model of Chapter 2 (Section 2.2) and Chapter 3 (Section 3.3) expresses the hypothesis that we’ve formed about the relations of direct type-level causation that obtain among aggregate quantities in the economy in question: yt = Et yt + 1 − (it − Etπt + 1) − δt,

(1)

πt = ξyt + βEtπt + 1 + ut,

(2)

(

) (

it = ρ it −1 + (1 − ρ ) rt* + π*t + φ π πt −1 − π*t + φy yt − yt* 

) + ε , i t

(3)

where yt ≡ log Yt is the logarithm of aggregate demand (which in equilibrium equals aggregate output), E the expectations operator, it ≡ log It the continuously compounded nominal interest rate, and πt ≡ log Pt /Pt − 1 the quarterly rate of inflation; Rt* , Π*t , and , Yt* are the monetary policy targets for the real interest rate, inflation, and output, respectively; ∆t, Ut, and E ij represent shocks to aggregate demand, inflation and the nominal interest rate, respectively, and are assumed to follow independent first-order autoregressive processes. The various parameters of the model are identified in microeconomic theory. How are we supposed to find out whether the nominal interest rate (It) is a direct type-level cause of aggregate demand (Yt)? If we were supposed to find out about that relationship by conducting an RCT, we would first draw a random sample from a population of economic systems with reserve or central banks. We would secondly use randomization techniques to assign the systems in the sample to a treatment and a control group. We would thirdly ask governments (or any other responsible and competent bodies) to control for demand and inflation expectations because we believe them to directly type-level cause aggregate demand or the nominal interest rate. We would fourthly ask the central banks of the systems in the treatment group to take measures that increase or decrease It in (3). We would finally check whether the ensuing change in the nominal interest rate would be followed by a change in aggregate demand only in the treatment group. Why is it that we would never carry out any RCT like that? The first reason is that we cannot know whether we can control for demand and inflation expectations through direct human intervention. In the example from labor economics, the

The In-Principle Inconclusiveness of Causal Evidence  73 variables that are believed to type-level cause X or Y can be controlled for through direct human intervention. In the fictitious RCT described above, by contrast, we cannot know whether some of the variables that are believed to type-level cause X or Y (demand and inflation expectations) can be controlled for through direct human intervention. We cannot know whether demand and inflation expectations can be controlled for because the information on which their formation is based might already include the information that there is an attempt to control for them. An attempt to control for demand expectations resembles a government’s attempt to sugarcoat the indicators of economic activity. Economic agents won’t be taken in by such an attempt as long as their expectations are formed rationally. An attempt to control for inflation expectations, by contrast, resembles a central bank’s attempt to “anchor” inflation expectations, i.e., to influence inflation expectations in a way that renders them largely invariant over time or even invariant to monetary policy interventions. And there is considerable disunity among macroeconomists whether or not a central bank can influence inflation expectations in this way. Recall from the previous chapter (especially Section 4.5) that even if we were able to control for demand and inflation expectations, there would be no way to find out about that ability. There would be no way to find out about that ability because expectations variables cannot be measured. It is true that researchers sometimes use survey data to provide evidence for or against that capability. But strictly speaking, an agent’s statement about what she expects can neither be verified nor trusted. Even if it could be verified or trusted, there would be the additional problem that she might revise or update her expectation soon after answering the survey. This problem suggests that in order to be able to measure individual expectations, one would have to understand the way in which agents form (i.e., revise or update) these expectations. But we are far from understanding the way in which agents form expectations. There is, however, a second reason why we would never carry out an RCT to find out whether It is a direct type-level cause of Yt. In the example from labor economics, the random sample drawn from the population of unemployed individuals can be large enough to ensure that treatment and control groups are at least nearly identical with respect to possible confounders. In the case of our fictitious trial, by contrast, the random sample that we would draw from the population of economic systems with a central bank is necessarily small. Our sample might include Bolivia, Moldova, Slovakia, Thailand, the USA, and Zimbabwe, and some randomizing procedure (like flipping a coin) might assign Moldova, Thailand, and Zimbabwe to the treatment, and Bolivia, Slovakia, and the USA to the control group. Even if institutions were able and willing to fully control for demand and inflation expectations and to change nominal interest rates as requested, and even if a change in the nominal interest rate were followed by a change in aggregate demand only in the treatment group, there would be no guarantee that the change in aggregate demand is attributable to the change in the nominal interest rate. We would rather take the change in aggregate demand as an invitation to search for unknown typelevel causes of aggregate demand that are present in (perhaps only one or two of the systems in) the treatment group but not in the control group. And we wouldn’t

74 Causality be surprised if we learned that the values of variables standing e.g. for investment, government purchases or net exports in the systems of the treatment group were significantly different from the values of those variables in the systems of the control group. 4.3 A Natural Experiment in Macroeconomics The famous natural experiment that Friedman and Schwartz (1963) observe can be read as an attempt to avoid the two difficulties. A natural experiment is by definition an experiment in which control over the variables that potentially type-level cause X and Y isn’t exercised by direct human intervention but by nature. And a peculiarity about Friedman and Schwartz’s experiment is that it turns from a plurality of economic systems monitored during a particular time interval to a plurality of time intervals during which one particular system is monitored. Friedman and Schwartz (1963, p. 676) argue, more specifically, that a look at the monetary history of the USA from 1867 to 1960 teaches three things: that “[c]hanges in the behavior of the money stock have been closely associated with changes in economic activity,” that “[t]he interrelation between monetary and economic change has been highly stable,” and that “[m]onetary changes have often had an independent origin; they have not been simply a reflection of changes in economic activity.” The most important case of what Friedman and Schwartz (1963, p. 692) believe is an independent occurrence of a monetary change is the contraction of the money stock that followed the death of Benjamin Strong (the president of the Federal Reserve Bank of New York) in 1928. With respect to Strong’s death, Friedman and Schwartz (1963, p. 693) speak of a “quasi-controlled experiment.” Friedman and Schwartz rarely employ the terms ‘cause’ or ‘causes’ in the Monetary History, and Friedman even explicitly disapproves of the use of these terms.4 But Hoover (2009, p. 306) points out that the Monetary History is full of causatives (terms like ‘influences,’ ‘increases,’ ‘engenders,’ ‘affects’ etc.). It is also clear that Friedman and Schwartz aim to derive a policy conclusion: the conclusion that contractionary (expansive) monetary policies lead to monetary contractions (expansions) and that monetary contractions (expansions) lead to economic contractions (expansions). Their quasi-controlled experiment may accordingly be interpreted as an experiment that is meant to provide evidence in support of a causal hypothesis: the hypothesis that monetary changes directly type-level cause economic changes. A closer look at their argument (Friedman and Schwartz 1963, especially pp. 686–95) reveals, moreover, that the experiment that is meant to provide evidence in support of that hypothesis can be regarded as an experiment in which a binary intervention variable I (contractionary monetary policy: yes or no) is set to ‘yes’ or ‘no’ by a procedure (Strong’s death) that is at least quasi-randomizing: in which I is set to ‘yes’ in the USA for the period from 01/1929 to 03/1932, and to ‘no’ in the USA for most other periods,5 and in which an important economic contraction is observed only for the period from 01/1929 to 03/1932. That contraction, Friedman and Schwartz (1963, p. 694) conclude is “strong evidence for the economic

The In-Principle Inconclusiveness of Causal Evidence  75 independence of monetary changes from the contemporary course of income and prices” and thus for the hypothesis that monetary changes directly type-level cause economic changes. That conclusion is an overstatement, however. In order to see why, consider the case of rational expectations and the case that Robert King and Charles Plosser (1984) make to support their hypothesis of “reverse causation.” In the case of rational expectations, agents make the best use of whatever information is available to them to form expectations of key variables (such as money supply, GDP, and prices) in a manner consistent with the way the economy actually operates. A typical rational expectations model (cf. e.g. Dornbusch et al. 71998, pp. 166– 68) predicts that monetary changes directly type-level cause economic changes unless agents fully anticipate the monetary policy measures leading to the monetary changes. In the case of rational expectations, one may accordingly say that I doesn’t only type-level cause X (monetary contraction: yes or no) but also Z, while Y (economic contraction: yes or no) isn’t only type-level caused by X but also by Z, where Z denotes rational expectations under full anticipation of monetary policy changes (yes or no). In this case, I satisfies conditions (I1) and (I2) but not conditions (I3) and (I4): I type-level causes X and breaks any other arrow that is directed into X; but there is also a variable Z that type-level causes Y, is on a directed path that doesn’t go through X, and is correlated with I because I type-level causes Z. King and Plosser (1984), by contrast, argue that aggregate measures of the money stock (such as M2) aren’t set directly by the Federal Reserve but are determined by the interaction of the supply of high-powered money with the behavior of the banking system and the public and that changes in the values of both the money stock and aggregate output result from the decision of firms to shrink production and to decrease their money holdings accordingly. If they are right, then I doesn’t satisfy any of conditions (I1)–(I4): I doesn’t satisfy (I1) because I doesn’t type-level cause X at all; I doesn’t satisfy (I2) because X isn’t determined by I at all; and I doesn’t satisfy (I3) and (I4) because (I3) and (I4) are vacuous (there isn’t any directed path from I to Y). The important economic contraction that Friedman and Schwartz observe for the period from 01/1929 to 03/1932 therefore cannot represent strong evidence in support of the hypothesis that monetary changes directly type-level cause economic changes. In the case of rational expectations, it’s conceivable that rational agents who had a great deal at stake fully understood the “active phase of conflict” that Friedman and Schwartz (1963, p. 692) argue was unleashed by Strong’s death, that these agents correctly anticipated the contractionary monetary policy that was characteristic of that phase, and that the monetary contraction caused by that policy therefore didn’t cause the economic contraction. And in King and Plosser’s case of “reverse causation,” it is not implausible that policy measures didn’t play any major role, and that it was a general pessimistic outlook that led firms to decide to shrink production, i.e., to increase their money holdings (thereby inducing the monetary contraction) and to reduce production (thereby effectuating the economic contraction). The economic contraction that Friedman and Schwartz observe for the period from 01/1929 to 03/1932 represents a piece of evidence that is too inconclusive

76 Causality to disentangle a set of competing and observationally equivalent hypotheses: the hypothesis that X directly type-level causes Y, the hypothesis that Y directly typelevel causes X (King and Plosser’s hypothesis of “reverse causation”) and the hypothesis that there is a variable (or set of variables) Z that directly type-level causes both X and Y (the hypothesis that obtains in the case of rational expectations). In Friedman and Schwartz’s (1963, p. 686) discussion, the observational equivalence of these hypotheses is expressed as follows: “The monetary changes might be dancing to the tune called by independently originating changes in the other economic variables; the changes in income and prices might be dancing to the tune called by independently originating monetary changes; […] or both might be dancing to the common tune of still a third set of influences.” Friedman and Schwartz go on to claim that “a wide range of qualitative evidence […] provides a basis for discriminating between these possible explanations of the observed statistical covariation.” The above considerations suggest, however, that the “wide variety of qualitative evidence” is wanting. And if it is wanting, then the three competing hypotheses cannot be disentangled. The three competing hypotheses could be disentangled if Z could be controlled to ‘no’ for the period from 01/1929 to 03/1932 or to identical values for all periods between 1867 and 1960; if randomization techniques could be applied to ensure that Z is evenly distributed over all these periods; or if the decisions of firms to shrink or expand production could be controlled for in effective ways. It is unclear, however, whether rational expectations or decisions of firms to shrink or expand production can be controlled for through direct human intervention. And the time periods between 1867 and 1960 are probably just as diverse and small in number as the economic systems that could be assigned to treatment and control groups in a fictitious RCT like the one considered in the preceding section (remember that two world wars and the Great Depression occurred between 1867 and 1960). Therefore, the three competing hypotheses are bound to remain entangled. Christina Romer and David Romer (1989) attempt to provide evidence along similar lines as Friedman and Schwartz. They search the records of the Federal Reserve for the postwar period to find evidence of policy shifts that were designed to lower inflation, not motivated by developments on the real side of the economy, and followed by recessions. They identify six such shifts, the most prominent being the monetary contraction that occurred shortly after Paul Volcker became chairman of the Federal Reserve Board in October 1979, and that was followed by one of the largest recessions in postwar US history. Romer and Romer (1989) argue that the monetary contraction was motivated by a desire to reduce inflation, and not by the presence of other forces that would have caused output to decline in any event. But their argument remains open to the sort of objections that can be raised in the case of rational expectations and in King and Plosser’s case of “reverse causation.” In the case of rational expectations, it’s conceivable that rational agents who had a great deal at stake fully understood that Volcker was going to fight inflation, that these agents correctly anticipated the contractionary monetary policy that ensued shortly after Volcker became chairman of the Federal Reserve Board, and that the monetary contraction caused by that policy therefore didn’t cause the economic

The In-Principle Inconclusiveness of Causal Evidence  77 contraction. And in King and Plosser’s case of “reverse causation,” it is not implausible that the contractionary monetary policy didn’t play any major role, and that it was a general pessimistic outlook that led firms to decide to shrink production, i.e., to increase their money holdings (thereby inducing the monetary contraction) and to reduce production (thereby effectuating the economic contraction). 4.4 The General Case The conclusion to be drawn states that the evidence deriving from these natural experiments is too inconclusive to turn the belief that monetary changes directly type-level cause economic changes into knowledge. Note that this conclusion holds in principle, and not just in the case of these experiments. It holds in principle because hidden variables operate whenever there is an attempt to control for an intervention variable I. A hidden variable, as I understand it, is a variable that denotes some macroeconomic aggregate, that cannot be measured, that might be incapable of manipulation through (policy or experimental) intervention, and that type-level causes Y. King and Plosser (1984, p. 363) refer to their hypothesis as one of “reverse causation.” But their hypothesis may also be read as one of confounding: the decisions of firms to shrink or expand production act as a common cause of both monetary and economic changes. These decisions undoubtedly cannot be measured. We might conduct a survey and ask firms how much they think they are going to produce so and so many quarters ahead. But as in the case of demand or inflation expectations (or expectation variables more generally), responses are to be dismissed as neither verifiable nor trustworthy. We therefore won’t be able to find out whether these decisions are capable of manipulation through (policy or experimental) intervention. In King and Plosser’s case of “reverse causation,” the hidden variable in question (i.e., decisions of firms to shrink or expand production) is assumed to be causally independent of I (i.e., of monetary policy interventions). But one of the lessons of the Lucas critique states that interventions on I always token-level cause changes in hidden variables. Remember from Section 4.3 of the previous chapter that according to the Lucas critique, the assumption of the invariance of the expectations that agents form under alternative policy rules is extreme. If variables denoting these expectations are hidden in the sense indicated above, and if (a switch to) an alternative policy rule amounts to a policy manipulation of an intervention variable I that type-level causes X, then the Lucas critique can also be read as saying that I type-level causes hidden variables. I take this reading of the Lucas critique to be rather uncontroversial. The only point I’d like to add is that it doesn’t make any difference whether I is manipulated for policy or experimental purposes and that the Lucas critique therefore has a negative bearing on the effectiveness of the IV method in macroeconomics. In order to show that the hypothesis that X directly type-level causes Y is true, researchers need to show that there is an intervention variable I that satisfies conditions (I1)– (I4). They cannot show that there is such a variable unless a set Z of variables can

78 Causality be controlled for in effective ways: unless it is possible to distribute the variables in Z evenly over all subjects of investigation or to control for these variables through nature or direct human intervention. In many special sciences such as microeconomics or pharmacology, Z can be controlled for in effective ways. In macroeconomics, however, Z includes hidden variables that in the case of the Lucas critique (in cases like King and Plosser’s) are type-level caused by I (are not type-level caused by I but type-level cause X). In macroeconomics, moreover, subjects of investigation (a plurality of economic systems monitored during a particular time interval or a plurality of time intervals during which one particular system is monitored) are too diverse and small in number for randomization to lead to even distributions of Z. In macroeconomics, the evidence that can be provided in support of conditions (I1)–(I4) is therefore in principle too inconclusive to disentangle a set of competing and observationally equivalent hypotheses, i.e., to support the hypothesis that X directly type-level causes Y (or to turn belief in the truth of that hypothesis into knowledge). There are no less than four objections that it seems can be raised against the general case. The first objection is that in natural experiments in macroeconomics, evidence not only springs from correlations or co-occurrences of events but also from the temporal order of events and the largeness of effects and that all pieces of evidence combine to disentangle competing and observationally equivalent hypotheses. In response to that objection, one needs to point out that temporal order or largeness of effects rarely plays any prominent role in natural experiments in macroeconomics. The monetary policy contraction that Friedman and Schwartz (1963, pp. 688–9) observe for 10/1931 isn’t followed but rather accompanied by a monetary and economic contraction in the same month; and the economic contraction that they observe for that month is relatively small because industrial production had been declining already for more than a year. But even if the order of events were such that the monetary contraction was followed by the economic contraction, new classical macroeconomics like King and Plosser could still defend the case of reverse causation: the monetary contraction occurred before the economic contraction because firms first decided to shrink production, then decreased their money holdings (inducing the monetary contraction), and then shrank production (effectuating the economic contraction). And even if the economic contraction were sharp and distinctive, rational expectation theorists could still deny that it was token-level caused by a monetary contraction: the monetary contraction occurred because of the contractionary monetary policy, but the economic contraction didn’t occur because of the monetary contraction because economic agents fully anticipated the decisive steps that the Federal Reserve was going to take. The second objection that it seems can be raised against the above generalization says that conditions (I1)–(I4) are too strong and should be replaced with a weaker set of conditions. Julian Reiss (2005, pp. 973–5), for instance, claims that skipping (I2) “is more in line with econometric practice.”6 And he cites wellknown examples from the econometric literature on instrumental variables and natural experiments in order to support his claim. But in macroeconomics, even Reiss’s weakened set of conditions is unlikely to be satisfied. His set still includes

The In-Principle Inconclusiveness of Causal Evidence  79 conditions (I1), (I3), and (I4). And remember from the preceding section that conditions (I3) and (I4) are violated in the case of rational expectations under full policy anticipations and that conditions (I1), (I3), and (I4) are violated in the case of reverse causation. The third objection is directed against the in-principle modality of the above generalization: perhaps evidence supporting conditions (I1)–(I4) is currently too inconclusive to disentangle competing and observationally equivalent hypotheses, but why should we think that it is too inconclusive as a matter of principle? Shouldn’t we expect scientific progress to render it sufficiently conclusive in the end? At this stage, it is impossible to surmise whether there will be any progress of that sort. Derivations of more conclusive evidence require greater numbers of or less diverse subjects of investigation or effective ways of controlling for individual expectations and decisions through direct human intervention. And at present, it is unclear how macroeconomics could ever meet these requirements. But we cannot rule out that at one point, it will be able to meet these requirements. Experimental economists, for instance, might at one point be able to measure the variables that currently appear hidden. Or perhaps researchers conducting surveys might at one point be able to develop the methodologies that render subjects’ statements about what they expect more trustworthy or verifiable. My use of the term ‘in principle’ should therefore be restricted to the inconclusiveness of causal evidence in macroeconomics, as we know it today. The fourth objection says that in order to show that X directly type-level causes Y, researchers don’t need to show that there is an intervention variable that satisfies conditions (I1)–(I4); they can also carry out the Hoover or AK test. This objection is arguably the most important one. Nowadays hardly any macroeconomist doubts that RCTs cannot be conducted in macroeconomics, or that the evidence deriving from natural experiments in macroeconomics is too inconclusive to disentangle competing and observationally equivalent hypotheses. Most macroeconomists believe that causal inference must proceed from “a statistical relevance basis” (Hoover 2001, p. 149). Since the Hoover and the AK test proceed from such a basis, they appear to represent promising alternatives to the IV method. The generalization defended in Section 4.4 doesn’t hold as long as the Hoover or AK test can be upheld as a method of inferring causal evidence that is sufficiently strong to turn causal belief into knowledge. But the aim of the following two sections is to show that the evidence provided by these tests is in principle too inconclusive to turn causal belief into knowledge. I will argue that a statistical relevance basis determines causal structure only insufficiently and that the additional steps that the Hoover and AK test take do not sufficiently determine that structure either. I will also argue that in order to determine that structure sufficiently one needs to assume the validity of conditions (I1), (I3), and (I4), i.e., the validity of conditions of which the present and the preceding two sections have shown that the evidence that can be provided in support of them is in principle too inconclusive to support the hypothesis that X directly type-level causes Y, where X and Y stand for macroeconomic aggregates. My argument is going to rely substantially on the work of Pearl (22009: especially Chapters 1, 3, and 5).

80 Causality 4.5 The Hoover Test The Hoover test is a testing procedure that Hoover (2001, pp. 214–7) says consists of three steps. The first step is to look at time-series data and use non-quantitative and extra-statistical historical and institutional insights to assemble a chronology of interventions. This step is supposed to serve two purposes. The first is to divide history in periods with and without interventions so that the periods without interventions (the tranquil periods) can form the baseline against which structural breaks (i.e., changes in any of the parameters of the process of a particular variable) can be identified. The second purpose is to provide cross-checks to statistical tests: a structural break detected at a time when no interventions can be identified may indicate econometric misspecification. The second step of the Hoover test is to apply LSE methodology to specify a statistical model for each of the tranquil periods separately. LSE methodology operates by (i) specifying a deliberately overfitting general model, by (ii) subjecting the general model to a battery of diagnostic (or misspecification) tests (i.e., tests for normality of residuals, absence of autocorrelation, absence of heteroscedasticity and stability of coefficients), by (iii) testing for various restrictions (in particular, for the restriction that a set of coefficients is equal to the zero vector) in order to simplify the general model, and by (iv) subjecting the simplified model to a battery of diagnostic tests. If the simplified model passes these tests, LSE methodology continues by repeating steps (i)–(iv), i.e., by using the simplified model as a general model, by subjecting that model to a battery of diagnostic tests etc. Simplification is complete if any further simplification either fails any of the diagnostic tests or turns out to be statistically invalid as a restriction of the more general model. The third step of the Hoover test is to use the simplified models for the baseline periods to identify structural breaks. By way of example, imagine that the simplified model resulting from the second step of the Hoover test is the following twoequation model (cf. Hoover 2001, pp. 192–3): y = a x + e , e ~ N (0, s e2 ), x = b + h , h ~ N (0, s h2 ), where X and Y may stand for any of the aggregate quantities referred to above (the nominal interest rate and aggregate demand, respectively, the money stock and aggregate output, respectively etc.), taxes and government spending, respectively (as in the example that Hoover 2001, Section 8.1. and Chapter 9, discusses), or prices and money, respectively (as in the example that Hoover 2001, Chapter 10, discusses), where N(⋅,⋅) indicates a normal distribution characterized by its mean and variance, and where cov(ε, η) = E(εtεs) = E(ηtηs) = 0, for t ≠ s. The reduced form equations of that model run as follows: y = αβ + αη + ε, x = β + η.

The In-Principle Inconclusiveness of Causal Evidence  81 The reduced form equations describe the joint probability distribution of X and Y, P(x,y), that can be partitioned into conditional and marginal distributions in two distinct ways: P(x,y) = P(y⎹ x)⋅P(x) = P(x⎹ y)⋅P(y). Using the reduced form equations, the conditional and marginal distributions can be calculated as follows: P( y x) = N (a x, s e2 ),

(

)

P ( x ) = N b , s h2 ,

(

)

P( x y ) = N ασ η2 y + βσ ε2  / α 2σ η2 + σ ε2  , σ η2σ ε2  / α 2σ η2 + σ ε2  , P ( y ) = N (ab , a 2sh2 + s e2 ). The third step of the Hoover test will identify a structural break in {β, sh2 }, i.e., the parameters of the X process, if the simplified model that results from the second step is characteristic of two adjacent tranquil periods, if the first step manages to identify interventions on X that occur in between these periods, and if the parameters of P(x) and P(x⎹ y) break statistically in between these periods. Mutatis mutandis, the third step will identify structural breaks in {α, s e2 }, i.e., the parameters of the Y process. The Hoover test will conclude that X directly type-level causes Y if the parameters of P(y⎹ x) remain invariant to changes in {β, sh2 } and the parameters of P(x) invariant to changes in {α, s e2 }. The parameters of P(y⎹ x) and P(x) will remain invariant to changes in {β, sh2 } and {α, s e2 }, respectively, if and only if the parameters of P(x,y) are identified (or structural) and {β, ση2} and the parameters of P(y⎹ x), on the one hand, and {α, s e2 } and the parameters of P(x), on the other, are variation-free, i.e., mutually unconstrained.7 A look at the above calculations of 2 conditional and marginal distributions shows that {β, sh } and the parameters of 2 P(y⎹ x), on the one hand, and {α, s e } and the parameters of P(x), on the other, are indeed variation-free. But are the parameters of P(x,y) identified? Pearl (22009, pp. 149–50) points out that in order for α in y = αx + ε to be identified, there must be no variable (or set of variables) Z that d-separates X from Y, where Z is said to d-separate X from Y or “block” a path p between X and Y if and only if (i) p contains a chain X → M → Y or a fork X ← M → Y such that the middle node M is in Z, or (ii) p contains an inverted fork (or collider) X → M ← Y such that the middle node M is not in Z, and such that no descendant of M is in Z (cf. Pearl 22009, pp. 16–7). The basic problem with the Hoover test is that its three steps do not sufficiently guarantee that there is no path-blocking Z, and that it therefore cannot show that α is identified. The same problem can be restated by noting that the three steps of the Hoover test do not sufficiently guarantee that the parameters of P(y⎹ x) remain invariant to

82 Causality changes in {β, sh2 }. In order for the parameters of P(y⎹ x) to remain invariant to changes in {β, sh2 }, P(y⎹ do(x), do(z)) = P(y⎹ do(x)) needs to hold for all Z that d-separate X from Y, where do(x) is the operator that Pearl (22009, p. 70) introduces to denote the intervention that sets X to x, and where do(z) denotes the intervention that controls for any path-blocking variable Z.8 Again, the problem with the Hoover test is that its three steps do not sufficiently guarantee that there is no path-blocking Z, and that it therefore cannot show that the parameters of P(y⎹ x) remain invariant to changes in {β, σ 2η }. Hoover might respond that interventions on Z won’t escape the researcher’s attention in the first step, that Z can be included in the deliberately overfitting general model that in the second step is subjected to LSE methodology and simplified to a statistical model that might be less parsimonious than the exemplary model cited above. But what if Z is a hidden variable like the ones mentioned in Sections 4.2–4.4: an unobservable (and possibly uncontrollable) variable denoting inflation, demand, or GDP expectations, or the decisions of firms to shrink or expand production? If Z is a hidden variable, then the simplified model will provide a statistical relevance basis for an arbitrary number of competing causal models, i.e., then the following causal graphs will be observationally equivalent: X → Y, X → Z → Y, X ← Z → Y. Perhaps Hoover believes that hidden variables are not causally relevant in all areas of macroeconomics and that the areas in which they are relevant exclude the areas to which he applies his three-step testing procedure. It is important to see, however, that this belief would be unjustified. Hoover applies his three-step procedure to provide evidence in support of two hypotheses: the hypothesis that taxes directly type-level cause government spending (cf. Hoover 2001, Chapter 9) and the hypothesis that prices directly type-level cause money (cf. Hoover 2001, Chapter 10). Hoover is aware, of course, that it is impossible to say that these hypotheses are true a priori and that it is easy to construct credible hypotheses about other causal structures. He seems to be unaware, however, that some of the aggregates that fill positions in these structures cannot be measured. In the case of his first hypothesis, Hoover seems to underestimate the implications of the constant-share model that he analyzes at the outset of his case study (cf. Hoover 2001, pp. 228–9). According to the constant-share model, taxes and government spending are causally independent because GNP type-level causes both taxes and government spending. But when carrying out the first step of his procedure, Hoover doesn’t assemble a chronology of interventions on GNP but only a chronology of interventions in the shape of changes in military and federal spending and tax bills and tax reforms (cf. Hoover 2001, pp. 229–31). Hoover might respond that assembling a chronology of interventions on GNP wouldn’t be exceedingly difficult, and that GNP can be included in the deliberately overfitting

The In-Principle Inconclusiveness of Causal Evidence  83 general model that in the second step of his procedure is subjected to LSE methodology and simplified to a statistical model that might be less parsimonious than a simple bivariate model for taxes and government spending. Remember from Section 4.3, however, that GNP is type-level caused by a hidden variable, i.e., the decisions of firms to shrink or expand production. The second of the three case studies that Lucas (1976, pp. 30–5) discusses to support his critique suggests, moreover, that this hidden variable is type-level caused by tax policy.9 In the case of his second hypothesis, Hoover (2001, pp. 260–1) runs money regressions with and without Federal Reserve policy instruments (reserves, the discount rate, and the Federal funds rate) in order to show that these instruments don’t need to be included among the regressors of money regressions. But the rationale for including these instruments is the possible presence of a causal chain (a “Federal Reserve reaction function”) that runs from prices through the Federal Reserve policy instruments and their effects on the banking system and the public to the stock of deposits. Aggregates that cannot be measured (such as inflation expectations or the decisions of firms to shrink or expand production) are likely to fill positions in that chain. In order to show that these aggregates don’t need to be included among the regressors of money regressions, one would have to run money regressions that do and do not include these aggregates among their regressors. And the problem is, of course, that these aggregates cannot be included because they cannot be measured. Hoover (2001, pp. 213–5, 276) is certainly aware of most of the difficulties that have been mentioned. He notes that the first step of assembling a chronology of interventions can be exceedingly difficult and that the range of possible causal interactions could have “been expanded to include more fully the role of interest rates or the role of real variables and so forth.” He also suggests that the LSE methodology to be applied in the second step has the important drawback of explicitly designing regressions that have desirable properties like normality of residuals, absence of autocorrelation etc. But he doesn’t believe that his testing procedure must fail as a matter of principle, or that the evidence provided by that procedure is in principle too inconclusive to support specific causal hypotheses. All he believes is that his test is “not necessarily easy to implement,” that it “may work sometimes” etc. (cf. Hoover 2001, p. 213). And that it does work, he thinks he can show with his two case studies about the causal relations between taxes and government spending and between money and prices, respectively. In the remainder of this section, I’d like to go a bit further than Hoover and argue that his testing procedure must fail as a matter of principle. It must fail as a matter of principle because it relies on conditions (I1), (I3), and (I4) from Woodward’s definition of ‘intervention variable,’ and because Sections 4.2–4.4 have shown that the evidence that can be provided in support of these conditions is in principle too inconclusive to support the hypothesis that X directly type-level causes Y, where X and Y stand for macroeconomic aggregates. In order to see that the Hoover test relies on condition (I1), i.e., on the condition that I type-level causes X, note that a structural break in any of the parameters of the X process must be understood as an

84 Causality intervention on a parameter-intervention variable that type-level causes X. Hoover (2011, p. 348) agrees with this understanding when noting that with respect to condition (I1), there is no fundamental difference between his structural account of causality and Woodward’s interventionist account.10 In order to see that the Hoover test also relies on conditions (I3) and (I4), i.e., on the condition that any directed path from I to Y goes through X, and on the condition that I is statistically independent of any variable Z that type-level causes Y and is on a directed path that does not go through X, remember from Chapter 2 (Section 2.4) that these conditions are meant to rule out the following cases (cf. Woodward 2003, pp. 101–02): X

X

?

Y

I

Y

I

?

Z X

I

? Y

I Z

W

X

? Y

Z

Figure 4.1 The four cases ruled out by Woodward’s conditions (I3) and (I4) (cf. Woodward 2003, pp. 101, 102).

These cases represent forks that contain path-blocking middle nodes: {I}, {I, Z} or {I, Z, W}. Conditions (I3) and (I4) are therefore among the conditions that need to be satisfied in order for α to be identified, or for the parameters of P(y⎹ x) to remain invariant to changes in {β, sh2 }. And the Hoover test relies on these conditions in the sense that its three step-procedure does not sufficiently guarantee that they are satisfied. One might wonder why the Hoover test relies on conditions (I1), (I3), and (I4) but not on condition (I2), i.e., on the condition that certain values of I are such that when I attains these values, X is no longer determined by other variables that typelevel cause it but only by I. The reason is that condition (I2) is not necessary for inferring that X directly type-level causes Y. In order to assess whether X directly type-level causes Y, one might find it convenient to check whether intervening on I breaks all arrows that are directed into X and depart from variables other than I. But Reiss is right when suggesting that in general, checking whether intervening on I breaks all these arrows is not necessary for assessing whether X directly type-level causes Y (cf. section 4 above).11

The In-Principle Inconclusiveness of Causal Evidence  85 4.6 The AK Test The AK test is a testing procedure that derives the hypothesis that X is a total typelevel cause of Y from the following two conditions: a ∑ZP(y⎹ x, z)⋅P(z) ≠ 0, b potential outcomes of Y are probabilistically independent of X given Z, where the expression in (a) measures the causal effect of X on Y, where (b) is a condition that Angrist and Kuersteiner (2011, p. 729) refer to as “selection-onobservables assumption” (or SOA, for short),12 and where Z is an admissible (or deconfounding) set of variables. It is true that so far, the chapter has been concerned with direct type-level causation, and that Angrist and Kuersteiner aim to derive a hypothesis of total type-level causation. Note, however, that the AK test would turn into an econometric procedure that tests for direct type-level causation if the following condition were added: c I all direct type-level causes of Y except X remain fixed by intervention. This condition is meant to ensure that the relation of type-level causation between X and Y is direct (cf. Pearl 22009, pp. 127–8).13 The hypothesis that Angrist and Kuersteiner (2011) aim to derive states that changes in the federal funds rate that the Federal Open Markets Committee (FOMC) intends at time t (∆FFt) directly type-level cause changes in real GDP at time t + j (∆GDPt + j), where j is the number of quarters ahead of t. It is true that they do not derive that conclusion from the inequality’(a’) ∑ZtP(∆gdpt + j⎹ ∆fft, zt)⋅P(zt) ≠ 0.

The causality test they use is a lot more sophisticated than a simple test for (a’) (cf. Angrist and Kuersteiner 2011, Sections III + IV and Table 3). But they are aware that their test relies on the assumption that (b’)

potential outcomes of ∆GDPt + j are independent of ∆FFt given Zt.

And that assumption is invalid unless Zt is admissible (or de-confounding). Angrist and Kuersteiner believe that Zt is admissible if the variables in Zt figure on the righthand side of a causal model adequately describes the process determining ∆FFt. Angrist and Kuersteiner (2011, p. 736) concede that they “do not really know how best to model the policy propensity score [i.e. the process determining ∆FFt]; even maintaining the set of covariates, lag length is uncertain, for example.” They therefore propose to specify a multiplicity of causal models for the process determining ∆FFt, to submit these models to a number of diagnostic (or misspecification) tests, and then to submit these models to their causality test.

86 Causality But the models they propose do not differ a lot. They all include variables standing for lagged changes in the intended federal funds rate, predicted changes in real GDP, predicted inflation and predicted unemployment innovation,14 past changes in real GDP and past inflation, and changes in predictions since the previous meeting of the FOMC. And differences between these models essentially relate to the number of lagged or predicted values that they include. The model that Angrist and Kuersteiner (2011, pp. 736–7) say performs best in terms of statistical adequacy (with respect to the diagnostic tests) runs as follows: ∆fft = α + Βzt + εt, where α is an intercept and Β a vector of parameters for the variables in Zt. The error term εt represents the “idiosyncratic information” to which policymakers are assumed to react. Angrist and Kuersteiner (2011, p. 727) note that this information cannot be observed and that it needs to be modeled as a stochastic shock (cf. part I, Section 1). Zt includes - the change in the intended federal funds rate in t − 1: ∆fft − 1, - changes in real GDP, inflation, and unemployment innovation in t, t + 1 and t + 2 that the FOMC predicts in t: ∆GDPt, Πt, ∆GDPt + 1, Πt + 1, ∆GDPt + 2, Πt + 2, Ut, and - past changes in real GDP and inflation: ∆GDPt − 1, Πt − 1 Zt also includes the changes in the predictions of inflation and changes in real GDP since the previous meeting of the FOMC. But I will drop these changes in order not to complicate matters too much. So according to Angrist and Kuersteiner, Zt is an admissible (or deconfounding) set of covariates if it includes the ten variables listed above, i.e., if Zt = {∆FFt − 1, ∆GDPt − 1, ∆GDPt, ∆GDPt + 1, ∆GDPt + 2, Πt − 1, Πt, Πt + 1, Πt + 2, Ut} If Zt is an admissible (or de-confounding) set of covariates, then the conditional independence assumption (bʹ) will hold: then we will be allowed to ignore changes in the intended federal funds rate if we wish to determine potential outcomes of changes in real GDP. Intuitively, this makes a lot of sense: if the ten variables in Zt sufficiently determine the FOMC’s decision to set the federal funds rate at a particular level, then we don’t need to look at that level if we are interested in potential outcomes of changes in real GDP. The interesting and somewhat counterintuitive point is that we may nonetheless be allowed to say that changes in the intended federal funds rate directly type-level cause changes in real GDP. If changes in the intended federal funds rate are ignorable, conditionally on the ten variables in Zt, and if Angrist and Kuersteiner’s causality test leads to positive results, then changes in the intended federal funds rate can be said to directly type-level cause changes in real GDP. Now, Angrist and Kuersteiner’s causality test does lead to positive results. In the case of their preferred model (the “baseline Romer model”), these results say that seven to twelve quarters ahead, there is a causal effect of changes in the intended

The In-Principle Inconclusiveness of Causal Evidence  87 federal funds rate on changes in real GDP at a significance level of 1% or 5%: that the probability that real GDPt + j changes as a result to a change in FFt lies somewhere between 0.040 and 0.092 for j = 7 … 12 (cf. Angrist and Kuersteiner 2011, Table 3). But the problem with these results is that they are likely to be biased. They are likely to be biased because Angrist and Kuersteiner are likely to be misguided in their choice of covariates. Pearl (22009, pp. 79–80) emphasizes that Z doesn’t qualify as admissible unless it satisfies the “back-door criterion.” He says that Z satisfies the backdoor criterion relative to an ordered pair of variables (X, Y) in a causal graph if Z (i) doesn’t include any descendants of X and (ii) blocks every path between X and Y that contains an arrow into X. Remember from the preceding section that Z is said to “block” a path p if p contains at least one arrow-emitting node that is in Z or at least one collision node that is not in Z and has no descendants in Z (cf. Pearl 22009, pp. 16–7). And remember from Chapter 2 (Section 2.5) that Pearl (22009, p. 80) uses the following graph to illustrate the backdoor criterion: Z1

Z2

Z3

Z4

Z5

X

Z6

Y

Figure 4.2 {Z3, Z4} and {Z4, Z5} satisfy the backdoor criterion; {Z6} and {Z4} don’t (cf. Pearl 2009, p. 80).

In this graph, {Z3, Z4} and {Z4, Z5} satisfy the backdoor criterion because they do not include any descendants of X, and because they block every path between X and Y that contains an arrow into X. It is clear that {Z6} does not satisfy the backdoor criterion because it is a descendent of X. It is worth noting, however, that by itself {Z4} doesn’t satisfy the backdoor criterion either: it blocks the path X ← Z3 ← Z1 → Z4 → Y because the arrow-emitting node is in Z; but it does not block the path X ← Z3 ← Z1 → Z4 ← Z2 → Z5 → Y because none of the arrow-emitting nodes (Z1, Z2) is in Z, and because the collision node Z4 is not outside Z. Pearl (22009, pp. 80–1) then proves the proposition that P(y⎹ do(x)) = ∑Z P(y⎹ x, z)⋅P(z) if and only if Z satisfies the backdoor-criterion, where P(y⎹ do(x)) is the probability that Y = y if X is set to x by intervention. For Angrist and Kuersteiner’s analysis of monetary policy shocks this means that Zt is an admissible (or de-confounding) set of covariates if and only if Zt satisfies the backdoor criterion, i.e., if and only if Zt (i) doesn’t include any descendants of ∆FFt and (ii) blocks every path between ∆FFt and ∆GDPt + j that contains an

88 Causality arrow into ∆FFt. It is true that Pearl proves his theorem for variables that aren’t time-indexed. But no matter if time-indexed or not, variables represent sets of potential values that are measurable or quantifiable. When time-indexed, they just represent sets of ordered pairs that assign each possible value to each possible point in time. There is accordingly no reason why Pearl’s theorem shouldn’t be applicable to time series data. Angrist and Kuersteiner are likely to be misguided in their choice of covariates because Zt is unlikely to satisfy the backdoor criterion. Here is the causal graph that corresponds to Angrist and Kuersteiner’s preferred model: FFt-1

GDPt-1

GDPt

GDPt+1

GDPt+2

t-1

t

t+1

t+2

Ut

FFt

Figure 4.3 Angrist and Kuersteiner’s preferred model (the “baseline Romer model”)

If the members of the FOMC believe that monetary policy has real-economy effects (as most of them do), then their predictions of changes in real GDP and inflation will depend on the federal funds rate that they intend to set now: then there will be causal arrows from ∆FFt (changes in the intended federal funds rate) to changes in real GDP and inflation in t + 1 and t + 2 (though perhaps not in t) that the FOMC predicts in t (violating condition (i) of the backdoor criterion)15: FFt-1

GDPt-1

GDPt

GDPt+1

GDPt+2

t-1

t

t+1

t+2

Ut

FFt

Figure 4.4 If the members of the FOMC believe that monetary policy has real-economy effects, their predictions of changes in real GDP and inflation will depend on the federal funds rate that they intend to set now.

The In-Principle Inconclusiveness of Causal Evidence  89 And if the idiosyncratic information to which policymakers are assumed to react is also the sort of information that also makes firms shrink or expand production (a general pessimistic or optimistic outlook on the economy), then there will be arrows from that kind of information to ∆FFt and ∆GDPt + j (violating condition (ii) of the backdoor criterion):

FFt-1

GDPt-1

GDPt

GDPt+1

GDPt+2

t-1

t

t+1

t+2

Ut

GDPt+j

FFt

t

Figure 4.5 If the idiosyncratic information to which policymakers are assumed to react is also the sort of information that makes firms shrink or expand production there will be arrows from that kind of information to ∆FFt and ∆GDPt + j.

More generally speaking, the problem with the AK test is that results from probabilistic causality tests will be biased unless the set of covariates satisfies the backdoor criterion, that the backdoor criterion implies Woodward’s conditions (I1)–(I4), and that Sections 4.2–4.4 have shown that the evidence that macroeconomists can provide in support of these conditions is too inconclusive in principle. Why does the backdoor criterion imply Woodward’s conditions (I1) – (I4)? For Pearl (22009, pp. 70–1), do(x) amounts to setting X to x by manipulating an intervention variable I and by breaking all arrows directed into X and departing from variables other than I. For Pearl, that is, do(x) requires that Woodward’s conditions (I1) and (I2) be satisfied. Condition (ii) of the backdoor criterion, moreover, rules out the same cases as Woodward’s conditions (I3) and (I4) (cf. Figure 4.1 in Section 4.5 above). These are cases in which Z blocks the paths between X and Y that contain an arrow into X. Therefore, conditioning on Z (knowing the value of Z) rules out the same cases as conditions (I3) and (I4). The backdoor criterion is a bit stronger than Woodward’s conditions (I1)–(I4) since condition (i) of the backdoor criterion also rules out cases in which arrows are directed into Z and depart from X.16 But the backdoor criterion represents a set of conditions that includes Woodward’s conditions (I1)–(I4). One may accordingly say that the backdoor criterion implies these conditions.

90 Causality As long as I, Z and W are known and measurable, there will be no problem: one of I, Z and W can simply be added to the set of admissible (or de-confounding) covariates. But in macroeconomics, hidden variables (i.e., variables that are causally relevant, though unobservable and possibly incapable of manipulation through direct human intervention) are always likely to be present: variables standing for decisions of firms to shrink or expand production, idiosyncratic information that guides these decisions, inflation expectations etc. And if a hidden variable is present, then causal inference based on the potential-outcome approach will be defective. The evidence that the AK test can provide is therefore too inconclusive to disentangle competing and observationally equivalent causal hypotheses in macroeconomics. 4.7

Conclusion

In Sections 4.2–4.6 I have argued that the evidence that macroeconomists can provide in support of the hypothesis that X directly type-level causes Y is too inconclusive in principle. Sections 4.2–4.4 have shown that the evidence provided by the IV method is too inconclusive because it derives from conditions (I1)–(I4) of Woodward’s definition of ‘intervention variable,’ and because in macroeconomics, hidden variables that violate these conditions, are likely to be present. Section 4.5 has argued that the evidence provided by the Hoover test is too inconclusive because it cannot show that the parameters of P(x,y) are identified, or that P(y⎹ x) remains invariant to changes in the parameters of the X process, and because conditions (I1), (I3) and (I4) are among the conditions that need to be satisfied in order for the parameters of P(x,y) to be identified, or for the parameters of P(y⎹ x) to remain invariant. Finally, Section 4.6 has tried to show that the evidence provided by the AK test is too inconclusive because the results of that test will be biased unless the backdoor criterion guides the choice of covariates, and because the backdoor criterion implies conditions (I1)–(I4). An important conclusion to be drawn from Sections 4.2–4.6 states that macroeconomics cannot do justice to its ultimate justification if the ultimate justification for the study of macroeconomics is to provide knowledge on which to base policy, and if knowledge on which to base policy is causal knowledge (cf. introduction to Part I). By way of conclusion, I’d like to point out that another conclusion that needs to be drawn from Sections 4.2–4.6 conflicts with a claim that Milton Friedman makes in a famous passage from his 1953 paper. In that passage, Friedman (1953/21994, p. 185) claims that “[t]he inability to conduct so-called ‘controlled experiments’ does not […] reflect a basic difference between the social and physical sciences […] because the distinction between a controlled experiment and uncontrolled experience is at best one of degree.” It is true that the inability to conduct RCTs doesn’t reflect a basic difference between the social and physical sciences: in many social sciences (including microeconomics), RCTs are conducted on a large scale; and in some physical sciences (such as astronomy), RCTs cannot be conducted. It is also true that the distinction between a controlled experiment and uncontrolled experience is “at best one of degree.”

The In-Principle Inconclusiveness of Causal Evidence  91 Note, however, that the inability to conduct RCTs does reflect a basic difference between macroeconomics and many of the special sciences (including microeconomics and pharmacology). Unlike researchers in many of the special sciences, macroeconomists cannot provide conclusive evidence in support of hypotheses of direct type-level causation. They cannot provide that evidence because they cannot conduct RCTs; and they cannot conduct RCTs because in macroeconomics, hidden variables (variables that are causally relevant, though unobservable and possibly incapable of control through direct human intervention) are always likely to be present. Note further that the immediate context suggests that Friedman’s claim is primarily concerned with macroeconomics. It is true that most of the more detailed examples he discusses are drawn from microeconomics (and especially the field of industrial organization). But in the more immediate context of his claim, Friedman (1953/21994, p. 186) refers to “the hypothesis that a substantial increase in the quantity of money within a relatively short period is accompanied by a substantial increase in prices.” He also maintains that “experience casts up […] direct, dramatic, and convincing […] evidence” in support of that hypothesis. His claim is, moreover, repeated almost literally in Friedman and Schwartz (1963, p. 688). If the claim is that the inability to conduct RCTs doesn’t reflect a basic difference between macroeconomics and many of the special sciences (including microeconomics and pharmacology), then that claim conflicts with an important conclusion that needs to be drawn from Sections 4.2–4.6 of this paper. Notes 1 There is one procedure that I won’t discuss in this book: the super exogeneity test designed by Hendry (1988) and Engle and Hendry (1993). I will assume, however, that the same negative conclusion that I will defend with respect to the other procedures holds for the super exogeneity test. For the super exogeneity test, the negative conclusion is foreshadowed in Pearl (22009, pp. 165–70), Hoover (2001, pp. 167–8), and Cartwright (2009, pp. 415–6). 2 A study that comes close to a full implementation of that two-step procedure is an experiment that Van den Berg and Van der Klaauw (2006) conduct for two Dutch cities and a time-interval between 08/1998 and 02/1999. The only drawback of that study is that it fails to control for role obligations. Its main finding is that the exit rate in the monitored treatment group wasn’t significantly higher than in the control group. 3 Woodward’s definition in fact requires that all variables in a variable set V, except X and Y, remain fixed by intervention. But this requirement is meant to ensure that X is a direct type-level cause of Y. And as such, it can be replaced with the weaker requirement that all direct type-level causes of Y except X remain fixed by intervention (a requirement that can be found e.g. in Pearl 22009, pp. 127–8). Cf. Chapter 2 (Section 2.3) for an elaboration of this point. 4 Perhaps under the influence of Popper or positivism, Friedman says that he tries “to avoid the use of the word ‘cause’ … it is a tricky and unsatisfactory word” (cited from Hoover 2009, p. 306). 5 Friedman and Schwartz (1963, pp. 688–9) identify only two further short periods that were also characterized by contractionary monetary policies and associated contractions in the money stock and industrial production. 6 He argues, more precisely, that skipping (I2) and replacing (I4) with the condition that I and Y do not have causes in common (except those that might cause Y via I and X) is

92 Causality more in line with econometric practice. But (as he himself observes) his condition that I and Y do not have causes in common (except those that might cause Y via I and X) is equivalent with (I4) as long as the common cause principle holds. 7 When defining the notion of direct type-level causation, Hoover suggests that the property of being variation-free is necessary and sufficient for structural invariance. He says, for instance, that a privileged parameterization “is the source of the causal asymmetries that define causal order,” and that “a set of parameters is privileged when its members are […] variation-free” (Hoover 2013, p. 41). When defining the notion of direct typelevel causation, however, Hoover refers to structural (not statistical) parameters, i.e., to parameters that figure in a structural or (better) causal model. For a more detailed analysis of Hoover’s definition, cf. Chapter 2 (Section 2.4). 8 Pearl (2009, p. 160) in fact claims that this equation needs to hold for all Z disjoint of {X ∪ Y}, but that claim is unnecessarily strong. 9 According to Lucas, it is investment decisions that are type-level caused by tax policy. But investment decisions are likely to type-level cause production decisions. 10 For a more detailed analysis of the exact relationship between Hoover’s account and a macroeconomic variant of Woodward’s account, cf. Chapter 2 (Sections 2.3 and 2.4). 11 Also remember from Chapter 2 (Section 2.3) that condition (I2) doesn’t figure in the definition that is central to a macroeconomic variant of Woodward’s interventionist account. 12 Angrist and Pischke (2009, p. 54) refer to the same assumption as “conditional independence assumption.” In the present context, however, it is more appropriate to use the term “selection-on-observables assumption” because the assumption in question might otherwise be confused with the assumption that Angrist and Kuersteiner (2011, p. 729) refer to as “the key testable conditional independence assumption,” i.e., with an assumption that involves actual, not potential outcomes. 13 In a probabilistic context, this condition corresponds to the condition that Y be ‘unshieldable’ from X, i.e., to a condition that follows from two results that Spohn (1980, pp. 77, 84) derives. 14 By ‘unemployment innovation,’ Angrist and Kuersteiner (2011, p. 736n) mean the unemployment rate in the current quarter minus the unemployment rate in the previous month. 15 It would be implausible to say that the real effects of ∆FFt would materialize suddenly in t + j, and not continuously over j quarters. 16 Condition (i) of the backdoor criterion is meant to ensure acyclicity (cf. Pearl 22009, p. 339). Remember from Chapter 2 (Section 5) that a potential-outcome approach to macroeconomic causality reduces to a macroeconomic variant of Woodward’s interventionist account if condition (I2) and condition (i) of the backdoor criterion are dropped.

References Angrist, J. D. and Kuersteiner, G. M. (2011). “Causal Effects of Monetary Shocks: SemiParametric Conditional Independence Tests with a Multinomial Propensity Score.” The Review of Economics and Statistics 93(3), 725–47. Angrist, J. D. and Pischke, J.-S. (2009). Mostly Harmless Econometrics. Princeton: PUP. Cartwright, N. (2009). “Causality, Invariance, and Policy.” In Kincaid, H. and Ross, D. (eds.), The Oxford Handbook of Philosophy of Economics. Oxford: OUP, 410–23. Dornbusch, R., Fischer, S., and Startz, R. (71998). Macroeconomics. Boston: McGraw-Hill. Engle, R. F. and Hendry, D. F. (1993). “Testing Super Exogeneity and Invariance in Regression Models.” Journal of Econometrics 56, 119–39. Friedman, M. (1953/21994). “The Methodology of Positive Economics.” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP.

The In-Principle Inconclusiveness of Causal Evidence  93 Friedman, M. and Schwartz, A. J. (1963). A Monetary History of the United States, 1867– 1960. Princeton: PUP. Hendry, D. F. (1988). “The Encompassing Implications of Feedback versus Feedforward Mechanisms in Econometrics.” Oxford Economic Papers 40(1), 132–49. Hoover, K. D. (2001). Causality in Macroeconomics. Cambridge: CUP. Hoover, K. D. (2009). “Milton Friedman’s Stance: The Methodology of Causal Realism.” In Mäki, U. (ed.), The Methodology of Positive Economics: Milton Friedman’s Essay Fifty  Years Later. Cambridge: CUP, 303–20. Hoover, K. D. (2011). “Counterfactuals and Causal Structure.” In McKay Illari, P., Russo, F. and Williamson, J. (eds.), Causality in the Sciences. Oxford: OUP, 338–60. Hoover, K. D. (2013). “Identity, Structure, and Causal Representation in Scientific Models.” In Chao, H.-K., Chen, S.-T. and Millstein, R. (eds.), Towards the Methodological Turn in the Philosophy of Science: Mechanism and Causality in Biology and Economics. Dordrecht: Springer, 35–60. King, R. G. and Plosser, C. I. (1984). “Money, Credit and Prices in a Real Business Cycle.” American Economic Review 74, 363–80. Lucas, R. E. (1976). “Econometric Policy Evaluation: A Critique.” Carnegie-Rochester Conference Series on Public Policy 1, 19–46. Pearl, J. (22009). Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press. Reiss, J. (2005). “Causal Instrumental Variables and Interventions.” Philosophy of Science 72, 964–76. Reiss, J. (2013). Philosophy of Economics. London/New York: Routledge. Romer, C. D. and Romer, D. H. (1989). “Does Monetary Policy Matter? A New Test in the Spirit of Friedman and Schwartz.” NBER Macroeconomics Annual 4, 121–70. Spohn, W. (1980). “Stochastic Independence, Causal Independence, and Shieldability.” Journal of Philosophical Logic 9, 73–99. Van den Berg, G.J. and Van der Klaauw, B. (2006). “Counseling and Monitoring of Unemployed Workers: Theory and Evidence from a Controlled Social Experiment.” International Economic Review 47(3), 895–936. Woodward, J. (2003). Making Things Happen: A Causal Theory of Explanation. Oxford: OUP.

5

5.1

Causality and Probability

Introduction

When investigating efficient causes, economists stand in one of roughly two traditions: in the tradition of understanding efficient causes as raising the probability of their effects, or in the tradition of understanding them as causally dependent on an instrumental variable (or “instrument”), i.e., on a variable, on which not only the putative cause causally depends but also the putative effect (via the putative cause), and which doesn’t causally depend on the putative effect or any other variable on which the putative effect causally depends. While the first tradition goes back to David Hume, the second tradition has its roots in some of the work of the early econometricians (Tryvge Haavelmo, Herbert Simon). The second tradition is younger than the first; but unlike the first tradition, the second tradition is at least compatible with Aristotelian approaches to efficient causation: with approaches that involve firm ontological commitments to powers, tendencies, or capacities.1 In Chapter 4, I argued that the evidence that macroeconomists can provide when conducting natural experiments or econometric causality tests is too inconclusive to support the hypothesis that X directly type-level causes Y, where X and Y stand for macroeconomic aggregates like the nominal interest rate and aggregate demand. This is a negative result that raises the question whether I have misunderstood macroeconomic causality all along. James Woodward’s interventionist account, Kevin Hoover’s account of privileged parameterization, and the potential outcome approach stand in the second of the two traditions. Would it be possible that an adequate account of macroeconomic causality must be located in the second tradition? The present chapter will answer this question negatively. It will argue that knowledge of causes that raise the probability of their effects can be employed for purposes of prediction, but less so for purposes of policy analysis (Section 5.6). But before presenting the argument, it will introduce and discuss the various probability accounts of causality: the probability theories of causality of Patrick Suppes and Clive Granger (Sections 5.2 and 5.3), Arnold Zellner’s idea of using causal laws to decide about the relevance of the variables and lags to be included in a model representing relations of Granger causality (Section 5.4), and causal Bayes nets theory (Section 5.5). The chapter will conclude by mentioning a number of DOI: 10.4324/9781003094883-6

Causality and Probability  95 problems that are potentially inherent to attempts to infer causality in the sense of the second tradition from probabilities (Section 5.7). 5.2

Suppes on Genuine Causation

While Hume required constant conjunction of cause and effect, probability approaches to causality are content to understand causes as raising the probability of their effects. They say that X = x causes Y = y if the conditional probability of Y = y given X = x is greater than the unconditional probability of Y = y, formally: P(Y = y⎹ X = x) > P(Y = y), where X, Y … are random variables, i.e., functions from a sample or state space into a range of values, where lower-case letters x, y … denote the values that X, Y … can take, and where P is a probability measure over the power set of that sample space, i.e., a function from that power set into the set of real numbers such that the Kolmogorov axioms are satisfied. The power set of the sample space may also be understood as the set of propositions saying that X = x, Y = y … Instead of propositions probability approaches to causality usually speak of events: of the event A of X attaining the value x, of the event B of Y attaining the value y … Suppes (1970, p. 12) interprets events as “instantaneous,” i.e., as occurring at a particular point in time; and he includes their time of occurrence in their formal characterization. So for him, ‘P(At)’ refers to the probability of the event A occurring at time t, where ‘A occurs at time t’ means as much as ‘X attains value x at time t.’ Suppes (1970, p. 24) understands “cause” as “genuine cause” and defines “genuine cause” as “prima facie cause that is not spurious.” Thus, in order to understand his definition of “cause,” one needs to understand his definitions of “prima facie cause” and “spurious cause.” His definition of “prima facie cause” runs as follows (cf. Suppes 1970, p. 12): (CPF) Bt′ is a prima facie cause of At iff (i) t′ < t, (ii) P(Bt′) > 0, and (iii) P(At⎹ Bt′) > P(At).

Condition (iii) requires that causes increase the probability of their effect, and condition (ii) is needed because in the definition of conditional probability – P(At⎹ Bt′) = P(At∧Bt′)/P(Bt′) – P(Bt′) is the denominator, and because the denominator must not be equal to zero. Condition (i) implies that Bt′ occurs earlier than At in time. Why does Suppes introduce that condition? One answer is that the relation ‘… increases the probability of …’ is symmetric because P(At⎹ Bt′) > P(At) is equivalent to P(Bt′⎹ At) > P(Bt′), that the relation ‘… causes …’ is asymmetric, and that temporality is capable of turning the relation ‘… increases the probability of …’ into an asymmetric one. A second answer is that the Humean tradition, in which Suppes stands, holds that causality is intrinsically linked to temporality. His definition of “spurious cause” runs as follows (Suppes 1970, pp. 21–2): (CS) Bt′ is a spurious cause of At iff Bt′ is a prima facie cause of At, and there is a t″ < t′ and an event Ct″ such that (i) Ct′′ precedes Bt′, (ii) P(Bt′ ∧ Ct″) > 0 and (iii) P(At | Bt′ ∧ Ct″) = P(At | Ct″).

96 Causality In other words: Bt′ is a spurious cause of At iff Bt′ is a prima facie cause of At, Ct″ precedes Bt′, and Ct″ “screens off” At from Bt′. The notion of a spurious cause is needed to rule out cases in which prima facie causes do not represent genuine causes. A falling barometer, for instance, is a prima facie cause but not a genuine cause of an upcoming storm. Atmospheric pressure that precedes both the falling barometer and the upcoming storm screens off the upcoming storm from the falling barometer. (CS) is not the only definition of spurious causation that Suppes (1970, pp. 21–8) brings into play, and besides “prima facie cause” and “spurious cause” he defines “direct cause,” “sufficient cause” and “negative cause.” But a consideration of these definitions lies beyond the purposes of this entry. More immediately relevant to these purposes is a consideration of the problems that Suppes’ account of genuine causation faces, and that require specific solutions. These problems are of essentially two kinds; they both suggest that condition (iii) of (CPF) cannot be a necessary condition for Bt′ causing At. The first problem is that it seems that Bt′ can turn out to be a cause of At even though P(At⎹ Bt′) < P(At). This problem can be illustrated by an example that Suppes (1970, p. 41) himself discusses. The example is that of a golfer with moderate skill who makes a shot that hits a limb of a tree close to the green and is thereby deflected directly into the hole, for a spectacular birdie. If At is the event of making a birdie and Bt’ the earlier event of hitting the limb, we will say that Bt′ causes At. But we will also say that P(At⎹ Bt′) < P(At): that the probability of his making a birdie is low, and that the probability of his making a birdie, given that the ball hits the branch, is even lower. Does the example show that condition (iii) of (CPF) cannot qualify as a necessary condition for Bt′ causing At? Suppes (1970, p. 42–3) argues for a negative answer. He argues that definition (CPF) can be defended if condition (iii) is relativized to background information Kt′: (CPF′) Bt′ is a prima facie cause of At iff (i) Bt′ ∧ Kt′ precedes At, (ii) P(Bt′ ∧ Kt′) > 0 and (iii) P(At | Bt′ ∧ Kt′) > P(At | Kt′). Thus, if Kt′ is e.g. the event of the shot being deflected in a specific angle, then the probability of the golfer’s making a birdie, given that the ball hits the branch and is deflected in a specific angle, might well be higher than the probability of his making a birdie. Suppes (1970, p. 42) adds that such relativization to background knowledge “can be useful, especially in theoretical contexts.” The second problem is a fact about probabilities that is well known to statisticians and is often referred to as “Simpson’s paradox.” The fact is that any association between two variables that holds in a given population – P(Y = y| X = x) > P(Y = y), P(Y = y| X = x) < P(Y = y) or P(Y = y| X = x) = P(X = x)) – can be reversed in a subpopulation by finding a third variable that is correlated with both. Consider, for instance, the population of all Germans. For the population of all Germans, the conditional probability of getting a heart disease, given that an individual smokes, is higher than the unconditional probability of getting a heart disease. But for a

Causality and Probability  97 subpopulation of Germans, in which all smokers exercise, the conditional probability of getting a heart disease, given that an individual smokes, is lower than the unconditional probability of getting a heart disease – at least if exercising is more effective at preventing heart disease that smoking at causing it. The fact itself is not a paradox. But Nancy Cartwright (1979, p. 421) points out that the paradox arises if we define causation of Y = y by X = x in terms of P(Y = y| X = x) > P(Y = y). If we define causation of Y = y by X = x in terms of P(Y = y| X = x) > P(Y = y), then causation of Y = y by X = x will depend on the population that we select when establishing P(Y = y| X = x) > P(Y = y). At the same time, we have the strong intuition that causation should be independent of specific populations. Cartwright (1979, p. 423) proposes to dissolve the paradox by conditioning Y = y on the set of “all alternative causal factors” of Y = y. Conditioning Y = y on such a set would render Suppes’ definition of genuine causation circular: causal vocabulary would show up in both the definiendum and the definiens. But some philosophers (e.g. Hoover 2001, p. 42; Woodward 2003, pp. 104–5) hold that non-circularity is not absolutely necessary. 5.3

Granger Causality

Perhaps the most influential explicit approach to causality in economics is that of Granger (1969; 1980). Like Suppes, Granger stands in the Humean tradition of understanding causes as raising the probability of their effect; and like Suppes, he believes that causality is intrinsically linked to temporality. But unlike Suppes, Granger (1980, p. 330, notation modified) defines ‘causation’ as a relation between variables: (GC) Xt Granger-causes Yt + 1 if and only if P(Yt + 1 = yt + 1⎹ Ωt = ωt) ≠ P(Yt + 1 = yt + 1⎹ Ωt = ωt − Xt = xt),

where Ωt is the infinite universe of variables dated t and earlier. The temporal ordering of Xt and Yt + 1 guarantees that the relation between Xt and Yt + 1 is asymmetric, and conditioning on Ωt = ωt immunizes (GC) against circularity, spuriousness and the problem that causes might lower the probability of their effects. Wolfgang Spohn (2012, p. 442) points out that it is “literally meaningless and an abuse of language” to speak of variables themselves as causing other variables. ‘Xt causes Yt + 1’ may either mean ‘Xt = xt causes Yt + 1 = yt + 1’ or ‘Yt + 1 causally depends on Xt,’ where the causal dependence of Yt + 1 on Xt is to be understood as a relation that obtains between Xt and Yt + 1 if some event Xt = xt causes some event Yt + 1 = yt + 1.2 The context of time series econometrics suggests that (GC) is to be read in the first sense (cf. Spohn 1983, pp. 85–6). The phrase ‘Xt Granger-causes Yt + 1’ will be retained in the remainder because economists and econometricians have become accustomed to its use. It should be kept in mind, however, that the phrase is to be understood as synonymous with ‘Xt = xt causes Yt + 1 = yt + 1.’ Granger (1980, p. 336) points out himself that (GC) is not “operational” because practical implementations cannot cope with an infinite number of variables with an

98 Causality infinite number of lags. But econometricians think that in order to test for “Granger causality,” they need to select only the relevant variables and only the relevant number of lags. Christopher Sims (1972), for instance, uses two variables (for money and GNP) and four future and eight past lags to show that money Grangercauses GNP, and not the other way around. Later, as part of a general critique of the practice of using a priori theory to identify instrumental variables, Sims (1980a) advocates vector autoregression (VAR), which generalizes Granger causality to the multivariate case. The following two-equation model, for instance, is a VAR model for two variables and one lag: yt + 1 = α11yt + α12xt + ε1t + 1, xt + 1 = α21yt + α22xt + ε2t + 1,

(1) (2)

where the αij are parameters and the Εit + 1 random error terms. Xt is said to Grangercause Yt + 1 if α12 ≠ 0; and Yt is said to Granger-cause Xt + 1 if α21 ≠ 0. While definition (GC) avoids some of the problems that competing definitions face (circularity, spuriousness, the problem of causes that lower the probability of their effects), objections have been raised to implementations of (GC), i.e., to empirical procedures of testing for Granger causality. Hoover (1993, pp. 700–5) lists three problems that stand in the way of a temporal ordering of cause and effect in macroeconomics. The first problem is that in macroeconomics, it is difficult to rule out contemporaneous causality because data are reported most often annually or quarterly. Hoover (1993, p. 702) cites Granger as suggesting that contemporaneous causality could be ruled out if data were sampled at fine enough intervals. But Hoover (1993, p. 702) responds that “such finer and finer intervals would exacerbate certain conceptual difficulties in the foundations of economics”; and he cites GNP as an example: “There are hours during the day when there is no production; does GNP fall to nought in those hours and grow astronomically when production resumes? Such wild fluctuations in GNP are economically meaningless.” The second problem is that there are hidden variables that (like expectations) cannot be included among the regressors in VAR models even though they are likely to be causally relevant. And the third problem is that economic theory (no matter which) provides reasonably persuasive accounts of steady-states, i.e., of hypothetical economic configurations that feature constant rates, quantities, and growth rates, and that are timeless in the sense that they result if time is allowed to run on to infinity. Hoover (1993, p. 705) admits that proponents of Granger causality might respond that “if macroeconomics cannot be beat into that mold [of temporal ordering], so much the worse for macroeconomics.” But Hoover (1993, p. 706) also argues that in macroeconomics, causal questions like ‘Will interest rates rise if the Fed sells $50M worth of treasury bonds?’ are sensible and well formulated and that our concepts of causality need to be suitable for their formulation and interpretation. Another prominent objection that has been raised to Granger-causality tests says that it is impossible to select the relevant number of variables and lags without (explicit or implicit) reliance on economic theory or background knowledge.

Causality and Probability  99 Sims’s subsequent work on the relation of Granger-causality between money and GNP indicates why this objection is important. When he included four variables (money, GNP, domestic prices, and nominal interest rates) and twelve past lags in a VAR model, the above-mentioned result of money Granger-causing GNP no longer obtained.3 A relation of Granger-causality thus crucially depends on the number of variables and lags that are deemed to be relevant. And who is to decide about the relevance of variables and lags, and how? 5.4

Zellner on Causal Laws

Zellner (1979; 1988) can be read as responding to that question when defining ‘causality’ in terms of “predictability according to a law or set of laws.”4 He claims that laws may be deterministic or stochastic, qualitative or quantitative, micro or macro, resulting from controlled or uncontrolled experiments, involving simultaneous or non-simultaneous relations, and so on, and that the only restrictions that need to be placed on laws relate to logical and mathematical consistency and to the ability to explain past data and experience and to predict future data and experience. While these restrictions are not “severe,” they imply that statistical regressions or autoregressions cannot qualify as laws because “they generally do not provide understanding and explanation and often involve confusing association or correlation with causality” (Zellner 1988, p. 9). Similarly, theories cannot qualify as laws if they are “based on impossible experiments or on data that can never be produced” (Zellner 1988, p. 9). Sketching a rudimentary theory of the psychology of scientific discovery, Zellner (1988, pp. 9–12) suggests that the discovery of laws proceeds in roughly three steps. In the first step, “the conscious and unconscious minds interact […] to produce ideas and combinations of ideas using as inputs at least (1) observed or known past data and experience, (2) a space of known theories, and (3) future knowable data and experience.” In a second step, “the conscious mind […] decides the general nature or design of an investigation.” This means that it selects a specific “phenomenon” from its pool of ideas or combinations of ideas and that it develops “an appropriate theory or model that is capable of explaining the phenomenon under investigation and yielding predictions.” With respect to the development of that theory or model, Zellner remarks that it requires “hard work, a breadth of empirical and theoretical knowledge, consideration of many possible combinations of ideas, luck, and a subtle interaction between the conscious and unconscious minds.” He also argues that “focusing attention on sophisticatedly simple models and theories is worthwhile.” The third and final step is that of demonstrating that “the suggested model or theory actually does explain what it purports to explain by empirical investigations using appropriate data.” That demonstration requires the frequent use of new data to test not only the model or theory itself, but also its implications, such as predictions about as yet unobserved phenomena. Whenever new data is used to test the model or theory or its implications successfully, the degree of reasonable belief or confidence in the model or theory increases. The degree of reasonable belief in

100 Causality the model or theory corresponds to the posterior probability that can be assigned to that model or theory and computed using Bayes’ theorem: P(H⎹ E) = P(E⎹ H) ⋅ P(H)/P(E), where H is a proposition summarizing the model or theory and E an ‘evidential proposition’ referring to new data that can be used to test H. Zellner (1988, p. 16) says that “a theory can be termed a causal law” if the posterior probability that can be assigned to it is “very high, reflecting much outstanding and broad-ranging performance in explanation and prediction.” Zellner can be read as responding to the question of how to decide about the relevance of variables and lags because causal laws may include well-confirmed theories about the strength of the parameters that can be included in a VAR model. If that strength happens to be among the phenomena that the conscious mind decides to investigate, the mind aims to develop an appropriate model or theory H that is capable of quantifying and explaining that strength. Once H is developed, it can be subjected to a Bayesian updating procedure in which a prior probability is assigned to H, in which the likelihood of E given H is evaluated, and in which data E is collected to compute the posterior probability of H in accordance with Bayes’ theorem. The posterior probability of H then serves as its prior probability when new data E is collected to test H (or any of its implications) again. Once the posterior probability of H is “very high,” H can be viewed as a causal law that supports decisions about the relevance of the variables and lags to be included in a VAR model: the greater the strength of a parameter, the more relevant its corresponding (lagged) variable. Zellner’s definition of “causality” combines with his rudimentary theory of the psychology of scientific discovery to imply an interesting response to the question of how to decide about the relevance of variables and lags. But problems pertain to the Bayesian updating procedure that marks the third of the three steps that scientific discovery takes according to his theory.5 And even if his theory were accurate, the question arises whether it doesn’t just make manifest how entirely difficult it is decide about the relevance of the variables and lags to be included in a VAR model. Zellner (1988, pp. 17–19) cites Milton Friedman’s theory of the consumption function as a theory with a very high posterior probability. But the very high posterior probability of that theory may well be exceptional. 5.5

Causal Bayes Nets Theory

One final probability approach to causality is causal Bayes nets theory. Causal Bayes nets theory was first developed outside economics (substantially foreshadowed in Spohn 1980, and then developed in detail by Judea Pearl 2000 and Peter Spirtes, Clark Glymour, and Richard Scheines 1993), but has been applied in economics and econometrics soon after (Bessler and Lee 2002, Demiralp and Hoover 2003). Unlike the approaches of Suppes and Granger, causal Bayes nets theory is primarily interested in relations of causal dependence and analyzes causal relations irrespectively of any temporal ordering. It consequently focuses on relations of direct causal dependence more explicitly. At the center of causal Bayes nets theory6 is the notion of a directed, acyclic graph (DAG). A DAG is a tuple 〈→, V〉, where V is a non-empty finite set of

Causality and Probability  101 pre-selected variables and → an acyclic relation on V: there are no variables X, …, Y ∈ V such that X → ... → Y and Y → X. A DAG is a causal graph if the arrow → can be interpreted as representing a relation of direct causal dependence between the variables in V. In order to understand the notion of direct causal dependence that is involved here, one needs to become acquainted with a bit of graph theoretical notation and the three axioms that determine the relation between causality and probability according to causal Bayes nets theory. Consider the graph theoretical notation first: In a DAG, X ∈ V is said to be - a parent of Y ∈ V if and only if X → Y (the set of parents of Y is denoted by pa(Y)). - a child of Y ∈ V if and only if Y is a parent of X. - an ancestor of Y ∈ V if and only if there are X, …, Y ∈ V such that X → … → Y (the set of ancestors of Y is denoted by an(Y)). - a descendant of Y ∈ V if and only if Y is an ancestor of X. - a non-descendant of Y ∈ V if and only X ≠ Y and X is not a descendant of Y (the set of non-descendants of Y is denoted by nd(Y)). Now turn to the three axioms (cf. Spirtes et al. 1993, pp. 29–32). Let 〈→, V〉 be a causal graph and P a probability measure over the power set of the sample space, and let ⊥ P stand for probabilistic independence. Then P satisfies the so-called - Causal Markov Condition if and only if for all X ∈ V X ⊥ P nd(X) − pa(X)/ pa(X). - Causal Minimality Condition if and only if for all X ∈ V pa(X) is the smallest subset of variable set Y such that X ⊥ P nd(X) − Y/Y. - Causal Faithfulness Condition if and only if for all subsets X, Y, Z of V X ⊥ P Y/Z holds only if X ⊥ P Y/Z is entailed by P’s satisfaction of the causal Markov and minimality conditions. Informally, the causal Markov condition says that the parents of X screen off X from all other non-descendants of X. The causal minimality condition says that P would no longer satisfy the causal Markov condition if any of the parents of X were excluded from pa(X); it requires that there be exactly one minimal set of parents of X that screens off X from all its other non-descendants. Finally, the faithfulness condition says that there are no accidental conditional independencies: that all the conditional independencies that the causal Markov and minimality conditions make reference to reflect relations of causal dependence. If P satisfies the causal Markov, minimality and faithfulness conditions and 〈→, V〉 is a causal graph, then P combines with 〈→, V〉 to form a causal Bayes net. As an example of a causal Bayes net, consider the following graph (Figure 5.1) and imagine that you are worried about the competitiveness (X4) of your firm and that you ponder about it in terms of productivity (X1), cost reduction (X2), and value creation (X3) and the probabilistic independencies that you think obtain between these variables. Then the causal Markov condition entails that X2 ⊥ P X3/X1 (that

102 Causality X4

Cost reduction

Competitiveness

X2

X3

X1

Value creation

Productivity

Figure 5.1 A causal Bayes net for the competitiveness of a firm.

X2 and X3 are probabilistically independent, given their common cause X1), and that X4 ⊥ P X1/{X2, X3} (that X2 and X3 screen off X4 from X1). The minimality condition entails that it is not the case that X2 ⊥ P X3 (otherwise {X1} would not be the minimal set given which X2 is independent of its non-descendant X3, and vice versa), and that it is not the case that X4 ⊥ P X2/X3 or X4 ⊥ P X3/X2 (X2 and X3 must make a difference, given the other). Finally, the faithfulness condition requires that probabilistic dependencies do not disappear when there are causal chains: that it be not the case that X4 ⊥ P X1/X2, X4 ⊥ P X1/X3, or X4 ⊥ P X1. Spirtes, Glymour, and Scheines make it clear that they do not expect the three axioms to hold universally. They point out that the causal Markov condition might be violated in quantum physics, and that the causal faithfulness condition is violated on occasion (in the foregoing example it would be violated if X4 ⊥ P X1 because the direct influences of X2 and X3 cancel out each other). But they also say of the three axioms that “their importance – if not their truth – is evidenced by the fact that nearly every statistical model with a causal significance we have come upon in the social scientific literature satisfies all three” (Spirtes et al. 1993, p. 53). What they could have stated more clearly is that in order to satisfy the three axioms, a statistical model or set of variables needs to be causally sufficient: it needs to include each proximate common cause of any two variables in V; otherwise the probabilistic independencies will inadequately reflect relations of direct causal dependence. Spohn (2012, p. 501) points out that causal sufficiency might be difficult to achieve. The common causes of any two variables in V might go back as far as the big bang or simply slip off our radars, especially when they are hidden, i.e., non-measurable and causally relevant. Hoover (2001, p. 168) analyzes the repercussions of this difficulty for the case of economics. He argues that the faithfulness condition might be violated whenever expectations operate because expectations are hidden and take positions in causal relations that might fail to be

Causality and Probability  103 reflected by conditional independencies. Hoover (2001, p. 167) argues, moreover, that macroeconomics poses “systematic threats to the Causal Markov Condition” because the “search for an unmeasured conditioning variable may end in the crossing of the micro/macro border before an appropriate conditioning variable could be located.” Spirtes, Glymour, and Scheines refrain from defining ‘causation’ explicitly and prefer to understand conditional independencies as reflecting relations of direct causal dependence. Spohn (2012, pp. 508–9), by contrast, proposes to define direct causal dependence in terms of the conditional independencies. He proposes, more specifically, to say that Y causally depends on X directly if and only if not Y ⊥ P X/ nd(Y) − X, i.e., if and only if it is not the case that Y is probabilistically independent of X, given all the non-descendants of Y except X. He emphasizes that this definition is problematic because it relativizes the notion of direct causal dependence to that of a set V of pre-selected variables: change that set, and you will change the conditional independencies, and with them relations of direct causation. But he also suggests that the problem can be solved by de-relativizing the notion of direct causal dependence, i.e., by defining it for a “universal frame” or universal set of variables. 5.6

Policy or Prediction?

Sections 5.3 and 5.4 called attention to the problem that economists cannot establish the claim that Xt Granger-causes Yt + 1 unless they manage to include in a VAR model only the relevant number of variables and lags. Assume that despite this problem, they manage to establish the claim that Xt Granger-causes Yt + 1. Can they now predict the value that Yt + 1 is going to attain if they know the value of Xt? Can they predict, for instance, the value that GNP will attain in t + 1 if they know the value that money takes in t? Most economists believe that the answer is positive. Granger (1969, p. 428) suggests that prediction is in fact the principal purpose of searching for relations of Granger causality. And in statistics, there are standard procedures for computing the expected value of Yt + 1 when the values of the other variables and lags in the model are given. Economists might not be able to predict the exact value of Yt + 1 (e.g. GNP), but they can state the probability with which Yt + 1 can be expected to attain a specific value. An entirely different question is whether economists can predict the value that Yt + 1 would attain if they were to control Xt to a specific value. Assume again that they know that Xt (standing e.g. for money) Granger-causes Yt + 1 (standing e.g. for GNP); does that imply that they know the value that Yt + 1 would attain if they managed to set Xt to a specific value? Most economists agree that the answer is negative. In order to see that the answer is negative, consider again equations (1) and (2) of Section 5.3. In order to be able to predict the value that Y is going to attain in t + 1, one needs to condition equation (1) on the observations of X and Y in t and take the expectation of Yt + 1: E(Yt + 1| yt, xt) = α11yt + α12xt + E(Ε1t + 1| yt, xt).

(3)

104 Causality But in order to be able to predict the value that Yt + 1 would attain if Xt were controlled to xt, one would need to condition equation (1) on the observations of X and Y in t and take the expectation of a counterfactual quantity. The expectation of that quantity is calculated in the same way as in (3). But in order to understand that quantity as counterfactual, one would need to understand the relation between Xt and Yt + 1 as causal in the sense of the second tradition mentioned in the introduction: one would need to assume that there is an instrumental variable It (standing e.g. for the federal funds rate) that causes Xt, that causes Yt + 1 only via Xt, and that isn’t caused by Ε1t + 1; one would need to interpret equation (1) as a structural equation (and not as a regression equation) and Ε1t + 1 as encompassing omitted variables that cause Yt + 1 (and not as a regression error).7 Thus knowledge that Xt Granger-causes Yt + 1 is not sufficient for (does not imply) knowledge of the value that Yt + 1 would take if Xt were controlled to xt. Might one perhaps say that knowledge that Xt Granger-causes Yt + 1 is necessary for knowledge of the value that Yt + 1 would take if Xt were controlled to xt? Unfortunately, the answer is still negative. In order to see that the answer is negative, consider the following model of structural equations: yt + 1 = θxt + 1 + β11yt + β12xt + ν1t + 1, xt + 1 = γyt + 1 + β21yt + β22xt + ν2t + 1,

(4) (5)

where θ, γ and the βij represent parameters and the Νit + 1 structural errors, i.e., errors encompassing omitted variables that are causally relevant. Solving the current values out of these equations yields the reduced form equations, which coincide with equations (1) and (2) such that α11 = (β11 + θβ21)/(1 − θγ), α12 = (β12 + θβ22)/(1 − θγ), α21 = (γβ11 + β21)/(1 − θγ), α22 = (γβ12 + β22)/(1 − θγ), ε1t = (ν1t + θν2t)/(1 − θγ), ε2t = (γν1t + θν2t)(1 − θγ). In order for Granger causality (or knowledge thereof) to qualify as a necessary condition of causality in the sense of the second tradition (or knowledge thereof), α12 in equation (1) would need to be unequal to zero. But Rodney Jacobs, Edward Leamer, and Michael Ward (1979, pp. 402–5) show (for a similar model) that there are cases in which α12 is equal to zero: cases in which e.g. β12 = − θβ22. And Hoover (2001, pp. 152–3) points out that these cases are not among the exotic ones that economists can neglect with a clear conscience. While the question relating to the value that Yt + 1 is going to attain if the value of Xt is reported to be xt arises in contexts of forecasting, the question relating to the value that Yt + 1 would attain if Xt were set to xt by intervention arises in contexts of policy analysis. It goes without saying that complementing knowledge of causality in the sense of the second tradition with knowledge of Granger causality is likely to be helpful in both contexts. And perhaps knowledge of causality in the sense of the second tradition yields better predictions than knowledge of Granger causality (cf. Pearl 2000, p. 31). But the decisive point of the foregoing analysis is that policy analysis requires knowledge of causality in the sense of the second tradition. Many economists believe that policy analysis is the ultimate justification for the study of economics (cf. e.g. Hoover 2001, p. 1); and that belief might explain

Causality and Probability  105 why some of them hold that only the second tradition deals with causality in the strict sense of the term. Thomas Sargent (1977, p. 216), for instance, states that “Granger’s definition of a causal relation does not, in general, coincide with the economists’ usual definition of one: namely, a relation that is invariant to interventions in the form of imposed changes in the processes governing the causal variables.” In econometric textbook expositions of the concept of causality, one is likewise likely to find the observation that “Granger causality is not causality as it is usually understood” (Maddala and Lahiri 2009, p. 390). 5.7

Common Effects and Common Causes

The result of the preceding section has been that (knowledge of) Granger causality is neither a necessary nor sufficient condition of (knowledge of) causality in the sense of the second tradition. Does that result generalize to the claim that (knowledge of) causality in the sense of the second tradition can never be inferred from (knowledge of) probabilities? Hoover (2009, p. 501) defends a negative answer when suggesting that “some causal claims may be supported by facts about probability models that do not depend on assumptions about the truth of these very same causal claims.” The causal claims that he discusses include the claim that Z causally depends on both X and Y and the claim that X and Y causally depend on Z. The primary purpose of the present and final section is to point to the problems that are potentially inherent to attempts to infer these claims from probability models. It is, of course, impossible to observe the relations of causal dependence that might (or might not) obtain between X, Y, and Z directly. But one might be able to observe realizations of X, Y, and Z. And Hoover thinks that it is possible to specify an adequate probability model for these realizations independently of any assumptions about the causal relations that might (or might not) obtain between X, Y, and Z. In some of his work, Hoover (2001, pp. 214–7) advocates the application of LSE methodology to specify adequate probability models. LSE methodology operates by (i) specifying a deliberately overfitting general model, by (ii) subjecting the general model to a battery of diagnostic (or misspecification) tests (i.e., tests for normality of residuals, absence of autocorrelation, absence of heteroscedasticity and stability of coefficients), by (iii) testing for various restrictions (in particular, for the restriction that a set of coefficients is equal to the zero vector) in order to simplify the general model, and by (iv) subjecting the simplified model to a battery of diagnostic tests. If the simplified model passes these tests, LSE methodology continues by repeating steps (i)–(iv), i.e., by using the simplified model as a general model, by subjecting that model to a battery of diagnostic tests etc. Simplification is complete if any further simplification either fails any of the diagnostic tests or turns out to be statistically invalid as a restriction of the more general model. Let us assume that the application of LSE methodology has resulted in the following normal model of X, Y, and Z (cf. Hoover 2009, p. 502, notation modified):

( X , Y , Z ) ~ N (µ X , µY , µ Z ,σ 2X ,σ Y2 ,σ 2X , ρXY , ρXZ , ρYZ ),

106 Causality where μX, μY, and μZ are the three means, σ 2X , σ Y2 , and σ 2X the three variances, and ρXY, ρXZ, and ρYZ the three covariances or population correlations of the model. Hoover argues that the model supports the claim that Z causally depends on both X and Y if it satisfies the antecedent of the common effect principle, and that it supports the claim X and Y causally depend on Z if it satisfies the antecedent of the common cause principle. The two principles can be restated as follows (cf. Hoover 2009): - Principle of the Common Effect: If X and Y are probabilistically independent conditional on some set of variables (possibly the null set) excluding Z, but are probabilistically dependent conditional on Z, then Z causally depends on both X and Y (then Z forms an unshielded collider on the path XZY). - Principle of the Common Cause: If X and Y are probabilistically dependent conditional on some set of variables (possibly the null set) excluding Z, but are probabilistically independent conditional on Z, then X and Y causally depend on Z. Hoover argues, more specifically, that the normal model of X, Y, and Z supports the claim that Z causally depends on both X and Y if ρXY = 0 and ρXY⎹ Z ≠ 0, and that it supports the claim that X and Y causally depend on Z if ρXY ≠ 0 and ρXY⎹ Z = 0. There are three problems that are potentially inherent to attempts to infer these claims from probability models. The first problem is that in practice, LSE methodology might be incapable of implementation without data mining, which Deborah Mayo (1996, pp. 316–7) characterizes as data use for double duty, i.e., as the use of data to arrive at a claim (e.g. at the claim that Z causally depends on both X and Y, or that X and Y causally depend on Z) in such a way that the claim is constrained to satisfy some criteria (e.g. absence of misspecification), and that the same data is regarded as supplying evidence in support of the claim arrived at. Aris Spanos (2000), however, argues that there are problematic and non-problematic cases of data mining. The second problem is that Hoover’s claim that an adequate probability model can be specified independently of any causal assumptions might not be accurate. If there are hidden variables (i.e., variables that cannot be measured and are known to be causally relevant), then these variables cannot be included in a deliberately overfitting general model, and then the model resulting from the application of LSE methodology cannot be said to be adequate.8 But even if the probability model can be said to be adequate, there will be the third problem that neither principle obtains in cases in which ρXY ≠ 0 denotes a nonsense correlation like that between higher than average sea levels and higher than average bread prices (Sober 2001, p. 332), or that between cumulative rainfall in Scotland and inflation (Hendry 1980, pp. 17–20). It would be absurd to ask for the variable, which causally depends on X and Y, or for the variable, on which X and Y causally depend if X and Y were correlated in a way that doesn’t make any sense. Hoover (2003; 2009) responds to this problem by distinguishing stationary and non-stationary time series that provide values to X and Y, and by arguing that

Causality and Probability  107 non-stationary time series are not subject to the common cause principle unless they are co-integrated. Time series are non-stationary if they grow over time and do not have a fixed (or “stationary”) mean. And they are co-integrated if each of them is I(1), i.e., integrated of order 1, and if there is a linear combination of them that is I(0), i.e., integrated of order 0, where time series or linear combinations of them are I(d), i.e., integrated of order d, if they must be differentiated d times to be made stationary. Hoover’s response is convincing to the extent that it explains why neither the common effect principle nor the common cause principle obtains in cases in which ρXY ≠ 0 denotes a nonsense correlation: nonsense correlations are correlations between non-stationary time series that are not co-integrated.9 It is worth mentioning, however, that testing for co-integration is not always easy. Søren Johansen (1988) has developed an empirical procedure that can be applied to test for cointegration, but Y.-W. Cheung and K. S. Lai (1993) point to several finite-sample shortcomings of that procedure; and Adrian Pagan (1995) points to difficulties in interpreting co-integration relationships that stem from the fact that Johansen’s procedure involves estimations of reduced form equations. Notes 1 Hoover (2001, p. 100), for instance, stands in the second tradition and characterizes his “structural account” of causality as “not inconsistent” with Nancy Cartwright’s account of causes as capacities. Cartwright is sympathetic to probability theories of causality, but holds that (high) conditional probabilities (or regularities) only manifest “nomological machines,” where a nomological machine is “a fixed (enough) arrangement of components, or factors, with stable (enough) capacities that in the right sort of stable (enough) environment, give rise to the kind of regular behavior that we represent in our scientific laws” (Cartwright 1999, p. 50). 2 Instead of relations of ‘causal dependence’ theorists sometimes speak of relations of ‘type-level causation (cf. chapter 2).’ Both ways of speaking refer to relations between variables (e.g. to the relation between income and consumption in general), and not to relations between events (i.e., not to relations like that between the event of US income attaining a specific value in Q4 2019 and the event of US consumption attaining a specific value in Q4 2019). 3 The new result stated that money accounted for only 4% of the variance in GNP (cf. Sims 1980b). 4 Zellner (1979, p. 12; 1988, p. 7) points out that he adopts that definition from Herbert Feigl. 5 Cf. Norton (2011) for a particularly concise and thorough discussion of these problems. 6 Much of the notation that the present section uses to describe causal Bayes nets theory is borrowed from Spohn (2012, Section 14.8). 7 One would need to interpret Ε1t + 1, more specifically, as encompassing omitted variables that adopt certain values and cause Yt + 1 in t + 1 or earlier. 8 Cf. Chapter 4 (Section 4.5) for an elaboration of this second problem. 9 For a more thorough and critical discussion of Hoover’s response, cf. Reiss (2015, Chapter 8).

References Bessler, D. A. and Lee, S. (2002). “Money and Prices: U.S. Data 1869–1914 (A Study with Directed Graphs).” Empirical Economics 27(3), 427–46.

108 Causality Cartwright, N. (1979). “Causal Laws and Effective Strateges.” Nous 13(4), 419–37. Cartwright, N. (1999). The Dappled World. Cambridge: CUP. Cheung, Y.-W. and Lai, K. S. (1993). “Finite-Sample Sizes of Johansen’s Likelihood Ratio Tests for Cointegration.” Oxford Bulletin of Economics and Statistics 55, 313–28. Demiralp, S. and Hoover, K. D. (2003) “Searching for the Causal Structure of a Vector Autoregression.” Oxford Bulletin of Economics and Statistics 65, 745–67. Granger, C. W. J. (1969). “Investigating Causal Relations by Econometric Models and Cross-Spectrum Methods.” Econometrica 37(3), 424–38. Granger, C. W. J. (1980). “Testing for Causality: A Personal Viewpoint.” Journal of Economic Dynamics and Control 2(4), 329–52. Hendry, D. (1980). “Econometrics – Alchemy or Science?” Economica 47(188), 387–406. Hoover, D. (2003). “Nonstationary Time Series, Cointegration, and the Principle of Common Cause.” The British Journal for Philosophy of Science 54, 527–51. Hoover, K. D. (1993). “Causality and Temporal Order in Macroeconomics or Why Even Economists Don’t Know How to Get Causes from Probabilities.” The British Journal for Philosophy of Science 44(4), 693–710. Hoover, K. D. (2001). Causality in Macroeconomics. Cambridge: CUP. Hoover, K. D. (2009). “Probability and Structure in Econometric Models.” In Glymour, C., Wei, W. and Westerståhl, D. (eds.), Logic, Methodology, and Philosophy of Science. London: College Publications, 497–513. Jacobs, R. L., Leamer, E. E. and Ward, M. P. (1979). “Difficulties with Testing for Causation.” Economic Inquiry 17, 401–13. Johansen, S. (1988). “Statistical Analysis of Cointegration Vectors.” Journal of Economic Dynamics and Control 12, 231–54. Maddala, G. S. and Lahiri, K. (42009). Introduction to Econometrics. Chichester: Wiley & Sons. Mayo, D.G. (1996). Error and the Growth of Experimental Knowledge. Chicago: University of Chicago Press. Norton, J. D. (2011). “Challenges to Bayesian Confirmation Theory.” In Bandyopadhyay, P. S. and Forster, M. R. (eds.), Handbook of the Philosophy of Science. Vol. 7: Philosophy of  Statistics. Amsterdam: Elsevier. Pagan, A. (1995). “Three Methodologies: An Update.” In Oxley, L., George, D. A. R. and Roberts, C. J. (eds.), Surveys in Econometrics. Oxford: Basil Blackwell, 30–41. Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press. Reiss, J. (2015). Causation, Evidence, and Inference. London: Routledge. Sargent, T. J. (1977). “Response to Gordon and Ando.” In Sims, C. A. (ed.), New Methods in Business Cycle Research. Minneapolis: Federal Reserve Bank of Minneapolis. Sims, C. A. (1972). “Money, Income and Causality.” American Economic Review 62(4), 540–52. Sims, C. A. (1980a). “Macroeconomics and Reality.” Econometrica 48, 1–48. Sims, C. A. (1980b), “Comparison of Interwar and Postwar Business Cycles: Monetarism Reconsidered.” The American Economic Review 70(2), 250–57. Sober, E. (2001). “Venetian Sea Levels, British Bread Prices, and the Principle of the Common Cause.” The British Journal for the Philosophy of Science 52, 331–46. Spanos, A. (2000). “Revisiting Data Mining: ‘Hunting’ with or without a License.” Journal of Economic Methodology 7(2), 231–64. Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation, Prediction and Search. New York: Springer.

Causality and Probability  109 Spohn, W. (1980). “Stochastic Independence, Causal Independence, and Shieldability.” Journal of Philosophical Logic 9, 73–99. Spohn, W. (1983). “Probabilistic Causality: from Hume via Suppes to Granger.” In Galavotti, M. C. and Gambetta, G. (eds.), Causalità e modelli probabilistici. Bologna: Cooperativa Libraria Universitaria. Spohn, W. (2012). The Laws of Belief. Oxford: OUP. Suppes, P. (1970). A Probabilistic Theory of Causality. Amsterdam: North-Holland. Woodward, J. (2003). Making Things Happen: A Causal Theory of Explanation. Oxford: OUP. Zellner, A. (1979). “Causality and Econometrics.” Carnegie-Rochester Conference Series on Public Policy 10(1), 9–54. Zellner, A. (1988). “Causality and Causal Laws in Economics.” Journal of Econometrics 39, 7–21.

Part II

Objectivity Tobias Henschen

In Part I of this book, I showed that macroeconomists (need to) use causal models to justify macroeconomic policy decisions. A causal model is a system of equations representing the relations of direct type-level causation that macroeconomists believe obtain between macroeconomic aggregates. To say that macroeconomic aggregate X directly type-level causes macroeconomic aggregate Y means (roughly) that there is a possible intervention on X that changes Y, where an intervention is understood as a manipulation of an intervention variable I that satisfies conditions requiring that X causally depend on I, and that there be no confounders of X and Y, and where an intervention variable is either a variable or a parameter. The most important result of Part I of this book is a negative one. It states that the empirical evidence that macroeconomists can provide when employing empirical procedures of causal inference is in principle too inconclusive to support any specific relations of direct type-level causation. This negative result raises the following question: can the causal models, on which macroeconomic policies rely, be said to be scientifically objective in any sense? Part II of this book is supposed to answer this question by analyzing various meanings that are traditionally ascribed to “scientific objectivity,” by checking whether macroeconomic causal models qualify as scientifically objective in any of these meanings, and by exploring the steps that might need to be taken to make these models scientifically objective in any of these meanings. The traditional meanings of “scientific objectivity” are the ones that Lorraine Daston and Peter Galison (2007) describe when investigating the historical development of the concept of scientific objectivity: truth-to-nature, mechanical objectivity, structural objectivity, and trained judgment. Truth-to-nature is objectivity in the sense of the truth of a scientific theory, model, or image. Since theories (like Newtonian mechanics), models (like macroeconomic causal models) and images (like that of a plant species) necessarily preselect, approximate, and idealize, the truth in question is clearly of the approximate kind. But the approximate truth of a theory, model, or image in general suffices to guarantee that what the theory, model, or image refers to or represents (physical body motion, relations of direct type-level causality, or a plant species) really exists.

DOI: 10.4324/9781003094883-7

112  Tobias Henschen Mechanical or structural objectivity, by contrast, is the objectivity of a theory, model, or image that scientists select independently of the non-epistemic values (ideologies, value judgments, group interests, etc.) that they happen to endorse. The difference between mechanical and structural objectivity is mainly that the former relates to the presence of empirical evidence that is received by passively registering “machines” (e.g. cameras), while the latter relates to the presence of empirical evidence that is provided by applications of various epistemic methods (proofs, experiments, randomized control trials, statistical tests, self-registering instruments, etc.). Finally, trained judgment is objectivity in the sense of “expertise”: theories, models, or images are thought to be objective in this sense if they result from some kind of expert activity (theory construction, model specification, or image production). Daston and Galison (2007, p. 36) point out that “[o]bjectivity and subjectivity define each other, like left and right or up and down. One cannot be understood, even conceived, without the other.” The four types of objectivity, therefore, each have a different subjective counterpart: “the sage, whose wellstocked memory synthesizes a lifetime of experience with skeletons or crystals or seashells into the type of that class of objects”, in the case of truth-to-nature; “the indefatigable worker, whose strong will turns inward on itself to subdue the self into a passively registering machine,” in the case of mechanical objectivity; “all human beings – indeed, all rational beings, Martians and monsters included,” in the case of structural objectivity; and “the intuitive expert, who depends on unconscious judgment to organize experience into patterns in the very act of perception,” in the case of trained judgment (cf. Daston and Galison 2007, pp. 44, 46). Daston and Galison (2007, p. 36) argue that “[f]irst and foremost, objectivity is the suppression of some aspect of the self, the countering of subjectivity,” and that truth-to-nature and trained judgment, therefore, do not qualify as types of objectivity in the strict sense of the term. They also defend the claim that scientific objectivity in the strict sense of the term “first emerged in the mid-nineteenth century” (Daston and Galison 2007, p. 27), and that while “[b]efore objectivity, there was truth-to-nature; after the advent of objectivity came trained judgment” (Daston and Galison 2007, p. 28). Daston and Galison (2007, pp. 29-31) provide a lot of etymological evidence in support of that claim. But they also concede that “[t]he emergence of objectivity […] in the mid-nineteenth century did not abolish truth-to-nature, any more than the turn to trained judgment in the early twentieth century eliminated objectivity” (Daston and Galison 2007, p. 18). Perhaps they are likewise prepared to refer to truth-to-nature and trained judgment as types of objectivity in the wider sense of the term. Chapters 6–8 are supposed to show that in causal modeling in macroeconomics, scientific objectivity has been traditionally conceived of as truth-to-economy (which is, of course, just the macroeconomic equivalent of truth-to-nature), structural objectivity, and trained judgment. One may say that a causal model used to

Objectivity  113 justify macroeconomic policy decisions is objective in the sense of “truth-to-economy” if what it represents and refers to (relations of direct type-level causation and their relata, aggregate quantities) really exists, that it is objective in the sense of “structural objectivity” if it is selected independently of the non-epistemic values that macroeconomists happen to endorse, and that it is objective in the sense of “trained judgment” if its specification relies on the intuitions of macroeconomic experts. Mechanical objectivity drops out because empirical evidence in support of relations of direct type-level causation cannot be received by passively registering machines. If the empirical evidence that macroeconomists can provide when employing empirical procedures of causal inference is in principle too inconclusive to support specific relations of direct type-level causation, then scientific realism and expertise will be problematic, while the independence of non-scientific values will be impossible. Scientific realism and expertise will be problematic because there will be no way for us to decide whether there are causal relations connecting aggregate quantities or whether any of the causal models that macroeconomic experts specify adequately represent the relations of direct type-level causation that obtain in the situation at hand. And independence of non-scientific values will be impossible because macroeconomists won’t be able to select macroeconomic causal models (or the causal hypotheses they express) based on empirical evidence alone. If they select a causal model, then the non-scientific values that they happen to endorse will inevitably play a role. What are the steps that need to be taken to render macroeconomic causal models scientifically objective in any of the traditional meanings? These steps become visible if macroeconomics is imagined to be “well-ordered” in the sense of Phillip Kitcher. According to Kitcher (2011), a scientific discipline is well-ordered if significance (the question of what problems research should be conducted on) is dealt with in an ideal discussion under mutual engagement; if certification (the acceptance of research results as providing evidence for scientific claims) results from applications of methods that an ideal deliberation endorses as reliable, and that conform to the ideal of transparency; and if application (the use of public knowledge to solve urgent problems) is the topic of an ideal discussion under conditions of mutual engagement at the time and in the circumstances when the knowledge for the application becomes available. In macroeconomics, the problem arises that the methods that an ideal deliberation endorses as reliable are currently unavailable. In order for these methods to be forthcoming progress needs to be made especially in the study of the formation of individual expectations: macroeconomists won’t be able to measure expectations unless they understand the formation of expectations, and they won’t be able to provide conclusive evidence in support of causal hypotheses unless they manage to measure expectations. If these methods fail to be forthcoming, then macroeconomic policy analysis will turn out to lack secure scientific-empirical foundations:

114  Tobias Henschen then macroeconomic policy will be led astray when trying to manipulate macroeconomic quantities to influence others. References Daston, L. and Galison, P. (22010). Objectivity. New York: Zone Books. Kitcher, P. (2011). Science in a Democratic Society. Amherst, NY: Prometheus Books.

6

6.1

Scientific Realism in Macroeconomics

Introduction

Truth-to-nature has been characterized above as (approximate) truth of a scientific theory, model, or image and as existence of what that theory, model, or image represents and refers to. This characterization suggests that truth-to-nature is the type of objectivity that is to be ascribed to theories, models, and images according to the position that philosophers refer to as ‘scientific realism.’ As such, however, it stands in need of relativization, clarification, and defense: in need of relativization because realism in macroeconomic policy analysis might not coincide with the type of realism that philosophers have in mind when discussing scientific realism in general (hence my use of the expression ‘truth-to-economy’ in the title to the present chapter); in need of clarification because it is not so clear (even controversial) just what the terms ‘(approximate) truth,’ ‘existence,’ ‘reference,’ or ‘representation’ are supposed to mean in the contexts of scientific realism in general and in macroeconomics in particular; and in need of defense because anti-realists have advanced forceful arguments against general scientific realism that might also apply to scientific realism in macroeconomics. The negative result of part I makes it clear that a defense against these arguments cannot guarantee that scientific realism is entirely unproblematic in macroeconomic causal modeling. If macroeconomic dynamic-stochastic generalequilibrium (DSGE) models conflict with the ontology of macroeconomic aggregates, and if the empirical evidence that macroeconomists can provide in support of relations of direct type-level causation is inconclusive in principle, then we simply don’t know whether a given macroeconomic causal model is (approximately) true, or whether the relations of direct type-level causation that it represents exist. But a defense of scientific realism in macroeconomics can at least defend the problematic character of truth-to-economy, i.e., emphasize that objectivity in the sense of truth-to-economy is given to us as a problem: that macroeconomic causal models might be (approximately) true, and that the relations of direct type-level causation that they represent might exist. The implication of the problematic character of truth-to-economy is that at one point in the future, macroeconomists might arrive at more secure knowledge of the relations of direct type-level causation that can be exploited for purposes of policy analysis. DOI: 10.4324/9781003094883-8

116 Objectivity The present chapter is to defend the problematic character of truth-to-economy in four sections. Sections 6.2 and 6.3 will look at the history of scientific realism in economics from David Ricardo to the measurement-without-theory debate and work toward a characterization of the type of scientific realism that is relevant in macroeconomic policy analysis today. Section 6.4 will provide a characterization of scientific realism, as it is relevant in macroeconomic policy analysis today, and defend that realism against its main rival in economics, instrumentalism. Section 6.5 will defend the problematic character of scientific realism in macroeconomics. It will argue with Uskali Mäki (1996) and others that what is problematic in macroeconomics, is the (approximate) truth of causal models and the existence of what they represent and that the existence of macroeconomic aggregates (what the variables of macroeconomic causal models refer to) is relatively unproblematic. But Section 6.5 will also defend the problematic character of scientific realism in macroeconomics against two arguments that anti-realists have advanced against the (approximate) truth of scientific theories, and that also seem to apply to scientific realism in macroeconomics: against the argument from pessimistic meta-induction and the argument from skepticism about truth. For most of this chapter, I will assume that the principle of common cause is a valid principle in macroeconomics (cf. Section 5.7 of Chapter 5): that the time series that provide values to macroeconomic variables are either stationary or non-stationary and co-integrated (that the correlations that obtain between macroeconomic aggregates do not amount to “nonsense” correlations). If the principle of common cause is a valid principle in macroeconomics, then relations of direct type-level causation between macroeconomic aggregates will exist in the economic system: then the correlation between macroeconomic aggregate X and macroeconomic aggregate Y is materially equivalent to a relation of direct typelevel causation between X and Y (with X being a direct type-level cause of Y or Y being a direct type-level cause of X), or with a macroeconomic aggregate (or set of aggregates) Z being a direct type-level cause of both X and Y. In this chapter, then, I will not question the existence of relations of direct typelevel causation between macroeconomic aggregates in general. I will investigate whether the specific relations of direct type-level causation that macroeconomic causal models represent can be said to exist. I will explore, in other words, whether these models can be said to be (approximately) true. But I will not question the truth of the more general claim that relations of direct type-level causation between macroeconomic aggregates exist in the economic system. 6.2

Newton or Kepler?

In their historical account of the development of the concept of scientific objectivity, Daston and Galison (2010, p. 42) describe truth-to-nature as follows: Eighteenth-century and early nineteenth-century anatomists and naturalists and their artists worked in a variety of media […] and with a variety of methods […]. But almost all the atlas makers were united in the view that what the

Scientific Realism in Macroeconomics  117 image represented, or ought to represent, was not the actual individual specimen before them but an idealized, perfected, or at least characteristic exemplar of a species or other natural kind. To this end, they carefully selected their models, watched their artists like hawks, and smoothed out anomalies and variations […]. They defended the realism – the ‘truth-to-nature’ – of underlying types and regularities against the naturalism of the individual object, with all its misleading idiosyncrasies. They were painstaking to the point of fanaticism in the precautions they took to ensure the fidelity of their images, but this by no means precluded intervening in every stage of the image-making process to ‘correct’ nature’s imperfect specimens. Daston and Galison (2010, p. 19) narrow their sights to image- and atlas-making “because scientific atlases have been central to scientific practice across disciplines and periods; and […] because atlases set standards for how phenomena are to be seen and depicted.” They make it clear, that is, that the purview of truth-to-nature “encompasses far more than images.” A good example of what that purview encompasses is Newtonian mechanics. Consider Newton’s first law of motion: every body continues in its state of uniform motion in a straight line unless it is compelled to change that state by force impressed upon it. Uniform body motion in a straight line is nothing to be observed in nature. But Newton would say that this is because every body is compelled to veer out of its straight line as a result of the external force impressed on it. What the first law of motion represents is therefore “not the actual individual specimen […] but an idealized, perfected […] exemplar of” physical body motion. And Newton can be seen as defending “the realism – the ‘truth-to-nature’ – of” that idealized, perfected exemplar “against the naturalism of the individual object, with all its misleading idiosyncrasies.” Truth-to-nature of the Newtonian kind found its way into economics when Ricardo proposed to specify simplified models to analyze policy problems. The best-known model that he specified for that purpose is the comparative-advantage model of international trade from Chapter 7 of his Principles (1817). The model says that two countries benefit from international trade if they produce two commodities, if labor is the only production factor in both countries, if the unit cost of production of each commodity (expressed in terms of labor) is constant in both countries, and if there’s a difference in comparative costs (i.e., if the ratio between the absolute unit costs of the two commodities in the one country is different from the ratio of these costs in the other country). The countries referred to in the model (countries that produce only two commodities with only one production factor under constant unit costs of production) cannot be observed in reality. So what the model represents is “not the actual individual specimen” of a two-country economy “but an idealized, perfected, or at least characteristic exemplar of” that economy. Despite the idealism of that model, Ricardo was a passionate proponent of free trade. He may therefore be regarded as defending “the realism – the ‘truthto-nature’ – of” the ideal two-country economy “against the naturalism of” the real two-country economy “with all its misleading idiosyncrasies.”

118 Objectivity In the nineteenth century, Ricardo-style simplified models were, of course, not only used for purposes of policy analysis. Consider, for instance, the model that John Stuart Mill discusses in chapter 11 of book 2 of his Principles of Political Economy (1848). The model says that the nominal wage falls if laboring population growth is positive, if capital expended in direct purchase of labor either decreases or remains constant, and if there is competition (i.e., no upward wage rigidity because of monopolies deciding to demand less labor than necessary to produce the optimum output, no downward wage rigidity because of trade unions impeding a decrease in wages). The model is not immediately useful for policy purposes. But it is clear that what it represents is “not the actual individual specimen” of a labor market “but an idealized, perfected, or at least characteristic exemplar of” that market. Wages with no upward or downward rigidity cannot be observed in reality, and causes that (like technological progress or an excessive supply of unqualified labor) potentially counteract positive laboring population growth are assumed to be absent. Mill can also be seen to provide the first philosophical defense of truth-toeconomy of the Newtonian kind. He famously defines political economy as the “science which traces the laws of such of the phenomena of society as arise from the combined operations of mankind for the production of wealth, in so far as those phenomena are not modified by the pursuit of any other object” (Mill 1836/21994: 54). He goes on to distinguish an inductive (or a posteriori) and a deductive (or a priori) method: “By the method à posteriori we mean that which requires, as the basis of its conclusions, not experience merely, but specific experience. By the method à priori we mean […] reasoning from an assumed hypothesis” (Mill 1836/21994: 56). And he concludes that the deductive method is the only legitimate method in political economy: In the definition we have attempted to frame of the science of Political Economy, we have characterized it as essentially an abstract science, and its method as the method à priori. […] It reasons, and, as we contend, must necessarily reason, from assumptions, not from facts. It is built on hypotheses, strictly analogous to those which, under the name of definitions, are the foundation of the other abstract sciences. […] But we go farther than to affirm that the method à priori is a legitimate mode of philosophical investigation […]; we contend that it is the only method (Mill 1836/21994, pp. 56, 58). He derives this conclusion, of course, from his observation that political economy cannot conduct any crucial experiments: How, for example, can we obtain a crucial experiment on the effect of a restrictive commercial policy upon national wealth? We must find two nations alike in every other respect, or at least possessed, in a degree exactly equal, of everything which conduces to national opulence, and adopting exactly the same in all their other affairs, but differing in this only, that one of them

Scientific Realism in Macroeconomics  119 adopts a system of commercial restrictions, and the other adopts free trade. This would be a decisive experiment, similar to those which we can almost always obtain in experimental physics. Doubtless this would be the most conclusive evidence of all if we could get it. But let anyone consider how infinitely numerous and various are the circumstances which either directly or indirectly do or may influence the amount of the national wealth, and then ask himself what are the probabilities that in the longest revolution of ages of two nations will be found, which agree, and can be shown to agree, in all those circumstances except one (Mill 1836/21994, pp. 59). Note that the reason that Mill gives to his claim of the impossibility of conducting crucial experiments in political economy coincides with the second of the two reasons that I give in Section 4.2 of Chapter 4 to emphasize the impossibility of conducting RCTs in macroeconomics. In economics, truth-to-economics of the Newtonian kind probably reached its climax when Léon Walras lay the foundations for general equilibrium theory in his Élements d’économie politique pure (1874). According to the definition that is central to that theory, a price vector and an allocation of consumed and produced goods constitute a competitive equilibrium if firms maximize profits, households maximize utility and all markets clear. The equilibrium referred to in that definition (an equilibrium in which all firms maximize profits, all households maximize utility and all markets clear) is nothing to be observed in reality. So what that definition represents is “not the actual individual specimen” of an economy “but an idealized, perfected, or at least characteristic exemplar of” that economy. The same definition, however, has important policy implications. The first welfare theorem from welfare economics says that an allocation of consumed and produced goods is Pareto-efficient (or -optimal) if this allocation combines with a price vector to constitute a competitive equilibrium (where an allocation counts as Pareto-efficient if there is no alternative allocation that makes some consumer better off without making another consumer worse off). If an allocation of consumed and produced goods is Pareto-efficient if it combines with a price vector to constitute a competitive equilibrium, then policy interventions disrupting competition can, of course, only reduce welfare. So anyone referring to general equilibrium theory when advocating laissez faire policies can be seen as defending “the realism – the ‘truth-to-nature’ – of” an ideal economy “against the naturalism of” a real economy “with all its misleading idiosyncrasies.” A strong commitment to truth-to-economics of the Newtonian kind can also be found among the representatives of the Austrian School of Economics. Consider e.g. what Carl Menger (1884, pp. 18–9) has to say about the “exact” branch of theoretical research in economics: [D]er exacten Richtung der theoretischen Forschung […] [fällt die Aufgabe zu], die realen Erscheinungen der Volkswirthschaft auf ihre einfachsten streng typischen Elemente zurückzuführen und uns, auf der Grundlage

120 Objectivity des Isolirungsverfahrens, die (exacten) Gesetze darzulegen, nach welchen sich complicirtere Erscheinungen der Volkswirthschaft aus den obigen Elementen entwickeln, um uns auf diesem Wege, zwar nicht das Verständnis der socialen Erscheinungen ‘in ihrer vollen empirischen Wirklichkeit’, wohl aber jenes der wirthschaftlichen Seite derselben zu verschaffen. Menger thinks that there is a full empirical and an economic reality of social phenomena, that the “exact” branch of theoretical research in economics has to deal with their economic reality, and that it has access to that reality when applying what Menger calls the ‘procedure of isolation.’ When truth-to-economy is characterized as Newtonian, there are at least two points that need to be observed. The first point is that Newtonian truthto-economy has never been uncontested. It can be distinguished from truthto-economy of a Keplerian kind: from “the realism – the ‘truth-to-nature’ – of underlying types and regularities” that do not represent ideal and unobservable entities and relations but observable or measurable entities and empirical relationships. And truth-to-economy of the Keplerian kind has been defended, first and foremost, by the members of the German historical school. In his review of Menger’s Untersuchungen über die Methode der Sozialwissenschaften, Gustav von Schmoller (1883/1998, pp. 161–2), for instance, urges that in economics, descriptive disciplines like statistics and history need to be used to derive the underlying types and regularities: Die Scheidung der Erkenntnisrichtungen […] ist unzweifelhaft von einer gewissen Berechtigung. […] [S]o können Statistik und Geschichte […] den Arbeiten entgegen gesetzt werden, welche das generelle Wesen der volkswirthschaftlichen Erscheinungen darstellen wollen. Aber dieser Gegensatz darf nicht als seine unüberbrückbare Kluft aufgefaßt werden. Die […] deskriptive Wissenschaft liefert die Vorarbeiten für die allgemeine Theorie; diese Vorarbeiten sind um so vollendeter, als die Erscheinungen nach allen wesentlichen Merkmalen, Veränderungen, Ursachen und Folgen beschrieben sind. […] Jede vollendete Beschreibung ist also ein Beitrag zur Feststellung des generellen Wesens der betreffenden Wissenschaft. According to Schmoller, the description of the underlying types and regularities can be so accomplished that the descriptive disciplines and the general theory (what Menger refers to as the ‘exact’ branch of theoretical research in economics) coincide. The second point to be observed is that hardly any of the above-mentioned theorists defends a pure variant of truth-to-economy of either the Newtonian or Keplerian kind. A more careful reading of his philosophical defense of truth-to-economy reveals that Mill (1836/21994, p. 58) believes that the inductive method “forms an indispensable supplement” to the deductive method. The inductive method forms an indispensable supplement to the deductive method because it is needed

Scientific Realism in Macroeconomics  121 to reduce the uncertainty that arises from the presence of potentially counteracting causes when the principles of political economy are applied to particular cases: When the Principles of Political Economy are to be applied to … diverted from any of them. […] [T]he à posteriori method […] is of great value […] as a means of […] reducing to the lowest point that uncertainty […] alluded to as arising from the complexity of every particular case, and from the difficulty (not to say impossibility) of our being assured à priori that we have taken into account all the material circumstances (1836/21994, pp. 60-1). For Mill (1836/21994, p. 63), the “true practical statesman” is therefore “he who combines […] experience with a profound knowledge of abstract political philosophy.” On closer inspection, it also becomes clear that Menger (1884, pp. 18, 26) believes that the “exact” branch of theoretical research is only one of two branches of equal importance, that the other branch is the “empirical” one, and that the study of history is beneficial to understanding economic phenomena: “Niemand leugnet […] den Nutzen, welchen die Geschichte der Volkswirthschaft an sich für das Verständnis der volkswirthschaftlichen Erscheinungen hat.” Similarly, his nemesis, Schmoller (1911/1998, pp. 322), sounds more conciliatory when emphasizing that as method of scientific thought, the inductive method is on a par with the deductive method: “Seit Jahren pflege ich den Studierenden zu sagen, wie der rechte und linke Fuß zum Gehen, so gehöre Induktion und Deduktion gleichmäßig zum wissenschaftlichen Denken.” Why is it that Mill, Menger, and Schmoller first passionately defend a pure variant of truth-to-economy of either the Newtonian or the Keplerian kind and then, more soberly, commit themselves to a mixed variant? One reason is probably merely sociological. At least in the case of Menger and Schmoller, there had been a struggle for influence between the representatives of the Austrian School of Economics and the German Historical School. Menger (1884, p. 31) suggests that his passionate defense of Newtonian truth-to-economy might also have to do with his worry that the share of economists with historical leanings in the totality of university positions in Germany is unduly high. The other and philosophically more important reason is that Mill, Menger, and Schmoller believe that in the mixed variant of truth-to-economy either the Keplerian or the Newtonian kind predominates. Mill (1836/21994, p. 58) emphasizes the primacy of the Newtonian kind when saying that the a posteriori method “admits of being usefully applied in aid of the method à priori.” Similarly, Menger (1884, pp. 29–30) takes the “exact” branch of theoretical research in economics to predominate when referring to statistics, history, and other descriptive disciplines as “auxiliary sciences.” Schmoller (1883/1998, pp. 161–2), by contrast, insists that descriptive methods need to predominate: “Je unvollkommener noch in einer Wissenschaft der deskriptive Theil ist, je mehr die Theorie nur in einer Summe vorläufiger, noch zweifelhafter,

122 Objectivity theilweise verfrühter Generalisationen besteht, desto größer muß der Abstand sein. Und das scheint mit die Lage […] der Nationalökonomie […] zu sein. Der Weg der Abhülfe liegt darin, daß zunächst und vor Allem die Beobachtung vermehrt, verschärft, verbessert wird […]. Es ist keineswegs eine Vernachlässigung der Theorie, sondern der nothwendige Unterbau für sie, wenn in einer Wissenschaft zeitweise überwiegend deskriptiv verfahren wird.” Schmoller thinks that it is not necessarily illegitimate for the “exact” branch of theoretical research in economics to predominate. But he also thinks that descriptive methods need to predominate as long as the theoretical generalizations of a discipline are as provisional, tentative, and doubtful as he thinks they are in economics. 6.3 Truth-to-Economy in the “Measurement without Theory” Debate What unites theorists from Ricardo to Schmoller is the assumption that probability models cannot be applied to economic data: histograms of economic data rarely display a bell curve, and economic behavior doesn’t seem to be sufficiently uniform to be cast into the shape of a probability model. Today, however, it has become natural to assume that probability models can be applied to economic data. The work that is widely acknowledged to have paved the way for that shift in assumptions is Tryvge Haavelmo’s The Probability Approach to Econometrics (1944). Haavelmo’s insight was that economic theory can be used to account for the naturally occurring variations in economic variables: that economic theory can be used to specify theoretical models about “autonomous” relationships (i.e., about relationships that are causal in the sense of the definition defended in Chapter 2); that theoretical models can be used to specify statistical models, i.e., models that do not only rely on theoretical assumptions about autonomous relationships but also on statistical assumptions about functional form, the number of regressors, the behavior of the error terms and so on; and that statistical models can be shown to be (or not to be) statistically adequate, i.e., to conform (or not to conform) to the economic data chosen. If a statistical model is statistically adequate, then a statistical (or probability) model applies to economic data. Is this supposed to mean, however, that statistical adequacy is the touchstone of the truth or falsity of the theoretical model? Haavelmo (1944/1995, p. 484) answers the question in the negative: statistical adequacy could also indicate “that we might be trying out the theory on facts for which the theory was not meant to hold.” Do we then have to assume that statistical (in-)adequacy doesn’t have any implications for the truth or falsity of the theoretical model whatsoever? Haavelmo (1944/1995, p. 484) again provides a negative answer: “In order to test a theory against facts, […] either the statistical observations have to be ‘corrected’, or the theory itself has to be adjusted.” What Haavelmo (1944/1995, p. 480) affirms is that the statistical adequacy of a statistical model indicates that the theoretical model that is used to specify the statistical model is ‘almost true’: “the question of whether or not an exact model is

Scientific Realism in Macroeconomics  123 ‘almost true’ is really the same question as whether or not some other model that claims less is actually true in relation to the facts, or at least does not contradict the facts.” That the theoretical model is almost true is not supposed to mean that it is sufficiently true to represent autonomous relationships. That it is almost true is supposed to mean that it is included in the set of theoretical models that are a priori admissible, and of which only one qualifies as true or sufficiently true to represent autonomous relationships. Statistical adequacy, to repeat, is not a touchstone of the (sufficient) truth or falsity of a theoretical model. But it may serve as a criterion that can be “used to narrow down the set of a priori admissible theoretical models” (Spanos 1989, p. 411). Theoretical models represent, to repeat, “autonomous” relationships: relationships that are causal in the sense of the definition defended in Chapter 2. Recall from Section 2.2 of Chapter 2 that causal models preselect in the sense that only a limited number of variables is included in the set V of causal structure variables. This strategy of preselection can be understood as strategy of approximation: parameters that measure the strength of the causal influence of one variable on another are set to zero if their influence is deemed to be nonzero but negligible. Remember from Section 3.4 of Chapter 3, however, that the strategy of preselection can also be understood as “strategy of idealization”: as the strategy of identifying and isolating “the real essences” or “causally effective capacities of economic reality” (Hoover 2001, pp. 126–7). Let us say that the strategy of preselection idealizes if theory is used to determine the number of variables to be included in the set V of causal structure variables, and that it approximates if only a minimum of theory is used to determine the number of variables to be included in V.1 Then Haavelmo can be said to endorse a variant of truth-to-economy, in which the Newtonian kind predominates. This variant is clearly of the mixed kind: it is of the Newtonian kind because theoretical (or causal) models are used to specify statistical models; and it is of the Keplerian kind because the statistical adequacy of a statistical model indicates that the theoretical model that is used to specify the statistical model is ‘almost true.’ But in the mixed variant that Haavelmo defends, the Newtonian kind predominates: without a theoretical model, there is no statistical model or test for statistical adequacy; and without economic theory, there is no theoretical model. The Newtonian predominance in Haavelmo’s variant of truth-to-economy has been challenged early on. Arthur Burns and Wesley Mitchell (1946, p. 4), for instance, think they can observe business cycles closely and systematically to better understand and explain them: “[A]ll investigators cherish the ultimate aim – namely, to attain better understanding of the recurrent fluctuations in economic fortune that modern nations experience. This aim may be pursued in many ways. The way we have chosen is to observe the business cycles of history as closely and systematically as we can before making a fresh attempt to explain them.” If the sort of systematic and careful observation that Burns and Mitchell undertake is supposed to lead to observable or measurable entities and empirical relationships, then the realism they endorse is a mixed variant of truth-to-economy, in which the Keplerian kind predominates.

124 Objectivity Koopmans (1947/1995, p. 494), by contrast, sides with Haavelmo when reviewing Burns and Mitchell’s 1946 study of business cycles and arguing “that even for the purpose of systematic and large-scale observation of such a many-sided phenomenon [as a business cycle], theoretical preconceptions about its nature cannot be dispensed with, and the authors do so only to the detriment of the analysis. […] The choices as to what variables to study cannot be settled by a brief reference to ‘theoretical studies of business cycles’.” Koopmans’ review kicks off the “measurement without theory” debate: the intellectual encounter between econometricians who like Koopmans belong to the Cowles Commission (CC) and econometricians who like Burns and Mitchell are associated with the National Bureau of Economic Research (NBER). Both groups of researchers argue in favor of truth-to-economy of either the Newtonian or Keplerian kind. Neither group, however, wishes to defend a pure variant of truth-to-economy. Vining (1949/1995, p. 509), Koopmans’ main opponent in the debate, explicitly avows himself to Haavelmo’s approach when referring to it as a “more complete account of the philosophy or theory of the method of Koopmans’ group.” And Koopmans (1947/1995, pp. 491–2) strikes a conciliatory tone when in one passage, distinguishing a “Kepler stage” and a “Newton stage” of the development of a scientific theory, and when arguing that “in research in economic dynamics the Kepler stage and the Newton stage of inquiry need to be […] intimately combined and to be pursued simultaneously.” It’s in fact from that passage that I borrow the terms ‘Keplerian’ and ‘Newtonian’ to distinguish a Keplerian and Newtonian type of truth-to-economy. Again, the question is why the CC and NBER associates first defend a pure variant of truth-to-economy of either the Newtonian or the Keplerian kind and then commit themselves to a mixed variant. Again, one reason appears to be merely sociological: Hendry and Morgan (1995, p. 69) point out that the “measurement without theory” debate was an argument between CC and NBER associates “against a backdrop of both groups seeking funding for their work.” And again, another, philosophically more important, reason is that either side believes that in the mixed variant of truth-to-economy either the Keplerian or the Newtonian kind predominates: while Koopmans believes that the Newtonian kind predominates, Vining holds that the Keplerian kind predominates. To support his belief, Koopmans provides three arguments. His first argument is the one that is directed against Burns and Mitchell’s proposal to observe business cycles closely and systematically in order to better understand and explain them, i.e., the argument that says that systematic and large-scale observations of economic fluctuations presuppose theoretical conceptions (cf. Above). Koopmans’ (1947/1995, p. 498) second argument says that “[w]ithout resort to theory […] conclusions relevant to the guidance of economic policies cannot be drawn.” Without resort to theory, policy conclusions cannot be drawn because in order for an economist to draw policy conclusions, she needs to specify a system of structural equations in which an equal number of relevant variables are determined by the simultaneous validity of these equations. The simultaneous validity of these equations, however, is nothing to be inferred from mere observations of regularities or

Scientific Realism in Macroeconomics  125 time series. The simultaneous validity of these equations could be inferred if crucial experiments could be conducted: if causes and effects could be separated by varying causes one at a time, studying the separate effect of each cause. But while such experiments can be conducted in the natural sciences, they are unavailable to economists because the object studied in economics (an economy) is too complex to allow for isolations of separate causes and surgical interventions. According to Koopmans the simultaneous validity of a system of structural equations needs to be inferred from theory. Vining (1949/1995, p. 506) will later complain that Koopmans doesn’t give the hypotheses of his theory “specific economic content.” But from what he explains in his review and rejoinder to Vining, it becomes clear that these hypotheses relate to “the behavior of individual consumers, laborers, entrepreneurs, investors, etc., in the markets indicated by these terms” (Koopmans 1947/1995, p. 516). The variables that figure in these hypotheses are of three types: they are (a) determined directly through individual decisions, (b) determined by government or bank officials acting under law or conventional rule, and (c) determined by a productive process as described by a transformation function. If variables are of type (a), then the hypothesis in question may but need not be of the utility- or profit-maximizing type. What is important for Koopmans (1947/1995, p. 516) is also that variables representing aggregates of variables of types (a)–(c) must be regarded as “dependent on (or deducible from) these.” It goes without saying that the evidence that can be provided to a theory of this kind isn’t as strong as the evidence that can be provided in support of the fundamental hypotheses of the natural sciences. Koopmans (1947/1995, p. 496) believes, however, that economists do possess more elaborate and better established theories of economic behavior than the theories of motion of material bodies known to Kepler. These economic theories are based on evidence of a different kind than the observations embodied in time series: knowledge of the motives and habits of consumers and of the profit-making objectives of business enterprise, based partly on introspection, partly on interview or on inferences from observed actions and individuals – briefly, a more or less systematized knowledge of man’s behavior and its motives. Koopmans believes, that is, that the evidence that can be provided in support of his economic hypotheses is strong enough to allow him to forego any further testing for these hypotheses and to go over to problems of estimation instead. The third argument that Koopmans (1947/1995, pp. 501–2) advances relates to “[t]he greater wealth, definiteness, rigor, and relevance to specific questions of such conditional information, as compared with any information extractable without hypotheses of the kind indicated.” Koopmans points out that unlike celestial mechanics, where the phenomenon studied can be treated as a deterministic process (with some randomness entering only through measurement errors), dynamic economics needs to treat a phenomenon as a stochastic process because of the great number of factors at work. The main problem in dynamic economics is accordingly

126 Objectivity the choice of an adequate test statistic: of that function of the observations that is to be used for parameter estimation or hypothesis testing. It is little surprising that Koopmans believes that this problem cannot be solved unless an explicit dynamic theory of the formation of economic variables is presupposed. The dynamic theory builds on the economic theory described above in that it represents the system of equations as stochastic difference equations: as equations describing responses to time lags (past values of economic variables affect current actions of individuals), and as equations describing the aggregate production process or the behavior of groups of individuals as determined in part by many minor factors, further scrutiny of which is either impossible or unrewarding. Of systems of that kind, Koopmans claims that they may possess a tendency for the variables to evolve in cyclical movements, but that these cycles need not be very regular or similar in duration or amplitude. In his rejoinder to Koopmans’ review, Vining raises objections to each of the three arguments brought up by Koopmans. With respect to Koopmans’ first argument, Vining (1949/1995, p. 506) suggests that “it appears not unfair to regard the formal economic theory underlying his approach as being in the main available from works not later than those of Walras.” Vining (1949/1995, p. 507) further admits that this theory “might be just the conception that we need for accounting for and analyzing the uniformities discoverable among human individual and population phenomena.” But he also points out that “such has not been demonstrated and until evidence of the adequacy of this model is made available, it is an unnecessary restriction upon economic research to insist that the method used shall be essentially that adopted and developed by Koopmans and his associates.” In his rejoinder to Vining’s rejoinder, Koopmans (1949/1995, p. 519) later rejects this objection and suggests that a strategy of holding unto a theory until it is disconfirmed is preferable to a strategy of not holding unto a theory until it is confirmed. He claims that examples from fundamental physics and the special sciences “suggest a scientific strategy of not discarding basic theoretical notions and assumptions before they ‘cause trouble’, that is, before observations are made which come in conflict with these basic notions and assumptions.” What might perhaps be mentioned in response to this suggestion is that “trouble” caused by basic theoretical notions and assumptions in economics is usually associated with the relatively high social cost of a real economy crisis that couldn’t be prevented or at least predicted on the basis of these notions and assumptions. A better strategy might accordingly be the one endorsed by Vining: try to find as many theories as possible that can be empirically confirmed to a certain extent. With respect to Koopmans’ second argument, Vining (1949/1995, p. 511) claims that knowledge in economics is too scant to be put to use in economic policy counseling: [W]e have next to no knowledge at all in the field of economic variation. But we do not lack exponents of the application of knowledge in economics. […] Now, it would be clearly unjust and inappropriate to speak of the relation between economics as currently used in discussions of positive policy and

Scientific Realism in Macroeconomics  127 economics as a study of a field of variation as being similar to the relation between astrology and astronomy. But fortune telling provided the principal support of many of the prominent early astronomers, and is fortune telling too hard an expression for much of what we do? Note that in light of the negative result of part I, Vining’s claim that knowledge in economics is too scant to be put to use in economic policy counseling remains true to the present day. With respect to Koopmans’ third argument, Vining (1949/1995, p. 513) maintains that “[s]ampling and estimation theory is important to a student of economic variation, but in a sense it is secondary.” Of primary importance is statistical work comparable to the one conducted by Burns and Mitchell, i.e., statistical work that aims at hypothesis seeking and gets along with many of the untestable assumptions needed for parameter estimation and hypothesis testing. When responding to this objection in his rejoinder, Koopmans (1947/1995, p. 518) admits that problems of parameter estimation and hypothesis testing “have a derived interest only.” But he also holds that “this derived interest is strong.” 6.4 Truth-to-Economy in Contemporary Macroeconomic Policy Analysis The two positions defended in the “measurement without theory” debate are very much alive in contemporary macroeconomic policy analysis. Macroeconomists engaging in DSGE modeling have inherited the CC position of truth-to-economy, in which the Newtonian kind predominates. In macroeconomic DSGE modeling, theory is used to specify a theoretical model, which in turn is used to specify a statistical model, which in turn is used to estimate the parameters of the theoretical model. The theory used to specify the theoretical model (a DSGE model) is general equilibrium theory plus the representative-agent assumption. The statistical model used for purposes of parameter estimation is a structural vector autoregression (SVAR) model. Christopher Sims (1980) argued that vector autoregression (VAR) models (equation systems in which one variable is regressed on its own lags and the lags of all other variables in V) can be used to model the responses of variables to shocks. But Sims (1986) later came to accept that different causal structures have different implications for the shock-response functions and that accordingly SVAR models (VAR models incorporating restrictions deriving from economic theory) are needed to model the responses of variables to shocks. If estimations of SVAR model parameters are unbiased, they will narrow down the set of a priori admissible theoretical models. But they will not allow for the selection of one specific theoretical model from a pool of competing and observationally equivalent models. They will not allow, for instance, for the selection and preference of a new Keynesian DSGE model over a new classical competitor (a new classical DSGE or “real business cycle” model), or vice versa. While Walrasian general equilibrium theory is used to specify new classical DSGE models, non-Walrasian general equilibrium theory is used to specify new Keynesian DSGE

128 Objectivity models. According to Walrasian general-equilibrium theory, firms act as price takers, there is a single all-encompassing market, monetary policy doesn’t have any real effects, and markets are perfectly competitive; according to non-Walrasian (or “Marshallian”) general-equilibrium theory, firms act as price setters, there is a juxtaposition of separate markets, monetary policy might have real effects, and markets aren’t perfectly competitive (cf. De Vroey 2004). Macroeconomists employing empirical procedures of causal inference (Hoover and econometricians conducting potential outcomes research) have inherited the NBER position of truth-to-economy, in which the Keplerian kind predominates. An empirical procedure like the LSE principle of encompassing is used to narrow down the set of a priori admissible theoretical models. Empirical information about structural breaks or statistical adequacy is then used to select one theoretical model from a pool of competing and observationally equivalent models (cf. Sections 4.4 and 4.5 of Chapter 4). Theory (or some kind of theoretical background knowledge) plays a role when chronologies of interventions are assembled, when non-measurable confounders are assumed to be absent, or when the selection-on-observables assumption is made (cf. Sections 4.4 and 4.5 of Chapter 4). But the role of theory is reduced to a minimum, and thus the Keplerian kind predominates in the position of truth-to-economy that Hoover and like-minded macro-econometricians endorse. The NBER position of truth-to-economy is also alive in the discipline of agentbased macroeconomics that I briefly described in Section 3.6 of Chapter 3. Agentbased macroeconomists use the results of empirical disciplines like econometrics or behavioral economics to model the direct interactions between heterogeneous agents in dynamic non-equilibrium. I noted in Section 3.6 of Chapter 3 that agentbased models underdetermine whether the modeled relations qualify as relations of causal dependence or supervenience. But I also suggested that their interpretation as causal or constitutive is, in each case, relatively straightforward. Like macroeconometricians conducting causal inference, agent-based macroeconomists do not get along without theory. Theory (or some kind of theoretical background knowledge) plays a role when they try to validate their models (when selecting initial values for variables and parameters to reproduce stylized facts), when they model specific aspects of the economic system and disregard others, or when they revert to network theory to understand the interactions between agents. But compared to the idealizing theory used in DSGE modeling, the kind and amount of theory used in agent-based macroeconomics is minimal. Therefore, the position of truth-to-economy agent-based macroeconomists endorse is adequately described as one, in which the Keplerian kind predominates. There is an important objection that has been in the air since the very beginning of my historical analysis. The objection says that Ricardo and the other theorists that I say endorse a Newtonian variant of truth-to-economy do not qualify as realists but instrumentalists. Their use of theoretical models to justify policy decisions doesn’t imply that they believe in the (approximate) truth of these models. The fact that these models idealize to a substantial degree rather suggests that they think these models are false. The falsity of these models doesn’t pose any challenges if they can be employed successfully for purposes of policy analysis. But if

Scientific Realism in Macroeconomics  129 theorists believe them to be false and successful, they qualify as instrumentalists, not realists. It is not implausible that many contemporary macroeconomists qualify as instrumentalists, and that they have inherited their instrumentalism from Friedman (1953, p. 153) who is famous for claiming that “[t]ruly important and significant hypotheses will be found to have ‘assumptions’ that are wildly inaccurate descriptive representations of reality,” and that “in general, the more significant the theory, the more unrealistic the assumptions.” But textual evidence speaks against the objection that earlier generations of economists endorse instrumentalist positions. Mill (1836/21994: 54), for instance, says that the hypothesis that man is a being who desires to possess wealth is “of all hypotheses equally simple, […] the nearest to the truth.” Realist statements can also be found in Haavelmo. Consider e.g. the following passage: A theoretical model […] is, as it stands, void of any practical meaning or interest. […] The model attains economic meaning only after a corresponding system of quantities or objects in real economic life has been chosen or described, in order to be identified with those in the model. […] The model thereby becomes an a priori hypothesis about real phenomena, stating that every system of values that we might observe of the ‘true’ variables will be one that belongs to the set of value-systems that is admissible within the model (Haavelmo 1944/1995, p. 485, emphasis in the original). It is also questionable whether instrumentalism is a sensible position to adopt in macroeconomics. The importance and significance that Friedman has in mind are predictive or explanatory success. He is not concerned with successful policy analysis or causal inference, and predictive or explanatory success does not imply successful policy analysis or causal inference. Are the theoretical models of macroeconomists engaging in DSGE modeling, causal inference, or agent-based modeling successful in terms of policy analysis? Macroeconomists engaging in DSGE modeling tend to answer the question affirmatively. But the eliminative program of the new classical or new Keynesian macroeconomics doesn’t allow them to model the chains of relations of causal dependence and supervenience that central banks and other policymaking institutions exploit as a matter of fact. These chains of relations include relations of supervenience (between microeconomic quantities and the macroeconomic aggregates supervening on them) and downward causation (between macroeconomic aggregates and microeconomic quantities). New classical or new Keynesian macroeconomics cannot model these relations because their program of microfoundations is one of eliminating macroeconomics by fully reducing it to microeconomics (general equilibrium theory). Macroeconomists engaging in causal inference likewise provide an affirmative answer (cf. Hoover 2001, pp. 59, 213–4). They think that central banks and other policymaking institutions exploit relations of causal dependence between

130 Objectivity macroeconomic aggregates and that empirical procedures can be successfully employed to provide conclusive evidence in support of the causal hypothesis that refers to these relations. I argued in Chapter 4, however, that the evidence that macroeconomists can provide in support of this hypothesis is too inconclusive in principle because the Lucas critique implies that expectations are likely to confound the relations of causal dependence that macroeconomists believe obtain between macroeconomic aggregates, and because macroeconomists cannot measure expectations or decide whether they can be controlled for. Agent-based macroeconomists are on the right track, but in order for policy analysis in agent-based macroeconomics to be successful, they need to make progress in the study of the formation of expectations. They won’t be able to tell whether variables standing for expectations or expectational aggregates can be controlled for unless they manage to measure expectations, and they won’t be able to measure expectations unless they understand how agents form expectations (cf. Section 3.5 of Chapter 3). Currently, reliable methods of studying the formation of expectations are unavailable, but they might be forthcoming in the future. Successful causal inference will, in any case, depend crucially on scientific progress in the study of the formation of expectations. In a more recent defense of instrumentalism in economics, Julian Reiss (2012, p. 367) explicitly concentrates on the case of causal models. He points out that in economics, the truth and usefulness (or significance) of causal models often come apart, and that economists are prepared to accept a false model, as long as that model can be successfully employed for purposes of explanation, prediction, or control (Reiss 2012, p. 371). Reiss (2012, p. 372–5) then goes on to argue that economists have good reasons to accept the false but useful model: whenever causal models are successfully employed for purposes of explanation, prediction, or control, what allows them to be employed for these purposes is an additional fact about these models; and “building causal models has enormous informational requirements.” Reiss has a point, of course, when suggesting that no one is interested in causal models that are true but useless, or that causal modeling in economics requires an enormous amount of information that might not be available. But economists hardly ever use causal models when aiming at purposes of explanation, statistical inference (parameter estimation or specification testing), or prediction (remember from Section 5.6 of Chapter 5 that they tend to use statistical models when aiming at purposes of statistical inference or prediction). The primary purpose of causal modeling in economics is control in the sense of policy analysis. And whenever economic causal models are used for purposes of policy analysis, it is indeed an additional fact that allows them to be used for such purposes: invariance under interventions. Reiss (2012, p. 374), however, claims that once we have discovered that additional fact, “it is irrelevant whether the relation at hand is causal or not.” This claim may be true in cases where the purpose at hand is explanation, statistical inference, or prediction. But I don’t think it is true in cases where the purpose at hand is policy analysis. Whenever we would like to correctly predict the consequences

Scientific Realism in Macroeconomics  131 of a policy manipulation, we need a causal model that adequately represents the relation of direct type-level causation that obtains between the manipulated variable and the target variable, and we need to assume that the equation representing that relation remains invariant to the intervention on the manipulated variable. It is not the case that we can dispense with the causal model once we have discovered that our invariance assumption is true. If we dispense with the causal model, we will be no longer committed to policy analysis: we no longer hold that the change in the target variable is to be attributed to the change in the manipulated variable. 6.5 Scientific Realism in Macroeconomics: Given as a Problem In a paper on scientific realism in economics, Mäki (1996, p. 431) suggests that there is “an interesting difference between economics and physics regarding their respective focus on the various aspects of the issue of realism. In physics, existence and reference constitute a major issue, whereas in the context of economics, existence and reference would appear to be relatively unproblematic and […] the emphasis seems to be on truth.” I’m not sure if Mäki’s suggestion is accurate. In physics, not only the existence of entities like particles, quarks, and black holes is a major issue but also the truth of the hypotheses that physicists hold with respect to these entities. Physicists generally acknowledge that fundamental physics is incomplete: that the standard model of particle physics, for instance, cannot explain a whole range of phenomena (gravity, dark matter, neutrino masses, and so on). And the argument from pessimistic meta-induction (that I’m going to discuss below) is not only directed against the idea that entities like particles and so on exist but also against the idea that the hypotheses that physicists hold with respect to these entities can be true. I do believe, however, that Mäki is right when claiming that in economics, more broadly understood, existence and reference are relatively unproblematic, while truth and representation are problematic. And I want to emphasize in the present section that Mäki’s claim is to be taken literally: that in contemporary macroeconomics, causal models cannot be shown to represent relations of direct type-level causation, and that they cannot be shown to fail to represent such system. I want to argue, that is, that in contemporary macroeconomics, truth-to-economy is given to us as a problem: that macroeconomists will have to make progress in the empirical program of microfoundations (cf. Section 3.6 of Chapter 3) in order to convince themselves of the (approximate) truth of their causal models. In discussions of general scientific realism, existence is often characterized as mind-independent. But Mäki (1996, p. 433) is right when pointing out that in economics, existence has to be characterized as independent of any particular mind. And the same is true for macroeconomics. The existence of aggregate quantities and the causal relations between them is not independent of any human mind: the existence of aggregate quantities (inflation expectations, the general price level etc.) depends on the minds or mental states (beliefs, intentions, fears etc.) of many people, and the existence of relations of direct type-level causation depends on

132 Objectivity the reality of interventions that may also be carried out by human beings (with minds). But what one might say is that the existence of aggregate quantities and the causal relations between them is independent of any particular mind. One might also follow Hoover (2001, p. 23) and say that these quantities and relations exist “independently of any (individual) human mind.” In a more recent paper on scientific realism, Mäki (2011, p. 7) characterizes the existence of aggregate quantities and the causal relations between them as independent of science. And this characterization is perhaps particularly felicitous because it determines the individual or particular mind, of which the existence of aggregate quantities and causal relations has to be independent: the mind of the scientist. Mäki (1996, pp. 433–9) is also right when arguing that in economics, more broadly understood, the existence of entities like firms, households, preferences, goods, prices, and so on is relatively unproblematic.2 If I understand him correctly, he advances an initial-baptism theory of reference for economic terms: entities like firms, preferences etc. are known from general folk views and folk economics, where they are termed ‘firms,’ ‘preferences,’ and so on; in scientific economics, these terms are supplemented with technical vocabulary (‘indifference curves,’ ‘elasticity’ etc.), and their meanings are modified (‘firm,’ for instance, becomes synonymous with ‘representative profit-maximizing firm’); but even when used in scientific economics, these terms refer to the entities known from general folk views and folk economics, where their existence is relatively unproblematic. The entities known from general folk views and folk economics are aggregated to quantities, the existence of which inherits the unproblematic character of the existence of the entities known from general folk views and folk economics. The existence of the aggregate quantities is surely unproblematic when they result from applications of unique aggregating procedures, as in the case of “summing up unemployed people to yield the rate of unemployment” (Mäki 1996, p. 435). One might think that the existence of aggregate quantities is problematic, once unique aggregating procedures are unavailable, as in the case of the general price level and all aggregate quantities that (like inflation, real GDP, and so on) are defined in terms of the general price level (cf. Section 3.2 of Chapter 3). But the truth is that in the case of the unavailability of unique aggregating procedures, there is an indefinite number of aggregate quantities. There is a general price level calculated from prices and quantities as a Laspeyres index, another general price level calculated from prices and quantities as a Paasche index, and an indefinite number of additional general price levels. Reiss (2012, p. 366) is making a similar point with respect to inflation when saying that “[o]ne of the main bones of contention in the CPI controversy was whether the U.S. Consumer Price Index should be modeled as a so-called ‘cost-ofliving index’ which assumes that all consumers make their purchasing decisions in such a way as to maximize utility given their budget constraints […]. Whether or not inflation exists is not an issue in this debate.” The question is, of course, which of the general price levels should be used to calculate inflation, and which of the definitions of inflation should be used to measure the variable denoting inflation in

Scientific Realism in Macroeconomics  133 a causal model. But this is a question of convention or politics (cf. Section 3.2 of Chapter 3) and does not change the fact that the general price level and inflation (and all other aggregate quantities defined in terms of the general price level) exist and that their existence is relatively unproblematic because they are composed of entities that are known from general folk views and folk economics. Mäki (1996, p. 439) is finally right when pointing out that in economics, more broadly conceived, “[f]or the most part, the issue is that of the truth […] of the many representations in scientific economics.” In macroeconomic causal modeling, “the issue” is clearly that of the (approximate) truth of causal models. Of course, the issue is not that causal models cannot be true or false, literally speaking. Causal models express claims of direct type-level causation, and these claims can be true or false. I therefore take the liberty to refer to the causal models themselves as true or false. The issue of the truth of macroeconomic causal model is the problem of the absence of any touchstone of the truth or falsity of these models. The considerations in part I have shown that neither empirical methods of causal inference nor the identification of causal model parameters in microeconomic theory can serve as a touchstone of that truth or falsity. The problem is that we simply cannot know whether a macroeconomic causal model is (approximately) true or false. Some anti-realists go further and claim that existence (or reference) and truth (or representation) are not problematic but impossible. They advance essentially two arguments in support of that claim. The first argument, the argument from pessimistic meta-induction points to the multiplicity of theories and models that have turned out to be false in the history of science and concludes that the same fate is likely to befall present and future theories and models. The second argument, the argument from skepticism about truth, points to the high degree of idealization that usually characterizes scientific theories and models and suggests that the high degree of idealization of these theories and models is irreconcilable with their truth or their terms or variables referring. Both arguments appear to have applications in macroeconomics. I will therefore discuss them in turn. In its original version, the argument from pessimistic meta-induction is implicit in an argument that Laudan (1981) advances against the ‘no-miracles argument.’ The no-miracles argument is an abductive inference of two empirical hypotheses from the apparent success of science. The argument is called ‘no-miracles argument’ because it suggests that the apparent success of science would be a miracle if the following two empirical hypotheses weren’t true: “(1) Terms in a mature science typically refer. (2) The laws of a theory belonging to a mature science are typically approximately true” (Putnam 1975–6, p. 179). Laudan (1981, p. 23) characterizes the apparent success of science more closely when saying that “a theory is successful if it makes substantially correct predictions, if it leads to efficacious interventions in the natural order, if it passes a battery of standard tests.” Laudan also brings up a number of arguments that are directed against (1) and (2) and the way of abductively inferring them. But perhaps his most influential argument is the one that boils down to the list of theories that he thinks were successful, part of a mature science and yet non-referring and false (cf. Laudan 1981, p. 33). In its original version, the argument from pessimistic meta-induction is just an extrapolation

134 Objectivity of that list: an extrapolation that is taken to warrant the conclusion that theories that presently or in the future are or will be held to be successful and part of a mature science are or will be non-referring and false. Mäki (1996, p. 441) is right when pointing out that “economics does not count as a mature science in the same way and degree as physics.” And the same is, of course, true of macroeconomics. But if we concede that macroeconomics is not a mature science, then the macroeconomic variant of the argument from pessimistic meta-induction will be even stronger. That variant says that so far every causal model that had been successfully used to inform macroeconomic policy failed to be successful at some point (in the sense that the consequences of the policy measures recommended by that model failed to materialize at some point); that a macroeconomic causal model turns out to be non-referring and false if it fails to be successful; and that therefore, every causal model that is successfully used to inform macroeconomic policy in the present or future will likewise turn out to be non-referring and false. On the one hand, that variant appears to have a straightforward application. It seems that the causal models at the center of Keynesian, new classical, and new Keynesian macroeconomics had all been successfully used to inform macroeconomic policy for a while, and it seems that all these models failed to be successful at some point. It seems that the model at the center of Keynesian macroeconomics (the IS-LM model) had been successfully used to inform the type of fiscal and monetary policy that appears to be needed to fight major recessions like the Great Depression, and that the same model failed to be successful when inflation and unemployment increased simultaneously in the early 1970s (when it seemed that an expansionary monetary policy increases inflation but not output and employment). It also seems that the model at the center of new classical macroeconomics had been successfully used to inform the type of policy that appears to be needed to combat stagflation (situations in which inflation and unemployment increase simultaneously), i.e., a policy of monetary restraint promoting technological innovation and that the same model failed to be successful when inflation continuously increased from business cycle to business cycle throughout the 1970s despite the absence of any noteworthy monetary policy measure. And it seems that the model at the center of new Keynesian macroeconomics had been successfully used to inform the type of policy that appears to be needed to bring inflation under control and that the same model failed to be successful when the crisis of 2008–2009 resulted in unusually high rates of decline in GDP across many countries. There are, on the other hand, several problems with the macroeconomic application of the argument from pessimistic meta-induction. One problem is that the ultimate failure of the IS-LM model and the models at the center of new classical and new Keynesian macroeconomics doesn’t imply that they are non-referring and false. Perhaps they can be viewed as being applicable only locally and temporarily, i.e., as being applicable only to those spatio-temporal periods in which the type of problem occurs that the type of policy that they inform is meant to solve: output and employment declines in the case of the IS-LM model, stagflation in the case of the model at the center of neoclassical macroeconomics, and high levels of

Scientific Realism in Macroeconomics  135 inflation in the case of the model at the center of new Keynesian macroeconomics. The model at the center of new Keynesian macroeconomics might also be able to inform the type of policy that appears to be needed to solve the type of problem that occurred in the wake of the crisis of 2008–2009: an expansionary fiscal policy and a policy of quantitative easing, i.e., large-scale purchase of toxic assets by the Federal Reserve (cf. Section 1.1 of Chapter 1). Another problem is that the very success of the use of the IS-LM model or the models at the center of new classical and new Keynesian macroeconomics for purposes of policy analysis is questionable. The use of these models for purposes of policy analysis cannot be said to be successful unless there is evidence that can be provided in support of causal chains that connect policy measures and desired outcomes. I argued in Chapter 4, however, that in contemporary macroeconomics, evidence of such chains is unavailable. It is, strictly speaking, unknown whether the IS-LM model or the models at the center of neoclassical and new Keynesian macroeconomics have been successfully used for purposes of policy analysis. Let’s assume in accordance with the macroeconomic application of the pessimistic meta-induction that these models have been successfully used for purposes of policy analysis and that they have turned out to be non-referring and false. Then a third problem relates to the inductive nature of that application. We cannot rule out that certain macroeconomic models superseding the IS-LM model and the models at the center of new classical and new Keynesian macroeconomics will turn out to be genuinely referring and (approximately) true. The hope that the superseding models will turn out to be genuinely referring and (approximately) true is in fact the primary motivation of many macroeconomists who aim to provide policy analysis with more secure foundations. A macroeconomic application of the argument from skepticism about truth points to the high degree of idealization that characterizes causal models in macroeconomics. Reiss (2012, p. 376), for instance, claims that the argument that false models may be approximately or sufficiently true “is to a large extent confused.” He concedes that “there are cases in which a false assumption can indeed be regarded as ‘approximately true,’ namely when the assumption concerns the value of a quantitative causal factor” (Reiss 2012, p. 376). But he also claims that “economics’ idealizations are seldom of this kind”: that, “[t]ypically, economic models ascribe properties to actors and institutions that these don’t have and explain outcomes by way of causal processes that don’t exist.” And in support of that claim, he refers to the examples that Friedman (1953/1994, pp. 191–3) discusses in his classic paper on economic methodology. I think that Reiss is right when claiming that models can be approximately true if false assumptions concern the values of quantitative causal factors, and that models cannot be (approximately) true if assumptions ascribe properties to actors and institutions that these don’t have. Reiss’s claim is consistent with a distinction that is sometimes made between approximation and idealization. Teller (2009, p. 239), for instance, makes that distinction when saying that one “speaks of approximation for a use of a mathematical expression or quantity when there has been substitution of a simpler expression that is close enough to the correct expression not to

136 Objectivity spoil the intended application,” and that “an idealization involves some radical misdescription.” Norton (2012, pp. 207–8) makes the same distinction when saying that “approximations merely describe a target system inexactly,” while idealizations “refer to a new system” and “carry a novel semantic import not carried by approximations.” I think that Reiss is also right when suggesting that economic models typically idealize. Macroeconomic DSGE models are cases in point. They are the workhorse of policy analysis in mainstream macroeconomics and rely on idealizing theory: general equilibrium theory plus the representative agent assumption. The representative agent assumption idealizes to the extent that it refers to purely fictional entities: representative agents and the quantities that they work, hold, produce, pay, or purchase (cf. Section 3.3 of Chapter 3). It is not so clear, however, whether general equilibrium theory idealizes. Reiss follows Friedman when maintaining that the assumption of profit-maximizing firms is an idealization. But that assumption can be regarded as a special case of the assumption of utility maximization, and is that assumption an idealization and, consequently, false? It appears false to say that agents first assign numerical probabilities and utilities to outcomes and then maximize expected utility. Note, however, that the assumption that agents maximize expected utility is not supposed to mean that they maximize expected utility consciously. One might point out that the assumption that agents maximize expected utility is logically equivalent with the assumption that the set of preferences of agents is consistent with the axioms of completeness, transitivity, independence, and continuity.3 And one might argue that the assumption that the set of preferences of agents is consistent with these axioms is false. But is that assumption false? One might justify a positive answer by appealing to the various experiments that have been conducted to the effect that there appear to be important empirical counter-instances to transitivity (May 1954) or independence (Allais 1953). But do these experiments rule out that these axioms are satisfied at least some and perhaps even most of the time? I won’t be able to decide this question at this point, but I do want to endorse Reiss’s claim that economic models can be (approximately) true if they approximate, that they cannot be (approximately) true if they idealize, and that they typically idealize. What are the implications of this claim for causal modeling in macroeconomics? One implication is that the DSGE models of the macroeconomic mainstream cannot be (approximately) true. Their falsity doesn’t pose any challenges if instrumentalism is a sensible position that mainstream macroeconomists could adopt. But I argued in the previous section that in macroeconomic policy analysis, instrumentalism is not a sensible position to adopt. A second implication is that the models of agent-based macroeconomists and of macroeconomists conducting causal inference qualify as potential candidates for approximate truth. The bits and pieces of theory or theoretical background knowledge that these macroeconomists use to specify their models do not necessarily idealize or render the models false. A third implication is that Hoover’s empirical procedure (cf. Section 4.4 of Chapter 4) cannot be regarded as employing the strategy of idealization: the strategy of identifying and isolating “the real essences”

Scientific Realism in Macroeconomics  137 or “causally effective capacities of economic reality” (Hoover 2001, pp. 126–7). His procedure must, by contrast, be regarded as employing the strategy of approximation: the strategy of not using idealizing theory when setting parameters that measure the strength of the causal influence of one variable on another to zero if their influence is deemed to be nonzero but negligible. But Hoover might be able to live with that implication, as it puts his procedure ahead of the idealizing procedure used in macroeconomic DSGE modeling. Theorists have tried to formally define or informally explicate the notion of approximate truth. Popper (1972, pp. 231–6), for instance, compares the true and false consequences of scientific theories to formally define relative orderings of “verisimilitude.” But Miller (1974) points to a technical problem with Popper’s definition: in order for one theory to have greater verisimilitude than another, the first theory must be true simpliciter, and this means that Popper cannot explain how theories that are strictly false can differ with respect to approximate truth. Post (1971), on the other hand, advances an informal explication when suggesting that a theory is more approximately true than an earlier one if the earlier theory can be understood as a limiting case of the later one. But the pessimistic meta-induction to be analyzed below seems to show that a theory that includes an earlier one as a limiting case cannot be closer to the truth than the earlier one. Teller (2009, pp. 236–7) offers another informal (in fact pragmatic) explication when identifying approximate truth with sufficient truth, and when tying sufficient truth to present needs and interests. He illustrates his explication by the example of the statement that the circumference of the earth is 40.000 km (while it is in fact 40,075.16 km). Whether or not we accept that statement as sufficiently true depends on our present needs and interests: we accept it as sufficiently true if our present needs and interests include the interest to please someone in an informal and everyday conversation; we do not accept it as sufficiently true if our present needs and interests include the interest to grade a geography term paper. Teller’s pragmatic explication of approximate truth is the one that is relevant to macroeconomic policy analysis. In macroeconomic policy analysis, the sufficient truth of a causal model is tied to present needs and interests, and these needs and interests relate to secure knowledge, on which to base policy. I argued in part I of this book that secure knowledge, on which to base policy, is currently unavailable in macroeconomics: that variables standing for (aggregates of) expectations operate as potential confounders to relations of direct type-level causation between variables denoting non-expectational aggregates, that variables standing for (aggregates of) expectations cannot be measured, and that consequently macroeconomists don’t know whether these variables can be controlled. I also argued, however, that secure knowledge might become available if macroeconomists turn to a program of empirical microfoundations: if they study the formation of expectations, and if they use the results of empirical economic disciplines like econometrics and behavioral or experimental economics to model the chains of relations of causal dependence and supervenience that policymaking institutions can exploit. I noted in Section 3.6 of Chapter 3 that the program of empirical microfoundations largely coincides with the program of microfoundations that has

138 Objectivity been underway in agent-based macroeconomics for a while. There is accordingly a sense in which one may say that the models of agent-based macroeconomists are (or will be) more approximately true than the models of macroeconomists engaging in causal inference. In conclusion, one may say that in macroeconomic causal modeling, scientific realism is problematic: that we currently don’t know whether the causal models that macroeconomists use to justify policy decisions genuinely refer, or whether they are (approximately) true. One may also say, however, that scientific realism about these models is not impossible: that the conclusions of (the macroeconomic applications of) the arguments from the pessimistic meta-induction and skepticism about (approximate) truth do not necessarily hold. In his more recent paper on scientific realism in economics, Mäki (2011, pp. 5–6) suggests that economists should endorse a type of realism that he calls “minimal realism.” He says that “minimal realism does not require concluding that an entity Y exists. It is enough that Y might (or might not) exist. […] [M]inimal realism does not require concluding that theory T is true about Y. It is enough that T might be true (or might be false).”4 Reiss (2012, p. 364) has a point when arguing that it would be hard not to be a realist if realism were understood as minimal realism. But if instrumentalism is not an option, then minimal realism will be the only option left. A more fullblown version of truth-to-economy is given as a problem: in macroeconomics, we currently don’t know whether causal models genuinely refer, or whether they are (approximately) true, but there is hope that at one point, we will be able to attain that knowledge. Notes 1 This way of speaking has important consequences that I will analyze in the final section of this chapter. One consequence is that theoretical models cannot be “almost” (or approximately) true unless theory doesn’t refer to fictional entities like representative agents. Another consequence is that Hoover’s empirical procedure (cf. Section 4.5 of Chapter 4) cannot be understood as employing the strategy of idealization. 2 In a similar vein, Hausman (1998, p. 201) argues that aggregate quantities “should not […] give rise to empiricist qualms similar to those to which quarks and neutrinos gave rise.” 3 Von Neumann and Morgenstern (1947) prove that any agent with a set of preferences obeying the axioms of completeness, transitivity, independence and continuity acts in accordance with the principle of maximizing expected utility. 4 For a similar passage, cf. Mäki (2005, p. 238).

References Allais, M. (1953). “Le Comportment de l’homme rationell devant le risqué: Critique des postulates et axioms de l’école Américaine.” Econometrica 21, 503–46. Burns, A. F. and Mitchell, W. C. (1946). Measuring Business Cycles. New York: National Bureau of Economic Research. Daston, L. and Galison, P. (22010). Objectivity. New York: Zone Books. De Vroey, M. (2004). “The History of Macroeconomics Viewed Against the Background of the Marshall-Walras Divide.” History of Political Economy 36(5), 57–91.

Scientific Realism in Macroeconomics  139 Friedman, M. (1953/21994). “The Methodology of Positive Economics.” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP. Haavelmo, T. (1944/1995). “The Probability Approach in Econometrics.” In Hendry, D. F. and Morgan, M. (eds.), The Foundations of Econometric Analysis. Cambridge: CUP, 477–90. Hausman, D. (1998). “Problems with Realism in Economics.” Economics and Philosophy 14(2), 185–213. Hendry, D. F. and Morgan, M. S. (1995). “Introduction.” In The Foundations of Econometric Analysis. Cambridge: CUP, 1–82. Hoover, K. D. (2001). Causality in Macroeconomics. Cambridge: CUP. Koopmans, T. C. (1947/1995). “‘Measurement without Theory’ Debate.” In Hendry, D. F. and Morgan, M. S. (eds.), The Foundations of Econometric Analysis. Cambridge: CUP, 491–502. Laudan, L. (1981). “A Confutation of Convergent Realism.” Philosophy of Science 48(1), 19–49. Mäki, U. (1996). “Scientific Realism and Some Peculiarities to Economics.” In Cohen, R. S., Hilpinen, R. and Qiu, R. Z. (eds.), Scientific Realism and Anti-Realism in the Philosophy of Science. Boston Studies in the Philosophy of Science, Vol. 160. Dordrecht: Kluwer, 425–45. Mäki, U. (2005), ‘Reglobalising Realism by Going Local, or (How) Should Our Formulations of Scientific Realism be Informed about the Sciences’, Erkenntnis, 63, 231–251. Mäki, U. (2011). “Scientific Realism as a Challenge to Economics and Vice Versa.” Journal of Economic Methodology 18(1), 1–12. May, K. O. (1954). “Intransitivity, Utility, and the Aggregation of Preference Patterns.” Econometrica 22, 1–13. Menger, C. (1884). Die Irrthümer des Historismus in der deutschen Nationalökonomie. Wien: Hölder. Mill, J. S. (1836/21994). “On the Definition of Political Economy and the Method of Investigation Proper to It.” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP. Miller, D. (1974). “Popper’s Qualitative Theory of Verisimilitude.” British Journal for the Philosophy of Science 25(2): 166–77. Norton, J. D. (2012). “Approximation and Idealization. Why the Difference Matters.” Philosophy of Science 79(2), 207–32. Popper, K. R. (1972). Conjectures  and  Refutations:  The  Growth  of  Knowledge. London: Routledge. Post, H. R. (1971). “Correspondence, Invariance and Heuristics: In Praise of Conservative Induction.” Studies in the History and Philosophy of Science 2, 213–55. Putnam, H. (1975–6). “What is ‘Realism’?” Proceedings of the Aristotelian Society 76, 177–94. Reiss, J. (2012). “Idealization and the Aims of Economics: Three Cheers for Instrumentalism.” Economics and Philosophy 28(3), 363–83. Schmoller, G. v. (1883/1998), “Zur Methodologie der Staats- und Sozialwissenschaften.” In Nau, H. H. (ed.), Gustav Schmoller: Historisch-ethische Nationalökonomie als Kulturwissenschaft. Marburg: Metropolis, 159–83. Schmoller, G. v. (1911/1998), “Volkswirtschaft, Volkswirtschaftslehre und –methode.” In Nau, H. H. (ed.), Gustav Schmoller: Historisch-ethische Nationalökonomie als Kulturwissenschaft. Marburg: Metropolis, 215–368. Sims, C. A. (1980). “Macroeconomics and Reality.” Econometrica 48(1), 1–48.

140 Objectivity Sims, C. A. (1986). “Are Forecasting Models Usable for Policy Analysis?” Federal Reserve Bank of Minneapolis Quarterly Review 10(1), 2–15. Spanos, A. (1989). “On Rereading Haavelmo: A Retrospective View of Econometric Modeling.” Econometric Theory 5(3), 405–29. Teller, P. (2009). “Fictions, Fictionalization, and Truth in Science.” In Suarez, M. (ed.), Fictions in Science. London: Routledge, 235–47. Vining, R. (1949/1995). “‘Measurement without Theory’ Debate.” In Hendry, D. F. and Morgan, M. S. (eds.), The Foundations of Econometric Analysis. Cambridge: CUP, 503–13. Von Neumann, J. and Morgenstern, O. (1947). Theory of Games and Economic Behavior. Princeton: PUP.

7

7.1

The Role of Non-Scientific Values in Macroeconomics

Introduction

There is the idea that the problematic character of scientific objectivity in the sense of scientific realism shouldn’t strike terror into our hearts. What distracts us from the true causal model is subjectivity in the sense of an amalgamation of non-scientific values; and don’t we dispose of the effective means of keeping subjectivity of that kind in check? Non-scientific values are values like ideologies, moral judgments, or material interests. They are traditionally thought to contrast with scientific values like simplicity or conservatism (consistency with other or previous scientific models and theories). And the effective means of keeping non-scientific values in check are the different causal inference methods that macroeconomists can use to provide empirical evidence in support of or against specific causal models. The idea that what distracts us from the true causal model is subjectivity, and that macroeconomists can keep subjectivity in check by using causal inference methods, is a macroeconomic variant of the type of scientific objectivity that Daston and Galison refer to as “structural objectivity.” The negative result of part I of this book suggests that I’m going to reject that idea. The empirical evidence that can be provided by use of causal inference methods in macroeconomics is too inconclusive to support specific causal hypotheses (or the macroeconomic models expressing them); so how could these methods be used to keep non-scientific values in check? But the idea of structural objectivity (or value-independence) is an old and venerable one, and the claim that structural objectivity is impossible in macroeconomic causal modeling requires careful argument. The argument that the present chapter will develop derives the impossibility of structural objectivity in macroeconomic causal modeling from four premises. Premise 1 merely restates the conclusion of Chapter 4: it says that causal hypotheses (hypotheses about relations of direct type-level causation between macroeconomic aggregates) are non-sporadically underdetermined by empirical evidence. Premise 2 states that scientific values will influence selections of hypotheses from pools of competing causal hypotheses if causal hypotheses are underdetermined by empirical evidence. Premise 3 says that the scientific values (e.g. simplicity or conservatism), on the basis of which

DOI: 10.4324/9781003094883-9

142 Objectivity macroeconomists select hypotheses from pools of competing and empirically underdetermined causal hypotheses, have no epistemic priority over the opposing ones (e.g. non-simplicity or novelty). Premise 4 states that macroeconomists prefer one scientific value over another on the basis of non-scientific values. The argument to be developed in this chapter is to some extent a macroeconomic application of a more general argument developed by Helen Longino: of the argument from empirical underdetermination. There is a second popular argument that is often cited in support of the impossibility of value-independence: the argument from inductive risk (Douglas 2000, Rudner 1953). But the argument from inductive risk is not particularly convincing: there is an ambiguity in its premises, and it misrepresents the procedure by which scientists should conduct statistical hypothesis tests (Henschen 2021). Thus, applying this argument to the case of macroeconomics (or indeed, any scientific discipline) is not particularly helpful.1 The chapter will proceed in four steps. Section 7.2 will take a brief look at Daston and Galison’s concept of structural objectivity and the history of that concept in economics. It will characterize structural objectivity as objectivity of a causal model that is selected independently of any non-scientific values. And it will argue that this characterization is well in line with a widespread conception of economic causal modeling as a two-stage procedure: as a procedure that is model specification at its first stage and specification testing at its second. According to that conception, causal modeling is open to influence by non-scientific values at its first stage. That influence isn’t held to pose any major challenges, however, because at its second stage, causal modeling is thought to succeed in filtering out causal model misspecifications. Section 7.3 will defend premise 1 of the argument for the impossibility of structural objectivity in contemporary macroeconomic causal modeling. Section 7.3 will also analyze the extent to which this argument does and does not qualify as a macroeconomic application of Longino’s more general argument. Section 7.4 will defend premises 2 and 3 and speculate about the scientific values, on the basis of which mainstream (new classical and new Keynesian) macroeconomics select hypotheses from pools of competing and empirically underdetermined causal hypotheses. It will argue that they select these hypotheses on the basis of specific combinations of scientific values: on the basis of a combination of simplicity and a specific type of conservatism (consistency with Walrasian general-equilibrium theory) in the case of new classical macroeconomics, and on the basis of the combination of non-simplicity and another type of conservatism (consistency with Marshallian general-equilibrium theory) in the case of new Keynesian macroeconomics. Section 5 will defend premise 4 and speculate about the non-scientific values, on the basis of which mainstream macroeconomists prefer one combination of scientific values over the other. These non-scientific values are the values that Karl Marx (1857/21994, 41890/1990, 1894/1981) was the first to investigate in theoretical detail, and that theorists like Joseph Schumpeter (1949) and Daniel Hausman and Michael McPherson (21994) have analyzed subsequently: ideologies, value (or moral) judgments, and group interests.

The Role of Non-Scientific Values in Macroeconomics  143 7.2

Structural Objectivity and Causal Modeling in Macroeconomics

After their characterization of truth-to-nature, Daston and Galison (2010, pp. 42–3) continue their historical account of the development of the concept of scientific objectivity by a description of what they call ‘mechanical objectivity’: “In the middle decades of the nineteenth century, at different rates and to different degrees in various disciplines, new, self-consciously ‘objective’ ways of making images were adopted by scientific atlas makers. These new methods aimed at automatism: to produce images ‘untouched by human hands’, neither the artist’s nor the scientist’s. Sometimes but not always, photography was the preferred medium for these ‘objective images’. Tracing and strict measuring controls could also be enlisted to the cause of mechanical objectivity, just as photographs could conversely be used to portray types. What was key was neither the medium nor mimesis but the possibility of minimizing intervention, in hoes of achieving an image untainted by subjectivity. The truth-to-nature practices of selecting, perfecting, and idealizing were rejected as the unbridled indulgence of the subjective fancies of the atlas maker.” According to Daston and Galison (2010, p. 43), mechanical objectivity was soon “to undermine the primary aim of all scientific atlases, to provide the working objects of a discipline.” Mechanical objectivity was soon to undermine that aim because the new methods of producing images ‘untouched by human hands’ were soon applied to the new, self-consciously ‘objective’ ways of making images themselves, and because applications of the new methods showed that the new, self-consciously ‘objective’ ways of making images were in fact very different: “mid-nineteenth century research in history, anthropology, philology, psychology, and, above all, sensory physiology […] underscored how differently individuals reasoned, described, believed, and even perceived” (Daston and Galison 2010, p. 256). As a result, mechanical objectivity was soon abandoned in favor of a still intensified type of objectivity, i.e., a type of objectivity that Daston and Galison (2010, p. 45) refer to as “structural objectivity”: Structural objectivity waged war on images in science. […] Confronted with results showing considerable variability in all manner of sensory phenomena, some scientists took refuge in structures. These were, they claimed, the permanent core of science, invariant across history and cultures. Just what these structures were – differential equations, the laws of arithmetic, logical relationships – was a matter of some debate. But there was unanimity among thinkers […] that objectivity must be about what was communicable everywhere and always among all human beings – indeed, all rational beings, Martians and monsters included.

144 Objectivity Like mechanical objectivity, structural objectivity is a type of objectivity in the strict sense of the term: it is characterized by “the suppression of some aspect of the self, the countering of subjectivity.” Daston and Galison (2010, p. 257) argue, however, that [m]echanical and structural objectivity […] countered different aspects of subjectivity. Mechanical objectivity restrained a scientific self all too prone to impose its own expectations, hypotheses and categories on data – to ventriloquize nature. This was a projective self that overleaped its own boundaries, crossing the line between observer and observed. The metaphors of mechanical objectivity were therefore of manful self-restraint, the will reined in by the will. The metaphors of structural objectivity were rather of a fortress self, locked away from nature and other minds alike. Structural objectivity addressed a claustral, private self menaced by solipsism. The recommended countermeasures emphasized renunciation rather than restraint: giving up one’s own sensations and ideas in favor of formal structures accessible to all thinking beings. Daston and Galison refer to the different aspects of subjectivity (the projective self and the private self) to justify their distinction of mechanical and structural objectivity. In macroeconomics, what I’d like to call ‘structural objectivity’ represents a specific intersection of what Daston and Galison call ‘mechanical’ and ‘structural’ objectivity. On the side of objectivity, structural objectivity in macroeconomics coincides with structural objectivity in Daston and Galison’s use of the term: what is supposed to be objective is a causal model, i.e., a model of relations of direct type-level causation; and these relations are nothing to be observed, photographed, or measured according to the terms of mechanical objectivity, but rather something that is supposed to be communicable everywhere and always among all human beings, the permanent core of macroeconomics (that is invariant across history and cultures). On the side of subjectivity, by contrast, structural objectivity in macroeconomics rather coincides with what Daston and Galison call ‘mechanical’ objectivity: the self that it counters is the projective self that is too prone to impose its own expectations, hypotheses, and categories on data, and not so much the self that has sensations and ideas. I take it that in this sense, structural objectivity is continuous with a widespread conception of economic causal modeling as a two-stage procedure: as a procedure that is model specification at its first stage and specification testing at its second. Schumpeter (1949, pp. 228–9) describes that procedure as an instance of scientific procedure in general when saying that “scientific procedure […] starts from the perception of a set of related phenomena which we wish to analyze and ends up – for the time being – with a scientific model in which these phenomena are conceptualized and the relations between them explicitly formulated, either as assumptions or as propositions (theorems).” On the surface, that description reads as if it related only to one step, i.e., that of model specification. But Schumpeter (1949,

The Role of Non-Scientific Values in Macroeconomics  145 p. 229) makes it clear that he intends “to give the term ‘model’ a very wide meaning. The explicit economic model of our own day and its analoga in other sciences are of course the product of late stages and scientific endeavor”: the products of “factual’ and ‘theoretical’ research that go on in an endless chain of give and take, the facts suggesting new analytical instruments (theories) and these in turn carrying us toward the recognition of new facts.” What Schumpeter means by ‘model’ is a scientific model that relies on well-confirmed theory and has undergone a sufficient number of empirical tests. Therefore, the scientific procedure that he describes is a two-stage procedure of original vision or perception of related phenomena (model specification) and empirical (specification) testing. With respect to that two-stage procedure, Schumpeter (1949, p. 230) maintains that while the first stage is ideology by nature the second stage is scientific treatment, and that only the second stage is capable of objective control: [T]he existence of the ideological bias in ourselves and others, we can trace […] to a simple source. This source is in the initial vision of the phenomena we propose to subject to scientific treatment. For this treatment itself is under objective control in the sense that it is always possible to establish whether a given statement, in reference to a given state of knowledge, is provable, refutable, or neither. […] [I]t does permit the exclusion of that particular kind of delusion which we call ideology because the test involved is indifferent to any ideology. The original vision, on the other hand, is under no such control. […] [T]he original vision is ideology by nature. I’m going to analyze Schumpeter’s conception of ideology more closely in Section 7.5. What is of immediate interest at this point is that Schumpeter’s conception of scientific procedure is continuous with the type of structural objectivity that I said above is relevant to macroeconomics. If ideology is an aspect of the projective self that is too prone to impose its own expectations, hypotheses, and categories on data, then the second stage of the scientific procedure that Schumpeter describes (scientific treatment or empirical specification testing) serves the purpose of suppressing that aspect of the self. Friedman (1953, pp. 186–7) maintains a similar conception of scientific procedure when distinguishing “two different […] stages: […] constructing hypotheses and […] testing their validity. […] Given that the hypothesis is consistent with the evidence at hand, its further testing involves deducing from it new facts capable of being observed but not previously known and checking these deduced facts against additional empirical evidence.” Friedman (1953, p. 187) concedes that the “two stages of constructing hypotheses and testing their validity are related in two different respects. […] The facts that serve as a test of the implications of a hypothesis might equally well have been among the raw material used to construct it, and conversely.” But like Schumpeter he holds that the “construction of hypotheses is a creative act of inspiration, intuition, invention” (Friedman 1953, p. 208). And his claim that empirical evidence can be “direct, dramatic, and convincing” (cf. Section 4.7 of Chapter 4) suggests that he also believes that the stage of hypothesis

146 Objectivity testing is capable of filtering out the ideologies (or, more generally, the nonscientific values) that are present at the stage of hypothesis construction. If the second stage of hypothesis or specification testing is capable of filtering out the non-scientific values that operate at the first stage of hypothesis construction or model specification, then scientists can be said to be able to select hypotheses or models independently of the non-scientific values that they happen to endorse. What allows them to select hypotheses or models independently of non-scientific values are, of course, empirical procedures of “testing their validity” or “factual” or “theoretical” research. Factual or theoretical research will eventually (perhaps after “an endless chain of give and take”) be able to select the true hypothesis or model from a pool of competing hypotheses or models. For the case of macroeconomic causal modeling, this means that non-scientific values operate at the first stage of causal model specification and that the second stage of specification testing is capable of filtering out these values. This second stage of specification testing is “factual” or “theoretical” research: empirical procedures of causal inference, or the identification of model parameters in microeconomic theory. Factual or theoretical research will eventually be able to select the true model from a pool of competing causal models. Thus, the true model is objective in the sense of being selected independently of the non-scientific values that macroeconomists happen to endorse. Perhaps factual or theoretical research will eventually be able to select the true model from a pool of competing causal models. But the results of part I have shown that it is currently unable to select the true causal model: that microeconomic theory conflicts with the ontology of macroeconomic aggregates (cf. Section 3.2 of Chapter 3), and that empirical procedures of causal inference are incapable of providing conclusive evidence in support of relations of direct typelevel causation (cf. Chapter 4). This means conversely that non-scientific values will influence the selection of causal models whenever one model is selected from a pool of competing causal models. Or so I will argue in the remaining three sections. 7.3  Longino on Values and Empirical Underdetermination In a series of books and papers, Longino endorses (among others) the claims that a “[t]he full content of a theory outreaches those elements of it (the observational elements) that can be shown to be true (or in agreement with actual observations)” (Longino 1996, 39); b scientific values “are quite frequently invoked as factors closing the gap between evidence and hypotheses revealed by underdetermination arguments” (Longino 1997, p. 23); particular values that many philosophers of science believe to be scientific have no epistemic priority over certain other values that oppose them; d scientists prefer values from either group on the basis of subjective preferences.

The Role of Non-Scientific Values in Macroeconomics  147 The first claim is a statement of the well-known thesis of the empirical underdetermination of scientific theories or hypotheses. The most famous statement of that thesis is probably found in Willard V. O. Quine (1975, p. 313). Longino’s statement is different from Quine’s in an important respect: what Quine has in mind is the empirical underdetermination of universal scientific theories or “systems of the world”; Longino, by contrast, refers to the empirical underdetermination of an arbitrary number of scientific theories and hypotheses. Like Quine, however, Longino (2008, p. 70) thinks of the empirical underdetermination of theories or hypotheses as a semantic gap: “The underdetermination with which I am concerned is produced by a semantic gap between most hypotheses and the observational data adduced in evidence for them.” John Norton (2008, p. 18) states the conditions under which there will be such a gap: “No body of data or evidence, no matter how extensive, can determine the content of a scientific theory […]. But there is universal agreement on the content of mature scientific theories. Therefore, there is a gap: at least a portion of the agreement cannot be explained by the import of evidence.” There will be a semantic gap between the content of mature scientific theories or hypotheses and their empirical content if there is (near) universal agreement on the content of these theories or hypotheses, and if that content is underdetermined by their empirical content. Norton (2008, p. 20) also points out that interesting versions of the underdetermination thesis say that underdetermination is persistent and non-sporadic. Underdetermination is persistent if it persists “no matter how long and ingeniously evidence collection may proceed”; and it is non-sporadic if “it asserts that all theories are beset with this problem.” Longino nowhere explicitly says that underdetermination is persistent and non-sporadic. But she expresses a similar view when making fairly general claims such as the following: “Empirical adequacy […] is not a sufficient criterion for theory choice because of the philosophical problem known as the underdetermination of theory by data” (Longino 2008, p. 69). Longino (2008, sections 2 and 3) backs up her second claim by quoting philosophers who invoke scientific values “as factors closing the gap between evidence and hypotheses revealed by underdetermination arguments,” and by listing additional values that one might think could close the gap. She quotes Thomas Kuhn and Quine who have invoked scientific values like consistency (with other scientific theories), conservatism (or consistency with previous scientific theories), and fruitfulness; and she lists additional scientific values like simplicity and explanatory power. Longino (2008) endorses claim (c) when arguing that the listed values have no epistemic priority over certain other values that exactly oppose them. None of the values from the opposing groups is especially capable of selecting the true hypothesis from a pool of competing and empirically underdetermined hypotheses. Values from either group can be used to close the semantic gap between empirical content and theoretical or hypothetical content. But values from neither group are able to discriminate between true and false or more and less probable hypotheses (cf. Longino 2008, p. 72).

148 Objectivity The values from the first group are consistency, conservatism, simplicity, and fruitfulness; the values from the second group are novelty, ontological heterogeneity, and applicability to current human needs (according to Longino, the values from the second group operate especially in feminist scientific research). While consistency and conservatism favor the hypothesis consistent with the highest number of theoretical and observational sentences in the web of belief, novelty favors the hypothesis that is consistent with the lowest number of theoretical and observational sentences. While simplicity favors the hypothesis that stipulates the most parsimonious ontology (where ontologies are understood as characterizing what is to count as ontologically real or primitive), ontological heterogeneity favors the hypothesis that stipulates the least parsimonious ontology. Finally, fruitfulness favors the hypothesis that generates the highest number of normal science problems that can be solved when the hypothesis is accepted. As such, it distracts from the possibilities of the social or technological application of scientific research (cf. Longino 1996, p. 54). Longino shows for the values from the first group that they are incapable of selecting the true (or most probable) hypothesis from a pool of competing and empirically underdetermined hypotheses. She argues that consistency and conservatism cannot select the true hypothesis unless the theoretical and observation sentences with which the hypothesis is consistent are true (cf. Longino 2008, pp. 72–3). How are we supposed to find out about the truth of the theoretical sentences? These sentences are likely to be empirically underdetermined if most theories and hypotheses are empirically underdetermined (as Longino maintains). If they are empirically underdetermined, then consistency, conservatism, or some other value will be needed to close the gap between their theoretical and empirical content. If consistency or conservatism is used, we will enter a regress that is potentially infinite. If any other value is used, this value will need to be capable of selecting the true theoretical sentences. Longino (2008, p. 73) argues that simplicity cannot select the true hypothesis because we “have no a priori reason to think the universe simple, that is, composed of very few kinds of thing (as few as the kinds of elementary particles, for example) rather than of many different kinds of thing. Or – as Kant teaches us – we can give a priori arguments for both theses, nullifying the probative significance of each. There is no empirical evidence for such a view, nor could there be.” Longino (2008, p. 76) finally argues that fruitfulness cannot select the true hypothesis because it is oriented toward utility and not toward truth. Longino shows for the values from the first group that they are incapable of selecting the true hypothesis from a pool of competing and empirically underdetermined hypotheses. But if the values from the first group are incapable of selecting that hypothesis, then the values from the second group will also be incapable. In the case of novelty, we will need to find out about the truth of theoretical sentences too. According to Longino, ontological heterogeneity represents the antithesis of a Kantian antinomy in which simplicity figures as thesis. And like fruitfulness, applicability to current human needs is geared toward utility, and not toward truth.

The Role of Non-Scientific Values in Macroeconomics  149 One might think that explanatory power could do the job: that explanatory power can close the gap between evidence and theory because strong explanatory power has epistemic priority over weak explanatory power. But following Nancy Cartwright, Longino (2008, p. 74) argues that there is a tradeoff between explanatory power and truth: that “explanatory strength is purchased at the cost of truth,” and that “the greater the explanatory power […] of a theory, that is, the greater the variety of phenomena brought under its explanatory umbrella, the less likely it is to be […] true.” Thus, explanatory power won’t be able to do the job either. It is true that Longino nowhere explicitly endorses claim (d): the claim that scientists prefer values from either group on the basis of subjective preferences. But she concludes from claims (a)–(c) that “critical interaction among scientists of different points of view [is] required to mitigate the influence of subjective preferences on […] theory choice” (Longino 1996, p. 40).2 She accordingly seems to believe that subjective preferences influence preferences for values from the first or second group and that preferences for values from the first or second group lead to selections of hypotheses from pools of competing and empirically underdetermined hypotheses. Compare claims (a)–(d) to the four premises of the argument for the impossibility of structural objectivity in macroeconomic causal modeling (cf. Section 7.1): 1 Causal hypotheses (hypotheses about relations of direct type-level causation between macroeconomic aggregates) are non-sporadically underdetermined by empirical evidence. 2 Scientific values will influence selections of hypotheses from pools of competing causal hypotheses if causal hypotheses are underdetermined by empirical evidence. 3 The scientific values, on the basis of which macroeconomists select hypotheses from pools of competing and empirically underdetermined causal hypotheses, have no epistemic priority over opposing ones. 4 Macroeconomists prefer one scientific value over another on the basis of nonscientific values. Do premises (1)–(4) represent mere macroeconomic applications of claims (a)–(d)? Superficially, that seems to be the case. Claim (a) relates to theories (or scientific hypotheses) in general and premise (1) to instances of such hypotheses (causal hypotheses in macroeconomics); and what is true of the general case needs to hold of its instances. Claims (b) and (c) refer to consistency, conservatism, simplicity, and fruitfulness and to the respective opposing values, premises (2) and (3) to subgroups of these values: to simplicity and consistency with Walrasian general-equilibrium theory (call it “conservatism 1”) in the case of new classical macroeconomics, and to non-simplicity and consistency with Marshallian generalequilibrium theory (call it “conservatism 2”) in the case of new Keynesian macroeconomics. And what is true of groups of values needs to hold of subgroups of these values. The term “subjective preferences,” as used in (d), can be taken to be synonymous with “non-scientific values,” as used in (4).

150 Objectivity On closer inspection, however, premises (1)–(4) turn out to be more than macroeconomic applications of claims (a)–(d). It is true that “subjective preferences” can be taken to be largely synonymous with “non-scientific values.” But the opposition between conservatism 1 and 2 and the opposition between simplicity and nonsimplicity are quite different from the respective oppositions between conservatism and novelty and between simplicity and ontological heterogeneity (cf. Section 7.4). The most obvious difference relates to claim (a) and premise (1): unlike premise (1), claim (a) faces a problem that Norton (2008, p. 17) describes when saying that “the underdetermination thesis is little more than speculation based on an impoverished account of induction.” The account of induction that Norton is referring to is an “impoverished version of hypothetico-deductive confirmation.” There are, according to Norton, no less than three alternative accounts of induction: accounts he refers to as “inductive generalization,” “hypothetical induction,” and “probabilistic induction.” He argues that all three accounts are superior to the hypothetico-deductive account, and that at least one reason for their superiority lies in the fact that unlike the hypotheticodeductive account, they do not rule out that the observational consequences of empirically equivalent theories or hypotheses supply differing evidential support. Premise (1) doesn’t face the same problem because the inductive method that needs to be applied in order to provide empirical evidence in support of causal hypotheses (in macroeconomics or elsewhere) is a subtype of the method that Norton (2008, p. 31) calls “hypothetical induction,” and that he takes to be a respectable and improved version of “the impoverished account of hypothetico-deductive confirmation.” For evidence E to confirm hypothesis H, the impoverished account only requires that H entail E, while the improved version requires in addition, that it be shown that E were unlikely to obtain if H were false, or that the confirmed hypothesis be produced by a method known to be reliable. The subtype in question is the instrumental variable (IV) method analyzed in Chapter 4. The IV method requires that the hypothesis that X directly type-level causes Y entails the evidence E that X and Y are correlated. It requires, in addition, that there be no confounders of I and X and X and Y, where I is an instrumental or intervention variable that type-level causes X. But while the additional requirement can be shown to obtain in scientific disciplines like pharmacology or labor economics, it cannot be shown to hold in macroeconomics. It cannot be shown to hold in macroeconomics because in macroeconomics, there are confounders, of which we cannot know whether they can be controlled for because they cannot be measured: expectational aggregates (cf. Section 3.5 of Chapter 3). Thus, while the IV method may be capable of providing conclusive evidence in support of relations of direct type-level causation between X and Y in disciplines like pharmacology or labor economics, it is unable to provide conclusive evidence in support of such relations in macroeconomics. Causal hypotheses are, in other words, empirically underdetermined in macroeconomics. I argued in Chapter 4 that in macroeconomics, causal evidence is inconclusive in principle, and I took the in-principle inconclusiveness of causal evidence to be an implication of the Lucas critique. With Norton, one may also say that in

The Role of Non-Scientific Values in Macroeconomics  151 macroeconomics, the inconclusiveness of causal evidence is non-sporadic. I did not say, however, that the inconclusiveness of causal evidence in macroeconomics is persistent. I allowed explicitly for the possibility of there being progress in the empirical microfoundations of macroeconomics that will eventually lead to successful measurement and control of expectational aggregates. The negative result of Chapter 4 may accordingly be restated as premise (1): as the claim that in macroeconomics, causal hypotheses are non-sporadically underdetermined by empirical evidence. 7.4

Simply Walras or Non-Simply Marshall?

As just seen, Longino lists a series of values that philosophers and scientists believe to be scientific and to be able to close the gap between empirical evidence and theory or hypothesis: conservatism (or consistency with other or previous scientific theories), fruitfulness, simplicity, and explanatory power. Friedman (1953/21994, p. 185) – to name a prominent economist – favors simplicity and fruitfulness: The choice among alternative hypotheses equally consistent with the available evidence must to some extent be arbitrary, though […] relevant considerations are suggested by the criteria ‘simplicity’ and ‘fruitfulness’. Can any of these values close the gap between empirical evidence and hypothesis in macroeconomic causal modeling, i.e., select the true hypothesis from a pool of competing and empirically underdetermined causal hypotheses in macroeconomics? The present section is going to answer that question by arguing that only three of these four values operate in macroeconomic causal modeling and that none of these values is capable of selecting the true causal hypothesis. The three values that operate in macroeconomic causal modeling are conservatism, simplicity, and fruitfulness. While conservatism comes in two opposing types (call them “conservatism 1” and “conservatism 2”), simplicity is opposed by a further value, viz. non-simplicity. When guiding causal hypothesis selection, conservatism 1 typically combines with simplicity and conservatism 2 with non-simplicity. These combinations usually lead to selections of competing hypotheses that are equally fruitful. Conservatism 1 and 2, simplicity and non-simplicity may accordingly be said to play a more dominant role in macroeconomic causal modeling than fruitfulness. In the context of macroeconomic causal modeling, explanatory power needs to be understood as the depth with which causal hypotheses are explanatory. But explanatory power (or depth) cannot play any role in macroeconomic causal modeling because the depth with which causal hypotheses are explanatory is a function of the range of the interventions that change an intervention variable I (Hitchcock and Woodward 2003, pp. 184, 198): the greater that range of interventions, the greater the explanatory power of the hypothesis that X directly type-level causes Y. We won’t be able to measure that range of interventions unless we know that I is an intervention variable: that I type-level causes X, and that there are no confounders that type-level cause both I and X or both X and Y. And the result of Chapter 4 is, of course, that we cannot find out whether I is an intervention variable.

152 Objectivity Conservatism, by contrast, does play a role in macroeconomic causal modeling. In macroeconomic causal modeling, however, conservatism comes in at least two different types: while conservatism 1 favors the causal hypothesis that is most consistent with Walrasian general-equilibrium theory, conservatism 2 favors the causal hypothesis that is most in line with Marshallian general-equilibrium theory. In general equilibrium theory, a price vector and an allocation of consumed and produced goods constitute a general equilibrium if firms maximize profits, households maximize utility and all markets clear. That equilibrium is of the Walrasian (or competitive) kind if households and firms act as price takers (i.e., if they are small relative to the size of the market) and markets are perfectly competitive. It is, by contrast, of the non-Walrasian (or Marshallian) kind if firms act as price setters and markets are not perfectly competitive (De Vroey 2004). If an allocation of consumed and produced goods combines with a price vector to constitute a Walrasian (or competitive) equilibrium, then the first welfare theorem holds: then the allocation of consumed and produced goods is Paretoefficient (or -optimal), where an allocation counts as Pareto-efficient if there is no alternative allocation that makes some consumer better off without making another consumer worse off. If, by contrast, an allocation of consumed and produced goods combines with a price vector to constitute a non-Walrasian (or Marshallian) equilibrium, then the first welfare theorem from welfare economics does not hold: then the allocation of consumed and produced goods is not necessarily Pareto-efficient. If the hypothesis that X directly type-level causes Y, where Y stands for economic changes (in GDP or inflation), is consistent with Walrasian general-equilibrium theory, then the relation of direct type-level causation that it refers to represents Pareto-optima: then economic changes are driven exclusively by technology shocks, and then policy interventions carried out to mitigate these changes could only reduce welfare. If, by contrast, the same hypothesis is consistent with Marshallian general-equilibrium theory, then the relation of direct type-level causation that it refers to does not represent Pareto-optima: then economic changes reflect market failures, and then policy interventions carried out to mitigate these changes are capable of increasing welfare.3 There are arguably further types of conservatism operating in macroeconomic causal modeling: types of conservatism that favor causal hypotheses that are not necessarily consistent with general-equilibrium theory. These types include types of conservatism favoring causal hypotheses that are most in line with Keynesianism (roughly, the theory that aggregate demand determines economic changes, and psychological motives or propensities determine aggregate demand) or Austrian economics (roughly, the theory that markets determine output changes, and that markets self-organize even though individuals act, judge, and evaluate in noncalculable ways). But conservatism 1 and 2 clearly dominate the mainstream of current macroeconomic research. The present chapter will therefore concentrate on conservatism 1 and 2. In macroeconomic causal modeling, simplicity is just as relevant as conservatism. In the context of macroeconomic causal modeling, simplicity relates to the number of variables included in the model equation that expresses a given causal

The Role of Non-Scientific Values in Macroeconomics  153 hypothesis.4 These variables are of essentially two kinds (cf. Section 2.1 of Chapter 2). Causal structure variable are variables that directly type-level cause other causal structure variables, are directly type-level caused by other causal structure variables or both. Background variables (or ‘shocks’) encompass the influence of variables that represent direct type-level causes of causal structure variables but have been omitted from the model. Since parameters represent (like causal structure and background variables) sets of potential values that are measurable or quantifiable, they may also be understood as variables. Simplicity favors the causal hypothesis that is expressed by the causal model equation with the fewest number of variables. In macroeconomic causal modeling, however, the counterpart of simplicity, viz. non-simplicity, likewise plays a role: a role that becomes manifest in a tendency of new Keynesian policy analysis toward ever-greater non-simplicity. Frank Smets and Raf Wouters (2003) develop a new Keynesian model investigating the causal relations between seven variables and ten shocks; Lawrence Christiano, Martin Eichenbaum and Charles Evans (2005) a new Keynesian model analyzing the causal relations between nine variables and ten shocks; and Lawrence Christiano, Roberto Motto, and Massimo Rostagno (2010) a new Keynesian model assessing the causal relations between 16 variables and 16 shocks. That new Keynesian policy analysis tends toward ever-greater non-simplicity is no coincidence. New Keynesians typically endorse conservatism 2: they prefer causal hypotheses that are most in line with (indeed derive from) Marshallian general-equilibrium theory. And conservatism 2 usually combines with nonsimplicity when guiding causal hypothesis selection in macroeconomic policy analysis. I think that the following statement by Jesùs Fernández-Villaverde (2010, p. 5) nicely illustrates how that combination typically operates: Most macroeconomists, myself included, have always had a soft spot for nominal or real rigidities. A cynic will claim it is just because they are most convenient. […] At least since David Hume, economists have believed that they have identified a monetary transmission mechanism from increases in money to short-run fluctuations caused by some form or another of price stickiness. It takes much courage, and more aplomb, to dismiss two and a half centuries of a tradition […] going through Marshall, Keynes, and Friedman. […] Moreover, […] it must be admitted that those who see money as an important factor in business cycles fluctuations have an impressive empirical case to rely on. New Keynesian macroeconomics assume that economic changes reflect market failures (or nominal or real rigidities, as Fernández-Villaverde calls them). Modeling market failures necessitates the introduction of a greater number of causal structure variables and shocks. Unless the introduction of these variables is justified by reference to microfoundations, the introduction of these variables appears merely “convenient.” In order to pre-empt charges of mere convenience, new Keynesians often claim that they “have an impressive empirical case to rely on.”

154 Objectivity That they have an impressive case to rely on, however, is questionable in light of premise (1) or the negative result of Chapter 4. One should also note that when pressed, new Keynesians often admit that the empirical case they rely on is not as impressive as it seems. Consider, for instance, the case of David Romer’s textbook on advanced macroeconomics. When describing the merits of new Keynesian macroeconomics, Romer (2012, p. 195) states that “there is strong evidence” that monetary disturbances have real effects. But later, after critically discussing Friedman and Schwartz’s natural experiment, Romer (2012, p. 224) concludes that “the evidence provided by natural experiments may be the best we can obtain.” New Keynesians are, of course, not the only macroeconomists who oscillate between claims of the strength of the empirical evidence for causal hypotheses and more sincere admissions of their actual weakness. Friedman himself, for instance, oscillates in this way when claiming, on the one hand, that there is “strong evidence for the economic independence of monetary changes from the contemporary course of income and prices,” and when admitting, on the other, that the “choice among alternative hypotheses equally consistent with the available evidence must to some extent be arbitrary.” New Keynesians then resort to some sort of tradition: in the case of FernándezVillaverde, to two and a half centuries of a tradition from Hume to Friedman. And they sometimes caution their colleagues that it takes “much courage, and more aplomb, to dismiss” that tradition. What new Keynesians often suppress is that there is an alternative tradition from, say, David Ricardo through Léon Walras to Robert Lucas.5 This is the tradition that new classical macroeconomics identify with, i.e., macroeconomists who endorse conservatism 1: who prefer causal hypotheses that are most in line with (indeed derive from) Walrasian general-equilibrium theory. In macroeconomic policy analysis, conservatism 1 usually combines with simplicity when guiding causal hypothesis selection. The following statement by V. V. Chari, Patrick Kehoe, and Ellen McGrattan (2008, p. 2) highlights how that combination typically operates: The tradition favored by many neoclassicals (like us) is to keep a macro model simple, keep the number of its parameters small and well-motivated by micro facts, and put up with the reality that no model can, or should, fit most aspects of the data. Recognize, instead, that a small macro model consistent with the micro data can still be useful in clarifying how to think about policy. It is worth mentioning that Chari, Kehoe, and McGrattan (2008) explicitly criticize that four of the shocks included in the Smets-Wouters model are not well motivated by micro facts. Finally, fruitfulness is likely to play a role in macroeconomic policy analysis as well. In a footnote, Kuhn (1977, p. 322) argues that fruitfulness “deserves more emphasis than it has yet received. A scientist choosing between two theories ordinarily knows that his decision will have a bearing on his subsequent research career. Of course, he is especially attracted by a theory that promises the concrete

The Role of Non-Scientific Values in Macroeconomics  155 successes for which scientists are ordinarily rewarded.” One may accordingly say that in macroeconomics, fruitfulness is just as relevant as in any other scientific discipline. But while there might be disciplines in which fruitfulness is able to discriminate between competing theories or hypotheses, macroeconomics is unlikely to be among them. The tasks that new classical and new Keynesian macroeconomics address themselves to sometimes look different: while new classicals pay attention to sound microfoundations and apply calibration techniques to estimate model parameters, new Keynesians are oriented toward the needs of central banks and other policymaking institutions and use Bayesian techniques to estimate model parameters. But new classical macroeconomics hardly promises more concrete successes than new Keynesian macroeconomics, and vice versa. It is therefore fair to say that in macroeconomic policy analysis, fruitfulness plays a less dominant role than conservatism 1 and 2, simplicity and non-simplicity. Recall from Section 7.3 that conservatism, simplicity, and fruitfulness are incapable of selecting the true hypothesis from a pool of competing and empirically equivalent hypotheses. The same result obtains, of course, when conservatism, simplicity, and fruitfulness are considered in the context of macroeconomic causal modeling. Conservatism 1 and 2 cannot select the true causal hypothesis because it is impossible to tell whether general-equilibrium theory is true, or (for that matter) whether Walrasian or Marshallian general-equilibrium theory is true.6 Simplicity cannot select the true causal hypothesis because even in specific cases, it is impossible to decide whether the simple or non-simple hypothesis is true. Finally, fruitfulness cannot select the true causal hypothesis because it is oriented toward utility, and not toward truth (in macroeconomics as in any other scientific discipline).7 7.5 Ideologies, Value Judgments, and Group Interests The preceding two sections have argued that in macroeconomics, empirical evidence underdetermines causal hypotheses non-sporadically, and that combinations of conservatism 1 and simplicity or of conservatism 2 and non-simplicity select hypotheses from pools of competing and empirically equivalent causal hypotheses. What remains to be seen is why mainstream macroeconomists prefer one combination over the other: why some of them prefer to derive new classical models from the assumptions of Walrasian general-equilibrium theory, while others prefer to derive new Keynesian models from the assumptions of Marshallian generalequilibrium theory. The explanation that the present and final section is going to offer makes reference to non-scientific values. It is going to argue, more specifically, that the ideologies, value judgments, or group interests that mainstream macroeconomists happen to endorse are likely to determine the combination that they prefer. The modal qualifier ‘likely’ will be required for essentially two reasons. The first reason is that the operation of non-scientific values explains why macroeconomists prefer one combination to the other, and that there might be alternative explanations of that preference. There is the problem that hardly any macroeconomist would happily

156 Objectivity admit that she prefers one combination to the other because of non-scientific values. But the (tacit and unconscious) operation of non-scientific values explains that preference very well. And I cannot even think of any other (let alone, better) explanation of that preference.8 The second reason why the modal qualifier will be required is that my method of identifying ideologies, value judgments, and group interests as the non-scientific values that determine preferences for one combination or the other is not (and probably cannot be) solidly grounded. I am not going to look at particular cases to identify these values. I will instead proceed in two steps. I will first retrieve these values from what Karl Marx and Joseph Schumpeter have to say about the influence of non-scientific values in economics. I will secondly suggest how these values could possibly operate in contemporary macroeconomic policy analysis. The conclusion of the present section therefore won’t relieve critical researchers of the challenging task of identifying the non-scientific values that operate in particular cases of macroeconomic policy analysis. I am going to look at what Marx has to say about the influence of non-scientific values in economics because he is arguably the economist-philosopher who analyzes the influence of non-scientific values in economics for the first time and in a theoretical manner. The economists whose work he believes is vitiated by the influence of non-scientific values are the classical economists of the eighteenth and nineteenth century, i.e., the “bourgeois” economists, in Marxian parlance. And the non-scientific values that Marx believes vitiate the work of the bourgeois economists are of essentially three types. Marx (1857/21994, p. 122) refers to non-scientific values of the first type when accusing the bourgeois economists of presenting “production […] as encased in eternal natural laws independent of history” and of quietly smuggling in “bourgeois relations […] as the inviolable natural laws on which society in the abstract is founded.” The bourgeois relations that Marx has in mind obtain between capitalists (or owners of the means of production) and laborers (individuals who merely own their labor-power). The Marxian theory of these relations is one of exploitation: a theory of capitalist surplus that results from a relative or absolute prolongation of the working day of the laborer. But what Marx criticizes about the theory of the bourgeois economists is that private ownership of the means of production is presented as an “eternal” or “inviolable natural law.” According to Marx (1857/21994, p. 121), “production […] is always production at a definite stage of social development.” Production at the capitalist stage of social development has merely replaced production at earlier stages and will be replaced with production at later stages. And Marx believes that at these later stages, ownership of the means of production is no longer concentrated in the hands of the members of a particular social class. Marx refers to non-scientific values of the second type when criticizing the theory of the bourgeois economists for characterizing the laborer as free in an unqualified sense. Marx (1867/1990, p. 280) says that according to that theory “both buyer and seller of […] labour-power are determined only by their own free will. They contract as free persons, who are equal before the law. Their contract is the final result in which their joint will finds a common legal expression.” Marx (1867/1990,

The Role of Non-Scientific Values in Macroeconomics  157 p. 183) objects to that theory that the freedom of the laborer is ambiguous: the laborer is not only free in the sense that “he can dispose of his labor-power as his own commodity” but also in the sense that “he has no other commodity for sale, i.e., he is rid of them, he is free of all the objects needed for the realization of his labour-power.” Therefore, the laborer is not free to sell his labor-power at all. He needs to sell it in order to reproduce and to provide for his family. And “the realm of freedom actually begins only where labour which is determined by necessity and mundane considerations ceases; this in the very nature of things it lies beyond the sphere of actual material production. […] The shortening of the working-day is its basic prerequisite” (cf. Marx 1894/1999, p. 593). Marx (1867/1990, p. 333) refers to non-scientific values of the third type when accusing the British economist Nassau Senior of unilaterally serving the interests of the manufacturers in Manchester: “One fine morning, in the year 1836, Nassau W. Senior […] was summoned from Oxford to Manchester, to learn in the latter place, the Political Economy that he taught in the former. The manufacturers chose him as their prize-fighter, not only against the newly passed Factory Act, but against Ten-Hours’ Agitation which aimed to go beyond it.” Senior issued a statement in which he argues that prolonging the working day of teenagers (“persons under 18”) from 11 ½ to 13 hours would more than double the net profit of a cotton manufacture, while reducing it to 10 ½ hours would destroy its net profit and reducing it to 10 hours would destroy its gross profit. Marx ridicules that statement. He offers an analysis that is meant to show that reducing the working day of the teenagers to 10 ½ hours would not destroy the net profit of the cotton manufacturers. And by fictitiously addressing himself to the Manchester manufacturers he says that “this fateful ‘last hour’ about which you have invented more stories than millenarians about the day of judgment, is ‘all bosh.’ If it goes, it will cost neither you your ‘pure profit’, nor will it cost the boys and girls you employ, their ‘pure minds’” (cf. Marx 1867/1990, p. 336). What is of interest at this point is not whether Marx’s attacks on the work of the classical economists are indeed justified. What is of interest is rather whether the non-scientific values that Marx believes influence the work of the classical economists can be given names and definite descriptions. Schumpeter (1949, p. 230) refers to non-scientific values of the first type as “ideologies,” and he says that an ideology “may contain any amount of delusions traceable to a man’s social location, to the manner in which he wants to see himself or his class or group and the opponents of his own class or group.” Schumpeter fails to mention that it is Engels (cf. especially 1893/1968, p. 476), and not Marx, who uses the term ‘ideology’ in this sense. But Schumpeter adequately characterizes the type of non-scientific values that Marx thinks operates when production in the bourgeois sense is presented as natural law or independent of history. Non-scientific values of the second type can be referred to as “value judgments” and non-scientific values of the third type as “group interests. Value judgments may be understood as subjective interpretations: propositions that are neither analytically true nor founded on specific empirical observations. Group interests, by contrast, can be defined as the material benefits that members of a specific social

158 Objectivity group receive when a particular proposition is accepted or rejected. I now turn to the question of how ideologies, value judgments, and group interests could possibly operate in contemporary macroeconomic policy analysis. Schumpeter (1949, pp. 231–5) analyzes the influence of ideologies on the work of Smith, Marx and Keynes: he first looks at their social locations and then at the class connotations that may have formed or helped to form their “vision” or “perception of a set of related phenomena.” In a move that might itself be influenced by non-scientific values, Schumpeter affirms that ideology is brought under control only in the case of Smith but not in the cases of Marx and Keynes. But what is of interest at this point is not Schumpeter’s analysis of the particular cases of Smith, Marx and Keynes. What is of interest is the way in which his analysis proceeds. And that way of proceeding may inspire attempts to identify the ideologies that might play a role in macroeconomic policy analysis today. One may say, for instance, that there are roughly three social classes: (a) the class of civil servants, i.e., of social individuals who are neither wealthy nor penniless, keep up some standard of education, behold the economic process with a critical eye and instinctively think of policy measures that can be applied to manage that process; (b) the business class, i.e., class of social individuals who run businesses to make profits, consider their businesses to animate the economic process and instinctively think of ways of deregulating that process; (c) the working class, i.e., class of social individuals who tend to receive low incomes, believe the economic process to lead to unfair goods allocations at least sometimes and instinctively think of policy measures that can be applied to restore fairness. And perhaps one may say that macroeconomists who affiliate themselves with (a), (b), and (c) tend to hold monetarist or new Keynesian, new classical, and Keynesian beliefs, respectively. Daniel Hausman and Michael McPherson (21994, pp. 260–4) demonstrate that one of the assumptions of Walrasian general-equilibrium theory, i.e., that of perfect competition, relies on a value judgment: on the value-laden identification of wellbeing and preference satisfaction. They argue that the identification of well-being and preference satisfaction is derived in three steps: start with standard decision theory, add the standard assumption that individuals are exclusively self-interested (that they will never prefer x to y if they think that y is better for them than x), and add the assumption that individuals have perfect knowledge (that they prefer x to y only if x is in fact better for them than y). How well satisfied an individual’s preferences are is then the same as how well off that individual is. The identification of well-being and preference satisfaction, however, is philosophically controversial and false: it is philosophically controversial because well-being might also have to do with chances, virtues, or duties; and it is false because what individuals prefer is not always good for them (they make mistakes and sometimes prefer to sacrifice their well-being in pursuit of some other end). The assumption of perfect competition is derived in four steps: start with the identification of well-being and preference satisfaction; add the assumption that it is impossible to make interpersonal comparisons of the well-being of different individuals (i.e., rule out that it can be a morally good thing to improve one

The Role of Non-Scientific Values in Macroeconomics  159 individual’s well-being at the expense of another individual’s well-being); add the principle of “minimal benevolence” (the principle saying that it is a morally good thing to search for Pareto improvements, i.e., to render at least one individual better off without rendering anyone worse off); and add the first welfare theorem, i.e., the theorem saying that a goods allocation is Pareto-efficient (or -optimal) if it represents a perfectly competitive equilibrium. In a perfectly competitive equilibrium, that is, it is impossible to improve anyone’s well-being without rendering anyone worse off. This result is remarkable, especially in light of the fact that the assumption of perfect competition is widespread in new classical macroeconomics. Finally, Ha-Joon Chang (2010, pp. 51–61) argues that an inflation target of 2% or less “is mainly geared towards the interests of the holders of financial assets”: that below 8%–10%, inflation is not correlated with economic growth; that usually high real interest rates are needed to keep inflation below 2%, that real interest rates of 8% or more mean that potential investors would not find any non-financial investments attractive (as few such investments bring profit rates higher than 7%), and that in this case, financial assets are the only profitable investment so that prices of financial assets go up. An inflation target of 2% or less is a policy measure that especially new Keynesian models imply. One may accordingly say that Chang thinks that “the interests of the holders of financial assets” (at least partially) explain why a specific group of macroeconomists prefers the combination of conservatism 2 and non-simplicity. The aim of this section has not been to argue that ideologies, value judgments, or group interests determine preferences for the combination of conservatism 1 and simplicity or for the combination of conservatism 2 and non-simplicity as a matter of necessity. The aim has been to show that ideologies, value judgments, or group interests are likely to determine preferences for either of these combinations. If ideologies, value judgments, or group interests are likely to determine preferences for either of these combinations, and if these combinations guide the selection of causal hypotheses in macroeconomics, then the values that ultimately guide the selection of these hypotheses are likely to be ideologies, value judgments, or group interests. I am aware that this result is not going to meet with a lot of sympathy among mainstream macroeconomists. But it suggests itself if the central claims of this chapter are correct. Notes 1 Cf. Reiss (2017, Section 3.5), however, for the opposite view. 2 Longino (1996, p. 40; 2008, p. 80) argues that in order for critical interaction to be possible, four conditions need to be satisfied: the provision of venues for the articulation of criticism, uptake (rather than mere toleration) of criticism, public standards to which discursive interactions are referenced, and equality (or tempered equality) of intellectual authority for all members of the community. 3 In the second part of his paper on empirical underdetermination, Norton (2008, p. 35) argues for the interesting claim that “pairs of theories that can be demonstrated to be observationally equivalent are very strong candidates for being variant formulations of the same theory.” Walrasian and Marshallian general-equilibrium theory have,

160 Objectivity









of course, a common core: general-equilibrium theory. But they have very different implications for the existence of Pareto-optima and policy. I take these implications to indicate that they are very poor candidates for being variant formulations of the same theory. 4 Simplicity, that is, is not opposed to the ontological heterogeneity of economic agents, as Longino’s analysis of simplicity might suggest (cf. Section 2). The (macro-) economic value of simplicity is also embodied in the LSE methodology of “simplifying” a deliberately overfitting model (cf. Section 4.5 of Chapter 4). Perhaps Haavelmo (1944, p. 25) was the one to elevate simplicity to the status of a value when stating that “[n] ature may limit the number of factors that have non-negligible factual influence to a relatively small number.” 5 They also suppress that associating a whole variety of theorists with one and the same tradition might not be wholly justifiable. It is not so clear, for instance, whether Hume can be read as a proponent of nominal or real rigidities. A similar point can, of course, be made with respect to the alternative tradition. 6 General equilibrium theory (of the Walrasian or Marshallian kind) might, of course, also be false, as Austrian economists or proponents of agent-based computational economics like to claim (cf. Section 3.3 of Chapter 3). 7 For the sake of completeness, I should note that alternative values that are sometimes proposed to fill the gap between evidence and theory cannot select the true causal hypothesis either. Modesty (as proposed by Quine and Ullian 1978) cannot select the true causal hypothesis because a hypothesis that is more humdrum than others or implied by others without implying them is not necessarily true. Predictive precision (as proposed by Popper 1967, p. 97) cannot select the true causal hypothesis because a model that is predictively precise may be causally misspecified. 8 It is also worth noting that macroeconomists are often less hesitant to admit that nonscientific values determine the combination that colleagues prefer. In general, Joseph Schumpeter (1949, 228) might have a point when saying that there “are a few writers who have in fact denied that there is such a thing in economics as accumulation of a stock of ‘correctly’ observed facts and ‘true’ propositions. But equally small is the minority who would deny the influence of ideological bias entirely. The majority of economists […] are ready enough to admit its presence though […] they find it only in others and never in themselves.”

References Chang, H.-J. (2010). 23 Things They Don’t Tell You About Capitalism. London: Penguin Books. Chari, V. V., Kehoe, P. and McGrattan. E. R. (2008). “New Keynesian Models: Not Yet Useful for Policy Analysis.” Federal Reserve Bank of Minneapolis. Research Department Staff Report 409. Christiano, L. J., Eichenbaum, M. and Evans, C. (2005). “Nominal Rigidities and the Dynamic Effects of a Shock to Monetary Policy.” Journal of Political Economy 113, 1–45. Christiano, L. J., Motto, R. and Rostagno, M. (2010). “Financial Factors in Economic Fluctuations.” ECB Working Paper Series No. 1192 (May). De Vroey, M. (2004). “The History of Macroeconomics Viewed against the Background of the Marshall-Walras Divide.” History of Political Economy 36, 57–91. Douglas, H. (2000). “Inductive Risk and Values in Science.” Philosophy of Science 67, 559–579. Engels, F. (1893/1968). “Brief an Mehring”. In Marx-Engels-Werke Bd. 39. Berlin: Dietz.

The Role of Non-Scientific Values in Macroeconomics  161 Fernández-Villaverde, J. (2010). “The Econometrics of DSGE Models”. NBER Working Paper No. 14677. Friedman, M. (1953/21994). “The Methodology of Positive Economics.” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP, 180–213. Haavelmo, T. (1944/1995). “The Probability Approach in Econometrics.” In Hendry, D. F. and Morgan, M. (eds.), The Foundations of Econometric Analysis. Cambridge: CUP, 477–90. Hausman, D. M. and McPherson, M. S. (21994). “Economics, Rationality, and Ethics.” In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP, 252–77. Henschen, T. (2021). “How Strong is the Argument from Inductive Risk?” European Journal for Philosophy of Science. https://doi.org/10.1007/s13194-021-00409-x. Hitchcock, C. and Woodward, J. (2003). “Explanatory Generalizations, Part II: Explanatory Depth.” Nous 37(2), 181–99. Kuhn, T. S. (1977). “Objectivity, Value Judgment, and Theory Choice.” In The Essential Tension. Chicago: University of Chicago Press. Longino, H. (1997). “Feminist Epistemology as a Local Epistemology.” Proceedings of the Aristotelian Society 71, 19–35. Longino, H. E. (1996). “Cognitive and Non-Cognitive Values in Science.” In Nelson, L. H. and Nelson, J. (eds.), Feminism, Science, and the Philosophy of Science. London: Kluwer, 39–58. Longino, H. E. (2008). “Values, Heuristics, and the Politics of Knowledge.” In Carrier, M., Howard, D. and Kourany, J. (eds.), The Challenge of the Social and Pressure of Practice: Science and Values Revisited. Pittsburgh: University of Pittsburgh Press, 189–216. Marx, K. (1857/21994). “Grundrisse: Foundations of the Critique of Political Economy” (Excerpts). In Hausman, D. M. (ed.), The Philosophy of Economics. An Anthology. Cambridge: CUP, 119–142. Marx, K. (1894/1981). Capital. A Critique of Political Economy. Volume III. Transl. D. Fernbach. Intr. E. Mandel. London: Penguin Books. Marx, K. (41890/1990). Capital. A Critique of Political Economy. Volume I. Transl. B. Fowkes. Intr. E. Mandel. London: Penguin Books. Norton, J. D. (2008). “Must Evidence Underdetermine Theory?” In Carrier, M., Howard, D. and Kourany, J. (eds.), The Challenge of the Social and Pressure of Practice: Science and Values Revisited. Pittsburgh: University of Pittsburgh Press, 17–44. Popper, K. R. (1967). “The Logic of the Social Sciences.” In Adorno, T. W. (ed.), The Positivist Dispute in German Sociology. New York: Harper and Row, 87–104. Quine, W. V. O. (1975). “On Empirically Equivalent Systems of the World.” Erkenntnis 9, 313–28. Quine, W. V. O. and Ullian, J. S. (1978). The Web of Belief. New York: Random House. Reiss, J. (2017). “Fact-value Entanglement in Positive Economics.” Journal of Economic Methodology 24(2), 134–149. Romer, D. (42012). Advanced Macroeconomics. New York: McGraw-Hill. Rudner, R. (1953). “The Scientist qua Scientist Makes Value Statements.” Philosophy of Science 20, 1–6. Schumpeter, J. A. (1949). “Science and Ideology.” American Economic Review 39, 345–59. Smets, F. and Wouters, R. (2003). “An Estimated Stochastic Dynamic General Equilibrium Model of the Euro Area.” ECB Working Paper Series No. 171 (August).

8

8.1

Macroeconomic Expertise

Introduction

There appears to be one final way out that one might seek to defend the scientific objectivity of macroeconomic causal models. One might argue that causal modeling in macroeconomics is not just that two-stage procedure of model specification and specification testing, but that the specification of a causal model depends on macroeconomic expertise. Macroeconomic expertise is not only propositional knowledge in the shape of evidential propositions that can be derived from theory, natural experiments, or econometric tests. To a substantial degree, it is also nonpropositional knowledge (or ‘know-how’), i.e., knowledge of how to specify a causal model in an intuitive response to the situation at hand (a model that may deviate from any prevalent models in certain respects). The argument in favor of this way out is an argument in favor of a conception of scientific objectivity as expertise, i.e., in favor of a conception according to which a macroeconomic causal model is objective if its specification relies on the intuitions of a macroeconomic expert. The aim of the present chapter is to analyze that conception in four sections. Section 2 will take a brief look at Lorraine Daston and Peter Galison’s (2010) concept of trained judgment (or expertise) and the history of that concept in economics. Section 3 will refer to the work of Hubert Dreyfus to state more clearly how expertise might function in general and in macroeconomics in particular. The final Section 4 will point to the problematic character of scientific objectivity in the sense of macroeconomic expertise. Perhaps one of the causal models that macroeconomic experts specify adequately represents the relations of direct type-level causation that obtain in the situation at hand. But there is no guarantee that any of these models adequately represent these relations. The problematic character of scientific objectivity in the sense of macroeconomic expertise can be understood as a special case of the more general “problem with experts” that Stephen Turner (2001) claims is present in a democracy. If the empirical underdetermination of macroeconomic causal models is non-sporadic (if the evidence that macroeconomists can provide in support of relations of direct type-level causation is inconclusive in principle), then macroeconomic expertise will be our last hope: then macroeconomists won’t be able to arrive at causal model specifications that adequately represent the relations of type-level causation that DOI: 10.4324/9781003094883-10

Macroeconomic Expertise  163 obtain in the situation at hand unless they are guided by expert intuition and nonpropositional knowledge when specifying these models. But heeding the advice of people who cite intuition and non-propositional knowledge in support of causal model specification violates one of the most fundamental democratic principles: that of the equality of all citizens. 8.2  Trained Judgment and Expertise in Economics After their characterization of mechanical and structural objectivity, Daston and Galison (2010, p. 46) continue their historical account of the development of the concept of scientific objectivity by a description of what they call ‘trained judgment’: Around the turn of the twentieth century, many scientists […] proposed recourse to trained judgment […]. These self-confident experts were not the seasoned naturalists of the eighteenth century, those devotees of the cult of the genius of observation. It did not take extraordinary talents of attention and memory plus a lifetime’s experience to discern patterns; ordinary endowments and a few years of training could make anyone an expert. […] Far from flexing the conscious will, the experts relied explicitly on unconscious intuition to guide them. In place of the paeans to hard work and selfsacrifice so characteristic of mechanical objectivity, practitioners of trained judgment professed themselves to be unable to distinguish between work and play – or, for that matter, between art and science. What trained judgment has in common with the other types of scientific objectivity is that it aims to discern patterns (in the case of macroeconomic policy analysis: relations of direct type- or token-level causation between aggregate quantities). But trained judgment differs from truth-to-nature in that it doesn’t require extraordinary talents, a large memory, or a lifetime’s experience (it does require experience, of course, but not a “lifetime’s” experience). And it differs from mechanical and structural objectivity in that it doesn’t presuppose the suppression of subjectivity. By contrast, trained judgment relies on subjectivity in the shape of intuitions, instincts, hypotheses, and hunches. It is, as Daston and Galison put it, as much an art as science. Daston and Galison (2010, pp. 312–3) observe that subjectivity in the shape of intuitions etc. becomes especially important in applied statistics: At the juncture of hypothesis and data, that crossroads at which the nineteenth-century researchers had confronted the choice between objective virtue and subjective vice, a wide range of mid-twentieth-century successors counseled trained judgment and trained instincts. Hypotheses, like hunches, were universally acknowledged as essential guides to research and explanation. Yet mistakes of interpretation were accepted as inevitable. How to know when a hypothesis was not a beacon but a fata morgana?

164 Objectivity The same observation can be made in macroeconomic causal modeling. Econometricians who follow in the footsteps of Haavelmo and Koopmans believe that a theoretical model is needed to specify a statistical model. But there is typically a multiplicity of theoretical models that are statistically adequate, and how are researchers supposed to single out the one theoretical model that adequately represents the relations of direct type-level causation that obtain between macroeconomic aggregate? Some researchers use microeconomic theory to single out that model. Remember from Section 7.4 of the previous chapter, however, that mainstream macroeconomists are dealing with two conflicting microeconomic theories: Walrasian and Marshallian general equilibrium theory. And remember from Section 3.3 of Chapter 3 that general equilibrium theory as such (as the theory of homogenous agents who solve optimization problems in dynamic equilibrium) conflicts with the ontology of macroeconomic aggregates on a number of points. Other researchers use empirical procedures of causal inference to single out the theoretical model. I have pointed out repeatedly, however, that these procedures are incapable of providing conclusive evidence in support of specific models. Objectivity in the shape of intuitions or know-how might qualify as a more reliable guide to the selection of the theoretical model. But there is, of course, nothing that assures us that objectivity in the shape of expert intuitions is a more reliable guide: that the theoretical model that is selected on the basis of expert intuitions adequately represents relations of direct type-level causation that obtain between macroeconomic aggregates. Daston and Galison’s observation can be said to remain valid to the present day. Many econometricians share the view that objectivity in the shape of expert intuitions is needed to guide the selection of the theoretical model that is required to specify the statistical model. Statistician Roy Welsch (1986, p. 405), for instance, points out that “[e]ven with a vast arsenal of diagnostics, it is very hard to write down rules that can be used to guide a data analysis. So much is really subjective and subtle. […] A great deal of what we teach in applied statistics is not written down, let alone in a form suitable for formal encoding. It is just simply ‘lore’.” In economics more generally, expertise or trained judgment had been associated for a long time with the “art” of economic policy. Menger (1883, p. 7), for instance, refers to economic and financial policy as mere “doctrines of art” (Kunstlehren). But Schmoller (1883/1998: 167–8) was among the first to explicitly question the confinement of artful expertise to the policy domain. He also provides what is perhaps the most detailed account of the role of expertise in economics. Recall that Schmoller belongs to a generation of economists who couldn’t imagine that probability models can be applied to economic data. But what he has to say about expertise in economics is nonetheless pertinent to macroeconomic causal modeling today. The specification of a macroeconomic causal model corresponds to what Schmoller calls ‘observation of economic processes.’ He begins his analysis by conceding that observations of economic processes (i.e., discernments of patterns in the shape of relations of direct type-level causation) are not as exact as observations in the natural sciences. They are not as exact as observations in the

Macroeconomic Expertise  165 natural sciences because economic processes are infinitely complex, because a great number of relevant causes and agents operate in economics, and because observations of economic processes necessarily rely on “condensation and pre-selection”: So kommen wir zu dem Ergebnis, daß bei der unendlichen Kompliziertheit der volkswirthschaftlichen Vorgänge, bei der großen Summe mitwirkender Ursachen und Personen die entstehenden Bilder, schon weil sie auf Kondensierungs- und Ausleseprozessen beruhen, nicht leicht die Genauigkeit naturwissenschaftlicher Beobachtung erreichen können (Schmoller 1911/1998, p. 279). But Schmoller (1911/1998, pp. 279–80) also believes that expertise or trained judgment can help economists idealize and pre-select adequately, and that expertise or trained judgment can be acquired through both a long-lasting experience in business administration and economic policy and a scientific education in economics: [S]eit es eine höhere geistige Kultur mit Schulbildung, Presse und Lektüre gibt, erreichen zahlreiche Geschäftsmänner und Beamte durch jahrelange praktische Lebenserfahrung und Uebung eine gewisse Fähigkeit, volkswirtschaftliche Erscheinungen im großen und ganzen richtig zu beobachten. Und daneben hat die Wissenschaft und der Unterricht in ihr, die regelmäßige Schulung im wissenschaftlichen Beobachten […] es […] dahin gebracht, daß […] wir […] aus der großen Masse des Beobachtungsmaterials […] die richtige Auswahl zu treffen gelernt haben. Expertise or trained judgment can help economists condensate and pre-select adequately. But like Daston and Galison, Schmoller (1911/1998, p. 280) emphasizes that expertise or trained judgment cannot guarantee that observations idealize and pre-select adequately. The more the respective process is ramified and complex, the more the observation of that process is prone to error: “Aber immer bleibt die Beobachtung der volkswirtschaftlichen Tatsachen eine schwierige, von Fehlern um so leichter getrübte Operation, je größer, verzweigter, komplizierter die einzelne Erscheinung ist.” The above quotations indicate that both Welsch and Schmoller believe that expertise or trained judgment can be taught: while expertise or “lore” in applied statistics is taught in classes on applied statistics or by mentoring early-career applied statisticians, expertise in the observation of economic processes is taught in economics classes and by mentoring novices in business administration and economic policy. Daston and Galison (2010, p. 312) point out that its ability of being taught is what distinguishes trained judgment from truth-to-nature: “Although brilliance could not be taught, intuitive thinking could.” They add that intuitive thinking could be taught, “even if no one understood exactly how it functioned.” It has to be emphasized, however, that the functioning of expertise or trained judgment is at least roughly understood. And it will be necessary to outline that understanding

166 Objectivity if the conception of scientific objectivity as trained judgment is to be evaluated in a manner that is somewhat informed. 8.3

Macroeconomic Expertise: How Does It Work?

The best-known and most lucid account of the functioning of expertise can be found in the work of Dreyfus. Dreyfus develops that account in a series of papers and books. But the most complete and detailed version of it is contained in Mind Over Machine (1986), a book that he co-authored with his brother Stuart, a computer engineer. In that book, Dreyfus proposes a developmental account of expertise. He suggests that the acquisition of expertise has to run through four ascending stages – (1) the stage of the novice, (2) the stage of the advanced beginner, (3) the stage of competence, and (4) the stage of proficiency – in order to reach (5) the final stage of expertise. In the first stage, the novice learns a context-free set of rules for determining action and tends to act slowly as she pays a lot of attention to their correct application (Dreyfus and Dreyfus 1986, p. 21). In the second stage, the advanced beginner has considerable experience in coping with real situations and marginally improves by identifying meaningful additional aspects of the situation that are not captured by rules (Dreyfus and Dreyfus 1986, pp. 22–3). In the third stage, the number of recognizable context-free and situational elements present in a real-world circumstance becomes “overwhelming” (Dreyfus and Dreyfus 1986, p. 23). The competent performer therefore begins to narrow down these elements: he chooses a plan in order to selectively address only those elements that appear relevant from his perspective. At the same time, he feels responsible for, and thus emotionally involved in, the product of his choice (Dreyfus and Dreyfus 1986, p. 26). While the actions of the novice and advanced beginner are characterized by a detached attitude, the competent performer sometimes finds himself “on an emotional roller coaster” (Dreyfus 2000, p. 160). In the fourth stage, the proficient performer begins to transcend what Dreyfus and Dreyfus (1986, p. 28) call the “Hamlet model of decision making,” i.e., the model of “the detached, deliberative, and sometimes agonizing selection among alternatives.” But only the fifth and final stage of expertise completes the transcendence of the Hamlet model. That expertise completely transcends the Hamlet model means essentially two things. It means, first, that experts are no longer bound by any rules that dictate what they need to do (cf. Dreyfus 2000, p. 162). It means, secondly, that expert knowledge is non-propositional. Expert knowledge is non-propositional because experts are deeply involved in the tasks they face. Certain features of the situation stand out as salient while others recede into the background, and the salient features change as the situation changes. But none of this is brought to the conscious attention of the expert. When she is deeply involved, her actions respond to the present situation immediately and intuitively. The situations in which she is going to take those actions trigger unconscious memories of similar situations in which similar actions were successful (Dreyfus and Dreyfus 1986, pp. 28, 35).

Macroeconomic Expertise  167 Dreyfus believes that his developmental account of expertise is phenomenologically justified, i.e., that all subjects are capable of recognizing it as complying with their own experience. He expresses that belief when extending the following invitation to his readers: “You need not merely accept our word but should check to see if the process by which you yourself acquired various skills reveal a similar pattern” (Dreyfus and Dreyfus 1986, p. 20). Dreyfus also claims that his account is universal in scope, i.e., that it applies to all sorts of motor and intellectual skills (riding bicycles, flying planes, playing chess, doing astrophysics etc.). If it is universal in scope, it should also apply to the case where the action in question is the specification of a macroeconomic causal model. But what could Dreyfus’s account look like if it is applied to that particular case? Perhaps one may say that in the first stage, the novice takes note of important research findings on the behavior of individual aggregates, such as consumption, investment, employment, and inflation. She also makes the acquaintance of the macroeconomic causal models that the macroeconomic mainstream or tradition has used to justify macroeconomic policy decisions: a baseline real-business cycle model, the canonical DSGE model, and some extensions of the canonical DSGE model. She tries to apply these models to particular economies at particular moments in time in order to improve her understanding of the appropriateness of particular policy measures: monetary and fiscal policies, government bailouts, quantitative easing etc. In the second stage, the advanced beginner meets with situations to which these models do not appear to apply. Perhaps these situations are such that a nominal interest rate decrease is unable to fight deflation, that an expansive fiscal policy leads to (high levels of government debt, a sovereign default and, as a result) high levels of unemployment, that a government bailout provides financial help to financial institutions but makes the government impossible to reelect in the eyes of the electorate, that the large-scale purchase of toxic assets by a central bank leads to higher inflation and to no increase in output or employment, and so on. Or perhaps these situations are such that consumers have non-homothetic utility functions, firms have heterogeneous productivities, agents do not form rational expectations, agents interact directly and in dynamic non-equilibrium, and so on. In the third stage, the competent performer is employed by a central bank or other policymaking institution and assigned a set of tasks the accomplishment of which is important and in need of a considerable amount of responsibility. In that stage, the number of potential type-level causes that the competent performer needs to take into account when assessing the appropriateness of a particular policy measure becomes “overwhelming.” She responds to that overwhelming number by asking macroeconomic researchers to develop ever more non-simple and sophisticated causal models. But she also begins to narrow down the number of typelevel causes by selectively addressing only those causes that appear relevant to the respective situation from her own perspective. In the fourth stage, the proficient performer begins to transcend the “Hamlet model of decision making.” She begins to understand that the “the detached, deliberative, and sometimes agonizing selection” among competing and empirically

168 Objectivity underdetermined causal models serves no purpose: that she must take responsibility and develop her own model. At this stage, she begins to incorporate more than just (macro-) economic insights into her thinking. And she will make the most progress if she has the “rare combination of gifts” referred to by John Maynard Keynes (1925, p. 12): that is, if she is not only an economist, but also a mathematician, historian, stateswoman, and philosopher. In the fifth stage, the macroeconomic expert no longer pays any explicit attention to the causal models that the macroeconomic mainstream or tradition has used to justify macroeconomic policy decisions. She continues to rely on these models implicitly. But the causal model that she specifies does not necessarily coincide with any of the prevalent causal models. It also lacks the degree of mathematical sophistication of the models that she asked macroeconomic researchers to develop. It probably isn’t even put on paper. And it is held to be very limited in scope, i.e., to relate exclusively to the situation at hand: to the situation in which the institution for which she works has to decide whether or not to carry out a particular policy measure. Her specification of that model responds immediately and intuitively to that particular situation: that particular situation triggers unconscious or conscious memories of similar situations in which taking the respective measure did or did not help to achieve the targeted end. When asked for a justification of her recommendation (not) to carry out the respective measure, she refers to the causal model that she specifies and to the high probability that she assigns to the hypothesis expressed by that model. And when asked for a justification of that specification and probability assignment, she might refer to evidence deriving from theory, natural experiment, or econometric tests. It should be clear, however, that evidence of that kind is in principle incapable of sufficiently determining her specification and probability assignment. Her specification and probability assignment will also rely substantially on expert intuition and non-propositional knowledge, i.e., on knowledge of how to specify a causal model in an intuitive response to the situation at hand. The point of applying Dreyfus’s account to the particular case of causal model specification in macroeconomics is not to claim that experts in Dreyfus’s sense typically work in central banks or other policymaking institutions. The point is rather to suggest that scientific objectivity in the sense of trained judgment or expertise could be achieved in macroeconomic causal modeling if there were experts in Dreyfus’s sense in macroeconomics. What needs to be assessed is of course whether we have any good reasons to believe that there are experts in Dreyfus’s sense in macroeconomics, and that these experts have the capacity of specifying causal models that adequately represent relations of direct type- or token-level causation that obtain in the situation at hand. Before coming up with a definite answer, one should note that Dreyfus’s account of expertise has an important normative implication. Since the expert transcends the Hamlet model of decision making, he cannot be expected to describe his process of decision making in propositional terms: “[E]xpert response is immediate […]. Also, […] since there are no rules that dictate […] what […] is the correct thing to do in that type of situation, the […] expert cannot explain why he did what

Macroeconomic Expertise  169 he did” (Dreyfus 2000, p. 162). Here is how Evan Selinger and Robert Crease (2002, p. 256) describe that normative implication: Rational reconstruction of expert decision making, Dreyfus argues, inaccurately represents a process that is in principle unrepresentable. When nonexperts demand that experts walk them through their decision making process step by step so that they can follow the expert’s chain of deductions and inferences (perhaps hoping to make this chain of inferences for themselves), they are, according to Dreyfus, no longer allowing the expert to function as expert, but instead, are making the expert produce derivative, and ultimately false, representations of his or her expertise. Hence Dreyfus argues that too much pressure should not be placed upon experts to ‘rationalize’ their ‘intuitive’ process of decision making to nonexperts. Selinger and Crease (2002, pp. 262–3) refuse to accept that normative implication. They argue that Dreyfus’s assumption of the autonomy of expert training suggests a naivete in his counsel to ‘trust experts’. […] [E]xperts will never be able to free themselves a priori from the suspicion that prejudices, ideologies, or hidden agendas might lurk in the pre-reflective relation that characterizes expertise. Not only is this suspicion to be expected; its absence would also be socially dangerous. […] He leaves no grounds for understanding how an expert might be legitimately challenged (or instructed, for that matter, as in the case of sensitivity training, nonexpert review panels, etc.). But Selinger and Crease (2002, p. 262) not only refuse to accept the normative implications of Dreyfus’s account. They also seek to replace that account with an account that represents experts as “culturally or situationally embedded with prejudices, ideologies, or hidden agendas. […] The acquisition of expertise is not a transcending of embeddedness and context, but a deepening and extension of one’s relationship to it.” There are, however, at least two points that need to be made in response to Selinger and Crease’s account of the “embeddedness” of the expert. The first point is that experts do not necessarily need to transcend their being embedded with prejudices, ideologies, or hidden agendas in order to respond adequately to the situation at hand. Dreyfus nowhere analyzes the role that prejudices etc. play when experts take actions. But his account doesn’t seem to require that prejudices operate only on stages (1)–(3) of the acquisition of expertise. In the case of macroeconomics, a macroeconomist’s being embedded with prejudices etc. will of course lead us to entertain doubts about his expertise. It doesn’t seem to be impossible, however, that an expert arrives at adequate causal model specifications, even though his intuitive responses to the situation at hand are influenced by prejudices etc. The second point is that the undesirability of its normative implications does not imply that Dreyfus’s account is false. It is true that experts often fail to communicate

170 Objectivity reasons for their actions to other experts or laypeople, and it’s of course regrettable that they fail to communicate these reasons. But their failure to communicate these reasons does not imply that it is false to say that experts cannot communicate these reasons because their actions are based on tacit intuition and non-propositional knowledge. In macroeconomic policy analysis, experts often do communicate reasons in order to justify causal model specifications: evidential propositions deriving from theory, natural experiments, or econometric testing procedures. In light of the negative result of part I, however, one cannot say that these reasons ever fully justify particular causal model specifications. One will therefore have to say that in macroeconomic policy analysis, expertise in the sense of Dreyfus is even desirable. If the reasons that can be communicated to justify causal model specifications determine these specifications only insufficiently, then expertise in the sense of Dreyfus is the last hope: then macroeconomists cannot arrive at causal model specifications that adequately represent the relations of type-level causation that obtain in the situation at hand unless they are guided by expert intuition and nonpropositional knowledge when specifying these models. 8.5 Expert Intuition and Scientific Objectivity A more serious problem is that there are cases in which it is difficult to judge whether there is anything like expert intuition or non-propositional knowledge. This problem is not inherent to Dreyfus’s account and might have to do with the nature of expertise. But while the cases that Dreyfus analyzes (riding bicycles, nursing, flying planes, playing chess etc.) clearly reveal expert intuition or nonpropositional knowledge, there are cases in which it is not so clear whether there is anything like expert intuition or non-propositional knowledge. Cases in which this is not so clear are typically cases in which putative experts deny each other the capacity of expertise, and in which advanced beginners or competent performers find it difficult to decide whether any of the putative experts qualifies as an expert. A brief look at the following passage from Paul Krugman’s 2009 article on the state of the macroeconomic discipline in the immediate wake of the crisis of 2008–2009 indicates that macroeconomic policy analysis is one such case: [I]n the wake of the crisis, the fault lines in the economics profession have yawned wider than ever. Lucas says the Obama administration’s stimulus plans are ‘schlock economics’, and his Chicago colleague John Cochrane says they’re based on discredited ‘fairy tales’. In response, Brad DeLong of the University of California, Berkeley, writes of the ‘intellectual collapse’ of the Chicago School, and I myself have written that comments from Chicago economists are the product of a Dark Age of macroeconomics in which hardwon knowledge has been forgotten. This passage from Krugman’s article can be read as implying that Keynesian macroeconomists like Brad DeLong and Krugman qualify as experts, while new classical macroeconomics like John Cochrane and Robert Lucas do not. The

Macroeconomic Expertise  171 problem is that Cochrane and Lucas would most probably return the compliment. So how are competent nonexperts supposed to decide whether any of them qualifies as an expert in macroeconomic policy analysis? Goldman (2001, p. 93) provides the following list of “five possible sources of evidence that a novice might have […] for trusting one putative expert more than another”: A Arguments presented by the contending experts to support their own views and critique their rivals’ views. B Agreement from additional putative experts on one side or other of the subject in question. C Appraisals by ‘meta-experts’ of the experts’ expertise (including appraisals reflected in formal credentials earned by the experts). D Evidence of the experts’ interests and biases vis-à-vis the question at issue. E Evidence of the experts’ past ‘track record.’ The list is meant to give five possible sources of evidence that a novice might have for trusting one putative expert more than another. But lack of trust in a putative expert can also be understood as skepticism about her expertise. And novices will find it difficult to evaluate the arguments mentioned under (A). Therefore, Goldman’s list may also be read as giving five possible sources of evidence that a competent nonexpert might have for deciding whether any of the putative experts who deny each other the capacity of expertise qualifies as expert. Goldman aims to show that there are cases in which at least some of the five possible sources of evidence can be exploited. But he also affirms that there are cases in which none of these sources can be exploited. His conclusion says that his “story’s ending is decidedly mixed, a cause for neither elation nor gloom. […] There are a few silver linings […]. There is no denying, however, that the epistemic situations […] are often daunting” (Goldman 2001, p. 109). The case of macroeconomic policy analysis is, unfortunately, a case with no silver linings. Arguments presented by macroeconomists to support their own views and critique their rivals’ views relate to evidence deriving from theory, natural experiment, or econometric testing procedures. It has been repeatedly pointed out that evidence of that kind is in principle incapable of determining the specification of a macroeconomic causal model sufficiently. Agreement from additional macroeconomists won’t do because for every macroeconomist who agrees on one side of the subject in question (who e.g. agrees on the Obama administration’s stimulus plans), there is another macroeconomist who agrees on the other side of that subject. By “appraisals by meta-experts,” Goldman (2001, p. 97) refers essentially to credentials like academic degrees, professional accreditations, work experience etc. Macroeconomists who deny each other the capacity of expertise typically have similar credentials. Therefore, these credentials cannot help the competent nonexpert decide whether a particular macroeconomist qualifies as an expert. Evidence of a macroeconomist’s interests and biases is evidence of the operation of ideologies, value judgments, and group interests in specifications of macroeconomic

172 Objectivity causal models. Evidence of that operation can be obtained in the way described in section 5 of the previous chapter. But the evidence obtained typically indicates that ideologies etc. operate in specifications of all the competing causal models. Finally, evidence of a macroeconomist’s past track record is evidence of the success or failure of the macroeconomic policy measures that she recommended or advised against in the past. The problem with evidence of that kind is that it is difficult or currently impossible to assess. In order to tell whether a macroeconomic policy measure is successful, one would have to be able to attribute the desired outcome to the policy measure; and in order to have that ability, one would have to be able to provide evidence in support of a relation of type-level causation between a variable that the policy measure manipulates, and a variable the change of which represents the desired outcome. I argued in Chapter 4, however, that in contemporary macroeconomics, evidence in support of such relations isn’t forthcoming. It is therefore difficult (if not impossible) to provide evidence in support of a macroeconomist’s past track record. There are additional sources of evidence that a novice might have for trusting one putative expert more than another: sources that Goldman does not take into consideration. One such source lies in the developmental account of expertise that Dreyfus thinks is phenomenologically justified and universal in scope. If it is obvious that a putative expert has gone through all five stages of the developmental account, the novice will have evidence for trusting that the putative expert is an expert. In the case of macroeconomic policy analysis, her evidence won’t be as strong as in the case of, say, bicycling: in the case of bicycling, her evidence will include evidence of a successful bicycle ride; in the case of macroeconomic policy analysis, by contrast, evidence of the success of the policy measures that a macroeconomist recommended in the past will be missing. But she will have evidence for trusting that the putative expert is a macroeconomic expert. The problem is that many macroeconomists have gone through all five stages of the developmental account and that many of them specify competing causal models to justify the policy measure that they think should be implemented in response to the situation at hand. The problem is, in other words, that the evidence that the novice has for trusting that a putative expert is an expert is not sufficiently strong to support the objectivity of a particular causal model. Cochrane, DeLong, Krugman, and Lucas all qualify as macroeconomic experts (and in their more sincere moments, they would never deny the others the status of experts), but Cochrane and Lucas favor a causal model specification that competes with the causal model specification that DeLong and Krugman favor. It follows that in macroeconomics, scientific objectivity in the sense of expertise is just as problematic as scientific objectivity in the sense of scientific realism. It is possible that one of the causal models that macroeconomic experts specify adequately represents the relations of direct type-level causation that obtain in the situation at hand. But there is no guarantee that any of these models adequately represent these relations. The problematic character of scientific objectivity in the sense of macroeconomic expertise can be understood as a special case of the more general “problem with experts” that Stephen Turner (2001) claims is present in a democracy. The

Macroeconomic Expertise  173 more general problem with experts is that the democratic principle of equality will be violated if expert advice is heeded. In epistemic contexts, the principle of equality requires that a person must not be coerced into accepting a particular judgment without being given the reasons that justify that judgment. Expertise is needed to determine the policy measure that should be implemented in response to the situation at hand. But experts won’t be able to fully justify the policy measure that they say should be implemented. They will be able to cite bits and pieces of theoretical and empirical evidence. It should be clear, however, that opposing experts will cite alternative evidence, and that the cited evidence will be unable to fully determine the policy measure that should be implemented in response to the situation at hand. Thus, the principle of equality will be violated if expert advice is heeded. Julian Reiss (2021, p. 219) points out that there are three political approaches that it seems can be adopted in response to the problems with experts. The populist approach chooses to ignore expertise and to make political decisions “a matter of mob psychology.” The technocratic approach subjects “citizens to rule by experts.” And the democratic approach maintains that “the process of democratic deliberation can balance the two demands more fairly”: the demand that expert advice be heeded, and the demand that citizens not be subjected to the rule of experts. I will argue in the following chapter that the approach to be adopted in the case of macroeconomic policy analysis is the democratic one. For now, I wish to explain why the problematic character of scientific objectivity in the sense of macroeconomic expertise can be understood as a special case of the problem with experts in a democracy. If the empirical underdetermination of macroeconomic causal models is nonsporadic (if the evidence that macroeconomists can provide in support of relations of direct type-level causation is inconclusive in principle), then macroeconomic expertise will be our last hope: then macroeconomists won’t be able to arrive at causal model specifications that adequately represent the relations of type-level causation that obtain in the situation at hand unless they are guided by expert intuition and non-propositional knowledge when specifying these models. Thus, ignoring the advice of macroeconomic experts and making macroeconomic policy analysis a matter of mob psychology cannot be an option. But Dreyfus is probably right when pointing out that experts cannot be expected to rationalize their intuitive process of decision making to non-experts. For experts in macroeconomic policy analysis, this means that they cannot be expected to fully justify their causal model specifications in propositional terms. They can be asked to state the evidential propositions on which their causal model specification relies (evidential propositions deriving from theory, experiment, and econometric testing procedures). But these propositions combine with expert intuitions and nonpropositional knowledge to sufficiently determine the causal model specification. Consequently, heeding the advice of macroeconomic experts violates the democratic principle of equality. Giving up that principle (subjecting citizens to the rule of macroeconomic experts) cannot be an option either. Is there a way for citizens of democratic societies to heed the advice of macroeconomic experts in a qualified manner: to heed the advice of macroeconomic

174 Objectivity experts without giving up on the (epistemic version of the) democratic principle of equality? I will argue in the upcoming chapter that there is such a way: that the circle of macroeconomic experts needs to be augmented by small groups of informed outsiders. Informed outsiders are scientists or laypeople who are advanced beginners or competent performers in macroeconomic policy analysis, who (like causal theorists) are aware of the limitations of causal inference methods in macroeconomics, and who (like philosophers or historians) are experts with respect to the non-scientific values that might influence the specification of causal models in macroeconomics. If the circle of macroeconomic experts is augmented by informed outsiders, citizens will regain epistemic equality with experts at least partially, and deliberators in the augmented circle will mutually engage and come to a better understanding about the methods that need to be developed to provide macroeconomic policy analysis with more secure foundations. References Daston, L. and Galison, P. (22010). Objectivity. New York: Zone Books. Dreyfus, H. L. (2000). “Could Anything be more Intelligible than Everyday Intelligibility? Reinterpreting Division I of ‘Being and Time’ in the Light of Division II.” In Faulconer, J. and Wrathall, M. A. (eds.), Appropriating Heidegger. Cambridge: CUP, 155–74. Dreyfus, H. L. and Dreyfus, S. E. (1986). Mind Over Machine: The Power of Human Intuition and Expertise in the Era of the Computer. New York: The Free Press. Goldman, A. (2001). “Experts: Which Ones Should You Trust?” Philosophy and Phenomenological Research 63(1), 85–110. Keynes, J. M. (1925). “Alfred Marshall, 1842–1924.” In Pigou, A. C. (ed.), Memorials of Alfred Marshall. Macmillan: London. Krugman, P. (2009). “How Did Economists Get It So Wrong?” The New York Times 09/06/2009. Menger, C. (1883). Untersuchungen über die Methode der Sozialwissenschaften und der politischen Oekonomie insbesondere. Leipzig: Duncker u. Humblot. Reiss, J. (2021). “Why Do Experts Disagree?” Critical Review 32 (1–3), 218–41. Schmoller, G. v. (1883/1998). “Zur Methodologie der Staats- und Sozialwissenschaften.” In Nau, H. H. (ed.), Gustav Schmoller: Historisch-ethische Nationalökonomie als Kulturwissenschaft, Marburg: Metropolis, 159–183. Schmoller, G. v. (1911/1998). “Volkswirtschaft, Volkswirtschaftslehre und –methode.” In Nau, H. H. (ed.), Gustav Schmoller: Historisch-ethische Nationalökonomie als Kulturwissenschaft, Marburg: Metropolis, 215–368. Selinger, E. M and Crease, R. P. (2002). “Dreyfus on Expertise: The Limits of Phenomenological Analysis.” Continental Philosophy Review 35, 245–79. Turner, S. (2001). “What is the Problem with Experts?” Social Studies of Science 31(1), 123–49. Welsch, R. E. (1986). “Comment.” Statistical Science 1, 403–5.

9

9.1

Macroeconomics in a Democratic Society

Introduction

Chapters 6–8 have shown that in contemporary macroeconomics, scientific objectivity in the traditional sense of scientific realism, value-independence, or expertise is either impossible or problematic. Scientific objectivity in the sense of scientific realism is problematic because in contemporary macroeconomics, the evidence that can be provided in support of the (approximate) truth of causal models remains inconclusive. Scientific objectivity in the sense of value-independence is impossible because the empirical underdetermination of macroeconomic causal models is non-sporadic, and because non-scientific values will necessarily influence the acceptance or rejection of scientific hypotheses (or the specification of the scientific models expressing them) if these hypotheses are empirically underdetermined. Finally, scientific objectivity in the sense of expertise is problematic because there is no guarantee that any of the causal models that macroeconomic experts specify adequately represents the relations of direct type-level causation that obtain in the situation at hand. The present and final chapter will argue for a democratic organization of the scientific practice of macroeconomics (or macroeconomic policy analysis). There appear to be no less than three reasons why a democratic organization of the practice of macroeconomics is called for. The first reason is that non-scientific values will dominate others unless the scientific practice of macroeconomics is organized democratically. The second reason is that a democratic approach to macroeconomic expertise is recommended in view of the agony of choice that would otherwise exist between a technocratic and a populist approach (cf. end of Chapter 8). The third reason is that the steps that need to be taken to approximate the ideal of scientific objectivity in any of its traditional meanings will become visible if the scientific practice of macroeconomics (or macroeconomic policy analysis) is organized democratically. The question is, of course, what a democratic organization of scientific practice is supposed to look like. The present chapter will consider the three conceptions of democratic organization that Karl Popper, Gunnar Myrdal, and Philipp Kitcher can be seen to propose in response to that question. Popper believes that scientific practice is organized democratically if it is competitive and tolerant of DOI: 10.4324/9781003094883-11

176 Objectivity free discussion. Myrdal thinks that the scientific practice of the social sciences is organized democratically if everyone is invited to challenge a particular piece of research that she finds to be founded on what she considers to be the wrong value premise. Kitcher holds that scientific practice is organized democratically if it is “well ordered,” i.e., if significance (the question of what problems research should be conducted on) is dealt with in an ideal discussion under mutual engagement; if certification (the acceptance of research results as providing evidence for scientific claims) results from applications of methods that an ideal deliberation endorses as reliable, and that conform to the ideal of transparency; and if application (the use of public knowledge to solve urgent problems) is the topic of an ideal discussion under conditions of mutual engagement at the time and in the circumstances when the knowledge for the application becomes available. The present chapter will defend a macroeconomic variant of Kitcher’s conception of well-orderedness. The problem with Popper’s and Myrdal’s conceptions is that they do not allow for the identification of clear-cut steps that need to be taken in order to approximate the ideal of scientific objectivity. According to Popper, the only path to scientific objectivity leads through the critical method, which becomes “situational logic” in the social sciences. However, the critical method can be shown to be inapplicable to causal modeling in macroeconomics (or in any other discipline). For Myrdal, the only way for research in the social sciences to become scientifically objective is to expose to full light the value premises that are usually left implicit and vague. For him, there is accordingly no scientific objectivity in any of the traditional meanings. Moreover, the organization of scientific practice that he proposes might not be democratic in the right sense. Kitcher’s specification of well-orderedness, by contrast, allows for the identification of clear-cut steps that need to be taken in order to approximate the ideal of scientific objectivity. The chapter will begin by presenting Popper’s conception of a democratic organization of scientific practice, and by discussing the critical method as a means of attaining scientific objectivity in macroeconomics (Section 9.2). It will then present Myrdal’s idea of a democratic organization of research in the social sciences and criticize his interpretation of scientific objectivity as exposition of value premises (Section 9.3). Next, the chapter will present Kitcher’s conception of a democratic organization of scientific practice: the ideal of well-orderedness (Section 9.4). The chapter (and the book as a whole) will conclude by describing two scenarios that might be characteristic of the current state of macroeconomic policy analysis (Section 9.5). While one of the two scenarios is relatively close to the ideal of well-orderedness, the other is relatively distant from it. The chapter will refrain from judging which of the two descriptions is more accurate. What is decisive is that the descriptions make visible the steps that macroeconomists need to take to approximate the ideal of scientific objectivity. 9.2 Popper on Scientific Objectivity and the Critical Method In a passage from “The Logic of the Social Sciences,” Popper (1967a, p. 90) identifies scientific objectivity with the objectivity of the critical method: “The so-called

Macroeconomics in a Democratic Society  177 objectivity of science lies in the objectivity of the critical method.” In another passage from “The Logic of the Social Sciences,” Popper (1967a, pp. 95–6) claims that scientific objectivity depends on a specific social or political organization of scientific practice: What may be described as scientific objectivity […] depends, in part, upon a number of social and political circumstances […]. Objectivity can only be explained in terms of social ideas such as competition (both of individual scientists and of various schools); tradition (mainly the critical tradition); social institution (for instance, publication in various competing journals and through various competing publishers; discussion at congresses); the power of the state (its tolerance of free discussion). The social or political organization that Popper describes can be referred to as ‘democratic’: it enables the mutual criticism, the friendly-hostile division of labor, the co-operation and competition among scientists; and it cannot enable that criticism etc. unless it embodies social ideas like competition, the critical tradition, and tolerance of free discussion. Popper may accordingly be said to hold the belief that scientific objectivity depends on a democratic organization of scientific practice and that the only path to scientific objectivity leads through the critical method. In the social sciences, the critical method coincides with what Popper calls “situational logic” or “situational analysis.” And there are essentially three tasks that Popper assigns to situational analysis. Its first task is to solve problems of explaining or predicting certain kinds or types of events with the help of constructing models of typical social situations (cf. Popper 1967b, pp. 357–8). The kinds of events that it attempts to explain or predict may be the topic of education theory, sociology, economics, or any other social science discipline. And the typical social situations that these models are models of are situations in which the types of event in question result from the behavior of agents given their environment, goals, and beliefs and the validity of what Popper (1967b, p. 359) calls the “rationality principle,” i.e., of the principle that says that all agents behave rationally, and that in economics, coincides with the principle according to which all agents solve optimization problems. The second task that Popper (1967a, p. 97) assigns to situational analysis is “to fight against the confusion of value-spheres.” In accordance with the contemporary distinction between scientific and non-scientific values (cf. Chapter 7), Popper (1967a, pp. 96–7) distinguishes between scientific values such as truth, relevance, interest, significance, fruitfulness, explanatory power, and simplicity and “extrascientific values and disvalues” like human welfare, national defense, national policy, industrial expansion and the acquisition of personal wealth. Popper (1967a, p. 96) concedes that it is “impossible to eliminate such extra-scientific interests and to prevent them from influencing the course of scientific research.” But he also thinks that it is possible and important to differentiate “the interests which do not belong to the search for truth and the purely scientific interest in truth. […] This cannot, of course, be achieved once and for all […]; yet it remains one of

178 Objectivity the enduring tasks of […] scientific criticism” (Popper 1967a, pp. 96–7). Popper believes, in short, that situational analysis manages to roll back the influence of extra-scientific (or non-scientific) values at least to some extent. The third and final task that Popper (1967a, p. 98–9) assigns to situational analysis is “to show that unacceptable conclusions can be derived from the assertion we are trying to criticize. If we are successful in deriving, logically, unacceptable conclusions from an assertion, then the assertion may be taken to be refuted.” This passage is reminiscent of passages that in earlier writings (cf. especially Popper 1959, p. 66) describe the method of falsification: the method of falsifying a theory by deriving from it a basic statement that contradicts a low-level empirical hypothesis that has been corroborated repeatedly. One may accordingly say that the third task of situational analysis is to apply the method of falsification. Can macroeconomics (or macroeconomic policy analysis) be expected to progress toward the ideal of scientific objectivity if it is organized democratically in Popper’s sense, and if the method of macroeconomics is situational analysis? At first sight, the answer seems to be positive. If the scientific practice of macroeconomic policy analysis is competitive and tolerant of free discussion, then non-scientific values will be unlikely to dominate others, and then people will be unlikely to adopt populist or technocratic approaches with respect to macroeconomic expertise. And if the method of macroeconomics is situational analysis, then macroeconomists will be able to roll back the influence of non-scientific values at least to some extent. At closer scrutiny, however, the answer turns out to be negative. There is, first of all, the problem that the list of social ideas that Popper provides is too unspecific to allow for the identification of clear-cut steps that macroeconomists could take in order to approximate the ideal of scientific objectivity. The list is intended to be incomplete (Popper says that objectivity can only be explained in terms of social ideas “such as …”). But it also remains unclear which ideas would have to be added in order to complete the list; and most of the items on the list are ill-defined. Is tolerance of free discussion, for instance, supposed to imply that anyone should be allowed to criticize or even reject expert assessments of empirical matters? Next, there are no less than four problems with the critical method, which Popper thinks scientists cannot avoid on the way to scientific objectivity. The first of these problems relates to the status of the rationality principle: is it falsifiable, and is it necessary for explanations and predictions in the social sciences? With respect to its falsifiability, Popper (1967b, pp. 360–1) endorses the position that it “is not treated as subject […] to any kind of tests,” and that it “is not universally true” and therefore “false.” I think that Hands (1985, p. 87) is right when arguing that this position is only seemingly paradoxical: that the rationality principle “can be false as a universal principle and yet unfalsifiable in any particular application.” But Popper (1967b. pp. 102, 103) appears to be ambivalent on the question of the necessity of the rationality principle. He states on the one hand that “[a] social science oriented towards […] situational logic can be developed independently of all subjective or psychological ideas” (my emphasis); he states on the other that “[t]he method of situational analysis is certainly […] not a psychological one; for it

Macroeconomics in a Democratic Society  179 excludes, in principle, all psychological elements and replaces them with objective situational elements. […] The explanations of situational logic […] are rational, theoretical reconstruction.” Popper’s second statement can be read as a plea for the new classical and new Keynesian programs of micro-foundations. But it conflicts with the psychological principles underlying Keynesian macroeconomics (for instance, with the three psychological motives for holding money). It’s also hard to reconcile with the program of empirical microfoundations that I argued needs to be in place for macroeconomists to be able to make progress in causal modeling (cf. Chapter 3, Section 3.6). The second problem relates to the type of explanation or prediction that Popper has in mind. Numerous passages in his work (cf. 1957, pp. 162–4, 1959, pp. 38–40, 1967a, p. 100) suggest that what he has in mind is explanation or prediction in the sense of the deductive-nomological (DN) account of scientific explanation: in the sense of an account that understands the explanation or prediction of an event as the deductive inference of a statement describing that event from premises that include at least one universal law and at least one singular statement or initial condition. In the social sciences, the rationality principle and a situational model play the role of universal law and singular statement, respectively (cf. Popper (1967b, p. 100n). The rationality principle is not a universal law because it “is not universally true,” but it “is not treated as subject […] to any kind of tests” either (cf. above). But the DN account is inadequate as an account of scientific explanation or prediction in causal modeling. A causal model that includes an equation expressing the statement that X directly type-level causes Y explains Y = y if that statement is true; and this statement is true if the equation expressing it is invariant under interventions, i.e., if this equation continues to hold in at least some situations in which a hypothetical or actual intervention changes the value of X (cf. Woodward and Hitchcock 2003, pp. 13, 15). Similarly, a causal model that includes an equation expressing the statement that X directly type-level causes Y predicts Y = y if that statement is true; and this statement is true if the equation expressing it is invariant under interventions. Note that in this sense, ‘prediction’ is not to be confused with the kind of prediction that is possible if X directly type-level causes Y in the sense of Granger-causality (cf. Chapter 5, Section 5.3). But it is clearly the kind of prediction that is relevant for purposes of policy analysis: the kind of prediction that we make when assessing the consequences of policy interventions on X. The third problem with the critical method is that the method of falsification is not applicable to macroeconomic policy analysis. It is not applicable because according to Popper (1959, p. 66), the acceptance of a basic statement that contradicts a theory is only a necessary condition of its falsification, because a sufficient condition of its falsification requires that the basic statement corroborate a low-level empirical hypothesis, and because corroboration of a low-level empirical hypothesis is impossible as long as the hypothesis in question states that there is a relation of direct type-level causation between macroeconomic aggregates. The fourth and final problem is that Popper’s conception of scientific objectivity does not proceed beyond that of structural objectivity (or value independence). If the critical method involves the corroboration of low-level empirical hypotheses,

180 Objectivity if ‘corroboration’ means as much as ‘value-independent confirmation of a hypothesis,’ if it is possible to roll back the influence of non-scientific values, and if scientists approximate an ideal if they roll back the influence of non-epistemic values, then the ideal in question will scientific objectivity in the sense of structural objectivity. But there will be no role for scientific objectivity in the sense of scientific realism or scientific expertise. 9.3 M yrdal on Scientific Objectivity and Inferences from Value Premises I now turn to Myrdal’s conception of the democratic organization of research in the social sciences. His conception begins and ends with the conviction that nonscientific values (“valuations”) influence research in the social sciences as a matter of necessity: every study of a social problem, however limited in scope, is and must be determined by valuations. A ‘disinterested’ social science has never existed and, for logical reasons, can never exist” (Myrdal 1970, pp. 52, 55). Non-scientific values necessarily influence research in the social sciences because the most central concepts of the social sciences necessarily have value connotations: “Words like, for instance, ‘equilibrium,’ ‘balance,’ ‘stable,’ ‘normal,’ ‘adjustment,’ ‘lag,’ or ‘function’ have in all the social sciences served as a bridge between presumably objective analysis and political prescription.” Myrdal (1970, p. 3) also holds that the “most fundamental problems facing the social scientists are […], what is objectivity, and how can the student attain objectivity in trying to find out the facts and the causal relationships between facts?” But how can objectivity be attained if valuations necessarily influence research? If valuations necessarily influence research, then one cannot attain objectivity “by ‘keeping to the facts’ and refining the methods of dealing with statistical data,” or “by stopping short of practical or political conclusions” (Myrdal 1970, p. 51). If valuations necessarily influence research, then it’s altogether impossible to understand scientific objectivity “in the conventional sense of independence from all valuations” (Myrdal 1970, p. 55). But how is scientific objectivity to be understood if not in the conventional sense of independence from all valuations? In order to understand his answer to this question, it is important to see that according to Myrdal (1970, p. 55), “the value premises that actually and of necessity determine social science research are generally hidden. The student can even remain unaware of them. They are […] left implicit and vague.” Myrdal (1970, pp. 47–8, 53) points to interesting manifestations of that hidden and implicit influence of value premises. One such manifestation is “that the scientists in any particular institutional and political setting move as a flock, reserving their controversies and particular originalities for matters that do not call into question the fundamental system of biases they share.” Another manifestation is that the “cue to the continual reorientation of our work in the social sciences” normally comes from only a few fundamental systems of biases. That second manifestation leads to the unwelcome consequence that scientists usually don’t have “foresight enough to read the writing on the wall: why should our

Macroeconomics in a Democratic Society  181 societies usually be taken by surprise by events, be caught unprepared, and forced to improvise?” The question mark at the end of the last paragraph suggests that Myrdal believes that there is a way for social scientists to have foresight enough to read the writing on the wall. He believes that the “only way in which we can strive for ‘objectivity’ […] is to expose the valuations to full light, make them conscious, specific, and explicit, and permit them to determine the theoretical research” (Myrdal 1970, pp. 55–6). He believes, more specifically, that there are three steps by which the social scientist “can better assure objectivity in his research” (Myrdal 1970, p. 5). The first step is “to raise the valuations actually determining our theoretical as well as our practical research to full awareness.” The second step is “to scrutinize them from the point of view of relevance, significance, and feasibility in the society under study.” And the third step is “to transform them into specific value premises for research, and to determine approach and define concepts in terms of a set of value premises which have been explicitly stated.” Myrdal (1970, pp. 53–4) elaborates on the first step when pointing out that it aims at turning a non-sequitur into a sequitur: “When the unstated value premises of research are kept hidden and for the most part vague, the results presented contain logical flaws. […] [T]here is found to be a non-sequitur concealed, leaving the reasoning open to invasion by uncontrolled influences from the valuation sphere. This element of inconclusiveness can be established by critical analysis.” Myrdal (1970, pp. 73–4) refers to the second step when pleading that “the social sciences should be opened more effectively to moral criticism. […] anyone who finds a particular piece of research to have been founded on what he considers wrong valuations can challenge it on that ground. He is also invited to remake the study and remodel its findings by substituting another, different set of value premises for the one utilized.” Myrdal (1970, p. 47), finally, refers to the third step when suggesting that it “blazes the way towards new perspectives.” If the cue to the continual reorientation of the social sciences no longer comes from the dominating political interests but from a variety of valuations, then the social science profession might gain “foresight enough to read the writing on the wall.” The above exposition of Myrdal’s conception of scientific objectivity is arguably a bit cursory.1 But it’s sufficiently detailed to allow for a balanced evaluation of the advantages and disadvantages of Myrdal’s conception in light of the particular case of macroeconomic policy analysis. An obvious advantage of Myrdal’s conception is its clear vision of the manifestations of a hidden and unacknowledged influence of non-scientific values. In the case of macroeconomic policy analysis, the picture of scientists that “move as a flock” or fail to “read the writing on the wall” appears to be particularly pertinent. The fundamental systems of biases that macroeconomists share may come in different shapes. Immediately before the crisis of 2008–2009, the fundamental system was arguably the program of microfoundations. If we distinguish two groups of mainstream macroeconomists (new classical and new Keynesian macroeconomics), we may also say that the “continual reorientation” of their work came from two fundamental systems of biases: Walrasian and non-Walrasian general

182 Objectivity equilibrium theory. And in light of the analysis of Section 7.5 of Chapter 7, we may say that macroeconomists selected these systems on the basis of ideologies, value judgments, or group interests. Because they moved “as a flock” (because the “continual reorientation” of their work came from essentially one fundamental system of biases, i.e., the program of microfoundations), they didn’t have “foresight enough to read the writing on the wall.” And when the crisis arrived, societies were “taken by surprise by events,” “caught unprepared, and forced to improvise.” Another advantage of Myrdal’s conception is that it proposes a method that, by raising value premises to full awareness, turns a non-sequitur into a sequitur. In the case of macroeconomic policy analysis, that method combines statistical evidence of a correlation between X and Y with an explicit value premise to derive the hypothesis that X directly type-level causes Y, the hypothesis that Y directly typelevel causes X, or the hypothesis that there is a variable Z that type-level causes both X and Y. Consider, for instance, the “highly stable” correlation that Milton Friedman and Anna Schwartz say obtains between monetary changes and economic changes (cf. Section 4.3 of Chapter 4) and two competing value premises: the ideology of the civil servants, and the ideology of the business class (cf. Section 7.5 of Chapter 7 above). If the value premise is the ideology of the civil servants, then the hypothesis derived by Myrdal’s method will state that monetary changes directly type-level cause economic changes. By contrast, if the value premise is the ideology of the business class, then the hypothesis will be the hypothesis of “reverse causation”: the hypothesis stating that economic changes directly typelevel cause monetary changes. But Myrdal’s conception also has at least four important disadvantages. The most important disadvantage is his claim that valuations influence research in the social sciences necessarily or “for logical reasons.” Myrdal’s argument for that claim states that the most central concepts of the social sciences (the concept of equilibrium, the concept of function etc.) necessarily have value connotations. But it is not the case that they have these connotations necessarily. They can be read as having these connotations. But they may also be read as technical concepts that lack these connotations. John Dupré (2007) proposes a more sophisticated variant of Myrdal’s argument. He doesn’t deny that the central concepts of the social sciences do not necessarily have value connotations. But he argues that these concepts (his examples include the concepts of Pareto optimum and inflation) would not have practical applications unless they also had value connotations. I think that a point that can be made in response to that argument suggests that social scientists can use these concepts without considering their practical applications. Dupré is right when saying that it “seems disingenuous” when economists “deny that normative questions are part of their discipline.” But disingenuousness of that kind doesn’t rule out that economists can in principle use the concepts of Pareto optimum, inflation etc. without considering their practical applications. Another important disadvantage of Myrdal’s position is that it is unclear why research in the social sciences should become scientifically objective by exposing to full light the valuations that usually remain hidden, implicit, or vague. Consider

Macroeconomics in a Democratic Society  183 again the highly stable correlation that Friedman and Schwartz say obtains between monetary changes and economic changes. A macroeconomist belonging to the social class of civil servants can be thought to expose to full light his ideology, but so can a macroeconomist who belongs to the social class of businesspeople. While the macroeconomist belonging to the social class of civil servants derives the hypothesis that monetary changes directly type-level causes economic changes, the macroeconomists belonging to the social class of businesspeople derive the hypothesis that economic changes directly type-level cause monetary changes. There appears to be no way to tell whether any of the two hypotheses is scientifically objective, or whether the research leading up to any of them is scientifically objective. Another disadvantage of Myrdal’s position is that it underestimates the role of expertise. The preceding chapter has concluded that the judgment of macroeconomists that are arguably best at specifying causal models cannot be dismissed as irrelevant to macroeconomic policy analysis. There is no way for us to find out about the adequacy of that judgment. But we can hope that one of the models that macroeconomic experts specify adequately represents the relations of direct typelevel causation that obtain in the situation at hand. If Myrdal does not even allow for the possibility of value-independent research, he won’t allow for the possibility of scientific objectivity in the sense of expertise either. The fourth and final disadvantage of Myrdal’s conception is that the organization of scientific practice that the recommends might not be democratic in the right sense. If really anyone “who finds a particular piece of research to have been founded on what he considers wrong valuations” is invited to “challenge” and “to remake the study and remodel its findings by substituting another, different set of value premises for the one utilized,” then the result might well be what Kitcher (2011, p. 51) refers to as a “vast cacophony, in which the divisions and distortions produced in our history would doom any chance of serious discussion.” 9.4

Kitcher on the Ideal of Well-Ordered Science

The political organization that Kitcher recommends in his Science in a Democratic Society is democracy in the sense of “informed citizen involvement”: an expert consensus “incorporating the judgments of small groups of informed outsiders” (Kitcher 2011, p. 12). Kitcher (2011, p. 25) argues that the expert consensus is unsuitable as a political organization of scientific practice because it tends to marginalize dissenters with an equal or even better claim of expertise. He likewise dismisses “vulgar democracy” because research would become myopic and unfruitful and favor shortterm over long-term interests if it were simply a matter of majority vote (Kitcher 2011, p. 112). According to Kitcher, only informed citizen involvement is capable of remedying altruism-failures: failures that lead to breakdowns in the social lives of our present and future global communities (just like the altruism-failures that led to breakdowns in the social lives of the bands of our pre-historic ancestors). In scientific practice, an expert consensus “incorporating the judgments of small groups of informed outsiders” needs to be reached with respect to significance

184 Objectivity (the question of what problems research should be conducted on), certification (the acceptance of research results as providing evidence for scientific claims) and application (the use of public knowledge to solve urgent problems). If - significance is dealt with in an ideal discussion under mutual engagement (Kitcher 2011, Chapter 5), - certification results from applications of methods that (a) an ideal deliberation endorses as reliable, and that (b) conform to the ideal of transparency (Kitcher 2011, Chapter 6), and - application is the topic of an ideal discussion under conditions of mutual engagement at the time and in the circumstances when the knowledge for the application becomes available (Kitcher 2011, Chapter 7), then scientific practice will be well-ordered. Kitcher (2011, pp. 115, 151, 178) refers to the well-orderedness of scientific practice as an ideal. This means that scientific practice cannot fully attain, but only approximate well-orderedness. But a description of the ideal of well-orderedness also allows for the identification of steps that scientists would need to take in order to approximate the ideal of scientific objectivity (in the traditional sense of realism, value-independence, or expertise). In the context of significance, the ideal of well-orderedness implies that the question of what problems research should be conducted on is decided in an ideal discussion under mutual engagement. Kitcher (2011, pp. 114–5) describes that ideal discussion when characterizing it as running through three stages. At its first stage, representatives of the various points of view gain a clear sense of what has so far been accomplished in the various sciences. At its second stage, these representatives voice their own preferences: preferences that already reflect their newly achieved awareness of the current state of the sciences. These preferences are further modified in order to meet the requirements of mutual engagement. Mutual engagement requires that each of the representatives be determined to accommodate the preferences of the others as far as possible and to avoid outcomes that leave some of the other representatives completely unsatisfied. At the third stage of the discussion, one of three possible outcomes obtains. The best outcome is for all deliberators to reach a plan that all perceive as best. The second best is for each deliberator to specify a set of plans that she considers acceptable, and for the intersection to be nonempty. The third outcome obtains when the choice is made by majority vote because there is no plan acceptable to all. In the context of certification, the ideal of well-orderedness implies that the acceptance of research results as providing evidence for scientific claims results from applications of methods that (a) an ideal deliberation endorses as reliable, and that (b) conform to the ideal of transparency. According to Kitcher (2011, p. 148), methods qualify as reliable if they give “rise to conclusions that are true enough, at a frequency that is high enough.” His idea that methods need to be endorsed in an ideal deliberation is motivated by the observation that there might be “instances in which a particular ideological agenda or pervasive prejudice inclines members of a scientific subcommunity to favor particular hypotheses, to overrate certain kinds

Macroeconomics in a Democratic Society  185 of evidence, or to manufacture evidence” (Kitcher 2011, p. 141). Kitcher (2011, p. 139) points out that philosophers like Feyerabend respond to that observation by proposing to introduce democracy directly into the process of certification, and that philosophers and scientists who defend the autonomy of science reject that proposal for fear of “mob rule.” But Kitcher (2011, p. 140) also argues that “[f] ear of the tyranny of ignorance should not blind us to the possibility of a tyranny of unwarranted expertise,” and that there is a third way to respond to the observation: that a circle of experts that is augmented by small groups of informed outsiders should decide in an ideal deliberation whether to accept or reject methods as reliable. According to Kitcher (2011, p. 151), methods conform to the ideal of transparency if they “accord with the ideas about proper acceptance or rejection current within the larger society.” He emphasizes that ideal transparency requires that really “all people, outsiders as well as researchers, can recognize the methods […] used in certification […] and can accept those methods.” He concedes that this requirement is extremely strong. But he also explains why the requirement is important: Were you to come to believe that, in some areas of inquiry, the methods […] used in certifying submissions do not accord with the standards you view as appropriate in your own activities of belief acquisition, your confidence in the system, at least with respect to the areas in question, would be undermined. You want it to be the case that you could, in principle, probe any part of the public system, and that, were you to do so, you would disclose processes of certification that conform to your standards (Kitcher 2011, pp. 151-2). The requirement is important because everyone wants to be able to probe any part of the public system at least in principle, and because no one is special: “You are not special. Public knowledge is set up for everyone and should therefore satisfy the same condition for all.” Kitcher (2011, pp. 152–3) points out that there are four possible constellations in which the two ideals (well-ordered certification and ideal transparency) might be satisfied. The “happy state” is the constellation in which both are satisfied. In that state “matters are as good as they could possibly be.” If only well-ordered certification is satisfied and not the ideal of transparency, then “public knowledge does well at storing up significant truths, but […] can no longer play its proper role […] in guiding public policy. Scientific authority is eroded.” If only the ideal of transparency is satisfied but not well-ordered certification, then “public confidence in science is high,” even though “the certification procedures used in building up public knowledge accept and reject submissions in unreliable ways, so […] the policies flowing from them are not altogether successful.” If, finally, both wellordered certification and the ideal of transparency are unsatisfied, then the “certification procedures within public knowledge are unreliable, and at variance with the standards of many people.” According to Kitcher (2011, p. 153), it “is not impossible that humanity has spent most of its existence in this last state – and that, with

186 Objectivity respect to some potentially important areas of investigation, we remain in it.” But he also says that we might be in the second or third state, and that we are definitely not in the happy first state. In the context of application, the ideal of well-orderedness implies that the application of public knowledge to solve urgent problems is the topic of “an ideal discussion under conditions of mutual engagement at the time and in the circumstances when the knowledge for the application becomes available” (Kitcher 2011, p. 170). Kitcher emphasizes the time and the circumstances because both the overall state of public knowledge and the environment in which the problems originally arise can change. He points out that there are two potential sources of complications that an ideal discussion under conditions of mutual engagement has to deal with when trying to reach an agreement about the application of public knowledge (cf. Kitcher 2011, pp. 170–1). The first complication obtains when public knowledge is too insufficient to solve a specific problem, and when there is a consensus about the urgency of the problem, a consensus about the non-urgency of the problem, or a debate about the urgency of the problem. The second complication obtains when people outside the scientific community or outside a small subfield within the scientific community cannot understand (or understand as correct) crucial parts of the information pertaining to a specific problem, to its urgency or the need to postpone its solution. Both complications frequently interact with one another (cf. Kitcher 2011, pp. 171–2). In biomedical research, for instance, scientists often agree that the insufficiency of knowledge requires the solution of a problem to be postponed, while patients (the prime beneficiaries of a solution to that problem) often lack access to the information pertaining to that postponement. Similarly, climatologists seem to be in agreement about the urgency of a problem, while many people lack access to the information pertaining to that urgency. 9.5 Well-Ordered Macroeconomics Kitcher (2011, p. 115) argues that “[u]nderstanding an ideal […] can sometimes help us to improve our practice.” Does understanding the ideal of well-orderedness help us to improve the scientific practice of macroeconomic policy analysis? I think that the answer is positive: no matter where we stand in macroeconomics, understanding the ideal of the well-orderedness of macroeconomics helps us to improve the scientific practice of macroeconomic policy analysis. I also think that the improvement of scientific practice is an improvement in the sense of an approximation of the ideal of scientific objectivity. Thus no matter where we stand in macroeconomics, understanding the ideal of well-orderedness will help us to identify clear-cut steps that macroeconomists would need to take in order to approximate the ideal of scientific objectivity. Just where we stand in macroeconomics is, of course, a matter of some controversy, and I’m not going to tie myself down to any definite judgment. I instead would like to describe two scenarios that are unlikely to reflect the current state of macroeconomic policy analysis. The true current state lies probably somewhere in between the two scenarios or close to any of them. My only purpose of describing

Macroeconomics in a Democratic Society  187 the two scenarios is the identification of clear-cut steps that macroeconomists can take in order to approximate the ideal of scientific objectivity. The first scenario is one in which a circle of macroeconomic experts convenes to decide on questions of significance (what problems should research be conducted on?), certification (which research results should be accepted as providing evidence for scientific claims?), and application (can public knowledge be used to solve urgent problems?). Fortunately, these experts largely agree on what has been accomplished so far. They all believe that macroeconomic policy analysis relies on firm microfoundations. They disagree a little on the exact shape of these microfoundations: while some believe that that they are those of Walrasian general-equilibrium theory, others believe that they are those of Marshallian general-equilibrium theory (cf. Section 7.5 of Chapter 7). But all experts agree that the program of microfoundations doesn’t require any major reorientation. In this scenario, the experts voice preferences that reflect their respective views of what has been accomplished so far. Some prefer research to be conducted on rational choice models of expectations or household demand, on calibrations of real-business-cycle (RBC) models etc. Others prefer research to be conducted on barriers to price adjustment, inflation inertia etc. But there is no one who prefers research to be conducted on problems in behavioral economics or neuroeconomics, or on the development of macro-econometric testing procedures and their application to relations of direct type-level causation that are believed to obtain between macroeconomic aggregates. All experts mutually engage to a certain extent: they are prepared to modify (some of) their preferences to accommodate the preferences of the other experts, and to avoid outcomes that leave some of the other experts completely unsatisfied. They then agree to conduct research according to a set of plans that in the eyes of each expert contains at least one plan that she finds acceptable: they agree to conduct research, say, on rational choice models of expectations and barriers to price adjustment. Perhaps the mutual engagement of the experts becomes manifest in the operations of national research agencies that watch over this second-best outcome of the discussion on questions of significance in macroeconomic policy analysis. In the first scenario, both the ideal of well-ordered certification and the ideal of transparency are largely satisfied. The methods that the experts use to infer evidence in support of causal hypotheses (hypotheses about relations of direct typelevel causation between macroeconomic aggregates) are methods of identifying model parameters in microeconomic theory, and these methods are held to be reliable: they are believed to give rise to conclusions that are true enough at a frequency that is high enough (where the conclusions in question are of course causal hypotheses). And the macroeconomic policies that are based on these conclusions often succeed. Outsiders, moreover, put a lot of trust in the system of macroeconomic knowledge. They tend to understand and accept the inference methods that the experts use to infer evidence in support of their causal hypotheses. In the first scenario, the information needed to solve macroeconomic problems simply coincides with the system of macroeconomic knowledge. Outsiders agree with the experts about the degrees of urgency with which macroeconomic

188 Objectivity problems require attention: while cases of real-economy crises or rising inflation require immediate action, projects like major tax reforms may also be postponed up to a point when better information pertaining to these projects might be available. And outsiders typically appreciate the solutions that the experts recommend on the basis of their causal hypotheses. Since outsiders tend to understand and accept the inference methods that the experts use to infer evidence in support of their causal hypotheses they have no reason to question the adequacy of the experts’ policy recommendations. This first scenario reflects a state of macroeconomic policy analysis that is arguably quite desirable: experts mutually engage, there is a secure system of macroeconomic knowledge, macroeconomic policies often succeed, and the scientific authority of the macroeconomic discipline is in good shape. Note, however, that there is still room for improvement. Steps that macroeconomists could take in order to further approximate the ideal of well-ordered macroeconomic policy analysis include an intensified effort to gain a clearer sense of what has been accomplished so far, a stronger commitment to mutual engagement, and a further improvement of macroeconomic policies. It is true that the experts already largely agree on what has been accomplished so far. But if the microfoundations of macroeconomics are either those of Walrasian general-equilibrium theory or those of Marshallian general-equilibrium theory, then the experts would get an even clearer sense of what has been accomplished so far if they found out, for instance, that the microfoundations of macroeconomics are indeed those of Marshallian general-equilibrium theory. If they got an even clearer sense of what has been accomplished so far, they could reach a plan that all experts perceive as best (the best possible outcome of an ideal discussion of questions of significance): e.g. a plan according to which research is no longer to be conducted on rational choice models etc. but only on barriers to price adjustment etc. If they got an even clearer sense of what has been accomplished so far, they could also improve macroeconomic policies: policies would become even more successful because they would no longer rely on conclusions that derive from the Walrasian ‘misconception’ of the microfoundations of macroeconomics. The second scenario is one in which a circle of experts convenes to decide on questions of significance, certification, and application. Unfortunately, the experts do not agree on what has been accomplished so far. Let’s say there are roughly four competing groups. Experts in the first two groups believe that macroeconomics relies on sound microfoundations: while experts in the first group maintain that these microfoundations are those of Walrasian general-equilibrium theory, experts in the second group hold that they are those of Marshallian general-equilibrium theory. Experts in the third group agree that the program of microfoundation requires substantial reorientation. And experts in the fourth group believe that the program of microfoundations is misguided and that the relations of causal dependence that they think obtain between macroeconomic aggregates can be investigated independently of any microfoundations. In this scenario, experts also voice preferences that reflect exclusively their respective views of what has been accomplished so far. Experts in the first group

Macroeconomics in a Democratic Society  189 prefer research to be conducted on rational choice models of expectations or household demand, on calibrations of RBC models etc. Experts in the second group prefer empirical research to be conducted on barriers to price adjustment, inflation inertia etc. Experts in the third group prefer research to be conducted on problems in behavioral and experimental economics, or on the behavior of macroeconomic aggregates in phase transitions. And experts in the fourth group prefer research to be conducted on the development of macro-econometric testing procedures and their application to the relations of probabilistic or causal dependence that are believed to obtain between macroeconomic aggregates. Unlike their counterparts in the first scenario, the experts in the second scenario even fail to mutually engage: they are not prepared to modify any of their preferences to accommodate the preferences of the other deliberators; and they don’t mind putting up with outcomes that leave some of the other deliberators completely unsatisfied. Questions of significance are decided by majority vote. Perhaps the experts in the first two groups forge an alliance against the experts in the remaining two groups. And perhaps the majority vote is expressed by the policy of the most important journals to favor the methods applied by the experts in the first and second groups. The second scenario further differs from the first scenario in that neither the ideal of well-ordered certification nor the ideal of transparency is satisfied. The methods that macroeconomists use to infer evidence in support of causal hypotheses (methods of identifying model parameters in microeconomic theory or empirical procedures of causal inference) are unreliable: they do not give rise to conclusions that are true enough at a frequency that is high enough (where the conclusions in question are again causal hypotheses). Macroeconomic policies often fail: nominal interest rate cuts lead to more deflation, expansive fiscal policies increase unemployment, quantitative easing lead to higher inflation and lower output, and so on. And outsiders have no trust in the system of ‘knowledge’ of the experts. The causal inference methods that the experts use are either opaque to them or at variance with the standards of at least some of them. The second scenario also differs from the first in that the system of macroeconomic ‘knowledge’ does not necessarily coincide with the information needed to solve macroeconomic problems. In the second scenario, the information needed to solve these problems is insufficient, and outsiders lack access to that information. Like in the first scenario, experts and outsiders tend to agree about the degrees of urgency with which macroeconomic problems require attention. But unlike their counterparts in the first scenario, outsiders now tend to favor solutions that differ from the solutions that macroeconomists recommend on the basis of their causal hypotheses. Perhaps large numbers of them are opposed to policies of quantitative easing (of large-scale purchases of mortgage-backed securities by the central bank), or of bailing out banks that are said to be too big to fail. Perhaps many of them favor expansive fiscal policies over policies of fiscal restraint. Perhaps some of them believe that the current Fed policy of steadily raising the federal funds rate won’t be able to bring down the rate of inflation significantly. And so on.

190 Objectivity What are the steps that the experts in this scenario need to take in order to approximate the ideal of well-ordered macroeconomic policy analysis? The decisive first step is to form a circle of experts that is augmented by small groups of informed outsiders. When this step is taken, three further steps will naturally follow: deliberators in the augmented circle of experts will gain a clearer sense of what has been accomplished so far; they will mutually engage; and the trust of citizens in the system of macroeconomic knowledge will be restored. In order for the trust of citizens to be restored, certification procedures do not necessarily need to be reliable (recall from the analysis of the previous section that the ideal of transparency can be attained in the absence of well-ordered certification). It goes without saying, however, that the trust of citizens increases with the reliability of these procedures. What are the qualities of the informed outsiders by whom the circle of experts is to be augmented? The outsiders need to be informed in the sense that they perform competently with respect to all the theories that the experts of the four groups are primarily concerned with: new classical and new Keynesian macroeconomics, behavioral and experimental economics, and macro-econometric causality. With respect to these theories, they need to be competent non-experts in the sense of Section 8.3 of Chapter 8: they need to fully understand their respective textbook expositions, their basic mathematics, and the most important problems that possibly get in the way of their successful application. Further qualities derive from the nature of the four steps that need to be taken. If the experts disagree on what has been accomplished so far, they will get a clearer sense of what has been accomplished so far when the informed outsiders share a lack of investment in winning the fight. If the outcome of their ideal deliberation on questions of significance is to be at least second best, the informed outsiders should, moreover, have an understanding of the non-scientific values that might drive the experts to push their preferences. That understanding may help them to weigh the preferences that experts voice, and that need to be taken into account when questions of significance need to be decided. If mutual engagement and a clearer sense of what has been accomplished so far are to result in more reliable certification procedures, then the informed outsiders should also be able to think of methods that accord with the ideas about proper acceptance current within the larger society. Informed outsiders who have an understanding of the non-scientific values that might drive the experts to push their preferences combine competence in economics with competence or expertise in history and philosophy. Marx, Schumpeter, Hausman, and McPherson have that understanding and combine competence or expertise in economics with competence or expertise in history and philosophy (cf. Section 7.5 of Chapter 7). Informed outsiders who can think of methods that accord with the ideas about proper acceptance current within the larger society combine competence in economics with expertise in philosophy. Causal theorists have a clear understanding of the limitations of the methods that macroeconomists use to derive evidence in support of their causal hypotheses. Philosophers also understand

Macroeconomics in a Democratic Society  191 that the methods that accord with the ideas about proper acceptance current within the larger society do not necessarily coincide with methods that aim to provide conclusive empirical evidence in support of causal hypotheses in macroeconomics. They understand that these methods might also coincide with Myrdal’s method of combining statistical evidence of a correlation between X and Y with an explicit value premise to derive a causal hypothesis. If the information that experts use to solve problems of macroeconomic policy analysis is to be sufficient, then macroeconomists need to become experts in Dreyfus’s sense: then they need to combine expertise in macroeconomics with sufficient work experience in central banks or other policymaking institutions (cf. Section 8.3 of Chapter 8). If they manage to become experts in Dreyfus’s sense, then the information they use to solve macroeconomic problems will be partly propositional and partly non-propositional: its propositional part will consist of conclusions derived from theory, natural experiments, and econometric tests, and its non-propositional part will rely on expert intuition. But it will be sufficient in the sense that it gives rise to causal models that adequately capture the situation at hand. The problem remains that in the augmented circle of experts, nobody knows whether any of these models really captures the situation at hand. But the problem can be solved if certification procedures (the methods that macroeconomists use to infer evidence in support of causal hypotheses) become more reliable: if they give rise to conclusions that are true enough at a frequency that is high enough. In Section 3.6 of Chapter 3, I suggested that certification procedures become more reliable if macroeconomists implement the program of empirical microfoundations: if they use the results of empirical disciplines like econometrics and behavioral or experimental economics to model the chains of relations of causal dependence and supervenience that policymaking institutions can exploit, and if they pay special attention to the study of the formation of expectations. But ultimately, the development of more reliable certification procedures will depend on the individuals who make up the augmented circle of experts: on their mutual engagement, and on the convergence of their respective views of what has been accomplished so far. In conclusion, one may say that the augmented circle of experts is to be made up of macroeconomists with sufficient work experience in central banks or other policymaking institutions and of informed outsiders with expertise in history and philosophy. These macroeconomists and informed outsiders need to be prepared to mutually engage and to consider opposing views of what has been accomplished so far. It is to be conceded that this conclusion derives from considerations that are somewhat speculative. But perhaps it helps to see that the augmented circle of experts combines the “rare gifts” of the “master-economist” that John Maynard Keynes (1925, p. 12) describes when saying that “the master-economist must possess a rare combination of gifts. He must reach a high standard in several different directions and must combine talents not often found together. He must be mathematician, historian, statesman and philosopher – in some degree. He must understand symbols and speak in words.”

192 Objectivity Note 1 Details that are interesting but irrelevant for the purposes of the present chapter relate to the idea that the social sciences have a capacity of self-healing, i.e., a capacity that might lead to an at least partial purge of biases in the social sciences (cf. Myrdal 1970, pp. 35–8, 40, 43), and to the idea that there might be value premises (the moral principles of respect for life and egalitarianism) that cannot be subjected to moral criticism but also do not have much significance as valuations that operate in the formation of economic policies.

References Dupré, J. (2007). “Fact and Value.” In Kincaid, H. (ed.), Value-Free Science: Ideal or Illusion? Oxford: OUP, 27–41. Hands, D. W. (1985). “Popper and Economic Methodology: a New Look.” Economics and Philosophy 1(1), 83–99. Keynes, J. M. (1925). “Alfred Marshall, 1842–1924.” In Pigou, A. C. (ed.), Memorials of Alfred Marshall. London: Macmillan. Kitcher, P. (2011). Science in a Democratic Society. Amherst, NY: Prometheus Books. Myrdal, G. (1970). Objectivity in Social Research. London: Duckworth. Popper, K. R. (1936/1959). The Logic of Scientific Discovery. Oxford/New York: Routledge. Popper, K. R. (1957/1984). “The Aim of Science.” In Miller, D. W. (ed.), Popper Selections. Princeton: PUP, 162–70. Popper, K. R. (1967a). “The Logic of the Social Sciences.” In Adorno, T. W. (ed.), The Positivist Dispute in German Sociology. New York: Harper and Row, 87–104. Popper, K. R. (1967b/1984). “The Rationality Principle.” In Miller, D. W. (ed.), Popper Selections. Princeton: PUP, 357–66. Woodward, J. and Hitchcock, C. (2003). “Explanatory Generalizations, Part I: A Counterfactual Account.” Nous 37(1), 1–24.

Index

Note: Page numbers followed by “n” denote endnotes abduction 133 activation programs 70–1 adequacy: empirical 147; statistical 86, 122–3, 128 Afrouzi, H. 66n16 aggregate: demand 6–7, 23, 41, 45, 48, 50, 55, 59, 72–3; macroeconomic 4, 8, 41–3, 45, 47, 49–66, 111, 116, 129–30, 149, 164; employment 41, 43; expectations 6, 41, 43, 45, 52–3, 57–9, 130 AK test 70, 79, 85–91 Akerlof, G. 51 Allais, M. 136 Angrist, J. 5, 8, 13–14, 20–1, 23–4, 33–5, 38, 69–70, 85–8, 92n12 antirealism 115–16, 133 Anufriev, M. 58 application 7, 10–11, 13–14, 105–6, 112–13, 134–35, 142, 149–50, 176, 184, 186 approximation 42, 48, 135–6 Aristoteles 5, 94 Arthur, W. B. 50 Austrian School 11, 119, 121, 152, 160n6 back-door criterion 33–6, 38, 87–90, 92n16 baseline Romer model 86–8 Bayes net: causal 5, 9, 69, 94, 100–2, 107n6 Bayes theorem 100 Beechey, M. 66n16 Berg, G. J. v. d. 13, 91n2 Bernanke, B. 3–4 Bessler, D. 100 bias 11, 36, 70–1, 87, 89, 127, 145, 160n8, 171, 180–2, 192n1

Brown, M. xvi Buchleitner, A. xvi Buiter, W. 14 Burns, A. 123–4, 127 Cartwright, N. 5, 15n1, 20, 27–8, 32, 97, 107n1, 149 causal: asymmetry 92n7, 95, 97; Bayes net (see Bayes net); dependence (see type-level causation); evidence 4–7, 69, 79; graph 25, 32, 34, 36, 39n11, 82, 87–8, 101; hypothesis 4, 10, 74, 130, 151–5, 160n7, 191; inference (see inference); knowledge (see knowledge); law 26, 100; model 3–4, 6–7, 9, 19, 21–3, 29, 33, 36, 57, 82, 85, 92n7, 111–13, 115–16, 123, 130–1, 133–5, 137–8, 141–6, 153, 162–4, 167–75, 179, 183, 191; parent 21, 29, 37–8, 101 Causal Markov Condition 101–3 causality: downward (or top-down) 42, 61, 63, 129; and counterfactuals 104; genuine 95–7; Granger (see Granger causality); interventionist account of 5, 20–1, 24–9, 35, 37, 84, 92n11; in Keynesian econometrics 49; macroeconomic (see macroeconomic); as privileged parameterization 29–33, 38, 92n7; and probability 9, 94–107; reverse 75–9, 182, token-level 21–2, 25, 37, 77–8, 163, 168, type-level 21–2, 24–38, 42–3, 47, 55, 58–60, 69–79, 81–6, 90–1, 91n3, 92n7, 92n9, 94, 107n2, 111, 113, 115–16, 131, 133, 137, 141, 144, 149–53,

194 Index 162, 164, 167, 170, 172–3, 175, 179, 182–3, 187 causation see causality cause: common 9, 77, 92n6, 102, 105–7, 116; prima facie 95–6; negative 96; spurious (see spuriousness); sufficient 96 CC see Cowles Commission central bank 1–2, 23, 42, 48, 50, 55, 62, 72–3, 129, 155, 167–8, 189, 191 certification 7, 11, 113, 176, 184–91 Chakravartty, A. xvi Chari, V. 154 Cheung, Y.-W. 9, 107 Christiano, L. 3–4, 39n6, 65n8, 66n12, 153 circularity: definitional 38, 97–8 Cochrane, J. 170–2 cointegration: of time series 107 collective intentionality see intentionality common cause principle 9, 92n6, 106–7, 116 complex system 46–57, 65n3 complexity: science 65n3 confirmation 126, 150, 180 confounding 10, 15, 24, 77, 85–7, 90 conservatism 10, 141–2, 147–55, 159 consistency 99, 141–2, 147–9, 151 constant-share model 82 correlation 9, 46–7, 63, 78, 99, 106–7, 116, 182–3, 191 counterfactual 104 Cowles Commission (CC) 124, 127 Crease, R. 169 critical method 176–9 DAG see directed acyclic graph Daston, L. 9, 111–12, 116–17, 143–4, 162–5 De Vroey, M. 128, 152 Debreu, G. 50 deduction: method of 118, 120–1 deductive-nomological (DN) account of scientific explanation 179 DeLong, B. 14, 170, 172 Demiralp, S. 100 demand: aggregate (see aggregate); expectations 8, 41, 45, 59, 64, 73, 77, 82 democracy 162, 172–3, 183, 185 democratic organization of scientific practice 7, 11, 175–7, 180, 183 democratic society: macroeconomics in a 175–91

determinism 22, 99, 125 directed acyclic graph (DAG) 100–1 DN account see deductive-nomological account of scientific explanation Dornbusch, R. 75 Dosi, G. 61–4 Douglas, H. 10, 142 Dreyfus, H. L. 10, 162, 166–8, 169–70, 172–4, 191 Dreyfus, S. 10, 166–8 DSGE model see dynamic stochastic general-equilibrium model Dupré, J. 182 dynamic stochastic general-equilibrium (DSGE) model 23, 26–8, 30, 39n6, 41–3, 47–51, 59–60, 65n8, 65n9, 65n19, 72, 115, 127–9, 136–7, 167 ECB see European Central Bank 1–3, 44 econometrics; and causality 14, 61, 137; and Granger causality 97; Keynesian 14; time series 97 economics: behavioral 61, 128, 137, 187, 189–91; experimental 6–7, 11; 61, 79, 137, 189–91; labor 13, 72, 150; welfare 119, 152; see also macroeconomics; microeconomics Eichenbaum, M. 153 emergence 41, 43, 45, 47, 54, 65n3, 66n15, 112 empirical: evidence 3, 6–7, 10, 12, 50, 56–9, 60, 63, 69, 111–13, 115, 141, 145, 148–51, 154, 173, 191; success 43, 53–5, 58–9 empirical underdetermination: argument from 10, 142, 146–9; non-sporadic 141, 147, 149, 151, 155, 162, 173, 175 employment 2–3, 41, 43, 63, 134, 167 Engels, F. 157 Engle, R.F. 91n1 Epstein, B. 66n13 equilibrium: dynamic 8, 48, 50, 164; see also non-equilibrium Euler equation 48 Evans, C. 153 evidence: causal (see causal evidence); conclusive 6–9, 13, 58–9, 64, 79, 91, 113, 119, 130, 146, 150, 164, 191; empirical (see empirical: evidence) European Central Bank (ECB) 1–3, 44

Index  195 exogeneity 20, 47–8, 50, 91n1; see also super-exogeneity expectations: adaptive 57, 64; individual 6–7, 41, 43, 45, 52n2, 54, 57–61, 64–5, 73, 79, 113; manipulability of 43, 54, 56; rational 22–3, 26–7, 30– 1, 57–8, 64, 73, 75–6, 78–9, 167 experiment; crucial 118–19; laboratory 57, 65; natural 5, 10, 13–14, 69–70, 74, 77–9, 94, 154, 162, 168, 170–1, 191; simulated 62, 64 expertise: macroeconomic 7, 162, 166, 175, 178, 183, 191; and democracy 173, 185 explanation 54, 57, 76, 99–100, 109, 130, 155–6, 178–9 explanatory power 149, 151, 177

Great Depression 76, 134 gross domestic product (GDP): nominal 22–3, 44–5, 53–4; real 13, 22–4, 34, 39n3, 44–5, 53, 85–8, 132 gross national product (GNP) 82–3, 98–9, 103, 107n3 group interests 7, 10, 112, 142, 155–9, 171, 182 Guerini, M. 63

Faber, M. xvi Faithfulness Condition 101–2 Falsification 178–9 Fed see US Federal Reserve System federal funds rate 2, 4, 13, 19, 23–4, 34, 54, 83, 85–8, 104, 189 Federal Open Markets Committee (FOMC) 1–2, 85–8 Feld, L. xvi Fernández-Villaverde, J. 153–4 FOMC see Federal Open Markets Committee Friedman, M. 13, 70, 90, 100, 182 fruitfulness 147–9, 151, 154–5, 177

Haavelmo, T. 5, 94, 122–4, 129, 160n4, 164 Hacking, I. 56 Hands, D. 178 Hausman, D. 10, 138n2, 142, 158, 190 Hendry, D. 9, 20, 91n1, 106, 124 Henschen, T. 10, 39n8, 142 heterogeneity 41–2, 46–7, 50–1, 56, 60–1, 66n12, 128, 148, 150, 160n4, 167 Hitchcock, C. 26, 151, 179 Hommes, C. 58 Homogeneity 8, 42, 50–1, 65n11, 164 Hooker, C. 65n3 Hoover test 70, 79–4, 90 Hoover, K. 4–5, 8–9, 12–13, 20–2, 27–33, 35, 38, 39n1, 39n3, 39n4, 42–3, 51–60, 66n15, 69–70, 74, 79–84, 90, 91n1, 91n4, 92n7, 92n10, 98, 100, 102–7, 107n1, 107n9, 123, 128–9, 132, 136–7, 138n1 Hume, D. 5, 94–5, 97, 153–4, 160n5 Humphries, A. xvii Hüttemann, A. xvi

Galison, P. 9, 111–12, 116–17, 141–4, 162–5 GDP see gross domestic product general price level 22, 43–6, 53–4, 65n1, 131–3 general-equilibrium theory: Marshallian 128, 142, 149, 152–3, 155, 159n3, 160n6, 187–8; Walrasian 10, 39n1, 128, 142, 149, 152, 154–5, 158, 159n3, 160n6, 187–8 Gennaioli, N. 58 German Historical School 120–1 Glymour, C. 5, 38, 39n11, 100, 102–3 GNP see gross national product Goldman, A. 10–11, 171–2 government spending 13, 54, 80, 82–3 Granger causality 5, 9, 20, 94, 97, 99, 103–5, 179; test 69, 98 Granger, C. 4–5, 9, 20, 94, 97, 100, 103 Graversen, B. 13

ideal: of transparency 7, 11, 176, 184, 185, 187, 189–90; value-free 10; of well-orderedness 11, 176, 184, 186; of well-ordered certification 185, 187, 189–90 idealization 43, 53–5, 58–9, 123, 133, 135–6, 138n1 ideal gas law 47 ideology 145, 157–8, 182–3 inconclusiveness: of causal evidence 6, 19, 79, 150–1 independence: conditional 39n7, 86, 92n12, 101; ontological 52; probabilistic 9, 101; value 7, 9, 11, 113, 141–2, 175, 179, 184 induction 10, 150 inductive risk 10, 142 inference: abductive 133; causal 4–6, 8–9, 11–14, 59–61, 63, 65, 79, 90, 111, 113, 128–30, 133, 136, 138, 141,

196 Index 146, 164, 174, 189; deductive 179; from value premises 180; statistical 130 inflation: expectations 6, 8, 22, 41, 45, 53–4, 56, 59–60, 66n16, 73, 83, 90, 131 informed outsiders 174, 183, 185, 190–1 initial conditions 179 instrumental-variable (IV) method 6, 8–9, 58, 60, 69–70, 77, 79, 90, 150 instrumentalism 116, 129–30, 136, 138 intentionality: collective 52 interest rate: nominal 6, 22–3, 45, 48–50, 53, 72–3, 80, 94, 99, 167, 189; real 3–4, 22–3, 39n3, 44–5, 49, 54, 72, 159 inter-subjectivity 24 intervention: controlled 72–8; experimental 77; human 25, 72–4, 76, 78–9, 90, 91; possible 8, 12, 25–8, 31, 37, 71, 111; on parameters 20, 28–32, 38; variable (see variable); on variables 28, 31–2, 37–8 interventionist account of causality 5, 8, 12, 15, 20–1, 24, 35, 37, 84, 92n11, 94 intuition: expert 7, 10, 113, 163–4, 168, 170, 173, 191 invariance 26, 27, 42, 46–7, 55, 77, 92n7, 130–1; to interventions 130 IS-LM model 134–5 isomorphism 61 Jacobs, R. 104 Johansen, S. 9, 107 Keynes-and-Schumpeter (K+S) model 62–4 Kant, I. 148 Keho, P. 154 Kepler, J. 120–1, 123–5, 128 Keynes, J. M. 158, 168, 191 Keynesianism 152 Khalifa, K. xvi King, R. 75–8 Kirman, A. 50–1 Kitcher, P. 7, 11, 113, 175–6, 183–6 Klaauw B. v. d. 13, 91n2 knowledge: causal xv, 90; nonpropositional 10, 163, 166, 168, 170, 173, 191; propositional 10, 162; public 7, 113, 176, 184–7; scientific vi, xv; secure 115, 137 Koopmans, T. 124–7, 164

Krugman, P. 14, 170, 172 Kuersteiner, G. 5, 8, 13, 20, 23–4, 33–5, 38, 39n7, 69–70, 85–8, 92n12, 92n14 Kuhn, T. 147, 154 Kydland, F. 65n9 Ladyman, J. 65n3, 65n4, 65n7 Lahiri, K. 105 Lai, K. 9, 107 Laspeyres index 44, 132 Laudan, L. 133 law: deterministic 99; natural 156–7; physical 25; scaling 46–7, 63; scientific 15n1, 107n1; stochastic 99; universal 179; see also causal law Leamer, E. 14–15, 104 Lee, S. 100 Lehman Brothers Inc. 1 linearity 23, 27–8, 30–1, 38, 42, 48, 107 List, C. 66n14 Longino, H. E. 10–11, 142, 146–9, 151, 159n2, 160n4 LSE methodology 80, 82, 105–6, 160n4 Lucas, R. E. 14, 49, 57, 60, 83, 92n9, 154, 170–2 Lucas critique 6, 49, 59, 65n9, 77–8, 130, 150 Machery, E. xvi macroeconomic: aggregate 4, 8, 19, 21, 41–3, 45, 47, 49–56, 59–60, 62–4, 66n13, 66n15, 69, 77, 111, 116, 129–30, 146, 149, 164, 179, 187–9; causality 8, 20, 24, 29, 33, 37, 92n16, 94; methodology 11–12; models 7, 9, 38, 42, 141, 162; policy decisions 2–5, 7, 19, 21–3, 111, 113, 167–8; theory 12 macroeconomics; agent-based 61, 128, 130, 138; Keynesian 129, 134–5, 142, 153–5, 179, 190; new classical 78, 129, 134–5, 142, 149, 154–5, 159, 170, 181, 190; see also aggregate; causality; microfoundations; new Keynesian macroeconomics; policy Maddala, G.S. 105 Mäki, U. 116, 131–4, 138 Manipulability 43, 54, 56 Manski, C. 57 Mantel, R. 50 Marshall, A. 153; see also generalequilibrium theory: Marshallian

Index  197 Marx, K. 10, 142, 156–8, 190 May, K. 136 Mayo, D. 106 McGrattan, E. 154 McPherson, M. 10, 142, 158, 190 measurement-without-theory debate 116, 122, 125, 127 mechanism: monetary transmission 55, 153; price 8, 42, 50, 55, 153 Menger, C. 119–21, 164 meta-expert 11, 171 metaphysics 4 methodology: descriptive 13; economic 11–13, 135; prescriptive 13 microeconomic quantities 8, 41–3, 46–7, 51–5, 60–2, 64, 66n13, 129 microeconomics 13–14, 69, 90–1, 129 microfoundations: empirical 43, 54, 59, 61–3, 137, 151, 179; program of 8, 43, 54, 59–64, 129, 131, 137, 179, 181–2, 187–8 Miller, D. 137 Minimality Condition 101–2 Mitchell, W. 123–4, 127 modality 25, 79 modularity 26–8, 32–3, 35 Moneta, A. 63 monetarism 158 monetary changes 13, 70, 74–7, 154, 182–3 monetary policy 2, 4, 20, 23, 72–8, 88, 128, 134; rule 22–3, 49, 77 money-price model 23, 27, 30–1 money stock 22, 23, 41, 44, 53, 55, 74–5, 80, 91n5 Morgan, M. 124 Morgenstern, O. 138n3 Motto, R. 3–4, 39n6, 65n8, 153 Müller, T. xvi Muth, J. 57 Myrdal, G. 175–6, 180–3, 191, 192n1 National Bureau of Economic Research (NBER) 124, 128 Necessity 45, 54, 60, 153, 157, 159, 178, 180 Neumann, J. v. 138n3 new Keynesian: IS curve 48, 50, 59; macroeconomics 129, 134–5, 142, 149, 153–5, 181, 190; Phillips curve 48, 59 Newton, I. 117–24, 127–8 Newtonian mechanics 111, 117 Nihilism 6

no-miracles argument 133 nomological 5, 15n1, 107n1, 179 non-anthropomorphism 25, 30 non-equilibrium: dynamic 8, 46–7, 56, 65n4, 65n7, 128, 167; see also equilibrium nonlinearity 27–8, 30–1, 38 Norton, J. 10, 107n5, 136, 147, 150, 159n3 novelty 53–5, 142, 148, 150 Obama, B. 170–1 objectivity: scientific 6, 9, 12, 111–12, 116, 141, 162–3, 166, 168, 172–3, 175– 81, 183, 186–7; see also expertise; truth-to-nature; value independence observation 57–8, 82, 103–4, 118, 122–6, 146–8, 150, 157, 163–5, 184–5 observational equivalence 29, 76, 78–9, 90, 128, 159n3 ontology: of macroeconomic aggregates 8, 41–3, 45, 50–1, 60, 62, 69, 146, 164 Ours, J. C. v. 13 output 4, 6, 13, 22–3, 48–9, 64, 72, 75–6, 80, 118, 134, 152, 167, 189 Paasche index 44, 132 Pagan, A. 9, 107 parameter: causal 21–2, 28–3, 37–8, 133, 187, 189; deep 14, 42, 49, 72; identification 133, 146; -nesting 29–0, 32, 39n10; structural 81 parameterization: privileged 8, 21, 29, 30, 32–3, 38, 92n7, 94 Pareto-efficiency 119, 152, 159 Pareto-optimum 119, 152, 160n3, 182 Pearl, J. 20, 29, 33–6, 79, 81–2, 85, 87–9, 91n1, 91n3, 92n8, 100, 104 pessimistic meta-induction 116, 131, 133– 4, 137, 138 Pettit, P. 66n14 philosophical method 11–13, 118 physics: astro- 167; condensed matter 46, 55–6, 60–1; experimental 119; particle 131; quantum 102 Pischke, J.-S. 14, 39n7, 92n12 Plosser, C. 75–8 Pohlmeier, W. xvi policy: advise 172; analysis 5–7, 9, 11–13, 41, 64, 87, 94, 104, 113, 115–16, 118, 127, 129–31, 135–7, 153–6, 158, 170–6, 178–9, 181–3, 186–91; economic 164–5; fiscal 62, 134–5, 167, 189; innovation 62;

198 Index intervention 31, 42, 49, 59–60, 62, 77, 119, 152, 179; monetary (see monetary policy) policymaking institutions 3–4, 23, 48, 129, 167–8, 191 Popper, K. 91n4, 137, 160n7, 175–9 Post, H. 137 potential outcome: approach to causality 5, 8, 9, 12, 20–1, 33, 35, 38, 69, 90, 92n16, 94; research 58, 60, 128 preferences: revealed 28, 39n9; subjective 146, 149–50 Prescott, E. 57, 65n9 price: mechanism (see mechanism: price); setters 128, 152; takers 152; stability 2–3; stickiness 153; vector 50, 119, 152; see also general price level; money-price model principle of common effect 106–7 principle of common cause 9, 92n6, 106–7, 116 probabilistic independence 9, 101–3 probability: approach to causality 5, 9, 69, 94–5, 100; distribution 24–6, 37, 46–7, 56, 63, 71, 81; model 14, 105–6, 122, 164 Putnam, H. 133 quantitative easing 1, 4, 135, 167, 189 quantity: microeconomic 8, 41–7, 51–5, 60–2, 64, 66n13, 129 Quine, W.V.O. 10, 147, 160n7 randomized controlled trials (RCTs) 13, 69–73, 76, 79, 90–1, 119 rational 51, 57, 169, 173, 177–9; see also rational expectations RBC models 65n9, 187, 189 RCTs see randomized controlled trials realism: minimal 138; scientific 7, 9, 11–12, 113, 115–16, 131, 138, 141, 172, 175, 180, 184 reduction 1, 3–4, 43, 45, 52, 101–2 reductive accounts of causality 38 regression: coefficient 22; equation 98, 104, 127; error 24, 104; studies 14 (see also vector autoregression) Reiss, J. 52, 65n1, 71, 78, 84, 107n9, 130, 132, 135–6, 138, 159n1, 173 representative agent 26, 42–3, 51, 60, 127, 136, 138n1 resilience 7 Ricardo, D. 116–18, 122, 128, 154

Richard, J.F. 5, 100 Romer, C. 76 Romer, D. 76, 154 Rosenberg, A. 13 Rostagno, M. 3–4, 39n6, 65n8, 153 Roventini, A. 61–2, 64 Rudner, R. 10, 142 Sargent, T. 49, 105 Sbordone, A.M. 39n5, 48 Scheines, R. 5, 100, 102–3 Schmoller, G. v. 120–2, 164–5 Schnellenbach, J. xvi Schumpeter, J. 10, 142, 144–5, 156–8, 160n8, 190 Schwartz, A. 13–14, 70, 74–6, 78, 91, 91n5, 154, 182–3 scientific: antirealism 115–16, 133; authority 185, 188; expertise; explanation 179; method 13, 121; objectivity 6, 9, 12, 111–12, 14, 14, 162–3, 166, 168, 170, 172–3, 175–81, 183, 186–7; practice 7, 12, 65n3, 117, 175–8, 183–4, 186; realism (see realism); value (see value) screening off 96, 101–2 Searle, J. 52 selection-on-observables assumption (SOA) 24, 33, 39n7, 85, 92n12, 128 Selinger, E. 169 semantic analysis 5, 12–13 Sen, A. 39n9 Senior, N. 157 Simon, H. 5, 29–30, 32–3, 38, 39n10, 94 Simon’s condition of privileged parametrization 29–30, 32–3, 38 Simon’s hierarchy condition 29–30, 32 Simplicity 10, 141–2, 147–55, 159, 160n4, 177 Simpson’s paradox 96 Sims, C. 14, 98–9, 107n3, 127 Simultaneity 36, 99, 124–5, 134 skepticism: about truth 116, 133, 135, 138 SMD theorem see Sonnenschein-MantelDebreu theorem Smets-Wouters model 154 Smets, F. 3, 4, 39, 65n8, 153–4 Smith, A. 158 SOA see selection-on-observables assumption Sober, E. 106

Index  199 Sonnenschein, H. 50 Sonnenschein-Mantel-Debreu (SMD) theorem 50–1, 60 Spanos, A. 106, 123 Spirtes, P. 5, 9, 100–3 Spohn, W. 92n13, 97, 100, 102–3, 107n6 spuriousness 97–8 steady state 46, 50, 98 see also dynamic equilibrium Stern, R. 65n2 Strong, B. 74–5 structural break 80–1, 83–4, 128 structural vector autoregression (SVAR) 127 stylized facts 63, 64, 128; reproduction of 63 subject: epistemic 24, experimental 71 Summers, L. 14–15 super-exogeneity 20, 91n1 supervenience 41, 43, 45, 52, 54, 61–3, 65n2, 66n15, 128–9, 137, 191 Suppes, P. 5, 9, 94–7, 100 Suppes’ account of genuine causation 96 SVAR see structural vector autoregression tax 13, 66n13, 80, 82–3, 92n9, 188 Taylor rule 63 Teller, P. 135, 137 temporal order 78, 97–8, 100 time series; econometrics 97; nonstationary 9, 106–7, 116 token-level causation see causation trained judgment see expertise Trichet, J.-C. 1, 3 truth: approximate 111, 115–16, 128, 131, 133, 135–8, 175; sufficient 123, 135, 137 truth-to-economy 112, 115–16, 118, 120–4, 127–8, 131, 138 truth-to-nature 9, 111–12, 115–20, 143, 163, 165 Turner, S. 162–3, 172–3 type-level causation see causation Ullian, J. 160n7 underdetermination see empirical underdetermination

unemployment 2, 3, 13, 24, 34, 62, 70, 86, 92, 132, 134, 167 US Federal Reserve System (Fed) 1, 2, 30, 75–6, 78, 83 validity: external 71; simultaneous (of equations) 124–5 value: free ideal 10; independence 7, 9, 11, 113, 141–2, 175, 179–80, 184; judgment 7, 9–10, 112–13, 142, 155–9, 171, 182; non-epistemic 112–13, 180; non-scientific 7, 10–12, 113, 141–2, 146, 149–50, 155–8, 160n8, 174–5, 177–8, 180– 1, 190; scientific 141–2, 146–7, 149, 177 VAR see vector autoregression variable: background 21, 153; confounding 24, 85–7, 90; hidden 6, 59, 77–9, 82–3, 90–1, 98, 106; instrumental 69, 78, 94, 98, 104, 150; intervention 5, 21, 25, 27, 28, 29, 31–2, 35–6, 37–8, 69, 71, 77, 79, 83–4, 90, 111, 150–1; omitted 21, 26, 104, 107n7, 153; structural 21, 153 vector autoregression (VAR) 14, 98, 127; see also structural vector autoregression Vining, R. 124–7 Volcker, P. 76 Walras, L. 119, 126, 154 Walras’ law 50, 65n11 Ward, M. 104 welfare theorem: first 119, 152, 159 well-ordered: science 176, 183–6; macroeconomics 11, 13, 113, 186–90 Welsch, R. 164–5 Woodward, J. 5, 8, 20–1, 24–33, 35, 37–8, 71, 83–4, 89, 91n3, 92n10, 92n11, 92n16, 94, 97, 151, 179 Wouters, R. 3–4, 39n6, 65n8, 153–4 Zellner, A. 94, 99–100 Zuber, C. xvi